What is SIP

The Basics

The Session Initiation Protocol (SIP) is a signalling protocol used for establishing sessions in an IP network. A session could be a simple two-way telephone call or it could be a collaborative multi-media conference session. The ability to establish these sessions means that a host of innovative services become possible, such as voice-enriched e-commerce, web page click-to-dial, Instant Messaging with buddy lists, and IP Centrex services.

Over the last couple of years, the Voice over IP community has adopted SIP as its protocol of choice for signalling. SIP is an RFC standard (RFC 3261) from the Internet Engineering Task Force (IETF), the body responsible for administering and developing the mechanisms that comprise the Internet. SIP is still evolving and being extended as technology matures and SIP products are socialised in the marketplace.

The IETF's philosophy is one of simplicity: specify only what you need to specify. SIP is very much of this mould; having been developed purely as a mechanism to establish sessions, it does not know about the details of a session, it just initiates, terminates and modifies sessions. This simplicity means that SIP scales, it is extensible, and it sits comfortably in different architectures and deployment scenarios.

SIP is a request-response protocol that closely resembles two other Internet protocols, HTTP and SMTP (the protocols that power the world wide web and email); consequently, SIP sits comfortably alongside Internet applications. Using SIP, telephony becomes another web application and integrates easily into other Internet services. SIP is a simple toolkit that service providers can use to build converged voice and multimedia services.

In order to provide telephony services there is a need for a number of different standards and protocols to come together - specifically to ensure transport (RTP), to authenticate users (RADIUS, DIAMETER), to provide directories (LDAP), to be able to guarantee voice quality (RSVP, YESSIR) and to inter-work with today's telephone network. Here we will only cover SIP.

SIP - Playing Nicely with the Other Protocols

Session Initiation Protocol (SIP) has become a strong, catalytic force shaping today's telecom industry. This IETF driven protocol represents a key ingredient in the converging world of telecommunications based applications. But SIP does not do everything, and it does not solve every problem. SIP has limits, and SIP works with other protocols to get the job done.

So what are the limits to SIP? And are we losing perspective as an industry when we say that SIP is a one-stop-shop for convergence?

SIP is not the panacea. It was never designed that way, and that's a good thing! Typically all-inclusive approaches (like H.323) have been fraught with difficulty and represent the wrong kind of thinking in today's modular network. SIP is flexible, but it sticks to doing what it does best.

So let's have a closer look. We will see that SIP does certain things well, and leaves other functions alone. We will see that SIP works with a number of other protocols to get the job done while still playing nicely with some neighboring technologies.

SIP - Playing an Important Role

SIP is an IETF application layer protocol for establishing, manipulating, and tearing down sessions. SIP's main purpose is to help session originators deliver invitations to potential session participants wherever they may be. In a nut shell, that is SIP's role.

So SIP is not the panacea - because it was never built to be that way. Let's review two of the fundamental assumptions behind SIP's design:

Reusing Existing Protocols - SIP was designed to specifically reuse as many existing protocols and protocol design concepts. For example, SIP was modeled after HTTP, using URLs for addressing and SDP to convey session information.

Maximizing Interoperability - SIP was also designed so that it would be easy to bind SIP functions to existing protocols and applications, such as e-mail and Web browsers. SIP does this by limiting itself to a modular philosophy - just like many other Internet protocols - and focusing on a specific set of functions.

It's actually good news that SIP does not try to solve everything single-handedly. We can examine this statement more closely with a quick look at the H.323 approach to IP telephony. H.323 is not a single protocol but rather an entire suite of protocols that cover everything from soup to nuts - codecs, call control, conferencing, and many other functions in one vertically integrated stack.

The advantage to this approach is that by strictly controlling so many aspects of the implementation it is easier to ensure that H.323 based systems function well together. On the down side, H.323 becomes heavy and cumbersome. Flexibility is sacrificed as one is tied to a single family of technologies.

For a mature technology this may not be a problem, since the best solutions are likely to have been discovered and incorporated into standards. However for a field as young and fast changing as IP telephony, where many problems and solutions are still under debate, flexibility is more important. SIP is part of this flexible approach, as it uses a wide variety of protocols, each addressing a different aspect of the problem space. The advantage is the ability to choose from among many competing technologies and move to newer and better ones as they emerge. This has always been the philosophy behind SIP and this is the approach of the IETF to IP telephony in general.

SIP is an important piece of this modular approach to IP telephony protocols. SIP addresses the need for a protocol to deal with generalized sessions. This involves finding potential call participants and contacting them as they move from place to place, changing their location and the even equipment they are using. Calls may require the use of multiple streams of various media, and very large numbers of participants might be involved in a call - and even joining and leaving in a constantly changing topology! This is what SIP does.

SIP - Working with Other Protocols

SIP was designed to solve only a few problems and to work with a broad spectrum of existing and future IP telephony protocols. To this end SIP provides four basic functions. SIP allows for the establishment of user location (i.e. translating from a user's name to their current network address). SIP provides for feature negotiation so that all of the participants in a session can agree on the features to be supported among them. SIP is a mechanism for call management - for example adding, dropping, or transferring participants. And finally SIP allows for changing features of a session while it is in progress. All of the other key functions are done with other protocols.

Yes this does indeed mean that SIP is not a session description protocol, and that SIP does not do conference control. SIP is not a resource reservation protocol and it has nothing to do with quality of service (QoS). SIP can work in a framework with other protocols to make sure these roles are played out - but SIP does not do them. SIP can function with SOAP, HTTP, XML, VXML , WSDL, UDDI, SDP and an alphabet soup of others. Everyone has a role to play!

There is no question that SIP was designed to be a modular component of a larger IP telephony solution and thus functions well with a large number of these IP related protocols. But SIP is even friendlier as it "plays nicely" with protocols that are often viewed as overlapping in function. For the near term we can expect that SIP will have to coexist with overlapping protocols such as H.323, MGCP, and MEGACO.

H.323 networks are already deployed in many parts of the world. Network operators are interested in growing network capability with coexisting SIP networks. SIP to H.323 translation products are already available. MGCP and MEGACO can also benefit from SIP as by themselves they aren't enough to build a complete IP telephony system. These protocols sit architecturally below SIP and can benefit in functionality by in effect being controlled through SIP.

Clearly, SIP is an important protocol that is becoming widely deployed. SIP is a catalytic protocol that delivers key signaling elements, which can turn a voice over IP network into a true IP communications network - a network capable of delivering next generation converged services. SIP is powerful, and yet simple. But that power comes from doing what it does best, and playing nicely with the rest to the other protocols in the converged protocol sandbox!