Instant messaging and Presence (IMP) is a collection of technologies that create the possibility of near real-time text-based communication between two or more participants over the internet or some form of internal network/intranet. This paper discusses the two major protocols in use for Instant Messaging and Presence: XMPP and SIMPLE. We will also look at the landscape of proprietary IMP protocols currently used in the market as well.
There are a number of protocols that have been defined for use in Instant Messaging and Presence applications. Some of them are proprietary and there are also some standardized protocols defined for the same.
This paper explores two prominent standards defined for Instant Messaging and Presence applications, XMPP and SIMPLE. We will also explore the usage of these protocols versus the other proprietary protocols.
Instant Messaging refers to the exchange of text and multimedia messages between two or more parties in near real-time over pubic or private networks. IM is differentiated from email in that email is asynchronous in nature.
IM applications are also integrated with the Presence information of a user. Presence conveys the ability and willingness of a user to communicate across a set of devices. The presence information of a user need not be limited to simple "on-line" and "off-line" status indications. The presence information can be rich and convey mood and privacy and many other attributes as defined for example in RFC 4480 “RPID: Rich Presence Extensions to the Presence Information Data Format”
In the next few sections we will explore two of the open standards that have been defined for IM and Presence applications: XMPP and SIMPLE
XMPP: Extensible Messaging and Presence Protocol
Extensible Messaging and Presence Protocol (XMPP) is an open, XML-based protocol originally aimed at near-real-time, extensible instant messaging (IM) and presence information (e.g., buddy lists), but now expanded into the broader realm of message oriented middleware. It remains the core protocol of the Jabber Instant Messaging and Presence technology. Built to be extensible, the protocol has been extended with features such as Voice over Internet Protocol and file transfer signalling.
The core technology behind XMPP was invented by Jeremie Miller in 1998, refined in the Jabber open-source community in 1999 and 2000, and formalized by the IETF in 2002 and 2003, resulting in publication of the XMPP RFCs in 2004
More information can be found at http://xmpp.org/about/history.shtml
XMPP uses a distributed architecture. XMPP is generally implemented as client-server architecture. The client using XMPP accesses a server over a TCP connection, and servers also communicate with each other over TCP connections.
Following are the XMPP components:
The server manages the connections from other entities in the form of streams to and from authorized clients, servers and other entities.
The client is basically the end user which initiates the streams.
A gateway is a special-purpose server-side service whose primary function is to translate XMPP into the protocol used by a foreign (non-XMPP) messaging system, as well as to translate the return data back into XMPP. Examples are gateways to email, Internet Relay Chat, SIMPLE, Short Message Service, and legacy instant messaging services such as AIM, ICQ, MSN Messenger, and Yahoo! Instant Messenger.
The figure below depicts the XMPP high level architecture.
XMPP has two fundamental concepts for the quick, asynchronous exchange of relatively small payloads of structured information between presence-aware entities: XML streams and XML stanzas. These terms are defined as follows:
An XML stream is a container for the exchange of XML elements between any two entities over a network. The start of an XML stream is denoted unambiguously by an opening XML <stream> tag (with appropriate attributes and namespace declarations), while the end of the XML stream is denoted unambiguously by a closing XML </stream> tag.
During the life of the stream, the entity that initiated it can send an unbounded number of XML elements over the stream. The "initial stream" is negotiated from the initiating entity (usually a client or server) to the receiving entity (usually a server), and can be seen as corresponding to the initiating entity's "session" with the receiving entity. The initial stream enables unidirectional communication from the initiating entity to the receiving entity; in order to enable information exchange from the receiving entity to the initiating entity, the receiving entity MUST negotiate a stream in the opposite direction (the "response stream").
An XML stanza is a discrete semantic unit of structured information that is sent from one entity to another over an XML stream. The start of any XML stanza is denoted unambiguously by the element start tag at depth=1 of the XML stream (e.g., <presence>), and the end of any XML stanza is denoted unambiguously by the corresponding close tag at depth=1 (e.g., </presence>). An XML stanza may also contain child elements (with accompanying attributes, elements, and XML character data) as necessary in order to convey the desired information. The XMPP standard defines three XML stanzas: <message/>, <presence/>, and <iq/>.
In essence, then, an XML stream acts as an envelope for all the XML stanzas sent during a session. This can be represented as follows:
| <stream> |
| <presence> |
| <show/> |
| </presence> |
| <message to='foo'> |
| <body/> |
| </message> |
| <iq to='bar'> |
| <query/> |
| </iq> |
| ... |
| </stream> |
In this section we will discuss some of the key strengths and weaknesses of XMPP
1. The XMPP protocol is standardized. However proprietary protocols used are not standardized, and work only with the clients and servers that implement that proprietary protocol.
2. The chief difference between XMPP and the proprietary IM networks is that XMPP is decentralized. However proprietary protocols control not only the specification, but the exchange of messages as well. For example, every AIM message is sent to an AOL server before it is sent to the recipient.
3. XMPP is designed to be extensible; new feature sets can be added without breaking the existing protocol. Extensions are managed through an open standards process at the JSF called Jabber Enhancement Proposals (JEP).
4. XMPP allows interworking with proprietary protocols like AOL and ICQ using a federation mechanism implemented via gateways.
1. Presence data overhead: A high percentage of XMPP inter-server traffic is presence data and a lot of the data is redundantly transmitted. Hence XMPP currently has a large overhead in delivering presence data to multiple recipients.
2. No binary data: The way XMPP is encoded as a single long XML document makes it impossible to deliver unmodified binary data. Therefore, file transfers use external protocols like HTTP. If unavoidable, XMPP also provides in-band file transfers by encoding all data using base64. Other binary data like encrypted conversations or graphic icons are embedded using the same method.
Based on data gathered by the IMtrends search engine following is an estimate on the deployment percentages for XMPP based servers.
SIMPLE, stands for SIP Instant Messaging and Presence Leveraging Extensions and is an instant messaging (IM) and presence protocol suite based on Session Initiation Protocol (SIP) managed by the IETF. As the name suggest, it is designed for presence and instant messaging. Although it is also XML based (with XML components taken from XMPP for that matter), it is piggybacked on SIP as an event package mechanism.
Like XMPP, and in contrast to the vast majority of IM and presence protocols used by software deployed today, SIMPLE is an open standard.
Simple is a protocol produced by the IETF SIMPLE Working Group. This working group focuses on the application of the Session Initiation Protocol (SIP, RFC 3261) to the suite of services collectively known as instant messaging and presence (IMP). The SIMPLE WG has produced a bunch of RFCs to address the requirements of IMP.
The primary focus of the group in on the following:
- Proposed standard SIP extensions documenting the transport of Instant Messages in SIP, compliant to the requirements for IM outlined in RFC 2779, CPIM
- Proposed standard SIP event packages and any related protocol mechanisms used to support presence, compliant to the requirements for presence outlined in RFC 2779 and CPIM.
- Architecture for the implementation of a traditional buddy list based instant messaging and presence application with SIP.
Further information about the SIMPLE WG can be found at:
The SIMPLE architecture is also a distributed architecture with clients, servers and gateways. We will look at the IM and Presence related aspects of the architecture.
There are two RFCs defined for Instant Messaging RFC 3428 “SIP Extension for Instant Messaging” and RFC 4975 “Message Session Relay Protocol (MSRP)”
SIP Extension for Instant Messaging (RFC 3428)
This RFC defines the use of the MESSAGE method to exchange instant messages between peer entities.
The MESSAGE method for sending instant messages is similar to a pager mode. There is no explicit association between messages. Each IM is not associated with the other messages exchanged between the two entities. In this sense the concept of a “conversation” only exists in the client user interface. This can be contrasted this with a "session” model, where there is an explicit conversation with a clear beginning and end.
When one user wants to send an instant message to another, the sender generates a SIP request using the MESSAGE method. The Request-URI of this request will normally be the "address of record" for the recipient of the instant message, but it may be a device address in situations where the client has current information about the recipient's location. For example, the client could be coupled with a presence system that supplies an up to date device contact for a given address of record. The body of the request will contain the message to be delivered. This body can be of any MIME type, including message/cpim.
Message Session Relay Protocol (RFC 4957)
The previous section describes a page-mode messaging. Session-mode messaging has a number of benefits over page-mode messaging, however, such as explicit rendezvous, tighter integration with other media-types, direct client-to-client operation, and brokered privacy and security.
RFC 4975 defines a session-oriented instant message transport protocol called the Message Session Relay Protocol (MSRP), whose sessions can be negotiated with an offer or answer using the Session Description Protocol (SDP). The exchange is carried by a signaling protocol, like SIP. This allows a messaging session as one of the possible media-types in a session.
MSRP sessions are typically arranged using SIP the same way a session of audio or video media is set up. One SIP user agent A sends the other B a SIP invitation containing an offered session description that includes a session of MSRP. The receiving SIP user agent can accept the invitation and include an answer session description that acknowledges the choice of media. A's session description contains an MSRP URI that describes where A is willing to receive MSRP requests from B and vice versa.
MSRP defines two request types. SEND requests are used to deliver a complete message or a chunk (a portion of a complete message), while REPORT requests report on the status of a previously sent message, or a range of bytes inside a message.
Messages sent using MSRP can be very large and can be delivered in several SEND requests, where each SEND request contains one chunk of the overall message. Long chunks may be interrupted in mid- transmission to ensure fairness across shared transport connections. To support this, MSRP uses a boundary-based framing mechanism.
This chunking mechanism allows a sender to interrupt a chunk part of the way through sending it. The ability to interrupt messages allows multiple sessions to share a TCP connection, and for large messages to be sent efficiently while not blocking other messages that share the same connection, or even the same MSRP session.
Presence information is a status indicator that conveys ability and willingness of a user to communicate. A user's client provides presence information (presence state) via a network connection to a presence service, which is stored in what constitutes his personal availability record (called a presentity) and can be made available for distribution to other users (called watchers) to convey his availability for communication. There a whole bunch of
The Presence Architecture as defined by SIMPLE can be described by the following diagram:
A brief description of the different entities is given below:
- Presence source: The presence source (or presentity) is an entity that can provide presence information to a presence server. The presence source can be a user or any entity in the network.
- Presence Server: The presence server is the network entity that stores the presence information published by a presence source and provides the presence information to the watchers
- Watcher: A watcher is an entity that requests information about a presentity
- Resource List Server: A resource list server manages presence lists on behalf of the users
In this section we will discuss some of the key strengths and weaknesses of SIMPLE:
1. The SIMPLE protocol is also standardized by the IETF. Hence there would be a greater incentive for its deployment.
2. SIMPLE defines both a session oriented and a pager mode of operation. This makes the protocol flexible and each application can chooses which variant to implement based on the end-user requirements.
3. The distributed nature of the servers makes the protocol les prone to a single point of failure
1. The presence data in SIMPLE being XML based adds a lot of messaging overhead and could potentially consume significant network bandwidth. This can be mitigated by allowing a maximum upper rate at which the UE can publish their data and at which the servers can generate notifications.
Comparison of features in XMPP and SIMPLE
With XMPP and SIMPLE both offering similar capabilities the choice of which protocol to use would depend on the feature offered by each. A good comparison of the features can be found at http://xmpp.org/about/xmpp-simple.shtml
Proprietary IMP Protocols
Besides the SIMPLE and XMPP protocols, there are number of proprietary protocols as well. Following is a brief list of the companies which have their own version of IMP protocols.
- Microsoft Office Communicator
- IBM Sametime
- Yahoo IM
- AOL IM
The main reason for each company having their own proprietary protocols is control of the customer. Each company wants to lock their customers into a proprietary protocol which does not inter-work with other protocols, so that users are forced to use the client and servers of that company or service. Hence there is no incentive for them to support the open protocols like XMPP or SIMPLE.
XMPP and SIMPLE provide a similar set of functionality for Presence and IM. Following is a summary of what XMPP and SIMPLE have to offer.
Advantages of using SIMPLE:
- VoIP today mostly uses SIP as the session setup protocol. Hence when deploying SIMPLE, there is no need to develop a new protocol – SIMPLE can use the base SIP software stack and build upon it.
- Since SIMPLE is built on top of SIP, it can enjoy all of the features that SIP provides including authorization, authentication and even compression (SigComp).
- SIMPLE has already been selected for IMS as well. Hence going forward service providers will be using SIMPLE - this includes mobile, cable and fixed networks.
Advantages of using XMPP:
- XMPP is a lightweight protocol – there are a small set of RFCs that need to be implemented for a functional XMPP system. However, since SIMPLE uses SIP, the number of RFCs to be supported is large, and it may be expensive to deploy these stacks on thin clients like mobile devices
- XMPP has been around for a long time and is used by a number IM and presence applications. Hence the penetration of XMPP in the market is higher. For example Google uses XMPP in the GTalk application
- Scale: XMPP based systems can scale better than SIMPLE since XMPP is more lightweight
- Additional features can easily be added via extensions to the protocol which makes it flexible
Considering the points discussed above, both XMPP and SIMPLE will continue to be deployed in the market. There seems to be no clear winner – both will co-exist depending on which application or service is being deployed.
Note that there are also a number of proprietary protocols in use and federation mechanisms will be used in the market to interwork between these variants of IMP protocols. Federation is the process of interworking between the different IM standards, so that users of different IM systems or service providers are able to communicate with each other. This typically involves the use of a gateway server which can convert from one IM format to another. Some examples of federation can be found at the following: