Showing posts with label Publisher. Show all posts
Showing posts with label Publisher. Show all posts

Oct 20, 2009

XMPP and SIMPLE: A Comparative Study

Instant messaging and Presence (IMP) is a collection of technologies that create the possibility of near real-time text-based communication between two or more participants over the internet or some form of internal network/intranet. This paper discusses the two major protocols in use for Instant Messaging and Presence: XMPP and SIMPLE. We will also look at the landscape of proprietary IMP protocols currently used in the market as well.

Introduction

There are a number of protocols that have been defined for use in Instant Messaging and Presence applications. Some of them are proprietary and there are also some standardized protocols defined for the same.

This paper explores two prominent standards defined for Instant Messaging and Presence applications, XMPP and SIMPLE. We will also explore the usage of these protocols versus the other proprietary protocols.

Instant Messaging refers to the exchange of text and multimedia messages between two or more parties in near real-time over pubic or private networks. IM is differentiated from email in that email is asynchronous in nature.

IM applications are also integrated with the Presence information of a user. Presence conveys the ability and willingness of a user to communicate across a set of devices. The presence information of a user need not be limited to simple "on-line" and "off-line" status indications. The presence information can be rich and convey mood and privacy and many other attributes as defined for example in RFC 4480 “RPID: Rich Presence Extensions to the Presence Information Data Format”

In the next few sections we will explore two of the open standards that have been defined for IM and Presence applications: XMPP and SIMPLE

XMPP: Extensible Messaging and Presence Protocol

Extensible Messaging and Presence Protocol (XMPP) is an open, XML-based protocol originally aimed at near-real-time, extensible instant messaging (IM) and presence information (e.g., buddy lists), but now expanded into the broader realm of message oriented middleware. It remains the core protocol of the Jabber Instant Messaging and Presence technology. Built to be extensible, the protocol has been extended with features such as Voice over Internet Protocol and file transfer signalling.

History of XMPP

The core technology behind XMPP was invented by Jeremie Miller in 1998, refined in the Jabber open-source community in 1999 and 2000, and formalized by the IETF in 2002 and 2003, resulting in publication of the XMPP RFCs in 2004

More information can be found at http://xmpp.org/about/history.shtml

XMPP Architecture

XMPP uses a distributed architecture. XMPP is generally implemented as client-server architecture. The client using XMPP accesses a server over a TCP connection, and servers also communicate with each other over TCP connections.

Following are the XMPP components:

  • Server

The server manages the connections from other entities in the form of streams to and from authorized clients, servers and other entities.

  • Client

The client is basically the end user which initiates the streams.

  • Gateway

A gateway is a special-purpose server-side service whose primary function is to translate XMPP into the protocol used by a foreign (non-XMPP) messaging system, as well as to translate the return data back into XMPP. Examples are gateways to email, Internet Relay Chat, SIMPLE, Short Message Service, and legacy instant messaging services such as AIM, ICQ, MSN Messenger, and Yahoo! Instant Messenger.

The figure below depicts the XMPP high level architecture.

clip_image002

XMPP Streams and Stanzas

XMPP has two fundamental concepts for the quick, asynchronous exchange of relatively small payloads of structured information between presence-aware entities: XML streams and XML stanzas. These terms are defined as follows:

  • Stream

An XML stream is a container for the exchange of XML elements between any two entities over a network. The start of an XML stream is denoted unambiguously by an opening XML <stream> tag (with appropriate attributes and namespace declarations), while the end of the XML stream is denoted unambiguously by a closing XML </stream> tag.

During the life of the stream, the entity that initiated it can send an unbounded number of XML elements over the stream. The "initial stream" is negotiated from the initiating entity (usually a client or server) to the receiving entity (usually a server), and can be seen as corresponding to the initiating entity's "session" with the receiving entity. The initial stream enables unidirectional communication from the initiating entity to the receiving entity; in order to enable information exchange from the receiving entity to the initiating entity, the receiving entity MUST negotiate a stream in the opposite direction (the "response stream").

  • Stanza

An XML stanza is a discrete semantic unit of structured information that is sent from one entity to another over an XML stream. The start of any XML stanza is denoted unambiguously by the element start tag at depth=1 of the XML stream (e.g., <presence>), and the end of any XML stanza is denoted unambiguously by the corresponding close tag at depth=1 (e.g., </presence>). An XML stanza may also contain child elements (with accompanying attributes, elements, and XML character data) as necessary in order to convey the desired information. The XMPP standard defines three XML stanzas: <message/>, <presence/>, and <iq/>.

In essence, then, an XML stream acts as an envelope for all the XML stanzas sent during a session. This can be represented as follows:

|--------------------|
| <stream> |
|--------------------|
| <presence> |
| <show/> |
| </presence> |
|--------------------|
| <message to='foo'> |
| <body/> |
| </message> |
|--------------------|
| <iq to='bar'> |
| <query/> |
| </iq> |
|--------------------|
| ... |
|--------------------|
| </stream> |
|--------------------|

Features of XMPP

In this section we will discuss some of the key strengths and weaknesses of XMPP

Strengths:

1. The XMPP protocol is standardized. However proprietary protocols used are not standardized, and work only with the clients and servers that implement that proprietary protocol.

2. The chief difference between XMPP and the proprietary IM networks is that XMPP is decentralized. However proprietary protocols control not only the specification, but the exchange of messages as well. For example, every AIM message is sent to an AOL server before it is sent to the recipient.

3. XMPP is designed to be extensible; new feature sets can be added without breaking the existing protocol. Extensions are managed through an open standards process at the JSF called Jabber Enhancement Proposals (JEP).

4. XMPP allows interworking with proprietary protocols like AOL and ICQ using a federation mechanism implemented via gateways.

Weaknesses:

1. Presence data overhead: A high percentage of XMPP inter-server traffic is presence data and a lot of the data is redundantly transmitted. Hence XMPP currently has a large overhead in delivering presence data to multiple recipients.

2. No binary data: The way XMPP is encoded as a single long XML document makes it impossible to deliver unmodified binary data. Therefore, file transfers use external protocols like HTTP. If unavoidable, XMPP also provides in-band file transfers by encoding all data using base64. Other binary data like encrypted conversations or graphic icons are embedded using the same method.

Market deployments

Based on data gathered by the IMtrends search engine following is an estimate on the deployment percentages for XMPP based servers.

image

Source: http://www.process-one.net/en/imtrends/article/usage_estimation_of_public_xmpp_servers_per_domain/

SIMPLE

SIMPLE, stands for SIP Instant Messaging and Presence Leveraging Extensions and is an instant messaging (IM) and presence protocol suite based on Session Initiation Protocol (SIP) managed by the IETF. As the name suggest, it is designed for presence and instant messaging. Although it is also XML based (with XML components taken from XMPP for that matter), it is piggybacked on SIP as an event package mechanism.

Like XMPP, and in contrast to the vast majority of IM and presence protocols used by software deployed today, SIMPLE is an open standard.

History of SIMPLE

Simple is a protocol produced by the IETF SIMPLE Working Group. This working group focuses on the application of the Session Initiation Protocol (SIP, RFC 3261) to the suite of services collectively known as instant messaging and presence (IMP). The SIMPLE WG has produced a bunch of RFCs to address the requirements of IMP.

The primary focus of the group in on the following:

  • Proposed standard SIP extensions documenting the transport of Instant Messages in SIP, compliant to the requirements for IM outlined in RFC 2779, CPIM
  • Proposed standard SIP event packages and any related protocol mechanisms used to support presence, compliant to the requirements for presence outlined in RFC 2779 and CPIM.
  • Architecture for the implementation of a traditional buddy list based instant messaging and presence application with SIP.

Further information about the SIMPLE WG can be found at:

http://www.ietf.org/dyn/wg/charter/simple-charter.html

SIMPLE Architecture

The SIMPLE architecture is also a distributed architecture with clients, servers and gateways. We will look at the IM and Presence related aspects of the architecture.

Instant Messaging

There are two RFCs defined for Instant Messaging RFC 3428 “SIP Extension for Instant Messaging” and RFC 4975 “Message Session Relay Protocol (MSRP)”

SIP Extension for Instant Messaging (RFC 3428)

This RFC defines the use of the MESSAGE method to exchange instant messages between peer entities.

The MESSAGE method for sending instant messages is similar to a pager mode. There is no explicit association between messages. Each IM is not associated with the other messages exchanged between the two entities. In this sense the concept of a “conversation” only exists in the client user interface. This can be contrasted this with a "session” model, where there is an explicit conversation with a clear beginning and end.

When one user wants to send an instant message to another, the sender generates a SIP request using the MESSAGE method. The Request-URI of this request will normally be the "address of record" for the recipient of the instant message, but it may be a device address in situations where the client has current information about the recipient's location. For example, the client could be coupled with a presence system that supplies an up to date device contact for a given address of record. The body of the request will contain the message to be delivered. This body can be of any MIME type, including message/cpim.

Message Session Relay Protocol (RFC 4957)

The previous section describes a page-mode messaging. Session-mode messaging has a number of benefits over page-mode messaging, however, such as explicit rendezvous, tighter integration with other media-types, direct client-to-client operation, and brokered privacy and security.

RFC 4975 defines a session-oriented instant message transport protocol called the Message Session Relay Protocol (MSRP), whose sessions can be negotiated with an offer or answer using the Session Description Protocol (SDP). The exchange is carried by a signaling protocol, like SIP. This allows a messaging session as one of the possible media-types in a session.

MSRP sessions are typically arranged using SIP the same way a session of audio or video media is set up. One SIP user agent A sends the other B a SIP invitation containing an offered session description that includes a session of MSRP. The receiving SIP user agent can accept the invitation and include an answer session description that acknowledges the choice of media. A's session description contains an MSRP URI that describes where A is willing to receive MSRP requests from B and vice versa.

MSRP defines two request types. SEND requests are used to deliver a complete message or a chunk (a portion of a complete message), while REPORT requests report on the status of a previously sent message, or a range of bytes inside a message.

Messages sent using MSRP can be very large and can be delivered in several SEND requests, where each SEND request contains one chunk of the overall message. Long chunks may be interrupted in mid- transmission to ensure fairness across shared transport connections. To support this, MSRP uses a boundary-based framing mechanism.

This chunking mechanism allows a sender to interrupt a chunk part of the way through sending it. The ability to interrupt messages allows multiple sessions to share a TCP connection, and for large messages to be sent efficiently while not blocking other messages that share the same connection, or even the same MSRP session.

Presence

Presence information is a status indicator that conveys ability and willingness of a user to communicate. A user's client provides presence information (presence state) via a network connection to a presence service, which is stored in what constitutes his personal availability record (called a presentity) and can be made available for distribution to other users (called watchers) to convey his availability for communication. There a whole bunch of

The Presence Architecture as defined by SIMPLE can be described by the following diagram:

clip_image006

A brief description of the different entities is given below:

  • Presence source: The presence source (or presentity) is an entity that can provide presence information to a presence server. The presence source can be a user or any entity in the network.
  • Presence Server: The presence server is the network entity that stores the presence information published by a presence source and provides the presence information to the watchers
  • Watcher: A watcher is an entity that requests information about a presentity
  • Resource List Server: A resource list server manages presence lists on behalf of the users

Strengths and weaknesses of SIMPLE

In this section we will discuss some of the key strengths and weaknesses of SIMPLE:

Strengths:

1. The SIMPLE protocol is also standardized by the IETF. Hence there would be a greater incentive for its deployment.

2. SIMPLE defines both a session oriented and a pager mode of operation. This makes the protocol flexible and each application can chooses which variant to implement based on the end-user requirements.

3. The distributed nature of the servers makes the protocol les prone to a single point of failure

Weaknesses:

1. The presence data in SIMPLE being XML based adds a lot of messaging overhead and could potentially consume significant network bandwidth. This can be mitigated by allowing a maximum upper rate at which the UE can publish their data and at which the servers can generate notifications.

Comparison of features in XMPP and SIMPLE

With XMPP and SIMPLE both offering similar capabilities the choice of which protocol to use would depend on the feature offered by each. A good comparison of the features can be found at http://xmpp.org/about/xmpp-simple.shtml

Proprietary IMP Protocols

Besides the SIMPLE and XMPP protocols, there are number of proprietary protocols as well. Following is a brief list of the companies which have their own version of IMP protocols.

  • Microsoft Office Communicator
  • IBM Sametime
  • Yahoo IM
  • AOL IM
  • Skype

The main reason for each company having their own proprietary protocols is control of the customer. Each company wants to lock their customers into a proprietary protocol which does not inter-work with other protocols, so that users are forced to use the client and servers of that company or service. Hence there is no incentive for them to support the open protocols like XMPP or SIMPLE.

Conclusion

XMPP and SIMPLE provide a similar set of functionality for Presence and IM. Following is a summary of what XMPP and SIMPLE have to offer.

Advantages of using SIMPLE:

  • VoIP today mostly uses SIP as the session setup protocol. Hence when deploying SIMPLE, there is no need to develop a new protocol – SIMPLE can use the base SIP software stack and build upon it.
  • Since SIMPLE is built on top of SIP, it can enjoy all of the features that SIP provides including authorization, authentication and even compression (SigComp).
  • SIMPLE has already been selected for IMS as well. Hence going forward service providers will be using SIMPLE - this includes mobile, cable and fixed networks.

Advantages of using XMPP:

  • XMPP is a lightweight protocol – there are a small set of RFCs that need to be implemented for a functional XMPP system. However, since SIMPLE uses SIP, the number of RFCs to be supported is large, and it may be expensive to deploy these stacks on thin clients like mobile devices
  • XMPP has been around for a long time and is used by a number IM and presence applications. Hence the penetration of XMPP in the market is higher. For example Google uses XMPP in the GTalk application
  • Scale: XMPP based systems can scale better than SIMPLE since XMPP is more lightweight
  • Additional features can easily be added via extensions to the protocol which makes it flexible

Considering the points discussed above, both XMPP and SIMPLE will continue to be deployed in the market. There seems to be no clear winner – both will co-exist depending on which application or service is being deployed.

Note that there are also a number of proprietary protocols in use and federation mechanisms will be used in the market to interwork between these variants of IMP protocols. Federation is the process of interworking between the different IM standards, so that users of different IM systems or service providers are able to communicate with each other. This typically involves the use of a gateway server which can convert from one IM format to another. Some examples of federation can be found at the following:

XMPP and SIMPLE: A Comparative Study

Instant messaging and Presence (IMP) is a collection of technologies that create the possibility of near real-time text-based communication between two or more participants over the internet or some form of internal network/intranet. This paper discusses the two major protocols in use for Instant Messaging and Presence: XMPP and SIMPLE. We will also look at the landscape of proprietary IMP protocols currently used in the market as well.

Introduction

There are a number of protocols that have been defined for use in Instant Messaging and Presence applications. Some of them are proprietary and there are also some standardized protocols defined for the same.

This paper explores two prominent standards defined for Instant Messaging and Presence applications, XMPP and SIMPLE. We will also explore the usage of these protocols versus the other proprietary protocols.

Instant Messaging refers to the exchange of text and multimedia messages between two or more parties in near real-time over pubic or private networks. IM is differentiated from email in that email is asynchronous in nature.

IM applications are also integrated with the Presence information of a user. Presence conveys the ability and willingness of a user to communicate across a set of devices. The presence information of a user need not be limited to simple "on-line" and "off-line" status indications. The presence information can be rich and convey mood and privacy and many other attributes as defined for example in RFC 4480 “RPID: Rich Presence Extensions to the Presence Information Data Format”

In the next few sections we will explore two of the open standards that have been defined for IM and Presence applications: XMPP and SIMPLE

XMPP: Extensible Messaging and Presence Protocol

Extensible Messaging and Presence Protocol (XMPP) is an open, XML-based protocol originally aimed at near-real-time, extensible instant messaging (IM) and presence information (e.g., buddy lists), but now expanded into the broader realm of message oriented middleware. It remains the core protocol of the Jabber Instant Messaging and Presence technology. Built to be extensible, the protocol has been extended with features such as Voice over Internet Protocol and file transfer signalling.

History of XMPP

The core technology behind XMPP was invented by Jeremie Miller in 1998, refined in the Jabber open-source community in 1999 and 2000, and formalized by the IETF in 2002 and 2003, resulting in publication of the XMPP RFCs in 2004

More information can be found at http://xmpp.org/about/history.shtml

XMPP Architecture

XMPP uses a distributed architecture. XMPP is generally implemented as client-server architecture. The client using XMPP accesses a server over a TCP connection, and servers also communicate with each other over TCP connections.

Following are the XMPP components:

  • Server

The server manages the connections from other entities in the form of streams to and from authorized clients, servers and other entities.

  • Client

The client is basically the end user which initiates the streams.

  • Gateway

A gateway is a special-purpose server-side service whose primary function is to translate XMPP into the protocol used by a foreign (non-XMPP) messaging system, as well as to translate the return data back into XMPP. Examples are gateways to email, Internet Relay Chat, SIMPLE, Short Message Service, and legacy instant messaging services such as AIM, ICQ, MSN Messenger, and Yahoo! Instant Messenger.

The figure below depicts the XMPP high level architecture.

clip_image002

XMPP Streams and Stanzas

XMPP has two fundamental concepts for the quick, asynchronous exchange of relatively small payloads of structured information between presence-aware entities: XML streams and XML stanzas. These terms are defined as follows:

  • Stream

An XML stream is a container for the exchange of XML elements between any two entities over a network. The start of an XML stream is denoted unambiguously by an opening XML <stream> tag (with appropriate attributes and namespace declarations), while the end of the XML stream is denoted unambiguously by a closing XML </stream> tag.

During the life of the stream, the entity that initiated it can send an unbounded number of XML elements over the stream. The "initial stream" is negotiated from the initiating entity (usually a client or server) to the receiving entity (usually a server), and can be seen as corresponding to the initiating entity's "session" with the receiving entity. The initial stream enables unidirectional communication from the initiating entity to the receiving entity; in order to enable information exchange from the receiving entity to the initiating entity, the receiving entity MUST negotiate a stream in the opposite direction (the "response stream").

  • Stanza

An XML stanza is a discrete semantic unit of structured information that is sent from one entity to another over an XML stream. The start of any XML stanza is denoted unambiguously by the element start tag at depth=1 of the XML stream (e.g., <presence>), and the end of any XML stanza is denoted unambiguously by the corresponding close tag at depth=1 (e.g., </presence>). An XML stanza may also contain child elements (with accompanying attributes, elements, and XML character data) as necessary in order to convey the desired information. The XMPP standard defines three XML stanzas: <message/>, <presence/>, and <iq/>.

In essence, then, an XML stream acts as an envelope for all the XML stanzas sent during a session. This can be represented as follows:

|--------------------|
| <stream> |
|--------------------|
| <presence> |
| <show/> |
| </presence> |
|--------------------|
| <message to='foo'> |
| <body/> |
| </message> |
|--------------------|
| <iq to='bar'> |
| <query/> |
| </iq> |
|--------------------|
| ... |
|--------------------|
| </stream> |
|--------------------|

Features of XMPP

In this section we will discuss some of the key strengths and weaknesses of XMPP

Strengths:

1. The XMPP protocol is standardized. However proprietary protocols used are not standardized, and work only with the clients and servers that implement that proprietary protocol.

2. The chief difference between XMPP and the proprietary IM networks is that XMPP is decentralized. However proprietary protocols control not only the specification, but the exchange of messages as well. For example, every AIM message is sent to an AOL server before it is sent to the recipient.

3. XMPP is designed to be extensible; new feature sets can be added without breaking the existing protocol. Extensions are managed through an open standards process at the JSF called Jabber Enhancement Proposals (JEP).

4. XMPP allows interworking with proprietary protocols like AOL and ICQ using a federation mechanism implemented via gateways.

Weaknesses:

1. Presence data overhead: A high percentage of XMPP inter-server traffic is presence data and a lot of the data is redundantly transmitted. Hence XMPP currently has a large overhead in delivering presence data to multiple recipients.

2. No binary data: The way XMPP is encoded as a single long XML document makes it impossible to deliver unmodified binary data. Therefore, file transfers use external protocols like HTTP. If unavoidable, XMPP also provides in-band file transfers by encoding all data using base64. Other binary data like encrypted conversations or graphic icons are embedded using the same method.

Market deployments

Based on data gathered by the IMtrends search engine following is an estimate on the deployment percentages for XMPP based servers.

image

Source: http://www.process-one.net/en/imtrends/article/usage_estimation_of_public_xmpp_servers_per_domain/

SIMPLE

SIMPLE, stands for SIP Instant Messaging and Presence Leveraging Extensions and is an instant messaging (IM) and presence protocol suite based on Session Initiation Protocol (SIP) managed by the IETF. As the name suggest, it is designed for presence and instant messaging. Although it is also XML based (with XML components taken from XMPP for that matter), it is piggybacked on SIP as an event package mechanism.

Like XMPP, and in contrast to the vast majority of IM and presence protocols used by software deployed today, SIMPLE is an open standard.

History of SIMPLE

Simple is a protocol produced by the IETF SIMPLE Working Group. This working group focuses on the application of the Session Initiation Protocol (SIP, RFC 3261) to the suite of services collectively known as instant messaging and presence (IMP). The SIMPLE WG has produced a bunch of RFCs to address the requirements of IMP.

The primary focus of the group in on the following:

  • Proposed standard SIP extensions documenting the transport of Instant Messages in SIP, compliant to the requirements for IM outlined in RFC 2779, CPIM
  • Proposed standard SIP event packages and any related protocol mechanisms used to support presence, compliant to the requirements for presence outlined in RFC 2779 and CPIM.
  • Architecture for the implementation of a traditional buddy list based instant messaging and presence application with SIP.

Further information about the SIMPLE WG can be found at:

http://www.ietf.org/dyn/wg/charter/simple-charter.html

SIMPLE Architecture

The SIMPLE architecture is also a distributed architecture with clients, servers and gateways. We will look at the IM and Presence related aspects of the architecture.

Instant Messaging

There are two RFCs defined for Instant Messaging RFC 3428 “SIP Extension for Instant Messaging” and RFC 4975 “Message Session Relay Protocol (MSRP)”

SIP Extension for Instant Messaging (RFC 3428)

This RFC defines the use of the MESSAGE method to exchange instant messages between peer entities.

The MESSAGE method for sending instant messages is similar to a pager mode. There is no explicit association between messages. Each IM is not associated with the other messages exchanged between the two entities. In this sense the concept of a “conversation” only exists in the client user interface. This can be contrasted this with a "session” model, where there is an explicit conversation with a clear beginning and end.

When one user wants to send an instant message to another, the sender generates a SIP request using the MESSAGE method. The Request-URI of this request will normally be the "address of record" for the recipient of the instant message, but it may be a device address in situations where the client has current information about the recipient's location. For example, the client could be coupled with a presence system that supplies an up to date device contact for a given address of record. The body of the request will contain the message to be delivered. This body can be of any MIME type, including message/cpim.

Message Session Relay Protocol (RFC 4957)

The previous section describes a page-mode messaging. Session-mode messaging has a number of benefits over page-mode messaging, however, such as explicit rendezvous, tighter integration with other media-types, direct client-to-client operation, and brokered privacy and security.

RFC 4975 defines a session-oriented instant message transport protocol called the Message Session Relay Protocol (MSRP), whose sessions can be negotiated with an offer or answer using the Session Description Protocol (SDP). The exchange is carried by a signaling protocol, like SIP. This allows a messaging session as one of the possible media-types in a session.

MSRP sessions are typically arranged using SIP the same way a session of audio or video media is set up. One SIP user agent A sends the other B a SIP invitation containing an offered session description that includes a session of MSRP. The receiving SIP user agent can accept the invitation and include an answer session description that acknowledges the choice of media. A's session description contains an MSRP URI that describes where A is willing to receive MSRP requests from B and vice versa.

MSRP defines two request types. SEND requests are used to deliver a complete message or a chunk (a portion of a complete message), while REPORT requests report on the status of a previously sent message, or a range of bytes inside a message.

Messages sent using MSRP can be very large and can be delivered in several SEND requests, where each SEND request contains one chunk of the overall message. Long chunks may be interrupted in mid- transmission to ensure fairness across shared transport connections. To support this, MSRP uses a boundary-based framing mechanism.

This chunking mechanism allows a sender to interrupt a chunk part of the way through sending it. The ability to interrupt messages allows multiple sessions to share a TCP connection, and for large messages to be sent efficiently while not blocking other messages that share the same connection, or even the same MSRP session.

Presence

Presence information is a status indicator that conveys ability and willingness of a user to communicate. A user's client provides presence information (presence state) via a network connection to a presence service, which is stored in what constitutes his personal availability record (called a presentity) and can be made available for distribution to other users (called watchers) to convey his availability for communication. There a whole bunch of

The Presence Architecture as defined by SIMPLE can be described by the following diagram:

clip_image006

A brief description of the different entities is given below:

  • Presence source: The presence source (or presentity) is an entity that can provide presence information to a presence server. The presence source can be a user or any entity in the network.
  • Presence Server: The presence server is the network entity that stores the presence information published by a presence source and provides the presence information to the watchers
  • Watcher: A watcher is an entity that requests information about a presentity
  • Resource List Server: A resource list server manages presence lists on behalf of the users

Strengths and weaknesses of SIMPLE

In this section we will discuss some of the key strengths and weaknesses of SIMPLE:

Strengths:

1. The SIMPLE protocol is also standardized by the IETF. Hence there would be a greater incentive for its deployment.

2. SIMPLE defines both a session oriented and a pager mode of operation. This makes the protocol flexible and each application can chooses which variant to implement based on the end-user requirements.

3. The distributed nature of the servers makes the protocol les prone to a single point of failure

Weaknesses:

1. The presence data in SIMPLE being XML based adds a lot of messaging overhead and could potentially consume significant network bandwidth. This can be mitigated by allowing a maximum upper rate at which the UE can publish their data and at which the servers can generate notifications.

Comparison of features in XMPP and SIMPLE

With XMPP and SIMPLE both offering similar capabilities the choice of which protocol to use would depend on the feature offered by each. A good comparison of the features can be found at http://xmpp.org/about/xmpp-simple.shtml

Proprietary IMP Protocols

Besides the SIMPLE and XMPP protocols, there are number of proprietary protocols as well. Following is a brief list of the companies which have their own version of IMP protocols.

  • Microsoft Office Communicator
  • IBM Sametime
  • Yahoo IM
  • AOL IM
  • Skype

The main reason for each company having their own proprietary protocols is control of the customer. Each company wants to lock their customers into a proprietary protocol which does not inter-work with other protocols, so that users are forced to use the client and servers of that company or service. Hence there is no incentive for them to support the open protocols like XMPP or SIMPLE.

Conclusion

XMPP and SIMPLE provide a similar set of functionality for Presence and IM. Following is a summary of what XMPP and SIMPLE have to offer.

Advantages of using SIMPLE:

  • VoIP today mostly uses SIP as the session setup protocol. Hence when deploying SIMPLE, there is no need to develop a new protocol – SIMPLE can use the base SIP software stack and build upon it.
  • Since SIMPLE is built on top of SIP, it can enjoy all of the features that SIP provides including authorization, authentication and even compression (SigComp).
  • SIMPLE has already been selected for IMS as well. Hence going forward service providers will be using SIMPLE - this includes mobile, cable and fixed networks.

Advantages of using XMPP:

  • XMPP is a lightweight protocol – there are a small set of RFCs that need to be implemented for a functional XMPP system. However, since SIMPLE uses SIP, the number of RFCs to be supported is large, and it may be expensive to deploy these stacks on thin clients like mobile devices
  • XMPP has been around for a long time and is used by a number IM and presence applications. Hence the penetration of XMPP in the market is higher. For example Google uses XMPP in the GTalk application
  • Scale: XMPP based systems can scale better than SIMPLE since XMPP is more lightweight
  • Additional features can easily be added via extensions to the protocol which makes it flexible

Considering the points discussed above, both XMPP and SIMPLE will continue to be deployed in the market. There seems to be no clear winner – both will co-exist depending on which application or service is being deployed.

Note that there are also a number of proprietary protocols in use and federation mechanisms will be used in the market to interwork between these variants of IMP protocols. Federation is the process of interworking between the different IM standards, so that users of different IM systems or service providers are able to communicate with each other. This typically involves the use of a gateway server which can convert from one IM format to another. Some examples of federation can be found at the following:

Oct 2, 2009

Code Access Security - .Net Vs. J2EE

Introduction

This article will act as an introduction to the concept of .Net’s Code Access Security and try to do a comparison between .Net’s Code Access Security and J2EE’s Code Access Security.

In .Net, the Common Language Runtime (CLR) allows code to perform only those operations that the code has permission to perform. So CAS is the CLR's security system that enforces security policies by preventing unauthorized access to protected resources and operations. Using the Code Access Security, you can do the following:
  • Restrict what your code can do
  • Restrict which code can call your code
  • Identify code

We'll be discussing about these things through out this article. The following are the jargons that constitute CAS.

Elements of Code Access Security

Code access security consists of the following elements:
  • Permissions
  • Permission sets
  • Code groups
  • Policy
Permissions
Permissions represent access to a protected resource or the ability to perform a protected operation. The .NET Framework provides several permission classes, like FileIOPermission (when working with files), UIPermission (permission to use a user interface), SecurityPermission (this is needed to execute the code and can be even used to bypass security) etc. I won't list all the permission classes here, they are listed below.

Permission sets

A permission set is a collection of permissions. You can put FileIOPermission and UIPermission into your own permission set and call it " My_PermissionSet". A permission set can include any number of permissions. FullTrust, LocalIntranet, Internet, Execution and Nothing are some of the built in permission sets in .NET Framework. FullTrust has all the permissions in the world, while Nothing has no permissions at all, not even the right to execute.

Code groups

Code group is a logical grouping of code that has a specified condition for membership. Code from http://www.somewebsite.com/ can belong to one code group, code containing a specific strong name can belong to another code group and code from a specific assembly can belong to another code group. There are built-in code groups like My_Computer_Zone, LocalIntranet_Zone, Internet_Zone etc. Like permission sets, we can create code groups to meet our requirements based on the evidence provided by .NET  Framework. Site, Strong Name, Zone, URL are some of the types of evidence.

Policy

Security policy is the configurable set of rules that the CLR follows when determining the permissions to grant to code. There are four policy levels - Enterprise, Machine, User and Application Domain, each operating independently from each other. Each level has its own code groups and permission sets. They have the hierarchy given below.



Functions of Code Access Security

Code Access Security performs the following functions:
  • Defines permissions and permission sets that represent the right to access various system resources.
  • Enables administrators to configure security policy by associating sets of permissions with groups of code (code groups).
  • Enables code to request the permissions it requires in order to run , as well as the permissions that would be useful to have , and specifies which permissions the code must never have .
  • Grants permissions to each assembly that is loaded, based on the permissions requested by the code and on the operations permitted by security policy.
  • Enables code to demand that its callers have specific permissions. Enables code to demand that its callers possess a digital signature, thus allowing only callers from a particular organization or site to call the protected code.
  • Enforces restrictions on code at run time by comparing the granted permissions of every caller on the call stack to the permissions that callers must have. There's a separate namespace - System.Security, which is dedicated for implementing security.
Security Namespace

These are the main classes in System.Security namespace:
Classes
Description
CodeAccessPermission
Defines the underlying structure of all code access permissions.
PermissionSet
Represents a collection that can contain many different types of permissions.
SecurityException
The exception that is thrown when a security error is detected.


These are the main classes in System.Security.Permissions namespace:
EnvironmentPermission
Controls access to system and user environment variables.
FileDialogPermission
Controls the ability to access files or folders through a file dialog.
FileIOPermission
Controls the ability to access files and folders.
IsolatedStorageFilePermission
Specifies the allowed usage of a private virtual file system.
IsolatedStoragePermission
Represents access to generic isolated storage capabilities.
ReflectionPermission
Controls access to metadata through the System.Reflection APIs.
RegistryPermission
Controls the ability to access registry variables.
SecurityPermission
Describes a set of security permissions applied to code.
UIPermission
Controls the permissions related to user interfaces and the clipboard.

.Net Vs J2EE – A Comparison

Code Access Security: Permissions 


Code-access permissions represent authorization to access a protected resource or perform a dangerous operation, and form a foundation of CAS. They have to be explicitly requested from the caller either by the system or by application code, and their presence or absence determines the appropriate course of action.

Both Java and .NET supply an ample choice of permissions for a variety of system operations. The runtime systems carry out appropriate checks when a resource is accessed or an operation is requested. Additionally, both platforms provide the ability to augment those stand ard permission sets with custom permissions for protection of application-specific resources. Once developed, custom permissions have to be explicitly checked for (demanded) by the application's code, because the platform's libraries are not going to check for them.

.NET defines a richer selection here, providing permissions for role -based checks and evidence-based checks. An interesting feature of the latter is the family of Identity permissions, which are used to identify an assembly by one of its traits -- for instance, its strong name (StrongNameIdentityPermission). Also, some of its permissions reflect close binding between the .NET platform and the underlying Windows OS (EventLogPermission, RegistryPermission). IsolatedStoragePermission is unique to . NET, and it allows low-trust code (Internet controls, for instance) to save and load a persistent state without revealing details of a computer's file and directory structure.

Adding a custom code access permission requires several steps. Note that if a custom permission is not designed for code access, it will not trigger a stack walk.

The steps are:
  • Optionally, inherit from CodeAccessPermission (to trigger a stack walk).
  • Implement IPermission and IUnrestrictedPermission.
  • Optionally, implement ISerializable.
  • Implement XML encoding and decoding.
  • Optionally, add declarative security support through an Attribute class.
  • Add the new permission to CAS Poli cy by assigning it to a code group.
  • Make the permission's assembly trusted by .NET framework. 

.NET permissions are grouped into NamedPermissionSets. The platform includes the following non-modifiable built-in sets: Nothing, Execution, FullTrust, Internet, LocalIntranet, SkipVerification. The FullTrust set is a special case, as it declares that this code does not have any restrictions and passes any permission check, even for custom permissions. By default, all local code (found in the local computer directories) is granted this privilege.

The above fixed permission sets can be demanded instead of regular permissions:

[assembly:PermissionSetAttribute(
SecurityAction.RequestMinimum,
Name="LocalIntranet")]

In addition to those, custom permission sets may be defined, and a built -in Everything set can be modified. However, imperative code -access checks cannot be applied to varying permission sets (i.e., custom ones and Everything). This restriction is present because they may represent di fferent permissions at different times, and .NET does not support dynamic policies, as it would require re-evaluation of the granted permissions.

Permissions, defined in Java, cover all important system features: file access, socket, display, reflection, security policy, etc. While the list is not as exhaustive as in .NET, it is complete enough to protect the underlying system from the ill - behaving code.

Developing a custom permission in Java is not a complicated process at all. The following steps are required:
  • Extend java.security.Permission or java.security.BasicPermission.
  • Add new permission to the JVM's policy by creating a grant entry.
Obviously, the custom permission's class or JAR file must be in the CLASSPATH (or in one of the standard JVM dire ctories), so that JVM can locate it.

Below is a simple example of defining a custom permission:

//permission class
public class CustomResPermission extends Permission {
public CustomResPermission (String name, String action) {
super(name,action);
}
}
//library class
public class AccessCustomResource {
public String getCustomRes() {
SecurityManager mgr = System.getSecurityManager();
if (mgr == null) {
//shouldn't run without security!!!
throw new SecurityException();
} else {
//see if read access to the resource
//was granted
mgr.checkPermission(
new CustomResPermission("ResAccess","read"));
}
//access the resource here
String res = "Resource";
return res;
}
}
//client class
public class CustomResourceClient {
public void useCustomRes() {
AccessCustomResource accessor = new AccessCustomResource();
try {
//assuming a SecurityManager has been
//installed earlier
String res = accessor.getCustomRes();
} catch(SecurityException ex) {
//insufficient access rights
}
}
}

J2EE reuses Java's permissions mechanism for code -access security. Its specification defines a minimal subset of permissions, the so -called J2EE Security Permissions Set (see section 6.2 of the J2EE.1.4 specification). This is the minimal subset of permissions that a J2EE -compliant application might expect from a J2EE container (i.e., the application does not attempt to call functions requiring other permissions). Of course, it is up to individual vendors to extend it, and most commercially available J2EE application servers allow for much more extensive application security sets.

Note: .NET defines a richer sets -based permission structure than Java. At the
same time, .NET permissions reveal their binding to the Windows OS.

Code Access Security: Policies

Code Access Security is evidence -based. Each application carries some evidence about its origin: location, signer, etc. This evidence can be discovered either by examining the application itself, or by a trusted entity: a class loader or a trusted host. Note that some forms of evidence are weaker th an others, and, correspondingly, should be less trusted -- for instance, URL evidence, which can be susceptible to a number of attacks. Publisher evidence, on the other hand, is PKI-based and very robust, and it is not a likely target of an attack, unless the publisher's key has been compromised. A policy, maintained by a system administrator, groups applications based on their evidence, and assigns appropriate permissions to each group of applications. 


Evidence for the .NET platform consists of various as sembly properties. The set of assembly evidences, which CLR can obtain, defines its group memberships. Usually, each evidence corresponds to a unique MembershipCondition, which are represented by .NET classes. See MSDN for the complete listing of standard conditions. They all represent types of evidence acceptable by CLR. For completeness, here is the list of the standard evidences for the initi al release: AppDirectory, Hash, Publisher, Site, Strong Name, URL, and Zone.


.NET's policy is hierarchical: it groups all applications into so -called Code Groups. An application is placed into a group by matching its Membership Condition (one per code group) with the evidence about the application's assembly. Those conditions are either derived from the evidence or custom -defined. Each group is assigned one of the pre-defined (standard or custom) NamedPermissionSet. Since an assembly can possess more than one type of evidence, it can be a member of multiple code groups. In this case, its total permission set will be a union of the sets from all groups (of a particular level) for which this assembly qualifies. The below figure depicts code-group hierarchy in the default machine policy:



Custom groups may be added under any existing group (there is always a singleroot). Properly choosing the parent is an important task, because due to its hierarchical nature, the policy is navigated top -down, and the search never reaches a descendent node if its parents' MembershipCondition was not satisfied.

In the figure, the Microsoft and ECMA nodes are not evaluated at all for non -local assemblies.
Built-in nodes can be modified or even deleted, but this should be done with care, as this may lead to the system's destabilization. Zones, identifying code, are defined by Windows and managed from I nternet Explorer, which allows adding to or removing whole sites or directories from the groups. All code in non -local groups have special access rights back to the originating site, and assemblies from the intranet zone can also access their originating d irectory shares.

To add a custom code group using an existing NamedPermissionSet with an associated MembershipCondition, one only needs to run the caspol.exe tool. Note that this tool operates on groups ordinal numbers rather than names:

caspol -addgroup 1.3 -site
www.MySite.com LocalIntranet

Actually, .NET has three independent policies, called Levels: Enterprise, Machine, and User. As a rule, a majority of the configuration process takes place on the Machine level — the other two levels grant Full Trust to everybody by default. An application can be a member of several groups on each level, depending on its evidence. As a minimum, all assemblies are member of the AllCode root group. Policy traversal is performed in the following order: Enterprise, M achine, and then User, and from the top down. On each level, granted permission sets are determined as follows:

Level Set = Set1 U Set2 U ... U SetN
where 1..N - code groups matching assembly's evidence. System configuration determines whether union or int ersection operation is used on the sets.

The final set of permissions is calculated as follows:

Final Set = Enterprise X Machine X User
Effectively, this is the least common denominator of all involved sets. However, the traversal order can be altered by u sing Exclusive and LevelFinal policy attributes. The former allows stopping intra -level traversal for an assembly; the latter, inter-level traversal. For instance, this can be used to ensure, on the Enterprise level, that a particular assembly always has e nough rights to execute.

Each policy maintains a special list of assemblies, called trusted assemblies -- they have FullTrust for that policy level. Those assemblies are either part of CLR, or are specified in the CLR configuration entries, so CLR will try to use them. They all must have strong names, and have to be placed into the Global Assembly Cache (GAC) and explicitly added to the policy (GAC can be found in the %WINDIR%\assembly directory):

gacutil /i MyGreatAssembly.dll
caspol -user -addfulltrust MyGreatAssembly.dll

The below figure shows the Machine-level trusted assemblies:
For Java, two types of code evidence are accepted by the JVM -- codebase (URL, either web or local), from where it is accessed, and signer (effectively, the
publisher of the code). Both evidences are optional: if omitted, all code is implied. Again, publisher evidence is more reliable, as it less prone to attacks. However, up until JDK 1.2.1, there was a bug in the SecurityManager's implementation that allowed replacing classes in a signed JAR file and then continuing to execute it, thus effectively stealing the signer's permissions.

Policy links together permissions and evidence by assigning proper rights to code, grouped by similar criteria. A JVM can use multiple policy files; two are defined in the default java.security:



policy.url.1=file:${java.home}/lib/security/java.policy
policy.url.2=file:${user.home}/.java.policy


This structure allows creating multi -level policy sets: network, machine, user. The resulting policy is computed as follows: Policy = Policy.1 U Policy.2 U ... U Policy.N JVM uses an extremely flexible approach to providing polic y: the default setting can be overwritten by specifying a command -line parameter to the JVM:

//adds MyPolicyFile to the list of policies
java -Djava.security.policy=MyPolicyFile.txt
// replaces the existing policy with MyPolicyFile
java -Djava.security.policy==MyPolicyFile.txt


Java policy has a flat structure: it is a series of grant statements, optionally followed by evidence, and a list of granted permissions. A piece of code may satisfy more than one clause's condition — the final set of granted permissions is a union of all matches:


grant [signedBy "signer1", ..., "signerN"] [codeBase "URL"] {
permission "TargetName", "Action"
[signedBy "signer1", ..., "signerN"];
...
}


Even locally installed classes are granted diffe rent trust levels, depending on their location:

· Boot classpath: $JAVA_HOME/lib, $JAVA_HOME/classes These classes automatically have the full trust and no security restrictions. Boot classpath can be changed both for compilation and runtime, using command-line parameters: -bootclasspath and -Xbootclasspath, respectively.
· Extensions: $JAVA_HOME/lib/ext Any code (JAR or class files) in that directory is given full trust in the
default java.policy:

grant codeBase "file:{java.home}/lib/ext/*" {
permission java.security.AllPermission;
}

· Standard classpath: $CLASSPATH ("." by default) By default, have only few permissions to establish certain network connections and read environment properties. Again, the SecurityManager has to be installed (either from command li ne using the - Djava.security.manager switch, or by calling System.setSecurityManager) in order to execute those permissions.

Policy-based security causes problems for applets. It's unlikely that a web site's users will be editing their policy files before accessing a site. Java does not allow runtime modification to the policy, so the code writers (especially applets) simply cannot obtain the required execution permissions. IE and Netscape have incompatible (with Sun's JVM, too!) approaches to handling appl et security. JavaSoft's Java plug-in is supposed to solve this problem by providing a common JRE, instead of the browser -provided VM.

If the applet code needs to step outside of the sandbox, the policy file has to be edited anyway, unless it is an RSA -signed applet. Those applets will either be given full trust (with user's blessing, or course), or if policy has an entry for the signer, it will be used. The following clause in the policy file will always prevent
granting full trust to any RSA -signed applet:

grant {
permission java.lang.RuntimePermission "usePolicy";
}

Note: Policy in .NET has a much more sophisticated structure than in Java, and it also works with many more types of evidences. Java defines very flexible approach to adding and overriding de fault policies -- something that .NET lacks completely.

Code Access Security: Access Checks

Code access checks are performed explicitly; the code (either an application, or a system library acting on its behalf) calls the appropriate Security Manager to verify that the application has the required permissions to perform an operation. This check results in an operation known as a stack walk: the Runtime verifies that each caller up the call tree has the required permissions to execute the requested operation. This operation is aimed to protect against a luring attack, where a privileged component is misled by a caller into executing dangerous operations on its behalf. When a stack walk is performed prior to executing an operation, the system can detect that the caller is not allowed to do what it is requesting, and abort the execution with an exception.

Privileged code may be used to deal with luring attacks without compromising overall system security, and yet provide useful functionality. Normally, the most restrictive set of permissions for all of the code on the current thread stack determines the effective permission set. To bypass this restriction, a special permission can be assigned to a small portion of code to perform a reduced set of restricted actions on behalf of under -trusted  callers. All of the clients can now access the protected resource in a safe manner using that privileged component, without compromising security. For instance, an application may be using fonts, which requires opening font files in protected system areas. Only trusted code has to be given permissions for file I/O, but any caller, even without this permission, can safely access the component itself and use fonts.

Finally, one has to keep in mind that code access security mechanisms of both platforms sit on top of the co rresponding OS access controls, which are usually role or identity-based. So, for example, even if Java/.NET's access control allows a particular component to read all of the files on the system drive, the requests might still be denied at the OS level.

A .NET assembly has a choice of using either imperative or declarative checks (demands) for individual permissions. Declarative (attribute) checks have the added benefit of being stored in metadata, and thus are available for analyzing and reporting by .NET tools like permview.exe. In either case, the check results in a stack walk. Declarative checks can be used from an assembly down to an individual properties level.
//this declaration demands FullTrust
//from the caller of this assembly
[assembly:PermissionSetAttribute(
SecurityAction.RequestMinimum,
Name = "FullTrust")]
//An example of a declarative permission
//demand on a method
[CustomPermissionAttribute(SecurityAction.Demand,
Unrestricted = true)]
public static string ReadData()
{ //Read from a custom resource. }
//Performing the same check imperatively
public static void ReadData()
{
CustomPermission MyPermission = new
CustomPermission(PermissionState.Unrestricted);
MyPermission.Demand();
//Read from a custom resource.
}
 
In addition to ordinary code access checks, an application can declaratively specify LinkDemand or InheritanceDemand actions, which allow a type to require that anybody trying to reference it or inherit from it possess particular permission(s). The former applies to the immediate requestor only, while the latter applies to all inheritance chain. Presence of those demands in the managed code triggers a check for the appropriate permission(s) at JIT time.

LinkDemand has a special application with strong -named assemblies in .NET, because such assemblies may have a higher level of trust from the user. To avoid their unintended malicious usage, .NET places an implicit LinkDemand for their callers to have been granted FullTrust; otherwise, a SecurityException is thrown during JIT compilation, when an under -privileged assembly tries to reference the strong-named assembly. The following implicit declarations are inserted by CLR:

[PermissionSet(SecurityAction.LinkDemand,
Name="FullTrust")]
[PermissionSet(SecurityAction.InheritanceDemand,
Name="FullTrust")]


Consequently, if a strong-named assembly is intended for use by partially trusted assemblies (i.e., from code without FullTrust), it has to be marked by a special attribute -- [assembly:AllowPartiallyTru stedCallers], which effectively removes implicit LinkDemand checks for FullTrust. All other assembly/class/method level security checks are still in place and enforceable, so it is possible that a caller may still not possess enough privileges to utilize a strong-named assembly decorated with this attribute.

.NET assemblies have an option to specify their security requirements at the assembly load time. Here, in addition to individual permissions, they can operate on one of the built-in non-modifiable PermissionSets.  There are three options for those requirements: RequestMinimum, RequestOptional, and RequestRefuse.

If the Minimum requirement cannot be satisf ied, the assembly will not load at all. Optional permissions may enable certain features. Application of the RequestOptional modifier limits the permission set granted to the assembly to only optional and minimal permissions (see the formula in the followi ng paragraph). RequestRefuse explicitly deprives the assembly of certain permissions (in case they were granted) in order to minimize chances that an assembly can be tricked into doing something harmful.
//Requesting minimum assembly permissions
//The request is placed on the assembly level.
using System.Security.Permissions;
[assembly:SecurityPermissionAttribute(
SecurityAction.RequestMinimum,
Flags = SecurityPermissionFlag.UnmanagedCode)]
namespace MyNamespace
{
...
}
CLR determines the final set of assembly permissions using the granted permissions, as specified in .NET CAS policy, plus the load-time modifiers described earlier. The formula appl ied is (P - Policy-derived permissions): G = M + (O<The Assert option explicitly succeeds the stack walk (for the given PermissionSet or any subset of it, as determined by the Intersect function), even if the upstream callers do not have the required permissions (it fails  if sets intersections are not empty). Deny and PermitOnly effectively restrict the available permission sets for the downstream callers. The figure below represents an overview of the Code Access Security permission grants and checks in .NET:

In Java, permissions are normally checked by the SecurityManager (or installed derivative), by using the checkPermission function. It defines a helper for each major group of permissions, such as checkWrite for the write action of FilePermission. All checks are imperativ e; there are no declarative code access checks in Java language. Each JVM can have at most one SecurityManager (or derivative) installed -- once set, they cannot be replaced, for security reasons.

Browsers always start SecurityManager, so any Internet Java application executes with enabled security. Locally started JVMs have to install a SecurityManager before exercising the first sensitive operation; this can also be done programmatically:

System.setSecurityManager(new SecurityManager());

or using a command-line option:

java -Djava.security.manager MyClass

In Java 2, when determining application permissions, SecurityManager delegates the call to java.security.AccessController, which obtains current snapshot of AccessControllerContext to determine which perm issions are present. SecurityManager's operations may be influenced by the java.security.DomainController implementation, if one exists. It instructs an existing SecurityManager to perform additional operations before security checks, thus allowing security system extensibility without re -implementing its core classes. JAAS uses this functionality to add principal -based security checks to the original code-based Java security.

When making access control decisions, the checkPermission method stops checking if it reaches a caller that was marked as "privileged" via a doPrivileged call without a context argument. If that caller's domain has the specified permission, no further checking is done and checkPermission returns quietly, indicating that the requested access is allowed. If that domain does not have the specified permission, an exception is thrown, as usual. Writing privileged code in Java is achieved by implementing the java.security.PrivilegedAction or PrivilegedExceptionAction interfaces. This approach is somewhat limiting, as it does not allow specifying the exact permissions to be asserted, while still requiring the callers to possess others -- it is an "all or nothing" proposition.
public class PrivilegedClass implements PrivilegedAction {
public Object run() {
//perform privileged operation
...
return null;
}
}
Suppose the current thread traversed m callers, in the order of caller 1 to caller 2 to caller M, which invoked the checkPermission method. This method determines whether access is granted or denied based on the following algorithm:
i = m;
while (i > 0) {
if (caller i's domain
does not have the permission)
throw AccessControlException
else if (caller i is marked as privileged) {
if (a context was specified
in the call to doPrivileged)
context.checkPermission(permission)
return;
}
i = i - 1;
}
// Next, check the context inherited when
// the thread was created. Whenever a new thread
// is created, the AccessControlContext at that
// time is stored and associated with the new
// thread, as the "inherited" context.
inheritedContext.checkPermission(permission);
Note: .NET arms developers with an impressive arsenal of various features for access checks, easily surpassing Java in this respect.

Conclusions

In this article, Code Access Security features of Java and .NET platforms were reviewed. CAS features i n .NET are significantly better than the ones Java can offer, with a single exception -- flexibility. Java, as it is often the case, offers ease and configurability in policy handling that .NET cannot match.

Code Access Security - .Net Vs. J2EE

Introduction

This article will act as an introduction to the concept of .Net’s Code Access Security and try to do a comparison between .Net’s Code Access Security and J2EE’s Code Access Security.

In .Net, the Common Language Runtime (CLR) allows code to perform only those operations that the code has permission to perform. So CAS is the CLR's security system that enforces security policies by preventing unauthorized access to protected resources and operations. Using the Code Access Security, you can do the following:
  • Restrict what your code can do
  • Restrict which code can call your code
  • Identify code

We'll be discussing about these things through out this article. The following are the jargons that constitute CAS.

Elements of Code Access Security

Code access security consists of the following elements:
  • Permissions
  • Permission sets
  • Code groups
  • Policy
Permissions
Permissions represent access to a protected resource or the ability to perform a protected operation. The .NET Framework provides several permission classes, like FileIOPermission (when working with files), UIPermission (permission to use a user interface), SecurityPermission (this is needed to execute the code and can be even used to bypass security) etc. I won't list all the permission classes here, they are listed below.

Permission sets

A permission set is a collection of permissions. You can put FileIOPermission and UIPermission into your own permission set and call it " My_PermissionSet". A permission set can include any number of permissions. FullTrust, LocalIntranet, Internet, Execution and Nothing are some of the built in permission sets in .NET Framework. FullTrust has all the permissions in the world, while Nothing has no permissions at all, not even the right to execute.

Code groups

Code group is a logical grouping of code that has a specified condition for membership. Code from http://www.somewebsite.com/ can belong to one code group, code containing a specific strong name can belong to another code group and code from a specific assembly can belong to another code group. There are built-in code groups like My_Computer_Zone, LocalIntranet_Zone, Internet_Zone etc. Like permission sets, we can create code groups to meet our requirements based on the evidence provided by .NET  Framework. Site, Strong Name, Zone, URL are some of the types of evidence.

Policy

Security policy is the configurable set of rules that the CLR follows when determining the permissions to grant to code. There are four policy levels - Enterprise, Machine, User and Application Domain, each operating independently from each other. Each level has its own code groups and permission sets. They have the hierarchy given below.



Functions of Code Access Security

Code Access Security performs the following functions:
  • Defines permissions and permission sets that represent the right to access various system resources.
  • Enables administrators to configure security policy by associating sets of permissions with groups of code (code groups).
  • Enables code to request the permissions it requires in order to run , as well as the permissions that would be useful to have , and specifies which permissions the code must never have .
  • Grants permissions to each assembly that is loaded, based on the permissions requested by the code and on the operations permitted by security policy.
  • Enables code to demand that its callers have specific permissions. Enables code to demand that its callers possess a digital signature, thus allowing only callers from a particular organization or site to call the protected code.
  • Enforces restrictions on code at run time by comparing the granted permissions of every caller on the call stack to the permissions that callers must have. There's a separate namespace - System.Security, which is dedicated for implementing security.
Security Namespace

These are the main classes in System.Security namespace:
Classes
Description
CodeAccessPermission
Defines the underlying structure of all code access permissions.
PermissionSet
Represents a collection that can contain many different types of permissions.
SecurityException
The exception that is thrown when a security error is detected.


These are the main classes in System.Security.Permissions namespace:
EnvironmentPermission
Controls access to system and user environment variables.
FileDialogPermission
Controls the ability to access files or folders through a file dialog.
FileIOPermission
Controls the ability to access files and folders.
IsolatedStorageFilePermission
Specifies the allowed usage of a private virtual file system.
IsolatedStoragePermission
Represents access to generic isolated storage capabilities.
ReflectionPermission
Controls access to metadata through the System.Reflection APIs.
RegistryPermission
Controls the ability to access registry variables.
SecurityPermission
Describes a set of security permissions applied to code.
UIPermission
Controls the permissions related to user interfaces and the clipboard.

.Net Vs J2EE – A Comparison

Code Access Security: Permissions 


Code-access permissions represent authorization to access a protected resource or perform a dangerous operation, and form a foundation of CAS. They have to be explicitly requested from the caller either by the system or by application code, and their presence or absence determines the appropriate course of action.

Both Java and .NET supply an ample choice of permissions for a variety of system operations. The runtime systems carry out appropriate checks when a resource is accessed or an operation is requested. Additionally, both platforms provide the ability to augment those stand ard permission sets with custom permissions for protection of application-specific resources. Once developed, custom permissions have to be explicitly checked for (demanded) by the application's code, because the platform's libraries are not going to check for them.

.NET defines a richer selection here, providing permissions for role -based checks and evidence-based checks. An interesting feature of the latter is the family of Identity permissions, which are used to identify an assembly by one of its traits -- for instance, its strong name (StrongNameIdentityPermission). Also, some of its permissions reflect close binding between the .NET platform and the underlying Windows OS (EventLogPermission, RegistryPermission). IsolatedStoragePermission is unique to . NET, and it allows low-trust code (Internet controls, for instance) to save and load a persistent state without revealing details of a computer's file and directory structure.

Adding a custom code access permission requires several steps. Note that if a custom permission is not designed for code access, it will not trigger a stack walk.

The steps are:
  • Optionally, inherit from CodeAccessPermission (to trigger a stack walk).
  • Implement IPermission and IUnrestrictedPermission.
  • Optionally, implement ISerializable.
  • Implement XML encoding and decoding.
  • Optionally, add declarative security support through an Attribute class.
  • Add the new permission to CAS Poli cy by assigning it to a code group.
  • Make the permission's assembly trusted by .NET framework. 

.NET permissions are grouped into NamedPermissionSets. The platform includes the following non-modifiable built-in sets: Nothing, Execution, FullTrust, Internet, LocalIntranet, SkipVerification. The FullTrust set is a special case, as it declares that this code does not have any restrictions and passes any permission check, even for custom permissions. By default, all local code (found in the local computer directories) is granted this privilege.

The above fixed permission sets can be demanded instead of regular permissions:

[assembly:PermissionSetAttribute(
SecurityAction.RequestMinimum,
Name="LocalIntranet")]

In addition to those, custom permission sets may be defined, and a built -in Everything set can be modified. However, imperative code -access checks cannot be applied to varying permission sets (i.e., custom ones and Everything). This restriction is present because they may represent di fferent permissions at different times, and .NET does not support dynamic policies, as it would require re-evaluation of the granted permissions.

Permissions, defined in Java, cover all important system features: file access, socket, display, reflection, security policy, etc. While the list is not as exhaustive as in .NET, it is complete enough to protect the underlying system from the ill - behaving code.

Developing a custom permission in Java is not a complicated process at all. The following steps are required:
  • Extend java.security.Permission or java.security.BasicPermission.
  • Add new permission to the JVM's policy by creating a grant entry.
Obviously, the custom permission's class or JAR file must be in the CLASSPATH (or in one of the standard JVM dire ctories), so that JVM can locate it.

Below is a simple example of defining a custom permission:

//permission class
public class CustomResPermission extends Permission {
public CustomResPermission (String name, String action) {
super(name,action);
}
}
//library class
public class AccessCustomResource {
public String getCustomRes() {
SecurityManager mgr = System.getSecurityManager();
if (mgr == null) {
//shouldn't run without security!!!
throw new SecurityException();
} else {
//see if read access to the resource
//was granted
mgr.checkPermission(
new CustomResPermission("ResAccess","read"));
}
//access the resource here
String res = "Resource";
return res;
}
}
//client class
public class CustomResourceClient {
public void useCustomRes() {
AccessCustomResource accessor = new AccessCustomResource();
try {
//assuming a SecurityManager has been
//installed earlier
String res = accessor.getCustomRes();
} catch(SecurityException ex) {
//insufficient access rights
}
}
}

J2EE reuses Java's permissions mechanism for code -access security. Its specification defines a minimal subset of permissions, the so -called J2EE Security Permissions Set (see section 6.2 of the J2EE.1.4 specification). This is the minimal subset of permissions that a J2EE -compliant application might expect from a J2EE container (i.e., the application does not attempt to call functions requiring other permissions). Of course, it is up to individual vendors to extend it, and most commercially available J2EE application servers allow for much more extensive application security sets.

Note: .NET defines a richer sets -based permission structure than Java. At the
same time, .NET permissions reveal their binding to the Windows OS.

Code Access Security: Policies

Code Access Security is evidence -based. Each application carries some evidence about its origin: location, signer, etc. This evidence can be discovered either by examining the application itself, or by a trusted entity: a class loader or a trusted host. Note that some forms of evidence are weaker th an others, and, correspondingly, should be less trusted -- for instance, URL evidence, which can be susceptible to a number of attacks. Publisher evidence, on the other hand, is PKI-based and very robust, and it is not a likely target of an attack, unless the publisher's key has been compromised. A policy, maintained by a system administrator, groups applications based on their evidence, and assigns appropriate permissions to each group of applications. 


Evidence for the .NET platform consists of various as sembly properties. The set of assembly evidences, which CLR can obtain, defines its group memberships. Usually, each evidence corresponds to a unique MembershipCondition, which are represented by .NET classes. See MSDN for the complete listing of standard conditions. They all represent types of evidence acceptable by CLR. For completeness, here is the list of the standard evidences for the initi al release: AppDirectory, Hash, Publisher, Site, Strong Name, URL, and Zone.


.NET's policy is hierarchical: it groups all applications into so -called Code Groups. An application is placed into a group by matching its Membership Condition (one per code group) with the evidence about the application's assembly. Those conditions are either derived from the evidence or custom -defined. Each group is assigned one of the pre-defined (standard or custom) NamedPermissionSet. Since an assembly can possess more than one type of evidence, it can be a member of multiple code groups. In this case, its total permission set will be a union of the sets from all groups (of a particular level) for which this assembly qualifies. The below figure depicts code-group hierarchy in the default machine policy:



Custom groups may be added under any existing group (there is always a singleroot). Properly choosing the parent is an important task, because due to its hierarchical nature, the policy is navigated top -down, and the search never reaches a descendent node if its parents' MembershipCondition was not satisfied.

In the figure, the Microsoft and ECMA nodes are not evaluated at all for non -local assemblies.
Built-in nodes can be modified or even deleted, but this should be done with care, as this may lead to the system's destabilization. Zones, identifying code, are defined by Windows and managed from I nternet Explorer, which allows adding to or removing whole sites or directories from the groups. All code in non -local groups have special access rights back to the originating site, and assemblies from the intranet zone can also access their originating d irectory shares.

To add a custom code group using an existing NamedPermissionSet with an associated MembershipCondition, one only needs to run the caspol.exe tool. Note that this tool operates on groups ordinal numbers rather than names:

caspol -addgroup 1.3 -site
www.MySite.com LocalIntranet

Actually, .NET has three independent policies, called Levels: Enterprise, Machine, and User. As a rule, a majority of the configuration process takes place on the Machine level — the other two levels grant Full Trust to everybody by default. An application can be a member of several groups on each level, depending on its evidence. As a minimum, all assemblies are member of the AllCode root group. Policy traversal is performed in the following order: Enterprise, M achine, and then User, and from the top down. On each level, granted permission sets are determined as follows:

Level Set = Set1 U Set2 U ... U SetN
where 1..N - code groups matching assembly's evidence. System configuration determines whether union or int ersection operation is used on the sets.

The final set of permissions is calculated as follows:

Final Set = Enterprise X Machine X User
Effectively, this is the least common denominator of all involved sets. However, the traversal order can be altered by u sing Exclusive and LevelFinal policy attributes. The former allows stopping intra -level traversal for an assembly; the latter, inter-level traversal. For instance, this can be used to ensure, on the Enterprise level, that a particular assembly always has e nough rights to execute.

Each policy maintains a special list of assemblies, called trusted assemblies -- they have FullTrust for that policy level. Those assemblies are either part of CLR, or are specified in the CLR configuration entries, so CLR will try to use them. They all must have strong names, and have to be placed into the Global Assembly Cache (GAC) and explicitly added to the policy (GAC can be found in the %WINDIR%\assembly directory):

gacutil /i MyGreatAssembly.dll
caspol -user -addfulltrust MyGreatAssembly.dll

The below figure shows the Machine-level trusted assemblies:
For Java, two types of code evidence are accepted by the JVM -- codebase (URL, either web or local), from where it is accessed, and signer (effectively, the
publisher of the code). Both evidences are optional: if omitted, all code is implied. Again, publisher evidence is more reliable, as it less prone to attacks. However, up until JDK 1.2.1, there was a bug in the SecurityManager's implementation that allowed replacing classes in a signed JAR file and then continuing to execute it, thus effectively stealing the signer's permissions.

Policy links together permissions and evidence by assigning proper rights to code, grouped by similar criteria. A JVM can use multiple policy files; two are defined in the default java.security:



policy.url.1=file:${java.home}/lib/security/java.policy
policy.url.2=file:${user.home}/.java.policy


This structure allows creating multi -level policy sets: network, machine, user. The resulting policy is computed as follows: Policy = Policy.1 U Policy.2 U ... U Policy.N JVM uses an extremely flexible approach to providing polic y: the default setting can be overwritten by specifying a command -line parameter to the JVM:

//adds MyPolicyFile to the list of policies
java -Djava.security.policy=MyPolicyFile.txt
// replaces the existing policy with MyPolicyFile
java -Djava.security.policy==MyPolicyFile.txt


Java policy has a flat structure: it is a series of grant statements, optionally followed by evidence, and a list of granted permissions. A piece of code may satisfy more than one clause's condition — the final set of granted permissions is a union of all matches:


grant [signedBy "signer1", ..., "signerN"] [codeBase "URL"] {
permission  "TargetName", "Action"
[signedBy "signer1", ..., "signerN"];
...
}


Even locally installed classes are granted diffe rent trust levels, depending on their location:

· Boot classpath: $JAVA_HOME/lib, $JAVA_HOME/classes These classes automatically have the full trust and no security restrictions. Boot classpath can be changed both for compilation and runtime, using command-line parameters: -bootclasspath and -Xbootclasspath, respectively.
· Extensions: $JAVA_HOME/lib/ext Any code (JAR or class files) in that directory is given full trust in the
default java.policy:

grant codeBase "file:{java.home}/lib/ext/*" {
permission java.security.AllPermission;
}

· Standard classpath: $CLASSPATH ("." by default) By default, have only few permissions to establish certain network connections and read environment properties. Again, the SecurityManager has to be installed (either from command li ne using the - Djava.security.manager switch, or by calling System.setSecurityManager) in order to execute those permissions.

Policy-based security causes problems for applets. It's unlikely that a web site's users will be editing their policy files before accessing a site. Java does not allow runtime modification to the policy, so the code writers (especially applets) simply cannot obtain the required execution permissions. IE and Netscape have incompatible (with Sun's JVM, too!) approaches to handling appl et security. JavaSoft's Java plug-in is supposed to solve this problem by providing a common JRE, instead of the browser -provided VM.

If the applet code needs to step outside of the sandbox, the policy file has to be edited anyway, unless it is an RSA -signed applet. Those applets will either be given full trust (with user's blessing, or course), or if policy has an entry for the signer, it will be used. The following clause in the policy file will always prevent
granting full trust to any RSA -signed applet:

grant {
permission java.lang.RuntimePermission "usePolicy";
}

Note: Policy in .NET has a much more sophisticated structure than in Java, and it also works with many more types of evidences. Java defines very flexible approach to adding and overriding de fault policies -- something that .NET lacks completely.

Code Access Security: Access Checks

Code access checks are performed explicitly; the code (either an application, or a system library acting on its behalf) calls the appropriate Security Manager to verify that the application has the required permissions to perform an operation. This check results in an operation known as a stack walk: the Runtime verifies that each caller up the call tree has the required permissions to execute the requested operation. This operation is aimed to protect against a luring attack, where a privileged component is misled by a caller into executing dangerous operations on its behalf. When a stack walk is performed prior to executing an operation, the system can detect that the caller is not allowed to do what it is requesting, and abort the execution with an exception.

Privileged code may be used to deal with luring attacks without compromising overall system security, and yet provide useful functionality. Normally, the most restrictive set of permissions for all of the code on the current thread stack determines the effective permission set. To bypass this restriction, a special permission can be assigned to a small portion of code to perform a reduced set of restricted actions on behalf of under -trusted  callers. All of the clients can now access the protected resource in a safe manner using that privileged component, without compromising security. For instance, an application may be using fonts, which requires opening font files in protected system areas. Only trusted code has to be given permissions for file I/O, but any caller, even without this permission, can safely access the component itself and use fonts.

Finally, one has to keep in mind that code access security mechanisms of both platforms sit on top of the co rresponding OS access controls, which are usually role or identity-based. So, for example, even if Java/.NET's access control allows a particular component to read all of the files on the system drive, the requests might still be denied at the OS level.

A .NET assembly has a choice of using either imperative or declarative checks (demands) for individual permissions. Declarative (attribute) checks have the added benefit of being stored in metadata, and thus are available for analyzing and reporting by .NET tools like permview.exe. In either case, the check results in a stack walk. Declarative checks can be used from an assembly down to an individual properties level.
//this declaration demands FullTrust
//from the caller of this assembly
[assembly:PermissionSetAttribute(
SecurityAction.RequestMinimum,
Name = "FullTrust")]
//An example of a declarative permission
//demand on a method
[CustomPermissionAttribute(SecurityAction.Demand,
Unrestricted = true)]
public static string ReadData()
{ //Read from a custom resource. }
//Performing the same check imperatively
public static void ReadData()
{
CustomPermission MyPermission = new
CustomPermission(PermissionState.Unrestricted);
MyPermission.Demand();
//Read from a custom resource.
}
 
In addition to ordinary code access checks, an application can declaratively specify LinkDemand or InheritanceDemand actions, which allow a type to require that anybody trying to reference it or inherit from it possess particular permission(s). The former applies to the immediate requestor only, while the latter applies to all inheritance chain. Presence of those demands in the managed code triggers a check for the appropriate permission(s) at JIT time.

LinkDemand has a special application with strong -named assemblies in .NET, because such assemblies may have a higher level of trust from the user. To avoid their unintended malicious usage, .NET places an implicit LinkDemand for their callers to have been granted FullTrust; otherwise, a SecurityException is thrown during JIT compilation, when an under -privileged assembly tries to reference the strong-named assembly. The following implicit declarations are inserted by CLR:

[PermissionSet(SecurityAction.LinkDemand,
Name="FullTrust")]
[PermissionSet(SecurityAction.InheritanceDemand,
Name="FullTrust")]


Consequently, if a strong-named assembly is intended for use by partially trusted assemblies (i.e., from code without FullTrust), it has to be marked by a special attribute -- [assembly:AllowPartiallyTru stedCallers], which effectively removes implicit LinkDemand checks for FullTrust. All other assembly/class/method level security checks are still in place and enforceable, so it is possible that a caller may still not possess enough privileges to utilize a strong-named assembly decorated with this attribute.

.NET assemblies have an option to specify their security requirements at the assembly load time. Here, in addition to individual permissions, they can operate on one of the built-in non-modifiable PermissionSets.  There are three options for those requirements: RequestMinimum, RequestOptional, and RequestRefuse.

If the Minimum requirement cannot be satisf ied, the assembly will not load at all. Optional permissions may enable certain features. Application of the RequestOptional modifier limits the permission set granted to the assembly to only optional and minimal permissions (see the formula in the followi ng paragraph). RequestRefuse explicitly deprives the assembly of certain permissions (in case they were granted) in order to minimize chances that an assembly can be tricked into doing something harmful.
//Requesting minimum assembly permissions
//The request is placed on the assembly level.
using System.Security.Permissions;
[assembly:SecurityPermissionAttribute(
SecurityAction.RequestMinimum,
Flags = SecurityPermissionFlag.UnmanagedCode)]
namespace MyNamespace
{
...
} 
CLR determines the final set of assembly permissions using the granted permissions, as specified in .NET CAS policy, plus the load-time modifiers described earlier. The formula appl ied is (P - Policy-derived permissions): G = M + (O<The Assert option explicitly succeeds the stack walk (for the given PermissionSet or any subset of it, as determined by the Intersect function), even if the upstream callers do not have the required permissions (it fails  if sets intersections are not empty). Deny and PermitOnly effectively restrict the available permission sets for the downstream callers. The figure below represents an overview of the Code Access Security permission grants and checks in .NET:

In Java, permissions are normally checked by the SecurityManager (or installed derivative), by using the checkPermission function. It defines a helper for each major group of permissions, such as checkWrite for the write action of FilePermission. All checks are imperativ e; there are no declarative code access checks in Java language. Each JVM can have at most one SecurityManager (or derivative) installed -- once set, they cannot be replaced, for security reasons.

Browsers always start SecurityManager, so any Internet Java application executes with enabled security. Locally started JVMs have to install a SecurityManager before exercising the first sensitive operation; this can also be done programmatically:

System.setSecurityManager(new SecurityManager());

or using a command-line option:

java -Djava.security.manager MyClass

In Java 2, when determining application permissions, SecurityManager delegates the call to java.security.AccessController, which obtains current snapshot of AccessControllerContext to determine which perm issions are present. SecurityManager's operations may be influenced by the java.security.DomainController implementation, if one exists. It instructs an existing SecurityManager to perform additional operations before security checks, thus allowing security system extensibility without re -implementing its core classes. JAAS uses this functionality to add principal -based security checks to the original code-based Java security.

When making access control decisions, the checkPermission method stops checking if it reaches a caller that was marked as "privileged" via a doPrivileged call without a context argument. If that caller's domain has the specified permission, no further checking is done and checkPermission returns quietly, indicating that the requested access is allowed. If that domain does not have the specified permission, an exception is thrown, as usual. Writing privileged code in Java is achieved by implementing the java.security.PrivilegedAction or PrivilegedExceptionAction interfaces. This approach is somewhat limiting, as it does not allow specifying the exact permissions to be asserted, while still requiring the callers to possess others -- it is an "all or nothing" proposition.
public class PrivilegedClass implements PrivilegedAction {
public Object run() {
//perform privileged operation
...
return null;
}
}
Suppose the current thread traversed m callers, in the order of caller 1 to caller 2 to caller M, which invoked the checkPermission method. This method determines whether access is granted or denied based on the following algorithm:
i = m;
while (i > 0) {
if (caller i's domain
does not have the permission)
throw AccessControlException
else if (caller i is marked as privileged) {
if (a context was specified
in the call to doPrivileged)
context.checkPermission(permission)
return;
}
i = i - 1;
}
// Next, check the context inherited when
// the thread was created. Whenever a new thread
// is created, the AccessControlContext at that
// time is stored and associated with the new
// thread, as the "inherited" context.
inheritedContext.checkPermission(permission);
Note: .NET arms developers with an impressive arsenal of various features for access checks, easily surpassing Java in this respect.

Conclusions

In this article, Code Access Security features of Java and .NET platforms were reviewed. CAS features i n .NET are significantly better than the ones Java can offer, with a single exception -- flexibility. Java, as it is often the case, offers ease and configurability in policy handling that .NET cannot match.

Text Widget

Copyright © Vinay's Blog | Powered by Blogger

Design by | Blogger Theme by