SOAP

XML-RPC was in large part invented by a single person who really didn’t know a lot about XML. Consequently he made many very questionable choices; and since XML-RPC did not go through any sort of standardization process, there was nobody to fix his mistakes. For instance, in XML-RPC the string type is defined as an “ASCII string”. Now frankly, this is just plain dumb, as well as not a little ethnocentric. XML documents are Unicode, not ASCII. Modern programming languages like Java can handle Unicode without any trouble. Indeed a language that can’t process Unicode really isn’t suitable for processing XML. There is no good reason to limit XML-RPC strings to ASCII. I certainly wouldn’t say you have to use non-ASCII characters in your XML-RPC documents, but if you want to they should certainly be allowed. However, the inventor of XML-RPC also happened to be the vendor of an ASCII-limited database, so he inserted the ASCII-only constraint into XML-RPC so he wouldn’t have to upgrade his database to support Unicode.

There are a lot of other issues like that with XML-RPC, some equally obvious, some more subtle. Nonetheless, clearly XML-RPC was a good idea in principle if not execution. Consequently work began on a more serious effort to enable remote procedure calls by passing XML documents over HTTP. This effort is known as the Simple Object Access Protocol, or just SOAP. Whereas XML-RPC was a quick hack by one developer, SOAP has been developed by a committee of XML experts from various companies including IBM and Microsoft.

You’ve undoubtedly heard the old saw about a camel being a horse designed by committee. However, the fact is a camel is actually superbly adapted to its environment. SOAP is a much more robust protocol than XML-RPC. It is much better designed from an XML standpoint as well. It takes advantage of numerous features of XML such as attributes, Unicode, and namespaces that XML-RPC either ignores or actively opposes. XML-RPC is adequate for simple tasks. However, if you get serious with it you rapidly hit a wall. SOAP can take you a lot farther. Although there are some basic services available using XML-RPC, the future clearly lies with SOAP.

The biggest conceptual difference between SOAP and XML-RPC is that XML-RPC exchanges a limited number of parameters of six fixed types, plus structs and arrays. However, SOAP allows you to send the server arbitrary XML elements. This is a much more flexible approach.

A SOAP Example

Let’s investigate how the stock quote example would likely be implemented in SOAP. Encoded as a SOAP document, the request document looks quite different, but the same information is present as demonstrated in Example 2.15.

Example 2.15. A SOAP document requesting the current stock price of Red Hat

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Body>
    <getQuote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>RHAT</symbol>
    </getQuote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The most obvious difference between this document and the XML-RPC equivalent in Example 2.6 is the use of namespaces. Namespaces allow the method request to be an arbitrary XML element. This goes way beyond passing just a method name and some argument values. SOAP permits much more complex XML messages than does XML-RPC.

The server’s response is equally flexible. Example 2.16 demonstrates:

Example 2.16. A SOAP Response

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
  <SOAP-ENV:Body>
    <Quote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <Price>4.12</Price>
    </Quote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

These two examples are minimal SOAP documents. The root element of every SOAP document is Envelope which must be in the http://schemas.xmlsoap.org/soap/envelope/ namespace, at least in SOAP 1.1. (The URL will change in SOAP 1.2.) Normally a prefix is used, and as always you can pick any prefix as long as the URI stays the same. In this chapter, I always assume that the prefix SOAP-ENV is mapped to that namespace URI. (This is the prefix the SOAP 1.1 specification uses.)

Each SOAP-ENV:Envelope element contains exactly one SOAP-ENV:Body element. The content of this element is one or more XML elements specific to the service. These examples use Quote, getQuote, and Price elements in the http://namespaces.cafeconleche.org/xmljava/ch2/ namespace. Other services will use other elements from other namespaces. It’s also permissible to use elements from no namespace at all, though using namespaces is highly recommended.

Posting SOAP documents

Currently, most SOAP messages are passed over HTTP using POST, just like XML-RPC messages. Other transport protocols such as SMTP, BEEP, and Jabber can be supported as well. However, there are a couple of crucial differences in the HTTP headers used for SOAP:

  • The HTTP request header must contain a SOAPAction field.

  • If the SOAP request fails, the server should return an HTTP 500 Internal Server Error rather than 200 OK.

The SOAPAction field alerts web servers and firewalls that they’re dealing with a SOAP message. This allows firewalls to more easily filter SOAP requests without looking at the request body. The value of the SOAPAction field is a double quoted URI that somehow indicates the intent of the message. For instance, if Example 2.15 were POSTed to a servlet running on www.ibiblio.org under the control of the user elharo, then you might use the SOAPAction http://www.ibiblio.org/#elharo to indicate to the server and firewall who was responsible for processing this request. This is shown in Example 2.17.

Example 2.17. A SOAP document requesting the current stock price of Red Hat

POST /xml/cgi-bin/SOAPHandler HTTP/1.1
Content-Type: text/xml; charset="utf-8"
Content-Length: 267
SOAPAction: "http://www.ibiblio.org/#elharo"

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Body>
    <getQuote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>RHAT</symbol>
    </getQuote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Conceptually, SOAPAction URIs are very similar to namespace URIs because they aren’t meant to be resolved. They’re just a convenient way of assigning unique identifiers to certain classes of SOAP messages. There’s no particular standard for how they’re chosen. You might use the full absolute URL that receives the SOAP request or you might use some previously agreed upon URI. You can even use nothing at all. But the SOAPAction header must be present in order for the request to be identified as a SOAP request.

The server will normally send the response back to the client over the same socket the client used to send the request and then close the connection. Like any other HTTP response, a SOAP response begins with an HTTP return code, message, and header. Assuming the request was successful, then the response code is 200 OK. Unlike the request, the response does not use any special header fields beyond those used by regular web browsers and servers. Example 2.18 demonstrates:

Example 2.18. A SOAP document returning the current stock price of Red Hat

HTTP/1.0 200 OK
Content-Type: text/xml; charset="utf-8"
Content-Length: 260

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
  <SOAP-ENV:Body>
    <Quote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <Price>4.12</Price>
    </Quote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Faults

It’s a fact of life that requests fail. They may fail for reasons beyond control of the SOAP provider. For instance, you may launch your SOAP request into the ether just before the phone company severs the wire connecting you to the Internet while hooking up your neighbor’s new DSL line. That sort of failure would make it itself manifest at a lower layer, below XML and SOAP, probably as a SocketException if you’re working in Java.

It’s also possible that your request will successfully arrive at the server, only to find that the server doesn’t recognize the URL you’re posting to. In fact, the server might not even be configured to support SOAP requests. This sort of error would not throw an exception, but would return a 404 Not Found page rather than the expected SOAP response. Your code should be prepared to handle such events.

Finally, it’s also possible that the SOAP responder itself is able to be reached and correctly invoked, but that it cannot process the request. This may be because the request contained bad data (e.g. a symbol for a stock that doesn’t exist) or simply because the server code is buggy and encountered a problem. In these cases the SOAP server itself is responsible for producing the correct error message. This error message is a SOAP response with a SOAP-ENV:Envelope and a SOAP-ENV:Body, just like a normal response. However, the SOAP-ENV:Body must contain exactly one SOAP_ENV:Fault element and may not contain anything else.

The SOAP_ENV:Fault element contains up to four child elements:

faultcode

A faultcode element contains a qualified name such as SOAP-ENV:VersionMismatch that identifies the fault.

faultstring

A faultstring element contains a plain text message for human readers describing the fault.

faultactor

The optional faultactor element contains a URI identifying the node that generated the fault. It’s used when a SOAP request is passed through a chain of handlers. This element is optional.

faultdetail

A faultdetail element is used when the fault is specifically related to the body of the request (e.g. the stock symbol was not recognized) as opposed to the envelope. It contains child elements describing the fault. This element is present if and only if the fault was related to the SOAP body as opposed to the SOAP header.

Caution

These four elements are not namespace qualified, which is a little surprising. They are not in the http://schemas.xmlsoap.org/soap/envelope/ namespace. They are not in some other namespace. They are in no namespace at all.

SOAP defines four specific fault codes in the http://schemas.xmlsoap.org/soap/envelope/ namespace to indicate common conditions in a generic way. These are:

SOAP-ENV:VersionMismatch

The namespace of the SOAP-ENV element indicates that this message is intended for a server implementing a different version of the SOAP protocol; e.g. a SOAP 1.2 message has been sent to a SOAP 1.1 server.

SOAP-ENV:MustUnderstand

There’s something in the header that the message says the server has to understand before acting, but the server does not recognize. (I’ll talk about this soon in the section on SOAP headers.)

SOAP-ENV:Client

The client sent a message that is somehow defective. Perhaps it omitted a key piece of information the server needs. For instance, the getQuote message was sent and understood, but the getQuote element did not have a symbol child. The client is to blame for the problem.

SOAP-ENV:Server

The client sent a correctly formed message with all the necessary information, but some error prevented the server from processing it. For example, the server may have needed to connect to a remote database to retrieve some information, and the database server had crashed. The server is to blame for the problem.

Example 2.19 is a fault that might be returned in response to a request for the non-existent stock ABCD. The faultcode element is set to SOAP-ENV:Client to indicate that the client's request was incorrect. The faultstring element just contains a brief string of unmarked up text that can be used to more fully describe the problem to a human reader. The faultdetail content includes elements in the same namespace as the successful response, http://namespaces.cafeconleche.org/xmljava/ch2/. Since this request was processed by a single node, no faultactor element is necessary.

Example 2.19. A SOAP fault response

HTTP/1.0 500 Internal Server Error
Content-Type: text/xml; charset="utf-8"
Content-Length: 498

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" 
 xmlns:stock="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <SOAP-ENV:Body>
    <SOAP-ENV:Fault>
      <faultcode>SOAP-ENV:Client</faultcode>
      <faultstring>
        There is no stock with the symbol ABCD.
      </faultstring>
      <faultdetail>
        <stock:InvalidSymbol>ABCD</stock:InvalidSymbol>
      </faultdetail>
    </SOAP-ENV:Fault>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Encoding Styles

The information encoded in the SOAP documents you’ve seen so far has all been nothing more than Unicode text strings. When you want to encode other types such as integers, arrays, and objects, you need to specify how the characters that make up the XML document should be deserialized into the local platform’s understanding of those types. For example, if a Java program encounters the element <Price>4.12</Price> should it convert this into a double? a float? a java.lang.String? a java.math.BigDecimal? a custom Price class? something else?

Any element in a SOAP document can have a SOAP-ENV:encodingStyle attribute whose value is a URI pointing to some kind of schema that says what types are assigned to which elements. The most common language to use for this schema is the W3C XML Schema Language. However, other schema languages such as RELAX NG are allowed too.

Example 2.20 uses the SOAP-ENV:encodingStyle attribute on the getQuote element to point to a schema at the relative URL trading.xsd. This schema defines the symbol element as having the custom type StockSymbol, and is shown in Example 2.21. This schema is just used for assigning types. It is not used for validation, though with a little extra work it could be.

Example 2.20. A SOAP document that specifies the encoding style

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
  <SOAP-ENV:Body>
    <getQuote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"
              SOAP-ENV:encodingStyle="trading.xsd">
      <symbol>RHAT</symbol>
    </getQuote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Example 2.21. A schema that assigns type to elements in the http://namespaces.cafeconleche.org/xmljava/ch2/ namespace

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://namespaces.cafeconleche.org/xmljava/ch2/"
xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"
elementFormDefault="qualified">

  <xsd:element name="getQuote">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="symbol" type="StockSymbol" 
                     maxOccurs="unbounded"/>
      </xsd:sequence>
    </xsd:complexType>  
  </xsd:element>

  <xsd:simpleType name="StockSymbol">
    <xsd:restriction base="xsd:string">
      <!-- two to six upper case letters -->
      <xsd:pattern value="[A-Z][A-Z][A-Z]?[A-Z]?[A-Z]?[A-Z]?"/>
    </xsd:restriction>
  </xsd:simpleType>

</xsd:schema>

The SOAP-ENV:encodingStyle attribute can be placed on any element in the document. It applies to that element and its descendants, and overrides the schemas declared on any ancestor. It is commonly placed on the root SOAP-ENV:Envelope element.

SOAP singles out one encoding style for special treatment. If the SOAP-ENV:encodingStyle attribute has the value http://schemas.xmlsoap.org/soap/encoding/, then a predefined set of types are available that includes one element for each simple type defined in the W3C XML Schema Language and listed in Table 2.1. For instance, assuming the SOAP-ENC prefix is bound to the http://schemas.xmlsoap.org/soap/encoding/ URI (Note that this is not the same as the namespace URI or prefix for the SOAP envelope) then an int can be placed in a SOAP-ENC:int element like this

<SOAP-ENC:int>12</SOAP-ENC:int>

The complete list of types and their normal Java semantics is given in Table 2.2, though this really just mirrors Table 2.1. In many cases, Java does not have a type that exactly matches one of the derived types. Thus it uses the broader base class. For example, Java does not have an unsigned integer type but all values of type xsd:unsignedInt can fit into a Java long. Java does not have a PositiveInteger class, but all xsd:positiveIntegers can be represented by a java.math.BigInteger. In some cases the mapping is obvious. In others, different programs may use different Java types and objects to deserialize the same values. For instance, an xsd:int is exactly a Java int, and an xsd:double is as close to a Java double as it’s possible for a base-10 string to be. However, an xsd:anyURI, could reasonably be converted to a java.net.URL, a java.lang.String, or some custom URI class.

Table 2.2. Simple Value Elements defined in SOAP

SOAP typeJava type
SOAP-ENC:stringjava.lang.String
SOAP-ENC:booleanboolean
SOAP-ENC:decimaljava.math.BigDecimal
SOAP-ENC:floatfloat
SOAP-ENC:doubledouble
SOAP-ENC:integerjava.math.BigInteger
SOAP-ENC:positiveIntegerjava.math.BigDecimal
SOAP-ENC:nonPositiveIntegerjava.math.BigInteger
SOAP-ENC:negativeIntegerjava.math.BigInteger
SOAP-ENC:nonNegativeIntegerjava.math.BigInteger
SOAP-ENC:longlong
SOAP-ENC:intint
SOAP-ENC:shortshort
SOAP-ENC:bytebyte
SOAP-ENC:unsignedLongdouble or java.math.BigInteger
SOAP-ENC:unsignedIntlong
SOAP-ENC:unsignedShortint
SOAP-ENC:unsignedByteint
SOAP-ENC:durationcustom class
SOAP-ENC:dateTimejava.util.Date
SOAP-ENC:timejava.sql.Time
SOAP-ENC:datejava.sql.Date
SOAP-ENC:gYearMonthcustom class
SOAP-ENC:gYearcustom class, int, or java.math.BigInteger
SOAP-ENC:gMonthDaycustom class
SOAP-ENC:gDaycustom class or int
SOAP-ENC:gMonthcustom class or int
SOAP-ENC:hexBinarybyte[]
SOAP-ENC:base64Binarybyte[]
SOAP-ENC:anyURIjava.net.URL, java.lang.String, or a custom class
SOAP-ENC:QNamejava.lang.String or a custom class
SOAP-ENC:NOTATIONorg.w3c.dom.Notation
SOAP-ENC:normalizedStringjava.lang.String
SOAP-ENC:tokenjava.lang.String
SOAP-ENC:languagejava.lang.String or a custom class
SOAP-ENC:NMTOKENjava.lang.String or a custom class
SOAP-ENC:NMTOKENSjava.lang.String or a custom class
SOAP-ENC:Namejava.lang.String
SOAP-ENC:NCNamejava.lang.String
SOAP-ENC:IDjava.lang.String
SOAP-ENC:IDREFjava.lang.String
SOAP-ENC:IDREFSan array or list of java.lang.Strings or a custom class
SOAP-ENC:ENTITYorg.w3c.dom.Entity
SOAP-ENC:ENTITIESan org.w3c.dom.NodeList containing org.w3c.dom.Entity objects.

These mappings are not written in stone. Some of the XMLish types like SOAP-ENC:ENTITY and SOAP-ENC:IDREFS are particularly uncertain, and may be implemented in different ways in different environments. However, this should give you a pretty good idea of the sorts of mappings that are possible between SOAP types and Java types.

Besides this list of simple types, the http://schemas.xmlsoap.org/soap/encoding/ encoding also defines concepts of structs, references, byte arrays, and arrays.

Structs

A struct is just an element that contains child elements but no mixed content. For example, this is a Quote struct that contains Symbol and Price members:

<Quote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <Symbol>RHAT</Symbol>
  <Price>4.12</Price>
</Quote>

In Java terms, by using the http://schemas.xmlsoap.org/soap/encoding/ encoding style, you’re indicating that you would like this element deserialized into an object of type Quote, which has two properties named Symbol and Price. In other words, the class definition looks something like this:

public class Quote {

  public String getSymbol();
  public double getPrice();

}

You may or may not have such a class in your system. If the SOAP request began its life as a Quote object which was then converted to XML, transmitted across the Internet, and then turned back into a Java object, then perhaps you do have such a class. On the other hand, perhaps the object began its life as a C struct or a C++ object; or perhaps it was never anything except an XML document. In these cases there may not be a convenient Quote class into which you can deserialize this compound object. Another possibility is to decode the name-value pairs into some form of Hashtable or HashMap. The names of the fields would be the keys and the values of the fields would be the values.

What this encoding really tells you is roughly how the author intended this document to be handled. However if you have some other way of making sense of this data, you can use it. You are not limited to any one deserialization form.

References

A reference type uses an href attribute to point to a value stored elsewhere in the SOAP request. This mirrors the structure when two objects both have to contain the same object. For example, consider this trade request:

<Bid xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <Symbol>RHAT</Symbol>
  <Price>4.12</Price>
  <Account>777-7777</Account>
</Bid>
<Bid xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <Symbol>YHOO</Symbol>
  <Price>4.12</Price>
  <Account>777-7777</Account>
</Bid>

In both cases the account number is the same. Furthermore, it’s not just that the two numbers are equal. They indicate the same object. In Java terms it’s the difference between the equals() method and the == operator. The first tests for equality while the second tests for identity. If the local semantics demand that each Account element be deserialized as an Account object, perhaps with other fields filled in from a database rather than the XML document, then you want some means of saying this document should produce one Account object rather than two. This is done with a reference. Give the first Account element a unique id attribute and use an href attribute in the second element to point to it like this:

<Bid xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <Symbol>RHAT</Symbol>
  <Price>4.12</Price>
  <Account id="a1">777-7777</Account>
</Bid>
<Bid xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <Symbol>YHOO</Symbol>
  <Price>4.12</Price>
  <Account href="#a1"/>
</Bid>

This document represents two Bid objects. Each has three properties: Symbol, Price, and Account. The Symbols are completely different. The prices are equal but not identical; that is, one can change without changing the other. There are two separate prices here that coincidentally have the same value. The Accounts, however, are identical. There is only one account here, used in two different places.

Arrays

In Java arrays are a funny kind of object, and in SOAP they are too. An array is represented as an element whose type is SOAP-ENC:Array. For example, this is an array of three numbers:

<Bid xsi:type="SOAP-ENC:Array">
   <Price>4.52</Price>
   <Price>0.35</Price>
   <Price>34.68</Price>   
 </Bid>

In an array the names of the elements don’t really mean anything. Only the positions matter. If the name of the array doesn’t matter either, you can use a SOAP-ENC:Array element instead. For example, this is an array of three doubles, with no extra information:

<SOAP-ENC:Array>
  <SOAP-ENC:double>4.52</SOAP-ENC:double>
  <SOAP-ENC:double>0.35</SOAP-ENC:double>
  <SOAP-ENC:double>34.68</SOAP-ENC:double>
</SOAP-ENC:Array>

SOAP arrays are not as strongly typed as Java arrays, at least by default. Whereas each array in Java must contain just ints or just strings or just objects, a SOAP array can contain data of varying types. For example, this array contains three items, each with a different name and type:

<Bid xsi:type="SOAP-ENC:Array">
  <Symbol  xsi:type="xsd:token">RHAT</Symbol>
  <Price   xsi:type="xsd:double">4.12</Price>
  <Account xsi:type="xsd:string">777-7777</Account>
</Bid>

Given this possibility, it can be difficult to decode a SOAP array into a Java array. The closest Java equivalent is an Object[] array. However, primitive types like double would need to be replaced by an instance of the matching type wrapper class such as java.lang.Double instead. Another possibility is to use a java.util.Vector or java.util.ArrayList instead of a straight array, though this still doesn’t remove the need for the type wrapper classes.

If you want to restrict the type of array components, you can add a SOAP-ENC:arrayType attribute to the array element. The value of this attribute is the type of the individual component followed by square brackets containing the length of the array. This is more similar to C’s array declaration syntax than Java’s. For example, this array must contain exactly three doubles:

<Bid xsi:type="SOAP-ENC:Array" SOAP-ENC:arrayType="xsd:double[3]">
   <Price>4.52</Price>
   <Price>0.35</Price>
   <Price>34.68</Price>   
 </Bid>

Any array component not specifically typed otherwise can be a struct. Furthermore any array component can be another array. However, this does not produce a multidimensional array. Instead, multidimensional arrays are created by stringing together the values from the second row after the values from the first, the values from the third row after the values from the second, and so on. The SOAP-ENC:arrayType attribute indicates how many columns there are. For example, this is a three row by two column array of doubles:

<SOAP-ENC:Array SOAP-ENC:arrayType="xsd:double[3,2]">
   <SOAP-ENC:double>1.1</SOAP-ENC:double>
   <SOAP-ENC:double>1.2</SOAP-ENC:double>
   <SOAP-ENC:double>2.1</SOAP-ENC:double>   
   <SOAP-ENC:double>2.2</SOAP-ENC:double>
   <SOAP-ENC:double>3.1</SOAP-ENC:double>
   <SOAP-ENC:double>3.2</SOAP-ENC:double>   
 </SOAP-ENC:Array>

Although the XML representation is one-dimensional, the Java interpretation is two-dimensional. When deserialized, this forms the following Java array:

double[][] array = {
  {1.1, 1.2},
  {2.1, 2.2},
  {3.1, 3.2}
}

In the interest of efficiency over potentially slow networks, SOAP allows partially transmitted and sparse arrays. A partially transmitted array (also known as a varying array) does not begin with position 0. Instead it begins at a specified index. For instance, it might have ten components indexed from 3 to 12 inclusive. In SOAP you indicate the position a partially transmitted array begins at with a SOAP-ENC:offset attribute. The value of this attribute is the index of the first element in the array enclosed in square brackets. For example, this array begins at 3:

<SOAP-ENC:Array SOAP-ENC:offset="[3]">
  <SOAP-ENC:string>Component 3</SOAP-ENC:string>   
  <SOAP-ENC:string>Component 4</SOAP-ENC:string>   
  <SOAP-ENC:string>Component 5</SOAP-ENC:string>   
  <SOAP-ENC:string>Component 6</SOAP-ENC:string>   
  <SOAP-ENC:string>Component 7</SOAP-ENC:string>   
  <SOAP-ENC:string>Component 8</SOAP-ENC:string>   
  <SOAP-ENC:string>...</SOAP-ENC:string>   
</SOAP-ENC:Array>

Java doesn’t have such arrays, though Pascal and some other languages do. In Java you’d probably deserialize such an array by putting null values or zeroes in the places before the beginning of the array.

A sparse array is one in which a very large percentage of the components are 0 or null. In SOAP a sparse array would only pass the non-zero/non-null components. However, when the array was deserialized, these would be filled in with zeroes or nulls. For a sparse array, the number of elements in the array must be specified by a SOAP-ENC:arrayType attribute. The position of each element that is provided is given by a SOAP-ENC:position attribute. For example, this is a 10-element array that only provides the second, third, and fifth elements:

<SOAP-ENC:Array SOAP-ENC:arrayType="xsd:string[10]">
  <SOAP-ENC:string SOAP-ENC:position="[2]">
    2nd component
  </SOAP-ENC:string>
  <SOAP-ENC:string SOAP-ENC:position="[3]">
    3rd component
  </SOAP-ENC:string>
  <SOAP-ENC:string SOAP-ENC:position="[5]">
    5th component
  </SOAP-ENC:string>   
</SOAP-ENC:Array>

The equivalent Java code looks like this:

   String[] array = new String[10];
   array[2] = "\n     2nd component\n    ";
   array[3] = "\n     3rd component\n    ";
   array[5] = "\n     5th component\n    ";

Byte Arrays

A byte array is just a string that somehow encodes binary data. The most common such encoding is Base-64. A schema or an xsi:type attribute is needed to identify the encoding. For example, this is a Base-64 encoded byte array that provides an SHA-1 digital signature for a document. The signature is normally 20 bytes, which becomes 56 characters when translated to Base-64:

<SignatureValue>
AgGOvkMdqdKT7QyMuXPsuomkOqqEhGukKkj4Em7OKKQxYzheuseS8Q==
</SignatureValue>

In Java this would normally be deserialized into a byte array.

SOAP Headers

As well as the body of the request, each SOAP document can contain a header. This is not an HTTP header. Rather it is an additional child of the SOAP-ENV:Envelope element, specifically a SOAP-ENV:Header element. If a SOAP request is an envelope, then the body is the letter inside the envelope, and the header is the writing on the outside of the envelope that tells the Post Office where to deliver it to, where to send it back if they can’t deliver it, and how much you paid to get the letter delivered. In other words, a SOAP header provides meta-information about the request.

The sort of meta-information provided varies from request to request and SOAP application to SOAP application. Some things that can be exchanged in headers include:

  • Protocols the server must understand to process the request.

  • A digital signature for the body of the message

  • A schema for the XML application used in the body

  • Credit card info to pay for the processing

  • A public key to be used to encrypt the response

For instance, Example 2.22 shows a bid document in which the header carries credit card information to pay for the request. In this case, the syntax used for the Payment element is specific to the XML application used in the body and even comes from the same namespace.

Example 2.22. A SOAP Request with a digital signature in the header

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Header>
    <Payment xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <Name>Elliotte Harold</Name>
      <Issuer>VISA</Issuer>
      <Number>5125456787651230</Number>
      <Expires>2005-12</Expires>
    </Payment>
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    <buy id="buy1"
         xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>MRBA</symbol> 
      <shares>100</shares> 
      <account>777-7777</account> 
    </buy>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Like the SOAP body, the SOAP header can use any XML application it cares to use to encode the data. It is not limited to a fixed vocabulary. Indeed it can use more than one such vocabulary. The SOAP-ENV:Header element can contain multiple child elements from a hodge-podge of different namespaces. Each one of these elements is called a header entry, and may be treated independently of the other header entries. For example, Example 2.23 adds an additional header containing a digital signature for the request body. The syntax used for the Signature element is defined by XML-Signature Syntax and Processing.

Example 2.23. A SOAP Request with two header entries

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Header>
    <Payment xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <Name>Elliotte Harold</Name>
      <Issuer>VISA</Issuer>
      <Number>5125456787651230</Number>
      <Expires>2005-12</Expires>
    </Payment>
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
  <SignedInfo>
    <CanonicalizationMethod 
    Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
    <SignatureMethod 
      Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1" />
    <Reference URI="file://J/xss4j/requestbody.xml">
    <DigestMethod 
      Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
      <DigestValue>3UxhLrdPpK3faRms5FOS6kAoeZI=</DigestValue>
    </Reference>
  </SignedInfo>
  <SignatureValue>
    ZeW/PYGT6A9iOqOrbMmeKOq1aQk+ars/QOC95Bj0xYrNAnLo/WK7+g==
  </SignatureValue>
</Signature>
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    <buy id="buy1"
         xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>MRBA</symbol> 
      <shares>100</shares> 
      <account>777-7777</account> 
    </buy>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The mustUnderstand attribute

An individual SOAP document tends to be tied pretty closely to the service it plans to talk to. You can’t send a request for a stock quote to a server that’s designed to provide basketball scores and expect to get sensible results back. In order that a SOAP request can indicate what is required of a server, it can place a SOAP-ENV:mustUnderstand attribute on each header entry. If this attribute has the value 1, then the service which receives the SOAP request must process the header entry. If it cannot, whether because it does not understand the header entry or for some other reason, then it must fail the request and return a fault. If the SOAP-ENV:mustUnderstand attribute has the value 0, then processing the header is optional. The service should do it if it can, but failing to do so does not automatically lead to a fault. The default is 0.

For example, Example 2.24 is a BUY order that requires the receiver to understand the Payment header. If the server does not recognize that header, it must not attempt to fulfill the order.

Example 2.24. A SOAP Request with a mustUnderstand attribute

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Header>
    <Payment xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"
             SOAP-ENV:mustUnderstand="1">
      <Name>Elliotte Harold</Name>
      <Issuer>VISA</Issuer>
      <Number>5125456787651230</Number>
      <Expires>2005-12</Expires>
    </Payment>
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    <buy xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>MRBA</symbol> 
      <shares>100</shares> 
      <account>777-7777</account> 
    </buy>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The actor attribute

Although this book mostly focuses on SOAP messages that go straight from the sender system to the receiver who will process it, not all systems are this simple. A SOAP message can be forwarded from one SOAP processor to the next until it reaches its ultimate destination. By default, the headers are only read by the last processor. However, you can indicate that a header is intended for a closer processor by using an actor attribute on the header entry element. The value of this attribute is a URI identifying the processor the header entry is intended for.

When a processor receives a SOAP message, it searches the header for header entries addressed to it. It acts on and deletes these header entries. It may also add new header entries intended for processors later in the chain. Then it forwards the message on to the next processor in the chain.

All processors except the last one only act on header entries that are specifically addressed to them. After acting on an entry, the processor deletes it before forwarding the request to the next processor in the chain. Furthermore, the URL http://schemas.xmlsoap.org/soap/actor/next indicates that the header entry should be processed and deleted by the first processor that sees it. The last processor in the chain will process any header entries that are not addressed to any processor in particular as well as any header entries that are addressed specifically to it.

The exact scheme for forwarding SOAP messages from one processor to the next is system dependent. For instance, you might set up a gateway server outside the firewall that verified certain characteristics of a SOAP message before it forwarded it to a processor inside the firewall. Such a gateway would either block or forward each message. A switching processor might inspect the body of the message and forward the request to different SOAP processors depending on what it saw there. Some systems might even use routing included in the messages themselves.

SOAP Limitations

Regrettably, in my opinion, SOAP does not allow developers to take full advantage of XML’s expressiveness and extensibility. First of all, According to the SOAP 1.1 specification, “A SOAP message MUST NOT contain a Document Type Declaration.” This allows non-validating parsers and parsers that cannot resolve external entities to be used to process SOAP messages without concern that they may be misinterpreting them because they don’t apply default namespaces or resolve external entities. However, it also means the document can’t be validated against a DTD.

Also according to the SOAP 1.1 specification, “A SOAP message MUST NOT contain Processing Instructions.” Honestly this makes no sense to me whatsoever. I see little reason for forbidding these. This does mean that all information in a SOAP request must be passed through the defined SOAP structure, but it also means that it’s difficult to include other useful features beyond the SOAP structure. The most obvious is that you can’t easily apply a style sheet to a SOAP document, though that’s not a huge loss since SOAP documents aren’t meant for humans to read in the first place. However, it also means it’s difficult to serve SOAP documents out of the Cocoon application server. There are probably many other environment specific instances where this becomes inconvenient.

Validating SOAP

SOAP is actively hostile to DTDs. The SOAP specification specifically forbids a SOAP request from containing a document type declaration. Thus you really have to use a schema to validate your documents, if you validate them at all.

Unlike XML-RPC, SOAP does have an official schema. In fact it has two, which you can download from the SOAP namespace URLs. The envelope schema at http://schemas.xmlsoap.org/soap/envelope/ describes the SOAP complex types: SOAP-ENV:Envelope, SOAP-ENV:header, SOAP-ENV:Body, etc. The encoding schema at http://schemas.xmlsoap.org/soap/encoding/ defines the SOAP data types shown in Table 2.2: SOAP-ENC:int, SOAP-ENC:NMTOKENS, SOAP-ENC:gYear, SOAP-ENC:Array, etc. These schemas are a little too long to reproduce here, but you can find them in Appendix B.

XML-RPC is a monolithic XML application not designed to be integrated with other XML applications. SOAP, by contrast, is incomplete without some other XML application to form the body of the SOAP request. Thus the SOAP schema cannot be monolithic. It must rely on some other XML application in its own namespace (or perhaps no namespace at all, though this is not recommended) so the SOAP schema cannot on its own validate any SOAP documents. You also need to provide a separate schema for the bodies of your documents, and then you need to merge the two together using xsd:import elements.

For example, Example 2.25 is a master schema for quote request documents such as Example 2.15. This schema declares no elements of its own but does import both SOAP schemas, as well as the schema for getQuote elements seen earlier in Example 2.21. This schema can be used to validate a complete SOAP request that has a getQuote body element. If you wanted to validate the other SOAP documents in this chapter that use other elements in the header and body, you’d just need to write declarations for those elements too. They could be placed in the master schema, trading.xsd, or their own schema documents, whichever seems most convenient.

Example 2.25. A Master Schema for SOAP Trading documents

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  targetNamespace="http://schemas.xmlsoap.org/soap/envelope/">

  <!-- Standard SOAP schemas -->
  <xsd:include 
    schemaLocation="http://schemas.xmlsoap.org/soap/envelope/"
  />
  <xsd:import 
    schemaLocation="http://schemas.xmlsoap.org/soap/encoding/"
    namespace="http://schemas.xmlsoap.org/soap/encoding/" 
  />

  <!-- Local schema -->
  <xsd:import schemaLocation="trading.xsd"
    namespace="http://namespaces.cafeconleche.org/xmljava/ch2/"
  />  
  
</xsd:schema>

Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified July 20, 2001
Up To Cafe con Leche