Writing XML

Now suppose you don’t just want to dump out a bunch of raw numbers. Instead you want to produce a well-formed XML document such as Example 3.2.

Example 3.2. The first 10 Fibonacci numbers in an XML document

<?xml version="1.0"?>
<Fibonacci_Numbers>
  <fibonacci>1</fibonacci>
  <fibonacci>1</fibonacci>
  <fibonacci>2</fibonacci>
  <fibonacci>3</fibonacci>
  <fibonacci>5</fibonacci>
  <fibonacci>8</fibonacci>
  <fibonacci>13</fibonacci>
  <fibonacci>21</fibonacci>
  <fibonacci>34</fibonacci>
  <fibonacci>55</fibonacci>
</Fibonacci_Numbers>

To produce this, just add string literals for the <fibonacci> and </fibonacci> tags inside the print statements, as well as a few extra print statements to produce the XML declaration and the root element start- and end-tags. XML documents are just text, and you can output them any way you’d output any other text document. Example 3.3 demonstrates.

Example 3.3. A program that outputs the Fibonacci numbers as an XML document

import java.math.BigInteger;


public class FibonacciXML {

  public static void main(String[] args) {
   
      BigInteger low  = BigInteger.ONE;
      BigInteger high = BigInteger.ONE;      
      
      System.out.println("<?xml version=\"1.0\"?>");  
      System.out.println("<Fibonacci_Numbers>");  
      for (int i = 0; i < 10; i++) {
        System.out.print("  <fibonacci>");
        System.out.print(low);
        System.out.println("</fibonacci>");
        BigInteger temp = high;
        high = high.add(low);
        low = temp;
      }
      System.out.println("</Fibonacci_Numbers>");  

  }

}

Better Coding Practices

Although I’m going to keep the examples as simple and small as I possibly can in this book, and save my lines of code for the points that are relevant to XML, you shouldn’t use that as an excuse to forget all the good programming practices you’ve learned in the past, even if I don’t use them here. For instance, keeping too much data in string literals throughout the code makes a program very hard to localize and maintain. You may wish to move element names and other constants into final static fields as shown in Example 3.4. You might even want to read the values of these variables from a resource bundle or system properties so they can be edited independently of the source code. I’m not going to do this here, because for purposes of education it’s more important to have short, self-contained examples that make the key points easy to grasp. However, in larger, more complex programs these techniques are essential.

Example 3.4. Using named constants for element names

import java.math.BigInteger;


public class FibonacciConstants {

  public final static String rootElementName 
   = "Fibonacci_Numbers";
  public final static String fibonacciElementName = "fibonacci";
  public final static String xmlDeclaration 
   = "<?xml version=\"1.0\"?>";

  public static void main(String[] args) {
   
      BigInteger low  = BigInteger.ONE;
      BigInteger high = BigInteger.ONE;      
      
      System.out.println(xmlDeclaration);  
      System.out.println("<" + rootElementName + ">");  
      for (int i = 0; i < 10; i++) {
        System.out.print("  <" + fibonacciElementName +">");
        System.out.print(low);
        System.out.println("</" + fibonacciElementName +">");
        BigInteger temp = high;
        high = high.add(low);
        low = temp;
      }
      System.out.println("</" + rootElementName + ">");  

  }

}

Attributes

Now let’s suppose you want to add some attributes to the elements. For instance, you might want to give each fibonacci element an index attribute specifying which Fibonacci number it is (the first, the second, the third, and so forth).

To do this, just add the extra strings to the source code, and calculate the index value from the loop index. Example 3.5 demonstrates.

Example 3.5. A Java program that writes an XML document that uses attributes

import java.math.BigInteger;


public class FibonacciAttributes {

  public static void main(String[] args) {
   
      BigInteger low  = BigInteger.ONE;
      BigInteger high = BigInteger.ONE;      
      
      System.out.println("<?xml version=\"1.0\"?>");  
      System.out.println("<Fibonacci_Numbers>");  
      for (int i = 1; i <= 10; i++) {
        System.out.print("  <fibonacci index=\"" + i + "\">");
        System.out.print(low);
        System.out.println("</fibonacci>");
        BigInteger temp = high;
        high = high.add(low);
        low = temp;
      }
      System.out.println("</Fibonacci_Numbers>");  

  }

}

The only even remotely difficult part of this is realizing you have to escape the double quotes used to delimit the attribute values as \" in the Java source code. An alternative is to use single quotes to delimit the attribute values. These don’t need to be escaped inside Java string literals. For example,

        System.out.print("  <fibonacci index='" + i + "'>");

Here’s the output from Example 3.5:

<?xml version="1.0"?>
<Fibonacci_Numbers>
  <fibonacci index="1">1</fibonacci>
  <fibonacci index="2">1</fibonacci>
  <fibonacci index="3">2</fibonacci>
  <fibonacci index="4">3</fibonacci>
  <fibonacci index="5">5</fibonacci>
  <fibonacci index="6">8</fibonacci>
  <fibonacci index="7">13</fibonacci>
  <fibonacci index="8">21</fibonacci>
  <fibonacci index="9">34</fibonacci>
  <fibonacci index="10">55</fibonacci>
</Fibonacci_Numbers>

Producing Valid XML

So far the Java programs have produced XML documents that were well-formed but not valid. Making them valid is not difficult. Simply print a a document type declaration that either includes an internal DTD subset or points to an appropriate external DTD subset. Example 3.6 takes the internal subset approach:

Example 3.6. A Java program that generates a valid document

import java.math.BigInteger;


public class ValidFibonacci {

  public static void main(String[] args) {
   
      BigInteger low  = BigInteger.ONE;
      BigInteger high = BigInteger.ONE;      
      
      System.out.println("<?xml version=\"1.0\"?>");  
      System.out.println("<!DOCTYPE Fibonacci_Numbers [");  
      System.out.println(
       "  <!ELEMENT Fibonacci_Numbers (fibonacci)*>");  
      System.out.println("  <!ELEMENT fibonacci (#PCDATA)>");  
      System.out.println(
       "  <!ATTLIST fibonacci index NMTOKEN #REQUIRED>");  
      System.out.println("]>");  
      System.out.println("<Fibonacci_Numbers>");  
      for (int i = 0; i < 10; i++) {
        System.out.print("  <fibonacci index=\"" + i + "\">");
        System.out.print(low);
        System.out.println("</fibonacci>");
        BigInteger temp = high;
        high = high.add(low);
        low = temp;
      }
      System.out.println("</Fibonacci_Numbers>");  

  }

}

Here’s the output from this program, including the document type declaration:

<?xml version="1.0"?>
<!DOCTYPE Fibonacci_Numbers [
  <!ELEMENT Fibonacci_Numbers (fibonacci)*>
  <!ELEMENT fibonacci (#PCDATA)>
  <!ATTLIST fibonacci index NMTOKEN #REQUIRED>
]>
<Fibonacci_Numbers>
  <fibonacci index="0">1</fibonacci>
  <fibonacci index="1">1</fibonacci>
  <fibonacci index="2">2</fibonacci>
  <fibonacci index="3">3</fibonacci>
  <fibonacci index="4">5</fibonacci>
  <fibonacci index="5">8</fibonacci>
  <fibonacci index="6">13</fibonacci>
  <fibonacci index="7">21</fibonacci>
  <fibonacci index="8">34</fibonacci>
  <fibonacci index="9">55</fibonacci>
</Fibonacci_Numbers>

Attaching a schema is no harder. Just place the necessary xmlns:xsi and xsi:noNamespaceSchemaLocation attributes on the root element.

Namespaces

Suppose instead of using a custom Fibonacci number vocabulary, you want to use the standard MathML vocabulary as shown in Example 3.7. The element names all have prefixes and an xmlns:mathml attribute on the root element binds the mathml prefix to the http://www.w3.org/1998/Math/MathML namespace URI. Each Fibonacci number is included in a mathml:mrow divided into a mathml:mi element, a mathml:mo element, and a mathml:mn element.

Example 3.7. A MathML document containing Fibonacci numbers

<?xml version="1.0"?>
<mathml:math xmlns:mathml="http://www.w3.org/1998/Math/MathML">
  <mathml:mrow>
    <mathml:mi>f(1)</mathml:mi>
    <mathml:mo>=</mathml:mo>
    <mathml:mn>1</mathml:mn>
  </mathml:mrow>
  <mathml:mrow>
    <mathml:mi>f(2)</mathml:mi>
    <mathml:mo>=</mathml:mo>
    <mathml:mn>1</mathml:mn>
  </mathml:mrow>
  <mathml:mrow>
    <mathml:mi>f(3)</mathml:mi>
    <mathml:mo>=</mathml:mo>
    <mathml:mn>2</mathml:mn>
  </mathml:mrow>
  <mathml:mrow>
    <mathml:mi>f(4)</mathml:mi>
    <mathml:mo>=</mathml:mo>
    <mathml:mn>3</mathml:mn>
  </mathml:mrow>
</mathml:math>

The markup is somewhat more complex, but the Java code is not significantly more so as Example 3.8 demonstrates.

Example 3.8. A Java program that generates a MathML document

import java.math.BigInteger;


public class MathMLFibonacci {

  public static void main(String[] args) {
   
      BigInteger low  = BigInteger.ONE;
      BigInteger high = BigInteger.ONE;      
      
      System.out.println("<?xml version=\"1.0\"?>");  
      System.out.println(
        "<mathml:math "
        + "xmlns:mathml=\"http://www.w3.org/1998/Math/MathML\">"
      );  
      for (int i = 1; i <= 10; i++) {
        System.out.println("  <mathml:mrow>");
        System.out.println("    <mathml:mi>f(" + i 
         + ")</mathml:mi>");
        System.out.println("    <mathml:mo>=</mathml:mo>");
        System.out.println("    <mathml:mn>" + low 
         + "</mathml:mn>");
        System.out.println("  </mathml:mrow>");
        BigInteger temp = high;
        high = high.add(low);
        low = temp;
      }
      System.out.println("</mathml:math>");  

  }

}

I could continue, showing you how to add comments, processing instructions, CDATA sections and other features of XML; but I think by now you’re getting the idea. XML documents are just text. They can be represented inside Java programs as various combinations of String literals and String variables. Other data types like int and BigInteger can be converted to their string representations either implicitly by concatenating them with strings or explicitly by invoking methods such as toString() and String.valueOf(). The fact that XML documents are text makes it very easy to organize XML output this way.

When you use this technique, you are responsible for following all the well-formedness, namespace well-formedness, and validity rules of XML. Nothing prevents you from producing incorrect XML. In fact, your first few efforts are likely to be malformed. You’ll want to run your output through a tool such as Xerces’s sax.Counter to check it for well-formedness and perhaps validity. Making sure the output is correct is simply one part of testing and debugging your code. The same would still be true if you were outputting a non-XML format. In fact, if anything, XML makes these tests easier because it’s straightforward to write declarative schemas that express exactly what is and is not legal and to compare your output against these schemas.

Eventually we’ll take up some alternatives to the direct string approach such as DOM and JDOM that do allow you to automatically maintain well-formedness and sometimes even validity. However, for many simple cases, these are vast overkill. It can be much simpler to just write a few strings onto an output stream. In the next section, we’ll look at how you can change the characteristics of the output stream you’re writing on.


Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified August 10, 2001
Up To Cafe con Leche