Relational Databases

Relational databases are another common source of data for XML documents. Some databases such as FileMaker Pro have built-in support for outputting tables as XML documents. Others do not. However, even if your database can export tables as XML documents, its XML format may not be the XML format you want. Fortunately, as long as there’s a JDBC driver for your database of choice, it’s not hard to extract the information from it and write that information into an XML document in the desired form.

For this example, I’ll use the same budget data previously read out of a CSV file in a single relational table that reflects the original flat structure of the files distributed by the Office of Management and Budget. Doubtless they have the data in their own relational databases, probably divided up into multiple tables; but they don’t publish it that way. They do state that “If you plan to use these data in a relational database, you should designate the following fields as ‘primary’ to uniquely identify each row of data: agency code, bureau code, account code, subfunction code, BEA category, Grant/Nongrant, and On- Off-budget field.” The SQL CREATE TABLE statement that initializes this table is:

CREATE TABLE BudgetAuthorizationTable (
  AgencyCode             CHAR(3),
  AgencyName             VARCHAR(89),
  BureauCode             CHAR(2),
  BureauName             VARCHAR(89),
  AccountCode            VARCHAR(6),
  AccountName            VARCHAR(160),
  TreasuryAgencyCode     CHAR(2),
  SubfunctionCode        CHAR(3),
  SubfunctionTitle       VARCHAR(72),
  BEACategory            VARCHAR(13),
  On-Off-BudgetIndicator VARCHAR(10),
  FY1976                 INTEGER,
  TransitionQuarter      INTEGER,
  FY1977                 INTEGER,
  FY1978                 INTEGER,
  FY1979                 INTEGER,
  FY1980                 INTEGER,
  FY1981                 INTEGER,
  FY1982                 INTEGER,
  FY1983                 INTEGER,
  FY1984                 INTEGER,
  FY1985                 INTEGER,
  FY1986                 INTEGER,
  FY1987                 INTEGER,
  FY1988                 INTEGER,
  FY1989                 INTEGER,
  FY1990                 INTEGER,
  FY1991                 INTEGER,
  FY1992                 INTEGER,
  FY1993                 INTEGER,
  FY1994                 INTEGER,
  FY1995                 INTEGER,
  FY1996                 INTEGER,
  FY1997                 INTEGER,
  FY1998                 INTEGER,
  FY1999                 INTEGER,
  FY2000                 INTEGER,
  FY2001                 INTEGER,
  FY2002                 INTEGER,
  FY2003                 INTEGER,
  FY2004                 INTEGER,
  FY2005                 INTEGER,
  FY2006                 INTEGER,
  PRIMARY KEY (AgencyCode, BureauCode, AccountCode, 
               SubfunctionCode, BEACategory, On-Off-BudgetIndicator)
);

The specific database I chose for this example is Microsoft Excel, mostly because it could very easily read the comma delimited files I was starting with. Excel isn’t the best example of a relational database. In fact, it isn’t a relational database at all. However, it does allow you to define a range of cells as a table, and then associate that table with an ODBC data source. This data source can then be read with SQL using JDBC through the JdbcOdbcDriver, which is all I really want to show here. Aside from the choice of JDBC driver, all statements will be completely database independent. The name of the ODBC data source is budauth. The name of the table in that source is BudgetAuthorizationTable.

Extracting the necessary data from the database is just a matter of SQL which, as a Java programmer, you access via JDBC. To some extent you can let the database do the hard work for you by executing the right sequence of SQL commands. In this case, this is just a long sequence of nested SELECT statements. SQL’s DISTINCT operator will be particularly helpful. The contortions of the Muenchian method in Example 4.12 were a roundabout way of providing a distinct operation in XSLT.

Example 4.14. A program that connects to a relational database using JDBC and converts the table to hierarchical XML

import java.sql.*;
import java.io.*;


public class SQLToXML {
        
  public static void main(String[] args ) {
        
    // Load the ODBC driver
    try {
      Class.forName( "sun.jdbc.odbc.JdbcOdbcDriver" );
    }
    catch (ClassNotFoundException e) {
      System.err.println("Could not load the JDBC-ODBC Bridge");
      return;
    }
    
    try {      
      Writer out = new OutputStreamWriter(System.out, "UTF8"); 
      out.write("<?xml version=\"1.0\"?>\r\n");
      out.write("<Budget>\r\n");
      writeAgencies(out);      
      out.write("</Budget>\r\n");
      out.close();      
    }
    catch (IOException e) {
      System.err.println(e);
    }

    
  }
  
  private static void writeAgencies(Writer out) 
   throws IOException {

    Connection conn = null;
    Statement stmnt = null;
    try {
      conn = DriverManager.getConnection(
       "jdbc:odbc:budauth", "", "");
      stmnt = conn.createStatement();
      String query = "SELECT DISTINCT AgencyName, AgencyCode"
       + " FROM BudgetAuthorizationTable;";
      ResultSet agencies = stmnt.executeQuery( query );

      while( agencies.next() ) {
        
        String agencyName = agencies.getString("AgencyName");
        agencyName = escapeText(agencyName);
        String agencyCode = agencies.getString("AgencyCode");
        out.write("  <Agency>\r\n");
        out.write("    <Name>" + agencyName + "</Name>\r\n");
        out.write("    <Code>" + agencyCode + "</Code>\r\n");
        writeBureaus(out, conn, agencyCode);
        out.write("  </Agency>\r\n");
         
      }
    }
    catch (SQLException e) {
      System.err.println(e);
      e.printStackTrace();       
    }
    finally {
      try {
        stmnt.close();
        conn.close();
      }
      catch(SQLException e) {
        System.err.println(e);
      }
    }
              
  }
  
  private static void writeBureaus(Writer out, Connection conn, 
   String agencyCode) throws IOException, SQLException {

    String query 
     = "SELECT DISTINCT BureauName, BureauCode "
     + "FROM BudgetAuthorizationTable WHERE AgencyCode='" 
     + agencyCode + "';";
    Statement stmnt = conn.createStatement();
    ResultSet bureaus = stmnt.executeQuery(query);

    while( bureaus.next() ) {
      String bureauName = bureaus.getString("BureauName");
      bureauName = escapeText(bureauName);
      String bureauCode = bureaus.getString("BureauCode");
      out.write("    <Bureau>\r\n");
      out.write("      <Name>" + bureauName + "</Name>\r\n");
      out.write("      <Code>" + bureauCode + "</Code>\r\n");
      writeAccounts(out, conn, agencyCode, bureauCode);
      out.write("    </Bureau>\r\n");
      out.flush();
    }        

  }
  
  private static void writeAccounts(Writer out, Connection conn, 
   String agencyCode, String bureauCode)
   throws IOException, SQLException {

    String query = "SELECT DISTINCT AccountName, AccountCode "
     + "FROM BudgetAuthorizationTable WHERE AgencyCode='" 
     + agencyCode + "' AND BureauCode='" + bureauCode + "';";
    Statement stmnt = conn.createStatement();
    ResultSet accounts = stmnt.executeQuery(query);

    while( accounts.next() ) {
      String accountName = accounts.getString("AccountName");
      accountName = escapeText(accountName);
      String accountCode = accounts.getString("AccountCode");
      out.write("      <Account>\r\n");
      out.write("        <Name>" + accountName + "</Name>\r\n");
      out.write("        <Code>" + accountCode + "</Code>\r\n");
      writeSubfunctions(
       out, conn, agencyCode, bureauCode, accountCode
      );
      out.write("      </Account>\r\n");
      out.flush();
    }        
        
  }
  
  private static void writeSubfunctions(Writer out,  
   Connection conn, String agencyCode, String bureauCode, 
   String accountCode) throws IOException, SQLException {

    String query = "SELECT * FROM BudgetAuthorizationTable"
     + " WHERE AgencyCode='" + agencyCode + "' AND BureauCode='" 
     + bureauCode + "' AND AccountCode='" + accountCode + "';";
    Statement stmnt = conn.createStatement();
    ResultSet subfunctions = stmnt.executeQuery(query);

    while( subfunctions.next() ) {
      String subfunctionTitle 
       = subfunctions.getString("SubfunctionTitle");
      subfunctionTitle = escapeText(subfunctionTitle);
      String subfunctionCode 
       = subfunctions.getString("SubfunctionCode");
      out.write("        <Subfunction>\r\n");
      out.write("          <Name>");
      out.write(subfunctionTitle);
      out.write("</Name>\r\n");
      out.write("          <Code>");
      out.write(subfunctionCode);
      out.write("</Code>\r\n");
      out.write("          <Amount year='TransitionQuarter'>"); 
      out.write(subfunctions.getInt("TransitionQuarter") 
       + "</Amount>\r\n");      
      for (int year = 1976; year <= 2006; year++) {
        String name = "FY" + year;
        long amt = subfunctions.getInt(name) * 1000L;
        out.write("          <Amount year='" + year + "'>");
        out.write(amt + "</Amount>\r\n");      
      }
      out.write("        </Subfunction>\r\n");
      out.flush();
    }        
        
  }
  
  public static String escapeText(String s) {
   
    if (s.indexOf('&') != -1 || s.indexOf('<') != -1
     || s.indexOf('>') != -1 || s.indexOf('"') != -1
     || s.indexOf('\'') != -1 ) {
      StringBuffer result = new StringBuffer(s.length() + 6);
      for (int i = 0; i < s.length(); i++) {
        char c = s.charAt(i);
        if (c == '&') result.append("&amp;");
        else if (c == '<') result.append("&lt;");
        else if (c == '"') result.append("&quot;");
        else if (c == '\'') result.append("&apos;");
        else if (c == '>') result.append("&gt;");
        else result.append(c);
      }
      return result.toString();  
    }
    else {
      return s;   
    }
        
  }
  
}

The basic approach here should be quite familiar by now. Tags are stored in string literals. These tags are written onto a Writer along with element content and attribute values that have been read from the input. The difference in this case is that since the input comes from a relational database, the program can use SQL to get the input it wants when it wants it. The Java program does not need to put itself out to accommodate the order of the input data. In essence this program is nothing more than four nested loops. The outermost loop iterates over the different agencies. This contains a loop that iterates over the bureaus within that agency. This contains a loop that iterates over the accounts within that bureau. This contains the innermost loop that iterates over the subfunctions within the account. Compare this to the contortions the other programs had to perform. Only the XQuery solution was as straightforward as this, which is not surprising since it also allows the client to specify when it wants to receive which data.


Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified July 25, 2002
Up To Cafe con Leche