Sponsored by BMBF Logo

A Transformation from XML to RDF via XSLT


  XML
XSLT

Introduction

The Resource Description Framework (RDF) was developed as a new data model for embedding information in a schematic document structure to make it more machine readable.

In practice however, one often finds data in XML format. Here we developed a generic  XSLT transformation which can always be applied to convert any XML document into RDF conform structure. This work was motivated by needs in semantic astronomy encountered in the AstroGrid-D project. Here for example the monitoring of robotic telescopes, compute resources and individual jobs is based on XML. However, the information service of AstroGrid-D - Stellaris - uses RDF.

RDF describes data through a hierarchical structure of resources, which are represented by universal resource identifiers (URIs). Resources can be composed of other resources in analogy to the real-world object they describe. For example a telescope can be composed of a camera, which is composed of a filter wheel, which is composed of filters, etc. This concept makes RDF an interesting choice for the metadata management in heterogeneous software environments, where an automated interaction between different components is desirable.

However, data is usually not provide in RDF and the development of individual solutions is often not affordable.

Here our generic XML to RDF transformation does the trick.

Design Goals

There are different ways to represent XML in RDF. Different solutions are shown in the history section below. The latest transformation achieves the following design goals:

  1. avoidance of blank nodes,
  2. one-to-one mapping for bidirectional extension,
  3. independence of XML schema.

Blank nodes are subjects without name. Therefore access to them is more difficult and some operations such as direct replacement of nodes cannot be performed. By avoiding blank nodes these complications can be avoided.

A one-to-one mapping is necessary for the inverse transformation.  A idirectional transformation can be important e.g. in a robotic telescope network where information about scheduled observations is stored in RDF but where rescheduling requires the original RTML observation request. Therefore in AstroGrid-D also the RTML observation requests were stored along with the RDF. The inverse transformation could make this additional service unnecessary. A unique reconstruction of the original XML requires e.g. to preserve the distinction between attributes and elements. As shown below, this is accomplished by the different transformation of attributes and elements.

The last point makes the transformation independent from the underlying XML schema, so that the structure of RDF is completely determined by the XML. It requires that the order of elements is preserved.

Transformation / Conversion

The transformation is accomplished via XSLT. The latest stylesheet (xml2rdf3.xsl) and some earlier versions are found below. As an example we show the transformation of a reduced description of the robotic telescope STELLA-I from its XML dialect in RTML (STELLA-I.rtml) into RDF (STELLA-I_3.rdf, STELLA-I_3.ttl). The description of STELLA-I is shown below.

<?xml version="1.0" encoding="UTF-8"?>
<RTML version="3.2" mode="resource" uid="rtml://www.opentel.net/STELLA-I"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.rtml.org/v3.2" xsi:schemaLocation="RTML-v3.2.xsd">
<!-- This is only a fragment -->
<Telescope>
<FocalLength units="meters">9.6</FocalLength>
<Camera>
<FilterWheel>
<Filter type="Johnson_U" name="U"/>
<Filter type="Johnson_B" name="B"/>
</FilterWheel>
</Camera>
</Telescope>
</RTML>

The transformation is executed with an XSLT processor like xsltproc as follows:

  xsltproc xml2rdf3.xsl STELLA-I.rtml > STELL-I_3.rdf  

The resulting RDF has the structure shown in the graphic below. It is obtained using the RDF visualization tool RDF Gravity.




Applications

This transformation is used to convert XML into RDF format.

For example in AstroGrid-D this transformation is used for monitoring with the information service Stellaris. More precisely it is used for converting

RTML metadata of robotic telescopes

- Monitoring & Discovery System (MDS) information of the Globus Toolkit

- Usage Records created from Audit logging of the Globus Toolkit

Version History

The table below contains different version of the transformation. The STELLA-I.rtml was slightly different for older versions.
A graphical overview can be found here.

Release
Version/Date Changes
Graph
RDF/XML
xml2rdf3.xsl 3.0 / 2009-05-19 rdf:value for every text, no attribute triples, order predicates, comments as triples
STELLA-I_3.png STELLA-I_3.rdf
xml2rdf25.xsl 2.5 / 2009-05-19 added BaseURI variable, keep comments as comments
STELLA-I_2.5.png STELLA-I_2.5.rdf
xml2rdf24.xsl
2.4 / 2008-09-30 no rdf:type information used (simpler); attributes are distinguished from elements by an additional xs:attribute triple
STELLA-I_2.4.png
xml2rdf23.xsl 2.3 / 2008-09-25
distinction of elements from attributes by an rdf:type xsl:element
xml2rdf22.xsl 2.2 / 2008-09-23
distinction of attributes from elements by an rdf:type xsl:attribute STELLA-I_2.2.png
xml2rdf21.xsl 2.1 / 2008-03-14 resources have an rdf:type information
STELLA-I_2.1.png
xml2rdf2.xsl 2.0 / 2007-11-05 blank nodes are replaced by URIs constructed from the hierarchy of XML element
STELLA-I_2.0.png STELLA-I_2.0.rdf
xml2rdf1.xsl 1.0 / 2007-03-26 elements and attributes become literals connected by blank nodes similar to the Java tool OwlMap .
STELLA-I_1.0.png

References

Contact

Frank Breitling
fbreitling (at) aip.de
http://www.aip.de/People/fbreitling/