Introducing Tripliser

I recently had to solve the problem of how to take XML, in a predefined format, and create RDF representing the semantics of the data. I began using XSLT, but gradually the edge cases to handle inconsistencies in the input XML caused the XLST to become verbose and incomprehensible (being a mix of syntax handling and business logic). Errors were hard to diagnose and failures were not effectively recovered from. I decided to write a library to help me with this problem, called Tripliser…

>> Homepage  |  >> GitHub

Tripliser is a Java library and command-line tool for creating triple graphs, and RDF serialisations, from XML source data. It is particularly suitable for data exhibiting any of the following characteristics:

  • Messy – missing data, badly formatted data, changeable structure
  • Bulky – large volumes of data
  • Volatile – ongoing changes to data and structure, e.g. feeds

Other non-RDF source data may be supported in future such as CSV and SQL databases.

It is designed as an alternative to XSLT conversion, providing the following advantages:

  • Easy-to-read mapping format – concisely describing each mapping
  • Robust – error or partial failure tolerant
  • Detailed reporting – comprehensive feedback on the successes and failures of the conversion process
  • Extensible – custom functions, flexible API
  • Efficient – facilities for processing data in large volumes with minimal memory usage

XML files are read in, and XPath is used to extract values which can be inserted into a triple graph. The graph can be serialised in various RDF formats and is accompanied by meta-data and a property-by-property report to indicate how successful or unsuccessful the mapping process was.

Data flow in Tripliser

Here’s what a typical mapping format looks like…

<?xml version="1.0" encoding="UTF-8"?>
<rdf-mapping xmlns="http://www.daverog.org/rdf-mapping" strict="false">
	<constants>
		<constant name="objectsUri" value="http://objects.theuniverse.org/" />
	</constants>
	<namespaces>
		<namespace prefix="xsd" url="http://www.w3.org/2001/XMLSchema#" />
		<namespace prefix="rdfs" url="http://www.w3.org/2000/01/rdf-schema#" />
		<namespace prefix="dc" url="http://purl.org/dc/elements/1.1/" />
		<namespace prefix="universe" url="http://theuniverse.org/" />
	</namespaces>
	<graph query="//universe-objects" name="universe-objects" comment="A graph for objects in the universe">
		<resource query="stars/star">
			<about prepend="${objectsUri}" append="#star" query="@id" />
			<properties>
				<property name="rdf:type" resource="true" value="universe:Star"/>
				<property name="dc:title" query="name" />
				<property name="universe:id" query="@id" />
				<property name="universe:spectralClass" query="spectralClass" />
			</properties>
		</resource>
		<resource query="planets/planet">
			<about prepend="${objectsUri}" append="#planet" query="@id" />
			<properties>
				<property name="rdf:type" resource="true" value="universe:Planet"/>
				<property name="dc:title" query="name" />
				<property name="universe:id" query="@id" />
				<property name="universe:adjective" query="adjective" />
				<property name="universe:numberOfSatellites" dataType="xsd:int" query="satellites" />
			</properties>
		</resource>
	</graph>
</rdf-mapping>

Go to the Homepage or to GitHub to find out more.

Advertisements

5 thoughts on “Introducing Tripliser

    • Hi there,

      Thanks for suggesting this comparison.

      I had not come across Krextor, but it is similar to several other approaches I have seen. It suffers, in my opinion, from the same issue as the approaches I have seen, which is the over use of XLST. This was one of the main drivers for creating this library. There are several outcomes from modelling the mapping process in an imperative language, as opposed to directly translating the XML using XLST. One such outcome is the simplicity of the mapping. Using the Krextor example extraction I can show a comparison of the two mapping styles:

      Source XML

      <person friends="http...van-houten.name/milhouse">
        <name>Bart Simpson</name>
      </person>
      

      Krextor mapping

      <template match="person" mode="krextor:main">
        <call-template name="krextor:create-resource">
          <with-param name="type" select="'&foaf;Person'"/>
        </call-template>
      </template>
      
      <template match="person/@friends" mode="krextor:main">
        <call-template name="krextor:add-uri-property">
          <with-param name="property" select="'&foaf;knows'"/>
        </call-template>
      </template>
      
      <template match="person/name" mode="krextor:main">
        <call-template name="krextor:add-literal-property">
          <with-param name="property" select="'&foaf;name'"/>
        </call-template>
      </template>
      

      Tripliser mapping

      <resource query="person">  
          <properties>  
              <property name="rdf:type" resource="true" value="foaf:name"/>  
              <property name="foaf:name" query="name" />  
              <property name="foaf:knows" query="@friends" />  
          </properties>  
      </resource>
      

      I would argue that not only is the syntax more concise, it is also considerably more readable. This will only be emphasised as mappings become more complex. There are other areas where the Tripliser approach is simpler for the user, such as reporting, with varying levels of failure, and graph meta-data.

      Dave

    • There do seem to be a number of solutions that use XSLT. This was the approach I was trying to avoid as I quickly found the the XSLTs became unmanagable. I must disclose that I have a more general dislike for XSLT as a technology, and would recommend a kind of intermediate modelling in most XML conversion scenarios I have come across. XPath is the genuinely useful part of XSLT.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s