Introducing Tripliser

I recently had to solve the problem of how to take XML, in a predefined format, and create RDF representing the semantics of the data. I began using XSLT, but gradually the edge cases to handle inconsistencies in the input XML caused the XLST to become verbose and incomprehensible (being a mix of syntax handling and business logic). Errors were hard to diagnose and failures were not effectively recovered from. I decided to write a library to help me with this problem, called Tripliser…

>> Homepage  |  >> GitHub

Tripliser is a Java library and command-line tool for creating triple graphs, and RDF serialisations, from XML source data. It is particularly suitable for data exhibiting any of the following characteristics:

  • Messy – missing data, badly formatted data, changeable structure
  • Bulky – large volumes of data
  • Volatile – ongoing changes to data and structure, e.g. feeds

Other non-RDF source data may be supported in future such as CSV and SQL databases.

It is designed as an alternative to XSLT conversion, providing the following advantages:

  • Easy-to-read mapping format – concisely describing each mapping
  • Robust – error or partial failure tolerant
  • Detailed reporting – comprehensive feedback on the successes and failures of the conversion process
  • Extensible – custom functions, flexible API
  • Efficient – facilities for processing data in large volumes with minimal memory usage

XML files are read in, and XPath is used to extract values which can be inserted into a triple graph. The graph can be serialised in various RDF formats and is accompanied by meta-data and a property-by-property report to indicate how successful or unsuccessful the mapping process was.

Data flow in Tripliser

Here’s what a typical mapping format looks like…

<?xml version="1.0" encoding="UTF-8"?>
<rdf-mapping xmlns="http://www.daverog.org/rdf-mapping" strict="false">
	<constants>
		<constant name="objectsUri" value="http://objects.theuniverse.org/" />
	</constants>
	<namespaces>
		<namespace prefix="xsd" url="http://www.w3.org/2001/XMLSchema#" />
		<namespace prefix="rdfs" url="http://www.w3.org/2000/01/rdf-schema#" />
		<namespace prefix="dc" url="http://purl.org/dc/elements/1.1/" />
		<namespace prefix="universe" url="http://theuniverse.org/" />
	</namespaces>
	<graph query="//universe-objects" name="universe-objects" comment="A graph for objects in the universe">
		<resource query="stars/star">
			<about prepend="${objectsUri}" append="#star" query="@id" />
			<properties>
				<property name="rdf:type" resource="true" value="universe:Star"/>
				<property name="dc:title" query="name" />
				<property name="universe:id" query="@id" />
				<property name="universe:spectralClass" query="spectralClass" />
			</properties>
		</resource>
		<resource query="planets/planet">
			<about prepend="${objectsUri}" append="#planet" query="@id" />
			<properties>
				<property name="rdf:type" resource="true" value="universe:Planet"/>
				<property name="dc:title" query="name" />
				<property name="universe:id" query="@id" />
				<property name="universe:adjective" query="adjective" />
				<property name="universe:numberOfSatellites" dataType="xsd:int" query="satellites" />
			</properties>
		</resource>
	</graph>
</rdf-mapping>

Go to the Homepage or to GitHub to find out more.