土曜日 2:40 p.m.–3:10 p.m.
Python and the Semantic Web: Building a Linked Data Fragment Server with Asynco and Redis
Jeremy Nelson
- 対象レベル:
- 中級
- カテゴリ:
- Industry Uses
説明
A barrier to wider adoption of the semantic web is scaling SPARQL endpoints for large Linked Datasets. A promising approach, Linked Data Fragments, allows for small queries instead of complex SPARQL. This talk is about a Python-based Linked Data Fragments Server project using the new Asynco Python module and Redis. Testing is being done with datasets from the Library of Congress and the DPLA.
概要
In 2001, Sir Tim Berners-Lee and other originated the idea of the [semantic web][SW], an internet where web content is understandably and actionable by computers. Berners-Lee work along with many others, resulted in a large number of specifications from the W3C like [RDF][RDF], [OWL][OWL], [SKOS][SKOS], and other vocabularies for both describing and enhancing content on the web through the use of RDF Graphs made up of large number triples made up **subject-predicate-object** statements.
Traditionally there have three methods for organizations to publish their RDF data on the web; providing a data-dump download, creating a subject page with linked-data, or by providing a SPARQL endpoint to query their RDF data. There are problems with each approach with the data-dump and subject pages requiring extensive client programming to be usable or high-server overhead in the case of a SPARQL endpoint.
[Ruben Verborgh][RV] of [Ghent University] [GHENT] in Belgium originated the concept of Linked Data Fragments that offer a middle-ground between these options. Instead of requiring massive client or server processing, the Linked Data Fragments instead offer a lightweight querying pattern called **Triple Pattern Fragment** made up of data matching a **subject-predicate-object** triple pattern, metadata consisting of the total triple count, and controls for browsing all of the other fragments in the same dataset. A Triple Pattern Fragments server is a server that offers this service. A Triple Pattern Fragments client can connect to the Triple Pattern Fragments server and can as well deconstruct SPARQL queries into the simpler triple pattern fragments queries to the server.
At a recent digital library conference, a group made up of representatives from a variety of different academic, public, and non-profit organizations like [Digital Public Library of America](http://dp.la), [Amherst College](https://www.amherst.edu/), [Boston Public Library](http://www.bpl.org/), and [Colorado College](https://www.coloradocollege.edu) came together and organized what eventually became a Python-based implementation of an open-source [Linked Data Fragments server](https://github.com/jermnelson/linked-data-fragments) using the new Python 3.4 asyncio module for building a fast, lightweight network server that uses Redis, a NoSQL datastore technology, for caching results.
This presentation will present the preliminary results from testing the Linked Data Fragments server with large datasets from college library catalogs as well as datasets from the Library of Congress and the DPLA.
[OWL]: http://www.w3.org/TR/owl2-syntax/
[SW]: http://www.cs.umd.edu/~golbeck/LBSC690/SemanticWeb.html
[RDF]: http://www.w3.org/RDF/
[RV]: http://ruben.verborgh.org/
[SKOS]: http://www.w3.org/2004/02/skos/
[GHENT]: http://www.ugent.be/