Developer(s) Daniel Krech Type Library | Operating system | |
Initial release June 4, 2002; 14 years ago (2002-06-04) Stable release 4.2.0 / February 19, 2015; 2 years ago (2015-02-19) Repository github.com/RDFLib/rdflib |
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information. The library contains an RDF/XML parser/serializer that conforms to the RDF/XML Syntax Specification (Revised). The library also contains both in-memory and persistent Graph backends. It is being developed by a number of contributors and was created by Daniel Krech who continues to maintain it.
Contents
RDFLib and Python Idioms
RDFLib's use of various Python idioms makes them an appropriate way to introduce it to a Python programmer who hasn't used it before.
RDFLib Graphs redefine certain built-in Python methods in order to behave in a predictable way. RDFLib graphs emulate container types and are best thought of as a set of 3-item triples:
set([(subject,predicate,object),(subject1,predicate1,object1),... (subjectN,predicateN,objectN)])RDFLib graphs are not sorted containers; they have ordinary set operations, e.g. add to add a triple, and methods that search triples and return them in arbitrary order.
RDF Graph Terms
The following RDFLib classes (listed below) model RDF terms in a graph and inherit off a common Identifier class, which extends Python unicode. Instances of these are nodes in an RDF graph.
Namespace Utilities
RDFLib provides mechanisms for managing Namespaces. In particular, there is a Namespace class which takes (as its only argument) the Base URI of the namespace. Fully qualified URIs in the namespace can be constructed by attribute / dictionary access on Namespace instances:
Graphs as Iterators
RDFLib graphs also override __iter__ in order to support iteration over the contained triples:
Set Operations on RDFLib Graphs
__iadd__ and __isub__ are overridden to support adding and subtracting Graphs to/from each other (in place):
Basic Triple Matching
RDFLib graphs support basic triple pattern matching with a triples((subject,predicate,object)) function. This function is a generator of triples that match the pattern given by the arguments. The arguments of these are RDF terms that restrict the triples that are returned. Terms that are None are treated as a wildcard.
Adding Triples
Triples can be added in two ways:
Removing Triples
Similarly, triples can be removed by a call to remove: remove((subject, predicate, object))
RDF Literal Support
RDFLib 'Literal's essentially behave like Unicode characters with an XML Schema datatype or language attribute. The class provides a mechanism to both convert Python literals (and their built-ins such as time/date/datetime) into equivalent RDF Literals and (conversely) convert Literals to their Python equivalent. There is some support of considering datatypes in comparing Literal instances, implemented as an override to __eq__. This mapping to and from Python literals is achieved with the following dictionaries:
Maps Python instances to WXS datatyped Literals
Maps WXS datatyped Literals to Python. This mapping is used by the toPython() method defined on all Literal instances.
SPARQL Querying
RDFLIb supports a majority of the current SPARQL specification and includes a harness for the publicly available RDF DAWG test suite. Support for SPARQL is provided by two methods:
The first method parses a stream object with the SPARQL syntax. It uses a Python/C parser generated by BisonGen, which builds a hierarchy of parsed objects. This parsed object can be passed to the second function which evaluates the query against an RDFLib Store instance using the (optional) initial bindings.
Using Parse:
p is an instance of rdflib.sparql.bison.Query.Query
The RDF Store API
A Universal RDF Store Interface
This document attempts to summarize some fundamental components of an RDF store. The motivation is to outline a standard set of interfaces for providing the necessary support needed in order to persist an RDF Graph in a way that is universal and not tied to any specific implementation. For the most part, the core RDF model is adhered to as well as terminology that is consistent with the RDF Model specifications. However, this suggested interface also extends an RDF store with additional requirements necessary to facilitate the aspects of Notation 3 that go beyond the RDF model to provide a framework for First Order Predicate Logic processing and persistence.
Terminology
Chimezie said "higher-order statements are complicated"
Which can be written as (in N3):Interpreting Syntax
The following Notation 3 document:
Could cause the following statements to be asserted in the store:
This statement would be asserted in the partition associated with quoted statements (in a formula named _:a)
Finally, these statements would be asserted in the same partition (in a formula named _:b)
Formulae and Variables as Terms
Formulae and variables are distinguishable from URI references, Literals, and BNodes by the following syntax:
They must also be distinguishable in persistence to ensure they can be round tripped. Other issues regarding the persistence of N3 terms.
Database Management
An RDF store should provide standard interfaces for the management of database connections. Such interfaces are standard to most database management systems (Oracle, MySQL, Berkeley DB, Postgres, etc..) The following methods are defined to provide this capability:
The configuration string is understood by the store implementation and represents all the necessary parameters needed to locate an individual instance of a store. This could be similar to an ODBC string, or in fact be an ODBC string if the connection protocol to the underlying database is ODBC. The open function needs to fail intelligently in order to clearly express that a store (identified by the given configuration string) already exists or that there is no store (at the location specified by the configuration string) depending on the value of create.
Triple Interfaces
An RDF store could provide a standard set of interfaces for the manipulation, management, and/or retrieval of its contained triples (asserted or quoted):
This function can be thought of as the primary mechanism for producing triples with nodes that match the corresponding terms and term pattern provided. A conjunctive query can be indicated by either providing a value of NULL/None/Empty string value for context or the identifier associated with the Conjunctive Graph.
Formula / Context Interfaces
These interfaces work on contexts and formulae (for stores that are formula-aware) interchangeably.
Named Graphs / Conjunctive Graphs
RDFLib defines the following kinds of Graphs:
A Conjunctive Graph is the most relevant collection of graphs that are considered to be the boundary for closed world assumptions. This boundary is equivalent to that of the store instance (which is itself uniquely identified and distinct from other instances of Store that signify other Conjunctive Graphs). It is equivalent to all the named graphs within it and associated with a _default_ graph which is automatically assigned a BNode for an identifier - if one isn't given.
Formulae
RDFLib graphs support an additional extension of RDF semantics for formulae. For the academically inclined, Graham Klyne's 'formal' extension (see external links) is probably a good read.
Formulae are represented formally by the 'QuotedGraph' class and disjoint from regular RDF graphs in that their statements are quoted.
Persistence
RDFLib provides an abstracted Store API for persistence of RDF and Notation 3. The Graph class works with instances of this API (as the first argument to its constructor) for triple-based management of an RDF store including: garbage collection, transaction management, update, pattern matching, removal, length, and database management (_open_ / _close_ / _destroy_) . Additional persistence mechanisms can be supported by implementing this API for a different store. Currently supported databases:
Store instances can be created with the plugin function:
'Higher-order' Idioms
There are a few high-level APIs that extend RDFLib graphs into other Pythonic idioms. For more a more explicit Python binding, there is Sparta and SuRF.
Support
There is a #rdflib connect irc channel on freenode for anyone who wants to chat about rdflib or redfoot. RDFLib and related projects are hosted on GitHub, which also includes an issue tracker. Also available is a mailinglist and documentation.