Linked Data Caching
The Linked Data Caching library offers caching facilities for Linked Data resources. In particular, it provides the following two major functionalities:
- it caches the triple results of LDClient requests, taking into account expiry times, and dynamically refreshes results if needed
- it offers transparent access to resources in the Linked Data Cloud by wrapping the repository connection and fetching remote triples as needed (i.e. if a call to getStatements(...) contains as a subject a resource that is considered external) and including them into the repository results
The Linked Data Caching library consists of a number of modules that can be combined as needed by the project using it. In particular, it offers different caching backends and different implementations for hooking into the repository connection as needed.
Like all Apache Marmotta libraries, the core LDCache consists of two libraries:
- ldcache-api: contains interfaces and model needed for implementing backends and wrappers
- ldcache-core: contains the main implementation of the caching functionalits
Both modules need to be included in order to be able to use the module (see usage).
Caching backends provide different means for storing the cached triples and corresponding metadata like expiry time. Currently, LDCache offers the following caching backends:
- ldcache-backend-kiwi: caches the triples in the underlying KiWi triple store in a separate named graph, and stores the caching metadata into a new database table; this implementation is used by the Apache Marmotta platform and provides very efficient transparent caching
- ldcache-backend-ehcache: caches the triples and caching metadata in a (volatile) EHCache cache; since the cache is purely in-memory (the EHCache Open Source edition does not offer persistent caching), the cache will be lost when the system is stopped (Note: this cache backend is not yet completed.)
- ldcache-backend-mapdb: caches the triples and caching metadata in a (persistent) MapDB cache; the cache is persisted to disk and will be restored when the system is restarted (Note: this cache backend is not yet completed.)
Connection wrappers overwrite the getStatements(...) method of a repository connection and include the triples from the Linked Data cache in the result. Currently, LDCache offers the following connection wrappers:
- ldcache-sail-generic: implements a connection wrapper for any kind of Sesame Sail repository and any kind of backend; in case getStatements(...) is called, it also retrieves the triples from the cache and merges the results before returning (using a UnionIteration) (Note: this connection wrapper is not yet completed.)
- ldcache-sail-kiwi: implements a connection wrapper for a KiWi repository using a KiWi LDCache backend; this is a special case that allows certain optimisations that are not possible for generic wrappers, as the cached triples are already stored in the same triple store; it merely triggers a refresh of the cache when getStatements is called
Since it is not possible to query the whole Linked Data Cloud, connection wrappers will typically only work in case the subject parameter of getStatements(...) is a URI resource and not a wildcard. While this seems a strong restriction, it actually gives rise to many interesting use cases, e.g. with the query language LDPath.