This project has retired. For details please refer to its Attic page.
Apache Marmotta - LDCache Usage

LDCache Usage

Currently, all libraries of the Linked Data Cache are available only via Maven or as part of the Marmotta source code.

Maven Artifacts

To use the Linked Data Cache in your own projects, please add the following Maven dependencies to your project build:

<dependency>
    <groupId>org.apache.marmotta</groupId>
    <artifactId>ldcache-api</artifactId>
    <version>3.3.0</version>
</dependency>
<dependency>
    <groupId>org.apache.marmotta</groupId>
    <artifactId>ldcache-core</artifactId>
    <version>3.3.0</version>
</dependency>

Since LDCache is internally using LDClient to retrieve resource data, you should also add the LDClient backends you need. Please refer to the LDClient documentation to see the list of available modules and how to configure and use them.

In order to use the Linked Data Cache, you’ll also need to add at least one caching backend to the project. Assuming you are using the KiWi triple store, you would add the KiWi caching backend as follows:

<dependency>
    <groupId>org.apache.marmotta</groupId>
    <artifactId>ldcache-backend-kiwi</artifactId>
    <version>3.3.0</version>
</dependency>

A number of additional caching backends are currently under development. We are at least planning to release the two following backends:

  • ldcache-backend-ehcache will store caching information in a EHCache in-memory cache; cache information will get lost on restart
  • ldcache-backend-mapdb will store caching information in a persistent hash map; cache information will survive a restart, but this implementation will be less scalable

In case you also want to use the transparent Linked Data cache in your respository, you also need to include one of the connection wrappers in your project. If you are using the KiWi triple store, add the following additional artifact:

<dependency>
    <groupId>org.apache.marmotta</groupId>
    <artifactId>ldcache-sail-kiwi</artifactId>
    <version>3.3.0</version>
</dependency>

Retrieving Cache Entries

Retrieving Linked Data resources through the cache works almost exactly like retrieving them through a raw LDClient instance. To initialise a LDCache instance, you use the following basic procedure:

CacheConfiguration config = new CacheConfiguration();

LDCache ldcache = new LDCache(config,backend);

// do stuff

ldcache.shutdown();

The CacheConfiguration consists of a LDClient configuration and some additional configuration values that are relevant for caching only (currently only the default expiry time in case none is given in the response). The backend parameter is an instance of LDCachingBackend and depends on the actual backend implementation used (see backends).

To retrieve a resource into the cache, you would add the following statement:

ldcache.refreshResource(resource, false);

The first argument is the Sesame URI of the resource you want to refresh. The second argument is a boolean value indicating whether you want to force the refresh or acknowledge the expiry time (i.e. not perform a refresh if the resource is not yet expired).

In order to access the cached triple content of a resource, you can request a Sesame RepositoryConnection to the cached content as follows:

LDCachingConnection con = ldcache.getCacheConnection(resource_uri);

LDCachingConnection is a standard RepositoryConnection with some additional methods for getting cache information. Note that the repository might also be shared between many resources, so when querying for cached triples, set the subject parameter to the requested resource.

In addition to the basic functionality, LDCache offers a number of additional methods to support typical operations. The most important methods are:

  • listCacheEntries() returns a (lazy, closeable) iterator over all cache entries managed by the LDCache instance
  • listExpiredEntries() returns a (lazy, closeable) iterator over all expired cache entries managed by the LDCache instance
  • expire(URI resource) forces the expiry of the cache entry for the resource given as argument
  • expireAll() forces the expiry of all cache entries managed by the LDCache instance
  • refreshExpired() updates all expired entries with the latest content from the source

Transparent Linked Data Access

The probably most attractive feature of LDCache is actually transparent Linked Data access. It lets you access Linked Data resources through a Sesame RepositoryConnection as if they were stored in a local triple store. This essentially gives you a Sesame Repository view of the Linked Data Cloud. To make use of this feature, you simply add one of the LDCache sail connection wrappers to your sail stack, e.g. as follows (example for the KiWi LDcache sail, see backends for more details):

// create LDClient client configuration
ClientConfiguration config = new ClientConfiguration();

// add LDClient endpoints if needed
config.addEndpoint(...);

// configure a filter to indicate which resources are considered as "remote
ResourceFilter cacheFilter = new UriPrefixFilter("http://remote/");

KiWiStore store = new KiWiStore("test",jdbcUrl,jdbcUser,jdbcPass,dialect, "http://localhost/context/default", "http://localhost/context/inferred");
KiWiLinkedDataSail lsail = new KiWiLinkedDataSail(store,cacheFilter,CACHE_CONTEXT, config);

Repository repository = new SailRepository(lsail);
repository.initialize();

RepositoryConnection con = repository.getConnection();
try {
    URI subject = repository.getValueFactory().createURI("http://remote/testresource");

    // transparently access the triples of "subject"
    RepositoryResult<Statement> triples = con.getStatements(subject,null,null,true);
    while(triples.hasNext()) {
        Statement t = triples.next();

        // do something
        ...
    }
    triples.close();
} finally {
    con.close();
}

repository.shutdown();

Obviously, there are some restrictions, the most important being that you cannot use wildcards on the subjects of triple queries. So, sorry folks, no full SPARQL over the Linked Data Cloud; however, in combination with the LDPath query language, transparent caching gives you a very powerful tool for accessing Linked Data resources.