rdf4j logo Eclipse rdf4j: documentation

Home

RDF4J Server, Workbench, and Console

1. Installing RDF4J Server and RDF4J Workbench

In this chapter, we explain how you can install RDF4J Server (the actual database server and SPARQL endpoint service) and RDF4J Workbench (a web-based client UI for managing databases and executing queries).

1.1. Required software

RDF4J Server requires the following software:

  • Java 8 Runtime Environment

  • A Java Servlet Container that supports Java Servlet API 2.5 and Java Server Pages (JSP) 2.0, or newer. We recommend using a recent, stable version of Apache Tomcat.

1.2. RDF4J Server and RDF4J Workbench

RDF4J Server is a database management application: it provides HTTP access to RDF4J repositories, exposing them as SPARQL endpoints. RDF4J Server is meant to be accessed by other applications. Apart from some functionality to view the server’s log messages, it doesn’t provide any user oriented functionality. Instead, the user oriented functionality is part of RDF4J Workbench. The Workbench provides a web interface for querying, updating and exploring the repositories of an RDF4J Server.

If you have not done so already, you will first need to download the RDF4J SDK. Both RDF4J Server and RDF4J Workbench can be found in the war directory of the SDK. The war-files in this directory need to be deployed in a Java Servlet Container. The deployment process is container-specific, please consult the documentation for your container on how to deploy a web application.

After you have deployed the RDF4J Server webapp, you should be able to access it, by default, at path /rdf4j-server. You can point your browser at this location to verify that the deployment succeeded.

1.2.1. Configuring RDF4J Workbench for UTF-8 Support

UTF-8 in the Request URI (GET)

There is a known issue (SES-1768) affecting the proper exploring of resources that use an extended character set. Workbench client-side code generates URI’s assuming an ISO-8859-1 character encoding, and often Tomcat comes pre-configured to expect UTF-8 encoded URI’s. It will be necessary to change the HTTP Connector configuration, or to add a separate HTTP Connector that uses ISO-8859-1. For details, see here for Tomcat 6 or here for Tomcat 7.

UTF-8 in the Request Body (POST)

To resolve issues where the request body is not getting properly interpreted as UTF-8, it is necessary to configure Tomcat to use its built in SetCharacterEncodingFilter. Some details are at https://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q3. With Tomcat 6, version 6.0.36 or later is required. On Tomcat 7, un-commenting the <filter> and <filter-mapping> elements for setCharacterEncodingFilter in $CATALINA_BASE/conf/web.xml, and restarting the server, were the only necessary steps.

1.2.2. Logging Configuration

Both RDF4J Server and RDF$J Workbench use the Logback logging framework. In its default configuration, all RDF4J Server log messages are sent to the log file [RDF4J_DATA]/Server/logs/main.log (and log messages for the Workbench to the same file in [RDF4J_DATA]/Workbench ).

The default log level is INFO, indicating that only important status messages, warnings and errors are logged. The log level and -behaviour can be adjusted by modifying the [RDF4J_DATA]/Server/conf/logback.xml file. This file will be generated when the server is first run. Please consult the logback manual for configuration instructions.

1.2.3. Repository Configuration

A clean installation of RDF4J Server has a single repository by default: the SYSTEM repository. This SYSTEM repository contains all configuration data for the server, including data on which other repositories exists and (in future releases) the access rights on these repositories. This SYSTEM repository should not be used to store data that is not related to the server configuration.

The best way to create and manage repositories in a SYSTEM repository is to use the RDF$J Console or RDF4J Workbench. The RDF$J Console is a command-line application for interacting with RDF4J Server (see the following chapter for details).

1.2.4. Setting up Users and Permissions

It is possible to set up your RDF4J Server to authenticate named users and restrict their permissions. See the blog post at http://rivuli-development.com/further-reading/sesame-cookbook/basic-security-with-http-authentication/ for a tutorial on how to do so when using Tomcat as the application container. [TODO add permission tutorial here]

1.3. Application directory configuration

RDF4J Server, Workbench, and Console all store configuration files and repository data in a single directory (with subdirectories). On Windows machines, this directory is %APPDATA%\RDF4J\ by default, where %APPDATA% is the application data directory of the user that runs the application. For example, in case the application runs under the ‘LocalService’ user account on Windows XP, the directory is C:\Documents and Settings\LocalService\Application Data\RDF4J\. On Linux/UNIX, the default location is $HOME/.rdf4j/, for example /home/tomcat/.rdf4j/. We will refer to this data directory as [RDF4J_DATA] in the rest of this manual.

The location of this data directory can be reconfigured using the Java system property org.eclipse.rdf4j.appdata.basedir. When you are using Tomcat as the servlet container then you can set this property using the JAVA_OPTS parameter, for example:

set JAVA_OPTS=-Dorg.eclipse.rdf4j.appdata.basedir=\path\to\other\dir\ (on Windows)
export JAVA_OPTS='-Dorg.eclipse.rdf4j.appdata.basedir=/path/to/other/dir/' (on Linux/UNIX)

If you are using Apache Tomcat as a Windows Service you should use the Windows Services configuration tool to set this property. Other users can either edit the Tomcat startup script or set the property some other way.

One easy way to find out what the directory is in a running instance of the RDF4J Server, is to go to http://localhost:8080/rdf4j-server/home/overview.view in your browser and click on ‘System’ in the navigation menu on the left. The data directory will be listed as one of the configuration settings of the current server.

2. RDF4J Console

The RDF4J Console is a command-line application for interacting with RDF4J. It can be used to create and use local RDF databases, or it can connect to a running RDF4J Server.

2.1. Getting started

RDF4J Console can be started using the console.bat/.sh scripts that can be found in the bin directory of the RDF4J SDK. By default, the console will connect to the “default data directory”, which contains the console’s own set of repositories.

The console can be operated by typing commands. For example, to get an overview of the available commands, type:

help

To get help for a specific command, type ‘help’ followed by the command name, e.g.:

help connect

2.1.1. Connecting to RDF4J Server

As indicated in the previous section, the console connects to its own set of repositories by default. Using the connect command you can make the console connect to a RDF4J Server or to a set of repositories on your file system. For example, to connect to a RDF4J Server that is listening to port 8080 on localhost, enter the following command:

connect http://localhost:8080/rdf4j-server

2.1.2. Repository list

To get an overview of the repositories that are available in the set that your console is connected to, use the show command:

show repositories

2.1.3. Creating a repository

The create command can be used to add new repositories to the set that the console is connected to. This command expects the name of a template that describes the repository’s configuration. Currently, there are nine templates that are included with the console by default:

  • memory — a memory based RDF repository

  • memory-rdfs — a main-memory repository with RDF Schema inferencing

  • memory-rdfs-dt — a main-memory repository with RDF Schema and direct type hierarchy inferencing

  • native — a repository that uses on-disk data structure

  • native-rdfs — a native repository with RDF Schema inferencing

  • native-rdfs-dt — a native repository with RDF Schema and direct type hierarchy inferencing

  • remote — a repository that serves as a proxy for a repository on a Sesame Server

When the create command is executed, the console will ask you to fill in a number of parameters for the type of repository that you chose. For example, to create a native repository, you execute the following command:

create native

The console will then ask you to provide an ID and title for the repository, as well as the triple indexes that need to be created for this kind of store. The values between square brackets indicate default values which you can select by simply hitting enter. The output of this dialogue looks something like this:

Please specify values for the following variables:
Repository ID [native]: myRepo
Repository title [Native store]: My repository
Triple indexes [spoc,posc]:
Repository created

2.1.4. Other commands

Please check the documentation that is provided by the console itself for help on how to use the other commands. Most commands should be self explanatory. [TODO expand on this a little – give some examples of other commands]

2.2. Repository configuration

2.2.1. Memory store configuration

A memory store is an RDF repository that stores its data in main memory. Apart from the standard ID and title parameters, this type of repository has a Persist and Sync delay parameter.

Memory Store persistence

The Persist parameter controls whether the memory store will use a data file for persistence over sessions. Persistent memory stores write their data to disk before being shut down and read this data back in the next time they are initialized. Non-persistent memory stores are always empty upon initialization.

Synchronization delay

By default, the memory store persistence mechanism synchronizes the disk backup directly upon any change to the contents of the store. That means that directly after an update operation (upload, removal) completes, the disk backup is updated. It is possible to configure a synchronization delay however. This can be useful if your application performs several transactions in sequence and you want to prevent disk synchronization in the middle of this sequence to improve update performance.

The synchronization delay is specified by a number, indicating the time in milliseconds that the store will wait before it synchronizes changes to disk. The value 0 indicates that there should be no delay. Negative values can be used to postpone the synchronization indefinitely, i.e. until the store is shut down.

2.2.2. Native store configuration

A native store stores and retrieves its data directly to/from disk. The advantage of this over the memory store is that it scales much better as it is not limited to the size of available memory. Of course, since it has to access the disk, it is also slower than the in-memory store, but it is a good solution for larger data sets.

Native store indexes

The native store uses on-disk indexes to speed up querying. It uses B-Trees for indexing statements, where the index key consists of four fields: subject (s), predicate (p), object (o) and context (c). The order in which each of these fields is used in the key determines the usability of an index on a specify statement query pattern: searching statements with a specific subject in an index that has the subject as the first field is significantly faster than searching these same statements in an index where the subject field is second or third. In the worst case, the ‘wrong’ statement pattern will result in a sequential scan over the entire set of statements.

By default, the native repository only uses two indexes, one with a subject-predicate-object-context (spoc) key pattern and one with a predicate-object-subject-context (posc) key pattern. However, it is possible to define more or other indexes for the native repository, using the Triple indexes parameter. This can be used to optimize performance for query patterns that occur frequently.

The subject, predicate, object and context fields are represented by the characters ‘s’, ‘p’, ‘o’ and ‘c’ respectively. Indexes can be specified by creating 4-letter words from these four characters. Multiple indexes can be specified by separating these words with commas, spaces and/or tabs. For example, the string “spoc, posc” specifies two indexes; a subject-predicate-object-context index and a predicate-object-subject-context index.

Creating more indexes potentially speeds up querying (a lot), but also adds overhead for maintaining the indexes. Also, every added index takes up additional disk space.

The native store automatically creates/drops indexes upon (re)initialization, so the parameter can be adjusted and upon the first refresh of the configuration the native store will change its indexing strategy, without loss of data.

2.2.3. HTTP repository configuration

An HTTP repository is not an actual store by itself, but serves as a proxy for a store on a (remote) RDF4J Server. Apart from the standard ID and title parameters, this type of repository has a RDF4J Server location and a Remote repository ID parameter.

RDF4J Server location

This parameter specifies the URL of the RDF4J Server instance that the repository should communicate with. Default value is http://localhost:8080/rdf4j-server, which corresponds to an RDF4J Server instance that is running on your own machine.

Remote repository ID

This is the ID of the remote repository that the HTTP repository should communicate with. Please note an HTTP repository in the Console has two repository ID parameters: one identifying the remote repository and one that specifies the HTTP repository’s own ID.

2.2.4. Repository configuration templates (advanced)

In RDF4J Server, repository configurations with all their parameters are modeled in RDF and stored in the SYSTEM repository. So, in order to create a new repository, the Console needs to create such an RDF document and submit it to the SYSTEM repository. The Console uses so called repository configuration templates to accomplish this.

Repository configuration templates are simple Turtle RDF files that describe a repository configuration, where some of the parameters are replaced with variables. The Console parses these templates and asks the user to supply values for the variables. The variables are then substituted with the specified values, which produces the required configuration data.

The RDF4J Console comes with a number of default templates. The Console tries to resolve the parameter specified with the ‘create’ command (e.g. “memory”) to a template file with the same name (e.g. “memory.ttl”). The default templates are included in Console library, but the Console also looks in the templates subdirectory of [RDF4J_DATA]. You can define your own templates by placing template files in this directory.

To create your own templates, it’s easiest to start with an existing template and modify that to your needs. The default “memory.ttl” template looks like this:

#
# RDF4J configuration template for a main-memory repository
#
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rep: <http://www.openrdf.org/config/repository#>.
@prefix sr: <http://www.openrdf.org/config/repository/sail#>.
@prefix sail: <http://www.openrdf.org/config/sail#>.
@prefix ms: <http://www.openrdf.org/config/sail/memory#>.

[] a rep:Repository ;
   rep:repositoryID "{%Repository ID|memory%}" ;
   rdfs:label "{%Repository title|Memory store%}" ;
   rep:repositoryImpl [
      rep:repositoryType "openrdf:SailRepository" ;
      sr:sailImpl [
         sail:sailType "openrdf:MemoryStore" ;
         ms:persist {%Persist|true|false%} ;
         ms:syncDelay {%Sync delay|0%}
      ]
   ].

Template variables are written down as {%var name%} and can specify zero or more values, seperated by vertical bars (“|”). If one value is specified then this value is interpreted as the default value for the variable. The Console will use this default value when the user simply hits the Enter key. If multiple variable values are specified, e.g. {%Persist|true|false%}, then this is interpreted as set of all possible values. If the user enters an unspecified value then that is considered to be an error. The value that is specified first is used as the default value.

The URIs that are used in the templates are the URIs that are specified by the RepositoryConfig and SailConfig classes of RDF4J’s repository configuration mechanism. The relevant namespaces and URIs can be found in the javadoc of these classes.

3. RDF4J Workbench

This chapter describes the RDF4J Workbench, a web application for interacting with RDF$J and/or other SPARQL endpoints.

This chapter will refer to URLs on a local server served from port 8080, which is possibly the most common “out-of-the-box” configuration. That is, Workbench URLs will start with http://localhost:8080/.

3.1. Browser Client Support

Table 1. Browser Client Support Matrix

Browser

Fully Working Versions

Non-Working Versions

Firefox

Any recent

Chrome

Any recent

Internet Explorer

8.0, 9.0 (Compatibility View), 10.0 (Compatibility View)

3.2. Getting Started

See Installing RDF4J Server and Workbench for instructions how to install Workbench.

To start using Workbench for the first time, point your browser to http://localhost:8080/rdf4j-workbench. Your browser will be automatically redirected to http://localhost:8080/rdf4j-workbench/repositories/NONE/repositories. This page will display all repositories in the default server, as indicated by the “default-server” property in WEB-INF/web.xml. Normally this is set to “/rdf4j-server”. That is, the default server for Workbench is usually the RDF4J Server instance at the path “/rdf4j-server” on the same web server. To view information about the RDF4J Server instance, click on “RDF4J Server” at the top of the side menu.

If the RDF4J Server instance has never been used, then the only repository displayed will be the SYSTEM repository.

3.2.1. Setting the Server, Repository and User Credentials

A “current selection” section sits at the top right in the Workbench, informing you of the URL of the server you are using, the repository you are currently using, and the user name used when accessing the server. Each of these items can be changed by clicking the “change” link immediately to the right of them. Since the Workbench is generally used for prototyping and exploration, “user” is commonly set to “none”. In this case, the Workbench is connecting to the RDF4J Server without authenticating, and below we refer to the user in this mode as the anonymous user.

3.2.2. Setting the Server and User Credentials

There are two ways to reach the “Change Server” page, which allows you to enter a URL for the server and, optionally, user credentials:

  1. Clicking on “change” for either the server or the user.

  2. Clicking on “RDF4J Server” on the sidebar menu.

A full URL is expected in the “Change Server” field. You may enter a file:/// URL to access a local repository on the Workbench server, but need to be sure that the Workbench server process has permission to access the given folder.

3.2.3. Important Security Consideration

Workbench stores user name and password credentials in plain-text cookies in the browser. You will need to configure your Workbench server for HTTPS to properly protect these credentials. See https://tomcat.apache.org/tomcat-6.0-doc/ssl-howto.html or https://tomcat.apache.org/tomcat-7.0-doc/ssl-howto.html for more information.

3.2.4. Setting the Repository

Prerequisites You have already connected to a server.

Even with a new server, there will at least exist the SYSTEM repository, which is the special repository used by RDF4J Server to track its repositories. There are two ways to change the current repsository:

  1. Clicking on “change” for the current repository in the “current selection” section.

  2. Clicking on “Repositories” in the sidebar menu.

You will be presented with a table listing of all the repositories available on the current server, with the following columns:

  • Readable

  • Writable

  • Id

  • Description

  • Location

“Location” is the URL of the repository, useful for accessing it via the RDF4J REST API. “Id” is presented as a clickable hyperlink that will open that repository in the Workbench, bringing the user to a summary page for the repository.

3.2.5. Creating a Repository

Prerequisites You have already connected connected to a server.

Click on “New repository” in the sidebar menu. This brings up the “New Repository” page. You are presented with a simple form that provides a “Type:” selector with several repository types:

1-4

In Memory Store

Simple, RDF Schema, RDF Schema and Direct Type Inferencing, or Custom Graph Query Inference

5-8

Native Store

Simple, RDF Schema, RDF Schema and Direct Type Inferencing, or Custom Graph Query Inference

11

Remote RDF Store

References a RDF4J repository external to the present server.

12

SPARQL Endpoint Proxy

References a SPARQL Endpoint (See SPARQL 1.1 Protocol).

13

The “ID:” and “Title:” fields are optional in this form. Clicking “Next” brings up a form with more fields specific to the repository type selected. On that form, it will be necessary to enter something in the “ID:” field before the “Create” button may be clicked. If creation is successful, the new repository is also opened and its “Summary” page is presented.

3.2.6. Modifying the Data Contents of a Repository

Prerequisites You have already connected connected to a server. You have opened a repository.

Data may be added to or removed from current repository using any of the sidebar menu items under “Modify”. After all successful operations, the user is presented with the repository “Summary” page.

3.2.7. Add

The “Add” page allows you to specify a URL with RDF data, a local file on on your client system, or to enter serialized RDF data into its text area for loading into the present repository. It is also possible to specify the Base URI and a Context for the triples. Think of the Context as a 4th element of each RDF statement, specifying a graph within the repository. You may specify one of eight serialization formats, or select “auto-detect” to let the server do a best guess at the format.

Remove

The “Remove Statements” page presents you with a form where you may enter values for subject, predicate, object or context. Clicking on “Remove” then removes all statements from the repository which match the given values. Leaving an item blank means that any value will match. If all values are left blank, clicking “Remove” will not do anything except present a warning message.

Clear

The “Clear Repository” page is powerful. Leaving the lone “Context:” field blank and clicking “Clear Context(s)” will remove all statements from all graphs in the repository. It is also possible to enter a resource value corresponding to a context that exists in the repository, and the statements for that graph only will be removed.

SPARQL Update

The “Execute SPARQL Update on Repository” page gives a text area where you enter a SPARQL 1.1 Update command. SPARQL Update is an extension to the SPARQL query language that provides full CRUD (Create Read Update Delete) capabilities. For more information see the W3C Recommendation for SPARQL 1.1 Update. Clicking “Execute” executes the specified SPARQL Update operation.

3.2.8. Exploring a Repository

Prerequisites You have already connected connected to a server. You have opened a repository.

Summary Page

Click on “Summary” on the sidebar menu. A simple summary is displayed with the repository’s id, description, URL for remote access and the associated server’s URL for remote access. Many operations when repositories are created and updated display this page afterwards.

Namespaces Page

Namespace-prefix pairings can be defined within a repository, so that URIs can be displayed in shorthand form as a qualified name. To edit them, click on “Namespaces” on the sidebar menu. A page is displayed with a table of all presently defined pairs. Existing namespaces may be edited by selecting them in the drop-down list, which populates the text fields. The text fields may then be edited, and the “Update” button will make the change on the repository. The “Delete” button will remove whichever pair has been selected.

Contexts Page

“Context” is the RDF4J construct for implementing RDF Named Graphs, which allow a repository to group data into separately addressable graphs. The Explore page always displays the context (always a URI or blank node) with each triple, the combination of which is often referred to as a quad.

To view all the contexts for the present repository, click on “Contexts” on the sidebar menu. Each context is clickable, bringing you to the “Explore” page for that context value.

Types Page

Click on “Types” on the sidebar menu. A list of types is displayed. These types are the resulting output from this SPARQL query:

SELECT DISTINCT ?type WHERE { ?subj a ?type }
Explore Page

Click on “Explore” in the sidebar menu. You are presented with an “Explore” page. Type a resource value into the empty “Resource” field, and hit Enter. You will be presented with a table listing all triples where your given resource is a part of the statement, or is the context (graph) name. Currently allowable resource values are:

  • URI’s enclosed in angle brackets, e.g., http://www.w3.org/1999/02/22-rdf-syntax-ns#type

  • Qualified Names (qnames), e.g. rdf:type, where the prefix “rdf” is associated with the namespace “http://www.w3.org/1999/02/22-rdf-syntax-ns#” in the repository.

  • Literal values with an explicit datatype or language tag, e.g., “dog”@en or “hund”@de or “1”^^xsd:integer or “9.99”^^http://www.w3.org/2001/XMLSchema#decimal

Data types expressed with qnames also need to have their namespace defined in the repository.

By using the “Results per page” setting and the “Previous …” and “Next …” buttons, you may page through a long set of results, or display all of the results at once. There is also a “Show data types & language tags” checkbox which, when un-checked, allows a less verbose table view.

3.2.9. Querying a Repository

Clicking on “Query” on the sidebar menu brings you to Workbench’s querying interface. Here, you may enter queries in the SPARQL or SeRQL query languages, save them for future access, and execute them against your repository.

If you have executed queries previously, the query text area will show the most recently executed query. If not, it will be pre-populated with a prefix header (SPARQL) or footer (SeRQL) containing all the defined namespaces for the repository. The “Clear” button below the text area gives you the option to restore this pre-populated state for the currently selected query language.

The two other action buttons are “Save Query” and “Execute”:

  • “Save Query” is only enabled when a name has been entered into the adjacent text field. Once clicked, your query is saved under the given name. An option to back out or overwrite is given if the name already exists. Saved queries are associated with the current repository and user name. If the “Save privately (do not share)” option is checked, then the saved query will only be visible to the current user.

  • “Execute” attempts to execute the given query text, and then you are presented with a query results page. Values are clickable, and clicking on a value brings you to its “Explore” page. Similar display options are presented as the “Explore” page, as well.

3.2.10. Working with Saved Queries

Clicking “Saved queries” on the sidebar menu brings you to the Workbench’s interface for working with previously saved queries. All saved queries accessible to the current user are listed in alphabetical order by

  • the user that saved them, then

  • the query name

The query name is displayed as a clickable link that will execute the query, followed by 3 buttons:

Button Action Description

Show

Toggles the display of the query metadata and query text. When the “Save Queries” page loads, this information is not showing to conserve screen real estate.

Edit

Brings you to the query entry page, pre-populated with the query text.

Delete…

Deletes the saved query, with a confirmation dialog provided for safety. Users may only delete their own queries or queries that were saved anonymously.

The query metadata fields, aside from query name and user, are:

  • Query Language: either SPARQL or SeRQL

  • Include Inferred Statements: whether to use any inferencing defined on the repository to expand the result set

  • Rows per page: How many results to display per page at first

  • Shared: whether this query is visible to users other than the one that saved it, restricted to always be true for the “anonymous” user

Note that it is only possible to save queries as the present user. If you edit another user’s query and save it with the same query name, a new saved query will be created associated with your user name.

3.2.11. Viewing all Triples and Exporting the Data

The “Export” link on the sidebar menu is convenient for bringing up a paged view of all quads in your triple store. As with other result views, resources are displayed as clickable links that bring you to that resource’s “Explore” page. In addition, it is possible to select from a number of serialization formats to download the entire contents of the triple store in:

  • TriG

  • BinaryRDF

  • TriX

  • N-Triples

  • N-Quads

  • N3

  • RDF/XML

  • RDF/JSON

  • Turtle