Class ParsedIRI

java.lang.Object
org.eclipse.rdf4j.common.net.ParsedIRI
All Implemented Interfaces:
Serializable, Cloneable

public class ParsedIRI extends Object implements Cloneable, Serializable
Represents an Internationalized Resource Identifier (IRI) reference.

Aside from some minor deviations noted below, an instance of this class represents a IRI reference as defined by RFC 3987: Internationalized Resource Identifiers (IRI): IRI Syntax. This class provides constructors for creating IRI instances from their components or by parsing their string forms, methods for accessing the various components of an instance, and methods for normalizing, resolving, and relativizing IRI instances. Instances of this class are immutable.

An IRI instance has the following seven components in string form has the syntax

[scheme:][//[user-info@]host[:port]][path][?query][#fragment]

In a given instance any particular component is either undefined or defined with a distinct value. Undefined string components are represented by null, while undefined integer components are represented by -1. A string component may be defined to have the empty string as its value; this is not equivalent to that component being undefined.

Whether a particular component is or is not defined in an instance depends upon the type of the IRI being represented. An absolute IRI has a scheme component. An opaque IRI has a scheme, a scheme-specific part, and possibly a fragment, but has no other components. A hierarchical IRI always has a path (though it may be empty) and a scheme-specific-part (which at least contains the path), and may have any of the other components.

IRIs, URIs, URLs, and URNs

IRIs are meant to replace URIs in identifying resources for protocols, formats, and software components that use a UCS-based character repertoire.

Internationalized Resource Identifier (IRI) is a complement to the Uniform Resource Identifier (URI). An IRI is a sequence of characters from the Universal Character Set (Unicode/ISO 10646). A mapping from IRIs to URIs is defined using toASCIIString(), which means that IRIs can be used instead of URIs, where appropriate, to identify resources. While all URIs are also IRIs, the normalize() method can be used to convert a URI back into a normalized IRI.

A URI is a uniform resource identifier while a URL is a uniform resource locator. Hence every URL is a URI, abstractly speaking, but not every URI is a URL. This is because there is another subcategory of URIs, uniform resource names (URNs), which name resources but do not specify how to locate them. The mailto, news, and isbn URIs shown above are examples of URNs.

Deviations

jar: This implementation treats the first colon as part of the scheme if the scheme starts with "jar:". For example the IRI jar:http://www.foo.com/bar/jar.jar!/baz/entry.txt is parsed with the scheme jar:http and the path /bar/jar.jar!/baz/entry.txt.

Since:
2.3
Author:
James Leigh
See Also: