Source types

On this page

Comunica SPARQL enables query execution over one or more sources on both the command line and when calling Comunica from a JavaScript application.

Usually, sources are passed as URLs that point to Web resources. Based on what is returned when dereferencing this URL, Comunica can apply different query algorithms.

Instead of relying on Comunica's detection algorithms, you can enforce the use of a certain type.

Some SPARQL endpoints may be recognised as a file instead of a SPARQL endpoint due to them not supporting SPARQL Service Description, which may produce incorrect results. For these cases, the sparql type MUST be set.

When enabling the info logger, you can derive what type Comunica has determined for each source.

Setting source type on the command line

On the command line, source types can optionally be enforced by prefixing the URL with <typeName>@, such as:

$ comunica-sparql sparql@https://dbpedia.org/sparql \
    "CONSTRUCT WHERE { ?s ?p ?o } LIMIT 100"

Setting source type in an application

Via a JavaScript application, the source type can be set by using a record containing type and value:

const bindingsStream = await myEngine.queryBindings(`...`, {
  sources: [
    { type: 'sparql', value: 'https://dbpedia.org/sparql' },
  ],
});

This record may optionally contain a source-specific context within the "context" field.

Supported source types

The table below summarizes the different source types that Comunica supports by default:

Type name	Description
`file`	plain RDF file in any RDF serialization, such as Turtle, TriG, JSON-LD, RDFa, ...
`sparql`	SPARQL endpoint
`hypermedia`	Sources that expose query capabilities via hypermedia metadata, such as Triple Pattern Fragments and Quad Pattern Fragments
`qpf`	A hypermedia source that is enforced as Triple Pattern Fragments or Quad Pattern Fragments
`brtpf`	A hypermedia source that is enforced as bindings-restricted Triple Pattern Fragments
`rdfjs`	JavaScript objects implementing the RDF/JS `source` interface
`serialized`	An RDF dataset serialized as a string in a certain format.
`hdt`	HDT files
`ostrichFile`	Versioned OSTRICH archives

The default source type is auto, which will automatically detect the proper source type. For example, if a SPARQL Service Description is detected, the sparql type is used.

RDF serializations

Comunica will interpret the Content-Type header of HTTP responses to determine used RDF serialization. If the server did not provide such a header, Comunica will attempt to derive the serialization based on the extension.

The following RDF serializations are supported:

Name	Content type	Extensions
TriG	`application/trig`	`.trig`
N-Quads	`application/n-quads`	`.nq`, `.nquads`
Turtle	`text/turtle`	`.ttl`, `.turtle`
N-Triples	`application/n-triples`	`.nt`, `.ntriples`
Notation3	`text/n3`	`.n3`
JSON-LD	`application/ld+json`, `application/json`	`.json`, `.jsonld`
RDF/XML	`application/rdf+xml`	`.rdf`, `.rdfxml`, `.owl`
RDFa and script RDF data tags HTML/XHTML	`text/html`, `application/xhtml+xml`	`.html`, `.htm`, `.xhtml`, `.xht`
RDFa in SVG/XML	`image/svg+xml`,`application/xml`	`.xml`, `.svg`, `.svgz`

String source

String-based sources allow you to query over sources that are represented as a string in a certain RDF serialization.

For example, querying over a Turtle-based datasource:

const bindingsStream = await myEngine.queryBindings(`...`, {
  sources: [
    {
      type: 'serialized',
      value: '<ex:s> <ex:p> <ex:o>. <ex:s> <ex:p2> <ex:o2>.',
      mediaType: 'text/turtle',
      baseIRI: 'http://example.org/',
    },
  ],
});