Source types


On this page

    Comunica SPARQL enables query execution over one or more sources on both the command line and when calling Comunica from a JavaScript application.

    Usually, sources are passed as URLs that point to Web resources. Based on what is returned when dereferencing this URL, Comunica can apply different query algorithms.

    Instead of relying on Comunica's detection algorithms, you can enforce the use of a certain type.

    Some SPARQL endpoints may be recognised as a file instead of a SPARQL endpoint due to them not supporting SPARQL Service Description, which may produce incorrect results. For these cases, the sparql type MUST be set.
    When enabling the info logger, you can derive what type Comunica has determined for each source.

    Setting source type on the command line

    On the command line, source types can optionally be enforced by prefixing the URL with <typeName>@, such as:

    $ comunica-sparql sparql@https://dbpedia.org/sparql \
        "CONSTRUCT WHERE { ?s ?p ?o } LIMIT 100"
    

    Setting source type in an application

    Via a JavaScript application, the source type can be set by using a hash containing type and value:

    const bindingsStream = await myEngine.queryBindings(`...`, {
      sources: [
        { type: 'sparql', value: 'https://dbpedia.org/sparql' },
      ],
    });
    

    Supported source types

    The table below summarizes the different source types that Comunica supports by default:

    Type nameDescription
    fileplain RDF file in any RDF serialization, such as Turtle, TriG, JSON-LD, RDFa, ...
    sparqlSPARQL endpoint
    hypermediaSources that expose query capabilities via hypermedia metadata, such as Triple Pattern Fragments and Quad Pattern Fragments
    qpfA hypermedia source that is enforced as Triple Pattern Fragments or Quad Pattern Fragments
    brtpfA hypermedia source that is enforced as bindings-restricted Triple Pattern Fragments
    rdfjsJavaScript objects implementing the RDF/JS source interface
    serializedAn RDF dataset serialized as a string in a certain format.
    hdtHDT files
    ostrichFileVersioned OSTRICH archives

    The default source type is auto, which will automatically detect the proper source type. For example, if a SPARQL Service Description is detected, the sparql type is used.

    RDF serializations

    Comunica will interpret the Content-Type header of HTTP responses to determine used RDF serialization. If the server did not provide such a header, Comunica will attempt to derive the serialization based on the extension.

    The following RDF serializations are supported:

    NameContent typeExtensions
    TriGapplication/trig.trig
    N-Quadsapplication/n-quads.nq, .nquads
    Turtletext/turtle.ttl, .turtle
    N-Triplesapplication/n-triples.nt, .ntriples
    Notation3text/n3.n3
    JSON-LDapplication/ld+json, application/json.json, .jsonld
    RDF/XMLapplication/rdf+xml.rdf, .rdfxml, .owl
    RDFa and script RDF data tags HTML/XHTMLtext/html, application/xhtml+xml.html, .htm, .xhtml, .xht
    RDFa in SVG/XMLimage/svg+xml,application/xml.xml, .svg, .svgz

    String source

    String-based sources allow you to query over sources that are represented as a string in a certain RDF serialization.

    For example, querying over a Turtle-based datasource:

    const bindingsStream = await myEngine.queryBindings(`...`, {
      sources: [
        {
          type: 'serialized',
          value: '<ex:s> <ex:p> <ex:o>. <ex:s> <ex:p2> <ex:o2>.',
          mediaType: 'text/turtle',
          baseIRI: 'http://example.org/',
        },
      ],
    });