Querying in a JavaScript app
On this page
The default Comunica query engine that exposes most standard features is Comunica SPARQL,
which uses the package name @comunica/query-sparql
.
In this guide, we will install it as a dependency in a Node.js JavaScript application,
and show how it can be used to execute queries.
1. Installation
npm init
.
You will also need a JavaScript file to write in, such as main.js
.
In order to add Comunica SPARQL as a dependency to your Node.js application, we can execute the following command:
$ npm install @comunica/query-sparql
2. Creating a new query engine
The easiest way to create an engine is as follows:
const QueryEngine = require('@comunica/query-sparql').QueryEngine; const myEngine = new QueryEngine();
You can reuse an engine as often as possible. This is especially valuable if you repeatedly query over the same sources, as caching can be performed.
3. Executing SPARQL SELECT queries
Once you engine has been created, you can use it to execute any SPARQL query, such as a SELECT
query:
const bindingsStream = await myEngine.queryBindings(` SELECT ?s ?p ?o WHERE { ?s ?p <http://dbpedia.org/resource/Belgium>. ?s ?p ?o } LIMIT 100`, { sources: ['https://fragments.dbpedia.org/2015/en'], });
The first argument of queryBindings()
is a SPARQL query string,
and the second argument is a query context containing options,
which must at least contain an array of sources to query over.
The resulting bindingsStream
is a stream of bindings,
where each binding contains values for the selected variables (?s ?p ?o
).
sources
is the only required option in the query context,
additional options can be passed
to tweak how the engine executed the query.
3.1 Consuming binding results as a stream
The most efficient way to make use of the result,
is by adding a data-listener to the bindingsStream
:
bindingsStream.on('data', (binding) => { console.log(binding.toString()); // Quick way to print bindings for testing console.log(binding.has('s')); // Will be true // Obtaining values console.log(binding.get('s').value); console.log(binding.get('s').termType); console.log(binding.get('p').value); console.log(binding.get('o').value); });
The data-listener will be invoked for each resulting binding, as soon as the query engine has detected it. This means that the data-listener can be invoked many times during query execution, even if not all results are available yet.
Each binding
is an RDF/JS Bindings
object
that contains mappings from variables to RDF terms.
Variable names can either be obtained by string label (without the ?
prefix) or via RDF/JS variable objects,
and bound RDF terms are represented as RDF/JS terms.
Learn more about the usage of these bindings objects in the bindings guide.
To find out when the query execution has ended, and all results are passed to the data-listener, an end-listener can be attached as well.
bindingsStream.on('end', () => { // The data-listener will not be called anymore once we get here. });
It is also considered good practise to add an error-listener, so you can detect any problems that have occurred during query execution:
bindingsStream.on('error', (error) => { console.error(error); });
3.2 Consuming binding results as an async iterable
Using a for-await loop, you can consume bindings as an async iterable. While this is more compact than the stream-based approach, it may lead to a slightly lower level of performance:
for await (const bindings of bindingsStream) { console.log(bindings.get('s').value); console.log(bindings.get('s').termType); }
3.3 Consuming binding results as an array
If performance is not an issue in your application,
or you just want the results in a simple array,
then you can call the asynchronous toArray()
method on the bindingsStream
:
const bindings = await bindingsStream.toArray(); console.log(bindings[0].get('s').value); console.log(bindings[0].get('s').termType);
This method will return asychronously (using await
) as soon as all results have been found.
If you have many results, it is recommended to consume results iteratively via a data listener instead.
Each binding in the array is again an RDF/JS Bindings
object.
If you want to limit the number of results in the array, you can optionally pass a limit:
const bindings = await bindingsStream.toArray({ limit: 100 });
4. Executing queries over multiple sources
Querying over more than one source is trivial, as any number of sources can easily be passed via an array:
const bindingsStream = await myEngine.queryBindings(` SELECT ?s ?p ?o WHERE { ?s ?p <http://dbpedia.org/resource/Belgium>. ?s ?p ?o } LIMIT 100`, { sources: [ 'http://fragments.dbpedia.org/2015/en', 'https://www.rubensworks.net', 'https://ruben.verborgh.org/profile/', ], });
5. Executing SPARQL CONSTRUCT queries
Next to SELECT
queries, you can also execute a CONSTRUCT
query to generate RDF quads/triples:
const quadStream = await myEngine.queryQuads(` CONSTRUCT WHERE { ?s ?p ?o } LIMIT 100`, { sources: ['http://fragments.dbpedia.org/2015/en'], });
5.1 Consuming quad results as a stream
The most efficient way to make use of the resulting RDF quads,
is by adding a data-listener to the quadStream
:
quadStream.on('data', (quad) => { console.log(quad.subject.value); console.log(quad.predicate.value); console.log(quad.object.value); console.log(quad.graph.value); });
The data-listener will be invoked for each constructed RDF triple/quad, as soon as the query engine has created it. This means that the data-listener can be invoked many times during query execution, even if not all results are available yet.
Each quad
is an RDF/JS quad,
which contain subject
, predicate
, object
and graph
terms.
Just like bindingsStream
, end-listener and error-listener can also be attached:
quadStream.on('end', () => { // The data-listener will not be called anymore once we get here. }); quadStream.on('error', (error) => { console.error(error); });
5.2 Consuming quad results as an async iterable
Just like with binding results, quads can also be consumed using for-await.:
for await (const quad of quadStream) { console.log(quad.subject.value); console.log(quad.predicate.value); console.log(quad.object.value); console.log(quad.graph.value); }
5.3 Consuming quad results as an array
Just like with binding results,
if performance is not an issue in your application,
or you just want the results in a simple array,
then you can call the asynchronous toArray()
method on the bindingsStream
:
const quads = await quadStream.toArray(); console.log(quads[0].subject.value); console.log(quads[0].predicate.value); console.log(quads[0].object.value); console.log(quads[0].graph.value);
This method will return asychronously (using await
) as soon as all results have been found.
If you have many results, it is recommended to consume results iteratively via a data listener instead.
Each quad
is again an RDF/JS quad,
which contain subject
, predicate
, object
and graph
terms.
6. Executing SPARQL ASK queries
One of the simplest forms SPARQL is the ASK query, which can be executed in Comunica as follows:
const hasMatches = await myEngine.queryBoolean(` ASK { ?s ?p <http://dbpedia.org/resource/Belgium> }`, { sources: ['http://fragments.dbpedia.org/2015/en'], })
The value of hasMatches
indicates if the query has at least one result.
7. Executing a generic query
If you don't know beforehand if your query is a SELECT
, CONSTRUCT
, or ASK
(e.g. if your app accepts queries via user input),
then you can make use of the generic query
method that supports all query types:
const result = await myEngine.query(` SELECT ?s ?p ?o WHERE { ?s ?p <http://dbpedia.org/resource/Belgium>. ?s ?p ?o } LIMIT 100`, { sources: ['http://fragments.dbpedia.org/2015/en'], }); if (result.resultType === 'bindings') { const bindingsStream = await result.execute(); bindingsStream.on('data', (binding) => { console.log(binding.toString()); }); }
The resulting object represents a future to the query results.
If has a field resultType
that indicates the query and result type, which can be 'bindings'
, 'quads'
, 'boolean'
, or 'void'
.
The asynchronous execute
method effectively executes the query, and returns a result depending on the resultType
, corresponding to the queryBindings
, queryQuads
, ... methods.
For example, if the result type is 'bindings'
, then the return type of execute
will be a bindings stream.
Optionally, you can also obtain metadata about the results via this query
method for the 'bindings'
and 'quads'
result types:
const result = await myEngine.query(` SELECT ?s ?p ?o WHERE { ?s ?p <http://dbpedia.org/resource/Belgium>. ?s ?p ?o } LIMIT 100`, { sources: ['http://fragments.dbpedia.org/2015/en'], }); if (result.resultType === 'bindings') { const metadata = await result.metadata(); console.log(metadata.cardinality); console.log(metadata.canContainUndefs); }
8. Serializing to a specific result format
If you want your application to output query results in a certain text-based format,
just like executing Comunica on the command line,
then you can make use of the resultToString()
method.
For example, serializing to SPARQL JSON can be done as follows:
const result = await myEngine.query(` SELECT ?s ?p ?o WHERE { ?s ?p <http://dbpedia.org/resource/Belgium>. ?s ?p ?o } LIMIT 100`, { sources: ['http://fragments.dbpedia.org/2015/en'], }); const { data } = await myEngine.resultToString(result, 'application/sparql-results+json'); data.pipe(process.stdout); // Print to standard output
The resultToString()
method accepts a query result and a result format media type.
The media type is optional, and will default to application/json
for bindings, application/trig
for quads, and simple
for booleans.
getResultMediaTypes()
method.