Release 5.0: 🧩 Increased parsing modularity for SPARQL 1.2 and beyond
Wednesday, January 7, 2026
On this page
More than one year ago, we released Comunica 4.0, which focused on increased expression modularization and performance. The current 5.0 release focuses primarily on the introduction of RDF and SPARQL 1.2 support, and the migration to a new parsing and algebra framework. For end-users, this release introduces no significant breaking changes. For module developers, the primary breaking change is related to this algebra migration.
🪆 RDF 1.2 and SPARQL 1.2 support
The RDF & SPARQL W3C Working Group is nearing its completion of the RDF and SPARQL 1.2 specifications. Besides various smaller changes, the main addition within these specifications is the concept of triple terms of reification, which allows you to make statements about other statements. This also impacts SPARQL, as it allows you to write queries such as:
SELECT ?authority { << :employee38 :jobTitle "Assistant Designer" >> :accordingTo ?authority . }
While these specifications are not yet official W3C recommendations at the time of writing, they are in a near-final state. With this release, Comunica is fully compliant with the RDF 1.2 and SPARQL 1.2 specifications, which includes efficient in-memory indexing of triple terms, and parsing and serializing of SPARQL 1.2, Turtle 1.2, N-Quads 1.2, SPARQL/JSON 1.2, ... This compliance supersedes the RDF-star support that was added in Comunica 2.8, as RDF 1.2 is the successor to RDF-star. If further changes would be made to the 1.2 specifications, Comunica will integrate them as soon as possible.
🔀 Modularized query parsing and algebra with Traqula
Comunica used to make use of the SPARQL.js library for parsing SPARQL 1.1 queries, and the SPARQLAlgebra.js library for algebra handling. While these have served us well over the years, they were not designed with high-modularity in mind, which became very apparent when updating to SPARQL 1.2.
As such, we developed a new modular query parsing and algebra framework, called Traqula. Besides offering both SPARQL 1.2 and SPARQL 1.1 parsing and algebra, its modular nature allows you to easily plug in support for parsing custom operators. This is particularly useful for example if you want to experiment with a new operator that is not part of the SPARQL standard, without having to write a completely new parser from scratch. We have a new dedicated tutorial to explain how a non-standard operator can be implemented into Comunica using Traqula.
As of this release, Comunica has fully migrated to Traqula, which involves a breaking change for Comunica module developers. As a developer of a Comunica module, the changes you will have to make will mostly look like this:
- import type { Factory } from 'sparqlalgebrajs'; - import { Algebra } from 'sparqlalgebrajs'; + import { Algebra } from '@comunica/utils-algebra'; + import type { AlgebraFactory } from '@comunica/utils-algebra';
🗃️ Improved HTTP caching
Up until now, Comunica engines would already cache various things to improve performance within and across query executions, such as indexed representations of fetched RDF documents. However, these caches were agnostic to HTTP caching headers, and would require manual cache invalidation if the underlying HTTP resource would change when reusing the same query engine.
With this release, all caching within Comunica is done according to the HTTP caching semantics (e.g. RFC 7234/9111). This means that Comunica will automatically invalidate intermediary caches if the HTTP caching headers specify this.
Additionally, it's now also possible to optionally enable caching of HTTP responses directly using the httpCache flag.
Learn more about caching in Comunica.
🗜️ Improved SERVICE handling and performance
While Comunica is able to automatically assign sources to parts of your SPARQL query,
it is also possible to manually assign subqueries to sources using SERVICE clauses in your query.
Up until now, these manually-placed SERVICE clauses were managed differently in the query plan
compared to automatically assigned sources, which would result in SERVICE clauses sometimes leading
to worse query plans and slower queries.
In this update, these manual and automatic source assignments have been architecturally aligned,
to ensure all optimizations to be used for both.
Related to this change, we now have a new explain mode to inspect the SPARQL query after optimization, including SERVICE clauses after source assignment. This is useful to quickly see how Comunica assigned sources, or to pass a source-assigned query to another SPARQL engine.
The following shows an example of how this can be used to explain a federated query across one SPARQL endpoint and two TPF interfaces:
$ comunica-sparql https://dbpedia.org/sparql https://data.linkeddatafragments.org/viaf https://data.linkeddatafragments.org/harvard \ -q 'SELECT ?person ?name ?book ?title { ?person dbpedia-owl:birthPlace [ rdfs:label "San Francisco"@en ]. ?viafID schema:sameAs ?person; schema:name ?name. ?book dc:contributor [ foaf:name ?name ]; dc:title ?title. } ' --explain query SELECT ?person ?name ?book ?title WHERE { { SERVICE <https://dbpedia.org/sparql> { ?viafID <http://schema.org/sameAs> ?person . } } UNION { SERVICE <https://data.linkeddatafragments.org/viaf> { ?viafID <http://schema.org/sameAs> ?person . } } { SERVICE <https://dbpedia.org/sparql> { ?g_1 <http://xmlns.com/foaf/0.1/name> ?name . } } UNION { SERVICE <https://data.linkeddatafragments.org/harvard> { ?g_1 <http://xmlns.com/foaf/0.1/name> ?name . } } { SERVICE <https://dbpedia.org/sparql> { ?book <http://purl.org/dc/terms/title> ?title . } } UNION { SERVICE <https://data.linkeddatafragments.org/harvard> { ?book <http://purl.org/dc/terms/title> ?title . } } SERVICE <https://dbpedia.org/sparql> { ?g_0 <http://www.w3.org/2000/01/rdf-schema#label> "San Francisco"@en . ?person <http://dbpedia.org/ontology/birthPlace> ?g_0 . } SERVICE <https://data.linkeddatafragments.org/viaf> { ?viafID <http://schema.org/name> ?name . } SERVICE <https://data.linkeddatafragments.org/harvard> { ?book <http://purl.org/dc/terms/contributor> ?g_1 . } }
⛓️ Improved link traversal management
Comunica has a dedicated repository for Link Traversal Query Processing, which is used for querying over decentralized environments such as Solid. While most of the traversal logic existed in this dedicated repository, some parts existed in the Comunica base repository. With this release, these parts have been removed from base Comunica, and properly integrated into Comunica Link Traversal.
This change also introduces the concept of a Link Traversal Manager into Comunica Link Traversal,
which makes link queue management more convenient.
Concretely, all passed sources that are marked with the traverse: true flag
will be grouped into one Link Traversal Source,
and have a dedicated Link Traversal Manager.
This manager holds the link queue, and allows the traversal process to start and stop.
🤝 Contributors
This release has been made possible thanks to the help of the following contributors (in no particular order):
If you would like to contribute yourself, be sure to have a look at our contribution guide. We even have some new bounties that allow you to get paid for your contribution!
- Run all SPARQL spec tests (Variable budget)
- Retry requests with failure during body response (€1632)
- Prefetch sources (€272)
- Compact similar filter warnings in logs (€1088)
Full changelog
If you want to learn more about the other changes in Comunica 5.0, check out the full changelog.