Release 4.2: 🍇 Towards real-world federation improvements
Tuesday, April 29, 2025
On this page
In this release, we mainly focused on taking steps towards solving practical problems that are encountered when performing federated SPARQL queries in real-world use cases. In addition, a variety of low-level performance improvements were applied.
🍇 Real-world federation improvements
While many algorithms have been proposed for federating over SPARQL endpoints, we notice there are many practical problems are encountered when running these algorithms in the real world. For example, endpoints restrict their usage by setting timeouts, result limits, and more.
In this release, we started working towards coping with these practical problems, to make federation more stable in realistic use cases. This includes the introduction of a smart HTTP rate limit actor, HTTP Retry-After header support, and optimizations for SPARQL endpoints that expose VoID descriptions. But these are just the beginning, so expect more improvements in the next upcoming releases.
For example, the query below over 2 Wikidata SPARQL endpoints (that expose VoID descriptions) can now be executed in 1.2 seconds (was ~7 seconds before).
PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX wd: <http://www.wikidata.org/entity/> SELECT ?subject ?subjectType { wd:Q59458901 wdt:P921 ?subject . ?subject wdt:P31 ?subjectType }
If you would run into other practical issues when executing federated queries, be sure to report them on our issue tracker!
🚄 Performance improvements
Besides the changes mentioned above, there are a number of smaller changes that have a positive impact on performance that are worth mentioning:
- Fix bad plans sometimes being chosen due to requestTime in files
- Fix queries with complex property paths not terminating
- Skip COUNT queries to singular SPARQL source
- Skip unnecessary SPARQL SD requests for a single source
- Allow Bind Join more for local data sources
- Shorten code path on empty join operations
- Remove uncommon variables handling in join entry sort
The state of our overall performance is now available on our website.
🤝 Contributors
This release has been made possible thanks to the help of the following contributors (in no particular order):
- Bryan-Elliott Tam
- Jonni Hanski
- Elias Crum
- Jitse De Smet
- Ruben Eschauzier
- Maarten Vandenbrande
- Ruben Taelman
- Karel Klíma
Full changelog
While this blog post explained the primary changes in Comunica 4.2, there are actually many more smaller changes internally that will make your lives easier. If you want to learn more about these changes, check out the full changelog.