RDF Parsing and Serializing
On this page
Parsing from and serializing to RDF is of great importance within Comunica, as Comunica needs to be able to query over RDF files in different formats, and produce RDF query results in different formats.
For this, Comunica provides the
RDF Parse (@comunica/bus-rdf-parse
)
and
RDF Serialize (@comunica/bus-rdf-serialize
) bus.
These buses respectively contain spec-compliant streaming parsers
and serializers for the most important RDF formats.
Calling a parser
RDF parsing actors implement the ActorRdfParse
abstract class,
which can handle two types of actions:
- Retrieval of supported media types (
mediaTypes
), such as'text/turtle'
,application/ld+json
, ... - Parsing for a given media type (
handle
).
While the first action can be used to determine all available media types that can be parsed across all actors in a bus, the second action is typically used afterwards to parse RDF for a specific media type.
Since there are two types of actions, calling an RDF parser involves two respective mediators.
An example of such two mediators can be found in dereference-rdf/actors.json
.
In TypeScript, these mediators will correspond to the following fields:
public readonly mediatorRdfParseMediatypes: MediateMediaTypes; public readonly mediatorRdfParseHandle: MediateMediaTyped< IActionParse<IActionRdfParseMetadata>, IActorTest, IActorParseOutput<RDF.Stream, IActorRdfParseOutputMetadata> >;
All available media types can be retrieved as follows:
const { mediaTypes } = await this.mediatorRdfParseMediatypes.mediate( { context, mediaTypes: true }, );
Parsing for a specific media type can be done as follows:
const { quads } = (await this.mediatorRdfParseHandle.mediate( { context, handle: { context, headers: undefined, // Optional HTTP fetch headers input: textStream, metadata: { baseIRI: 'http://example.org/' }, }, handleMediaType: 'text/turtle', }, )).handle;
Input quadStream
must always be a text stream,
output quads
is am RDF/JS stream.
More examples on how these parses are used can be found in actors on the Dereference RDF bus or in the rdf-parse.js package.
Calling a serializer
RDF serialzation actors implement the ActorRdfSerialize
,
which can handle two types of actions:
- Retrieval of supported media types (
mediaTypes
), such as'text/turtle'
,application/ld+json
, ... - Retrieval of supported media types as URLs (
mediaTypeFormats
), such ashttp://www.w3.org/ns/formats/N3
,http://www.w3.org/ns/formats/JSON-LD
, ... - Parsing for a given media type (
handle
).
The first action can be used to determine all available media types that can be parsed across all actors in a bus, the second action is used to identify media types by URL in things like SPARQL service descriptions, and the third action is typically used afterwards to parse RDF for a specific media type.
Since there are three types of actions, calling an RDF serializer involves three respective mediators.
An example of such two mediators can be found in sparql-serializers.json
.
In TypeScript, these mediators will correspond to the following fields:
public readonly mediatorRdfSerialize: MediatorRdfSerializeHandle; public readonly mediatorMediaTypeCombiner: MediatorRdfSerializeMediaTypes; public readonly mediatorMediaTypeFormatCombiner: MediatorRdfSerializeMediaTypeFormats;
All available media types can be retrieved as follows:
const { mediaTypes } = await this.mediatorMediaTypeCombiner.mediate( { context, mediaTypes: true }, );
All available media type URLs can be retrieved as follows:
const { mediaTypeFormats } = await this.mediatorMediaTypeFormatCombiner.mediate( { context, mediaTypeFormats: true }, );
Serializing for a specific media type can be done as follows:
const { data } = (await this.mediatorRdfSerialize.mediate({ context, handle: { type: 'quads', quadStream, // An RDF/JS Stream of RDF/JS quads. }, handleMediaType: 'text/turtle', })).handle
Input quadStream
must always be an RDF/JS stream,
output data
is a text stream.
More examples on how these parses are used can be found in the SPARQL RDF Serialize actor.