Overview
enRichMyData delivers capabilities as a set of interoperable tools and services forming the enRichMyData Toolbox. At the center is the concept of a data enrichment pipeline that receives input data to be enriched and generates enriched data.
Functional Tools
A set of tools providing functional capabilities needed to support the design of pipelines.
Infrastructure Services
Services providing non-functional capabilities needed to support the effective deployment and execution of pipelines.
enRichMyData features loosely coupled but interoperable tools and services designed to handle complex data enrichment scenarios, where tools and services can be combined and customized as needed.
DiscoverR
DiscoverR assists users in searching datasets, ontologies, and enrichment services and provides insights on their content to support their use in enrichment pipeline. The user can search for keywords over descriptions of catalogued datasets/ontologies/services, or browse specific descriptions from a visual interface. It offers semantic data profiling techniques to enrich basic descriptions based on metadata with ontology usage patterns and statistics, boosting FAIR principles.
ABSTAT
Scalable data profiling tool for RDF data (knowledge graphs) based on: 1) pattern extraction (class - predicate - class), possibly with the support of the data ontology; 2) calculation of different statistics.
SemTUI
Table reconciliation and extension service: Semantic annotation of tabular data by external services. Entity reconciliation and linking. Table extension extracting data from external datasets and Knowledge Graphs.
WrappR
WrappR provides data access using a virtual semantic layer and ensures secure access. WrappR is delivered as a semantic graph database with efficient reasoning, cluster and external index synchronization support. It provides a variety of different type of APIs and access methods as well as different types of data federation and virtualization. Through semantic data access and integration, WrappR provides a practical, robust and versatile tool to improve access to data.
Ontotext GraphDB
Ontotext GraphDB is a highly efficient and robust graph database with RDF and SPARQL support. It supports a number of plugins and connectors such as MongoDB connector for JSON store access, JDBC for exposing RDF as a virtual relational DB, ONTOP for virtual sparql access.
Ontotext Semantic Objects
The Semantic Objects are a declaratively configurable service for querying and mutating knowledge graphs which automatically transpiles GraphQL queries and mutations into optimized SPARQL queries.
Ontotext Semantic Search
The Semantic Search provides a way to index the data from GraphDB in Elasticsearch and run queries against it.
CleanR
CleanR supports the specification of data manipulation transformations, including data cleaning operations and the generation of knowledge graphs from various data formats. Users specify transformations interactively from a user interface, while specifications will be stored in a machine-readable format to be replicated and reused. CleanR provides a broad set of AI-enabled data transformations and integrates them with generic linking and extension functionalities provided by the ResourcR.
Ontotext Refine
Ontotext Refine (OntoRefine) is a free application for automating the conversion of messy string data into a knowledge graph.
RMLMapper
RMLMapper executes RDF Mapping Language (RML) rules to generate Linked Data from multiple originally (semi-)structured data sources.
LinkR
LinkR provides capabilities for semantic annotation of structured and semi-structured data using reference knowledge graphs and category schemes. Annotations consist of links from elements of the input data to elements of well-established knowledge bases and ontologies, or user defined knowledge graphs made available through the ResourcR. LinkR supports annotations through intelligent ML algorithms recommending annotations and a human-in-the-loop approach.
SemTUI
Table reconciliation and extension services: semantic annotation of tabular data using external services. The UI supports entity linking and schema annotations to support full-fledge mapping, and specification of data extension operations using external datasets and Knowledge Graphs.
selBat
Table interpretation service: Semantic annotation of tabular data by an unsupervised approach based on heuristic. Schema types and properties, entity reconciliation and linking. Target Knowledge Graph: Wikidata.
Ontotext Refine
Ontotext Refine (OntoRefine) is a free application for automating the conversion of messy string data into a knowledge graph. It allows reconciliation against any endpoint supporting the reconcile API protocol.
Ontotext Reconciliation
Ontotext reconciliation generates a reconciliation API endpoint on top of an RDF knowledge graph.
StructR
StructR is the counterpart of LinkR for unstructured data. It generates structured data from the unstructured input text through semantic annotation, linking and extension. The text is processed by linguistic and semantic tools and concept mentions are identified and disambiguated from context. Extension with custom annotation services is supported through a labeling interface for creating and editing text annotations.
Wikifier
The JSI Wikifier is a web service that takes a text document as input and annotates it with links to relevant Wikipedia concepts (entities).
Expert AI Platform Document Analyser
With the Natural Language API's document analysis capabilities, you can perform deep linguistic analysis, keyphrase extraction, named entity recognition, relation extraction and sentiment analysis.
Event Registry Relation Classifier
The Event Registry Relation Classifier.
Infrastructure Services
Infrastructure Services provide non-functional capabilities needed to support the effective deployment and execution of pipelines. These services enable scaling, reuse, streaming, and environmental impact monitoring of data enrichment processes.
ResourcR
Provides infrastructure components to support the creation of linking services for a given dataset from a data provider as well as access mechanisms such as search and query. Enables performant linking and search functionalities with limited effort.
ScalR
Provides infrastructure components for executing cleaning, transformation and linking at large scale using software containers. Supports management of data enrichment pipelines on heterogeneous computing infrastructures.
StreamR
Provides infrastructure components for streaming support in data enrichment pipelines. Pipes data streams from/to appropriate endpoints ensuring high throughput for setting up custom streams for new applications.
GreenR
Provides infrastructure components to support monitoring of data enrichment pipelines in terms of their environmental impact. Monitors the carbon footprint of pipeline components.