This tutorial introduces the topic of semantic data enrichment, covering theoretical and practical considerations. In particular, the tutorial will provide an explanation of the role that semantics play in data enrichment for downstream AI-based applications, a review of the advantages and limitations of tools, methodologies, and techniques for semantic data enrichment available today, and a practical dive into the creation of data transformations for enriching the data.
Provide evidence for the hypothesis that several semantic approaches can play an important role in supporting high-quality, controllable, scalable tabular data enrichment processes. This objective is related to proposing data enrichment as a key perspective on semantic solutions to support AI-based applications and intercept a strong demand from the market;
Present a link & extend paradigm for data enrichment as a unifying abstraction to develop semantic data enrichment solutions.
Present key aspects of semantic data enrichment, with a particular focus on:
Provide a practical guide to creating a semantic data enrichment pipeline for real-world data enrichment tasks, by using an extensible toolkit (under development in the context of enRichMyData project) that covers key aspects of semantic enrichment.
Discuss open research questions to stimulate further research on data enrichment.
The tutorial will be a half-day tutorial, requiring approximately three hours including presentations and a hands-on section. The draft schedule proposed for the tutorial is the following:
Topics: challenges; semantics technologies as key enablers; analytics with enriched data: use cases from industry-driven data science projects.
Topics: semantic table annotation tools and techniques; doing it with OpenRefine: lessons learned and limitations; data enrichment with batch pipelines: lessons learned and limitations; user-driven data enrichment at scale: the Grafterizer/ASIA approach; QMiner and data analytics on top of enriched data.
Topics: SemTUI fully modular framework: extending, annotating and enriching data; Expert AI tools: category classification on top of enriched data.
Environment set up; design of the data transformation workflows with SemTUI; batch execution of the transformations on a larger data set; design of larger data set transformation workflows with TAO.
You should attend the tutorial if you:
See our tools in action!
WORK IN PROGRESS