About This Course
The Linguistic Linked Data (LLD) field studies techniques and tools aimed at modelling and publishing language resources on the Web, in ways that enable their data interoperation and reuse. During this course you will acquire the fundamental notions around LLD and will also gain practical experience on its main tools and techniques.
LLD grounds on Semantic Web techniques. One of the first lessons of this course is a quick overview of Linked Data and the Semantic Web, to refresh some basic concepts. After such a lesson, we will introduce the notion of LLD, followed by a visit to one of its most relevant foundational models, which is Ontolex lemon. Lemon is aimed at representing lexical content as Linked Data on the Web. Right after that, we will enter into more practical aspects, by learning how to build lemon lexicons with VocBench, and how to use other general frameworks for semantic processing such as Jena. The SPARQL query language is key to query and update linked data. We will learn its main capabilities, applied to LLD examples. The following lesson will tackle the problem of how to generate and publish linked data from other kinds of data formats. Then, the particular representation needs of corpora and text annotation will be discussed in another lesson.
Course Staff
Jorge Gracia, University of Zaragoza
Jorge Gracia is a senior research fellow (“Ramón y Cajal” postdoctoral position) at the Department of Computer Science and Systems Engineering (University of Zaragoza, Spain). He is a member of the Aragon Institute of Engineering Research (I3A) and of the Distributed Information Systems (SID) research group. His main research interest include Semantic Web, Ontology Matching, Linguistic Linked Data and Natural Language Processing. He has been chair of NexusLinguarum, the “European network for Web-centred linguistic data science”, a COST Action that joined the efforts of researchers from 42 countries.
Max Ionov, University of Cologne
Christian Chiarcos, University of Augsburg
Armando Stellato, University of Rome Tor Vergata
Armando Stellato, PhD, is Associate Professor at the University of Rome, Tor Vergata, where he researches and teaches in the fields of Knowledge Engineering and Knowledge Based Systems, especially in the area of Semantic Web. He is currently leading – under two projects funded by the ISA2 program – the development of VocBench: a Platform for Collaborative Management of Ontologies, Thesauri, Lexicons and Dataset, ShowVoc, a dataset/metadata catalog and fruition platform and SEPIA, a platform for semantic elicitation from unstructured content. He is also active in the field of Legal Informatics, being involved in a collaboration with the Italian government for the semantic organization of Italian laws and on the realization of a semantic representation model for legal acts, currently funded and under adoption by the Publications Office of the EU.
John McCrae, University of Galway
John McCrae is a lecturer above-the-bar at the Data Science Institute, Insight Centre for Data Analytics and ADAPT centre at the National University of Ireland Galway and the leader of the Unit for Linguistic Data. He is the coordinator of the Prêt-à-LLOD project and work package leader in the ELEXIS infrastructure. His research interests include the following: Ontologies, lexicography and the lexicon-ontology interface, Collaborative development and publishing of language resources, Big data and data science, Linked data and the Semantic Web, Machine translation and multilingualism, Machine learning methods for NLP, Digital Humanities, and Under-resourced languages. He obtained my PhD from the National Institute of Informatics in Tokyo under the supervision of Nigel Collier and until 2015 he was a post-doctoral researcher at the University of Bielefeld in Bielefeld, Germany in Prof. Philipp Cimiano's group, AG Semantic Computing.
Slavko Žitnik, University of Ljubljana (Course coordinator)
Slavko Žitnik Associate Professor and Vice Dean for Education at the University of Ljubljana, Faculty of Computer and Information Science. His research is in the areas of natural language processing, information retrieval, information extraction, semantic Web, and information systems, and counts more than 100 bibliographic items. He actively collaborates with researchers from Université Paris 1 - Sorbonne, University of Belgrade, University of South Florida, and Harvard University. He received multiple shared task awards and the University of Ljubljana award or extraordinary pedagogical, research, and artistic achievements. He is teaching courses related to data science, databases, semantics, and natural language processing.
Andon Tchechmedjiev, IMT Mines Ales
Andon Tchechmedjiev is Associate Professor at Institut Mines Telecom in the EuroMov Digital Health in Motion (EDHM) interdisciplinary lab (IMT Mines Alès, University of Montpellier), at the crossroads of artificial intelligence, human movement science and embodiment as well as medicine. He is co-PI of the Semantics and Taxonomy of Human Movement research axis and member of the lab steering committee. He also co-animates the data ecosystem and governance theme in the Data & AI scientific community at Institut Mines Telecom.
Collaborators
- Katerina Gkirtzou, Athena Research Center
- Panagiotis Karioris, Athena Research Center
- Gokhan Ozkan, Kırklareli University