Skip to main content
Nexus Linguarum DGN_linkeddata-essentials
Linguistic Linked Data – Essentials

The Linguistic Linked Data field studies techniques and tools aimed at modelling and publishing language resources on the Web, in ways that enable their data interoperation and reuse. During this course you will acquire the fundamental notions around linguistic linked data and will also gain practical experience on its main tools and techniques. This first course covers the essentials of linguistic linked data and has a continuation (Linguistic Linked Data – Advanced Topics) that we encourage you to take in order to acquire a complete picture of the field.

  1. Course Start:

    (Self-paced)
  2. Estimated Effort:

    15 h/total

Linguistic Linked Data – Essentials

About This Course

The Linguistic Linked Data (LLD) field studies techniques and tools aimed at modelling and publishing language resources on the Web, in ways that enable their data interoperation and reuse. During this course you will acquire the fundamental notions around LLD and will also gain practical experience on its main tools and techniques.

LLD grounds on Semantic Web techniques. One of the first lessons of this course is a quick overview of Linked Data and the Semantic Web, to refresh some basic concepts. After such a lesson, we will introduce the notion of LLD, followed by a visit to one of its most relevant foundational models, which is Ontolex lemon. Lemon is aimed at representing lexical content as Linked Data on the Web. Right after that, we will enter into more practical aspects, by learning how to build lemon lexicons with VocBench, and how to use other general frameworks for semantic processing such as Jena. The SPARQL query language is key to query and update linked data. We will learn its main capabilities, applied to LLD examples. The following lesson will tackle the problem of how to generate and publish linked data from other kinds of data formats. Then, the particular representation needs of corpora and text annotation will be discussed in another lesson.

Course Topics

  • Installing the tools
  • Semantic Web and linked data
  • Linguistic linked data
  • Tools for ontolex lexicon building
  • Modeling: Ontolex lemon
  • Linked data tools overview
  • SPARQL
  • Linked data generation
  • Corpora and annotation

Required Skills

General IT knowledge and basics in linguistics.

Course Level

Introductory course.

Target Group

Everyone interested in language technologies, willing to represent and generate linguistic data in standard and interoperable ways on the Web.

Effort

3-4 hours per week

License

CC-BY-SA 4.0

Course Staff

Jorge Gracia

Jorge Gracia, University of Zaragoza

Jorge Gracia is a senior research fellow (“Ramón y Cajal” postdoctoral position) at the Department of Computer Science and Systems Engineering (University of Zaragoza, Spain). He is a member of the Aragon Institute of Engineering Research (I3A) and of the Distributed Information Systems (SID) research group. His main research interest include Semantic Web, Ontology Matching, Linguistic Linked Data and Natural Language Processing. He has been chair of NexusLinguarum, the “European network for Web-centred linguistic data science”, a COST Action that joined the efforts of researchers from 42 countries.

Max Ionov

Max Ionov, University of Cologne

Christian Chiarcos

Christian Chiarcos, University of Augsburg

Armando Stellato

Armando Stellato, University of Rome Tor Vergata

Armando Stellato, PhD, is Associate Professor at the University of Rome, Tor Vergata, where he researches and teaches in the fields of Knowledge Engineering and Knowledge Based Systems, especially in the area of Semantic Web. He is currently leading – under two projects funded by the ISA2 program – the development of VocBench: a Platform for Collaborative Management of Ontologies, Thesauri, Lexicons and Dataset, ShowVoc, a dataset/metadata catalog and fruition platform and SEPIA, a platform for semantic elicitation from unstructured content. He is also active in the field of Legal Informatics, being involved in a collaboration with the Italian government for the semantic organization of Italian laws and on the realization of a semantic representation model for legal acts, currently funded and under adoption by the Publications Office of the EU.

John McCrae

John McCrae, University of Galway

John McCrae is a lecturer above-the-bar at the Data Science Institute, Insight Centre for Data Analytics and ADAPT centre at the National University of Ireland Galway and the leader of the Unit for Linguistic Data. He is the coordinator of the Prêt-à-LLOD project and work package leader in the ELEXIS infrastructure. His research interests include the following: Ontologies, lexicography and the lexicon-ontology interface, Collaborative development and publishing of language resources, Big data and data science, Linked data and the Semantic Web, Machine translation and multilingualism, Machine learning methods for NLP, Digital Humanities, and Under-resourced languages. He obtained my PhD from the National Institute of Informatics in Tokyo under the supervision of Nigel Collier and until 2015 he was a post-doctoral researcher at the University of Bielefeld in Bielefeld, Germany in Prof. Philipp Cimiano's group, AG Semantic Computing.

Slavko Žitnik

Slavko Žitnik, University of Ljubljana (Course coordinator)

Slavko Žitnik Associate Professor and Vice Dean for Education at the University of Ljubljana, Faculty of Computer and Information Science. His research is in the areas of natural language processing, information retrieval, information extraction, semantic Web, and information systems, and counts more than 100 bibliographic items. He actively collaborates with researchers from Université Paris 1 - Sorbonne, University of Belgrade, University of South Florida, and Harvard University. He received multiple shared task awards and the University of Ljubljana award or extraordinary pedagogical, research, and artistic achievements. He is teaching courses related to data science, databases, semantics, and natural language processing.

Andon Tchechmedjiev

Andon Tchechmedjiev, IMT Mines Ales

Andon Tchechmedjiev is Associate Professor at Institut Mines Telecom in the EuroMov Digital Health in Motion (EDHM) interdisciplinary lab (IMT Mines Alès, University of Montpellier), at the crossroads of artificial intelligence, human movement science and embodiment as well as medicine. He is co-PI of the Semantics and Taxonomy of Human Movement research axis and member of the lab steering committee. He also co-animates the data ecosystem and governance theme in the Data & AI scientific community at Institut Mines Telecom.

Collaborators

  • Katerina Gkirtzou, Athena Research Center
  • Panagiotis Karioris, Athena Research Center
  • Gokhan Ozkan, Kırklareli University

Frequently Asked Questions

What web browser should I use?

Our German-UDS.academy platform works best with current versions of Chrome, Edge, Firefox, or Safari.

See our list of supported browsers for the most up-to-date information.

Enroll