LD4IE 2014: 2nd International workshop on Linked Data for Information Extraction.

This workshop focuses on the exploitation of Linked Data for Web Scale Information Extraction (IE), which concerns extracting structured knowledge from unstructured/semi-structured documents on the Web.
  • Quando il 20/10/2014 dalle 08:00 alle 18:00 (Europe/Berlin / UTC200)
  • Dove Riva del Garda - Trento, Italy
  • Contatti
  • Sito web Visita il sito
  • Aggiungi l'evento al calendario iCal

 


LD4IE 2014, is the second international workshop on Linked Data for Information Extraction, following a successful edition last year in Sydney.

The World Wide Web provides access to tens of billions of pages, mostly containing information that is largely unstructured and only intended for human readability. On the other hand,  Linked Data provide billions of pieces of information linked together and made available for automated processing. However, there is the lack of interconnection between the information in the Web pages and Linked Data. A number of initiatives, like RDFa (supported by W3C) or Microformats (used by schema.org and supported by major search engines) are trying to enable machines to make sense of the information contained in human readable pages by providing the ability to annotate webpage content with Linked Data.
This creates a large knowledge base of entities and concepts, connected by semantic relations. Such resources can be valuable seed data for IE tasks. Furthermore, the annotated web pages can be considered as training data in the traditional machine learning paradigm.

However, powering Web-scale IE using Linked Data faces major challenges, including  discovering relevant learning materials, which is non-trivial due to the heterogeneity of vocabularies, the imbalanced coverage of different domains and the presence of noise, errors, imprecision and spam.

Addressing these challenges requires multi-field collaborative research effort covering various topics such as modelling IE tasks with respect to LD; efficient, large scale, and robust learning algorithms able to scale and cope with noise; measures for assessing learning material quality, and methods for selecting and optimizing training seeds.

Azioni sul documento

pubblicato il 02/10/2014 ultima modifica 05/10/2022
Hanno contribuito: claudia.damato