Next Web: web 3.0, web semántica y el futuro de internet > Data Vocabularies

    sortFiltrar Ordenar
    104 results



    Published on 13.2.2020 by Equipo GNOSS

    Nota de prensa de SEGITTUR, donde anuncia que acaba de publicar un Manual de Buenas Prácticas en Semántica aplicada al turismo.

    Este manual elaborado por SEGITTUR, está basado en la recientemente publicada Norma UNE 178503:2019  Destinos turísticos Inteligentes. Semántica aplicada al turismo, y está dirigido a gestores turísticos locales, así como a los equipos de desarrolladores tecnológicos que acompañan a los destinos; puedes leer más información en la nota de prensa

    Desde GNOSS tuvimos la ocasión de presentar el Manual en edición de FITUR  DE 2020 en una mesa de debate junto con SEGITTUR y Globaldit. Ahora ya está disponible online; puede descargarse desde la web de SEGITTUR en el siguiente enlace:

    Manual_BBPP_semántica_webok (Formato: pdf Tamaño: 1295 KB)




    Published on 17.1.2018 by Equipo GNOSS

    Mejoras en Linked Open Vocabularies (LOV)

    El proyecto Linked Open Vocabularies (LOV) continúa en su tarea de eliminar las barreras que la selección de vocabularios puede provocar a los publicadores de datos en el desarrollo de sus proyectos Linked Data.

    Un reciente paper, Linked Open Vocabularies (LOV): a gateway to reusable semantic vocabularies on the Web, premiado con el Semantic Web Outstanding Paper Award 2017, describe LOV como un catálogo de vocabularios reutilizables de alta calidad, para la descripción de datos en la Web. La iniciativa LOV recopila y hace visible indicadores que no habían sido previamente recopilados, como la interconexión entre vocabularios o el historial de versiones




    Published on 2.6.2016 by Equipo GNOSS

    ¿Qué pasó con la Web Semántica? - What happened to the Semantic Web? - Kingsley Idehen

    Kingsley Idehen, CEO de Openlink Software, creadores de Virtuoso, expone en este post su visión sobre el estado actual de la Web Semántica.

    El provocador título del post es el punto de partida para rebatir la idea de que la web semántica sea una promesa tecnológica incumplida, sino que, más bien, lo que ha sucedido es que su llegada no ha tenido la espectacularidad que algunos esperaban. En palabras de Kingsley Idehen: "In this post, I will demonstrate that as expected [1][2], its arrival was without fanfare, but we are inarguably there."

    El autor proporciona dos ejemplos, relacionados con la experiencia de búsqueda, particularmente en Google.

    En primer lugar, la creación del vocabulario compartido, por parte de Google, Microsoft, Yahoo!, Yandex, y otros.

    En segundo, la creación del Knowledge Graph de Google, y su aplicación indirecta en las búsquedas normales, y directa en búsquedas especiales (Custom Search Engine).

    Estos ejemplos demuestran que los objetivos básicos de la Web Semántica ya se han alcanzado:

    • La web está llena de documentos HTML que incluyen datos semánticamente enriquecidos.
    • Estos documentos crean una nueva dimensión Web en la que los enlaces ya no son sólo entre documentos, sino que funcionan como nombres desambiguados para cualquier entidad, permitiendo la construcción de sentencias en lenguaje natural para codificar y decodificar información (datos contextualizados), comprensibles por usuarios y máquinas (bots).

    En palabras del autor: "The fundamental goal of the Semantic Web Project has already been achieved. Like the initial introduction of the Web, there wasn't an official release date — it just happened!"




    Published on 28.5.2015 by Ricardo Alonso Maturana

    Semantic SEO for the Automotive Industry

    Cars are typically characterized by many technical features. Also, the location of offer and demand matters, both for used and new cars. This in combination with the vast amount of possibilities to configure a certain car model makes it very difficult to articulate the exact strengths and features of a certain make and model to potential customers, and makes the matchmaking process very complex.
    This talk shows examples of how leading brands in the automotive segment can combine GoodRelations, the Vehicle Sales Ontology,, and brand-specific extensions to articulate their value proposition to both traditional Web search engines and to novel applications.

    This is a video recording of my talk at the London Semantic Tech & Business Conference 2011. For more information, see




    Published on 28.5.2015 by Ricardo Alonso Maturana

    Metaweb video - Freebase

    On July 16th 2010, when Metaweb announced their acquisition by Google, they also launched a video that explains what Metaweb/Freebase does, what entities are, etc.

    Video Transcript

    You know what drives me crazy about words? They have a million different meanings.

    Like, check this out: someone says, "I love Boston." Now, they probably mean, "I love Boston, the big city in Massachusetts", but they could be referring to one of the twenty-six other Bostons that are scattered around the globe. But, if it's during the playoffs, they're probably referring to the Celtics [basketball team]. Of course, you and I both hope that they're talking about the Boston. You know. [Image of rock band, sounds of electric guitar.]

    But, I guess there's really no way of knowing. The problem is that the same word can mean so many different things. Because of that, when it comes to finding, linking, reconciling, or organising multiple layers of information, words are not the best solution. The guys at grocery stores figured this out back in the sixties when they started putting barcodes on everything, so that products with the same name wouldn't get confused.

    So how come on the web, so many sites still try to organise stuff with words? Say you're a product guy at a big music site and you want to pull in feeds of lyrics and videos and photos from all of your data suppliers. But everyone uses different names for things, and a lot of the feeds don't even match up, so you've got to reconcile them, and pull in updates, and deal with merges and deletes and splits. It's a nightmare.

    But what if there was a better way?

    Welcome to Metaweb. Metaweb is a service that helps you build your website around entities, and not just words. Whoa, what's an entity? Well the simple answer is, it's a singular person, place, or thing.

    OK, well, let's compare that to text. Did you know that on the web there are more than 50 different ways people write "U. C. Berkeley"? [Examples listed: Cal Berkeley, Berkeley University, UCB, California, U of Cal, etc.] And they're really just talking about one single place, one entity. By mapping all those words to a single entity, as if it had its own barcode, you can combine all that information about U. C. Berkeley into one place.

    But that's just the beginning. Because entities represent unique, real-life things, we can build a map that shows how they're related. So, you can look for things that share certain attributes, like "actresses under 20 from New York". Can you imagine trying to find that with a keyword search? [Shows typical keyword search results, with keywords highlighted: "NY blogger under fire for criticizing actress", "March 3 2004: New! 20 steps to be an actress", "Kid actress eats 20 York peppermints".] Entities are just smarter than words.

    So, Metaweb's been in the process of identifying millions of these entities and mapping out how they're related, and what words other sites use to refer to them. And it's really cool because they have a totally collaborative process that involves the online community. This thing will always be expanding and improving.

    So, how is this going to help you? Well let's say you're that guy writing the movie review. If you tag the review with an entity in Metaweb, it's like you're looking at a menu saying, "Hey, Metaweb, give me the movie poster and a trailer and some links and maybe some other information like the release date and who was in it." And BAM, it'd be right there. And now, your page looks awesome!

    Or, say you're that product guy at the music site. Instead of spending months doing messy integrations and maintaining all those feeds, you can just plug in to Metaweb, and suddenly everything just works. It's like a switchboard for content on the web. [Various logos related to web content: eg. Twitter, Facebook, Audio Scrobbler, Wordpress.] And not only that! When your site's built on entities, new things get magically connected. Like, if one of your users adds a band to her profile page, or tags them in a comment, that can show up on the band page, because they're all linked under the hood to the same entity.

    Are you kidding me? This stuff sounds impossible! Well, that's what they said about the barcode.

    And it's not just movies and bands. Metaweb has millions of entities in thousands of categories: twelve million and counting!

    Metaweb makes your site smarter. It's time to connect to the web.




    Published on 16.1.2015 by Equipo GNOSS

    Nueva versión de la aplicación Linked Open Vocabularies (LOV)

    Se presenta una nueva versión de la aplicación Linked Open Vocabularies (LOV), con una importante re-ingeniería, usando MongoDB y ElasticSearch para ofrecer un acceso rápido a los datos, y NodeJS para mostrar un interfaz de usuario limpio y rápido.

    El proyecto LOV, que casi tiene 4 años, incorpora las siguientes mejoras:

    • Uso de tags para vocabularios en vez de categorías jerárquicas (p.e.  “Time” ).
    • La posibilidad de realizar rápidas búsquedas de texto libre sobre 469 vocabularios, más de 46.000 términos, y 462 agentes (creadores, contribuyentes, publicadores).
    • Un conjunto de APIs ( para acceder a los datos de LOV.
    • Un punto de acceso SPARQL, sobre los registros LOV y la última versión de cada vocabulario.




    Published on 30.9.2014 by Pablo Hermoso de Mendoza González

    En este artículo escrito por Harry Halpin, Ivan Herman y Patrick Hayes se analiza el uso de owl:sameAs a la hora de enlazar datos con datos pertenecientes a diversos datasets, en el marco del proyecto global Linked Open Data.

    Los autores afirman que la comunidad que trabaja en el marco del proyecto considera que owl:sameAs establece vínculos e identidades a veces incorrectas y plantea otros enfoques a la hora de definir la identidad. De hecho, owlsameAs puede ser considerado sólo un tipo de "vínculo de identidad, un enlace que declara que dos elementos son idénticos en una cierta manera.

    Plantean cuatro lecturas alternativas de owl:sameAs, dependiendo de los casos:

    • Same think As but referentially Opaque
    • Same Think As but Different Context
    • Represents
    • Very Similar to




    Published on 10.3.2014 by Pablo Hermoso de Mendoza González

    Library Linked Data Incubator Group: Datasets, Value Vocabularies, and Metadata Element Sets

    El objetivo del Grupo Incubador de Datos Vinculados de Bibliotecas del W3C, constituido desde mayo de 2010 hasta agosto de 2011, ha sido "contribuir a incrementar la interoperabilidad global de los datos de las bibliotecas en la Web, reuniendo a personas implicadas en actividades de la Web Semántica —centradas en los Datos Vinculados— en bibliotecas e instituciones afines, mediante el examen de las iniciativas en curso e identificando futuras vías de colaboración. Los Datos Vinculados se expresan según una normativa, como Resource Description Framework (RDF), que especifica las relaciones entre cosas y los Uniform Resource Identifiers (URIs o "direcciones Web").

    Este informe sobre Conjuntos de Datos, Vocabularios controlados y Conjuntos de Elementos de Metadatos es un complemento del informe principal realizado por el grupo. Este documento proporciona, a partir de los datos recopilados en los Casos de Uso y de las aportaciones del grupo de expertos, un resumen del estado actual de los componentes estructurales de los Datos Vinculados y en especial de aquellos que están más relacionados con los esfuerzos llevados a cabo desde el área de las bibliotecas.




    Published on 5.3.2014 by Equipo GNOSS

    Three Linked Data Vocabularies are W3C Recommendations

    • The Data Catalog (DCAT) Vocabulary is used to provide information about available data sources. When data sources are described using DCAT, it becomes much easier to create high-quality integrated and customized catalogs including entries from many different providers. Many national data portals are already using DCAT.
    • The Data Cube Vocabulary brings the cube model underlying SDMX (Statistical Data and Metadata eXchange, a popular ISO standard) to Linked Data. This vocabulary enables statistical and other regular data, such as measurements, to be published and then integrated and analyzed with RDF-based tools.
    • The Organization Ontology provides a powerful and flexible vocabulary for expressing the official relationships and roles within an organization. This allows for interoperation of personnel tools and will support emerging socially-aware software.


    The Getty Vocabularies: project to publish as Linked Open Data  (Getty Research Institute)

    The Getty has a project to publish as LINKED OPEN DATA their vocabuaries.

    In the LODLAM Summit 2013, they have said that, planning for the publication of all four Getty vocabularies as Linked Open Data (LOD) is well underway. It´s anticipated that the data will be publishing under the ODC_BY 1.0. license. They will begin with AAT and then move on to TGN, ULAN from AAT and TGN; and CONA from all three. They also intend to publish LOD versions of their lookup list (e.g., languages, roles, nationalities, place types, and bibliographic sources).

    What is the history of the Getty vocabularies?
    Work on the AAT began in the late 1970s in response to a need expressed by art libraries, art journal indexing services, and catalogers of museum objects and visual resource collections for a controlled vocabulary to encourage consistency in cataloging and more efficient retrieval of information. While controlled headings and terminology were already common in the field of bibliographic cataloging, and thesauri for cataloging in the sciences was by then well established, the use of a thesaurus for indexing was not welcomed by art catalogers prior to the advent of computerized cataloging. The original core AAT terms were derived from scattered local lists and other sources, in consultation with a panel of experts in architecture and art. The AAT was first published, in print form, in 1990.

    Work on the ULAN began in 1984, when the Getty merged and coordinated controlled vocabulary resources for use by the J. Paul Getty Trust's many automated documentation projects. The AAT was already being managed by the Getty at this time, and the Getty attempted to respond to requests from Getty projects for additional controlled vocabularies for artists' names (ULAN) and geographic names (TGN). In 1987 the Getty created a department dedicated to compiling and distributing terminology. Although originally intended only for use by Getty projects, in response to requests from the broader community, the ULAN was first published in 1991, in print form, according to the tenets previously established for the construction and maintenance of the AAT.

    Work on the TGN began in 1987. Its development was informed by an international study completed by the Thesaurus Artis Universalis (TAU), a working group of the Comité International d'Histoire de l'Art (CIHA), and by the consensus reached at a colloquium held in 1991, attended by the spectrum of potential users of geographic vocabulary in cataloging and scholarship of art and architectural history and archaeology. The TGN was first published, on the Web, in 1997.

    Work on CONA began in 2004, when detailed discussions regarding the Getty Vocabulary Program compiling a vocabulary comprising unique numeric identifiers and brief records for art works was undertaken. CONA is scheduled to be available for user contributions in 2011; the online "browser" is scheduled to launch in early 2012.

    Learn more about the scope and history of each vocabulary at About the AAT, About the ULAN, About the TGN, and About CONA.

    Planning for the publication of all four Getty vocabularies as Linked Open Data (LOD) is well underway. It´s anticipated that the data will be publishing under the ODC_BY 1.0. license. They will begin with AAT and then move on to TGN, ULAN from AAT and TGN; and CONA from all three. They also intend to publish LOD versions of their lookup list (e.g., languages, roles, nationalities, place types, and bibliographic sources).