CaLiGraph v3.1

CaLiGraph is a large semantic knowledge graph with a rich ontology compiled from the DBpedia ontology and Wikipedia categories & list pages. The ontology is enriched with fine-grained value restrictions on its classes that are discovered with the Cat2Ax approach. A large number of CaLiGraph's entities is extracted from Wikipedia listings through a combination of the ontological information and transformer-based extractors.

Check out the core ideas of the extraction

Statistics

How deep is your graph?

CaLiGraph has a rich ontology with more than a million classes and over 260 thousand restrictions. Additionally, it contains information about almost 15 million entities.

Show more

Resources

.. and how to use them

We provide you with all the data that you need to use CaLiGraph. Furthermore, we point you to the research publications that explain how the graph is created.

Show more

News

CaLiGraph paper accepted at ESWC'23

The paper NASTyLinker: NIL-Aware Scalable Transformer-based Entity Linker has been accepted at the 20th edition of the Extended Semantic Web Conference (ESWC'23).

In this paper we describe how to reliably identify and disambiguate novel subject entities in listings all over Wikipedia in order to extend the number of entities in CaLiGraph. The ideas described in the paper are incorporated in version 3.0 of CaLiGraph.

CaLiGraph paper accepted at DL4KG Workshop @ ISWC'22

The paper Transformer-based Subject Entity Detection in Wikipedia Listings has been accepted at Deep Learning 4 Knowledge Graphs Workshop at ISWC'22.

In this paper we describe how to reliably identify subject entities in listings all over Wikipedia in order to extend the number of entities in CaLiGraph. The ideas described in the paper are incorporated in version 2.1 of CaLiGraph.

CaLiGraph wins Task 1 of SemREC @ ISWC'21

CaLiGraph wins Task 1 of the Semantic Reasoning Evaluation Challenge collocated with the 20th International Semantic Web Conference (ISWC'21).

The Semantic Reasoning Evaluation Challenge (SemREC) is an effort to foster the development of new reasoners that can keep up with the complexity and size of recent ontologies. In Task 1, the participants were asked to submit an ontology that poses a challenge for reasoners. The CaLiGraph ontology has been awarded the 1st place in this category. Find details about the submission here.

CaLiGraph v2.0 Release

CaLiGraph now contains entities from arbitrary listings in Wikipedia.

With this release, we incorporate the results of the paper "Information Extraction from Co-Occurring Similar Entities" into the CaLiGraph extraction framework. Through the extraction of new entities and facts from arbitrary listings (i.e. tables or enumerations) in Wikipedia, CaLiGraph now describes more than 10 million entities.

CaLiGraph paper accepted at WWW'21

The paper Information Extraction from Co-Occurring Similar Entities has been accepted at The Web Conference 2021.

In this paper we describe how to extract information from listings all over Wikipedia to enrich CaLiGraph with many additional entities and facts. The ideas described in the paper will be incorporated in version 2.0 of CaLiGraph, which is soon to be published.

CaLiGraph v1.3 Release

CaLiGraph now incorporates the most recent data from Wikipedia and DBpedia.

We updated the CaLiGraph extractors to work with recent dumps from Wikipedia and DBpedia using the DBpedia Databus. With the release of CaLiGraph 1.3, we publish a version that is based on a Wikipedia dump from November 2020. This means that CaLiGraph now contains more than 1 million classes and almost 9 million entities!

CaLiGraph v1.0 Release

The first version of CaLiGraph is released and the website is launched.

On October 29, 2019 the paper Uncovering the Semantics of Wikipedia Categories, which describes the foundations used for the extraction of CaLiGraph, is presented at the International Semantic Web Conference.