CaLiGraph v2.1

CaLiGraph is a large semantic knowledge graph with a rich ontology compiled from the DBpedia ontology and Wikipedia categories & list pages. The ontology is enriched with fine-grained value restrictions on its classes that are discovered with the Cat2Ax approach. A large number of CaLiGraph's entities is extracted from Wikipedia listings through a combination of the ontological information and transformer-based extractors.

Check out the core ideas of the extraction

Statistics

How deep is your graph?

CaLiGraph has a rich ontology with more than a million classes and over 200,000 restrictions. Additionally, it contains information about more than 10 million entities.

Show more

Resources

.. and how to use them

We provide you with all the data that you need to use CaLiGraph. Furthermore, we point you to the research publications that explain how the graph is created.

Show more

News

CaLiGraph v2.0 Release

CaLiGraph now contains entities from arbitrary listings in Wikipedia.

With this release, we incorporate the results of the paper "Information Extraction from Co-Occurring Similar Entities" into the CaLiGraph extraction framework. Through the extraction of new entities and facts from arbitrary listings (i.e. tables or enumerations) in Wikipedia, CaLiGraph now describes more than 10 million entities.

CaLiGraph paper accepted at WWW'21

The paper Information Extraction from Co-Occurring Similar Entities has been accepted at The Web Conference 2021.

In this paper we describe how to extract information from listings all over Wikipedia to enrich CaLiGraph with many additional entities and facts. The ideas described in the paper will be incorporated in version 2.0 of CaLiGraph, which is soon to be published.

CaLiGraph v1.3 Release

CaLiGraph now incorporates the most recent data from Wikipedia and DBpedia.

We updated the CaLiGraph extractors to work with recent dumps from Wikipedia and DBpedia using the DBpedia Databus. With the release of CaLiGraph 1.3, we publish a version that is based on a Wikipedia dump from November 2020. This means that CaLiGraph now contains more than 1 million classes and almost 9 million entities!

CaLiGraph v1.0 Release

The first version of CaLiGraph is released and the website is launched.

On October 29, 2019 the paper Uncovering the Semantics of Wikipedia Categories, which describes the foundations used for the extraction of CaLiGraph, is presented at the International Semantic Web Conference.