Analysing Debian packages with Neo4j

Speaker: Norbert Preining

Track: Packaging, policy, and Debian infrastructure

Type: Long talk (45 minutes)

Video:

Room: Yushan (玉山) Live Stream

Time: Jul 29 (Sun), 15:00

Duration: 0:45

We present our work towards representing Debian’s packages, including history and releases, as well as other components of the Debian environment, in a Graph Database.

The Ultimate Debian Database UDD collects a variety of data around Debian and Ubuntu: Packages and sources, bugs, history of uploads, just to name a few. The database scheme reveals a highly de-normalized RDB. In this on-going work we extract (some) data from UDD and represent it as a graph database.

The presentation will give a short introduction on the life time and structure of Debian packages, followed with the graph database scheme (nodes and relations). After going through some of the queries used on the UDD web pages we will show how they can be translated to Cypher.

We close with an outlook of our future plans and open problems.

URLs