Representing Hierarchies in Relational Databases
2018-05-23, 14:00–14:30, 🏰

In this talk, I’ll explain the fundamental problem representing deep hierarchies in relational databases. To address this problem, we can use a database design pattern, named Materialized Path Trees.

Many data structures require a representation, where one parent node can have any arbitrary number of children. Inside relational databases, this typically is represented by a foreign key onto its own table. In Django’s ORM, we use models.ForeignKey('self', ...), to create this kind of recursive relationship.
The major problem with this kind of representation is, that it doesn’t scale for deep trees. Whenever we have to traverse the tree from a given starting node, our code has to perform one database query per hierarchy level.
To circumvent this, some database vendors implemented SQL dialects, to fetch a whole subtree with one query. Long time ago, Oracle for instance implemented 'CONNECT BY', which is proprietary and not part of the SQL standard. Nowadays, newer releases of most major database vendors implemented the 'WITH RECURSIVE' clause, which has been added to the SQL-99 standard. This allows us to build recursive queries.

Fortunately there is a clever recipe to represent hierarchies in relational databases using standard SQL techniques, but without the mentioned scaling problem: Materialized Path Trees, discovered by Vadim Tropashko. Django’s ecosystem offers two libraries, which implement this design pattern: django-mptt and django-treebeard. I also would like to mention django-tree, which only works on Postgres, using their SQL extension mentioned before.
In this talk I’ll explain the design patterns for Materialized Path Trees. Furthermore I’ll show the pros and cons of both libraries.

After graduating from the University of Innsbruck in CS and Physics, Jacob worked for different companies as a system architect and software developer, where he built and scaled websites in Perl, PHP and Javascript. After dealing with a lot of unmaintainable and poorly written code, Jacob fell in love with Python and Django and hasn’t looked back since. Jacob is a strong believer in sharing knowledge and maintains a dozen of quite popular OSS project on his GitHub account.