Hierarchy Map [HM] Pattern
How to store variable depth hierarchies in a BI query friendly format.
Variable Depth Hierarchy Problem
Company ownership is an example of a variable depth hierarchy. In the organisation chart above each node at each level in this hierarchy is a business customer, all ultimately owned by Pomegranate Corp. The hierarchy can vary in depth and change shape at any time as companies are acquired, merged or sold off.
A consulting firm working with each of the above corporate clients individually might not consider Pomegranate or POM Computing significant based on direct billing alone. However, if ownership data is available in an analytically friendly form KPIs can easily be recalculated by rolling up billing metrics to any level in the hierarchy and a true picture of client importance might emerge along with opportunities to cross/up-sell services.
The ownership above can be described as a 'many to one (M:1) recursive relationship' in which each customer can own many other customers and many customers can all be owned by one other customer. The relationships are optional as a customer can be independent, privately owned and not own any other customers. On an Entity Relationship Diagram (ERD) the relationships can be visualised like this:
This would typically be implemented in a relational database by adding a parent customer foreign key column. Foreign keys normally point to the primary key in another table but in this case it would be a recursive key pointing back to the primary key of the same table as in the example below taken from chapter 6 of Agile Data Warehouse Design:
Unfortunately, while the above design succinctly stores the information it fails to publish it in a form that can easily be used by ad hoc BI tools. It is possible to answers questions by writing custom applications using loops and recursion code to aggregate can only generate non-procedural SQL. Perhaps this is why the M:1 recursive relationship is often nicknamed a 'head-scratcher' when drawn as in the above ERD. (Be careful how you draw it or you might be scratching another part of your anatomy.)
The Hierarchy Map (HM) pattern (aka hierarchy helper) solves the ad hoc query problem by storing every company to subsidiary relationship no matter how distant. I.e. not just the direct parent:child relationships found in the raw data but the derived indirect relationships between parent:grandchild, parent:great-grandchild etc as show in the Company Structure example below.
The design of this table, particularly the importance of the sequence number column for displaying the hierarchy correctly and how it works with type 2 slowly changing dimensions (SCD) is discussed on pages 176-185 of Agile Data Warehouse Design.
Download Example Code
Hierarchy maps can be build and maintained by stored procedures such as the downloadable example here:
Company Structure M:1 HM (Microsoft SQL Server) [ZIP] Author: Man Mohan Singh
The solution pack contains:
- DDL for building a Company source table, Company HV dimension and Company Structure HM
- Simple SQL scripts for loading and updating the source table with Pomegranate example data
- Stored procedure for maintaining the type 2 SCD
- Store procedure for loading and rebuilding the Company Structure hierarchy map
- Instructions for running the code, a spreadsheet of test results for various update scenarios and contact details for feedback and more information
(coming shortly) Company Structure M:1 HM (Oracle) [ZIP] Author: Lawrence Corr
Please note: the above code is offered freely without warranty.
The solution above is designed to cope with the M:1 recursive relationship. The hierarchy map pattern can be expanded to solve Many to Many (M:M) recursive relationships that represent variable depth multi-parent hierarchies such as bill of materials, HR dotted line relationships and partial ownership. This involves storing additional rows for the multiple parent relationships and weighting factors for allocating child facts across multiple parent values.
Using a Hierarchy Map
Hierarchy maps may appear complex but once built, using them in a query is simple. By joining a dimension to a fact table through a hierarchy map as below it is possible to roll up all child facts to any parent level by performing only one additional join.