Flattening a parent-child hierarchy

Posted on January 16, 2011

1



Flattening a parent-child hierarchy can be very useful for both SQL and SQL Server Analysis Services, since SSAS cannot index Parent-Child hierarchies. It should be noted that BIDS Helper, the excellent download available at http://bidshelper.codeplex.com/ provides functionality to “naturalize” a parent-child hierarchy. We are going to build on what we did with the Enumerated Path method.

What I will present here is how to do this to load a table.

First we create our products table as per the previous examples:

IF OBJECT_ID('dbo.Product') IS NOT NULL DROP TABLE dbo.Product
GO
CREATE TABLE dbo.Product(ProductID int not null PRIMARY KEY
, ProductName varchar(100)
, ParentProductID INT)
GO
INSERT INTO dbo.Product(ProductID, ProductName, ParentProductID)
VALUES (0,'Baby Goods',NULL)
, (10,'Baby Food',0)
, (20,'Nappies',0)
, (100,'All Ages Baby Food',10)
, (150,'Beginners',10)
, (200,'Strawberry Yoghurt',100)
, (250,'Baby Cereal',100)
, (300,'Formula Milk',150)
, (310,'Heinz Formula',300)
, (400,'Small Pack Nappies',20)
, (450,'Bulk Pack Nappies',20)
GO

I’m going to load the results of the recursive CTE into a temp table for readability and again use an enumerated path method to provide a TreePath.

IF OBJECT_ID('tempdb..#ProductList') IS NOT NULL
DROP TABLE #ProductList
GO
;WITH RecursiveCTE
As 
( SELECT  p.ProductName
 , p.ProductID
 , CAST(p.ProductName AS VARCHAR(MAX)) AS TreePath
 , ParentProductID  
 FROM dbo.Product p  
 WHERE p.ParentProductID IS NULL
 UNION ALL
 SELECT p.ProductName
  , p.ProductID
  , CAST(H.TreePath + ','
   + CAST(p.ProductName AS VARCHAR(100)) AS VARCHAR(MAX))
   AS TreePath
  , p.ParentProductID  
 FROM dbo.Product p  
 INNER JOIN RecursiveCTE H
 ON H.ProductID = p.ParentProductID  
)  
SELECT ProductName
 , ProductID
 , ParentProductID 
 , TreePath
INTO #ProductList
FROM RecursiveCTE

We then convert the TreePath column to XML:

;WITH CTE AS (
SELECT ProductID
, ProductName
, CONVERT(XML,'<Product><Lvl>'
 + REPLACE([TreePath],',', '</Lvl><Lvl>') + '</Lvl></Product>')
 AS Prod_Hierarchy
FROM #ProductList a
WHERE NOT EXISTS (SELECT 1
  FROM #ProductList b
  WHERE a.ProductID = b.ParentProductID)
)
SELECT *
FROM CTE

This produces:

Now, to ensure that we end up with a uniform hierarchy, we shred the XML and use a COALESCE down the levels. So that the child gets repeated up the hierarchy until it meets the parent at the right level. This is important when using this in SSAS. See this blog by Chris Webb:

http://cwebbbi.wordpress.com/2009/11/11/ragged-hierarchies-hidememberif-and-mdx-compatibility/

Thanks to Boyan Penev for pointing out the importance of this to me when I had shown him example.

So our final SQL looks like this:

;WITH CTE AS (
SELECT ProductID, ProductName
 ,CONVERT(XML,'<Product><Lvl>'
        + REPLACE([TreePath],',', '</Lvl><Lvl>') + '</Lvl></Product>')
 AS Prod_Hierarchy
FROM #ProductList a
WHERE NOT EXISTS (SELECT 1
  FROM #ProductList b
  WHERE a.ProductID = b.ParentProductID)
)
, LevelBreakdown AS (
SELECT    ProductID
 , ProductName
 , Prod_Hierarchy.value('/Product[1]/Lvl[1]','varchar(25)') AS [Level1]
 , Prod_Hierarchy.value('/Product[1]/Lvl[2]','varchar(25)') AS [Level2]
 , Prod_Hierarchy.value('/Product[1]/Lvl[3]','varchar(25)') AS [Level3]
 , Prod_Hierarchy.value('/Product[1]/Lvl[4]','varchar(25)') AS [Level4]
 FROM cte)
SELECT
  Level1
, COALESCE(Level2, Level1) AS Level2
, COALESCE(Level3, Level2, Level1) AS Level3
, COALESCE(Level4, Level3, Level2, Level1) AS Level4
, ProductName
, ProductID
FROM LevelBreakdown

This produces:

Thus

Becomes

The obvious downside with method is the static nature of the levels, especially when being referenced from a cube and/or in a report.  So if you have a dimension that is changing the number of levels often, you may want to give this a miss. If you are in the situation where you have an organizational change only once a year, this method of dealing with hierarchies could work well. While the BIDS Helper will create a view in your database, the solution provided above allows for easy indexing.

Advertisements
Posted in: SQL, SSAS