What is the HierarchyID Data Type?

Hierarchical data is a common type of data that represents the relationships between entities in a tree-like structure. For example, you may have hierarchical data for your organizational structure, where each employee has a manager and a department; or for your file system, where each folder can contain subfolders and files; or for your website, where each page can have subpages and links.

Storing and querying hierarchical data can be challenging in a relational database, where the data is organized in tables with rows and columns. You may need to use complex joins or recursive queries to traverse the hierarchy or find the ancestors or descendants of a given node. You may also need to use additional columns or tables to store the level or order of the nodes in the hierarchy.

SQL Server (2008 and later) provides a solution for these challenges with the HierarchyID data type. The HierarchyID data type is a variable-length system data type that represents a position in a hierarchy. It has a special way of representing the relationships between the nodes in a hierarchy from top to bottom levels and from left to right among the children nodes of a parent node.

The HierarchyID data type uses a compact binary format that encodes the path from the root node to any node in the hierarchy. For example, the root node has a HierarchyID value of /, while its first child has a value of /1/, and its second child has a value of /2/. The first child of the first child has a value of /1/1/, and so on. This format allows you to compare and sort HierarchyID values easily using simple operators such as <, >, =, or <>.

The HierarchyID data type also provides several methods that allow you to perform common hierarchical operations on HierarchyID values.

For example, you can use the GetAncestor() method to find the parent node of a given node; or the GetDescendant() method to find the next child node of a given node; or the IsDescendantOf() method to check if one node is a descendant of another node.

Let’s say we have a table called Employees that stores the hierarchy of employees in an organization. 

The table has the following columns:

  • Id: int, the primary key of the table
  • Name: varchar(50), the name of the employee
  • Position: hierarchyid, the position of the employee in the hierarchy
  • Level: as computed, the level of the employee in the hierarchy

The table has the following data:

Id

Name

Position

Level

1

Alice

/

0

2

Bob

/1/

1

3

Carol

/2/

1

4

Dave

/1/1/

2

5

Eve

/1/2/

2

6

Frank

/2/1/

2

7

Grace

/1/1/1/

3

8

Harry

/2/1/1/

3

The hierarchy looks like this:

Alice

├── Bob

│   ├── Dave

│   │   └── Grace

│   └── Eve

└── Carol

    └── Frank

        └── Harry

Querying Hierarchical Data Using the HierarchyID Data Type

The HierarchyID data type provides several methods that allow you to perform common hierarchical operations on HierarchyID values.

To find the parent node of a given node, you can use the GetAncestor() method with an argument of 1. For example, if you want to find the parent node of Eve, you can use GetAncestor (1) to return `/1/`, which is Bob’s position. 

The code would look like this:

DECLARE @CurrentNode hierarchyid;
SELECT @CurrentNode = Position FROM Employees WHERE Name = 'Eve';
SELECT Name FROM Employees WHERE Position = @CurrentNode.GetAncestor(1); 
-- returns Bob

To find the next child node of a given node, you can use the GetDescendant() method with two arguments that specify the left and right boundaries of the child node. For example, if you want to find the next child node of Bob, you can use GetDescendant (NULL, NULL) to return `/1/3/`, which is a new position that does not exist yet, or GetDescendant (‘/1/2/’, NULL) to return `/1/3/`, which is also a new position. 

The code would look like this:

DECLARE @CurrentNode hierarchyid;
SELECT @CurrentNode = Position FROM Employees WHERE Name = 'Bob';
SELECT @CurrentNode.GetDescendant(NULL, NULL); -- returns /1/3/
SELECT @CurrentNode.GetDescendant('/1/2/', NULL); -- returns /1/3/

To check if one node is a descendant of another node, you can use the IsDescendantOf() method with an argument that specifies the parent node. For example, if you want to check if Harry is a descendant of Alice, you can use IsDescendantOf (‘/’) to return 1 (true), which means that Harry is a descendant of Alice; or IsDescendantOf (‘/2/’) to return 0 (false), which means that Harry is not a descendant of Carol. 

The code would look like this:

DECLARE @ChildNode hierarchyid;
DECLARE @ParentNode hierarchyid;
SELECT @ChildNode = Position FROM Employees WHERE Name = 'Harry';
SELECT @ParentNode = '/';
SELECT @ChildNode.IsDescendantOf(@ParentNode); -- returns 1 (true)
SELECT @ParentNode = '/2/';
SELECT @ChildNode.IsDescendantOf(@ParentNode); -- returns 0 (false)

Examples of “Hierarchical Data”

Data that would benefit from the HierarchyID data type is more common than you might think. Here are some real-world examples of hierarchy structures that could potentially be in your SQL Server database:

  1. Organizational structure: You can use HierarchyID to store the reporting relationships between employees in your organization. You can also use it to query the hierarchy for various purposes, such as finding the direct reports of a manager, finding the managers of an employee, finding the employees in a department, etc.
  2. Folders, subfolders, and files: You can use HierarchyID to store the structure of your file system. You can also use it to query the hierarchy for various purposes, such as finding the files in a folder, finding the subfolders of a folder, finding the path of a file, etc.
  3. Tasks and subtasks in a project: You can use HierarchyID to store the breakdown of tasks and subtasks in your project management system. You can also use it to query the hierarchy for various purposes, such as finding the progress of a task, finding the dependencies of a task, finding the subtasks of a task, etc.
  4. Pages and subpages of a website: You can use HierarchyID to store the structure of your website. You can also use it to query the hierarchy for various purposes, such as finding the subpages of a page, finding the links of a page, finding the breadcrumbs of a page, etc.
  5. Geographical data with countries, regions, and cities: You can use HierarchyID to store the hierarchy of geographical entities such as countries, regions, and cities. You can also use it to query the hierarchy for various purposes, such as finding the countries in a region, finding the regions in a country, finding the cities in a region, etc.

When Should I Use the HierarchyID Data Type?

You should use HierarchyID when…

  • You have hierarchical data that you want to store in a database table.
  • You want to perform hierarchical operations such as traversing the tree structure or finding the parent or child nodes of a given node.
  • You want to store hierarchical data in a compact format that is easy to query.

You should NOT use HierarchyID when…

  • You have non-hierarchical data that does not fit into a tree structure.
  • You have very large trees with many levels or nodes. In this case, it may be more efficient to use other methods, such as nested sets.

Conclusion

In conclusion, the HierarchyID data type is a useful feature in SQL Server that can help you store and query hierarchical data efficiently and easily.

For more information, please contact us!