PostgreSQL 은 특별합니다.
단순한 데이터베이스가 아니라, 하나의 어플리케이션 플랫폼이기 때문입니다.

공지사항

뉴스 및 기사

	총 게시물 177건, 최근 0 건

Traversing from the bottom-up in a hierarchical query

글쓴이 : 모델광 날짜 : 2023-06-25 (일) 19:07 조회 : 2668

A few days ago, I was watching a YouTube video titled "Think Different":

https://www.youtube.com/watch?v=4fcb8eu20SQ

The message in the video reminded me of a performance issue that I couldn't resolve, but an excellent developer resolved it by thinking differently.

While working, sometimes I feel caught between a rock and a hard place. However, it is during these moments that we must think differently in order to find a way out of the difficult situation.

In this note, I will show you how to resolve a performance issue related to a hierarchical query. Writing hierarchical queries can be cumbersome, and when dealing with large data sizes, it can become a performance bottleneck. Hierarchical queries are commonly used for company organization charts, employee-manager relationships, and geographic region hierarchies.

Here is a little data to model the performance issue:

CREATE TABLE emp (
    empno integer NOT NULL,
    ename character varying(10),
    job character varying(9),
    mgr integer,
    hiredate timestamp without time zone,
    sal numeric(7,2),
    comm numeric(7,2),
    deptno integer
);

INSERT INTO EMP VALUES
(7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20),
(7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30),
(7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30),
(7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20),
(7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30),
(7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30),
(7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10),
(7788,'SCOTT','ANALYST',7566,to_date('13-7-1987','dd-mm-yyyy'),3000,NULL,20),
(7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10),
(7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30),
(7876,'ADAMS','CLERK',7788,to_date('13-7-1987', 'dd-mm-yyyy'),1100,NULL,20),
(7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30),
(7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20),
(7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10);

I have rigged the data so that the EMP2 table takes up as many blocks as possible.

CREATE TABLE EMP2

SELECT A.*, REPEAT('Z',1500) AS DUMMY FROM EMP A;

VACUUM EMP2;
CREATE UNIQUE INDEX EMP2_PK ON EMP2(EMPNO);

Note that the table has only one index on the EMPNO column.

Take also note that the table takes up three blocks.

SELECT * FROM PG_CLASS where relname='emp2';

The following recursive CTE is a canonical exmaple of retrieving the complete path from the root node to the node of EMPNO = 7788.

WITH RECURSIVE W AS (
SELECT EMPNO, ENAME::TEXT
FROM EMP2
WHERE MGR IS NULL ---the starting point
UNION ALL
SELECT A.EMPNO, W.ENAME||'>'||A.ENAME
FROM EMP2 A, W
WHERE A.MGR = W.EMPNO
)
SELECT *
FROM W
WHERE EMPNO = 7788;

The resulting data of the above query is:

KING>JONES>SCOTT

In the table EMP2, a row is dependent on another row. The logical structure of the EMP2 table forms a tree structure. The pain point with the query is the abscence of an index on the MGR column, resulting in a full table scan of the EMP2 table.

This is the plan of the above query:

CTE Scan on w (cost=41.84..44.11 rows=1 width=36) (actual time=0.030..0.044 rows=1 loops=1)
Filter: (empno = 7788)
Rows Removed by Filter: 13
Buffers: shared hit=15
CTE w
    -> Recursive Union (cost=0.00..41.84 rows=101 width=36) (actual time=0.008..0.040 rows=14 loops=1)
          Buffers: shared hit=15
          -> Seq Scan on emp2 (cost=0.00..3.14 rows=1 width=36) (actual time=0.007..0.009 rows=1 loops=1)
                Filter: (mgr IS NULL)
                Rows Removed by Filter: 13
                Buffers: shared hit=3
          -> Hash Join (cost=0.33..3.67 rows=10 width=36) (actual time=0.004..0.006 rows=3 loops=4)
                Hash Cond: (a.mgr = w_1.empno)
                Buffers: shared hit=12
                -> Seq Scan on emp2 a (cost=0.00..3.14 rows=14 width=46) (actual time=0.000..0.001 rows=14 loops=4)
                      Buffers: shared hit=12
                -> Hash (cost=0.20..0.20 rows=10 width=36) (actual time=0.001..0.002 rows=4 loops=4)
                      Buckets: 1024 Batches: 1 Memory Usage: 9kB
                      -> WorkTable Scan on w w_1 (cost=0.00..0.20 rows=10 width=36) (actual time=0.000..0.001 rows=4 loops=4)
Planning:
Buffers: shared hit=15 read=1
Planning Time: 0.174 ms
Execution Time: 0.062 ms

Here are some details to note:

The execution plan illustrates a query that utilizes a CTE and recursive union to perform hierarchical traversal in PostgreSQL.

1) The first node, CTE Scan on w, calls the second node, CTE w.

2) The CTE w node invokes the Recursive Union node.

3) The Recursive Union node calls the Seq Scan on emp2 node and the Hash Join node.

4) The Seq Scan on emp2 node sequentially scans the EMP2 table. It filters out rows where mgr is null, removing 13 rows. It retrieves 1 row with a toal of three block I/O operations.

5) The Hash Join node determines that the WorkTable Scan on w should be the build table. It builds a hash table using the temporary result set from the WorkTable Scan on w node. Notably, while accessing the temporary working table, there is no block I/O.

6) After constructing the hash table, the Hash Join performs a sequential scan on the EMP2 table to locate matching rows.

7) The Hash Join operation is executed 4 times (loops=4), indicating that the recursive traversal involves four levels or iterations to construct the hierarchical relationship in the result set. The total number of block I/O operations is 12.

8) The Recursive Union node combines the results of the Seq Scan on emp2 and Hash Join operations,. The Recursive Union passes the resulting data, consisting of 14 rows, up to the CTE w node.

9) The CTE w node passes the result up to the CTE Scan on w node.

10) The CTE Scan on w node retrieves one row (empno=7788) after filtering out 13 rows.

The only thing that really matters in this plan is the full table scan on the EMP2 table, and obviously we could consider creating an index on the MGR column as a possible way to reduce the block I/O and workload. In this case, though, there are only three columns from EMP2 appearing in the query, so we could create an index that included those columns so that PostgreSQL doesn't have to visit the table at all. For instance, an index like (MGR, EMPNO, ENAME) could be beneficial.

Let us suppose that it is not a viable option to add an index on the EMP2 table. In that case, finding a solution to reduce block I/O operations becomes more challenging. Unfortunately, I couldn't devise a solution using conventional strategies of traversing rows in a top-down direction.

Working out a solution to reduce the block I/O is left as an exercise for the readers. I will return in a week or so with the ideas of a smart co-worker on reducing the block I/O of the query, and I will update the title of this post to reflect the solution.

Added on JULY 15, 2023
When I sought help to optimize the above query, a skilled developer with a "Think Different" mindset was able to improve its performance by changing the traversal direction from top to bottom to bottom to top.
The following is the modified query that traverses the rows from the bottom-up.

WITH RECURSIVE W AS (
SELECT EMPNO, ENAME::TEXT AS ENAME, MGR, 0 AS LEVEL
    FROM EMP2 E
   WHERE EMPNO = 7788      --the bottom point
   UNION ALL
   SELECT A.EMPNO, A.ENAME||'>'||W.ENAME, A.MGR, W.LEVEL+1
     FROM EMP2 A, W
    WHERE A.EMPNO = W.MGR
)
SELECT ENAME
FROM (SELECT *

                     , CASE WHEN NOT EXISTS (SELECT 1
                                                            FROM W A
                                                           WHERE A.EMPNO = W.MGR)
                                        THEN 1
                                         ELSE 0
                                 END AS ISLEAF    --check whether the node is the bottom leaf
          FROM W) FOO
WHERE ISLEAF = 1;

Here is a breakdown of how the query works:
1) The query starts by defining a recursive CTE called "W". The initial part of the CTE is the anchor member, which selects the starting point of the recursion based on the condition EMPNO = 7788.
2) The UNION ALL clause is used to specify the recursive member of the CTE. It joins the EMP2 table with the previous result of CTE W to traverse the hierarchical structure from the bottom to the top.
3) The recursive member continues to iterate until no more rows are returned. In this case, it stops when it reaches a row where the MGR column value is null. We can observe that the CTE W returns 3 rows when we query the CTE W.
4) The inline view FOO then selects the ENAME from the CTE, along with an addtional column ISLEAF that indicates whether the node is a bottom-level leaf node (ISLEAF=1) or not (ISLEAF=0). The ENAME column in the inline view FOO contains the ENAMES from the top-level leaf to the bottom-level leaf separated by the delimiter '>'.
5) Finally, the outer query filters the results by selecting only the row where ISLEAF is equal to 1, which gives us the names from the top-level leaf to the bottom-level leaf separated by the delimiter '>'.

And I have run the query on PostgreSQL 15.1. Since the table is tiny, I adjusted the random_page_cost parameter to 0.01 to lower the cost of accessing the index before running the query.

SET RANDOM_PAGE_COST = 0.01;

Here is the plan generated by this query - and the first thing I'd like you to note is that there is no Seq Scan on emp2.

CTE Scan on w (cost=11.02..242.81 rows=1 width=32) (actual time=0.038..0.039 rows=1 loops=1)
Filter: (CASE WHEN (NOT (hashed SubPlan 3)) THEN 1 ELSE 0 END = 1)
Rows Removed by Filter: 2
Buffers: shared hit=6
CTE w
    -> Recursive Union (cost=0.14..11.02 rows=101 width=44) (actual time=0.012..0.031 rows=3 loops=1)
          Buffers: shared hit=6
          -> Index Scan using emp2_pk on emp2 e (cost=0.14..0.17 rows=1 width=44) (actual time=0.011..0.012 rows=1 loops=1)
                Index Cond: (empno = 7788)
                Buffers: shared hit=2
          -> Hash Join (cost=0.57..0.88 rows=10 width=44) (actual time=0.005..0.005 rows=1 loops=3)
                Hash Cond: (w_1.mgr = a.empno)
                Buffers: shared hit=4
                -> WorkTable Scan on w w_1 (cost=0.00..0.20 rows=10 width=40) (actual time=0.000..0.000 rows=1 loops=3)
                -> Hash (cost=0.40..0.40 rows=14 width=46) (actual time=0.009..0.009 rows=14 loops=1)
                      Buckets: 1024 Batches: 1 Memory Usage: 9kB
                      Buffers: shared hit=4
                      -> Index Scan using emp2_pk on emp2 a (cost=0.14..0.40 rows=14 width=46) (actual time=0.002..0.006 rows=14 loops=1)
                            Buffers: shared hit=4
SubPlan 3
    -> CTE Scan on w a_1 (cost=0.00..2.02 rows=101 width=4) (actual time=0.000..0.019 rows=3 loops=1)
          Buffers: shared hit=4
Planning Time: 0.124 ms
Execution Time: 0.066 ms

In this plan we can see that the RANDOM_PAGE_COST parameter has been applied - PostgreSQL is walking the emp2_pk index. Take note that the number of block I/Os dropped from 15 to 6. I am not going to say anything about interpreting the plan because I have already done it in the previous plan.
Retrieving information from a hierarchical structure in a database often presents challenges.
I hope this note has shed some light on how to obtain the complete path efficiently.

Conclusion
A basic strategy for retrieving the complete hierarchical path would be to start from the top. This approach can be further optimized by creating suitable indexes. However, there are several benefits and drawbacks associated with creating an index.
To achieve better performance in hierarchical queries, it is essential to challenge conventional optimization practices and think outside the box. Considering an alternative strategy, such as traversing the rows from the bottom, can be advantageous in extracting hierarchical data.

E-MAIL: root@postgresdba.com