PostgreSQL 은 특별합니다.
단순한 데이터베이스가 아니라, 하나의 어플리케이션 플랫폼이기 때문입니다.

공지사항

뉴스 및 기사

	총 게시물 163건, 최근 0 건

Incremental Sort 4

글쓴이 : 모델광 날짜 : 2023-06-10 (토) 06:58 조회 : 555

In the previous notes titled "Incremental Sort 1", "Incremental Sort 2", and "Incremental Sort 3", I showed you how performant PostgreSQL can be employing the sexy incremental sort operation. The purpose of the incremental sort operation is to efficiently sort the result set according to the specified sort key, taking advantage of any presorted data and minimizing memory usage.
So you might assume that the plan is optimal if you see an Incremental Sort node in the plan. However, that is not always the case. Sometimes you need to push the optimizer to its limits.
Read through the following demonstration, which indicates that incremental sort can be suboptimal.

Here is a small script to demonstrate my point:

(tested on PostgreSQL 15)

DROP TABLE ORDERS;

create table ORDERS (
ord_no    bigint,
cust_id    varchar(20),
comment varchar(100),
ord_date varchar(8)
);

ALTER TABLE ORDERS ADD CONSTRAINT PK_ORDERS PRIMARY KEY(ORD_NO);

DROP TABLE ORDERS_DETAIL;

CREATE TABLE ORDERS_DETAIL(
ORD_LINE_NO   BIGINT NOT NULL
,ORD_NO         BIGINT NOT NULL
,PROD_ID         VARCHAR(10) NOT NULL
,COMMENT       VARCHAR(100)
,ORD_AMT        BIGINT
);

insert into ORDERS
select i as ord_no
      , 'C'||mod(i,10) as cust_id
      , lpad('X',10,'Y') as comment
      , to_char(to_date('20191001','YYYYMMDD')+mod(i,600),'yyyymmdd') as ord_date
from generate_series(1,1000000) a(i);

CREATE INDEX ORDERS_X02 ON ORDERS(ORD_DATE);

INSERT INTO ORDERS_DETAIL
SELECT i as ORD_LINE_NO
      , mod(i,1000000) AS ORD_NO
      , 'PP'||MOD(i,5) AS PROD_ID
      , lpad('X',10,'Y') as comment
      , case when i < 1000 then i*100 else i end as prod_amt
FROM generate_series(1,10000000) a(i);

ALTER TABLE ORDERS_DETAIL ADD CONSTRAINT PK_ORDERS_DETAIL PRIMARY KEY(ORD_LINE_NO);
CREATE INDEX ORDERS_DETAIL_X02 ON ORDERS_DETAIL(ORD_NO, ORD_AMT);

I have disabled parallel processing for the purpose of simplicty and clarity.

set max_parallel_workers_per_gather = 0;

Here is the query and plan that produced the Incremental Sort operation:

SELECT *
FROM ORDERS A, ORDERS_DETAIL B
WHERE A.ORD_NO = B.ORD_NO
    AND A.ORD_DATE < '20191213'
    AND B.PROD_ID = 'PP2'
ORDER BY A.ORD_DATE DESC, B.ORD_AMT
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY;

Limit (actual time=259.516..259.519 rows=10 loops=1)
Buffers: shared hit=16685 read=100312
-> Incremental Sort (actual time=259.514..259.516 rows=20 loops=1)
        Sort Key: a.ord_date DESC, b.ord_amt
        Presorted Key: a.ord_date
        Full-sort Groups: 1 Sort Method: top-N heapsort Average Memory: 29kB Peak Memory: 29kB
        Pre-sorted Groups: 1 Sort Method: top-N heapsort Average Memory: 28kB Peak Memory: 28kB
        Buffers: shared hit=16685 read=100312
        -> Nested Loop (actual time=0.023..254.942 rows=16671 loops=1)
              Buffers: shared hit=16685 read=100312

-> Index Scan Backward using orders_x02 on orders a (actual time=0.015..21.416 rows=8336 loops=1)

Index Cond: ((ord_date)::text < '20191213'::text)
Buffers: shared hit=9 read=8344

-> Index Scan using orders_detail_x02 on orders_detail b (actual time=0.022..0.027 rows=2 loops=8336)

                    Index Cond: (ord_no = a.ord_no)
                    Filter: ((prod_id)::text = 'PP2'::text)
                    Rows Removed by Filter: 8
                    Buffers: shared hit=16676 read=91968
Planning:
Buffers: shared hit=6 read=10
Planning Time: 0.222 ms
Execution Time: 259.540 ms

Here are some key points about the Incremental Sort operation in the plan:

* Sort Key : The sort key defines the order in which the rows are sorted. In this case, the sort key consists of two columns: a.ord_date in descending order and b.ord_amt.
* Presorted Key : The plan mentions a presorted key for the incremental sort. This indicates that some initial sorting has already been performed before the incremental sort operation. In this case, it states that the rows are presored on the a.ord_date column.
* Sort Method : The plan specifies that the sort method used is "top-N heapsort." Top-N heapsort is an efficient sorting algorithm that works well when only a limited number of rows are required from the sorted result set.
* Average Memory and Peak Memory : The plan provides information about the memory usage during sort operation. It states that the average memory used is 29kB, and the peak memory usage is also 29kB. This indicates that the sort operation is memory-efficient and does not require excessive memory allocation.

PostgreSQL is taking advantage of the presorted data by the ORD_DATE column, minimizing memory consumption. However, it is important to note that it is walking the ORDERS_DETAIL_X01 index 8336 times. The majority of the elapsed time is spent on accessing the index(0.027 x 8336 = 225.072 ms). So the following question arises:
Is it necessary to access the index 8336 times?

If you run the following query, you will get the same result as the original query:

SELECT *
FROM ORDERS A, ORDERS_DETAIL B
WHERE A.ORD_NO = B.ORD_NO
    AND A.ORD_DATE = '20191212'
    AND B.PROD_ID = 'PP2'
ORDER BY A.ORD_DATE DESC, B.ORD_AMT
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY;

Limit (actual time=74.343..74.347 rows=10 loops=1)
Buffers: shared hit=7960 read=15439
-> Sort (actual time=74.341..74.344 rows=20 loops=1)
        Sort Key: b.ord_amt
        Sort Method: top-N heapsort Memory: 29kB
        Buffers: shared hit=7960 read=15439
        -> Nested Loop (actual time=0.394..70.482 rows=16670 loops=1)
              Buffers: shared hit=7960 read=15439
              -> Bitmap Heap Scan on orders a (actual time=0.379..7.005 rows=1667 loops=1)
                    Recheck Cond: ((ord_date)::text = '20191212'::text)
                    Heap Blocks: exact=1667
                    Buffers: shared hit=385 read=1286
                    -> Bitmap Index Scan on orders_x02 (actual time=0.207..0.207 rows=1667 loops=1)
                          Index Cond: ((ord_date)::text = '20191212'::text)
                          Buffers: shared read=4

-> Index Scan using orders_detail_x02 on orders_detail b (actual time=0.008..0.036 rows=10 loops=1667)

                    Index Cond: (ord_no = a.ord_no)
                    Filter: ((prod_id)::text = 'PP2'::text)
                    Buffers: shared hit=7575 read=14153
Planning:
Buffers: shared hit=8 read=8
Planning Time: 0.354 ms
Execution Time: 74.372 ms

Note that I replaced A.ORD_DATE < '20191213' with A.ORD_DATE = '20191212'. We can also notice that it is accessing the ORDERS_DETAIL_X02 index just 1667 times, which is the reason why the elapsed time dropped from 258 ms to 74 ms.
So it is safe to say that the original query and the execution plan are not optimal. And I am afraid I do not know why PostgreSQL accesed the ORDERS_DETAIL_X02 index 8336 times in the original query.
Since the original query is not performant, we need to rewrite it. Here is my version of the modified query:

SELECT *
   FROM (SELECT ORD_DATE
           FROM (SELECT ORD_DATE
                   FROM ORDERS A, ORDERS_DETAIL B
                  WHERE A.ORD_NO = B.ORD_NO
                    AND A.ORD_DATE < '20191213'
                    AND EXISTS (SELECT 1
                                  FROM ORDERS_DETAIL B
                                 WHERE A.ORD_NO = B.ORD_NO
                                   AND B.PROD_ID = 'PP2')
                  ORDER BY A.ORD_DATE DESC
                 OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY
                 ) K
         GROUP BY ORD_DATE
        ) V, ORDERS A, ORDERS_DETAIL B
WHERE V.ORD_DATE = A.ORD_DATE
        AND A.ORD_NO = B.ORD_NO
        AND A.ORD_DATE < '20191213'
        AND B.PROD_ID = 'PP2'
ORDER BY A.ORD_DATE DESC, B.ORD_AMT
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY;

Limit (actual time=43.613..43.618 rows=10 loops=1)
Buffers: shared hit=13183 read=10245
-> Incremental Sort (actual time=43.611..43.616 rows=20 loops=1)
        Sort Key: a_1.ord_date DESC, b.ord_amt
        Presorted Key: a_1.ord_date
        Full-sort Groups: 1 Sort Method: top-N heapsort Average Memory: 29kB Peak Memory: 29kB
        Pre-sorted Groups: 1 Sort Method: top-N heapsort Average Memory: 28kB Peak Memory: 28kB
        Buffers: shared hit=13183 read=10245
        -> Nested Loop (actual time=0.048..39.728 rows=16670 loops=1)
              Buffers: shared hit=13183 read=10245
              -> Merge Join (actual time=0.045..3.388 rows=1667 loops=1)
                    Merge Cond: ((a.ord_date)::text = (a_1.ord_date)::text)
                    Buffers: shared hit=849 read=851

-> Index Scan Backward using orders_x02 on orders a (actual time=0.009..2.926 rows=1668 loops=1)

                          Index Cond: ((ord_date)::text < '20191213'::text)
                          Buffers: shared hit=824 read=851
                    -> Sort (actual time=0.034..0.036 rows=1 loops=1)
                          Sort Key: a_1.ord_date DESC
                          Sort Method: quicksort Memory: 25kB
                          Buffers: shared hit=25
                          -> HashAggregate (actual time=0.030..0.033 rows=1 loops=1)
                                Group Key: a_1.ord_date
                                Batches: 1 Memory Usage: 24kB
                                Buffers: shared hit=25
                                -> Limit (actual time=0.013..0.027 rows=20 loops=1)
                                      Buffers: shared hit=25
                                      -> Nested Loop (actual time=0.013..0.025 rows=20 loops=1)
                                            Join Filter: (a_1.ord_no = b_1.ord_no)
                                            Buffers: shared hit=25

-> Nested Loop Semi Join (actual time=0.009..0.014 rows=2 loops=1)

Buffers: shared hit=13

-> Index Scan Backward using orders_x02 on orders a_1

(actual time=0.003..0.004 rows=2 loops=1)

Index Cond: ((ord_date)::text < '20191213'::text)
Buffers: shared hit=5

-> Index Scan using orders_detail_x02 on orders_detail b_2

(actual time=0.004..0.004 rows=1 loops=2)

                                                        Index Cond: (ord_no = a_1.ord_no)
                                                        Filter: ((prod_id)::text = 'PP2'::text)
                                                        Buffers: shared hit=8

-> Index Only Scan using orders_detail_x02 on orders_detail b_1

(actual time=0.002..0.003 rows=10 loops=2)

                                                  Index Cond: (ord_no = b_2.ord_no)
                                                  Heap Fetches: 0
                                                  Buffers: shared hit=12

-> Index Scan using orders_detail_x02 on orders_detail b (actual time=0.005..0.020 rows=10 loops=1667)

                    Index Cond: (ord_no = a.ord_no)
                    Filter: ((prod_id)::text = 'PP2'::text)
                    Buffers: shared hit=12334 read=9394
Planning:
Buffers: shared hit=80
Planning Time: 0.514 ms
Execution Time: 43.664 ms

Take note that the block I/O dropped from (16685+100312=116,997) to (10183+10245=23,428). The elapsed time decreased from 259 ms to 44 ms.
In the K subquery, I tried to extract the top 20 OREDES.ORD_DATE values sorted by ORD_DATE satisfying the condition ORDERS_DETAIL.PROD_ID='PP2'. Then I grouped them by ORD_DATE to fetch distinct ORD_DATE values, which helps avoid multiplying the result set when joined to the ORDERS table. With this query, we are able to access the ORDERS_DETAIL_X02 index 1667 times (compared to 8336 times in the original query).

It is generally recommended to write queries that are straightforward and easy to understand. The rewritten query has a convoluted structure but it is worth trying this complex approach when the performance benefit is significant.

Conclusion

Incremental Sort does not guarantee we have obtained the optimal plan. Do not stop tuning a query when you see Incremental Sort in the plan.

Added on the next day of this publication
While investigating the execution plan of the orignial query, I mentioned that I did not know why PostgreSQL accessed the ORDERS_DETAIL_02 index 8336 times. It appears that the incremental sort algorithm in PostgreSQL 15 is somewhat clumsy.
We can observe that PostgreSQL did not access the entire ORDERS_X02 index on the ORDERS table. It stopped at the date '20191208'.

To validate this, run the following query:

SELECT COUNT(*)
FROM ORDERS
WHERE ORD_DATE < '20191213'
AND ORD_DATE > '20191207';

The result of the query is:

COUNT
----------
8335

From a logical perspective, it should have ceased accessing the ORDERS_DETAIL_X02 index after encountering the date '20191212', as it could retrieve 20 rows that satisfy all the conditions in the WHERE clause. In future versions, I anticipate that PostgreSQL's incremental sort mechanism will address this issue.

E-MAIL: root@postgresdba.com