Обсуждение: Degression (PG10 > 11, 12 or 13)

Поиск
Список
Период
Сортировка

Degression (PG10 > 11, 12 or 13)

От
Johannes Graën
Дата:
Hi,

When trying to upgrade an existing database from version 10 to 13 I came
across a degression in some existing code used by clients. Further
investigations showed that performance measures are similar in versions
11 to 13, while in the original database on version 10 it's around 100
times faster. I could boil it down to perl functions used for sorting.

>From the real data that I don't own, I created a test case that is
sufficient to observe the degression: http://ix.io/3o7f


These are the numbers on PG 10:

> test=# explain (analyze, verbose, buffers)
> select attr from tab order by func(attr);
>                                                       QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual time=179.374..180.558 rows=9869 loops=1)
>    Output: attr, (func(attr))
>    Sort Key: (func(tab.attr))
>    Sort Method: quicksort  Memory: 1436kB
>    Buffers: shared hit=49
>    ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40) (actual time=2.293..169.060 rows=9869
loops=1)
>          Output: attr, func(attr)
>          Buffers: shared hit=49
>  Planning time: 0.318 ms
>  Execution time: 182.061 ms
> (10 rows)
> 
> test=# explain (analyze, verbose, buffers)
> select attr from tab;
>                                                  QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.tab  (cost=0.00..147.69 rows=9869 width=8) (actual time=0.045..3.975 rows=9869 loops=1)
>    Output: attr
>    Buffers: shared hit=49
>  Planning time: 0.069 ms
>  Execution time: 6.020 ms
> (5 rows)


And here we have PG 11:

> test=# explain (analyze, verbose, buffers)
> select attr from tab order by func(attr);
>                                                       QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual time=597.877..599.805 rows=9869 loops=1)
>    Output: attr, (func(attr))
>    Sort Key: (func(tab.attr))
>    Sort Method: quicksort  Memory: 1436kB
>    Buffers: shared hit=49
>    ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40) (actual time=0.878..214.188 rows=9869
loops=1)
>          Output: attr, func(attr)
>          Buffers: shared hit=49
>  Planning Time: 0.151 ms
>  Execution Time: 601.767 ms
> (10 rows)
> 
> test=# explain (analyze, verbose, buffers)
> select attr from tab;
>                                                  QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.tab  (cost=0.00..147.69 rows=9869 width=8) (actual time=0.033..1.628 rows=9869 loops=1)
>    Output: attr
>    Buffers: shared hit=49
>  Planning Time: 0.043 ms
>  Execution Time: 2.581 ms
> (5 rows)


In the real scenario it's 500ms vs. 50s. The reason is obviously the
perl function used as sort key. All different versions have been tested
with an unmodified config and one tunes with pgtune. Creating a
functional index does not help in the original database as the planner
doesn't use it, while it *is* used in the test case. But the question
what causes that noticeable difference in performance is untouched by
the fact that it could be circumvented in some cases.

The perl version used is v5.24.1.

Best
  Johannes




Re: Degression (PG10 > 11, 12 or 13)

От
Tomas Vondra
Дата:
On 5/28/21 4:12 PM, Johannes Graën wrote:
> Hi,
> 
> When trying to upgrade an existing database from version 10 to 13 I came
> across a degression in some existing code used by clients. Further
> investigations showed that performance measures are similar in versions
> 11 to 13, while in the original database on version 10 it's around 100
> times faster. I could boil it down to perl functions used for sorting.
> 
>>From the real data that I don't own, I created a test case that is
> sufficient to observe the degression: http://ix.io/3o7f
> 

That function is pretty much just a sequence of ~120 regular
expressions, doing something similar to unaccent(). I wonder if we're
calling the function much more often, perhaps due to some changes in the
sort code (the function is immutable, but that does not guarantee it's
called just once).

It'd be interesting to see profiles from perf, both from 10 and 11.

Also, maybe try materializing the function results before doing the
sort, perhaps like this:

SELECT * FROM (select attr, func(attr) as fattr from tab offset 0) foo
ORDER BY fattr;

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Degression (PG10 > 11, 12 or 13)

От
Pavel Stehule
Дата:
hI

pá 28. 5. 2021 v 16:12 odesílatel Johannes Graën <johannes@selfnet.de> napsal:
Hi,

When trying to upgrade an existing database from version 10 to 13 I came
across a degression in some existing code used by clients. Further
investigations showed that performance measures are similar in versions
11 to 13, while in the original database on version 10 it's around 100
times faster. I could boil it down to perl functions used for sorting.

>From the real data that I don't own, I created a test case that is
sufficient to observe the degression: http://ix.io/3o7f


These are the numbers on PG 10:

> test=# explain (analyze, verbose, buffers)
> select attr from tab order by func(attr);
>                                                       QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual time=179.374..180.558 rows=9869 loops=1)
>    Output: attr, (func(attr))
>    Sort Key: (func(tab.attr))
>    Sort Method: quicksort  Memory: 1436kB
>    Buffers: shared hit=49
>    ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40) (actual time=2.293..169.060 rows=9869 loops=1)
>          Output: attr, func(attr)
>          Buffers: shared hit=49
>  Planning time: 0.318 ms
>  Execution time: 182.061 ms
> (10 rows)
>
> test=# explain (analyze, verbose, buffers)
> select attr from tab;
>                                                  QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.tab  (cost=0.00..147.69 rows=9869 width=8) (actual time=0.045..3.975 rows=9869 loops=1)
>    Output: attr
>    Buffers: shared hit=49
>  Planning time: 0.069 ms
>  Execution time: 6.020 ms
> (5 rows)


And here we have PG 11:

> test=# explain (analyze, verbose, buffers)
> select attr from tab order by func(attr);
>                                                       QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual time=597.877..599.805 rows=9869 loops=1)
>    Output: attr, (func(attr))
>    Sort Key: (func(tab.attr))
>    Sort Method: quicksort  Memory: 1436kB
>    Buffers: shared hit=49
>    ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40) (actual time=0.878..214.188 rows=9869 loops=1)
>          Output: attr, func(attr)
>          Buffers: shared hit=49
>  Planning Time: 0.151 ms
>  Execution Time: 601.767 ms
> (10 rows)
>
> test=# explain (analyze, verbose, buffers)
> select attr from tab;
>                                                  QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
>  Seq Scan on public.tab  (cost=0.00..147.69 rows=9869 width=8) (actual time=0.033..1.628 rows=9869 loops=1)
>    Output: attr
>    Buffers: shared hit=49
>  Planning Time: 0.043 ms
>  Execution Time: 2.581 ms
> (5 rows)


In the real scenario it's 500ms vs. 50s. The reason is obviously the
perl function used as sort key. All different versions have been tested
with an unmodified config and one tunes with pgtune. Creating a
functional index does not help in the original database as the planner
doesn't use it, while it *is* used in the test case. But the question
what causes that noticeable difference in performance is untouched by
the fact that it could be circumvented in some cases.

The perl version used is v5.24.1.

 I looked on profile - Postgres 14

   5,67%  libperl.so.5.32.1                     [.] Perl_utf8_length
   5,44%  libc-2.33.so                          [.] __strcoll_l
   4,73%  libperl.so.5.32.1                     [.] Perl_pp_subst
   4,33%  libperl.so.5.32.1                     [.] Perl_re_intuit_start
   3,25%  libperl.so.5.32.1                     [.] Perl_fbm_instr
   1,92%  libperl.so.5.32.1                     [.] Perl_regexec_flags
   1,79%  libperl.so.5.32.1                     [.] Perl_runops_standard
   1,16%  libperl.so.5.32.1                     [.] Perl_pp_const
   0,97%  perf                                  [.] 0x00000000002fcf93
   0,94%  libperl.so.5.32.1                     [.] Perl_pp_nextstate
   0,68%  libperl.so.5.32.1                     [.] Perl_do_trans
   0,54%  perf                                  [.] 0x00000000003dd0c5

and Postgres - 10

   5,45%  libperl.so.5.32.1                      [.] Perl_utf8_length
   4,78%  libc-2.33.so                           [.] __strcoll_l
   4,15%  libperl.so.5.32.1                      [.] Perl_re_intuit_start
   3,92%  libperl.so.5.32.1                      [.] Perl_pp_subst
   2,99%  libperl.so.5.32.1                      [.] Perl_fbm_instr
   1,77%  libperl.so.5.32.1                      [.] Perl_regexec_flags
   1,59%  libperl.so.5.32.1                      [.] Perl_runops_standard
   1,02%  libperl.so.5.32.1                      [.] Perl_pp_const
   0,99%  [kernel]                               [k] psi_group_change
   0,85%  [kernel]                               [k] switch_mm_irqs_off

and it doesn't look too strange.

-- postgres 14
postgres=# explain (analyze, verbose, buffers)
 select attr from tab order by func(attr);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN                                                      │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual time=246.612..247.292 rows=9869 loops=1)                    │
│   Output: attr, (func(attr))                                                                                         │
│   Sort Key: (func(tab.attr))                                                                                         │
│   Sort Method: quicksort  Memory: 1436kB                                                                             │
│   Buffers: shared hit=49                                                                                             │
│   ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40) (actual time=0.102..201.863 rows=9869 loops=1) │
│         Output: attr, func(attr)                                                                                     │
│         Buffers: shared hit=49                                                                                       │
│ Planning Time: 0.057 ms                                                                                              │
│ Execution Time: 248.386 ms                                                                                           │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(10 rows)

-- postgres 10
postgres=# explain (analyze, verbose, buffers)
 select attr from tab order by func(attr);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN                                                      │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual time=267.779..268.366 rows=9869 loops=1)                    │
│   Output: attr, (func(attr))                                                                                         │
│   Sort Key: (func(tab.attr))                                                                                         │
│   Sort Method: quicksort  Memory: 1436kB                                                                             │
│   Buffers: shared hit=49                                                                                             │
│   ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40) (actual time=0.278..222.606 rows=9869 loops=1) │
│         Output: attr, func(attr)                                                                                     │
│         Buffers: shared hit=49                                                                                       │
│ Planning time: 0.132 ms                                                                                              │
│ Execution time: 269.258 ms                                                                                           │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(10 rows)

This is tested on my laptop - both version uses same locale

Are you sure, so all databases use the same encoding and same locale?

Regards

Pavel

 
Best
  Johannes



Re: Degression (PG10 > 11, 12 or 13)

От
Tom Lane
Дата:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> pá 28. 5. 2021 v 16:12 odesílatel Johannes Graën <johannes@selfnet.de>
> napsal:
>> When trying to upgrade an existing database from version 10 to 13 I came
>> across a degression in some existing code used by clients. Further
>> investigations showed that performance measures are similar in versions
>> 11 to 13, while in the original database on version 10 it's around 100
>> times faster. I could boil it down to perl functions used for sorting.

> Are you sure, so all databases use the same encoding and same locale?

Yeah ... I don't know too much about the performance of Perl regexps,
but it'd be plausible that it varies depending on locale setting.

            regards, tom lane



Re: Degression (PG10 > 11, 12 or 13)

От
Johannes Graën
Дата:
On 28/05/2021 18.24, Tom Lane wrote:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> pá 28. 5. 2021 v 16:12 odesílatel Johannes Graën <johannes@selfnet.de>
>> napsal:
>>> When trying to upgrade an existing database from version 10 to 13 I came
>>> across a degression in some existing code used by clients. Further
>>> investigations showed that performance measures are similar in versions
>>> 11 to 13, while in the original database on version 10 it's around 100
>>> times faster. I could boil it down to perl functions used for sorting.
> 
>> Are you sure, so all databases use the same encoding and same locale?
> 
> Yeah ... I don't know too much about the performance of Perl regexps,
> but it'd be plausible that it varies depending on locale setting.

It probably wasn't Perl at all. Thanks to the hint I checked the initial
database again and, while encoding and ctype are set to UTF8, the
collation is C, which makes a huge difference:

... order by tab(attr) => Execution Time: 51429.875 ms
... order by tab(attr collate "C") => Execution Time: 537.757 ms

in the original database. Any other version yields similar times.


On 28/05/2021 17.47, Tomas Vondra wrote:
> That function is pretty much just a sequence of ~120 regular
> expressions, doing something similar to unaccent(). I wonder if we're
> calling the function much more often, perhaps due to some changes in the
> sort code (the function is immutable, but that does not guarantee it's
> called just once).

> Also, maybe try materializing the function results before doing the
> sort, perhaps like this:
>
> SELECT * FROM (select attr, func(attr) as fattr from tab offset 0) foo
> ORDER BY fattr;

I was expecting it to be called once in the process of sorting, and it
seems that this is actually true for all version and different
collations, but sorting for a collation that is not C requires
considerable more resources (that still needs to be shown for other
collations, but I see the overhead of having more or less complex
definitions vs. just comparing numbers).

That being said, I would have used unaccent or, if that wasn't an
option, maybe have those values calculated by a trigger function when
the corresponding rows are changed. But I don't control the code.

Now what keeps me wondering is how the sorting works internally and if
we could conclude that using the C collation in order expressions and
indexes is a general way to speed up queries - if the actual order is of
less importance.

Best
  Johannes