Re: should check interrupts in BuildRelationExtStatistics ?

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: should check interrupts in BuildRelationExtStatistics ?
Дата
Msg-id 20220605014233.GR29853@telsasoft.com
обсуждение исходный текст
Ответ на Re: should check interrupts in BuildRelationExtStatistics ?  (Justin Pryzby <pryzby@telsasoft.com>)
Ответы Re: should check interrupts in BuildRelationExtStatistics ?  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
On Fri, Jun 03, 2022 at 10:28:37AM -0500, Justin Pryzby wrote:
> Maybe this should actually call vacuum_delay_point(), like
> compute_scalar_stats().

I think vacuum_delay_point() would be wrong for these cases, since they don't
call "fetchfunc()", like the other places which use vacuum_delay_point.

> For MCV, there seems to be no issue, since those
> functions are being called (but only for expressional stats).  But maybe I've
> just failed to make a large enough, non-expressional MCV list for the problem
> to be apparent.

I reproduced the issue with MCV like this:

DROP TABLE IF EXISTS t; CREATE TABLE t AS SELECT a::text,b::text,c::text,d::text,e::text,f::text,g::text FROM
generate_series(1000001,1000006)a,generate_series(1000001,1000006)b,generate_series(1000001,1000006)c,generate_series(1000001,1000006)d,generate_series(1000001,1000006)e,generate_series(1000001,1000006)f,generate_series(1000001,1000006)g,generate_series(1000001,1000006)h;
VACUUMt; 
 
DROP STATISTICS IF EXISTS stxx; CREATE STATISTICS stxx (mcv) ON a,b,c,d,e,f FROM t; ALTER STATISTICS stxx SET
STATISTICS9999; ANALYZE VERBOSE t;
 

This is slow (25 seconds) inside qsort:

(gdb) bt
#0  __memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:1020
#1  0x00005653d8686fac in varstrfastcmp_locale (a1p=0x5653dce67d54 "1000004~", len1=7, a2p=0x5653e895ffa4 "1000004~",
len2=7,ssup=ssup@entry=0x5653d98c37b8) at varlena.c:2444
 
#2  0x00005653d8687161 in varlenafastcmp_locale (x=94918188367184, y=94918384418720, ssup=0x5653d98c37b8) at
varlena.c:2270
#3  0x00005653d85134d8 in ApplySortComparator (ssup=0x5653d98c37b8, isNull2=<optimized out>, datum2=<optimized out>,
isNull1=<optimizedout>, datum1=<optimized out>) at ../../../src/include/utils/sortsupport.h:224
 
#4  multi_sort_compare (a=0x7fa587b44e58, b=0x7fa5875f0dd0, arg=0x5653d98c37b0) at extended_stats.c:903
#5  0x00005653d8712eed in qsort_arg (data=data@entry=0x7fa5875f0050, n=<optimized out>, n@entry=1679616,
element_size=element_size@entry=24,compare=compare@entry=0x5653d8513483 <multi_sort_compare>, 
 
    arg=arg@entry=0x5653d98c37b0) at ../../src/include/lib/sort_template.h:349
#6  0x00005653d851415f in build_sorted_items (data=data@entry=0x7fa58f2e1050, nitems=nitems@entry=0x7ffe4f764e5c,
mss=mss@entry=0x5653d98c37b0,numattrs=6, attnums=0x7fa58f2e1078) at extended_stats.c:1134
 
#7  0x00005653d8515d84 in statext_mcv_build (data=data@entry=0x7fa58f2e1050, totalrows=totalrows@entry=1679616,
stattarget=stattarget@entry=9999)at mcv.c:204
 
#8  0x00005653d8513819 in BuildRelationExtStatistics (onerel=onerel@entry=0x7fa5b26ef658, inh=inh@entry=false,
totalrows=1679616,numrows=numrows@entry=1679616, rows=rows@entry=0x7fa5a4103050, natts=natts@entry=7, 
 
    vacattrstats=vacattrstats@entry=0x5653d98b76b0) at extended_stats.c:213

The fix seems to be to CHECK_FOR_INTERRUPTS() within multi_sort_compare().
That would supercede the other two CHECK_FOR_INTERRUPTS I'd proposed, and
handle mcv, depends, and ndistinct all at once.

Does that sound right ?

For MCV, there's also ~0.6sec spent in build_column_frequencies(), which (if
needed) would be addressed by adding CFI in sort_item_compare.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [v15 beta] pg_upgrade failed if earlier executed with -c switch
Следующее
От: Guillaume Lelarge
Дата:
Сообщение: Re: Proposal: adding a better description in psql command about large objects