BUG #17619: AllocSizeIsValid violation in parallel hash join

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #17619: AllocSizeIsValid violation in parallel hash join
Дата
Msg-id 17619-0de62ceda812b8b5@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #17619: AllocSizeIsValid violation in parallel hash join  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17619
Logged by:          Dmitry Astapov
Email address:      dastapov@gmail.com
PostgreSQL version: 13.8
Operating system:   Linux (CentOS 7)
Description:

This is how the issue looks like from the user perspective (happens with
both "explain analyze" and "select" as well):

# explain analyze select n, id, rhs from lhs left join (select id, date1,
date2, source,payload from rhs) as rhs using (id) where n > bigint
E'3240754568';
ERROR:  XX000: invalid memory alloc request size 1702125924
LOCATION:  MemoryContextAlloc, mcxt.c:804

Query plan (as shown by explain) is rather simple:
# explain select n, id, rhs from lhs left join (select id, date1, date2,
source,payload from rhs) as rhs using (id) where n > bigint E'3240754568';
                                       QUERY PLAN
            
─────────────────────────────────────────────────────────────────────────────────────────
 Gather  (cost=186066.70..198955.08 rows=13 width=52)
   Workers Planned: 2
   ->  Parallel Hash Left Join  (cost=185066.70..197953.78 rows=5
width=52)
         Hash Cond: (lhs.id = rhs.id)
         ->  Parallel Seq Scan on lhs  (cost=0.00..12887.04 rows=5
width=20)
               Filter: (n > '3240754568'::bigint)
         ->  Parallel Hash  (cost=180233.53..180233.53 rows=386653
width=44)
               ->  Parallel Seq Scan on rhs  (cost=0.00..180233.53
rows=386653 width=44)


I managed to add "print pid and sleep" into that function and attach gdb,
and this is the backtrace I got:

(gdb) bt
#0  0x00007fbb83da59e0 in __nanosleep_nocancel () from /lib64/libc.so.6
#1  0x00007fbb83da5894 in sleep () from /lib64/libc.so.6
#2  0x0000000000926584 in MemoryContextAlloc (context=0x134d190,
size=size@entry=1702125924) at mcxt.c:806
#3  0x000000000092cd9d in sts_read_tuple (meta_data=0x7ffc88d1264c,
accessor=0x1360e00) at sharedtuplestore.c:450
#4  sts_parallel_scan_next (accessor=accessor@entry=0x1360e00,
meta_data=meta_data@entry=0x7ffc88d1264c) at sharedtuplestore.c:530
#5  0x0000000000682911 in ExecParallelHashJoinNewBatch (hjstate=<optimized
out>, hjstate=<optimized out>) at nodeHashjoin.c:1153
#6  ExecHashJoinImpl (parallel=true, pstate=0x13112a0) at
nodeHashjoin.c:560
#7  ExecParallelHashJoin (pstate=0x13112a0) at nodeHashjoin.c:607
#8  0x0000000000665f88 in ExecProcNodeInstr (node=0x13112a0) at
execProcnode.c:466
#9  0x000000000067b8a4 in ExecProcNode (node=0x13112a0) at
../../../src/include/executor/executor.h:248
#10 gather_getnext (gatherstate=0x1311098) at nodeGather.c:295
#11 ExecGather (pstate=0x1311098) at nodeGather.c:227
#12 0x0000000000665f88 in ExecProcNodeInstr (node=0x1311098) at
execProcnode.c:466
#13 0x000000000065e542 in ExecProcNode (node=0x1311098) at
../../../src/include/executor/executor.h:248
#14 ExecutePlan (execute_once=<optimized out>, dest=0xa79fe0 <donothingDR>,
direction=<optimized out>, numberTuples=0, sendTuples=true,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x1311098, estate=0x1310e10) at execMain.c:1632
#15 standard_ExecutorRun (queryDesc=0x130f980, direction=<optimized out>,
count=0, execute_once=<optimized out>) at execMain.c:350
#16 0x00000000005f281f in ExplainOnePlan
(plannedstmt=plannedstmt@entry=0x13056d8, into=into@entry=0x0,
es=es@entry=0x12f63b0, 
    queryString=queryString@entry=0x122daa0 "explain analyze select n, id,
rhs from lhs left join (select id, date1, date2, source,payload from rhs) as
rhs using (id) where n > bigint E'3240754568';", params=params@entry=0x0,
queryEnv=queryEnv@entry=0x0, 
    planduration=planduration@entry=0x7ffc88d12820, bufusage=0x0) at
explain.c:571
#17 0x00000000005f2b8d in ExplainOneQuery (query=<optimized out>,
cursorOptions=<optimized out>, into=0x0, es=0x12f63b0, 
    queryString=0x122daa0 "explain analyze select n, id, rhs from lhs left
join (select id, date1, date2, source,payload from rhs) as rhs using (id)
where n > bigint E'3240754568';", params=0x0, queryEnv=0x0) at
explain.c:404
#18 0x00000000005f31f3 in ExplainQuery (pstate=pstate@entry=0x12517c0,
stmt=stmt@entry=0x122f848, params=params@entry=0x0,
dest=dest@entry=0x12f6318) at explain.c:275
#19 0x00000000007e2785 in standard_ProcessUtility (pstmt=0x12d6168,
queryString=0x122daa0 "explain analyze select n, id, rhs from lhs left join
(select id, date1, date2, source,payload from rhs) as rhs using (id) where n
> bigint E'3240754568';", 
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x12f6318, qc=0x7ffc88d12ac0) at utility.c:846
#20 0x00000000007e082e in PortalRunUtility (portal=portal@entry=0x128fb10,
pstmt=0x12d6168, isTopLevel=<optimized out>,
setHoldSnapshot=setHoldSnapshot@entry=true, dest=dest@entry=0x12f6318,
qc=qc@entry=0x7ffc88d12ac0) at pquery.c:1153
#21 0x00000000007e0ce5 in FillPortalStore (portal=portal@entry=0x128fb10,
isTopLevel=isTopLevel@entry=true) at pquery.c:1026
#22 0x00000000007e105c in PortalRun (portal=portal@entry=0x128fb10,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x12ef350,
altdest=altdest@entry=0x12ef350, qc=qc@entry=0x7ffc88d12ca0)
    at pquery.c:760
#23 0x00000000007dc774 in exec_simple_query (query_string=0x122daa0 "explain
analyze select n, id, rhs from lhs left join (select id, date1, date2,
source,payload from rhs) as rhs using (id) where n > bigint E'3240754568';")
at postgres.c:1238
#24 0x00000000007dda1a in PostgresMain (argc=<optimized out>,
argv=argv@entry=0x1257bd8, dbname=0x1257af0 "dastapov", username=<optimized
out>) at postgres.c:4347
#25 0x000000000048409f in BackendRun (port=<optimized out>, port=<optimized
out>) at postmaster.c:4550
#26 BackendStartup (port=0x124fad0) at postmaster.c:4234
#27 ServerLoop () at postmaster.c:1739
#28 0x0000000000758e05 in PostmasterMain (argc=argc@entry=9,
argv=argv@entry=0x1228420) at postmaster.c:1412
#29 0x000000000048546e in main (argc=9, argv=0x1228420) at main.c:210

If I go to frame 3 (sts_read_tuple), it would be in this part of
sts_read_tuple:

    if (size > accessor->read_buffer_size)
    {
        size_t        new_read_buffer_size;

        if (accessor->read_buffer != NULL)
            pfree(accessor->read_buffer);
        new_read_buffer_size = Max(size, accessor->read_buffer_size * 2);
        accessor->read_buffer =
            MemoryContextAlloc(accessor->context, new_read_buffer_size);   <<<<-----
HERE is where the memory allocation is happening
        accessor->read_buffer_size = new_read_buffer_size;
    }

At this point:
(gdb) frame 3
#3  0x000000000092cd9d in sts_read_tuple (meta_data=0x7ffc88d1264c,
accessor=0x1360e00) at sharedtuplestore.c:450
450                MemoryContextAlloc(accessor->context, new_read_buffer_size);
(gdb) p accessor->read_buffer_size
$1 = 5641470
(gdb) p size
$2 = 1702125924
(gdb) p accessor->read_file->name
$5 = 0x1336af8 "i12of16.p1"
(gdb) p accessor->read_file->numFiles
$6 = 1


Looking in base/pgsql_tmp, i see there pgsql_tmp4511.0.sharedfileset which
is 5.1 GB large, and contains:
$ ls -ltr
-rw------- 1 user group 94M Sep 21 16:05 i7of8.p2.0
-rw------- 1 user group 90M Sep 21 16:05 i7of8.p1.0
-rw------- 1 user group 88M Sep 21 16:05 i7of8.p0.0
-rw------- 1 user group 96M Sep 21 16:05 i6of8.p2.0
-rw------- 1 user group 74M Sep 21 16:05 i6of8.p1.0
-rw------- 1 user group 75M Sep 21 16:05 i6of8.p0.0
-rw------- 1 user group 84M Sep 21 16:05 i5of8.p2.0
-rw------- 1 user group 84M Sep 21 16:05 i5of8.p1.0
-rw------- 1 user group 77M Sep 21 16:05 i5of8.p0.0
-rw------- 1 user group 66M Sep 21 16:05 i4of8.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i4of8.p1.0
-rw------- 1 user group 78M Sep 21 16:05 i4of8.p0.0
-rw------- 1 user group 69M Sep 21 16:05 i3of8.p2.0
-rw------- 1 user group 86M Sep 21 16:05 i3of8.p1.0
-rw------- 1 user group 83M Sep 21 16:05 i3of8.p0.0
-rw------- 1 user group 80M Sep 21 16:05 i2of8.p2.0
-rw------- 1 user group 87M Sep 21 16:05 i2of8.p1.0
-rw------- 1 user group 88M Sep 21 16:05 i2of8.p0.0
-rw------- 1 user group 97M Sep 21 16:05 i1of8.p2.0
-rw------- 1 user group 77M Sep 21 16:05 i1of8.p1.0
-rw------- 1 user group 90M Sep 21 16:05 i1of8.p0.0
-rw------- 1 user group 86M Sep 21 16:05 i9of16.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i9of16.p1.0
-rw------- 1 user group 82M Sep 21 16:05 i9of16.p0.0
-rw------- 1 user group 78M Sep 21 16:05 i8of16.p2.0
-rw------- 1 user group 70M Sep 21 16:05 i8of16.p1.0
-rw------- 1 user group 73M Sep 21 16:05 i8of16.p0.0
-rw------- 1 user group 78M Sep 21 16:05 i7of16.p2.0
-rw------- 1 user group 72M Sep 21 16:05 i7of16.p1.0
-rw------- 1 user group 73M Sep 21 16:05 i7of16.p0.0
-rw------- 1 user group 71M Sep 21 16:05 i6of16.p2.0
-rw------- 1 user group 75M Sep 21 16:05 i6of16.p1.0
-rw------- 1 user group 74M Sep 21 16:05 i6of16.p0.0
-rw------- 1 user group 77M Sep 21 16:05 i5of16.p2.0
-rw------- 1 user group 74M Sep 21 16:05 i5of16.p1.0
-rw------- 1 user group 74M Sep 21 16:05 i5of16.p0.0
-rw------- 1 user group 73M Sep 21 16:05 i4of16.p2.0
-rw------- 1 user group 74M Sep 21 16:05 i4of16.p1.0
-rw------- 1 user group 67M Sep 21 16:05 i4of16.p0.0
-rw------- 1 user group 74M Sep 21 16:05 i3of16.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i3of16.p1.0
-rw------- 1 user group 72M Sep 21 16:05 i3of16.p0.0
-rw------- 1 user group 79M Sep 21 16:05 i2of16.p2.0
-rw------- 1 user group 73M Sep 21 16:05 i2of16.p1.0
-rw------- 1 user group 74M Sep 21 16:05 i2of16.p0.0
-rw------- 1 user group 70M Sep 21 16:05 i1of16.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i1of16.p1.0
-rw------- 1 user group 80M Sep 21 16:05 i1of16.p0.0
-rw------- 1 user group 86M Sep 21 16:05 i15of16.p2.0
-rw------- 1 user group 85M Sep 21 16:05 i15of16.p1.0
-rw------- 1 user group 86M Sep 21 16:05 i15of16.p0.0
-rw------- 1 user group 78M Sep 21 16:05 i14of16.p2.0
-rw------- 1 user group 80M Sep 21 16:05 i14of16.p1.0
-rw------- 1 user group 71M Sep 21 16:05 i14of16.p0.0
-rw------- 1 user group 75M Sep 21 16:05 i13of16.p2.0
-rw------- 1 user group 71M Sep 21 16:05 i13of16.p1.0
-rw------- 1 user group 78M Sep 21 16:05 i13of16.p0.0
-rw------- 1 user group 69M Sep 21 16:05 i12of16.p2.0
-rw------- 1 user group 78M Sep 21 16:05 i12of16.p1.0
-rw------- 1 user group 80M Sep 21 16:05 i12of16.p0.0
-rw------- 1 user group 73M Sep 21 16:05 i11of16.p2.0
-rw------- 1 user group 72M Sep 21 16:05 i11of16.p1.0
-rw------- 1 user group 75M Sep 21 16:05 i11of16.p0.0
-rw------- 1 user group 79M Sep 21 16:05 i10of16.p2.0
-rw------- 1 user group 77M Sep 21 16:05 i10of16.p1.0
-rw------- 1 user group 80M Sep 21 16:05 i10of16.p0.0

So it looks like we are reading i12of16.p1.0, which is only 78M large. I
have retained a copy of the files in this location for further analysis, if
needed.

Unfortunately, I cannot share my dataset with you, and my attempts to come
up with a synthetic dataset that reproduces this issue have failed so far. I
am very confident, though, that issue is not caused by database corruption,
as I can take my dataset as two csv files, \copy them it into a freshly
initdb-ed database and immediately reproduce the issue.


Tables in question (lhs and rhs) have 953201 and 927968 rows accordingly and
are this large:
\dt+
                                   List of relations
 Schema │ Name │ Type  │  Owner   │ Persistence │ Access method │  Size   │
Description 
────────┼──────┼───────┼──────────┼─────────────┼───────────────┼─────────┼─────────────
 public │ lhs  │ table │ dastapov │ permanent   │ heap          │ 49 MB   │

 public │ rhs  │ table │ dastapov │ permanent   │ heap          │ 1747 MB │



\d+ lhs
                                    Table "public.lhs"
 Column │  Type  │ Collation │ Nullable │ Default │ Storage  │ Stats target
│ Description 
────────┼────────┼───────────┼──────────┼─────────┼──────────┼──────────────┼─────────────
 id     │ text   │           │          │         │ extended │
│ 
 n      │ bigint │           │          │         │ plain    │
│ 
Access method: heap


\d+ rhs
                                   Table "public.rhs"
 Column  │ Type │ Collation │ Nullable │ Default │ Storage  │ Stats target │
Description 
─────────┼──────┼───────────┼──────────┼─────────┼──────────┼──────────────┼─────────────
 id      │ text │           │          │         │ extended │              │

 date1   │ date │           │          │         │ plain    │              │

 date2   │ date │           │          │         │ plain    │              │

 source  │ text │           │          │         │ extended │              │

 payload │ text │           │          │         │ extended │              │

Access method: heap

# select tablename, attname, avg_width, n_distinct from pg_stats where
tablename in ('lhs', 'rhs') order by 1;
 tablename │ attname │ avg_width │ n_distinct  
───────────┼─────────┼───────────┼─────────────
 lhs       │ id      │        12 │ -0.95371073
 lhs       │ n       │         8 │          -1
 rhs       │ payload │      1383 │          -1
 rhs       │ id      │        12 │          -1
 rhs       │ date1   │         4 │         609
 rhs       │ date2   │         4 │          11
 rhs       │ source  │        17 │        4275


Potentially interesting piece of the puzzle is that there are some long
outliers in rhs.payload and rhs.source, but the rest of the columns have
values that are exactly of avg_width bytes:

# select log_len, count(*) from (select log(length(payload))::int as log_len
from rhs) foo group by 1 order by 2 desc;
 log_len │ count  
─────────┼────────
       3 │ 840852
       4 │  77776
       5 │   8003
       6 │   1317
       7 │     20
(5 rows)

# select log_len, count(*) from (select log(length(source))::int as log_len
from rhs) foo group by 1 order by 2 desc;
 log_len │ count  
─────────┼────────
       1 │ 907692
       3 │  11793
       2 │   4109
       4 │   2920
       5 │   1454
(5 rows)

As I said, I tried to create tables with similar data using generate_series,
but failed to reproduce the issue. However, I am happy to provide additional
information / do additional debugging, as necessary, in lieu of my inability
to share the offending dataset.

pg_settings:
# select name, setting from pg_settings where setting <> boot_val;
            name            │            setting            
────────────────────────────┼───────────────────────────────
 application_name           │ psql
 archive_command            │ (disabled)
 client_encoding            │ UTF8
 data_directory_mode        │ 0700
 default_text_search_config │ pg_catalog.english
 lc_collate                 │ en_US.utf8
 lc_ctype                   │ en_US.utf8
 lc_messages                │ en_US.utf8
 lc_monetary                │ en_US.utf8
 lc_numeric                 │ en_US.utf8
 lc_time                    │ en_US.utf8
 log_file_mode              │ 0600
 log_filename               │ postgresql-%a.log
 log_rotation_size          │ 0
 log_timezone               │ Europe/London
 log_truncate_on_rotation   │ on
 logging_collector          │ on
 max_locks_per_transaction  │ 128
 max_stack_depth            │ 2048
 server_encoding            │ UTF8
 shared_buffers             │ 4194304
 synchronous_commit         │ off
 temp_buffers               │ 32768
 TimeZone                   │ Europe/London
 unix_socket_directories    │ /home/dastapov/local/pg13-bug
 unix_socket_permissions    │ 0777
 wal_buffers                │ 2048
 work_mem                   │ 262144
(28 rows)

Let me know how I can assist further.


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Junwang Zhao
Дата:
Сообщение: Re: The keyword in the procedure's error message is "function", which should be "procedure"
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: BUG #17385: "RESET transaction_isolation" inside serializable transaction causes Assert at the transaction end