BUG #18556: Parallel operation error during CREATE TABLE AS statement
От | PG Bug reporting form |
---|---|
Тема | BUG #18556: Parallel operation error during CREATE TABLE AS statement |
Дата | |
Msg-id | 18556-06f52f915c19f7c0@postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #18556: Parallel operation error during CREATE TABLE AS statement
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 18556 Logged by: Shaun Thomas Email address: shaun@tembo.io PostgreSQL version: 16.3 Operating system: Ubuntu 22.04.4 LTS Description: We have been witnessing an intermittent error from a Postgres 16.3 instance if a table is created using a SELECT statement. The error is: ERROR: cannot start commands during a parallel operation The statement itself is very innocuous: CREATE TEMP TABLE parallel_bonk AS ( SELECT document_key, full_document FROM some_schema.source_table WHERE document_key ->> '_id' IN ('xyz', 'pdq') ); Just to eliminate some potential variables, this is a 54-million row HEAP table with the following definition: Column | Type | Collation | Nullable | Default -----------------------------------+--------------------------+-----------+----------+--------- _id | jsonb | | not null | operation_type | text | | | full_document | jsonb | | | ns_db | text | | | ns_coll | text | | | to_db | text | | | to_coll | text | | | document_key | jsonb | | not null | update_description_updated_fields | jsonb | | | update_description_removed_fields | text[] | | | cluster_time | jsonb | | | txn_number | bigint | | | lsid_id | text | | | lsid_uid | text | | | kafka_topic | text | | not null | kafka_partition | integer | | not null | kafka_offset | bigint | | not null | kafka_timestamp | timestamp with time zone | | not null | Indexes: "source_table_pkey" PRIMARY KEY, btree (document_key) "source_table_kafka_offset_idx" UNIQUE, btree (kafka_offset DESC) Note that there are no triggers, rules, constraints, partitions, or any parallel-unsafe entities associated with this table. The associated query plan looks like this: QUERY PLAN -------------------------------------------------------------------------------------------- Gather (cost=1000.00..4045627.93 rows=544751 width=479) Workers Planned: 8 -> Parallel Seq Scan on source_table (cost=0.00..3990152.83 rows=68094 width=479) Filter: ((document_key ->> '_id'::text) = ANY ('{xyz,pdq}'::text[])) JIT: Functions: 4 Options: Inlining true, Optimization true, Expressions true, Deforming true Note that the problem persists regardless of the JIT setting. This appears to be related to the parallel workers in some manner. We haven't been able to distill this down to a simple or reliable repro case yet, as the problem seems to occur entirely at random, suggesting it's some kind of parallel operation race condition. There have been other similar discussions regarding the dbt Postgres ETL toolkit. It appears this ETL kit utilizes a loading procedure that is more likely to trigger this edge case.
В списке pgsql-bugs по дате отправления: