BUG #16300: Text line order corruption with COPY command

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #16300: Text line order corruption with COPY command
Дата
Msg-id 16300-b952db3f81f7f40d@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #16300: Text line order corruption with COPY command  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      16300
Logged by:          Hans Buschmann
Email address:      buschmann@nidsa.net
PostgreSQL version: 12.2
Operating system:   Windows Server 2019 64bit
Description:

A reproducable line order corruption occurs when copying a quite large test
file into Postgres.

I was trying to import and parse a big .xml file (about 41 MB, 643407 lines)
into a simple import table using the following sequence:


create database x86db template=template0 encoding 'UTF8' lc_collate='C';

\c x86db

create table uops_imp2 (
cline varchar
)
;

copy uops_imp2 from 'N:/downloads/uops_info_instructions_200226.xml';
or
copy uops_imp2 from '/usr/local/hb/uops_info_instructions_200226.xml';

This was tested on different machines under Windows Server 2019 64bit and
Fedora 31 x86-64 under Postgres 12.2 respective 12.1:

x86db=# select version ();
                          version
------------------------------------------------------------
 PostgreSQL 12.2, compiled by Visual C++ build 1914, 64-bit
(1 row)

x86db=# select version ();
                                                version
--------------------------------------------------------------------------------------------------------
 PostgreSQL 12.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 9.2.1
20190827 (Red Hat 9.2.1-1), 64-bit
(1 row)

The original order of the input lines from the original file was verified
under 2 different editors under Windows:

notepad++ 7.8.5 x64
notepad (as build in), with status line turned on to show line numbers

Here are shown the line 627365 til 627392: (the correct original)

        <doc TP="1.0"/>
      </architecture>
    </instruction>
    <instruction asm="VPMADDWD" category="AVX512" cpl="3" evex="1"
extension="AVX512EVEX" iclass="VPMADDWD"
iform="VPMADDWD_ZMMi32_MASKmskw_ZMMi16_MEMi16_AVX512" isa-set="AVX512BW_512"
mask="0" string="VPMADDWD (ZMM, ZMM, M512)" zeroing="0">
      <operand idx="1" name="REG0" type="reg" w="1" width="512"

xtype="i32">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
      <operand idx="2" name="REG2" r="1" type="reg" width="512"

xtype="i16">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
      <operand idx="3" memory-prefix="zmmword ptr" name="MEM0" r="1"
type="mem" width="512" xtype="i16"/>
      <architecture name="SKX">
        <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="2.3"/>
        <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="3.0"/>
        <measurement TP="0.54" TP_ports="0.50" ports="1*p05+1*p23" uops="2"
uops_retire_slots="1">
          <latency cycles="5" start_op="2" target_op="1"/>
          <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_addr_same_reg="14" cycles_addr_same_reg_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
        </measurement>
      </architecture>
      <architecture name="CNL">
        <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
          <latency cycles="5" start_op="2" target_op="1"/>
          <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
        </measurement>
      </architecture>
      <architecture name="ICL">
        <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
          <latency cycles="5" start_op="2" target_op="1"/>
          <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
        </measurement>
        <doc TP="1.0"/>
      </architecture>
    </instruction>

when querying the table by

select * from uops_imp2 offset 627365 limit 27;

I get a different part from the original lines with another line mangled in
between (see ###)
x86db=#
x86db=# select * from uops_imp2 offset 627365 limit 27;

                                                       cline
                                              

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
           <latency cycles="5" start_op="4" target_op="1"/>
         </measurement>
         <doc TP="1.0"/>
       </architecture>
     </instruction>
     <instruction asm="VPMADDWD" category="AVX512" cpl="3" evex="1"
extension="AVX512EVEX" iclass="VPMADDWD"
iform="VPMADDWD_ZMMi32_MASKmskw_ZMMi16_MEMi16_AVX512" isa-set="AVX512BW_512"
mask="0" string="VPMADDWD (ZMM, ZMM, M512)" zeroing="0">
       <operand idx="1" name="REG0" type="reg" w="1" width="512"

xtype="i32">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
###           <latency cycles="6" start_op="2" target_op="1"/>
       <operand idx="2" name="REG2" r="1" type="reg" width="512"

xtype="i16">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand>
       <operand idx="3" memory-prefix="zmmword ptr" name="MEM0" r="1"
type="mem" width="512" xtype="i16"/>
       <architecture name="SKX">
         <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="2.3"/>
         <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1"
ports="1*p05+1*p23" uops="2" version="3.0"/>
         <measurement TP="0.54" TP_ports="0.50" ports="1*p05+1*p23" uops="2"
uops_retire_slots="1">
           <latency cycles="5" start_op="2" target_op="1"/>
           <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_addr_same_reg="14" cycles_addr_same_reg_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
         </measurement>
       </architecture>
       <architecture name="CNL">
         <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
           <latency cycles="5" start_op="2" target_op="1"/>
           <latency cycles_addr="13" cycles_addr_is_upper_bound="1"
cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/>
         </measurement>
       </architecture>
       <architecture name="ICL">
         <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2"
uops_retire_slots="1">
           <latency cycles="5" start_op="2" target_op="1"/>
(27 rows)


In all cases i tried the original order of the lines was not preserved and
the disorder was the same.

The count of all lines seems correct:

x86db=# select count(*) from uops_imp2;
 count
--------
 643407
(1 row)

The same error occurred when using \copy on the psql client side.

To reproduce, the XML-file is directly downloadable under the following
address:

https://uops.info/xml.html

and choosing the file instructions.xml

I have not further analyzed other regions of line order corruption because
it is very difficult when you cant rely on postgres COPY.

I fear similar problems could occur when restoring a pg_dump file, which
also relies on copy commands.

Thanks in advance

Hans Buschmann


В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Demarest, Jamie"
Дата:
Сообщение: RE: Postgresql create a core while trying log a message to syslog
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: BUG #16300: Text line order corruption with COPY command