Re: Parallel copy
От | Tomas Vondra |
---|---|
Тема | Re: Parallel copy |
Дата | |
Msg-id | 20201030203730.eicjk6542pwoicvb@development обсуждение исходный текст |
Ответ на | Re: Parallel copy (vignesh C <vignesh21@gmail.com>) |
Ответы |
Re: Parallel copy
|
Список | pgsql-hackers |
Hi, I've done a bit more testing today, and I think the parsing is busted in some way. Consider this: test=# create extension random; CREATE EXTENSION test=# create table t (a text); CREATE TABLE test=# insert into t select random_string(random_int(10, 256*1024)) from generate_series(1,10000); INSERT 0 10000 test=# copy t to '/mnt/data/t.csv'; COPY 10000 test=# truncate t; TRUNCATE TABLE test=# copy t from '/mnt/data/t.csv'; COPY 10000 test=# truncate t; TRUNCATE TABLE test=# copy t from '/mnt/data/t.csv' with (parallel 2); ERROR: invalid byte sequence for encoding "UTF8": 0x00 CONTEXT: COPY t, line 485: "m&\nh%_a"%r]>qtCl:Q5ltvF~;2oS6@HB>F>og,bD$Lw'nZY\tYl#BH\t{(j~ryoZ08"SGU~.}8CcTRk1\ts$@U3szCC+U1U3i@P..." parallel worker The functions come from an extension I use to generate random data, I've pushed it to github [1]. The random_string() generates a random string with ASCII characters, symbols and a couple special characters (\r\n\t). The intent was to try loading data where a fields may span multiple 64kB blocks and may contain newlines etc. The non-parallel copy works fine, the parallel one fails. I haven't investigated the details, but I guess it gets confused about where a string starts/end, or something like that. [1] https://github.com/tvondra/random regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: