Обсуждение: select limit error in file_fdw

Поиск
Список
Период
Сортировка

select limit error in file_fdw

От
Erik Rijkers
Дата:
Hello,

I have noticed that since ffa4cbd623, a foreign table that pulls data
from a PROGRAM (in this case an unzip call) will fail if there is a 
LIMIT on the SELECT
(while succeeding without LIMIT). Below is an example.
(Table size matters, so larger machines than mine may need more than 
those 100000 rows.)

A pre-ffa4cbd623 instance did not have this problem (and neither does 
11.1)

This seems like a bug to me.

--------------- 8< ------------------------------
#!/bin/bash

# service=prod
   service=dev11  # this is latest dev11 (in 11.1 it still worked;
                  # commit ffa4cbd623 looks pertinent)

csv_file=/tmp/t.txt

echo "select n from generate_series(1, 100000) as f(n)" | psql -qtAX 
service=$service > $csv_file

zip ~/t.zip $csv_file

echo "
drop   server if     exists test_server cascade;
create server if not exists test_server foreign data wrapper file_fdw;
create schema if not exists tmp;
drop   foreign table if exists tmp.t cascade;
create foreign table           tmp.t (n int)
server test_server
options (
     program       'unzip -p \"/home/aardvark/t.zip\" \"tmp/t.txt\"'
   , format 'csv'
   , header 'TRUE'
   , delimiter E'\t'
);
" | psql -X service=$service

# this works OK:
echo "table tmp.t;" | psql -Xa  service=$service | head

# this fails in latest dev11 :
echo "table tmp.t limit 10;" | psql -Xa service=$service
--------------- 8< ------------------------------

Output:

updating: tmp/t.txt (deflated 63%)
DROP SERVER
CREATE SERVER
CREATE SCHEMA
DROP FOREIGN TABLE
CREATE FOREIGN TABLE
table tmp.t;
    n
--------
       2
       3
       4
       5
       6
       7
       8
table tmp.t limit 10;
ERROR:  program "unzip -p "/home/aardvark/t.zip" "tmp/t.txt"" failed
DETAIL:  child process exited with exit code 141

it would be nice to get this working again.


Thanks,

Erik Rijkers



Re: select limit error in file_fdw

От
Tom Lane
Дата:
Erik Rijkers <er@xs4all.nl> writes:
> I have noticed that since ffa4cbd623, a foreign table that pulls data
> from a PROGRAM (in this case an unzip call) will fail if there is a 
> LIMIT on the SELECT
> (while succeeding without LIMIT). Below is an example.

Um ... this example works for me, in both HEAD and v11 branch tip.
Moreover, the behavior you describe is exactly what ffa4cbd623 was
intended to fix.  Is there any chance that you got 11.1 and v11
branch tip mixed up?

If not, there must be some platform-specific behavior involved.
What are you testing on, exactly?

            regards, tom lane


Re: select limit error in file_fdw

От
Erik Rijkers
Дата:
On 2018-12-16 07:03, Tom Lane wrote:
> Erik Rijkers <er@xs4all.nl> writes:
>> I have noticed that since ffa4cbd623, a foreign table that pulls data
>> from a PROGRAM (in this case an unzip call) will fail if there is a
>> LIMIT on the SELECT
>> (while succeeding without LIMIT). Below is an example.
> 
> Um ... this example works for me, in both HEAD and v11 branch tip.
> Moreover, the behavior you describe is exactly what ffa4cbd623 was
> intended to fix.  Is there any chance that you got 11.1 and v11
> branch tip mixed up?

I admit it's suspicious. I am assuming I pull the latest, from 
REL_11_STABLE, but I will have another hard look at my build stuff.

On the other hand, in that test.sh, have you tried enlarging the test 
table? It works for me too with small enough values in that 
generate_series.

> If not, there must be some platform-specific behavior involved.
> What are you testing on, exactly?

This is debian 9/Stretch:

/etc/os-release:
"Debian GNU/Linux 9 (stretch)"

uname -a
Linux gulo 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 
GNU/Linux

/proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 42
model name      : Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
stepping        : 7
microcode       : 0x25
cpu MHz         : 2299.645
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe 
syscall nx rdtscp l
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass 
l1tf
bogomips        : 6185.58
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:


$ (PGSERVICE=dev11 psql -c "select version()")
  PostgreSQL 11.1_REL_11_STABLE_20181216_0458_171c on 
x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 
20170516, 64-bit
(1 row)

(note that '171c', tacked on via --with-extra-version)


What other info could be useful?





Re: select limit error in file_fdw

От
Erik Rijkers
Дата:
On 2018-12-16 11:19, Erik Rijkers wrote:
> On 2018-12-16 07:03, Tom Lane wrote:
>> Erik Rijkers <er@xs4all.nl> writes:
>>> I have noticed that since ffa4cbd623, a foreign table that pulls data
>>> from a PROGRAM (in this case an unzip call) will fail if there is a
>>> LIMIT on the SELECT
>>> (while succeeding without LIMIT). Below is an example.
>> 
>> Um ... this example works for me, in both HEAD and v11 branch tip.
>> Moreover, the behavior you describe is exactly what ffa4cbd623 was
>> intended to fix.  Is there any chance that you got 11.1 and v11
>> branch tip mixed up?
> 
> I admit it's suspicious. I am assuming I pull the latest, from
> REL_11_STABLE, but I will have another hard look at my build stuff.

To circumvent a possible bug in my normal build stuff, I built
an instance from scratch, maybe someone could check for any errors
that I may have overlooked?

The instance built with this still has the LIMIT error.

I did notice that the error (as provoked by the earlier posted test.sh)
can be avoided by adding 'count(*) over ()' to the select list.
Not really surprising, I suppose.

Here is my scratch_build.sh:

------------
#!/bin/bash

git --version

project=scratch

# shutdown - just in case it's running
/home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/pg_ctl 
\
   -D /home/aardvark/pg_stuff/pg_installations/pgsql.scratch/data \
   -l logfile -w stop

cd ~/tmp/

# if [[ 0 -eq 1 ]]; then
   git clone https://git.postgresql.org/git/postgresql.git
   cd postgresql
   git checkout REL_11_STABLE
# else
#   cd postgresql
# # git pull
# fi

echo "deleting stuff"
make distclean &> /dev/null

echo "rebuilding stuff"
time ( ./configure \
   --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.$project \
   
--bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast 
\
   
--libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.$project/lib.fast 
\
   --with-pgport=6011 --quiet --enable-depend --with-openssl  
--with-libxml \
   --with-libxslt --with-zlib  --enable-tap-tests \
   --with-extra-version=_$project \
  && make -j 6  && ( cd contrib; make ) \
  && make check \
  && make install && ( cd contrib; make install ) \
  &&   
/home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/initdb 
\
    -D /home/aardvark/pg_stuff/pg_installations/pgsql.$project/data -E 
UTF8 \
    -A scram-sha-256 --pwfile=/home/aardvark/pg_stuff/.11devel 
--data-checksums \
    --waldir=/home/aardvark/pg_stuff/pg_installations_wal/pgsql.$project
   rc=$?
   echo "rc [$rc]"
)

/home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/pg_ctl 
\
   -D /home/aardvark/pg_stuff/pg_installations/pgsql.scratch/data \
   -l logfile -w start

echo "select current_setting('port'), version()" \
   | 
/home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/psql 
-qX service=scratch
echo "
    create extension file_fdw;
    \\dx
" | 
/home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/psql 
-qX service=scratch

/home/aardvark/pg_stuff/pg_installations/pgsql.$project/bin.fast/pg_ctl 
\
   -D /home/aardvark/pg_stuff/pg_installations/pgsql.scratch/data \
   -l logfile -w stop

------------

comments welcome.


thanks,

  Erik Rijkers























> On the other hand, in that test.sh, have you tried enlarging the test
> table? It works for me too with small enough values in that
> generate_series.
> 
>> If not, there must be some platform-specific behavior involved.
>> What are you testing on, exactly?
> 
> This is debian 9/Stretch:
> 
> /etc/os-release:
> "Debian GNU/Linux 9 (stretch)"
> 
> uname -a
> Linux gulo 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 
> GNU/Linux
> 
> /proc/cpuinfo
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 42
> model name      : Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
> stepping        : 7
> microcode       : 0x25
> cpu MHz         : 2299.645
> cache size      : 6144 KB
> physical id     : 0
> siblings        : 4
> core id         : 0
> cpu cores       : 4
> apicid          : 0
> initial apicid  : 0
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 13
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
> syscall nx rdtscp l
> bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass 
> l1tf
> bogomips        : 6185.58
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 36 bits physical, 48 bits virtual
> power management:
> 
> 
> $ (PGSERVICE=dev11 psql -c "select version()")
>  PostgreSQL 11.1_REL_11_STABLE_20181216_0458_171c on
> x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0
> 20170516, 64-bit
> (1 row)
> 
> (note that '171c', tacked on via --with-extra-version)
> 
> 
> What other info could be useful?


Re: select limit error in file_fdw

От
Tom Lane
Дата:
Erik Rijkers <er@xs4all.nl> writes:
> On 2018-12-16 07:03, Tom Lane wrote:
>> Um ... this example works for me, in both HEAD and v11 branch tip.
>> Moreover, the behavior you describe is exactly what ffa4cbd623 was
>> intended to fix.  Is there any chance that you got 11.1 and v11
>> branch tip mixed up?

> [ nope ]
> On the other hand, in that test.sh, have you tried enlarging the test 
> table?

Yeah, I tried sizes ranging up to 1m tuples without success.

However, something else occurred to me this morning, and a bit
later I can reproduce the problem!  I did it by changing the
table's definition to use a shell pipeline:

     program       'unzip -p \"/tmp/t.zip\" \"tmp/t.txt\" | cat'

ERROR:  program "unzip -p "/tmp/t.zip" "tmp/t.txt" | cat" failed
DETAIL:  child process exited with exit code 141

What is happening for me, I think, is that if the PROGRAM is
just "unzip" then the shell exec's it directly, and so the
SIGPIPE result is reported directly to the PG backend.  But
if the PROGRAM is too complicated to handle that way, then
unzip and cat are children of a shell process, and one or both
of them get SIGPIPE, and the shell reports that as exit status
128 + SIGPIPE.  So we need to consider that result as indicating
a sigpipe failure, too.

It remains unclear why you had an intervening shell process when
I didn't, but perhaps that can be chalked up to use of a different
shell?

            regards, tom lane


Re: select limit error in file_fdw

От
Tom Lane
Дата:
I wrote:
> It remains unclear why you had an intervening shell process when
> I didn't, but perhaps that can be chalked up to use of a different
> shell?

To provide some data on that: popen() is presumably invoking /bin/sh,
which on my box is

$ /bin/sh --version
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)

$ rpm -qf /bin/sh
bash-4.1.2-48.el6.x86_64

            regards, tom lane


Re: select limit error in file_fdw

От
Erik Rijkers
Дата:
On 2018-12-16 16:52, Tom Lane wrote:
> Erik Rijkers <er@xs4all.nl> writes:
>> On 2018-12-16 07:03, Tom Lane wrote:
>>> Um ... this example works for me, in both HEAD and v11 branch tip.
>>> Moreover, the behavior you describe is exactly what ffa4cbd623 was
>>> intended to fix.  Is there any chance that you got 11.1 and v11
>>> branch tip mixed up?
> 
>> [ nope ]
>> On the other hand, in that test.sh, have you tried enlarging the test
>> table?
> 
> Yeah, I tried sizes ranging up to 1m tuples without success.
> 
> However, something else occurred to me this morning, and a bit
> later I can reproduce the problem!  I did it by changing the
> table's definition to use a shell pipeline:
> 
>      program       'unzip -p \"/tmp/t.zip\" \"tmp/t.txt\" | cat'
> 
> ERROR:  program "unzip -p "/tmp/t.zip" "tmp/t.txt" | cat" failed
> DETAIL:  child process exited with exit code 141

curious...

Adding that ' | cat' tail makes all 3 instances (that I have tried) 
fail:
   11.1 as released,
   REL_11_STABLE, and
   instance from d56e0fde

/bin/sh seems to be dash, here.

bash version is:
$ /bin/bash --version
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)



Re: select limit error in file_fdw

От
Tom Lane
Дата:
Erik Rijkers <er@xs4all.nl> writes:
> On 2018-12-16 16:52, Tom Lane wrote:
>> However, something else occurred to me this morning, and a bit
>> later I can reproduce the problem!  I did it by changing the
>> table's definition to use a shell pipeline:

> /bin/sh seems to be dash, here.

Hm.  That must be the relevant difference.  I just repeated the experiment
on a Fedora 28 box with reasonably up-to-date bash:

$ /bin/sh --version
GNU bash, version 4.4.23(1)-release (x86_64-redhat-linux-gnu)

and it behaves the same as my RHEL6 box: no problem with the direct
invocation of unzip, problem if use a pipeline.

Anyway, we know what to do, so I'll go do it.

            regards, tom lane


Re: select limit error in file_fdw

От
Erik Rijkers
Дата:
On 2018-12-16 19:10, Tom Lane wrote:
> 
> Anyway, we know what to do, so I'll go do it.
> 

Thank you very much. I've now also tested with the original, much larger 
file, which gives no problem anymore. I am really glad this works again, 
we use this stuff a lot.


Erik Rijkers





Re: select limit error in file_fdw

От
Tom Lane
Дата:
Erik Rijkers <er@xs4all.nl> writes:
> Thank you very much. I've now also tested with the original, much larger 
> file, which gives no problem anymore. I am really glad this works again, 
> we use this stuff a lot.

We appreciate you noticing the problem before 11.2 got out ...

            regards, tom lane