Re: Add on_error and log_verbosity options to file_fdw

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Add on_error and log_verbosity options to file_fdw
Дата
Msg-id CAD21AoBij_2cy5T7ihk9ArRQzDcXVHoK13vzy34zAUAMLgrKkQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Add on_error and log_verbosity options to file_fdw  (torikoshia <torikoshia@oss.nttdata.com>)
Ответы Re: Add on_error and log_verbosity options to file_fdw
Список pgsql-hackers
Hi,

On Thu, Jul 18, 2024 at 6:38 PM torikoshia <torikoshia@oss.nttdata.com> wrote:
>
> On 2024-07-05 00:27, torikoshia wrote:
> > Hi,
> >
> > With the current file_fdw, if even one line of data conversion fails,
> > the contents of the file cannot be referenced at all:
> >
> >   =# \! cat data/test.data
> >   1,a
> >   2,b
> >   a,c
> >   =# create foreign table f_fdw_test_1 (i int, t text) server f_fdw
> > options (filename 'test.data', format 'csv');
> >   CREATE FOREIGN TABLE
> >
> >   =# table f_fdw_test_1;
> >   ERROR:  invalid input syntax for type integer: "a"
> >   CONTEXT:  COPY f_fdw_test, line 3, column i: "a"
> >
> > Since we'll support ON_ERROR option which tolerates data conversion
> > errors in COPY FROM and LOG_VERBOSITY option at v17[1], how about
> > supporting them on file_fdw?

+1

> >
> > This idea comes from Fujii-san[2], and I think it'd be useful when
> > reading a bit dirty data.
> >
> > Attached PoC patch works like below:
> >
> >   =# create foreign table f_fdw_test_2 (i int, t text) server f_fdw
> > options (filename 'test.data', format 'csv', on_error 'ignore');
> >   CREATE FOREIGN TABLE
> >
> >   =# table f_fdw_test_2;
> >   NOTICE:  1 row was skipped due to data type incompatibility
> >    i | t
> >   ---+---
> >    1 | a
> >    2 | b
> >   (2 rows)

I'm slightly concerned that users might not want to see the NOTICE
message for every scan. Unlike COPY FROM, scanning a file via file_fdw
could be frequent. An alternative idea of place to write the
information of the number of malformed rows would be the EXPLAIN
command as follow:

                           QUERY PLAN
----------------------------------------------------------------
 Foreign Scan on public.test  (cost=0.00..1.10 rows=1 width=12)
   Output: a, b, c
   Foreign File: test.csv
   Foreign File Size: 12 b
   Skipped Rows: 10

> >
> >
> >   =# create foreign table f_fdw_test_3 (i int, t text) server f_fdw
> > options (filename 'test.data', format 'csv', on_error 'ignore',
> > log_verbosity 'verbose');
> > CREATE FOREIGN TABLE
> >
> >   =# table f_fdw_test_3 ;
> >   NOTICE:  skipping row due to data type incompatibility at line 3 for
> > column i: "a"
> >   NOTICE:  1 row was skipped due to data type incompatibility
> >    i | t
> >   ---+---
> >    1 | a
> >    2 | b
> >   (2 rows)

IIUC we have to execute ALTER FOREIGN TABLE to change the
log_verbosity value and which requires to be the owner. Which seems
not to be user-friendly.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Matthew Kim
Дата:
Сообщение: Re: Remove dependence on integer wrapping
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: Remove dependence on integer wrapping