Re: Add tuples_skipped to pg_stat_progress_copy

Поиск
Список
Период
Сортировка
От torikoshia
Тема Re: Add tuples_skipped to pg_stat_progress_copy
Дата
Msg-id bfd3fedea94e564a45547ad854b533f1@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: Add tuples_skipped to pg_stat_progress_copy  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Add tuples_skipped to pg_stat_progress_copy
Список pgsql-hackers
On 2024-01-24 17:05, Masahiko Sawada wrote:
> On Tue, Jan 23, 2024 at 1:02 AM torikoshia <torikoshia@oss.nttdata.com> 
> wrote:
>> 
>> On 2024-01-17 14:47, Masahiko Sawada wrote:
>> > On Wed, Jan 17, 2024 at 2:22 PM torikoshia <torikoshia@oss.nttdata.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> 132de9968840c introduced SAVE_ERROR_TO option to COPY and enabled to
>> >> skip malformed data, but there is no way to watch the number of
>> >> skipped
>> >> rows during COPY.
>> >>
>> >> Attached patch adds tuples_skipped to pg_stat_progress_copy, which
>> >> counts the number of skipped tuples because source data is malformed.
>> >> If SAVE_ERROR_TO is not specified, this column remains zero.
>> >>
>> >> The advantage would be that users can quickly notice and stop COPYing
>> >> when there is a larger amount of skipped data than expected, for
>> >> example.
>> >>
>> >> As described in commit log, it is expected to add more choices for
>> >> SAVE_ERROR_TO like 'log' and using such options may enable us to know
>> >> the number of skipped tuples during COPY, but exposed in
>> >> pg_stat_progress_copy would be easier to monitor.
>> >>
>> >>
>> >> What do you think?
>> >
>> > +1
>> >
>> > The patch is pretty simple. Here is a comment:
>> >
>> > +       (if <literal>SAVE_ERROR_TO</literal> is specified, otherwise
>> > zero).
>> > +      </para></entry>
>> > +     </row>
>> >
>> > To be precise, this counter only advances when a value other than
>> > 'ERROR' is specified to SAVE_ERROR_TO option.
>> 
>> Thanks for your comment and review!
>> 
>> Updated the patch according to your comment and option name change by
>> b725b7eec.
> 
> Thanks! The patch looks good to me. I'm going to push it tomorrow,
> barring any objections.

Thanks!

>> 
>> BTW, based on this patch, I think we can add another option which
>> specifies the maximum tolerable number of malformed rows.
>> I remember this was discussed in [1], and feel it would be useful when
>> loading 'dirty' data but there is a limit to how dirty it can be.
>> Attached 0002 is WIP patch for this(I haven't added doc yet).
> 
> Yeah, it could be a good option.
> 
>> This may be better discussed in another thread, but any comments(e.g.
>> necessity of this option, option name) are welcome.
> 
> I'd recommend forking a new thread for this option. As far as I
> remember, there also was an opinion that "reject limit" stuff is not
> very useful.

OK, I'll make another thread for this.


-- 
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Melanie Plageman
Дата:
Сообщение: Re: Emit fewer vacuum records by reaping removable tuples during pruning
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Emit fewer vacuum records by reaping removable tuples during pruning