Re: proposal: possibility to read dumped table's name from file

Поиск

Список

Период

Сортировка

От	Pavel Stehule
Тема	Re: proposal: possibility to read dumped table's name from file
Дата	5 июля 2020 г. 23:08:09
Msg-id	CAFj8pRCsZuKRRdqZoYYo_wW-YjpWGA_ie9nhwJRd9E+GmsShrQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: proposal: possibility to read dumped table's name from file (Justin Pryzby <pryzby@telsasoft.com>)
Ответы	Re: proposal: possibility to read dumped table's name from file (Justin Pryzby <pryzby@telsasoft.com>)
Список	pgsql-hackers

Дерево обсуждения

st 1. 7. 2020 v 23:24 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:

On Thu, Jun 11, 2020 at 09:36:18AM +0200, Pavel Stehule wrote:
> st 10. 6. 2020 v 0:30 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
> > > + /* ignore empty rows */
> > > + if (*line != '\0')
> >
> > Maybe: if line=='\0': continue
> > We should also support comments.

Comment support is still missing but easily added :)

I tried this patch and it works for my purposes.

Also, your getline is dynamically re-allocating lines of arbitrary length.
Possibly that's not needed. We'll typically read "+t schema.relname", which is
132 chars. Maybe it's sufficient to do
char buf[1024];
fgets(buf);
if strchr(buf, '\n') == NULL: error();
ret = pstrdup(buf);

63 bytes is max effective identifier size, but it is not max size of identifiers. It is very probably so buff with 1024 bytes will be enough for all, but I do not want to increase any new magic limit. More when dynamic implementation is not too hard.

Table name can be very long - sometimes the data names (table names) can be stored in external storages with full length and should not be practical to require truncating in filter file.

For this case it is very effective, because a resized (increased) buffer is used for following rows, so realloc should not be often. So when I have to choose between two implementations with similar complexity, I prefer more dynamic code without hardcoded limits. This dynamic hasn't any overhead.

In any case, you could have getline return a char* and (rather than following
GNU) no need to take char**, int* parameters to conflate inputs and outputs.

no, it has a special benefit. It eliminates the short malloc/free cycle. When some lines are longer, then the buffer is increased (and limits), and for other rows with same or less size is not necessary realloc.

I realized that --filter has an advantage over the previous implementation
(with multiple --exclude-* and --include-*) in that it's possible to use stdin
for includes *and* excludes.

yes, it looks like better choose

By chance, I had the opportunity yesterday to re-use with rsync a regex that
I'd previously been using with pg_dump and grep. What this patch calls
"--filter" in rsync is called "--filter-from". rsync's --filter-from rejects
filters of length longer than max filename, so I had to split it up into
multiple lines instead of using regex alternation ("|"). This option is a
close parallel in pg_dump.

we can talk about option name - maybe "--filter-from" is better than just "--filter"

Regards

Pavel

--
Justin

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Pavel Stehule
Дата: 05 июля 2020 г., 22:50:34
Сообщение: Re: proposal: possibility to read dumped table's name from file

Следующее

От: Tom Lane
Дата: 05 июля 2020 г., 23:25:22
Сообщение: Re: Ideas about a better API for postgres_fdw remote estimates

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: proposal: possibility to read dumped table's name from file

Предыдущее

Следующее