Re: Bug in pg_dump --filter? - Invalid object types can be misinterpreted as valid
От | Xuneng Zhou |
---|---|
Тема | Re: Bug in pg_dump --filter? - Invalid object types can be misinterpreted as valid |
Дата | |
Msg-id | CABPTF7WSYRKqZcedtHxVK3JCmPuNwu9MU14Wzz0TJTeKx+wGJQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Bug in pg_dump --filter? - Invalid object types can be misinterpreted as valid (Fujii Masao <masao.fujii@gmail.com>) |
Список | pgsql-hackers |
Hi Fujii-san, Thanks for working on this. On Sat, Aug 2, 2025 at 5:48 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > Hi, > > It looks like pg_dump --filter can mistakenly treat invalid object types > in the filter file as valid ones. For example, the invalid type "table-data" > (probably a typo for "table_data") is incorrectly recognized as "table", > and pg_dump runs without error when it should fail. > > -------------------------------------------- > $ cat filter.txt > exclude table-data one > > $ pg_dump --filter filter.txt > -- > -- PostgreSQL database dump > -- > ... > > $ echo $? > 0 > -------------------------------------------- > > This happens because pg_dump (filter_get_keyword() in pg_dump/filter.c) > identifies tokens as sequences of ASCII alphabetic characters, treating > non-alphabetic characters (like hyphens) as token boundaries. As a result, > "table-data" is parsed as "table". > > To fix this, I've attached the patch that updates pg_dump --filter so that > it treats tokens as strings of non-space characters separated by spaces > or line endings, ensuring invalid types like "table-data" are correctly > rejected. Thought? > > With the patch: > -------------------------------------------- > $ cat filter.txt > exclude table-data one > > $ pg_dump --filter filter.txt > pg_dump: error: invalid format in filter read from file "filter.txt" > on line 1: unsupported filter object type: "table-data" > -------------------------------------------- After testing, the patch LGTM. I noticed two very small possible nits: 1) Comment wording The loop now calls isspace((unsigned char)*ptr), so a token ends at any whitespace, not just at ASCII space (0x20). Could we revise the comment—from “strings of non-space characters bounded by space characters” to something like “strings of non-space characters bounded by whitespace” —to match the behavior? 2) Variable name const char *keyword = filter_get_token(&str, &size); keyword = filter_get_token(&str, &size); After the patch, filter_get_token() no longer returns a keyword (letters-only identifier); it now returns any non-whitespace token. Renaming the variable from keyword to token (or similar) might make the intent clearer.. Best, Xuneng
В списке pgsql-hackers по дате отправления: