Re: Test instability when pg_dump orders by OID
От | Robert Haas |
---|---|
Тема | Re: Test instability when pg_dump orders by OID |
Дата | |
Msg-id | CA+TgmoYnAiypU5sYstJ0nCqr+VVubf5wczANg=KqEeOYmdVeAw@mail.gmail.com обсуждение исходный текст |
Ответ на | Test instability when pg_dump orders by OID (Noah Misch <noah@leadboat.com>) |
Список | pgsql-hackers |
On Mon, Jul 7, 2025 at 3:27 PM Noah Misch <noah@leadboat.com> wrote: > Let's get rid of pg_dump's need to sort by OID, apart from catalog corruption > scenarios. +1. I had at one point believed that sorting by OID was a good way to make dumps stable, but this disproves that theory. Sorting by logical properties of the object is better. > Adding an assert found a total of seven affected object types. > See the second attached patch. The drawback is storing five more fields in > pg_dump memory: oprleft, oprright, opcmethod, opfmethod, and collencoding. > That seems minor relative to existing pg_dump memory efficiency. Since this > is a source of test flakes in v18, I'd like to back-patch to v18. I'm not > sure why the buildfarm hasn't seen the above diff, but I expect the diff could > happen there. This is another nice win for the new test from commit > 172259afb563d35001410dc6daad78b250924038. The order instability was always > bad for users, but the test brought it to the forefront. One might argue for > back-patching $SUBJECT further, too. I agree with back-patching it at least as far as v18. I think it probably wouldn't hurt anything to back-patch further, and it might avoid future buildfarm failures. Against that, there's a remote possibility that someone who is currently saving pg_dump output for later comparison, say in a case where OIDs are always stable in practice, could be displeased to see the pg_dump order change in a minor release. But that seems like a very weak argument against back-patching. I can't see us ever deciding to put up with buildfarm instability on such grounds. Reviewing: + * Sort by name. This differs from "Name:" in plain format output, which + * is a _tocEntry.tag. For example, DumpableObject.name of a constraint + * is pg_constraint.conname, but _tocEntry.tag of a constraint is relname + * and conname joined with a space. This comment is useful, but if I were to be critical, it does a better job saying what this field isn't than what it is. + * Sort by encoding, per pg_collation_name_enc_nsp_index. This is + * mostly academic, because CREATE COLLATION has restrictions to make + * (nspname, collname) uniquely identify a collation within a given + * DatabaseEncoding. pg_import_system_collations() bypasses those + * restrictions, but pg_dump+restore fails after a + * pg_import_system_collations('my_schema') that creates collations + * for a blend of encodings. This comment is also useful, but if I were to be critical again, it does a better job saying why we shouldn't do what the code then does than why we should. Neither of those issues seem like must-fix problems to me. -- Robert Haas EDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: