Re: Test instability when pg_dump orders by OID

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Test instability when pg_dump orders by OID
Дата
Msg-id CA+TgmoYnAiypU5sYstJ0nCqr+VVubf5wczANg=KqEeOYmdVeAw@mail.gmail.com
обсуждение исходный текст
Ответ на Test instability when pg_dump orders by OID  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
On Mon, Jul 7, 2025 at 3:27 PM Noah Misch <noah@leadboat.com> wrote:
> Let's get rid of pg_dump's need to sort by OID, apart from catalog corruption
> scenarios.

+1. I had at one point believed that sorting by OID was a good way to
make dumps stable, but this disproves that theory. Sorting by logical
properties of the object is better.

> Adding an assert found a total of seven affected object types.
> See the second attached patch.  The drawback is storing five more fields in
> pg_dump memory: oprleft, oprright, opcmethod, opfmethod, and collencoding.
> That seems minor relative to existing pg_dump memory efficiency.  Since this
> is a source of test flakes in v18, I'd like to back-patch to v18.  I'm not
> sure why the buildfarm hasn't seen the above diff, but I expect the diff could
> happen there.  This is another nice win for the new test from commit
> 172259afb563d35001410dc6daad78b250924038.  The order instability was always
> bad for users, but the test brought it to the forefront.  One might argue for
> back-patching $SUBJECT further, too.

I agree with back-patching it at least as far as v18. I think it
probably wouldn't hurt anything to back-patch further, and it might
avoid future buildfarm failures. Against that, there's a remote
possibility that someone who is currently saving pg_dump output for
later comparison, say in a case where OIDs are always stable in
practice, could be displeased to see the pg_dump order change in a
minor release. But that seems like a very weak argument against
back-patching. I can't see us ever deciding to put up with buildfarm
instability on such grounds.

Reviewing:

+ * Sort by name.  This differs from "Name:" in plain format output, which
+ * is a _tocEntry.tag.  For example, DumpableObject.name of a constraint
+ * is pg_constraint.conname, but _tocEntry.tag of a constraint is relname
+ * and conname joined with a space.

This comment is useful, but if I were to be critical, it does a better
job saying what this field isn't than what it is.

+ * Sort by encoding, per pg_collation_name_enc_nsp_index.  This is
+ * mostly academic, because CREATE COLLATION has restrictions to make
+ * (nspname, collname) uniquely identify a collation within a given
+ * DatabaseEncoding.  pg_import_system_collations() bypasses those
+ * restrictions, but pg_dump+restore fails after a
+ * pg_import_system_collations('my_schema') that creates collations
+ * for a blend of encodings.

This comment is also useful, but if I were to be critical again, it
does a better job saying why we shouldn't do what the code then does
than why we should.

Neither of those issues seem like must-fix problems to me.

--
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления: