Обсуждение: pg_upgrading to 10.1 corrupts (varchar,varchar) UNIQUE indices
Hi there, this is a reproducible error. We recently pg_upgraded our production database to 10.1 from 9.6.6. The upgrade runs fine with the suggestion to analyze all data. [rihad@postgres-10-test]$ cat analyze_new_cluster.sh #!/bin/sh echo 'This script will generate minimal optimizer statistics rapidly' echo 'so your system is usable, and then gather statistics twice more' echo 'with increasing accuracy. When it is done, your system will' echo 'have the default level of optimizer statistics.' echo echo 'If you have used ALTER TABLE to modify the statistics target for' echo 'any tables, you might want to remove them and restore them after' echo 'running this script because they will delay fast statistics generation.' echo echo 'If you would like default statistics as quickly as possible, cancel' echo 'this script and run:' echo ' "/10.1/bin/vacuumdb" --all --analyze-only' echo "/10.1/bin/vacuumdb" --all --analyze-in-stages echo echo 'Done' which we run after the upgrade. It doesn't matter if we do the analyze so in a test environment with no activity or run it concurrently with the already started production queries. The problem: Some of our tables have indices of the form: "index_translations_on_locale_and_key" UNIQUE, btree (locale, key) locale | character varying(255) | | | key | character varying(255) | | | or "index_users_on_email_and_type" UNIQUE, btree (email, type) email | character varying(255) | | not null | ''::character varying type | character varying | | not null | (these are different tables) Trying to find data using the specified indices fails to find matching rows: foo=# select * from translations where locale='de' and key='extranet.options.places.age_brackets_hints.a'; id | locale | key | value | created_at | updated_at | resolved ----+--------+-----+-------+------------+------------+---------- (0 rows) foo=# explain select * from translations where locale='de' and key='extranet.options.places.age_brackets_hints.a'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Index Scan using index_translations_on_locale_and_key on translations (cost=0.41..2.63 rows=1 width=234) Index Cond: (((locale)::text = 'de'::text) AND ((key)::text = 'extranet.options.places.age_brackets_hints.a'::text)) (2 rows) reindexing the table fixes the issue: foo=# reindex index index_translations_on_locale_and_key ; REINDEX foo=# select * from translations where locale='de' and key='extranet.options.places.age_brackets_hints.a'; id | locale | key | value | created_at | updated_at | resolved --------+--------+----------------------------------------------+-----------------------------------------------------+----------------------------+----------------------------+---------- 136373 | de | extranet.options.places.age_brackets_hints.a | Alter für einen vollen Gast-Tarif, z.B ab 12 Jahre | 2017-08-22 11:27:27.774259 | 2017-09-02 09:05:45.244927 | f (1 row) foo=# explain select * from translations where locale='de' and key='extranet.options.places.age_brackets_hints.a'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Index Scan using index_translations_on_locale_and_key on translations (cost=0.41..2.63 rows=1 width=234) Index Cond: (((locale)::text = 'de'::text) AND ((key)::text = 'extranet.options.places.age_brackets_hints.a'::text)) (2 rows) The upgrade guide states that only hash indices should be rebuilt after the upgrade, not btree ones.
On Fri, Dec 22, 2017 at 12:53 PM, rihad <rihad@mail.ru> wrote:
Hi there, this is a reproducible error. We recently pg_upgraded our production database to 10.1 from 9.6.6. The upgrade runs fine with the suggestion to analyze all data.
[rihad@postgres-10-test]$ cat analyze_new_cluster.sh
#!/bin/sh
echo 'This script will generate minimal optimizer statistics rapidly'
echo 'so your system is usable, and then gather statistics twice more'
echo 'with increasing accuracy. When it is done, your system will'
echo 'have the default level of optimizer statistics.'
echo
echo 'If you have used ALTER TABLE to modify the statistics target for'
echo 'any tables, you might want to remove them and restore them after'
echo 'running this script because they will delay fast statistics generation.'
echo
echo 'If you would like default statistics as quickly as possible, cancel'
echo 'this script and run:'
echo ' "/10.1/bin/vacuumdb" --all --analyze-only'
echo
"/10.1/bin/vacuumdb" --all --analyze-in-stages
echo
echo 'Done'
which we run after the upgrade. It doesn't matter if we do the analyze so in a test environment with no activity or run it concurrently with the already started production queries.
The problem:
Some of our tables have indices of the form:
"index_translations_on_locale_and_key" UNIQUE, btree (locale, key)
locale | character varying(255) | | |
key | character varying(255) | | |
or
"index_users_on_email_and_type" UNIQUE, btree (email, type)
email | character varying(255) | | not null | ''::character varying
type | character varying | | not null |
(these are different tables)
Trying to find data using the specified indices fails to find matching rows:
foo=# select * from translations where locale='de' and key='extranet.options.places.age_brackets_hints.a';
id | locale | key | value | created_at | updated_at | resolved
----+--------+-----+-------+------------+------------+------ ----
(0 rows)
foo=# explain select * from translations where locale='de' and key='extranet.options.places.age_brackets_hints.a';
QUERY PLAN
------------------------------------------------------------ ------------------------------ ------------------------------
Index Scan using index_translations_on_locale_and_key on translations (cost=0.41..2.63 rows=1 width=234)
Index Cond: (((locale)::text = 'de'::text) AND ((key)::text = 'extranet.options.places.age_brackets_hints.a'::text))
(2 rows)
reindexing the table fixes the issue:
foo=# reindex index index_translations_on_locale_and_key ;
REINDEX
foo=# select * from translations where locale='de' and key='extranet.options.places.age_brackets_hints.a';
id | locale | key | value | created_at | updated_at | resolved
--------+--------+------------------------------------------ ----+------------------------- ----------------------------+- ---------------------------+-- --------------------------+--- -------
136373 | de | extranet.options.places.age_brackets_hints.a | Alter für einen vollen Gast-Tarif, z.B ab 12 Jahre | 2017-08-22 11:27:27.774259 | 2017-09-02 09:05:45.244927 | f
(1 row)
foo=# explain select * from translations where locale='de' and key='extranet.options.places.age_brackets_hints.a';
QUERY PLAN
------------------------------------------------------------ ------------------------------ ------------------------------
Index Scan using index_translations_on_locale_and_key on translations (cost=0.41..2.63 rows=1 width=234)
Index Cond: (((locale)::text = 'de'::text) AND ((key)::text = 'extranet.options.places.age_brackets_hints.a'::text))
(2 rows)
The upgrade guide states that only hash indices should be rebuilt after the upgrade, not btree ones.
rihad wrote: > Hi there, this is a reproducible error. We recently pg_upgraded our > production database to 10.1 from 9.6.6. The upgrade runs fine with the > suggestion to analyze all data. [...] > Trying to find data using the specified indices fails to find matching rows: [...] > reindexing the table fixes the issue: Did you also upgrade the operating system or at least libc? Then the problem could come from a modified collation. Compare https://www.postgresql.org/message-id/CB4D1C6BAA80CF146CB0D4F2%40eje.credativ.lan Yours, Laurenz Albe
On 12/22/2017 05:09 PM, Magnus Hagander wrote:
FreeBSD 10.3-RELEASE-p26 amd64What platform are you on
pkg install., how was PostgreSQL installed
Additionally, 10.1 needed for pg_upgrade to run was installed using portmaster -m PREFIX=/10.1, because FreeBSD doesn't allow more than one major version to be installed simultaneously.
After the upgrade took place, 9.6.6 was pkg deleted and 10.1 was pkg installed.
and exactly how was pg_upgrade executed? (or is, since it's reproducible)
sudo -u postgres /10.1/bin/pg_upgrade -j2 --link -r -b /usr/local/bin -B /10.1/bin/ -d /var/db/postgres/data96/ -D /var/db/postgres/data10/
On 12/22/2017 05:09 PM, Laurenz Albe wrote: > rihad wrote: >> Hi there, this is a reproducible error. We recently pg_upgraded our >> production database to 10.1 from 9.6.6. The upgrade runs fine with the >> suggestion to analyze all data. > [...] > >> Trying to find data using the specified indices fails to find matching rows: > [...] > >> reindexing the table fixes the issue: > Did you also upgrade the operating system or at least libc? It looks like it. $ ll /lib/libc.so.7 -r--r--r-- 1 root wheel 1647104 Jun 19 2017 /lib/libc.so.7 this never was a major/minor OS upgrade, just whatever FreeBSD patch releases included. > > Then the problem could come from a modified collation. > Compare https://www.postgresql.org/message-id/CB4D1C6BAA80CF146CB0D4F2%40eje.credativ.lan de (German) was just an example, the inability to find matching rows affected other languages like fr or even en, too. It does look like a collation issue, though.
rihad wrote: > On 12/22/2017 05:09 PM, Laurenz Albe wrote: > > rihad wrote: > > > Hi there, this is a reproducible error. We recently pg_upgraded our > > > production database to 10.1 from 9.6.6. The upgrade runs fine with the > > > suggestion to analyze all data. > > > > [...] > > > > > Trying to find data using the specified indices fails to find matching rows: > > > > [...] > > > > > reindexing the table fixes the issue: > > > > Did you also upgrade the operating system or at least libc? > > It looks like it. > > $ ll /lib/libc.so.7 > -r--r--r-- 1 root wheel 1647104 Jun 19 2017 /lib/libc.so.7 > > this never was a major/minor OS upgrade, just whatever FreeBSD patch > releases included. > > > Then the problem could come from a modified collation. > > Compare https://www.postgresql.org/message-id/CB4D1C6BAA80CF146CB0D4F2%40eje.credativ.lan > > de (German) was just an example, the inability to find matching rows > affected other languages like fr or even en, too. > It does look like a collation issue, though. That would be the best thing, because it would mean that there is no PostgreSQL bug or hardware problem, and REINDEX will fix the problem. Yours, Laurenz Albe