Обсуждение: vacuum analyze corrupts db with larger tuples (< 8k)
ok, here is what I have found out on 6.5.3, Linux 2.2.10: DROP TABLE buf; CREATE TABLE buf (s varchar(5000)); -- type TEXT is the same prob COPY buf FROM stdin; xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ [... almost 5k ...] xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ [... almost 5k ...] xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\ [ 10 times ] \. # vacuumdb --analyze test ERROR: Tuple is too big: size 9604 vacuumdb: database vacuum failed on test. - this is repeatable - in in this test the db isn't corrupted but in our production db :( - Tom, you made a remark that you found a bug in access/hio.c. Does the bug also hit here? Dirk
Dirk Lutzebaeck writes: > - Tom, you made a remark that you found a bug in access/hio.c. > Does the bug also hit here? I applied this patch and the corruption is gone but vacuumdb --analyze still complains about the tuple being too large. Dirk
Dirk Lutzebaeck <lutzeb@aeccom.com> writes:
> ok, here is what I have found out on 6.5.3, Linux 2.2.10:
> [ make table with a bunch of almost-5K varchar fields ]
> # vacuumdb --analyze test
> ERROR: Tuple is too big: size 9604
> vacuumdb: database vacuum failed on test.
Ohhh ... I know what's going on. The oversize tuple is the one that
VACUUM is attempting to store in pg_statistic, containing the min and
max values for your varchar column. In this example, both the min and
max are just shy of 5K characters, so the pg_statistic tuple is too
big to fit on a page.
I had already patched this in current sources, by the expedient of not
trying to store a pg_statistic tuple at all if it's too big. (Then
you don't get stats for that particular column, but the stats probably
wouldn't be useful anyway.)
I suppose I should make up a back-patch for REL6_5 with this fix.
regards, tom lane
I wrote:
> Ohhh ... I know what's going on. The oversize tuple is the one that
> VACUUM is attempting to store in pg_statistic, containing the min and
> max values for your varchar column. In this example, both the min and
> max are just shy of 5K characters, so the pg_statistic tuple is too
> big to fit on a page.
I have applied the attached patch to the REL6_5 branch. (There is
already similar code in the current branch.) I'm afraid it won't
help you recover from an already-corrupted pg_statistic table, though.
regards, tom lane
*** src/backend/commands/vacuum.c.orig Wed Aug 25 08:01:45 1999
--- src/backend/commands/vacuum.c Tue Jan 4 12:15:17 2000
***************
*** 2405,2414 ****
stup = heap_formtuple(sd->rd_att, values, nulls);
/* ----------------
! * insert the tuple in the relation and get the tuple's oid.
* ----------------
*/
! heap_insert(sd, stup);
pfree(DatumGetPointer(values[3]));
pfree(DatumGetPointer(values[4]));
pfree(stup);
--- 2405,2425 ----
stup = heap_formtuple(sd->rd_att, values, nulls);
/* ----------------
! * Watch out for oversize tuple, which can happen if
! * both of the saved data values are long.
! * Our fallback strategy is just to not store the
! * pg_statistic tuple at all in that case. (We could
! * replace the values by NULLs and still store the
! * numeric stats, but presently selfuncs.c couldn't
! * do anything useful with that case anyway.)
* ----------------
*/
! if (MAXALIGN(stup->t_len) <= MaxTupleSize)
! {
! /* OK to store tuple */
! heap_insert(sd, stup);
! }
!
pfree(DatumGetPointer(values[3]));
pfree(DatumGetPointer(values[4]));
pfree(stup);
> Dirk Lutzebaeck <lutzeb@aeccom.com> writes: > > ok, here is what I have found out on 6.5.3, Linux 2.2.10: > > [ make table with a bunch of almost-5K varchar fields ] > > # vacuumdb --analyze test > > ERROR: Tuple is too big: size 9604 > > vacuumdb: database vacuum failed on test. > > Ohhh ... I know what's going on. The oversize tuple is the one that > VACUUM is attempting to store in pg_statistic, containing the min and > max values for your varchar column. In this example, both the min and > max are just shy of 5K characters, so the pg_statistic tuple is too > big to fit on a page. > > I had already patched this in current sources, by the expedient of not > trying to store a pg_statistic tuple at all if it's too big. (Then > you don't get stats for that particular column, but the stats probably > wouldn't be useful anyway.) > > I suppose I should make up a back-patch for REL6_5 with this fix. Oh, good we know the cause. Seems we should wait for 7.0 for this. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Tom Lane writes: > I have applied the attached patch to the REL6_5 branch. (There is > already similar code in the current branch.) I'm afraid it won't > help you recover from an already-corrupted pg_statistic table, though. How can I recover from this? Simply delete the entries in pg_statistic? Is this possible? Dirk
Dirk Lutzebaeck <lutzeb@aeccom.com> writes:
> Tom Lane writes:
>> I have applied the attached patch to the REL6_5 branch. (There is
>> already similar code in the current branch.) I'm afraid it won't
>> help you recover from an already-corrupted pg_statistic table, though.
> How can I recover from this? Simply delete the entries in pg_statistic?
> Is this possible?
It's worth a try, anyway: just DELETE FROM pg_statistic
and see what happens... but, depending on how badly pg_statistic is
messed up, that might itself crash...
regards, tom lane
> Tom Lane writes: > > > I have applied the attached patch to the REL6_5 branch. (There is > > already similar code in the current branch.) I'm afraid it won't > > help you recover from an already-corrupted pg_statistic table, though. > > How can I recover from this? Simply delete the entries in pg_statistic? > Is this possible? The only fix I know is pg_upgrade. You may have to enable the script. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026