Fix up for BTP_CHAIN problems

Поиск
Список
Период
Сортировка
От Wayne Piekarski
Тема Fix up for BTP_CHAIN problems
Дата
Msg-id 199907170340.NAA10098@helpdesk.senet.com.au
обсуждение исходный текст
Ответы Re: [HACKERS] Fix up for BTP_CHAIN problems  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

A few weeks ago I sent an email out about getting BTP_CHAIN faults when
trying to perform operations with tables. My colleague Matt Altus was
trawling the mailing lists looking for information about this, and he
found some articles previously discussing problems with Btree indices and
how they sometimes can have problems handling tables with massive
duplicate entries in them, as the tree becomes unbalanced, and mentioned
other things like leaf nodes and so on. The postings talked about how
fixing up the problem was tricky and was still there, and Oracle solved it
by including the tid in with the index to make it more unique.

Well, we thought about this, and had a look at every table and index we'd
ever had BTP_CHAIN problems with, and all had massive duplication of
values in the particular columns. Ie, one table has 1.5 million rows, and
one of the columns with an index on it (snum) has only 20000 unique values
- this particular table was very troublesome, whereas others weren't so
bad because they were a lot smaller. Each table we looked at were all the
same problem, and we thought wow, this is really neat because all our
problem tables were explained by these postings. None of our other indexes
caused problems, because they were more unique.

Each one of our tables has a column called id which is very similar to an
oid except we generate it ourselves, and so we put in a reference to the
id column after all the other columns in our indexes. ie,

create index sessions_snum_index on sessions using btree (snum);

became:

create index sessions_snum_index on sessions using btree (snum, id);

The indexes grew a little bit, but now we have not had *ANY* BTP_CHAIN
faults at all, and to test it we really thrashed the machine to see if we
could cause it to die. It worked perfectly and we were all really happy
because BTP_CHAIN was very annoying to fix up. It was occuring a lot when
the machine was under high load.

So I can definitely recommend this to anyone who has problems like this,
or tables with lots of rows but not many unique values. The problem does
not occur under simple circumstances, only under cases where many backends
are all running and the system is under a high load.

Would a solution to the problem be to automatically include the row OID
when creating an index? This would fix the problem for everyone
automatically without having to do the hack manually. Is it ok to include
the OID in an index? I wasn't sure about this which is why I included my
own ID value instead so someone might want to comment on this.

Just thought I'd share this with everyone so we can all benefit from it.
This is a problem which really caused us to doubt the ability of Postgres
to be used in a high load environment and so I think it should be
mentioned somewhere. Maybe in the docs?


BTW, since getting around BTP_CHAIN our only remaining problem is the
backends waiting thing, and we are upgrading to 6.5 tomorrow which we hope
will fix this up forever. We did some testing of 6.5 and it runs a *lot*
faster, is more reliable, and the load of the machine is very much lower
than it normally is with 6.4.2 with our thrash testing program. I assume
that 6.4 style code will work unchanged in 6.5? Ie, we've used a lot of
LOCK TABLE xxx; code everywhere, which we hope will work untouched in 6.5. 

We'll report back after our upgrade once we know that everything works
really well.

Regards,
Wayne

------------------------------------------------------------------------------
Wayne Piekarski                               Tel:     (08) 8221 5221
Research & Development Manager                Fax:     (08) 8221 5220
SE Network Access Pty Ltd                     Mob:     0407 395 889
222 Grote Street                              Email:   wayne@senet.com.au
Adelaide SA 5000                              WWW:     http://www.senet.com.au


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Wayne Piekarski
Дата:
Сообщение: Re: Oversize proc sources (was Re: [BUGS] Backend dies creating plpgsql procedures (with reproducible example!))
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: include-file cleanup