Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data

Поиск
Список
Период
Сортировка
От Alex P. Rudnev
Тема Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data
Дата
Msg-id Pine.SUN.3.91.990203182100.18173f-100000@virgin.relcom.eu.net
обсуждение исходный текст
Ответ на Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data  (Jeff Hoffmann <jeff@remapcorp.com>)
Список pgsql-general
Do you try 'vacuum analyze' and create indexes for those keys which are
_most important_?

It's very important. I can't talk just about PSQL parser/optimiser, but
it can't choose appropriate joining order withouth knowing the key
distribution. Well projected indexes and 'vacuum analyze' can improve
your join dramatically.

Simple rules:
- create indexes for all UNIQ attributes you are used for the joining.
- don't create any index for the attributes if this attribute have a lot
of duplicated records for every attribute instance.

For example:

 CREATE TABLE employee (enum int, branch int);
 CREATE TABLE empinfo (enum int, surname text, sex int);

Create index for the branch (if you should select employee by the branch)
and for the empinfo.enum. The worst thing you can do is creating index
on 'empinfo.sex' because the frauded system can think there is 10,000
different sexes when there is only two of them.

And use 'explain' statement to check if the system's behaviour is right.


 On Wed, 3 Feb 1999, Jeff Hoffmann wrote:

> Date: Wed, 03 Feb 1999 09:14:37 -0600
> From: Jeff Hoffmann <jeff@remapcorp.com>
> To: Bruce Momjian <maillist@candle.pha.pa.us>
> Cc: pgsql-general@postgreSQL.org
> Subject: Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data
>
> Bruce Momjian wrote:
> >
> > > 1. If I re-organize the data, I would be able to perform my queries
> > > without executing joins on multiple tables per query.
> > >
> > > 2. As I re-organize the data, the database becomes less and less
> > > intuitive and (seemingly) less "normal".
> > >
> > > So, I guess my question is:  how costly are joins?  I've heard that
> > > Postgres pretty much "pukes" (in terms of speed) when you're trying
> > > to do anything more than 6 table joins in one query.  This leads
> > > me to believe that joins are fairly costly... ????
> > >
> > > Does anyone have any words of advice for me as I battle this?
> >
> > We are working speeding up large table joins right now.  Try doing SET
> > GEQO to set the value lower and see if that helps.
>
> i've noticed a pretty drastic slowdown going from a 4 table join
> (instantaneous) to a 5 table join (15-20 seconds) i don't think i've
> tried 6 tables yet with the same database.  is there a lower limit to
> where using GEQO doesn't really make sense in terms of speed (i.e.,
> would setting GEQO to 5 be a toss up and 4 not make sense?)  i'm
> guessing that the number of plans the optimizer checks without GEQO goes
> up factorially, while GEQO goes up fairly linearly (from my limited
> knowledge of genetic algorithms, basically you make a series of first
> guesses and gradually refine them and throw away the losers until you
> end up with the right one.)  if my join is based purely on primary keys,
> shouldn't just about any plan work well, or at least well enough that it
> doesn't pay to make an exhaustive search of the plan space, making GEQO
> the best choice?  i guess my question is "is there a rule of thumb for
> setting GEQO?"  is there a reason it was set to 8 by default?  does GEQO
> work better in some cases than others? (small tables, joins on
> non-indexed fields, etc.)
>
> it seems like i'm always learning something new with postgres -- i've
> never thought about this before; it's just one more thing for me to play
> around with...
>
>

Aleksei Roudnev, Network Operations Center, Relcom, Moscow
(+7 095) 194-19-95 (Network Operations Center Hot Line),(+7 095) 230-41-41, N 13729 (pager)
(+7 095) 196-72-12 (Support), (+7 095) 194-33-28 (Fax)


В списке pgsql-general по дате отправления:

Предыдущее
От: Jeff Hoffmann
Дата:
Сообщение: Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data
Следующее
От: Sebestyen Zoltan
Дата:
Сообщение: vacuum problems