Re: partitioned table set and indexes

Поиск
Список
Период
Сортировка
От Rick Otten
Тема Re: partitioned table set and indexes
Дата
Msg-id CAMAYy4KC0DupF=0CtAK+jxruMUC6ssK36uC6wg60Qs3vFmCY1w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: partitioned table set and indexes  (Andreas Kretschmer <andreas@a-kretschmer.de>)
Ответы Re: partitioned table set and indexes
Список pgsql-performance
Ok, here is the first case where I select on the column:


Here is the second case where I try a join:


And here is the third case where I add a filter on the parent table:


The primary use case for partitioning is for performance gains when working with very large tables.  I agree these are not that large and by itself it does not justify the extra complexity of working with partitioning.

However there are other use cases for the partitioning model.  In our case we have legacy business processes that swap out the child tables and operate on them independently from each other.  They could be refactored to work together within one big table, but that is a project for another day.   The segmentation of the data into structurally consistent but related separate tables is the first step in that direction.  (previously they were all different from each other, but similar too)   Some of these children tables will hit 1M rows by the end of 2016, but it will take a while for them to grow to that size.

I do have another table with many millions of rows that could use partitioning, and eventually I'll split that one up - probably around the time I merge this one into a single table.  First I have to finish getting everything off of MySQL...

The query performance hit for sequence scanning isn't all that terrible, but I'd rather understand and get rid of the issue if I can, now, before I run into it again in a situation where it is crippling.

Thank you for your help with this!

--

ps:  You don't have to believe me about the bulk index adding thing.  I hardly believe it myself. It is just something to keep an eye out for.  If it is a real issue, I ought to be able to build a reproducible test case to share - at that time I'll see if I can open it up as a real bug.  For now I'd rather focus on understanding why my select uses an index and a join won't.



On Fri, Dec 11, 2015 at 4:44 PM, Andreas Kretschmer <andreas@a-kretschmer.de> wrote:


> Rick Otten <rottenwindfish@gmail.com> hat am 11. Dezember 2015 um 21:40
> geschrieben:
>
>
> I do not know why if I blast a new index creation on the 20 or so children
> all at once some of them fail, but then if I go back and do a few at a time
> they all work.  It has happened to me 3 times now, so I'm pretty sure I'm
> not imagining it.

don't believe that, sorry.


>
> What specifically in the explain analyze output tells you that it is using
> a sequence scan instead of an index scan _because_ there are too few rows?
> I can see where it chooses a sequence scan over an index and I know there
> are only a few rows in those tables, but I'm not sure how the explain
> output tells you that it made that choice on purpose.

a sequentiell scan over a small table are cheaper than an index-scan. Imageine a
small table,
only 3 rows. Fits in one page. It's cheaper to read just this page than read the
index
plus read the table to put out the result row.


Why are you using partitioning? That's make only sense with large child-tables
(more than 1 million rows or so)
and if you have a proper partitioning schema.



>
> Why would the select statement use the index, but not the join?
>
> There used to be an explain output anonymizer tool, if I can find that
> again, I'll send along the output.  It has been a few years since I posted
> a question to this list so I don't think I have a bookmark for it any
> more.... Hmmm.  I'll look around.


http://explain.depesz.com/

В списке pgsql-performance по дате отправления:

Предыдущее
От: Andreas Kretschmer
Дата:
Сообщение: Re: partitioned table set and indexes
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: checkpoints, proper config