Re: All Taxi Services need Index Clustered Heap Append

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: All Taxi Services need Index Clustered Heap Append
Дата
Msg-id CAKJS1f-Gim0cLBidKTpkGfRr-guWvELNt7JoO7pryCYsY63c7A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: All Taxi Services need Index Clustered Heap Append  (Ants Aasma <ants.aasma@eesti.ee>)
Список pgsql-hackers
On 4 March 2018 at 23:05, Ants Aasma <ants.aasma@eesti.ee> wrote:
> On Sat, Mar 3, 2018 at 4:53 PM, David Rowley
>> It's a good job someone invented HASH partitioning then.
>>
>> It would be interesting to hear how your benchmarks go using current
>> master + the faster partition pruning patchset [1].  Currently, HASH
>> partitioning does exist in master, just there's no partition pruning
>> for the non-matching partitions, which is why you need [1].
>>
>> I think trying with something like 500-1000 partitions might be a good
>> place to start.
>
> I don't think that will actually help much. 1000 partitions means each
> partition gets data from ~50 vehicles. A 60 tuples per page each page
> in the partitioned able will contain on average 1.2 interesting
> tuples. So you still have almost one page read per row.

hmm, I missed that part about only 60 tuples per page.

It may be worth an experiment with two table, one to hold the day's
worth of data, and a holding table which stores about 1-2 minutes of
data. Each minute or two the holding table could be flushed like:

WITH del AS (DELETE FROM driver_pos_holding RETURNING *)
INSERT INTO driver_pos SELECT * FROM del ORDER BY id,ts;

then perhaps a manual VACUUM of driver_pos_holding... or leave it up
to auto-vacuum...

both tables could be inherited by a single parent to allow queries to
return all rows, or be wrapped up in a UNION ALL view, although an
inherited table should provide better plans than the view in some
cases. Although using an inherited parent would disallow you to use
partitioning if you ever wanted to partition by ts to make the job of
removing old data easier.

Hopefully having 60-120 seconds of driver data will in the holding
table will mean that the tuples for each driver only span 2-3 pages
for that 1-2 minute period in the main table You might then have not
much more than 240 pages to load for a driver after a 4-hour run.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: 2018-03 Commitfest Summary (Andres #1)
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: 2018-03 Commitfest Summary (Andres #1)