Re: Default setting for enable_hashagg_disk
От | Tomas Vondra |
---|---|
Тема | Re: Default setting for enable_hashagg_disk |
Дата | |
Msg-id | 20200625231154.uknvrnqwmahv6rtu@development обсуждение исходный текст |
Ответ на | Re: Default setting for enable_hashagg_disk (Jeff Davis <pgsql@j-davis.com>) |
Список | pgsql-hackers |
On Thu, Jun 25, 2020 at 02:28:02PM -0700, Jeff Davis wrote: >On Thu, 2020-06-25 at 15:56 -0400, Bruce Momjian wrote: >> It is my understanding that spill of sorts is mostly read >> sequentially, >> while hash reads are random. Is that right? Is that not being >> costed >> properly? > >I don't think there's a major problem with the cost model, but it could >probably use some tweaking. > >Hash writes are random. The hash reads should be mostly sequential (for >large partitions it will be 128-block extents, or 1MB). The cost model >assumes 50% sequential and 50% random. > The important bit here is that while the logical writes are random, those are effectively combined in page cache and the physical writes are pretty sequential. So I think the cost model is fairly reasonable. Note: Judging by iosnoop stats shared in the thread linked by Jeff. >Sorts are written sequentially and read randomly, but there's >prefetching to keep the reads from being too random. The cost model >assumes 75% sequential and 25% random. > >Overall, the IO pattern is better for Sort, but not dramatically so. >Tomas Vondra did some nice analysis here: > > >https://www.postgresql.org/message-id/20200525021045.dilgcsmgiu4l5jpa@development > >That resulted in getting the prealloc and projection patches in. > >Regards, > Jeff Davis > > regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: