HASH INDEX builds seems confused
| От | John Naylor |
|---|---|
| Тема | HASH INDEX builds seems confused |
| Дата | |
| Msg-id | CANWCAZZnunCRic4AiMqRp6b8vQ_h3uOYcaQ5Omp7xLeMX-u3dA@mail.gmail.com обсуждение исходный текст |
| Ответы |
Re: HASH INDEX builds seems confused
|
| Список | pgsql-hackers |
hashbuild() says: * If we just insert the tuples into the index in scan order, then * (assuming their hash codes are pretty random) there will be no locality * of access to the index, and if the index is bigger than available RAM * then we'll thrash horribly. To prevent that scenario, we can sort the * tuples by (expected) bucket number. However, such a sort is useless * overhead when the index does fit in RAM. We choose to sort if the * initial index size exceeds maintenance_work_mem, or the number of * buffers usable for the index, whichever is less. (Limiting by the However, since commit e09d7a126 it's harder to believe sorts are ever useless, since we then decided that sorts should have a more strict sort order for the sake of sequential access. Further, d09dbeb9b built upon that to remove wasteful binary search when inserting into the page. Looking at some of the numbers in the linked threads, I wonder if all test environments were actually hitting the sort path at all, since you'd have to exceed m_w_m or s_b to take advantage. Unless I'm missing something, it seems like we should just sort unconditionally. That would be a nice simplification, and might speed up index builds even when there's plenty of memory. (If I am in fact missing something, maybe comments need updating) Now that I'm looking, I'm also wondering how hard it would be to have datum1 contain both the bucket (high bits) and hash (lower bits), since we can now count on Datums being 8 bytes on all platforms. It might be harder in turn to hack things so that the appropriate sort specialization could be applied (it'd need a fake sortKey at least), but that would be a possible future project. -- John Naylor Amazon Web Services
В списке pgsql-hackers по дате отправления: