Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Дата
Msg-id CABOikdNigDQ59DyAk1hQh6PpDJwaVqs3VV4ZqqFtDHZiz9-2-Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Ответы Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers

On Thu, Jan 19, 2017 at 6:35 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:


Revised patch is attached. 

I've now also rebased the main WARM patch against the current master (3eaf03b5d331b7a06d79 to be precise). I'm attaching Alvaro's patch to get interesting attributes (prefixed with 0000 since the other two patches are based on that). The changes to support system tables are now merged with the main patch. I could separate them if it helps in review.

I am also including a stress test workload that I am currently running to test WARM's correctness since Robert raised a valid concern about that. The idea is to include a few more columns in the pgbench_accounts table and have a few more indexes. The additional columns with indexes kind of share a relationship with the "aid" column. But instead of a fixed value, values for these columns can vary within a fixed, non-overlapping range. For example, for aid = 1, aid1's original value will be 10 and it can vary between 8 to 12. Similarly, aid2's original value will be 20 and it can vary between 16 to 24. This setup allows us to update these additional columns (thus force WARM), but still ensure that we can do some sanity checks on the results.

The test contains a bunch of UPDATE, FOR UPDATE, FOR SHARE transactions. Some of these transactions commit and some rollback. The checks are in-place to ensure that we always find exactly one tuple irrespective of which column we use to fetch the row. Of course, when the aid[1-4] columns are used to fetch tuples, we need to scan with a range instead of an equality. Then we do a bunch of operations like CREATE INDEX, DROP INDEX, CIC, run long transactions, VACUUM FULL etc while the tests are running and ensure that the sanity checks always pass. We could do a few other things like, may be marking these indexes as UNIQUE or keeping a long transaction open while doing updates and other operations. I'll add some of those to the test, but suggestions are welcome.

I do see a problem with CREATE INDEX CONCURRENTLY with these tests, though everything else has run ok so far (I am yet to do very long running tests. Probably just a few hours tests today).

I'm trying to understand why CIC fails to build a consistent index. I think I've some clue now why it could be happening. With HOT, we don't need to worry about broken chains since at the very beginning we add the index tuple and all subsequent updates will honour the new index while deciding on HOT updates i.e. we won't create any new broken HOT chains once we start building the index. Later during validation phase, we only need to insert tuples that are not already in the index. But with WARM, I think the check needs to be more elaborate. So even if the TID (we always look at its root line pointer etc) exists in the index, we will need to ensure that the index key matches the heap tuple we are dealing with. That looks a bit tricky. May be we can lookup the index using key from the current heap tuple and then see if we get a tuple with the same TID back. Of course, we need to do this only if the tuple is a WARM tuple. The other option is that we collect not only TIDs but also keys while scanning the index. That might increase the size of the state information for wildly wide indexes. Or may be just turn WARM off if there exists a build-in-progress index. 

Suggestions/reviews/tests welcome.

Thanks,
Pavan


--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: [HACKERS] ICU integration