Re: [HACKERS] WARM and indirect indexes

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: [HACKERS] WARM and indirect indexes
Дата
Msg-id CABOikdOxe0yDiRk6GTY3ystZROyEpap7ZNL4qkXVEatynX3KPQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] WARM and indirect indexes  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] WARM and indirect indexes  (Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] WARM and indirect indexes  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-hackers


On Thu, Jan 12, 2017 at 3:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jan 10, 2017 at 2:24 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> The big advantage of WARM is that it works automatically, like HOT: the
> user doesn't need to do anything different than today to get the
> benefit.  With indirect indexes, the user needs to create the index as
> indirect explicitely.

However, this cuts both ways.  If the WARM implementation has bugs --
either data-corrupting bugs or crash bugs or returns-wrong-answer bugs
or performance-in-corner-cases bugs -- everyone will be exposed to
them.  


IMHO WARM is way less complicated or intrusive than HOT was. It doesn't change any of the MVCC mechanics or doesn't change when and how tuples are marked dead or when and how dead tuples are removed. What it changes is how tuples are indexed and accessed via index methods. So I believe bugs in this area can possibility corrupt indexes or return wrong results, which is bad but may have happened with many other patches we did in recent past. The other thing the patch changes is how update-chain is maintained. In order to quickly find the root offset while updating a tuple, we now store the root offset in the t_ctid field of the last tuple in the chain and use a separate bit to mark end-of-the-chain (instead of relying of t_ctid = t_self check). That can lead to problems if chains are not maintained or followed correctly. These changes are in the first patch of the patch series and if you've any suggestions on how to improve that or solidify chain following, please let me know. I was looking for some way to hide t_ctid field to ensure that the links are only accessed via some standard API.

I think as a developer of the patch, what I would like to know is what can we do address concerns raised by you? What kind of tests you would like to do to get confidence in the patch? What I've done so far is to rely on the existing tests such as regression, isolation and pgbench. After adding support for system tables, the code gets exercised even more during regression tests, which is good. I also performed a few tests where I would turn sequential scan off and then run "make installcheck" and compare regression diffs between master and patched code. That helps because the index access paths are used even more often. I did not find any bugs in those tests.

My favourite test during HOT development was to run pgbench with large number of clients and periodically check for data consistency while tests are running, by comparing sum(tbalance), sum(bbalance) and sum(abalance) values. I'm yet to do that kind of test with WARM because that would require a slightly different test setup (more indexes and more update statements), but I intend to do those tests too. I have also started writing regression test cases which could lead to some corner cases and share them for inclusion irrespective of WARM.

Please share your thoughts on what more can be and should be done.

Thanks,
Pavan
--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: [HACKERS] background sessions
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Passing query string to workers