Re: [HACKERS] hash index on unlogged tables doesn't behave as expected

Поиск
Список
Период
Сортировка
От Ashutosh Sharma
Тема Re: [HACKERS] hash index on unlogged tables doesn't behave as expected
Дата
Msg-id CAE9k0P=iqPtHzGObikRjhxaz4tWFW+XZmgGnwBwz3=nRgxGSAg@mail.gmail.com
обсуждение исходный текст
Ответ на [HACKERS] hash index on unlogged tables doesn't behave as expected  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] hash index on unlogged tables doesn't behave as expected  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Hi,

On Mon, Jul 3, 2017 at 9:12 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> While discussing the behavior of hash indexes with Bruce in the nearby
> thread [1], it has been noticed that hash index on unlogged tables
> doesn't behave as expected.  Prior to 10, it has the different set of
> problems (mainly because hash indexes are not WAL-logged) which were
> discussed on that thread [1], however when I checked, it doesn't work
> even for 10.  Below are steps to reproduce the problem.
>
> 1. Setup master and standby
> 2. On the master, create unlogged table and hash index.
>    2A.  Create unlogged table t1(c1 int);
>    2B.  Create hash index idx_t1_hash on t1 using hash(c1);
> 3. On Standby, try selecting data,
>    select * from t1;
>    ERROR:  cannot access temporary or unlogged relations during recovery
> ---Till here everything works as expected
> 4. Promote standby to master (I have just stopped the standby and
> master and removed recovery.conf file from the standby database
> location) and try starting the new master, it
> gives below error and doesn't get started.
> FATAL:  could not create file "base/12700/16387": File exists
>
> The basic issue was that the WAL logging for Create Index operation
> was oblivion of the fact that for unlogged tables only INIT forks need
> to be logged.  Another point which we need to consider is that while
> replaying the WAL for the create index operation, we need to flush the
> buffer if it is for init fork.  This needs to be done only for pages
> that can be part of init fork file (like metapage, bitmappage).
> Attached patch fixes the issue.
>
> Another approach to fix the issue could be that always log complete
> pages for the create index operation on unlogged tables (in
> hashbuildempty).  However, the logic for initial hash index pages
> needs us to perform certain other actions (like update metapage after
> the creation of bitmappage) which make it difficult to follow that
> approach.
>

Thanks for sharing the steps to reproduce the issue in detail. I could easily reproduce this issue. I also had a quick look into the patch and the fix looks fine to me. However, it would be good if we can add at least one test for unlogged table with hash index in the regression test suite.

Apart from that i have tested the patch and the patch seems to be working fine. Here are the steps that i have followed to verify the fix,

With Patch:
========
postgres=# SELECT pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
 base/14915/16384
(1 row)

postgres=# SELECT pg_relation_filepath('idx_t1_hash');
 pg_relation_filepath
----------------------
 base/14915/16387
(1 row)

Master:
=====
[ashu@localhost bin]$ ls -l ../master/base/14915/1638*
-rw-------. 1 ashu ashu 18128896 Jul  3 14:29 ../master/base/14915/16384
-rw-------. 1 ashu ashu    24576 Jul  3 14:29 ../master/base/14915/16384_fsm
-rw-------. 1 ashu ashu        0 Jul  3 14:28 ../master/base/14915/16384_init
-rw-------. 1 ashu ashu 22339584 Jul  3 14:29 ../master/base/14915/16387
-rw-------. 1 ashu ashu    32768 Jul  3 14:28 ../master/base/14915/16387_init


Standby:
======
[ashu@localhost bin]$ ls -l ../standby/base/14915/1638*
-rw-------. 1 ashu ashu     0 Jul  3 14:28 ../standby/base/14915/16384_init
-rw-------. 1 ashu ashu 32768 Jul  3 14:28 ../standby/base/14915/16387_init


Without patch:
==========
postgres=# SELECT pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
 base/14915/16384
(1 row)

postgres=# SELECT pg_relation_filepath('idx_t1_hash');
 pg_relation_filepath
----------------------
 base/14915/16387
(1 row)

Master:
=====
[ashu@localhost bin]$ ls -l ../master/base/14915/1638*
-rw-------. 1 ashu ashu 18128896 Jul  3 14:36 ../master/base/14915/16384
-rw-------. 1 ashu ashu    24576 Jul  3 14:36 ../master/base/14915/16384_fsm
-rw-------. 1 ashu ashu        0 Jul  3 14:35 ../master/base/14915/16384_init
-rw-------. 1 ashu ashu 22339584 Jul  3 14:36 ../master/base/14915/16387
-rw-------. 1 ashu ashu    32768 Jul  3 14:35 ../master/base/14915/16387_init

Standby:
======
[ashu@localhost bin]$ ls -l ../standby/base/14915/1638*
-rw-------. 1 ashu ashu     0 Jul  3 14:35 ../standby/base/14915/16384_init
-rw-------. 1 ashu ashu 73728 Jul  3 14:35 ../standby/base/14915/16387
-rw-------. 1 ashu ashu 24576 Jul  3 14:35 ../standby/base/14915/16387_init


As seen from the test results above, it is clear that without patch, both main fork and init fork were getting WAL logged as the create hash index WAL logging was being done without checking forkNum type.

_hash_init()
{
................
................
        log_newpage(&rel->rd_node,
                    forkNum,
                    blkno,
                    BufferGetPage(buf),
                    true);
        _hash_relbuf(rel, buf);
....................
....................
}

I have also ran the WAL consistency check tool to verify the things and didn't find any issues. Thanks.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] Multi column range partition table
Следующее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] UPDATE of partition key