Обсуждение: Postgres Replaying WAL slowly

Поиск
Список
Период
Сортировка

Postgres Replaying WAL slowly

От
Soni M
Дата:
Hello Everyone ...

We have 6 PG 9.1.12 installation, one master (Ubuntu 10.04), one slony slave(Ubuntu 10.04), and four streaming replica (2 on Ubuntu 10.04 and 2 on RHEL 6.5 (Santiago) which lies on different datacenter). All Ubuntu is on same datacenter. Master send wal archive to slony slave.

This is the infrastructure description :
200Mbit link between data centers, esx 5.5 on hp blade chassis. proliant gen 7 blades. postgres servers dedicated to esx hosts (no other vms on those esx hosts). 3par disk backends with 4 and 8 Gbit fiber channel connections. 10Gbit ethernet virtual connects on the hp chassis. cisco fabric and network switches.

All postgres installed from Ubuntu/RHEL package.

Everything works fine until on Thursday we have high load on master, and after that every streaming replica lag further behind the master. Even on night and weekend where all server load is low. But the slony slave is OK at all.

We thought it was due to network, so we decide to copy wal files to local of a streaming server, and replaying wal from local. After PG restart, it replays wal on a good speed about 3 seconds per wal file, but as the time goes the speed decreasing. We had 30 seconds per wal file. The worst we get is 3 minutes to replay 1 wal file.

The rate of wal produced from master is normal like usual. And also on Thursday we had wal files on pg_xlog on streaming replica server, but no other wal files.

This is the configuration :
SELECT name, current_setting(name)
  FROM pg_settings
  WHERE source NOT IN ('default', 'override');
              name              |                  current_setting
--------------------------------+---------------------------------------------------
 application_name               | psql
 archive_command                | /var/lib/postgresql/scripts/wal_archive "%p" "%f"
 archive_mode                   | on
 checkpoint_completion_target   | 0.7
 checkpoint_segments            | 30
 client_encoding                | UTF8
 DateStyle                      | ISO, MDY
 default_text_search_config     | pg_catalog.english
 effective_cache_size           | 125GB
 effective_io_concurrency       | 3
 external_pid_file              | /var/run/postgresql/9.1-main.pid
 hot_standby                    | on
 hot_standby_feedback           | on
 lc_messages                    | en_US.UTF-8
 lc_monetary                    | en_US.UTF-8
 lc_numeric                     | en_US.UTF-8
 lc_time                        | en_US.UTF-8
 listen_addresses               | *
 log_checkpoints                | on
 log_connections                | on
 log_destination                | csvlog
 log_directory                  | pg_log
 log_disconnections             | on
 log_filename                   | postgresql-%a.log
 log_line_prefix                | %t
 log_lock_waits                 | on
 log_rotation_age               | 1d
 log_rotation_size              | 0
 log_temp_files                 | 100kB
 log_timezone                   | localtime
 log_truncate_on_rotation       | on
 logging_collector              | on
 maintenance_work_mem           | 1GB
 max_connections                | 750
 max_locks_per_transaction      | 900
 max_pred_locks_per_transaction | 900
 max_stack_depth                | 2MB
 max_wal_senders                | 6
 port                           | 5432
 shared_buffers                 | 8GB
 ssl                            | on
 temp_buffers                   | 64MB
 TimeZone                       | America/Chicago
 unix_socket_directory          | /var/run/postgresql
 wal_keep_segments              | 50
 wal_level                      | hot_standby
 work_mem                       | 256MB
(47 rows)

Thanks for any help

--
Regards,

Soni Maula Harriz

Re: Postgres Replaying WAL slowly

От
Heikki Linnakangas
Дата:
On 06/29/2014 11:14 AM, Soni M wrote:
> Everything works fine until on Thursday we have high load on master, and
> after that every streaming replica lag further behind the master. Even on
> night and weekend where all server load is low. But the slony slave is OK
> at all.

What does 'top' on the standby say? Is the startup process using 100% of
(one) CPU replaying records, or is it waiting for I/O? How large is the
database, does it fit in RAM? Any clues in the system or PostgreSQL logs?

- Heikki



Re: Postgres Replaying WAL slowly

От
Soni M
Дата:
top and sar says 100% cpu usage of one core, no sign of I/O wait. The database is 1.5TB in size. RAM in master is 145GB, on slave it's differ, some has about 16GB another has 145GB also.

nothing suspicious on standby's postgres log.

on master's postgres log :
WARNING,01000,"pgstat wait timeout",,,,,,,,,""
ERROR,57014,"canceling autovacuum task",,,,,"automatic vacuum of table ""consprod._consprod_replication.sl_event""",,,,""
ERROR,57014,"canceling statement due to statement timeout",,,,,,"
"PARSE",2014-06-26 00:39:35 CDT,91/0,0,ERROR,25P02,"current transaction is aborted, commands ignored until end of transaction block",,,,,,"select 1",,,""
"could not receive data from client: Connection reset by peer",,,,,,,,,""

the log files is big anyway. if you can specify some pattern to look at the log, that would really help.


On Sun, Jun 29, 2014 at 3:31 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 06/29/2014 11:14 AM, Soni M wrote:
Everything works fine until on Thursday we have high load on master, and
after that every streaming replica lag further behind the master. Even on
night and weekend where all server load is low. But the slony slave is OK
at all.

What does 'top' on the standby say? Is the startup process using 100% of (one) CPU replaying records, or is it waiting for I/O? How large is the database, does it fit in RAM? Any clues in the system or PostgreSQL logs?

- Heikki




--
Regards,

Soni Maula Harriz

Re: Postgres Replaying WAL slowly

От
Soni M
Дата:
Here's some data from standby (this is when replaying take 44seconds per wal files):

2014-06-29 00:07:36.513 CDT,,,16682,,53af6f46.412a,44,,2014-06-28 20:43:34 CDT,,0,LOG,00000,"restartpoint complete: wrote 63187 buffers (3.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=209.170 s, sync=0.482 s, total=209.667 s; sync files=644, longest=0.036 s, average=0.000 s",,,,,,,,,""
2014-06-29 00:07:36.513 CDT,,,16682,,53af6f46.412a,45,,2014-06-28 20:43:34 CDT,,0,LOG,00000,"recovery restart point at 27CE/170056A8","last completed transaction was at log time 2014-06-27 13:39:00.542624-05",,,,,,,,""
2014-06-29 00:28:59.678 CDT,,,16682,,53af6f46.412a,47,,2014-06-28 20:43:34 CDT,,0,LOG,00000,"restartpoint complete: wrote 70942 buffers (3.4%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=209.981 s, sync=0.493 s, total=210.486 s; sync files=723, longest=0.156 s, average=0.000 s",,,,,,,,,""
2014-06-29 00:28:59.678 CDT,,,16682,,53af6f46.412a,48,,2014-06-28 20:43:34 CDT,,0,LOG,00000,"recovery restart point at 27CE/35002678","last completed transaction was at log time 2014-06-27 13:42:05.121121-05",,,,,,,,""

--
Regards,

Soni Maula Harriz

Re: Postgres Replaying WAL slowly

От
Heikki Linnakangas
Дата:
On 06/29/2014 03:43 PM, Soni M wrote:
> top and sar says 100% cpu usage of one core, no sign of I/O wait.

Hmm, I wonder what it's doing then... If you have "perf" installed on
the system, you can do "perf top" to get a quick overlook of where the
CPU time is spent.

- Heikki



Re: Postgres Replaying WAL slowly

От
Soni M
Дата:
Here's what 'perf top' said on streaming replica :

Samples: 26K of event 'cpu-clock', Event count (approx.): 19781
 95.97%  postgres                             [.] 0x00000000002210f3
  0.41%  perf                                 [.] 0x000000000005f225
  0.39%  libc-2.12.so                         [.] __strstr_sse2
  0.22%  libc-2.12.so                         [.] memchr
  0.22%  [kernel]                             [k] kallsyms_expand_symbol
  0.18%  perf                                 [.] symbols__insert
  0.18%  [kernel]                             [k] format_decode
  0.15%  libc-2.12.so                         [.] __GI___strcmp_ssse3
  0.13%  [kernel]                             [k] string
  0.12%  [kernel]                             [k] number
  0.12%  [kernel]                             [k] vsnprintf
  0.12%  libc-2.12.so                         [.] _IO_vfscanf
  0.11%  perf                                 [.] dso__find_symbol
  0.11%  [kernel]                             [k] _spin_unlock_irqrestore
  0.10%  perf                                 [.] hex2u64
  0.10%  postgres                             [.] hash_search_with_hash_value
  0.09%  perf                                 [.] rb_next
  0.08%  libc-2.12.so                         [.] memcpy
  0.07%  libc-2.12.so                         [.] __strchr_sse2
  0.07%  [kernel]                             [k] clear_page
  0.06%  [kernel]                             [k] strnlen
  0.05%  perf                                 [.] perf_evsel__parse_sample
  0.05%  perf                                 [.] rb_insert_color
  0.05%  [kernel]                             [k] pointer



On Mon, Jun 30, 2014 at 2:05 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 06/29/2014 03:43 PM, Soni M wrote:
top and sar says 100% cpu usage of one core, no sign of I/O wait.

Hmm, I wonder what it's doing then... If you have "perf" installed on the system, you can do "perf top" to get a quick overlook of where the CPU time is spent.

- Heikki




--
Regards,

Soni Maula Harriz

Re: Postgres Replaying WAL slowly

От
Heikki Linnakangas
Дата:
On 06/30/2014 05:46 PM, Soni M wrote:
> Here's what 'perf top' said on streaming replica :
>
> Samples: 26K of event 'cpu-clock', Event count (approx.): 19781
>   95.97%  postgres                             [.] 0x00000000002210f3

Ok, so it's stuck doing something.. Can you get build with debug symbols
installed, so that we could see the function name?
- Heikki



Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:
On Jun 30, 2014, at 9:14 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

> On 06/30/2014 05:46 PM, Soni M wrote:
>> Here's what 'perf top' said on streaming replica :
>>
>> Samples: 26K of event 'cpu-clock', Event count (approx.): 19781
>>  95.97%  postgres                             [.] 0x00000000002210f3
>
> Ok, so it's stuck doing something.. Can you get build with debug symbols installed, so that we could see the function
name?
> - Heikki
>

Looks like StandbyReleaseLocks:

Samples: 10K of event 'cpu-clock', Event count (approx.): 8507
 89.21%  postgres                  [.] StandbyReleaseLocks
  0.89%  libc-2.12.so              [.] __strstr_sse2
  0.83%  perf                      [.] 0x000000000005f1e5
  0.74%  [kernel]                  [k] kallsyms_expand_symbol
  0.52%  libc-2.12.so              [.] memchr
  0.47%  perf                      [.] symbols__insert
  0.47%  [kernel]                  [k] format_decode

Re: Postgres Replaying WAL slowly

От
Andres Freund
Дата:
On 2014-06-30 19:14:24 +0300, Heikki Linnakangas wrote:
> On 06/30/2014 05:46 PM, Soni M wrote:
> >Here's what 'perf top' said on streaming replica :
> >
> >Samples: 26K of event 'cpu-clock', Event count (approx.): 19781
> >  95.97%  postgres                             [.] 0x00000000002210f3
>
> Ok, so it's stuck doing something.. Can you get build with debug symbols
> installed, so that we could see the function name?

My guess it's a spinlock, probably xlogctl->info_lck via
RecoveryInProgress(). Unfortunately inline assembler doesn't always seem
to show up correctly in profiles...

What worked for me was to build with -fno-omit-frame-pointer - that
normally shows the callers, even if it can't generate a proper symbol
name.

Soni: Do you use Hot Standby? Are there connections active while you
have that problem? Any other processes with high cpu load?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres Replaying WAL slowly

От
Soni M
Дата:



On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund <andres@2ndquadrant.com> wrote:

My guess it's a spinlock, probably xlogctl->info_lck via
RecoveryInProgress(). Unfortunately inline assembler doesn't always seem
to show up correctly in profiles...

What worked for me was to build with -fno-omit-frame-pointer - that
normally shows the callers, even if it can't generate a proper symbol
name.

Soni: Do you use Hot Standby? Are there connections active while you
have that problem? Any other processes with high cpu load?

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

It is 
 96.62%  postgres             [.] StandbyReleaseLocks
 as Jeff said. It runs quite long time, more than 5 minutes i think

i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load.

--
Regards,

Soni Maula Harriz

Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:
On Jun 30, 2014, at 10:29 AM, Soni M <diptatapa@gmail.com> wrote:




On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund <andres@2ndquadrant.com> wrote:

My guess it's a spinlock, probably xlogctl->info_lck via
RecoveryInProgress(). Unfortunately inline assembler doesn't always seem
to show up correctly in profiles...

What worked for me was to build with -fno-omit-frame-pointer - that
normally shows the callers, even if it can't generate a proper symbol
name.

Soni: Do you use Hot Standby? Are there connections active while you
have that problem? Any other processes with high cpu load?

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

It is 
 96.62%  postgres             [.] StandbyReleaseLocks
 as Jeff said. It runs quite long time, more than 5 minutes i think

i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load.

compiled with -fno-omit-frame-pointer doesn't yield much more info:

 76.24%  postgres                               [.] StandbyReleaseLocks
  2.64%  libcrypto.so.1.0.1e                    [.] md5_block_asm_data_order
  2.19%  libcrypto.so.1.0.1e                    [.] RC4
  2.17%  postgres                               [.] RecordIsValid
  1.20%  [kernel]                               [k] copy_user_generic_unrolled
  1.18%  [kernel]                               [k] _spin_unlock_irqrestore
  0.97%  [vmxnet3]                              [k] vmxnet3_poll_rx_only
  0.87%  [kernel]                               [k] __do_softirq
  0.77%  [vmxnet3]                              [k] vmxnet3_xmit_frame
  0.69%  postgres                               [.] hash_search_with_hash_value
  0.68%  [kernel]                               [k] fin

However, this server started progressing through the WAL files quite a bit better before I finished compiling, so we'll leave it running with this version and see if there's more info available the next time it starts replaying slowly.


Re: Postgres Replaying WAL slowly

От
Andres Freund
Дата:
On 2014-06-30 11:34:52 -0700, Jeff Frost wrote:
> On Jun 30, 2014, at 10:29 AM, Soni M <diptatapa@gmail.com> wrote:

> > It is
> >  96.62%  postgres             [.] StandbyReleaseLocks
> >  as Jeff said. It runs quite long time, more than 5 minutes i think
> >
> > i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue
haslast more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load. 
>
> compiled with -fno-omit-frame-pointer doesn't yield much more info:

You'd need to do perf record -ga instead of perf record -a to see
additional information.

But:

>  76.24%  postgres                               [.] StandbyReleaseLocks

already is quite helpful.

What are you doing on that system? Is there anything requiring large
amounts of access exclusive locks on the primary? Possibly large amounts
of temporary relations?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres Replaying WAL slowly

От
Pavel Stehule
Дата:



2014-06-30 20:34 GMT+02:00 Jeff Frost <jeff@pgexperts.com>:
On Jun 30, 2014, at 10:29 AM, Soni M <diptatapa@gmail.com> wrote:




On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund <andres@2ndquadrant.com> wrote:

My guess it's a spinlock, probably xlogctl->info_lck via
RecoveryInProgress(). Unfortunately inline assembler doesn't always seem
to show up correctly in profiles...


For this kind of issues a systemtap or dtrace can be useful

http://postgres.cz/wiki/Monitorov%C3%A1n%C3%AD_lwlocku_pomoc%C3%AD_systemtapu

you can identify what locking is a problem - please, use a google translate

Regards

Pavel
 
What worked for me was to build with -fno-omit-frame-pointer - that
normally shows the callers, even if it can't generate a proper symbol
name.

Soni: Do you use Hot Standby? Are there connections active while you
have that problem? Any other processes with high cpu load?

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

It is 
 96.62%  postgres             [.] StandbyReleaseLocks
 as Jeff said. It runs quite long time, more than 5 minutes i think

i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load.

compiled with -fno-omit-frame-pointer doesn't yield much more info:

 76.24%  postgres                               [.] StandbyReleaseLocks
  2.64%  libcrypto.so.1.0.1e                    [.] md5_block_asm_data_order
  2.19%  libcrypto.so.1.0.1e                    [.] RC4
  2.17%  postgres                               [.] RecordIsValid
  1.20%  [kernel]                               [k] copy_user_generic_unrolled
  1.18%  [kernel]                               [k] _spin_unlock_irqrestore
  0.97%  [vmxnet3]                              [k] vmxnet3_poll_rx_only
  0.87%  [kernel]                               [k] __do_softirq
  0.77%  [vmxnet3]                              [k] vmxnet3_xmit_frame
  0.69%  postgres                               [.] hash_search_with_hash_value
  0.68%  [kernel]                               [k] fin

However, this server started progressing through the WAL files quite a bit better before I finished compiling, so we'll leave it running with this version and see if there's more info available the next time it starts replaying slowly.



Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:
On Jun 30, 2014, at 11:39 AM, Andres Freund <andres@2ndquadrant.com> wrote:

> On 2014-06-30 11:34:52 -0700, Jeff Frost wrote:
>> On Jun 30, 2014, at 10:29 AM, Soni M <diptatapa@gmail.com> wrote:
>
>>> It is
>>> 96.62%  postgres             [.] StandbyReleaseLocks
>>> as Jeff said. It runs quite long time, more than 5 minutes i think
>>>
>>> i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue
haslast more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load. 
>>
>> compiled with -fno-omit-frame-pointer doesn't yield much more info:
>
> You'd need to do perf record -ga instead of perf record -a to see
> additional information.
>

Ah! That's right.

Here's how that looks:

Samples: 473K of event 'cpu-clock', Event count (approx.): 473738
+  68.42%          init  [kernel.kallsyms]                         [k] native_safe_halt
+  26.07%      postgres  postgres                                  [.] StandbyReleaseLocks
+   2.82%       swapper  [kernel.kallsyms]                         [k] native_safe_halt
+   0.19%           ssh  libcrypto.so.1.0.1e                       [.] md5_block_asm_data_order
+   0.19%      postgres  postgres                                  [.] RecordIsValid
+   0.16%           ssh  libcrypto.so.1.0.1e                       [.] RC4
+   0.10%      postgres  postgres                                  [.] hash_search_with_hash_value
+   0.06%      postgres  [kernel.kallsyms]                         [k] _spin_unlock_irqrestore
+   0.05%          init  [vmxnet3]                                 [k] vmxnet3_poll_rx_only
+   0.04%      postgres  [kernel.kallsyms]                         [k] copy_user_generic_unrolled
+   0.04%          init  [kernel.kallsyms]                         [k] finish_task_switch
+   0.04%          init  [kernel.kallsyms]                         [k] __do_softirq
+   0.04%           ssh  [kernel.kallsyms]                         [k] _spin_unlock_irqrestore
+   0.04%           ssh  [vmxnet3]                                 [k] vmxnet3_xmit_frame
+   0.03%      postgres  postgres                                  [.] PinBuffer
+   0.03%          init  [vmxnet3]                                 [k] vmxnet3_xmit_frame
+   0.03%           ssh  [kernel.kallsyms]                         [k] copy_user_generic_unrolled
+   0.03%      postgres  postgres                                  [.] XLogReadBufferExtended
+   0.03%           ssh  ssh                                       [.] 0x000000000002aa07
+   0.03%          init  [kernel.kallsyms]                         [k] _spin_unlock_irqrestore
+   0.03%           ssh  [vmxnet3]                                 [k] vmxnet3_poll_rx_only
+   0.02%           ssh  [kernel.kallsyms]                         [k] __do_softirq
+   0.02%      postgres  libc-2.12.so                              [.] _wordcopy_bwd_dest_aligned
+   0.02%      postgres  postgres                                  [.] mdnblocks
+   0.02%           ssh  libcrypto.so.1.0.1e                       [.] 0x00000000000e25a1
+   0.02%           scp  [kernel.kallsyms]                         [k] copy_user_generic_unrolled
+   0.02%           ssh  libc-2.12.so                              [.] memcpy
+   0.02%      postgres  libc-2.12.so                              [.] memcpy


> But:
>
>> 76.24%  postgres                               [.] StandbyReleaseLocks
>
> already is quite helpful.
>
> What are you doing on that system? Is there anything requiring large
> amounts of access exclusive locks on the primary? Possibly large amounts
> of temporary relations?


The last time we did a 100% logging run, the peak temp table creation was something like 120k/hr, but the replicas
seemedable to keep up with that just fine. 

Hopefully Soni can answer whether that has increased significantly since May.



Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:
On Jun 30, 2014, at 12:17 PM, Jeff Frost <jeff@pgexperts.com> wrote:


already is quite helpful.

What are you doing on that system? Is there anything requiring large
amounts of access exclusive locks on the primary? Possibly large amounts
of temporary relations?


The last time we did a 100% logging run, the peak temp table creation was something like 120k/hr, but the replicas seemed able to keep up with that just fine.


Sampling pg_locks on the primary shows ~50 locks with ExclusiveLock mode:

           mode           | count
--------------------------+-------
 AccessExclusiveLock      |    11
 AccessShareLock          |  2089
 ExclusiveLock            |    46
 RowExclusiveLock         |    81
 RowShareLock             |    17
 ShareLock                |     4
 ShareUpdateExclusiveLock |     5

Seems to be relatively consistent.  Of course, it's hard to say what it looked like back when the issue began.



Re: Postgres Replaying WAL slowly

От
Tom Lane
Дата:
Jeff Frost <jeff@pgexperts.com> writes:
> Sampling pg_locks on the primary shows ~50 locks with ExclusiveLock mode:

>            mode           | count
> --------------------------+-------
>  AccessExclusiveLock      |    11
>  AccessShareLock          |  2089
>  ExclusiveLock            |    46
>  RowExclusiveLock         |    81
>  RowShareLock             |    17
>  ShareLock                |     4
>  ShareUpdateExclusiveLock |     5

That's not too helpful if you don't pay attention to what the lock is on;
it's likely that all the ExclusiveLocks are on transactions' own XIDs,
which isn't relevant to the standby's behavior.  The AccessExclusiveLocks
are probably interesting though --- you should look to see what those
are on.

            regards, tom lane


Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:
On Jun 30, 2014, at 12:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Jeff Frost <jeff@pgexperts.com> writes:
>> Sampling pg_locks on the primary shows ~50 locks with ExclusiveLock mode:
>
>>           mode           | count
>> --------------------------+-------
>> AccessExclusiveLock      |    11
>> AccessShareLock          |  2089
>> ExclusiveLock            |    46
>> RowExclusiveLock         |    81
>> RowShareLock             |    17
>> ShareLock                |     4
>> ShareUpdateExclusiveLock |     5
>
> That's not too helpful if you don't pay attention to what the lock is on;
> it's likely that all the ExclusiveLocks are on transactions' own XIDs,
> which isn't relevant to the standby's behavior.  The AccessExclusiveLocks
> are probably interesting though --- you should look to see what those
> are on.

You're right about the ExclusiveLocks.

Here's how the AccessExclusiveLocks look:

 locktype | database |  relation  | page | tuple | virtualxid | transactionid | classid |   objid    | objsubid |
virtualtransaction|  pid  |        mode         | granted 

----------+----------+------------+------+-------+------------+---------------+---------+------------+----------+--------------------+-------+---------------------+---------
 relation |   111285 | 3245291551 |      |       |            |               |         |            |          |
233/170813        | 23509 | AccessExclusiveLock | t 
 relation |   111285 | 3245292820 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292833 |      |       |            |               |         |            |          |
173/1723993       | 23407 | AccessExclusiveLock | t 
 relation |   111285 | 3245287874 |      |       |            |               |         |            |          |
133/3818415       | 23348 | AccessExclusiveLock | t 
 relation |   111285 | 3245292836 |      |       |            |               |         |            |          |
173/1723993       | 23407 | AccessExclusiveLock | t 
 relation |   111285 | 3245292774 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292734 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292827 |      |       |            |               |         |            |          |
173/1723993       | 23407 | AccessExclusiveLock | t 
 relation |   111285 | 3245288540 |      |       |            |               |         |            |          |
133/3818415       | 23348 | AccessExclusiveLock | t 
 relation |   111285 | 3245292773 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292775 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292743 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292751 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245288669 |      |       |            |               |         |            |          |
133/3818415       | 23348 | AccessExclusiveLock | t 
 relation |   111285 | 3245292817 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245288657 |      |       |            |               |         |            |          |
133/3818415       | 23348 | AccessExclusiveLock | t 
 object   |   111285 |            |      |       |            |               |    2615 | 1246019760 |        0 |
233/170813        | 23509 | AccessExclusiveLock | t 
 relation |   111285 | 3245292746 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245287876 |      |       |            |               |         |            |          |
133/3818415       | 23348 | AccessExclusiveLock | t 
 relation |   111285 | 3245292739 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292826 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292825 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245292832 |      |       |            |               |         |            |          |
173/1723993       | 23407 | AccessExclusiveLock | t 
 relation |   111285 | 3245292740 |      |       |            |               |         |            |          |
5/22498235        | 23427 | AccessExclusiveLock | t 
 relation |   111285 | 3245287871 |      |       |            |               |         |            |          |
133/3818415       | 23348 | AccessExclusiveLock | t 
(25 rows)

And if you go fishing in pg_class for any of the oids, you don't find anything:

SELECT s.procpid,
       s.query_start,
       n.nspname,
       c.relname,
       l.mode,
       l.granted,
       s.current_query
       FROM pg_locks l,
            pg_class c,
            pg_stat_activity s,
            pg_namespace n
      WHERE l.relation = c.oid
      AND l.pid = s.procpid
      AND c.relnamespace = n.oid
      AND l.mode = 'AccessExclusiveLock';
 procpid | query_start | nspname | relname | mode | granted | current_query
---------+-------------+---------+---------+------+---------+---------------
(0 rows)

Temp tables maybe?

Re: Postgres Replaying WAL slowly

От
Matheus de Oliveira
Дата:

On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost <jeff@pgexperts.com> wrote:
And if you go fishing in pg_class for any of the oids, you don't find anything:

That is probably because you are connected in the wrong database. Once you connect to the database of interest, you don't even need to query pg_class, just cast relation attribute to regclass:

    SELECT relation::regclass, ...
    FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database());


Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres

Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:

On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira <matioli.matheus@gmail.com> wrote:


On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost <jeff@pgexperts.com> wrote:
And if you go fishing in pg_class for any of the oids, you don't find anything:

That is probably because you are connected in the wrong database. Once you connect to the database of interest, you don't even need to query pg_class, just cast relation attribute to regclass:

    SELECT relation::regclass, ...
    FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database());



Yah, i thought about that too, but verified I am in the correct DB.  Just for clarity sake:

SELECT relation::regclass
    FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database()) and mode = 'AccessExclusiveLock';

  relation
------------


 3245508214
 3245508273
 3245508272
 3245508257
 3245508469
 3245508274
 3245508373
 3245508468
 3245508210
 3245508463
 3245508205
 3245508260
 3245508265
 3245508434
(16 rows)

Re: Postgres Replaying WAL slowly

От
Andres Freund
Дата:
On 2014-06-30 12:57:56 -0700, Jeff Frost wrote:
>
> On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira <matioli.matheus@gmail.com> wrote:
>
> >
> > On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost <jeff@pgexperts.com> wrote:
> > And if you go fishing in pg_class for any of the oids, you don't find anything:
> >
> > That is probably because you are connected in the wrong database. Once you connect to the database of interest, you
don'teven need to query pg_class, just cast relation attribute to regclass: 
> >
> >     SELECT relation::regclass, ...
> >     FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database());
> >
>
> Yah, i thought about that too, but verified I am in the correct DB.  Just for clarity sake:

So these are probably relations created in uncommitted
transactions. Possibly ON COMMIT DROP temp tables?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:

On Jun 30, 2014, at 1:15 PM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-06-30 12:57:56 -0700, Jeff Frost wrote:

On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira <matioli.matheus@gmail.com> wrote:


On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost <jeff@pgexperts.com> wrote:
And if you go fishing in pg_class for any of the oids, you don't find anything:

That is probably because you are connected in the wrong database. Once you connect to the database of interest, you don't even need to query pg_class, just cast relation attribute to regclass:

   SELECT relation::regclass, ...
   FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE datname = current_database());


Yah, i thought about that too, but verified I am in the correct DB.  Just for clarity sake:

So these are probably relations created in uncommitted
transactions. Possibly ON COMMIT DROP temp tables?


That would make sense.  There are definitely quite a few of those being used.

Another item of note is the system catalogs are quite bloated:

 schemaname |  tablename   | tbloat | wastedmb | idxbloat | wastedidxmb
------------+--------------+--------+----------+----------+-------------
 pg_catalog | pg_attribute |   3945 |   106.51 |     2770 |      611.28
 pg_catalog | pg_class     |   8940 |    45.26 |     4420 |       47.89
 pg_catalog | pg_type      |   4921 |    18.45 |     5850 |       81.16
 pg_catalog | pg_depend    |    933 |    10.23 |    11730 |      274.37
 pg_catalog | pg_index     |   3429 |     8.36 |     3920 |       24.24
 pg_catalog | pg_shdepend  |    983 |     2.67 |     9360 |       30.56
(6 rows)

Would that cause the replica to spin on StandbyReleaseLocks?


Re: Postgres Replaying WAL slowly

От
Tom Lane
Дата:
Jeff Frost <jeff@pgexperts.com> writes:
> On Jun 30, 2014, at 1:15 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> So these are probably relations created in uncommitted
>> transactions. Possibly ON COMMIT DROP temp tables?

> That would make sense.  There are definitely quite a few of those being used.

Uh-huh.  I doubt that the mechanism that handles propagation of
AccessExclusiveLocks to the standby is smart enough to ignore locks
on temp tables :-(

> Another item of note is the system catalogs are quite bloated:
> Would that cause the replica to spin on StandbyReleaseLocks?

AFAIK, no.  It's an unsurprising consequence of heavy use of short-lived
temp tables though.

So it seems like we have a candidate explanation.  I'm a bit surprised
that StandbyReleaseLocks would get this slow if there are only a dozen
AccessExclusiveLocks in place at any one time, though.  Perhaps that
was a low point and there are often many more?

            regards, tom lane


Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:
On Jun 30, 2014, at 1:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>
>
>> Another item of note is the system catalogs are quite bloated:
>> Would that cause the replica to spin on StandbyReleaseLocks?
>
> AFAIK, no.  It's an unsurprising consequence of heavy use of short-lived
> temp tables though.
>

Yah, this has been an issue in the past, so we tend to cluster them regularly during off-hours to minimize the issue.

> So it seems like we have a candidate explanation.  I'm a bit surprised
> that StandbyReleaseLocks would get this slow if there are only a dozen
> AccessExclusiveLocks in place at any one time, though.  Perhaps that
> was a low point and there are often many more?
>
>

Entirely possible that it was a low point.  We'll set up some monitoring to track the number of AccessExclusiveLocks
andsee how much variance there is throughout the day. 




Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:

On Jun 30, 2014, at 1:46 PM, Jeff Frost <jeff@pgexperts.com> wrote:

So it seems like we have a candidate explanation.  I'm a bit surprised
that StandbyReleaseLocks would get this slow if there are only a dozen
AccessExclusiveLocks in place at any one time, though.  Perhaps that
was a low point and there are often many more?



Entirely possible that it was a low point.  We'll set up some monitoring to track the number of AccessExclusiveLocks and see how much variance there is throughout the day.


Since we turned on the monitoring for that, we had a peak of 13,550 AccessExclusiveLocks.  So far most of the samples have been in the double digit, with that and two other outliers: 6,118 and 12,747.

Re: Postgres Replaying WAL slowly

От
Tom Lane
Дата:
Jeff Frost <jeff@pgexperts.com> writes:
>>> So it seems like we have a candidate explanation.  I'm a bit surprised
>>> that StandbyReleaseLocks would get this slow if there are only a dozen
>>> AccessExclusiveLocks in place at any one time, though.  Perhaps that
>>> was a low point and there are often many more?

> Since we turned on the monitoring for that, we had a peak of 13,550
> AccessExclusiveLocks.

Ah ... that's more like a number I can believe something would have
trouble coping with.  Did you see a noticeable slowdown with this?
Now that we've seen that number, of course it's possible there was an
even higher peak occurring when you saw the trouble.

Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe
it just takes awhile to handle that many locks.

Did you check whether the locks were all on temp tables of the
ON COMMIT DROP persuasion?

            regards, tom lane


Re: Postgres Replaying WAL slowly

От
Andres Freund
Дата:
On 2014-06-30 19:04:20 -0400, Tom Lane wrote:
> Jeff Frost <jeff@pgexperts.com> writes:
> >>> So it seems like we have a candidate explanation.  I'm a bit surprised
> >>> that StandbyReleaseLocks would get this slow if there are only a dozen
> >>> AccessExclusiveLocks in place at any one time, though.  Perhaps that
> >>> was a low point and there are often many more?
>
> > Since we turned on the monitoring for that, we had a peak of 13,550
> > AccessExclusiveLocks.

Any chance the workload also uses lots of subtransactions?

> Ah ... that's more like a number I can believe something would have
> trouble coping with.  Did you see a noticeable slowdown with this?
> Now that we've seen that number, of course it's possible there was an
> even higher peak occurring when you saw the trouble.
>
> Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe
> it just takes awhile to handle that many locks.

I don't think there's a O(n^2) in StandbyReleaseLocks() itself, but in
combination with StandbyReleaseLockTree() it looks possibly bad. The
latter will call StandbyReleaseLocks() for every xid/subxid, and each of
the StandbyReleaseLocks() will then trawl the entire RecoveryLockList...

It'd probably be better to implement ReleaseLocksTree() by sorting the
subxid list and bsearch that while iterating RecoveryLockList.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:

On Jun 30, 2014, at 4:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ah ... that's more like a number I can believe something would have
trouble coping with.  Did you see a noticeable slowdown with this?
Now that we've seen that number, of course it's possible there was an
even higher peak occurring when you saw the trouble.

Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe
it just takes awhile to handle that many locks.

Did you check whether the locks were all on temp tables of the
ON COMMIT DROP persuasion?


Unfortunately not, because I went for a poor man's: SELECT count(*) FROM pg_locks WHERE mode = 'AccessExclusiveLock' 
run in cron every minute.

That said, I'd bet it was mostly ON COMMIT DROP temp tables.

The unfortunate thing is I wouldn't know how to correlate that spike with the corresponding slowdown because the replica is about 5.5hrs lagged at the moment.

Hopefully it will get caught up tonight and we can see if there's a correlation tomorrow.

Re: Postgres Replaying WAL slowly

От
Jeff Frost
Дата:
On Jun 30, 2014, at 4:57 PM, Jeff Frost <jeff@pgexperts.com> wrote:


On Jun 30, 2014, at 4:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ah ... that's more like a number I can believe something would have
trouble coping with.  Did you see a noticeable slowdown with this?
Now that we've seen that number, of course it's possible there was an
even higher peak occurring when you saw the trouble.

Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe
it just takes awhile to handle that many locks.

Did you check whether the locks were all on temp tables of the
ON COMMIT DROP persuasion?


Unfortunately not, because I went for a poor man's: SELECT count(*) FROM pg_locks WHERE mode = 'AccessExclusiveLock' 
run in cron every minute.

That said, I'd bet it was mostly ON COMMIT DROP temp tables.

The unfortunate thing is I wouldn't know how to correlate that spike with the corresponding slowdown because the replica is about 5.5hrs lagged at the moment.

Hopefully it will get caught up tonight and we can see if there's a correlation tomorrow.

And indeed it did catch up overnight and the lag increased shortly after a correlating spike in AccessExclusiveLocks that were generated by temp table creation with on commit drop.


Re: Postgres Replaying WAL slowly

От
Tom Lane
Дата:
Jeff Frost <jeff@pgexperts.com> writes:
>> On Jun 30, 2014, at 4:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Did you check whether the locks were all on temp tables of the
>>> ON COMMIT DROP persuasion?

> And indeed it did catch up overnight and the lag increased shortly after a correlating spike in AccessExclusiveLocks
thatwere generated by temp table creation with on commit drop. 

OK, so we have a pretty clear idea of where the problem is now.

It seems like there are three, not mutually exclusive, ways we might
address this:

1. Local revisions inside StandbyReleaseLocks to make it perform better in
the presence of many locks.  This would only be likely to improve matters
much if there's a fixable O(N^2) algorithmic issue; but there might well
be one.

2. Avoid WAL-logging AccessExclusiveLocks associated with temp tables, on
the grounds that no standby should be touching them.  I'm not entirely
sure that that argument is bulletproof though; in particular, even though
a standby couldn't access the table's data, it's possible that it would be
interested in seeing consistent catalog entries.

3. Avoid WAL-logging AccessExclusiveLocks associated with
new-in-transaction tables, temp or not, on the grounds that no standby
could even see such tables until they're committed.  We could go a bit
further and not take out any locks on a new-in-transaction table in the
first place, on the grounds that other transactions on the master can't
see 'em either.

It sounded like Andres had taken a preliminary look at #1 and found a
possible avenue for improvement, which I'd encourage him to pursue.

For both #2 and the conservative version of #3, the main implementation
problem would be whether the lock WAL-logging code has cheap access to
the necessary information.  I suspect it doesn't.

The radical version of #3 might be pretty easy to do, at least to the
extent of removing locks taken out during CREATE TABLE.  I suspect there
are some assertions or other consistency checks that would get unhappy if
we manipulate relations without locks, though, so those would have to be
taught about the exception.  Also, we sometimes forget new-in-transaction
status during relcache flush events; it's not clear if that would be a
problem for this.

I don't plan to work on this myself, but perhaps someone with more
motivation will want to run with these ideas.

            regards, tom lane


Re: Postgres Replaying WAL slowly

От
Andres Freund
Дата:
On 2014-07-01 15:20:37 -0400, Tom Lane wrote:
> Jeff Frost <jeff@pgexperts.com> writes:
> >> On Jun 30, 2014, at 4:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Did you check whether the locks were all on temp tables of the
> >>> ON COMMIT DROP persuasion?
>
> > And indeed it did catch up overnight and the lag increased shortly after a correlating spike in
AccessExclusiveLocksthat were generated by temp table creation with on commit drop. 
>
> OK, so we have a pretty clear idea of where the problem is now.
>
> It seems like there are three, not mutually exclusive, ways we might
> address this:
>
> 1. Local revisions inside StandbyReleaseLocks to make it perform better in
> the presence of many locks.  This would only be likely to improve matters
> much if there's a fixable O(N^2) algorithmic issue; but there might well
> be one.
>
> It sounded like Andres had taken a preliminary look at #1 and found a
> possible avenue for improvement, which I'd encourage him to pursue.
>

I don't have the resources to do this right now, but yes, I think we can
get relatively easily get rid of the O(num_locks * num_subtransactions)
behaviour.

> 2. Avoid WAL-logging AccessExclusiveLocks associated with temp tables, on
> the grounds that no standby should be touching them.  I'm not entirely
> sure that that argument is bulletproof though; in particular, even though
> a standby couldn't access the table's data, it's possible that it would be
> interested in seeing consistent catalog entries.

Hm. We definitely perform checks surprisingly late for those. It's
possible to do SELECT * FROM pg_temp_<nn>.whatever; without an error f
there's no rows of if the rest of the plan doesn't do accesses to that
table. The check prohibiting access is only in bufmgr.c...
So yea, I don't think we can do this for at least < 9.4. And there
it'll still be hard.

> 3. Avoid WAL-logging AccessExclusiveLocks associated with
> new-in-transaction tables, temp or not, on the grounds that no standby
> could even see such tables until they're committed.  We could go a bit
> further and not take out any locks on a new-in-transaction table in the
> first place, on the grounds that other transactions on the master can't
> see 'em either.
>
> For both #2 and the conservative version of #3, the main implementation
> problem would be whether the lock WAL-logging code has cheap access to
> the necessary information.  I suspect it doesn't.

Not trivially. It's logged directly in LockAcquireExtended(). We could
add the information into locktags as there's unused fields for relation
locktags, but brrr.

> The radical version of #3 might be pretty easy to do, at least to the
> extent of removing locks taken out during CREATE TABLE.  I suspect there
> are some assertions or other consistency checks that would get unhappy if
> we manipulate relations without locks, though, so those would have to be
> taught about the exception.
>
> Also, we sometimes forget new-in-transaction
> status during relcache flush events; it's not clear if that would be a
> problem for this.

I think that hole is actually pluggable in newer releases - at least
there's no code around that assumes rd_createSubid now is persistent,
even across cache resets.

But I think more importantly it's probably quite possible to hit a
similar problem without ON COMMIT DROP relations. Say DISCARD TEMP
inside a transaction (with several subxacts) or so? So we probaly really
should fix the bad scaling.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres Replaying WAL slowly

От
Tom Lane
Дата:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-07-01 15:20:37 -0400, Tom Lane wrote:
>> It seems like there are three, not mutually exclusive, ways we might
>> address this:

> But I think more importantly it's probably quite possible to hit a
> similar problem without ON COMMIT DROP relations. Say DISCARD TEMP
> inside a transaction (with several subxacts) or so? So we probaly really
> should fix the bad scaling.

Well, my thought was that these approaches would address somewhat
different sets of use-cases, and we might well want to do more than one.
Even if StandbyReleaseLocks were zero-cost, not emitting the WAL in the
first place is surely considerably cheaper yet.

            regards, tom lane


Re: Postgres Replaying WAL slowly

От
Simon Riggs
Дата:
On 1 July 2014 20:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I don't plan to work on this myself, but perhaps someone with more
> motivation will want to run with these ideas.

I was planning to work on improving performance of replication apply
over the summer, mid July - Aug, so I'll add this to the list.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres Replaying WAL slowly

От
Simon Riggs
Дата:
On 1 July 2014 20:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Frost <jeff@pgexperts.com> writes:
>>> On Jun 30, 2014, at 4:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Did you check whether the locks were all on temp tables of the
>>>> ON COMMIT DROP persuasion?
>
>> And indeed it did catch up overnight and the lag increased shortly after a correlating spike in AccessExclusiveLocks
thatwere generated by temp table creation with on commit drop. 
>
> OK, so we have a pretty clear idea of where the problem is now.
>
> It seems like there are three, not mutually exclusive, ways we might
> address this:
>
> 1. Local revisions inside StandbyReleaseLocks to make it perform better in
> the presence of many locks.  This would only be likely to improve matters
> much if there's a fixable O(N^2) algorithmic issue; but there might well
> be one.
>
> 2. Avoid WAL-logging AccessExclusiveLocks associated with temp tables, on
> the grounds that no standby should be touching them.  I'm not entirely
> sure that that argument is bulletproof though; in particular, even though
> a standby couldn't access the table's data, it's possible that it would be
> interested in seeing consistent catalog entries.
>
> 3. Avoid WAL-logging AccessExclusiveLocks associated with
> new-in-transaction tables, temp or not, on the grounds that no standby
> could even see such tables until they're committed.  We could go a bit
> further and not take out any locks on a new-in-transaction table in the
> first place, on the grounds that other transactions on the master can't
> see 'em either.
>
> It sounded like Andres had taken a preliminary look at #1 and found a
> possible avenue for improvement, which I'd encourage him to pursue.
>
> For both #2 and the conservative version of #3, the main implementation
> problem would be whether the lock WAL-logging code has cheap access to
> the necessary information.  I suspect it doesn't.
>
> The radical version of #3 might be pretty easy to do, at least to the
> extent of removing locks taken out during CREATE TABLE.  I suspect there
> are some assertions or other consistency checks that would get unhappy if
> we manipulate relations without locks, though, so those would have to be
> taught about the exception.  Also, we sometimes forget new-in-transaction
> status during relcache flush events; it's not clear if that would be a
> problem for this.
>
> I don't plan to work on this myself, but perhaps someone with more
> motivation will want to run with these ideas.

Patch implements option 2 in the above.

Skipping the locks entirely seems like it opens a can of worms.

Skipping the lock for temp tables is valid since locks don't need to
exist on the standby. Any catalog entries for them will exist, but the
rows will show them as temp and nobody would expect them to be valid
outside of the original session.

Patch implements a special case that takes the lock normally, but
skips WAL logging the lock info.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения