Обсуждение: Postgresql ShareLock problems.

Поиск
Список
Период
Сортировка

Postgresql ShareLock problems.

От
"Svancara, Randall"
Дата:

We are using a third party monitoring solution called Zabbix with Postgresql 9.1.We are observing lots of sharelock problems possibly as a result of multiple, concurrent updates to the specific table called items.


First of all, what can I do to reduce lock contention in Postgresql?  Surely someone is using postgresql in a high transaction environment supporting concurrent updates to the same table?  Secondly how do I determine what would be appropriate in terms of settings for the postgresql.conf to deal with this issue.  Third, what you suggest to the Zabbix developers to fix such a problem?

 

2012-11-01 00:14:23 PDT zabbix zabbix 127.0.0.1 - DETAIL:  Process 5983 waits for ShareLock on transaction 48349033; blocked by process 5992.

        Process 5992 waits for ShareLock on transaction 48349079; blocked by process 5983.

        Process 5983: update items set lastclock=1351754051,lastns=230190945,prevvalue=lastvalue,lastvalue='2.692934' where itemid=200589;

        update items set lastclock=1351754050,lastns=329344694,prevvalue=lastvalue,lastvalue='2080' where itemid=200650;

        update items set lastclock=1351754051,lastns=378134858,prevvalue=lastvalue,lastvalue='1' where itemid=200710;

        update items set lastclock=1351754051,lastns=381995730,prevorgvalue='56',prevvalue=lastvalue,lastvalue='0' where itemid=201150;

        update items set lastclock=1351754051,lastns=378331407,prevvalue=lastvalue,lastvalue='Unit: 0 Slot: 0 Port: 3 Gigabit - Level 0x6000001' where itemid=201300;

        update items set lastclock=1351754051,lastns=380237518,prevorgvalue='0',prevvalue=lastvalue,lastvalue='0' where itemid=201350;

        update items set lastclock=1351754051,lastns=378222316,prevvalue=lastvalue,lastvalue='0' where itemid=201670;

        update items set lastclock=1351754050,lastns=319926666,prevvalue=lastvalue,lastvalue='893' where itemid=201970;

        update items set lastclock=1351754051,lastns=393

        Process 5992: update items set lastclock=1351754043,lastns=980316865,prevvalue=lastvalue,lastvalue='65381720064' where itemid=23343;

        update items set lastclock=1351754044,lastns=392778204,prevvalue=lastvalue,lastvalue='887279616' where itemid=23344;

        update items set lastclock=1351754044,lastns=3196233,prevvalue=lastvalue,lastvalue='1351754044' where itemid=23763;

        update items set lastclock=1351754044,lastns=867856940,prevvalue=lastvalue,lastvalue='92.097214' where itemid=23764;

        update items set lastclock=1351754043,lastns=998890879,prevorgvalue='94217753500',prevvalue=lastvalue,lastvalue='14024' where itemid=23943;

        update items set lastclock=1351754044,lastns=865008900,prevorgvalue='0',prevvalue=lastvalue,lastvalue='0' where itemid=23944;

        update items set lastclock=1351754043,lastns=970666655,prevorgvalue='66951216662',prevvalue=lastvalue,lastvalue='605816' where itemid=24003;

        update items set lastclock=1351754044,lastns=871362232,prevvalue=lastvalue,lastvalue='2715914240' where itemid=24004;

        update items set lastclock=13

2012-11-01 00:14:23 PDT zabbix zabbix 127.0.0.1 - HINT:  See server log for query details.

 

Here is our modifications to the default postgresql.conf file:

listen_addresses = '*'

max_connections = 300                   # (change requires restart)

shared_buffers = 40GB                   # min 128kB  #  WSUIT

work_mem = 1024MB                         # WSU IT

maintenance_work_mem = 512MB            # WSU IT

effective_io_concurrency = 4

synchronous_commit = off                # immediate fsync at commit  #  WSUIT

wal_buffers = 1024MB

checkpoint_segments = 512               # in logfile segments, min 1, 16MB each  #  WSUIT

checkpoint_completion_target = 0.9      # checkpoint target duration, 0.0 - 1.0  #  WSUIT

effective_cache_size = 48GB                                         #  WSUIT

logging_collector = on                  # Enable capturing of stderr and csvlog

log_directory = '/var/log/pg_log'       # directory where log files are written, WSUIT

log_filename = 'postgresql-%a.log'      # log file name pattern,

log_truncate_on_rotation = on           # If on, an existing log file of the

log_rotation_size = 100MB               # Automatic rotation of logfiles will

log_min_error_statement = error         # values in order of decreasing detail:

log_min_duration_statement = 1000       # WSUIT -- Log statements over ten seconds

log_line_prefix = '%t %u %d %h - '                      # special values:

log_lock_waits = on                     # WSUIT - Turn on logging waits

autovacuum_max_workers = 6              # WSUIT -- Increase these for our larger tables

datestyle = 'iso, mdy'

lc_messages = 'en_US.UTF-8'                     # locale for system error message

lc_monetary = 'en_US.UTF-8'                     # locale for monetary formatting

lc_numeric = 'en_US.UTF-8'                      # locale for number formatting

lc_time = 'en_US.UTF-8'                         # locale for time formatting

 

postgres=# select version();

                                                   version

--------------------------------------------------------------------------------------------------------------

PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit

 

Hardware:
12 cpu cores,

96 GB of RAM

4-15K RPM drives in a RAID 1+0

 

TOP output:

 

top - 12:42:20 up 97 days, 21:17,  8 users,  load average: 1.33, 1.19, 1.14

Tasks: 718 total,   7 running, 711 sleeping,   0 stopped,   0 zombie

Cpu(s): 18.7%us,  1.1%sy,  0.0%ni, 80.1%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st

Mem:  99060040k total, 96891404k used,  2168636k free,   220860k buffers

Swap: 16777208k total,  1467392k used, 15309816k free, 85556104k cached

 

You can see the load is really low and virtually no I/O waits. 

 

Re: Postgresql ShareLock problems.

От
Jeff Janes
Дата:
On Thu, Nov 1, 2012 at 12:43 PM, Svancara, Randall <rsvancara@wsu.edu> wrote:
> We are using a third party monitoring solution called Zabbix with Postgresql
> 9.1.We are observing lots of sharelock problems possibly as a result of
> multiple, concurrent updates to the specific table called items.
>
>
> First of all, what can I do to reduce lock contention in Postgresql?

The problem you are seeing is with deadlocks, not with sharelocks.
(The fact that transaction waits are implemented as ShareLocks is an
internal implementation, and perhaps the error message should be
changed so as not to leak that to the user).

The problem is not with contention in general, but just the special
type of contention that leads to cyclic dependencies, i.e. deadlocks.

> Surely
> someone is using postgresql in a high transaction environment supporting
> concurrent updates to the same table?  Secondly how do I determine what
> would be appropriate in terms of settings for the postgresql.conf to deal
> with this issue.  Third, what you suggest to the Zabbix developers to fix
> such a problem?

I don't think there is anything you can do about it in the config file.

If Zabbix retries the transaction when it gets thrown a deadlock
error, then there is not really a problem, just a cluttering of your
log file.  Unless they are happening extremely frequently and creating
performance problems.

It is hard to know why Zabbix is doing what they are doing, so hard to
know what to recommend instead.  One possibility is that they are
bunching unrelated real-world transactions into fewer database
transactions in an attempt to improve "group commit" performance.  In
which case, they should probably not do that and instead turn
synchronous commit off.

Cheers,

Jeff