Обсуждение: Partitions and max_locks_per_transaction
Hi All, I have a stats collection system where I collect stats at specific intervals (from network monitoring nodes), and stuff them into a PostgreSQL DB. To make make the retrieval faster, I'm using a partitioning scheme as follows: stats_300: data gathered at 5 mins, child tables named stats_300_t1_t2 (where t2 - t1 = 2 hrs), i.e. 12 tables in one day stats_3600: data gathered / calculated over 1 hour, child tables similar to the above - stats_3600_t1_t2, where (t2 - t1) is 2 days (i.e. 15 tables a month) stats_86400: data gathered / calculated over 1 day, stored as stats_86400_t1_t2 where (t2 - t1) is 30 days (i.e. 12 tables a year). The child tables have 4 indexes each (including a unique index, also used for CLUSTER). No indexes are defined on the parent tables. Data insert / load happens directly to the child table (no stored procs involved). I'm running into the error "ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction. ". Looking back, it seems acceptable to have max_locks in the thousands (with a corresponding shared_buffers setting so I don't overflow SHMMAX). However, what I find strange is that I only have 32 tables so far (some at 5-min, some at 1-hour). I'm doing some data preloading, and even that ran into this problem. I'm running this on a shared server with 4GB total RAM, so I don't want PG to use too much. (Eventually, the system is designed to have I tried increasing the max_locks_per_transaction, but that just seems to delay the inevitable. Any ideas what I might be doing wrong? If this may be a programmatic issue, I'm using Python PygreSQL to load the data as prepared statements. I have one connection to the DB, create and release a cursor, and commit transactions when I'm done. --- begin postgresql.conf --- data_directory = '/data/pg' hba_file = '/etc/pg_hba.conf' ident_file = '/etc/pg_ident.conf' external_pid_file = '/data/pg/8.4-main.pid' port = 5432 max_connections = 8 unix_socket_directory = '/tmp' ssl = false shared_buffers = 128MB # used to be 500 work_mem = 64MB maintenance_work_mem = 64MB wal_buffers = 1MB checkpoint_segments = 30 checkpoint_timeout = 15min effective_cache_size = 1024MB default_statistics_target = 800 constraint_exclusion = on log_destination = 'syslog' syslog_facility = 'LOCAL1' syslog_ident = 'postgres' client_min_messages = error log_min_messages = error log_line_prefix = '%t ' log_temp_files = 0 datestyle = 'iso, mdy' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' default_text_search_config = 'pg_catalog.english' max_locks_per_transaction = 8000 # Originally 500, tried 1k and 2k also Thanks Hrishikesh
=?UTF-8?B?SHJpc2hpa2VzaCAo4KS54KWD4KS34KWA4KSV4KWH4KS2IOCkruClh+CkueClh+CkguCkpuCksw==?= =?UTF-8?B?4KWHKQ==?= <hashinclude@gmail.com>writes: > To make make the retrieval faster, I'm using a > partitioning scheme as follows: > stats_300: data gathered at 5 mins, child tables named stats_300_t1_t2 > (where t2 - t1 = 2 hrs), i.e. 12 tables in one day > stats_3600: data gathered / calculated over 1 hour, child tables > similar to the above - stats_3600_t1_t2, where (t2 - t1) is 2 days > (i.e. 15 tables a month) > stats_86400: data gathered / calculated over 1 day, stored as > stats_86400_t1_t2 where (t2 - t1) is 30 days (i.e. 12 tables a year). So you've got, um, something less than a hundred rows in any one child table? This is carrying partitioning to an insane degree, and your performance is NOT going to be improved by it. I'd suggest partitioning on boundaries that will give you order of a million rows per child. That could be argued an order of magnitude or two either way, but what you've got is well outside the useful range. > I'm running into the error "ERROR: out of shared memory HINT: You > might need to increase max_locks_per_transaction. No surprise given the number of tables and indexes you're forcing the system to deal with ... regards, tom lane
It was Thursday 19 November 2009 11:08:10 pm that the wise Tom Lane thus wrote: > <hashinclude@gmail.com> writes: > > To make make the retrieval faster, I'm using a > > partitioning scheme as follows: > > > > stats_300: data gathered at 5 mins, child tables named stats_300_t1_t2 > > (where t2 - t1 = 2 hrs), i.e. 12 tables in one day > > stats_3600: data gathered / calculated over 1 hour, child tables > > similar to the above - stats_3600_t1_t2, where (t2 - t1) is 2 days > > (i.e. 15 tables a month) > > stats_86400: data gathered / calculated over 1 day, stored as > > stats_86400_t1_t2 where (t2 - t1) is 30 days (i.e. 12 tables a year). > > So you've got, um, something less than a hundred rows in any one child > table? This is carrying partitioning to an insane degree, and your > performance is NOT going to be improved by it. Sorry I forgot to mention - in the "normal" case, each of those tables will have a few hundred thousand records, and in the worst case (the tables store info on up to 2000 endpoints) it can be around 5 million. Also, the partitioning is not final yet (we might move it to 6 hours / 12 hours per partition) - which is why I need to run the load test :) > I'd suggest partitioning on boundaries that will give you order of a > million rows per child. That could be argued an order of magnitude or > two either way, but what you've got is well outside the useful range. > > > I'm running into the error "ERROR: out of shared memory HINT: You > > might need to increase max_locks_per_transaction. > > No surprise given the number of tables and indexes you're forcing > the system to deal with ... How many locks per table/index does PG require? Even with my current state (<50 tables, < 250 (tables + indexes)) is it reasonable to expect 2000 locks to run out? Thanks, Hrishi