WARNING: pgstat wait timeout
От | Sergey E. Koposov |
---|---|
Тема | WARNING: pgstat wait timeout |
Дата | |
Msg-id | alpine.LRH.2.00.1001210139500.15783@lnfm1.sai.msu.ru обсуждение исходный текст |
Ответы |
Re: WARNING: pgstat wait timeout
(Jaime Casanova <jcasanov@systemguards.com.ec>)
|
Список | pgsql-hackers |
Hello hackers, I've recently hit the message "WARNING: pgstat wait timeout" with PG 8.4.2. I saw some reports about that message in the -bugs mailing list http://archives.postgresql.org/pgsql-bugs/2009-12/msg00175.php http://archives.postgresql.org/pgsql-bugs/2009-07/msg00081.php where the backtrace from the statisctic collector was requested. Although I don't have any other bad sympthoms in the system, I still obtained a backtrace from the statistics collector process. Since I'm not 100% sure that the message is really a bug, feel free to ignore. But if needed I have PG still running, so I can check something else if needed. Here is the (rather innocent IMHO) backtrace of the statistic collector process: (gdb) bt #0 0x00007f31ddfc4b1f in poll () from /lib/libc.so.6 #1 0x00000000005bf7da in PgstatCollectorMain (argc=<value optimized out>, # argv=<value optimized out>) at pgstat.c:2718 #2 0x00000000005c0131 in pgstat_start () at pgstat.c:631 #3 0x00000000005c474d in reaper (postgres_signal_arg=<value optimized out>) #at postmaster.c:2322 #4 <signal handler called> #5 0x00007f31ddfc6c83 in select () from /lib/libc.so.6 #6 0x00000000005c20fc in ServerLoop () at postmaster.c:1347 #7 0x00000000005c34a7 in PostmasterMain (argc=3, argv=0x144fd20) at #postmaster.c:1040 #8 0x000000000056cdc8 in main (argc=3, argv=0x144fd20) at main.c:188 (gdb) quit The program is running. Quit anyway (and detach it)? (y or n) y Detaching from program: /opt/pgsql/bin/postgres, process 24677 ------------- Bt full: (gdb) bt full #0 0x00007f31ddfc4b1f in poll () from /lib/libc.so.6 No symbol table info available. #1 0x00000000005bf7da in PgstatCollectorMain (argc=<value optimized out>, argv=<value optimized out>) at pgstat.c:2718 len = 64 msg = {msg_hdr = {m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, msg_dummy = {m_hdr ={ m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}}, msg_inquiry = {m_hdr = {m_type = PGSTAT_MTYPE_BGWRITER, m_size= 64}, inquiry_time = 0}, msg_tabstat = {m_hdr = {m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_databaseid = 0,m_nentries = 0, m_xact_commit = 0, m_xact_rollback = 0, m_entry = {{t_id = 0, t_counts = { t_numscans = 0, t_tuples_returned= 0, t_tuples_fetched = 138, t_tuples_inserted = 138, t_tuples_updated = 0, t_tuples_deleted =0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples = 0, t_blocks_fetched = 2, t_blocks_hit =2}}, {t_id = 2672, t_counts = {t_numscans = 1, t_tuples_returned = 1, t_tuples_fetched = 1, t_tuples_inserted =0, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples= 0, t_blocks_fetched = 2, t_blocks_hit = 2}}, {t_id = 1259, t_counts = {t_numscans = 4, t_tuples_returned = 553, t_tuples_fetched = 0, t_tuples_inserted = 0, t_tuples_updated = 0, t_tuples_deleted= 0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples = 0, t_blocks_fetched =24, t_blocks_hit = 24}}, {t_id = 2615, t_counts = {t_numscans = 0, t_tuples_returned = 0, t_tuples_fetched = 0,t_tuples_inserted = 0, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0, t_new_live_tuples= 0, t_new_dead_tuples = 0, t_blocks_fetched = 1, t_blocks_hit = 1}}, {t_id = 2685, t_counts ={t_numscans = 1, t_tuples_returned = 1, t_tuples_fetched = 1, t_tuples_inserted = 0, t_tuples_updated = 0, t_tuples_deleted= 0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples = 0, t_blocks_fetched =2, t_blocks_hit = 2}}, {t_id = 1172815, t_counts = {t_numscans = 0, t_tuples_returned = 0, t_tuples_fetched= 0, t_tuples_inserted = 500000, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated= 0, t_new_live_tuples = 500000, t_new_dead_tuples = 0, t_blocks_fetched = 23077, t_blocks_hit= 15381}}, {t_id = 1172684, t_counts = {t_numscans = 0, t_tuples_returned = 0, msg = {msg_hdr = {m_type= PGSTAT_MTYPE_BGWRITER, m_size = 64}, msg_dummy = {m_hdr = { m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}},msg_inquiry = {m_hdr = {m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, inquiry_time = 0}, msg_tabstat = {m_hdr ={m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_databaseid = 0, m_nentries = 0, m_xact_commit = 0, m_xact_rollback =0, m_entry = {{t_id = 0, t_counts = { t_numscans = 0, t_tuples_returned = 0, t_tuples_fetched = 138, t_tuples_inserted= 138, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0, t_new_live_tuples= 0, t_new_dead_tuples = 0, t_blocks_fetched = 2, t_blocks_hit = 2}}, {t_id = 2672, t_counts ={t_numscans = 1, t_tuples_returned = 1, t_tuples_fetched = 1, t_tuples_inserted = 0, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples = 0, t_blocks_fetched= 2, t_blocks_hit = 2}}, {t_id = 1259, t_counts = {t_numscans = 4, t_tuples_returned = 553, t_tuples_fetched= 0, t_tuples_inserted = 0, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0,t_new_live_tuples = 0, t_new_dead_tuples = 0, t_blocks_fetched = 24, t_blocks_hit = 24}}, {t_id = 2615, t_counts= {t_numscans = 0, t_tuples_returned = 0, t_tuples_fetched = 0, t_tuples_inserted = 0, t_tuples_updated= 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples =0, t_blocks_fetched = 1, t_blocks_hit = 1}}, {t_id = 2685, t_counts = {t_numscans = 1, t_tuples_returned = 1, t_tuples_fetched = 1, t_tuples_inserted = 0, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated= 0, t_new_live_tuples = 0, t_new_dead_tuples = 0, t_blocks_fetched = 2, t_blocks_hit = 2}},{t_id = 1172815, t_counts = {t_numscans = 0, t_tuples_returned = 0, t_tuples_fetched = 0, t_tuples_inserted= 500000, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0, t_new_live_tuples= 500000, t_new_dead_tuples = 0, t_blocks_fetched = 23077, t_blocks_hit = 15381}}, {t_id = 1172684,t_counts = {t_numscans = 0, t_tuples_returned = 0, t_tuples_fetched = 0, t_tuples_inserted = 0, t_tuples_updated= 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples =0, t_blocks_fetched = 1018, t_blocks_hit = 1017}}, {t_id = 2678, t_counts = {t_numscans = 5, t_tuples_returned= 1, t_tuples_fetched = 1, t_tuples_inserted = 0, t_tuples_updated = 0, t_tuples_deleted = 0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples = 0, t_blocks_fetched = 5, t_blocks_hit= 5}}, {t_id = 2619, t_counts = {t_numscans = 0, t_tuples_returned = 0, t_tuples_fetched = 0, t_tuples_inserted= 0, t_tuples_updated = 24, t_tuples_deleted = 0, t_tuples_hot_updated = 24, t_new_live_tuples= 0, t_new_dead_tuples = 9, t_blocks_fetched = 72, t_blocks_hit = 72}}, {t_id = 2696, t_counts ={t_numscans = 48, t_tuples_returned = 48, t_tuples_fetched = 48, t_tuples_inserted = 0, t_tuples_updated = 0, t_tuples_deleted= 0, t_tuples_hot_updated = 0, t_new_live_tuples = 0, t_new_dead_tuples = 0, t_blocks_fetched =96, t_blocks_hit = 96}}}}, msg_tabpurge = {m_hdr = { m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_databaseid = 0,m_nentries = 0, m_tableid = {0, 0, 0, 0, 0, 0, 0, 0, 138, 0, 138, 0 <repeats 11 times>, 2, 0, 2, 0, 2672, 32767, 1,0, 1, 0, 1, 0 <repeats 13 times>, 2, 0, 2, 0, 1259, 0, 4, 0, 553, 0 <repeats 15 times>, 24, 0, 24, 0, 2615, 0 <repeats19 times>, 1, 0, 1, 0, 2685, 0, 1, 0, 1, 0, 1, 0 <repeats 13 times>, 2, 0, 2, 0, 1172815, 32561, 0, 0, 0, 0,0, 0, 500000, 0, 0, 0, 0, 0, 0, 0, 500000, 0, 0, 0, 23077, 0, 15381, 0, 1172684, 0 <repeats 19 times>, 1018, 0, 1017,0, 2678, 0, 5, 0, 1, 0, 1, 0 <repeats 13 times>, 5, 0, 5, 0, 2619, 3, 0, 0, 0, 0, 0, 0, 0, 0, 24, 0, 0, 0, 24,0, 0, 0, 9, 0, 72, 0, ---Type <return> to continue, or q <return> to quit--- 72, 0, 2696, 32767, 48, 0, 48, 0, 48, 0 <repeats 13 times>...}},msg_dropdb = {m_hdr = { m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_databaseid = 0}, msg_resetcounter= {m_hdr = { m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_databaseid = 0}, msg_autovacuum = {m_hdr= { m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_databaseid = 0, m_start_time = 0}, msg_vacuum = {m_hdr = { m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_databaseid = 0, m_tableoid = 0, m_analyze = 0 '\0', m_autovacuum= 0 '\0', m_scanned_all = 0 '\0', m_vacuumtime = 0, m_tuples = 0}, msg_analyze = {m_hdr = { m_type = PGSTAT_MTYPE_BGWRITER,m_size = 64}, m_databaseid = 0, m_tableoid = 0, m_autovacuum = 0 '\0', m_analyzetime = 0, m_live_tuples= 0, m_dead_tuples = 0}, msg_bgwriter = {m_hdr = { m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_timed_checkpoints= 0, m_requested_checkpoints = 0, m_buf_written_checkpoints = 0, m_buf_written_clean = 0, m_maxwritten_clean= 0, m_buf_written_backend = 138, m_buf_alloc = 138}, msg_funcstat = {m_hdr = {m_type = PGSTAT_MTYPE_BGWRITER,m_size = 64}, m_databaseid = 0, m_nentries = 0, m_entry = {{f_id = 0, f_numcalls = 0, f_time = 0,f_time_self = 0}, {f_id = 138, f_numcalls = 138, f_time = 0, f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time= 0, f_time_self = 2}, { f_id = 2, f_numcalls = 140733193390704, f_time = 1, f_time_self = 1}, {f_id = 1, f_numcalls= 0, f_time = 0, f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time = 0, f_time_self = 2}, {f_id = 2, f_numcalls= 1259, f_time = 4, f_time_self = 553}, {f_id = 0, f_numcalls = 0, f_time = 0, f_time_self = 0}, {f_id =0, f_numcalls = 0, f_time = 0, f_time_self = 24}, {f_id = 24, f_numcalls = 2615, f_time = 0, f_time_self = 0}, { f_id = 0, f_numcalls = 0, f_time = 0, f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time = 0, f_time_self= 1}, {f_id = 1, f_numcalls = 2685, f_time = 1, f_time_self = 1}, {f_id = 1, f_numcalls = 0, f_time = 0,f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time = 0, f_time_self = 2}, {f_id = 2, f_numcalls = 139848431297871,f_time = 0, f_time_self = 0}, {f_id = 0, f_numcalls = 500000, f_time = 0, f_time_self = 0}, {f_id= 0, f_numcalls = 500000, f_time = 0, f_time_self = 23077}, {f_id = 15381, f_numcalls = 1172684, f_time = 0,f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time = 0, f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time = 0,f_time_self = 1018}, {f_id = 1017, f_numcalls = 2678, f_time = 5, f_time_self = 1}, {f_id = 1, f_numcalls = 0, f_time= 0, f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time = 0, f_time_self = 5}, {f_id = 5, f_numcalls = 12884904507,f_time = 0, f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time = 24, f_time_self = 0}, {f_id = 24, f_numcalls= 0, f_time = 9, f_time_self = 72}, { f_id = 72, f_numcalls = 140733193390728, f_time = 48, f_time_self= 48}, {f_id = 48, f_numcalls = 0, f_time = 0, f_time_self = 0}, {f_id = 0, f_numcalls = 0, f_time = 0,f_time_self = 96}}}, msg_funcpurge = { m_hdr = {m_type = PGSTAT_MTYPE_BGWRITER, m_size = 64}, m_databaseid = 0, m_nentries= 0, m_functionid = {0, 0, 0, 0, 0, 0, 0, 0, 138, 0, 138, 0 <repeats 11 times>, 2, 0, 2, 0, 2672, 32767, 1,0, 1, 0, 1, 0 <repeats 13 times>, 2, 0, 2, 0, 1259, 0, 4, 0, 553, 0 <repeats 15 times>, 24, 0, 24, 0, 2615, 0<repeats 19 times>, 1, 0, 1, 0, 2685, 0, 1, 0, 1, 0, 1, 0 <repeats 13 times>, 2, 0, 2, 0, 1172815, 32561, 0, 0, 0,0, 0, 0, 500000, 0, 0, 0, 0, 0, 0, 0, 500000, 0, 0, 0, 23077, 0, 15381, 0, 1172684, 0 <repeats 19 times>, 1018, 0,1017, 0, 2678, 0, 5, 0, 1, 0, 1, 0 <repeats 13 times>, 5, 0, 5, 0, 2619, 3, 0, 0, 0, 0, 0, 0, 0, 0, 24, 0, 0, 0, 24,0, 0, 0, 9, 0, 72, 0, 72, 0, 2696, 32767, 48, 0, 48, 0, 48, 0 <repeats 13 times>...}}} input_fd = {fd = 8,events = 9, revents = 0} __func__ = "PgstatCollectorMain" #2 0x00000000005c0131 in pgstat_start () at pgstat.c:631 curtime = <value optimized out> pgStatPid = 0 __func__ = "pgstat_start" #3 0x00000000005c474d in reaper (postgres_signal_arg=<value optimized out>) at postmaster.c:2322 ---Type <return> to continue, or q <return> to quit--- save_errno = 17 pid = 24673 exitstatus = 0 status = 0 __func__ = "reaper" #4 <signal handler called> #4 <signal handler called> No symbol table info available. #5 0x00007f31ddfc6c83 in select () from /lib/libc.so.6 No symbol table info available. #6 0x00000000005c20fc in ServerLoop () at postmaster.c:1347 timeout = {tv_sec = 59, tv_usec = 916269} rmask= {fds_bits = {120, 0 <repeats 15 times>}} selres = <value optimized out> readmask = {fds_bits = {120,0 <repeats 15 times>}} nSockets = 7 now = 64 last_touch_time = 1264024914 __func__ = "ServerLoop" #7 0x00000000005c34a7 in PostmasterMain (argc=3, argv=0x144fd20) at postmaster.c:1040 opt = <value optimized out> status = <value optimized out> userDoption = <value optimized out> __func__ = "PostmasterMain" #8 0x000000000056cdc8 in main (argc=3, argv=0x144fd20) at main.c:188 ---------- The message occurs with PG 8.4.2 on a powerful 16-CPU, 12Gb RAM machine with RAID50 during heavy COPY/INSERTs/UPDATEs from one Java client (some of the INSERTS actually take up to 100 sec). The OS is Linux Debian 5.0 with 2.6.30 kernel. I see these WARNINGs appearing for the second time and as before I saw them staring after I did a manual vacuum of one of the tables in which the data is inserted. But I have no idea whether that's really a cause or just a coincidence. Regards, S. ******************************************************************* Sergey E. Koposov, PhD Institute for Astronomy, Cambridge, UK Tel: +44-1223-337-551 Web: http://lnfm1.sai.msu.ru/~math E-mail: math@sai.msu.ru
В списке pgsql-hackers по дате отправления: