Re: WAL recycling, ext3, Linux 2.4.18
От | Doug Fields |
---|---|
Тема | Re: WAL recycling, ext3, Linux 2.4.18 |
Дата | |
Msg-id | 5.1.0.14.2.20020708120620.01f39538@pop.pexicom.com обсуждение исходный текст |
Ответ на | Re: WAL recycling, ext3, Linux 2.4.18 (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: WAL recycling, ext3, Linux 2.4.18
(Tom Lane <tgl@sss.pgh.pa.us>)
Re: WAL recycling, ext3, Linux 2.4.18 (Curt Sampson <cjs@cynic.net>) |
Список | pgsql-general |
At 11:57 AM 7/8/2002, Tom Lane wrote: >Doug Fields <dfields-pg-general@pexicom.com> writes: > > I'm still trying to track down my very odd periodic pauses/hangs in > > PostgreSQL 7.2.1. > > I've localized it to what seems to be the "recycled transaction log file" > > lines in the log file. Whenever this happens, a whole bunch of queries > > which were "on hold" (just sitting there, as can be seen in > > pg_stat_activity, when they usually execute in fractions of a second) come > > back to life and finish very quickly. > >Hm. That would be from the tail end of a checkpoint operation. If you >do a CHECKPOINT command manually, do you get a similar hangup? In fact, yes. (I never did a CHECKPOINT ever before now.) It took less time (probably because I did it around half way through the 6.5-minute WAL recycling period), but had almost identical effects. pg_stat_activity showed a whole bunch of queries stacked up, and vmstat exhibited the same pathology: the other postgres processes blocked during the checkpoint (it's pretty plain in the vmstat output below). procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 0 3851828 121732 3913552 0 0 2 79 56 62 49 2 49 1 0 0 0 3850724 121732 3913576 0 0 0 1171 214 1023 4 0 95 1 0 0 0 3851544 121732 3913576 0 0 0 266 179 148 1 0 99 0 0 0 0 3851780 121732 3913600 0 0 0 1407 266 364 4 1 95 0 2 1 0 3843020 121732 3913616 0 0 0 4182 195 186 2 4 94 0 2 3 0 3841608 121732 3913616 0 0 0 2722 197 191 0 1 99 0 2 3 0 3840984 121732 3913616 0 0 0 2774 232 298 1 1 99 0 2 3 0 3840056 121732 3913616 0 0 0 2739 200 204 1 1 99 0 2 3 0 3839280 121732 3913616 0 0 0 1634 183 201 1 0 99 0 2 4 0 3839424 121732 3913616 0 0 0 4098 202 143 0 1 99 0 2 4 0 3837672 121732 3913616 0 0 0 2710 194 170 1 1 99 0 2 3 0 3836992 121732 3913616 0 0 0 2139 156 82 0 0 99 0 2 4 0 3827844 121732 3913632 0 0 0 5362 170 3260 3 1 95 0 2 4 0 3828592 121732 3913632 0 0 0 2134 143 53 0 1 99 1 1 0 0 3830892 121744 3913668 0 0 0 4926 152 6087 8 3 89 0 0 0 0 3850360 121744 3913772 0 0 0 7851 610 16106 61 5 34 0 0 0 0 3850360 121744 3913772 0 0 0 20 189 284 1 0 99 0 0 0 0 3850360 121744 3913772 0 0 0 14 147 157 0 0 100 >What are the stuck queries doing exactly? Can you attach to a few of the >stuck backends with gdb and get stack traces? The stuck queries are of the following nature, typically either: 1) UPDATE x SET somedate=now() WHERE primarykey IN (1,2,3,4...300 to 450 IDs) 2) DELETE FROM x WHERE key1 = 1 AND key2 IN (1,2,3,4...300 to 450 IDs) In my pg_stats display below you can see some of them. Here is a stack trace. I did "where" about every second during the "pause" and received the same stack trace. This is on PID 3456 per the pg_stat_activity listing below. After things clear up, I also did a stack trace; it's blocked on recv, presumably waiting for more commands to come down the socket. (I tried a few other PIDs with similar stack traces, all stuck on the semop call.) (gdb) c Continuing. Program received signal SIGINT, Interrupt. 0x4028913f in semop () from /lib/libc.so.6 (gdb) where #0 0x4028913f in semop () from /lib/libc.so.6 #1 0x08106dec in IpcSemaphoreLock () #2 0x0810bce1 in LWLockAcquire () #3 0x0808433b in XLogInsert () #4 0x08079033 in _bt_doinsert () #5 0x08078ea9 in _bt_doinsert () #6 0x080786ed in _bt_doinsert () #7 0x0807be9d in btinsert () #8 0x08152822 in OidFunctionCall5 () #9 0x08077cf8 in index_insert () #10 0x080c674a in ExecInsertIndexTuples () #11 0x080c373e in ExecutorEnd () #12 0x080c3374 in ExecutorEnd () #13 0x080c2797 in ExecutorRun () #14 0x081104de in ProcessQuery () #15 0x0810ed70 in pg_exec_query_string () #16 0x0810fd5e in PostgresMain () #17 0x080f6d4e in ClosePostmasterPorts () #18 0x080f669f in ClosePostmasterPorts () #19 0x080f5882 in PostmasterMain () #20 0x080f5391 in PostmasterMain () #21 0x080d4e18 in main () #22 0x401d114f in __libc_start_main () from /lib/libc.so.6 (gdb) c Continuing. And then later... (gdb) c Continuing. Program received signal SIGINT, Interrupt. 0x402887f2 in recv () from /lib/libc.so.6 (gdb) where #0 0x402887f2 in recv () from /lib/libc.so.6 #1 0x080d42bc in StreamClose () #2 0x080d430d in pq_getbyte () #3 0x0810e7c8 in HandleFunctionRequest () #4 0x0810e837 in HandleFunctionRequest () #5 0x0810fc5e in PostgresMain () #6 0x080f6d4e in ClosePostmasterPorts () #7 0x080f669f in ClosePostmasterPorts () #8 0x080f5882 in PostmasterMain () #9 0x080f5391 in PostmasterMain () #10 0x080d4e18 in main () #11 0x401d114f in __libc_start_main () from /lib/libc.so.6 (gdb) c Continuing. > > 1) Is there any known bad interactions with ext3fs and PostgreSQL? > >Dunno. The CHECKPOINT would probably create a significant number of >disk write requests, followed by a sync() request. If that could >monopolize an ext3 filesystem for a long time, perhaps that would >explain your problem. But I haven't heard similar complaints before. > >What do you have shared_buffers set to? pexicast_lg=# show shared_buffers; NOTICE: shared_buffers is 65536 SHOW VARIABLE I believe this is about 512 megs. I have 8 gigs RAM on this server and tried 65536 and before it 32768. Either one seems to work fine - I didn't notice any significant performance difference yet between them. See below for pg_stat_activity... I appreciate the help! Cheers, Doug pexicast_lg=# select * from pg_stat_activity where datname='pexicast_lg'; select now(); datid | datname | procpid | usesysid | usename | current_query ---------+-------------+---------+----------+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 3926366 | pexicast_lg | 3393 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (463196,463216,463236,463256,463276,463296,461176,461196,461216,461236,461256,461276,461296,461316,461336,461356,461376,461396,461416,461436,461456,461476,461496,461516,461536,461556,461576,46 3926366 | pexicast_lg | 3412 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (425066,425086,425106,425126,425146,425166,425186,425206,425226,425246,425266,425286,425306,425326,425346,425366,425386,425406,425426,425446,425466,425486,425506,425526,425546,425566,425586,42 3926366 | pexicast_lg | 3443 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (442093,442113,442133,442153,442173,442193,442213,442233,442253,442273,442293,442313,442333,442353,442373,442393,442413,442433,442453,442473,442493,442513,442533,442553,442573,442593,442613,44 3926366 | pexicast_lg | 3444 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (438107,438127,438147,438167,438187,438207,438227,438247,436147,436167,436187,436207,436227,436247,436267,436287,436307,436327,436347,436367,436387,436407,436427,436447,436467,436487,436507,43 3926366 | pexicast_lg | 3446 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (429470,429490,429510,429530,429550,429570,429590,429610,429630,429650,429670,429690,429710,429730,429750,429770,429790,429810,429830,429850,429870,429890,429910,429930,427830,427850,427870,42 3926366 | pexicast_lg | 3447 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (463658,463678,463698,463718,463738,463758,463778,463798,463818,463838,463858,463878,463898,463918,460378,461838,461858,461878,461898,461918,461938,461958,461978,461998,462018,462038,462058,46 3926366 | pexicast_lg | 3448 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (407502,407522,407542,407562,407582,407602,407622,407642,407662,407682,407702,407722,407742,407762,407782,407802,407822,407842,407862,407882,405802,405822,405842,405862,405882,405902,405922,40 3926366 | pexicast_lg | 3449 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (475717,475737,475757,475777,475797,475817,475837,475857,475877,475897,475917,475937,475957,475977,475997,476017,476037,476057,476077,476097,476117,476137,476157,474077,474097,474117,474137,47 3926366 | pexicast_lg | 3450 | 100 | tomcat | <IDLE> 3926366 | pexicast_lg | 3451 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (438472,438492,438512,438532,438552,438572,438592,438612,438632,438652,438672,436572,436592,436612,436632,436652,436672,436692,436712,436732,436752,436772,436792,436812,436832,436852,436872,43 3926366 | pexicast_lg | 3452 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (415401,415421,415441,415461,415481,415501,415521,415541,415561,415581,415601,415621,415641,415661,415681,413581,413601,413621,413641,413661,413681,413701,413721,413741,413761,413781,413801,41 3926366 | pexicast_lg | 3453 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (442908,442928,442948,442968,440868,440888,440908,440928,440948,440968,440988,441008,441028,441048,441068,441088,441108,441128,441148,441168,441188,441208,441228,441248,441268,441288,441308,44 3926366 | pexicast_lg | 3454 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (410524,410544,410564,410584,410604,410624,410644,410664,410684,410704,410724,410744,408644,408664,408684,408704,408724,408744,408764,408784,408804,408824,408844,408864,408884,408904,408924,40 3926366 | pexicast_lg | 3455 | 100 | tomcat | <IDLE> 3926366 | pexicast_lg | 3456 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (451115,451135,451155,451175,451195,451215,451235,451255,451275,451295,451315,451335,451355,451375,451395,451415,451435,451455,451475,451495,451515,451535,451555,449475,449495,449515,449535,44 3926366 | pexicast_lg | 3457 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (408325,408345,408365,408385,408405,408425,408445,408465,408485,408505,408525,408545,406465,406485,406505,406525,406545,406565,406585,406605,406625,406645,406665,406685,406705,406725,406745,40 3926366 | pexicast_lg | 3458 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (397080,397100,397120,397140,397160,397180,397200,397220,397240,397260,397280,397300,397320,397340,397360,397380,397400,397420,397440,397460,397480,397500,397520,397540,397560,397580,397600,39 3926366 | pexicast_lg | 3459 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (476459,476479,476499,476519,476539,476559,476579,476599,476619,476639,476659,476679,476699,476719,476739,476759,474659,474679,474699,474719,474739,474759,474779,474799,474819,474839,474859,47 3926366 | pexicast_lg | 3460 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (411003,411023,411043,411063,411083,411103,411123,411143,411163,411183,411203,411223,411243,411263,411283,411303,411323,411343,411363,411383,411403,411423,411443,411463,411483,411503,411523,41 3926366 | pexicast_lg | 3461 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (432589,432609,432629,432649,432669,432689,432709,432729,432749,432769,432789,432809,432829,432849,432869,432889,432909,430809,430829,430849,430869,430889,430909,430929,430949,430969,430989,43 3926366 | pexicast_lg | 3462 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (470654,470674,470694,470714,470734,470754,470774,470794,470814,470834,470854,470874,470894,470914,470934,470954,470974,470994,471014,471034,471054,471074,471094,471114,471134,471154,471174,47 3926366 | pexicast_lg | 3497 | 100 | tomcat | UPDATE list_entries SET last_mail=now() WHERE list_entry_id IN (440551,440571,440591,440611,440631,440651,440671,440691,440711,440731,440751,440771,440791,440811,440831,440851,440871,440891,440911,440931,440951,440971,440991,441011,441031,441051,441071,44 3926366 | pexicast_lg | 4235 | 101 | dfields | <IDLE> (23 rows)
В списке pgsql-general по дате отправления:
Следующее
От: Manfred KoizarДата:
Сообщение: Re: PostgreSQL doesn't use indexes even is enable_seqscan = off