Re: WAL recycling, ext3, Linux 2.4.18

Поиск
Список
Период
Сортировка
От Doug Fields
Тема Re: WAL recycling, ext3, Linux 2.4.18
Дата
Msg-id 5.1.0.14.2.20020708115449.01f5cc20@pop.pexicom.com
обсуждение исходный текст
Ответ на WAL recycling, ext3, Linux 2.4.18  (Doug Fields <dfields-pg-general@pexicom.com>)
Список pgsql-general
Hello all,

I've some very interesting data collected from vmstat on the above problem.

I've now recorded the "whole cycle" on a vmstat log, using "vmstat 5" on my
server. I will annotate it here: This shows activity, context switches, and
"running, blocked and waiting" processes every 5 seconds.

    procs                      memory    swap          io     system
  cpu
  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us
sy  id
  0  0  0      0 3858124 121692
3909904   0   0     0   863  234   298   4   0  96
  1  0  0      0 3858116 121692
3909912   0   0     0   508  185   270   3   0  97
  1  0  0      0 3857744 121692
3909936   0   0     0   940  224   487   6   0  94
  0  0  0      0 3858076 121692
3909952   0   0     0   619  228   232   3   0  97
  0  0  0      0 3858056 121696
3909968   0   0     0   582  194   193   3   0  97
  0  0  0      0 3858024 121696
3910000   0   0     0  1036  249   276   5   0  95
  0  0  0      0 3858024 121696
3910000   0   0     0    26  172   248   0   0 100
  0  0  0      0 3858000 121696
3910024   0   0     0   902  243   458   4   0  96
  1  0  0      0 3857732 121696
3910032   0   0     0   293  159   131   1   0  99
  0  0  0      0 3857968 121696
3910056   0   0     0   742  206   186   3   0  96
  0  0  0      0 3857960 121696
3910064   0   0     0   334  165   143   1   0  99
  0  0  0      0 3857952 121696
3910072   0   0     0   320  164   153   1   0  98
  0  0  0      0 3857936 121696
3910088   0   0     0   514  215   262   3   0  97

Up to this point everything is cranking along quite nicely. With 8gb of
RAM, you'll notice that it's extraordinarily rare for me to ever get any
"blocks in" - everything is always cached by the kernel. On the other hand,
the "blocks out" is fairly consistently high, as I have about 240 rows or
so being updated per second (usually in batches of 300-450) - some being
deleted and others having a single column updated.

Now, in the next 5 second analysis, something happens. I believe the WAL
logger begins its "duty cycle" and starts recycling log files. Doing
"select * from pg_stat_activity" shows all my normally very fast queries
stuck not being processed. Indeed, the vmstat log below continues to tell
the story: blocks out increases greatly, lots of processes (postgresql
since that's the _only_ thing that runs on this server other than system
daemons like syslogd) become blocked or waiting, etc.

At the end, the coup de grace comes: a flurry of context switches and tons
of blocks out. This isn't actually accurate enough for me to say which of
these two is the cause, but it's probably some combination of both:

1) The WAL finishes with a huge amount of I/O all at once
2) All the waiting queries (mostly UPDATE and DELETE) all of a sudden run
simultaneously

But either way, at the end of it, the write amounts jump sky high and so do
the context switches... Note that at 5 seconds per entry, this means about
1:15 (one and a quarter minutes) of time is lost, and as I said previously,
this happens approximately every six and a half minutes.

    procs                      memory    swap          io     system
  cpu
  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us
sy  id
  0  3  1      0 3852388 121696
3910088   0   0     0  4257  205   151   1   7  92
  0  2  3      0 3850944 121696
3910088   0   0     0  3000  222   264   0   1  99
  0  2  3      0 3849920 121696
3910088   0   0     0  2643  176   132   0   1  99
  0  2  3      0 3843964 121696
3910096   0   0     0  4035  208  1480   2   1  97
  0  2  4      0 3840304 121696
3910096   0   0     0  3030  197   165   1   1  98
  0  2  4      0 3839696 121696
3910096   0   0     0  2780  179   124   0   1  99
  0  2  4      0 3838168 121696
3910096   0   0     0  2914  169    82   1   1  98
  0  2  4      0 3838840 121696
3910096   0   0     0  2847  163    55   0   1  99
  0  2  4      0 3839484 121696
3910096   0   0     0  2631  154    46   0   1  99
  0  2  3      0 3840208 121696
3910096   0   0     0  2864  163    63   0   1  99
  0  2  4      0 3840896 121696
3910096   0   0     0  2730  157    60   0   1  99
  0  2  4      0 3841448 121696
3910096   0   0     0  2792  155    70   0   1  99
  0  2  4      0 3841756 121696
3910096   0   0     0  2824  167    88   0   1  99
  0  2  2      0 3841624 121696
3910096   0   0     0  2863  167    89   0   1  99
  0  2  2      0 3841656 121696
3910096   0   0     0  2840  149    59   0   1  99
  0  2  2      0 3841636 121696
3910096   0   0     0  3076  164    82   0   1  99
  0  1  0      0 3827484 121696
3910128   0   0     0  4582  144  5577   6   2  92
  5  0  0      0 3847556 121696 3910252   0   0     0  9406  207
13897  38   6  56

Hereafter, things settle down to normal. Actually, they settle down to
super-normal for a few moments, as all the threads (there are usually about
20-40 or more) now have another group of 300-450 rows to process, and won't
hit the database for about a minute to 90 seconds.

    procs                      memory    swap          io     system
  cpu
  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us
sy  id
  0  0  0      0 3857684 121696
3910252   0   0     0   242  600  6071  26   2  72
  0  0  0      0 3857684 121696
3910252   0   0     0    14  147   152   0   0 100
  0  0  0      0 3857684 121696
3910252   0   0     0    14  144   145   0   0 100
  0  0  0      0 3857672 121696
3910256   0   0     0    43  106    21   0   0 100

I welcome additional thoughts. My "SHOW ALL" command output is below.

Cheers,

Doug

pexicast_lg=# show all;
NOTICE:  enable_seqscan is on
NOTICE:  enable_indexscan is on
NOTICE:  enable_tidscan is on
NOTICE:  enable_sort is on
NOTICE:  enable_nestloop is on
NOTICE:  enable_mergejoin is on
NOTICE:  enable_hashjoin is on
NOTICE:  ksqo is off
NOTICE:  geqo is on
NOTICE:  tcpip_socket is on
NOTICE:  ssl is off
NOTICE:  fsync is on
NOTICE:  silent_mode is off
NOTICE:  log_connections is on
NOTICE:  log_timestamp is on
NOTICE:  log_pid is on
NOTICE:  debug_print_query is off
NOTICE:  debug_print_parse is off
NOTICE:  debug_print_rewritten is off
NOTICE:  debug_print_plan is off
NOTICE:  debug_pretty_print is off
NOTICE:  show_parser_stats is off
NOTICE:  show_planner_stats is off
NOTICE:  show_executor_stats is off
NOTICE:  show_query_stats is off
NOTICE:  stats_start_collector is on
NOTICE:  stats_reset_on_server_start is on
NOTICE:  stats_command_string is on
NOTICE:  stats_row_level is off
NOTICE:  stats_block_level is off
NOTICE:  trace_notify is off
NOTICE:  hostname_lookup is off
NOTICE:  show_source_port is off
NOTICE:  sql_inheritance is on
NOTICE:  australian_timezones is off
NOTICE:  fixbtree is on
NOTICE:  password_encryption is off
NOTICE:  transform_null_equals is off
NOTICE:  geqo_threshold is 11
NOTICE:  geqo_pool_size is 0
NOTICE:  geqo_effort is 1
NOTICE:  geqo_generations is 0
NOTICE:  geqo_random_seed is -1
NOTICE:  deadlock_timeout is 1000
NOTICE:  syslog is 2
NOTICE:  max_connections is 256
NOTICE:  shared_buffers is 65536
NOTICE:  port is 5432
NOTICE:  unix_socket_permissions is 511
NOTICE:  sort_mem is 65536
NOTICE:  vacuum_mem is 65536
NOTICE:  max_files_per_process is 1000
NOTICE:  debug_level is 0
NOTICE:  max_expr_depth is 10000
NOTICE:  max_fsm_relations is 100
NOTICE:  max_fsm_pages is 10000
NOTICE:  max_locks_per_transaction is 64
NOTICE:  authentication_timeout is 60
NOTICE:  pre_auth_delay is 0
NOTICE:  checkpoint_segments is 30
NOTICE:  checkpoint_timeout is 300
NOTICE:  wal_buffers is 16
NOTICE:  wal_files is 48
NOTICE:  wal_debug is 0
NOTICE:  commit_delay is 0
NOTICE:  commit_siblings is 5
NOTICE:  effective_cache_size is 625000
NOTICE:  random_page_cost is 4
NOTICE:  cpu_tuple_cost is 0.01
NOTICE:  cpu_index_tuple_cost is 0.001
NOTICE:  cpu_operator_cost is 0.0025
NOTICE:  geqo_selection_bias is 2
NOTICE:  default_transaction_isolation is read committed
NOTICE:  dynamic_library_path is $libdir
NOTICE:  krb_server_keyfile is unset
NOTICE:  syslog_facility is LOCAL0
NOTICE:  syslog_ident is postgres
NOTICE:  unix_socket_group is unset
NOTICE:  unix_socket_directory is unset
NOTICE:  virtual_host is unset
NOTICE:  wal_sync_method is fdatasync
NOTICE:  DateStyle is ISO with European conventions
NOTICE:  Time zone is unset
NOTICE:  TRANSACTION ISOLATION LEVEL is READ COMMITTED
NOTICE:  Current client encoding is 'SQL_ASCII'
NOTICE:  Current server encoding is 'SQL_ASCII'
NOTICE:  Seed for random number generator is unavailable
SHOW VARIABLE

At 02:36 AM 7/8/2002, Doug Fields wrote:
>Hello all,
>
>I'm still trying to track down my very odd periodic pauses/hangs in
>PostgreSQL 7.2.1.
>
>I've localized it to what seems to be the "recycled transaction log file"
>lines in the log file. Whenever this happens, a whole bunch of queries
>which were "on hold" (just sitting there, as can be seen in
>pg_stat_activity, when they usually execute in fractions of a second) come
>back to life and finish very quickly.
>
>Unfortunately, PostgreSQL doesn't seem to log when it starts doing this
>recycling, only when it's done.
>
>However, it seems to be taking about 1.5 minutes (yes, around 90 seconds)
>to do this recycling on about sixteen of these WAL files at a time.
>(Deduction from the logs from the application that uses the database.) I
>currently have about 102 of these WAL files (I don't mind; I have 50 gigs
>set aside for pg_xlog). My postgresql.conf settings are:
>
>WAL_FILES = 48
>WAL_BUFFERS = 16
>CHECKPOINT_SEGMENTS = 30
>
>With this, during my heavy load period, I get those 16 WAL recycling
>messages every 6.5 minutes. During heavy vacuuming, the recycling happens
>every 3 minutes, and that was my goal (no more than every three minutes,
>per Bruce Momjian's PDF on tuning).
>
>My server specs:
>Dual P4 Xeon 2.4
>8gb RAM
>RAID-1 drive for pg_xlog - running ext3
>RAID-5 drive dedicated to PostgreSQL for everything else - running ext3
>Debian 3.0 (woody) kernel 2.4.18
>
>Some questions:
>
>1) Is there any known bad interactions with ext3fs and PostgreSQL? My
>hardware vendor (Pogo Linux, recommended) seemed to suggest that ext3fs
>has problems in multi-threading.
>2) Any ideas on how to get it to log more info on WAL usage?
>3) Which process in PostgreSQL should I attach to using gdb to check out
>this WAL stuff?
>
>Putting my application on hold for 1.5 minutes out of every 6.5 is of
>course very bad... I'm stumped. Any ideas are welcome; I am willing to
>provide any additional information and run any other tests.
>
>Thanks,
>
>Doug





В списке pgsql-general по дате отправления:

Предыдущее
От: Doug Fields
Дата:
Сообщение: Re: WAL recycling, Linux 2.4.18
Следующее
От: Tom Lane
Дата:
Сообщение: Re: WAL recycling, Linux 2.4.18