Обсуждение: My Experiment of PG crash when dealing with huge amount of data

Поиск

Список

Период

Сортировка

My Experiment of PG crash when dealing with huge amount of data

От

高健

Дата:

30 августа 2013 г., 09:10:46

Hello:

I have done the following experiment to test :

PG's activity when dealing with data which is bigger in size than total memory of the whole os system.

The result is:

PG says:

----------------------------

WARNING: terminating connection because of crash of another server process

DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT: In a moment you should be able to reconnect to the database and repeat your command.

The connection to the server was lost. Attempting reset: Failed.

-----------------------------

In log, I can see the following:

LOG: background writer process (PID 3221) was terminated by signal 9: Killed

But why it need to kill the bgwriter ?

And that PG will use as much resource as it can to finish a sql dml , till it used almost all the resource and then crash?

I used the default checkpoint_segments(3) and shard_buffers(32MB) settings. And my machine has only 1024MB memory.

The process is as :

1) create table whose one tuple is more than 1KB:

postgres=# create table test01(id integer, val char(1024));

2) insert into the table 2457600 records, which makes totally more than 2400MB:

postgres=# insert into test01 values(generate_series(1,2457600),repeat( chr(int4(random()*26)+65),1024));

It really took a few minutes, When the sql statement run, I can see that the server process is consuming 80% of total memory of the os.

----------------------------------------------------------------------------------------------------------------------------------

[root@server ~]# ps aux | grep post

root 3180 0.0 0.0 105296 712 pts/1 S 16:31 0:00 su - postgres

postgres 3181 0.0 0.0 70304 676 pts/1 S+ 16:31 0:00 -bash

postgres 3219 0.0 0.2 113644 2864 pts/1 S 16:32 0:00 /usr/local/pgsql/bin/postgres -D /gao/data

postgres 3221 0.4 3.0 113724 35252 ? Ss 16:32 0:01 postgres: writer process

postgres 3222 0.2 0.1 113644 1616 ? Ds 16:32 0:00 postgres: wal writer process

postgres 3223 0.0 0.0 114380 1148 ? Ss 16:32 0:00 postgres: autovacuum launcher process

postgres 3224 0.0 0.0 73332 472 ? Ss 16:32 0:00 postgres: stats collector process

root 3252 0.0 0.0 105296 712 pts/2 S 16:32 0:00 su - postgres

postgres 3253 0.0 0.0 70304 676 pts/2 S 16:32 0:00 -bash

postgres 3285 0.0 0.0 83488 740 pts/2 S+ 16:32 0:00 ./psql

postgres 3286 14.8 80.2 2598332 924308 ? Ds 16:32 0:35 postgres: postgres postgres [local] INSERT

root 3333 0.0 0.0 65424 812 pts/3 S+ 16:36 0:00 grep post

--------------------------------------------------------

After a while, I found in the PG's log , the following information:

--------------------------------------------------------

LOG: autovacuum launcher started

LOG: database system is ready to accept connections

LOG: checkpoints are occurring too frequently (2 seconds apart)