Re: memory problems and crash of db when deleting data from tablewith thousands of partitions

Поиск
Список
Период
Сортировка
От Josef Machytka
Тема Re: memory problems and crash of db when deleting data from tablewith thousands of partitions
Дата
Msg-id CAGvVEFsTp8cqALqPaOVxXaXMXx9V6TXtCAGobvWBb-ezjDbRtA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: memory problems and crash of db when deleting data from tablewith thousands of partitions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: memory problems and crash of db when deleting data from tablewith thousands of partitions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-bugs
Thank you for your email, FYI - we now did tests with PostgreSQL 12 and unfortunately it is also not able to handle to our case. Only difference is that PG 12 is not killed by OOM killer and even does not crash - which is good. But it reports error "out of memory" and stops the statement. So at least it looks like much more stable then PG 11.

Regards

Josef Machytka

from postgresql log:

TopMemoryContext: 6685688 total in 213 blocks; 1229064 free (32 chunks); 5456624 used
  pgstat TabStatusArray lookup hash table: 1048576 total in 8 blocks; 230568 free (17 chunks); 818008 used
  TopTransactionContext: 8192 total in 1 blocks; 7744 free (2 chunks); 448 used
  Type information cache: 24352 total in 2 blocks; 2624 free (0 chunks); 21728 used
  HandleParallelMessages: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
  TableSpace cache: 8192 total in 1 blocks; 2096 free (0 chunks); 6096 used
  Operator lookup cache: 24576 total in 2 blocks; 10760 free (3 chunks); 13816 used
  RowDescriptionContext: 8192 total in 1 blocks; 6896 free (0 chunks); 1296 used
  MessageContext: 52197329840 total in 8274 blocks; 75904 free (9 chunks); 52197253936 used
    partition directory: 32768 total in 3 blocks; 8544 free (7 chunks); 24224 used
  Operator class cache: 8192 total in 1 blocks; 560 free (0 chunks); 7632 used
  smgr relation table: 4194304 total in 10 blocks; 1597192 free (37 chunks); 2597112 used
  TransactionAbortContext: 32768 total in 1 blocks; 32512 free (0 chunks); 256 used
  Portal hash: 8192 total in 1 blocks; 560 free (0 chunks); 7632 used
  TopPortalContext: 8192 total in 1 blocks; 7936 free (1 chunks); 256 used
  Relcache by OID: 1048576 total in 8 blocks; 208888 free (16 chunks); 839688 used
  CacheMemoryContext: 136577192 total in 29 blocks; 5037352 free (18 chunks); 131539840 used
    index info: 3072 total in 2 blocks; 1064 free (1 chunks); 2008 used: test_out_of_stock_default_v2__start_date_end_date_country_g_idx
    index info: 3072 total in 2 blocks; 1064 free (1 chunks); 2008 used: test_out_of_stock_zoot_defaul_start_date_end_date_country_g_idx
....
    index info: 3072 total in 2 blocks; 1064 free (1 chunks); 2008 used: test_out_of_stock_zalora_2018_start_date_end_date_country__idx1
    19998 more child contexts containing 51415472 total in 39616 blocks; 12386152 free (10423 chunks); 39029320 used
  WAL record construction: 49768 total in 2 blocks; 6368 free (0 chunks); 43400 used
  PrivateRefCount: 8192 total in 1 blocks; 2624 free (0 chunks); 5568 used
  MdSmgr: 524288 total in 7 blocks; 42464 free (2 chunks); 481824 used
  LOCALLOCK hash: 2097152 total in 9 blocks; 100416 free (30 chunks); 1996736 used
  Timezones: 104120 total in 2 blocks; 2624 free (0 chunks); 101496 used
  ErrorContext: 8192 total in 1 blocks; 7936 free (3 chunks); 256 used
Grand total: 52401570368 bytes in 48395 blocks; 21121960 free (10700 chunks); 52380448408 used
2019-10-28 15:11:44.276 UTC [2746] upcload@queries ERROR:  out of memory
2019-10-28 15:11:44.276 UTC [2746] upcload@queries DETAIL:  Failed on request of size 16 in memory context "MessageContext".
2019-10-28 15:11:44.276 UTC [2746] upcload@queries STATEMENT:  ......

On Wed, 23 Oct 2019 at 14:47, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Wed, Oct 23, 2019 at 01:46:23PM +0200, Josef Machytka wrote:
>Hi people,
>
>I know this is actually known problem (
>https://stackoverflow.com/questions/49291451/postgres-10-3-heavily-partitioned-table-and-cannot-delete-any-records/58521850#58521850).
>
>
>I would just like to add my voice and description of use case to this
>topic. If this could be repaired it would be amazing because we use new
>native partitioning really a lot in our company and we like it - well, not
>counting this problem into it....
>
>I have this problem on PostgreSQL 11 (PostgreSQL 11.5 (Debian
>11.5-3.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian
>6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit) with partitioned table having
>more levels of partitioning. Main table is partitioned by shops and each
>shops months (year-month) over several past years. Together several
>thousands of partitions and number is still growing.
>
>When we simply insert new data (which is usual operation we do) all is
>perfect. But lately we needed to delete some wrong data over all partitions
>and PostgreSQL started to crash during this operation.
>
>Crashes are always the same - PostgreSQL starts to use more and more memory
>and eventually is killed by OOM killer. I tried to fiddle with work_mem and
>other settings - nothing, database just crashes a bit later but crashes
>anyway.
>

Yeah, I think this is a known issue - there are cases where we're not
smart enough and end up opening/locking all the partitions, resulting in
excessive memory consumption (and OOM). I suppose this is one of those
cases, but I'd have to see memory context stats to know for sure.

Unfortunately, that's a design issue, and it's not going to be fixed in
backbranches. We're improving this - perhaps PostgreSQL 12 would
improve the behavior in your case, and hopefully 13 will do even better.

>Of course workaround works - I can use script to do deletion or update over
>each shop partition separately. There are only several dozens of monthly
>partitions for each shop so it work perfectly. But anyway if problem would
>be repaired and simple delete/update over top main table would be possible
>this would be much better.
>

Right, that's a reasonable workaround.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее
От: william allen
Дата:
Сообщение: RE: BUG #15858: could not stat file - over 4GB
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: BUG #16079: Question Regarding the BUG #16064