Re: Creating foreign key on partitioned table is too slow

Поиск
Список
Период
Сортировка
On Thu, Oct 24, 2019 at 03:48:57PM -0300, Alvaro Herrera wrote:
>On 2019-Oct-23, kato-sho@fujitsu.com wrote:
>
>> Hello
>>
>> To benchmark with tpcb model, I tried to create a foreign key in the partitioned history table, but backend process
killedby OOM.
 
>> the number of partitions is 8192. I tried in master(commit: ad4b7aeb84).
>>
>> I did the same thing in another server which has 200GB memory, but creating foreign key did not end in 24 hours.
>
>Thanks for reporting.  It sounds like there must be a memory leak here.
>I am fairly pressed for time at present so I won't be able to
>investigate this until, at least, mid November.
>

I've briefly looked into this, and I think the main memory leak is in
RelationBuildPartitionDesc. It gets called with PortalContext, it
allocates a lot of memory building the descriptor, copies it into
CacheContext but does not even try to free anything. So we end up with
something like this:

TopMemoryContext: 215344 total in 11 blocks; 47720 free (12 chunks); 167624 used
  pgstat TabStatusArray lookup hash table: 32768 total in 3 blocks; 9160 free (4 chunks); 23608 used
  TopTransactionContext: 4194304 total in 10 blocks; 1992968 free (18 chunks); 2201336 used
  RowDescriptionContext: 8192 total in 1 blocks; 6880 free (0 chunks); 1312 used
  MessageContext: 8192 total in 1 blocks; 3256 free (1 chunks); 4936 used
  Operator class cache: 8192 total in 1 blocks; 512 free (0 chunks); 7680 used
  smgr relation table: 32768 total in 3 blocks; 16768 free (8 chunks); 16000 used
  TransactionAbortContext: 32768 total in 1 blocks; 32504 free (0 chunks); 264 used
  Portal hash: 8192 total in 1 blocks; 512 free (0 chunks); 7680 used
  TopPortalContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544 used
    PortalContext: 1557985728 total in 177490 blocks; 9038656 free (167645 chunks); 1548947072 used: 
  Relcache by OID: 16384 total in 2 blocks; 3424 free (3 chunks); 12960 used
  CacheMemoryContext: 17039424 total in 13 blocks; 7181480 free (9 chunks); 9857944 used
    partition key: 1024 total in 1 blocks; 168 free (0 chunks); 856 used: history
    index info: 2048 total in 2 blocks; 568 free (1 chunks); 1480 used: pg_class_tblspc_relfilenode_index
  ...
    index info: 2048 total in 2 blocks; 872 free (0 chunks); 1176 used: pg_class_oid_index
  WAL record construction: 49776 total in 2 blocks; 6344 free (0 chunks); 43432 used
  PrivateRefCount: 8192 total in 1 blocks; 2584 free (0 chunks); 5608 used
  MdSmgr: 8192 total in 1 blocks; 5976 free (0 chunks); 2216 used
  LOCALLOCK hash: 65536 total in 4 blocks; 18584 free (12 chunks); 46952 used
  Timezones: 104128 total in 2 blocks; 2584 free (0 chunks); 101544 used
  ErrorContext: 8192 total in 1 blocks; 6840 free (4 chunks); 1352 used
Grand total: 1580997216 bytes in 177834 blocks; 18482808 free (167857 chunks); 1562514408 used

(At which point I simply interrupted the query, it'd allocate more and
more memory until an OOM).

The attached patch trivially fixes that by adding a memory context
tracking all the temporary data, and then just deletes it as a whole at
the end of the function. This significantly reduces the memory usage for
me, not sure it's 100% correct.

FWIW, even with this fix it still takes an awful lot to create the
foreign key, because the CPU is stuck doing this

    60.78%    60.78%  postgres  postgres            [.] bms_equal
    32.58%    32.58%  postgres  postgres            [.] get_eclass_for_sort_expr
     3.83%     3.83%  postgres  postgres            [.] add_child_rel_equivalences
     0.23%     0.00%  postgres  [unknown]           [.] 0x0000000000000005
     0.22%     0.00%  postgres  [unknown]           [.] 0000000000000000
     0.18%     0.18%  postgres  postgres            [.] AllocSetCheck
   ...

Haven't looked into the details yet.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Soumyadeep Chakraborty
Дата:
Сообщение: Re: WIP: expression evaluation improvements
Следующее
От: Andres Freund
Дата:
Сообщение: Re: WIP: expression evaluation improvements