Re: Get memory contexts of an arbitrary backend process

Поиск
Список
Период
Сортировка
От torikoshia
Тема Re: Get memory contexts of an arbitrary backend process
Дата
Msg-id ab8998e5af7977133cee43f9d53abd23@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: Get memory contexts of an arbitrary backend process  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Get memory contexts of an arbitrary backend process
Список pgsql-hackers
Hi,

Thanks for all your comments, I updated the patch.


On Tue, Sep 1, 2020 at 12:03 AM Kasahara Tatsuhito 
<kasahara.tatsuhito@gmail.com> wrote:

> - How about managing the status of signal send/receive and dump
> operations on a shared hash or others ?
> Sending and receiving signals, dumping memory information, and
> referencing dump information all work asynchronously.
> Therefore, it would be good to have management information to
> check
> the status of each process.
> A simple idea is that ..
> - send a signal to dump to a PID, it first record following
> information into the shared hash.
> pid (specified pid)
> loc (dump location, currently might be ASAP)
> recv (did the pid process receive a signal? first false)
> dumped (did the pid process dump a mem information?  first
> false)
> - specified process receive the signal, update the status in the
> shared hash, then dumped at specified location.
> - specified process finish dump mem information,  update the
> status
> in the shared hash.

I added a shared hash table consisted of minimal members
mainly for managing whether the file is dumped or not.
Some members like 'loc' seem useful in the future, but I
haven't added them since it's not essential at this point.



On 2020-09-01 10:54, Andres Freund wrote:

>>  CREATE VIEW pg_backend_memory_contexts AS
>> -    SELECT * FROM pg_get_backend_memory_contexts();
>> +    SELECT * FROM pg_get_backend_memory_contexts(-1);
> 
> -1 is odd. Why not use NULL or even 0?

Changed it from -1 to NULL.

>> +             rc = fwrite(&name_len, sizeof(int), 1, fpout);
>> +             rc = fwrite(name, name_len, 1, fpout);
>> +             rc = fwrite(&idlen, sizeof(int), 1, fpout);
>> +             rc = fwrite(clipped_ident, idlen, 1, fpout);
>> +             rc = fwrite(&level, sizeof(int), 1, fpout);
>> +             rc = fwrite(&parent_len, sizeof(int), 1, fpout);
>> +             rc = fwrite(parent, parent_len, 1, fpout);
>> +             (void) rc;                              /* we'll check 
>> for error with ferror */
>> +
>> +     }
> This format is not descriptive. How about serializing to json or
> something? Or at least having field names?

Added field names for each value.

>> +             while (true)
>> +             {
>> +                     CHECK_FOR_INTERRUPTS();
>> +
>> +                     pg_usleep(10000L);
>> +
> 
> Need better signalling back/forth here.

Added a shared hash table that has a flag for managing whether the file 
is dumped or not.
I modified it to use this flag.

>> +     /*
>> +      * Open a temp file to dump the current memory context.
>> +      */
>> +     fpout = AllocateFile(tmpfile, PG_BINARY_W);
>> +     if (fpout == NULL)
>> +     {
>> +             ereport(LOG,
>> +                             (errcode_for_file_access(),
>> +                              errmsg("could not write temporary 
>> memory context file \"%s\": %m",
>> +                                             tmpfile)));
>> +             return;
>> +     }
> 
> Probably should be opened with O_CREAT | O_TRUNC?

AllocateFile() calls fopen() and AFAIU fopen() with mode "w" corresponds 
to open() with "O_WRONLY | O_CREAT | O_TRUNC".

Do you mean I should not use fopen() here?

On 2020-09-24 13:01, Michael Paquier wrote:
> On Thu, Sep 10, 2020 at 09:11:21PM +0900, Kasahara Tatsuhito wrote:
>> I think it's fine to have an interface to delete in an emergency, but
>> I agree that
>> users shouldn't be made aware of the existence or deletion of dump
>> files, basically.
> 
> Per the CF bot, the number of tests needs to be tweaked, because we
> test each entry filtered out with is_deeply(), meaning that the number
> of tests needs to be updated to reflect that if the filtered list is
> changed:
> t/010_pg_basebackup.pl ... 104/109 # Looks like you planned 109 tests
> but ran 110.
> t/010_pg_basebackup.pl ... Dubious, test returned 255 (wstat 65280, 
> 0xff00)
> All 109 subtests passed
> 
> Simple enough to fix.

Incremented the number of tests.


Any thoughts?

Regards,

-- 
Atsushi Torikoshi
NTT DATA CORPORATION

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Handing off SLRU fsyncs to the checkpointer
Следующее
От: Craig Ringer
Дата:
Сообщение: [PATCH] Runtime control of CLOBBER_CACHE_ALWAYS