Re: Resetting spilled txn statistics in pg_stat_replication

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Resetting spilled txn statistics in pg_stat_replication
Дата
Msg-id CA+fd4k5+bH1MfFDgw6JGfHPmaXf0OrTjKvjqiw4QDLt0ZZvC8A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Resetting spilled txn statistics in pg_stat_replication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Resetting spilled txn statistics in pg_stat_replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thu, 16 Jul 2020 at 18:16, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 16, 2020 at 1:45 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Thu, 16 Jul 2020 at 15:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jul 10, 2020 at 2:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Fri, Jul 10, 2020 at 7:23 AM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > >
> > > > > On Thu, 9 Jul 2020 at 12:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Jul 8, 2020 at 1:14 PM Masahiko Sawada
> > > > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > > I think that using oids has another benefit that we don't need to send
> > > > > > > slot name to the stats collector along with the stats. Since the
> > > > > > > maximum size of slot name is NAMEDATALEN and we don't support the
> > > > > > > pgstat message larger than PGSTAT_MAX_MSG_SIZE (1000 bytes), if the
> > > > > > > user wants to increase NAMEDATALEN they might not be able to build.
> > > > > > >
> > > > > >
> > > > > > I think NAMEDATALEN is used for many other objects as well and I don't
> > > > > > think we want to change it in foreseeable future, so that doesn't
> > > > > > sound to be a good reason to invent OIDs for slots.  OTOH, I do
> > > > > > understand it would be better to send OIDs than names for slots but I
> > > > > > am just not sure if it is a good idea to invent a new way to generate
> > > > > > OIDs (which is different from how we do it for other objects in the
> > > > > > system) for this purpose.
> > > > >
> > > > > I'm concerned that there might be users who are using custom
> > > > > PostgreSQL that increased NAMEDATALEN for some reason. But indeed, I
> > > > > also agree with your concerns. So perhaps we can go with the current
> > > > > PoC patch approach as the first version (i.g., sending slot drop
> > > > > message to stats collector). When we need such a unique identifier
> > > > > also for other purposes, we will be able to change this feature so
> > > > > that it uses that identifier for this statistics reporting purpose.
> > > > >
> > > >
> > > > Okay, feel to submit the version atop my revert patch.
> > > >
> > >
> > > Attached, please find the rebased version.  I have kept prorows as 10
> > > instead of 100 for pg_stat_get_replication_slots because I don't see
> > > much reason for keeping the value more than the default value of
> > > max_replication_slots.
> > >
> >
> > Thank you for rebasing the patch! Agreed.
> >
> > > As we are targeting this patch for PG14, so I think we can now add the
> > > functionality to reset the stats as well.  What do you think?
> > >
> >
> > Yeah, I was also updating the patch while adding the reset functions.
> >
> > However, I'm concerned about the following part:
> >
> > +static int
> > +pgstat_replslot_index(const char *name)
> > +{
> > +   int     i;
> > +
> > +   Assert(nReplSlotStats <= max_replication_slots);
> > +   for (i = 0; i < nReplSlotStats; i++)
> > +   {
> > +       if (strcmp(replSlotStats[i].slotname, name) == 0)
> > +           return i; /* found */
> > +   }
> > +
> > +   /* not found, register new slot */
> > +   memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats));
> > +   memcpy(&replSlotStats[nReplSlotStats].slotname, name, NAMEDATALEN);
> > +   return nReplSlotStats++;
> > +}
> >
> > +static void
> > +pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len)
> > +{
> > +   int idx;
> > +
> > +   idx = pgstat_replslot_index(msg->m_slotname);
> > +   Assert(idx >= 0 && idx < max_replication_slots);
> >
> > As long as we cannot rely on message ordering, the above assertion
> > could be false. For example, suppose that there is no unused
> > replication slots and the user:
> >
> > 1. drops the existing slot.
> > 2. creates a new slot.
> >
> > If the stats messages arrive in order of 2 and 1, the above assertion
> > is false or leads to memory corruption when assertions are not
> > enabled.
> >
> > A possible solution would be to add an in-use flag to
> > PgStat_ReplSlotStats indicating whether the stats for slot is used or
> > not. When receiving a drop message for a slot, the stats collector
> > just marks the corresponding stats as unused. When receiving the stats
> > report for a new slot but there is no unused stats slot, ignore it.
> > What do you think?
> >
>
> As of now, you have a boolean flag msg.m_drop to distinguish the drop
> message but we don't have a similar way to distinguish the 'create'
> message.  What if have a way to distinguish 'create' message (we can
> probably keep some sort of flag to indicate the type of message
> (create, drop, update)) and then if the slot with the same name
> already exists, we ignore such a message.  Now, we also need a way to
> create the entry for a slot for a normal stats update message as well
> to accommodate for the lost 'create' message.  Does that make sense?

I might be missing your point, but even if we have 'create' message,
the problem can happen if when slots are full the user drops slot
‘slot_a’, creates slot ‘slot_b', and messages arrive in the reverse
order?

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: TAP tests and symlinks on Windows
Следующее
От: ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Дата:
Сообщение: Re: renaming configure.in to configure.ac