Re: Postgres is not able to handle more than 4k tables!?

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Postgres is not able to handle more than 4k tables!?
Дата
Msg-id 20200709145700.GA12375@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Postgres is not able to handle more than 4k tables!?  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Postgres is not able to handle more than 4k tables!?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Postgres is not able to handle more than 4k tables!?  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Список pgsql-hackers
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:
> > There are several thousand clients, most of which are executing complex
> > queries.
>
> So, that's really the core of your problem.  We don't promise that
> you can run several thousand backends at once.  Usually it's recommended
> that you stick a connection pooler in front of a server with (at most)
> a few hundred backends.

Sure, but that doesn't mean things should completely fall over when we
do get up to larger numbers of backends, which is definitely pretty
common in larger systems.  I'm pretty sure we all agree that using a
connection pooler is recommended, but if there's things we can do to
make the system work at least a bit better when folks do use lots of
connections, provided we don't materially damage other cases, that's
probably worthwhile.

> > So them are not able to process all this invalidation messages and their
> > invalidation message buffer is overflown.
> > Size of this buffer is hardcoded (MAXNUMMESSAGES = 4096) and can not be
> > changed without recompilation of Postgres.
> > This is problem N1.
>
> No, this isn't a problem.  Or at least you haven't shown a reason to
> think it is.  Sinval overruns are somewhat routine, and we certainly
> test that code path (see CLOBBER_CACHE_ALWAYS buildfarm animals).

Testing that it doesn't outright break and having it be decently
performant are two rather different things.  I think we're talking more
about performance and not so much about if the system is outright broken
in this case.

> > But then we come to the next show stopper: NUM_LOCK_PARTITIONS.
> > It is also hardcoded and can't be changed without recompilation:
>
> > #define LOG2_NUM_LOCK_PARTITIONS  4
> > #define NUM_LOCK_PARTITIONS  (1 << LOG2_NUM_LOCK_PARTITIONS)
>
> > Having just 16 LW-Locks greatly increase conflict probability (taken in
> > account that there are 5k tables and totally about 25k relations).
>
> > It cause huge lw-lock acquisition time for heap_open and planning stage
> > of some queries is increased from milliseconds to several minutes!
>
> Really?

Apparently, given the response down-thread.

> > This is problem number 2. But there is one more flaw we have faced with.
> > We have increased LOG2_NUM_LOCK_PARTITIONS to 8
> > and ... clients start to report "too many LWLocks taken" error.
> > There is yet another hardcoded constant MAX_SIMUL_LWLOCKS = 200
> > which relation with NUM_LOCK_PARTITIONS  was not mentioned anywhere.
>
> Seems like self-inflicted damage.  I certainly don't recall anyplace
> in the docs where we suggest that you can alter that constant without
> worrying about consequences.

Perhaps not in the docs, but would be good to make note of it somewhere,
as I don't think it's really appropriate to assume these constants won't
ever change and whomever contemplates changing them would appreciate
knowing about other related values..

> > So looks like NUM_LOCK_PARTITIONS and MAXNUMMESSAGES  constants have to
> > be replaced with GUCs.
>
> I seriously doubt we'd do that.

Making them GUCs does seem like it's a few steps too far... but it'd be
nice if we could arrange to have values that don't result in the system
falling over with large numbers of backends and large numbers of tables.
To get a lot of backends, you'd have to set max_connections up pretty
high to begin with- perhaps we should contemplate allowing these values
to vary based on what max_connections is set to?

Thanks,

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Is this a bug in pg_current_logfile() on Windows?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Is this a bug in pg_current_logfile() on Windows?