Re: MVCC and all that...

Поиск
Список
Период
Сортировка
От Justin
Тема Re: MVCC and all that...
Дата
Msg-id CALL-XeOVhXeGTxQ-m=a__YqKrhWROTdv6Ea-zfN2-rFtRQyfSg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: MVCC and all that...  (Nico Williams <nico@cryptonector.com>)
Ответы Re: MVCC and all that...
Re: MVCC and all that...
Список pgsql-general


On Wed, Sep 10, 2025 at 5:28 PM Nico Williams <nico@cryptonector.com> wrote:
On Tue, Sep 09, 2025 at 08:41:02PM -0400, Justin wrote:
> The author brings up threaded vs multi-process. That's an old old old old
> old conversation that has been shown there is no clear better way.

This is relevant to the next part:

> Number of open connections.  so firebird can do 1000  open sessions with a
> smaller memory footprint,  still can not have 1000 simultaneous running
> sessions unless we have 1000 CPU's. Where is the win here??  We should be
> managing resources better on the application side, not opening thousands of
> connections that sit idle doing nothing.

When a service is written in such a way as to minimize the memory
footprint of each request/client then it can process more of them
assuming it's only memory-bound.  Why?  Because less memory per thing ==
less bandwidth use, and also less thrashing of caches and higher cache
hit ratios.

Minimizing request/client state means not spreading any of it on the
stack, thus not requiring a stack per-client.  This means not
thread-per-client (green or otherwise) or process-per-client.  It means
essentially some flavor of continuation passing style (CPS).  For a
query plan executor that's really: the query plan, all the in-flight I/O
requests, all cached data still needed to continue processing the plan.
If you have a Duff's device style / CPS style implementation, then
nothing on the stack needs to be preserved while waiting for I/Os, and
the state of the query plan is effectively minimized.

But for a database with storage I/O costs the memory footprint doesn't
matter quite so much because most likely it will be I/O bound rather
than CPU- or memory-bound.


I am not following you here,   Databases are going to be bound somewhere at some point, Disk,IO, Network IO, Memory, or CPU bound.  Which one is causing the bottle neck just depends on the workload and size of the database. 

The number of idle sessions does not really matter  it is just wasting resources across the entire application stack. 


> "PostgreSQL has a relatively simple, but fast query planning algorithm"
> Compared to what....  What feature is PG missing these days...  the only
> thing I know it can't do is change the  plan  in the middle of the
> execution stage.  Which is not a query planner thing but the execution
> layer saying to itself  I am taking too long maybe go back to the planning
> stage...  Query Hints that have been discussed endlessly.  Adding hints
> adds its own problems and has become a big mess for databases that support
> it.

I would really like out-of-band hints.  These would be hints not
specified in the SQL itself but to be sent separately and which address
table sources or joins by name, like this:

psql> SELECT .. FROM x x1 JOIN y y1 ON .. JOIN y y2 ON .. WHERE ..;
...> \hint y1 indexed by ..
...> \hint y2 indexed by ..
...> ;


I humbly disagree, the point of SQL being a 4th  generation language is,  I tell it what I want, not how to go get what I want. 

Thank you,
Justin

В списке pgsql-general по дате отправления: