Обсуждение: Decrease MAX_BACKENDS to 2^16

Поиск
Список
Период
Сортировка

Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
Hi,

Currently the maximum for max_connections (+ bgworkers + autovacuum) is
defined by
#define MAX_BACKENDS    0x7fffff
which unfortunately means that some things like buffer reference counts
need a full integer to store references.

Since there's absolutely no sensible scenario for setting
max_connections that high, I'd like to change the limit to 2^16, so we
can use a uint16 in BufferDesc->refcount.

Does anyone disagree? This clearly is 9.5 material, but I wanted to
raise it early, since I plan to develop some stuff for 9.5 that'd depend
on lowering it.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Greg Stark
Дата:
On Fri, Apr 25, 2014 at 11:15 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Since there's absolutely no sensible scenario for setting
> max_connections that high, I'd like to change the limit to 2^16, so we
> can use a uint16 in BufferDesc->refcount.

Clearly there's no sensible way to run 64k backends in the current
architecture. But I don't think it's beyond the realm of possibility
that we'll reduce the overhead in the future with an eye to being able
to do that. Is it that helpful that it's worth baking in more
dependencies on that limitation?


-- 
greg



Re: Decrease MAX_BACKENDS to 2^16

От
David Fetter
Дата:
On Sat, Apr 26, 2014 at 12:15:40AM +0200, Andres Freund wrote:
> Hi,
> 
> Currently the maximum for max_connections (+ bgworkers + autovacuum) is
> defined by
> #define MAX_BACKENDS    0x7fffff
> which unfortunately means that some things like buffer reference counts
> need a full integer to store references.

Out of curiosity, where are you finding that a 32-bit integer is
causing problems that a 16-bit one would solve?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-26 11:52:44 +0100, Greg Stark wrote:
> On Fri, Apr 25, 2014 at 11:15 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > Since there's absolutely no sensible scenario for setting
> > max_connections that high, I'd like to change the limit to 2^16, so we
> > can use a uint16 in BufferDesc->refcount.
> 
> Clearly there's no sensible way to run 64k backends in the current
> architecture.

The current limit is 2^24, I am only proposing to lower it to 2^16.

> But I don't think it's beyond the realm of possibility
> that we'll reduce the overhead in the future with an eye to being able
> to do that. Is it that helpful that it's worth baking in more
> dependencies on that limitation?

I don't think it's realistic that we'll ever have more than 2^16 full
blown backends. We might (I hope!) a builtin pooler, but pooler
connections won't be full backends.
So I really don't see any practical limitation with limiting the max
number of backends to 65k.

What I think it's necessary for is at least:

* Move the buffer content lock inline into to the buffer descriptor, while still fitting into one cacheline.
* lockless/atomic Pin/Unpin Buffer.

Imo those are significant scalability advantages...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-26 05:40:21 -0700, David Fetter wrote:
> On Sat, Apr 26, 2014 at 12:15:40AM +0200, Andres Freund wrote:
> > Hi,
> > 
> > Currently the maximum for max_connections (+ bgworkers + autovacuum) is
> > defined by
> > #define MAX_BACKENDS    0x7fffff
> > which unfortunately means that some things like buffer reference counts
> > need a full integer to store references.
> 
> Out of curiosity, where are you finding that a 32-bit integer is
> causing problems that a 16-bit one would solve?

Save space? For one it allows to shrink some structs (into one
cacheline!). For another it allows to combine flags and refcount in
buffer descriptors into one variable, manipulated atomically.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Tom Lane
Дата:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-04-26 11:52:44 +0100, Greg Stark wrote:
>> But I don't think it's beyond the realm of possibility
>> that we'll reduce the overhead in the future with an eye to being able
>> to do that. Is it that helpful that it's worth baking in more
>> dependencies on that limitation?

> What I think it's necessary for is at least:

> * Move the buffer content lock inline into to the buffer descriptor,
>   while still fitting into one cacheline.
> * lockless/atomic Pin/Unpin Buffer.

TBH, that argument seems darn weak, not to mention probably applicable
only to current-vintage Intel chips.  And you have not proven that
narrowing the backend ID is necessary to either goal, even if we
accepted that these goals were that important.

While I agree with you that it seems somewhat unlikely we'd ever get
past 2^16 backends, these arguments are not nearly good enough to
justify a hard-wired limitation.
        regards, tom lane



Re: Decrease MAX_BACKENDS to 2^16

От
Tom Lane
Дата:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-04-26 05:40:21 -0700, David Fetter wrote:
>> Out of curiosity, where are you finding that a 32-bit integer is
>> causing problems that a 16-bit one would solve?

> Save space? For one it allows to shrink some structs (into one
> cacheline!).

And next week when we need some other field in a buffer header,
what's going to happen?  If things are so tight that we need to
shave a few bits off backend IDs, the whole thing is a house of
cards anyway.
        regards, tom lane



Re: Decrease MAX_BACKENDS to 2^16

От
David Fetter
Дата:
On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-04-26 11:52:44 +0100, Greg Stark wrote:
> >> But I don't think it's beyond the realm of possibility
> >> that we'll reduce the overhead in the future with an eye to being able
> >> to do that. Is it that helpful that it's worth baking in more
> >> dependencies on that limitation?
> 
> > What I think it's necessary for is at least:
> 
> > * Move the buffer content lock inline into to the buffer descriptor,
> >   while still fitting into one cacheline.
> > * lockless/atomic Pin/Unpin Buffer.
> 
> TBH, that argument seems darn weak, not to mention probably applicable
> only to current-vintage Intel chips.  And you have not proven that
> narrowing the backend ID is necessary to either goal, even if we
> accepted that these goals were that important.
> 
> While I agree with you that it seems somewhat unlikely we'd ever get
> past 2^16 backends, these arguments are not nearly good enough to
> justify a hard-wired limitation.

Rather than hard-wiring one, could we do something clever with
bit-stuffing, or would that tank performance in some terrible ways?

I know we allow for gigantic numbers of backend connections, but I've
never found a win for >2x the number of cores in the box, which at
least in my experience so far tops out in the 8-bit (in extreme cases
unsigned 8-bit) range.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-26 11:20:56 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-04-26 11:52:44 +0100, Greg Stark wrote:
> >> But I don't think it's beyond the realm of possibility
> >> that we'll reduce the overhead in the future with an eye to being able
> >> to do that. Is it that helpful that it's worth baking in more
> >> dependencies on that limitation?
> 
> > What I think it's necessary for is at least:
> 
> > * Move the buffer content lock inline into to the buffer descriptor,
> >   while still fitting into one cacheline.
> > * lockless/atomic Pin/Unpin Buffer.
> 
> TBH, that argument seems darn weak, not to mention probably applicable
> only to current-vintage Intel chips.

64 byte has been the cacheline size for more than a decade and it's not
just x86. ARM has also moved to it, as well as other architectures. And
even if it's 32 or 128bit - fitting datastructures to a power of 2 of
the cacheline size is still beneficial.
I don't think many datastructures in pg deserves attention to that, but
the buffer descriptors are one of the few. It's currently one of the top
#3 sources of cpu cache issues in pg.

> And you have not proven that
> narrowing the backend ID is necessary to either goal, even if we
> accepted that these goals were that important.

I am pretty sure there are other ways, but since the actual cost of that
restriction imo is just about zero, it seems like a quite sensible
solution.

> While I agree with you that it seems somewhat unlikely we'd ever get
> past 2^16 backends, these arguments are not nearly good enough to
> justify a hard-wired limitation.

Even if you include a lockless pin/unpin buffer? Besides the lwlock's
internal spinlock the buffer spinlocks are the hottest ones in PG by
far.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-26 11:22:39 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-04-26 05:40:21 -0700, David Fetter wrote:
> >> Out of curiosity, where are you finding that a 32-bit integer is
> >> causing problems that a 16-bit one would solve?
> 
> > Save space? For one it allows to shrink some structs (into one
> > cacheline!).
> 
> And next week when we need some other field in a buffer header,
> what's going to happen?  If things are so tight that we need to
> shave a few bits off backend IDs, the whole thing is a house of
> cards anyway.

The problem isn't so much that we need the individual bits, but that we
need something that has an alignment of two, instead of 4.

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Josh Berkus
Дата:
On 04/26/2014 11:06 AM, David Fetter wrote:
> I know we allow for gigantic numbers of backend connections, but I've
> never found a win for >2x the number of cores in the box, which at
> least in my experience so far tops out in the 8-bit (in extreme cases
> unsigned 8-bit) range.

For my part, I've found that anything over a few hundred backends on a
commodity server leads to serious performance degradation.  Even 2000 is
enough to make most servers fall over.  And with proper connection
pooling, I can pump 30,000 queries per second through about 45
connections, so the clear path to supporting large numbers of
connections is some form of built-in pooling.

However, I agree with Tom that Andres should "show his hand" before we
decrease MAX_BACKENDS by 256X.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-26 13:16:38 -0700, Josh Berkus wrote:
> However, I agree with Tom that Andres should "show his hand" before we
> decrease MAX_BACKENDS by 256X.

I just don't want to invest time in developing and benchmarking
something that's not going to be accepted anyway. Thus my question.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Noah Misch
Дата:
On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > What I think it's necessary for is at least:
> 
> > * Move the buffer content lock inline into to the buffer descriptor,
> >   while still fitting into one cacheline.
> > * lockless/atomic Pin/Unpin Buffer.
> 
> TBH, that argument seems darn weak, not to mention probably applicable
> only to current-vintage Intel chips.  And you have not proven that
> narrowing the backend ID is necessary to either goal, even if we
> accepted that these goals were that important.
> 
> While I agree with you that it seems somewhat unlikely we'd ever get
> past 2^16 backends, these arguments are not nearly good enough to
> justify a hard-wired limitation.

I'm satisfied with the arguments Andres presented, which I presume were weak
only because he didn't expect a staunch defense of max_connections=70000 use.
The new restriction will still permit settings an order of magnitude larger
than current *worst* practice and 2-3 orders of magnitude larger than current
good practice.  If the next decade sees database server core counts grow by
two orders of magnitude or sees typical cache architectures change enough to
make the compactness irrelevant, we'll have the usual opportunities to react.
Today, the harm from contention on buffer headers totally eclipses the benefit
of allowing max_connections=70000.  There's no cause to predict a hardware
development radical enough to change that conclusion.

Sure, let's not actually commit a patch to impose this limit until the first
change benefiting from doing so is ready to go.  There remains an opportunity
to evaluate whether that beneficiary change is better done a different way.
By having this thread to first settle that the new max_connections limit is
essentially okay, the eventual thread concerning lock-free pin manipulation
need not inflate from discussion of this side issue.

On Sat, Apr 26, 2014 at 11:22:39AM -0400, Tom Lane wrote:
> And next week when we need some other field in a buffer header,
> what's going to happen?  If things are so tight that we need to
> shave a few bits off backend IDs, the whole thing is a house of
> cards anyway.

The buffer header has seen one change in nine years.  Making it an inviting
site for future patches is not important.

nm

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



Re: Decrease MAX_BACKENDS to 2^16

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:
>> While I agree with you that it seems somewhat unlikely we'd ever get
>> past 2^16 backends, these arguments are not nearly good enough to
>> justify a hard-wired limitation.

> I'm satisfied with the arguments Andres presented, which I presume were weak
> only because he didn't expect a staunch defense of max_connections=70000 use.
> The new restriction will still permit settings an order of magnitude larger
> than current *worst* practice and 2-3 orders of magnitude larger than current
> good practice.  If the next decade sees database server core counts grow by
> two orders of magnitude or sees typical cache architectures change enough to
> make the compactness irrelevant, we'll have the usual opportunities to react.
> Today, the harm from contention on buffer headers totally eclipses the benefit
> of allowing max_connections=70000.  There's no cause to predict a hardware
> development radical enough to change that conclusion.

Well, let me clarify my position: I'm not against reducing MAX_BACKENDS
if we get a significant improvement by doing so.  But the case for that
has not been made.

>> And next week when we need some other field in a buffer header,
>> what's going to happen?  If things are so tight that we need to
>> shave a few bits off backend IDs, the whole thing is a house of
>> cards anyway.

> The buffer header has seen one change in nine years.  Making it an inviting
> site for future patches is not important.

We were just a few days ago discussing (again) making changes to the
buffer allocation algorithms.  It hardly seems implausible that any
useful improvements there might need new or different fields in the
buffer headers.
        regards, tom lane



Re: Decrease MAX_BACKENDS to 2^16

От
Peter Geoghegan
Дата:
On Sat, Apr 26, 2014 at 1:30 PM, Noah Misch <noah@leadboat.com> wrote:
> Sure, let's not actually commit a patch to impose this limit until the first
> change benefiting from doing so is ready to go.  There remains an opportunity
> to evaluate whether that beneficiary change is better done a different way.
> By having this thread to first settle that the new max_connections limit is
> essentially okay, the eventual thread concerning lock-free pin manipulation
> need not inflate from discussion of this side issue.

I agree with your remarks here. This kind of thing is only going to
become more important.

> On Sat, Apr 26, 2014 at 11:22:39AM -0400, Tom Lane wrote:
>> And next week when we need some other field in a buffer header,
>> what's going to happen?  If things are so tight that we need to
>> shave a few bits off backend IDs, the whole thing is a house of
>> cards anyway.
>
> The buffer header has seen one change in nine years.  Making it an inviting
> site for future patches is not important.

My prototype caching patch, which seems promising to me adds an
instr_time to the BufferDesc struct. While that's obviously something
that isn't acceptable, and while I obviously could do better, it still
strikes me that that is the natural place to put such a piece of
state. That doesn't mean it's the best place, but it's still a point
worth noting in the context of this discussion.

As I mention on the thread concerning that work, the LRU-K paper
recommends a time-based delay throttling incrementation of usage_count
to address the problem of "correlated references" (5 seconds is
suggested there). At least one other major system implements a
configurable delay defaulting to 3 seconds. The 2Q paper also suggests
a correlated reference period.

-- 
Peter Geoghegan



Re: Decrease MAX_BACKENDS to 2^16

От
Peter Geoghegan
Дата:
On Sat, Apr 26, 2014 at 1:58 PM, Peter Geoghegan <pg@heroku.com> wrote:
> The 2Q paper also suggests a correlated reference period.

I withdraw this. 2Q in fact does not have such a parameter, while
LRU-K does. But the other major system I mentioned very explicitly has
a configurable delay that serves this exact purpose. This "prevents a
burst of pins on a buffer counting as many touches". The point is that
this approach is quite feasible, and may even be the best way of
addressing the general problem of correlated references.


-- 
Peter Geoghegan



Re: Decrease MAX_BACKENDS to 2^16

От
Jim Nasby
Дата:
On 4/26/14, 1:27 PM, Andres Freund wrote:
> I don't think we need to decide this without benchmarks proving the
> benefits. I basically want to know whether somebody has an actual
> usecase - even if I really, really, can't think of one - of setting
> max_connections even remotely that high. If there's something
> fundamental out there that'd make changing the limit impossible, doing
> benchmarks wouldn't be worthwile.

Stupid question... how many OSes would actually support 65k active processes, let alone 2^24?
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: Decrease MAX_BACKENDS to 2^16

От
Heikki Linnakangas
Дата:
On 04/26/2014 09:27 PM, Andres Freund wrote:
> I don't think we need to decide this without benchmarks proving the
> benefits. I basically want to know whether somebody has an actual
> usecase - even if I really, really, can't think of one - of setting
> max_connections even remotely that high. If there's something
> fundamental out there that'd make changing the limit impossible, doing
> benchmarks wouldn't be worthwile.

It doesn't seem unreasonable to have a database with tens of thousands 
of connections. Sure, performance will suffer, but if the connections 
sit idle most of the time so that the total load is low, who cares. 
Sure, you could use a connection pooler, but it's even better if you 
don't have to.

If there are big gains to be had from limiting the number of 
connections, I'm not against it. For the purpose of shrinking BufferDesc 
though, I have feeling there might be other lower hanging fruit in 
there. For example, wait_backend_pid and freeNext are not used very 
often, so they could be moved elsewhere, to a separate array. And buf_id 
and the LWLock pointers could be calculated from the memory address of 
the struct.

- Heikki



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-28 10:48:30 +0300, Heikki Linnakangas wrote:
> On 04/26/2014 09:27 PM, Andres Freund wrote:
> >I don't think we need to decide this without benchmarks proving the
> >benefits. I basically want to know whether somebody has an actual
> >usecase - even if I really, really, can't think of one - of setting
> >max_connections even remotely that high. If there's something
> >fundamental out there that'd make changing the limit impossible, doing
> >benchmarks wouldn't be worthwile.
> 
> It doesn't seem unreasonable to have a database with tens of thousands of
> connections. Sure, performance will suffer, but if the connections sit idle
> most of the time so that the total load is low, who cares. Sure, you could
> use a connection pooler, but it's even better if you don't have to.

65k connections will be absolutely *disastrous* for performance because
of the big PGPROC et al. I *do* think we have to make live easier for
users here by supplying builtin pooling at some point, but that's just a
separate feature.

> If there are big gains to be had from limiting the number of connections,
> I'm not against it. For the purpose of shrinking BufferDesc though, I have
> feeling there might be other lower hanging fruit in there. For example,
> wait_backend_pid and freeNext are not used very often, so they could be
> moved elsewhere, to a separate array. And buf_id and the LWLock pointers
> could be calculated from the memory address of the struct.

The main reason I want to shrink it is that I want to make pin/unpin
buffer lockless and all solutions I can come up with for that require
flags to be in the same uint32 as the refcount. For performance
it'd be beneficial if usagecount also fits in there.

I agree that we can move a good part of BufferDesc into a separately
indexed array. io_in_progress_lock, freeNext, wait_backend_id are imo
good candidates.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Heikki Linnakangas
Дата:
On 04/28/2014 12:39 PM, Andres Freund wrote:
> On 2014-04-28 10:48:30 +0300, Heikki Linnakangas wrote:
>> On 04/26/2014 09:27 PM, Andres Freund wrote:
>>> I don't think we need to decide this without benchmarks proving the
>>> benefits. I basically want to know whether somebody has an actual
>>> usecase - even if I really, really, can't think of one - of setting
>>> max_connections even remotely that high. If there's something
>>> fundamental out there that'd make changing the limit impossible, doing
>>> benchmarks wouldn't be worthwile.
>>
>> It doesn't seem unreasonable to have a database with tens of thousands of
>> connections. Sure, performance will suffer, but if the connections sit idle
>> most of the time so that the total load is low, who cares. Sure, you could
>> use a connection pooler, but it's even better if you don't have to.
>
> 65k connections will be absolutely *disastrous* for performance because
> of the big PGPROC et al.

Well, often that's still good enough.

> The main reason I want to shrink it is that I want to make pin/unpin
> buffer lockless and all solutions I can come up with for that require
> flags to be in the same uint32 as the refcount. For performance
> it'd be beneficial if usagecount also fits in there.

Would it be enough to put only some of the flags in the same uint32?

- Heikki



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-28 13:32:45 +0300, Heikki Linnakangas wrote:
> On 04/28/2014 12:39 PM, Andres Freund wrote:
> >On 2014-04-28 10:48:30 +0300, Heikki Linnakangas wrote:
> >>On 04/26/2014 09:27 PM, Andres Freund wrote:
> >>>I don't think we need to decide this without benchmarks proving the
> >>>benefits. I basically want to know whether somebody has an actual
> >>>usecase - even if I really, really, can't think of one - of setting
> >>>max_connections even remotely that high. If there's something
> >>>fundamental out there that'd make changing the limit impossible, doing
> >>>benchmarks wouldn't be worthwile.
> >>
> >>It doesn't seem unreasonable to have a database with tens of thousands of
> >>connections. Sure, performance will suffer, but if the connections sit idle
> >>most of the time so that the total load is low, who cares. Sure, you could
> >>use a connection pooler, but it's even better if you don't have to.
> >
> >65k connections will be absolutely *disastrous* for performance because
> >of the big PGPROC et al.
> 
> Well, often that's still good enough.

That may be true for 2-4k max_connections, but >65k? That won't even
*run*, not to speak of doing something, in most environments because of
the number of processes required.

Even making only 20k connections will probably crash your computer.

> >The main reason I want to shrink it is that I want to make pin/unpin
> >buffer lockless and all solutions I can come up with for that require
> >flags to be in the same uint32 as the refcount. For performance
> >it'd be beneficial if usagecount also fits in there.
> 
> Would it be enough to put only some of the flags in the same uint32?

It's probably possible, but would make things more complicated. For a
"feature" nobody is ever going to use.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Robert Haas
Дата:
On Mon, Apr 28, 2014 at 7:37 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> Well, often that's still good enough.
>
> That may be true for 2-4k max_connections, but >65k? That won't even
> *run*, not to speak of doing something, in most environments because of
> the number of processes required.
>
> Even making only 20k connections will probably crash your computer.

I'm of two minds on this topic.  On the one hand, "cat
/proc/sys/kernel/pid_max" on a Linux system I just tested (3.2.6)
returns 65536, so we'll run out of PID space before we run out of 64k
backends.  On the other hand, that value can easily be increased to a
few million via, e.g., sysctl -w kernel.pid_max=4194303, and I imagine
that as machines continue to get bigger there will be more and more
people wanting to do things like that.

I think the fact that making 20k connections might crash your computer
is an artifact of other problems that we really ought to also fix
(like per-backend memory utilization, and lock contention on various
global data structures) rather than baking it into more places.  In
PostgreSQL 25.3, perhaps we'll be able to run distributed PostgreSQL
clusters that can service a million simultaneous connections across
dozens of physical machines.  Then again, there might not be much left
of our current buffer manager by that point, so maybe what we decide
right now isn't that relevant.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Decrease MAX_BACKENDS to 2^16

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> I think the fact that making 20k connections might crash your computer
> is an artifact of other problems that we really ought to also fix
> (like per-backend memory utilization, and lock contention on various
> global data structures) rather than baking it into more places.  In
> PostgreSQL 25.3, perhaps we'll be able to run distributed PostgreSQL
> clusters that can service a million simultaneous connections across
> dozens of physical machines.  Then again, there might not be much left
> of our current buffer manager by that point, so maybe what we decide
> right now isn't that relevant.

Yeah.  I think that useful use of 64K backends is far enough away that
it shouldn't be a showstopper argument, assuming that we get something
good in return for baking that into bufmgr.  What I find much more
worrisome about Andres' proposals is that he seems to be thinking that
there are *no* other changes to the buffer headers on the horizon.
That's untenable.  And I don't want to be told that we can't improve
the buffer management algorithms because adding another field would
make the headers not fit in a cacheline.  (For the same reason, I'm
pretty unimpressed by the nearby suggestions that it'd be okay to put
very tight limits on the number of bits in the buffer header flags or
the usage count.)
        regards, tom lane



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-28 10:03:58 -0400, Tom Lane wrote:
> What I find much more worrisome about Andres' proposals is that he
> seems to be thinking that there are *no* other changes to the buffer
> headers on the horizon.

Err. I am not thinking that at all. I am pretty sure I never made that
argument. The reason I want to limit the number of connections is it
allows *both*, shrinking the size of BufferDescs due to less alignment
padding *and* stuffing the refcount and flags into one integer.

> That's untenable.  And I don't want to be told that we can't improve
> the buffer management algorithms because adding another field would
> make the headers not fit in a cacheline.

I think we need to move some less frequently fields to a separate array
to be future proof. Heikki suggested freeNext, wait_backend_pid I added
io_in_progress_lock. We could theoretically replace buf_id by
calculating it based on the BufferDescriptors array, but that's probably
not a good idea.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Decrease MAX_BACKENDS to 2^16

От
Tom Lane
Дата:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-04-28 10:03:58 -0400, Tom Lane wrote:
>> What I find much more worrisome about Andres' proposals is that he
>> seems to be thinking that there are *no* other changes to the buffer
>> headers on the horizon.

> Err. I am not thinking that at all. I am pretty sure I never made that
> argument. The reason I want to limit the number of connections is it
> allows *both*, shrinking the size of BufferDescs due to less alignment
> padding *and* stuffing the refcount and flags into one integer.

Weren't you saying you also wanted to stuff the usage count into that same
integer?  That's getting a little too tight for my taste, even if it would
fit today.
        regards, tom lane



Re: Decrease MAX_BACKENDS to 2^16

От
Andres Freund
Дата:
On 2014-04-28 10:57:12 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-04-28 10:03:58 -0400, Tom Lane wrote:
> >> What I find much more worrisome about Andres' proposals is that he
> >> seems to be thinking that there are *no* other changes to the buffer
> >> headers on the horizon.
> 
> > Err. I am not thinking that at all. I am pretty sure I never made that
> > argument. The reason I want to limit the number of connections is it
> > allows *both*, shrinking the size of BufferDescs due to less alignment
> > padding *and* stuffing the refcount and flags into one integer.
> 
> Weren't you saying you also wanted to stuff the usage count into that same
> integer?  That's getting a little too tight for my taste, even if it would
> fit today.

That's a possible additional optimization that we could use. But it's
certainly not required. Would allow us to use fewer atomic operations...

Right now there'd be enough space for a more precise usagecount and more
flags. ATM there's 9 bits for flags and 3 bits of usagecount...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services