Обсуждение: Estimating HugePages Requirements?

Поиск
Список
Период
Сортировка

Estimating HugePages Requirements?

От
Don Seiler
Дата:
Good day,

I'm trying to set up a chef recipe to reserve enough HugePages on a linux system for our PG servers. A given VM will only host one PG cluster and that will be the only thing on that host that uses HugePages. Blogs that I've seen suggest that it would be as simple as taking the shared_buffers setting and dividing that by 2MB (huge page size), however I found that I needed some more.

In my test case, shared_buffers is set to 4003MB (calculated by chef) but PG failed to start until I reserved a few hundred more MB. When I checked VmPeak, it was 4321MB, so I ended up having to reserve over 2161 huge pages, over a hundred more than I had originally thought.

I'm told other factors contribute to this additional memory requirement, such as max_connections, wal_buffers, etc. I'm wondering if anyone has been able to come up with a reliable method for determining the HugePages requirements for a PG cluster based on the GUC values (that would be known at deployment time).

Thanks,
Don.

--
Don Seiler
www.seiler.us

Re: Estimating HugePages Requirements?

От
Julien Rouhaud
Дата:
On Thu, Jun 10, 2021 at 12:42 AM Don Seiler <don@seiler.us> wrote:
>
> I'm told other factors contribute to this additional memory requirement, such as max_connections, wal_buffers, etc.
I'mwondering if anyone has been able to come up with a reliable method for determining the HugePages requirements for a
PGcluster based on the GUC values (that would be known at deployment time). 

It also depends on modules like pg_stat_statements and their own
configuration.  I think that you can find the required size that your
current configuration will allocate with:

SELECT sum(allocated_size) FROM pg_shmem_allocations ;



Re: Estimating HugePages Requirements?

От
Vijaykumar Jain
Дата:
Please ignore, if you have read the blog below, if not, at the end of it there is a github repo which has mem specs for various tpcc benchmarks.
Ofcourse, your workload expectations may vary from the test scenarios used, but just in case.



Re: Estimating HugePages Requirements?

От
Don Seiler
Дата:
On Wed, Jun 9, 2021 at 1:45 PM Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote:
Please ignore, if you have read the blog below, if not, at the end of it there is a github repo which has mem specs for various tpcc benchmarks.
Ofcourse, your workload expectations may vary from the test scenarios used, but just in case.


That blog post is about transparent huge pages, which is different than HugePages I'm looking at here. We already disable THP as a matter of course.

--
Don Seiler
www.seiler.us

Re: Estimating HugePages Requirements?

От
Bruce Momjian
Дата:
On Wed, Jun  9, 2021 at 01:52:19PM -0500, Don Seiler wrote:
> On Wed, Jun 9, 2021 at 1:45 PM Vijaykumar Jain <vijaykumarjain.github@gmail.com
> > wrote:
> 
>     Please ignore, if you have read the blog below, if not, at the end of it
>     there is a github repo which has mem specs for various tpcc benchmarks.
>     Ofcourse, your workload expectations may vary from the test scenarios used,
>     but just in case.
> 
>     Settling the Myth of Transparent HugePages for Databases - Percona Database
>     Performance Blog
> 
> 
> That blog post is about transparent huge pages, which is different than
> HugePages I'm looking at here. We already disable THP as a matter of course.

This blog post talks about sizing huge pages too:

    https://momjian.us/main/blogs/pgblog/2021.html#April_12_2021

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Estimating HugePages Requirements?

От
Magnus Hagander
Дата:
On Wed, Jun 9, 2021 at 7:23 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Jun 10, 2021 at 12:42 AM Don Seiler <don@seiler.us> wrote:
> >
> > I'm told other factors contribute to this additional memory requirement, such as max_connections, wal_buffers, etc.
I'mwondering if anyone has been able to come up with a reliable method for determining the HugePages requirements for a
PGcluster based on the GUC values (that would be known at deployment time). 
>
> It also depends on modules like pg_stat_statements and their own
> configuration.  I think that you can find the required size that your
> current configuration will allocate with:
>
> SELECT sum(allocated_size) FROM pg_shmem_allocations ;

I wonder how hard it would be to for example expose that through a
commandline switch or tool.

The point being that in order to run the query you suggest, the server
must already be running. There is no way to use this to estimate the
size that you're going to need after changing the value of
shared_buffers, which is a very common scenario. (You can change it,
restart without using huge pages because it fails, run that query,
change huge pages, and restart again -- but that's not exactly...
convenient)

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Estimating HugePages Requirements?

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> I wonder how hard it would be to for example expose that through a
> commandline switch or tool.

Just try to start the server and see if it complains.
For instance, with shared_buffers=10000000 I get

2021-06-09 15:08:56.821 EDT [1428121] FATAL:  could not map anonymous shared memory: Cannot allocate memory
2021-06-09 15:08:56.821 EDT [1428121] HINT:  This error usually means that PostgreSQL's request for a shared memory
segmentexceeded available memory, swap space, or huge pages. To reduce the request size (currently 83720568832 bytes),
reducePostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections. 

Of course, if it *does* start, you can do the other thing.

Admittedly, we could make that easier somehow; but if it took
25 years for somebody to ask for this, I'm not sure it's
worth creating a feature to make it a shade easier.

            regards, tom lane



Re: Estimating HugePages Requirements?

От
Magnus Hagander
Дата:
On Wed, Jun 9, 2021 at 9:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Magnus Hagander <magnus@hagander.net> writes:
> > I wonder how hard it would be to for example expose that through a
> > commandline switch or tool.
>
> Just try to start the server and see if it complains.
> For instance, with shared_buffers=10000000 I get
>
> 2021-06-09 15:08:56.821 EDT [1428121] FATAL:  could not map anonymous shared memory: Cannot allocate memory
> 2021-06-09 15:08:56.821 EDT [1428121] HINT:  This error usually means that PostgreSQL's request for a shared memory
segmentexceeded available memory, swap space, or huge pages. To reduce the request size (currently 83720568832 bytes),
reducePostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections. 
>
> Of course, if it *does* start, you can do the other thing.

Well, I have to *stop* the existing one first, most likely, otherwise
there won't be enough huge pages (or indeed memory) available. And if
then doesn't start, you're looking at extended downtime.

You can automate this to minimize it (set the value in the conf, stop
old, start new, if new doesn't start then stop new, reconfigure, start
old again), but it's *far* from friendly.

This process works when you're setting up a brand new server with
nobody using it. It doesn't work well, or at all, when you actually
have active users on it..


> Admittedly, we could make that easier somehow; but if it took
> 25 years for somebody to ask for this, I'm not sure it's
> worth creating a feature to make it a shade easier.

We haven't had huge page support for 25 years, "only" since 9.4 so
about 7 years.

And for every year that passes, huge pages become more interesting in
that in general memory sizes increase so the payoff of using them is
increased.

Using huge pages *should* be a trivial improvement to set up. But it's
in my experience complicated enough that many just skip it simply for
that reason.

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Estimating HugePages Requirements?

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> On Wed, Jun 9, 2021 at 9:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Just try to start the server and see if it complains.

> Well, I have to *stop* the existing one first, most likely, otherwise
> there won't be enough huge pages (or indeed memory) available.

I'm not following.  If you have a production server running, its
pg_shmem_allocations total should already be a pretty good guide
to what you need to configure HugePages for.  You need to know to
round that up, of course --- but if you aren't building a lot of
slop into the HugePages configuration anyway, you'll get burned
down the road.

            regards, tom lane



Re: Estimating HugePages Requirements?

От
Magnus Hagander
Дата:
On Wed, Jun 9, 2021 at 9:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Magnus Hagander <magnus@hagander.net> writes:
> > On Wed, Jun 9, 2021 at 9:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Just try to start the server and see if it complains.
>
> > Well, I have to *stop* the existing one first, most likely, otherwise
> > there won't be enough huge pages (or indeed memory) available.
>
> I'm not following.  If you have a production server running, its
> pg_shmem_allocations total should already be a pretty good guide
> to what you need to configure HugePages for.  You need to know to
> round that up, of course --- but if you aren't building a lot of
> slop into the HugePages configuration anyway, you'll get burned
> down the road.

I'm talking about the case when you want to *change* the value for
shared_buffers (or other parameters that would change the amount of
required huge pages), on a system where you're using huge pages.
pg_shmem_allocations will tell you what you need with the current
value, not what you need with the new value.

But yes, you can do some math around it and make a well educated
guess. But it would be very convenient to have the system able to do
that for you.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Estimating HugePages Requirements?

От
P C
Дата:
I agree, its confusing for many and that confusion arises from the fact that you usually talk of shared_buffers in MB or GB whereas hugepages have to be configured in units of 2mb. But once they understand they realize its pretty simple.

Don, we have experienced the same not just with postgres but also with oracle. I havent been able to get to the root of it, but what we usually do is, we add another 100-200 pages and that works for us. If the SGA or shared_buffers is high eg 96gb, then we add 250-500 pages. Those few hundred MBs  may be wasted (because the moment you configure hugepages, the operating system considers it as used and does not use it any more) but nowadays, servers have 64 or 128 gb RAM easily and wasting that 500mb to 1gb does not hurt really.

HTH

On Thu, 10 Jun 2021 at 1:01 AM, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Jun 9, 2021 at 9:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Magnus Hagander <magnus@hagander.net> writes:
> > On Wed, Jun 9, 2021 at 9:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Just try to start the server and see if it complains.
>
> > Well, I have to *stop* the existing one first, most likely, otherwise
> > there won't be enough huge pages (or indeed memory) available.
>
> I'm not following.  If you have a production server running, its
> pg_shmem_allocations total should already be a pretty good guide
> to what you need to configure HugePages for.  You need to know to
> round that up, of course --- but if you aren't building a lot of
> slop into the HugePages configuration anyway, you'll get burned
> down the road.

I'm talking about the case when you want to *change* the value for
shared_buffers (or other parameters that would change the amount of
required huge pages), on a system where you're using huge pages.
pg_shmem_allocations will tell you what you need with the current
value, not what you need with the new value.

But yes, you can do some math around it and make a well educated
guess. But it would be very convenient to have the system able to do
that for you.

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/


Re: Estimating HugePages Requirements?

От
Don Seiler
Дата:
On Wed, Jun 9, 2021, 21:03 P C <puravc@gmail.com> wrote:
I agree, its confusing for many and that confusion arises from the fact that you usually talk of shared_buffers in MB or GB whereas hugepages have to be configured in units of 2mb. But once they understand they realize its pretty simple.

Don, we have experienced the same not just with postgres but also with oracle. I havent been able to get to the root of it, but what we usually do is, we add another 100-200 pages and that works for us. If the SGA or shared_buffers is high eg 96gb, then we add 250-500 pages. Those few hundred MBs  may be wasted (because the moment you configure hugepages, the operating system considers it as used and does not use it any more) but nowadays, servers have 64 or 128 gb RAM easily and wasting that 500mb to 1gb does not hurt really.

I don't have a problem with the math, just wanted to know if it was possible to better estimate what the actual requirements would be at deployment time. My fallback will probably be you did and just pad with an extra 512MB by default.

Don.

Re: Estimating HugePages Requirements?

От
Don Seiler
Дата:
On Thu, Jun 10, 2021 at 7:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Wed, Jun 09, 2021 at 10:55:08PM -0500, Don Seiler wrote:
> On Wed, Jun 9, 2021, 21:03 P C <puravc@gmail.com> wrote:
>
> > I agree, its confusing for many and that confusion arises from the fact
> > that you usually talk of shared_buffers in MB or GB whereas hugepages have
> > to be configured in units of 2mb. But once they understand they realize its
> > pretty simple.
> >
> > Don, we have experienced the same not just with postgres but also with
> > oracle. I havent been able to get to the root of it, but what we usually do
> > is, we add another 100-200 pages and that works for us. If the SGA or
> > shared_buffers is high eg 96gb, then we add 250-500 pages. Those few
> > hundred MBs  may be wasted (because the moment you configure hugepages, the
> > operating system considers it as used and does not use it any more) but
> > nowadays, servers have 64 or 128 gb RAM easily and wasting that 500mb to
> > 1gb does not hurt really.
>
> I don't have a problem with the math, just wanted to know if it was
> possible to better estimate what the actual requirements would be at
> deployment time. My fallback will probably be you did and just pad with an
> extra 512MB by default.

It's because the huge allocation isn't just shared_buffers, but also
wal_buffers:

| The amount of shared memory used for WAL data that has not yet been written to disk.
| The default setting of -1 selects a size equal to 1/32nd (about 3%) of shared_buffers, ...

.. and other stuff:

src/backend/storage/ipc/ipci.c
         * Size of the Postgres shared-memory block is estimated via
         * moderately-accurate estimates for the big hogs, plus 100K for the
         * stuff that's too small to bother with estimating.
         *
         * We take some care during this phase to ensure that the total size
         * request doesn't overflow size_t.  If this gets through, we don't
         * need to be so careful during the actual allocation phase.
         */
        size = 100000;
        size = add_size(size, PGSemaphoreShmemSize(numSemas));
        size = add_size(size, SpinlockSemaSize());
        size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
                                                                                         sizeof(ShmemIndexEnt)));
        size = add_size(size, dsm_estimate_size());
        size = add_size(size, BufferShmemSize());
        size = add_size(size, LockShmemSize());
        size = add_size(size, PredicateLockShmemSize());
        size = add_size(size, ProcGlobalShmemSize());
        size = add_size(size, XLOGShmemSize());
        size = add_size(size, CLOGShmemSize());
        size = add_size(size, CommitTsShmemSize());
        size = add_size(size, SUBTRANSShmemSize());
        size = add_size(size, TwoPhaseShmemSize());
        size = add_size(size, BackgroundWorkerShmemSize());
        size = add_size(size, MultiXactShmemSize());
        size = add_size(size, LWLockShmemSize());
        size = add_size(size, ProcArrayShmemSize());
        size = add_size(size, BackendStatusShmemSize());
        size = add_size(size, SInvalShmemSize());
        size = add_size(size, PMSignalShmemSize());
        size = add_size(size, ProcSignalShmemSize());
        size = add_size(size, CheckpointerShmemSize());
        size = add_size(size, AutoVacuumShmemSize());
        size = add_size(size, ReplicationSlotsShmemSize());
        size = add_size(size, ReplicationOriginShmemSize());
        size = add_size(size, WalSndShmemSize());
        size = add_size(size, WalRcvShmemSize());
        size = add_size(size, PgArchShmemSize());
        size = add_size(size, ApplyLauncherShmemSize());
        size = add_size(size, SnapMgrShmemSize());
        size = add_size(size, BTreeShmemSize());
        size = add_size(size, SyncScanShmemSize());
        size = add_size(size, AsyncShmemSize());
#ifdef EXEC_BACKEND
        size = add_size(size, ShmemBackendArraySize());
#endif

        /* freeze the addin request size and include it */
        addin_request_allowed = false;
        size = add_size(size, total_addin_request);

        /* might as well round it off to a multiple of a typical page size */
        size = add_size(size, 8192 - (size % 8192));

BTW, I think it'd be nice if this were a NOTICE:
| elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", allocsize);

Great detail. I did some trial and error around just a few variables (shared_buffers, wal_buffers, max_connections) and came up with a formula that seems to be "good enough" for at least a rough default estimate.

The pseudo-code is basically:

ceiling((shared_buffers + 200 + (25 * shared_buffers/1024) + 10*(max_connections-100)/200 + wal_buffers-16)/2)
 
This assumes that all values are in MB and that wal_buffers is set to a value other than the default of -1 obviously. I decided to default wal_buffers to 16MB in our environments since that's what -1 should go to based on the description in the documentation for an instance with shared_buffers of the sizes in our deployments.

This formula did come up a little short (2MB) when I had a low shared_buffers value at 2GB. Raising that starting 200 value to something like 250 would take care of that. The limited testing I did based on different values we see across our production deployments worked otherwise. Please let me know what you folks think. I know I'm ignoring a lot of other factors, especially given what Justin recently shared.

The remaining trick for me now is to calculate this in chef since shared_buffers and wal_buffers attributes are strings with the unit ("MB") in them, rather than just numerical values. Thinking of changing that attribute to be just that and assume/require MB to make the calculations easier.

--
Don Seiler
www.seiler.us