Обсуждение: 64-bit size pgbench

Поиск
Список
Период
Сортировка

64-bit size pgbench

От
Greg Smith
Дата:
Attached is a patch that fixes a long standing bug in pgbench:  it won't
handle scale factors above ~4000 (around 60GB) because it uses 32-bit
integers for its computations related to the number of accounts, and it
just crashes badly when you exceed that.  This month I've run into two
systems where that was barely enough to exceed physical RAM, so I'd
expect this to be a significant limiting factor during 9.0's lifetime.
A few people have complained about it already in 8.4.

The index size on the big accounts table has to increase for this to
work, it's a bigint now instead of an int.  That's going to mean a drop
in results for some tests, just because less index will fit in RAM.
I'll quantify that better before submitting something final here.  I
still have some other testing left to do as well:  making sure I didn't
break the new \setshell feature (am suspicious of strtol()), confirming
the random numbers are still as random as they should be (there was a
little bug in the debugging code related to that, too).

Was looking for general feedback on whether the way I've converted this
to use 64 bit integers for the account numbers seems appropriate, and to
see if there's any objection to fixing this in general given the
potential downsides.

Here's the patch in action on previously unreachable sizes (this is a
system with 8GB of RAM, so I'm basically just testing seek speed here):

$ ./pgbench -j 4 -c 8 -T 30 -S pgbench
starting vacuum...end.
transaction type: SELECT only
scaling factor: 5000
query mode: simple
number of clients: 8
number of threads: 4
duration: 30 s
number of transactions actually processed: 2466
tps = 82.010509 (including connections establishing)
tps = 82.042946 (excluding connections establishing)

$ psql -x -c "select relname,reltuples from pg_class where
relname='pgbench_accounts'" -d pgbench
relname   | pgbench_accounts
reltuples | 5e+08

$ psql -x -c "select pg_size_pretty(pg_table_size('pgbench_accounts'))"
-d pgbench
pg_size_pretty | 63 GB

$ psql -x -c "select aid from pgbench_accounts order by aid limit 1" -d
pgbench
aid | 1

$ psql -x -c "select aid from pgbench_accounts order by aid desc limit
1" -d pgbench
aid | 500000000

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 38086a5..8a7064a 100644
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*************** usage(const char *progname)
*** 313,326 ****
  }

  /* random number generator: uniform distribution from min to max inclusive */
! static int
! getrand(int min, int max)
  {
      /*
       * Odd coding is so that min and max have approximately the same chance of
       * being selected as do numbers between them.
       */
!     return min + (int) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0));
  }

  /* call PQexec() and exit() on failure */
--- 313,326 ----
  }

  /* random number generator: uniform distribution from min to max inclusive */
! static int64
! getrand(int64 min, int64 max)
  {
      /*
       * Odd coding is so that min and max have approximately the same chance of
       * being selected as do numbers between them.
       */
!     return min + (int64) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0));
  }

  /* call PQexec() and exit() on failure */
*************** runShellCommand(CState *st, char *variab
*** 630,636 ****
      FILE   *fp;
      char    res[64];
      char   *endptr;
!     int        retval;

      /*
       * Join arguments with whilespace separaters. Arguments starting with
--- 630,636 ----
      FILE   *fp;
      char    res[64];
      char   *endptr;
!     int64        retval;

      /*
       * Join arguments with whilespace separaters. Arguments starting with
*************** runShellCommand(CState *st, char *variab
*** 704,710 ****
      }

      /* Check whether the result is an integer and assign it to the variable */
!     retval = (int) strtol(res, &endptr, 10);
      while (*endptr != '\0' && isspace((unsigned char) *endptr))
          endptr++;
      if (*res == '\0' || *endptr != '\0')
--- 704,710 ----
      }

      /* Check whether the result is an integer and assign it to the variable */
!     retval = (int64) strtol(res, &endptr, 19);
      while (*endptr != '\0' && isspace((unsigned char) *endptr))
          endptr++;
      if (*res == '\0' || *endptr != '\0')
*************** runShellCommand(CState *st, char *variab
*** 712,718 ****
          fprintf(stderr, "%s: must return an integer ('%s' returned)\n", argv[0], res);
          return false;
      }
!     snprintf(res, sizeof(res), "%d", retval);
      if (!putVariable(st, "setshell", variable, res))
          return false;

--- 712,718 ----
          fprintf(stderr, "%s: must return an integer ('%s' returned)\n", argv[0], res);
          return false;
      }
!     snprintf(res, sizeof(res), INT64_FORMAT, retval);
      if (!putVariable(st, "setshell", variable, res))
          return false;

*************** top:
*** 959,966 ****
          if (pg_strcasecmp(argv[0], "setrandom") == 0)
          {
              char       *var;
!             int            min,
!                         max;
              char        res[64];

              if (*argv[2] == ':')
--- 959,967 ----
          if (pg_strcasecmp(argv[0], "setrandom") == 0)
          {
              char       *var;
!             int64        min,
!                         max,
!                         rand;
              char        res[64];

              if (*argv[2] == ':')
*************** top:
*** 1000,1014 ****

              if (max < min || max > MAX_RANDOM_VALUE)
              {
!                 fprintf(stderr, "%s: invalid maximum number %d\n", argv[0], max);
                  st->ecnt++;
                  return true;
              }

  #ifdef DEBUG
!             printf("min: %d max: %d random: %d\n", min, max, getrand(min, max));
  #endif
!             snprintf(res, sizeof(res), "%d", getrand(min, max));

              if (!putVariable(st, argv[0], argv[1], res))
              {
--- 1001,1016 ----

              if (max < min || max > MAX_RANDOM_VALUE)
              {
!                 fprintf(stderr, "%s: invalid maximum number " INT64_FORMAT "\n", argv[0], max);
                  st->ecnt++;
                  return true;
              }
+             rand=getrand(min,max);

  #ifdef DEBUG
!             printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, rand);
  #endif
!             snprintf(res, sizeof(res), INT64_FORMAT, rand);

              if (!putVariable(st, argv[0], argv[1], res))
              {
*************** init(void)
*** 1191,1197 ****
          "drop table if exists pgbench_tellers",
          "create table pgbench_tellers(tid int not null,bid int,tbalance int,filler char(84)) with (fillfactor=%d)",
          "drop table if exists pgbench_accounts",
!         "create table pgbench_accounts(aid int not null,bid int,abalance int,filler char(84)) with (fillfactor=%d)",
          "drop table if exists pgbench_history",
          "create table pgbench_history(tid int,bid int,aid int,delta int,mtime timestamp,filler char(22))"
      };
--- 1193,1199 ----
          "drop table if exists pgbench_tellers",
          "create table pgbench_tellers(tid int not null,bid int,tbalance int,filler char(84)) with (fillfactor=%d)",
          "drop table if exists pgbench_accounts",
!         "create table pgbench_accounts(aid bigint not null,bid int,abalance int,filler char(80)) with
(fillfactor=%d)",
          "drop table if exists pgbench_history",
          "create table pgbench_history(tid int,bid int,aid int,delta int,mtime timestamp,filler char(22))"
      };
*************** init(void)
*** 1204,1210 ****
      PGconn       *con;
      PGresult   *res;
      char        sql[256];
!     int            i;

      if ((con = doConnect()) == NULL)
          exit(1);
--- 1206,1212 ----
      PGconn       *con;
      PGresult   *res;
      char        sql[256];
!     int64        i;

      if ((con = doConnect()) == NULL)
          exit(1);
*************** init(void)
*** 1232,1244 ****

      for (i = 0; i < nbranches * scale; i++)
      {
!         snprintf(sql, 256, "insert into pgbench_branches(bid,bbalance) values(%d,0)", i + 1);
          executeStatement(con, sql);
      }

      for (i = 0; i < ntellers * scale; i++)
      {
!         snprintf(sql, 256, "insert into pgbench_tellers(tid,bid,tbalance) values (%d,%d,0)",
                   i + 1, i / ntellers + 1);
          executeStatement(con, sql);
      }
--- 1234,1246 ----

      for (i = 0; i < nbranches * scale; i++)
      {
!         snprintf(sql, 256, "insert into pgbench_branches(bid,bbalance) values(" INT64_FORMAT ",0)", i + 1);
          executeStatement(con, sql);
      }

      for (i = 0; i < ntellers * scale; i++)
      {
!         snprintf(sql, 256, "insert into pgbench_tellers(tid,bid,tbalance) values (" INT64_FORMAT "," INT64_FORMAT
",0)",
                   i + 1, i / ntellers + 1);
          executeStatement(con, sql);
      }
*************** init(void)
*** 1263,1271 ****

      for (i = 0; i < naccounts * scale; i++)
      {
!         int            j = i + 1;

!         snprintf(sql, 256, "%d\t%d\t%d\t\n", j, i / naccounts + 1, 0);
          if (PQputline(con, sql))
          {
              fprintf(stderr, "PQputline failed\n");
--- 1265,1273 ----

      for (i = 0; i < naccounts * scale; i++)
      {
!         int64            j = i + 1;

!         snprintf(sql, 256, INT64_FORMAT "\t" INT64_FORMAT "\t%d\t\n", j, i / naccounts + 1, 0);
          if (PQputline(con, sql))
          {
              fprintf(stderr, "PQputline failed\n");
*************** init(void)
*** 1273,1279 ****
          }

          if (j % 10000 == 0)
!             fprintf(stderr, "%d tuples done.\n", j);
      }
      if (PQputline(con, "\\.\n"))
      {
--- 1275,1281 ----
          }

          if (j % 10000 == 0)
!             fprintf(stderr, INT64_FORMAT " tuples done.\n", j);
      }
      if (PQputline(con, "\\.\n"))
      {

Re: 64-bit size pgbench

От
Takahiro Itagaki
Дата:
Greg Smith <greg@2ndquadrant.com> wrote:

> Attached is a patch that fixes a long standing bug in pgbench:  it won't 
> handle scale factors above ~4000 (around 60GB) because it uses 32-bit 
> integers for its computations related to the number of accounts, and it 
> just crashes badly when you exceed that.  This month I've run into two 
> systems where that was barely enough to exceed physical RAM, so I'd 
> expect this to be a significant limiting factor during 9.0's lifetime.  
> A few people have complained about it already in 8.4.

+1 for the fix.

Do we also need to adjust "tuples done" messages during dataload?
It would be too verbose for large scale factor. I think a message
every 1% is reasonable.
   if (j % 10000 == 0)       fprintf(stderr, INT64_FORMAT " tuples done.\n", j);

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




Re: 64-bit size pgbench

От
Tom Lane
Дата:
Greg Smith <greg@2ndquadrant.com> writes:
> Was looking for general feedback on whether the way I've converted this 
> to use 64 bit integers for the account numbers seems appropriate, and to 
> see if there's any objection to fixing this in general given the 
> potential downsides.

In the past we've rejected proposed patches for pgbench on the grounds
that they would make results non-comparable to previous results.  So the
key question here is how much this affects the speed.  Please be sure to
test that on a 32-bit machine, not a 64-bit one.

> !     retval = (int64) strtol(res, &endptr, 19);

That bit is merely wishful thinking :-(
        regards, tom lane


Re: 64-bit size pgbench

От
Robert Haas
Дата:
On Fri, Jan 29, 2010 at 11:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Smith <greg@2ndquadrant.com> writes:
>> Was looking for general feedback on whether the way I've converted this
>> to use 64 bit integers for the account numbers seems appropriate, and to
>> see if there's any objection to fixing this in general given the
>> potential downsides.
>
> In the past we've rejected proposed patches for pgbench on the grounds
> that they would make results non-comparable to previous results.

Perhaps we need an option indicating whether or not the use of bigint
columns is OK.

...Robert


Re: 64-bit size pgbench

От
Greg Smith
Дата:
Tom Lane wrote:
> In the past we've rejected proposed patches for pgbench on the grounds
> that they would make results non-comparable to previous results.  So the
> key question here is how much this affects the speed.  Please be sure to
> test that on a 32-bit machine, not a 64-bit one.
>   

Sheesh, who has a 32-bit machine anymore?  I'll see what older hardware 
I can dig up.  I've realized there are two separate issues to be 
concerned about:

1) On small scale data sets, what's the impact of the main piece of data 
being shuffled around in memory (the account number in the accounts 
table) now being 64 bits?  That part might be significantly worse on 
32-bit hardware.

2) How does the expansion in size of the related primary key on that 
data impact the breakpoint where the database doesn't fit in RAM anymore?

I did just updated my pgbench-tools package this month so that it 
happily runs against either 8.3 or 8.4/9.0 and I've done two rounds of 
extensive test runs lately, so plenty of data to compare against here.

>> !     retval = (int64) strtol(res, &endptr, 19);
>>     
>
> That bit is merely wishful thinking :-(
>   

I did specificially say I didn't trust that call one bit.

There is a middle ground position here, similar to what Robert 
suggested, that I just add a "large mode" to the program for people who 
need it without touching the current case.  That might allow me to 
sidestep some of these issues I may not have a good answer to with 
getting the \setshell feature working right in 64 bits, could just make 
that one specific to "regular mode".

In any case, I think this limitation in what pgbench can do has risen to 
be a full-on bug at this point for the expected users of the next 
version, and I'll sit on this until there's something better we can make 
available.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com