Обсуждение: optimizer cost calculation problem
Kenji Sugita has identified a problem with cost_sort() in costsize.c.
In the following code fragment, sortmembytes is defined as long. So
double nruns = nbytes / (sortmembytes * 2);
may cause an integer overflow if sortmembytes exceeds 2^30, which in
turn make optimizer to produce wrong query plan(this actually happned
in a large PostgreSQL installation which has tons of memory). Here is a
proposed fix against current:
*** optimizer/path/costsize.c 16 Feb 2003 02:30:38 -0000 1.107
--- optimizer/path/costsize.c 31 Mar 2003 22:25:06 -0000
***************
*** 490,496 **** Cost startup_cost = input_cost; Cost run_cost = 0; double nbytes =
relation_byte_size(tuples,width);
! long sortmembytes = SortMem * 1024L; if (!enable_sort) startup_cost += disable_cost;
--- 490,496 ---- Cost startup_cost = input_cost; Cost run_cost = 0; double nbytes =
relation_byte_size(tuples,width);
! double sortmembytes = SortMem * 1024.0; if (!enable_sort) startup_cost += disable_cost;
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Kenji Sugita has identified a problem with cost_sort() in costsize.c.
> In the following code fragment, sortmembytes is defined as long. So
> double nruns = nbytes / (sortmembytes * 2);
> may cause an integer overflow if sortmembytes exceeds 2^30, which in
> turn make optimizer to produce wrong query plan(this actually happned
> in a large PostgreSQL installation which has tons of memory).
I find it really really hard to believe that it's wise to run with
sort_mem exceeding 2 gig ;-). Does that installation have so much
RAM that it can afford to run multiple many-Gb sorts concurrently?
This is far from being the only place that multiplies SortMem by 1024.
My inclination is that a safer fix is to alter guc.c's entry for
SortMem to establish a maximum value of INT_MAX/1024 for the variable.
Probably some of the other GUC variables like shared_buffers ought to
have overflow-related maxima established, too.
regards, tom lane
> Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > Kenji Sugita has identified a problem with cost_sort() in costsize.c. > > In the following code fragment, sortmembytes is defined as long. So > > double nruns = nbytes / (sortmembytes * 2); > > may cause an integer overflow if sortmembytes exceeds 2^30, which in > > turn make optimizer to produce wrong query plan(this actually happned > > in a large PostgreSQL installation which has tons of memory). > > I find it really really hard to believe that it's wise to run with > sort_mem exceeding 2 gig ;-). Does that installation have so much > RAM that it can afford to run multiple many-Gb sorts concurrently? The process is assigned 1 gig sort mem to speed up a batch job by uisng backend-process-only sort mem setting, and they do not modify postgresql.conf for ordinaly user. BTW it does not 2 gig, but 1 gig (remember that we do sortmembytes * 2) . > This is far from being the only place that multiplies SortMem by 1024. > My inclination is that a safer fix is to alter guc.c's entry for > SortMem to establish a maximum value of INT_MAX/1024 for the variable. > > Probably some of the other GUC variables like shared_buffers ought to > have overflow-related maxima established, too. > > regards, tom lane
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> BTW it does not 2 gig, but 1 gig (remember that we do sortmembytes *
> 2) .
Good point. Probably that particular calculation should be
"sortmembytes * 2.0" to force it to double before it can overflow.
But I still think we'd better limit SortMem so that the basic
SortMem*1024 calculation can't overflow (or even come close to overflow,
likely).
regards, tom lane
Tom wrote: > >I find it really really hard to believe that it's wise to run with >sort_mem exceeding 2 gig ;-). Does that installation have so much >RAM that it can afford to run multiple many-Gb sorts concurrently? I don't do 2 gig... but I found 0.3 gig helped on a not-too-large system. In a nightly load of a datawarehousing application I have a number of places where I do: set sort_mem=300000; create table new_whatevers as select distinct whatever from import_table; set sort_mem=10000; and set sort_mem=100000; select count(*),b from a group by b; set sort_mem=10000; when the optimizer likes hash aggregates. It significantly (8 hours -> 6 hours) reduces the nightly processing of my log-file analysis database. If my modest system benefited from 1/3 gig, it wouldn't surprise me if a large system benefits from 2 gig. If more info's useful, I'd be happy to provide some. Ron
On Mon, 31 Mar 2003, Tom Lane wrote: > Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > BTW it does not 2 gig, but 1 gig (remember that we do sortmembytes * > > 2) . > > Good point. Probably that particular calculation should be > "sortmembytes * 2.0" to force it to double before it can overflow. > But I still think we'd better limit SortMem so that the basic > SortMem*1024 calculation can't overflow (or even come close to overflow, > likely). This isn't really an issue for 64 bit hardware is it?
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> This isn't really an issue for 64 bit hardware is it?
Is "int" 64 bits on such a machine? The ones I've dealt with chose to
set int = 32bits, long = 64bits. If they don't do that then they have
a problem with not having any native C 32bit type (and promoting short
int up to 32 bits just moves the problem...)
At some point we should probably try to regularize the backend code so
that all memory-size-related calcs are consistently done in size_t or
ssize_t arithmetic; but we're a long way from that at present. For now
I think it's prudent to keep sort_mem small enough to avoid overflow in
int32 arithmetic.
regards, tom lane
On Tue, Apr 01, 2003 at 11:24:01AM -0500, Tom Lane wrote: > "scott.marlowe" <scott.marlowe@ihs.com> writes: > > This isn't really an issue for 64 bit hardware is it? > > Is "int" 64 bits on such a machine? The ones I've dealt with chose to It was for the regression tests we did recently on a 64-bit gcc-3.2-compiled Solaris 8 binary. Piles of regression failures because int was not 32 bits. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110