Обсуждение: Weird issue with planner choosing seq scan

Поиск
Список
Период
Сортировка

Weird issue with planner choosing seq scan

От
Sean Leach
Дата:
I have a table, that in production, currently has a little over 3
million records in production.  In development, the same table has
about 10 million records (we have cleaned production a few weeks
ago).  One of my queries joins this table with another, and in
development, the particular condition uses an IndexScan on the "stamp"
column (the time this record was inserted) which makes it run fast.
In Production however (different machine, similar specs/postgresql
settings) the planner uses a seq scan on that table, causing the query
performance to be abysmal (sometimes over 10 seconds in production,
around 1 second in development).  What can I do to tweak this/
troubleshoot it?  I have another table with similar structure etc.
that has the same issue.  Thanks!!!

Here is the query:

SELECT node,count(*) AS counts FROM u_counts c,res r WHERE
c.res_id=r.id AND stamp > (current_timestamp - interval '1 day') AND
r.rtype='u' AND r.location=1 GROUP BY node;

The tables have an index on u_counts.res_id, u_counts.stamp,
res.location, and res.rtype

Here is the production explain analyze:

HashAggregate  (cost=472824.67..472824.77 rows=8 width=6) (actual
time=12482.856..12482.872 rows=9 loops=1)
    ->  Hash Join  (cost=16.71..471847.28 rows=195479 width=6) (actual
time=1217.532..10618.930 rows=1035998 loops=1)
          Hash Cond: (c.res_id = r.id)
          ->  Seq Scan on u_counts c  (cost=0.00..466319.96
rows=948218 width=4) (actual time=1217.183..7343.507 rows=1035998
loops=1)
                Filter: (stamp > (now() - '1 day'::interval))
          ->  Hash  (cost=15.88..15.88 rows=67 width=10) (actual
time=0.299..0.299 rows=60 loops=1)
                ->  Seq Scan on res r  (cost=0.00..15.88 rows=67
width=10) (actual time=0.027..0.195 rows=60 loops=1)
                      Filter: (((rtype)::text = 'u'::text) AND
(location = 1))
  Total runtime: 12482.961 ms


Here is the development explain analyze:

  HashAggregate  (cost=72.91..73.02 rows=9 width=6) (actual
time=3108.793..3108.807 rows=9 loops=1)
    ->  Hash Join  (cost=10.42..71.27 rows=327 width=6) (actual
time=0.608..2446.714 rows=392173 loops=1)
          Hash Cond: (c.res_id = r.id)
          ->  Index Scan using u_counts_i2 on u_counts c
(cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582
rows=392173 loops=1)
                Index Cond: (stamp > (now() - '1 day'::interval))
          ->  Hash  (cost=9.53..9.53 rows=71 width=10) (actual
time=0.310..0.310 rows=78 loops=1)
                ->  Seq Scan on res r  (cost=0.00..9.53 rows=71
width=10) (actual time=0.010..0.189 rows=78 loops=1)
                      Filter: (((rtype)::text = 'u'::text) AND
(location = 1))
  Total runtime: 3108.891 ms


Re: Weird issue with planner choosing seq scan

От
Tom Lane
Дата:
Sean Leach <sleach@wiggum.com> writes:
> I have a table, that in production, currently has a little over 3
> million records in production.  In development, the same table has
> about 10 million records (we have cleaned production a few weeks
> ago).

You mean the other way around, to judge by the rowcounts from EXPLAIN.

>           ->  Index Scan using u_counts_i2 on u_counts c
> (cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582
> rows=392173 loops=1)

I kinda think the devel system wouldn't be using an indexscan either
if it had up-to-date ANALYZE statistics.  But even with the 1082 row
estimate that seems a remarkably low cost estimate.  Have you been
playing games with random_page_cost?  Maybe you forgot to duplicate the
devel system's cost parameters onto the production system?

            regards, tom lane

Re: Weird issue with planner choosing seq scan

От
Sean Leach
Дата:
Nope, seems like that would make sense but dev is 10 mill, prod is 3
million.   Also including random_page_cost below.  Thanks for any help.


Here is dev:

db=> analyze u_counts;
ANALYZE
Time: 15775.161 ms

db=> select count(1) from u_counts;
   count
----------
  10972078
(1 row)

db=> show random_page_cost;
  random_page_cost
------------------
  4
(1 row)

Time: 0.543 ms
db=> explain analyze SELECT node,count(*) AS counts FROM u_counts
c,res r WHERE c.res_id=r.id AND stamp > (current_timestamp - interval
'1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node;
                                                                         QUERY
  PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=12906.12..12906.24 rows=9 width=6) (actual
time=3135.831..3135.845 rows=9 loops=1)
    ->  Hash Join  (cost=10.42..12538.88 rows=73449 width=6) (actual
time=0.746..2475.632 rows=391380 loops=1)
          Hash Cond: (c.res_id = r.id)
          ->  Index Scan using u_counts_i2 on db c
(cost=0.00..10882.33 rows=243105 width=4) (actual time=0.287..1269.651
rows=391380 loops=1)
                Index Cond: (stamp > (now() - '1 day'::interval))
          ->  Hash  (cost=9.53..9.53 rows=71 width=10) (actual
time=0.430..0.430 rows=78 loops=1)
                ->  Seq Scan on res r  (cost=0.00..9.53 rows=71
width=10) (actual time=0.021..0.203 rows=78 loops=1)
                      Filter: (((rtype)::text = 'udns'::text) AND
(location = 1))
  Total runtime: 3136.000 ms




Now - here is prod:


db=> show random_page_cost;
  random_page_cost
------------------
  4
(1 row)

Time: 0.434 ms

db=> analyze u_counts;
ANALYZE
Time: 179.928 ms

db=> select count(1) from u_counts;
   count
---------
  3292215
(1 row)


db=> explain analyze SELECT node,count(*) AS counts FROM u_counts
c,res r WHERE c.res_id=r.id AND stamp > (current_timestamp - interval
'1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node;
                                                                 QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=452333.49..452333.59 rows=8 width=6) (actual
time=13200.887..13200.902 rows=9 loops=1)
    ->  Hash Join  (cost=16.71..451192.74 rows=228149 width=6) (actual
time=1430.458..11274.073 rows=1036015 loops=1)
          Hash Cond: (c.res_id = r.id)
          ->  Seq Scan on u_counts c  (cost=0.00..444744.45
rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015
loops=1)
                Filter: (stamp > (now() - '1 day'::interval))
          ->  Hash  (cost=15.88..15.88 rows=67 width=10) (actual
time=0.363..0.363 rows=60 loops=1)
                ->  Seq Scan on res r  (cost=0.00..15.88 rows=67
width=10) (actual time=0.046..0.258 rows=60 loops=1)
                      Filter: (((rtype)::text = 'udns'::text) AND
(location = 1))
  Total runtime: 13201.046 ms
(9 rows)

Time: 13204.686 ms









On Feb 24, 2008, at 9:50 AM, Tom Lane wrote:

> Sean Leach <sleach@wiggum.com> writes:
>> I have a table, that in production, currently has a little over 3
>> million records in production.  In development, the same table has
>> about 10 million records (we have cleaned production a few weeks
>> ago).
>
> You mean the other way around, to judge by the rowcounts from EXPLAIN.
>
>>          ->  Index Scan using u_counts_i2 on u_counts c
>> (cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582
>> rows=392173 loops=1)
>
> I kinda think the devel system wouldn't be using an indexscan e ither
> if it had up-to-date ANALYZE statistics.  But even with the 1082 row
> estimate that seems a remarkably low cost estimate.  Have you been
> playing games with random_page_cost?  Maybe you forgot to duplicate
> the
> devel system's cost parameters onto the production system?
>
>             regards, tom lane


Re: Weird issue with planner choosing seq scan

От
Tom Lane
Дата:
Sean Leach <sleach@wiggum.com> writes:
> Now - here is prod:

> db=> select count(1) from u_counts;
>    count
> ---------
>   3292215
> (1 row)


>           ->  Seq Scan on u_counts c  (cost=0.00..444744.45
> rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015
> loops=1)
>                 Filter: (stamp > (now() - '1 day'::interval))

Given that this scan actually is selecting about a third of the table,
I'm not sure that the planner is doing the wrong thing.  It's hard to
see how an indexscan would be an improvement.

[ thinks for a bit... ]  Actually, the problem might be the 3M
executions of now() and interval subtraction that you get in the seqscan
case.  What results do you get if you write it with a sub-select like this:

explain analyze SELECT node,count(*) AS counts FROM u_counts
c,res r WHERE c.res_id=r.id AND stamp > (SELECT current_timestamp - interval
'1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node;

            regards, tom lane

Re: Weird issue with planner choosing seq scan

От
Sean Leach
Дата:
On Feb 24, 2008, at 11:10 AM, Tom Lane wrote:

> Sean Leach <sleach@wiggum.com> writes:
>> Now - here is prod:
>
>> db=> select count(1) from u_counts;
>>   count
>> ---------
>>  3292215
>> (1 row)
>
>
>>          ->  Seq Scan on u_counts c  (cost=0.00..444744.45
>> rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015
>> loops=1)
>>                Filter: (stamp > (now() - '1 day'::interval))
>
> Given that this scan actually is selecting about a third of the table,
> I'm not sure that the planner is doing the wrong thing.  It's hard to
> see how an indexscan would be an improvement.
>
> [ thinks for a bit... ]  Actually, the problem might be the 3M
> executions of now() and interval subtraction that you get in the
> seqscan
> case.  What results do you get if you write it with a sub-select
> like this:
>
> explain analyze SELECT node,count(*) AS counts FROM u_counts
> c,res r WHERE c.res_id=r.id AND stamp > (SELECT current_timestamp -
> interval
> '1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node;


Unfortunately, the same, dev uses index scan, prod uses seq scan, prod
takes about 4x longer to do the query.  Any other thoughts on best way
to proceed?   Thanks again Tom.




>
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match


Re: Weird issue with planner choosing seq scan

От
"Stephen Denne"
Дата:
Tom Lane wrote
> Sean Leach <sleach@wiggum.com> writes:
> > Now - here is prod:
>
> > db=> select count(1) from u_counts;
> >    count
> > ---------
> >   3292215
> > (1 row)
>
>
> >           ->  Seq Scan on u_counts c  (cost=0.00..444744.45
> > rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015
> > loops=1)
> >                 Filter: (stamp > (now() - '1 day'::interval))
>
> Given that this scan actually is selecting about a third of the table,
> I'm not sure that the planner is doing the wrong thing.  It's hard to
> see how an indexscan would be an improvement.

If you always get around a third of the rows in your table written in the last day, you've got to be deleting about a
thirdof the rows in your table every day too. You might have a huge number of dead rows in your table, slowing down the
sequentialscan. 
(Likewise updating a third of the rows, changing an indexed field.)

What do you get from:
VACUUM VERBOSE u_counts;

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 

__________________________________________________________________
  This email has been scanned by the DMZGlobal Business Quality
              Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________



Re: Weird issue with planner choosing seq scan

От
Sean Leach
Дата:
On Feb 24, 2008, at 1:18 PM, Stephen Denne wrote:
> If you always get around a third of the rows in your table written
> in the last day, you've got to be deleting about a third of the rows
> in your table every day too. You might have a huge number of dead
> rows in your table, slowing down the sequential scan.
> (Likewise updating a third of the rows, changing an indexed field.)
>
> What do you get from:
> VACUUM VERBOSE u_counts;


This actually makes sense as we aggregate the production rows (but not
development), and here is the output of vacuum analyze.  We have the
auto vacuum daemon on, but after we do our aggregation (we aggregate
rows down to a less granular time scale, i.e. similar to what rrdtool
does etc.), we should probably do a 'vacuum full analyze' moving
forward after each aggregation run, right?

I need to do one now it appears, but I am assuming it will take a
_long_ time...I might need to schedule some downtime if it will.  Even
without a full vacuum, the query seems to have come down from 20-30s
to 5s.

db=> VACUUM VERBOSE u_counts;
INFO:  vacuuming "public.u_counts"
INFO:  index "u_counts_pkey" now contains 5569556 row versions in
73992 pages
DETAIL:  0 index row versions were removed.
57922 index pages have been deleted, 57922 are currently reusable.
CPU 0.59s/0.09u sec elapsed 3.73 sec.
INFO:  index "u_counts_i1" now contains 5569556 row versions in 76820
pages
DETAIL:  0 index row versions were removed.
54860 index pages have been deleted, 54860 are currently reusable.
CPU 1.04s/0.16u sec elapsed 20.10 sec.
INFO:  index "u_counts_i2" now contains 5569556 row versions in 77489
pages
DETAIL:  0 index row versions were removed.
53708 index pages have been deleted, 53708 are currently reusable.
CPU 0.70s/0.10u sec elapsed 5.41 sec.
INFO:  index "u_counts_i3" now contains 5569556 row versions in 76900
pages
DETAIL:  0 index row versions were removed.
55564 index pages have been deleted, 55564 are currently reusable.
CPU 0.94s/0.13u sec elapsed 20.34 sec.
INFO:  "u_counts": found 0 removable, 5569556 nonremovable row
versions in 382344 pages
DETAIL:  2085075 dead row versions cannot be removed yet.
There were 15567992 unused item pointers.
281727 pages contain useful free space.
0 pages are entirely empty.
CPU 5.24s/1.77u sec elapsed 53.69 sec.
WARNING:  relation "public.u_counts" contains more than
"max_fsm_pages" pages with useful free space
HINT:  Consider using VACUUM FULL on this relation or increasing the
configuration parameter "max_fsm_pages".
VACUUM
Time: 53758.329 ms



Re: Weird issue with planner choosing seq scan

От
"Scott Marlowe"
Дата:
The fact that your indexes are bloated but your table is not makes me
wonder if you're not running a really old version of pgsql that had
problems with monotonically increasing indexes bloating over time and
requiring reindexing.

That problem has been (for the most part) solved by some hacking Tom
Lane did some time back.

What version pgsql is this?  If it's pre 8.0 it might be worth looking
into migrating for performance and maintenance reasons.

Re: Weird issue with planner choosing seq scan

От
Sean Leach
Дата:
On Feb 24, 2008, at 4:03 PM, Scott Marlowe wrote:

> The fact that your indexes are bloated but your table is not makes me
> wonder if you're not running a really old version of pgsql that had
> problems with monotonically increasing indexes bloating over time and
> requiring reindexing.
>
> That problem has been (for the most part) solved by some hacking Tom
> Lane did some time back.
>
> What version pgsql is this?  If it's pre 8.0 it might be worth looking
> into migrating for performance and maintenance reasons.

It's the latest 8.3.0 release :(

Re: Weird issue with planner choosing seq scan

От
"Scott Marlowe"
Дата:
On Sun, Feb 24, 2008 at 6:05 PM, Sean Leach <sleach@wiggum.com> wrote:

>  On Feb 24, 2008, at 4:03 PM, Scott Marlowe wrote:
>
>  >
>  > What version pgsql is this?  If it's pre 8.0 it might be worth looking
>  > into migrating for performance and maintenance reasons.
>
>  It's the latest 8.3.0 release :(

Urg.  Then I wonder how your indexes are bloating but your table is
not...  you got autovac running?  No weird lock issues?  It's a side
issue right now since the table is showing as non-bloated (unless
you've got a long running transaction and that number is WAY off from
your vacuum)

Re: Weird issue with planner choosing seq scan

От
Sean Leach
Дата:
On Feb 24, 2008, at 4:27 PM, Scott Marlowe wrote:

> On Sun, Feb 24, 2008 at 6:05 PM, Sean Leach <sleach@wiggum.com> wrote:
>
>> On Feb 24, 2008, at 4:03 PM, Scott Marlowe wrote:
>>
>>>
>>> What version pgsql is this?  If it's pre 8.0 it might be worth
>>> looking
>>> into migrating for performance and maintenance reasons.
>>
>> It's the latest 8.3.0 release :(
>
> Urg.  Then I wonder how your indexes are bloating but your table is
> not...  you got autovac running?  No weird lock issues?  It's a side
> issue right now since the table is showing as non-bloated (unless
> you've got a long running transaction and that number is WAY off from
> your vacuum)


Autovac is running, but probably not tuned.  I am looking at my
max_fsm_pages setting to up as vacuum says, but not sure which value
to use (all the posts on the web refer to what looks like an old
vacuum output format), is this the line to look at?

INFO:  "u_counts": found 0 removable, 6214708 nonremovable row
versions in 382344 pages
DETAIL:  2085075 dead row versions cannot be removed yet.

I.e. I need 382344 max_fsm_pages?  No weird lock issues that we have
seen.

So should I do a vacuum full and then hope this doesn't happen again?
Or should I run a VACUUM FULL after each aggregation run?

Thanks!
Sean


Re: Weird issue with planner choosing seq scan

От
Matthew
Дата:
On Sun, 24 Feb 2008, Tom Lane wrote:
> Sean Leach <sleach@wiggum.com> writes:
>> I have a table, that in production, currently has a little over 3
>> million records in production.  In development, the same table has
>> about 10 million records (we have cleaned production a few weeks
>> ago).
>
> You mean the other way around, to judge by the rowcounts from EXPLAIN.
>
>>           ->  Index Scan using u_counts_i2 on u_counts c
>> (cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582
>> rows=392173 loops=1)
>
> I kinda think the devel system wouldn't be using an indexscan either
> if it had up-to-date ANALYZE statistics.  But even with the 1082 row
> estimate that seems a remarkably low cost estimate.

Seems pretty obvious to me. The table is obviously going to be well
ordered by the timestamp, if that's the time that the entries are inserted
into the table. So the index is going to have a very good correlation with
the order of the table, which is why the estimated cost for the index scan
is so low. The production table will be more active than the development
table, so the entries in it will be more recent. The entries that were
cleaned out a while ago are all irrelevant, because they will be old ones,
and we are specifically searching for new entries. Because the production
table is more active, even though it is smaller, the results of the search
will be bigger (as seen in the explain analyse results), pushing it over
the limit and making a sequential scan more attractive.

Matthew

--
Failure is not an option. It comes bundled with your Microsoft product.
                                                 -- Ferenc Mantfeld

Re: Weird issue with planner choosing seq scan

От
"Stephen Denne"
Дата:
Sean Leach wrote:
> On Feb 24, 2008, at 4:27 PM, Scott Marlowe wrote:
>
> >
> > Urg.  Then I wonder how your indexes are bloating but your table is
> > not...  you got autovac running?  No weird lock issues?  It's a side
> > issue right now since the table is showing as non-bloated (unless
> > you've got a long running transaction and that number is
> WAY off from
> > your vacuum)
>
>
> Autovac is running, but probably not tuned.  I am looking at my
> max_fsm_pages setting to up as vacuum says, but not sure which value
> to use (all the posts on the web refer to what looks like an old
> vacuum output format), is this the line to look at?
>
> INFO:  "u_counts": found 0 removable, 6214708 nonremovable row
> versions in 382344 pages
> DETAIL:  2085075 dead row versions cannot be removed yet.
>
> I.e. I need 382344 max_fsm_pages?  No weird lock issues that we have
> seen.

I think the hint and warning are referring to this line:
> 281727 pages contain useful free space.

But you're likely to have other relations in your database that have useful free space too.

What this warning is saying is that at least some of the useful free space in that table will not be re-used for new
rowsor row versions, because it is impossible for the free space map to have references to all of the pages with usable
space,since it is too small to hold that much information. 

> So should I do a vacuum full and then hope this doesn't
> happen again?
> Or should I run a VACUUM FULL after each aggregation run?

If your usage pattern results in generating all of that unused space in one transaction, and no further inserts or
updatesto that table till next time you run the same process, then my guess is that you probably should run a vacuum
fullon that table after each aggregation run. In that case you wouldn't have to increase max_fsm_pages solely to keep
trackof large amount of unused space in that table, since you're cleaning it up as soon as you're generating it. 

You earlier had 5.5 million row versions, 2 million of them dead but not yet removable, and you said (even earlier)
thatthe table had 3.3 million rows in it. 
You now say you've got 6.2 million row versions (with the same 2M dead). So it looks like you're creating new row
versionsat quite a pace, in which case increasing max_fsm_pages, and not worrying about doing a vacuum full _every_
timeis probably a good idea. 

Have you checked Scott Marlowe's note:

> > unless you've got a long running transaction

How come those 2 million dead rows are not removable yet? My guess (based on a quick search of the mailing lists) would
bethat they were generated from your aggregation run, and that a long running transaction started before your
aggregationrun committed (possibly even before it started), and that transaction is still alive. 

Alternatively, it may be a different 2 million dead row versions now than earlier, and may simply be a side effect of
yourparticular usage, and nothing to worry about. (Though it is exactly the same number of rows, which strongly hints
atbeing exactly the same rows.) 

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 

__________________________________________________________________
  This email has been scanned by the DMZGlobal Business Quality
              Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________



Re: Weird issue with planner choosing seq scan

От
Sean Leach
Дата:
On Feb 25, 2008, at 1:19 PM, Stephen Denne wrote:
>
>> So should I do a vacuum full and then hope this doesn't
>> happen again?
>> Or should I run a VACUUM FULL after each aggregation run?
>
> If your usage pattern results in generating all of that unused space
> in one transaction, and no further inserts or updates to that table
> till next time you run the same process, then my guess is that you
> probably should run a vacuum full on that table after each
> aggregation run. In that case you wouldn't have to increase
> max_fsm_pages solely to keep track of large amount of unused space
> in that table, since you're cleaning it up as soon as you're
> generating it.
>
> You earlier had 5.5 million row versions, 2 million of them dead but
> not yet removable, and you said (even earlier) that the table had
> 3.3 million rows in it.
> You now say you've got 6.2 million row versions (with the same 2M
> dead). So it looks like you're creating new row versions at quite a
> pace, in which case increasing max_fsm_pages, and not worrying about
> doing a vacuum full _every_ time is probably a good idea.

So 281727 should be the minimum I bump it to correct?


>
>
> Have you checked Scott Marlowe's note:
>
>>> unless you've got a long running transaction
>
> How come those 2 million dead rows are not removable yet? My guess
> (based on a quick search of the mailing lists) would be that they
> were generated from your aggregation run, and that a long running
> transaction started before your aggregation run committed (possibly
> even before it started), and that transaction is still alive.
>
> Alternatively, it may be a different 2 million dead row versions now
> than earlier, and may simply be a side effect of your particular
> usage, and nothing to worry about. (Though it is exactly the same
> number of rows, which strongly hints at being exactly the same rows.)


Great detective work, you are correct.  We have a daemon that runs and
is constantly adding new data to that table, then we aggregated it
daily (I said weekly before, I was incorrect) - which deletes several
rows as it updates a bunch of others.  So it sounds like upping
max_fsm_pages is the best option.

Thanks again everyone!



Re: Weird issue with planner choosing seq scan

От
"Stephen Denne"
Дата:
Sean Leach wrote
> On Feb 25, 2008, at 1:19 PM, Stephen Denne wrote:
> >
> >> So should I do a vacuum full and then hope this doesn't
> >> happen again?
> >> Or should I run a VACUUM FULL after each aggregation run?
> >
> > If your usage pattern results in generating all of that
> unused space
> > in one transaction, and no further inserts or updates to
> that table
> > till next time you run the same process, then my guess is that you
> > probably should run a vacuum full on that table after each
> > aggregation run. In that case you wouldn't have to increase
> > max_fsm_pages solely to keep track of large amount of unused space
> > in that table, since you're cleaning it up as soon as you're
> > generating it.
> >
> > You earlier had 5.5 million row versions, 2 million of them
> dead but
> > not yet removable, and you said (even earlier) that the table had
> > 3.3 million rows in it.
> > You now say you've got 6.2 million row versions (with the same 2M
> > dead). So it looks like you're creating new row versions at
> quite a
> > pace, in which case increasing max_fsm_pages, and not
> worrying about
> > doing a vacuum full _every_ time is probably a good idea.
>
> So 281727 should be the minimum I bump it to correct?

Please know that I'm very new at advising PostgreSQL users how they should tune their system...

My understanding of your vacuum verbose output was that it was pointing out that max_fsm_pages was currently smaller
than281727, so therefore there was no way it could contain mappings to all the reusable space. However I don't think it
ishinting at, nor recommending a value that you should be using. 

If you do nothing, then this number of pages with reusable space will probably continue to grow, therefore, it probably
hasbeen growing. 

So, for example, if your max_fsm_pages is currently only 20000, then perhaps 20000 of the 281727 pages with reusable
spaceare in the free space map. The remaining 260000 pages _may_ have been generated through 20 different processes
eachof which created 13000 more pages with reusable space than the map could reference. If that was the case, then a
max_fsm_pagesof 33000 might be large enough. 

Do you see what I'm getting at?
I think that you should do a vacuum full of that table once, then monitor the number of pages in it with reusable space
fora while (over a few iterations of your regular processes). That should give you information about how much larger
yourmax_fsm_pages should be than it currently is. 

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 

__________________________________________________________________
  This email has been scanned by the DMZGlobal Business Quality
              Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________



Re: Weird issue with planner choosing seq scan

От
"Stephen Denne"
Дата:
Sean Leach wrote
> On Feb 25, 2008, at 1:19 PM, Stephen Denne wrote:
> >
> >
> > Have you checked Scott Marlowe's note:
> >
> >>> unless you've got a long running transaction
> >
> > How come those 2 million dead rows are not removable yet? My guess
> > (based on a quick search of the mailing lists) would be that they
> > were generated from your aggregation run, and that a long running
> > transaction started before your aggregation run committed
> (possibly
> > even before it started), and that transaction is still alive.
> >
> > Alternatively, it may be a different 2 million dead row
> versions now
> > than earlier, and may simply be a side effect of your particular
> > usage, and nothing to worry about. (Though it is exactly the same
> > number of rows, which strongly hints at being exactly the
> same rows.)
>
>
> Great detective work, you are correct.  We have a daemon that
> runs and
> is constantly adding new data to that table, then we aggregated it
> daily (I said weekly before, I was incorrect) - which deletes
> several
> rows as it updates a bunch of others.  So it sounds like upping
> max_fsm_pages is the best option.

but... do you have a long running transaction? Are you happy having 30% to 40% of your table unusable (needlessly?) and
slowingdown your sequential scans? 

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 

__________________________________________________________________
  This email has been scanned by the DMZGlobal Business Quality
              Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________



Re: Weird issue with planner choosing seq scan

От
Sean Leach
Дата:
On Feb 25, 2008, at 2:59 PM, Stephen Denne wrote:
>>
>
> Please know that I'm very new at advising PostgreSQL users how they
> should tune their system...

I'd never have known it if you hadn't said anything

>
>
> My understanding of your vacuum verbose output was that it was
> pointing out that max_fsm_pages was currently smaller than 281727,
> so therefore there was no way it could contain mappings to all the
> reusable space. However I don't think it is hinting at, nor
> recommending a value that you should be using.
>
> If you do nothing, then this number of pages with reusable space
> will probably continue to grow, therefore, it probably has been
> growing.
>
> So, for example, if your max_fsm_pages is currently only 20000, then
> perhaps 20000 of the 281727 pages with reusable space are in the
> free space map. The remaining 260000 pages _may_ have been generated
> through 20 different processes each of which created 13000 more
> pages with reusable space than the map could reference. If that was
> the case, then a max_fsm_pages of 33000 might be large enough.
>
> Do you see what I'm getting at?
> I think that you should do a vacuum full of that table once, then
> monitor the number of pages in it with reusable space for a while
> (over a few iterations of your regular processes). That should give
> you information about how much larger your max_fsm_pages should be
> than it currently is.

This sounds sane to me, will do.  Thanks again!