Обсуждение: Slow query with a lot of data

Поиск
Список
Период
Сортировка

Slow query with a lot of data

От
Moritz Onken
Дата:
Hi,

I run this query:

select max(a."user"), b.category, count(1) from result a,
domain_categories b where a."domain" = b."domain" group by b.category;

the table result contains all websites a user visited. And the table
domain_categories contains all categories a domain is in.
result has 20 Mio rows and domain_categories has about 12 Mio. There
are 500.000 different users.

I have indexes on result.domain, domain_categories.domain,
result.user, domain_categories.category. Clustered result on user and
domain_categories on domain.

explain analyze says (limited to one user with id 1337):

"HashAggregate  (cost=2441577.16..2441614.72 rows=2504 width=8)
(actual time=94667.335..94671.508 rows=3361 loops=1)"
"  ->  Merge Join  (cost=2119158.02..2334105.00 rows=14329622 width=8)
(actual time=63559.938..94621.557 rows=36308 loops=1)"
"        Merge Cond: (a.domain = b.domain)"
"        ->  Sort  (cost=395.52..405.49 rows=3985 width=8) (actual
time=0.189..0.211 rows=19 loops=1)"
"              Sort Key: a.domain"
"              Sort Method:  quicksort  Memory: 27kB"
"              ->  Index Scan using result_user_idx on result a
(cost=0.00..157.21 rows=3985 width=8) (actual time=0.027..0.108
rows=61 loops=1)"
"                    Index Cond: ("user" = 1337)"
"        ->  Materialize  (cost=2118752.28..2270064.64 rows=12104989
width=8) (actual time=46460.599..82336.116 rows=12123161 loops=1)"
"              ->  Sort  (cost=2118752.28..2149014.75 rows=12104989
width=8) (actual time=46460.592..59595.851 rows=12104989 loops=1)"
"                    Sort Key: b.domain"
"                    Sort Method:  external sort  Disk: 283992kB"
"                    ->  Seq Scan on domain_categories b
(cost=0.00..198151.89 rows=12104989 width=8) (actual
time=14.352..22572.869 rows=12104989 loops=1)"
"Total runtime: 94817.058 ms"

This is running on a pretty small server with 1gb of ram and a slow
sata hd. Shared_buffers is 312mb, max_fsm_pages = 153600. Everything
else is commented out. Postgresql v8.3.3. Operating system Ubuntu 8.04.

It would be great if someone could help improve this query. This is
for a research project at my university.

Thanks in advance,

Moritz


Re: Slow query with a lot of data

От
Matthew Wakeling
Дата:
On Mon, 18 Aug 2008, Moritz Onken wrote:
> I have indexes on result.domain, domain_categories.domain, result.user,
> domain_categories.category. Clustered result on user and domain_categories on
> domain.

> "        ->  Materialize  (cost=2118752.28..2270064.64 rows=12104989 width=8)
> (actual time=46460.599..82336.116 rows=12123161 loops=1)"
> "              ->  Sort  (cost=2118752.28..2149014.75 rows=12104989 width=8)
> (actual time=46460.592..59595.851 rows=12104989 loops=1)"
> "                    Sort Key: b.domain"
> "                    Sort Method:  external sort  Disk: 283992kB"
> "                    ->  Seq Scan on domain_categories b
> (cost=0.00..198151.89 rows=12104989 width=8) (actual time=14.352..22572.869
> rows=12104989 loops=1)"

This is weird, given you say you have clustered domain_categories on
domain. Have you analysed? You should be able to run:

EXPLAIN SELECT * from domain_categories ORDER BY domain

and have it say "Index scan" instead of "Seq Scan followed by disc sort)".

Matthew

--
Patron: "I am looking for a globe of the earth."
Librarian: "We have a table-top model over here."
Patron: "No, that's not good enough. Don't you have a life-size?"
Librarian: (pause) "Yes, but it's in use right now."

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 18.08.2008 um 16:30 schrieb Matthew Wakeling:

> On Mon, 18 Aug 2008, Moritz Onken wrote:
>> I have indexes on result.domain, domain_categories.domain,
>> result.user, domain_categories.category. Clustered result on user
>> and domain_categories on domain.
>
>> "        ->  Materialize  (cost=2118752.28..2270064.64
>> rows=12104989 width=8) (actual time=46460.599..82336.116
>> rows=12123161 loops=1)"
>> "              ->  Sort  (cost=2118752.28..2149014.75 rows=12104989
>> width=8) (actual time=46460.592..59595.851 rows=12104989 loops=1)"
>> "                    Sort Key: b.domain"
>> "                    Sort Method:  external sort  Disk: 283992kB"
>> "                    ->  Seq Scan on domain_categories b
>> (cost=0.00..198151.89 rows=12104989 width=8) (actual
>> time=14.352..22572.869 rows=12104989 loops=1)"
>
> This is weird, given you say you have clustered domain_categories on
> domain. Have you analysed? You should be able to run:
>
> EXPLAIN SELECT * from domain_categories ORDER BY domain
>
> and have it say "Index scan" instead of "Seq Scan followed by disc
> sort)".
>
> Matthew
>

Thanks, the index was created but I forgot to run analyze again on
that table.

I had a little mistake in my previous sql query. The corrected version
is this:
explain analyze select a."user", b.category, count(1) from result a,
domain_categories b where a."domain" = b."domain" and a."user" = 1337
group by a."user", b.category;

(notice the additional group by column).

explain analyze:


"HashAggregate  (cost=817397.78..817428.92 rows=2491 width=8) (actual
time=42874.339..42878.419 rows=3361 loops=1)"
"  ->  Merge Join  (cost=748.47..674365.50 rows=19070970 width=8)
(actual time=15702.449..42829.388 rows=36308 loops=1)"
"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on
domain_categories b  (cost=0.00..391453.79 rows=12105014 width=8)
(actual time=39.018..30166.349 rows=12104989 loops=1)"
"        ->  Sort  (cost=395.52..405.49 rows=3985 width=8) (actual
time=0.188..32.345 rows=36309 loops=1)"
"              Sort Key: a.domain"
"              Sort Method:  quicksort  Memory: 27kB"
"              ->  Index Scan using result_user_idx on result a
(cost=0.00..157.21 rows=3985 width=8) (actual time=0.021..0.101
rows=61 loops=1)"
"                    Index Cond: ("user" = 1337)"
"Total runtime: 42881.382 ms"

This is still very slow...



Re: Slow query with a lot of data

От
Matthew Wakeling
Дата:
On Mon, 18 Aug 2008, Moritz Onken wrote:
> "HashAggregate  (cost=817397.78..817428.92 rows=2491 width=8) (actual  time=42874.339..42878.419 rows=3361 loops=1)"
> "  ->  Merge Join  (cost=748.47..674365.50 rows=19070970 width=8) (actual > time=15702.449..42829.388 rows=36308
loops=1)"
> "        Merge Cond: (b.domain = a.domain)"
> "        ->  Index Scan using domain_categories_domain on domain_categories b > (cost=0.00..391453.79 rows=12105014
width=8)(actual time=39.018..30166.349 > rows=12104989 loops=1)" 
> "        ->  Sort  (cost=395.52..405.49 rows=3985 width=8) (actual > time=0.188..32.345 rows=36309 loops=1)"
> "              Sort Key: a.domain"
> "              Sort Method:  quicksort  Memory: 27kB"
> "              ->  Index Scan using result_user_idx on result a > (cost=0.00..157.21 rows=3985 width=8) (actual
time=0.021..0.101rows=61 > loops=1)" 
> "                    Index Cond: ("user" = 1337)"
> "Total runtime: 42881.382 ms"
>
> This is still very slow...

Well, you're getting the database to read the entire contents of the
domain_categories table in order. That's 12 million rows - a fair amount
of work.

You may find that removing the "user = 1337" constraint doesn't make the
query much slower - that's where you get a big win by clustering on
domain. You might also want to cluster the results table on domain.

If you want the results for just one user, it would be very helpful to
have a user column on the domain_categories table, and an index on that
column. However, that will slow down the query for all users a little.

Matthew

--
For every complex problem, there is a solution that is simple, neat, and wrong.
                                                      -- H. L. Mencken

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
>>
>
> Well, you're getting the database to read the entire contents of the
> domain_categories table in order. That's 12 million rows - a fair
> amount of work.
>
> You may find that removing the "user = 1337" constraint doesn't make
> the query much slower - that's where you get a big win by clustering
> on domain. You might also want to cluster the results table on domain.

Running the query for more than one user is indeed not much slower.
That's what I need. I'm clustering the results table on domain right
now. But why is this better than clustering it on "user"?

>
>
> If you want the results for just one user, it would be very helpful
> to have a user column on the domain_categories table, and an index
> on that column. However, that will slow down the query for all users
> a little.

A row in domain_categories can belong to more than one user. But I
don't need to run this query for only one user anyway.

Thanks so far,

Re: Slow query with a lot of data

От
Matthew Wakeling
Дата:
On Mon, 18 Aug 2008, Moritz Onken wrote:
> Running the query for more than one user is indeed not much slower. That's
> what I need. I'm clustering the results table on domain right now. But why is
> this better than clustering it on "user"?

The reason is the way that the merge join algorithm works. What it does is
takes two tables, and sorts them both by the join fields. Then it can
stream through both tables producing results as it goes. It's the best
join algorithm, but it does require both tables to be sorted by the same
thing, which is domain in this case. The aggregating on user happens after
the join has been done, and the hash aggregate can accept the users in
random order.

If you look at your last EXPLAIN, see that it has to sort the result table
on domain, although it can read the domain_categories in domain order due
to the clustered index.

"HashAggregate
"  ->  Merge Join
"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on domain_categories b
"        ->  Sort
"              Sort Key: a.domain"
"              Sort Method:  quicksort  Memory: 27kB"
"              ->  Index Scan using result_user_idx on result a
"                    Index Cond: ("user" = 1337)"

Without the user restriction and re-clustering, this should become:

"HashAggregate
"  ->  Merge Join
"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on domain_categories b
"        ->  Index Scan using result_domain on result a

Matthew

--
Vacuums are nothings. We only mention them to let them know we know
they're there.

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 18.08.2008 um 18:05 schrieb Matthew Wakeling:

> On Mon, 18 Aug 2008, Moritz Onken wrote:
>> Running the query for more than one user is indeed not much slower.
>> That's what I need. I'm clustering the results table on domain
>> right now. But why is this better than clustering it on "user"?
>
> The reason is the way that the merge join algorithm works. What it
> does is takes two tables, and sorts them both by the join fields.
> Then it can stream through both tables producing results as it goes.
> It's the best join algorithm, but it does require both tables to be
> sorted by the same thing, which is domain in this case. The
> aggregating on user happens after the join has been done, and the
> hash aggregate can accept the users in random order.
>
> If you look at your last EXPLAIN, see that it has to sort the result
> table on domain, although it can read the domain_categories in
> domain order due to the clustered index.

explain select
   a."user", b.category, sum(1.0/b.cat_count)::float
   from result a, domain_categories b
   where a."domain" = b."domain"
   group by a."user", b.category;

"GroupAggregate  (cost=21400443313.69..22050401897.13 rows=35049240
width=12)"
"  ->  Sort  (cost=21400443313.69..21562757713.35 rows=64925759864
width=12)"
"        Sort Key: a."user", b.category"
"        ->  Merge Join  (cost=4000210.40..863834009.08
rows=64925759864 width=12)"
"              Merge Cond: (b.domain = a.domain)"
"              ->  Index Scan using domain_categories_domain on
domain_categories b  (cost=0.00..391453.79 rows=12105014 width=12)"
"              ->  Materialize  (cost=3999931.73..4253766.93
rows=20306816 width=8)"
"                    ->  Sort  (cost=3999931.73..4050698.77
rows=20306816 width=8)"
"                          Sort Key: a.domain"
"                          ->  Seq Scan on result a
(cost=0.00..424609.16 rows=20306816 width=8)"

Both results and domain_categories are clustered on domain and analyzed.
It took 50 minutes to run this query for 280 users ("and "user" IN
([280 ids])"), 78000 rows were returned and stored in a table. Is this
reasonable?
Why is it still sorting on domain? I thought the clustering should
prevent the planner from doing this?

moritz

Re: Slow query with a lot of data

От
Matthew Wakeling
Дата:
On Tue, 19 Aug 2008, Moritz Onken wrote:
> explain select
> a."user", b.category, sum(1.0/b.cat_count)::float
> from result a, domain_categories b
> where a."domain" = b."domain"
> group by a."user", b.category;

> Both results and domain_categories are clustered on domain and analyzed.
> Why is it still sorting on domain? I thought the clustering should prevent
> the planner from doing this?

As far as I can tell, it should. If it is clustered on an index on domain,
and then analysed, it should no longer have to sort on domain.

Could you post here the results of running:

select * from pg_stats where attname = 'domain';

> It took 50 minutes to run this query for 280 users ("and "user" IN ([280
> ids])"), 78000 rows were returned and stored in a table. Is this reasonable?

Sounds like an awfully long time to me. Also, I think restricting it to
280 users is probably not making it any faster.

Matthew

--
It's one of those irregular verbs - "I have an independent mind," "You are
an eccentric," "He is round the twist."
                                      -- Bernard Woolly, Yes Prime Minister

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
>>
> As far as I can tell, it should. If it is clustered on an index on
> domain, and then analysed, it should no longer have to sort on domain.
>
> Could you post here the results of running:
>
> select * from pg_stats where attname = 'domain';
>


  schemaname |       tablename        | attname | null_frac |
avg_width | n_distinct |
most_common_vals
|
most_common_freqs
|

                        histogram_bounds  

                         |  
  correlation

  public     | result                 | domain  |         0 |
4 |       1642 |
{3491378,3213829,3316634,3013831,3062500,3242775,3290846,3171997,3412018,3454092
} |
{0.352333,0.021,0.01,0.00766667,0.00566667,0.00533333,0.00533333,0.005,0.00266667,0.00266667
}                 |
{3001780,3031753,3075043,3129688,3176566,3230067,3286784,3341445,3386233,3444374,3491203
}
                                                                                               |  
            1


No idea what that means :)
>>
>
> Sounds like an awfully long time to me. Also, I think restricting it
> to 280 users is probably not making it any faster.

If I hadn't restricted it to 280 users it would have run ~350days...

Thanks for your help!

moritz

Re: Slow query with a lot of data

От
Matthew Wakeling
Дата:
On Tue, 19 Aug 2008, Moritz Onken wrote:
>        tablename        | attname | n_distinct | correlation
>  result                 | domain  |       1642 |           1

Well, the important thing is the correlation, which is 1, indicating that
Postgres knows that the table is clustered. So I have no idea why it is
sorting the entire table.

What happens when you run EXPLAIN SELECT * FROM result ORDER BY domain?

>> Sounds like an awfully long time to me. Also, I think restricting it to 280
>> users is probably not making it any faster.
>
> If I hadn't restricted it to 280 users it would have run ~350days...

What makes you say that? Perhaps you could post EXPLAINs of both of the
queries.

Matthew

--
What goes up must come down. Ask any system administrator.

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 19.08.2008 um 14:17 schrieb Matthew Wakeling:

> On Tue, 19 Aug 2008, Moritz Onken wrote:
>>       tablename        | attname | n_distinct | correlation
>> result                 | domain  |       1642 |           1
>
> Well, the important thing is the correlation, which is 1, indicating
> that Postgres knows that the table is clustered. So I have no idea
> why it is sorting the entire table.
>
> What happens when you run EXPLAIN SELECT * FROM result ORDER BY
> domain?
>

"Index Scan using result_domain_idx on result  (cost=0.00..748720.72
rows=20306816 width=49)"
... as it should be.

>>> Sounds like an awfully long time to me. Also, I think restricting
>>> it to 280 users is probably not making it any faster.
>>
>> If I hadn't restricted it to 280 users it would have run ~350days...
>
> What makes you say that? Perhaps you could post EXPLAINs of both of
> the queries.
>
> Matthew

That was just a guess. The query needs to retrieve the data for about
50,000 users. But it should be fast if I don't retrieve the data for
specific users but let in run through all rows.

explain insert into setup1 (select
   a."user", b.category, sum(1.0/b.cat_count)::float
   from result a, domain_categories b
   where a."domain" = b."domain"
   and b.depth < 4
   and a.results > 100
   and a."user" < 30000
   group by a."user", b.category);


"GroupAggregate  (cost=11745105.66..12277396.81 rows=28704 width=12)"
"  ->  Sort  (cost=11745105.66..11878034.93 rows=53171707 width=12)"
"        Sort Key: a."user", b.category"
"        ->  Merge Join  (cost=149241.25..1287278.89 rows=53171707
width=12)"
"              Merge Cond: (b.domain = a.domain)"
"              ->  Index Scan using domain_categories_domain on
domain_categories b  (cost=0.00..421716.32 rows=5112568 width=12)"
"                    Filter: (depth < 4)"
"              ->  Materialize  (cost=148954.16..149446.36 rows=39376
width=8)"
"                    ->  Sort  (cost=148954.16..149052.60 rows=39376
width=8)"
"                          Sort Key: a.domain"
"                          ->  Bitmap Heap Scan on result a
(cost=1249.93..145409.79 rows=39376 width=8)"
"                                Recheck Cond: ("user" < 30000)"
"                                Filter: (results > 100)"
"                                ->  Bitmap Index Scan on
result_user_idx  (cost=0.00..1240.08 rows=66881 width=0)"
"                                      Index Cond: ("user" < 30000)"


This query limits the number of users to 215 and this query took about
50 minutes.
I could create to temp tables which have only those records which I
need for this query. Would this be a good idea?


moritz


Re: Slow query with a lot of data

От
"Scott Carey"
Дата:
What is your work_mem set to?  The default?

Try increasing it significantly if you have the RAM and seeing if that affects the explain plan.  You may even want to set it to a number larger than the RAM you have just to see what happens.  In all honesty, it may be faster to overflow to OS swap space than sort too many rows, but ONLY if it changes the plan to a significantly more efficient one.

Simply type
'SET work_mem = '500MB';  
before running your explain.  Set it to even more RAM if you have the space for this experiment.
 
In my experience the performance of aggregates on large tables is significantly affected by work_mem and the optimizer will chosse poorly without enough of it.  It will rule out plans that may be fast enough when overflowing to disk in preference to colossal sized sorts (which likely also overflow to disk but take hours or days).

On Tue, Aug 19, 2008 at 5:47 AM, Moritz Onken <onken@houseofdesign.de> wrote:

Am 19.08.2008 um 14:17 schrieb Matthew Wakeling:


On Tue, 19 Aug 2008, Moritz Onken wrote:
     tablename        | attname | n_distinct | correlation
result                 | domain  |       1642 |           1

Well, the important thing is the correlation, which is 1, indicating that Postgres knows that the table is clustered. So I have no idea why it is sorting the entire table.

What happens when you run EXPLAIN SELECT * FROM result ORDER BY domain?


"Index Scan using result_domain_idx on result  (cost=0.00..748720.72 rows=20306816 width=49)"
... as it should be.


Sounds like an awfully long time to me. Also, I think restricting it to 280 users is probably not making it any faster.

If I hadn't restricted it to 280 users it would have run ~350days...

What makes you say that? Perhaps you could post EXPLAINs of both of the queries.

Matthew

That was just a guess. The query needs to retrieve the data for about 50,000 users. But it should be fast if I don't retrieve the data for specific users but let in run through all rows.

explain insert into setup1 (select

 a."user", b.category, sum(1.0/b.cat_count)::float
 from result a, domain_categories b
 where a."domain" = b."domain"
 and b.depth < 4
 and a.results > 100
 and a."user" < 30000
 group by a."user", b.category);


"GroupAggregate  (cost=11745105.66..12277396.81 rows=28704 width=12)"
"  ->  Sort  (cost=11745105.66..11878034.93 rows=53171707 width=12)"

"        Sort Key: a."user", b.category"
"        ->  Merge Join  (cost=149241.25..1287278.89 rows=53171707 width=12)"

"              Merge Cond: (b.domain = a.domain)"
"              ->  Index Scan using domain_categories_domain on domain_categories b  (cost=0.00..421716.32 rows=5112568 width=12)"
"                    Filter: (depth < 4)"
"              ->  Materialize  (cost=148954.16..149446.36 rows=39376 width=8)"
"                    ->  Sort  (cost=148954.16..149052.60 rows=39376 width=8)"
"                          Sort Key: a.domain"
"                          ->  Bitmap Heap Scan on result a  (cost=1249.93..145409.79 rows=39376 width=8)"
"                                Recheck Cond: ("user" < 30000)"
"                                Filter: (results > 100)"
"                                ->  Bitmap Index Scan on result_user_idx  (cost=0.00..1240.08 rows=66881 width=0)"
"                                      Index Cond: ("user" < 30000)"


This query limits the number of users to 215 and this query took about 50 minutes.
I could create to temp tables which have only those records which I need for this query. Would this be a good idea?


moritz



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 19.08.2008 um 16:49 schrieb Scott Carey:

> What is your work_mem set to?  The default?
>
> Try increasing it significantly if you have the RAM and seeing if
> that affects the explain plan.  You may even want to set it to a
> number larger than the RAM you have just to see what happens.  In
> all honesty, it may be faster to overflow to OS swap space than sort
> too many rows, but ONLY if it changes the plan to a significantly
> more efficient one.
>
> Simply type
> 'SET work_mem = '500MB';
> before running your explain.  Set it to even more RAM if you have
> the space for this experiment.
>
> In my experience the performance of aggregates on large tables is
> significantly affected by work_mem and the optimizer will chosse
> poorly without enough of it.  It will rule out plans that may be
> fast enough when overflowing to disk in preference to colossal sized
> sorts (which likely also overflow to disk but take hours or days).

Thanks for that advice but the explain is not different :-(

moritz

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 19.08.2008 um 17:23 schrieb Moritz Onken:

>
> Am 19.08.2008 um 16:49 schrieb Scott Carey:
>
>> What is your work_mem set to?  The default?
>>
>> Try increasing it significantly if you have the RAM and seeing if
>> that affects the explain plan.  You may even want to set it to a
>> number larger than the RAM you have just to see what happens.  In
>> all honesty, it may be faster to overflow to OS swap space than
>> sort too many rows, but ONLY if it changes the plan to a
>> significantly more efficient one.
>>
>> Simply type
>> 'SET work_mem = '500MB';
>> before running your explain.  Set it to even more RAM if you have
>> the space for this experiment.
>>
>> In my experience the performance of aggregates on large tables is
>> significantly affected by work_mem and the optimizer will chosse
>> poorly without enough of it.  It will rule out plans that may be
>> fast enough when overflowing to disk in preference to colossal
>> sized sorts (which likely also overflow to disk but take hours or
>> days).
>
> Thanks for that advice but the explain is not different :-(
>
> moritz
>
> --

Hi,

I started the query with work_mem set to 3000MB. The explain output
didn't change but it runs now much faster (about 10 times). The swap
isn't used. How can you explain that?

moritz

Re: Slow query with a lot of data

От
Zoltan Boszormenyi
Дата:
Moritz Onken írta:
>
> Am 19.08.2008 um 17:23 schrieb Moritz Onken:
>
>>
>> Am 19.08.2008 um 16:49 schrieb Scott Carey:
>>
>>> What is your work_mem set to?  The default?
>>>
>>> Try increasing it significantly if you have the RAM and seeing if
>>> that affects the explain plan.  You may even want to set it to a
>>> number larger than the RAM you have just to see what happens.  In
>>> all honesty, it may be faster to overflow to OS swap space than sort
>>> too many rows, but ONLY if it changes the plan to a significantly
>>> more efficient one.
>>>
>>> Simply type
>>> 'SET work_mem = '500MB';
>>> before running your explain.  Set it to even more RAM if you have
>>> the space for this experiment.
>>>
>>> In my experience the performance of aggregates on large tables is
>>> significantly affected by work_mem and the optimizer will chosse
>>> poorly without enough of it.  It will rule out plans that may be
>>> fast enough when overflowing to disk in preference to colossal sized
>>> sorts (which likely also overflow to disk but take hours or days).
>>
>> Thanks for that advice but the explain is not different :-(
>>
>> moritz
>>
>> --
>
> Hi,
>
> I started the query with work_mem set to 3000MB. The explain output
> didn't change but it runs now much faster (about 10 times). The swap
> isn't used. How can you explain that?

$ cat /proc/sys/vm/overcommit_memory
0
$ less linux/Documentation/filesystems/proc.txt
...
overcommit_memory
-----------------

Controls overcommit of system memory, possibly allowing processes
to allocate (but not use) more memory than is actually available.


0       -       Heuristic overcommit handling. Obvious overcommits of
                address space are refused. Used for a typical system. It
                ensures a seriously wild allocation fails while allowing
                overcommit to reduce swap usage.  root is allowed to
                allocate slightly more memory in this mode. This is the
                default.

1       -       Always overcommit. Appropriate for some scientific
                applications.

2       -       Don't overcommit. The total address space commit
                for the system is not permitted to exceed swap plus a
                configurable percentage (default is 50) of physical RAM.
                Depending on the percentage you use, in most situations
                this means a process will not be killed while attempting
                to use already-allocated memory but will receive errors
                on memory allocation as appropriate.
...

I guess you are running on 64-bit because "obvious overcommit" exceeds
3GB already.
Or you're running 32-bit and overcommit_memory=1 on your system.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


Re: Slow query with a lot of data

От
"Scott Carey"
Дата:
More work_mem will make the sort fit more in memory and less on disk, even with the same query plan.


On Wed, Aug 20, 2008 at 12:54 AM, Moritz Onken <onken@houseofdesign.de> wrote:

Am 19.08.2008 um 17:23 schrieb Moritz Onken:



Am 19.08.2008 um 16:49 schrieb Scott Carey:

What is your work_mem set to?  The default?

Try increasing it significantly if you have the RAM and seeing if that affects the explain plan.  You may even want to set it to a number larger than the RAM you have just to see what happens.  In all honesty, it may be faster to overflow to OS swap space than sort too many rows, but ONLY if it changes the plan to a significantly more efficient one.

Simply type
'SET work_mem = '500MB';
before running your explain.  Set it to even more RAM if you have the space for this experiment.

In my experience the performance of aggregates on large tables is significantly affected by work_mem and the optimizer will chosse poorly without enough of it.  It will rule out plans that may be fast enough when overflowing to disk in preference to colossal sized sorts (which likely also overflow to disk but take hours or days).

Thanks for that advice but the explain is not different :-(

moritz

--

Hi,

I started the query with work_mem set to 3000MB. The explain output didn't change but it runs now much faster (about 10 times). The swap isn't used. How can you explain that?


moritz

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Slow query with a lot of data

От
"Scott Carey"
Дата:
Ok, so the problem boils down to the sort at the end.

The query up through the merge join on domain is as fast as its going to get.  The sort at the end however, should not happen ideally.  There are not that many rows returned, and it should hash_aggregate if it thinks there is enough space to do so.

The query planner is going to choose the sort > agg over the hash-agg if it estimates the total number of resulting rows to be large enough so that the hash won't fit in work_mem.   However, there seems to be another factor here based on this:


GroupAggregate  (cost=11745105.66..12277396.
81 rows=28704 width=12)"
"  ->  Sort  (cost=11745105.66..11878034.93 rows=53171707 width=12)"

"        Sort Key: a."user", b.category"
"        ->  Merge Join  (cost=149241.25..1287278.89 rows=53171707 width=12)"

"              Merge Cond: (b.domain = a.domain)"


The planner actually thinks there will only be 28704 rows returned of width 12.  But it chooses to sort 53 million rows before aggregating.  Thats either a bug or there's something else wrong here.   That is the wrong way to aggregate those results no matter how much work_mem you have unless I'm completely missing something...

You can try rearranging the query just to see if you can work around this.  What happens if you compare the explain on:

select
 a."user", b.category, sum(1.0/b.cat_count)::float
 from result a, domain_categories b
 where a."domain" = b."domain"
 and b.depth < 4
 and a.results > 100
 and a."user" < 30000
 group by a."user", b.category


to

select
 c."user", c.category, sum(1.0/c.cat_count)::float
 from (select a."user", b.category, b.cat_count
   from result a, domain_categories b
     where a."domain" = b."domain"
       and b.depth < 4
       and a.results > 100
       and a."user" < 30000 ) c
  group by c."user", c.category

It shouldn't make a difference, but I've seen things like this help before so its worth a try.  Make sure work_mem is reasonably sized for this test.

Another thing that won't be that fast, but may avoid the sort, is to select the subselection above into a temporary table, analyze it, and then do the outer select.  Make sure your settings for temporary space (temp_buffers in 8.3) are large enough for the intermediate results (700MB should do it).  That won't be that fast, but it will most likely be faster than sorting 50 million + rows.  There are lots of problems with this approach but it may be worth the experiment.


On Tue, Aug 19, 2008 at 5:47 AM, Moritz Onken <onken@houseofdesign.de> wrote:

Am 19.08.2008 um 14:17 schrieb Matthew Wakeling:


On Tue, 19 Aug 2008, Moritz Onken wrote:
     tablename        | attname | n_distinct | correlation
result                 | domain  |       1642 |           1

Well, the important thing is the correlation, which is 1, indicating that Postgres knows that the table is clustered. So I have no idea why it is sorting the entire table.

What happens when you run EXPLAIN SELECT * FROM result ORDER BY domain?


"Index Scan using result_domain_idx on result  (cost=0.00..748720.72 rows=20306816 width=49)"
... as it should be.


Sounds like an awfully long time to me. Also, I think restricting it to 280 users is probably not making it any faster.

If I hadn't restricted it to 280 users it would have run ~350days...

What makes you say that? Perhaps you could post EXPLAINs of both of the queries.

Matthew

That was just a guess. The query needs to retrieve the data for about 50,000 users. But it should be fast if I don't retrieve the data for specific users but let in run through all rows.

explain insert into setup1 (select

 a."user", b.category, sum(1.0/b.cat_count)::float
 from result a, domain_categories b
 where a."domain" = b."domain"
 and b.depth < 4
 and a.results > 100
 and a."user" < 30000
 group by a."user", b.category);


"GroupAggregate  (cost=11745105.66..12277396.81 rows=28704 width=12)"
"  ->  Sort  (cost=11745105.66..11878034.93 rows=53171707 width=12)"

"        Sort Key: a."user", b.category"
"        ->  Merge Join  (cost=149241.25..1287278.89 rows=53171707 width=12)"

"              Merge Cond: (b.domain = a.domain)"
"              ->  Index Scan using domain_categories_domain on domain_categories b  (cost=0.00..421716.32 rows=5112568 width=12)"
"                    Filter: (depth < 4)"
"              ->  Materialize  (cost=148954.16..149446.36 rows=39376 width=8)"
"                    ->  Sort  (cost=148954.16..149052.60 rows=39376 width=8)"
"                          Sort Key: a.domain"
"                          ->  Bitmap Heap Scan on result a  (cost=1249.93..145409.79 rows=39376 width=8)"
"                                Recheck Cond: ("user" < 30000)"
"                                Filter: (results > 100)"
"                                ->  Bitmap Index Scan on result_user_idx  (cost=0.00..1240.08 rows=66881 width=0)"
"                                      Index Cond: ("user" < 30000)"


This query limits the number of users to 215 and this query took about 50 minutes.
I could create to temp tables which have only those records which I need for this query. Would this be a good idea?


moritz



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Slow query with a lot of data

От
Tom Lane
Дата:
"Scott Carey" <scott@richrelevance.com> writes:
> The planner actually thinks there will only be 28704 rows returned of width
> 12.  But it chooses to sort 53 million rows before aggregating.  Thats
> either a bug or there's something else wrong here.   That is the wrong way
> to aggregate those results no matter how much work_mem you have unless I'm
> completely missing something...

That does look weird.  What are the datatypes of the columns being
grouped by?  Maybe they're not hashable?

Another forcing function that prevents use of HashAgg is DISTINCT
aggregates, but you don't seem to have any in this query...

            regards, tom lane

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 20.08.2008 um 20:06 schrieb Scott Carey:

> Ok, so the problem boils down to the sort at the end.
>
> The query up through the merge join on domain is as fast as its
> going to get.  The sort at the end however, should not happen
> ideally.  There are not that many rows returned, and it should
> hash_aggregate if it thinks there is enough space to do so.
>
> The query planner is going to choose the sort > agg over the hash-
> agg if it estimates the total number of resulting rows to be large
> enough so that the hash won't fit in work_mem.   However, there
> seems to be another factor here based on this:
>
>
> GroupAggregate  (cost=11745105.66..12277396.
> 81 rows=28704 width=12)"
> "  ->  Sort  (cost=11745105.66..11878034.93 rows=53171707 width=12)"
>
> "        Sort Key: a."user", b.category"
> "        ->  Merge Join  (cost=149241.25..1287278.89 rows=53171707
> width=12)"
>
> "              Merge Cond: (b.domain = a.domain)"
>
>
> The planner actually thinks there will only be 28704 rows returned
> of width 12.  But it chooses to sort 53 million rows before
> aggregating.  Thats either a bug or there's something else wrong
> here.   That is the wrong way to aggregate those results no matter
> how much work_mem you have unless I'm completely missing something...
>
> You can try rearranging the query just to see if you can work around
> this.  What happens if you compare the explain on:
>
> select
>  a."user", b.category, sum(1.0/b.cat_count)::float
>  from result a, domain_categories b
>  where a."domain" = b."domain"
>  and b.depth < 4
>  and a.results > 100
>  and a."user" < 30000
>  group by a."user", b.category
>
>

"HashAggregate  (cost=1685527.69..1686101.77 rows=28704 width=12)"
"  ->  Merge Join  (cost=148702.25..1286739.89 rows=53171707 width=12)"
"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on
domain_categories b  (cost=0.00..421716.32 rows=5112568 width=12)"
"              Filter: (depth < 4)"
"        ->  Sort  (cost=148415.16..148513.60 rows=39376 width=8)"
"              Sort Key: a.domain"
"              ->  Bitmap Heap Scan on result a
(cost=1249.93..145409.79 rows=39376 width=8)"
"                    Recheck Cond: ("user" < 30000)"
"                    Filter: (results > 100)"
"                    ->  Bitmap Index Scan on result_user_idx
(cost=0.00..1240.08 rows=66881 width=0)"
"                          Index Cond: ("user" < 30000)"



> to
>
> select
>  c."user", c.category, sum(1.0/c.cat_count)::float
>  from (select a."user", b.category, b.cat_count
>    from result a, domain_categories b
>      where a."domain" = b."domain"
>        and b.depth < 4
>        and a.results > 100
>        and a."user" < 30000 ) c
>   group by c."user", c.category
>


"HashAggregate  (cost=1685527.69..1686101.77 rows=28704 width=12)"
"  ->  Merge Join  (cost=148702.25..1286739.89 rows=53171707 width=12)"
"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on
domain_categories b  (cost=0.00..421716.32 rows=5112568 width=12)"
"              Filter: (depth < 4)"
"        ->  Sort  (cost=148415.16..148513.60 rows=39376 width=8)"
"              Sort Key: a.domain"
"              ->  Bitmap Heap Scan on result a
(cost=1249.93..145409.79 rows=39376 width=8)"
"                    Recheck Cond: ("user" < 30000)"
"                    Filter: (results > 100)"
"                    ->  Bitmap Index Scan on result_user_idx
(cost=0.00..1240.08 rows=66881 width=0)"
"                          Index Cond: ("user" < 30000)"



> It shouldn't make a difference, but I've seen things like this help
> before so its worth a try.  Make sure work_mem is reasonably sized
> for this test.

It's exactly the same. work_mem was set to 3000MB.

>
>
> Another thing that won't be that fast, but may avoid the sort, is to
> select the subselection above into a temporary table, analyze it,
> and then do the outer select.  Make sure your settings for temporary
> space (temp_buffers in 8.3) are large enough for the intermediate
> results (700MB should do it).  That won't be that fast, but it will
> most likely be faster than sorting 50 million + rows.  There are
> lots of problems with this approach but it may be worth the
> experiment.
>

I'll try this.

Thanks so far!

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 20.08.2008 um 20:28 schrieb Tom Lane:

> "Scott Carey" <scott@richrelevance.com> writes:
>> The planner actually thinks there will only be 28704 rows returned
>> of width
>> 12.  But it chooses to sort 53 million rows before aggregating.
>> Thats
>> either a bug or there's something else wrong here.   That is the
>> wrong way
>> to aggregate those results no matter how much work_mem you have
>> unless I'm
>> completely missing something...
>
> That does look weird.  What are the datatypes of the columns being
> grouped by?  Maybe they're not hashable?
>
> Another forcing function that prevents use of HashAgg is DISTINCT
> aggregates, but you don't seem to have any in this query...
>
>             regards, tom lane

The datatypes are both integers. There is no DISTINCT in this query.
Thanks anyway!



Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 21.08.2008 um 09:04 schrieb Moritz Onken:

>
> Am 20.08.2008 um 20:28 schrieb Tom Lane:
>
>> "Scott Carey" <scott@richrelevance.com> writes:
>>> The planner actually thinks there will only be 28704 rows returned
>>> of width
>>> 12.  But it chooses to sort 53 million rows before aggregating.
>>> Thats
>>> either a bug or there's something else wrong here.   That is the
>>> wrong way
>>> to aggregate those results no matter how much work_mem you have
>>> unless I'm
>>> completely missing something...
>>
>> That does look weird.  What are the datatypes of the columns being
>> grouped by?  Maybe they're not hashable?
>>
>> Another forcing function that prevents use of HashAgg is DISTINCT
>> aggregates, but you don't seem to have any in this query...
>>
>>             regards, tom lane
>
> The datatypes are both integers. There is no DISTINCT in this query.
> Thanks anyway!
>

insert into setup1 (select
   a."user", b.category, sum(1.0/b.cat_count)::float
   from result a, domain_categories b
   where a."domain" = b."domain"
   and b.depth < 4
   and a.results > 100
   group by a."user", b.category);

This query inserted a total of 16,000,000 rows and, with work_mem set
to 3000mb,
took about 24 hours.

Any more ideas to speed this up?



Re: Slow query with a lot of data

От
"Scott Carey"
Дата:
It looks to me like the work_mem did have an effect. 

Your earlier queries had a sort followed by group aggregate at the top, and now its a hash-aggregate.  So the query plan DID change.  That is likely where the first 10x performance gain came from. 

The top of the plan was:

GroupAggregate  (cost=11745105.66..12277396.
81 rows=28704 width=12)"
  ->  Sort  (cost=11745105.66..11878034.93 rows=53171707 width=12)"
        ->  Merge Join  (cost=149241.25..1287278.89 rows=53171707 width=12)"
              Merge Cond: (b.domain = a.domain)"

and now it is:

"HashAggregate  (cost=1685527.69..1686101.77 rows=28704 width=12)"
  ->  Merge Join  (cost=148702.25..1286739.89 rows=53171707 width=12)"
        Merge Cond: (b.domain = a.domain)"

The HashAggregate replaced the Sort followed by GroupAggregate at about 1/10 the cost.
 
It probably only took the first couple hundred MB of work_mem to do this, or less given that you were at the default originally.
Note how the estimated cost on the latter is 1.6 million, and it is 11 million in the first one.

You won't get a large table aggregate significantly faster than this -- you're asking it to scan through 53 million records and aggregate.  An explain analyze will be somewhat instructive to help identify if there is more I/O or CPU bound overall as we can compare the estimated cost with the actual times, but this probably won't get very far.

After that, inserting 16M rows requires rather different tuning and bottleneck identification.

On Thu, Aug 21, 2008 at 12:03 AM, Moritz Onken <onken@houseofdesign.de> wrote:

Am 20.08.2008 um 20:06 schrieb Scott Carey:


Ok, so the problem boils down to the sort at the end.

The query up through the merge join on domain is as fast as its going to get.  The sort at the end however, should not happen ideally.  There are not that many rows returned, and it should hash_aggregate if it thinks there is enough space to do so.

The query planner is going to choose the sort > agg over the hash-agg if it estimates the total number of resulting rows to be large enough so that the hash won't fit in work_mem.   However, there seems to be another factor here based on this:


GroupAggregate  (cost=11745105.66..12277396.
81 rows=28704 width=12)"
"  ->  Sort  (cost=11745105.66..11878034.93 rows=53171707 width=12)"

"        Sort Key: a."user", b.category"
"        ->  Merge Join  (cost=149241.25..1287278.89 rows=53171707 width=12)"

"              Merge Cond: (b.domain = a.domain)"


The planner actually thinks there will only be 28704 rows returned of width 12.  But it chooses to sort 53 million rows before aggregating.  Thats either a bug or there's something else wrong here.   That is the wrong way to aggregate those results no matter how much work_mem you have unless I'm completely missing something...

You can try rearranging the query just to see if you can work around this.  What happens if you compare the explain on:

select
 a."user", b.category, sum(1.0/b.cat_count)::float
 from result a, domain_categories b
 where a."domain" = b."domain"
 and b.depth < 4
 and a.results > 100
 and a."user" < 30000
 group by a."user", b.category



"HashAggregate  (cost=1685527.69..1686101.77 rows=28704 width=12)"
"  ->  Merge Join  (cost=148702.25..1286739.89 rows=53171707 width=12)"

"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on domain_categories b  (cost=0.00..421716.32 rows=5112568 width=12)"
"              Filter: (depth < 4)"
"        ->  Sort  (cost=148415.16..148513.60 rows=39376 width=8)"

"              Sort Key: a.domain"
"              ->  Bitmap Heap Scan on result a  (cost=1249.93..145409.79 rows=39376 width=8)"
"                    Recheck Cond: ("user" < 30000)"
"                    Filter: (results > 100)"
"                    ->  Bitmap Index Scan on result_user_idx  (cost=0.00..1240.08 rows=66881 width=0)"
"                          Index Cond: ("user" < 30000)"



to

select
 c."user", c.category, sum(1.0/c.cat_count)::float
 from (select a."user", b.category, b.cat_count
  from result a, domain_categories b
    where a."domain" = b."domain"
      and b.depth < 4
      and a.results > 100
      and a."user" < 30000 ) c
 group by c."user", c.category



"HashAggregate  (cost=1685527.69..1686101.77 rows=28704 width=12)"
"  ->  Merge Join  (cost=148702.25..1286739.89 rows=53171707 width=12)"

"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on domain_categories b  (cost=0.00..421716.32 rows=5112568 width=12)"
"              Filter: (depth < 4)"
"        ->  Sort  (cost=148415.16..148513.60 rows=39376 width=8)"

"              Sort Key: a.domain"
"              ->  Bitmap Heap Scan on result a  (cost=1249.93..145409.79 rows=39376 width=8)"
"                    Recheck Cond: ("user" < 30000)"
"                    Filter: (results > 100)"
"                    ->  Bitmap Index Scan on result_user_idx  (cost=0.00..1240.08 rows=66881 width=0)"
"                          Index Cond: ("user" < 30000)"



It shouldn't make a difference, but I've seen things like this help before so its worth a try.  Make sure work_mem is reasonably sized for this test.

It's exactly the same. work_mem was set to 3000MB.




Another thing that won't be that fast, but may avoid the sort, is to select the subselection above into a temporary table, analyze it, and then do the outer select.  Make sure your settings for temporary space (temp_buffers in 8.3) are large enough for the intermediate results (700MB should do it).  That won't be that fast, but it will most likely be faster than sorting 50 million + rows.  There are lots of problems with this approach but it may be worth the experiment.


I'll try this.

Thanks so far!


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 21.08.2008 um 16:39 schrieb Scott Carey:

> It looks to me like the work_mem did have an effect.
>
> Your earlier queries had a sort followed by group aggregate at the
> top, and now its a hash-aggregate.  So the query plan DID change.
> That is likely where the first 10x performance gain came from.

But it didn't change as I added the sub select.
Thank you guys very much. The speed is now ok and I hope I can finish
tihs work soon.

But there is another problem. If I run this query without the
limitation of the user id, postgres consumes about 150GB of disk space
and dies with

ERROR:  could not write block 25305351 of temporary file: No space
left on device

After that the avaiable disk space is back to normal.

Is this normal? The resulting table (setup1) is not bigger than 1.5 GB.

moritz

Re: Slow query with a lot of data

От
"Merlin Moncure"
Дата:
On Thu, Aug 21, 2008 at 11:07 AM, Moritz Onken <onken@houseofdesign.de> wrote:
>
> Am 21.08.2008 um 16:39 schrieb Scott Carey:
>
>> It looks to me like the work_mem did have an effect.
>>
>> Your earlier queries had a sort followed by group aggregate at the top,
>> and now its a hash-aggregate.  So the query plan DID change.  That is likely
>> where the first 10x performance gain came from.
>
> But it didn't change as I added the sub select.
> Thank you guys very much. The speed is now ok and I hope I can finish tihs
> work soon.
>
> But there is another problem. If I run this query without the limitation of
> the user id, postgres consumes about 150GB of disk space and dies with
>
> ERROR:  could not write block 25305351 of temporary file: No space left on
> device
>
> After that the avaiable disk space is back to normal.
>
> Is this normal? The resulting table (setup1) is not bigger than 1.5 GB.

Maybe the result is too big.  if you explain the query, you should get
an estimate of rows returned.  If this is the case, you need to
rethink your query or do something like a cursor to browse the result.

merlin

Re: Slow query with a lot of data

От
Moritz Onken
Дата:
Am 21.08.2008 um 19:08 schrieb Merlin Moncure:

> On Thu, Aug 21, 2008 at 11:07 AM, Moritz Onken
> <onken@houseofdesign.de> wrote:
>>
>> Am 21.08.2008 um 16:39 schrieb Scott Carey:
>>
>>> It looks to me like the work_mem did have an effect.
>>>
>>> Your earlier queries had a sort followed by group aggregate at the
>>> top,
>>> and now its a hash-aggregate.  So the query plan DID change.  That
>>> is likely
>>> where the first 10x performance gain came from.
>>
>> But it didn't change as I added the sub select.
>> Thank you guys very much. The speed is now ok and I hope I can
>> finish tihs
>> work soon.
>>
>> But there is another problem. If I run this query without the
>> limitation of
>> the user id, postgres consumes about 150GB of disk space and dies
>> with
>>
>> ERROR:  could not write block 25305351 of temporary file: No space
>> left on
>> device
>>
>> After that the avaiable disk space is back to normal.
>>
>> Is this normal? The resulting table (setup1) is not bigger than 1.5
>> GB.
>
> Maybe the result is too big.  if you explain the query, you should get
> an estimate of rows returned.  If this is the case, you need to
> rethink your query or do something like a cursor to browse the result.
>
> merlin

There will be a few million rows. But I don't understand why these rows
bloat up so much. If the query is done the new table is about 1 GB in
size.
But while the query is running it uses >150GB of disk space.

moritz


Re: Slow query with a lot of data

От
"Merlin Moncure"
Дата:
On Fri, Aug 22, 2008 at 2:31 AM, Moritz Onken <onken@houseofdesign.de> wrote:
> Am 21.08.2008 um 19:08 schrieb Merlin Moncure:
>
>> On Thu, Aug 21, 2008 at 11:07 AM, Moritz Onken <onken@houseofdesign.de>
>> wrote:
>>>
>>> Am 21.08.2008 um 16:39 schrieb Scott Carey:
>>>
>>>> It looks to me like the work_mem did have an effect.
>>>>
>>>> Your earlier queries had a sort followed by group aggregate at the top,
>>>> and now its a hash-aggregate.  So the query plan DID change.  That is
>>>> likely
>>>> where the first 10x performance gain came from.
>>>
>>> But it didn't change as I added the sub select.
>>> Thank you guys very much. The speed is now ok and I hope I can finish
>>> tihs
>>> work soon.
>>>
>>> But there is another problem. If I run this query without the limitation
>>> of
>>> the user id, postgres consumes about 150GB of disk space and dies with
>>>
>>> ERROR:  could not write block 25305351 of temporary file: No space left
>>> on
>>> device
>>>
>>> After that the avaiable disk space is back to normal.
>>>
>>> Is this normal? The resulting table (setup1) is not bigger than 1.5 GB.
>>
>> Maybe the result is too big.  if you explain the query, you should get
>> an estimate of rows returned.  If this is the case, you need to
>> rethink your query or do something like a cursor to browse the result.
>>
>> merlin
>
> There will be a few million rows. But I don't understand why these rows
> bloat up so much. If the query is done the new table is about 1 GB in size.
> But while the query is running it uses >150GB of disk space.

can we see explain?

merlin