Обсуждение: How to craft a query that uses memory?

Поиск
Список
Период
Сортировка

How to craft a query that uses memory?

От
Дата:
Hello list,
 
What would be the best way of setting up a query that uses lots of memory, in order to study the effects of resource usage parameters on the performance of the query?
 
I thought that a query on a large table involving JOINs would be a way, but while the backend initially used 1.7 gigs of  memory, it only uses a couple dozen megs of memory anymore after I restarted PostgreSQL because the checkpointer process had 2.4 gigs of memory in use.  (The table consumes nearly 1.3 gigs of disk space, and the query returns one billion as the join used results in one billion rows.)
 
The query I was using is this one:
 
SELECT count(*) FROM pgbench_accounts a JOIN pgbench_accounts b ON (a.abalance = b.abalance);
 
which I ran after initializing the table using pgbench –I –s 100, and populating the abalance field with
 
UPDATE pgbench_accounts SET abalance = aid % 100000 + 1;
 
The query plan reads,
 
postgres=# EXPLAIN ANALYZE SELECT count(*) FROM pgbench_accounts a JOIN pgbench_accounts b ON (a.abalance = b.abalance);
                                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=22479359.09..22479359.10 rows=1 width=0) (actual time=787769.456..787769.456 rows=1 loops=1)
   ->  Merge Join  (cost=3126660.67..19725402.17 rows=1101582767 width=0) (actual time=62254.349..621119.691 rows=1000000000 loops=1)
         Merge Cond: (a.abalance = b.abalance)
         ->  Sort  (cost=1563330.33..1588330.33 rows=10000000 width=4) (actual time=27706.051..42305.488 rows=10000000 loops=1)
               Sort Key: a.abalance
               Sort Method: external merge  Disk: 136624kB
               ->  Seq Scan on pgbench_accounts a  (cost=0.00..263935.00 rows=10000000 width=4) (actual time=0.260..11592.979 rows=10000000 loops=1)
         ->  Materialize  (cost=1563330.33..1613330.33 rows=10000000 width=4) (actual time=34548.224..209123.713 rows=999999901 loops=1)
               ->  Sort  (cost=1563330.33..1588330.33 rows=10000000 width=4) (actual time=34548.202..43988.283 rows=10000000 loops=1)
                     Sort Key: b.abalance
                     Sort Method: external merge  Disk: 136624kB
                     ->  Seq Scan on pgbench_accounts b  (cost=0.00..263935.00 rows=10000000 width=4) (actual time=0.017..16807.894 rows=10000000 loops=1)
Total runtime: 787836.841 ms
(13 Zeilen)

The point of the SELECT count(*) (as opposed to something like SELECT a.*) is, of course, to make sure we are not measuring the impact of the memory consumption of pgbench.
 
There are sorts involved, which I heard might require working memory but could also “spill to disk” as someone put it.  So how do I craft a query that actually does use lots of memory?
 
Cheers,
Holger Friedrich
 
 

Re: How to craft a query that uses memory?

От
Albe Laurenz
Дата:
Holger Friedrich wrote:
> So how do I craft a query that actually does use lots of memory?

You increase the parameter "work_mem".

You can do that globally in postgresql.conf or with SET for one session
or with SET LOCAL for one transaction.

Yours,
Laurenz Albe

Re: How to craft a query that uses memory?

От
Дата:
Albe Laurenz wrote on Friday, June 19, 2015 12:56 PM:
> Holger Friedrich wrote:
> > So how do I craft a query that actually does use lots of memory?

> You increase the parameter "work_mem".

> You can do that globally in postgresql.conf or with SET for one session or with SET LOCAL for one transaction.

Thank you to both Albe Laurenz and Andreas Kretschmer, who both gave the same advice.  It was spot-on:  my query
actuallyseems to be good for a performance study, setting work_mem to various values causes the query planner to either
sortin memory or on disk, and of course this has some impact on performance.
 

(Sorry for erroneously asking for suggesting a "better" query, rather than for tuning my configuration, which turned
outto be the sensible thing to do...)
 

Best regards,
Holger Friedrich

Re: How to craft a query that uses memory?

От
Scott Marlowe
Дата:
Note that after cranking up work mem you then need a query that can
use it. Quickest way to use LOTS of memory is to do a lot of
unconstrained joins.

select * from table a join table b on (true) join table c on (true)
join table d on (true); Result set is size of a*b*c*d

On Fri, Jun 19, 2015 at 7:21 AM,
<Holger.Friedrich-Fa-Trivadis@it.nrw.de> wrote:
> Albe Laurenz wrote on Friday, June 19, 2015 12:56 PM:
>> Holger Friedrich wrote:
>> > So how do I craft a query that actually does use lots of memory?
>
>> You increase the parameter "work_mem".
>
>> You can do that globally in postgresql.conf or with SET for one session or with SET LOCAL for one transaction.
>
> Thank you to both Albe Laurenz and Andreas Kretschmer, who both gave the same advice.  It was spot-on:  my query
actuallyseems to be good for a performance study, setting work_mem to various values causes the query planner to either
sortin memory or on disk, and of course this has some impact on performance. 
>
> (Sorry for erroneously asking for suggesting a "better" query, rather than for tuning my configuration, which turned
outto be the sensible thing to do...) 
>
> Best regards,
> Holger Friedrich
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
To understand recursion, one must first understand recursion.