Обсуждение: Slow query - lots of temporary files.

Поиск
Список
Период
Сортировка

Slow query - lots of temporary files.

От
Johann Spies
Дата:
I have stopped this query after about 16 hours.  At the same time I ran a 'explain analyze' on the same query to find out why it took so long.  These two processes generated temporary files of 173GB in /var/lib/postgresql/9.4/main/base/pgsql_tmp.

COPY (SELECT A.ut,         B.go AS funding_org,         B.gn AS grant_no,         C.gt AS thanks,         D.au  FROM isi.funding_text C,       isi.rauthor D,       isi.africa_uts A  LEFT JOIN isi.funding_org B ON (B.ut = A.ut)  WHERE (C.ut IS NOT NULL         OR B.ut IS NOT NULL)    AND D.rart_id = C.ut    AND C.ut = B.ut  GROUP BY A.ut,           GO,           gn,           gt,           au  ORDER BY funding_org) TO '/tmp/africafunding2.csv' WITH csv quote '"' DELIMITER ',';


A modified version of this query finished in 1min 27 sek:

COPY
(SELECT 'UT'||A.ut,
B.go AS funding_org,
B.gn AS grant_no,
C.gt AS thanks
FROM isi.africa_uts A
LEFT JOIN isi.funding_org B ON (B.ut = A.ut)
LEFT JOIN isi.funding_text C ON (A.ut = C.ut)
WHERE (C.ut IS NOT NULL
OR B.ut IS NOT NULL)
GROUP BY A.ut,
GO,
gn,
gt) TO '/tmp/africafunding.csv' WITH csv quote '"' DELIMITER ',';
As I said, the process of 'explain analyze' of the problematic query contributed to the 173GB 
temporary files and did not finish in about 16 hours.

Just explain of the query part produces this:

"Sort (cost=4781458203.46..4798118612.44 rows=6664163593 width=390)"
" Output: a.ut, b.go, b.gn, c.gt, (array_to_string(array_agg(d.au), ';'::text)), b.go, b.gn, d.au"
" Sort Key: b.go"
" -> GroupAggregate (cost=2293037801.73..2509623118.51 rows=6664163593 width=390)"
" Output: a.ut, b.go, b.gn, c.gt, array_to_string(array_agg(d.au), ';'::text), b.go, b.gn, d.au"
" Group Key: a.ut, b.go, b.gn, c.gt, d.au"
" -> Sort (cost=2293037801.73..2309698210.72 rows=6664163593 width=390)"
" Output: a.ut, c.gt, b.go, b.gn, d.au"
" Sort Key: a.ut, b.go, b.gn, c.gt, d.au"
" -> Merge Join (cost=4384310.92..21202716.78 rows=6664163593 width=390)"
" Output: a.ut, c.gt, b.go, b.gn, d.au"
" Merge Cond: ((c.ut)::text = (d.rart_id)::text)"
" -> Merge Join (cost=635890.84..1675389.41 rows=6069238 width=412)"
" Output: c.gt, c.ut, a.ut, b.go, b.gn, b.ut"
" Merge Cond: ((c.ut)::text = (b.ut)::text)"
" Join Filter: ((c.ut IS NOT NULL) OR (b.ut IS NOT NULL))"
" -> Merge Join (cost=635476.30..675071.77 rows=1150354 width=348)"
" Output: c.gt, c.ut, a.ut"
" Merge Cond: ((a.ut)::text = (c.ut)::text)"
" -> Index Only Scan using africa_ut_idx on isi.africa_uts a (cost=0.42..19130.19 rows=628918 width=16)"
" Output: a.ut"
" -> Sort (cost=632211.00..640735.23 rows=3409691 width=332)"
" Output: c.gt, c.ut"
" Sort Key: c.ut"
" -> Seq Scan on isi.funding_text c (cost=0.00..262238.91 rows=3409691 width=332)"
" Output: c.gt, c.ut"
" -> Index Scan using funding_org_ut_idx on isi.funding_org b (cost=0.56..912582.50 rows=9835492 width=64)"
" Output: b.id, b.ut, b.go, b.gn"
" -> Materialize (cost=0.57..17914892.46 rows=159086560 width=26)"
" Output: d.id, d.rart_id, d.au, d.ro, d.ln, d.af, d.ras, d.ad, d.aa, d.em, d.ag, d.tsv"
" -> Index Scan using rauthor_rart_id_idx on isi.rauthor d (cost=0.57..17517176.06 rows=159086560 width=26)"
" Output: d.id, d.rart_id, d.au, d.ro, d.ln, d.af, d.ras, d.ad, d.aa, d.em, d.ag, d.tsv"

Any idea on why adding the rauthor table in the query is so problematic?

My systerm:

768 GB RAM
shared_ buffers:  32GB
work_mem:  4608MB

Regards
Johann

--
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)

Re: Slow query - lots of temporary files.

От
Claudio Freire
Дата:
On Wed, Jun 10, 2015 at 9:39 AM, Johann Spies <johann.spies@gmail.com> wrote:
> COPY
>   (SELECT A.ut,
>           B.go AS funding_org,
>           B.gn AS grant_no,
>           C.gt AS thanks,
>           D.au
>    FROM isi.funding_text C,
>         isi.rauthor D,
>         isi.africa_uts A
>    LEFT JOIN isi.funding_org B ON (B.ut = A.ut)
>    WHERE (C.ut IS NOT NULL
>           OR B.ut IS NOT NULL)
>      AND D.rart_id = C.ut
>      AND C.ut = B.ut
>    GROUP BY A.ut,
>             GO,
>             gn,
>             gt,
>             au
>    ORDER BY funding_org) TO '/tmp/africafunding2.csv' WITH csv quote '"'
> DELIMITER ',';
>
>
> A modified version of this query finished in 1min 27 sek:
>
> COPY
>   (SELECT 'UT'||A.ut,
>                 B.go AS funding_org,
>                 B.gn AS grant_no,
>                 C.gt AS thanks
>    FROM isi.africa_uts A
>    LEFT JOIN isi.funding_org B ON (B.ut = A.ut)
>    LEFT JOIN isi.funding_text C ON (A.ut = C.ut)
>    WHERE (C.ut IS NOT NULL
>           OR B.ut IS NOT NULL)
>    GROUP BY A.ut,
>             GO,
>             gn,
>             gt) TO '/tmp/africafunding.csv' WITH csv quote '"' DELIMITER
> ',';
>
>
> As I said, the process of 'explain analyze' of the problematic query
> contributed to the 173GB
> temporary files and did not finish in about 16 hours.

The joins are different on both versions, and the most likely culprit
is the join against D. It's probably wrong, and the first query is
building a cartesian product.

Without more information about the schema it's difficult to be sure though.


Re: Slow query - lots of temporary files.

От
Johann Spies
Дата:
On 10 June 2015 at 15:02, Claudio Freire <klaussfreire@gmail.com> wrote:

The joins are different on both versions, and the most likely culprit
is the join against D. It's probably wrong, and the first query is
building a cartesian product.

Without more information about the schema it's difficult to be sure though.

Thanks for your  reply.  I will experiment futher with different joins.

Here is the schema of the involved tables:

nkb=# \d isi.funding_text
                                    Table "isi.funding_text"
 Column |         Type          |                           Modifiers                          
--------+-----------------------+---------------------------------------------------------------
 id     | integer               | not null default nextval('isi.funding_text_id_seq'::regclass)
 ut     | character varying(15) |
 gt     | citext                |
Indexes:
    "funding_text_pkey" PRIMARY KEY, btree (id)
    "funding_text_ut_idx" btree (ut)
Foreign-key constraints:
    "funding_text_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut)

nkb=# \d isi.funding_org
                                    Table "isi.funding_org"
 Column |         Type          |                          Modifiers                          
--------+-----------------------+--------------------------------------------------------------
 id     | integer               | not null default nextval('isi.funding_org_id_seq'::regclass)
 ut     | character varying(15) |
 go     | citext                |
 gn     | character varying     |
Indexes:
    "funding_org_pkey" PRIMARY KEY, btree (id)
    "funding_org_ut_idx" btree (ut)
Foreign-key constraints:
    "funding_org_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut)


                                    Table "isi.africa_uts"
 Column |         Type          |                          Modifiers                         
--------+-----------------------+-------------------------------------------------------------
 ut     | character varying(15) |
 id     | integer               | not null default nextval('isi.africa_uts_id_seq'::regclass)
Indexes:
    "africa_uts_pkey" PRIMARY KEY, btree (id)
    "africa_ut_idx" btree (ut)
Foreign-key constraints:
    "africa_uts_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut)


                                     Table "isi.rauthor"
 Column  |          Type          |                        Modifiers                        
---------+------------------------+----------------------------------------------------------
 id      | integer                | not null default nextval('isi.rauthor_id_seq'::regclass)
 rart_id | character varying(15)  |
 au      | character varying(75)  |
 ro      | character varying(30)  |
 ln      | character varying(200) |
 af      | character varying(200) |
 ras     | character varying(4)   |
 ad      | integer                |
 aa      | text                   |
 em      | character varying(250) |
 ag      | character varying(75)  |
 tsv     | tsvector               |
Indexes:
    "rauthor_pkey" PRIMARY KEY, btree (id) CLUSTER
    "rauthor_ad_idx" btree (ad)
    "rauthor_au_idx" btree (au)
    "rauthor_lower_idx" btree (lower(au::text))
    "rauthor_lower_lower1_idx" btree (lower(ln::text), lower(af::text))
    "rauthor_rart_id_idx" btree (rart_id)
    "rauthor_tsv_idx" gin (tsv)
Referenced by:
    TABLE "level1.person" CONSTRAINT "person_auth_id_fkey" FOREIGN KEY (auth_id) REFERENCES isi.rauthor(id) ON DELETE CASCADE
Triggers:
    tsvectorupdate_for_rauthor BEFORE INSERT OR UPDATE ON isi.rauthor FOR EACH ROW EXECUTE PROCEDURE isi.update_rauthor_tsv()

Regards
Johann
--
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)

Re: Slow query - lots of temporary files.

От
Tomas Vondra
Дата:

On 06/10/15 15:42, Johann Spies wrote:
> On 10 June 2015 at 15:02, Claudio Freire <klaussfreire@gmail.com
> <mailto:klaussfreire@gmail.com>> wrote:
>
>
>     The joins are different on both versions, and the most likely culprit
>     is the join against D. It's probably wrong, and the first query is
>     building a cartesian product.
>
>     Without more information about the schema it's difficult to be sure
>     though.
>
>
> Thanks for your  reply.  I will experiment futher with different joins.

I don't know what you mean by "experimenting with joins" - that should
be determined by the schema.

The problematic piece of the explain plan is this:

  ->  Merge Join  (cost=4384310.92..21202716.78 rows=6664163593
                   width=390)"
        Output: a.ut, c.gt, b.go, b.gn, d.au"
        Merge Cond: ((c.ut)::text = (d.rart_id)::text)"

That is, the planner expects ~6.7 billion rows, each ~390B wide. That's
~2.5TB of data that needs to be stored to disk (so that the sort can
process it).

The way the schema is designed might be one of the issues - ISTM the
'ut' column is somehow universal, mixing values referencing different
columns in multiple tables. Not only that's utterly misleading for the
planner (and may easily cause issues with huge intermediate results),
but it also makes formulating the queries very difficult. And of course,
the casting between text and int is not very good either.

Fix the schema to follow relational best practices - separate the values
into multiple columns, and most of this will go away.


regards

--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Slow query - lots of temporary files.

От
Johann Spies
Дата:


On 10 June 2015 at 16:50, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:


The problematic piece of the explain plan is this:

 ->  Merge Join  (cost=4384310.92..21202716.78 rows=6664163593
                  width=390)"
       Output: a.ut, c.gt, b.go, b.gn, d.au"
       Merge Cond: ((c.ut)::text = (d.rart_id)::text)"

That is, the planner expects ~6.7 billion rows, each ~390B wide. That's ~2.5TB of data that needs to be stored to disk (so that the sort can process it).

The way the schema is designed might be one of the issues - ISTM the 'ut' column is somehow universal, mixing values referencing different columns in multiple tables. Not only that's utterly misleading for the planner (and may easily cause issues with huge intermediate results), but it also makes formulating the queries very difficult. And of course, the casting between text and int is not very good either.

Fix the schema to follow relational best practices - separate the values into multiple columns, and most of this will go away.

Thanks for your reply Tomas.

I do not understand what the problem with the 'ut' column is.  It is a unique identifier in the first table(africa_uts) and is used in the other tables to establish joins and does have the same type definition in all the tables.  Is the problem in the similar name.  The data refers in all the 'ut' columns of the different tables to the same data.  I do not casting of integers into text in this case.  I don't know why the planner is doing it.  The field 'rart_id' in isi.rauthor is just another name for 'ut' in the other tables and have the same datatype.

I do not understand your remark: "separate the values into multiple columns". I cannot see which values can be separated into different columns in the schema.  Do you mean in the query?  Why?


Johann
--
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)