Re: Inner Join of the same table

Поиск
Список
Период
Сортировка
От Sebastián Baioni
Тема Re: Inner Join of the same table
Дата
Msg-id 20060815185335.14227.qmail@web36114.mail.mud.yahoo.com
обсуждение исходный текст
Ответ на Re: Inner Join of the same table  ("Jim C. Nasby" <jnasby@pervasive.com>)
Ответы Re: Inner Join of the same table
Re: Inner Join of the same table
Список pgsql-performance
Hello Jim, we can't use the Where cuiT='12345678901' in the subquery because we need max(cuiL)
independently of that cuiT:
cuiT cuiL PERI    FAMI
1    a    200608  0
1    a    200601  2
1    b    200607  3
1    c    200605  4
2    a    200605  9
2    c    200604  4
2    b    200608  1
We need:
where cuiT = '1'
cuiT cuiL PERI    FAMI
1    a    200608  9
1    c    200605  4
If we place the Where cuiT = '1' in the subquery we couldn't get the max(FAMI) of cuiL a = 9 and
we couldn't know if that PERI is the max(PERI) of that cuiL independently of that cuiT.

Here is the explain analyze with PG 8.0 for Windows:
Explain Analyze
SELECT DISTINCT T.cuiT,T.cuiL, U.MAXPERI AS  ULT_APORTE_O_DDJJ
        FROM APORTES AS T
        INNER JOIN
        (
        SELECT cuiL, MAX(PERI) AS MAXPERI
        FROM APORTES
        GROUP BY cuiL
        ) AS U ON T.cuiL=U.cuiL AND T.PERI=U.MAXPERI
WHERE T.cuiT='12345678901'
order by T.cuiT, T.cuiL, U.MAXPERI;

QUERY PLAN
 1 Unique  (cost=37478647.41..37478650.53 rows=312 width=62) (actual time=2677209.000..2677520.000
rows=1720 loops=1)
 2  ->  Sort  (cost=37478647.41..37478648.19 rows=312 width=62) (actual
time=2677209.000..2677260.000 rows=3394 loops=1)
 3        Sort Key: t.cuiT, t.cuiL, u.maxperi
 4        ->  Merge Join  (cost=128944.78..37478634.48 rows=312 width=62) (actual
time=74978.000..2677009.000 rows=3394 loops=1)
 5              Merge Cond: ("outer".cuiL = "inner".cuiL)
 6              Join Filter: (("inner".peri)::text = "outer".maxperi)
 7              ->  Subquery Scan u  (cost=0.00..37348434.56 rows=3951 width=47) (actual
time=130.000..2634923.000 rows=254576 loops=1)
 8                    ->  GroupAggregate  (cost=0.00..37348395.05 rows=3951 width=25) (actual
time=130.000..2629617.000 rows=254576 loops=1)
 9                          ->  Index Scan using uesapt001 on APORTES  (cost=0.00..37301678.64
rows=9339331 width=25) (actual time=110.000..2520690.000 rows=9335892 loops=1)
10              ->  Sort  (cost=128944.78..129100.44 rows=62263 width=40) (actual
time=30684.000..36838.000 rows=80471 loops=1)
11                    Sort Key: t.cuiL
12                    ->  Index Scan using uesapt002 on APORTES t  (cost=0.00..122643.90
rows=62263 width=40) (actual time=170.000..25566.000 rows=80471 loops=1)
13                          Index Cond: (cuiT = '12345678901'::bpchar)
Total runtime: 2677640.000 ms

Thanks
     Sebastián Baioni

 --- "Jim C. Nasby" <jnasby@pervasive.com> escribió:

> On Tue, Aug 15, 2006 at 03:43:29PM +0000, Sebasti?n Baioni wrote:
> > Hi Nark, thanks for your answer.
> >
> > It's expected to return 1,720 rows (of 80,471 that match with condition WHERE
> > T.cuiT='12345678901')
> >
> > We have indexes by :
> > uesapt000: cuiT, cuiL, PERI;
> > uesapt001: cuiL, PERI;
> > uesapt002: cuiT, PERI;
> >
> > We usually make a vacuum analyze and reindex of every table, and we are running 8.0 and 8.1
for windows and 7.4 for Linux.
>
> That's EXPLAIN, not EXPLAIN ANALYZE, which doesn't help us much. Best
> bet would be an EXPLAIN ANALYZE from 8.1.x. It would also be useful to
> know how MSSQL is executing this query.
>
> If it would serve your purposes, copying the WHERE clause into the
> subquery would really help things. I think it might also mean you could
> combine everything into one query.
>
> > Thanks
> >      Sebasti?n Baioni
> >
> >  --- Mark Lewis <mark.lewis@mir3.com> escribi?:
> >
> > > Can you provide an EXPLAIN ANALYZE of the query in PG?  Have you
> > > analyzed the PG database?  How many rows is this query expected to
> > > return?  Which version of PG are you running?  What indexes have you
> > > defined?
> > >
> > > -- Mark
> > >
> > > On Tue, 2006-08-15 at 14:38 +0000, Sebasti?n Baioni wrote:
> > > > Hello, I'm migrating from MS SQL Server to PostgreSQL 8.1 and I have a
> > > > serious problem:
> > > > Table: APORTES - Rows: 9,000,000 (9 million)
> > > > *cuiT (char 11)
> > > > *cuiL (char 11)
> > > > *PERI (char 6)
> > > > FAMI (numeric 6)
> > > >
> > > > I need all the cuiLs whose max(PERI) are from a cuiT, and the Max
> > > > (FAMI) of those cuiLs, so the sentence is:
> > > >
> > > > SELECT DISTINCT T.cuiT, T.cuiL. U.MAXPERI, U.MAXFAMI
> > > >        FROM APORTES T
> > > >        INNER JOIN
> > > >        (SELECT cuiL, MAX(PERI) AS MAXPERI,
> > > >                MAX(FAMI) AS MAXFAMI
> > > >         FROM APORTES
> > > >         GROUP BY cuiL) AS U
> > > >        ON T.cuiL = U.cuiL AND T.PERI=U.MAXPERI
> > > > WHERE T.cuiT='12345678901'
> > > >
> > > > In MS SQL Server it lasts 1minute, in PostgreSQL for Windows it lasts
> > > > 40minutes and in PostgreSQL for Linux (FreeBSD) it lasts 20minuts.
> > > >
> > > > Do you know if there is any way to tune the server or optimize this
> > > > sentence?
> > > >
> > > > Thanks
> > > >      Sebastián Baioni





__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Luke Lonergan"
Дата:
Сообщение: Re: Dell PowerEdge 2950 performance
Следующее
От: Michael Stone
Дата:
Сообщение: Re: Postgresql Performance on an HP DL385 and