Re: [GENERAL] Joining 16 tables seems slow

Поиск
Список
Период
Сортировка
От Frank Millman
Тема Re: [GENERAL] Joining 16 tables seems slow
Дата
Msg-id 73E7F216A1444578B7AEF815E162EC36@FrankLaptop
обсуждение исходный текст
Ответ на Re: [GENERAL] Joining 16 tables seems slow  (Chris Travers <chris.travers@gmail.com>)
Список pgsql-general
 
Sent: Tuesday, September 12, 2017 3:36 PM
Subject: Re: [GENERAL] Joining 16 tables seems slow
 
Chris Travers wrote:
 
On Tue, Sep 12, 2017 at 3:15 PM, Frank Millman <frank@chagford.com> wrote:
 
2017-09-12 12:39 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
 
> please use https://explain.depesz.com/ for both plans (slow, fast)
 
Here are the results -
 
 
 
>
> I am not convinced that the nested loop is a problem here.  I cannot think of a faster join plan than a nested loop when you only have one iteration of the loop (and looking through I did not see any loop counts above 1).
>
> If you read and count ms carefully you will find that ar_tran_inv is scanned 6 times and each of these times is taking about 25ms.  25x6 is half of your query time right there and then you have the overhead in the joins on top of that.  Quick eyeball estimates is that this is where approx 200ms of your query time comes from.  Looking at this in more detail it doesn't look
>
> This is not a problem with too many tables in the join but the fact that you are joining the same tables in multiple times in ways you end up needing to repeatedly sequentially scan them.
>
> I also don't think an index is going to help unless you have accounting data going way back (since you are looking for about a year's worth of data) or unless 90% of your transactions get marked as deleted.  So I think you are stuck with the sequential scans on this table and optimizing will probably mean reducing the number of times you scan that table.
 
Ok, I have a bit more information.
 
A couple of general comments first.
 
1. This is now purely an academic exercise. The SQL query that triggered this thread is unnecessarily complex, and I have a better solution. However, I think it is still worth the effort to understand what is going on.
 
2. explain.depesz.com is a brilliant tool – thanks for suggesting it.
 
As Pavel and Chris have pointed out, the problem seems to be that ar_tran_inv is scanned six times. The question is why? I have an idea, but I will need some assistance.
 
I have split my transaction table into three separate tables – ar_tran_inv, ar_tran_crn, ar_tran_rec. I then have a VIEW called ar_trans to view the transactions in total.
 
Each physical table has a primary key called ‘row_id’, and an index on ‘tran_date’.
 
The view is created like this -
 
CREATE VIEW ccc.ar_trans AS
  SELECT ‘ar_inv’ AS tran_type, row_id as tran_row_id, tran_number, tran_date ... FROM ccc.ar_tran_inv
  UNION ALL
  SELECT ‘ar_crn’ AS tran_type, row_id as tran_row_id, tran_number, tran_date ... FROM ccc.ar_tran_crn
  UNION ALL
  SELECT ‘ar_rec’ AS tran_type, row_id as tran_row_id, tran_number, tran_date ... FROM ccc.ar_tran_rec
 
In my sql_slow query, I have this 5 times, using different dates -
 
LEFT JOIN ccc.ar_trans trans_alloc_curr ON
    trans_alloc_curr.tran_type = alloc_curr.tran_type AND
    trans_alloc_curr.tran_row_id = alloc_curr.tran_row_id AND
    trans_alloc_curr.tran_date <= '2015-09-30'
 
Is it possible that it has to perform a full scan of each of the underlying tables to make the join?
 
If so, is it possible to speed this up?
 
Frank
 

В списке pgsql-general по дате отправления:

Предыдущее
От: Yogesh Sharma
Дата:
Сообщение: Re: [GENERAL] Perl script is killed by SIGPIPE
Следующее
От: tirveni yadav
Дата:
Сообщение: Re: [GENERAL] Perl script is killed by SIGPIPE