Обсуждение: Strange nested loop for an INSERT

Поиск
Список
Период
Сортировка

Strange nested loop for an INSERT

От
phb07
Дата:
Hi all,
 
One of my customer has reported to me a performance problem when using the E-Maj extension.
It is a tool that allows to log updates performed on application tables, with the capability to cancel them.
It is based on classic triggers with a log table associated to each application table.
The performance issue, encountered in very specific situations, is the time needed to cancel a significant number of insertions.
 
I have build a simple test case that reproduces the problem without the need of the extension. It just mimics the behaviour.
Attached is the psql script and its result.
 
The updates cancellation operation is done in 3 steps:
- create a temporary table that holds each primary key to process
- delete from the application table all rows that are no longer wished (previously inserted rows and new values of updated rows)
- insert into the application table old rows we want to see again (previously deleted rows or old values of updated rows)
 
The performance problem only concerns the third statement (the INSERT).
I have run this test case in various recent postgres versions, from 9.1 to 9.6, with the same results.
 
The problem appears when:
- the application table has a primary key with a large number of columns (at least 7 columns in this test case)
- and nothing but INSERT statements have been executed on the application table
- and the log trigger remains active (to provide a nice feature: cancel the cancellation !)
 
In the test case, I create a table and populate it with 100,000 rows, create the log mechanism, then insert 10,000 rows and finaly cancel these 10,000 rows insertion.
 
The faulting INSERT statement has the following explain:
explain analyze
INSERT INTO t1
  SELECT t1_log.c1,t1_log.c2,t1_log.c3,t1_log.c4,t1_log.c5,t1_log.c6,t1_log.c7,t1_log.c8
    FROM t1_log, tmp
    WHERE t1_log.c1 = tmp.c1 AND t1_log.c2 = tmp.c2 AND t1_log.c3 = tmp.c3
      AND t1_log.c4 = tmp.c4 AND t1_log.c5 = tmp.c5 AND t1_log.c6 = tmp.c6
      AND t1_log.c7 = tmp.c7
      AND t1_log.gid = tmp.gid AND t1_log.tuple = 'OLD';
                                                                                                          QUERY PLAN                                                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Insert on t1  (cost=0.00..890.90 rows=1 width=32) (actual time=434571.193..434571.193 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..890.90 rows=1 width=32) (actual time=434571.187..434571.187 rows=0 loops=1)
         Join Filter: ((t1_log.c1 = tmp.c1) AND (t1_log.c2 = tmp.c2) AND (t1_log.c3 = tmp.c3) AND (t1_log.c4 = tmp.c4) AND (t1_log.c5 = tmp.c5) AND (t1_log.c6 = tmp.c6) AND (t1_log.c7 = tmp.c7) AND (t1_log.gid = tmp.gid))
         Rows Removed by Join Filter: 100000000
         ->  Index Scan using t1_log_gid_tuple_idx on t1_log  (cost=0.00..423.22 rows=1 width=40) (actual time=0.378..69.594 rows=10000 loops=1)
               Index Cond: ((tuple)::text = 'OLD'::text)
         ->  Seq Scan on tmp  (cost=0.00..176.17 rows=9717 width=36) (actual time=0.006..21.678 rows=10000 loops=10000)
Total runtime: 434571.243 ms
(8 rows)
 
Time: 434572,146 ms
 
When the conditions are not exactly met, I get:
explain analyze
INSERT INTO t1
  SELECT t1_log.c1,t1_log.c2,t1_log.c3,t1_log.c4,t1_log.c5,t1_log.c6,t1_log.c7,t1_log.c8
    FROM t1_log, tmp
    WHERE t1_log.c1 = tmp.c1 AND t1_log.c2 = tmp.c2 AND t1_log.c3 = tmp.c3
      AND t1_log.c4 = tmp.c4 AND t1_log.c5 = tmp.c5 AND t1_log.c6 = tmp.c6
      AND t1_log.c7 = tmp.c7
      AND t1_log.gid = tmp.gid AND t1_log.tuple = 'OLD';
                                                                                                         QUERY PLAN                                                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Insert on t1  (cost=438.65..906.34 rows=1 width=32) (actual time=111.526..111.526 rows=0 loops=1)
   ->  Hash Join  (cost=438.65..906.34 rows=1 width=32) (actual time=111.521..111.521 rows=0 loops=1)
         Hash Cond: ((tmp.c1 = t1_log.c1) AND (tmp.c2 = t1_log.c2) AND (tmp.c3 = t1_log.c3) AND (tmp.c4 = t1_log.c4) AND (tmp.c5 = t1_log.c5) AND (tmp.c6 = t1_log.c6) AND (tmp.c7 = t1_log.c7) AND (tmp.gid = t1_log.gid))
         ->  Seq Scan on tmp  (cost=0.00..176.17 rows=9717 width=36) (actual time=0.007..22.444 rows=10000 loops=1)
         ->  Hash  (cost=435.68..435.68 rows=99 width=40) (actual time=58.300..58.300 rows=10000 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 586kB
               ->  Seq Scan on t1_log  (cost=0.00..435.68 rows=99 width=40) (actual time=2.281..28.430 rows=10000 loops=1)
                     Filter: ((tuple)::text = 'OLD'::text)
                     Rows Removed by Filter: 10000
Total runtime: 111.603 ms
(10 rows)
 
So we get a nested loop in the bad case, instead of a hash join.
But what looks strange to me in this nested loop is that the seq scan on the tmp table is executed 10000 times (once for each t1_log row) while no row matches the "t1_log.tuple = 'OLD'" condition, leading to a dramatic O^2 behaviour.
 
I have also remarked that the problem disappears when:
- an index is added on the temporary table,
- or the log trigger is disabled,
- or the enable_nestloop is disabled (bringing what is currenlty my favourite workaround),
- or when I delete from pg_statistics the row concerning the "tuple" column of the log table (that presently says that there is nothing but 'NEW' values).
 
Thanks by advance for any explanation about this case.
Best regards.
Philippe.

Вложения

Re: Strange nested loop for an INSERT

От
Tom Lane
Дата:
phb07 <phb07@apra.asso.fr> writes:
> The performance issue, encountered in very specific situations, is the
> time needed to cancel a significant number of insertions.
> I have build a simple test case that reproduces the problem without the
> need of the extension. It just mimics the behaviour.

At least for this example, the problem is that the DELETE enormously
alters the statistics for the t1_log.tuple column (going from 100% "NEW"
to 50% "NEW" and 50% "OLD"), but the plan for your last command is
generated with stats saying there are no "OLD" entries.  So you get a plan
that would be fast for small numbers of "OLD" entries, but it sucks when
there are lots of them.  The fix I would recommend is to do a manual
"ANALYZE t1_log" after such a large data change.  Auto-ANALYZE would fix
it for you after a minute or so, probably, but if your script doesn't want
to wait around then an extra ANALYZE is the ticket.

            regards, tom lane


Re: Strange nested loop for an INSERT

От
phb07
Дата:
Thanks, Tom, for this quick answer.


Le 12/09/2016 à 16:41, Tom Lane a écrit :
> phb07 <phb07@apra.asso.fr> writes:
>> The performance issue, encountered in very specific situations, is the
>> time needed to cancel a significant number of insertions.
>> I have build a simple test case that reproduces the problem without the
>> need of the extension. It just mimics the behaviour.
> At least for this example, the problem is that the DELETE enormously
> alters the statistics for the t1_log.tuple column (going from 100% "NEW"
> to 50% "NEW" and 50% "OLD"), but the plan for your last command is
> generated with stats saying there are no "OLD" entries.  So you get a plan
> that would be fast for small numbers of "OLD" entries, but it sucks when
> there are lots of them.  The fix I would recommend is to do a manual
> "ANALYZE t1_log" after such a large data change.  Auto-ANALYZE would fix
> it for you after a minute or so, probably, but if your script doesn't want
> to wait around then an extra ANALYZE is the ticket.
>
>             regards, tom lane
>
I understand the point (and I now realize that I should have found the
answer by myself...)
Adding an ANALYZE of the log table effectively changes the plan and
brings good performances for the INSERT statement.
The drawback is the overhead of this added ANALYZE statement. With a
heavy processing like in this test case, it is worth to be done. But for
common cases, it's a little bit expensive.
But I keep the idea and I will study the best solution to implement.

Regards. Philippe.



Re: Strange nested loop for an INSERT

От
Jim Nasby
Дата:
On 9/12/16 1:05 PM, phb07 wrote:
> The drawback is the overhead of this added ANALYZE statement. With a
> heavy processing like in this test case, it is worth to be done. But for
> common cases, it's a little bit expensive.

You could always look at the number of rows affected by a command and
make a decision on whether to ANALYZE based on that, possibly by looking
at pg_stat_all_tables.n_mod_since_analyze.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: Strange nested loop for an INSERT

От
phb07
Дата:
Le 21/09/2016 à 23:42, Jim Nasby a écrit :
> On 9/12/16 1:05 PM, phb07 wrote:
>> The drawback is the overhead of this added ANALYZE statement. With a
>> heavy processing like in this test case, it is worth to be done. But for
>> common cases, it's a little bit expensive.
>
> You could always look at the number of rows affected by a command and
> make a decision on whether to ANALYZE based on that, possibly by
> looking at pg_stat_all_tables.n_mod_since_analyze.
I have solved the issue by adding an ANALYZE between both statements. To
avoid the associated overhead for cases when it is not worth to be done,
the ANALYZE is only performed when more than 1000 rows have just been
deleted by the first statement (as the logic is embeded into a plpgsql
function, the GET DIAGNOSTICS statement provides the information). This
threshold is approximately the point where the potential loss due to bad
estimates equals the ANALYZE cost.
But the idea of using the n_mod_since_analyze data to also take into
account other recent updates not yet reflected into the statistics is
very interesting.

Thanks.


Re: Strange nested loop for an INSERT

От
Jim Nasby
Дата:
On 9/23/16 12:59 PM, phb07 wrote:
>
> Le 21/09/2016 à 23:42, Jim Nasby a écrit :
>> On 9/12/16 1:05 PM, phb07 wrote:
>>> The drawback is the overhead of this added ANALYZE statement. With a
>>> heavy processing like in this test case, it is worth to be done. But for
>>> common cases, it's a little bit expensive.
>>
>> You could always look at the number of rows affected by a command and
>> make a decision on whether to ANALYZE based on that, possibly by
>> looking at pg_stat_all_tables.n_mod_since_analyze.
> I have solved the issue by adding an ANALYZE between both statements. To
> avoid the associated overhead for cases when it is not worth to be done,
> the ANALYZE is only performed when more than 1000 rows have just been
> deleted by the first statement (as the logic is embeded into a plpgsql
> function, the GET DIAGNOSTICS statement provides the information). This
> threshold is approximately the point where the potential loss due to bad
> estimates equals the ANALYZE cost.
> But the idea of using the n_mod_since_analyze data to also take into
> account other recent updates not yet reflected into the statistics is
> very interesting.

Another interesting possibility would be to look at
pg_catalog.pg_stat_xact_all_tables; if you add n_tup_ins, _upd, and _del
that will tell you how much n_mod_since_analyze will be increased when
your transaction commits, so you could guage exactly how much the
current transaction has changed things.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461