Re: Hash join on int takes 8..114 seconds

От: PFC
Тема: Re: Hash join on int takes 8..114 seconds
Дата: ,
Msg-id: op.uk0xtfsrcigqcu@soyouz
(см: обсуждение, исходный текст)
Ответ на: Re: Hash join on int takes 8..114 seconds  (Tom Lane)
Список: pgsql-performance

Скрыть дерево обсуждения

Hash join on int takes 8..114 seconds  ("Andrus", )
 Re: Hash join on int takes 8..114 seconds  (PFC, )
  Re: Hash join on int takes 8..114 seconds  ("Andrus", )
 Re: Hash join on int takes 8..114 seconds  (Richard Huxton, )
  Re: Hash join on int takes 8..114 seconds  ("Andrus", )
   Re: Hash join on int takes 8..114 seconds  (Richard Huxton, )
    Re: Hash join on int takes 8..114 seconds  ("Andrus", )
   Re: Hash join on int takes 8..114 seconds  (PFC, )
    Re: Hash join on int takes 8..114 seconds  ("Andrus", )
     Re: Hash join on int takes 8..114 seconds  (, )
      Re: Hash join on int takes 8..114 seconds  ("Andrus", )
       Re: Hash join on int takes 8..114 seconds  (PFC, )
       Re: Hash join on int takes 8..114 seconds  (Alan Hodgson, )
       Re: Hash join on int takes 8..114 seconds  (Tomas Vondra, )
     Re: Hash join on int takes 8..114 seconds  (Richard Huxton, )
      Re: Hash join on int takes 8..114 seconds  ("Andrus", )
       Re: Hash join on int takes 8..114 seconds  (PFC, )
        Re: Hash join on int takes 8..114 seconds  ("Andrus", )
         Re: Hash join on int takes 8..114 seconds  (Alvaro Herrera, )
          Re: Hash join on int takes 8..114 seconds  ("Andrus", )
           Re: Hash join on int takes 8..114 seconds  (Alvaro Herrera, )
            Re: Hash join on int takes 8..114 seconds  ("Andrus", )
             Re: Hash join on int takes 8..114 seconds  (Alvaro Herrera, )
           Re: Hash join on int takes 8..114 seconds  (Tomas Vondra, )
      Re: Hash join on int takes 8..114 seconds  ("Andrus", )
       Re: Hash join on int takes 8..114 seconds  (Scott Carey, )
     Re: Hash join on int takes 8..114 seconds  (PFC, )
      Re: Hash join on int takes 8..114 seconds  (Tom Lane, )
       Re: Hash join on int takes 8..114 seconds  (PFC, )
      Re: Hash join on int takes 8..114 seconds  ("Andrus", )
       Re: Hash join on int takes 8..114 seconds  (PFC, )
        Re: Hash join on int takes 8..114 seconds  ("Andrus", )
         Re: Hash join on int takes 8..114 seconds  (Tomas Vondra, )
          Re: Hash join on int takes 8..114 seconds  ("Andrus", )
           Re: Hash join on int takes 8..114 seconds  (Tomas Vondra, )
         Re: Hash join on int takes 8..114 seconds  (Scott Carey, )
          Re: Hash join on int takes 8..114 seconds  ("Andrus", )
           Re: Hash join on int takes 8..114 seconds  (Tomas Vondra, )
            Re: Hash join on int takes 8..114 seconds  ("Andrus", )
             Re: Hash join on int takes 8..114 seconds  (, )
              Re: Hash join on int takes 8..114 seconds  ("Andrus", )

On Fri, 21 Nov 2008 21:07:02 +0100, Tom Lane <> wrote:

> PFC <> writes:
>> Index on orders_products( product_id ) and orders_products( order_id ):
>>     => Same plan
>
>>     Note that in this case, a smarter planner would use the new index to
>> perform a BitmapAnd before hitting the heap to get the rows.
>
> Considering that the query has no constraint on
> orders_products.order_id, I'm not sure what you think the extra index is
> supposed to be used *for*.
>
> (Well, we could put orders as the outside of a nestloop and then we'd
> have such a constraint, but with 30000 orders rows to process that plan
> would lose big.)
>
> (And yes, the planner did consider such a plan along the way.
> See choose_bitmap_and.)
>
>             regards, tom lane


    I think I didn't express myself correctly...

    Here the indexes are small (therefore well cached) but the
orders_products table is large (and not cached).
    To reproduce this, I put this table on a crummy slow external USB drive.
    Between each of the following queries, pg was stopped, the USB drive
unmounted, remounted, and pg restarted, to purge orders_products table out
of all caches.
    I also modified the statistical distribution (see init script at bottom
of message).

EXPLAIN ANALYZE SELECT count(*)
 FROM orders
JOIN orders_products USING (order_id)
WHERE orders.order_date BETWEEN '2000-01-01' AND '2000-02-01'
AND orders_products.product_id = 2345;
                                                                         QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=5431.93..5431.94 rows=1 width=0) (actual
time=5176.382..5176.382 rows=1 loops=1)
    ->  Hash Join  (cost=1575.13..5431.84 rows=35 width=0) (actual
time=62.634..5176.332 rows=36 loops=1)
          Hash Cond: (orders_products.order_id = orders.order_id)
          ->  Bitmap Heap Scan on orders_products  (cost=21.27..3864.85
rows=1023 width=4) (actual time=7.041..5118.512 rows=1004 loops=1)
                Recheck Cond: (product_id = 2345)
                ->  Bitmap Index Scan on orders_products_product_order
(cost=0.00..21.02 rows=1023 width=0) (actual time=0.531..0.531 rows=1004
loops=1)
                      Index Cond: (product_id = 2345)
          ->  Hash  (cost=1130.58..1130.58 rows=33862 width=4) (actual
time=55.526..55.526 rows=31999 loops=1)
                ->  Index Scan using orders_date on orders
(cost=0.00..1130.58 rows=33862 width=4) (actual time=0.139..33.466
rows=31999 loops=1)
                      Index Cond: ((order_date >= '2000-01-01'::date) AND
(order_date <= '2000-02-01'::date))
  Total runtime: 5176.659 ms

    This is the original query ; what I don't like about it is that it
bitmapscans orders_products way too much, because it reads all orders for
the specified product, not just orders in the date period we want.

    However, since Postgres scanned all order_id's corresponding to the date
range already, to build the hash, the list of order_ids of interest is
known at no extra cost. In this case, additionnally, correlation is 100%
between order_id and date, so I can do :

test=> SELECT max(order_id), min(order_id) FROM orders WHERE order_date
BETWEEN '2000-01-01' AND '2000-02-01';
   max  | min
-------+-----
  31999 |   1

    And I can add an extra condition to the query, like this :

EXPLAIN ANALYZE SELECT count(*)
 FROM orders
JOIN orders_products USING (order_id)
WHERE orders.order_date BETWEEN '2000-01-01' AND '2000-02-01'
AND orders_products.order_id BETWEEN 1 AND 31999
AND orders_products.product_id = 2345;
                                                                              QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=426.80..426.81 rows=1 width=0) (actual
time=179.233..179.233 rows=1 loops=1)
    ->  Nested Loop  (cost=0.00..426.79 rows=1 width=0) (actual
time=6.667..179.190 rows=36 loops=1)
          ->  Index Scan using orders_products_product_order on
orders_products  (cost=0.00..142.11 rows=34 width=4) (actual
time=6.559..177.597 rows=36 loops=1)
                Index Cond: ((product_id = 2345) AND (order_id >= 1) AND
(order_id <= 31999))
          ->  Index Scan using orders_pkey on orders  (cost=0.00..8.36
rows=1 width=4) (actual time=0.039..0.041 rows=1 loops=36)
                Index Cond: (orders.order_id = orders_products.order_id)
                Filter: ((orders.order_date >= '2000-01-01'::date) AND
(orders.order_date <= '2000-02-01'::date))
  Total runtime: 179.392 ms

    This is with no cache on orders_products table. About 30X faster.
    Interestingly, when everything is cached, it's even faster (about 100X)...

    The plan I was thinking about was not a nested loop with 30K loops...
this would be bad as you said. It would have been something like this :

- There is an index on (product_id, order_id)

- Build the hash from orders table (can't avoid it)

->  Hash
    ->  Index Scan using orders_date on orders
        Index Cond: ((order_date >= '2000-01-01'::date) AND (order_date <=
'2000-02-01'::date))

- A slightly twisted bitmap scan form :

->  Bitmap Heap Scan on orders_products
     Recheck Cond: (product_id = 2345) AND order_id IN (hash created above))
     ->  Bitmap Index Scan on orders_products_product_order
           Index Cond: (product_id = 2345 AND order_id IN (hash created
above))

    The Bitmap Index Scan sees the order_ids in the index it is scanning...
they could be checked before checking the visibility in the heap for the
big table.




Test script:


BEGIN;
CREATE TABLE orders (order_id INTEGER NOT NULL, order_date DATE NOT NULL);
CREATE TABLE products (product_id INTEGER NOT NULL, product_name TEXT NOT
NULL);
CREATE TABLE orders_products (order_id INTEGER NOT NULL, product_id
INTEGER NOT NULL, padding1 TEXT, padding2 TEXT) TABLESPACE usb;

INSERT INTO products SELECT n, 'product number ' || n::TEXT FROM
generate_series(1,10001) AS n;
INSERT INTO orders SELECT n,'2000-01-01'::date + (n/1000 * '1
DAY'::interval) FROM generate_series(1,1000000) AS n;

SET work_mem TO '1GB';
INSERT INTO orders_products SELECT
a,b,'aibaifbaurgbyioubyfazierugybfoaybofauez',
'hfohbdsqbhjhqsvdfiuazvfgiurvgazrhbazboifhaoifh'
   FROM (SELECT DISTINCT (1+(n/10))::INTEGER AS a,
(1+(random()*10000))::INTEGER AS b FROM generate_series( 1,9999999 ) AS n)
AS x;

DELETE FROM orders_products WHERE product_id NOT IN (SELECT product_id
 FROM products);
DELETE FROM orders_products WHERE order_id NOT IN (SELECT order_id FROM
orders);
ALTER TABLE orders ADD PRIMARY KEY (order_id);
ALTER TABLE products ADD PRIMARY KEY (product_id);
ALTER TABLE orders_products ADD PRIMARY KEY (order_id,product_id);
ALTER TABLE orders_products ADD FOREIGN KEY (product_id) REFERENCES
products( product_id ) ON DELETE CASCADE;
ALTER TABLE orders_products ADD FOREIGN KEY (order_id) REFERENCES orders(
order_id ) ON DELETE CASCADE;
CREATE INDEX orders_date ON orders( order_date );
CREATE INDEX orders_products_product_order ON orders_products( product_id,
order_id );
COMMIT;
SET work_mem TO DEFAULT;
ANALYZE;


В списке pgsql-performance по дате сообщения:

От: Gregory Stark
Дата:
Сообщение: Re: seq scan over 3.3 million rows instead of single key index access
От: Tom Lane
Дата:
Сообщение: Re: seq scan over 3.3 million rows instead of single key index access