Re: Increasing pattern index query speed

Поиск
Список
Период
Сортировка
От Mario Weilguni
Тема Re: Increasing pattern index query speed
Дата
Msg-id 492D5A01.3060109@sime.com
обсуждение исходный текст
Ответ на Re: Increasing pattern index query speed  ("Andrus" <kobruleht2@hot.ee>)
Список pgsql-performance
Andrus schrieb:
> Richard,
>
>> These are the same but the times are different. I'd be very surprised if
>> you can reproduce these times reliably.
>
> I re-tried today again and got same results: in production database
> pattern query  is many times slower that equality query.
> toode and rid base contain only single product starting with 99000010
> So both queries should scan exactly same numbers of rows.
>
>> Can I give you some wider-ranging suggestions Andrus?
>> 1. Fix the vacuuming issue in your hash-join question.
>
> I have ran VACUUM FULL VERBOSE ANALYSE and set max_fsm_pages=150000
> So issue is fixed before those tests.
>
>> 2. Monitor the system to make sure you know if/when disk activity is
>> high.
>
> I optimized this system. Now there are short (some seconds) sales
> queries about after every 5 - 300 seconds which cause few disk
> activity and add few new rows to some tables.
> I havent seen that this activity affects to this test result.
>
>> 3. *Then* start to profile individual queries and look into their plans.
>> Change the queries one at a time and monitor again.
>
> How to change pattern matching query to faster ?
>
> Andrus.
>
> Btw.
>
> I tried to reproduce this big difference in test server in 8.3 using
> sample data script below and got big difference but in opposite
> direction.
>
> explain analyze   SELECT sum(1)
> FROM   orders
> JOIN orders_products USING (order_id)
> JOIN products USING (product_id)
> WHERE orders.order_date>'2006-01-01' and ...
>
> different where clauses produce different results:
>
> AND orders_products.product_id =     '3370000000000000'  -- 880 .. 926 ms
> AND  orders_products.product_id like '3370000000000000%' -- 41 ..98 ms
>
> So patter index is 10 .. 20 times (!) faster always.
> No idea why.
>
> Test data creation script:
>
> begin;
> CREATE OR REPLACE FUNCTION Counter() RETURNS int IMMUTABLE AS
> $_$
> SELECT 3500000;
> $_$ LANGUAGE SQL;
>
> CREATE TEMP TABLE orders (order_id INTEGER NOT NULL, order_date DATE
> NOT NULL);
> CREATE TEMP TABLE products (product_id CHAR(20) NOT NULL, product_name
> char(70) NOT NULL, quantity numeric(12,2) default 1);
> CREATE TEMP TABLE orders_products (order_id INTEGER NOT NULL,
> product_id CHAR(20),
>  id serial, price numeric(12,2) default 1 );
>
> INSERT INTO products SELECT (n*power( 10,13))::INT8::CHAR(20),
>       'product number ' || n::TEXT FROM generate_series(0,13410) AS n;
>
> INSERT INTO orders
> SELECT n,'2005-01-01'::date + (4000.0 * n/Counter() * '1 DAY'::interval)
> FROM generate_series(0, Counter()/3 ) AS n;
>
> SET work_mem TO 2097151;
>
> INSERT INTO orders_products SELECT
>   generate_series/3 as  order_id,
>   ( (1+ (generate_series % 13410))*power( 10,13))::INT8::CHAR(20) AS
> product_id
> FROM generate_series(1, Counter());
>
> ALTER TABLE orders ADD PRIMARY KEY (order_id);
> ALTER TABLE products ADD PRIMARY KEY (product_id);
> ALTER TABLE orders_products ADD PRIMARY KEY (id);
>
> ALTER TABLE orders_products ADD FOREIGN KEY (product_id) REFERENCES
> products(product_id);
> ALTER TABLE orders_products ADD FOREIGN KEY (order_id) REFERENCES
> orders(order_id) ON DELETE CASCADE;
>
> CREATE INDEX orders_date ON orders( order_date );
> CREATE INDEX order_product_pattern_idx ON orders_products( product_id
> bpchar_pattern_ops );
>
> COMMIT;
> SET work_mem TO DEFAULT;
> ANALYZE;
>
No wonder that = compares bad, you created the index this way:
CREATE INDEX order_product_pattern_idx ON orders_products( product_id
bpchar_pattern_ops );
why not:
CREATE INDEX order_product_pattern_idx ON orders_products( product_id);

explain analyze   SELECT sum(1)
FROM   orders
JOIN orders_products USING (order_id)
JOIN products USING (product_id)
WHERE orders.order_date>'2006-01-01'
AND orders_products.product_id =     '3370000000000000';

                                                                QUERY
PLAN

-------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=3013.68..3013.69 rows=1 width=0) (actual
time=8.206..8.207 rows=1 loops=1)
   ->  Nested Loop  (cost=10.83..3013.21 rows=185 width=0) (actual
time=2.095..7.962 rows=189 loops=1)
         ->  Index Scan using products_pkey on products
(cost=0.00..8.27 rows=1 width=18) (actual time=0.036..0.038 rows=1 loops=1)
               Index Cond: ((product_id)::text = '3370000000000000'::text)
         ->  Nested Loop  (cost=10.83..3003.09 rows=185 width=18)
(actual time=2.052..7.474 rows=189 loops=1)
               ->  Bitmap Heap Scan on orders_products
(cost=10.83..949.68 rows=253 width=22) (actual time=0.161..0.817
rows=261 loops=1)
                     Recheck Cond: ((product_id)::text =
'3370000000000000'::text)
                     ->  Bitmap Index Scan on foo  (cost=0.00..10.76
rows=253 width=0) (actual time=0.116..0.116 rows=261 loops=1)
                           Index Cond: ((product_id)::text =
'3370000000000000'::text)
               ->  Index Scan using orders_pkey on orders
(cost=0.00..8.10 rows=1 width=4) (actual time=0.020..0.021 rows=1 loops=261)
                     Index Cond: (orders.order_id =
orders_products.order_id)
                     Filter: (orders.order_date > '2006-01-01'::date)
 Total runtime: 8.268 ms


В списке pgsql-performance по дате отправления:

Предыдущее
От: Richard Huxton
Дата:
Сообщение: Re: Increasing pattern index query speed
Следующее
От: "Chavdar Kopoev"
Дата:
Сообщение: many to many performance