Обсуждение: Shards + hash = forever running queries
Hello, We are using Postgres 9.1.4. We are struggling with a class of queries that got impossible to run after sharding a large table. Everything like: select small.something, big.anything from small join big on small.big_id = big.id; and variation such as "select * from big where id in (select big_id from small)" Since "big" was sharded, the query plan results in something like: Hash Join (cost=10000000001.23..30038997974.72 rows=10 width=753) Hash Cond: (b.id = i.big_id) -> Append (cost=0.00..20038552251.23 rows=118859245 width=11) -> Index Scan using big_201207_pkey on big_201207 b (cost=0.00..2224100.46 rows=1609634 width=12) -> Index Scan using big_201101_pkey on big_201101 b (cost=0.00..404899.71 rows=5437497 width=12) -> Index Scan using big_201104_pkey on big_201104 b (cost=0.00..349657.58 rows=4625181 width=12) -> [...all the shards] -> Hash (cost=10000000001.10..10000000001.10 rows=10 width=742) -> Seq Scan on small i (cost=10000000000.00..10000000001.10 rows=10 width=742) Postgres ends up in never-ending reads: even if "small" has only three rows I've never seen such query finishing, the time passed being even longer than a full scan on big. The plan looks sub-optimal, as it seems it first does a huge indexscan of all the partitions, then it joins the result against a small hash. 1. Can we fix the queries to work around this problem? 2. Could the planner be fixed for this scenario for PG 9.2 (or 9.3)? Creating the hash beforehand, performing an hash join for each partition and merging the results looks like it would bring it back into the realm of the runnable queries. Am I wrong? Thank you very much. -- Daniele
On Mon, Jul 23, 2012 at 11:03 AM, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote: > 1. Can we fix the queries to work around this problem? As a stop-gap measure I've defined a get_big(id) function and using it to pull in the details we're interested into from the "big" table: create function get_big (id int) returns big as $$ select * from big where id = $1; $$ language sql stable strict; I'm not completely satisfied by it though: if there's any better solution I'd be happy to know. Thank you, -- Daniele
Daniele Varrazzo <daniele.varrazzo@gmail.com> writes: > Since "big" was sharded, the query plan results in something like: > Hash Join (cost=10000000001.23..30038997974.72 rows=10 width=753) > Hash Cond: (b.id = i.big_id) > -> Append (cost=0.00..20038552251.23 rows=118859245 width=11) > -> Index Scan using big_201207_pkey on big_201207 b > (cost=0.00..2224100.46 rows=1609634 width=12) > -> Index Scan using big_201101_pkey on big_201101 b > (cost=0.00..404899.71 rows=5437497 width=12) > -> Index Scan using big_201104_pkey on big_201104 b > (cost=0.00..349657.58 rows=4625181 width=12) > -> [...all the shards] > -> Hash (cost=10000000001.10..10000000001.10 rows=10 width=742) > -> Seq Scan on small i (cost=10000000000.00..10000000001.10 > rows=10 width=742) [ squint... ] 9.1 certainly ought to be able to find a smarter plan for such a case. For instance, if I try this on 9.1 branch tip: regression=# create table p (id int primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "p_pkey" for table "p" CREATE TABLE regression=# create table c1 (primary key (id)) inherits(p); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "c1_pkey" for table "c1" CREATE TABLE regression=# create table c2 (primary key (id)) inherits(p); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "c2_pkey" for table "c2" CREATE TABLE regression=# explain select * from p,int4_tbl where id=f1; QUERY PLAN -------------------------------------------------------------------------------- Nested Loop (cost=0.00..53.25 rows=120 width=8) Join Filter: (public.p.id = int4_tbl.f1) -> Seq Scan on int4_tbl (cost=0.00..1.05 rows=5 width=4) -> Append (cost=0.00..10.40 rows=3 width=4) -> Index Scan using p_pkey on p (cost=0.00..1.87 rows=1 width=4) Index Cond: (id = int4_tbl.f1) -> Index Scan using c1_pkey on c1 p (cost=0.00..4.27 rows=1 width=4) Index Cond: (id = int4_tbl.f1) -> Index Scan using c2_pkey on c2 p (cost=0.00..4.27 rows=1 width=4) Index Cond: (id = int4_tbl.f1) (10 rows) You have evidently got enable_seqscan turned off, so I wonder whether the cost penalties applied by that are swamping the estimates. Do you get any better results if you re-enable that? regards, tom lane
On Mon, Jul 23, 2012 at 4:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Daniele Varrazzo <daniele.varrazzo@gmail.com> writes: >> Since "big" was sharded, the query plan results in something like: >> [ugly] > > [ squint... ] 9.1 certainly ought to be able to find a smarter plan for > such a case. For instance, if I try this on 9.1 branch tip: > [good] > > You have evidently got enable_seqscan turned off, so I wonder whether > the cost penalties applied by that are swamping the estimates. Do you > get any better results if you re-enable that? Hello Tom, thank you for your help. Actually, I don't know what to say. seqscan were most likely enabled when the problem showed up. I may have disabled it for testing in my session and the plan I've copied may have been generated with disabled seqscan, but the original problem (that query never completing) was reported in different sessions by different people, so the only possibility was that seqscans were disabled in the config file... which I have been confirmed was not the case. I hadn't tested in my session whether they were disabled before explicitly disabling them for testing. Matter of fact, after reading your reply, I've tested the query again... and it was fast, the plan being the nested loop of your example. :-\ What can I say, thank you for your radiation... I've tried reverting other schema changes we performed yesterday but I've not been able to reproduce the original slowness. In case we find something that may be any useful to postgres I'll report it back. Have a nice day, -- Daniele