Shiby Thomas wrote:
>
> Hi,
>
> The foll. query did not finish even after running for 20 HOURS.
> I remember some discussion about a self-join bug. Could it be because of that ?
No. Self-join is problem of old optimizer, not executor. You got output
from EXPLAIN ==> optimizer' phase is done.
> This is with the snapshot on Feb 2. Foll. is the output of explain:
> Here data is a table with 2 attributes(tid, item) and about 1.1 million tuples.
> c2(item1, item2) has about 87,000 tuples. I am running on a 8 processor
> sun sparc with a total of 2GB memory and each processor is a
> 248 MHz SUNW,UltraSPARC-II.
> I invoked postmaster as:
> postmaster -B 30000 -o "-s -F -S 16384"
^^^^^
240M of shared memory ?!
> The tables does not have any indices right now. Will it help with indices
> for the hash join plan ?
^^^^^^^^^^^^^^
No. But indices could be used in nestloop plan...
>
> assoc=> explain select item1, item2, count(t1.tid) into table f2_temp from
> data t1, data t2, c2 where t1.item = c2.item1 and t2.item = c2.item2 and
> t1.tid = t2.tid group by item1, item2;
> NOTICE: QUERY PLAN:
>
> Aggregate (cost=211518.56 size=0 width=0)
> -> Group (cost=211518.56 size=0 width=0)
> -> Sort (cost=211518.56 size=0 width=0)
> -> Hash Join (cost=211518.56 size=14 width=24)
> -> Hash Join (cost=89881.04 size=1154369 width=16)
> -> Seq Scan on t2 (cost=45447.18 size=1154369 width=8)
> -> Hash (cost=0.00 size=0 width=0)
> -> Seq Scan on c2 (cost=3447.84 size=87571 width=8)
> -> Hash (cost=0.00 size=0 width=0)
> -> Seq Scan on t1 (cost=45447.18 size=1154369 width=8)
^^^^^^^
Your big table data will be entirely in memory (this is caused
by -B 30000).
Ok. First, try to create indices on both tables.
Does it help ?
EXPLAIN ?
Vadim