Обсуждение: check sql progress
Howdy:
I'm running Postgres 7.1.3 on RedHat Linux 7.2 kernel
version 2.4.7 rel 10.
Basic SQL questions (I think). I'm trying to write
a query:
[script]
--explain
drop table t_bp_stage9;
create table t_bp_stage9 as
select
lpad (a.member_id::text,10,'0') as bp_member_id,
b.bp_disease_id,
b.bp_measure_id,
b.bp_end_period_id,
b.bp_leg_cust_memb_id,
b.bp_source_code
from
t_bp_test1 a, t_bp_stage8 b
where
cast(a.member_id_legacy_cust as char) = b.bp_leg_cust_memb_id
;
grant select on t_bp_stage9 to public;
[/script]
When I do and explain, I get this:
[explain]
psql:./test2:17: NOTICE: QUERY PLAN:
Nested Loop (cost=0.00..32520.00 rows=10000 width=84)
-> Seq Scan on t_bp_test1 a (cost=0.00..20.00 rows=1000 width=24)
-> Seq Scan on t_bp_stage8 b (cost=0.00..20.00 rows=1000 width=60)
EXPLAIN
[/explain]
I started the job from last night (about 6pm) and had to
kill it when I got in this morning (8am). I know the
data is large (t_bp_stage8 = 8183745 records;
t_bp_test1 =1677375 records) but what I haven't seen
is the partition space where the database lives get smaller
(when ever I create large tables, I check the partition
and, sure enough, the file system loses a bit of space).
My questions: How can I check to see if this query is actually
doing something? Is my query even right? Even though there's
temporary, should I index them and drop them later?
Suggestions?
Thanks!
-X
"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes: > When I do and explain, I get this: > Nested Loop (cost=0.00..32520.00 rows=10000 width=84) > -> Seq Scan on t_bp_test1 a (cost=0.00..20.00 rows=1000 width=24) > -> Seq Scan on t_bp_stage8 b (cost=0.00..20.00 rows=1000 width=60) > I started the job from last night (about 6pm) and had to > kill it when I got in this morning (8am). I know the > data is large (t_bp_stage8 = 8183745 records; > t_bp_test1 =1677375 records) The planner evidently doesn't know that. Have you ever done a VACUUM (perhaps better, VACUUM ANALYZE) on these tables? Also, I'd suggest simplifying the where clause: where cast(a.member_id_legacy_cust as char) = b.bp_leg_cust_memb_id; Can't that be just where a.member_id_legacy_cust = b.bp_leg_cust_memb_id; ? I think that the added cast will probably prevent the planner from using any intelligent query plans, like merge or hash join. regards, tom lane
--Thanks for the reply:
--I will try the 'vacuum analyze' on the tables now.
--As far as:
[snip code]
where cast(a.member_id_legacy_cust as char) = b.bp_leg_cust_memb_id;
[/snip]
--The reason I did that was because "a.member_id_legacy_cust" is something
from an MS Access table and the definitions of the columns were different:
[table of t_bp_test1]
Table "t_bp_test1"
Attribute | Type | Modifier
--------------------------+------------------------+----------
prod_id | integer |
member_id | character varying(254) |
member_id_consist | character varying(254) |
member_id_legacy | character varying(254) |
member_id_legacy_consist | character varying(254) |
member_id_legacy_cust | character varying(254) |
members_consist | integer
[/table]
[table of t_bp_stage8]
Table "t_bp_stage8"
Attribute | Type | Modifier
---------------------+-----------+----------
bp_disease_id | text |
bp_measure_id | text |
bp_end_period_id | text |
bp_leg_cust_memb_id | character |
bp_source_code | text |
[/table]
--I just thought it would be easier to CAST to char.
Maybe that's what's the holdup ... but if I didn't have
cast, then I get an error:
[error without using CAST]
psql:./test2:17: ERROR: Unable to identify an operator '='
for types 'varchar' and 'bpchar'
You will have to retype this query using an explicit cast
[/error]
--Maybe I'm going about this the wrong way - could I join
or intersect and get the right results? I wonder ...
--Or, maybe create a new table with new column definitions ...
but that seems to be counterproductive ... I guess ...
--Thanks again!
-X
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes:
> When I do and explain, I get this:
> Nested Loop (cost=0.00..32520.00 rows=10000 width=84)
> -> Seq Scan on t_bp_test1 a (cost=0.00..20.00 rows=1000 width=24)
> -> Seq Scan on t_bp_stage8 b (cost=0.00..20.00 rows=1000 width=60)
> I started the job from last night (about 6pm) and had to
> kill it when I got in this morning (8am). I know the
> data is large (t_bp_stage8 = 8183745 records;
> t_bp_test1 =1677375 records)
The planner evidently doesn't know that. Have you ever done a VACUUM
(perhaps better, VACUUM ANALYZE) on these tables?
Also, I'd suggest simplifying the where clause:
where cast(a.member_id_legacy_cust as char) = b.bp_leg_cust_memb_id;
Can't that be just
where a.member_id_legacy_cust = b.bp_leg_cust_memb_id;
? I think that the added cast will probably prevent the planner from
using any intelligent query plans, like merge or hash join.
regards, tom lane