Обсуждение: stuck on values in 8.2

Поиск
Список
Период
Сортировка

stuck on values in 8.2

От
Tom Allison
Дата:
OK, after reviewing many emails and what I was trying to do I
upgraded from 8.2.

Seems to work as it did in 8.1 which is a good start.

I'm doing all of this so I can use the 'values'  that was described
as being something like:

select * from (values ('one','two','three')) "foo";

But instead of getting three rows of one value each, I get three
columns of one row.
Not what I was looking for. :(

What I am doing now is run a large list (300+) of values that is
being iterated through a sql statement like:  'select * from mytable
where token = ?' to see if the value exists or not and then "do stuff".

A while back I was told that this would all be fixed if I used values
(...)

Did I misunderstand the proper care and feeding of the VALUES
statement or is there something else I need to do here?

I initially thought that I could do this with:
select t.value, v.value from
values('one','two','three') left outer join mytable using (value)

except
-- not sure how to name the column in values() to come out as name
"value"
-- and quite frankly I can't get anything to actually run other than
my first example.

Ideally the output would have something like:
v.value | t.value
----------+---------
one       |
two        |  two
three    |  three

where the value 'one' is not in the table...

Am I still able to do something like this but don't have the syntax
correct?
Or am I more SOL than RTFM?

Re: stuck on values in 8.2

От
Gregory Stark
Дата:
"Tom Allison" <tom@tacocat.net> writes:

> OK, after reviewing many emails and what I was trying to do I upgraded from 8.2.
>
> Seems to work as it did in 8.1 which is a good start.
>
> I'm doing all of this so I can use the 'values'  that was described as being
> something like:
>
> select * from (values ('one','two','three')) "foo";

SELECT * FROM (VALUES ('one'),('two'),('three')) AS foo(value)

> I initially thought that I could do this with:
> select t.value, v.value from
> values('one','two','three') left outer join mytable using (value)

  postgres=# SELECT *
    FROM (VALUES ('one'),('two'),('three')) AS foo(value)
    LEFT OUTER JOIN mytable ON (foo.value = mytable.value);

   value | value
  -------+-------
   one   |
   two   | two
   three | three
  (3 rows)

"USING" would work too but then you only get one output column rather than two
which is not so helpful in this case.

  postgres=# SELECT *
    FROM (VALUES ('one'),('two'),('three')) AS foo(value)
    LEFT OUTER JOIN mytable USING (value) ;

   value
  -------
   one
   two
   three
  (3 rows)

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


Re: stuck on values in 8.2

От
Tom Allison
Дата:
This is getting really ugly...
it won't finish in less than .. minutes.

spam=> explain select u2.token_idx, t.token_idx, foo.token from
tokens t left outer join user_token u1 using (token_idx),
users u left outer join user_token u2 using (user_idx),
(values('one'),('want'),('examine'),('three')) as foo(token)
left outer join tokens using (token)
where u.user_idx = 15;
                                                   QUERY PLAN
------------------------------------------------------------------------
--------------------------------------
Nested Loop Left Join  (cost=423.70..4327392.26 rows=216168368 width=48)
    ->  Nested Loop  (cost=250.49..3851.69 rows=30472 width=44)
          ->  Hash Left Join  (cost=249.40..3241.16 rows=30472 width=40)
                Hash Cond: (t.token_idx = u1.token_idx)
                ->  Nested Loop  (cost=0.00..967.34 rows=29908 width=40)
                      ->  Nested Loop Left Join  (cost=0.00..33.18
rows=4 width=32)
                            ->  Values Scan on
"*VALUES*"  (cost=0.00..0.05 rows=4 width=32)
                            ->  Index Scan using tokens_token_key on
tokens  (cost=0.00..8.27 rows=1 width=16)
                                  Index Cond: ("*VALUES*".column1 =
(tokens.token)::text)
                      ->  Seq Scan on tokens t  (cost=0.00..158.77
rows=7477 width=8)
                ->  Hash  (cost=145.29..145.29 rows=8329 width=8)
                      ->  Seq Scan on user_token u1
(cost=0.00..145.29 rows=8329 width=8)
          ->  Materialize  (cost=1.09..1.10 rows=1 width=4)
                ->  Seq Scan on users u  (cost=0.00..1.09 rows=1
width=4)
                      Filter: (user_idx = 15)
    ->  Materialize  (cost=173.21..244.15 rows=7094 width=12)
          ->  Seq Scan on user_token u2  (cost=0.00..166.11 rows=7094
width=12)
                Filter: (user_idx = 15)
(18 rows)

On May 12, 2007, at 11:08 AM, Gregory Stark wrote:

>
> "Tom Allison" <tom@tacocat.net> writes:
>
>> OK, after reviewing many emails and what I was trying to do I
>> upgraded from 8.2.
>>
>> Seems to work as it did in 8.1 which is a good start.
>>
>> I'm doing all of this so I can use the 'values'  that was
>> described as being
>> something like:
>>
>> select * from (values ('one','two','three')) "foo";
>
> SELECT * FROM (VALUES ('one'),('two'),('three')) AS foo(value)
>
>> I initially thought that I could do this with:
>> select t.value, v.value from
>> values('one','two','three') left outer join mytable using (value)
>
>   postgres=# SELECT *
>     FROM (VALUES ('one'),('two'),('three')) AS foo(value)
>     LEFT OUTER JOIN mytable ON (foo.value = mytable.value);
>
>    value | value
>   -------+-------
>    one   |
>    two   | two
>    three | three
>   (3 rows)
>
> "USING" would work too but then you only get one output column
> rather than two
> which is not so helpful in this case.
>
>   postgres=# SELECT *
>     FROM (VALUES ('one'),('two'),('three')) AS foo(value)
>     LEFT OUTER JOIN mytable USING (value) ;
>
>    value
>   -------
>    one
>    two
>    three
>   (3 rows)
>
> --
>   Gregory Stark
>   EnterpriseDB          http://www.enterprisedb.com
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org/


Re: stuck on values in 8.2

От
Richard Broersma Jr
Дата:
--- Tom Allison <tom@tacocat.net> wrote:

> This is getting really ugly...
> it won't finish in less than .. minutes.
>
> spam=> explain select u2.token_idx, t.token_idx, foo.token from
> tokens t left outer join user_token u1 using (token_idx),
> users u left outer join user_token u2 using (user_idx),
> (values('one'),('want'),('examine'),('three')) as foo(token)
> left outer join tokens using (token)
> where u.user_idx = 15;

It looks to me that your query has (3) left joins and (3) implied cross-joins.  Perhaps reforming
your query to eliminate the cross-joins with help performance.  In regard to your tables <tokens>
and <user_tokens>, in this query you are referring to (2) separate instances of these tables when
a single instance these tables would probably work just fine. i.e.

tokens t vs. tokens,
user_token u1 vs user_token u2

Regards,
Richard Broersma Jr.

Re: stuck on values in 8.2

От
Tom Allison
Дата:
I agree I'm going in the wrong direction.

in a Venn sort of way, what I'm trying to do is:

values(...) --> left outer --> tokens --> left outer --> (user_tokens
where user_tokens.user_idx = users.user_idx and users.user_idx = 4)

To give me a list of
all values ||  any matching token || any matching user_token where
user_idx = 4

something like:
SELECT values.token, t.token_idx, ut.token_idx
FROM
(values('one'),('want'),('examine'),('three')) as values(token)
     left outer join tokens t using (token)
     left outer join ( select token_idx from user_token where
user_idx = 14) "ut"
     using (token_idx)
;

That seems to be better.
I think the part I was trying to get my brain around was how, in
postgres, do you do multiple outer joins.
On my day job I do this in Oracle without thinking, but the syntax of
postgres is new to me.
Like doing dates.  Everyone has a different way of doing dates and
they are all weird.

Now I have to go impliment it into my code and see what it actually
does.
I'm hoping to peel 3-5 seconds off each process!

On May 12, 2007, at 12:06 PM, Richard Broersma Jr wrote:

>
> --- Tom Allison <tom@tacocat.net> wrote:
>
>> This is getting really ugly...
>> it won't finish in less than .. minutes.
>>
>> spam=> explain select u2.token_idx, t.token_idx, foo.token from
>> tokens t left outer join user_token u1 using (token_idx),
>> users u left outer join user_token u2 using (user_idx),
>> (values('one'),('want'),('examine'),('three')) as foo(token)
>> left outer join tokens using (token)
>> where u.user_idx = 15;
>
> It looks to me that your query has (3) left joins and (3) implied
> cross-joins.  Perhaps reforming
> your query to eliminate the cross-joins with help performance.  In
> regard to your tables <tokens>
> and <user_tokens>, in this query you are referring to (2) separate
> instances of these tables when
> a single instance these tables would probably work just fine. i.e.
>
> tokens t vs. tokens,
> user_token u1 vs user_token u2
>
> Regards,
> Richard Broersma Jr.
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: explain analyze is your friend


Re: stuck on values in 8.2

От
Gregory Stark
Дата:
"Tom Allison" <tom@tacocat.net> writes:

> This is getting really ugly...
> it won't finish in less than .. minutes.
>
> spam=> explain select u2.token_idx, t.token_idx, foo.token from
> tokens t left outer join user_token u1 using (token_idx),
> users u left outer join user_token u2 using (user_idx),
> (values('one'),('want'),('examine'),('three')) as foo(token)
> left outer join tokens using (token)
> where u.user_idx = 15;

You might find it easier to follow your SQL queries if you formatted them to
make their structure clear:

         SELECT u2.token_idx, t.token_idx, foo.token
           FROM tokens t
LEFT OUTER JOIN user_token u1 USING (token_idx),
                users u
LEFT OUTER JOIN user_token u2 USING (user_idx),
                (VALUES('one'),('want'),('examine'),('three')) AS foo(token)
LEFT OUTER JOIN tokens USING (token)
          WHERE u.user_idx = 15;

Note that your query is joining 6 tables and there are two joins that don't
have any join constraint on them. So you're getting the cartesian product of
those joins. Postgres estimates your query will return 216 million records.


--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


Re: stuck on values in 8.2

От
Tom Allison
Дата:
Thank you very much for all your help!!!

Solved this one rather nicely:

my $glue = q{'),(E'};

my $string = "(E'" . join($glue, map{quotemeta } @$tokens) . "')";

my $sql =<<SQL;
SELECT values.token, t.token_idx, ut.token_idx
FROM
(values TOKEN_LIST_STRING) as values(token)
     left outer join tokens t using (token)
     left outer join
     ( select token_idx from user_token where user_idx = $self->
{user_idx}) "ut"
     using (token_idx)
SQL

$sql =~ s/TOKEN_LIST_STRING/$string/o;

-------------------------
This is something on the order to 10-1000 times faster depending on
the various result sets matching more/less across the various tables.
"very nice..."

I'm thinking I could do the same thing where I have a lot of inserts.
But this is going to be considerably harder because I don't have any
really good experience with doing this with transactional integrity.

 From this query, I eventually want to grab all the values.token and
insert them into the token and user_token table.

I can call:
    insert into tokens(token) values(('one'),('two'),('three'));
then call:
    insert into user_token(user_idx, token_idx)
        select 14, token_idx from tokens where token in ('one','two','three')

And that should work.
HOWEVER:
If I have two simultaneous INSERT INTO token(token) queries with
overlapping values I'm going to get into all kinds of trouble with
the integrity constraint on
my index of unique tokens.  Typically I'll get an integrity violation
error.

Or am I looking at something like:
insert into tokens
select ..
values(..) as values(token)
left outer join tokens using (token)

(Am I getting the hang of this any better?)



On May 12, 2007, at 12:06 PM, Richard Broersma Jr wrote:

>
> --- Tom Allison <tom@tacocat.net> wrote:
>
>> This is getting really ugly...
>> it won't finish in less than .. minutes.
>>
>> spam=> explain select u2.token_idx, t.token_idx, foo.token from
>> tokens t left outer join user_token u1 using (token_idx),
>> users u left outer join user_token u2 using (user_idx),
>> (values('one'),('want'),('examine'),('three')) as foo(token)
>> left outer join tokens using (token)
>> where u.user_idx = 15;
>
> It looks to me that your query has (3) left joins and (3) implied
> cross-joins.  Perhaps reforming
> your query to eliminate the cross-joins with help performance.  In
> regard to your tables <tokens>
> and <user_tokens>, in this query you are referring to (2) separate
> instances of these tables when
> a single instance these tables would probably work just fine. i.e.
>
> tokens t vs. tokens,
> user_token u1 vs user_token u2
>
> Regards,
> Richard Broersma Jr.


Re: stuck on values in 8.2

От
Tom Allison
Дата:
I think I fixed the rest of my sql statements with the following:

insert into tokens (token)select values.token
from (values TOKEN_LIST_STRING ) as values(token)
left outer join tokens t using (token)
where t.token_idx is null

insert into user_token(user_idx, token_idx)select $self->{user_idx},
token_idxfrom tokens left outer join (select token_idx from
user_token where user_idx = $self->{user_idx}) ut using (token_idx)
where ut.token_idx is null
and token_idx in ($string)


I think does what I was trying to accomplish.  At least the little
test sql seems to work.

interestingly, the time to process has gone from >100s to <1s.



On May 12, 2007, at 2:23 PM, Tom Allison wrote:

> Thank you very much for all your help!!!
>
> Solved this one rather nicely:
>
> my $glue = q{'),(E'};
>
> my $string = "(E'" . join($glue, map{quotemeta } @$tokens) . "')";
>
> my $sql =<<SQL;
> SELECT values.token, t.token_idx, ut.token_idx
> FROM
> (values TOKEN_LIST_STRING) as values(token)
>     left outer join tokens t using (token)
>     left outer join
>     ( select token_idx from user_token where user_idx = $self->
> {user_idx}) "ut"
>     using (token_idx)
> SQL
>
> $sql =~ s/TOKEN_LIST_STRING/$string/o;
>
> -------------------------
> This is something on the order to 10-1000 times faster depending on
> the various result sets matching more/less across the various tables.
> "very nice..."
>
> I'm thinking I could do the same thing where I have a lot of inserts.
> But this is going to be considerably harder because I don't have
> any really good experience with doing this with transactional
> integrity.
>
> From this query, I eventually want to grab all the values.token and
> insert them into the token and user_token table.
>
> I can call:
>     insert into tokens(token) values(('one'),('two'),('three'));
> then call:
>     insert into user_token(user_idx, token_idx)
>         select 14, token_idx from tokens where token in
> ('one','two','three')
>
> And that should work.
> HOWEVER:
> If I have two simultaneous INSERT INTO token(token) queries with
> overlapping values I'm going to get into all kinds of trouble with
> the integrity constraint on
> my index of unique tokens.  Typically I'll get an integrity
> violation error.
>
> Or am I looking at something like:
> insert into tokens
> select ..
> values(..) as values(token)
> left outer join tokens using (token)
>
> (Am I getting the hang of this any better?)
>
>
>
> On May 12, 2007, at 12:06 PM, Richard Broersma Jr wrote:
>
>>
>> --- Tom Allison <tom@tacocat.net> wrote:
>>
>>> This is getting really ugly...
>>> it won't finish in less than .. minutes.
>>>
>>> spam=> explain select u2.token_idx, t.token_idx, foo.token from
>>> tokens t left outer join user_token u1 using (token_idx),
>>> users u left outer join user_token u2 using (user_idx),
>>> (values('one'),('want'),('examine'),('three')) as foo(token)
>>> left outer join tokens using (token)
>>> where u.user_idx = 15;
>>
>> It looks to me that your query has (3) left joins and (3) implied
>> cross-joins.  Perhaps reforming
>> your query to eliminate the cross-joins with help performance.  In
>> regard to your tables <tokens>
>> and <user_tokens>, in this query you are referring to (2) separate
>> instances of these tables when
>> a single instance these tables would probably work just fine. i.e.
>>
>> tokens t vs. tokens,
>> user_token u1 vs user_token u2
>>
>> Regards,
>> Richard Broersma Jr.
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster