Обсуждение: select from pipe-delimited field

Поиск
Список
Период
Сортировка

select from pipe-delimited field

От
san man
Дата:
Hello all,

I am trying to do a SELECT operation with a WHERE condition. However, the column with which I am trying to do the comparison has several values which are pipe-delimited. I want to return a match(true) if the WHERE condition matches any of the bar-delimited values.

For example, SELECT id WHERE synonyms = 'word';

Here synonyms is a pipe-delimited field and I want to match "word" with any of the values of the synonyms fields.

Thanks in advance.

Neel

Re: select from pipe-delimited field

От
David Fetter
Дата:
On Mon, Aug 23, 2010 at 05:44:09PM -0500, san man wrote:
> Hello all,
>
> I am trying to do a SELECT operation with a WHERE condition. However, the
> column with which I am trying to do the comparison has several values which
> are pipe-delimited. I want to return a match(true) if the WHERE condition
> matches any of the bar-delimited values.

You'll want to normalize this table into two or more tables, at some
point.

> For example, SELECT id WHERE synonyms = 'word';
>
> Here synonyms is a pipe-delimited field and I want to match "word" with any
> of the values of the synonyms fields.

Try the LIKE function.

http://www.postgresql.org/docs/current/static/functions-matching.html#FUNCTIONS-LIKE

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: select from pipe-delimited field

От
Arjen Nienhuis
Дата:
On Tue, Aug 24, 2010 at 12:44 AM, san man <neelakash21@gmail.com> wrote:
Hello all,

I am trying to do a SELECT operation with a WHERE condition. However, the column with which I am trying to do the comparison has several values which are pipe-delimited. I want to return a match(true) if the WHERE condition matches any of the bar-delimited values.

For example, SELECT id WHERE synonyms = 'word';

Here synonyms is a pipe-delimited field and I want to match "word" with any of the values of the synonyms fields.


SELECT id FROM t WHERE 'word' = ANY(regexp_split_to_array(synonyms, '[|]'));
 
With postgres it's possible to make an index on regexp_split_to_array(synonyms, '[|]')) for some extra speed. But its better to store the synonyms as an array in the first place.

Thanks in advance.

Neel

Re: select from pipe-delimited field

От
san man
Дата:
Thanks for the replies.
David, I would have normalized it to 2 or more tables, but the number of bar-delimited are not fixed and as new data are added the maximum number of these values may change. Also, the problem with like I think is that matching is not strict and thus might give spurious hits.

Arjen's solution(haven't tried yet) looks better to me.

-Neel

On Mon, Aug 23, 2010 at 5:59 PM, Arjen Nienhuis <a.g.nienhuis@gmail.com> wrote:
On Tue, Aug 24, 2010 at 12:44 AM, san man <neelakash21@gmail.com> wrote:
Hello all,

I am trying to do a SELECT operation with a WHERE condition. However, the column with which I am trying to do the comparison has several values which are pipe-delimited. I want to return a match(true) if the WHERE condition matches any of the bar-delimited values.

For example, SELECT id WHERE synonyms = 'word';

Here synonyms is a pipe-delimited field and I want to match "word" with any of the values of the synonyms fields.


SELECT id FROM t WHERE 'word' = ANY(regexp_split_to_array(synonyms, '[|]'));
 
With postgres it's possible to make an index on regexp_split_to_array(synonyms, '[|]')) for some extra speed. But its better to store the synonyms as an array in the first place.

Thanks in advance.

Neel


Re: select from pipe-delimited field

От
Craig Ringer
Дата:
On 08/24/2010 07:07 AM, san man wrote:
> Thanks for the replies.
> David, I would have normalized it to 2 or more tables, but the number of
> bar-delimited are not fixed and as new data are added the maximum number
> of these values may change. Also, the problem with like I think is that
> matching is not strict and thus might give spurious hits.

You are trying to simulate arrays using your own custom setup.

If you switch from using pipe-delimeted text to an array, you can use
the PostgreSQL array operators to do what you want. You even have
(limited) indexing options.

As for normalizing the data out to another table: Sometimes performance
concerns render that undesirable. Arrays can be very useful for fairly
small amounts of data that's tightly associated with a given record,
especially things like search keys.

There's a reason that tsvector is implemented how it is, rather than as
a breakout table full of keyword associations. It has to be fast, and
indexable. It sounds like the OP's problem has the same requirements.

--
Craig Ringer

Re: select from pipe-delimited field

От
Scott Marlowe
Дата:
On Mon, Aug 23, 2010 at 8:21 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> On 08/24/2010 07:07 AM, san man wrote:
>>
>> Thanks for the replies.
>> David, I would have normalized it to 2 or more tables, but the number of
>> bar-delimited are not fixed and as new data are added the maximum number
>> of these values may change. Also, the problem with like I think is that
>> matching is not strict and thus might give spurious hits.
>
> You are trying to simulate arrays using your own custom setup.
>
> If you switch from using pipe-delimeted text to an array, you can use the
> PostgreSQL array operators to do what you want. You even have (limited)
> indexing options.
>
> As for normalizing the data out to another table: Sometimes performance
> concerns render that undesirable. Arrays can be very useful for fairly small
> amounts of data that's tightly associated with a given record, especially
> things like search keys.
>
> There's a reason that tsvector is implemented how it is, rather than as a
> breakout table full of keyword associations. It has to be fast, and
> indexable. It sounds like the OP's problem has the same requirements.

I think if I was going to normalize it I would normalize into a many
to many lookup table with the lookup table being in the middlle for an
it from each table.  It's no full text search, but it would probably
be faster than one big second table with lots of redundant info in it.

Re: select from pipe-delimited field

От
hubert depesz lubaczewski
Дата:
On Mon, Aug 23, 2010 at 06:07:08PM -0500, san man wrote:
> Thanks for the replies.
> David, I would have normalized it to 2 or more tables, but the number of
> bar-delimited are not fixed and as new data are added the maximum number of
> these values may change. Also, the problem with like I think is that
> matching is not strict and thus might give spurious hits.

The fact that it's not fixed is irrelevant. just add table:

create table synonyms (
    word text,
    synonym text,
    primary key (synonymi, word)
);

and you're fine. you can have as many synonyms per word as you want.

depesz