Обсуждение: only best matches with ilike or regex matching

Поиск
Список
Период
Сортировка

only best matches with ilike or regex matching

От
Ron Arts
Дата:
Hi,

this is probably an SQL question instead of PostgreSQL but here it goes.

I have a table containing phone destinations and pricing as follows:

prefix  |  cost   |  timeframe | provider
----------------------------------------
^31     |  0.02   |  1         | 1
^31     |  0.01   |  2         | 1
^31653  |  0.14   |  1         | 1
^31653  |  0.12   |  2         | 1
^31     |  0.03   |  1         | 2
^31     |  0.02   |  2         | 2
^31653  |  0.15   |  1         | 2
^31653  |  0.13   |  2         | 2

where timeframe=2 means weekends.

For a given phonenumber I need to get the list of providers with the
cheapest one first. Suppose the target phonenumber is 31653445566,
and timeframe is 2:

prefix  |  cost   |  timeframe | provider
----------------------------------------
^31653  |  0.12   |  2         | 1
^31653  |  0.13   |  2         | 2

But I cannot find a query to get this result. I only want
the ^31653 rows, and not the ^31 rows, but these both match
a 'where'31653445566' ~ prefix' clause. Using distinct does not
work as well.

Please help,

Thanks,
Ron



Вложения

Re: only best matches with ilike or regex matching

От
"Chad Wagner"
Дата:
On 1/17/07, Ron Arts <ron.arts@neonova.nl> wrote:
this is probably an SQL question instead of PostgreSQL but here it goes.

I have a table containing phone destinations and pricing as follows:

prefix  |  cost   |  timeframe | provider
----------------------------------------
^31     |  0.02   |  1         | 1
^31     |  0.01   |  2         | 1
^31653  |  0.14   |  1         | 1
^31653  |  0.12   |  2         | 1
^31     |  0.03   |  1         | 2
^31     |  0.02   |  2         | 2
^31653  |  0.15   |  1         | 2
^31653  |  0.13   |  2         | 2

where timeframe=2 means weekends.

For a given phonenumber I need to get the list of providers with the
cheapest one first. Suppose the target phonenumber is 31653445566,
and timeframe is 2:

prefix  |  cost   |  timeframe | provider
----------------------------------------
^31653  |  0.12   |  2         | 1
^31653  |  0.13   |  2         | 2

But I cannot find a query to get this result. I only want
the ^31653 rows, and not the ^31 rows, but these both match
a 'where'31653445566' ~ prefix' clause. Using distinct does not
work as well.

This is close...

# select * from phonerates where '^316534455665' like prefix || '%' and timeframe = 2 order by length(prefix) desc, cost asc limit 1;
 prefix | cost | timeframe | provider
--------+------+-----------+----------
 ^31653 | 0.12 |         2 |        1

If you want exactly as you indicated above, then...

select *
  from phonerates
 where timeframe = 2
   and prefix = (select prefix
                   from phonerates
                  where '^316534455665' like prefix || '%'
                 order by length(prefix) desc
                 limit 1)
 order by cost;

 prefix | cost | timeframe | provider
--------+------+-----------+----------
 ^31653 | 0.12 |         2 |        1
 ^31653 | 0.13 |         2 |        2



--
Chad
http://www.postgresqlforums.com/

Re: only best matches with ilike or regex matching

От
Ron Arts
Дата:
hi Chad,

thanks for responding.
Your solution points out to me that I my example
wasn't perfect in one way: it suggests that providers use
the same prefix list. But they don't. Sorry about that.

In reality this would be a more realistic example:

id | prefix  |  cost   |  timeframe | provider
----------------------------------------------
1  | ^31     |  0.02   |  1         | 1
2  | ^31     |  0.01   |  2         | 1
3  | ^31653  |  0.14   |  1         | 1
4  | ^31653  |  0.12   |  2         | 1
5  | ^31     |  0.03   |  1         | 2
6  | ^31     |  0.02   |  2         | 2
7  | ^316    |  0.15   |  1         | 2
8  | ^316    |  0.13   |  2         | 2

As you see, different providers divide up the possible
range of phone numbers in a different way.

Now your last query won't work because the subselect
will return the prefix from row 4, and this will not
match row 8. Can you offer another suggestion?

Ron

Chad Wagner schreef:
> On 1/17/07, *Ron Arts* <ron.arts@neonova.nl
> <mailto:ron.arts@neonova.nl>> wrote:
>
>     this is probably an SQL question instead of PostgreSQL but here it goes.
>
>     I have a table containing phone destinations and pricing as follows:
>
>     prefix  |  cost   |  timeframe | provider
>     ----------------------------------------
>     ^31     |  0.02   |  1         | 1
>     ^31     |  0.01   |  2         | 1
>     ^31653  |  0.14   |  1         | 1
>     ^31653  |  0.12   |  2         | 1
>     ^31     |  0.03   |  1         | 2
>     ^31     |  0.02   |  2         | 2
>     ^31653  |  0.15   |  1         | 2
>     ^31653  |  0.13   |  2         | 2
>
>     where timeframe=2 means weekends.
>
>     For a given phonenumber I need to get the list of providers with the
>     cheapest one first. Suppose the target phonenumber is 31653445566,
>     and timeframe is 2:
>
>     prefix  |  cost   |  timeframe | provider
>     ----------------------------------------
>     ^31653  |  0.12   |  2         | 1
>     ^31653  |  0.13   |  2         | 2
>
>     But I cannot find a query to get this result. I only want
>     the ^31653 rows, and not the ^31 rows, but these both match
>     a 'where'31653445566' ~ prefix' clause. Using distinct does not
>     work as well.
>
>
> This is close...
>
> # select * from phonerates where '^316534455665' like prefix || '%' and
> timeframe = 2 order by length(prefix) desc, cost asc limit 1;
>  prefix | cost | timeframe | provider
> --------+------+-----------+----------
>  ^31653 | 0.12 |         2 |        1
>
> If you want exactly as you indicated above, then...
>
> select *
>   from phonerates
>  where timeframe = 2
>    and prefix = (select prefix
>                    from phonerates
>                   where '^316534455665' like prefix || '%'
>                  order by length(prefix) desc
>                  limit 1)
>  order by cost;
>
>  prefix | cost | timeframe | provider
> --------+------+-----------+----------
>  ^31653 | 0.12 |         2 |        1
>  ^31653 | 0.13 |         2 |        2
>
>
>
> --
> Chad
> http://www.postgresqlforums.com/


Вложения

Re: only best matches with ilike or regex matching

От
"Duncan Garland"
Дата:
>
> I have a table containing phone destinations and pricing as follows:
>

SELECT * FROM tel ORDER BY cost;

SELECT * FROM tel WHERE timeframe = 2 ORDER BY cost;

SELECT * FROM tel t1
WHERE SUBSTR( '31653445566', 1, LENGTH( prefix ) ) = prefix
AND timeframe = 2
AND NOT EXISTS
( SELECT NULL FROM tel t2
  WHERE
  t1.timeframe = t2.timeframe
  AND t1.provider = t2.provider
  AND t1.prefix = SUBSTR( t2.prefix, 1, LENGTH( t1.prefix ) )
  AND LENGTH( t1.prefix ) < LENGTH( t2.prefix )
);

Produces

 id | prefix | cost | timeframe | provider
----+--------+------+-----------+----------
  2 | 31     | 0.01 |         2 |        1
  1 | 31     | 0.02 |         1 |        1
  6 | 31     | 0.02 |         2 |        2
  5 | 31     | 0.03 |         1 |        2
  4 | 31653  | 0.12 |         2 |        1
  8 | 316    | 0.13 |         2 |        2
  3 | 31653  | 0.14 |         1 |        1
  7 | 316    | 0.15 |         1 |        2
(8 rows)

 id | prefix | cost | timeframe | provider
----+--------+------+-----------+----------
  2 | 31     | 0.01 |         2 |        1
  6 | 31     | 0.02 |         2 |        2
  4 | 31653  | 0.12 |         2 |        1
  8 | 316    | 0.13 |         2 |        2
(4 rows)

 id | prefix | cost | timeframe | provider
----+--------+------+-----------+----------
  4 | 31653  | 0.12 |         2 |        1
  8 | 316    | 0.13 |         2 |        2
(2 rows)

Is that what you meant? The longest match for a given provider?

Regards

Duncan
-----Original Message-----
From: pgsql-novice-owner@postgresql.org
[mailto:pgsql-novice-owner@postgresql.org]On Behalf Of Ron Arts
Sent: 18 January 2007 07:48
To: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] only best matches with ilike or regex matching


hi Chad,

thanks for responding.
Your solution points out to me that I my example
wasn't perfect in one way: it suggests that providers use
the same prefix list. But they don't. Sorry about that.

In reality this would be a more realistic example:

id | prefix  |  cost   |  timeframe | provider
----------------------------------------------
1  | ^31     |  0.02   |  1         | 1
2  | ^31     |  0.01   |  2         | 1
3  | ^31653  |  0.14   |  1         | 1
4  | ^31653  |  0.12   |  2         | 1
5  | ^31     |  0.03   |  1         | 2
6  | ^31     |  0.02   |  2         | 2
7  | ^316    |  0.15   |  1         | 2
8  | ^316    |  0.13   |  2         | 2

As you see, different providers divide up the possible
range of phone numbers in a different way.

Now your last query won't work because the subselect
will return the prefix from row 4, and this will not
match row 8. Can you offer another suggestion?

Ron

Chad Wagner schreef:
> On 1/17/07, *Ron Arts* <ron.arts@neonova.nl
> <mailto:ron.arts@neonova.nl>> wrote:
>
>     this is probably an SQL question instead of PostgreSQL but here it
goes.
>
>     I have a table containing phone destinations and pricing as follows:
>
>     prefix  |  cost   |  timeframe | provider
>     ----------------------------------------
>     ^31     |  0.02   |  1         | 1
>     ^31     |  0.01   |  2         | 1
>     ^31653  |  0.14   |  1         | 1
>     ^31653  |  0.12   |  2         | 1
>     ^31     |  0.03   |  1         | 2
>     ^31     |  0.02   |  2         | 2
>     ^31653  |  0.15   |  1         | 2
>     ^31653  |  0.13   |  2         | 2
>
>     where timeframe=2 means weekends.
>
>     For a given phonenumber I need to get the list of providers with the
>     cheapest one first. Suppose the target phonenumber is 31653445566,
>     and timeframe is 2:
>
>     prefix  |  cost   |  timeframe | provider
>     ----------------------------------------
>     ^31653  |  0.12   |  2         | 1
>     ^31653  |  0.13   |  2         | 2
>
>     But I cannot find a query to get this result. I only want
>     the ^31653 rows, and not the ^31 rows, but these both match
>     a 'where'31653445566' ~ prefix' clause. Using distinct does not
>     work as well.
>
>
> This is close...
>
> # select * from phonerates where '^316534455665' like prefix || '%' and
> timeframe = 2 order by length(prefix) desc, cost asc limit 1;
>  prefix | cost | timeframe | provider
> --------+------+-----------+----------
>  ^31653 | 0.12 |         2 |        1
>
> If you want exactly as you indicated above, then...
>
> select *
>   from phonerates
>  where timeframe = 2
>    and prefix = (select prefix
>                    from phonerates
>                   where '^316534455665' like prefix || '%'
>                  order by length(prefix) desc
>                  limit 1)
>  order by cost;
>
>  prefix | cost | timeframe | provider
> --------+------+-----------+----------
>  ^31653 | 0.12 |         2 |        1
>  ^31653 | 0.13 |         2 |        2
>
>
>
> --
> Chad
> http://www.postgresqlforums.com/



Re: only best matches with ilike or regex matching

От
Ron Arts
Дата:
Hi Duncan,

Yes. This works great. Now could this be done
using regex matching as well?

Ron

Duncan Garland schreef:
>> I have a table containing phone destinations and pricing as follows:
>>
>
> SELECT * FROM tel ORDER BY cost;
>
> SELECT * FROM tel WHERE timeframe = 2 ORDER BY cost;
>
> SELECT * FROM tel t1
> WHERE SUBSTR( '31653445566', 1, LENGTH( prefix ) ) = prefix
> AND timeframe = 2
> AND NOT EXISTS
> ( SELECT NULL FROM tel t2
>   WHERE
>   t1.timeframe = t2.timeframe
>   AND t1.provider = t2.provider
>   AND t1.prefix = SUBSTR( t2.prefix, 1, LENGTH( t1.prefix ) )
>   AND LENGTH( t1.prefix ) < LENGTH( t2.prefix )
> );
>
> Produces
>
>  id | prefix | cost | timeframe | provider
> ----+--------+------+-----------+----------
>   2 | 31     | 0.01 |         2 |        1
>   1 | 31     | 0.02 |         1 |        1
>   6 | 31     | 0.02 |         2 |        2
>   5 | 31     | 0.03 |         1 |        2
>   4 | 31653  | 0.12 |         2 |        1
>   8 | 316    | 0.13 |         2 |        2
>   3 | 31653  | 0.14 |         1 |        1
>   7 | 316    | 0.15 |         1 |        2
> (8 rows)
>
>  id | prefix | cost | timeframe | provider
> ----+--------+------+-----------+----------
>   2 | 31     | 0.01 |         2 |        1
>   6 | 31     | 0.02 |         2 |        2
>   4 | 31653  | 0.12 |         2 |        1
>   8 | 316    | 0.13 |         2 |        2
> (4 rows)
>
>  id | prefix | cost | timeframe | provider
> ----+--------+------+-----------+----------
>   4 | 31653  | 0.12 |         2 |        1
>   8 | 316    | 0.13 |         2 |        2
> (2 rows)
>
> Is that what you meant? The longest match for a given provider?
>
> Regards
>
> Duncan
> -----Original Message-----
> From: pgsql-novice-owner@postgresql.org
> [mailto:pgsql-novice-owner@postgresql.org]On Behalf Of Ron Arts
> Sent: 18 January 2007 07:48
> To: pgsql-novice@postgresql.org
> Subject: Re: [NOVICE] only best matches with ilike or regex matching
>
>
> hi Chad,
>
> thanks for responding.
> Your solution points out to me that I my example
> wasn't perfect in one way: it suggests that providers use
> the same prefix list. But they don't. Sorry about that.
>
> In reality this would be a more realistic example:
>
> id | prefix  |  cost   |  timeframe | provider
> ----------------------------------------------
> 1  | ^31     |  0.02   |  1         | 1
> 2  | ^31     |  0.01   |  2         | 1
> 3  | ^31653  |  0.14   |  1         | 1
> 4  | ^31653  |  0.12   |  2         | 1
> 5  | ^31     |  0.03   |  1         | 2
> 6  | ^31     |  0.02   |  2         | 2
> 7  | ^316    |  0.15   |  1         | 2
> 8  | ^316    |  0.13   |  2         | 2
>
> As you see, different providers divide up the possible
> range of phone numbers in a different way.
>
> Now your last query won't work because the subselect
> will return the prefix from row 4, and this will not
> match row 8. Can you offer another suggestion?
>
> Ron
>
> Chad Wagner schreef:
>> On 1/17/07, *Ron Arts* <ron.arts@neonova.nl
>> <mailto:ron.arts@neonova.nl>> wrote:
>>
>>     this is probably an SQL question instead of PostgreSQL but here it
> goes.
>>     I have a table containing phone destinations and pricing as follows:
>>
>>     prefix  |  cost   |  timeframe | provider
>>     ----------------------------------------
>>     ^31     |  0.02   |  1         | 1
>>     ^31     |  0.01   |  2         | 1
>>     ^31653  |  0.14   |  1         | 1
>>     ^31653  |  0.12   |  2         | 1
>>     ^31     |  0.03   |  1         | 2
>>     ^31     |  0.02   |  2         | 2
>>     ^31653  |  0.15   |  1         | 2
>>     ^31653  |  0.13   |  2         | 2
>>
>>     where timeframe=2 means weekends.
>>
>>     For a given phonenumber I need to get the list of providers with the
>>     cheapest one first. Suppose the target phonenumber is 31653445566,
>>     and timeframe is 2:
>>
>>     prefix  |  cost   |  timeframe | provider
>>     ----------------------------------------
>>     ^31653  |  0.12   |  2         | 1
>>     ^31653  |  0.13   |  2         | 2
>>
>>     But I cannot find a query to get this result. I only want
>>     the ^31653 rows, and not the ^31 rows, but these both match
>>     a 'where'31653445566' ~ prefix' clause. Using distinct does not
>>     work as well.
>>
>>
>> This is close...
>>
>> # select * from phonerates where '^316534455665' like prefix || '%' and
>> timeframe = 2 order by length(prefix) desc, cost asc limit 1;
>>  prefix | cost | timeframe | provider
>> --------+------+-----------+----------
>>  ^31653 | 0.12 |         2 |        1
>>
>> If you want exactly as you indicated above, then...
>>
>> select *
>>   from phonerates
>>  where timeframe = 2
>>    and prefix = (select prefix
>>                    from phonerates
>>                   where '^316534455665' like prefix || '%'
>>                  order by length(prefix) desc
>>                  limit 1)
>>  order by cost;
>>
>>  prefix | cost | timeframe | provider
>> --------+------+-----------+----------
>>  ^31653 | 0.12 |         2 |        1
>>  ^31653 | 0.13 |         2 |        2
>>
>>
>>
>> --
>> Chad
>> http://www.postgresqlforums.com/
>
>


--
NeoNova BV, The Netherlands
Professional internet and VoIP solutions

http://www.neonova.nl   Kruislaan 419              1098 VA Amsterdam
info: 020-5628292       servicedesk: 020-5628292   fax: 020-5628291
KvK Amsterdam 34151241

The following disclaimer applies to this email:
http://www.neonova.nl/maildisclaimer

Вложения