Re: Longest Common Subsequence in Postgres - Algorithm Challenge

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Longest Common Subsequence in Postgres - Algorithm Challenge
Дата
Msg-id CAB7nPqTh29mKnpxjP+CPD3vNV1T_vnLi=_Jkc6W4x-F5SMvEEw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Longest Common Subsequence in Postgres - Algorithm Challenge  (Robert James <srobertjames@gmail.com>)
Ответы Re: Longest Common Subsequence in Postgres - Algorithm Challenge  (Marcin Mańk <marcin.mank@gmail.com>)
Список pgsql-general
On Tue, Jul 9, 2013 at 5:04 AM, Robert James <srobertjames@gmail.com> wrote:
> On 7/8/13, hubert depesz lubaczewski <depesz@depesz.com> wrote:
>> On Mon, Jul 08, 2013 at 09:09:26AM -0400, Robert James wrote:
>>> I have two relations, where each relation has two fields, one
>>> indicating a name and one indicating a position.  That is, each
>>> relation defines a sequence.
>>>
>>> I need to determine their longest common subsequence.  Yes, I can do
>>> this by fetching all the data into Java (or any other language) and
>>> computing it using the standard LCS dynamic programming language.  But
>>> I'd like to stay within Postgres.  Is there any way to do this?
>>
>> I'm not entirely sure I understand. Can you show us some sample data and
>> expected output?
>
> Sure.  Borrowing a good example from
> http://wordaligned.org/articles/longest-common-subsequence :
>
> Table A (val varchar primary key, pos integer):
> 1, "C"
> 2, "H"
> 3, "I"
> 4, "M"
> 5, "P"
> 6, "A"
> 7, "N"
> 8, "Z"
> 9, "E"
> 10, "E"
>
> Table B (val varchar primary key, pos integer):
> 1, "H"
> 2, "U"
> 3, "M"
> 4, "A"
> 5, "N"
>
> SELECT LongestCommonSubsequence(A,B):
> 1, "H"
> 2, "M"
> 3, "A"
> 4, "N"
>
> (Common chars are in upper case:
> cHiMpANzee
> HuMAN
> )
>
> The std dynamic programming algorithm is described at
> http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
Your best bet on that would be to implement using either a procedural
language (plpython, plperl that are in core, or even another one), or
a C function and install it on server side with an extension. I'll do
the latter if I were you. Roughly such a function would take in
arguments the OIDs of the functions to compare (or their relation
name), and return a set of rows describing the longest sequence found.
--
Michael


В списке pgsql-general по дате отправления:

Предыдущее
От: "Daniel Serodio (lists)"
Дата:
Сообщение: Re: replication stops working
Следующее
От: Muhammad Bashir Al-Noimi
Дата:
Сообщение: Force ssl connection