Re: Longest Common Subsequence in Postgres - Algorithm Challenge

Поиск
Список
Период
Сортировка
От Darren Duncan
Тема Re: Longest Common Subsequence in Postgres - Algorithm Challenge
Дата
Msg-id 51DBC50C.3050808@darrenduncan.net
обсуждение исходный текст
Ответ на Re: Longest Common Subsequence in Postgres - Algorithm Challenge  (Robert James <srobertjames@gmail.com>)
Список pgsql-general
Though people talk about doing this in other languages, I think you can solve it
in plain SQL if you wanted to.

For one thing, you could start off using unordered set operations to make the
problem space smaller, such as using set intersection to see what the common
subSETs of values there are from each of the sequences, which SQL does quickly
and easily, and then you can eliminate any subsequences from the original whose
set of values don't match one of these common subsets, and only those would you
then have to compare for order.

-- Darren Duncan

On 2013.07.08 1:04 PM, Robert James wrote:
> On 7/8/13, hubert depesz lubaczewski <depesz@depesz.com> wrote:
>> On Mon, Jul 08, 2013 at 09:09:26AM -0400, Robert James wrote:
>>> I have two relations, where each relation has two fields, one
>>> indicating a name and one indicating a position.  That is, each
>>> relation defines a sequence.
>>>
>>> I need to determine their longest common subsequence.  Yes, I can do
>>> this by fetching all the data into Java (or any other language) and
>>> computing it using the standard LCS dynamic programming language.  But
>>> I'd like to stay within Postgres.  Is there any way to do this?
>>
>> I'm not entirely sure I understand. Can you show us some sample data and
>> expected output?
>
> Sure.  Borrowing a good example from
> http://wordaligned.org/articles/longest-common-subsequence :
>
> Table A (val varchar primary key, pos integer):
> 1, "C"
> 2, "H"
> 3, "I"
> 4, "M"
> 5, "P"
> 6, "A"
> 7, "N"
> 8, "Z"
> 9, "E"
> 10, "E"
>
> Table B (val varchar primary key, pos integer):
> 1, "H"
> 2, "U"
> 3, "M"
> 4, "A"
> 5, "N"
>
> SELECT LongestCommonSubsequence(A,B):
> 1, "H"
> 2, "M"
> 3, "A"
> 4, "N"
>
> (Common chars are in upper case:
> cHiMpANzee
> HuMAN
> )
>
> The std dynamic programming algorithm is described at
> http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
>
>



В списке pgsql-general по дате отправления:

Предыдущее
От: Mike Christensen
Дата:
Сообщение: Re: PERFORM statement
Следующее
От: Misa Simic
Дата:
Сообщение: Re: Longest Common Subsequence in Postgres - Algorithm Challenge