Re: Finding matching words in a word game

Поиск
Список
Период
Сортировка
От Chris Angelico
Тема Re: Finding matching words in a word game
Дата
Msg-id CAPTjJmoQ9sdw81Hfb7eGd+Uvxx66N-62J8BnPdjM8z7upmtSpA@mail.gmail.com
обсуждение исходный текст
Ответ на Finding matching words in a word game  (Alexander Farber <alexander.farber@gmail.com>)
Ответы Re: Finding matching words in a word game
Список pgsql-general
On Tue, Mar 5, 2013 at 8:29 PM, Alexander Farber
<alexander.farber@gmail.com> wrote:
> is there maybe a clever way of finding all possible words
> from a given set of letters by means of PostgreSQL
> (i.e. inside the database vs. scanning all database
> rows by a PHP script, which would take too long) -
> if the dictionary is kept in a simple table like:
>
> create table good_words (
>         word varchar(16) primary key,
>         stamp timestamp default current_timestamp
> );

How many words are you looking at, in your dictionary? I wrote an
anagramming program in C++ that works off a fairly small dictionary of
common words (~60K words) and gives adequate performance without any
indexing - the time is dwarfed by just scrolling the text up the
console. The only thing I'd recommend doing differently is the
language - use one that has a proper hash/tree type, saving you the
trouble of implementing one (I implemented my own non-balancing binary
tree for the task... no wait, on examination, it seems to actually be
a linear search - and yet it has passable performance). PHP can quite
probably do everything you want here; otherwise, I'd recommend
something like Python or Pike.

Simple Python example:

words = {}
for word in dictionary:   # provide a dictionary somehow - maybe from a file/db
    words.setdefault(''.join(sorted(word)),[]).append(word)
# Voila! You now have your mapping. One-off initialization complete.

find_anagrams_of = "stop"
anagrams = words.get(''.join(sorted(find_anagrams_of)),[])
# anagrams is now a list of all known anagrams of the target -
possibly an empty list
print(anagrams)


['opts', 'post', 'pots', 'spot', 'stop', 'tops']

On my laptop, loading ~100K words took about 1 second, and the lookup
took effectively no time. I don't think there's any need for a heavy
database engine here, unless you're working with millions and millions
of words :)

Chris Angelico


В списке pgsql-general по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Change owner for all tables in a database in one batch
Следующее
От: Alexander Farber
Дата:
Сообщение: Re: Finding matching words in a word game