Re: Combine Top-k with similarity search extensions

Поиск

Список

Период

Сортировка

От	Shmagi Kavtaradze
Тема	Re: Combine Top-k with similarity search extensions
Дата	20 ноября 2015 г. 19:13:27
Msg-id	CAHY6mawC2GN=W7gRDWgDhv9CTLLTmpU=Wq9jBAsTKtjPCMTEOg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Combine Top-k with similarity search extensions (tim.child@comcast.net)
Ответы	Re: Combine Top-k with similarity search extensions
Список	pgsql-novice

Дерево обсуждения

It will add complexity and also no idea how to do it. Is there any alternative?

On Fri, Nov 20, 2015 at 5:00 PM, <tim.child@comcast.net> wrote:

Shmagi,

Take the first 20 text characters and compute and store the CRC32 or MD5 of that value. That value acts as a signature. You can then find all distinct signatures, or all rows with duplicate signatures for further analysis You could event try building a signature on the full text string.

From: "Shmagi Kavtaradze" <kavtaradze.s@gmail.com>
To: pgsql-novice@postgresql.org
Sent: Friday, November 20, 2015 2:21:36 AM
Subject: [NOVICE] Combine Top-k with similarity search extensions

I am performing similarity check over a column in a table with about 3500 entries. Column is populated with text data from text file. Performing a check results in 3500 * 3500 rows and it takes forever to calculate for my virtual machine. Is there any way to calculate for top-k results, to decrease amount and time needed? What I mean is that, for example when checking two sentences, if first several words does not match, to stop checking that sentences and move on.

В списке pgsql-novice по дате отправления:

Предыдущее

От: tim.child@comcast.net
Дата: 20 ноября 2015 г., 19:00:40
Сообщение: Re: Combine Top-k with similarity search extensions

Следующее

От: tim.child@comcast.net
Дата: 20 ноября 2015 г., 19:42:51
Сообщение: Re: Combine Top-k with similarity search extensions

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Combine Top-k with similarity search extensions

Предыдущее

Следующее