Re: How to do faster DML

Поиск

Список

Период

Сортировка

От	Peter J. Holzer
Тема	Re: How to do faster DML
Дата	3 февраля 2024 г. 20:29:47
Msg-id	20240203202947.u45hpyiwzuodbav2@hjp.at обсуждение исходный текст
Ответ на	Re: How to do faster DML (Lok P <loknath.73@gmail.com>)
Список	pgsql-general

Дерево обсуждения

On 2024-02-03 19:25:12 +0530, Lok P wrote:
> Apology. One correction, the query is like below. I. E filter will be on on
> ctid which I believe is equivalent of rowid in oracle and we will not need the
> index on Id column then. 
>
>  But, it still runs long, so thinking any other way to make the duplicate
> removal faster? 
>
> Also wondering , the index creation which took ~2.5hrs+ , would that have been
> made faster any possible way by allowing more db resource through some session
> level db parameter setting?
>
> create table TAB1_New
> as
> SELECT  * from TAB1 A
> where CTID in
>       (select min(CTID) from TAB1
>       group by ID having count(ID)>=1 );

That »having count(ID)>=1« seems redundant to me. Surely every id which
occurs in the table occurs at least once?

Since you want ID to be unique I assume that it is already almost
unique - so only a small fraction of the ids will be duplicates. So I
would start with creating a list of duplicates:

create table tab1_dups as
select id, count(*) from tab1 group by id having count(*) > 1;

This will still take some time because it needs to build a temporary
structure large enough to hold a count for each individual id. But at
least then you'll have a much smaller table to use for further cleanup.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Вложения

signature.asc

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: How to do faster DML

Вложения