Re: duplicate primary index in bayes db from SpamAssassin

Поиск
Список
Период
Сортировка
От Michael Monnerie
Тема Re: duplicate primary index in bayes db from SpamAssassin
Дата
Msg-id 200804150826.37501@zmi.at
обсуждение исходный текст
Ответ на Re: duplicate primary index in bayes db from SpamAssassin  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-admin
On Montag, 14. April 2008 Tom Lane wrote:
> What PG version is this?

I've looked a little more into history, and PostgreSQL was 8.1.5
originally from CD, then 8.1.9 and now 8.1.11.

I'm currently investigating a way to delete duplicates from the table.
Is there any chance to do that? A
SELECT * from bayes_token;
shows no dups, as there's a primary index... at least it works here.

This is the bayes_token table and its index:
CREATE TABLE bayes_token (
    id integer DEFAULT 0 NOT NULL,
    token bytea DEFAULT ''::bytea NOT NULL,
    spam_count integer DEFAULT 0 NOT NULL,
    ham_count integer DEFAULT 0 NOT NULL,
    atime integer DEFAULT 0 NOT NULL
);
ALTER TABLE ONLY bayes_token
    ADD CONSTRAINT bayes_token_pkey PRIMARY KEY (id, token);

Could I create a new db without indices, restore the data there, and
then run a "delete from.. where duplicates"? Sometimes there's 3x the
same content in the primary index, but I'd need to delete all except
the one with the highest spam_count.

I just found I even have a duplicate in bayes_vars:

COPY bayes_vars (id, username, spam_count, ham_count, token_count,
last_expire, last_atime_delta, last_expire_reduce, oldest_token_age,
newest_token_age) FROM stdin;
1       vscan   194393  517531  2602114 1206030039      2764800 15304
1203265204      1206057801
1       vscan   194398  517535  2602799 1206030039      2764800 15304
1203265204      1206064729
4       vscan   18305   25403   2042983 1208143427      1382400 13268
1206501543      1208216540
\.

Definition:
CREATE TABLE bayes_vars (
    id serial NOT NULL,
    username character varying(200) DEFAULT ''::character varying NOT
NULL,
    spam_count integer DEFAULT 0 NOT NULL,
    ham_count integer DEFAULT 0 NOT NULL,
    token_count integer DEFAULT 0 NOT NULL,
    last_expire integer DEFAULT 0 NOT NULL,
    last_atime_delta integer DEFAULT 0 NOT NULL,
    last_expire_reduce integer DEFAULT 0 NOT NULL,
    oldest_token_age integer DEFAULT 2147483647 NOT NULL,
    newest_token_age integer DEFAULT 0 NOT NULL
);
ALTER TABLE ONLY bayes_vars
    ADD CONSTRAINT bayes_vars_pkey PRIMARY KEY (id);

But here, I could see the dups with SELECT, and even delete one record:

# select * from bayes_vars;
 id | username | spam_count | ham_count | token_count | last_expire |
last_atime_delta | last_expire_reduce | oldest_token_age |
newest_token_age

----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------
  1 | vscan    |     194393 |    517531 |     2602114 |  1206030039 |
2764800 |              15304 |       1203265204 |       1206057801
  1 | vscan    |     194398 |    517535 |     2602799 |  1206030039 |
2764800 |              15304 |       1203265204 |       1206064729
  4 | vscan    |      18375 |     25828 |     2050196 |  1208229525 |
1382400 |              13268 |       1206501543 |       1208240610
(3 Zeilen)

bayes_pg_v1=# delete from bayes_vars where spam_count =194393;
DELETE 1
bayes_pg_v1=# select * from bayes_vars;
 id | username | spam_count | ham_count | token_count | last_expire |
last_atime_delta | last_expire_reduce | oldest_token_age |
newest_token_age

----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------
  1 | vscan    |     194398 |    517535 |     2602799 |  1206030039 |
2764800 |              15304 |       1203265204 |       1206064729
  4 | vscan    |      18375 |     25829 |     2050215 |  1208229525 |
1382400 |              13268 |       1206501543 |       1208240637
(2 Zeilen)

mfg zmi
--
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0676/846 914 666                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: www.keyserver.net                   Key-ID: 1C1209B4

Вложения

В списке pgsql-admin по дате отправления:

Предыдущее
От: yogesh@banasdairy.coop
Дата:
Сообщение: Restore Database From data folder
Следующее
От: "Mikko Partio"
Дата:
Сообщение: Re: FATAL: could not open relation xxx: No such file or directory