Обсуждение: Unique is non unique; no nulls

Поиск
Список
Период
Сортировка

Unique is non unique; no nulls

От
Razvan Costea-Barlutiu
Дата:
Hello.

Facts:
1. System: Win2003
2. Postgres 8.2
3. Use pgAdmin 1.6.2, rev.5837 to administer the database

3. Table:
CREATE TABLE mt
(
  suid character varying(70) NOT NULL DEFAULT 'suid'::character varying,
  ris_match_row_idx integer DEFAULT -1,
  ris_match_rule smallint DEFAULT 999,
  stulev_match_row_idx integer DEFAULT -1,
  sdate character varying(8)
)
4. Constraint:

ALTER TABLE mt
  ADD CONSTRAINT suid_uniq UNIQUE(suid);

5. Action : using plpgsql  - populate the table with *what I think are*
unique SUID values. No conflicts reported.


Now, here's where the confusion starts
7. Count the number of rows in the table, using pgADmin's "count"
functionality, from the table's context menu. Result: 2,768,862 rows
8. Count the number of rows, by counting the number of SUIDs:
   select count (suid) from migratek.mt - returns 2,768,862
so far so good.
but
9. Count the number of distinct values. This should be the same with the
number of rows, since there is a unique constraint there right?
   select count (distinct suid) from migratek.mt - returns 2,766,333

10. pull hair off my head
11. try to figure for 10 hours what's going on in joins involving this
table. Then realize that the unique entries in the mt table are not unique.


Does anybody have a clue on how a table with an "unique" constraint can
foster duplications? OR is is it a bug in the *distinct* algorithm
postgres uses?
Does the usage of a plpgsql function (1 large transaction) to populate a
table that has constraints has anything to do with the checking of
constraints in that table?

This has way too many ramifications for me to follow so I do appreciate
some guidance.

Razvan



Re: Unique is non unique; no nulls

От
Gregory Stark
Дата:
"Razvan Costea-Barlutiu" <cbrazvan@laitek.com> writes:

> Hello.
>
> Facts:
> 1. System: Win2003
> 2. Postgres 8.2

Is that just 8.2 or 8.2.x? Which x?

> This has way too many ramifications for me to follow so I do appreciate some
> guidance.

What do the outputs of these queries say?

select count(*),count(distinct suid) from mt;

select suid,count(*) from mt group by suid having count(*) > 1;

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Unique is non unique; no nulls

От
Tom Lane
Дата:
Gregory Stark <stark@enterprisedb.com> writes:
> "Razvan Costea-Barlutiu" <cbrazvan@laitek.com> writes:
>> This has way too many ramifications for me to follow so I do appreciate some
>> guidance.

> What do the outputs of these queries say?

It would also be interesting to know whether REINDEXing the table
succeeds and if so whether the behavior gets any saner.

            regards, tom lane

Re: Unique is non unique; no nulls

От
cbrazvan@laitek.com
Дата:
Tom, Gregory,

Thank you for your replies. I will try reindexing the table but I am not
very sure that will make any difference.
The reason is that when I built the table, I was building it making sure
there are unique values that get trown into it, in addition to the
constraint defined in the table.

THere is a pgplsql function that says something like:
 for s in select distinct suid from migratek.stuides loop
   insert (...,suid) values (...,s.suid) into migratek.mt
So right from the start, there should be no duplications.

Hmm... maybe this will help: in the same function (i.e. transaction), I am
also doing an update on the same table, using suid as update key, that is:
update migratek.mt set sdate = smth where suid = s.suid;

Can this - performing an update on a table from within the (same) table
populating transaction that was not yet commited - throw off postgress in
some way? Especially given the large number of tuples inserted (2.7
million)
Right now, my inclination is to rewrite the function, separating the
updates from the inserts and see if that works.


Razvan

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Razvan Costea-Barlutiu" <cbrazvan@laitek.com> writes:
>>> This has way too many ramifications for me to follow so I do appreciate
>>> some
>>> guidance.
>
>> What do the outputs of these queries say?
>
> It would also be interesting to know whether REINDEXing the table
> succeeds and if so whether the behavior gets any saner.
>
>             regards, tom lane
>



Re: Unique is non unique; no nulls

От
Gregory Stark
Дата:
<cbrazvan@laitek.com> writes:

> Hmm... maybe this will help: in the same function (i.e. transaction), I am
> also doing an update on the same table, using suid as update key, that is:
> update migratek.mt set sdate = smth where suid = s.suid;

Do the rows in the output to the second query I sent (the one with the large
output you attached) look like the "before" and "after" for one of these
updates?

> Can this - performing an update on a table from within the (same) table
> populating transaction that was not yet commited - throw off postgress in
> some way? Especially given the large number of tuples inserted (2.7
> million)

It certainly shouldn't. I'm still a bit skeptical that you really have a
unique index on the table. Perhaps there was an error when you created it and
you didn't notice? Or perhaps the index was dropped at some point later? Can
you send the table description including the list of indexes to be sure?

> Right now, my inclination is to rewrite the function, separating the
> updates from the inserts and see if that works.

I know the pressure to get something that works can be pretty high, but in the
long term I think it would be better to know exactly what happened here.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Unique is non unique; no nulls

От
Razvan Costea-Barlutiu
Дата:
I chose to rebuild the entire table as I didn't really have the
necessary time to investigate what went wrong beyond of what I already did.
I have tried various scenarios for reproducing the problem but I failed
in doing that. The fact is that there were a large number of PGAdmin
crashes in the last time, including ones that left Postgres in weird
states (connections hanging or unfinished transactions) and I can't
faithfully state where the problem lays.

The second build of the table was OK. I'll also make sure I'll update to
version 8.2.5 asap.

Thank you for your time.
Razvan C..B.

> Gregory Stark <stark@enterprisedb.com> writes:
>
>> "Razvan Costea-Barlutiu" <cbrazvan@laitek.com> writes:
>>
>>> This has way too many ramifications for me to follow so I do appreciate some
>>> guidance.
>>>
>
>
>> What do the outputs of these queries say?
>>
>
> It would also be interesting to know whether REINDEXing the table
> succeeds and if so whether the behavior gets any saner.
>
>             regards, tom lane
>
>