Re: Violation of primary key constraint

Поиск
Список
Период
Сортировка
От Toby Murray
Тема Re: Violation of primary key constraint
Дата
Msg-id CAJeqKgteO5raHeJ-MimhrmgQzkXQEq5FCUP_kc8P4ksANBhpKw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Violation of primary key constraint  (Toby Murray <toby.murray@gmail.com>)
Ответы Re: Violation of primary key constraint  (Toby Murray <toby.murray@gmail.com>)
Список pgsql-bugs
On Thu, Jan 31, 2013 at 10:57 PM, Toby Murray <toby.murray@gmail.com> wrote:
> On Thu, Jan 31, 2013 at 5:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Toby Murray <toby.murray@gmail.com> writes:
>>> I just had some interaction with RhodiumToad on IRC about a duplicated
>>> primary key problem I ran into today. After some poking around he
>>> suggested that I send this to -bugs since it seems like an interesting
>>> error.
>>
>> I poked around in the PK index file (thanks for sending that) and could
>> not find anything that looks wrong.  There are a lot of duplicate keys
>> (a few of the keys appear more than a thousand times) but I think this
>> is just the result of update activity that hasn't been vacuumed away
>> yet.  I count 181340233 leaf index tuples bearing 168352931 distinct
>> key values --- that makes for a dead-tuple fraction of 7.7% which is
>> not quite enough to trigger an autovacuum, so it's not terribly
>> surprising that the dups are still present.
>>
>> At this point it seems that it's not the index's fault.  What seems more
>> likely is that somehow the older heap entry failed to get marked "dead"
>> after an UPDATE.
>>
>>> ... Especially the one with the ID
>>> 26709186 since it hasn't been changed in OpenStreetMap in years so
>>> there is no reason for it to have been touched in any way since the
>>> import.
>>
>> Yeah, it's a bit hard to explain that this way unless there was an
>> UPDATE that didn't change the timestamp or version.  How sure are you
>> that the updating process always changes those?
>
> Pretty sure. The minutely change stream coming from OSM is generated
> from all objects that were modified since the last diff was generated,
> based on transaction numbers in their postgresql database. The way
> changes get into that database is through the rails API which enforces
> version number bumps on upload and sets its own timestamp based on
> when the upload came in. The only way to apply an update that doesn't
> change anything would be to apply a minutely diff from the past to an
> up-to-date database. This does happen right when you start updating
> from minutely diffs after a clean import but the overlap is a matter
> of hours, just to ensure there is no gap - not something from 2008.

Ignore this!

I was just talking some more with RhodiumToad on IRC and realized that
I completely forgot about the linestring column. Whenever a node that
is a member of a way gets changed, the way's linestring is updated by
the minutely updates. And checking again, the linestrings on the two
records with an ID of 26709186 are indeed different, but still the
same length. The node was only moved by a few centimeters I think.

Also, the node change happened on January 28th. To be precise, the
timestamp of the node is 2013-01-28 02:38:29. So it is looking like
something may have gone wrong, possibly in a single minutely update,
on January 28th. I'll do a little more digging.

Toby

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Toby Murray
Дата:
Сообщение: Re: Violation of primary key constraint
Следующее
От: Toby Murray
Дата:
Сообщение: Re: Violation of primary key constraint