Re: Post-2018 messages in archives

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: Post-2018 messages in archives
Дата
Msg-id CABUevEyzJEEi-zA4zb3p4tgPUpHdsYJvsKgRX=X_GZGdGWS3YQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Post-2018 messages in archives  (Noah Misch <noah@leadboat.com>)
Ответы Re: Post-2018 messages in archives  (Noah Misch <noah@leadboat.com>)
Список pgsql-www
On Wed, Dec 5, 2018 at 2:53 AM Noah Misch <noah@leadboat.com> wrote:
On Mon, Dec 03, 2018 at 10:08:20AM +0100, Magnus Hagander wrote:
> On Mon, Dec 3, 2018 at 2:40 AM Noah Misch <noah@leadboat.com> wrote:
> > At some point in the last few months, the archives of many mailing lists
> > added
> > messages dated far in the future.  For example, pgsql-hackers archives
> > gained
> > four messages from years 2030, 2032 and 2036:
> >
> > https://www.postgresql.org/list/pgsql-hackers/since/203011010000/

> > Perhaps the fix is to set the archive date to the archives ingest time when
> > the message asserts a date substantially (15min?) earlier or later.  Would
> > that be an improvement?

> Unfortunately we don't keep the ingest time separately. But for the future,
> doing so would probably be a good idea, for other reasons as well.  I think
> 15 minutes might be pushing it a bit given the kind of times we see around,
> in particular with incorrectly configured timezones. But something like 24h
> would probably work.
>
> Luckily, it's not too terribly bad:
>
> archives=# select count(*) from messages where date > now();
>  count
> -------
>     10
> (1 row)
>
> (out of about 1.3M messages).
>
> So short-term I will go process those messages manually.

Data looks clean now.  Thanks.  If the problem remains as rare as it has been,
the automated fix I was contemplating is premature.

Thanks for confirming.

I think it's still needed, in case either (1) it happens again, or (2) we reparse the archives fully again which will reset it all. It's not too urgent at this point though, but I've left it on my  TODO list to make sure we have a safeguard in there.

--

В списке pgsql-www по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Post-2018 messages in archives
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: Dropping training events