Обсуждение: Email not searchable in our archives

Поиск
Список
Период
Сортировка

Email not searchable in our archives

От
Bruce Momjian
Дата:
Would someone please fine out why the attached email from Heikki is not
appearing in a search of our archives?  I tried the subject and a line
from the email and neither came up as a hit:

  http://search.postgresql.org/search?q=git+patch+review&m=1&l=&d=-1&s=r

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Email not searchable in our archives

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> Would someone please fine out why the attached email from Heikki is not
> appearing in a search of our archives?

I've noticed some curious omissions lately, too.  For instance I
searched for "Include Lists for Text Search" this morning, and
successfully got hits on yesterday's and today's posts with that
title, but not the ones I wanted from last September.  Perhaps there
are chunks of last year that are missing from the tsearch index
for some reason?
        regards, tom lane


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Mon, Mar 10, 2008 at 6:55 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Would someone please fine out why the attached email from Heikki is not
>  appearing in a search of our archives?  I tried the subject and a line
>  from the email and neither came up as a hit:
>
>   http://search.postgresql.org/search?q=git+patch+review&m=1&l=&d=-1&s=r

Eh? The message you included is at the top of the results when I use
the query above.


-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
BTW, your mail broke our archiver.  If you go to your message page,
http://archives.postgresql.org/pgsql-www/2008-03/msg00183.php
you'll notice it only displays your part of the message -- not the
attached message.  Then you notice that the date index has no
msg00184.php nearby ... until you go to the end of it and you notice
that there's a message from Heikki dated 23 May 2007.

The problem here is that your message contains a text/plain attachment
with the dreaded "^From " line, which causes Mhonarc to think it's a
separate message.  Not sure if there's something we can do about this.
One idea would be making the separator contain the list address in the
"^From " line.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Dave Page wrote:
> On Mon, Mar 10, 2008 at 6:55 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > Would someone please fine out why the attached email from Heikki is not
> >  appearing in a search of our archives?  I tried the subject and a line
> >  from the email and neither came up as a hit:
> >
> >   http://search.postgresql.org/search?q=git+patch+review&m=1&l=&d=-1&s=r
> 
> Eh? The message you included is at the top of the results when I use
> the query above.

It shows up now for me too, but did not this morning, about 12 hours
ago.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Alvaro Herrera wrote:
> BTW, your mail broke our archiver.  If you go to your message page,
> http://archives.postgresql.org/pgsql-www/2008-03/msg00183.php
> you'll notice it only displays your part of the message -- not the
> attached message.  Then you notice that the date index has no
> msg00184.php nearby ... until you go to the end of it and you notice
> that there's a message from Heikki dated 23 May 2007.
> 
> The problem here is that your message contains a text/plain attachment
> with the dreaded "^From " line, which causes Mhonarc to think it's a
> separate message.  Not sure if there's something we can do about this.
> One idea would be making the separator contain the list address in the
> "^From " line.

Yea, I can see how that would happen. Sorry.  I can probably modify my
mailer to escape those "From" lines but that isn't going to fix it for
other posters.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> Alvaro Herrera wrote:
>> The problem here is that your message contains a text/plain attachment
>> with the dreaded "^From " line, which causes Mhonarc to think it's a
>> separate message.  Not sure if there's something we can do about this.
>> One idea would be making the separator contain the list address in the
>> "^From " line.

> Yea, I can see how that would happen. Sorry.  I can probably modify my
> mailer to escape those "From" lines but that isn't going to fix it for
> other posters.

The "From " to ">From " kluge is supposed to happen during delivery into
a Unix-format mailbox.  There *is not* any restriction on messages in
flight that they not contain lines starting with "From ".  So if this
broke the archives, the fault is on the archives' side not Bruce's.

http://www.faqs.org/rfcs/rfc822.html
        regards, tom lane


Re: Email not searchable in our archives

От
Andrew Sullivan
Дата:
On Mon, Mar 10, 2008 at 11:37:40PM -0400, Tom Lane wrote:
> a Unix-format mailbox.  There *is not* any restriction on messages in
> flight that they not contain lines starting with "From ".  So if this
> broke the archives, the fault is on the archives' side not Bruce's.
> 
> http://www.faqs.org/rfcs/rfc822.html

Well, you probably want to look at 2821 and 2822, also, but you're quite
right.  The way the archiver is using From sounds like a filthy hack to me.

A


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Alvaro Herrera wrote:
> >> The problem here is that your message contains a text/plain attachment
> >> with the dreaded "^From " line, which causes Mhonarc to think it's a
> >> separate message.  Not sure if there's something we can do about this.
> >> One idea would be making the separator contain the list address in the
> >> "^From " line.
> 
> > Yea, I can see how that would happen. Sorry.  I can probably modify my
> > mailer to escape those "From" lines but that isn't going to fix it for
> > other posters.
> 
> The "From " to ">From " kluge is supposed to happen during delivery into
> a Unix-format mailbox.  There *is not* any restriction on messages in
> flight that they not contain lines starting with "From ".  So if this
> broke the archives, the fault is on the archives' side not Bruce's.

Well, I'm not sure the problem is the delivery either, because the
"From " line here occured in an attachment, not the email body itself.
I think this is more a bug in Mhonarc's message separation, which is way
too primitive.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Bruce Momjian wrote:
> Dave Page wrote:
> > On Mon, Mar 10, 2008 at 6:55 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > Would someone please fine out why the attached email from Heikki is not
> > >  appearing in a search of our archives?  I tried the subject and a line
> > >  from the email and neither came up as a hit:
> > >
> > >   http://search.postgresql.org/search?q=git+patch+review&m=1&l=&d=-1&s=r
> > 
> > Eh? The message you included is at the top of the results when I use
> > the query above.
> 
> It shows up now for me too, but did not this morning, about 12 hours
> ago.

Of course it shows up, but it's the copy added to the 2008-03 mbox.
Note the URL.  The original still doesn't appear.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > Dave Page wrote:
> > > On Mon, Mar 10, 2008 at 6:55 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > > Would someone please fine out why the attached email from Heikki is not
> > > >  appearing in a search of our archives?  I tried the subject and a line
> > > >  from the email and neither came up as a hit:
> > > >
> > > >   http://search.postgresql.org/search?q=git+patch+review&m=1&l=&d=-1&s=r
> > > 
> > > Eh? The message you included is at the top of the results when I use
> > > the query above.
> > 
> > It shows up now for me too, but did not this morning, about 12 hours
> > ago.
> 
> Of course it shows up, but it's the copy added to the 2008-03 mbox.
> Note the URL.  The original still doesn't appear.

Oh, I see now, yea.  So who is going to find out why that email is
missing?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Tue, Mar 11, 2008 at 1:02 PM, Bruce Momjian <bruce@momjian.us> wrote:
>  > Of course it shows up, but it's the copy added to the 2008-03 mbox.
>  > Note the URL.  The original still doesn't appear.
>
>  Oh, I see now, yea.  So who is going to find out why that email is
>  missing?

Google seems to find the May 07 version, so it must be in the
archives. Any ideas Magnus - being your code 'n' all :-) ?

-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> The "From " to ">From " kluge is supposed to happen during delivery into
>> a Unix-format mailbox.  There *is not* any restriction on messages in
>> flight that they not contain lines starting with "From ".  So if this
>> broke the archives, the fault is on the archives' side not Bruce's.

> Well, I'm not sure the problem is the delivery either, because the
> "From " line here occured in an attachment, not the email body itself.
> I think this is more a bug in Mhonarc's message separation, which is way
> too primitive.

Whether it's an attachment or not is irrelevant --- the standards for
this don't even know that there is such a thing as an attachment.
        regards, tom lane


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:

> > Well, I'm not sure the problem is the delivery either, because the
> > "From " line here occured in an attachment, not the email body itself.
> > I think this is more a bug in Mhonarc's message separation, which is way
> > too primitive.
> 
> Whether it's an attachment or not is irrelevant --- the standards for
> this don't even know that there is such a thing as an attachment.

Oh, RFC2822 clearly does -- it refers to RFC2045 through 20499, which
define MIME.  In any case, that "From " line is not defined by RFC822
either, it is purely an implementation matter.  As such, Mhonarc would
stand better if it followed the MIME standard which says that the
message should be split at the terminators defined in the header.  Only
if no such terminators are defined the "From " line should be used.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Email not searchable in our archives

От
Andrew Sullivan
Дата:
On Tue, Mar 11, 2008 at 10:57:43AM -0400, Tom Lane wrote:
> 
> Whether it's an attachment or not is irrelevant --- the standards for
> this don't even know that there is such a thing as an attachment.

That's not quite right: MIME knows about message parts, which is really what
we mean by "attachment".  My bet is that the problem (I haven't looked at
the pieces that are doing this) is that something (mhonarc?) isn't handling
MIME message parts correctly (maybe during the transition to mailbox
format?).

A


Re: Email not searchable in our archives

От
Tom Lane
Дата:
Andrew Sullivan <ajs@crankycanuck.ca> writes:
> On Tue, Mar 11, 2008 at 10:57:43AM -0400, Tom Lane wrote:
>> Whether it's an attachment or not is irrelevant --- the standards for
>> this don't even know that there is such a thing as an attachment.

> That's not quite right: MIME knows about message parts, which is really what
> we mean by "attachment".  My bet is that the problem (I haven't looked at
> the pieces that are doing this) is that something (mhonarc?) isn't handling
> MIME message parts correctly (maybe during the transition to mailbox
> format?).

Right, the problem is exactly that Unix mbox format knows about "From "
(and nothing else) as a message separator.  Whatever code dumps messages
into such a file *must* escape data lines beginning with "From ".
        regards, tom lane


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Andrew Sullivan <ajs@crankycanuck.ca> writes:

> > That's not quite right: MIME knows about message parts, which is really what
> > we mean by "attachment".  My bet is that the problem (I haven't looked at
> > the pieces that are doing this) is that something (mhonarc?) isn't handling
> > MIME message parts correctly (maybe during the transition to mailbox
> > format?).
> 
> Right, the problem is exactly that Unix mbox format knows about "From "
> (and nothing else) as a message separator.  Whatever code dumps messages
> into such a file *must* escape data lines beginning with "From ".

It would be Majordomo's fault then.

However, I think you'd find that if you complain about it to them, they
will tell you that they correctly handle the "From " line in the message
body but they don't touch it inside MIME parts.  And they would be right ...
Still, it would be a good idea to ask.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
bruce wrote:
> Would someone please fine out why the attached email from Heikki is not
> appearing in a search of our archives?  I tried the subject and a line
> from the email and neither came up as a hit:

OK, it has been 24 hours since I reported some emails are not being
archived and no one has even responded they are looking at the problem.

Are we unable to manage our own archive search?  If we can't, I will
start linking to another archive from the TODO list.  Right now, every
time I need a URL for the TODO list I have to troll through the archives
by date until I find the email.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Bruce Momjian wrote:
> bruce wrote:
> > Would someone please fine out why the attached email from Heikki is not
> > appearing in a search of our archives?  I tried the subject and a line
> > from the email and neither came up as a hit:
> 
> OK, it has been 24 hours since I reported some emails are not being
> archived and no one has even responded they are looking at the problem.

I can fix the problem by providing URLs with message ids.  Would that
help?  You can find out the Message-Id trivially from the original
email, and the URL would take you to the main message page complete with
thread links and all.

I haven't done it yet because I noticed that I'd need to create a
directory with thousands of files and I'm not sure how is it going to
work.  I've been trying to generate something of the form
msgid/f/e/fedup2007234234@momjian.us
(i.e. creating subdirs for the first letters) but apparently Mhonarc
doesn't let me do that.

Perhaps I should just try without the subdir and see if it works.

Thoughts?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Tue, Mar 11, 2008 at 9:35 PM, Bruce Momjian <bruce@momjian.us> wrote:
> bruce wrote:
>  > Would someone please fine out why the attached email from Heikki is not
>  > appearing in a search of our archives?  I tried the subject and a line
>  > from the email and neither came up as a hit:
>
>  OK, it has been 24 hours since I reported some emails are not being
>  archived and no one has even responded they are looking at the problem.
>
>  Are we unable to manage our own archive search?  If we can't, I will
>  start linking to another archive from the TODO list.  Right now, every
>  time I need a URL for the TODO list I have to troll through the archives
>  by date until I find the email.

The people that know that code well all have day jobs and limited
spare time. I don't think it's unreasonable for them to take more than
24 hours to respond.

Google seems to work fine on our archives, so there is no need to link
elsewhere.

-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Dave Page wrote:
> On Tue, Mar 11, 2008 at 9:35 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > bruce wrote:
> >  > Would someone please fine out why the attached email from Heikki is not
> >  > appearing in a search of our archives?  I tried the subject and a line
> >  > from the email and neither came up as a hit:
> >
> >  OK, it has been 24 hours since I reported some emails are not being
> >  archived and no one has even responded they are looking at the problem.
> >
> >  Are we unable to manage our own archive search?  If we can't, I will
> >  start linking to another archive from the TODO list.  Right now, every
> >  time I need a URL for the TODO list I have to troll through the archives
> >  by date until I find the email.
> 
> The people that know that code well all have day jobs and limited
> spare time. I don't think it's unreasonable for them to take more than
> 24 hours to respond.
> 
> Google seems to work fine on our archives, so there is no need to link
> elsewhere.

Does Google link _into_ our archives ---  ah, that does work and is a
good work-around.  Thanks.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > bruce wrote:
> > > Would someone please fine out why the attached email from Heikki is not
> > > appearing in a search of our archives?  I tried the subject and a line
> > > from the email and neither came up as a hit:
> > 
> > OK, it has been 24 hours since I reported some emails are not being
> > archived and no one has even responded they are looking at the problem.
> 
> I can fix the problem by providing URLs with message ids.  Would that
> help?  You can find out the Message-Id trivially from the original
> email, and the URL would take you to the main message page complete with
> thread links and all.

I have been pasting the email subject line into the search and usually
it is the first hit (when search works).

> I haven't done it yet because I noticed that I'd need to create a
> directory with thousands of files and I'm not sure how is it going to
> work.  I've been trying to generate something of the form
> msgid/f/e/fedup2007234234@momjian.us
> (i.e. creating subdirs for the first letters) but apparently Mhonarc
> doesn't let me do that.
> 
> Perhaps I should just try without the subdir and see if it works.

I am thinking we need the searches to actually work.  I can find the
emails eventually, and using Google with site:archives.postgresql.org
works pretty well for the time being.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Dave Page wrote:
> On Tue, Mar 11, 2008 at 9:35 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > bruce wrote:
> >  > Would someone please fine out why the attached email from Heikki is not
> >  > appearing in a search of our archives?  I tried the subject and a line
> >  > from the email and neither came up as a hit:
> >
> >  OK, it has been 24 hours since I reported some emails are not being
> >  archived and no one has even responded they are looking at the problem.
> >
> >  Are we unable to manage our own archive search?  If we can't, I will
> >  start linking to another archive from the TODO list.  Right now, every
> >  time I need a URL for the TODO list I have to troll through the archives
> >  by date until I find the email.
> 
> The people that know that code well all have day jobs and limited
> spare time. I don't think it's unreasonable for them to take more than
> 24 hours to respond.
> 
> Google seems to work fine on our archives, so there is no need to link
> elsewhere.

OK, but also consider I am not the only one who is doing searches, and I
have no idea how long it has been broken.  I am now finding 80% of
emails missing for June, 2008, so it is a massive issue, not just a few
emails.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
"Joshua D. Drake"
Дата:
On Tue, 11 Mar 2008 19:02:25 -0300
Alvaro Herrera <alvherre@commandprompt.com> wrote:
> Perhaps I should just try without the subdir and see if it works.

I am wondering if we should bail out of Mhonarc all together. Do we
actually need it? We have the actual mbox files right? Couldn't we
build our own parser for whatever?

As a note, mailman also uses mbox files. We could try its archive
generation capability. I am not suggesting we move to mailman, just
that we use and existing tool that may work better to generate the
archives themselves.

Either way, this is certainly not urgent, although it is important.

Joshua D. Drake



--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director |  PostgreSQL political pundit


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Joshua D. Drake wrote:
-- Start of PGP signed section.
> On Tue, 11 Mar 2008 19:02:25 -0300
> Alvaro Herrera <alvherre@commandprompt.com> wrote:
>  
> > Perhaps I should just try without the subdir and see if it works.
> 
> I am wondering if we should bail out of Mhonarc all together. Do we
> actually need it? We have the actual mbox files right? Couldn't we
> build our own parser for whatever?
> 
> As a note, mailman also uses mbox files. We could try its archive
> generation capability. I am not suggesting we move to mailman, just
> that we use and existing tool that may work better to generate the
> archives themselves.
> 
> Either way, this is certainly not urgent, although it is important.

FYI, I have set up a custom Google search site for
archives.postgresql.org:
http://www.google.com/coop/cse?cx=008259951665801127283%3A7jpbk6al2qu


--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Bruce Momjian wrote:
> Alvaro Herrera wrote:

> > I can fix the problem by providing URLs with message ids.  Would that
> > help?  You can find out the Message-Id trivially from the original
> > email, and the URL would take you to the main message page complete with
> > thread links and all.
> 
> I have been pasting the email subject line into the search and usually
> it is the first hit (when search works).

I am not saying we should continue to have a broken search -- I only say
that I can fix your particular use case.  If you're not interested in it
I can easily push the issue down in my TODO list.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
bruce wrote:
> Dave Page wrote:
> > On Tue, Mar 11, 2008 at 9:35 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > bruce wrote:
> > >  > Would someone please fine out why the attached email from Heikki is not
> > >  > appearing in a search of our archives?  I tried the subject and a line
> > >  > from the email and neither came up as a hit:
> > >
> > >  OK, it has been 24 hours since I reported some emails are not being
> > >  archived and no one has even responded they are looking at the problem.
> > >
> > >  Are we unable to manage our own archive search?  If we can't, I will
> > >  start linking to another archive from the TODO list.  Right now, every
> > >  time I need a URL for the TODO list I have to troll through the archives
> > >  by date until I find the email.
> > 
> > The people that know that code well all have day jobs and limited
> > spare time. I don't think it's unreasonable for them to take more than
> > 24 hours to respond.
> > 
> > Google seems to work fine on our archives, so there is no need to link
> > elsewhere.
> 
> Does Google link _into_ our archives ---  ah, that does work and is a
> good work-around.  Thanks.

OK, now Google search isn't finding this email either:
http://archives.postgresql.org/pgsql-hackers/2007-08/msg00055.php

See this search:

http://www.google.com/search?hl=en&client=firefox-a&rls=com.ubuntu%3Aen-US%3Aofficial&hs=eqZ&q=Re%3A+clog_buffers+to+64+in+8.3+site%3Aarchives.postgresql.org&btnG=Search

It sees this email:
http://archives.postgresql.org/pgsql-hackers/2007-09/msg00636.php

but not the emails from August on the same subject.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > Alvaro Herrera wrote:
> 
> > > I can fix the problem by providing URLs with message ids.  Would that
> > > help?  You can find out the Message-Id trivially from the original
> > > email, and the URL would take you to the main message page complete with
> > > thread links and all.
> > 
> > I have been pasting the email subject line into the search and usually
> > it is the first hit (when search works).
> 
> I am not saying we should continue to have a broken search -- I only say
> that I can fix your particular use case.  If you're not interested in it
> I can easily push the issue down in my TODO list.

I don't feel it is right that you have to push up a TODO item just
because the search is broken.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Joshua D. Drake wrote:

> I am wondering if we should bail out of Mhonarc all together. Do we
> actually need it? We have the actual mbox files right? Couldn't we
> build our own parser for whatever?
> 
> As a note, mailman also uses mbox files. We could try its archive
> generation capability.

Mailman archives are just as crappy, if not crappier.  And they know it.
It's based on Hypermail; I note that Hypermail's latest version happened
on 2003.  The Mailman guys are rethinking the issue; see 
http://wiki.list.org/display/DEV/ModernArchiving

Somebody suggests Lurker as one alternative:
http://lurker.sourceforge.net/

It is a very different interface.  Perhaps we could try it as an
experiment.  I have seen the Debian lists under it and it feels really
martian.

I don't want to lose Mhonarc, at least not for the moment.  It is
powerful and customizable and has served us reasonably well for a very
long time.  (Longer than most of us, actually.)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Bruce Momjian wrote:
> Alvaro Herrera wrote:

> > I am not saying we should continue to have a broken search -- I only say
> > that I can fix your particular use case.  If you're not interested in it
> > I can easily push the issue down in my TODO list.
> 
> I don't feel it is right that you have to push up a TODO item just
> because the search is broken.

Well, then it means somebody else has to push up the "fix the search"
TODO item ;-)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Email not searchable in our archives

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I don't want to lose Mhonarc, at least not for the moment.  It is
> powerful and customizable and has served us reasonably well for a very
> long time.  (Longer than most of us, actually.)

Agreed.  One of the problems with moving off it is that if we change to
something else that breaks mbox files at different points, we will
invalidate archive URLs.  We went there once already by accident and it
was not fun.

The From-line problem is minor anyway.  I think the real issue is why
the heck is our text search missing some old mail?  AFAIK no one has a
clue where that problem is, so it's premature to blame any particular
component.
        regards, tom lane


Re: Email not searchable in our archives

От
"Marc G. Fournier"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Alvaro, have you checked with the mhonarc folks about the From-line issue?

- --On Tuesday, March 11, 2008 22:03:46 -0300 Alvaro Herrera 
<alvherre@commandprompt.com> wrote:

> Joshua D. Drake wrote:
>
>> I am wondering if we should bail out of Mhonarc all together. Do we
>> actually need it? We have the actual mbox files right? Couldn't we
>> build our own parser for whatever?
>>
>> As a note, mailman also uses mbox files. We could try its archive
>> generation capability.
>
> Mailman archives are just as crappy, if not crappier.  And they know it.
> It's based on Hypermail; I note that Hypermail's latest version happened
> on 2003.  The Mailman guys are rethinking the issue; see
> http://wiki.list.org/display/DEV/ModernArchiving
>
> Somebody suggests Lurker as one alternative:
> http://lurker.sourceforge.net/
>
> It is a very different interface.  Perhaps we could try it as an
> experiment.  I have seen the Debian lists under it and it feels really
> martian.
>
> I don't want to lose Mhonarc, at least not for the moment.  It is
> powerful and customizable and has served us reasonably well for a very
> long time.  (Longer than most of us, actually.)
>
> --
> Alvaro Herrera                                http://www.CommandPrompt.com/
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>
> --
> Sent via pgsql-www mailing list (pgsql-www@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-www



- ----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email . scrappy@hub.org                              MSN . scrappy@hub.org
Yahoo . yscrappy               Skype: hub.org        ICQ . 7615664
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFH1zEy4QvfyHIvDvMRAiHVAJ9JKBeJueMskm4TwHGyL0y48PIELQCgzC4N
BhGEZe2qt4oyrLnWsjFVy04=
=ZjTJ
-----END PGP SIGNATURE-----



Re: Email not searchable in our archives

От
"Magnus Hagander"
Дата:
> > Perhaps I should just try without the subdir and see if it works.
> 
> I am wondering if we should bail out of Mhonarc all together. Do we
> actually need it? We have the actual mbox files right? Couldn't we
> build our own parser for whatever?

Let's not throw it out until we know that's where the problem is.  A lot of work has been but into our mhonarc instal
overthe years to make it fit with our website etc.
 

Dave has looked at the custom parser thing, but it's a lot more work than you'd initially think. There's a zillion
corner-cases.

/Magnus


Re: Email not searchable in our archives

От
"Magnus Hagander"
Дата:
> bruce wrote:
> > Would someone please fine out why the attached email from Heikki is not
> > appearing in a search of our archives?  I tried the subject and a line
> > from the email and neither came up as a hit:
> 
> OK, it has been 24 hours since I reported some emails are not being
> archived and no one has even responded they are looking at the problem.
> 
> Are we unable to manage our own archive search?  If we can't, I will
> start linking to another archive from the TODO list.  Right now, every
> time I need a URL for the TODO list I have to troll through the archives
> by date until I find the email.

Obviously I wil look at this as soon as I can. But as Dave has already pointed out, most of us has a dayjob that has to
beprioritised, so everything cannot be done within 24 hours. I was hopeing somebody else would have time to look at it
meanwhile,but so far nobody has had the time. If this is not acceptable then the answer to your question is no, we
currentlycan't do it.
 


I notice, however, that when we have a similar issue with for example the patch queue not being updated for many many
weeks,that is considered a *feature*, and not a problem. Are we not able to manage our own patch queue? If we can't,
perhapswe should stop all development until we can be sure it's always up-to-date?
 

/Magnus


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Tue, Mar 11, 2008 at 10:20 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
> On Tue, 11 Mar 2008 19:02:25 -0300
>  Alvaro Herrera <alvherre@commandprompt.com> wrote:
>
>  > Perhaps I should just try without the subdir and see if it works.
>
>  I am wondering if we should bail out of Mhonarc all together. Do we
>  actually need it? We have the actual mbox files right? Couldn't we
>  build our own parser for whatever?

As I've mentioned a number of times, I've spent quite a bit of time
doing that already - I just don't have the spare cycles to finish
right now. We currently have a parser/archiver that will incrementally
archive messages from the (growing) mboxes into a database, and a web
frontend which resolves many of the problems with the current
archives.

There are optimisation/performance issues to be solved, as well as
some PHP crashes that seem to manifest themselves only on the FreeBSD
production server. The mime handling could also use some improvement
to properly reconstruct multi-part messages, but in reality I'm not
sure we ever have any where that's actually an issue.

If anyone else wants to pickup where I've left off, please let me know.


-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Wed, Mar 12, 2008 at 1:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>  > I don't want to lose Mhonarc, at least not for the moment.  It is
>  > powerful and customizable and has served us reasonably well for a very
>  > long time.  (Longer than most of us, actually.)
>
>  Agreed.  One of the problems with moving off it is that if we change to
>  something else that breaks mbox files at different points, we will
>  invalidate archive URLs.  We went there once already by accident and it
>  was not fun.

The replacement archives system I've been working on takes it's data
from the existing monthly mboxes, and although it uses it's own URL
scheme, it does understand and accept the old URLs. That (and fixing
the thread-breaks-over-a-month issue) we pretty much my top
requirements.


-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Magnus Hagander wrote:
> > bruce wrote:
> > > Would someone please fine out why the attached email from Heikki is not
> > > appearing in a search of our archives?  I tried the subject and a line
> > > from the email and neither came up as a hit:
> >
> > OK, it has been 24 hours since I reported some emails are not being
> > archived and no one has even responded they are looking at the problem.
> >
> > Are we unable to manage our own archive search?  If we can't, I will
> > start linking to another archive from the TODO list.  Right now, every
> > time I need a URL for the TODO list I have to troll through the archives
> > by date until I find the email.
> 
> Obviously I wil look at this as soon as I can. But as Dave has
> already pointed out, most of us has a dayjob that has to be
> prioritised, so everything cannot be done within 24 hours. I
> was hopeing somebody else would have time to look at it meanwhile,
> but so far nobody has had the time. If this is not acceptable
> then the answer to your question is no, we currently can't do
> it.

OK, so should we look to outsource our searching?  (Of cource, Google
isn't indexing all the emails either so I am worried about outsourcing
too.)

> I notice, however, that when we have a similar issue with for
> example the patch queue not being updated for many many weeks,
> that is considered a *feature*, and not a problem. Are we not
> able to manage our own patch queue? If we can't, perhaps we
> should stop all development until we can be sure it's always
> up-to-date?

The patch emails have always been available and online.  What wasn't
done is processing them as TODO items and applying, and that isn't going
to be done for weeks still, I bet.

We don't have an option to outsource that, but we do have the option for
search.  Also, search is a public infrastructure issue, while the patch
queue is a development tool --- I don't consider them to have the same
reliability requirements.

Also, I have been working on the patch queue for a week, and so has Tom.
The search problem, a more public infrastructure with a higher promise
of reliability, isn't even being worked on yet.

-- Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Marc G. Fournier wrote:

> Alvaro, have you checked with the mhonarc folks about the From-line issue?

Nope, not yet.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Email not searchable in our archives

От
Alvaro Herrera
Дата:
Bruce Momjian wrote:
> bruce wrote:
> > Would someone please fine out why the attached email from Heikki is not
> > appearing in a search of our archives?  I tried the subject and a line
> > from the email and neither came up as a hit:
> 
> OK, it has been 24 hours since I reported some emails are not being
> archived and no one has even responded they are looking at the problem.

I'm looking at the problem.  On a quick glance it is obvious that
there's something bogus going on -- the search database only contains
472 emails for May 2007, but Mhonarc reports 1187.

I have to go chase something at the bank right now, I'll update you
later.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Wed, Mar 12, 2008 at 1:09 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
>  I'm looking at the problem.  On a quick glance it is obvious that
>  there's something bogus going on -- the search database only contains
>  472 emails for May 2007, but Mhonarc reports 1187.

Confirmed. I added a test mode to a copy of the archives indexer, and
running that it claims it would index a further 715 messages, which
would give us a total of 1187.

So I guess the next step is to try running out of test mode to see if
the data actually makes it into the index now, but I didn't want to do
that and stomp on any testing you're doing.


-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Wed, Mar 12, 2008 at 2:14 PM, Dave Page <dpage@pgadmin.org> wrote:
>  Confirmed. I added a test mode to a copy of the archives indexer, and
>  running that it claims it would index a further 715 messages, which
>  would give us a total of 1187.
>
>  So I guess the next step is to try running out of test mode to see if
>  the data actually makes it into the index now, but I didn't want to do
>  that and stomp on any testing you're doing.

OK, so running it properly has added those missing 715 messages. I
think we need to run a full index run which should restore any missing
pages, but before we do that, I'd kinda like to gather any ideas on
why this has happened before removing any evidence.

My best guess is simply that the indexer failed for some time and
noone noticed for a few weeks. By the time it was re-run, some
messages that it had missed were outside the timeframe that an
incremental crawl would have picked up (the current, plus last month).
Thoughts?

Stefan; any thoughts on how we might monitor that the indexer has been
running correctly? I assume that should be fairly easy if we have it
drop a timestamp someplace?

-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
Stefan Kaltenbrunner
Дата:
Dave Page wrote:
> On Wed, Mar 12, 2008 at 2:14 PM, Dave Page <dpage@pgadmin.org> wrote:
>>  Confirmed. I added a test mode to a copy of the archives indexer, and
>>  running that it claims it would index a further 715 messages, which
>>  would give us a total of 1187.
>>
>>  So I guess the next step is to try running out of test mode to see if
>>  the data actually makes it into the index now, but I didn't want to do
>>  that and stomp on any testing you're doing.
> 
> OK, so running it properly has added those missing 715 messages. I
> think we need to run a full index run which should restore any missing
> pages, but before we do that, I'd kinda like to gather any ideas on
> why this has happened before removing any evidence.

hmm weird ...

> 
> My best guess is simply that the indexer failed for some time and
> noone noticed for a few weeks. By the time it was re-run, some
> messages that it had missed were outside the timeframe that an
> incremental crawl would have picked up (the current, plus last month).
> Thoughts?
> 
> Stefan; any thoughts on how we might monitor that the indexer has been
> running correctly? I assume that should be fairly easy if we have it
> drop a timestamp someplace?

yes - iirc there is even some discussion on that on pmt - will work 
something out for that in the next days.


Stefan


Re: Email not searchable in our archives

От
Magnus Hagander
Дата:
On Wed, Mar 12, 2008 at 03:25:00PM +0000, Dave Page wrote:
> On Wed, Mar 12, 2008 at 2:14 PM, Dave Page <dpage@pgadmin.org> wrote:
> >  Confirmed. I added a test mode to a copy of the archives indexer, and
> >  running that it claims it would index a further 715 messages, which
> >  would give us a total of 1187.
> >
> >  So I guess the next step is to try running out of test mode to see if
> >  the data actually makes it into the index now, but I didn't want to do
> >  that and stomp on any testing you're doing.
> 
> OK, so running it properly has added those missing 715 messages. I
> think we need to run a full index run which should restore any missing
> pages, but before we do that, I'd kinda like to gather any ideas on
> why this has happened before removing any evidence.
> 
> My best guess is simply that the indexer failed for some time and
> noone noticed for a few weeks. By the time it was re-run, some
> messages that it had missed were outside the timeframe that an
> incremental crawl would have picked up (the current, plus last month).
> Thoughts?
> 
> Stefan; any thoughts on how we might monitor that the indexer has been
> running correctly? I assume that should be fairly easy if we have it
> drop a timestamp someplace?

I admint to having a ticket on pmt to get that set up.

Actually, it might be better to look into the actual database, and find the
latest email indexed? If it's older than <nn> something is wrong. It oculd
be the archives that's wrong and not indexer of course, but the point is
we'll get notified and someone can look into it.

Do you think we need to track it on a per-list basis, or just check for the
latest timestamp across all lists?

//Magnus


Re: Email not searchable in our archives

От
Tom Lane
Дата:
"Dave Page" <dpage@pgadmin.org> writes:
> My best guess is simply that the indexer failed for some time and
> noone noticed for a few weeks. By the time it was re-run, some
> messages that it had missed were outside the timeframe that an
> incremental crawl would have picked up (the current, plus last month).
> Thoughts?

That would explain a contiguous range of messages that were not indexed,
but is that what we have?  I think the thing to do before you destroy
the old index is make a list of which messages were indexed and which
weren't.
        regards, tom lane


Re: Email not searchable in our archives

От
Tom Lane
Дата:
"Dave Page" <dpage@pgadmin.org> writes:
> On Wed, Mar 12, 2008 at 3:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That would explain a contiguous range of messages that were not indexed,
>> but is that what we have?

> Looking at the debug output, the messages that were missed were all contiguous:

OK, that seems to support your theory.  Might as well go ahead and
reindex.  +1 for getting some monitoring in there somewhere.
        regards, tom lane


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Wed, Mar 12, 2008 at 4:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>  OK, that seems to support your theory.  Might as well go ahead and
>  reindex.  +1 for getting some monitoring in there somewhere.

Yeah. Tried that - on the first attempt it died with a deadlock:

----------------
[search@community2 search]$ php archives.php -f
Indexing list 42 (atlpug)
Indexing list 43 (lapug)
Indexing list 41 (pdxpug)
Indexing list 40 (persianpug)
PHP Warning:  pg_execute(): Query failed: ERROR:  deadlock detected
DETAIL:  Process 9245 waits for ShareLock on transaction 378698;
blocked by process 9297.
Process 9297 waits for ShareLock on transaction 378690; blocked by process 9245.
CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."lists" x WHERE
"id" OPERATOR(pg_catalog.=) $1 FOR SHARE OF x" in
/home/search/portal/tools/search/classes/SearchDB.class.php on line 60
#0  SearchDB::mydie(Query failed: 0.94209700 1205339104
) called at [/home/search/portal/tools/search/classes/SearchDB.class.php:61]
#1  SearchDB::ExecutePrepared(0.94209700 1205339104, Array ([0] =>
0,[1] => 1183059217,[2] => PostgreSQL & our ML,[3] => Mohsen
Pahlevanzadeh <mohsen@pahlevanzadeh.org>,[4] =>

سلام بر دوستان فارسی زبان،
این ایمیل رو زدم تا چند تا نÚ(c)ته رو یاد
آور بشم:
۱. قراره فارسی نوشته بشه.
Û². قراره جایی بشه تا Ú(c)اربرای pg دیگه
نرن رو ML های زبان‌های دیگه پست
بگذارند.

۳. ترویج DB های FOSSی در ایران.
۴. اگر بشه ترجمه Doc اون به فارسی
--محسن
--
-------------------------
Mohsen Pahlevanzadeh
email address : mohsen ( at ) pahlevanzadeh ( dot ) org
web site : http://pahlevanzadeh.org
IRC IM : m_pahlevanzadeh
yahoo IM : linuxorbsd
----------------------------



,[5] => PostgreSQL & our ML,[6] =>

سلام بر دوستان فارسی زبان،
این ایمیل رو زدم تا چند تا نÚ(c)ته رو یاد
آور بشم:
۱. قراره فارسی نوشته بشه.
Û². قراره جایی بشه تا Ú(c)اربرای pg دیگه
نرن رو ML های زبان‌های دیگه پست
بگذارند.

۳. ترویج DB های FOSSی در ایران.
۴. اگر بشه ترجمه Doc اون به فارسی
--محسن
--
-------------------------
Mohsen Pahlevanzadeh
email address : mohsen ( at ) pahlevanzadeh ( dot ) org
web site : http://pahlevanzadeh.org
IRC IM : m_pahlevanzadeh
yahoo IM : linuxorbsd
----------------------------



)) called at [/home/search/portal/tools/search/classes/ArchiveIndexer.class.php:114]
#2  ArchiveIndexer->IndexSinglePage(40, persianpug, 2007, 6, 0) called
at [/home/search/portal/tools/search/classes/ArchiveIndexer.class.php:62]
#3  ArchiveIndexer->IndexMonth(40, persianpug, 2007, 6) called at
[/home/search/portal/tools/search/classes/ArchiveIndexer.class.php:40]
#4  ArchiveIndexer->Index(1, , -1, -1) called at
[/home/search/portal/tools/search/archives.php:28]
Query failed: 0.94209700 1205339104
----------------

Running it again now and it's up to 4000+ messages indexed  (ie. ones
that weren't already indexed) and is still working on pgsql-advocacy.
We'll see how it goes....

--
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
Stefan Kaltenbrunner
Дата:
Tom Lane wrote:
> "Dave Page" <dpage@pgadmin.org> writes:
>> On Wed, Mar 12, 2008 at 3:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> That would explain a contiguous range of messages that were not indexed,
>>> but is that what we have?
> 
>> Looking at the debug output, the messages that were missed were all contiguous:
> 
> OK, that seems to support your theory.  Might as well go ahead and
> reindex.  +1 for getting some monitoring in there somewhere.

yeah we will work on that and add some new ones to our current list of 
354 active checks ;-)


Stefan


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Wed, Mar 12, 2008 at 4:55 PM, Stefan Kaltenbrunner
<stefan@kaltenbrunner.cc> wrote:
>
> Tom Lane wrote:
>  > "Dave Page" <dpage@pgadmin.org> writes:
>  >> On Wed, Mar 12, 2008 at 3:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  >>> That would explain a contiguous range of messages that were not indexed,
>  >>> but is that what we have?
>  >
>  >> Looking at the debug output, the messages that were missed were all contiguous:
>  >
>  > OK, that seems to support your theory.  Might as well go ahead and
>  > reindex.  +1 for getting some monitoring in there somewhere.
>
>  yeah we will work on that and add some new ones to our current list of
>  354 active checks ;-)

One thing that crosses my mind - perhaps we should run a full index
once per week to try to catch this sort of thing in the future?

BTW, up to 13500 messages now...


-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Dave Page wrote:
> On Wed, Mar 12, 2008 at 4:55 PM, Stefan Kaltenbrunner
> <stefan@kaltenbrunner.cc> wrote:
> >
> > Tom Lane wrote:
> >  > "Dave Page" <dpage@pgadmin.org> writes:
> >  >> On Wed, Mar 12, 2008 at 3:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >  >>> That would explain a contiguous range of messages that were not indexed,
> >  >>> but is that what we have?
> >  >
> >  >> Looking at the debug output, the messages that were missed were all contiguous:
> >  >
> >  > OK, that seems to support your theory.  Might as well go ahead and
> >  > reindex.  +1 for getting some monitoring in there somewhere.
> >
> >  yeah we will work on that and add some new ones to our current list of
> >  354 active checks ;-)
> 
> One thing that crosses my mind - perhaps we should run a full index
> once per week to try to catch this sort of thing in the future?

Seems running it weekly would mean failures would disappear and not be
diagnosed.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Email not searchable in our archives

От
"Dave Page"
Дата:
On Wed, Mar 12, 2008 at 5:16 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Dave Page wrote:
>  > One thing that crosses my mind - perhaps we should run a full index
>  > once per week to try to catch this sort of thing in the future?
>
>  Seems running it weekly would mean failures would disappear and not be
>  diagnosed.

There is that - but then at least the index should be up to date
within 7 days at all times regardless of any corner cases that we
might otherwise not notice for some time.


BTW, the reindexing just finished - it added 31,269 messages that were
previously missing :-(


-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Email not searchable in our archives

От
Bruce Momjian
Дата:
Dave Page wrote:
> On Wed, Mar 12, 2008 at 5:16 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > Dave Page wrote:
> >  > One thing that crosses my mind - perhaps we should run a full index
> >  > once per week to try to catch this sort of thing in the future?
> >
> >  Seems running it weekly would mean failures would disappear and not be
> >  diagnosed.
> 
> There is that - but then at least the index should be up to date
> within 7 days at all times regardless of any corner cases that we
> might otherwise not notice for some time.
> 
> 
> BTW, the reindexing just finished - it added 31,269 messages that were
> previously missing :-(

31k emails.  Wow.

Thanks, I just checked an email that use to be missing and it is there
now.  Thanks!

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +