Обсуждение: pgarchives: strip angle brackets when checking for msgid search

Поиск
Список
Период
Сортировка

pgarchives: strip angle brackets when checking for msgid search

От
Amir Rohan
Дата:
In some email-clients, when you copy the Message-ID to the clipboard <br /> it surrounds in (or perhaps, doesn't
strip)angle brackets.<br /> When pasting this into the mail archive search, search fails to find the message<br /> and
thishappens often enough so as to be annoying. The fix is trivial.<br /><br /> Discussion of similar problem in the
commitfestapp:<br /><br /><a class="moz-txt-link-rfc2396E"
href="mailto:CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com"><CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com></a><br
/><br/> patch attached.<br /><br /> Amir<br /> 

Re: pgarchives: strip angle brackets when checking for msgid search

От
Alvaro Herrera
Дата:
Amir Rohan wrote:

>     In some email-clients, when you copy the Message-ID to the clipboard
>     <br>
>     it surrounds in (or perhaps, doesn't strip) angle brackets.<br>
>     When pasting this into the mail archive search, search fails to find
>     the message<br>
>     and this happens often enough so as to be annoying. The fix is
>     trivial.<br>

Sounds reasonable as a feature.  Can't comment on the implemtation.

Can you please not send HTML-only email?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgarchives: strip angle brackets when checking for msgid search

От
Amir Rohan
Дата:
On 10/02/2015 09:53 PM, Alvaro Herrera wrote:
> Amir Rohan wrote:
>> 
>> In some email-clients, when you copy the Message-ID to the clipboard
>> it surrounds in (or perhaps, doesn't strip) angle brackets.
>> When pasting this into the mail archive search, search fails to find the message
>> and this happens often enough so as to be annoying. The fix is trivial.
>> 
>> Discussion of similar problem in the commitfest app:
>> 
>> <CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com>
>> 
>> patch attached. 
>
> Sounds reasonable as a feature.  Can't comment on the implemtation.
> 
> Can you please not send HTML-only email?
> 

I am mortified by all my recent email mishaps and had
just switched email provider for that reason.

Hopefully with this, they end.

Amir




Re: pgarchives: strip angle brackets when checking for msgid search

От
Magnus Hagander
Дата:
On Fri, Oct 2, 2015 at 7:29 PM, Amir Rohan <amir.rohan@zoho.com> wrote:
In some email-clients, when you copy the Message-ID to the clipboard
it surrounds in (or perhaps, doesn't strip) angle brackets.
When pasting this into the mail archive search, search fails to find the message
and this happens often enough so as to be annoying. The fix is trivial.

Discussion of similar problem in the commitfest app:

<CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com>

patch attached.

Interesting - we also had a similar fix in 9b7e9b59 for the archives, but that one only deals with the URLs and not the search box. Makes much sense to do it in the search box as well, so applied. Thanks! 

Actually, before I sent this. It's also broken because it redirects to bad URLs if there are spaces in it. Or if there is just one of the two, e.g. it has < but not >. All those now generate 404s, which is not good.

This patch needs to be in sync with the callers of that API as well as the code that actually views entries in the archives before it can be safely applied.

--

Re: pgarchives: strip angle brackets when checking for msgid search

От
Amir Rohan
Дата:
Hi Magnus,

On 10/03/2015 07:31 PM, Magnus Hagander wrote:
> On Fri, Oct 2, 2015 at 7:29 PM, Amir Rohan <amir.rohan@zoho.com
> <mailto:amir.rohan@zoho.com>> wrote:
> 
>     In some email-clients, when you copy the Message-ID to the clipboard
>     it surrounds in (or perhaps, doesn't strip) angle brackets.
>     When pasting this into the mail archive search, search fails to find
>     the message
>     and this happens often enough so as to be annoying. The fix is trivial.
> 
>     Discussion of similar problem in the commitfest app:
> 
>     <CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com>
>     <mailto:CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com>
> 
>     patch attached.
> 
> 
> Interesting - we also had a similar fix in 9b7e9b59 for the archives,
> but that one only deals with the URLs and not the search box. Makes much
> sense to do it in the search box as well, so applied. Thanks! 
> 
> Actually, before I sent this. 
>It's also broken because it redirects to bad URLs if there are spaces
in it.

What is "It"?

> Or if there is just one of the two, > e.g. it has < but not >. 

I'm sorry, I lost you. Let's try and sync...

Prior to this patch:

"msgid" = works
" msgid " = fails
" msgid" = fails
"msgid " = fails
<msgid>" = fails
<msgid" = fails
"msgid>" = fails
" msgid>" = fails
etc'

With this patch -- they should all work.

If there's a search query that currently works and is broken by this
patch, can you give an example?

If you simply mean that searching for "m sgid" is broken, yes it
currently is, and no the patch doesn't fix that.

> All those now generate 404s, which is not good.
> 

"Now" meaning before applying this patch?

> This patch needs to be in sync with the callers of that API as well as
> the code that actually views entries in the archives before it can be
> safely applied.

There is no API change. afaict nothing that currently works and uses
this endpoint should be affected. The only effect should be that
queries that now fail but should succeed, succeed.

I'm happy to address any concerns you have, but I don't understand what
precisely you want done.

Amir





Re: pgarchives: strip angle brackets when checking for msgid search

От
Magnus Hagander
Дата:


On Sat, Oct 3, 2015 at 7:42 PM, Amir Rohan <amir.rohan@zoho.com> wrote:
Hi Magnus,

On 10/03/2015 07:31 PM, Magnus Hagander wrote:
> On Fri, Oct 2, 2015 at 7:29 PM, Amir Rohan <amir.rohan@zoho.com
> <mailto:amir.rohan@zoho.com>> wrote:
>
>     In some email-clients, when you copy the Message-ID to the clipboard
>     it surrounds in (or perhaps, doesn't strip) angle brackets.
>     When pasting this into the mail archive search, search fails to find
>     the message
>     and this happens often enough so as to be annoying. The fix is trivial.
>
>     Discussion of similar problem in the commitfest app:
>
>     <CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com>
>     <mailto:CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com>
>
>     patch attached.
>
>
> Interesting - we also had a similar fix in 9b7e9b59 for the archives,
> but that one only deals with the URLs and not the search box. Makes much
> sense to do it in the search box as well, so applied. Thanks!
>
> Actually, before I sent this.
>It's also broken because it redirects to bad URLs if there are spaces
in it.

What is "It"?

> Or if there is just one of the two, > e.g. it has < but not >.

I'm sorry, I lost you. Let's try and sync...

Prior to this patch:

"msgid" = works
" msgid " = fails
" msgid" = fails
"msgid " = fails
<msgid>" = fails
<msgid" = fails
"msgid>" = fails
" msgid>" = fails
etc'

With this patch -- they should all work.

If there's a search query that currently works and is broken by this
patch, can you give an example?


The search works. Meaning the API returns "this messageid exists", which causes the search client to send a redirect to an URL that still has the non-stripped version of the URL (including the < or space).

The regexp used for parsing URLs only allows either "msgid" or "<msgid>". Any other of those combinations will result in a 404 at the receiving end of that redirect.
 

> All those now generate 404s, which is not good.
>

"Now" meaning before applying this patch?

No, after. Today they render a 200 OK with "search returned no hits".

 

> This patch needs to be in sync with the callers of that API as well as
> the code that actually views entries in the archives before it can be
> safely applied.

There is no API change. afaict nothing that currently works and uses
this endpoint should be affected. The only effect should be that
queries that now fail but should succeed, succeed.


That is exactly what I'm saying it doesn't. I deployed it and tested and it broke.

So either the API needs to be updated to return the actual URL, and the consumers (which at this point I believe are just the main website) need to be updated to actually use that URL, or the consumer also needs to be updated with the same "cleaning" rules, or the URL parsing in the archives code need to also deal with all those cases.

I think either the first, or a combination of first and third, of those are the way to go. The second one (doing the same changes in the pgweb code) seems to just put us in a position where we'll make the same mistake again next time we try to fix these rules.

--

Re: pgarchives: strip angle brackets when checking for msgid search

От
Amir Rohan
Дата:
On 10/05/2015 11:57 AM, Magnus Hagander wrote:
> 
> 
> On Sat, Oct 3, 2015 at 7:42 PM, Amir Rohan <amir.rohan@zoho.com
> <mailto:amir.rohan@zoho.com>> wrote:
> 
>     Hi Magnus,
> 
>     On 10/03/2015 07:31 PM, Magnus Hagander wrote:
>     > On Fri, Oct 2, 2015 at 7:29 PM, Amir Rohan <amir.rohan@zoho.com <mailto:amir.rohan@zoho.com>
>     > <mailto:amir.rohan@zoho.com <mailto:amir.rohan@zoho.com>>> wrote:
>     >
>     >     In some email-clients, when you copy the Message-ID to the clipboard
>     >     it surrounds in (or perhaps, doesn't strip) angle brackets.
>     >     When pasting this into the mail archive search, search fails to find
>     >     the message
>     >     and this happens often enough so as to be annoying. The fix is trivial.
>     >
>     >     Discussion of similar problem in the commitfest app:
>     >
>     >     <CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com
>     <mailto:CABUevExRGr%2BQ5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com>>
>     >   
>      <mailto:CABUevExRGr+Q5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com
>     <mailto:CABUevExRGr%2BQ5gzp-9cDLOqgntsOEcM0zZYg5VkADj3Z_NKz5A@mail.gmail.com>>
>     >
>     >     patch attached.
>     >
>     >
>     > Interesting - we also had a similar fix in 9b7e9b59 for the archives,
>     > but that one only deals with the URLs and not the search box. Makes much
>     > sense to do it in the search box as well, so applied. Thanks!
>     >
>     > Actually, before I sent this.
>     >It's also broken because it redirects to bad URLs if there are spaces
>     in it.
> 
>     What is "It"?
> 
>     > Or if there is just one of the two, > e.g. it has < but not >.
> 
>     I'm sorry, I lost you. Let's try and sync...
> 
>     Prior to this patch:
> 
>     "msgid" = works
>     " msgid " = fails
>     " msgid" = fails
>     "msgid " = fails
>     <msgid>" = fails
>     <msgid" = fails
>     "msgid>" = fails
>     " msgid>" = fails
>     etc'
> 
>     With this patch -- they should all work.
> 
>     If there's a search query that currently works and is broken by this
>     patch, can you give an example?
> 
> 
> 
> The search works. Meaning the API returns "this messageid exists", which
> causes the search client to send a redirect to an URL that still has the
> non-stripped version of the URL (including the < or space).
> 
> The regexp used for parsing URLs only allows either "msgid" or
> "<msgid>". Any other of those combinations will result in a 404 at the
> receiving end of that redirect.
>  
> 
>     > All those now generate 404s, which is not good.
>     >
> 
>     "Now" meaning before applying this patch?
> 
> 
> No, after. Today they render a 200 OK with "search returned no hits".
> 
>  
> 
> 
>     > This patch needs to be in sync with the callers of that API as well as
>     > the code that actually views entries in the archives before it can be
>     > safely applied.
> 
>     There is no API change. afaict nothing that currently works and uses
>     this endpoint should be affected. The only effect should be that
>     queries that now fail but should succeed, succeed.
> 
> 
> 
> That is exactly what I'm saying it doesn't. I deployed it and tested and
> it broke.
> 
> So either the API needs to be updated to return the actual URL, and the
> consumers (which at this point I believe are just the main website) need
> to be updated to actually use that URL, or the consumer also needs to be
> updated with the same "cleaning" rules, or the URL parsing in the
> archives code need to also deal with all those cases.
> 
> I think either the first, or a combination of first and third, of those
> are the way to go. The second one (doing the same changes in the pgweb
> code) seems to just put us in a position where we'll make the same
> mistake again next time we try to fix these rules.
> 

Ah, thanks for clearing that up, I understand now and I didn't think to
test that. I'll send an updated patch.

Amir





Re: pgarchives: strip angle brackets when checking for msgid search

От
Amir Rohan
Дата:
On 10/05/2015 11:57 AM, Magnus Hagander wrote:
>
>
> On Sat, Oct 3, 2015 at 7:42 PM, Amir Rohan <amir.rohan@zoho.com
> <mailto:amir.rohan@zoho.com>> wrote:
>
>     Hi Magnus,
>
>     On 10/03/2015 07:31 PM, Magnus Hagander wrote:
>     > On Fri, Oct 2, 2015 at 7:29 PM, Amir Rohan <amir.rohan@zoho.com <mailto:amir.rohan@zoho.com>
>     > <mailto:amir.rohan@zoho.com <mailto:amir.rohan@zoho.com>>> wrote:
>     >
>     >     In some email-clients, when you copy the Message-ID to the clipboard
>     >     it surrounds in (or perhaps, doesn't strip) angle brackets.
>     >     When pasting this into the mail archive search, search fails to find
>     >     the message
>     >     patch attached.
>     >
>
> The search works. Meaning the API returns "this messageid exists", which
> causes the search client to send a redirect to an URL that still has the
> non-stripped version of the URL (including the < or space).
>
> The regexp used for parsing URLs only allows either "msgid" or
> "<msgid>". Any other of those combinations will result in a 404 at the
> receiving end of that redirect.
>
> <...>
>
> So either the API needs to be updated to return the actual URL, and the
> consumers (which at this point I believe are just the main website) need
> to be updated to actually use that URL, or the consumer also needs to be
> updated with the same "cleaning" rules, or the URL parsing in the
> archives code need to also deal with all those cases.
>
> I think either the first, or a combination of first and third, of those
> are the way to go. The second one (doing the same changes in the pgweb
> code) seems to just put us in a position where we'll make the same
> mistake again next time we try to fix these rules.
>

With these patches (now touching pgweb and pgarchives, when a (cleaned)
search query gets a hit on a msgid we now include the canonical
messageid in the JSON response. I didn't have it return a url as you
suggested because archives should know as little as possible
about pgweb's link structure. So pgweb still crafts the redirect url,
only it uses the returned msgid if its present in the response.

A separate patch makes archives notion of where pgweb /search
lives coonfigurable, so devs can test.

As a side note, the ping-pong between archives and pgweb seems too
complicated. The search form should possibly be in pgweb instead of
archives, as all it does is server a form submitting to pgweb's
own /search endpoint, it has nothing to do with archives directly.

I tested as best I could, but there's some stuff missing from
my dev setup and no documentation how get thing running.

Amir

Вложения

Re: pgarchives: strip angle brackets when checking for msgid search

От
Alvaro Herrera
Дата:
Amir Rohan wrote:

> With these patches (now touching pgweb and pgarchives, when a (cleaned)
> search query gets a hit on a msgid we now include the canonical
> messageid in the JSON response. I didn't have it return a url as you
> suggested because archives should know as little as possible
> about pgweb's link structure. So pgweb still crafts the redirect url,
> only it uses the returned msgid if its present in the response.

FWIW the URL structure for archives can never be changed, because many
URLs are in use in permanent storage such as commit messages and
archived emails.  As an example, when we migrated from the old
Mhonarc-generated urls (with the mbox/month/message number structure)
we, or Magnus actually, created a database with redirects from that
structure to the message-id-based URL structure, to avoid breaking
links.

I'm not saying your approach is bad or shouldn't be used, just pointing
out something that the rest of the -www team is probably already aware
of and you might not be.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgarchives: strip angle brackets when checking for msgid search

От
Amir Rohan
Дата:
On 10/05/2015 04:59 PM, Alvaro Herrera wrote:
> Amir Rohan wrote:
> 
>> With these patches (now touching pgweb and pgarchives, when a (cleaned)
>> search query gets a hit on a msgid we now include the canonical
>> messageid in the JSON response. I didn't have it return a url as you
>> suggested because archives should know as little as possible
>> about pgweb's link structure. So pgweb still crafts the redirect url,
>> only it uses the returned msgid if its present in the response.
> 
> FWIW the URL structure for archives can never be changed, because many
> URLs are in use in permanent storage such as commit messages and
> archived emails.  As an example, when we migrated from the old
> Mhonarc-generated urls (with the mbox/month/message number structure)
> we, or Magnus actually, created a database with redirects from that
> structure to the message-id-based URL structure, to avoid breaking
> links.
> 
> I'm not saying your approach is bad or shouldn't be used, just pointing
> out something that the rest of the -www team is probably already aware
> of and you might not be.
> 

That's a valid point (two, actually) but it still /logically/
belongs in the the main website instead of the archive "API server",
and keeps with how this worked before.

But if magnus concurs with you, It's easily changed.

Amir




Archive viewer mangles links to archive appearing in messages

От
Amir Rohan
Дата:
Hi,

Related issue to <560EBEF5.6070209@zoho.com> seen here:

http://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@mail.gmail.com

msgid in link gets flagged as email address and is rewritten for spam
protection, breaking  and the link.

Amir




Re: pgarchives: strip angle brackets when checking for msgid search

От
Amir Rohan
Дата:
On 10/05/2015 04:01 PM, Amir Rohan wrote:
> On 10/05/2015 11:57 AM, Magnus Hagander wrote:
>>
>> So either the API needs to be updated to return the actual URL, and the
>> consumers (which at this point I believe are just the main website) need
>> to be updated to actually use that URL, or the consumer also needs to be
>> updated with the same "cleaning" rules, or the URL parsing in the
>> archives code need to also deal with all those cases.
>>
>> I think either the first, or a combination of first and third, of those
>> are the way to go. The second one (doing the same changes in the pgweb
>> code) seems to just put us in a position where we'll make the same
>> mistake again next time we try to fix these rules.
>>
> 
> With these patches <...>

Bump. Magnus, is there something more I can do to get this on live
on postgresql.org?




Re: pgarchives: strip angle brackets when checking for msgid search

От
Amir Rohan
Дата:
On 10/21/2015 04:49 PM, Amir Rohan wrote:
> On 10/05/2015 04:01 PM, Amir Rohan wrote:
>> On 10/05/2015 11:57 AM, Magnus Hagander wrote:
>>>
>>> So either the API needs to be updated to return the actual URL, and the
>>> consumers (which at this point I believe are just the main website) need
>>> to be updated to actually use that URL, or the consumer also needs to be
>>> updated with the same "cleaning" rules, or the URL parsing in the
>>> archives code need to also deal with all those cases.
>>>
>>> I think either the first, or a combination of first and third, of those
>>> are the way to go. The second one (doing the same changes in the pgweb
>>> code) seems to just put us in a position where we'll make the same
>>> mistake again next time we try to fix these rules.
>>>
>>
>> With these patches <...>
> 
> Bump. Magnus, is there something more I can do to get this on live
> on postgresql.org?
> 

Bump. Waiting for feedback on the revised patch.

Amir