Обсуждение: Bugs in new announcement system

Поиск
Список
Период
Сортировка

Bugs in new announcement system

От
David Fetter
Дата:
Hi,

I just spent an hour trying to figure out how to post the PostgreSQL
Weekly News through the new web form after I spent this morning and
into this afternoon writing it. It would be an understatement to
describe that latter process as onerous and unpleasant.

The attempt to disallow HTML by checking for < in a regex is not super
handy, and it's probably not secure either.

https://git.postgresql.org/gitweb/?p=pgweb.git;a=commitdiff;h=b3e9a962e4514962a1fdbf86b8cdbae3103e76e9

I went and found a library Python provides called Bleach
(https://bleach.readthedocs.io/en/latest/), which should do a much
better job.

Please fix this either by making something that highlights the
offending section(s) so people have some idea what to fix, or renders
them harmless automatically, whichever seems easier. I went to the
trouble of tracking this down because I have a lot of readers each
week who expect me to get it there, but I doubt anyone else who ran
into this bothered.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Bugs in new announcement system

От
Magnus Hagander
Дата:
On Mon, Nov 2, 2020 at 1:10 AM David Fetter <david@fetter.org> wrote:
>
> Hi,
>
> I just spent an hour trying to figure out how to post the PostgreSQL
> Weekly News through the new web form after I spent this morning and
> into this afternoon writing it. It would be an understatement to
> describe that latter process as onerous and unpleasant.

The expectations that you might need some extra time on it is why we
notified you of the changes ahead of actually making them, and offered
to help with any issues or questions you had around it...

> The attempt to disallow HTML by checking for < in a regex is not super
> handy, and it's probably not secure either.

Fully agreed, that's a quick stop-gap measure put in earlier, that
should've been replaced.


> I went and found a library Python provides called Bleach
> (https://bleach.readthedocs.io/en/latest/), which should do a much
> better job.

Yeah, that seems a lot more useful.


> Please fix this either by making something that highlights the
> offending section(s) so people have some idea what to fix, or renders
> them harmless automatically, whichever seems easier. I went to the

Do you have any suggestions for how to actually accomplish such highlighting?

There are also some further issues around the preview code for that,
since it uses a different markdown engine, but that one already has
some issues so we should probably try to figure that part out at the
same time.


> trouble of tracking this down because I have a lot of readers each
> week who expect me to get it there, but I doubt anyone else who ran
> into this bothered.

Well, nobody else has reported any problems, but my guess is nobody
else has tried pasting HTML before :)

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Bugs in new announcement system

От
David Fetter
Дата:
On Sun, Nov 08, 2020 at 06:25:17PM +0100, Magnus Hagander wrote:
> On Mon, Nov 2, 2020 at 1:10 AM David Fetter <david@fetter.org> wrote:
> >
> > Hi,
> >
> > I just spent an hour trying to figure out how to post the PostgreSQL
> > Weekly News through the new web form after I spent this morning and
> > into this afternoon writing it. It would be an understatement to
> > describe that latter process as onerous and unpleasant.
> 
> The expectations that you might need some extra time on it is why we
> notified you of the changes ahead of actually making them, and offered
> to help with any issues or questions you had around it...

When was this?

> > The attempt to disallow HTML by checking for < in a regex is not super
> > handy, and it's probably not secure either.
> 
> Fully agreed, that's a quick stop-gap measure put in earlier, that
> should've been replaced.
> 
> > I went and found a library Python provides called Bleach
> > (https://bleach.readthedocs.io/en/latest/), which should do a much
> > better job.
> 
> Yeah, that seems a lot more useful.

> > Please fix this either by making something that highlights the
> > offending section(s) so people have some idea what to fix, or renders
> > them harmless automatically, whichever seems easier. I went to the
> 
> Do you have any suggestions for how to actually accomplish such highlighting?

I'd imagine that the thing that can tell there's HTML in there can
also tell where it is and hand back a line number at a minimum.

> There are also some further issues around the preview code for that,
> since it uses a different markdown engine, but that one already has
> some issues so we should probably try to figure that part out at the
> same time.
> 
> 
> > trouble of tracking this down because I have a lot of readers each
> > week who expect me to get it there, but I doubt anyone else who ran
> > into this bothered.
> 
> Well, nobody else has reported any problems, but my guess is nobody
> else has tried pasting HTML before :)

I did not try pasting HTML in there. There was no HTML anywhere in the
newsletter before. What there was was a false positive that I had the
pleasure of tracking down.

What is it precisely that you don't want in HTML? I'm asking because
if you can come up with a list of things you want blocked, a gizmo
that removes same from that AST (er, DOM) seems like the thing that
would actually work and not burden people.

You're inferring that no complaints means no one had problems other
than me. I think a much more likely explanation is survivorship bias,
i.e. lots of people noticed it was buggy and unhelpful, and silently
gave up.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Bugs in new announcement system

От
Dave Cramer
Дата:


You're inferring that no complaints means no one had problems other
than me. I think a much more likely explanation is survivorship bias,
i.e. lots of people noticed it was buggy and unhelpful, and silently
gave up.
I just assumed someone was busy fixing it.

Dave

Re: Bugs in new announcement system

От
"Jonathan S. Katz"
Дата:
On 11/8/20 10:16 PM, David Fetter wrote:
> On Sun, Nov 08, 2020 at 06:25:17PM +0100, Magnus Hagander wrote:
>> On Mon, Nov 2, 2020 at 1:10 AM David Fetter <david@fetter.org> wrote:
>>>
>>> Hi,
>>>
>>> I just spent an hour trying to figure out how to post the PostgreSQL
>>> Weekly News through the new web form after I spent this morning and
>>> into this afternoon writing it. It would be an understatement to
>>> describe that latter process as onerous and unpleasant.
>>
>> The expectations that you might need some extra time on it is why we
>> notified you of the changes ahead of actually making them, and offered
>> to help with any issues or questions you had around it...
>
> When was this?

2020-09-09, subject was "Announce/PWN changes". It was sent from Magnus
& CC'd to webmaster (which is why I'm aware of the note).

Jonathan

Вложения

Re: Bugs in new announcement system

От
Magnus Hagander
Дата:
On Mon, Nov 9, 2020 at 4:16 AM David Fetter <david@fetter.org> wrote:
>
> On Sun, Nov 08, 2020 at 06:25:17PM +0100, Magnus Hagander wrote:
> > On Mon, Nov 2, 2020 at 1:10 AM David Fetter <david@fetter.org> wrote:
> > Yeah, that seems a lot more useful.
>
> > > Please fix this either by making something that highlights the
> > > offending section(s) so people have some idea what to fix, or renders
> > > them harmless automatically, whichever seems easier. I went to the
> >
> > Do you have any suggestions for how to actually accomplish such highlighting?
>
> I'd imagine that the thing that can tell there's HTML in there can
> also tell where it is and hand back a line number at a minimum.

Oh, that's the easy part -- even getting a regexp to do that is pretty
easy.  But how do you get that feedback into a standard  HTML input
box, what amount of black magic is needed there?


> > There are also some further issues around the preview code for that,
> > since it uses a different markdown engine, but that one already has
> > some issues so we should probably try to figure that part out at the
> > same time.
> >
> >
> > > trouble of tracking this down because I have a lot of readers each
> > > week who expect me to get it there, but I doubt anyone else who ran
> > > into this bothered.
> >
> > Well, nobody else has reported any problems, but my guess is nobody
> > else has tried pasting HTML before :)
>
> I did not try pasting HTML in there. There was no HTML anywhere in the
> newsletter before. What there was was a false positive that I had the
> pleasure of tracking down.

Oh, gotcha. Would you care to actually share *what* the problematic
match was? If nothing else, that would be good to test against with a
new implementation.


> What is it precisely that you don't want in HTML? I'm asking because
> if you can come up with a list of things you want blocked, a gizmo
> that removes same from that AST (er, DOM) seems like the thing that
> would actually work and not burden people.

We don't want anything in HTML in general, other than what's generated
out of the markdown. So it's really a question of what we *want*,
which is just the basic formatting tags + links.

Looking some more at the bleach thing it does seem to work with this
kind of whitelist model, so that is indeed probably a good way
forward. It will require some bigger hackings around the pgweb code
though, but that will likely pay off.



> You're inferring that no complaints means no one had problems other
> than me. I think a much more likely explanation is survivorship bias,
> i.e. lots of people noticed it was buggy and unhelpful, and silently
> gave up.

This is certainly possible. But given the number of other people who
have contacted us with questions around *different* things in that
system after the change, I'm willing to guess that the number are
fairly low. And we've generally seen about the same number of posts /
week as we had before, so there has certainly not been a big drop.
Whereas the actual delivery rate has gone up *massively*.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Bugs in new announcement system

От
Bruce Momjian
Дата:
On Mon, Nov  9, 2020 at 03:32:58PM +0100, Magnus Hagander wrote:
> This is certainly possible. But given the number of other people who
> have contacted us with questions around *different* things in that
> system after the change, I'm willing to guess that the number are
> fairly low. And we've generally seen about the same number of posts /
> week as we had before, so there has certainly not been a big drop.
> Whereas the actual delivery rate has gone up *massively*.

Why has the delivery rate increased?  Was it related to this change?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Bugs in new announcement system

От
Magnus Hagander
Дата:
On Wed, Nov 11, 2020 at 10:45 PM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Mon, Nov  9, 2020 at 03:32:58PM +0100, Magnus Hagander wrote:
> > This is certainly possible. But given the number of other people who
> > have contacted us with questions around *different* things in that
> > system after the change, I'm willing to guess that the number are
> > fairly low. And we've generally seen about the same number of posts /
> > week as we had before, so there has certainly not been a big drop.
> > Whereas the actual delivery rate has gone up *massively*.
>
> Why has the delivery rate increased?  Was it related to this change?

Numerous reasons. The main ones being we are now in control of the
message and can assure it is properly DKIM signed with no violations,
and also in control of the sending domain making sure that
DKIM/SPF/DMARC policies are correctly configured to actually send to a
mailinglist. And having both a html and plaintext part consistently
across messages (yes, some systems do actually consider emails without
a html part spam these days -- the opposite of what it used to be).
And longer term we'd expect the fact that people can
subscribe/unsubscribe to individual topics are going to make it more
likely that they don't just hit "this is spam" -- previously you could
subscribe to -announce to get our release announcements but would also
receive a lot of things entirely unrelated to that, with the only
common topic being postgres.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/