Обсуждение: Bugs in new announcement system
Hi, I just spent an hour trying to figure out how to post the PostgreSQL Weekly News through the new web form after I spent this morning and into this afternoon writing it. It would be an understatement to describe that latter process as onerous and unpleasant. The attempt to disallow HTML by checking for < in a regex is not super handy, and it's probably not secure either. https://git.postgresql.org/gitweb/?p=pgweb.git;a=commitdiff;h=b3e9a962e4514962a1fdbf86b8cdbae3103e76e9 I went and found a library Python provides called Bleach (https://bleach.readthedocs.io/en/latest/), which should do a much better job. Please fix this either by making something that highlights the offending section(s) so people have some idea what to fix, or renders them harmless automatically, whichever seems easier. I went to the trouble of tracking this down because I have a lot of readers each week who expect me to get it there, but I doubt anyone else who ran into this bothered. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, Nov 2, 2020 at 1:10 AM David Fetter <david@fetter.org> wrote: > > Hi, > > I just spent an hour trying to figure out how to post the PostgreSQL > Weekly News through the new web form after I spent this morning and > into this afternoon writing it. It would be an understatement to > describe that latter process as onerous and unpleasant. The expectations that you might need some extra time on it is why we notified you of the changes ahead of actually making them, and offered to help with any issues or questions you had around it... > The attempt to disallow HTML by checking for < in a regex is not super > handy, and it's probably not secure either. Fully agreed, that's a quick stop-gap measure put in earlier, that should've been replaced. > I went and found a library Python provides called Bleach > (https://bleach.readthedocs.io/en/latest/), which should do a much > better job. Yeah, that seems a lot more useful. > Please fix this either by making something that highlights the > offending section(s) so people have some idea what to fix, or renders > them harmless automatically, whichever seems easier. I went to the Do you have any suggestions for how to actually accomplish such highlighting? There are also some further issues around the preview code for that, since it uses a different markdown engine, but that one already has some issues so we should probably try to figure that part out at the same time. > trouble of tracking this down because I have a lot of readers each > week who expect me to get it there, but I doubt anyone else who ran > into this bothered. Well, nobody else has reported any problems, but my guess is nobody else has tried pasting HTML before :) -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On Sun, Nov 08, 2020 at 06:25:17PM +0100, Magnus Hagander wrote: > On Mon, Nov 2, 2020 at 1:10 AM David Fetter <david@fetter.org> wrote: > > > > Hi, > > > > I just spent an hour trying to figure out how to post the PostgreSQL > > Weekly News through the new web form after I spent this morning and > > into this afternoon writing it. It would be an understatement to > > describe that latter process as onerous and unpleasant. > > The expectations that you might need some extra time on it is why we > notified you of the changes ahead of actually making them, and offered > to help with any issues or questions you had around it... When was this? > > The attempt to disallow HTML by checking for < in a regex is not super > > handy, and it's probably not secure either. > > Fully agreed, that's a quick stop-gap measure put in earlier, that > should've been replaced. > > > I went and found a library Python provides called Bleach > > (https://bleach.readthedocs.io/en/latest/), which should do a much > > better job. > > Yeah, that seems a lot more useful. > > Please fix this either by making something that highlights the > > offending section(s) so people have some idea what to fix, or renders > > them harmless automatically, whichever seems easier. I went to the > > Do you have any suggestions for how to actually accomplish such highlighting? I'd imagine that the thing that can tell there's HTML in there can also tell where it is and hand back a line number at a minimum. > There are also some further issues around the preview code for that, > since it uses a different markdown engine, but that one already has > some issues so we should probably try to figure that part out at the > same time. > > > > trouble of tracking this down because I have a lot of readers each > > week who expect me to get it there, but I doubt anyone else who ran > > into this bothered. > > Well, nobody else has reported any problems, but my guess is nobody > else has tried pasting HTML before :) I did not try pasting HTML in there. There was no HTML anywhere in the newsletter before. What there was was a false positive that I had the pleasure of tracking down. What is it precisely that you don't want in HTML? I'm asking because if you can come up with a list of things you want blocked, a gizmo that removes same from that AST (er, DOM) seems like the thing that would actually work and not burden people. You're inferring that no complaints means no one had problems other than me. I think a much more likely explanation is survivorship bias, i.e. lots of people noticed it was buggy and unhelpful, and silently gave up. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
You're inferring that no complaints means no one had problems other
than me. I think a much more likely explanation is survivorship bias,
i.e. lots of people noticed it was buggy and unhelpful, and silently
gave up.
I just assumed someone was busy fixing it.
Dave
On 11/8/20 10:16 PM, David Fetter wrote: > On Sun, Nov 08, 2020 at 06:25:17PM +0100, Magnus Hagander wrote: >> On Mon, Nov 2, 2020 at 1:10 AM David Fetter <david@fetter.org> wrote: >>> >>> Hi, >>> >>> I just spent an hour trying to figure out how to post the PostgreSQL >>> Weekly News through the new web form after I spent this morning and >>> into this afternoon writing it. It would be an understatement to >>> describe that latter process as onerous and unpleasant. >> >> The expectations that you might need some extra time on it is why we >> notified you of the changes ahead of actually making them, and offered >> to help with any issues or questions you had around it... > > When was this? 2020-09-09, subject was "Announce/PWN changes". It was sent from Magnus & CC'd to webmaster (which is why I'm aware of the note). Jonathan
Вложения
On Mon, Nov 9, 2020 at 4:16 AM David Fetter <david@fetter.org> wrote: > > On Sun, Nov 08, 2020 at 06:25:17PM +0100, Magnus Hagander wrote: > > On Mon, Nov 2, 2020 at 1:10 AM David Fetter <david@fetter.org> wrote: > > Yeah, that seems a lot more useful. > > > > Please fix this either by making something that highlights the > > > offending section(s) so people have some idea what to fix, or renders > > > them harmless automatically, whichever seems easier. I went to the > > > > Do you have any suggestions for how to actually accomplish such highlighting? > > I'd imagine that the thing that can tell there's HTML in there can > also tell where it is and hand back a line number at a minimum. Oh, that's the easy part -- even getting a regexp to do that is pretty easy. But how do you get that feedback into a standard HTML input box, what amount of black magic is needed there? > > There are also some further issues around the preview code for that, > > since it uses a different markdown engine, but that one already has > > some issues so we should probably try to figure that part out at the > > same time. > > > > > > > trouble of tracking this down because I have a lot of readers each > > > week who expect me to get it there, but I doubt anyone else who ran > > > into this bothered. > > > > Well, nobody else has reported any problems, but my guess is nobody > > else has tried pasting HTML before :) > > I did not try pasting HTML in there. There was no HTML anywhere in the > newsletter before. What there was was a false positive that I had the > pleasure of tracking down. Oh, gotcha. Would you care to actually share *what* the problematic match was? If nothing else, that would be good to test against with a new implementation. > What is it precisely that you don't want in HTML? I'm asking because > if you can come up with a list of things you want blocked, a gizmo > that removes same from that AST (er, DOM) seems like the thing that > would actually work and not burden people. We don't want anything in HTML in general, other than what's generated out of the markdown. So it's really a question of what we *want*, which is just the basic formatting tags + links. Looking some more at the bleach thing it does seem to work with this kind of whitelist model, so that is indeed probably a good way forward. It will require some bigger hackings around the pgweb code though, but that will likely pay off. > You're inferring that no complaints means no one had problems other > than me. I think a much more likely explanation is survivorship bias, > i.e. lots of people noticed it was buggy and unhelpful, and silently > gave up. This is certainly possible. But given the number of other people who have contacted us with questions around *different* things in that system after the change, I'm willing to guess that the number are fairly low. And we've generally seen about the same number of posts / week as we had before, so there has certainly not been a big drop. Whereas the actual delivery rate has gone up *massively*. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On Mon, Nov 9, 2020 at 03:32:58PM +0100, Magnus Hagander wrote: > This is certainly possible. But given the number of other people who > have contacted us with questions around *different* things in that > system after the change, I'm willing to guess that the number are > fairly low. And we've generally seen about the same number of posts / > week as we had before, so there has certainly not been a big drop. > Whereas the actual delivery rate has gone up *massively*. Why has the delivery rate increased? Was it related to this change? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
On Wed, Nov 11, 2020 at 10:45 PM Bruce Momjian <bruce@momjian.us> wrote: > > On Mon, Nov 9, 2020 at 03:32:58PM +0100, Magnus Hagander wrote: > > This is certainly possible. But given the number of other people who > > have contacted us with questions around *different* things in that > > system after the change, I'm willing to guess that the number are > > fairly low. And we've generally seen about the same number of posts / > > week as we had before, so there has certainly not been a big drop. > > Whereas the actual delivery rate has gone up *massively*. > > Why has the delivery rate increased? Was it related to this change? Numerous reasons. The main ones being we are now in control of the message and can assure it is properly DKIM signed with no violations, and also in control of the sending domain making sure that DKIM/SPF/DMARC policies are correctly configured to actually send to a mailinglist. And having both a html and plaintext part consistently across messages (yes, some systems do actually consider emails without a html part spam these days -- the opposite of what it used to be). And longer term we'd expect the fact that people can subscribe/unsubscribe to individual topics are going to make it more likely that they don't just hit "this is spam" -- previously you could subscribe to -announce to get our release announcements but would also receive a lot of things entirely unrelated to that, with the only common topic being postgres. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/