Обсуждение: human validation on post comments

Поиск
Список
Период
Сортировка

human validation on post comments

От
Travis Hein
Дата:
Hi,

my name is Travis, I am new to pgsql-www.

I have been integrating a component that will ask the user to enter the word
in a dynamic image before their comments can be submitted.

I had resumed this work in progress from Gevik.

The current setup is the development site in my sandbox
http://travis.pgadmin.org

I was looking for feedback, and guidance for the following things:


- is the input validator being invoked on all the spots where it should be?
currently the comments feedback for the documentation, i know about.
validation is invoked from  /system/page/form.php
- currently the invalid human feedback page is a simple validation failed
message, outside of portal look and feel.

and technical details (if they follow our best practices)

- there is a single table session_capture for managing remote ip, session, and
the word used in the generated image.
- the validation script that manipulates dynamic image and session table are
in a top level folder /validation
- gettext macro used for the messages displayed
- modify system/page.php
    to add validate case to action handler list
- modify system/page/form.php
    to send to our /validate handler, instead of /system/handleform.php
- modify the .htaccess
    to pass /validationimage through to the /validation/validation_image.php


Then if things are looking ok, what and how is the process for integrating the
enhancements back to the site.

looking forward to constructive comments,
Travis

Re: human validation on post comments

От
Josh Berkus
Дата:
Travis,

> I have been integrating a component that will ask the user to enter the
> word in a dynamic image before their comments can be submitted.

Terrific!  I'm sure the people who clear the comments will have nice things to
say.

The image is generated dynamically?   That's good -- the spammers are already
working on systems that harvest static images from sites and match them
against a database.  Grrrr.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: human validation on post comments

От
"Magnus Hagander"
Дата:
> my name is Travis, I am new to pgsql-www.
>
> I have been integrating a component that will ask the user to
> enter the word in a dynamic image before their comments can
> be submitted.
>
> I had resumed this work in progress from Gevik.

Great!

> - is the input validator being invoked on all the spots where
> it should be?
> currently the comments feedback for the documentation, i know about.
> validation is invoked from  /system/page/form.php

I believe all form submissions go through there ATM. But it would
probably be good if there was something "API-like" so you could call it
from elsewhere if required, unless that's too much work?

Will the validation image automatically be added to all forms, or is it
something we need to set for each form that needs it? And if it's
automatic, is there a way to turn it off for a form through form.php?


> - currently the invalid human feedback page is a simple
> validation failed message, outside of portal look and feel.

That would be nice if it could be fixed to be a page inside the portal.
IIRC I have code around to fix that for the generic forms, that will go
in any day now :-) Might help you build off.


> - the validation script that manipulates dynamic image and
> session table are in a top level folder /validation

Any reason why this is not in /system/? I like the way things are now
where all the code goes in one subdir. If it's many files, it could go
in /system/validation?


> - modify system/page/form.php
>     to send to our /validate handler, instead of
> /system/handleform.php

I assume the validate handler then passes control back into handleform,
or does form-handling move completely into the validator?


Rest looks very good!

> Then if things are looking ok, what and how is the process
> for integrating the enhancements back to the site.

Send a patch through to the list for even more comments, I guess.

//Magnus

Re: human validation on post comments

От
"Dave Page"
Дата:

> -----Original Message-----
> From: pgsql-www-owner@postgresql.org
> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Travis Hein
> Sent: 18 March 2006 15:57
> To: pgsql-www@postgresql.org
> Subject: [pgsql-www] human validation on post comments
>
> Hi,
>
> my name is Travis, I am new to pgsql-www.

Hi Travis,

> I have been integrating a component that will ask the user to
> enter the word
> in a dynamic image before their comments can be submitted.
>
> I had resumed this work in progress from Gevik.
>
> The current setup is the development site in my sandbox
> http://travis.pgadmin.org
>
> I was looking for feedback, and guidance for the following things:
>
>
> - is the input validator being invoked on all the spots where
> it should be?
> currently the comments feedback for the documentation, i know about.
> validation is invoked from  /system/page/form.php

Most likely if it's from there. News, events, professional services,
comments & bug reports all seem to be covered.

> - currently the invalid human feedback page is a simple
> validation failed
> message, outside of portal look and feel.

Yes, this needs to be fixed. Also, during my testing I hit the limit for
uncompleted submissions - it gave me the message actually in the image,
but cropped so it couldn't be fully read, but still asked me to enter
the word and hit submit! Can the message be moved out and into a proper
page please?

Also, we could probably do with a couple of extra words above the image
just to clarify why the user must enter the word.

> - the validation script that manipulates dynamic image and
> session table are
> in a top level folder /validation

Yup - as Magnus said these should be under /system somewhere. The URL
exposed to the user (which will be rewritten in the .htaccess file) can
be in the root directory though.

> Then if things are looking ok, what and how is the process
> for integrating the
> enhancements back to the site.

First off, resolve mine and any other issues raised (unless they result
in any discussion/objections, in which case wait for the outcome of that
first). Secondly, cvs update your code and merge with the current.
Magnus committed a bunch of changes over the weekend so you'll need to
make sure everything lives happily together. Then, post a patch in diff
-c format for review.

Cheers, Dave.

Re: human validation on post comments

От
David Fetter
Дата:
On Sat, Mar 18, 2006 at 10:15:12AM -0800, Josh Berkus wrote:
> Travis,
>
> > I have been integrating a component that will ask the user to
> > enter the word in a dynamic image before their comments can be
> > submitted.
>
> Terrific!  I'm sure the people who clear the comments will have nice
> things to say.
>
> The image is generated dynamically?   That's good -- the spammers
> are already working on systems that harvest static images from sites
> and match them against a database.  Grrrr.

Actually, they've already got one, and here's how it works:

1.  Put up a free porn site.
2.  Present somebody else's capcha image as an entry.
3.  Let the person see the porn if they've correctly cracked the
    capcha.
4.  Spam site.

The sad part of this one is that they don't have to crack any single
capcha system.  Instead, they've cracked the entire capcha process.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: human validation on post comments

От
"Dave Page"
Дата:

> -----Original Message-----
> From: pgsql-www-owner@postgresql.org
> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of David Fetter
> Sent: 21 March 2006 05:43
> To: PostgreSQL WWW
> Subject: Re: [pgsql-www] human validation on post comments
>
> Actually, they've already got one, and here's how it works:
>
> 1.  Put up a free porn site.
> 2.  Present somebody else's capcha image as an entry.
> 3.  Let the person see the porn if they've correctly cracked the
>     capcha.
> 4.  Spam site.
>
> The sad part of this one is that they don't have to crack any single
> capcha system.  Instead, they've cracked the entire capcha process.

Grrr, where's my baseball bat?

Actually though that shouldn't be too much of a problem as long as the
images timeout after a few minutes- and we still have all the normal
moderation in place.

Regards, Dave.

Re: human validation on post comments

От
Tino Wildenhain
Дата:
Dave Page schrieb:
>
>
>
>>-----Original Message-----
>>From: pgsql-www-owner@postgresql.org
>>[mailto:pgsql-www-owner@postgresql.org] On Behalf Of David Fetter
...
>>The sad part of this one is that they don't have to crack any single
>>capcha system.  Instead, they've cracked the entire capcha process.
>
>
> Grrr, where's my baseball bat?
>
> Actually though that shouldn't be too much of a problem as long as the
> images timeout after a few minutes- and we still have all the normal
> moderation in place.
>
I should point out, the whole captcha thing isnt WAI compliant:
http://www.w3.org/WAI/

so better not follow that path if we dont want to shy away
diabled people...

--Tino

Re: human validation on post comments

От
"Dave Page"
Дата:

> -----Original Message-----
> From: Tino Wildenhain [mailto:tino@wildenhain.de]
> Sent: 21 March 2006 08:26
> To: Dave Page
> Cc: David Fetter; PostgreSQL WWW
> Subject: Re: [pgsql-www] human validation on post comments
>
> Dave Page schrieb:
> >
> >
> >
> >>-----Original Message-----
> >>From: pgsql-www-owner@postgresql.org
> >>[mailto:pgsql-www-owner@postgresql.org] On Behalf Of David Fetter
> ...
> >>The sad part of this one is that they don't have to crack any single
> >>capcha system.  Instead, they've cracked the entire capcha process.
> >
> >
> > Grrr, where's my baseball bat?
> >
> > Actually though that shouldn't be too much of a problem as
> long as the
> > images timeout after a few minutes- and we still have all the normal
> > moderation in place.
> >
> I should point out, the whole captcha thing isnt WAI compliant:
> http://www.w3.org/WAI/
>
> so better not follow that path if we dont want to shy away
> diabled people...

Hmm, that's a good point (one I should have thought of considering the
BSI's new guidance notes on building accessible websites just landed on
my desk!).

I think we would be covered if we offered an audio equivalent as well
though wouldn't we? Perhaps there is some text-to-speech code that we
can use from within PHP?

Regards, Dave.

Re: human validation on post comments

От
"Magnus Hagander"
Дата:
> > > I have been integrating a component that will ask the
> user to enter
> > > the word in a dynamic image before their comments can be
> submitted.
> >
> > Terrific!  I'm sure the people who clear the comments will
> have nice
> > things to say.
> >
> > The image is generated dynamically?   That's good -- the spammers
> > are already working on systems that harvest static images
> from sites
> > and match them against a database.  Grrrr.
>
> Actually, they've already got one, and here's how it works:
>
> 1.  Put up a free porn site.
> 2.  Present somebody else's capcha image as an entry.
> 3.  Let the person see the porn if they've correctly cracked the
>     capcha.
> 4.  Spam site.
>
> The sad part of this one is that they don't have to crack any
> single capcha system.  Instead, they've cracked the entire
> capcha process.

I don't know how this particular system is set up, but how can they
defeat something like:

* Fill in form data. Submit
* Generate verification page containing an image. Along with the code,
store the hash of the form data.
* Validate the image against the hash of the data.

Means you need to put in all your data in the form beforehand, so you
have to tailor one page to each set of contenst. Or am I thinking
completely wrong here :-)

//Magnus

Re: human validation on post comments

От
David Fetter
Дата:
On Tue, Mar 21, 2006 at 08:12:05AM -0000, Dave Page wrote:
> > -----Original Message-----
> > From: pgsql-www-owner@postgresql.org
> > [mailto:pgsql-www-owner@postgresql.org] On Behalf Of David Fetter
> > Sent: 21 March 2006 05:43
> > To: PostgreSQL WWW
> > Subject: Re: [pgsql-www] human validation on post comments
> >
> > Actually, they've already got one, and here's how it works:
> >
> > 1.  Put up a free porn site.
> > 2.  Present somebody else's capcha image as an entry.
> > 3.  Let the person see the porn if they've correctly cracked the
> >     capcha.
> > 4.  Spam site.
> >
> > The sad part of this one is that they don't have to crack any
> > single capcha system.  Instead, they've cracked the entire capcha
> > process.
>
> Grrr, where's my baseball bat?
>
> Actually though that shouldn't be too much of a problem as long as
> the images timeout after a few minutes- and we still have all the
> normal moderation in place.

The porn thing works just fine no matter what the timeout is, as the
spam is queued up already and the capcha gets presented as soon as
it's generated.  The porn surfer will generally not dally when
presented with the capcha.

But apart from its ineffectiveness on spammers, as others have
mentioned, capcha excludes blind people. :(

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: human validation on post comments

От
"Dave Page"
Дата:

> -----Original Message-----
> From: David Fetter [mailto:david@fetter.org]
> Sent: 21 March 2006 16:45
> To: Dave Page
> Cc: PostgreSQL WWW
> Subject: Re: [pgsql-www] human validation on post comments
>
> The porn thing works just fine no matter what the timeout is, as the
> spam is queued up already and the capcha gets presented as soon as
> it's generated.  The porn surfer will generally not dally when
> presented with the capcha.

Generating enough real traffic to a dummy site to ensure that there is
always user ready to read a single capcha within a few minutes of it
being generated just to post a single piece of spam seems like a pretty
mean feat. I would think they could generate more revenue from bunging a
few ads on the site than hoping that the spam they manage to get on a
completely unrelated site might actually generate a customer. Still, I'm
only speculating so may be completely wrong.

> But apart from its ineffectiveness on spammers, as others have
> mentioned, capcha excludes blind people. :(

Yes - it's a shame none of us thought about it when Gevik was originally
working on it.

There is the audio option I suggested which Paypal use IIRC -
alternatively we could use some sort of puzzle - such as 'enter the
third, second from last and 2nd character from this string'.

Regards, Dave.

Re: human validation on post comments

От
David Fetter
Дата:
On Tue, Mar 21, 2006 at 04:54:24PM -0000, Dave Page wrote:
>
>
> > -----Original Message-----
> > From: David Fetter [mailto:david@fetter.org]
> > Sent: 21 March 2006 16:45
> > To: Dave Page
> > Cc: PostgreSQL WWW
> > Subject: Re: [pgsql-www] human validation on post comments
> >
> > The porn thing works just fine no matter what the timeout is, as
> > the spam is queued up already and the capcha gets presented as
> > soon as it's generated.  The porn surfer will generally not dally
> > when presented with the capcha.
>
> Generating enough real traffic to a dummy site to ensure that there
> is always user ready to read a single capcha within a few minutes of
> it being generated just to post a single piece of spam seems like a
> pretty mean feat.

I see I didn't explain it well enough.  Here's the flow:

1.  Spammer generates spam and queues it up for sites.
2.  A person arrives at the porn site.
3.  The spam system generates a request including the spam to the
    target site.  Clock starts ticking.
4.  The spam system presents the resulting capcha to the porn surfer.
    Less than a second has elapsed.
5.  Porn surfer types in the string as asked.  Time elapsed is
    probably still under 5 seconds.
6.  Spam system sends the string to the target site.  Time elapsed is
    under 10 seconds for >90% of cases.

> I would think they could generate more revenue from bunging a few
> ads on the site than hoping that the spam they manage to get on a
> completely unrelated site might actually generate a customer. Still,
> I'm only speculating so may be completely wrong.

It's very cheap to set up such a system, and spammers routinely
expect--and profit from--"hit rates" that are less than one in a
million.

> > But apart from its ineffectiveness on spammers, as others have
> > mentioned, capcha excludes blind people. :(
>
> Yes - it's a shame none of us thought about it when Gevik was
> originally working on it.
>
> There is the audio option I suggested which Paypal use IIRC -
> alternatively we could use some sort of puzzle - such as 'enter the
> third, second from last and 2nd character from this string'.

That lends itself to exactly the same attack I sketched out above.

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

Re: human validation on post comments

От
"Dave Page"
Дата:

> -----Original Message-----
> From: David Fetter [mailto:david@fetter.org]
> Sent: 21 March 2006 17:16
> To: Dave Page
> Cc: PostgreSQL WWW
> Subject: Re: [pgsql-www] human validation on post comments
>
> I see I didn't explain it well enough.  Here's the flow:
>
> 1.  Spammer generates spam and queues it up for sites.
> 2.  A person arrives at the porn site.
> 3.  The spam system generates a request including the spam to the
>     target site.  Clock starts ticking.
> 4.  The spam system presents the resulting capcha to the porn surfer.
>     Less than a second has elapsed.
> 5.  Porn surfer types in the string as asked.  Time elapsed is
>     probably still under 5 seconds.
> 6.  Spam system sends the string to the target site.  Time elapsed is
>     under 10 seconds for >90% of cases.

Ahh, gotcha.

>
> > > But apart from its ineffectiveness on spammers, as others have
> > > mentioned, capcha excludes blind people. :(
> >
> > Yes - it's a shame none of us thought about it when Gevik was
> > originally working on it.
> >
> > There is the audio option I suggested which Paypal use IIRC -
> > alternatively we could use some sort of puzzle - such as 'enter the
> > third, second from last and 2nd character from this string'.
>
> That lends itself to exactly the same attack I sketched out above.

Undoubtedley, but unless they write something specifically to work with
our site which is a lot of effort... And all we do then is fall back to
how things are now until we've broken whatever they were doing by
modifying the regexps in the auto-reject code or re-jigged the puzzles.
Of course, doing any of this we mustn't make it too difficult for the
user to submit things.

Regards, Dave.

Re: human validation on post comments

От
"Greg Sabino Mullane"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Pardon if this has already been brought up as I am late to the
thread, but why can't we just use a "unseen by default" policy
in which all posts are not made viewable until approved by a
moderator?

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200603211227
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFEIDjRvJuQZxSWSsgRAtydAKCts/A5xWRpzbxs//VjZqmCrKUQXwCaArh3
tpO4osfGfRUS2GPGJKab2dk=
=Ha6g
-----END PGP SIGNATURE-----



Re: human validation on post comments

От
"Magnus Hagander"
Дата:
> Pardon if this has already been brought up as I am late to
> the thread, but why can't we just use a "unseen by default"
> policy in which all posts are not made viewable until
> approved by a moderator?

We already do this, and have done for a while (before that they used to
be show-by-default, but can be rejected, which missed a lot). This is
AFAIK about making life for the moderators easie r:-)

//Magnus

Re: human validation on post comments

От
Robert Treat
Дата:
On Tuesday 21 March 2006 12:49, Magnus Hagander wrote:
> > Pardon if this has already been brought up as I am late to
> > the thread, but why can't we just use a "unseen by default"
> > policy in which all posts are not made viewable until
> > approved by a moderator?
>
> We already do this, and have done for a while (before that they used to
> be show-by-default, but can be rejected, which missed a lot). This is
> AFAIK about making life for the moderators easie r:-)
>

Honestly I don't find the amount of spam as annoying as the amount of
pointless posts and/or support questions...  but if you really want such a
system, why not have one that switches randomly between captcha, math
problems, and anagrams and also verifies the referer url.  This wouldn't be
foolproof, but the point here is only to make it more of a hassle than the
next guys site.

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: human validation on post comments

От
Travis Hein
Дата:
Hi all, sorry been a bit since this one, I have been out pondering this.

ok, so far, from the way the captcha works, it uses a text file as a
dictionary of words. and randomly picks a word from this file.
then it uses the Pear PHP modules to dynamically generate a random placement
and orientation of the word into an image, with two random sets of
coincentric circles. I am thinking that it would deter the bots or anyone
from using a match image lookup, character recognition, or saving previous
images. (and if someone is that creative to get through, they still have to
deal with the moderators :) )

There is a simple database tabe table that stores the word, when the image is
first generated, and links it to your session. You have 3 tries to enter the
right word, then the image becomes not useable any more.

Once you do enter the correct word, this image / session binding is updated as
being used, and cannot be used again.

there is also a time out period where by the image expires, in the session
captcha, assuming you keep your session active long enough.

This was all Gavik's work, seems pretty complete on these parts.


On March 21, 2006 03:43 am, Dave Page wrote:
> I should point out, the whole captcha thing isnt WAI compliant:
> http://www.w3.org/WAI/
>
> so better not follow that path if we dont want to shy away
> diabled people...

> I think we would be covered if we offered an audio equivalent as well
> though wouldn't we? Perhaps there is some text-to-speech code that we
> can use from within PHP?

> But apart from its ineffectiveness on spammers, as others have
> mentioned, capcha excludes blind people. :(

So, what I am thinking, is for us to be WAI compliant, we need an option to
"click here to receive an audible version of this word in the image that you
must type in"

I have not been able to find a good, universal, works in all places text to
speech converter,

so how about, if we (someone with a good voice) pre-records each and every
word in the capatcha word dictionary, (eventually we could also do it for all
languages we support too, but i was thinking if the word was "shovel" that
the voice version of the word could be "s"  "h"  "o" "v" "e" "l" . you  know,
spell out the letters, so that it is more language independent. (currently,
the dictionary for captacha words is english only now too.)

I would consider using a small database table, for the dictionary, to manage
all the words for the dictionary, as opposed to cluttering up a folder in the
docroot with the audio files?
The table would have a column for the word, as text, and a column for the
word, as a .wav (or the standard acceptable audio format) recorded, spelled
out by the letters version of this word.

then on the captacha image screen, the link "click here to get the audio
version of the word shown in the image button" would be the facility to
retreive the word as a downloadable / playable attachment for the user, to
play, and type the letters back in. From there, the existing session captcha
features would be used.

I recognise that the challenge is to get someone to vocalize the dictionary of
captcha words.
I have not done this yet, but wanted to get the general feedback to if this
was a good idea.

> --
Tue Mar 28 19:48:12 EST 2006
 19:48:12 up 21 min,  1 user,  load average: 12.87, 9.76, 5.24

Re: human validation on post comments

От
"Dave Page"
Дата:

> -----Original Message-----
> From: pgsql-www-owner@postgresql.org
> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Travis Hein
> Sent: 29 March 2006 02:10
> To: PostgreSQL WWW
> Subject: Re: [pgsql-www] human validation on post comments
>

Hi Travis,

> > But apart from its ineffectiveness on spammers, as others have
> > mentioned, capcha excludes blind people. :(
>
> So, what I am thinking, is for us to be WAI compliant, we
> need an option to
> "click here to receive an audible version of this word in the
> image that you
> must type in"
>
> I have not been able to find a good, universal, works in all
> places text to
> speech converter,

Well, we could always use a 'works in one place' one and record the
output.

> so how about, if we (someone with a good voice) pre-records
> each and every
> word in the capatcha word dictionary, (eventually we could
> also do it for all
> languages we support too, but i was thinking if the word was
> "shovel" that
> the voice version of the word could be "s"  "h"  "o" "v" "e"
> "l" . you  know,
> spell out the letters, so that it is more language
> independent. (currently,
> the dictionary for captacha words is english only now too.)

Yes, that seems the easiest way. Simpler yet, perhaps we should make it
a numeric code.

> I would consider using a small database table, for the
> dictionary, to manage
> all the words for the dictionary, as opposed to cluttering up
> a folder in the
> docroot with the audio files?
> The table would have a column for the word, as text, and a
> column for the
> word, as a .wav (or the standard acceptable audio format)
> recorded, spelled
> out by the letters version of this word.

It might be better to keep it all in the filesystem to be honest,
otherwise it'll add a fair bit of load to wwwmaster. In the fs, at least
it will all come from one of the frontend servers.

> then on the captacha image screen, the link "click here to
> get the audio
> version of the word shown in the image button" would be the
> facility to
> retreive the word as a downloadable / playable attachment for
> the user, to
> play, and type the letters back in. From there, the existing
> session captcha
> features would be used.

Yes.

> I recognise that the challenge is to get someone to vocalize
> the dictionary of
> captcha words.
> I have not done this yet, but wanted to get the general
> feedback to if this
> was a good idea.

It's doable, but is it worth the effort? I'm beginning to think not if
we have to prerecord everything rather than being able to generate it on
the fly. Currently I'm only seeing a few spams per day - it looks like
the latest additions to the regexp reject list is working pretty well.

Regards, Dave