fíam

(rhymes with liam)

  • Preventing Spam

    May 13, 2008 at 03:00:48 CEST

    Latelly I've been hammered with a lot of spam in this blog, so I decided to implement something to prevent it.

    As I've previously mentioned, I don't like Akismet because it's too simple. It only tells you if they think the comment is spam, so the best you can do is skip writing comments to the database. It would be nice if it returned a probability, so you could act accordingly. For example, consider the following:

    • If spam probability is 50% or below, accept the comment.
    • If it's between 50% and 80%, present some validation method to the user. It could be a CAPTCHA or even something more simple like a message telling the user to resubmit the form before 30 seconds, since most of the spam bots wouldn't get that right.
    • If it's more than 80%, discard the comment.

    But Akismet can't do that, so I will never use it. My initial idea was implementing my own spam detection system but, since developing ffloat.it keeps me busy enough, that's not something I can do for now. However, after reading the suggestion from Scott Lawton and reading the page he mentioned, I found I could write something to prevent most of the spam in less than an hour.

    My approach uses two form classes, which you must subclass in your application. And that's all you need! Your forms won't even have any visual impact, since those two classes only introduce two hidden fields and the correspondant validation methods. The process is a follows:

    • When you create the form (empty or with data) you need to pass two new variables to it: the remote address which is requesting the page and an identifier. For example, in Blango I use the primary key for the entry.

    • The form encrypts the requester IP, the identifier and the current time using a stream crypher and your settings.SECRET_KEY as key and puts it in a hidden field.

    • The form adds a textfield (author_bogus_name) with a maximum length of 0 without label and with style set to display:none. Users won't see it, but spam bots will try to put something there.

    • Upon form verification, the hidden field is decyphered and the requester address and the identifier are checked for equality. If they match, a time verification is performed: if the user took less than 5 seconds for posting it (wow, too fast typing, isn't it?) or more than an hour (preventing bots for reusing the token in the future), the form won't validate.

    I know this method is not perfect, since a spambot could be instructed to circunvent it. But the game consists on being ahead of the spammers, and currently this technique will get you there.

    As for the code, it's currently commited to the Blango tree, in the file magicforms.py, but for your convenience I've made it avaible here. Let's see an example from Blango itself:

    Before:

    class CommentForm(forms.ModelForm):
    ...
    ...
    comment_form = CommentForm()
    if request.method == 'POST' and entry.allow_comments:
        comment_form = CommentForm(request.POST)
    

    After:

    from magicforms import MagicModelForm
    class CommentForm(MagicModelForm):
    ...
    ...
    comment_form = CommentForm(request.META['REMOTE_ADDR'], entry.id)
    if request.method == 'POST' and entry.allow_comments:
        comment_form = CommentForm(request.META['REMOTE_ADDR'], entry.id, request.POST)
    

    Just remember to use MagicForm if your form inherits from forms.Form and MagicModelForm if your forms inherits from forms.ModelForm. Note also that this code depends on PyCrypto (python-crypto package in Debian and friends).

    Tags:

41 comments for "Preventing Spam"

  • #1 Comment by Simon May 13, 2008 at 11:41:34 CEST ( Permalink )

    I like the idea of having a field that is hidden with css. I think that will trip most spambots especially if you apply the css to a surrounding element rather then the input field itself.

    It would be good to note that validating the REMOTE_ADDR could cause problems for some users. I believe that AOL users could have a different IP address on subsequent requests. Maybe a way around this would be to use the first two octets. Even without the REMOTE_ADDR element this method would work well.

  • #2 Comment by Nick May 13, 2008 at 12:24:11 CEST ( Permalink )

    Firefox will try to autofill it (hidden field too I think) and fail. So it's not so good idea.

    Another idea is to add to every field name unique id - md5 hash of secret key+time+data. You will need to rewrite form class to support this.

    So every time every field would have new name which makes bots pre-collection of field names meaningless.

  • #3 Comment by Simon May 14, 2008 at 11:52:40 CEST ( Permalink )

    It could be that Firefox doesn't send the field in the post data. I believe that the rfc (not sure which one anymore) says that it is up to the browser if it wants to send it or not. Not sure what spambots would do it it was filled with predetermined data. As it's not a hidden filed I suspect they would fill it with junk.

  • #4 Comment by Simon May 14, 2008 at 11:54:25 CEST ( Permalink )

    Or Firefox is filling it with a password you have stored for the website.

  • #5 Comment by fiam May 22, 2008 at 13:58:42 CEST ( Permalink )

    @Simon, @Nick

    Thanks for your comments. As far as I know, Firefox will try to fill the field only if it has recorded a value for a field with the same DOM id. DOM id for the hidden field in MagicForms is set to bogus_author_name, which IMHO is not a commonly used id. However, I like your idea about setting a random field id and I'll be implementing it.

  • #6 Comment by Antti Kaihola Sept. 23, 2008 at 08:55:35 CEST ( Permalink )

    Is there a particular reason why you're using ARC4 encryption instead of a salted MD5 or SHA1 hash?

    MD5 and SHA1 are available in the Python standard library, so the dependency on the PyCrypto library could be dropped.

  • #7 Comment by Antti Kaihola Sept. 23, 2008 at 09:53:39 CEST ( Permalink )

    Ah, now I see it. You're encrypting the timestamp as well, so a hash function won't work here since it has to be decrypted. An alternative would be to use the timestamp unencrypted and include its own salted hash to detect tampering.

  • #8 Comment by Antti Kaihola Sept. 23, 2008 at 13:53:33 CEST ( Permalink )

    Here's a version using salted SHA1 instead of ARC4:

    http://github.com/akaihola/django-magicforms/commits/ditch-pycrypto

  • #9 Comment by Antti Kaihola Sept. 25, 2008 at 09:04:33 CEST ( Permalink )

    While hacking on magicforms I noticed that using cPickle is unreliable. See for instance http://dpaste.com/hold/80421/

  • #10 Comment by Antti Kaihola Sept. 26, 2008 at 10:00:42 CEST ( Permalink )

    To clarify my point on cPickle: the same data isn't guaranteed to always pickle to the same string. However, this isn't a concern in fíam's original magicforms. I ran into it when I changed from ARC4 to salted SHA1 and started to compare pickled strings.

  • #11 Comment by John Moylan Sept. 18, 2009 at 15:15:43 CEST ( Permalink )

    You probably need to track user agent as well as requester IP - to get around NAT'd users with the same IP breaking the 5min rule.

  • #12 Comment by edrgft Oct. 14, 2009 at 03:44:23 CEST ( Permalink )

    gh

  • #13 Comment by Xbnviagra Dec. 30, 2009 at 12:55:58 CET ( Permalink )

    G'end of day url _http://bkfuller.com/members/buy-adipex.aspx * HP: _http://bkfuller.com/members/generic-viagra.aspx

  • #14 Comment by Zdpcigarettes Dec. 30, 2009 at 15:48:20 CET ( Permalink )

    Good job !! _http://bkfuller.com/members/discount-xanax.aspx ,
    Website xttp://bkfuller.com/members/discount-propecia.aspx

  • #15 Comment by Xanaxtrz Dec. 31, 2009 at 04:23:57 CET ( Permalink )

    Thanks 8-) _ttp://www.apatarforge.org/wiki/display/~discount%2Bpropecia _ttp://bkfuller.com/members/discount-fioricet.aspx ...

  • #16 Comment by Cialisnrq Dec. 31, 2009 at 07:05:58 CET ( Permalink )

    comment 2: bkfuller.com/members/cheap-levitra.aspx .
    HP bkfuller.com/members/cialis-online.aspx VS
    pxikacn

  • #17 Comment by Ativanlpb Dec. 31, 2009 at 17:57:05 CET ( Permalink )

    Gran sitio. El hacer de la subsistencia ttp://www.testriffic.com/user/buyacomplia www.testriffic.com/user/cigaretteshere

  • #18 Comment by Sdgtramadol Jan. 1, 2010 at 01:27:13 CET ( Permalink )

    G'end of day - ttp://bkfuller.com/members/ativan-online.aspx
    Ttp://www.goarticles.com/cgi-bin/showa.cgi?C=2422199 upad

  • #19 Comment by Swhpropecia Jan. 1, 2010 at 15:08:04 CET ( Permalink )

    Keep up a good work site http://bkfuller.com/members/discount-propecia.aspx http: _http://www.goarticles.com/cgi-bin/showa.cgi?C=2422191 fwmods

  • #20 Comment by Rectramadol Jan. 1, 2010 at 19:06:35 CET ( Permalink )
  • #21 Comment by Cigarettesxus Jan. 1, 2010 at 20:55:19 CET ( Permalink )

    Les meilleures vues ^_^ URL: ttp://www.testriffic.com/user/discountxanax ttp://www.goarticles.com/cgi-bin/showa.cgi?C=2422268
    :cry:

  • #22 Comment by Mdppropecia Jan. 2, 2010 at 00:39:28 CET ( Permalink )

    Here who asked that _ttp://www.apatarforge.org/wiki/display/~buy%2Bacomplia
    _ttp://bkfuller.com/members/discount-fioricet.aspx

  • #23 Comment by Diazepamvtj Jan. 2, 2010 at 17:35:20 CET ( Permalink )

    Hi http://bkfuller.com/members/ativan-online.aspx
    Ttp://bkfuller.com/members/discount-xanax.aspx VS

  • #24 Comment by Propeciaxyj Jan. 2, 2010 at 19:41:42 CET ( Permalink )

    This is awesome xttp://www.goarticles.com/cgi-bin/showa.cgi?C=2422268 * _http://www.goarticles.com/cgi-bin/showa.cgi?C=2422356 ; iqsvt

  • #25 Comment by Cigarettespgk Jan. 2, 2010 at 23:30:23 CET ( Permalink )

    comments3: _ttp://www.testriffic.com/user/propeciahere $ visit _ttp://www.testriffic.com/user/buyativan +
    ushte

  • #26 Comment by Swhcialis Jan. 3, 2010 at 01:26:15 CET ( Permalink )

    comment3: -- visit ttp://www.apatarforge.org/wiki/display/~cheap%2Blevitra xttp://www.apatarforge.org/wiki/display/~cialis%2Bonline
    zujlbd

  • #27 Comment by Adipexbuj Jan. 3, 2010 at 19:56:30 CET ( Permalink )

    Thanks for attention - link: ttp://bkfuller.com/members/cialis-online.aspx WEBSITE: xttp://bkfuller.com/members/diazepam-online.aspx

  • #28 Comment by Buyfioricethip Jan. 4, 2010 at 01:27:27 CET ( Permalink )

    Thanks you big link: www.testriffic.com/user/buyfioricet
    Www.testriffic.com/user/genericviagra , *JOKINGLY*

  • #29 Comment by Viagra Jan. 8, 2010 at 10:57:51 CET ( Permalink )

    Erectile dysfunction, <a href="http://www.viagrausaonline.com/">viagra online</a>, 08273, [URL=http://www.viagrausaonline.com/]viagra online[/URL], )), http://www.viagrausaonline.com/ Viagra, pawe.

  • #30 Comment by Jazlevitra Jan. 10, 2010 at 09:31:56 CET ( Permalink )

    Las gracias localizan son buenas www.apatarforge.org/wiki/display/~discount%2Bviagra _http://www.apatarforge.org/wiki/display/~cheap%2Blevitra
    :cry:

  • #31 Comment by Cibxanax Jan. 10, 2010 at 21:14:38 CET ( Permalink )

    Considerable article _ttp://www.goarticles.com/cgi-bin/showa.cgi?C=2422412 site http://www.goarticles.com/cgi-bin/showa.cgi?C=2422342 .. rqbnc

  • #32 Comment by Imytramadol Jan. 10, 2010 at 23:58:40 CET ( Permalink )
  • #33 Comment by Ltiphentermine Jan. 11, 2010 at 02:32:48 CET ( Permalink )

    Thanks !!! WEBSITE xttp://www.goarticles.com/cgi-bin/showa.cgi?C=2422268
    LINK: ttp://www.testriffic.com/user/buyativan | :!:

  • #34 Comment by Tramadolqwr Jan. 11, 2010 at 05:21:48 CET ( Permalink )

    comments4: . SITE: _http://www.apatarforge.org/wiki/display/~discount%2Bcialis see xttp://www.apatarforge.org/wiki/display/~discount%2Bviagra
    agh

  • #35 Comment by Levitramqs Jan. 11, 2010 at 09:46:41 CET ( Permalink )

    Most appropriate regards URL http://bkfuller.com/members/cialis-online.aspx bkfuller.com/members/viagra-online.aspx
    &

  • #36 Comment by Cialisvjl Jan. 11, 2010 at 17:56:17 CET ( Permalink )
  • #37 Comment by Utiativan Jan. 13, 2010 at 03:50:44 CET ( Permalink )

    your comments5: ^^ URL _http://www.fotolog.com/cigarettes
    Link _http://www.fotolog.com/buyadipex
    ,-

  • #38 Comment by Buyfioricetksa Jan. 13, 2010 at 08:40:30 CET ( Permalink )

    Svp nous avec vos emplacements http://www.fotolog.com/discountcialis
    Http://www.fotolog.com/discountviagra

  • #39 Comment by Hydrocodonetiu Jan. 13, 2010 at 10:54:56 CET ( Permalink )

    Quiero decir - gracias por esto http://profiles.friendster.com/buyhydrocodone _http://www.fotolog.com/buyadipex $

  • #40 Comment by Wkmativan Jan. 13, 2010 at 13:07:44 CET ( Permalink )

    your comment5: _http://profiles.friendster.com/discountpropecia ! _http://profiles.friendster.com/cheaplevitra . *WALL*

  • #41 Comment by Ezbbuyfioricet Jan. 13, 2010 at 15:26:09 CET ( Permalink )

    You have the best rating ttp://www.fotolog.com/discountfioricet ttp://www.fotolog.com/viagrahere *SCRATCH*

Comments for this entry are currently disabled