Cleanup Time — Spam Filter Free Day
1111 spam comments in the last 24 hours have been published on Justaddwater.dk. Just for the record, all of them are now deleted. The many spam comments has had no effect whatsoever with respect to Google PageRank or similar, since we kept <code>rel=”nofollow”</code> on all links. Besides, we quickly removed all spam comments at the end of Spam Filter Free Day.
137 new conversations in my mailbox telling me that 137 articles I wrote were hit by spam comments the last 24 hours.
Important lesson: Almost none of the recent posts have spam comments
The screenshot below shows that only 2 of the latest 20 posts have recieved spam comments. So it seems the logic behind it is that spammers need time to harvest links for potential targets.
(click to view entire count on first 40 articles)
No posts newer than 11 days are hit by spam comments. 2 comments on “Spam Comments Dropped 95% Overnight” from December 4th.
So on this particular day, spam comments only hit in after 11 days and should be used to our advantage. There are currently these different strategies to consider:
- Close old posts for comments
- Close old posts for comments but keep active posts open
- Hold back comments on old posts for moderation
- Hold back comments on old posts for moderation unless posted by a user that has a previously approved comment
Thomas and I talked this over yesterday, and would really prefer number 4, and we can live with number 3. Number 1 and 2 are implemented by a wordpress plugin (i forget the name).
The real difference is in the user experience. 1 and 2 are unacceptable for us, since it will block our users’ valid comments if, say, somebody has an update to an old article or wants to send a trackback from a related post. This really works against the wisdom of crowds principle, that works best if everybody is allowed to comment right away.
Does anybody know a plugin that can do number 3 or 4?.
WordPress flaws and bugs:
- Three comments were incorrectly held for moderation by WordPress even though the link limit were set to 99 and the comments clearly did not contain 99 links.
- The interface for marking comments as spam is unproductive. It took me two hours to mark 1111 comments as spam (and de-spamming 6 valid comments). I used “mass edit” screen and checked all. Pressed “mark checked comments as spam”. Then pressed OK to the JavaScript confirmation.
I don’t know the internal procedures here — but it seems as if wordpress sends the 20 spam marked comments directly to Akismet. And waits for the response before showing me the next page. My workaround here was to first open page 1-20 in 20 different tabs. Then for each of them check all and press “mark checked comments as spam”. - Recheck Queue for spam is a brilliant feature — but works on the wrong data. The feature should be copied to the “comments” tab. This way, you could force a rerun if your spam filter is out of order for a period of time. Also that would be the usecase of people trying out a spam filter. “wow I got more spam comments than I can handle. Let me try and install a spam filter and see what it can do for me”. In this case, the spam filter not only works forward in time after activation, but also backwards. (obviously there must be a feature to review the past comments marked as spam).
- Blank email posts slip through even when the setting does not allow it. I have suspected this to happen also in previous versions of WordPress. Several of the 1111 comments we recieved during Spam Filter Free Day were with blank email. And that should not be possible because of the setting in WordPress “Comment author must fill out name and e-mail” (found at Admin>Options>Discussion).
I would expect these comments to be rejected and not even show up at the administrators’ panels. But for some reasons they still appear. Perhaps a tiny bug in WordPress? Or maybe spammers found a hole? Or are these trackbacks/pingbacks that just look like comments? Or maybe the updated Akismet version 2.1.2 actually deals with this because they added separate tabs for trackbacks and pingbacks?
All in all, we learned a few lessons and found some areas, where wordpress could improve in order to deal with comments that slips through. Total time consumption for removing these comments were 2 hours, which actually surprised us to be very low. But note also that this was not “just” a day without Akismet. In that case, we would probably had to keep comments under surveillance, and remove them during the day. In that case, time consumption would perhaps have been something like 6 hours during a 24 hour period.
As to whether we would do it again as a reoccuring event? Most likely not. We don’t want our regular readers to suffer, so we prefer to keep our blog safe and sound and in good shape, by keeping the guards up. On the other hand, it has been a learning experience to do the Spam Filter Free Day. Not only has our daily spam count decreased (perhaps because spammers have understood that comments that momentarily slip through have no effect on PageRank or similar). Also, we wanted to raise awareness of the bandwith and processing power (machine and human) that spam comments waste every day. And last also do this day as a thank to spamfilters such as Akismet. (heck, comments on other blogs even alledge that the Spam Filter Day is a publicity trick from Akismet… I can strongly deny that as we — thomas and jesper — can take full responsibility for inventing this event. And we are in no way affiliated with Akismet).
I would really appreciate comments with respect to plugins that do not close articles for comments — but in stead holds comments back for moderation. I will subsequently update this article with links.
Technorati Tags: wordpress, akismet, spam, comment spam, spam filter free day.
December 18th, 2007 at 20:12 (GMT-1)
Jesper,
In the WordPress dashboard (version 2.3.1 and earlier I presume), under Options, Discussion, there is the ability to hold comments in moderation unless the comment author has a previously approved comment.
I believe this addresses your desire to “hold back comments on old posts for moderation unless posted by a user that has a previously approved comment.” Unless of course, you desire to not include the “new posts” and only have “old posts” affected.
john
December 18th, 2007 at 20:53 (GMT-1)
@ John: I think you have to emphasize the word ‘old’, which would exclude ‘new posts’.
From the stats mentioned, it would be convenient not to hold back comments for moderation from anyone, unless a posting reaches say 10 days of age. Only comment authors with no previously approved comments would subsequently be moderated.
An additional setting to adjust this time-span in combination with the setting you refer to would come in hand, hence the request for such a plug-in.
December 18th, 2007 at 21:01 (GMT-1)
@s1000: Agreed, regarding the “old” v. “new” posts. Nevertheless, I would venture that an implementation of all comments held for moderation from the “beginning of a site” would be as convenient.
Of course, the issue begins when you have an established site with regular commenters. The implementation of the moderation would take some time to get through initially. Perhaps, only a few weeks? I’m not sure as it would depend on your commenter base.
A plugin, as you suggest, would be handy but I question how many sites would implement such a strategy even with the stats mentioned.
December 18th, 2007 at 21:40 (GMT-1)
Thanks John and s1000 (Søren) for your comments and speculations. Here is the reason that we don’t moderate all comments before one successful post. Thomas and I have actually discussed this on several occasions.
The biggest issue is actually that we want an instant discussion to take place even if none of us are not able to moderate. It would be unfair to new users that their comments were held back while known hangarounds could just post their comments.
Our concern and our wish to hold back on moderating first comments lead to occasional spam comments that slip through. This is the negative side of the tradeoff, but we really need the discussion to take place instantly.
Consider your own example above. Lets say you are both new and “unknown” on this blog — just for the example. That way John’s first comment would be held back for moderation. As John’s comment is held back, Søren would not have left a reply to Johns comment.
So in best case, our moderation of first comments would lead to a delayed response. In worst case it would lead to a poorer discussion (considering the fact that people often don’t get back often).
Some blogs that hold first comments for moderation often make such mistakes that decreases the user experience:
* hold back comments which delays or makes discussion poorer
* neglects to tell about it upfront
December 18th, 2007 at 21:46 (GMT-1)
Jesper,
Thank you for the excellent explanation. I was in the moderation camp. However, with your excellent example, I’ve turned off moderation on my site. Again, thank you for the insight!
December 18th, 2007 at 21:57 (GMT-1)
Excellent lessons learned on this unscientific approach. It would be interesting to cull such information across numerous blogs all using similar methods.
And yes, thank you for pointing out my favorite rant, that comments and the handling of comment spam within WordPress and Akismet needs drastic improvement.
Your PageRank would not change as it doesn’t vary from day to day, nor week to week. One day would have little or no impact on it. A month, maybe. So this isn’t a good comparison or assumption to make.
I do hope that the lessons learned will help benefit others as WordPress and Akismet improve. Thanks for taking this on.
December 18th, 2007 at 22:08 (GMT-1)
I have also found on my blog that it takes a while for spammers to discover a post (about 3-4 weeks in my case), so I could see others wanting a plug-in that turns on moderation for older posts. However, currently I’m doing Strategy 1 and turning off comments after a month to control spam –even sifting through moderated spam is a hassle. In my case, the greatest number of page downloads for a post are for the day it is posted, presumably through feeds, with rapidly diminishing number of downloads afterward. It seems that just about anyone who wants to read (and therefore comment on) a post does so well before the beginning of the spamming, so I don’t think I lose much by closing comments for old posts. What do the page downloads for justaddwater look like for posts across time?
December 19th, 2007 at 10:17 (GMT-1)
@Lorelle: For this discussion, I don’t care about the PageRank of this blog. What I want to emphasize is that comment spammers won’t get any PageRank boost out of spamming us.
There is an important destinction there.
December 20th, 2007 at 05:15 (GMT-1)
[…] he was turning off Akismet for the day, many thought it was an outrageous thing to do, but now the results are in and Jepser shares the lessons learned about being Akismet free and then cleaning up comment spam manually and with Akismet. I agree with […]
December 22nd, 2007 at 19:22 (GMT-1)
[…] spurred on by a conversation at justaddwater.dk in the comment section, I turned off moderation of my comments. It was Jesper that convinced […]
December 29th, 2007 at 13:15 (GMT-1)
[…] Lessons learnt from Spam Filter Free Day. Like this post?: Subscribe our latest articles via RSS feed or by […]
April 29th, 2008 at 05:26 (GMT-1)
It seems that just about anyone who wants to read (and therefore comment on) a post does so well before the beginning of the spamming, so I don’t think I lose much by closing comments for old posts.
May 5th, 2008 at 13:39 (GMT-1)
Closing comments for old post should not be a big disadvantage to the blog, as compared to the insecurity and uncomfortable behaviour of spam comments.
Thanks for the great post, pal…!!
May 6th, 2008 at 01:44 (GMT-1)
I would venture to say that not much at all is lost by closing comments on old posts. Ironically the old posts is where I tend to get the most of my spam. I tend to turn my comments off after 6 months on most posts, unless there is an ongoing discussion that has consistant contributions. This does not happen often :>( A sofisticated plugin that would moderate them all and learn while doing it would be wondeful…..guess that makes us the plugin!
May 6th, 2008 at 22:11 (GMT-1)
I would suggest to use Akismet to stop Spam with comment.
It does great job.
because beside you, there is a huge WP community are using Akismet to fight spam too.
May 9th, 2008 at 14:32 (GMT-1)
I use Akismet to filter spam
works fine
May 14th, 2008 at 18:02 (GMT-1)
90% of emails are spam now. as much as I want to help that nigerian prince, are part of me says no. you can use basic anti-spam software but people are now going for something more extreme with anti spam hardware. its getting nuts.
May 28th, 2008 at 20:04 (GMT-1)
Akismet is great, but can sometimes be wrong at its results. For example, a competitor of your website could enter spam into the spam box, then put your website in the “URL” box. If they spam much, the URL gets blocked automatically , so you cannot truly post in comments anymore. They make ur URL attribute to spam.
Good thing I contacted akismet and had them put my URL on no-spam, because this is what people have been doing to me!
June 10th, 2008 at 05:58 (GMT-1)
Akismet saves me a lot of time and I usually only check and approval comments once a week because the spam filter has filter out most of it already.
As long as the commenter post useful comments, I leave the comment on and I would send them an email to let them know that they are welcome to post more on my blog, thus providing more contents for viewers.
June 10th, 2008 at 21:52 (GMT-1)
yup.. akismet still work anyhow, its still easy to see which is a spam or not for they usually just post with a nonsense web link. they can’t even make a real post. so.. either you upload something to protect your site or just visit the comment section once a week to delete them.
July 18th, 2008 at 19:50 (GMT-1)
hey man!
i use this plugin http://sw-guide.de/wordpress/plugins/math-comment-spam-protection/
It works very well, the all spamers suck out
Try it ;)
September 8th, 2008 at 01:36 (GMT-1)
I notice that WP gets a lot of false positives. It totally destroys interaction as far as I’m concerned. From my experience I would rather use Drupal.
September 9th, 2008 at 06:40 (GMT-1)
Using Captcha would be a good option sice Akismet sometimes catches real commenters as spam.
September 27th, 2008 at 18:54 (GMT-1)
The problem of akismet is that it is not that smart for non-English spams. For example, I am using it for Japanese blogs, but it will catch ledigimate comments and ignore spams some times.
Maybe Captcha is still better for me.
October 10th, 2008 at 18:44 (GMT-1)
Wow, that is a lot of spam in one day. I get about 100 a day on my flagship blog…and that is enough to drive me crazy. But I also find that askimet catches a lot of real commenters…i go through them all just in case…I think its more so the people with links to their name that are not their actal name, but have actually left a meanful comment, silly by the blogger because they are adding free content to the blog!
December 26th, 2008 at 00:34 (GMT-1)
hey, i am just figting the same problem right now… i have 9000 of spam from last year when i didn’t touch the blog… you can change the 20 to 200 for example… i tried 500, 2000 but failed with that numbers…
in wp-admin/edit-comments.php find this line and change 20 to whatever you’d like…
$comments_per_page = apply_filters(‘comments_per_page’, 20, $comment_status);
please note, that it’s very memory and cpu intensive and extreme values will probably fail… the firefox almost hangs, but that’s not a problem… the problem will probably be the server’s Request-URI Too Large
The requested URL’s length exceeds the capacity limit for this server.
this isn’t the best solution but helps me now a bit =)
December 26th, 2008 at 00:38 (GMT-1)
well i ended up with DELETE FROM wp_comments WHERE comment_approved = 0 SQL command launched directly =))) this took like 0.5s =)))