Now that I have a couple of minutes, let me tell you what I changed exactly in Mephisto to support multiple spam detection engines.

Initially, app/models/comment.rb looked like this:


1 class Comment < Content
2 def check_approval(site, request)
3 self.approved = site.approve_comments?
4 if valid_comment_system?(site)
5 akismet = Akismet.new(site.akismet_key, site.akismet_url)
6 self.approved = !akismet.comment_check(comment_spam_options(site, request))
7 logger.info "Checking Akismet (#{site.akismet_key}) for new comment on Article #{article_id}. #{approved? ? Approved : Blocked}"
8 logger.warn "Odd Akismet Response: #{akismet.last_response.inspect}" unless Akismet.normal_responses.include?(akismet.last_response)
9 end
10 end
11 end

app/models/comment.rb@bugfixing

The original method was intimately tied to Akismet: it would instantiate one, and use it directly. The first thing I needed to do was to be able to use any kind of spam detection engine, without knowing the details of it. In fact, what I needed was an instance of the Adapter pattern.

The code changed from the above to this:


1 class Comment < Content
2 def check_approval(site, request)
3 self.approved = site.approve_comments? || spam_engine(site).ham?(article.permalink_url(site, request), self)
4 end
5
6 def spam_engine(site)
7 site.spam_engine
8 end
9 end

app/models/comment.rb@multiengine

So now, I had to get hold of an instance of SpamDetectionEngine. I again turned to the Gang of Four patterns, and this is when I realized I needed a Strategy Pattern. The Site was the most obvious place to put this, as the configuration for storing Akismet details was already there. I simply extended the concept further to allow any engine to store any configuration settings in Site, and made Site return an instance of my spam engine:


1 class Site < ActiveRecord::Base
2 def spam_engine
3 klass_name = read_attribute(:spam_detection_engine)
4 return Mephisto::SpamDetectionEngines::NullEngine.new(self) if klass_name.blank?
5 klass_name.constantize.new(self)
6 end
7 end

app/models/site.rb@multiengine

What’s a NullEngine you ask ? It’s the default engine. It’s a do nothing engine. It’s an instance of the Null object pattern. What does this engine do really ? It returns canned data for all requests. It’s always #valid?, and it always accepts comments.

So, I moved the old Akismet code to the AkismetEngine, and created a new DefensioEngine. Both of these engines have real code in place to do the validation. Although I haven’t verified that the AkismetEngine still works, I believe it should. I used Marc-AndrĂ© Cournoyer’s Defensio Rails plugin to actually talk to Defensio.

Next, I needed a way to configure those different engines. What I needed was a way to render templates, and have the engines provide the templates to the regular Rails views. I made myself a custom Template class, which renders a simple ERB template. I put the templates right next to the engine it’s linked to. Then it was just a matter of rendering the template. Looking at the AkismetEngine, the #settings_template code looks like this:


1 module Mephisto
2 module SpamDetectionEngines
3 class AkismetEngine < Mephisto::SpamDetectionEngine::Base
4 class << self
5 def settings_template(site)
6 load_template(File.join(File.dirname(FILE), "akismet_settings.html.erb")).render(:site => site, :options => site.spam_engine_options)
7 end
8 end
9 end
10 end
11 end

lib/mephisto/spam_detection_engines/akismet_engine.rb@multiengine

Finally, Defensio provides statistics through it’s API. I wanted to show nice graphics, and turned to the Google Chart API to generate them. In the end, what does all of this look like ? Well, this:

The Defensio engine's configuration and statistics
Click for larger version

I had a bit of discussion with Carl Mercier about the accuracy graph. He said Defensio looked poor because the green bar is very low. I countered that the difference between 95 and 96 percent, if the graph were 100 pixels high, would only be 1 pixel. The graph’s scale is actually 95 to 100%, so now the difference between 95 and 96 percent will be 20 pixels, if the bar is 100 pixels high. Anyway, this is an unresolved issue, and I’d like to know what people feel. You can change this yourself by cloning the repository and editing the Defensio statistics template:


1 <%
2
spam = statistics.spam.to_i</span> <span class="no"> 3</span> </span> ham = <span class="iv">statistics.ham.to_i
4 accuracy = statistics</span>.accuracy.to_f * <span class="fl">100.0</span> <span class="no"> <strong>5</strong></span> mapped_accuracy = ((accuracy - <span class="i">95</span>) * <span class="fl">5.0</span>) <span class="no"> 6</span> spam_pct = spam / (spam + ham) * <span class="fl">100.0</span> <span class="no"> 7</span> ham_pct = <span class="fl">100.0</span> - spam_pct <span class="no"> 8</span> false_positives, false_negatives = <span class="iv">statistics.false_positives.to_i, @statistics.false_negatives.to_i
9 false_positives_pct = false_positives.to_f / (false_positives + false_negatives) * 100.0
10 false_negatives_pct = 100.0 – false_positives_pct
11 stats_chd = sprintf("t:%.1f,%.1f", spam_pct, ham_pct)
12 accuracy_chd = sprintf("t:%.1f|%.1f", mapped_accuracy, 100.0)
13 retraining_chd = sprintf("t:%.1f,%.1f", false_positives_pct, false_negatives_pct)
14 %>
15

lib/mephisto/spam_detection_engines/defensio_statistics.html.erb@multiengine

All in all, I really enjoyed refactoring a tool that I use regularly. I didn’t find nor fix any major problems along the way, but if I do find some, I’ll be sure to fix them and make them available.

So, what are you waiting for ?


1 git clone git://github.com/francois/mephisto.git mephisto_defensio
2 cd mephisto_defensio
3 git checkout multiengine
4 rake db:bootstrap db:migrate
5 thin start

NOTE: There is a book called Refactoring to Patterns. I haven’t read it yet, but it seems like a good read.

Search

Your Host

A picture of me

I am François Beausoleil, a Ruby on Rails and Scala developer. During the day, I work on Seevibes, a platform to measure social interactions related to TV shows. At night, I am interested many things. Read my biography.

Top Tags

Books I read and recommend

Links

Projects I work on

Projects I worked on