Yesterday (Tuesday May 20th, 2008), I presented at Montreal on Rails. I made a short and sweet presentation on Mephisto, and how I refactored it to support both Akismet and Defensio.

You can grab the slides for “Refactoring to Patterns: How Mephisto went from a single engine Lada to a multi-engine jet fighter”/2008/05/21/refactoring-to-patterns.pdf (PDF).

References

Design Patterns

Other interesting patterns that I used in Mephisto, which I briefly talked about, but haven’t mentioned in the slides at all:

Refactoring

  • Refactoring to Patterns

Other things I talked about

Why should you switch to Defensio instead of Akismet ? Because of this:

The Mephisto interface, when interfaced with Defensio shows the spaminess of each comment, as well as use different shades of orange to show spammy comments
Click for larger version

Defensio provides useful statistics to the blogger. A typical response would look like:


1 —-
2 defensio-result:
3 message: ""
4 status: success
5 signature: awnc057e1a132p1jj3t4x
6 spaminess: 0.7
7 api-version: "1.1"
8 spam: true

Notice how Defensio returned the spaminess and a signature ? The signature can be used to retrain Defensio. When a false positive or negative comes through, the Defensio API simply accepts a series of signatures, and will retrain itself. Since the signature is a short-hand for the whole comment, all of the data is available for retraining: IP, author email, author name, comment’s body, etc.

It is not obvious from the screenshot above (as I mostly get spammy comments), but there are actually many shades of orange to highlight the spaminess of comments. Lighter shades are non-spammy, and darker ones, spammier.

Now that I have a couple of minutes, let me tell you what I changed exactly in Mephisto to support multiple spam detection engines.

Initially, app/models/comment.rb looked like this:


1 class Comment < Content
2 def check_approval(site, request)
3 self.approved = site.approve_comments?
4 if valid_comment_system?(site)
5 akismet = Akismet.new(site.akismet_key, site.akismet_url)
6 self.approved = !akismet.comment_check(comment_spam_options(site, request))
7 logger.info "Checking Akismet (#{site.akismet_key}) for new comment on Article #{article_id}. #{approved? ? Approved : Blocked}"
8 logger.warn "Odd Akismet Response: #{akismet.last_response.inspect}" unless Akismet.normal_responses.include?(akismet.last_response)
9 end
10 end
11 end

app/models/comment.rb@bugfixing

The original method was intimately tied to Akismet: it would instantiate one, and use it directly. The first thing I needed to do was to be able to use any kind of spam detection engine, without knowing the details of it. In fact, what I needed was an instance of the Adapter pattern.

The code changed from the above to this:


1 class Comment < Content
2 def check_approval(site, request)
3 self.approved = site.approve_comments? || spam_engine(site).ham?(article.permalink_url(site, request), self)
4 end
5
6 def spam_engine(site)
7 site.spam_engine
8 end
9 end

app/models/comment.rb@multiengine

So now, I had to get hold of an instance of SpamDetectionEngine. I again turned to the Gang of Four patterns, and this is when I realized I needed a Strategy Pattern. The Site was the most obvious place to put this, as the configuration for storing Akismet details was already there. I simply extended the concept further to allow any engine to store any configuration settings in Site, and made Site return an instance of my spam engine:


1 class Site < ActiveRecord::Base
2 def spam_engine
3 klass_name = read_attribute(:spam_detection_engine)
4 return Mephisto::SpamDetectionEngines::NullEngine.new(self) if klass_name.blank?
5 klass_name.constantize.new(self)
6 end
7 end

app/models/site.rb@multiengine

What’s a NullEngine you ask ? It’s the default engine. It’s a do nothing engine. It’s an instance of the Null object pattern. What does this engine do really ? It returns canned data for all requests. It’s always #valid?, and it always accepts comments.

So, I moved the old Akismet code to the AkismetEngine, and created a new DefensioEngine. Both of these engines have real code in place to do the validation. Although I haven’t verified that the AkismetEngine still works, I believe it should. I used Marc-AndrĂ© Cournoyer’s Defensio Rails plugin to actually talk to Defensio.

Next, I needed a way to configure those different engines. What I needed was a way to render templates, and have the engines provide the templates to the regular Rails views. I made myself a custom Template class, which renders a simple ERB template. I put the templates right next to the engine it’s linked to. Then it was just a matter of rendering the template. Looking at the AkismetEngine, the #settings_template code looks like this:


1 module Mephisto
2 module SpamDetectionEngines
3 class AkismetEngine < Mephisto::SpamDetectionEngine::Base
4 class << self
5 def settings_template(site)
6 load_template(File.join(File.dirname(FILE), "akismet_settings.html.erb")).render(:site => site, :options => site.spam_engine_options)
7 end
8 end
9 end
10 end
11 end

lib/mephisto/spam_detection_engines/akismet_engine.rb@multiengine

Finally, Defensio provides statistics through it’s API. I wanted to show nice graphics, and turned to the Google Chart API to generate them. In the end, what does all of this look like ? Well, this:

The Defensio engine's configuration and statistics
Click for larger version

I had a bit of discussion with Carl Mercier about the accuracy graph. He said Defensio looked poor because the green bar is very low. I countered that the difference between 95 and 96 percent, if the graph were 100 pixels high, would only be 1 pixel. The graph’s scale is actually 95 to 100%, so now the difference between 95 and 96 percent will be 20 pixels, if the bar is 100 pixels high. Anyway, this is an unresolved issue, and I’d like to know what people feel. You can change this yourself by cloning the repository and editing the Defensio statistics template:


1 <%
2
spam = statistics.spam.to_i</span> <span class="no"> 3</span> </span> ham = <span class="iv">statistics.ham.to_i
4 accuracy = statistics</span>.accuracy.to_f * <span class="fl">100.0</span> <span class="no"> <strong>5</strong></span> mapped_accuracy = ((accuracy - <span class="i">95</span>) * <span class="fl">5.0</span>) <span class="no"> 6</span> spam_pct = spam / (spam + ham) * <span class="fl">100.0</span> <span class="no"> 7</span> ham_pct = <span class="fl">100.0</span> - spam_pct <span class="no"> 8</span> false_positives, false_negatives = <span class="iv">statistics.false_positives.to_i, @statistics.false_negatives.to_i
9 false_positives_pct = false_positives.to_f / (false_positives + false_negatives) * 100.0
10 false_negatives_pct = 100.0 – false_positives_pct
11 stats_chd = sprintf("t:%.1f,%.1f", spam_pct, ham_pct)
12 accuracy_chd = sprintf("t:%.1f|%.1f", mapped_accuracy, 100.0)
13 retraining_chd = sprintf("t:%.1f,%.1f", false_positives_pct, false_negatives_pct)
14 %>
15

lib/mephisto/spam_detection_engines/defensio_statistics.html.erb@multiengine

All in all, I really enjoyed refactoring a tool that I use regularly. I didn’t find nor fix any major problems along the way, but if I do find some, I’ll be sure to fix them and make them available.

So, what are you waiting for ?


1 git clone git://github.com/francois/mephisto.git mephisto_defensio
2 cd mephisto_defensio
3 git checkout multiengine
4 rake db:bootstrap db:migrate
5 thin start

NOTE: There is a book called Refactoring to Patterns. I haven’t read it yet, but it seems like a good read.

My friend, Carl Mercier, unveiled Defensio in November 2007. At the time, I had repeatedly told Carl I would write a plugin for Mephisto to integrate with Defensio. Then, in December 2007, TheWebFellas released a plugin that integrated Mephisto and Defensio.

I installed the plugin, but it didn’t work out for me. And then, I had other stuff to do (like nobody does, duh). And now, a couple of months later, I am just learning about Git, and how Git empowers people to make wholesale changes to an application, and still be able to exchange that code with everyone.

I was on the #github channel on Friday night, having some problems creating a new repository. Comes alongs halorgium telling me “why didn’t you name your repository Mephiso ?” Good question. And I asked him if Rick Olson (techno-weenie) had a public Git repository for Mephisto. Halorgium replied with Rick’s repository URL, and told me that it was down at the moment. But, he had a recent clone, which he pushed to GitHub. I then simply forked his repository, and started coding like mad.

It’s now 4 days later, and I am releasing a refactored Mephisto. Instead of being intimately tied to Akismet, or Defensio for that matter, this version of Mephisto uses the Strategy and Adapter design patterns to enable any spam detection engine to connect to Mephisto.

I don’t know Rick personally, nor do I know Mark Dagget, but if they wish, they can now pull from my repository, and the whole Mephisto community will have a much better Mephisto available to them all.

So:

  • If halorgium hadn’t been on #github on Friday, I wouldn’t have known about Rick’s Mephisto repository;
  • If halorgium hadn’t had a clone, I wouldn’t have started then;
  • If I hadn’t been using Git, I wouldn’t have attempted this (too many changes in too many places for Subversion);

Git empowered me to make big changes to a foreign code base. I’m not afraid of losing any of my changes, and anyone can pull from my repository. This is a completely different working model than Subversion.

If you wish to play with this Mephisto version, you can pull from my public clone URL: git://github.com/francois/mephisto.git

Update 2008-03-03: After discussion with Halorgium on #github, I have pushed a multiengine branch. Use that instead of my master, which has been reset.


1 git clone git://github.com/francois/mephisto.git mephisto_defensio
2 cd mephisto_defensio
3 git checkout multiengine
4 rake db:bootstrap db:migrate
5 thin start

If you already have a clone of Mephisto’s repository from someone else, add mine as a remote:


1 cd mephisto
2 git remote add francois git://github.com/francois/mephisto.git
3 git fetch francois
4 git branch —track multiengine francois/multiengine
5 git checkout multiengine
6 rake db:migrate
7 thin start

In case you want to look at the code first, you can browse the GitHub repository using http://github.com/francois/mephisto/tree/multiengine

I suggest starting with Mephisto::SpamDetectionEngine::Base, and exploring from there.

Search

Your Host

A picture of me

I am François Beausoleil, a Ruby on Rails and Scala developer. During the day, I work on Seevibes, a platform to measure social interactions related to TV shows. At night, I am interested many things. Read my biography.

Top Tags

Books I read and recommend

Links

Projects I work on

Projects I worked on