Sometimes, you need to know what your program’s doing, or how long it’s taking to do something. You could always log to a file, then use a combination of grep, awk and/or wc to gather the statistics yourself, but why bother? There are many tools out there which will do exactly what you want, just use them: Cacti, Graphite or plain-old RRD.
For instance, at Yatter, we need to know how fast our ranking algorithms are running, and we must know how long the ranking takes compared to the number of users and pages we have on hand. Graphing is the perfect solution for that, and Graphite fit the bill just fine for us. But Graphite alone won’t do all that we need: we also needed a way to instrument our code, hence the Counters library:
1 require "counters" 2 require "sequel" 3 4 DB = Sequel.connect "jdbc:postgres://127.0.0.1:5432/db" 5 Counter = Counters::StatsD.new(:url => "udp://127.0.0.1:8125", :namespace => "ranker") 6 7 users = Counter.latency "fetch.users" do 8 DB[:users].all 9 end 10 11 pages = Counter.latency "fetch.pages" do 12 DB[:pages].all 13 end 14 15 Counter.magnitude "count.users", users.length 16 Counter.magnitude "count.pages", pages.length 17 18 Counter.latency "ranking" do 19 entropy = 1.0 20 while entropy > MIN_ENTROPY 21 Counter.hit "iteration" 22 # Reduce entropy 23 end 24 end
At the end of the day, we’ll have hierarchical counters in Graphite which will give us all kinds of statistics. From the API above, you can gather that values are stored under hierarchical keys separated by fullstops. If you’re interested in the code, make yourself at home with the Counters GitHub repository.
Counters is certified to run on JRuby 1.6.0 in 1.8 mode, and MRI 1.9.2.