Wednesday, 6 February 2008

Making Bio::Graphics extendable

One of the issues in a library like Bio::Graphics, is the plethora of glyph types that users will want. Here's a little showcase of what's provided by the library:


Features on a DNA sequences can be represented as filled boxes, open boxes, boxes with arrows, lines, triangles, ... In this post, I'll show you (and remind myself) how I came to a version of the Bio::Graphics code that makes adding glyphs straightforward both by myself and the user. WARNING: this post is going to be rather technical... Sorry about that.

First pass
Suppose we want to make it possible to create a picture like this one:

You basically have to tell your script that marker features should be drawn as triangles, and both scaffold and clone features as coloured boxes. The initial version of doing the actual drawing looked like this (only taking the relevant bits):

class Feature
def initialize(glyph = :generic)
@glyph = glyph
end
attr_accessor :glyph

def draw
case @glyph
when :generic
drawing.rectangle(left, top, width, height).fill
when :line
drawing.move_to(left,top)
drawing.line_to(right,top)
drawing.stroke
when :triangle
# code to draw triangle
end
end
end


This does work, but you see the issue, right? Whenever I or someone else comes up with another idea on how to represent a particular feature, the library code itself has to be changed. So far from extendable, that is...

Second pass: extracting the glyphs
To handle this issue for perl's Bio::Graphics, Lincoln Stein uses the Factory pattern. Which means that he creates a single GlyphFactory object that spits out different Glyph objects for each feature based on the configuration set at the Feature level. As I didn't know a thing about Design Patterns (i.e. before Russ Olsen's "Design Patterns in Ruby" arrived here at work) I had no idea how to set something up like that and just started coding away. As it turns out, I actually implemented it using a Strategy pattern.

What I basically wanted, is to delegate the actual drawing of a feature to a glyph. The Design Patterns in Ruby book gives a good example for formatting text. Here's the code:

class XMLFormatter
def output_report(title, text)
puts('< xml>')
puts(' < title>#{title}< /title>')
puts(' < text>#{text}< /text>')
puts('< /xml>')
end
end

class PlainTextFormatter
def output_report(title, text)
puts("***** #{title} *****")
puts text
end
end


This can then be used in e.g. a Report class like this (also from the same book):

class Report
attr_reader :title, :text
attr_accessor :formatter

def initialize(formatter)
@title = 'Monthly Report'
@text = 'Things are going pretty well.'
@formatter = formatter
end

def output_report
@formatter.output_report(@title, @text)
end
end


Looks a lot like what we need, isn't it? Translating this to our purposes, the library code could look like this:

class Glyph::Common
def initialize(caller)
@caller = caller
end
attr_accessor :caller
end

class Glyph::Generic < Glyph::Common
def draw(left, right, width, height)
@caller.drawing.rectangle(left, top, width, height).fill
end
end

class Glyph::Line < Glyph::Common
def draw(left, right, width, height)
@caller.drawing.move_to(left,top)
@caller.drawing.line_to(right,top)
@caller.drawing.stroke
end
end


And use it in the Feature class like this:

class Feature
def initialize(glyph_object = Glyph::Generic)
@glyph_object = glyph_object.new(self)
end
attr_accessor :glyph_object

def draw
@glyph_object.draw
end
end


At least this approach splits out the actual drawing into different simple classes. But the extendability still isn't there: the user still has to open the library file containing all glyph definitions and hack away in there.

Third pass: loading glyphs automatically
It's be nice if we could add new glyph types on the fly just by creating a little file containing the code for that glyph's class. Convention over configuration to the rescue...

What I did, was create a folder (/lib/bio/graphics/glyphs/) that contains the description of all glyphs in separate files:
generic.rb

class Glyph::Generic < Glyph::Common
def draw(left, right, width, height)
@caller.drawing.rectangle(left, top, width, height).fill
end
end


line.rb

class Glyph::Line < Glyph::Common
def draw(left, right, width, height)
@caller.drawing.move_to(left,top)
@caller.drawing.line_to(right,top)
@caller.drawing.stroke
end
end


So ideally, the only thing to make a script work that asks for a feature to be drawn as a empty box (feature = Feature.new(:empty_box)), would be to add a file to that directory called 'empty_box.rb'. Several things have to be taken care of to make that happen:
* loading the new file
* translating the :empty_box to EmptyBox

To load all files in that directory is easy enough. Adding the following code to the main bio-graphics.rb file (which loads the whole library) does the trick:

glyph_dir = File.dirname(__FILE__) + '/bio/graphics/glyphs/'
require glyph_dir + '/common.rb'
full_pattern = File.join(glyph_dir, '*.rb')
Dir.glob(full_pattern).each do |file|
require file
end


To translate the :empty_box symbol into the EmptyBox class takes a little more work: we need to convert the snake_case symbol into a CamelCase string, and then create an object of the class that has that name. To do that, I extended the String class a bit with these additional methods:

class String
def snake_case
return self.to_s.gsub(/::/, '/').gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').gsub(/([a-z\d])([A-Z])/,'\1_\2').tr("-", "_").downcase
end

def camel_case
return self.to_s.gsub(/\/(.?)/) { "::" + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }.to_s.gsub(/\/(.?)/) { "::" + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }
end

def to_class
parts = self.split(/::/)
klass = Kernel
parts.each do |part|
klass = klass.const_get(part)
end
return klass
end
end


Now what happens here? The snake_case and camel_case methods should be not that difficult to understand and are not really where the magic happens. The String#to_class method however is a different story. As it happens, every class in ruby is also represented by a constant (the class name always start with a capital). To get to the class that has the name MyClass, all you have to do is retrieve the constant with that name: Kernel.const_get("MyClass"). Unfortunately, having namespaces (Bio::Graphics::Glyph::Generic) makes things a bit difficult. You can't just do Kernel.const_get("Bio::Graphics::Glyph::Generic"). To get to the Generic class, you have to call the const_get method on the Bio::Graphics::Glyph class, which doesn't exist yet. Therefore we have to look through all parts of the namespace and build up the class as we go.

With this code in place, I rewrote the Feature class to use this functionality:

class Feature
def initialize(glyph = :generic)
@glyph = glyph
end
attr_accessor :glyph

def draw
glyph_name = 'Bio::Graphics::Glyph::' + glyph.to_s.camel_case
glyph_class = glyph_name.to_class
glyph = glyph_class.new(self)
glyph.draw
end
end


Now all a user has to do to add a new glyph type to his application, is:
* create a file in the lib/bio/graphics/glyphs/ directory that defines the glyph
* make sure that the name he gives to that class is the CamelCase version of the symbol he wants to use (which should be snake_case)

There you go. As I warned at the start: technical. At the moment this setup works for what I need the Bio::Graphics library to do. There might be a chance that the approach is changed in the future as we need to handle subfeatures, subsubfeatures, subsubsubfeatures, ... more elegantly. But thats' something for another post.

3 comments:

  1. Hi Jan,

    I'm thinking about using cairo (and the ruby bindings) to generate some figures, but the lack of (English) rcairo documentation has frustrated my efforts somewhat. How did you get your head around the rcairo library? Any great resources that I've missed?
    Thanks
    -r

    ReplyDelete
  2. Hi Bob,

    I remember it being a pain to get rcairo documentation. Ended up using python documentation at http://tinyurl.com/yrreev

    Getting cairo itself running is also far from simple on a Mac... Have you had a look at processing (http://processing.org) as an alternative to cairo? Depending on what you want to do that might be a better option.

    ReplyDelete
  3. I love processing, but I'm really using this little project as a warm-up before learning how to craft some custom GTK+ widgets, so cairo is sort of necessary. I'm also on ubuntu, so I didn't have much trouble getting cairo installed.

    The python code seems very close to the ruby bindings, thanks!
    Since posting my first comment I've git cloned your biographics repo. Looking through your code has actually answered some of the more critical questions I had about the Context and ImageSurface objects.

    Thanks for the python docs tip.
    -r

    ReplyDelete