Flying memes

Ruby Spell Doctor: a Ruby binding to Hunspell

The idea behind this project was in my head since a while. Then thanks to Gregory Brown and his Ruby Mendicant University, I had the opportunity and the incentive to start developing it. Ruby Spell Doctor bind some of the most useful Hunspell functions making them available to Ruby developers. For who don’t know, Hunspell is a widely used open source syntax checker library that is implemented in a lot of softwares, such as OpenOffice, Google Chrome and Mac OSX.

To archive this result I used FFI: the Ruby snippet used to map the native C function look like this:

module Hunspell
  module Wrapper
    extend FFI::Library
    ffi_lib(LIBRARY_PATH)
    attach_function :create,  :Hunspell_create, [:string,  :string], :pointer
    attach_function :spell,   :Hunspell_spell,  [:pointer, :string], :int
    attach_function :suggest, :Hunspell_suggest,[:pointer, :pointer, :string], :int;
  end
end

where LIBRARY_PATH is the path to the compiled hunspell library.

Then the next step was to create a Hunspell class which wraps this module creating a nice ruby-sh interface to the library:

module Hunspell
  class Hunspell

    # initialize the class and the anonymus module acting as a wrapper to
    # hunspell
    def initialize(config_obj = nil)
      @dicts = bootstrap(config_obj || CONFIGURATION)
      respawn_handler('default')
    end

    def respawn_handler(dictionary)
      # create the hunspell library object
      @hunspell = FFI::MemoryPointer.new :pointer
      @hunspell = Wrapper.create(
        @dicts[dictionary]['aff'],@dicts[dictionary]['dic']
      )
    end

    # returns true if words is not correct, false otherwise
    def spelled_correctly?(word)
      Wrapper.spell(@hunspell,word) != 0
    end

    # returns an array with suggested words of a given word ([] if no
    # suggestions)
    def suggest(word)
      suggestions = FFI::MemoryPointer.new :pointer, 1
      suggestions_number = Wrapper.suggest(@hunspell,suggestions, word)
      suggestion_pointer = suggestions.read_pointer
      if (suggestion_pointer.null?)
        []
      else
        suggestion_pointer.get_array_of_string(0,suggestions_number).compact
      end
    end

    private

    # get lib directory and default dictionary path
    def bootstrap(config_obj)
      if config_obj.nil?
        raise BootstrapError.new("No configuration info supplied to Hunspell" +
            "constructor and no default configuration file available")
      end
      return config_obj['dictionaries'].merge({
          'default' => config_obj['dictionaries'][
                          config_obj['dictionaries']['default']]
      })
    end

  end
end

Well, it’s almost all here, except for some configuration details that let you choose which language dictionary files to load. The path of these files (a pair of .dic and .aff) can be specified in a .yml file and then sent to the constructor method in a way like this:

YAML (config.yml)

dictionaries:
  default: en
  en:
    aff: '/Users/sandropaganotti/RMU/hunspell/hunspell_en_dict/en_US.aff'
    dic: '/Users/sandropaganotti/RMU/hunspell/hunspell_en_dict/en_US.dic'
  it:
    aff: '/Users/sandropaganotti/RMU/hunspell/hunspell_it_dict/it_IT.aff'
    dic: '/Users/sandropaganotti/RMU/hunspell/hunspell_it_dict/it_IT.dic'

Usage Example

module Hunspell
    LIBRARY_PATH = '/path/to/your/hunspell.lib'
end
require 'yaml'
require 'lib/hunspell'
hunspell = Hunspell::Hunspell.new(Yaml.load(File.read('config.yml')))
require 'lib/hunspell'
hunspell.respawn_handler 'it'
hunspell.suggest 'spagtti'
=> ["spaghetti", "spaginati", "spagliati", "spaghetto"]

And that’s it! You can find the source code of this library on my github account; hope it may be useful!

Tags: ,