Using Thin with Solaris' SMF Monitoring Framework
- - - - - - - - - - - - - - - - - - - - - - - - -

Solaris uses the Service Management Framework (SMF) at the OS level to manage, monitor, and restart long running processes.  This replaces init scripts, and tools like monit and god.

The sample XML file (thin_solaris_smf.erb) is an example SMF manifest which I use on a Joyent accelerator which runs on OpenSolaris.

This setup will:

- ensure the right dependencies are loaded
- start n instances of Thin, and monitor each individually.  If any single one dies it will be restarted instantly (test it by killing a single thin instance and it will be back alive before you can type 'ps -ef').

This is better than using clustering since if you start the cluster with SMF it will only notice a problem and restart individual Thin's if ALL of them are dead, at which point it will restart the whole cluster.  This approach makes sure that all of your Thins start together and are monitored and managed independant of each other.  This problem likely exists if you are using god or monit to monitor only the start of the master cluster, and don't then monitor the individual processes started.

This example is in .erb format instead of plain XML since I dynamically generate this file as part of a Capistrano deployment.  In my deploy.rb file I define the variables found in this erb.  Of course you don't need to use this with Capistrano.  Just replace the few ERB variables from the xml file, change its extension, and load that directly in Solaris if you prefer.

Here are some examples for usage to get you started with Capistrano, and Thin:

FILE : config/deploy.rb
--

  require 'config/accelerator/accelerator_tasks'

  set :application, "yourapp"
  set :svcadm_bin, "/usr/sbin/svcadm"
  set :svccfg_bin, "/usr/sbin/svccfg"
  set :svcs_bin, "/usr/bin/svcs"

  # gets the list of remote service SMF names that we need to start
  # like (depending on thin_max_instances settings):
  # svc:/network/thin/yourapp-production:i_0
  # svc:/network/thin/yourapp-production:i_1
  # svc:/network/thin/yourapp-production:i_2
  set :service_list, "`svcs -H -o FMRI svc:network/thin/#{application}-production`"

  # how many Thin instances should be setup to run?
  # this affects the generated thin smf file, and the nginx vhost conf
  # need to re-run setup for thin smf and nginx vhost conf when changed
  set :thin_max_instances, 3

  # OVERRIDE STANDARD TASKS
  desc "Restart the entire application"
  deploy.task :restart do
    accelerator.thin.restart
    accelerator.nginx.restart
  end

  desc "Start the entire application"
  deploy.task :start do
    accelerator.thin.restart
    accelerator.nginx.restart
  end

  desc "Stop the entire application"
  deploy.task :stop do
    accelerator.thin.disable
    accelerator.nginx.disable
  end


FILE : config/accelerator/accelerator_tasks.rb
--

    desc "Create and deploy Thin SMF config"
    task :create_thin_smf, :roles => :app do
      service_name = application
      working_directory = current_path
      template = File.read("config/accelerator/thin_solaris_smf.erb")
      buffer = ERB.new(template).result(binding)
      put buffer, "#{shared_path}/#{application}-thin-smf.xml"
      sudo "#{svccfg_bin} import #{shared_path}/#{application}-thin-smf.xml"
    end

    desc "Delete Thin SMF config"
    task :delete_thin_smf, :roles => :app do
      accelerator.thin.disable
      sudo "#{svccfg_bin} delete /network/thin/#{application}-production"
    end

    desc "Show all SMF services"
    task :svcs do
      run "#{svcs_bin} -a" do |ch, st, data|
        puts data
      end
    end

    desc "Shows all non-functional SMF services"
    task :svcs_broken do
      run "#{svcs_bin} -vx" do |ch, st, data|
        puts data
      end
    end


    namespace :thin do

      desc "Disable all Thin servers"
      task :disable, :roles => :app do
        # temporarily disable, until next reboot (-t)
        sudo "#{svcadm_bin} disable -t #{service_list}"
      end

      desc "Enable all Thin servers"
      task :enable, :roles => :app do
        # start the app with all recursive dependencies
        sudo "#{svcadm_bin} enable -r #{service_list}"
      end

      desc "Restart all Thin servers"
      task :restart, :roles => :app do
        # svcadm restart doesn't seem to work right, so we'll brute force it
        disable
        enable
      end

    end # namespace thin


FILE : config/thin.yml
--

---
pid: tmp/pids/thin.pid
socket: /tmp/thin.sock
log: log/thin.log
max_conns: 1024
timeout: 30
chdir: /your/app/dir/rails/root
environment: production
max_persistent_conns: 512
daemonize: true
servers: 3


FILE : config/accelerator/thin_solaris_smf.erb
--
This is of course an example.  It works for me, but YMMV

You may need to change this line to match your environment and config:
  exec='/opt/csw/bin/thin -C config/thin.yml --only <%= instance.to_s %> start'


CONTRIBUTE:

If you see problems or enhancements for this approach please send me an email at glenn [at] rempe dot us.  Sadly, I won't be able to provide support for this example as time and my limited Solaris admin skills won't allow.

Cheers,

Glenn Rempe
2008/03/20
