andy lindeman . com


Callbacks and Background Jobs

TL;DR: Don't use after_{create,update,save} to enqueue a background job; instead use after_commit because the database transaction will be committed.


Rails' callbacks are popular way to hook into the lifecycle of a model, and use these hooks to automatically perform tasks when a model changes.

For instance:

class Image < ActiveRecord::Base
  after_create :generate_thumbnail

  def generate_thumbnail
    # ...
  end
end

In this case, we are generating a thumbnail for an image after it is created.

Because it's likely that generating a thumbnail is an expensive operation that could easily bog down the web server and web requests, many folks would refactor it to enqueue a background job that generates the thumbnail at some time in the future:

class Image < ActiveRecord::Base
  after_create :enqueue_thumbnail_job

  def enqueue_thumbnail_job
    Resque.enqueue(ThumbnailJob, id)
  end
end

In this case, we are using a Resque job to do the heavy lifting. We avoid bogging down web requests with thumbnail generation.

Great, so imagine the Resque job looks something like:

class ThumbnailJob
  def self.perform(id)
    image = Image.find(id)
    image.generate_thumbnail
  end
end

The problem is that after_create (and most other Rails callbacks) are executed within the context of a transaction: other database sessions will not see the changes to the database until after the changes are committed.

If the background worker was incredibly fast at popping off a job, it is possible that Image.find(id) would fail because the database transaction had not yet completed. The row would not be visible to other database sessions yet.

Granted, it is unlikely that this will occur as there is a delay between the time a job is enqueued and when it is started by a worker process; however, it could happen in a small number of cases, and when it does, it would be very hard to debug and reproduce.

So, what's the solution?

The after_commit callback, introduced in Rails 3 and available as a plugin in Rails 2.

class Image < ActiveRecord::Base
  after_commit :enqueue_thumbnail_job

  def enqueue_thumbnail_job
    Resque.enqueue(ThumbnailJob, id) unless destroyed?
  end
end

NOTE: We also check to make sure the object is not destroyed, as after_commit is called upon destruction as well (thanks zimbatm).

Moral of the story: if you expect another process (e.g., background job) to see changes to the database, wait until the database transaction is completed (after_commit).



Testing Devise with RSpec Request Specs and Capybara

Many questions about testing Devise come up regularly in the freenode/#rspec IRC channel. Often, folks ask how to use RSpec request specs to write integration level ("full stack") tests against Devise.

In integration tests, ideally nothing is mocked or stubbed. Using the recommended setup for controller tests, for instance, is not a good idea in request specs. We want to click links exactly as a user would, fill in forms exactly how a user would, etc...

This post assumes you have a Devise setup running already. If not, start with the Devise README.

First, RSpec request specs can use either Capybara or Webrat. I prefer Capybara. To install, add this line to your Gemfile and run bundle install:

gem 'capybara', :group => :test

Next, add require 'capybara/rspec' to spec/spec_helper.rb:

ENV["RAILS_ENV"] ||= 'test'
require File.expand_path("../../config/environment", __FILE__)
require 'rspec/rails'
require 'capybara/rspec' ### ADD THIS LINE

Finally, RSpec request specs must be created in the spec/requests directory under your Rails root, so create that directory if it does not already exist.

Now you're ready to start writing specs! For example, here's a simple one I named spec/requests/user_registration_spec.rb:

require "spec_helper"

describe "user registration" do
  it "allows new users to register with an email address and password" do
    visit "/users/sign_up"

    fill_in "Email",                 :with => "alindeman@example.com"
    fill_in "Password",              :with => "ilovegrapes"
    fill_in "Password confirmation", :with => "ilovegrapes"

    click_button "Sign up"

    page.should have_content("Welcome! You have signed up successfully.")
  end
end

And here's one to test user sign in, named spec/requests/user_sign_in_spec.rb:

require "spec_helper"

describe "user sign in" do
  it "allows users to sign in after they have registered" do
    user = User.create(:email    => "alindeman@example.com",
                       :password => "ilovegrapes")

    visit "/users/sign_in"

    fill_in "Email",    :with => "alindeman@example.com"
    fill_in "Password", :with => "ilovegrapes"

    click_button "Sign in"

    page.should have_content("Signed in successfully.")
  end
end

Hopefully these serve as a good starting point for your tests.

As you move beyond these basic specs, check out these resources:



What's The First Release That Contains This Commit?

If the project is using `git` and tags each release, it is really easy to ask `git` which tag is the first to contain a given commit.

For instance, take 58a46bf on Sunspot which added the `Sunspot.optimize` method:

$ cd /path/to/sunspot
$ git describe --contains 58a46bf
v1.2.0~22

In this case, `v1.2.0` was the first tag to contain the commit in its tree, and it was `22` commits away.

To use `Sunspot.optimize`, you’d need at least Sunspot v1.2.0.



ruby-debug and "Warning: saved frames may be incomplete"

I like using ruby-debug (well, ruby-debug19 for Ruby 1.9), a command line debugger for Ruby.

Occasionally, though, I would run into an issue where the backtrace (bt) command did not give a useful stack trace.

(rdb:1) bt
--> #0 at line /.../config/initializers/omniauth.rb:11
Warning: saved frames may be incomplete; compare with caller(0).

Thanks to a StackOverflow question, though, I’ve found the solution: require “ruby-debug” and call Debugger.start as soon as possible in the application being debugged.

For a Rails project, this may look like adding the following to config/environments/development.rb:

require 'ruby-debug'
Debugger.start

Sweet, sweet backtraces return!

(rdb:1) bt
--> #0 OmniAuth::Strategy.fail!(message_key#Symbol)
       at line /.../lib/omniauth/strategy.rb:11
    #1 OmniAuth::Strategy.fail!(message_key#Symbol)
       at line /.../lib/omniauth/strategy.rb:224
    #2 OmniAuth::Strategies::OpenID.start
       at line /.../lib/omniauth/strategies/open_id.rb:71
    #3 OmniAuth::Strategies::OpenID.request_phase
       at line /.../lib/omniauth/strategies/open_id.rb:63
    ...


Optimistic Locking with MongoDB (and mongo_mapper)

(cross posted on the Highgroove Blog)

At Highgroove (where I work), we build database-backed web applications. These days, there are many options when choosing a database backend. We normally start with relational database systems (e.g., PostgreSQL and MySQL) because they are very mature and feature niceties like ACID transactions and locks.

On the other hand, we also work with newer NoSQL database systems that often trade features like transactions, locks, and joins for higher performance and scalability. One popular option is a document-based store called MongoDB. By design, MongoDB does not support ACID transactions (though many operations are atomic) or traditional locks.

In many cases MongoDB, paired with an object-relational mapper like mongo_mapper is a great solution for Ruby and Rails applications. And after the break I explore a solution that allows a developer to “lock” a MongoDB document while still maintaining the high performance and other features MongoDB is known for.

mongo_mapper loads documents from MongoDB into memory as a Ruby object. Typically, a program then changes a few values and saves the record. Notably, saving the record actually resets every field in the document to the values in the Ruby object.

Unfortunately, data and work can be lost if multiple processes (e.g., different web server processes/threads or background workers) are interacting with the same documents at the same time.

For instance, in the code below if two processes execute post = BlogPost.find("4e13cda850b86112c9000001") at roughly the same time, one will end up overriding the other because the values in memory will be stale.

This problem is not unique to MongoDB or mongo_mapper, but the potential solutions are different from relational databases because MongoDB documents cannot be exclusively locked.

One way to solve the problem is to only interact with MongoDB through its atomic operations, but many of the conveniences of an object-relational mapper are lost for a problem that may only occur sporadically.

Another way is to use optimistic locking, a technique that reliably determines if a record has been updated by another process between the time it was initially loaded into memory and the time a save is attempted. If the record has been modified by another process, an error is raised and the object must be reloaded. If objects are not in high contention and retrying the operations is easily accomplished, optimistic locking is a good solution that keeps performance high in the average case.

If using optimistic locking with a background job, a reasonable response to a stale document error would be to reload the record, retry the operation, and attempt a save again. On the other hand, if two end users are in contention, the stale document could be presented to one of the users for manual conflict resolution.

We have written a small gem that implements optimistic locking with mongo_mapper called mm-optimistic_locking. It is very straightforward to use and we think it does its job well, but we definitely welcome feedback.

An example usage is shown below (basic mongo_mapper familiarity assumed):

More information is available in the README and its GitHub homepage.



subscribe

Subscribe to RSS feed

© Andy Lindeman | Icons from silk | [source on github] | [validate as html5]