making things better, making better things

Saturday, June 27, 2009

deploying Thinking Sphinx on DreamHost PS

The time had come to add a search engine to Touring Machine. I went pretty far down the road with Xapian/Xapit, but:

You may want to trigger [reindexing] via a cron job on a recurring schedule (i.e. every day) to update the Xapian database. However it will only take effect after the Rails application is restarted because the Xapian database is stored in memory.

I see the Xapit Sync project to fix this has since ceased to be vaporware. Well, maybe next time.

Anyway, so RubyTrends tells me the cool kids use Thinking Sphinx, and I want to be cool, but I’m running Touring Machine on the cheap – on a shared DreamHost server – and they don’t want me to run server processes, and although some guy on the Internet says it’s probably fine, I’m leery of defying them. But last week they were running a discount offer on DreamHost PS, their quasi-VPS service – no root, but you can run whatever you want, within the resources (memory, CPU) you pay for.

Sounds like a fine place to run the search engine, but I didn’t want to run the whole Rails app there – since I’ve already got a place to run it that is effectively free. (DreamHost is very cheap, and I run a bunch of sites on it.) So I set out to run a distributed site, with the web app running on DreamHost shared hosting, and the search engine running on DreamHost PS.

It took some setup, but I think I’ve got it working. Here are some notes. As usual, this isn’t a tutorial. You should read everyone else’s instructions – especially the official Thinking Sphinx documentation, and J. Wade Winningham’s post about Capistrano tasks. (I didn’t use his deploy.rb, though.)

posted by erik at 5:47 pm  

Friday, June 12, 2009

where it’s at

With the infrastructure in place for the suggestions feature, it’s easy to add new kinds of suggestions – especially if someone else has already done most of the work. For example, given an address, we should be able to look up the names of businesses at that address, and suggest them as venue names.

Yahoo! and Google both do this automatically if you search for something they recognize as an address. But Google doesn’t seem to provide an API. Yahoo! Local does, and the YM4R gem already provides a nice Ruby wrapper for it. So:

  def make_suggestions_based_on_address
    businesses = Ym4r::YahooMaps::BuildingBlock::LocalSearch::get(
      :location => address.block,
      :radius => 0.001,
      :results => 20,
      :query => '*'
    businesses.each do |business|
      add_suggestion(:name, business.title)
      add_suggestion(:url, business.business_url) unless business.business_url.blank?

You might wonder about the radius parameter. Technically, Yahoo! doesn’t provide the service I want; I want “tell me what’s at this address”, it has “tell me what’s near this address”. With a radius of 0, it returns up to 10 results. After a little experimentation, it looks to me like 0.01 is a tight enough radius to get me all the businesses at an address – or at least the top 10 – and none of the ones next door.

This is actually one of two ways I’m subverting the API here. It’s intended for search, not simple lookup, which is why it’s a little awkward to get a simple address. It’s also why Yahoo! provides multiple URLs for each business. The BusinessUrl is the actual URL for the business’s web site. There’s also a BusinessClickUrl, which “contains extra information that helps [them] to optimize [their] search services. Yahoo! requests that you display the BusinessUrl, but link to the BusinessClickUrl, so they can track usage.

The latest code actually uses both URLs, but I’m going to stop here for now.

posted by erik at 12:22 am  

Wednesday, June 10, 2009

how to help

Here’s how the suggestions feature for Touring Machine works so far.


  • As the user enters data in the new venue form, if we’re able to make any guesses, we pop them up in a panel to the right of the form.
  • Each suggestion is a link. If a user clicks a suggestion link, the suggestion is copied into the corresponding form field, and the field is given focus, so the user can edit it. (For example, if the “page title” suggestion has additional text around the venue name.)

There’s also an “accessible” JavaScript-free UI, but it’s bad, so I’m going to pretend it’s not there.

Client and Server

I started drawing an interaction diagram, but it’s really standard Rails stuff:

  • Model, controller, and view produce the new venue form.
  • When any form input’s value changes, we submit the whole form (as an AJAX GET) to the controller’s suggest action.
  • The server responds with an RJS view that populates and shows the suggestions panel (if there are any suggestions) or hides it (if not).
  • The suggestions are links that call a JavaScript function that copies, highlights, and focuses.
  • Once the user is done editing, the create action just works as normal.


In keeping with the Skinny Controller, Fat Model approach, the only interesting code is in the model. It took me a couple of tries to get a design I liked, but eventually I went with Rails-style magic:

  def make_suggestions
    @suggestions = {}
    methods.grep(/^make_suggestions_based_on_/).each do |method|
      attrs = method.sub('make_suggestions_based_on_', '').split('_and_')
      values = {|attr| send(attr)}
      send(method) unless values.any?(&:blank?)

When asked for suggestions:

  • We look through the Venue object for methods of the form make_suggestions_from_attr_and_other_attr. (We support multiple attributes so that, for example, given a venue name and vague address – city and state – we can look up the exact address.) Call these “suggesters”.
  • We call each suggester only if we have non-blank values for all the named attributes (because how are you going to base a suggestion on the venue’s address if you don’t know its address?). Note that although I’m saying “attributes”, this code works for attributes, associatons, and in fact any method. This is important because a venue’s address is actually a has_one association.
  • The suggesters populate @suggestions, and then we return it.

A small thing I’m pleased by: The make_suggestions method works both as “create suggestion objects” and in the idiomatic sense of “make a suggestion” as “suggest something”.

posted by erik at 1:55 pm  

Tuesday, June 9, 2009

help a brother out

Lately I’ve been trying to help Touring Machine help me. Adding a new venue is a little tedious: Typing, or copy and pasting, a name, URL, description, address, sometimes phone and email, and tags. It’s not a big deal for one venue, but when I’m researching venues in a new city, it gets tiresome. And it’s only going to get worse as I add more fields to the database.

The frustrating part is, all that information is out there already, and it’s not hard to find. I’m usually just copying from Google results, the first or second hit, or a MySpace page: - Mercury Lounge - 102 - Female - GO-GO-GOLETA, CALIFORNIA -

So why am I doing all the Control-C, Control-V?

If I give Touring Machine a URL, it should be able to guess the venue name (page title) and maybe an address or phone number, if they’re on the page. If I supply a name and city, it should get the address, and maybe URL, from a search engine. If they’re on MySpace, why not grab the “About Me” for a first-draft description? It should even be possible to scan the page for words that match popular tags – bar, cafe, restaurant, jazz, rock, folk.

Well, I’ve been working on it. The next few posts – unless I get sidetracked – will describe the design and implementation so far.

posted by erik at 11:39 am  

Powered by WordPress