The four solutions for a Ruby on Rails system seem to be:
- Stick with a MySQL query, but do something smarter (perhaps with an index, or using the full text search capabilities of MySQL)
- Ferret
- Ultra-Search
- Solr
A bit of Googling turned up a useful looking plugin: acts_as_fulltexttable
(Here are some more search options that I didn't have time to investigate: http://agilewebdevelopment.com/plugins/search?search=search
This little plugin does exactly what I was thinking... build a search index using mysql's full text search capability. If it is fast enough, then it will probably be good enough for my purposes.
Here are the full installation instructions.
== Step 1
Install the plugin:
script/plugin install http://wonsys.googlecode.com/svn/plugins/acts_as_fulltextable/
I had no problems with this... just note the last line that the installation process prints out.
== Step 2
Add the following code to the model that should be included in searches:
acts_as_fulltextable :fields, :to, :include, :in, :index
I wondered if this line should be inserted literally, or whether ':fields', for instance, should be replaced by the fields I actually wanted to index. I decided to follow the instructions literally, and added this line to three models I wanted to search.
Needless to say, this didn't work at all. I really hate overly clever documentation, and this is a classic case. I scratched my head and read the code for about an hour before stumbling upon the answer... read all the arguments to acts_as_fulltextable together, like a sentence:
'fields to include in index'
Duh. A simple example would have saved me, and probably a bunch of other people, a lot of time.
So the right way to do this is to add a line like:
acts_as_fulltextable :name, :description, :author, :etc
== Step 3
Create the migration:
script/generate fulltext_rows model1 model2 model3 ...
'modelx' needs to be the lowercase, singular version of the model name. So if your model is 'Item', use 'item'.
Then execute it:
rake db:migrate
No problems. The migration worked fine, and the index was built, all in one step. Nice. Now, does it work?
== Run searches
You can either run a search on a single model:
Model.find_fulltext('query to run', :limit => 10, :offset => 0)
Or you can run it on more models at once:
FulltextRow.search('query to run', :only => [:only, :this, :models], :limit => 10, :offset => 0)
I imported a database with 50000 records and tried some searches. Immediate gratification. Not only did the acts_as_fulltextable plugin work, but searches were blindingly fast. Things were looking up.
However, I quickly realized the query syntax was a bit quirky. It's documented in the MySQL documentation.
The plugin uses 'in boolean mode'. Here's the documentation.
Will Paginate
The next big step will be to get will_paginate working.
So, it wasn't hard to integrate the plugin with the rest of the website, and will_paginate seemed to work just fine.
However, I did run into a weird problem... Even though I had 50K items in the database, the standard search seemed to be limited to 30. Here's the SQL from the log:
FulltextRow Load (0.382258) SELECT fulltext_rows.*, match(`value`) against('1*' in boolean mode) AS relevancy FROM `fulltext_rows` WHERE (match(value) against('1*' in boolean mode) AND fulltextable_type IN ('Item')) ORDER BY relevancy DESC, value ASC LIMIT 0, 30
Note the 'Limit 0, 30' at the end.
This is limit is not in the plugin code, so I am trying to figure out where it is coming from. Using the limit field during the search does not help... find_fulltext() still returns 30 items, even when the search should return thousands of items... very puzzling...
@collection = Item.find_fulltext(search_terms, :limit => 100)
It seems to have something to do with will_paginate. If I remove the "require 'will_paginate'" from environment.rb, then when I run the above command in script/console, I get 100 items back.
In fulltext_row.rb, the following lines seem to merge paginate options with search options in a way that doesn't work for me:
if defined?(WillPaginate)
self.paginate(:all, search_options.merge(:page => page))
else
self.find(:all, search_options.merge(:limit => limit, :offset => offset))
end
It may be a bug, it may just be my requirements, but somehow that 30 from will_paginate is being used to limit the number of items returned in search. I need a way to override this behavior...
The way I 'fixed' this is to make a small change to the above code:
if defined?(WillPaginate)
self.paginate(:all, search_options.merge(:page => page, :per_page => 500))
else
self.find(:all, search_options.merge(:limit => limit, :offset => offset))
end
By adding that :per_page number, I am able to get 500 items, rather than 30. 500 is probably enough for me.
This seems a bit of a kludge, so I'll need to dig into this more, but for now, that seems to work.
No comments:
Post a Comment