Multi-word facets (ElasticSearch+Tire)

Tag: elasticsearch , tire Author: dazui312 Date: 2012-07-26

My model has a tags field, which is an array of tags. The problem I'm having is I want the tags to work like keywords, but ES is somehow breaking them into spaces for the purposes of faceting.

The mapping is:

indexes :tags, type: :array

The query for popular tags is:

tire.search do
  facet 'tags' do
    terms :tags, size: 100
  end
end

Now what results is individual words. e.g. A record tagged ["retro music", "awesome"] will end up having three separate tags. Similarly, if I do a query to search on "retro music" (must { term 'tags', options[:tag] }), that will fail, while a query on "retro" or "music" will succeed. The desired behaviour here is that the tag should be atomic, so only a "retro music" (or "awesome") tag search should succeed.

Best Answer

By default, elasticsearch analyzes strings using the "standard" analyzer, which converts strings to lowercase, splits them into words and removes some frequently occurring words (stopwords). You can prevent elasticsearch from doing all that by turning off analyzer for the field tags:

indexes :tags, :type => 'string', :index => :not_analyzed 

comments:

This makes sense, but doesn't work for some reason. It still breaks up the array items. BTW why "string" rather than "array"? Anyway, I've tried string, keyword, and array, but none of them work unfortunately.
Did you delete the old index and created a new one after making this change? This type of changes require index to be deleted and recreated. Could you show the complete query because the underlying type of data is string. The fact that it's an array is handled automatically by elasticsearch.
I did delete the old index. I put the full code here: gist.github.com/3332033. It's essentially what I put in the question.
It's hard to figure out what's going on from the provided information. Could you create a stand alone repro of the issue by any chance?
That's what I am getting when I am trying to reproduce this issue: gist.github.com/d6e98a1d6faaba2637e7

Other Answer1

For me the solution was :index => :not_analyzed like above and building the index with Page.create_elasticsearch_index, not Page.import, as shown here .