ElasticSearch multi indexes effect on performance + Tire default config

Tag: elasticsearch , tire Author: goldencredit Date: 2013-06-14

We've recently decided to implement search with elasticsearch. Using Ruby on Rails we went with Tire.

Considering that elasticsearch index is the equivalent of a database in a relational DB, why does Tire uses different indexes for each ActiveRecord model? Isn't that's the "_type" attribute purpose?

I guess I don't understand what you're asking here -- are you interested in historical trivia of why Tire approaches it like this, or are you trying to solve a specific issue? You can configure your model with index_name and document_type methods and store everything in one index just fine.
Thanks for your response. My question was about the why. Why is the Tire default way of indexing ActiveRecords objects is done in different indices, and in particular, is it because of performance issues (meaning - its faster in different indices)? It's our first project with elasticsearch and asking only because this is something we could not understand from the documentation - meaning there was not a clear view of it.
It's more like arbitrary decision -- it was more easier to do it this way initially. But on top of that, I figure having a separate index for each model might make more sense for people, and also allows easy definition of mapping in the model -- something you can't do otherwise.
Great.. Thanks @karmi!

Other Answer1

You can have different configurations for things like replication and number of shards at the index level. So, it makes sense to put your active records in different indices since you can have different configurations for these things and probably have different querying and performance needs for them.

Be careful with the database analogies, it leads to bad schema design and poor performance. A single type in elastic search might be several tables in a database.

comments:

Thanks. I guess that you have a point. Still, I think that the fact that Tire defaults are that way is weird. What is the point of the _type field in that case? Also, most of the times, these types of configurations are specified system wide..
Types are useful if you need a few different variants of the same thing with e.g. different analyzer configs, different sources of data, or different versions of the schema. Elastic search also has aliases which are great if you ever need to do schema migrations. The point of replicas and shards being configurable on a per index basis is that you can evolve your configuration over time as needed and e.g. increase the number of shards in a new index, move the data over, and point the alias from the old to the new index.
@JillesvanGurp Thanks for the explanation!