Stemmer in Solr

Tag: apache , solr , solrj , stemming Author: cuiyanaaa Date: 2012-09-04

I was using the EnglishPorterFilterFactory for the application that I'm currently building in solr. Things are going fine. I tried using EnglishMinimalStemFilterFactory since I wanted to go for a less aggressive one. But I was not able to analyze the huge difference in the solr results. Whats the difference between the both? Also could you recommend me a less aggressive filter factory for pluralization stemming.

Thanks.

Best Answer

I would go for the HunspellStemFilterFactory. Given that it is based on dictionaries rather than algorithms, I expect it to be less aggressive.

comments:

Could you please let me know the difference between EnglishPorterFilterFactory and EnglishMinimalStemFilterFactory ?
They implement algorithms described by different research papers. See tartarus.org/martin/PorterStemmer and medialab.tfe.umu.se/courses/mdm0506a/material/… (look for "S stemmer")
Thanks a lot .... :)