qatsi: (wally)
[personal profile] qatsi
Book Review: Elasticsearch in Action, by Radu Gheorghe, Matthew Lee Hinman, and Roy Russo
In a previous job I'd come across Solr, but these days, I'm told, Elasticsearch is where it's at, at least where I work now. Ultimately, both are built on top of Lucene; which does, actually, make you wonder whether that has any competitors.

This book was published in 2016, but one of its weaknesses is that it's very clearly somewhat out-of-date already, as it refers to the "latest" Elasticsearch being 1.5, with 2.0 in development. (There's been some creative arithmetic, but the current release is 5.x, and over that time there have been significant changes to some APIs.) Uncharacteristically for Manning, I did encounter a few badly worded sections, or examples or diagrams that didn't make sense or had incorrect values; cumulatively these things make me wonder whether the book suffered long delays and then finally got pushed out in a hurry without adequate proofing.

Hopefully the underlying principles remain much the same, with chapters on basic CRUD operations, index and analyzer configuration, search syntax, and relevance scoring. It is always against my intuition that many search queries are exact, when my expectation is for the results to have some degree of fuzziness; of course, in practice that's where the scoring comes in. There are also chapters on techniques for modelling parent-child relationships between documents, scaling, performance, and administrative tasks. Somewhat unfairly, several topics are relegated to appendices, including geo-search, hit highlighting, percolation (which was completely new to me but seems potentially very powerful, discovering which queries will match a given document) and completion/suggestion. It seems to me these are at least as, if not more important, than parent-child relationships and aggregation functions, but maybe that's just my use case.
September 2017

