Fresh Hacker News | Manticore Search: Fast, efficient, drop-in replacement for Elasticsearch

▲Manticore Search: Fast, efficient, drop-in replacement for Elasticsearch(github.com)

133 points by klaussilveira 1 day ago | 8 comments

▲snikolaev 1 day ago

Hi everyone — I'm one of the maintainers of Manticore Search. Huge thanks to @klaussilveira for submitting this — we really appreciate the interest and the thoughtful discussion here.

A few points that came up in the thread and are worth clarifying:

- We do get compared to Elasticsearch a lot. While we support some of its APIs, Manticore isn't a drop-in replacement. We've focused on performance, simplicity, and keeping things open-source without vendor lock-in. Our own SQL-based query language and REST endpoints are part of that philosophy. - @mdaniel was right to question the "drop-in" wording — that's not our goal. - As @sandstrom pointed out, tools like Typesense and Meilisearch are part of this evolving search space. We see Manticore fitting in where users want powerful full-text and vector search capabilities with lower resource overhead and SQL capabilities (we support JSON too though)

We'd love to hear from you: - What are your main use cases for search or log indexing? - Which Elasticsearch features (if any) are absolutely essential for you? - Are there performance comparisons or scaling challenges you'd like to see addressed?

Happy to answer any questions or dive deeper.

▲sandstrom 1 day ago

"What are your main use cases for search or log indexing?"

To me, storing and searching logs is quite different from most other search use-cases, and it's not obvious that they should be handled by the same piece of software.

For example, tokenization, stemming, language support many other things and are basically useless in log search. Also, log search is often storing a lot of data, and rarely retrieving it (different usage pattern from many search use-cases which tend to be less write-heavy and more about reads).

I know ElasticSearch has had success in both, but if I were Manticore/Typesense/Meilisearch I'd probably just skip logs altogether.

Loki, QuickWit and other such tools are likely better suited for logs.

- https://github.com/quickwit-oss/quickwit - https://github.com/grafana/loki

▲Semaphor 1 day ago

> What are your main use cases for search

Camera and camera lens names. I tried mellisearch (1-2 years ago), and while I loved the simplicity (I barely understand what I threw together with many, many lines of C# code for elastic search; this is partially on ES, but clearly on me as well), it was not good at getting results.

Names like "Lumix DC-S1IIE", "DSC-RX100 VIIA", or "FE 50-150 mm F2 GM (SEL50150GM)" do not quite work with default tokenizers and analyzers. Of course that is for product names, for full text queries still need to use normal language rules… except for product names showing up in the text, so now I need multiple indexes for every field, and searching different sub-fields sometimes with different query analyzers.

It was a lot of trial and error getting ES to both find what was searched for, but also be typo tolerant. It’s very easy getting far too many results, and bad scoring for fuzzy queries.

So a bit of a special case, but something the customization capabilities of ES support pretty well.

Luckily, our dataset is rather small, maybe 100k documents, so scalability is not a problem.

▲snikolaev 1 day ago

Thanks for sharing! What sets Manticore apart from Meilisearch and Elasticsearch is that it lets you configure tokenization at a low level by:

- choosing which characters should be treated as token characters, and using the rest as token separators

- defining "blend chars" — for example, the hyphen (-) could make sense as both a separator and a non-separator in your case

- or optionally adding it to the ignore_chars list

- there's also regexp_filter to process tokens when indexing and searching

That said, setting things like this up perfectly is always tricky with any search engine, because the words and punctuation in real data often don't follow regular patterns. It's especially difficult when you want to find "abc def" by "ab cd ef" which may be a common situation in your case.

▲sontek 1 day ago

The main tag line says "Drop-in replacement for E in the ELK stack" -- but here you say you aren't a drop-in replacement.

Does this mean you've at least implemented every API that Kibana requires?

▲snikolaev 1 day ago

Not every, but Kibana can be used with Manticore with some limitations - https://github.com/manticoresoftware/kibana-demo

▲Keyframe 1 day ago

So, "Drop-in replacement for E in the ELK stack, but not really, maybe, limitations apply"? Why even use that copy then?

▲victorbjorklund 1 day ago

To be honest plenty do that. Companies claim their object storage is S3 compatiable even if they havent implemented every single functionality that S3 has.

▲nine_k 1 day ago

Let's call it an almost equivalent replacement, like "e" for "E".

▲esafak 1 day ago

To entice people searching for ElasticSearch alternatives.

▲xeraa 1 day ago

The clear misrepresentation for years with absolutely no shame...

▲lokl 1 day ago

I've been using Sphinx for 20 years for full-text search with a custom stemmer. I considered switching to Manticore, but didn't see a huge need to do so, because Sphinx still works well for me and requires zero maintenance. Any big new features that might entice me to switch? (I only have a few GB of indexes, covering a few million documents.)

▲snikolaev 1 day ago

If your setup works fine and doesn't need any maintenance, there's really no reason to switch to something else. But once you upgrade to a newer version of Sphinx and it crashes, I personally feel more comfortable knowing I can report the issue on GitHub and expect it to be fixed eventually. Unfortunately, that's not how it works with Sphinx.

Speaking of features, both Sphinx (as a closed-source project) and Manticore (as an open-source project) have added some nice improvements during last years. But again, if you're happy with the 20-year-old version, there's probably nothing to worry about.

▲nchmy 28 minutes ago

Im a fan of manticore and promote it when I can, but this doesn't seem like an approoriate response. It appears generous on the surface, but has various unnecessary and subtle digs in it - all while it would be easy to show how manticore is a more appropriate choice.

They're surely not using a 20-year-old version of Sphinx - it appears that the latest open source version on Github is 8 years old. They have a closed source version which appears to be maintained, though has infrequent releases.

Why not just say these things, which make manticore self-evidently a better choice?

https://github.com/sphinxsearch/sphinx https://sphinxsearch.com/downloads/current/

▲cbsmith 1 day ago

> We do get compared to Elasticsearch a lot. While we support some of its APIs, Manticore isn't a drop-in replacement.

Thank you for saying that up front. I read a description of your product and the first thing I thought was, "this looks like a potential alternative to ElasticSearch, but it is not a drop-in replacement for ElasticSearch".

▲aaroninsf 1 day ago

Since you asked! I don't see mention of Lucene on the repo landing page,

could you ELI5 the query language and TD-IDF?

(Being lazy, I am happy to look into this myself lol.)

▲snikolaev 1 day ago

We made a blogpost "TF-IDF in a nutshell" - https://manticoresearch.com/blog/tf-idf-in-a-nutshell/

Manticore Search's query language is more expressive than Lucene's. While Lucene supports basic boolean logic, field search, phrase queries, wildcards, and proximity, Manticore adds many powerful operators. These include quorum matching (/N), strict word order (<<), NEAR/NOTNEAR proximity, MAYBE (soft OR), exact form (=word), position anchors (^word, word$), full regular expressions, and more. Manticore uses SQL-style syntax with a MATCH() clause for full-text search, making it easier to combine text search with filters and ranking.

▲mdaniel 1 day ago

One should not use "drop-in" when they have their own query language and seemingly input shape for the /search endpoint (which is also different from Elastic, of course) https://manual.manticoresearch.com/Searching/Full_text_match...

It sounds like they're really targeting the logging search store part of ELK, which can be a perfectly fine objective, but no need to mislead audiences since they will find out and then you've made an enemy

▲snikolaev 1 day ago

The Manticore Search github repository calls it a "drop-in replacement for E in the ELK stack," not just a replacement for Elasticsearch. On https://manticoresearch.com/, it's described as an "Elasticsearch alternative," so the confusion is probably just here on HN :)

▲tbarbugli 1 day ago

I agree, only reason I read the project readme was to see the drop-in explainer.

Very misleading title

▲andygeorge 1 day ago

cc klaussilveira

▲klaussilveira 1 day ago

Well, it says right there in the GitHub About section: "Drop-in replacement for E in the ELK stack".

▲sandstrom 1 day ago

For anyone who's interested, two other popular contenders for replacing Elasticsearch are Typesense (https://typesense.org/) and Meilisearch (https://www.meilisearch.com/).

(both are also trying to replace Algolia, because both have cloud offerings)

▲entropyie 1 day ago

Honourable mention to ZincSearch, if you are looking for a lightweight single binary (golang) alternative: https://github.com/zincsearch/zincsearch

I have no affiliation.

▲robertlagrant 1 day ago

Don't forget OpenSearch[0].

[0] https://opensearch.org

▲smarx007 1 day ago

Isn't that just an old fork of Elasticsearch?

▲mdaniel 1 day ago

It is a fork, but not old; they have ongoing commits: https://github.com/opensearch-project/OpenSearch/commits/mai...

Plus, given that AWS is currently hosting Open Search, they are not incentivized to sit on their laurels when it comes to modern features or stability

▲Keyframe 1 day ago

Went from ES and sharding hell to less of a sharding hell with OS on AWS. I've been looking for a replacement ever since first friday evening sharding party with infrastructure team.

▲cbsmith 1 day ago

"sharding party"

Man, that made me laugh. I'm using that.

▲Keyframe 1 day ago

yeah, you haven't lived until you curled in blind rage _cluster/allocation/explain and _cat/shards?h=index,shard,prirep,state,unassigned.reason | grep UNASSIGNED every few seconds; In production party, with tarantulas on your back asking for status, of course.

▲sontek 1 day ago

I don't believe meilisearch or typesense are API compatible with Elasticsearch. I think the best part of this new tool is its a drop-in replacement.

Edit: Nevermind, in another part of this thread the maintainer said:

    We do get compared to Elasticsearch a lot. While we support some of its APIs, Manticore isn't a drop-in replacement

Which conflicts with the README: "Drop-in replacement for E in the ELK stack"

▲nchmy 25 minutes ago

FYI, manticore is not "new". It has been around many years, and is based on Sphinx search which has been around longer than, I think, Lucene (which elasticseaech is built on)

▲snikolaev 1 day ago

It's not a drop-in replacement in general, but it can be seen as a drop-in replacement for Elasticsearch in the ELK stack because: - You can send data to Manticore using Logstash (L) - You can visualize the data using Kibana (K)

Sorry for the confusion :)

▲merb 1 day ago

That is the most bullshit thing I’ve read in a while. Send data to manticore via logstash does not make you an elasticsearch replacement. And a lot of elasticsearch use cases are not using kibana.

(Logstash can basically ingest and output to everything…)

▲snikolaev 1 day ago

But not everything works with Kibana. What are some well known alternatives besides Manticore?

▲pQd 21 hours ago

manticore, earlier sphinx search, has been rock solid for us for the past 16 years. now serving searches across nearly 300M short documents. we're using it in the old mode - where full index is re-created every 24h.

it's great to see that the project is alive and adding embeddings-related functions needed for semantic search.

▲snikolaev 20 hours ago

Great to hear! 16 years is impressive. Glad to see the new semantic features caught your eye — we're excited to keep improving the project.

▲wavemode 1 day ago

> Manticore Search was forked from Sphinx 2.3.2 in 2017.

What was the reason for the fork, and in what ways does Manticore Search differ from Sphinx today?

▲pQd 14 hours ago

sphinx project went half-dead around that time, few years later it got revived but with closed source license.

as far as i understand apparent death of sphinx and demand for continued development/support from big users of it led to creation of manticore.

▲aleksi 1 day ago

See https://manticoresearch.com/comparison/vs-sphinx/. Sphinx is no longer open-source.

▲tonyhart7 1 day ago

I thought we already have elastic search alternative called meilisearch

▲mdaniel 10 hours ago

relevant: https://www.meilisearch.com/docs/learn/resources/comparison_...

MIT: https://github.com/meilisearch/meilisearch/blob/v1.15.2/LICE...

from a few months ago: https://news.ycombinator.com/item?id=43680699 and the .com has quite a few submissions, but without any obvious commentary upon them https://news.ycombinator.com/from?site=meilisearch.com

▲another_twist 1 day ago

Curious about the architecture here. Where does the 20x speedup come from ?

Recently had a look at Tantivy as well, although compared to raw lucene, their perf is actually inferior. Wonder if there are specific benchmarks here which measure performace and if they compared tail latencies as opposed to averages.

▲snikolaev 1 day ago

The speedup comes from a number of architectural and low-level performance optimizations in Manticore Search.

Manticore has a modern multithreading architecture with efficient query parallelization that fully utilizes all CPU cores. It supports real-time indexing - documents are searchable immediately after insertion, with no need to wait for flushes or refreshes.

It uses row-wise storage optimized for small to large datasets, and for even larger datasets that don’t fit into memory, there's support for columnar storage through the Manticore Columnar Library.

Secondary indexes are built automatically using the PGM-index (Piecewise Geometric Model index), which enables efficient filtering and sorting by mapping keys to their memory locations. The cost-based query optimizer uses statistics about the data to choose the most efficient execution plan for each query.

Manticore is SQL-first: SQL is its native syntax, and it speaks the MySQL protocol, so it works out of the box with MySQL clients.

It's written in C++, starts quickly, uses minimal RAM, and avoids garbage collection — which helps keep latencies low and stable even under load.

As for benchmarks, there's a growing collection of them at https://db-benchmarks.com, where Manticore is compared to Elasticsearch, MySQL, PostgreSQL, Meilisearch, Typesense, and others. The results are open and reproducible.

▲9dev 1 day ago

If I had to guess, I would say it’s the 20x smaller feature set compared to Elasticsearch.

We built a custom search engine on top of Elasticsearch. Our query builder regularly constructs optimised queries that would be impossible to implement in any of the touted alternatives or replacements, which almost always focus on simple full text search, because that’s everything the developers ever used ES for. There’s a mindboggingly huge number of additional features that you need for serious search engines though, and any contender will have to support at least a subset of these to deserve that title in the first place.

I’m keeping an eye on the space, but so far, I’m less than impressed with everything I’ve seen.

▲another_twist 1 day ago

What are the missing features though ? Autoshard, something related to ranking ? Also curious, why not go with algolia which as I understand kinda built for product facing search use cases ?

▲snikolaev 1 day ago

Autosharding, authentication, dynamic mapping.

▲mdaniel 1 day ago

> dynamic mapping

I didn't dig into the docs, but now having seen the "create table whatever(name string)" makes me super paranoid: does your mention of "dynamic mapping" as a missing feature mean that if a document shows up with <<{"name":"Fred","birthday":"1970-12-25"}>> it'll drop the document?

▲snikolaev 1 day ago

There's a JSON field type that lets you use any schema, but it doesn't support full-text filtering. If you are not using it and if your next document has a different schema, it will cause an error when you try to insert it.

▲mdaniel 1 day ago

> Autosharding

relevant: https://news.ycombinator.com/item?id=32274309

▲snikolaev 1 day ago

It's somewhat smaller, but I believe not 20 times smaller. Among the major features, probably only authentication and auto-sharding are missing. Both are already in progress. On the other hand, the main feature missing in Elasticsearch is proper SQL support, which many Manticore users really appreciate.

▲9dev 1 day ago

What’s the story on nested documents, complex Boolean queries, custom script scoring, pipelined aggregations, vectors and so on?

▲snikolaev 1 day ago

Complex boolean queries work great. Manticore also supports over 20 full-text operators — a lot more than Elasticsearch. That's one reason it's popular in areas like patent and legal search, where strong full-text matching is especially important.

Custom script scoring is available - https://manual.manticoresearch.com/Extensions/UDFs_and_Plugi...

Vectors - yes. Recent blog post on it https://manticoresearch.com/blog/quantization/

Pipelined aggregations - no.

Nested documents - no, but Manticore supports INNER JOIN and LEFT JOIN.

▲cess11 1 day ago

I like Manticore. It's easy to setup, lean on resources and quite fast. I use it when I want to quickly pour a lot of semi-structured text into a database for exploratory browsing and prototype web applications.

The auto-bolding of query terms in responses is quite convenient and has allowed me to skip annoying little regexes many times. Maybe other engines have it too and I never noticed?

▲aitchnyu 1 day ago

I used to auto bold with Solr with <b> tags in 2011. In hindsight, I should have whitelisted only <b> tags for user inputs, resumes in this case.

Rambling on, Solr is an Apache product, does tandem releases with Apache Lucene, big sister of Elasticsearch. ES used/uses Apache Lucene too as underlying database.

▲cess11 1 day ago

OK, didn't notice that functionality when I've spent time with those, and to me they feel like more serious commitments, you know? There's more work involved and more maintenance and so on.

Especially ElasticSearch is rather fiddly and demanding, and I've seen exactly one production ES-system that impressed me, the rest were log-stores that could have used just about any database engine.

With Manticore I just apt it in and start pouring data into it immediately. Similar to how MySQL/MariaDB is almost frictionless when you run it locally with a root account, and neither ever starts to be annoying in resource consumption unless I actively mess up. I typically chose either (or DuckDB) when I have an idea or some ETL task I'd like to implement, over Postgres and ElasticSearch.

▲snikolaev 1 day ago

Thank you!