A few points that came up in the thread and are worth clarifying:
- We do get compared to Elasticsearch a lot. While we support some of its APIs, Manticore isn't a drop-in replacement. We've focused on performance, simplicity, and keeping things open-source without vendor lock-in. Our own SQL-based query language and REST endpoints are part of that philosophy. - @mdaniel was right to question the "drop-in" wording — that's not our goal. - As @sandstrom pointed out, tools like Typesense and Meilisearch are part of this evolving search space. We see Manticore fitting in where users want powerful full-text and vector search capabilities with lower resource overhead and SQL capabilities (we support JSON too though)
We'd love to hear from you: - What are your main use cases for search or log indexing? - Which Elasticsearch features (if any) are absolutely essential for you? - Are there performance comparisons or scaling challenges you'd like to see addressed?
Happy to answer any questions or dive deeper.
To me, storing and searching logs is quite different from most other search use-cases, and it's not obvious that they should be handled by the same piece of software.
For example, tokenization, stemming, language support many other things and are basically useless in log search. Also, log search is often storing a lot of data, and rarely retrieving it (different usage pattern from many search use-cases which tend to be less write-heavy and more about reads).
I know ElasticSearch has had success in both, but if I were Manticore/Typesense/Meilisearch I'd probably just skip logs altogether.
Loki, QuickWit and other such tools are likely better suited for logs.
- https://github.com/quickwit-oss/quickwit - https://github.com/grafana/loki
Camera and camera lens names. I tried mellisearch (1-2 years ago), and while I loved the simplicity (I barely understand what I threw together with many, many lines of C# code for elastic search; this is partially on ES, but clearly on me as well), it was not good at getting results.
Names like "Lumix DC-S1IIE", "DSC-RX100 VIIA", or "FE 50-150 mm F2 GM (SEL50150GM)" do not quite work with default tokenizers and analyzers. Of course that is for product names, for full text queries still need to use normal language rules… except for product names showing up in the text, so now I need multiple indexes for every field, and searching different sub-fields sometimes with different query analyzers.
It was a lot of trial and error getting ES to both find what was searched for, but also be typo tolerant. It’s very easy getting far too many results, and bad scoring for fuzzy queries.
So a bit of a special case, but something the customization capabilities of ES support pretty well.
Luckily, our dataset is rather small, maybe 100k documents, so scalability is not a problem.
- choosing which characters should be treated as token characters, and using the rest as token separators
- defining "blend chars" — for example, the hyphen (-) could make sense as both a separator and a non-separator in your case
- or optionally adding it to the ignore_chars list
- there's also regexp_filter to process tokens when indexing and searching
That said, setting things like this up perfectly is always tricky with any search engine, because the words and punctuation in real data often don't follow regular patterns. It's especially difficult when you want to find "abc def" by "ab cd ef" which may be a common situation in your case.
Does this mean you've at least implemented every API that Kibana requires?
Speaking of features, both Sphinx (as a closed-source project) and Manticore (as an open-source project) have added some nice improvements during last years. But again, if you're happy with the 20-year-old version, there's probably nothing to worry about.
They're surely not using a 20-year-old version of Sphinx - it appears that the latest open source version on Github is 8 years old. They have a closed source version which appears to be maintained, though has infrequent releases.
Why not just say these things, which make manticore self-evidently a better choice?
https://github.com/sphinxsearch/sphinx https://sphinxsearch.com/downloads/current/
Thank you for saying that up front. I read a description of your product and the first thing I thought was, "this looks like a potential alternative to ElasticSearch, but it is not a drop-in replacement for ElasticSearch".
could you ELI5 the query language and TD-IDF?
(Being lazy, I am happy to look into this myself lol.)
Manticore Search's query language is more expressive than Lucene's. While Lucene supports basic boolean logic, field search, phrase queries, wildcards, and proximity, Manticore adds many powerful operators. These include quorum matching (/N), strict word order (<<), NEAR/NOTNEAR proximity, MAYBE (soft OR), exact form (=word), position anchors (^word, word$), full regular expressions, and more. Manticore uses SQL-style syntax with a MATCH() clause for full-text search, making it easier to combine text search with filters and ranking.
It sounds like they're really targeting the logging search store part of ELK, which can be a perfectly fine objective, but no need to mislead audiences since they will find out and then you've made an enemy
Very misleading title
(both are also trying to replace Algolia, because both have cloud offerings)
I have no affiliation.
Plus, given that AWS is currently hosting Open Search, they are not incentivized to sit on their laurels when it comes to modern features or stability
Man, that made me laugh. I'm using that.
Edit: Nevermind, in another part of this thread the maintainer said:
We do get compared to Elasticsearch a lot. While we support some of its APIs, Manticore isn't a drop-in replacement
Which conflicts with the README: "Drop-in replacement for E in the ELK stack"Sorry for the confusion :)
(Logstash can basically ingest and output to everything…)
it's great to see that the project is alive and adding embeddings-related functions needed for semantic search.
What was the reason for the fork, and in what ways does Manticore Search differ from Sphinx today?
as far as i understand apparent death of sphinx and demand for continued development/support from big users of it led to creation of manticore.
MIT: https://github.com/meilisearch/meilisearch/blob/v1.15.2/LICE...
from a few months ago: https://news.ycombinator.com/item?id=43680699 and the .com has quite a few submissions, but without any obvious commentary upon them https://news.ycombinator.com/from?site=meilisearch.com
Recently had a look at Tantivy as well, although compared to raw lucene, their perf is actually inferior. Wonder if there are specific benchmarks here which measure performace and if they compared tail latencies as opposed to averages.
Manticore has a modern multithreading architecture with efficient query parallelization that fully utilizes all CPU cores. It supports real-time indexing - documents are searchable immediately after insertion, with no need to wait for flushes or refreshes.
It uses row-wise storage optimized for small to large datasets, and for even larger datasets that don’t fit into memory, there's support for columnar storage through the Manticore Columnar Library.
Secondary indexes are built automatically using the PGM-index (Piecewise Geometric Model index), which enables efficient filtering and sorting by mapping keys to their memory locations. The cost-based query optimizer uses statistics about the data to choose the most efficient execution plan for each query.
Manticore is SQL-first: SQL is its native syntax, and it speaks the MySQL protocol, so it works out of the box with MySQL clients.
It's written in C++, starts quickly, uses minimal RAM, and avoids garbage collection — which helps keep latencies low and stable even under load.
As for benchmarks, there's a growing collection of them at https://db-benchmarks.com, where Manticore is compared to Elasticsearch, MySQL, PostgreSQL, Meilisearch, Typesense, and others. The results are open and reproducible.
We built a custom search engine on top of Elasticsearch. Our query builder regularly constructs optimised queries that would be impossible to implement in any of the touted alternatives or replacements, which almost always focus on simple full text search, because that’s everything the developers ever used ES for. There’s a mindboggingly huge number of additional features that you need for serious search engines though, and any contender will have to support at least a subset of these to deserve that title in the first place.
I’m keeping an eye on the space, but so far, I’m less than impressed with everything I’ve seen.
I didn't dig into the docs, but now having seen the "create table whatever(name string)" makes me super paranoid: does your mention of "dynamic mapping" as a missing feature mean that if a document shows up with <<{"name":"Fred","birthday":"1970-12-25"}>> it'll drop the document?
Custom script scoring is available - https://manual.manticoresearch.com/Extensions/UDFs_and_Plugi...
Vectors - yes. Recent blog post on it https://manticoresearch.com/blog/quantization/
Pipelined aggregations - no.
Nested documents - no, but Manticore supports INNER JOIN and LEFT JOIN.
The auto-bolding of query terms in responses is quite convenient and has allowed me to skip annoying little regexes many times. Maybe other engines have it too and I never noticed?
Rambling on, Solr is an Apache product, does tandem releases with Apache Lucene, big sister of Elasticsearch. ES used/uses Apache Lucene too as underlying database.
Especially ElasticSearch is rather fiddly and demanding, and I've seen exactly one production ES-system that impressed me, the rest were log-stores that could have used just about any database engine.
With Manticore I just apt it in and start pouring data into it immediately. Similar to how MySQL/MariaDB is almost frictionless when you run it locally with a root account, and neither ever starts to be annoying in resource consumption unless I actively mess up. I typically chose either (or DuckDB) when I have an idea or some ETL task I'd like to implement, over Postgres and ElasticSearch.