Question 1

Why is MySQL FULLTEXT so slow?

Accepted Answer

MySQL FULLTEXT is slow because it stores indexes on disk using B-tree pages, requires disk I/O for every query, and uses uncompressed posting lists. MygramDB solves this with in-memory N-gram indexing delivering consistent sub-millisecond latency.

Question 2

How does MygramDB sync with MySQL?

Accepted Answer

MygramDB uses GTID-based binlog replication to sync with MySQL in real-time. It acts as a MySQL replica, receiving changes via the binary log. No ETL pipelines or manual sync needed. Write to MySQL as usual, MygramDB updates automatically.

Question 3

How much faster is MygramDB than MySQL FULLTEXT?

Accepted Answer

On a 1.1M Wikipedia article dataset, MygramDB delivers sub-millisecond search latency compared to MySQL FULLTEXT at 500ms-2.5s. COUNT queries are thousands of times faster. With verify_text enabled (v1.5.0), results are exact match with MySQL. Benchmarks are reproducible via make bench-up.

Question 4

Does MygramDB support Japanese/Chinese/Korean text?

Accepted Answer

Yes, MygramDB has excellent CJK support using ICU-based Unicode normalization and N-gram tokenization. It handles Japanese, Chinese, and Korean text perfectly without additional plugins or configuration.

Question 5

What is the difference between MygramDB and Elasticsearch?

Accepted Answer

MygramDB is a single-binary deployment with direct MySQL binlog sync, sub-millisecond latency, and low operational complexity. Elasticsearch offers distributed search and advanced features but requires cluster management, ETL pipelines, and JVM tuning. Choose MygramDB for simpler MySQL-based applications; Elasticsearch for large-scale distributed search.

Strategy	When Used	Representation
Delta encoding	Sparse terms (density < 18%)	Sorted IDs stored as varint-encoded deltas
Roaring bitmap	Dense terms (density >= 18%)	Compressed bitmap via CRoaring library

Document text	Contains all bi-grams?	Actually contains "quantum"?
"quantum mechanics"	Yes	Yes
"quantify antum"	Yes (`qu`, `ua`, `an`, `nt`, `tu` from "quantify"; `an`, `nt`, `tu`, `um` from "antum")	No

How It Works

N-gram Indexing

Posting List Compression

Search Pipeline

verify_text Post-Filter

Cache and Invalidation

How It Works ​

N-gram Indexing ​

Posting List Compression ​

Search Pipeline ​

verify_text Post-Filter ​