Skip to content

Benchmarks

Tested on 1.1M Wikipedia articles (1M English + 100K Japanese, avg. 666 chars per article) with MySQL 8.4 FULLTEXT (ngram parser). MygramDB v1.5.0 with verify_text: all and query cache disabled. All numbers are p50 over 10 iterations.

Reproducible

Run these benchmarks yourself: make bench-up && make bench-run in the repository.

Note on hardware

Measured on Apple M4 Max with 128GB unified memory. Unified memory has higher bandwidth than typical server DDR4/DDR5. On server hardware, absolute latencies will be higher for both engines, but the relative performance difference remains consistent.

Search Latency (SORT id LIMIT 100)

Search Latency (p50, log scale)

Query TypeMatchesMySQLMygramDBSpeedup
Multi-word ("quantum physics")1042,566ms0.09ms27,600x
Medium-freq ("quantum")1,9611,874ms0.28ms6,700x
Low-freq ("algorithm")2,498507ms0.42ms1,200x
Rare ("fibonacci")84936ms0.08ms11,600x

CJK Search Latency (SORT id LIMIT 100)

CJK Search Latency (p50, log scale)

QueryMatchesMySQLMygramDBSpeedup
日本32,2821,204ms1.1ms1,100x
東京6,989300ms3.9ms77x
科学1,5514.2ms2.2ms1.9x

COUNT Performance

COUNT Latency (p50, log scale)

Query TypeCountMySQLMygramDBSpeedup
Medium-freq ("quantum")1,9611,797ms0.08ms21,600x
Low-freq ("algorithm")2,498416ms0.08ms5,500x

Result Consistency (v1.5.0)

With verify_text: all, MygramDB produces exact match results with MySQL FULLTEXT:

QueryMySQLMygramDBMatch
quantum1,9611,961exact
algorithm2,4982,498exact
日本32,28232,282exact
科学1,5511,551exact

Without verify_text, n-gram indexes may return false positives (e.g., "quantum" returns 58K instead of 1,961). This is inherent to n-gram tokenization and expected behavior.

Concurrent Throughput

Throughput — Queries per Second (higher is better)

Query: "algorithm", 10 seconds per level.

ConnectionsMySQL QPSMygramDB QPSMySQL p50MygramDB p50
122,634470ms0.35ms
4811,766495ms0.32ms

Memory Usage

DocumentsIndexDocuments + TextTotal RSSPer 1M docs
1,100,000813MB1.54GB2.53GB~2.3GB

verify_text modes

  • off (default): Lower memory, but may include n-gram false positives
  • all: Stores document text for post-filter verification. Exact results at the cost of ~1.5GB additional memory for 1.1M docs.

Why These Numbers?

See Why MySQL FULLTEXT is Slow for architectural details, and Comparison for how MygramDB compares to Elasticsearch.