Sonic

A fast, lightweight, and schema-less search backend written in Rust β€” an alternative to Elasticsearch that runs on a few MBs of RAM.

Sonic is a fast, lightweight, schema-less search backend written in Rust. Rather than indexing full documents like Elasticsearch, Sonic indexes search terms mapped to object identifiers β€” keeping its memory footprint tiny (around 30MB under load) while still responding to search queries in microseconds. It is a compelling alternative to Elasticsearch for use cases where you need fast full-text search without the operational overhead of a JVM-based system.

Sonic was built at Crisp to power search across half a billion objects on a $5/month server, and has been running in production there ever since.

Features

  • Extremely lightweight β€” idles at ~30MB RAM; no JVM, no Node.js, no runtime dependencies beyond the binary
  • Microsecond query latency β€” benchmarked at ~880ΞΌs average query time on 1 million indexed objects
  • Schema-less β€” no need to define field types or mappings upfront; just push text and identifiers
  • Collections and buckets β€” organise your index into collections (e.g. messages) and buckets (e.g. per-user indexes) for multi-tenant setups
  • Auto-suggest β€” built-in SUGGEST command for real-time word completion as users type
  • Typo correction β€” automatically tries alternate spellings when exact matches are sparse
  • 80+ language support β€” stop-word removal and text normalisation for over 80 languages
  • Sonic Channel protocol β€” a simple, lightweight TCP protocol (similar in spirit to Redis's RESP) for all operations
  • Persistent storage β€” indexes are stored on disk via RocksDB and FST; data survives restarts

Architecture

Sonic is an identifier index, not a document index. When you push data into Sonic, you provide a search string and an object ID. When you query Sonic, it returns those IDs β€” your application then looks them up in its own database to retrieve the actual documents. This separation keeps Sonic lean and focused.

[Your App] ──push(text, id)──▢ [Sonic] ──query(term)──▢ [Your App] ──lookup(id)──▢ [Your DB]

Installation

cargo install sonic-server

Or download a pre-built binary from the releases page.

For Debian/Ubuntu, official packages are available:

echo "deb [signed-by=/usr/share/keyrings/valeriansaliou_sonic.gpg] https://packagecloud.io/valeriansaliou/sonic/debian/ bookworm main" \
  > /etc/apt/sources.list.d/valeriansaliou_sonic.list
curl -fsSL https://packagecloud.io/valeriansaliou/sonic/gpgkey \
  | gpg --dearmor -o /usr/share/keyrings/valeriansaliou_sonic.gpg
apt update && apt install sonic
# Fedora
# Available via cargo install or pre-built binary from the releases page:
# https://github.com/valeriansaliou/sonic/releases
# macOS
brew install sonic

Configuration

Sonic reads from a config.cfg file (TOML format):

[server]
log_level = "error"

[channel]
inet = "0.0.0.0:1491"
tcp_timeout = 300

[channel.auth]
password = "yourpassword"

[store]
[store.kv]
path = "/var/lib/sonic/store/kv/"
retain_word_objects = 1000

[store.fst]
path = "/var/lib/sonic/store/fst/"

Start the server:

sonic -c /path/to/config.cfg

Sonic Channel Protocol

Sonic communicates over a simple line-oriented TCP protocol. There are three channel modes:

  • search β€” query the index
  • ingest β€” push data into the index
  • control β€” administrative commands

Connecting (example with netcat)

nc localhost 1491
# Server responds:
# CONNECTED <sonic-server v1.4.9>

START search yourpassword
# OK

QUERY messages user:1 "hello world" LIMIT(5)
# PENDING abcdef
# EVENT QUERY abcdef "msg:42" "msg:17"

QUIT

Common commands

# Ingest channel
PUSH messages user:1 "Hello, how are you doing today?"
PUSH messages user:1 "The weather is great in Nantes"

# Search channel
QUERY messages user:1 "weather"     β†’ returns object IDs
SUGGEST messages user:1 "weat"      β†’ returns ["weather"]

# Control channel
TRIGGER consolidate                 β†’ rebuild the FST index
FLUSHB messages user:1              β†’ clear a bucket
FLUSHC messages                     β†’ clear a collection

Client Libraries

Official and community libraries are available for most languages:

  • Node.js β€” node-sonic-channel
  • PHP β€” psonic
  • Rust β€” sonic-channel
  • Python β€” asonic, python-sonic-client
  • Go β€” go-sonic
  • Ruby β€” sonic-ruby
  • Java β€” java-sonic
  • Elixir β€” sonix

Performance

On a single-threaded benchmark with ~1 million indexed messages:

OperationAverageBestWorst
PUSH275ΞΌs190ΞΌs363ΞΌs
QUERY880ΞΌs852ΞΌs1ms

Sonic can theoretically scale to 32,000 PUSH operations/second and 8,000 QUERY operations/second on a hyper-threaded 4-core CPU by load-balancing across multiple Sonic Channel connections.

When to use Sonic

Sonic is the right choice when:

  • You need fast full-text search without running a JVM
  • Your data naturally maps to an identifier-based model (messages, articles, products)
  • You want per-tenant search buckets without complexity
  • Resource constraints rule out Elasticsearch or Meilisearch
  • You need auto-suggest / word completion

If you need to search and retrieve the full document content from the search engine itself (rather than an external DB), or need more advanced relevance tuning, consider Meilisearch or Quickwit instead.