RisingWave

A Postgres-compatible streaming database written in Rust — ingest, process, and serve real-time event data with sub-second freshness using familiar SQL.

Screenshot of RisingWave

RisingWave is an open-source streaming database written in Rust. It ingests data from streaming sources like Kafka, processes it continuously using SQL, and serves the results at low latency — all with a PostgreSQL-compatible interface. Instead of stitching together a stream processor (Flink), a serving layer (Postgres), and a message bus, RisingWave collapses the entire pipeline into a single system.

Features

  • PostgreSQL-compatible — connect with psql, any JDBC/ODBC driver, or your favourite Postgres client library; no new query language to learn
  • Materialized views — define a SQL query once as a materialized view and RisingWave keeps it continuously and incrementally up to date as new data arrives
  • Sub-100ms end-to-end freshness — from event ingestion to queryable result in under 100 milliseconds
  • Rich source support — ingest from Kafka, Kinesis, Pulsar, Redpanda, CDC (PostgreSQL, MySQL), S3, and more
  • Rich sink support — deliver results to Kafka, Iceberg, Delta Lake, Postgres, MySQL, ClickHouse, Elasticsearch, and others
  • Apache Iceberg native — first-class support for reading, writing, and managing Iceberg tables including compaction and snapshot management
  • S3-backed storage — stores all state in cloud object storage, enabling near-infinite scale and fast recovery from failures
  • Elastic scaling — stateless compute nodes scale in and out in seconds without restarting pipelines
  • Python DataFrame API — in addition to SQL, supports a DataFrame-style interface for Python users

Installation

The quickest way to try RisingWave locally is the standalone binary:

# macOS / Linux — official install script
# Debian / Ubuntu / Fedora: this script works on all Linux distributions
curl -L https://risingwave.com/sh | sh

# Start RisingWave in standalone mode
risingwave

# Connect with psql
psql -h localhost -p 4566 -d dev -U root

Or via Docker:

docker run -it --pull=always \
  -p 4566:4566 \
  -p 5691:5691 \
  risingwavelabs/risingwave:latest \
  playground

For production deployments, use the Helm chart or RisingWave Cloud.

Core Concepts

Sources

A source connects RisingWave to an upstream data stream:

-- Create a source from a Kafka topic
CREATE SOURCE orders (
    order_id     BIGINT,
    customer_id  BIGINT,
    product_id   BIGINT,
    amount       DECIMAL,
    placed_at    TIMESTAMPTZ
)
WITH (
    connector     = 'kafka',
    topic         = 'orders',
    properties.bootstrap.server = 'localhost:9092'
)
FORMAT PLAIN ENCODE JSON;

Materialized Views

Materialized views are the core primitive — they define a continuous query that RisingWave maintains incrementally:

-- Revenue per product, updated continuously as orders arrive
CREATE MATERIALIZED VIEW revenue_by_product AS
SELECT
    product_id,
    SUM(amount)  AS total_revenue,
    COUNT(*)     AS order_count,
    MAX(placed_at) AS last_order_at
FROM orders
GROUP BY product_id;

Query it at any time for the current, up-to-date result:

SELECT * FROM revenue_by_product
ORDER BY total_revenue DESC
LIMIT 10;

Windowed Aggregations

RisingWave supports tumbling, hopping, and session windows for time-based aggregations:

-- Orders per minute using a 1-minute tumbling window
CREATE MATERIALIZED VIEW orders_per_minute AS
SELECT
    window_start,
    window_end,
    COUNT(*) AS order_count,
    SUM(amount) AS total_amount
FROM TUMBLE(orders, placed_at, INTERVAL '1 minute')
GROUP BY window_start, window_end;

Joins Across Streams

-- Enrich order events with customer data from a CDC source
CREATE MATERIALIZED VIEW enriched_orders AS
SELECT
    o.order_id,
    o.amount,
    c.name   AS customer_name,
    c.region AS customer_region
FROM orders o
JOIN customers c ON o.customer_id = c.id;

Sinks

Push materialized view results downstream as they update:

-- Continuously sink results to a Kafka topic
CREATE SINK high_value_orders_sink
FROM (
    SELECT * FROM enriched_orders WHERE amount > 1000
)
WITH (
    connector = 'kafka',
    properties.bootstrap.server = 'localhost:9092',
    topic = 'high-value-orders'
)
FORMAT PLAIN ENCODE JSON;

Use Cases

  • Live operational dashboards — sub-second data freshness for trading, logistics, IoT, and monitoring UIs
  • Fraud and anomaly detection — stream-join events with historical profiles and alert in real time
  • Real-time feature engineering — generate and serve ML features from live event streams
  • CDC pipelines — capture database changes and continuously transform and deliver them to analytics systems
  • Iceberg lakehouses — stream data from Kafka directly into an open Iceberg lakehouse with automatic compaction

Apache Flink is a powerful stream processing engine, but operating it requires significant expertise — managing state backends, tuning checkpointing, writing Java/Scala jobs, and running a separate serving layer for query results. RisingWave replaces the entire Flink + serving database stack with a single PostgreSQL-compatible system, dramatically reducing operational complexity and allowing teams that already know SQL to build real-time pipelines without learning a new programming model.