Multi-Database Replication on 2 GB RAM — Case Study

Problem

Replicating large hospital datasets across Oracle, MSSQL, MySQL and PostgreSQL under a hard 2 GB RAM budget without data loss or cross-database mismatches.

Solution

Engineered a streaming-read and batched-write pipeline in Spring Boot with per-source connection tuning and careful JDBC cursor management — reliable cross-vendor replication within tight memory constraints.

Architecture

$ render architecture.mmd

flowchart LR
  subgraph SRC[Hospital Sources]
    O[(Oracle)]
    M[(MSSQL)]
    MY[(MySQL)]
  end
  SRC --> Reader[Streaming Reader<br/>JDBC cursor]
  Reader --> Buf{{Bounded Buffer<br/>backpressure}}
  Buf --> Mapper[Schema Mapper<br/>Spring Boot]
  Mapper --> Writer[Batched Writer]
  Writer --> PG[(PostgreSQL<br/>warehouse)]
  Mapper -. metrics .-> Mon[Health · Prometheus]
  classDef src fill:#0c4a6e,stroke:#0ea5e9,color:#fff
  classDef pg fill:#14532d,stroke:#22c55e,color:#fff
  class O,M,MY src
  class PG pg

Technical decisions

$ git log --oneline decisions/

#01

Streaming reads, never full table loads

JDBC fetchSize tuned per vendor (Oracle 500, MSSQL 1000, MySQL forward-only cursors) so memory never grew with table size.

#02

Bounded buffer with backpressure

A fixed-size queue between reader and writer prevented the JVM heap from blowing past the 2 GB cap when downstream PG slowed down.

#03

Batched writes, deterministic commits

Writer batched 500-row chunks with explicit transactions and idempotency keys — safe replays after crashes, no duplicate rows.

#04

Per-source schema mapping

Type quirks isolated per vendor (Oracle NUMBER, MSSQL DATETIME2, MySQL JSON) instead of a leaky common abstraction.

Have a similar challenge?

Let's talk