Cross-vendor clinical data replication (Oracle · MSSQL · MySQL · PostgreSQL) reliable on a hard 2 GB RAM budget.
Replicating large hospital datasets across Oracle, MSSQL, MySQL and PostgreSQL under a hard 2 GB RAM budget without data loss or cross-database mismatches.
Engineered a streaming-read and batched-write pipeline in Spring Boot with per-source connection tuning and careful JDBC cursor management — reliable cross-vendor replication within tight memory constraints.
$ render architecture.mmd
flowchart LR
subgraph SRC[Hospital Sources]
O[(Oracle)]
M[(MSSQL)]
MY[(MySQL)]
end
SRC --> Reader[Streaming Reader<br/>JDBC cursor]
Reader --> Buf{{Bounded Buffer<br/>backpressure}}
Buf --> Mapper[Schema Mapper<br/>Spring Boot]
Mapper --> Writer[Batched Writer]
Writer --> PG[(PostgreSQL<br/>warehouse)]
Mapper -. metrics .-> Mon[Health · Prometheus]
classDef src fill:#0c4a6e,stroke:#0ea5e9,color:#fff
classDef pg fill:#14532d,stroke:#22c55e,color:#fff
class O,M,MY src
class PG pg
$ git log --oneline decisions/
JDBC fetchSize tuned per vendor (Oracle 500, MSSQL 1000, MySQL forward-only cursors) so memory never grew with table size.
A fixed-size queue between reader and writer prevented the JVM heap from blowing past the 2 GB cap when downstream PG slowed down.
Writer batched 500-row chunks with explicit transactions and idempotency keys — safe replays after crashes, no duplicate rows.
Type quirks isolated per vendor (Oracle NUMBER, MSSQL DATETIME2, MySQL JSON) instead of a leaky common abstraction.
Have a similar challenge?
Let's talk