Vol. I · No. 1
The Portfolio Ledger
Late Edition · May 2026
A single-edition broadsheet on a four-service stack for real-time market data and limit-order-book matching.
PORTFOLIO · LIVE ON AWS · MAY 2026

Real-Time
Market Data Pipeline.
Plus an Order Book.

Four services, two runtimes, one Postgres. A Python ingestor holding a persistent WebSocket to Finnhub. A rolling z-score anomaly detector in O(1) per update. A Java limit-order-book matching engine with price-time priority, O(log n) submit and O(1) cancel. Wired with Docker, deployed on AWS Lightsail, exposed through a single Streamlit dashboard. By Sam Shulman.

Live URL serves a real Streamlit app over plain HTTP from a Lightsail instance in us-east-1.
§ 01
The Position Summary

By the numbers.

The headline data. Every figure below maps to a specific file in the repo. No abstractions, no hand-waves.

Codebase
2runtimes & build systems
Services
4plus a Postgres instance
Submit cost
O(log n)k-level TreeMap sweep
DS&A pieces
2rolling-z and LOB
Cloud
AWSLightsail, us-east-1
Cancel cost
O(1)HashMap + intrusive unlink
§ 02
Across the stack

Twelve capabilities, one repo.

What the project actually demonstrates, named. Each cell below points at a specific subsystem and the file it lives in. The repo backs every claim.

Java
Matching engine17 + Maven + Javalin, ~30 source files, 25 JUnit tests
Python
Three servicesFastAPI, Streamlit, async ingestor on Python 3.14
SQL, RDB
Postgres 16Raw psycopg, no ORM, schema is six tables
Data structures, algorithms
Rolling-z + LOBO(1) anomaly detect, O(log n) submit, O(1) cancel
Microservices
Four-service meshHTTP and Postgres are the only protocols
Low-latency
Sub-second ticksVendor WebSocket plus prepared-statement caching
Docker
One image per servicedocker compose builds and runs the stack locally and on AWS
AWS
Live on LightsailUbuntu 24.04 instance, 2 GB, public endpoint open
CI / CD
GitHub ActionsThree parallel jobs, lint, pytest, mvn test, compose build
Automated tests
25 + 21 = 46Real Postgres via testcontainers, no mocks
Financial markets
Anomaly alertsplus a synthetic exchange that fills by price-time priority
Distributed systems
Auto-reconnect, watchdogPostgres as the bus, no queue, no broker
§ 03
Topology

Four services. One Postgres.

The ingestor never talks to the API. The dashboard never talks to the database. HTTP and Postgres are the only seams. Read the diagram and the whole system fits in your head.

EXTERNAL · WSS Finnhub.io wss://ws.finnhub.io SERVICE · PYTHON ingestor + async WebSocket loop + RollingZScore detector + exp backoff + watchdog DATA postgres 16 ticks · alerts me_orders · me_trades me_book_snapshots SERVICE · PYTHON api GET /health GET /prices/{symbol} GET /alerts SERVICE · JAVA 17 matching-engine + TreeMap order book per symbol + synthetic generator @ 50 ms + HTTP via Javalin SERVICE · STREAMLIT dashboard page 1 · Finnhub view page 2 · matching engine 10 s autorefresh / 2 Hz TRADES INSERT SELECT JOURNAL HTTP HTTP PORTFOLIO · LEDGER
Python service JVM service UI surface External
§ 04
In depth

Three pieces, in detail.

The substance, not the surface. Each piece names the data structure, names the cost in big-O, and shows the code at the file and line it lives in.

Case I · ingestor / anomaly.py

The rolling z-score,
in O(1) per tick.

"Naive recompute would be O(window). Running sum and running sum-of-squares over a fixed-size deque is two scalar updates per tick."

Each symbol gets its own detector. Every trade tick produces a log-return; the return goes into a deque of length sixty, the running totals get the new value added and the soon-to-be-evicted value subtracted, and the z-score falls out of mean and standard deviation in two divides and a square root. An alert fires when |z| exceeds 2.5 and the window is warm.

No NumPy. No SciPy. The point of writing it this way is precisely to demonstrate the algorithmic technique, not to outsource it.

OperationCost
update(price)O(1)
variance recomputeO(1)
window evictionO(1)
warm-up tick count60
ingestor / anomaly.py · RollingZScore.update
def update(self, price: float) -> AnomalyResult:    if price <= 0:        raise ValueError("price must be positive")    if self._last_price is None:        self._last_price = price        return AnomalyResult(z_score=None, is_anomaly=False)    r = math.log(price / self._last_price)    self._last_price = price    # subtract the about-to-evict value BEFORE append.    # keeps running totals in sync with the deque, O(1).    if len(self._returns) == self.window_size:        evicted = self._returns[0]        self._sum     -= evicted        self._sum_sq  -= evicted * evicted    self._returns.append(r)    self._sum    += r    self._sum_sq += r * r    if len(self._returns) < self.window_size:        return AnomalyResult(z_score=None, is_anomaly=False)    n        = self.window_size    mean     = self._sum / n    variance = self._sum_sq / n - mean * mean    if variance <= 0:        return AnomalyResult(z_score=None, is_anomaly=False)    z = (r - mean) / math.sqrt(variance)    return AnomalyResult(z_score=z, is_anomaly=abs(z) > self.threshold)
Case II · matching-engine / OrderBook.java

A price-time priority order book.

"Two TreeMaps for the price ladders, intrusive linked-list FIFOs at each level, and a HashMap order-id index so cancel is O(1) lookup, O(1) unlink, O(log n) only if the level empties."

The book is two NavigableMaps: bids reverse-ordered so the best bid is firstEntry, asks natural-ordered so the best ask is firstEntry. Each map value is a PriceLevel: a linked list of orders plus a running total quantity. Submit sweeps the opposite side, walks the FIFO at each price level, and posts what does not fill. Cancel goes through the HashMap, unlinks from its level, and removes the level only if empty.

Every public method on OrderBook is synchronized. Matching must be linearizable; one coarse lock is the right tool because the engine is single-threaded per book on the producer side.

OperationCost
submitLimitO(k log n)
submitMarketO(k log n)
cancelO(1) + O(log n) if level empties
topOfBookO(1)
matching-engine / OrderBook.java · match
private long match(        long takerId,        Side takerSide,        BigDecimal limitPrice,        long quantity,        long ts,        List<Trade> out) {    NavigableMap<BigDecimal, PriceLevel> opposite =            (takerSide == Side.BUY) ? asks : bids;    long remaining = quantity;    while (remaining > 0 && !opposite.isEmpty()) {        Map.Entry<BigDecimal, PriceLevel> best = opposite.firstEntry();        BigDecimal restingPrice = best.getKey();        if (limitPrice != null) {            int cmp = restingPrice.compareTo(limitPrice);            boolean acceptable =                    (takerSide == Side.BUY) ? cmp <= 0 : cmp >= 0;            if (!acceptable) break;        }        PriceLevel level = best.getValue();        while (remaining > 0 && level.head != null) {            Order maker = level.head;            long fillQty = Math.min(remaining, maker.remainingQuantity());            out.add(new Trade(maker.id(), takerId, symbol, restingPrice, fillQty, ts));            remaining -= fillQty;            maker.decreaseQuantity(fillQty);            level.recordFill(fillQty);            lastTradePrice = restingPrice;            if (maker.remainingQuantity() == 0) {                level.unlink(maker);                ordersById.remove(maker.id());            }        }        if (level.isEmpty()) opposite.remove(restingPrice);    }    return remaining;}
Case III · service boundaries

Four services, two protocols.

"The dashboard never touches the database. The ingestor never touches the API. Two protocols carry the whole stack: HTTP and Postgres."

Four services, but the rules between them are short. The ingestor writes ticks and alerts with raw psycopg in an autocommit connection, wrapping each tick and its derived alert inside one conn.transaction() so a crash mid-trade cannot leave a chart spike without its overlay marker.

The API reads with an AsyncConnectionPool in its lifespan, exposing three endpoints, and the dashboard polls those endpoints. The matching engine writes its own me_* tables on a HikariCP pool with prepareThreshold=1 so the very first execute uses a server-side named plan; subsequent inserts skip parse and plan.

No message queue. No ORM. No FastAPI WebSocket. Postgres is the bus.

BoundaryProtocol
ingestor → postgresSQL · INSERT
matching-engine → postgresSQL · INSERT (HikariCP)
api → postgresSQL · SELECT
dashboard → apiHTTP · poll @ 10 s
dashboard → matching-engineHTTP · poll @ 2 Hz
ingestor / main.py · _handle_trade
# A tick and its derived alert ship in one transaction so a crash# mid-trade cannot leave a chart spike without its alert marker.async with conn.transaction():    await conn.execute(        "INSERT INTO ticks (symbol, ts, price, volume) "        "VALUES (%s, %s, %s, %s) "        "ON CONFLICT (symbol, ts) DO NOTHING",        (symbol, ts, price, volume),    )    if result.is_anomaly:        await conn.execute(            "INSERT INTO alerts (symbol, ts, price, z_score, message) "            "VALUES (%s, %s, %s, %s, %s)",            (symbol, ts, price, result.z_score, message),        )
§ 05
Try It Yourself

Three commands. One stack.

A free Finnhub key gets the Python side running. The Java side runs against synthetic flow with no external dependencies. Docker brings the lot up in one command.

# 1. Clone $git clone https://github.com/shulman33/market-pipeline.git $cd market-pipeline   # 2. Configure (free key, no credit card) $cp .env.example .env $edit .env # set FINNHUB_API_KEY=...   # 3. Bring up all four services + Postgres $docker compose up -d --build   # Open three URLs: http://localhost:8501 // dashboard http://localhost:8000/health // api http://localhost:8080/health // matching engine   # Run the tests (no mocks, real Postgres via testcontainers) $pytest -v $cd matching-engine && mvn -B test