Anh D Tran trnahnh

About

I build systems that operate at the edge of what hardware allows.

My work lives in the space between nanoseconds and profit — low-latency matching engines,
AI inference pipelines, and real-time infrastructure designed to handle production load
without flinching. I treat performance as a feature, not an afterthought.

Currently studying Computer Engineering at the University of Cincinnati (3.71 GPA),
targeting quantitative finance and big tech firms for Fall 2026 co-op. My background
spans systems programming in Rust and Go, ML pipelines with XGBoost and LangGraph,
and full-stack product engineering across multiple deployed applications.

I don't build apps. I build engines.

Open To: Fall 2026 Co-op · Quant Finance Internships · Big Tech SWE Roles · Systems Engineering · ML Infrastructure

Tech Stack

Languages

Frontend

Backend & Databases

Cloud, DevOps & Tooling

Featured Projects

Commma — Developer Activity Tracker

A full-stack developer activity platform turning the editor into a logbook — pace, splits, streaks, and leaderboards as rituals of a real sport, applied to code. A VSCode extension captures editor activity in real time; a Hono API ingests and aggregates sessions; a React web app surfaces session detail, streaks, leaderboards, and shareable keyboard heatmap cards.

Attribute	Detail
Stack	TypeScript, React 19, Vite, Tailwind v4, Hono, Node.js, PostgreSQL, Redis
Extension	VSCode API · Key-label tracking · Privacy modes · Offline queue
Performance	Sub-second session ingestion · Redis sorted sets for leaderboard queries
Auth	GitHub OAuth · JWT access tokens · HTTP-only rotating refresh tokens
Billing	Stripe Pro/Team subscriptions · Signature-verified webhooks
Deployment	EC2 t4g (Graviton) + PM2 · S3 + CloudFront · Neon PostgreSQL · Upstash Redis · Terraform
Repository	github.com/NauriFive/commma-coding-progress-tracker
Live	commma.dev

The extension captures key labels — never key content — across three configurable privacy modes. The API aggregates raw events into session records with pace, line delta, and per-language breakdowns. The Canvas heatmap layer renders per-session key frequency as a transparent PNG exportable in three aspect ratios (9:16, 1:1, 16:9) for social sharing. Redis sorted sets back real-time weekly, monthly, and all-time leaderboards filterable by language.

Ferrox — Order Matching Engine

A production-grade central limit order book matching engine written in Rust, architected for sub-microsecond execution in high-frequency trading environments. Designed around zero-cost abstractions, lock-free concurrency primitives, and memory-mapped persistence with crash recovery guarantees.

Attribute	Detail
Stack	Rust, Atomics, mmap WAL, Criterion, HdrHistogram
Scale	4.7M orders/second sustained throughput
Performance	500ns P99 tick-to-trade latency · Zero hot-path heap allocations
Memory	1M-slot pre-allocated arena · Doubly linked list order book
Concurrency	Lock-free SPSC ring buffer · Acquire/Release memory ordering · 64B cache-line padding
Reliability	mmap write-ahead log · Crash recovery under 1.4ms
Throughput Gain	8.8x improvement from lock-free SPSC design
Repository	github.com/trnahnh/ferrox

Built to match or exceed the performance profile of institutional-grade matching engines. The design eliminates all dynamic memory allocation on the critical execution path, using a pre-allocated arena and stack-pinned message passing throughout. The WAL layer guarantees durability without sacrificing microsecond-level recovery windows, verified end-to-end via HdrHistogram latency measurement under Criterion benchmarks.

Draft-Thinker — Cost-Aware LLM Gateway

A high-performance LLM routing gateway written in Go that cuts inference costs by 91.6% without degrading output quality. Routes requests dynamically using real-time Shannon entropy analysis of token logprobabilities, executes speculative drafts via goroutines, and serves repeated semantic queries from a vector-backed cache layer under 50ms.

Attribute	Detail
Stack	Go, OpenAI API, Qdrant, Redis, Prometheus, Grafana, Docker
Cost Reduction	91.6% inference cost savings · 94% of requests handled by drafter model
Routing	Real-time Shannon entropy analysis of token logprobabilities
Accuracy	98.2% accuracy on 518-prompt benchmark
Cache	Vector search + TTL eviction · Cosine similarity ≥ 0.95 · Sub-50ms cache hits
Observability	Prometheus metrics · Grafana dashboards
Repository	github.com/trnahnh/draft-thinker

The entropy router evaluates token-level confidence distributions from the drafter model before deciding whether to escalate to a capable and expensive model. The semantic cache layer uses vector search to detect repeated queries above a cosine similarity threshold, eliminating redundant inference on cache-accepted responses entirely. Full observability via Prometheus and Grafana covers routing decisions, cache hit rates, and per-model cost attribution.

Inyeon — Agentic AI Git Assistant

A multi-agent AI assistant for software engineering workflows, built on a LangGraph orchestration backbone with a FastAPI runtime and ChromaDB vector memory. Handles the full spectrum of developer requests through a 7-agent pipeline with 100ms median response time and 100% test coverage across 245+ cases.

Attribute	Detail
Stack	Python, FastAPI, LangGraph, ChromaDB, scikit-learn, NumPy, Typer
Architecture	7-agent orchestration pipeline with cost-optimized caching and short-circuiting
Performance	100ms median response latency
Test Coverage	245+ test cases · 100% unit and integration branch coverage
Build Speed	95% Docker build time reduction (49s → 2.1s)
Memory	RAG-powered ChromaDB across 4 clustering strategies via scikit-learn
Repository	github.com/trnahnh/inyeon

Each agent in the pipeline is scoped to a discrete responsibility: intent classification, repository context retrieval, code analysis, diff generation, review synthesis, test suggestion, and response formatting. LangGraph manages state transitions and conditional routing between agents, enabling complex multi-hop workflows without brittle prompt chaining.

KatanaID — AI Branding Toolkit

A production-deployed AI branding platform written in Go, generating brand identities through high-concurrency API orchestration. Integrates Google Gemini AI for creative generation with a trust score engine and browser fingerprinting for session security, delivering complete brand packages under 200ms via concurrent goroutine execution. Validated under 2,300+ requests/day via k6 stress testing.

Attribute	Detail
Stack	Go, React, Gemini AI, Ent ORM, PostgreSQL, goroutines, Railway, Vercel
Concurrency	19+ parallel API calls per request via goroutine fan-out
Performance	Sub-200ms response times · 2,300+ requests/day in production
Security	Trust score engine · Browser fingerprinting · k6 stress tested
Data Layer	Ent ORM type-safe PostgreSQL · Zero schema-related runtime errors
Deployment	Production · katanaid.com

The fan-out architecture dispatches all generative API calls simultaneously at request ingestion, collapsing serial latency chains into a single parallel wait window. Trust scoring evaluates session signals in real time, gating generation behind lightweight anomaly detection before touching paid API quota. Ent ORM enforces strict type safety on the PostgreSQL layer, enabling reliable large-scale brand data synchronization across distributed API sources.

Caphne — Real-Time Study Matching Platform

A real-time peer study matching platform serving 400+ active users at FPT University, built on a Socket.IO event bus with PostgreSQL persistence and Redis caching. Engineered to sustain 1,700+ requests per minute under concurrent session load with p50 response times under 400ms.

Attribute	Detail
Stack	Nuxt 3, Vue 3, shadcn-vue, Tailwind, Express.js, Socket.IO, PostgreSQL, Drizzle ORM, Redis
Scale	1,700+ requests/minute · 400+ active users · 30 days production traffic
Performance	p50 median response under 400ms · 60% API response time reduction
Frontend	Vue optimistic updates · Lazy loading · Input debouncing
Auth	Email OTP via Resend · OAuth via Passport.js (Google, GitHub)
Infra	EC2 (SSM deploys) · S3 + CloudFront · Terraform · GitHub Actions CI/CD
Security	JWT · OAuth 2.0 · Typebox schema validation
Deployment	Production · caphne.co

The Redis layer serves presence state and match candidates from memory, keeping the hot path away from PostgreSQL except for durable writes. Socket.IO manages bidirectional session state across the matching lifecycle, from availability broadcast through confirmation handshake to session teardown. Led 6 engineers through sprint planning and code reviews across the full product lifecycle.

Dasi — End-to-End Encrypted Journal

A privacy-first journaling application with end-to-end encryption — thoughts are encrypted on-device before leaving the client, ensuring the server never has access to plaintext content. Daily writing prompts eliminate the blank-page problem and drive consistent engagement.

Attribute	Detail
Stack	Go, Chi, PostgreSQL, React, TypeScript, AWS Lambda, Resend
Security	On-device encryption before sync · Server sees only ciphertext
Infrastructure	AWS Lambda serverless compute
Notifications	Resend transactional email for daily prompts
Repository	github.com/NauriFive/dasi-encrypted-journal

The encryption model ensures that even a full database compromise exposes no user content — all plaintext remains on the client. The daily prompt system is designed to reduce activation energy for writing, routing prompts through Resend at scheduled intervals to nudge users back into the habit loop.

AnyuDock — S3 File Storage & Config Sharing

A brutalist-by-design S3 file storage platform for sharing files and environment configs between machines. Private by default, public on demand — files stay locked to the owner until explicitly toggled, with share links available for public files only.

Attribute	Detail
Stack	Hono, Bun, Drizzle ORM, PostgreSQL, React, TanStack Router/Query, Tailwind, Vite
Storage	Any S3-compatible provider · Privacy toggle per file · Share link generation
Auth	Email OTP via Resend · JWT session cookies
API	File upload, list, preview, download, privacy toggle, share links
Deployment	Production · anyudock.cloud
Repository	github.com/NauriFive/anyudock

Files are private by default on upload — private files are owner-only for preview and management, while public files are downloadable by anyone with the file ID. Share links are generated only for public files, keeping accidental exposure impossible by design.

Experience

Founder & CTO · Commma · commma.dev · LinkedIn May 2026 – Present · Cincinnati, OH

Founding and building an end-to-end developer activity tracking platform from zero — VSCode extension, REST API, and web app — shipping across the full monorepo as sole technical decision-maker. Phases 1–4 complete and live in production; Phase 5 (mobile, PWA, JetBrains/Neovim plugins, CLI, self-hosted Docker stack) in progress.

Architected a pnpm monorepo across three apps (extension, API, web) and two shared packages (schema, DB) with strict TypeScript boundaries
Built the VSCode extension with key-label tracking, three configurable privacy modes, and an offline queue for resilient event delivery
Designed and implemented GitHub OAuth, JWT access tokens, HTTP-only rotating refresh tokens, and Stripe Pro/Team billing with webhook signature verification
Engineered Canvas-based keyboard heatmap rendering with PNG export in three aspect ratios for social sharing, plus a server-side OG image variant via sharp
Deployed on EC2 t4g (Graviton) with PM2, S3 + CloudFront for the web layer, Neon PostgreSQL, Upstash Redis, and Terraform for infra-as-code with S3-locked remote state

Lead Software Engineer · Caphne Jan 2026 – Present · Ho Chi Minh, Vietnam

Leading 6 engineers building an end-to-end distributed real-time messaging and study-matching platform across the full product lifecycle, from architecture to production operations.

Engineered a distributed real-time messaging system using Socket.IO, PostgreSQL, and Redis, cutting average API response times by 60% and sustaining 1,700+ requests/minute over 30 days of production traffic
Scaled the backend study-matching platform to 400+ active users through sprint planning and code reviews using Nuxt 3 and Node.js
Achieved p50 median response times under 400ms by architecting a Vue-based frontend with optimistic updates, lazy loading, and input debouncing, hardened with JWT, OAuth 2.0, and Typebox schema validation

Founding Engineer · KatanaID Dec 2025 – Present · Remote

Co-founding and leading 5 engineers to architect a scalable AI branding intelligence platform in Go, from initial design through production deployment and ongoing operations.

Architected a scalable end-to-end branding intelligence backend in Go, leveraging goroutines to orchestrate 19+ external API calls for real-time asset aggregation and PDF generation
Sustained 2,300+ requests/day under production load, validated via k6 stress testing, by integrating Google Gemini AI for brand asset generation alongside a trust score engine with browser fingerprinting
Eliminated schema-related runtime errors by leveraging Ent ORM for a strictly type-safe PostgreSQL layer, enabling reliable large-scale brand data synchronization across distributed API sources

Technical Assistant Intern · Vietcombank May 2025 – Jul 2025 · Hue, Vietnam

Maintained banking infrastructure reliability and drove digital adoption across branch operations over a 3-month engagement.

Maintained 99% uptime across 30+ networked branch systems by diagnosing hardware, network, and software faults in coordination with IT teams
Reduced monthly in-branch transactions by 15% by onboarding 100+ customers biweekly onto digital banking services including mobile app setup, online payments, and account management workflows

Achievements

Recognition	Details
GPA Honor Roll	3.71 / 4.0 · University of Cincinnati · Computer Engineering
Production Deployment	Commma — live at commma.dev · EC2 t4g + S3 + CloudFront + Terraform · Phases 1–4 complete
Production Deployment	KatanaID — 2,300+ req/day · AI branding platform · 5-engineer team
Production Deployment	Caphne — 400+ users · 1,700+ req/min · FPT University network
LLM Cost Engineering	91.6% inference cost reduction · 98.2% accuracy · 518-prompt benchmark
Systems Performance	500ns P99 matching engine latency · 4.7M orders/sec · 8.8x throughput gain
Full Test Coverage	245+ test cases · 100% branch coverage · 95% Docker build time reduction
Team Leadership	Led 6-engineer team (Caphne) · 5-engineer team (KatanaID)

GitHub Analytics

Contribution Activity

Connect

The bottleneck is never the algorithm. It's the engineer who stops measuring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anh D Tran trnahnh

Achievements

Achievements

Highlights

Block or report trnahnh

About

Tech Stack

Languages

Frontend

Backend & Databases

Cloud, DevOps & Tooling

Featured Projects

Experience

Achievements

GitHub Analytics

Contribution Activity

Connect

Pinned Loading

Uh oh!