🏗️ System Design Hard 2025–26

Design a Networking SDK (like Retrofit)

A system design walkthrough of building a type-safe HTTP client library from scratch — covering annotation-driven API design, the request pipeline, interceptor chain, converters, auth token refresh, caching, and testing architecture.

1 — Understanding the Problem

Before drawing boxes, align with the interviewer on which perspective is being asked. This question comes in two flavours:

Functional requirements
In scope
Out of scope
Non-functional requirements
2 — Core Entities

Nine building blocks compose the entire SDK. Understanding what each one owns is the foundation of the design.

EntityResponsibilityOwned by
NetworkClientBuilder-configured root object. Creates service proxies. Holds all registered interceptors, converters, and the engine.SDK caller (singleton)
ServiceProxyJVM dynamic proxy that intercepts method calls on your annotated interface and turns them into HTTP requests.SDK — created via client.create<T>()
ServiceMethodParsed, cached representation of one interface method — HTTP verb, URL template, which arg maps to path/query/body.SDK — cached on first call
HttpRequestImmutable value object: method, full URL, headers, serialised body. Passed through the interceptor chain.SDK internal
HttpResponseImmutable value object: status code, headers, response body (bytes or stream).SDK internal
InterceptorSingle-method interface. Can read/mutate the request before forwarding, and read/mutate the response on the way back.SDK + caller custom
ConverterSerialises a Kotlin object → RequestBody (outbound) or ResponseBody → Kotlin object (inbound).SDK + caller (e.g. JSON factory)
CallAdapterBridges a raw HttpResponse to whatever return type the method declares: plain T, Result<T>, or Flow<T>.SDK built-in
HttpEngineSingle-method interface: execute(HttpRequest) → HttpResponse. OkHttp in production; MockEngine in tests.SDK interface, OkHttp default
3 — High-Level Architecture

The SDK has four distinct layers. The caller only ever touches the top layer; everything else is an implementation detail they never see.

CALLER LAYER UserApi interface  ·  ViewModel  ·  Repository SDK PUBLIC API NetworkClient.Builder  ·  client.create<T>()  ·  ServiceProxy REQUEST PIPELINE Interceptor Chain  ·  Converters  ·  CallAdapters  ·  TokenManager  ·  Cache ENGINE LAYER HttpEngine interface  ·  OkHttpEngine (prod)  ·  MockEngine (test)
The four layers of the SDK — callers only depend on the top two

Key design principle: Each layer depends only on the layer below it via a stable interface. The caller never imports OkHttp. The pipeline never imports OkHttp. Only the engine layer does. This is what makes the engine swappable and the entire stack testable.

4 — The Caller's View (API Design)

The best SDK feels like magic to the caller. They write a plain Kotlin interface with annotations and get a fully working HTTP client back. The guiding principle is zero boilerplate — no request builders, no callback hell, no serialisation glue in application code.

What the caller writes

An annotated interface declares what to call, not how to call it. The SDK generates all the wiring at runtime (or compile time with KSP).

// Caller writes this interface UserApi @GET ("/users/{id}") suspend fun getUser (@Path("id") id: String): User @POST ("/users") suspend fun createUser (@Body req: CreateReq): User suspend fun getUsers (@Query("page") p: Int): List<User> create<T>() SDK GENERATES • Reads annotations • Builds URL from template • Serialises @Body to JSON • Runs interceptor chain • Deserialises response
The caller writes a plain interface — the SDK generates all HTTP wiring
Annotation vocabulary
AnnotationWhere usedPurpose
@GET, @POST, @PUT, @DELETEFunctionHTTP method + path template e.g. "/users/{id}"
@Path("id")ParameterReplaces {id} placeholder in the URL — URL-encoded automatically
@Query("page")ParameterAppended as ?page=value — null values are skipped
@BodyParameterSerialised to request body via the active Converter
@Header("X-Trace")ParameterAdded as a per-call request header
@Headers("Accept: ...")FunctionStatic headers added to every call of this method
@MultipartFunctionSwitches body encoding to multipart/form-data
@StreamingFunctionResponse body streamed rather than buffered — for large downloads
Supported return types
Return typeBehaviourBest for
suspend fun … : TThrows HttpException on non-2xx. Deserialises body to T.Simple happy-path calls in a try/catch or runCatching
suspend fun … : Result<T>Never throws. Success or failure wrapped in Result.ViewModel / UI layer that handles errors with fold
suspend fun … : UnitDiscards body. Throws on non-2xx. Typical for DELETE / fire-and-forget POST.Mutations where the response body is irrelevant
fun … : Flow<T>Wraps repeated polling calls in a Flow that emits on each interval.Live-updating data (status polling, score tickers)
5 — LLD: The Request Pipeline

When the caller invokes userApi.getUser("u1"), the SDK runs through a deterministic sequence of steps before the network is ever touched — and another sequence on the way back. This is the core of the LLD.

① Method call intercepted ServiceProxy.InvocationHandler fires ② ServiceMethod lookup ConcurrentHashMap cache (reflect once) ③ RequestBuilder Expand URL · @Query · @Body → JSON bytes ④ Interceptor Chain Auth → Cache → Logging → Retry → ⑤ HttpEngine.execute() OkHttp makes the real network call RESPONSE PATH ↑ ⑥ Interceptors (return path) Each interceptor sees the response ⑦ CallAdapter Converter deserialises → wraps Result/Flow ⑧ Caller coroutine resumes User / Result<User> returned Design decisions at each step ① Dynamic proxy (JVM) or KSP codegen ② ConcurrentHashMap — reflection amortised ③ Uri.encode on @Path to prevent injection ③ Null @Query params silently skipped ④ Auth interceptor goes FIRST always ④ Logging goes LAST (sees final request) ⑤ Engine is an interface — swappable ⑤ suspendCancellableCoroutine bridges to OkHttp On 401 → AuthInterceptor retries once On 5xx → RetryInterceptor backs off ⑦ Converter.Factory registry: first match wins ⑧ CancellationException cancels OkHttp call
Full request → response journey through the SDK pipeline. Grey arrows = outbound path. Green arrows = response path.

Why cache ServiceMethod? Parsing annotations with reflection is expensive (~1ms). The ConcurrentHashMap<Method, ServiceMethod> cache means that cost is paid exactly once per method across the entire app lifetime, no matter how many calls are made.

6 — Interceptor Chain Design

The interceptor chain is the SDK's most powerful extension point. It follows the Chain of Responsibility pattern: each interceptor receives the request, can modify it, calls chain.proceed() to pass it forward, and then sees the response on the way back. This symmetric in/out design is what makes auth, logging, retry, and caching all implementable without touching each other.

Caller coroutine Auth Add Bearer token Handle 401 retry Cache Check disk cache Store on 200 Retry Retry 5xx Exp. back-off Logging Log method/url Log latency/code HttpEngine OkHttp / Mock — Request flows left to right → ← Response flows right to left — Each interceptor owns a symmetric before/after around chain.proceed()
Interceptor chain — request flows right, response flows left through the same interceptors
Interceptor ordering rules

Interview trap: Interviewers often ask "what if two requests both get a 401 at the same time?" The naive answer — both refresh the token — causes two token refresh calls, and one of them will fail because the refresh token gets invalidated. The correct answer is a Mutex inside TokenManager.refreshToken() — see Section 8.

7 — Converter & Serialisation Design

Converters are the answer to "how does the SDK avoid being coupled to any particular JSON library?" They use the Abstract Factory pattern — the SDK defines the contract, callers plug in implementations, and the registry picks the right one at runtime.

ConverterRegistry List<Converter.Factory> KotlinxJson Factory (highest priority) GsonFactory Factory ProtobufFactory Factory ScalarFactory String / ByteArray Registry walks factories in order — first factory that claims the type wins addConverter() prepends to list, so the most recently added factory has highest priority
Converter.Factory registry — open for extension, the SDK never imports a JSON library

Each Converter.Factory handles two directions: serialising a Kotlin object to a RequestBody (outbound), and deserialising a ResponseBody back to a Kotlin object (inbound). A factory returns null when it cannot handle a given type — the registry moves to the next one.

8 — Auth & Token Refresh

Token management is where most networking SDKs get it wrong. There are three problems to solve: where to store the token safely, how to attach it transparently, and how to handle concurrent refresh without hitting the auth endpoint multiple times.

The concurrent refresh race
Coroutine A Coroutine B Coroutine C TokenManager Auth API get 401 → call refreshToken() B & C also call refreshToken() MUTEX ← A acquires lock ⏸ B & C suspended (waiting for mutex) POST /oauth/token new access + refresh token store tokens; tokenVersion++ RELEASED ← A releases, A retries call B LOCK B acquires → version changed! B returns existing fresh token C does same — zero extra API calls Result: 3 concurrent 401s → exactly 1 token refresh call to the auth endpoint
Mutex + version counter prevents the concurrent refresh stampede
Token storage
Storage optionRiskVerdict
In-memory onlyToken lost on process death — user re-logs on every app restart❌ Bad UX
Plain SharedPreferencesXML file readable by root or via backup exploit❌ Insecure
EncryptedSharedPreferencesAES-256-GCM; key in Android Keystore (hardware-backed API 23+)✅ Recommended
Keystore direct (no value storage)Keystore stores keys, not arbitrary strings — must pair with EncryptedSharedPreferences⚠ Pair with above

Never use the main NetworkClient for token refresh. The refresh request would go through the AuthInterceptor, which would see a 401 and try to refresh again — infinite loop. Use a separate raw HttpEngine instance with no interceptors for the auth endpoint.

9 — HTTP Caching Strategy

The cache interceptor implements standard HTTP cache semantics — the same logic browsers use. A cache hit means zero network usage; a conditional request (304) means network metadata only, not body bytes.

GET request arrives Cache lookup by URL key = sha256(method + url + vary) Cached? No Fetch network Store in cache Return response Still fresh? (max-age not exceeded) Yes — cache hit! Return cached 🎉 No Conditional request If-None-Match: ETag / If-Modified-Since 304 Not Modified → refresh TTL, use cache 200 New body → overwrite cache
Cache decision tree — conditional requests save bandwidth even when cache is stale
10 — Testing Architecture

The HttpEngine interface is what makes the entire SDK testable without a real server. In tests you swap in MockEngine — a simple queue of canned responses. No mocking frameworks, no HTTP stubs on a local port, no network flakiness.

PRODUCTION NetworkClient OkHttpEngine All interceptors + converters run against real network engine("okhttp") in Builder UNIT TEST NetworkClient MockEngine Same interceptors + converters Canned JSON responses, no network engine(mockEngine) in Builder
Swapping OkHttpEngine → MockEngine is a one-line Builder change — all other pipeline logic is identical

What makes this design powerful for testing:

11 — Dynamic Proxy vs KSP Code Generation

This is a classic "how does it actually work?" follow-up at Staff level. There are two fundamentally different approaches to turning the annotation-decorated interface into working code.

Dynamic Proxy (Runtime) 1. App starts, calls client.create<UserApi>() 2. JVM generates synthetic class at runtime 3. First method call → reflection + cache 4. Subsequent calls → cached ServiceMethod Easy to set up. Needs keep rules for R8. KSP Code Generation (Compile-time) 1. Build runs KSP processor 2. Processor visits annotated interfaces 3. Emits UserApiImpl.kt — plain Kotlin class 4. App uses generated class, zero reflection Errors at build time. R8-safe. More setup.
Two approaches to service proxy generation — both produce identical runtime behaviour
DimensionDynamic ProxyKSP Codegen
When errors are caughtRuntime — first method callCompile time — build fails immediately
R8 / ProGuardNeeds keep rules for all annotated interfaces✅ Generated code is plain Kotlin, fully shrinkable
Reflection costOnce per method (cached), negligible afterZero — no reflection at all
Supports inline / reified❌ — proxy can't use them✅ — generated code can be inlined
Build complexityNoneRequires KSP Gradle plugin + processor module
KMP (Kotlin Multiplatform)❌ JVM-only, iOS/WASM don't have dynamic proxies✅ Works on all targets
Choose whenInternal tools, prototype, JVM-only appPublic SDK, KMP, aggressive minification
12 — What Interviewers Expect at Each Level
SDE-II
  • Explain the caller-facing API: annotations, interface, create<T>()
  • Describe dynamic proxy mechanics at a high level
  • Name 2–3 interceptors and what they do
  • Know the difference between suspend T and Result<T>
  • Understand why HttpEngine is an interface (testability)
Senior
  • Design the full interceptor chain with correct ordering and reasons
  • Explain the concurrent 401 race and the Mutex + version counter fix
  • Design the Converter.Factory registry and explain factory ordering
  • Explain suspendCancellableCoroutine and OkHttp call cancellation
  • Design the caching layer: ETag, 304, TTL, cache-only vs network-only modes
Staff+
  • KSP processor design: symbol resolution, incremental processing, code emission
  • Full thread-safety analysis across every SDK class
  • SDK versioning: binary compatibility, shim layers, deprecation strategy
  • KMP strategy: why dynamic proxy fails on non-JVM targets
  • Observability: metrics interceptor, URL pattern normalization for low-cardinality
  • Certificate pinning: where it lives, how to rotate pins without a release
13 — Interview Q&A
How does client.create<UserApi>() actually work at runtime?

It calls Proxy.newProxyInstance() with your interface's class. The JVM generates a synthetic class. Every method call is routed to an InvocationHandler that looks up the cached ServiceMethod, runs RequestBuilder to build the HttpRequest, then dispatches through the interceptor chain via the CallAdapter. Reflection is only paid on the first call; subsequent calls hit the ConcurrentHashMap cache.

Three coroutines all get 401 simultaneously — how do you prevent three token refresh calls?

Use a Mutex inside TokenManager.refreshToken(). The first coroutine acquires the lock and calls the auth endpoint. The other two suspend at the mutex. To avoid a second refresh when they wake up, use a tokenVersion counter. Before acquiring the lock, each coroutine reads the current version. After acquiring, if the version has changed, someone else already refreshed — return the existing fresh token without hitting the network again.

How is a request cancelled when the user leaves the screen?

The ViewModel launches the call inside viewModelScope, which is cancelled in onCleared(). Because every SDK method is suspend, the CancellationException propagates naturally. Inside OkHttpEngine, the OkHttp Call is started with suspendCancellableCoroutine { cont → cont.invokeOnCancellation { call.cancel() } } — so the underlying socket is actually closed, not just abandoned.

Why does the auth interceptor go first, and logging go last?

Auth must go first because it may need to retry the entire request after a token refresh — it wraps the whole downstream chain. Logging goes last (just before the engine) because it should see the final, fully-decorated request exactly as it will be sent over the wire, including the Authorization header added by auth. If logging were first, it would see a request without credentials and give misleading logs.

How do you add Protobuf support without changing the SDK?

Implement Converter.Factory and return non-null converters for types that have a Protobuf descriptor. Register it via Builder.addConverter(ProtobufFactory()). The registry walks factories in registration order; your factory claims proto types, the JSON factory claims everything else. Zero SDK changes — this is the Open/Closed principle in action.

When would you choose KSP over dynamic proxy?

Use KSP when: (1) shipping a public SDK where callers may use aggressive R8 rules — generated code survives shrinking without keep rules; (2) targeting Kotlin Multiplatform — iOS and WASM don't have JVM dynamic proxies; (3) you want annotation errors caught at build time rather than crashing on first call in production. Use dynamic proxy when building an internal tool quickly — no KSP processor module to maintain.

How does a 304 Not Modified response flow through the cache interceptor?

The cache interceptor detects a stale entry (max-age exceeded but we have an ETag). It sends a conditional request with If-None-Match: <etag>. If the server responds 304, the cache interceptor intercepts before the response reaches the call adapter, fetches the previously stored body from disk, bumps the TTL, and returns the cached HttpResponse. The call adapter never sees a 304 — it sees a 200 with the fresh cached body.

Compare your design to Retrofit — what would you keep and what would you change?

Keep: annotation-based interface (zero boilerplate), the Converter.Factory registry (open for extension), and the CallAdapter pattern (any return type without SDK changes).

Change: Remove the Call<T> wrapper — it exists for historical RxJava reasons and adds noise in modern coroutine code. Add a first-class HttpEngine interface so OkHttp is swappable (KMP support). Add built-in Result<T> and unified error types rather than leaving error handling to each caller. Ship MockEngine as a first-class artifact, not a third-party extra.

How would you add request metrics (latency, error rate) without changing call sites?

Add a MetricsInterceptor that records System.nanoTime() before and after chain.proceed(), then calls a pluggable MetricsReporter interface. For the URL label, use the original pathTemplate from ServiceMethod stored in HttpRequest.tag — e.g. /users/{id} instead of /users/abc123. This keeps metric cardinality low (one series per endpoint, not one per user ID).

Where does certificate pinning live in this architecture?

At the OkHttpEngine layer — it's a TLS concern, not an application-layer one. A CertificatePinner is passed to the OkHttpClient.Builder. Always pin two certificates: the active leaf and a backup (the next leaf or the signing CA). If you pin only one, rotating becomes a zero-downtime problem. Deploy the new backup pin in SDK v1, then once all clients have updated, rotate the primary cert — all v1+ clients already trust it.