🏗️ System Design Hard 2025–26

Design a Real-Time Chat Application

A mobile-first system design breakdown of a WhatsApp-style chat app — covering offline-first architecture, WebSocket lifecycle management, delivery receipts, and media handling on Android.

⏱ 45 min interview

📖 ~12 min read

🎯 Senior / Staff+

🔥 Asked at Google, Swiggy, Flipkart

Understanding the Problem

Start your interview by defining the functional and non-functional requirements. For a chat application, functional requirements describe what the system does, while non-functional requirements describe the system qualities — things like "messages should be delivered in under 500ms" or "the app should work fully offline."

Prioritise the top 3 functional requirements. Everything else shows your product thinking, but clearly note it as "below the line" so the interviewer knows you won't be including them in your design. Check in to see if the interviewer wants to move anything.

Functional Requirements

Core Requirements

Users should be able to send and receive text messages in real time (1:1 chat)
Users should see delivery receipts — Sent (✓), Delivered (✓✓), Read (✓✓ blue)
Messages should queue and deliver even when the user is temporarily offline

Below the line (out of scope):

Group chat (>2 participants)
End-to-end encryption
Voice / video calling
Message reactions and threads

Non-Functional Requirements

Core Requirements

The system should deliver messages in under 500ms on a stable network connection
The system should work fully offline — messages queue locally and sync on reconnect
The system should be battery-efficient — no aggressive polling or persistent foreground services
The app is read-heavy; local DB is the single source of truth for the UI at all times

Below the line (out of scope):

Sub-50ms latency (a local DB + WebSocket gives us <200ms realistically)
Message search and full-text indexing
Compliance / data retention policies

Here's how it might look on your whiteboard: write "Functional Requirements" and "Non-Functional Requirements" with a horizontal line separating core from out-of-scope. Tell the interviewer explicitly what you're deprioritising and why.

The Set Up

Planning the Approach

On Android, the mobile client owns significantly more responsibility than in a web app. The device must manage its own connection state, persist data locally for offline access, handle background sync, and deliver push notifications when the process is killed. Before drawing architecture boxes, establish this contract clearly.

The core insight that separates strong candidates: the local Room database is the single source of truth. The UI never reads from the network directly. It only observes the DB. The repository layer syncs the DB with the server silently in the background.

Defining the Core Entities

Before designing the architecture, align on the data model. The two central entities are Message and Conversation. The trickiest design decision here is the dual-ID system on Message:

// Two-ID pattern — the most important design decision in this system

@Entity(tableName = "messages")
data class MessageEntity(
    @PrimaryKey val localId: String,       // UUID assigned by device at send time
    val serverId: String?,                  // null until server ACKs; used for ordering
    val conversationId: String,
    val senderId: String,
    val text: String?,
    val mediaUri: String?,                 // local path initially, then CDN URL
    val status: MessageStatus,            // PENDING | SENT | DELIVERED | READ | FAILED
    val createdAt: Long,                   // client timestamp (ms)
    val serverTs: Long?                    // used for canonical ordering after sync
)

@Entity(tableName = "conversations")
data class ConversationEntity(
    @PrimaryKey val id: String,
    val participantIds: String,           // JSON array
    val lastMessageId: String?,
    val unreadCount: Int,
    val updatedAt: Long
)

The localId is generated by the device the instant the user taps Send. This allows the message bubble to render immediately without waiting for any network response. The serverId arrives later via the WebSocket ACK frame and is written back to the same DB row.

🔷 Pattern: Dual-ID / Optimistic UI

The dual-ID system enables optimistic UI — the message appears instantly, before any server confirmation. This is how WhatsApp, iMessage, and Telegram all work. The localId acts as an idempotency key: even if the network request is retried, the server deduplicates by localId and never creates a duplicate message.

High-Level Design

1) The Android Client Architecture

When the app launches, it connects to the chat server via a persistent WebSocket and opens a reactive stream from Room that the UI observes. All writes — both from the local user and from the server — go through Room first, so the UI always renders from a consistent local state.

Android Client Architecture — UI always reads from Room 🔍

Let's walk through exactly what each layer does:

UI Layer: Jetpack Compose screens observe StateFlow from ViewModels. They never call network methods directly — they only call ViewModel functions.
Domain Layer: Use cases encapsulate business logic. SendMessageUseCase assigns a localId, writes to Room, and triggers the outbox worker.
ChatRepository: The central coordinator. It writes incoming WebSocket frames to Room and reads from Room for all UI queries. Room → UI is a reactive Flow.
Room Database: The only source the UI reads from. This makes the UI identical whether online or offline — it always renders from local data.
WebSocketManager: Maintains the persistent connection. Routes incoming frames to the Repository. Handles reconnection with exponential backoff.
OutboxWorker: A WorkManager task constrained to NETWORK_CONNECTED. Drains the local outbox table and sends pending messages. Auto-retries on failure.

2) How a Message is Sent (Offline-First Flow)

When a user taps Send, we want the message to appear in the UI immediately — regardless of network state. Here's the exact sequence:

Message Send Flow — local DB first, network second 🔍

// OutboxWorker — WorkManager drains pending messages
class OutboxWorker(ctx: Context, params: WorkerParameters) :
    CoroutineWorker(ctx, params) {

    override suspend fun doWork(): Result {
        val pending = messageDao.getPendingMessages()
        pending.forEach { msg ->
            try {
                webSocketManager.send(msg.toWsFrame())
                // Don't mark SENT here — wait for server ACK via WebSocket
            } catch (e: Exception) {
                if (runAttemptCount >= 3) {
                    messageDao.updateStatus(msg.localId, MessageStatus.FAILED)
                    return Result.failure()
                }
                return Result.retry()  // WorkManager uses exponential backoff
            }
        }
        return Result.success()
    }
}

3) How the WebSocket Connection is Managed

The critical question isn't just "use WebSockets" — it's how to manage the connection lifecycle on Android, where the OS aggressively kills background processes. Here's the technology trade-off:

Option	Latency	Battery	Complexity	Decision
HTTP Polling (5s interval)	~5s	🔴 High	Low	Rejected
HTTP Long Polling	<1s	Medium	Medium	Rejected
Server-Sent Events (SSE)	<200ms	Low	Low	Acceptable
WebSocket (OkHttp)	<100ms	Low	Medium	✅ Chosen
gRPC Streaming	<100ms	Low	High	Good alternative

The lifecycle rule: connect when the app comes to foreground, disconnect cleanly on background. When the app is backgrounded or killed, FCM push notifications wake it up for new messages. This avoids the battery drain of a persistent foreground service.

🔷 Pattern: Exponential Backoff Reconnect

On connection failure, do not retry immediately — this thundering herd problem can DDoS your own server when millions of clients lose connectivity at once (e.g. server deploy). Instead, use exponential backoff with jitter: delay = min(2ⁿ × 1000ms, 30000ms) + random(0, 1000ms).

4) How Image and Video Sharing Works

The key principle: never block the message send on media upload. The image renders immediately from the local file. Upload happens in the background. This is the same approach used by WhatsApp, Telegram, and Signal.

User picks image: Copy to app-internal storage. Compress to ≤300KB with target 1080px width. Generate a localUri.
Render from local URI: Message bubble shows the image from disk immediately. Status = PENDING.
MediaUploadWorker runs: Chunked multipart upload to S3 via a pre-signed URL. Stores the last uploaded byte so resumable uploads survive network interruptions.
Upload complete: DB row updated with the CDN URL. Coil swaps the image source transparently — no flicker, no reload.

⚠️ Common pitfall: Don't use the original file from MediaStore for upload — the URI may become invalid if the user deletes the photo mid-upload. Always copy to your app's own internal storage first, then upload from there.

Low-Level Design

The HLD shows what components exist. The LLD shows exactly how they talk to each other — method calls, data transformations, and state changes at every step. Below are three precise flows you should be able to draw and explain in an interview.

Whiteboard Overview

Start here. Draw this on the whiteboard in the first 5 minutes to anchor the entire conversation. Every other flow is a zoom-in of one arrow on this diagram.

Full System Whiteboard — Sender → Server → Receiver 🔍

Flow 1 — Sending a Message (Detailed)

This is the most important flow to master. Every method call, every state change, in exact order.

ViewModel

UseCase

Repository

Room DB

WorkManager

WebSocket

Server

→sendMessage(text)

User taps Send. ViewModel receives the call.

→execute(SendMessageParams)

ViewModel delegates to SendMessageUseCase. Assigns localId = UUID.randomUUID(), status = PENDING.

→saveMessage(MessageEntity)

UseCase calls Repository with a fully formed entity. No network involved yet.

→messageDao.insert(entity)

Room writes the row. The UI's Flow<List<Message>> fires immediately — bubble appears on screen with a clock icon (PENDING).

→WorkManager.enqueue(OutboxWorkRequest)

Constraint: NETWORK_CONNECTED. If offline, WorkManager queues it and waits. If online, it runs immediately.

→webSocket.send(WsFrame.json)

OutboxWorker fetches all PENDING rows, serialises each to a JSON frame, and calls webSocket.send(). Does not mark as SENT yet.

→WsFrame{ localId, text, conversationId }

Frame hits the server's WebSocket Gateway.

←ACK{ localId, serverId, serverTs }

Server persists the message, generates a serverId, and sends back an ACK frame on the same WebSocket connection.

←incomingFrames.emit(ack)

WebSocketManager emits the frame onto its SharedFlow. Repository collects it.

→messageDao.updateAck(localId, serverId, SENT)

Room row updated: serverId set, status → SENT. UI's Flow fires again — clock icon becomes ✓ (single tick).

Flow 2 — Receiving a Message

The receive path is simpler because all the heavy lifting (outbox, retry) is on the sender side. The receiver's job is: decode the frame → write to Room → let the UI react.

Server

WebSocket

Repository

Room DB

ViewModel

→onMessage(WsFrame.MSG)

Server pushes an incoming message frame. WebSocketListener.onMessage() fires on OkHttp's internal thread.

→incomingFrames.emit(frame)

Decoded and emitted on Dispatchers.IO. Repository is already collecting this SharedFlow in a coroutineScope.

→messageDao.insertOrIgnore(entity)

insertOrIgnore is idempotent — if the same message arrives twice (reconnect scenario), no duplicate is created.

→Flow<List<Message>> emits

Room detects the new row and emits on the reactive Flow. No polling. No manual notify.

→uiState updates → recomposition

ViewModel maps the list to UI models. Compose recomposes only the new message bubble. No full list redraw.

←DELIVERED_ACK{ messageId, recipientId }

Repository sends a DELIVERED receipt back to the server after writing to DB. Sender's app receives this and updates their row to DELIVERED (✓✓).

Flow 3 — Delivery Receipt State Machine

Each message row follows a strict one-way state machine. The receipt flows are the most commonly asked follow-up in interviews — draw this clearly.

MessageStatus State Machine — strictly one-directional, never goes backwards 🔍

// Repository collecting WebSocket frames and updating status

class ChatRepository @Inject constructor(
    private val dao: MessageDao,
    private val wsManager: WebSocketManager,
    private val scope: CoroutineScope
) {
    init {
        scope.launch(Dispatchers.IO) {
            wsManager.incomingFrames.collect { frame ->
                when (frame.type) {
                    WsFrameType.MESSAGE  -> handleIncoming(frame)
                    WsFrameType.ACK       -> dao.updateAck(frame.localId, frame.serverId, MessageStatus.SENT)
                    WsFrameType.DELIVERED -> dao.updateStatus(frame.serverId, MessageStatus.DELIVERED)
                    WsFrameType.READ      -> dao.updateStatus(frame.serverId, MessageStatus.READ)
                }
            }
        }
    }

    private suspend fun handleIncoming(frame: WsFrame) {
        dao.insertOrIgnore(frame.toEntity())
        // Send DELIVERED receipt back immediately after DB write
        wsManager.send(WsFrame.deliveredAck(frame.serverId))
    }

    // UI observes this — reactive, no manual refresh needed
    fun messages(conversationId: String): Flow<List<MessageEntity>> =
        dao.getMessages(conversationId)  // Room returns Flow automatically
}

🔷 Pattern: Idempotent Writes with insertOrIgnore

On reconnect, the server may re-deliver messages the client already received. Using INSERT OR IGNORE (Room's OnConflictStrategy.IGNORE) means duplicate frames are silently dropped. The UI never shows a duplicate bubble — no extra deduplication logic needed anywhere else.

Potential Deep Dives

1) How do we paginate old messages?

Use Paging 3 with RemoteMediator. Recent messages load from Room instantly. When the user scrolls up past the local cache boundary, RemoteMediator fetches older pages from the REST API, writes them to Room, and Paging 3 re-emits. The UI always reads from Room — the pagination source switch is invisible to the user.

2) How do delivery receipts work at scale?

Don't send a receipt event per message — this floods the WebSocket. Instead, batch them: when the user opens a chat screen, send a single READ_ACK frame with the newest messageId they've seen. The server marks all messages up to that ID as read. This reduces receipt traffic by ~90%.

3) How does push notification work when the app is killed?

Firebase Cloud Messaging (FCM) delivers a data-only push (not a notification push) to the device. Android wakes the app's FirebaseMessagingService, which connects the WebSocket, fetches missed messages via REST, writes them to Room, and shows a local notification. This is battery-safe because FCM uses the system-level push channel — no background process required.

4) How would you add end-to-end encryption?

Use the Signal Protocol (also used by WhatsApp). Each device generates a key pair on first launch. The public key is uploaded to the server. On first message, the sender fetches the recipient's public key and performs an X3DH key exchange to derive a shared secret. From that point, every message is encrypted on-device using the Double Ratchet algorithm. The server never sees plaintext.

🔷 Pattern: Dealing with Key Distribution

E2E encryption introduces a key distribution problem: what if a user installs the app on a new device? They need to re-establish keys with every contact. WhatsApp handles this by requiring the user to re-verify contacts' safety numbers. This is a UX trade-off you should surface in the interview.

5) How would you share logic with an iOS client?

Use Kotlin Multiplatform (KMM). The Repository layer, use cases, data models, and Room queries (via SQLDelight on iOS) can be shared. Only the UI layer (Compose on Android, SwiftUI on iOS) and platform-specific code (WorkManager, FCM) stay separate. This is a strong signal at Staff+ level interviews.

What is Expected at Each Level

Your answer to this question will be evaluated differently depending on the role you're interviewing for. Here's what each level needs to demonstrate:

Mid-level

Clean MVVM architecture with Repository pattern
Room as local DB, Retrofit for REST
Basic offline support with WorkManager
FCM for push notifications
Can articulate delivery receipt states

Senior

Dual-ID / outbox pattern explained clearly
WebSocket lifecycle tied to app foreground state
Chunked resumable media upload
Paging 3 + RemoteMediator for history
Batched read receipts to reduce traffic

Staff+

E2E encryption trade-offs (Signal Protocol)
KMM for cross-platform code sharing
Performance monitoring + regression alerting
Group chat fan-out strategies
Multi-device session management

20 Must-Know Interview Questions

These are the most frequently asked follow-up questions in real Chat App system design interviews at Google, Swiggy, Flipkart, and CRED. Each one is a potential 10-minute rabbit hole — know them cold.

Q1 Easy Why is the Room database the single source of truth? Why not read from the network directly in the ViewModel?

▾

Reading from the network directly in the ViewModel breaks offline support, introduces loading states in the UI, and makes the UI depend on network availability. Room acts as a local cache that the UI always reads from — the same query works whether you're online or offline. The Repository syncs Room with the server silently in the background. This is the repository pattern and the foundation of offline-first architecture. The UI is always fast because local DB reads are microseconds, not hundreds of milliseconds.

Q2 Easy What is the dual-ID system (localId + serverId)? Why not just use the server-generated ID?

▾

If you wait for the server to generate an ID before inserting into Room, the user sees a loading spinner after tapping Send — which feels sluggish. The localId (a UUID generated on-device) lets you insert into Room immediately, render the bubble, and send to the server in the background. When the server ACK arrives, you write back the serverId to the same row. The localId also acts as an idempotency key — if the network request is retried, the server ignores duplicate frames with the same localId.

Q3 Medium Why WorkManager for the outbox instead of just calling the API directly from the ViewModel?

▾

If you call the API directly from the ViewModel and the user closes the app mid-send, the coroutine is cancelled and the message is lost. WorkManager survives process death — it stores the work request in its own SQLite DB and re-executes it when the app restarts or connectivity returns. It also handles NETWORK_CONNECTED constraints automatically, so you don't need to manage connectivity callbacks yourself. For anything that must complete eventually regardless of app lifecycle, WorkManager is the right tool.

Q4 Medium How do you handle WebSocket reconnection without draining the battery?

▾

Use exponential backoff with jitter: delay = min(2ⁿ × 1000ms, 30000ms) + random(0–1000ms). The jitter prevents thousands of clients from reconnecting simultaneously after a server restart (thundering herd). Tie the connection to the app's foreground state using ProcessLifecycleOwner — connect on ON_START, disconnect on ON_STOP. When backgrounded, rely on FCM push to wake the app instead of keeping a persistent connection. Also register a ConnectivityManager.NetworkCallback to reconnect immediately when network becomes available, rather than waiting for the next backoff interval.

Q5 Medium What happens to messages when the user is offline for 3 days and then comes back online?

▾

On reconnect: (1) The WebSocket connects and the server pushes any missed messages that arrived while the client was offline. (2) For a 3-day gap, the server may have too many messages to push via WebSocket — instead, the client calls a REST GET /messages?since={lastKnownServerTs} endpoint to fetch the backlog and writes them all to Room. (3) Any outbox messages (PENDING rows in Room) are immediately picked up by WorkManager and sent. The key is storing lastKnownServerTs persistently in DataStore so the sync-from point survives app restarts.

Q6 Medium How do delivery receipts work at scale? What if you send a receipt per message?

▾

Sending a receipt per message is fine for low-traffic chats but doesn't scale. If a conversation has 100 unread messages, opening it would trigger 100 READ receipt frames simultaneously. Instead, batch them: send a single READ_ACK{ conversationId, upToMessageId } frame when the chat screen becomes visible. The server marks all messages up to that ID as read. This reduces receipt traffic by ~90%. For DELIVERED receipts, send one per incoming message immediately after writing to Room — this is unavoidable since delivery is per-message.

Q7 Medium How do you prevent duplicate messages from appearing in the UI?

▾

Use INSERT OR IGNORE (Room's OnConflictStrategy.IGNORE) when inserting incoming messages. The localId is the primary key — if the same frame arrives twice (reconnect scenario, server retry), the second insert is silently dropped. On the sender side, the outbox pattern ensures the message is in Room before any network call, so the bubble is never duplicated regardless of how many times WorkManager retries. The key insight: make every write idempotent at the DB level rather than trying to deduplicate at the UI level.

Q8 Hard How would you design group chat? What changes in the architecture?

▾

Group chat introduces fan-out: one message must be delivered to N recipients. Two strategies: (1) Fan-out on write (server-side) — when the server receives a message, it immediately pushes to all online group members' WebSocket connections and queues FCM for offline ones. Simple for the client, scales to ~100 members. (2) Fan-out on read (client-side) — the server stores one copy and clients pull. Simpler server, but more complex client sync logic. For the client, group messages add a groupId field and read receipts become per-member (you need to track who has read, not just whether the message was read).

Q9 Hard How do you handle media (image/video) sending without blocking the message flow?

▾

Never block the message send on media upload. The flow: (1) Copy image to app-internal storage, compress to ≤300KB. (2) Insert a message row with mediaUri = localPath, status = PENDING — bubble renders immediately from local file. (3) MediaUploadWorker uploads to S3 via a pre-signed URL in the background using chunked multipart upload. Store the last uploaded byte offset in DataStore so uploads resume after interruption. (4) On success, update the row with the CDN URL. Coil swaps the image source transparently. The receiver downloads from CDN on first open and caches to disk.

Q10 Hard How does push notification work when the app is completely killed?

▾

When the app is killed, the WebSocket is gone. The server detects the disconnect and switches to FCM. Send a data-only push (not a notification push) — this wakes the app's FirebaseMessagingService even when the app is killed. In onMessageReceived(): (1) Connect the WebSocket briefly. (2) Fetch missed messages via REST GET /messages?since=lastTs. (3) Write to Room. (4) Show a local notification using NotificationManager. Use data push not notification push so you control the notification content (unread count, sender name) rather than FCM controlling it. Handle notification grouping for multiple conversations using NotificationCompat.InboxStyle.

Q11 Medium How do you paginate message history? The user scrolls up and there are 10,000 old messages.

▾

Use Paging 3 with RemoteMediator. The PagingSource reads from Room (fast, local). When the user scrolls past the oldest locally cached message, RemoteMediator.load() fires and fetches the next page from REST GET /messages?before={oldestLocalServerId}&limit=50. Write the fetched page to Room. Paging 3 automatically emits the updated list — the UI scrolls smoothly with no manual handling. Use cursor-based pagination (by serverId or serverTs), never offset-based — offset pagination breaks when new messages are inserted during scrolling.

Q12 Medium How do you handle message ordering? What if client clock is wrong?

▾

Never use the client timestamp (createdAt) as the canonical ordering key. Client clocks can be wrong by minutes or days. Use serverTs — the timestamp assigned by the server when it persists the message — for ordering. For display, show the client timestamp (so "just now" is accurate), but sort by serverTs. In Room: ORDER BY serverTs ASC, localId ASC — the localId as a tiebreaker handles the brief window where serverTs is null (PENDING messages). PENDING messages always appear at the bottom since their serverTs is null.

Q13 Hard How would you add end-to-end encryption? What changes in your architecture?

▾

Use the Signal Protocol (X3DH + Double Ratchet). Key changes: (1) On first launch, generate an identity key pair and a set of one-time pre-keys. Upload public keys to the server's key distribution service. (2) Before the first message to a user, fetch their public keys and run X3DH to derive a shared session key. (3) Encrypt every message on-device before sending. The server only ever sees ciphertext. (4) Room stores encrypted blobs — you decrypt on read, in the ViewModel before mapping to UI models. Major trade-off: key management becomes complex. New device onboarding, key rotation, and message backup all require careful design. The server cannot moderate content.

Q14 Medium How do you implement typing indicators (User is typing...) efficiently?

▾

Typing indicators are ephemeral — never persist them to Room. The flow: (1) When the user types, send a TYPING{ conversationId } WebSocket frame. (2) Debounce the send — only fire after 500ms of inactivity stops (don't send on every keystroke). (3) The server forwards the event to the recipient's WebSocket. (4) On receipt, show the indicator and start a 5-second timer. If another TYPING frame arrives, reset the timer. If it expires, hide the indicator. (5) Send a TYPING_STOP frame when the user clears the input or sends the message. Store the typing state in a simple MutableStateFlow<Boolean> in the ViewModel — no DB involved.

Q15 Hard How do you support multiple devices for the same account (phone + tablet)?

▾

Each device maintains its own WebSocket connection identified by a deviceId. The server maintains a mapping of userId → [deviceId1, deviceId2, ...]. When a message is sent, the server fans out to all active WebSocket connections for that user. For offline devices, FCM is registered per-device so each gets its own push token. The client-side Room DB is per-device — messages sync independently on each device using lastSyncedTs. The trickiest part: read receipts. If a user reads a message on their tablet, the phone should also mark it as read. The server broadcasts a READ_SYNC event to all other devices of the same user.

Q16 Medium How do you keep the conversation list updated in real time (unread count, last message preview)?

▾

The conversation list is a Room query with a reactive Flow. Use a Room @Query with a JOIN between conversations and messages tables:
SELECT c.*, m.text as lastMessageText FROM conversations c LEFT JOIN messages m ON m.localId = c.lastMessageId ORDER BY c.updatedAt DESC. Every time a new message is inserted, Room emits on this Flow automatically — the conversation list reorders and shows the new preview without any manual refresh. Unread count is a column on the conversations table, incremented on incoming message insert and reset to 0 when the chat screen opens.

Q17 Hard How would you test this architecture? What are the key test layers?

▾

Three layers: (1) Unit tests — Test SendMessageUseCase with a fake Repository. Test ChatViewModel using runTest + Turbine to assert StateFlow emissions. Test ChatRepository with a fake DAO and fake WebSocketManager. (2) Integration tests — Test Room DAO with an in-memory Room database (Room.inMemoryDatabaseBuilder) using runTest. Test the outbox flow with a TestListenableWorkerBuilder for WorkManager. (3) UI tests — Use MockWebServer (OkHttp) to simulate WebSocket frames and assert Compose UI reactions. Use Hilt's @UninstallModules + @TestInstallIn to replace real dependencies with fakes.

Q18 Medium How do you handle message deletion? Both "delete for me" and "delete for everyone".

▾

Delete for me: Soft-delete in the local Room DB — add a deletedForMe: Boolean flag. The Room query filters these out. Never actually delete the row (it may be needed for receipt sync). Delete for everyone: Send a DELETE{ serverId } WebSocket frame. The server marks the message as deleted in its DB and fans out a DELETE event to all recipients' devices. On receipt, the client sets deletedForEveryone = true in Room. The UI shows "This message was deleted" in place of the content. The content itself can be nulled out in Room. Hard limit: WhatsApp allows delete-for-everyone only within 60 hours — enforce this on the server.

Q19 Hard How do you design the system to handle 1 million concurrent users on the Android client side?

▾

The Android client itself doesn't change at scale — it still manages one WebSocket and one Room DB. The client-side concerns at scale are: (1) Reconnect storms — exponential backoff with jitter prevents all 1M clients hitting the server simultaneously after an outage. (2) Battery efficiency — the foreground/background WebSocket lifecycle pattern keeps battery impact minimal. (3) DB growth — implement a message retention policy: delete messages older than 30 days from Room (keep on server). Use a periodic WorkManager task for DB housekeeping. (4) Memory — Paging 3 ensures only the visible window of messages is in memory, not all 10,000.

Q20 Hard If you had to share business logic between Android and iOS, how would you approach it?

▾

Use Kotlin Multiplatform (KMM). The shareable layer includes: Repository, Use Cases, data models, and the outbox pattern logic. Room is replaced by SQLDelight (which generates type-safe Kotlin code from SQL for both Android and iOS). kotlinx.coroutines works on both platforms. The WebSocket client can use Ktor's WebSocket client (multiplatform). What stays platform-specific: UI (Compose on Android, SwiftUI on iOS), WorkManager (use BackgroundTasks on iOS), and FCM (use APNs on iOS). This approach typically reduces business logic duplication by 60–70%, but adds build complexity and requires the team to be comfortable with KMM's maturity limitations.