โš™๏ธ Core Concept ยท ~30 min read ยท Intermediate โ†’ Advanced

Coroutines & Flow

How Kotlin conquered Android concurrency โ€” from suspending functions and structured lifecycles to cold streams, hot state, and back-pressure.

The problem with threads

Before coroutines, Android concurrency was a battlefield. You had AsyncTask โ€” a class so error-prone it was deprecated in API 30 with no fanfare, just a quiet acknowledgment that it had caused enough damage. You had raw Threads, which you could create freely but couldn't cancel, coordinate, or automatically clean up. You had callbacks nested inside callbacks nested inside callbacks โ€” what the community affectionately called "callback hell." And every time you wanted to do something after a network call, you'd write code like this:

The old way โ€” callback hell
fetchUser(userId) { user -> fetchOrders(user.id) { orders -> fetchProducts(orders.first().id) { product -> updateUI(user, orders, product) // 3 levels deep โ€” and this is the happy path } } }

This code is hard to read, nearly impossible to handle errors in cleanly, and leaks memory if the user navigates away before all three calls complete. The callback for fetchProducts holds a reference to the enclosing Activity โ€” which might have been destroyed by the time it fires.

Threads themselves are expensive. Creating a thread allocates roughly 512KBโ€“1MB of stack memory. Android devices, especially mid-range ones, can't sustain hundreds of concurrent threads. And once you create a thread, you own it โ€” if you forget to stop it, it keeps running long after the work is done, burning CPU and battery.

Kotlin coroutines solve all of this. But to understand how, you first need to understand what a coroutine actually is.


What is a coroutine?

A coroutine is a unit of work that can be suspended and resumed without blocking a thread. When it hits a waiting operation โ€” a network call, a disk read, a timer โ€” it suspends, releases the thread it was running on, and lets that thread pick up other work. When the operation completes, the coroutine resumes โ€” picking up exactly where it left off, on the same or a different thread.

The important distinction is between blocking and suspending. When a thread blocks, it stops doing anything โ€” it just sits there consuming OS resources while waiting. When a coroutine suspends, the thread is freed entirely. Ten thousand coroutines can be simultaneously waiting on network responses while sharing just a handful of threads, because none of them are blocking while they wait.

This is what the suspend keyword means in Kotlin. Marking a function suspend tells the compiler: "this function may pause execution at some point." The compiler then transforms it into a state machine that can be paused and resumed. To the developer writing it, it reads like normal, sequential, synchronous code โ€” because it is. There are no callbacks, no .then() chains, no thread management. The concurrency is handled for you.

Three async calls โ€” reads like sequential code, blocks no thread
suspend fun loadDashboard(): Dashboard { val user = fetchUser() // suspends โ€” thread freed during network call val orders = fetchOrders() // resumes, then suspends again val products = fetchProducts() // resumes, then suspends again return Dashboard(user, orders, products) } // Error handling works like normal code โ€” no .catch() chaining try { val dashboard = loadDashboard() } catch (e: IOException) { showError(e.message) }

Compare this to the callback version at the top. Same logic, same three calls โ€” but here you can read it top to bottom, handle errors in one place, and return a value naturally. A suspend function behaves exactly like a regular function from the caller's perspective. The only difference is that it can pause โ€” and when it does, it does so without wasting a thread.


How coroutines work under the hood

Writing suspend fun feels magical โ€” the code looks synchronous but somehow doesn't block a thread. Most articles stop at the analogy. This section goes deeper: what the Kotlin compiler actually generates, what a Continuation is, and the exact moment a thread gets freed.

The Continuation โ€” the secret extra parameter

Every suspend function you write gets silently transformed by the Kotlin compiler. The transformation is called Continuation-Passing Style (CPS). The rule is simple: every suspend fun gets an extra parameter added at the end โ€” a Continuation.

What you write vs what the compiler generates
// What YOU write suspend fun fetchUser(id: String): User // What the COMPILER generates (simplified) fun fetchUser(id: String, continuation: Continuation<User>): Any

What is a Continuation? It's an interface with one method:

The Continuation interface โ€” from kotlinx.coroutines source
interface Continuation<in T> { val context: CoroutineContext // which dispatcher, job, etc. fun resumeWith(result: Result<T>) // called when the suspended work completes }

Think of a Continuation as a packaged-up "what to do next." When a coroutine suspends, it takes everything it needs to resume later โ€” its local variables, which line to return to โ€” and bundles it into a heap-allocated object. The call stack is replaced by a chain of Continuation objects on the heap. That's how a coroutine can be suspended without occupying a thread โ€” there's no stack frame holding memory. The coroutine's state costs a few hundred bytes on the heap, not 512KBโ€“2MB of thread stack.

๐Ÿ’ก The analogy

Imagine you're reading a book and get interrupted. You don't freeze in place holding the book open. You fold a bookmark into the page, put the book on the shelf, and go do something else. When you come back, you pick up the book, open to the bookmark, and continue reading exactly where you left off.

The Continuation is the bookmark. "Which page" (current state label), "what I was holding" (local variables saved as fields), and "what to do when I'm done" (the caller's continuation) are all encoded in it. The thread is the reader โ€” free to do other things while the bookmark waits on the shelf.

The state machine the compiler generates

CPS alone isn't the whole story. A function with multiple suspension points needs to remember which suspension point it's at when it resumes. The compiler transforms the function body into a state machine โ€” a when block with a numbered label that tracks progress. Each suspension point gets its own state number. Let's trace through a real example:

Your suspend function with two suspension points
suspend fun loadDashboard(): Dashboard { val user = fetchUser() // suspension point 1 val orders = fetchOrders() // suspension point 2 return Dashboard(user, orders) }
What the compiler generates โ€” decompiled to readable Kotlin/Java
// The compiler creates an anonymous class that IS the state machine class LoadDashboardStateMachine( val completion: Continuation<Dashboard> // the caller's continuation โ€” called when we're done ) : Continuation<Any> { var label = 0 // STATE: 0=start, 1=after fetchUser, 2=after fetchOrders var user: User? = null // local variables moved from stack โ†’ heap fields var orders: List<Order>? = null override val context = completion.context override fun resumeWith(result: Result<Any>) { when (label) { 0 -> { // โ”€โ”€ STATE 0: First call โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ label = 1 // advance state BEFORE suspending val res = fetchUser(this) // pass ourselves as the Continuation if (res == COROUTINE_SUSPENDED) // fetchUser went async return // โ† THREAD IS FREED RIGHT HERE user = res as User // fetchUser returned synchronously (e.g. cache) // fall through to label 1 logic immediately } 1 -> { // โ”€โ”€ STATE 1: fetchUser completed, start fetchOrders โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ user = result.getOrThrow() as User // unwrap the result from the resume call label = 2 val res = fetchOrders(this) if (res == COROUTINE_SUSPENDED) return // โ† THREAD IS FREED AGAIN orders = res as List<Order> } 2 -> { // โ”€โ”€ STATE 2: fetchOrders completed โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ orders = result.getOrThrow() as List<Order> } } // All states done โ€” resume the CALLER with the final result completion.resumeWith(Result.success(Dashboard(user!!, orders!!))) } }

The critical lines are the two if (res == COROUTINE_SUSPENDED) return exits. That return unwinds the entire call stack โ€” every frame between this state machine and the thread's event loop returns. The thread is now free. The state machine object, with label = 1 and user still null, sits quietly on the heap consuming a few hundred bytes.

When fetchUser eventually completes on a background thread, it calls this.resumeWith(Result.success(user)). The Dispatcher intercepts this, wraps it in a Runnable, and posts it to the appropriate thread. That thread calls resumeWith() again โ€” this time label == 1, so the state machine picks up exactly where it left off.

suspendCancellableCoroutine โ€” the lowest level primitive

suspendCancellableCoroutine is the function that actually performs a suspension. Everything else โ€” delay(), Room queries, Retrofit calls โ€” ultimately calls this. It gives you the raw Continuation and lets you store it somewhere. When the async work completes, you call resume() on it.

How delay() actually suspends โ€” from the coroutines source
suspend fun delay(timeMillis: Long) { if (timeMillis <= 0) return // fast path โ€” no suspension return suspendCancellableCoroutine { continuation -> // Schedule a callback on the event loop AFTER timeMillis ms val handle = DefaultDelay.scheduleAfter(timeMillis) { continuation.resume(Unit) // โ† wakes the coroutine when the timer fires } // If the coroutine is cancelled while waiting, cancel the timer too continuation.invokeOnCancellation { handle.cancel() } // suspendCancellableCoroutine returns COROUTINE_SUSPENDED to its caller // โ† the thread is freed at this exact point } }

This is the pattern for every async integration. Firebase, Bluetooth, Camera, Play Services โ€” any callback-based API can be made suspendable with this same skeleton:

Wrapping any callback API into a suspend function
// Legacy callback API from a third-party SDK fun getLastLocation(callback: (Location?) -> Unit) // Wrapped as a suspend function โ€” the full pattern suspend fun FusedLocationProviderClient.awaitLastLocation(): Location? = suspendCancellableCoroutine { continuation -> getLastLocation() .addOnSuccessListener { location -> continuation.resume(location) // success โ†’ resume with value } .addOnFailureListener { exception -> continuation.resumeWithException(exception) // failure โ†’ resume with exception } continuation.invokeOnCancellation { // Called if the coroutine is cancelled while waiting for location // Clean up here: cancel the pending task, remove listeners, etc. } } // Usage โ€” reads exactly like synchronous code val location = fusedLocationClient.awaitLastLocation() updateMap(location)

How dispatchers schedule the resumed work

When continuation.resume(value) is called โ€” say, by the timer thread after a delay() โ€” the coroutine doesn't jump back to executing immediately. The coroutine's CoroutineContext contains a Dispatcher, and the dispatcher decides when and where to run the resumed work. For Dispatchers.Main on Android, it posts a Runnable to the main thread's Looper. For Dispatchers.IO, it submits the work to a shared thread pool.

What happens step by step when a coroutine resumes
// Step-by-step: what happens after delay(1000) completes // 1. Timer thread fires after 1000ms, calls continuation.resume(Unit) // 2. The coroutine's dispatcher intercepts this call: dispatcher.dispatch(context, Runnable { continuation.resumeWith(Result.success(Unit)) }) // 3. For Dispatchers.Main โ€” posts to main thread Looper: Handler(Looper.getMainLooper()).post(block) // 4. For Dispatchers.IO โ€” submits to thread pool: threadPool.submit(block) // 5. The target thread picks up the Runnable, executes resumeWith() // 6. The state machine advances to the next label, runs until the next suspension // 7. Either suspends again (returns COROUTINE_SUSPENDED) or finishes completely

Why 100,000 coroutines is fine but 100,000 threads crashes everything

Now the numbers make visceral sense. A suspended coroutine = one state machine object on the heap = ~200โ€“400 bytes. A thread = 512KBโ€“2MB of OS-allocated stack memory + kernel thread object. You can have a million suspended coroutines on a device that can sustain maybe 500 threads before the OS starts struggling.

100,000 coroutines โ€” runs in ~1 second, uses ~40MB heap
fun main() = runBlocking { val jobs = (1..100_000).map { launch { delay(1000) // each coroutine suspends for 1 second // while suspended: ~200 bytes on heap, 0 bytes of thread stack } } jobs.joinAll() println("All 100,000 coroutines completed") } // Finishes in ~1 second. The 100,000 delay() calls are backed by a single // event-loop timer queue โ€” not 100,000 threads. // โŒ Equivalent with threads โ€” DO NOT RUN: // repeat(100_000) { Thread { Thread.sleep(1000) }.start() } // This would try to allocate ~50GB of thread stacks โ†’ OutOfMemoryError in seconds
โœ… The mental model to keep

A running coroutine occupies a thread. A suspended coroutine occupies only a heap object. The Dispatcher is just a queue of "continuations waiting to run." When your IO thread pool is full, new coroutines don't create new threads โ€” their continuations queue up and run as threads become available. This is why coroutines scale โ€” the number of actual OS threads stays small and bounded, while the number of logical concurrent tasks can be enormous.


Starting a coroutine: the three builders

You can't call a suspend function from regular code โ€” you need to be inside a coroutine to call one. This is by design. Kotlin's coroutine system enforces that someone is responsible for the coroutine's lifecycle. That someone is always a CoroutineScope, and the bridge between regular code and the coroutine world is a coroutine builder.

launch โ€” fire and forget

launch starts a new coroutine and returns a Job. You use it when you want to kick off some work and you don't need a result back. The calling code continues immediately โ€” launch doesn't wait for the coroutine to finish.

launch โ€” for side effects
val job = scope.launch { saveToDatabase(user) // runs in the background sendAnalyticsEvent("user_saved") } job.cancel() // you can cancel it any time job.join() // or wait for it to finish

async โ€” when you need a result

async starts a coroutine and returns a Deferred<T> โ€” think of it as a promise of a future value. You call await() to get the result, which suspends the calling coroutine until the result is ready. The real power of async is running things in parallel.

async + await โ€” parallel execution
// Sequential: takes userTime + ordersTime = ~600ms total val user = fetchUser(id) // 300ms val orders = fetchOrders(id) // 300ms // Parallel with async: takes max(userTime, ordersTime) = ~300ms total val userDeferred = scope.async { fetchUser(id) } val ordersDeferred = scope.async { fetchOrders(id) } val user = userDeferred.await() // suspends until ready val orders = ordersDeferred.await() // likely already done by now

runBlocking โ€” the testing bridge

runBlocking starts a coroutine and blocks the current thread until it completes. It's the escape hatch that lets regular code (like a unit test's main function or a test body) enter the coroutine world. In production code, you should almost never use it โ€” if you're blocking a thread, you've defeated the purpose of coroutines.

โš ๏ธ Common mistake

Calling runBlocking on the main thread in production code will freeze your UI. It's for tests and main functions only. In a ViewModel or repository, always use launch or async within an appropriate scope.


Dispatchers โ€” where your coroutine runs

When you launch a coroutine, you need to tell Kotlin which thread or thread pool it should run on. This is what Dispatchers are for. Think of a Dispatcher as a job board โ€” it decides which worker (thread) picks up a piece of work.

DispatcherThread PoolUse ForNever Use For
Main Single main thread UI updates, ViewModel logic, anything touching Views Network calls, disk I/O, heavy computation
IO Up to 64 threads (or CPU count, whichever is higher) Network requests, file reads/writes, Room DB queries CPU-heavy work (use Default instead)
Default CPU core count threads JSON parsing, image decoding, list sorting, heavy computation Blocking I/O (wastes CPU threads waiting)
Unconfined Inherits from caller, then resumes wherever Testing, very specific library use-cases Production application code

A crucial point: Room, Retrofit (with suspend functions), and Kotlin's own IO functions are already cooperative โ€” they suspend the coroutine without occupying a thread. When Room executes a query, it uses Dispatchers.IO internally and suspends your coroutine while the disk read is in progress. You don't need to manually switch dispatchers for suspend-aware libraries.

withContext โ€” switching dispatchers within a coroutine
// Start in IO for the network call suspend fun loadAndParseImage(url: String): Bitmap = withContext(Dispatchers.IO) { val bytes = downloadBytes(url) // blocking download on IO thread withContext(Dispatchers.Default) { BitmapFactory.decodeByteArray(bytes, 0, bytes.size) // CPU-heavy decode on Default } } // Back in the ViewModel โ€” switch to Main to update UI fun loadImage(url: String) = viewModelScope.launch { val bitmap = loadAndParseImage(url) // off main thread _uiState.value = UiState.Loaded(bitmap) // back on Main automatically (viewModelScope) }

Structured Concurrency โ€” the rule that prevents leaks

This is the concept that separates developers who use coroutines from those who understand them. Structured concurrency is a simple rule with profound consequences:

Every coroutine must exist within a scope. When the scope is cancelled, all coroutines within it are cancelled automatically.

๐Ÿ’ก The analogy

Imagine a project manager who starts sub-tasks. If the project manager quits (scope is cancelled), every sub-task they started is automatically cancelled โ€” nobody keeps working on a cancelled project. There are no orphaned tasks running in the background burning resources with no one checking on them.

Without structured concurrency, you'd have to manually track every coroutine you ever started and remember to cancel them all. You'd always forget one. That forgotten coroutine would silently leak memory โ€” holding a reference to your Activity or Fragment long after it was destroyed.

The three scopes you'll use every day

viewModelScope is tied to a ViewModel's lifecycle. When the ViewModel is cleared (which happens when the screen it belongs to is permanently gone), all coroutines in viewModelScope are automatically cancelled. This is the scope you'll use for 90% of coroutines in a well-architected app.

lifecycleScope is tied to a Fragment or Activity's lifecycle. It's useful for UI-layer work โ€” observing Flows, starting one-shot operations that should stop if the user navigates away. Be careful: it's also cancelled on rotation, so don't use it for work you want to survive configuration changes (use viewModelScope for that instead).

applicationScope / custom CoroutineScope is for work that should outlive any single screen โ€” uploading a photo, syncing in the background, an analytics flush. You create this manually and tie it to your Application class or a singleton service.

Scope lifetimes
class UserViewModel : ViewModel() { fun saveUser(user: User) { viewModelScope.launch { // auto-cancelled when ViewModel is cleared repository.save(user) } } } class UserFragment : Fragment() { override fun onViewCreated(...) { lifecycleScope.launch { // auto-cancelled when Fragment view is destroyed viewModel.uiState.collect { render(it) } } } } class MyApplication : Application() { val appScope = CoroutineScope(SupervisorJob() + Dispatchers.Default) // Lives as long as the Application process โ€” for background work }

Job vs SupervisorJob โ€” failure propagation

Every coroutine is associated with a Job. Jobs form a tree โ€” a parent job and its children. How a failure in one child affects the tree depends on whether you're using a regular Job or a SupervisorJob.

๐Ÿ’ก The analogy

With a regular Job, think of a team of mountaineers roped together. If one falls, they all fall. One coroutine throws an uncaught exception โ†’ the whole scope is cancelled โ†’ all sibling coroutines are cancelled too.

With a SupervisorJob, each mountaineer has their own safety harness. If one falls, the others continue climbing. One coroutine fails โ†’ only that coroutine is affected โ†’ siblings keep running.

Job vs SupervisorJob failure propagation
// Regular Job โ€” one failure cancels all siblings val scope = CoroutineScope(Job()) scope.launch { throw Exception("I failed") } // cancels the entire scope scope.launch { doOtherWork() } // โ† also cancelled! Never runs. // SupervisorJob โ€” failures are isolated val scope = CoroutineScope(SupervisorJob()) scope.launch { throw Exception("I failed") } // only THIS coroutine fails scope.launch { doOtherWork() } // โ† still runs normally // viewModelScope uses SupervisorJob internally โ€” that's why ViewModel coroutines // don't cancel each other on failure

This is why viewModelScope uses a SupervisorJob under the hood. If you have a ViewModel that loads a user profile and also tracks analytics, you don't want the analytics coroutine to cancel the profile-loading coroutine if it throws an exception.


Cancellation โ€” cooperative, not forceful

Coroutine cancellation is cooperative. Cancelling a coroutine doesn't immediately stop it โ€” it sends a signal that the coroutine should stop. The coroutine's code is responsible for checking this signal and stopping itself. This is a crucial distinction.

Cancellation only works at suspension points
// This coroutine CAN be cancelled โ€” it checks at the delay() call scope.launch { repeat(1000) { i -> delay(100) // โ† suspension point โ€” cancellation checked here println("Processed item $i") } } // This coroutine CANNOT be cancelled โ€” no suspension points scope.launch { for (i in 0..1_000_000) { heavyComputation(i) // โ† blocks the thread, never checks cancellation } } // Fix: use isActive to periodically check scope.launch { for (i in 0..1_000_000) { if (!isActive) return@launch // โ† cooperative check heavyComputation(i) } }

When a coroutine is cancelled, any subsequent call to a suspend function will throw a CancellationException. This exception is special โ€” it's silently ignored by the coroutine machinery (it's not an error, it's a controlled shutdown). If you catch Exception in your coroutine, make sure you don't swallow CancellationException:

Don't swallow CancellationException
// โŒ Wrong โ€” swallows cancellation, coroutine keeps running after cancel() scope.launch { try { doSomething() } catch (e: Exception) { log.e("Error: $e") // catches CancellationException and ignores it! } } // โœ… Correct โ€” re-throw CancellationException scope.launch { try { doSomething() } catch (e: CancellationException) { throw e // always re-throw! } catch (e: Exception) { log.e("Error: $e") } }

What is Flow?

A suspend function returns one value. But what if you need multiple values over time โ€” a stream of database updates, a sequence of network pages, live UI events? That's what Flow is for.

๐Ÿ’ก The analogy

A regular suspend function is like ordering a pizza โ€” you place the order, you wait, you get one pizza. A Flow is like a pizza subscription โ€” you subscribe once, and pizzas keep arriving over time whenever they're ready.

There's a crucial detail though: a Flow is cold โ€” the subscription doesn't exist until a collector asks for it. The pizzeria only starts making pizzas when someone subscribes. If nobody subscribes, no pizzas are made. This matters for understanding when your code actually runs.

Cold means the code inside a flow { } builder doesn't run until you call collect(). Every new call to collect() starts a fresh execution from the beginning. This is fundamentally different from the hot streams we'll see with StateFlow and SharedFlow.

Creating and collecting a Flow
// Define a flow โ€” nothing runs yet fun tickerFlow(period: Duration): Flow<Int> = flow { var i = 0 while (true) { emit(i++) // emit a value downstream delay(period) // suspend until next tick } } // Start the flow โ€” this is when the code above actually runs viewModelScope.launch { tickerFlow(1.seconds).collect { value -> println("Tick: $value") // called once per second } } // When the scope is cancelled, collect() cancels the flow โ€” no leaks

Flow operators โ€” building pipelines

The real power of Flow comes from its operators โ€” functions that transform, filter, combine, and reshape streams. They're lazy: an operator doesn't process a value until a collector downstream asks for it. This creates a pipeline where data flows only when consumed.

Essential operators
searchQuery .debounce(300) // wait 300ms of silence before emitting โ€” perfect for search .filter { it.length >= 2 } // ignore single-character queries .distinctUntilChanged() // skip if query didn't change (e.g., add+delete space) .flatMapLatest { query -> // cancel previous search when a new query arrives repository.search(query) } .catch { e -> // handle errors without stopping the flow emit(SearchResult.Error(e.message)) } .collect { result -> _uiState.value = result }

Pay close attention to flatMapLatest โ€” it's one of the most important operators for real-world use. When a new value arrives, it cancels any ongoing collection of the previous inner flow and starts collecting the new one. For a search box, this means: if the user types "cor", starts a search, then immediately types "coro" โ€” the "cor" search is cancelled and "coro" takes over. You never see stale results from a superseded query.

Combining multiple flows

combine and zip merge two flows into one, but they behave differently:

combine vs zip
// combine: emits whenever EITHER flow has a new value // Like: "give me the latest value from both, whenever either updates" combine(userFlow, settingsFlow) { user, settings -> UiState(user.name, settings.theme) }.collect { uiState -> render(uiState) } // zip: pairs values one-to-one, waits for both // Like: "pair up the 1st from A with the 1st from B, the 2nd with 2nd, etc." flowA.zip(flowB) { a, b -> Pair(a, b) }.collect { (a, b) -> process(a, b) }

In most UI use-cases, combine is what you want โ€” you want the UI to re-render whenever any of its data sources update. zip is for one-to-one processing, like pairing events with timestamps.


StateFlow & SharedFlow โ€” hot streams

Remember: regular Flow is cold โ€” it only runs when collected, and each collector gets an independent stream. But for UI state, you want something different: a value that exists and updates regardless of whether anyone is currently collecting it, and that new collectors can immediately get the current value from. That's what hot streams are for.

StateFlow โ€” the UI state container

๐Ÿ’ก The analogy

A StateFlow is like a scoreboard at a sports game. The score exists and updates whether anyone is looking at it or not. When you sit down and look at the scoreboard for the first time, you see the current score โ€” you don't have to wait for the next goal. Everyone looking at the scoreboard always sees the same current value.

StateFlow is the modern replacement for LiveData. It always has a value, collectors always get the latest value immediately on subscription, and it only emits when the value actually changes (it skips duplicate values with structural equality).

StateFlow in a ViewModel
class UserViewModel : ViewModel() { // Private mutable version โ€” only ViewModel can change it private val _uiState = MutableStateFlow<UserUiState>(UserUiState.Loading) // Public read-only version โ€” UI observes this val uiState: StateFlow<UserUiState> = _uiState.asStateFlow() fun loadUser(id: String) = viewModelScope.launch { _uiState.value = UserUiState.Loading try { val user = repository.getUser(id) _uiState.value = UserUiState.Success(user) } catch (e: Exception) { _uiState.value = UserUiState.Error(e.message) } } } // In Fragment โ€” collect safely with Lifecycle awareness lifecycleScope.launch { viewLifecycleOwner.repeatOnLifecycle(Lifecycle.State.STARTED) { viewModel.uiState.collect { state -> render(state) } } }

Notice repeatOnLifecycle(STARTED). This is critical โ€” it stops collecting when the app goes to the background (Fragment is stopped) and restarts when it comes back to the foreground. Without it, you'd be processing UI updates and doing unnecessary work while your app is in the background.

SharedFlow โ€” for events

SharedFlow is a more flexible hot stream. Unlike StateFlow, it doesn't require an initial value and can replay multiple past values to new subscribers. This makes it perfect for one-shot events: showing a snackbar, navigating to a screen, displaying a dialog.

โ„น๏ธ StateFlow vs SharedFlow โ€” quick decision rule

StateFlow: use when you're modeling state โ€” something that has a current value that persists over time (isLoading, user data, list contents). New subscribers should always get the current value.

SharedFlow: use when you're modeling events โ€” one-shot things that happen and shouldn't be replayed to late subscribers (navigation, snackbars, toasts). A "show snackbar" event collected twice would show two snackbars.

SharedFlow for navigation events
class LoginViewModel : ViewModel() { private val _events = MutableSharedFlow<LoginEvent>() val events: SharedFlow<LoginEvent> = _events.asSharedFlow() fun login(email: String, pass: String) = viewModelScope.launch { val result = authRepository.login(email, pass) if (result.success) { _events.emit(LoginEvent.NavigateToDashboard) // one-shot event } else { _events.emit(LoginEvent.ShowError(result.message)) } } } // In Fragment repeatOnLifecycle(Lifecycle.State.STARTED) { viewModel.events.collect { event -> when (event) { is LoginEvent.NavigateToDashboard -> navigateToDashboard() is LoginEvent.ShowError -> showSnackbar(event.message) } } }

Channel โ€” the coroutine mailbox

A Channel is a communication primitive for sending values between coroutines. Think of it as a pipe: one coroutine sends values in one end, another coroutine receives them from the other end. Unlike SharedFlow, a channel item is consumed by exactly one receiver โ€” it's not broadcast to multiple collectors.

๐Ÿ’ก The analogy

A Channel is a mailbox between two people. You can drop letters in it (send), and the recipient picks them up one by one (receive). Each letter is picked up by exactly one person โ€” it's not photocopied for everyone. And if the mailbox is full (buffer is full), the sender either waits, drops the letter, or drops the oldest letter, depending on the overflow policy.

Channels are particularly useful when you have a producer-consumer relationship: one part of your code generates work (events, tasks, data), and another part processes it. The channel buffers work between them and naturally handles back-pressure.

Channel overflow strategies
// SUSPEND (default): sender suspends when buffer is full // Good when you can't afford to lose any item val channel = Channel<Task>(capacity = 100) // DROP_OLDEST: drops the oldest buffered item when full // Good for telemetry/analytics โ€” prefer fresh data over stale val channel = Channel<AnalyticsEvent>( capacity = 2000, onBufferOverflow = BufferOverflow.DROP_OLDEST ) // DROP_LATEST: drops the incoming item if buffer is full // Good for UI events where you don't want to queue up too many val channel = Channel<ClickEvent>( capacity = 1, onBufferOverflow = BufferOverflow.DROP_LATEST )

Production patterns you'll use every day

Observing Room as a Flow

Room's DAO supports returning Flow<T> from queries. This creates a live query โ€” whenever the data in the database changes, Room automatically re-runs the query and emits the new result. Combined with StateFlow in the ViewModel, this creates a fully reactive data pipeline with zero polling.

Room โ†’ Flow โ†’ StateFlow โ†’ UI
// DAO โ€” returns Flow, not suspend fun @Dao interface UserDao { @Query("SELECT * FROM users WHERE id = :id") fun observeUser(id: String): Flow<User> // NOT suspend โ€” returns a live stream } // Repository โ€” pass the flow through unchanged fun userFlow(id: String): Flow<User> = userDao.observeUser(id) // ViewModel โ€” convert to StateFlow for the UI val user: StateFlow<User?> = repository .userFlow(userId) .stateIn( scope = viewModelScope, started = SharingStarted.WhileSubscribed(5_000), // keep alive 5s after last subscriber initialValue = null )

stateIn with WhileSubscribed(5000) is a pattern worth memorising. The 5-second timeout means: if all subscribers unsubscribe (e.g., because the user rotated the screen), keep the upstream flow alive for 5 seconds. If a new subscriber arrives within that window (the new Activity after rotation), they immediately get the last cached value without re-running the database query. This avoids the flash of an empty state on rotation.

Handling errors in a Flow pipeline

Errors in a Flow cancel the entire flow by default. The catch operator intercepts exceptions and lets you emit a fallback value โ€” keeping the stream alive rather than terminating it.

Graceful error handling in flows
repository.userFlow(id) .map { user -> UiState.Success(user) } .catch { e -> // Log the error, emit an error state โ€” flow continues running analytics.logError(e) emit(UiState.Error(e.localizedMessage)) } .onEach { state -> // Runs for every emission, success or error recovery hideLoadingIndicator() } .collect { state -> _uiState.value = state }

Testing coroutines

The biggest challenge when testing coroutines is time. Real code uses delay(), but in tests you don't want to wait real seconds. Kotlin provides TestCoroutineDispatcher (now called UnconfinedTestDispatcher and StandardTestDispatcher in the current API) and the runTest builder that lets you control virtual time.

Testing with runTest and virtual time
@Test fun `search debounces correctly`() = runTest { val results = mutableListOf<String>() val job = launch { searchFlow .debounce(300) .collect { results.add(it) } } searchInput.value = "c" advanceTimeBy(100) // 100ms passes virtually โ€” no actual waiting searchInput.value = "co" advanceTimeBy(100) // another 100ms searchInput.value = "cor" advanceTimeBy(400) // 400ms โ€” debounce triggers assertEquals(listOf("cor"), results) // only the final query went through job.cancel() }

For testing Flows, the Turbine library by Cash App is the gold standard. It turns the async nature of collecting a flow into a synchronous, readable test API:

Testing Flows with Turbine
@Test fun `loading state transitions correctly`() = runTest { viewModel.uiState.test { assertEquals(UiState.Loading, awaitItem()) // first emission viewModel.loadUser("user123") assertEquals(UiState.Loading, awaitItem()) // loading starts val success = awaitItem() as UiState.Success assertEquals("Alice", success.user.name) // data loads cancelAndIgnoreRemainingEvents() } }

How this connects to system design interviews

Every Android system design question you'll face at a senior level involves coroutines and Flow. Understanding them deeply lets you explain your choices with precision rather than hand-waving.

When designing an Analytics SDK, you explain: "The track() function calls channel.trySend() โ€” non-blocking because the Channel is bounded with DROP_OLDEST overflow. A background coroutine in a SupervisorJob-backed scope drains the channel to Room. The scope is tied to the application lifecycle so it outlives any single Activity."

When designing a Payment System, you explain: "The CheckoutOrchestrator uses a Mutex with tryLock() to prevent double-submission โ€” if the mutex is already held, the second startCheckout() call returns immediately. The checkout state is a StateFlow<CheckoutState> that the UI layer collects with repeatOnLifecycle(STARTED)."

The pattern is always the same: choose the right primitive (Channel, StateFlow, SharedFlow, Mutex), wire it into the right scope, and explain why each choice is correct. That's what senior-level answers look like.