Design an Offline-First News App
1. Understanding the Problem
Design an Android news app (think Inshorts, Google News, or The Hindu app) that works seamlessly without internet. Users should be able to open the app on the subway, in flight mode, or in a low-connectivity region and still read articles. The core challenge: keeping local content fresh while minimising data usage and battery drain.
The defining principle: Room DB is always the single source of truth. The UI never reads from the network directly โ it only reads from Room. The network exists solely to keep Room up to date. This pattern guarantees the app works regardless of connectivity.
Learn This Pattern โโ Functional Requirements
- Browse a categorised news feed (Top Stories, Tech, Sports, โฆ)
- Full article read, including images
- Bookmark articles for later offline reading
- Background sync: articles download automatically on a schedule
- Offline banner shown when no connectivity
- Search across locally cached articles
- Staleness indicator on articles older than X hours
โ๏ธ Non-Functional Requirements
- Cold start: Feed visible instantly (0 network calls)
- Freshness: Sync every 1โ4 h; immediate pull-to-refresh
- Storage: Evict articles older than 48 h (except bookmarks)
- Battery: Sync only on Wi-Fi or when charging (configurable)
- Data: Sync images lazily; text first
- Conflict: Server data always wins on refresh
- How many articles per category? (Affects storage budget)
- Is background sync required, or only on app open?
- Should bookmarked articles have their images pre-downloaded?
- Is full-text search required, or just title/headline search?
2. The Set Up
Sync Strategy Comparison
Fetch then Cache
Always hit the network; fall back to cache on failure.
Serve from Cache
Serve cached data; refresh in background.
Room as SSoT
Room drives UI always. WorkManager syncs periodically + on demand.
Core Components
3. High-Level Design
WorkManager Sync Strategy
| Trigger | Type | Constraint | Priority |
|---|---|---|---|
| ๐ Periodic (every 4 h) | PeriodicWorkRequest | NetworkType.UNMETERED (Wi-Fi) | Normal |
| App foreground / pull-to-refresh | OneTimeWorkRequest | NetworkType.CONNECTED | High |
| FCM breaking news push | OneTimeWorkRequest (expedited) | NetworkType.CONNECTED | Expedited |
| App install / first launch | OneTimeWorkRequest | NetworkType.CONNECTED | High |
Room Schema
@Entity("articles")
data class ArticleEntity(
@PrimaryKey val id: String,
val title: String,
val summary: String,
val body: String, // full article text
val imageUrl: String,
val category: String,
val source: String,
val publishedAt: Long, // epoch ms
val syncedAt: Long, // when we saved it
val isBookmarked: Boolean = false,
val isRead: Boolean = false
)
@Entity("sync_metadata")
data class SyncMetadata(
@PrimaryKey val category: String,
val lastSyncAt: Long,
val etag: String? // HTTP ETag for conditional GET
)
@Dao
interface ArticleDao {
@Query("SELECT * FROM articles WHERE category = :cat ORDER BY publishedAt DESC")
fun observeByCategory(cat: String): Flow<List<ArticleEntity>>
@Query("SELECT * FROM articles WHERE isBookmarked = 1 ORDER BY publishedAt DESC")
fun observeBookmarks(): Flow<List<ArticleEntity>>
@Query("SELECT * FROM articles WHERE title LIKE '%' || :q || '%' OR summary LIKE '%' || :q || '%'")
fun search(q: String): Flow<List<ArticleEntity>>
// Eviction: delete stale non-bookmarked articles older than cutoff
@Query("DELETE FROM articles WHERE syncedAt < :cutoff AND isBookmarked = 0")
suspend fun evictStale(cutoff: Long)
@Insert(onConflict = OnConflictStrategy.REPLACE)
suspend fun upsertAll(articles: List<ArticleEntity>)
}
4. Low-Level Design
Whiteboard: Sync Pipeline
Flow 1: Cold Start (Offline)
| FeedFragment | NewsViewModel | NewsRepository | Room DB | WorkManager |
|---|---|---|---|---|
1onViewCreated โ collectLatest(viewModel.articles) |
||||
2Exposes repository.observeCategory(selectedCat) |
||||
3Returns articleDao.observeByCategory(cat) โ no network call |
||||
| 4Room emits cached articles (from last sync) immediately | ||||
| 5RecyclerView renders cached feed. User reads articles. | ||||
| 6WorkManager queues a sync โ waits for network (constraint not met) | ||||
| 7Offline banner shown (ConnectivityManager Flow emits NO_NETWORK) |
Flow 2: Background Sync (WorkManager)
| WorkManager | NewsSyncWorker | News API | Room DB | Feed UI |
|---|---|---|---|---|
| 1PeriodicWorkRequest fires (4 h interval, Wi-Fi constraint met) | ||||
2Read sync_metadata for each category โ get stored ETag |
||||
3GET /headlines?category=tech with If-None-Match: "abc123" |
||||
4If no new articles: 304 Not Modified โ zero bytes transferred |
||||
5If updated: 200 with new articles + new ETag |
||||
6db.withTransaction: upsertAll + evictStale(cutoff=now-48h) + update ETag |
||||
| 7Room invalidates query โ Flow emits updated list | ||||
| 8RecyclerView animates new articles in via DiffUtil. No code in Fragment changed. |
Flow 3: Bookmark for Offline Reading
| ArticleFragment | NewsViewModel | Room DB | Coil (Image Prefetch) |
|---|---|---|---|
| 1User taps bookmark icon โ heart animation | |||
2bookmarkArticle(articleId) |
|||
3UPDATE articles SET isBookmarked=1 WHERE id=:id |
|||
4Trigger image pre-fetch for article's imageUrl |
|||
| 5Headless Coil request: download full image to disk cache (not just thumbnail) | |||
6Bookmarked articles exempt from evictStale DELETE โ persist indefinitely |
|||
7BookmarksFragment reads via observeBookmarks() โ available fully offline |
Key Code: NewsSyncWorker
class NewsSyncWorker(ctx: Context, params: WorkerParameters) : CoroutineWorker(ctx, params) {
override suspend fun doWork(): Result {
val categories = listOf("top", "tech", "sports", "business")
return try {
categories.forEach { category ->
syncCategory(category)
}
Result.success()
} catch (e: IOException) {
// Transient network error โ retry with backoff
if (runAttemptCount < 3) Result.retry()
else Result.failure()
}
}
private suspend fun syncCategory(category: String) {
val meta = db.syncMetadataDao().get(category)
// Conditional GET โ only download if server data changed
val response = api.getHeadlines(
category = category,
ifNoneMatch = meta?.etag // null on first sync
)
if (response.code() == 304) return // nothing changed, zero bytes used
val articles = response.body()!!
val newEtag = response.headers()["ETag"]
val cutoff = System.currentTimeMillis() - 48L * 3600 * 1000
db.withTransaction {
db.articleDao().upsertAll(articles.map { it.toEntity(category) })
db.articleDao().evictStale(cutoff) // bookmarks are excluded by query
db.syncMetadataDao().upsert(SyncMetadata(
category = category,
lastSyncAt = System.currentTimeMillis(),
etag = newEtag
))
}
// Warm image cache for top 10 articles (disk only, no memory waste)
articles.take(10).forEach { article ->
val req = ImageRequest.Builder(applicationContext)
.data(article.thumbnailUrl)
.memoryCachePolicy(CachePolicy.DISABLED)
.diskCachePolicy(CachePolicy.ENABLED)
.build()
imageLoader.enqueue(req)
}
}
}
Key Code: Scheduling WorkManager
// Called once from Application.onCreate()
fun schedulePeriodicSync(context: Context) {
val constraints = Constraints.Builder()
.setRequiredNetworkType(NetworkType.UNMETERED) // Wi-Fi only
.setRequiresBatteryNotLow(true)
.build()
val syncRequest = PeriodicWorkRequestBuilder<NewsSyncWorker>(4, TimeUnit.HOURS)
.setConstraints(constraints)
.setBackoffCriteria(BackoffPolicy.EXPONENTIAL, 15, TimeUnit.MINUTES)
.build()
WorkManager.getInstance(context).enqueueUniquePeriodicWork(
"news_sync",
ExistingPeriodicWorkPolicy.KEEP, // don't re-schedule if already queued
syncRequest
)
}
// On-demand sync (pull-to-refresh)
fun triggerImmediateSync(context: Context) {
val req = OneTimeWorkRequestBuilder<NewsSyncWorker>()
.setExpedited(OutOfQuotaPolicy.RUN_AS_NON_EXPEDITED_WORK_REQUEST)
.setConstraints(Constraints(NetworkType.CONNECTED))
.build()
WorkManager.getInstance(context).enqueueUniqueWork(
"news_sync_now", ExistingWorkPolicy.REPLACE, req
)
}
5. Potential Deep Dives
ETag / Conditional GET for Zero Data Waste
The server returns an ETag header with each response. Store it in the sync_metadata table. On the next sync request, send If-None-Match: <stored_etag>. If nothing changed, the server returns 304 Not Modified with an empty body โ zero bytes transferred. This is critical for the periodic background sync which fires frequently: most syncs in a stable news cycle cost nothing.
ETag (content hash) and Last-Modified (timestamp) are the two HTTP cache validators. ETags are preferred because they're content-based, not time-based โ a re-processed article with the same content won't look "changed" to ETag, but would to Last-Modified.
Storage Budget & Eviction Policy
Unbounded growth is a real problem. Strategy:
- Text: Delete articles older than 48 h that aren't bookmarked (the
evictStalequery runs inside every sync transaction) - Images: Coil's DiskLruCache has a configurable max size (e.g., 250 MB). Coil handles eviction automatically using LRU โ you don't need custom logic
- Bookmarks: Never evicted automatically. Show a "Manage storage" setting so users can clear bookmarks manually
- onTrimMemory: Implement
ComponentCallbacks2.onTrimMemory()to clear in-memory Coil cache on memory pressure without losing disk cache
FCM-Triggered Breaking News Sync
For urgent breaking news, the server sends a FCM data message (not notification message, so it delivers in background). The FirebaseMessagingService.onMessageReceived() handler enqueues an expedited OneTimeWorkRequest. Expedited work runs immediately, bypassing the normal scheduling delay. The user sees a notification once the article is saved to Room โ no custom notification sound or copy needed in the client.
Full-Text Search
Room supports FTS4/FTS5 virtual tables. Create an ArticleFts entity that shadows ArticleEntity with @Fts4(contentEntity = ArticleEntity::class). FTS uses an inverted index for MATCH queries โ dramatically faster than LIKE '%query%' on large tables. The shadow table is auto-kept in sync by Room triggers. Expose search via a Flow so results update reactively as the user types (with 300 ms debounce in the ViewModel).
Connectivity Banner
Observe network state in the ViewModel using a callbackFlow wrapping ConnectivityManager.registerDefaultNetworkCallback. Map the callback to a Boolean isOnline StateFlow. The Fragment observes this and shows/hides a "You're offline" banner at the top of the feed with a smooth slide animation. When it goes back online, trigger an immediate sync automatically.
6. What is Expected at Each Level
- Knows Room + Retrofit basics
- Caches API response to Room manually
- WorkManager for background fetch
- Handles no-internet with try/catch
- Shows offline toast/message
- Basic UPSERT on article refresh
- Room as SSoT โ UI only reads from Room
- Reactive Flow from Room โ zero polling
- Conditional GET with ETag (zero-byte syncs)
- Atomic transaction: upsert + evict + updateETag
- Stale eviction policy with bookmark exemption
- FCM โ expedited WorkManager trigger
- Coil disk pre-cache for bookmarked images
- FTS5 virtual table for instant full-text search
- Differential sync: server sends only changed article IDs
- Per-category sync schedules (sports sync more during events)
- Storage budget enforcement per category
- WorkManager chain: sync โ pre-fetch images โ notify
- A/B testing article card layouts via RemoteConfig
- Analytics for offline read depth (how far users read offline)
7. Interview Questions
Regular caching is a fallback โ the app tries the network first and falls back to cache only on failure. Offline-first inverts this: the local database is always the source of truth. The UI always reads from Room regardless of connectivity. The network only exists to keep the local database updated. This guarantees the app works identically whether or not internet is available โ there's no code path that fails without network. It also means instant cold starts (Room read is <1 ms vs network which is 200 ms+).
WorkManager is the correct modern solution because: (1) it survives process death โ work is persisted to a SQLite database and re-scheduled on boot; (2) it respects Doze mode and battery restrictions โ work deferred appropriately without killing battery; (3) it supports constraints (Wi-Fi only, battery not low) declaratively; (4) it handles retries with backoff automatically. AlarmManager fires at an exact time but doesn't handle process death or battery policy. Raw Services are killed by the system on Android 8+. WorkManager handles all edge cases that require hundreds of lines of custom code with the older APIs.
Run an eviction query inside every sync transaction: DELETE FROM articles WHERE syncedAt < :cutoff AND isBookmarked = 0. The cutoff is typically now - 48 hours. Doing this inside the same transaction as the upsert is crucial โ it's atomic: either both succeed or neither does, preventing partial states where old articles persist alongside failed new insertions. Bookmarked articles are explicitly excluded so user-saved content survives eviction. For images, Coil's DiskLruCache handles eviction automatically via LRU once it reaches the configured size cap.
An ETag is a hash of the response content that the server includes in the response header (ETag: "abc123"). The client stores it. On the next request, it sends If-None-Match: "abc123". If the content hasn't changed, the server returns 304 Not Modified with an empty body โ only headers, zero bytes of content transferred. This is massive for a periodic sync that fires every 4 hours: if news is slow on a Sunday morning, all syncs cost near-zero data. The client only pays the full data cost when content actually changes.
Room DAO methods that return Flow are reactive. Room uses SQLite's change notification mechanism โ any write to the articles table triggers an invalidation, which causes the active Flow to re-emit. The Fragment's collectLatest picks this up and the RecyclerView updates via DiffUtil. There is no polling, no manual refresh call from the Fragment. The sync worker writes to Room (from a background thread), and the UI thread receives the update through the coroutine Flow pipeline automatically. This is the elegance of Room + Kotlin Flow.
The server's article data (title, body, imageUrl, publishedAt) should always win โ use @Insert(onConflict = OnConflictStrategy.REPLACE). But local flags (isRead, isBookmarked) are client-only state that must be preserved. Strategy: before upserting, read existing local flags; after mapping the server response to an entity, merge the local flags back in. Or better: use @Update / @Query to update only the server-driven columns, leaving isRead and isBookmarked untouched. A clean pattern: separate the entity into a ArticleContent table (server-owned) and ArticleUserState table (client-owned), joined by article ID.
enqueueUniquePeriodicWork ensures only one instance of the periodic sync exists by name. ExistingPeriodicWorkPolicy.KEEP means: if a work request with this name already exists, do nothing โ keep the existing schedule. REPLACE would cancel the existing one and restart the interval, potentially causing extra syncs or losing the existing schedule. Using KEEP with a call in Application.onCreate() means the sync is safely registered on every app launch, but the actual periodic schedule only runs once and is never duplicated.
On SwipeRefreshLayout swipe: (1) Call triggerImmediateSync() which enqueues a OneTimeWorkRequest with ExistingWorkPolicy.REPLACE (so multiple swipes don't stack up); (2) Observe the WorkInfo state from WorkManager: when State.RUNNING, show the spinner; when SUCCEEDED or FAILED, hide it. Do not hide the spinner when Room emits new data โ hide it when the WorkManager job completes. This handles the case where the sync ran but found no new articles (304 response) โ the spinner correctly stops even though Room didn't emit.
Use Room's FTS4 or FTS5 virtual table. Annotate a data class with @Fts4(contentEntity = ArticleEntity::class). Room creates a shadow FTS table and SQLite triggers to keep it in sync with articles. Query with: SELECT * FROM articles JOIN articlesFts ON articles.rowid = articlesFts.rowid WHERE articlesFts MATCH :query. FTS uses an inverted index, so "climate change" searches the index rather than scanning every row. It supports prefix matching (clim*), boolean operators, and near-word matching. In the ViewModel, debounce the search input by 300 ms to avoid hitting Room on every keystroke.
The server sends an FCM data message (not a notification message) when breaking news hits. FirebaseMessagingService.onMessageReceived() receives it even when the app is in the background or killed. It enqueues an expedited OneTimeWorkRequest. The worker syncs the relevant article, saves it to Room, then uses NotificationManager to build and show a push notification with the headline and a PendingIntent deep-linking to the article. Using data messages instead of notification messages gives full control over notification content and timing โ the notification fires only after the article is saved locally, so tapping it always works offline.
Wrap ConnectivityManager.registerDefaultNetworkCallback in a callbackFlow: the onAvailable callback emits true and onLost emits false. Expose this as a StateFlow<Boolean> in the ViewModel. The Fragment observes it and animates a banner view (using TransitionManager.beginDelayedTransition) in/out. When the network returns, trigger an immediate sync automatically โ the user doesn't need to pull-to-refresh. Use distinctUntilChanged() on the Flow so rapid toggle/untoggle events don't cause banner flicker.
Use WorkManager's TestListenableWorkerBuilder: val worker = TestListenableWorkerBuilder<NewsSyncWorker>(context).build(). Use an in-memory Room database and a mock Retrofit API (returns pre-built responses). Call worker.doWork() (suspending). Assert: (1) Worker returned Result.success(); (2) Room contains the expected articles; (3) sync_metadata.etag was updated; (4) Old articles were evicted. For the 304 case: mock API to return an empty body with 304 status, assert Room is unchanged. For retry: mock a network exception, assert Result.retry() on attempt 1 and Result.failure() on attempt 4.
Delete-all then insert destroys local state (isRead, isBookmarked) for every sync. It also causes a momentary empty state between delete and insert completion โ the UI briefly shows nothing. Upsert (OnConflictStrategy.REPLACE) merges server data into existing rows. For new articles, it inserts; for existing ones, it updates only server-owned fields. Combined with separate columns for client state or a @Query that preserves local flags, this maintains both freshness and local user state without gaps in the UI.
Set Constraints.setRequiredNetworkType(NetworkType.UNMETERED) on the PeriodicWorkRequest. UNMETERED matches Wi-Fi and Ethernet โ any connection without data caps. The periodic sync (every 4 h) and image pre-caching can transfer several MB. Running this on cellular would burn the user's data plan silently. However, the on-demand pull-to-refresh sync uses NetworkType.CONNECTED so it works on any network โ the user explicitly requested it. FCM-triggered breaking news also uses CONNECTED since the user expects timely breaking news even on cellular.
Add a staleness computed property to the UI model: val isStale = (now - syncedAt) > 6.hours. Map this in the ViewModel when converting ArticleEntity to ArticleUiModel. In the article card ViewHolder, show a small "๐ 6h ago" chip in a muted colour when isStale == true. Alternatively, dim the card slightly or show a yellow border. This gives the user transparency about data freshness without blocking them from reading. The staleness threshold should be a configurable constant, not a magic number.
On first launch, Room is empty. The ViewModel exposes a uiState that can be Loading | Success(articles) | Empty. The Fragment shows a loading shimmer (skeleton UI) while the first sync runs. The sync is triggered immediately on first launch as a high-priority OneTimeWorkRequest with CONNECTED constraint. Once the worker writes articles to Room, the Flow emits and the shimmer is replaced by the feed. If there's no network on first launch, the Fragment shows an "No articles yet โ check your connection" empty state with a retry button, which triggers the immediate sync.
The sync_metadata table has one row per category with its own lastSyncAt and etag. The sync worker iterates over all enabled categories in the user's preferences. This allows per-category sync policies: Sports could sync every 30 minutes during a cricket match (detected via server flag in the response), while Opinion syncs once a day. The categories table also stores the user's subscription state โ unsubscribed categories are skipped entirely in the sync loop, saving bandwidth. The user changing their category subscriptions in Settings triggers an immediate OneTimeWorkRequest for newly subscribed categories.
Track these metrics in analytics: (1) Offline read rate: % of article opens that happened with no connectivity โ a high number validates the investment. (2) Cold start to first article: P50/P95 time from app open to first article visible โ should be <300 ms. (3) Sync success rate: % of WorkManager syncs that return SUCCESS vs RETRY/FAILURE. (4) 304 rate: % of sync API calls that return 304 โ high is good (efficient). (5) Eviction loss: track when a user tries to open a bookmarked article that was accidentally evicted โ this indicates a bug in the eviction exemption logic.
Add a readProgressPercent: Int column to ArticleEntity (0โ100). As the user scrolls through the article body, a RecyclerView.OnScrollListener computes the scroll percentage and updates the ViewModel. Debounce writes to Room by 2 seconds to avoid constant DB writes. On article open, the ViewModel loads the stored readProgressPercent and scrolls the RecyclerView to that position. Show a reading progress bar at the bottom of the article card in the feed. Mark isRead = true once readProgressPercent >= 80. This entirely client-side feature requires no server changes and works completely offline.
Room requires a migration strategy whenever the schema version changes. Options: (1) Manual migration: val MIGRATION_1_2 = object : Migration(1, 2) { override fun migrate(db: SupportSQLiteDatabase) { db.execSQL("ALTER TABLE articles ADD COLUMN readProgressPercent INTEGER NOT NULL DEFAULT 0") }} โ add this to databaseBuilder.addMigrations(MIGRATION_1_2). (2) Destructive migration: .fallbackToDestructiveMigration() โ wipes and recreates all tables. Only acceptable in development or if the data is purely a cache (users won't lose bookmarks with this approach โ use option 1 in production). Always write a migration test using MigrationTestHelper to verify the schema transition doesn't corrupt existing data.