Scalability for Mobile and Web Products
This article provides detailed content.
Scalability is the architectural discipline that decides how your product behaves as traffic and user count grow. A system that works at 100 users collapses at 100,000 — and predicting that silent transition is the tech lead's job. This article covers the four principles that keep mobile and web architectures stable in the long run.
Infrastructure Layer: Elastic from Day One
The first principle of scalability is to start "not distributed, but distributable." Beginning with a monolithic backend is often the right call — but the database, cache, and file storage should live as separate services from day one. This keeps the path open for service decomposition in year two without a rewrite.
Practical infrastructure choices:
- Compute: PaaS like Heroku/Railway is enough for MVP, but keep the migration path to AWS ECS/GCP Cloud Run ready
- Database: PostgreSQL primary + read replica should be wired before hitting 50,000 users
- Object storage: Files uploaded to S3/GCS must be decoupled from app servers
- CDN: Cloudflare/CloudFront as default for static assets
Caching Strategies
Caching is the highest-ROI tool for performance and cost — but when managed poorly, it becomes the source of the hardest-to-debug bugs. A three-layer approach works well in practice:
- CDN cache: Static assets and public API responses for 5-60 minutes
- Application cache (Redis): User sessions, expensive DB query results, rate-limit counters
- Client cache: React Query on web,
cached_network_imageon Flutter
Three invalidation strategies are usually mixed: TTL (time-based), event-based (invalidate on DB change), and manual (admin flush). One rule: if you're not sure the cache is working, there is no cache. Instrument it — measure hit/miss ratios.
Service Decomposition: When and From Where?
Jumping to microservices early is one of the most common startup mistakes. The decomposition call isn't technical — it's organizational, triggered when two teams collide on the same codebase. Practical thresholds:
- 10+ developers on the same monolith: Deploy queues and merge conflicts force decomposition
- Divergent SLA requirements: Payment at 99.99%, notifications fine at 99.5% — separate them
- Different scaling profiles: Image processing is CPU-bound, chat is memory-bound — different infra profiles need separate services
The first decomposition is usually lifting side flows like "notifications + email + scheduler" out. Decouple edge services before splitting the core product.
Observability: Removing the Blindfold
The third scalability principle is knowing what slowed down. Observability stands on three legs:
- Logs: Structured (JSON) logs in one place — Datadog, Grafana Loki, or self-hosted ELK
- Metrics: Request count, latency (p50/p95/p99), error rate. Prometheus + Grafana or cloud-native alternatives
- Tracing: Distributed tracing (OpenTelemetry) to show which step is slow when a request crosses multiple services
Sentry + a simple Grafana dashboard is enough at first. But once you cross ~50,000 users, APM becomes mandatory — New Relic, Datadog APM, or Grafana Tempo.
Scalability on the Mobile Side
Less-discussed but equally critical: scalability inside the mobile codebase. To keep Flutter or React Native maintainable into year three:
- Feature-first folder structure:
features/auth/,features/chat/modular organization — not a monolithicscreens/directory - Design system: A component library cuts new-screen build time from 4-5 hours to 1
- Offline-first architecture: Redux/Riverpod persistence + sync. Users should be able to operate offline
- Performance budget: Set CI limits for app size, cold start time, and FPS metrics
A Real Scaling Scenario
A B2C mobile + web SaaS went from 500 users to 50,000 in 18 months. Key inflection points:
- 2,000 users: Added Redis cache + queue system (BullMQ); moved email sending from sync to async
- 10,000 users: Read replica + image CDN; offline cache on mobile
- 30,000 users: Notifications + search extracted as separate microservices
- 50,000 users: APM investment, performance budget in CI, database sharding plan drafted
Tolga Ege - Senior Mobile & Web Developer, Founder of CreativeCode
Mobile App, Web Development, AI, SaaS