Caching
Storing frequently used data in fast storage to reduce latency and load
Overview
Caching stores copies of frequently accessed data in a faster, closer location so repeated requests avoid slow work. It matters because it dramatically cuts latency and reduces load on databases and services. The trade-off is that cached data can become stale.
Syntax / Usage
A cache sits between a requester and a slower data source. On a cache hit, data is returned instantly; on a miss, the system fetches from the source and stores the result for next time.
Request --> [ Cache ]? hit --> return fast
| miss
v
[ Database ] --> store in cache --> return
Key concepts: TTL (time-to-live) controls how long entries stay valid, and an eviction policy (like LRU) decides what to remove when the cache is full.
Examples
A product page reads from Redis first; on a miss it queries Postgres, then writes the result to Redis with a 60-second TTL. Most requests are served from memory.
An API sets Cache-Control: max-age=300 on responses so browsers and CDNs reuse them for five minutes, cutting server traffic for popular endpoints.
Common Mistakes
- Caching data that changes often and serving stale results
- Never setting a TTL, so entries live forever
- Cache stampede: many misses hitting the database at once
- Forgetting to invalidate the cache after a write (write-through helps)
- Caching sensitive per-user data in a shared cache
See Also
system-design-scalability system-design-databases system-design-apis