Many applications today are data-intensive, as opposed to compute-intensive. Raw CPU power is rarely a limiting factor of these applications - bigger problems are usually the amount of data, the complexity of data, and the speed at which it is changing. A data-intensive application is typically built from standard building blocks that provide commonly needed functionality. For example, many applications need to:

  • Store data so that they, or another application, can find it again later (databases)
  • Remember the result of an expensive operation, to speed up reads (caches)
  • Allow users to search data by keywords or filter it in various ways (search indexes)
  • Send a message to another process, to be handled asynchronously (stream processing)
  • Periodically crunch a large amount of accumulated data (batch processing)

When building an application, most engineers wouldn’t dream of writing a new data storage engine from scratch, because databases are perfectly good tool for the job.

There are many database systems with different characteristics, because different applications have different requirements. There are various approaches to caching, several ways of building search indexes, and so on.