Home United States USA — software Fit for purpose: The case for the purpose-built database

Fit for purpose: The case for the purpose-built database

July 26, 2021

247

One size does not fit all
Sponsored Remember when the only database in town was relational? Things have changed in 20 years. Today, the venerable old relational database management system (RDBMS) still presides, but the market is also filled with new database types designed for different kinds of jobs. Database concepts predate the RDBMS, in the form of hierarchical and network databases, and even further back in the form of punched card collections. But it was really E. F Codd’s relational concept, published in 1970, that ushered in the era of modern database computing. His concept of tables and rows separated the logical data model neatly from the underlying physical storage, and led to a flurry of database engines from the mid-seventies onwards. The RDBMS was perfectly adequate for most applications for decades, but the seeds of change had already been planted. Just a year before Codd published his first paper, the Stanford Research Institute and UCLA exchanged the first ever internet message. That would eventually change everything, creating an internet that would morph computing forever, increasing applications’ scale and scope. Over the last few decades, the Internet ushered in new data and speed requirements that made relational systems less appropriate for many applications. Today, many applications need to work with terabytes of data, supporting millions of global users. Traditional relational systems have problems coping with that scale while still maintaining performance. Hyperscale operators were among the first to notice as relational products grew thin at the seams. Amazon, which had relied on Oracle’s RDBMS in its early days, began noticing the strain. Amazon’s relational databases began hitting their limits in 2004 as the ecommerce giant’s transaction volumes ballooned, explains Edin Zulich, senior solutions architect (SA) manager and NoSQL database specialist at AWS. « A closer look at the data access patterns and how those databases were used revealed that in most cases, we used a fairly straightforward key-value access pattern, » he recalls. « This gave rise to the idea that maybe we can look into creating a more scalable database that would work well for these use cases. » The company began developing its own key-value database, Dynamo, in 2004 after Oracle started running out of steam. It posted a paper documenting its experiences in 2007. Then, it continued to refine the database internally before releasing it as the Amazon DynamoDB key-value database service in 2012. Focusing on a key-value structure enabled AWS to break away from the rigid structure on which relational tables are based. « Data has weight, » Zulich says, adding that it’s harder to scale out systems that are organized that way. Why take the performance hit of complex joins when you could just store the records that you need together? « It’s basically about organizing data in a way that’s efficient for your read and write patterns, » he explains. « If I’m storing shopping cart data, I’ll always make sure that a given cart’s data is in the same location. That way, I don’t have to send a request to several different nodes and then put that shopping cart together. That is what would happen if you did it with a relational database. » With a range of databases suited for different use cases, there’s no reason for companies to use just one for a job, Zulich says.