Book Review #2 — Designing Data-Intensive Applications (part 1)

Book Review #2 — Designing Data-Intensive Applications (part 1)

·

3 min read

About the Book

Image created by Author using canva

My Assessment

Image created by Author using canva

What to Expect

This is a very well-written book and is popular among software engineers. First released in 2017, I am re-reading it after joining a new job after a pandemic break. I feel like this will be a good place to revisit key concepts. The book is organized into three parts:

  1. Foundations of Data Systems

  2. Distributed Data

  3. Derived Data

Hence, my review will also be in three parts. In the Foundations of Data Systems, the author describes what makes applications reliable, scalable, and maintainable. The Data Models section touches on NoSQL, traditional relational databases, graph databases, and document databases. Next comes storage and retrieval, which is my personal favorite. As a data engineer, it really helps me understand how data is stored and retrieved, and how B-trees, LSM-trees, and indexes play a role in retrieval and storage. Finally, Encoding discusses JSON, XML, and new formats in detail with some examples, covering Thrift, Avro schema evolution. Part one focuses on data systems that apply to data stored on a single machine.

In my view, this book bridges the gap and does a good job of explaining the concepts and principles needed to design and maintain data-driven applications. It’s a proper technical book that assumes at least a mid-level engineering expertise. What I love most about this book is that it helps you identify gaps in your understanding and provides additional reading references. You can explore, read, practice, and solidify the principles you feel you need to wrap your head around.

What not to Expect

This book is not a tutorial for any specific language or tool. The author explicitly mentions this fact at the beginning and sets the tone correctly. Don’t expect to become a coding-level expert just by reading this book; instead, its purpose is to provide a deeper understanding of the various components that need to interact for designing or maintaining a well-designed, data-driven project. It does a good job of explaining the critical considerations in different areas that need to be thought through and taken care of for a well-curated application to function properly.

My final thoughts

This book is quite popular, often appearing in ‘top N’ lists of recommended reads. I wholeheartedly endorse this recommendation. However, be aware that this book is a substantial read; it requires discipline to complete. Personally, it took me two months to finish the first section. With over a decade of experience in the industry, nothing I’ve encountered so far is truly ‘unknown’ to me. I might have come across the concepts in my readings or discussions in project meetings. What is new to me, though, is how this book either reaffirms my understanding or corrects the course of my previous interpretations. In either case, it’s a bonus and serves as a valuable repository for future reference.

I am considering spending some time writing a few articles based on the engrossing topics from the first section. If you like this idea, please consider following to stay connected

Happy reading Books!

Did you find this article valuable?

Support Aruna Das by becoming a sponsor. Any amount is appreciated!