Querying Beyond Tables: NoSQL’s Distributed Adventure — 3

Querying Beyond Tables: NoSQL’s Distributed Adventure — 3

·

6 min read

I am still working on Designing Data — Intensive Application. If you are following me so far and staying in this journey with me. I want to share that I have started finding rhythm and enjoying the process now. More I go deeper into concepts they make sense to me now in a way it never made before.

In this article I will try to explain my understanding of NoSQL. I remember those days when I heard first this term around a decade ago, my brain was like ah now onwards we will not be querying database anymore, humanity has figured out processing data without querying at all. But soon enough my imaginary world of not going to have to query anymore shattered pretty quickly.

So let’s address the term first. According to internet folklore, an engineer who wanted to attend a tech meet in early 2000s San Francisco from London decided to use the Twitter hashtag #nosql for the event, and the term stuck ever since. This story reminds me of how the term ‘Hadoop’ was coined after the creator’s elephant toy. Spending this much time explaining the term, my intention is to advise you: if you’re someone like me who tries to find meaning in names, don’t. Because, at least in this case, it will only confuse you further.

Now that we’ve discussed the term NoSQL, let’s delve deeper into its history. Understanding the origins often sheds light on its significance. Here’s a brief timeline of how NoSQL databases evolved:

Image created by Author using Canva

1980

Let’s take some time to understand each of these. In the early 1980s, the industry witnessed the rise of relational databases, which would go on to dominate the field for two decades. There were several reasons for its widespread adoption and the industry’s continued investment in it. Below, I highlight some properties that made relational databases so lucrative:

  1. SQL (Structured Query Language): SQL provided a somewhat standard and easy-to-learn syntax, akin to English. It was simpler to grasp compared to other programming languages used for querying databases.

  2. Consistency: Over the years, relational databases invested heavily in maintaining transaction consistency. This commitment to consistency lent them greater credibility.

  3. Integration: Relational databases facilitated seamless integration among systems or applications by allowing tables to be joined in the background.

1990

Relational databases did have the problem of scattering pieces of information across individual tables, which added the overhead of relation mapping frameworks. To address this, object databases emerged. They aimed to reduce the overhead of mapping individual components from memory to the persistence stage. However, despite their intentions, object databases were not successful and eventually faded away.

Below are some of the reasons why Object database disn’t became popular:

  1. Mismatch with Object-Oriented Programming:
  • Object databases aimed to bridge the gap between object-oriented programming (OOP) and data persistence.

  • However, the tight coupling between objects in memory and their representation in the database led to challenges:

a — Complexity: Mapping complex object graphs to database tables was intricate and often required custom code.

b — Inheritance and Polymorphism: Handling inheritance hierarchies and polymorphic relationships was cumbersome.

c — Schema Evolution: Changes to object classes (adding/removing fields) impacted the database schema.

2. Lack of Standardization:

  • Unlike SQL, which had a well-defined standard (SQL-92 and later versions), object databases lacked a unified standard.

  • This lack of standardization hindered interoperability and made it difficult for developers to switch between different object databases.

3. Performance and Scalability Issues:

  • Object databases struggled with performance, especially when dealing with large datasets.

  • Complex object graphs led to inefficient queries, and indexing mechanisms were less effective than in relational databases.

  • Horizontal scaling (sharding) was challenging due to the tight coupling of objects.

4. Persistence Ignorance:

  • Object-relational impedance mismatch: Object-oriented models and relational databases have different paradigms.

  • Developers often preferred to keep their domain objects “persistence-agnostic” (persistence ignorance).

  • Object databases forced developers to embed persistence logic within the domain objects, violating this principle.

5. Market Dominance of Relational Databases:

  • By the time object databases gained traction, relational databases (RDBMS) were already entrenched.

  • RDBMS had decades of maturity, robust tooling, and widespread adoption.

  • Organizations hesitated to switch from proven RDBMS solutions to relatively untested object databases.

6. Emergence of NoSQL Databases:

  • As the need for handling diverse data types and massive scalability grew, NoSQL databases emerged.

  • NoSQL databases addressed specific use cases (e.g., document stores, key-value stores, column-family databases) more effectively.

  • Object databases lost relevance in this evolving landscape.

7. Legacy Systems and Vendor Support:

  • Some organizations had invested heavily in relational databases and were reluctant to migrate.

  • Object database vendors faced challenges in providing robust support, maintenance, and upgrades.

2010

In the early 2010s, companies like Amazon, Facebook, and Google grappled with the challenges posed by their massive internet traffic. Their unique business requirements demanded innovative solutions. Consequently, platforms such as Google’s Bigtable, Amazon’s DynamoDB, and Facebook’s HBase were born. These solutions, initially developed for internal use, were later made public to support open-source concepts.

During this period, the concept of NoSQL gained prominence. NoSQL databases emerged to address issues related to big data and distributed server queries. Running relational databases on individual nodes while ensuring data consistency and integrity was a formidable task — then and now. NoSQL introduced a different data model, revolutionizing data storage and retrieval. It simplified persistence and required less development effort.

Since its inception, NoSQL has become a pivotal moment in the world of data management, enabling efficient storage and retrieval across distributed systems.

The term NoSQL stands for “not only SQL” or “non-relational”. It refers to databases that store data in formats other than traditional relational tables. Unlike relational databases (RDBMS), which rely on structured schemas, NoSQL databases provide more flexibility in handling unstructured, semi-structured, and polymorphic data.

NoSQL databases emerged in the late 2000s when storage costs decreased significantly. Developers faced challenges with rigid data models and the need to avoid data duplication. NoSQL databases optimized for developer productivity, allowing them to handle diverse data shapes efficiently. The Agile Manifesto and the rise of cloud computing further fueled the adoption of NoSQL databases. NoSQL databases address the limitations of RDBMS, especially when dealing with diverse data types and scalability. They empower developers to iterate quickly and adapt to changing requirements.

NoSQL databases don’t enforce fixed schemas. Developers can adapt data structures without predefined tables. Now you will hear this term NoSQL database are schemeless there is no such thing as schemeless its implicit schema your code would expect order[‘price”] .NoSQL databases scale out by adding more nodes to clusters, accommodating large amounts of data and high user loads. The data model allows for efficient queries, especially when dealing with unstructured or semi-structured data. NoSQL databases simplify development by providing flexibility and adaptability.

Choose NoSQL when you need flexibility, scalability, and efficient handling of unstructured data. Use cases include social networks, real-time analytics, content management, and IoT applications.

NoSQL doesn’t mean abandoning SQL altogether; it complements traditional databases. NoSQL databases are not a one-size-fits-all solution; choose based on your specific requirements.

I’m diving deeper into Designing Data-Intensive Applications and will be sharing insights on specific whitepapers, concepts, and design patterns that capture my attention. If you’d like to join me on this exploration, consider following me to receive automatic notifications about my next article!

Did you find this article valuable?

Support Aruna Das by becoming a sponsor. Any amount is appreciated!