module 2 nosql
module 2 nosql
STUDY MATERIAL
Module-2 NOSQL
MODULE 2
NOSQL DATABASE
DISTRIBUTION MODELS
databases, which typically work more efficiently when run on a single server.
Similarly, if your application primarily involves working with aggregates (e.g.,
collections of related data that are retrieved and processed together), then using a
single-server document or key-value store can be a simpler and effective solution. It
reduces the complexity of managing distributed systems and eases the workload for
application developers.
Despite the fact that more complex distribution strategies will be discussed
throughout the chapter, it’s important to note that single-server setups are often
preferable. If the application’s requirements allow for it, avoiding data distribution is
ideal. Whenever feasible, sticking to a single-server approach is the first choice, as it
minimizes potential issues and keeps the system simpler.
4.2. Sharding
Sharding is a technique used to achieve horizontal scalability by distributing
different parts of a dataset across multiple servers (or nodes). This method can greatly
enhance performance, particularly for systems where different users are accessing
different parts of the data. In an ideal scenario, each user interacts with a single server
node, which handles both reading and writing. This balances the load evenly between
the servers, so if there are ten servers, each would handle approximately 10% of the
overall load.
Figure 4.1. Sharding puts different data on separate nodes, each of which does its own
reads and writes.
However, the ideal scenario is rare. To approach it, you need to ensure that data
frequently accessed together is stored on the same node. This is where aggregate
orientation becomes useful. Aggregates are collections of data designed to be
accessed together, making them a natural unit for distribution across different servers.
When deciding how to arrange data on nodes, several factors can improve
performance:
1. Physical location-based distribution: If data is primarily accessed from specific regions,
placing the data near the users’ locations can reduce access time. For example, orders
from Boston customers could be stored in a data center on the U.S. East Coast.
2. Even load distribution: Aggregates should be spread evenly across nodes to ensure that
no single node is overloaded. This distribution might need adjustments over time based on
changes in data access patterns.
3. Sequential reading optimization: If certain data is accessed sequentially, grouping it
together can improve efficiency. For instance, organizing web pages by reversed domain
names allows multiple pages to be processed together.
Historically, manual sharding was done by embedding logic into the application,
such as assigning customers with last names starting from A-D to one shard, and so
on. However, this approach complicates programming and requires significant
changes when data needs rebalancing across shards. To simplify this, many modern
NoSQL databases offer auto-sharding, where the database itself handles data
allocation to shards and directs queries to the appropriate shard.
Sharding offers significant performance benefits, especially for write-heavy
applications. While replication and caching can improve read performance, sharding
is one of the few ways to horizontally scale writes. However, sharding does not
inherently improve resilience. If a node containing a shard fails, that shard’s data
becomes unavailable, affecting all users who rely on that data. Although only a
portion of the database is affected in a sharded system, partial data loss is still
problematic. In fact, sharded clusters often use less reliable machines, making node
failure more likely than in single-server systems, thus potentially reducing resilience.
Because sharding can be complex, it should not be implemented hastily. Some
databases are designed to use sharding from the start, and these should be run on
Figure 4.2. Data is replicated from master to slaves. The master services all writes;
fails. If updates haven't fully propagated to all slaves, data loss can occur, as the most
recent updates might not have been replicated. This is something to be aware of when
using replication, and later discussions on consistency can help address these
concerns.
Figure 4.3. Peer-to-peer replication has all nodes applying reads and writes to all the
data.
However, the primary challenge with peer-to-peer replication is ensuring consistency.
When multiple nodes are allowed to write to the same dataset, the risk of write-write
conflicts arises. This occurs when two users or processes attempt to update the same
record on different nodes at the same time. Inconsistent read operations—where
different nodes may show slightly different versions of data due to delays in
replication—are less concerning because they are usually temporary and can be
resolved as the system synchronizes. But inconsistent writes are more problematic,
as they can result in permanent data inconsistencies that need to be resolved manually
or through automated conflict resolution mechanisms.
To handle these inconsistencies, two broad strategies are typically
used:
Coordination of writes: In this approach, before a write is confirmed, the replicas
communicate with each other to ensure consistency. They can agree on the validity of
a write, preventing conflicts from happening in the first place. This coordination
process can be achieved by using a majority vote among the nodes (i.e., as long as a
majority of the replicas agree on a write, it is considered valid). While this strategy
guarantees consistency similar to that of a master-slave model, it comes at the cost of
increased network traffic and slower write performance due to the time required for
coordination.
Handling inconsistent writes: At the opposite end of the spectrum, the system may
allow inconsistent writes to occur and then rely on conflict resolution policies to
merge or resolve these inconsistencies later. This strategy provides the full
performance benefit of peer-to-peer replication, as there is no need to coordinate
between nodes before writing, but it requires a solid plan for dealing with conflicting
data. Certain applications or domains can tolerate these inconsistencies if appropriate
merge strategies are in place.
Ultimately, peer-to-peer replication involves a trade-off between consistency and
availability. You can opt for stronger consistency by coordinating writes, but this
sacrifices some of the system's availability and performance. Alternatively, you can
prioritize availability and performance by allowing writes to any replica, but this
increases the likelihood of inconsistencies that need to be resolved. The choice
depends on the specific needs of the application and the importance of data
consistency versus system availability.
fault tolerance since slaves can continue serving read requests even if the master fails).
CONSISTENCY
When transitioning from a centralized relational database to a cluster-based
NoSQL database, one of the most significant changes is how you approach
consistency. In relational databases, the goal is often to maintain strong consistency,
ensuring that inconsistencies are minimized or avoided altogether. However, in the
world of NoSQL, terms like the CAP theorem and eventual consistency become
more prominent, and you’re quickly faced with decisions about the level of
consistency your system requires.
Consistency in databases comes in different forms, and the term "consistency" can
refer to a wide range of potential issues that may arise. So, it's important to first
understand the different types of consistency models that exist. Then, we can explore
why you might choose to relax consistency or even durability under certain
conditions, especially in NoSQL systems.
this case) can obtain the lock to update the value at a time. Once Martin has the
lock and updates the value, Pramod would see the result of Martin's update
before attempting his own change.
Optimistic approach: This approach allows conflicts to occur but detects them
and resolves them afterward. For example, after Martin's update succeeds,
Pramod's update would fail because the system would check if the data had
changed before Pramod's update was applied. In this case, Pramod would receive
an error, prompting him to check the updated value and decide whether to retry
his update.
Both the pessimistic and optimistic approaches rely on consistent serialization of
updates. With a single server, this is easy since the server can only apply one update
at a time. However, in distributed systems with peer-to-peer replication, different
nodes might apply the updates in a different order, leading to inconsistent data across
nodes.
Another optimistic solution to handle write-write conflicts is to save both updates
and record the conflict. This is similar to how version control systems handle
conflicting commits. The user can then be asked to resolve the conflict manually (e.g.,
deciding which phone number format to keep), or the system might attempt to
automatically merge the updates (e.g., standardizing the phone number format).
Many people, when first encountering these types of conflicts, instinctively prefer
pessimistic concurrency to avoid them. However, there is always a trade-off
between safety (avoiding errors like lost updates) and liveness (quick system
responses). Pessimistic approaches can slow down a system significantly, especially
when multiple users are trying to update the same data. Additionally, pessimistic
concurrency can introduce other problems, such as deadlocks, which are difficult to
debug.
In distributed systems, replication increases the likelihood of write-write conflicts
because different copies of the same data can be independently updated. One way to
manage this is to have a single node handle all writes, which simplifies maintaining
update consistency. Many distribution models (except peer-to-peer replication)
follow this pattern to avoid conflicts.
This situation leads to eventual consistency, where updates will eventually reach all
replicas, but in the meantime, clients may see stale (outdated) data. The inconsistency
will be resolved after a short period, but there’s still a window during which
inconsistent data may be read.
4. Inconsistency Windows and Different Levels of Consistency
Replication consistency can worsen logical inconsistencies by extending the time of
the inconsistency window. For example, an inconsistency window on the master node
might be very short, but network delays can cause it to last much longer on slave
nodes
Applications often allow developers to choose different levels of consistency for
specific operations. Weak consistency can be acceptable most of the time, but for
critical operations, strong consistency might be required
Figure 5.3. With two breaks in the communication lines, the network partitions into two groups.
Consistency: All nodes in the system see the same data at the same time.
Availability: Every request to a non-failing node receives a response, either with
the requested data or an error.
Partition Tolerance: The system continues to operate despite network partitions,
which separate nodes from each other.
In scenarios where both nodes accept bookings, there’s a chance for overbooking
(e.g., both Martin and Pramod booking the same room). Some businesses accept this
risk due to their operational models.
In a master-slave distribution model, if the master node fails before updates are
sent to the slaves, those updates will be lost.
When the master node comes back online, it could have conflicting updates with
those made to the slaves while it was down. This scenario highlights a durability
issue since the master acknowledges the update to the client, leading the client to
believe the update was successful when it actually wasn't fully replicated.
One way to mitigate the risk of losing updates during replication is to require the
master node to wait for acknowledgments from some replicas before confirming an
update to the client. This strategy can improve durability but has trade-offs:
As with general durability, it's advantageous to allow individual calls to specify their
desired level of durability. This flexibility enables developers to balance performance
and reliability based on the specific needs of each operation.
In summary, while durability is a critical property of reliable data storage, there are
cases where relaxing this requirement can lead to improved performance and
responsiveness in database systems. Understanding the trade-offs involved—especially
in high-traffic applications or when managing replicated data—enables developers to
make informed decisions about how best to handle data consistency and persistence.
This approach can enhance system efficiency while accommodating the potential for
data loss in less critical scenarios.
5.5. Quorums
Not All or Nothing: When considering consistency and durability, it's important to
understand that you don't have to fully sacrifice one for the other. Instead, you can
adjust the number of nodes involved in operations to find a suitable balance.
Replication Factor: When data is replicated across multiple nodes, the replication
factor (N) represents the total number of copies of the data in the system.
Read Quorum (R): The read quorum determines how many nodes you must contact to
ensure you receive the most recent write. The complexity arises because the read
quorum depends on the write quorum:
If W = 2 (two nodes must confirm a write), then R must also be at least 2 to guarantee
that you read the latest data.
If W = 1 (only one node confirms a write), you must contact all 3 nodes (R = 3)
to ensure you retrieve the most current update. In this case, if a write quorum isn't
met, conflicts can occur, but contacting enough nodes for reads helps detect any
inconsistencies.
The relationship between reads, writes, and replication can be summarized with the
inequality: R + W > N. This means that the total number of nodes contacted for reads
and writes must exceed the total number of replicas to guarantee strong consistency in
reads.
The number of nodes contacted during an operation can vary based on the
operation's requirements. For instance:
It’s possible to configure the system to favor different aspects depending on the
context. For example:
For fast, strongly consistent reads, you might set a high write quorum (W
= 3) and allow reads from just one node (R = 1). This configuration means
every write must be confirmed by all nodes, making writes slower but
allowing rapid reads.
In scenarios where you can tolerate some staleness or slower reads, you can
adjust the number of nodes involved in writes and reads to optimize
performance and availability.
The key takeaway is that there are multiple strategies and combinations of node
interactions that can be tailored to the specific needs of an application. This flexibility
stands in contrast to the simplistic view of a binary trade-off between consistency and
availability often discussed in NoSQL literature.
Key Points
• Write-write conflicts occur when two clients try to write the same data at the same
time. Read write conflicts occur when one client reads inconsistent data in the middle
of another client’s write.
• Pessimistic approaches lock data records to prevent conflicts. Optimistic approaches
detect conflicts and fix them.
• Distributed systems see read-write conflicts due to some nodes having received
updates while other nodes have not. Eventual consistency means that at some point
the system will become consistent once all the writes have propagated to all the nodes.
• Clients usually want read-your-writes consistency, which means a client can write
and then immediately read the new value. This can be difficult if the read and the
write happen on different nodes.
• To get good consistency, you need to involve many nodes in data operations, but
this increases latency. So you often have to trade off consistency versus latency.
• The CAP theorem states that if you get a network partition, you have to trade off
availability of data versus consistency.
• Durability can also be traded off against latency, particularly if you want to survive
failures with replicated data.
• You do not need to contact all replicants to preserve strong consistency with
replication; you just need a large enough quorum
Version Stamps
Critics of NoSQL databases often highlight their lack of support for transactions, which
are essential for maintaining data consistency. Transactions allow developers to bundle
multiple operations into a single, atomic unit, ensuring that either all operations
succeed or none do. However, many advocates of NoSQL argue that the absence of
traditional transactions is not as concerning as it seems. This is largely because
aggregate-oriented NoSQL databases provide atomic updates within aggregates, which
are designed to handle related data as a cohesive unit.
This approach is similar to updating resources in HTTP, where servers use etag headers.
An etag is a unique string representing the version of a resource. When updating a
resource, the client can supply the previous etag. If the resource has changed on the
server, the etags will not match, and the server will return a "412 Precondition Failed"
response, preventing outdated updates.
Some databases offer conditional update mechanisms to ensure that updates are not
based on stale data. While you can implement this check manually, it requires ensuring
that no other process can modify the resource between your read and update, a practice
known as compare-and-set (CAS) operations. In databases, this means comparing a
version stamp rather than a value.
GUID: A globally unique identifier can be generated by any system and is unlikely to
produce duplicates. While GUIDs are unique, they are large and cannot be directly
compared for recency.
Content Hash: This involves creating a hash of the resource's content. It provides
uniqueness and can be generated by any node. However, like GUIDs, hashes cannot be
compared for recency and may be lengthy.
Timestamps: Using the last update's timestamp allows for easy comparisons for
recency and requires no central management. However, synchronized clocks across
multiple machines are essential to prevent corruption and ensure uniqueness.
Combining these methods can yield a composite version stamp, leveraging the
strengths of each approach. For instance, CouchDB employs a combination of counters
and content hashes, enabling effective conflict detection during peer-to-peer replication.
In a traditional setup, such as a master-slave model, the master node generates and
manages version stamps, ensuring that all updates follow a consistent order. However,
in a peer-to-peer model, multiple nodes can independently update data, leading to
potential discrepancies. If two nodes return different version stamps when queried, it
indicates that they may have different data states.
Counters: The simplest form of version stamping involves using a counter that
increments with each update. For instance, if Node Blue has a version stamp of 4 and
Node Green has a stamp of 6, you can easily determine that Green's data is more
current. This method works well for single-master scenarios but does not suffice in
cases where multiple nodes can make updates independently.
Timestamps: While timestamps are another common approach, they can lead to
complications in ensuring all nodes maintain synchronized clocks. Clock drift can
result in inconsistencies, making it challenging to resolve conflicts effectively.
Timestamps also do not provide sufficient information to detect write-write conflicts,
which limits their utility to scenarios with a single master.
Vector Stamps: The most robust approach in peer-to-peer NoSQL systems is the use
of vector stamps. A vector stamp consists of a set of counters—one for each
participating node. For example, a vector stamp for three nodes (Blue, Green, Black)
might look like this: [blue: 43, green: 54, black: 12]. Each node updates its own counter
independently upon making a change, which allows for tracking the versioning of data
across the system.
Whenever two nodes communicate, they synchronize their vector stamps. This process
helps determine the state of data across the network. The comparison of vector stamps
provides insight into which version is newer:
Newer Version: If all counters in one vector stamp are greater than or equal to
those in another, then it can be considered the newer version.
Conflict Detection: If both version stamps show a counter greater than the other for
different nodes, a write-write conflict has occurred, indicating that both nodes made
changes that need to be reconciled.
When nodes are added or if a node has not been involved in recent updates, its counter
may be missing from the vector. Missing values are treated as zero, allowing the
system to accommodate new nodes without invalidating the existing version stamps.
For example, [blue: 6, black: 2] is interpreted as [blue: 6, green: 0, black: 2].
While vector stamps are effective for identifying inconsistencies, they do not resolve
conflicts. The resolution depends on the specific application domain and its
requirements. This highlights the consistency/latency tradeoff:
Network Partitions: In the event of network partitions, you may face challenges in
making the system available or consistent.
Conflict Management: You can either choose to handle inconsistencies when they
arise or risk making the system unavailable during such partitions
Key Points
• Version stamps help you detect concurrency conflicts. When you read data, then
update it, you can check the version stamp to ensure nobody updated the data between
your read and write.
• Version stamps can be implemented using counters, GUIDs, content hashes,
timestamps, or a combination of these.
• With distributed systems, a vector of version stamps allows you to detect when
different nodes have conflicting updates.