Resource Replication
in Cloud Computing
Dr. Hitesh Mohapatra
School of Computer Engineering
KIIT Deemed to be University
Definition
Resource replication in cloud
computing is the process of
making multiple copies of the
same IT resource.
This is done to improve the
availability and performance of
the resource.
Why is resource replication important?
• Reliability
• Resource replication helps ensure that users can access their
resources consistently, even if there are hardware failures or
network issues.
• Disaster recovery
• Resource replication can help with disaster recovery by creating
redundant copies of data in multiple locations.
• Application performance
• Resource replication can help applications run faster, especially
mobile applications.
How is resource replication done?
• Virtualization technology
• Virtualization technology is used to create multiple instances of the same
resource. For example, a hypervisor can use a virtual server image to create
multiple virtual server instances.
• Synchronous replication
• Data is saved to both the primary and secondary storage platforms at the
same time. This provides a more accurate backup, but it can impact network
performance.
• Asynchronous replication
• Data is saved to the primary storage first, then to the secondary storage. This
method puts less strain on systems, but there is a lag between storage
operations.
Synchronous and Asynchronous
What Is Remote Replication?
• Introduction
• Essential part of data protection and recovery.
• Historical Context
• Initially used for copying and storing application data in off-site locations.
• Technological Advancements
• Expanded capabilities over time.
• Now allows creating a synchronized copy of a VM on a remote target host.
• Functionality
• Replica: Synchronized copy of the VM.
• Functions like a regular VM on the source host.
• Flexibility
• VM replicas can be transferred to and run on any capable hardware.
• Disaster Recovery
• Powered on within seconds if the original VM fails.
• Significantly decreases downtime.
• Risk Mitigation
• Mitigates potential business risks and losses associated with disaster.
Factors to be considered!
• Distance — the greater the distance between the sites, the more
latency will be experienced.
• Bandwidth — the internet speed and network connectivity should be
sufficient to ensure an advanced connection for rapid and secure data
transfer.
• Data rate — the data rate should be lower than the available
bandwidth so as not to overload the network.
• Replication technology — replication jobs should be run in parallel
(simultaneously) for efficient network use.
Synchronous replication
• Introduction
• Essential part of data protection and recovery.
• Historical Context
• Initially used for copying and storing application data in off-site locations.
• Technological Advancements
• Expanded capabilities over time.
• Now allows creating a synchronized copy of a VM on a remote target host.
• Functionality
• Replica: Synchronized copy of the VM.
• Functions like a regular VM on the source host.
Cont.
• Flexibility
• VM replicas can be transferred to and run on any capable hardware.
• Disaster Recovery
• Powered on within seconds if the original VM fails.
• Significantly decreases downtime.
• Risk Mitigation
• Mitigates potential business risks and losses associated with disaster.
• Synchronous Replication
• Data replicated to a secondary remote location at the same time as new data is
created/updated in the primary datacenter.
• Near-instant replication: Data replicas are only a few minutes older than the source material.
• Both host and target remain synchronized, crucial for successful disaster recovery (DR).
• Impact on Network Performance
• Atomic operations: Sequence of operations completed without interruption.
• Write considered finished only when both local and remote storages acknowledge its
completion.
• Guarantees zero data loss, but can slow down overall performance.
Asynchronous replication
• Replication not performed at the same time as changes are
made in the primary storage.
• Data replicated in predetermined time periods (hourly, daily, or
weekly).
• Replica stored in a remote DR location, not synchronized in real
time with the primary location.
• Write considered complete once local storage acknowledges it.
• Improves network performance and availability without affecting
bandwidth.
• In a disaster scenario, DR site might not contain the most
recent changes, posing a risk of critical data loss.
Cont.
• Introduction
• Essential part of data protection and recovery.
• Historical Context
• Initially used for copying and storing application data in off-site locations.
• Technological Advancements
• Expanded capabilities over time.
• Now allows creating a synchronized copy of a VM on a remote target host.
• Functionality
• Replica: Synchronized copy of the VM.
• Functions like a regular VM on the source host.
• Flexibility
• VM replicas can be transferred to and run on any capable hardware.
• Disaster Recovery
• Powered on within seconds if the original VM fails.
• Significantly decreases downtime.
Cont.
• Risk Mitigation
• Mitigates potential business risks and losses associated with disaster.
• and target remain synchronized, crucial for successful disaster recovery (DR).
• Impact on Network Performance
• Atomic operations: Sequence of operations completed without interruption.
• Write considered finished only when both local and remote storages
acknowledge its completion.
• Guarantees zero data loss, but can slow down overall performance.
Synchronous Asynchronous
Distance Works better when locations are in close Works over longer distances (as long as network
proximity (performance drops in proportion to connection between datacenters is available).
distance).
Cost More expensive More cost-effective
Recovery Point Objective (RPO) Zero From 15 minutes to a few hours
Recovery Time Objective (RTO) Short Short
Network Requires more bandwidth and is affected by Requires less bandwidth and is not affected by
latency; Can be affected by WAN interruptions latency; Is not affected by WAN interruptions
(as the transfer of replicated data cannot be (as the copy of data can be saved at the local
postponed until later). site until WAN service is restored).
Data loss Zero Possible loss of most recent updates to data.
Resilience A single failure could cause loss of service; Loss of service can occur after 2 failures.
Viruses or other malicious components that
lead to data corruption might be replicated to
the second copy of the data.
Performance Low (waits for network acknowledgement from High (does not wait for network
the secondary location). acknowledgement from the secondary
location).
Management May require specialized hardware; Supported More compatible with other products;
by high-end block-based storage arrays and Supported by array-, network- and host-based
network-based replication products. replication products.
Use cases Best solution for immediate disaster recovery Best solution for storage of less sensitive data
and projects that require absolutely no data and immediate disaster recovery of projects
loss. that can tolerate partial data loss.
What is data replication in Cloud Computing?
Data replication is the process of maintaining redundant copies of primary data.
This is important for several reasons, including fault tolerance, high availability,
read-intensive applications, reduced network latency, or supporting data
sovereignty requirements.
Fault Tolerance: Data replication is necessary when applications must preserve data in the
case of hardware or network failure due to causes ranging from someone tripping over a power
cable to a regional disaster such as an earthquake. Thus, every application needs data
replication for resilience and consistency.
High Availability: Data frequently accessed by many users or concurrent sessions needs
data replication. In this case, replicated data must remain consistent with its leader and other
replicas.
Reduce Latency: Data replication also helps modern cloud applications run off distributed
data in different networks or geographic regions that serve the end user better.
In short, it’s not only about backup and disaster management but also about
application performance. Let’s dive into how replication works and understand
these needs a little deeper.
Cloud data replication vs. traditional data replication
Feature Traditional Data Replication Cloud Data Replication
- Global: Applications to multiple cloud-
- Local: Mobile device to PC, PC to
Scope based data/services, replicating to other
networked database
cloud resources
- Advanced data protection and high
Primary Use - Preserve data in case of failure
availability
- Replicas not directly accessible until
Accessibility - Near-instant access to replicas
primary nodes fail
- Requires manual work to reassemble - Automates replication and
Manual Work
data while offline management
- Multiple cloud-based machines in
- From local to external network for
Replication Levels same data center, rack-level distribution,
backup
cross-data center replication
- Not real-time, only becomes “active”
Real-Time Sync - Real-time or near-real-time replication
when primary fails
- Relies on manual intervention to - Automatic failover and faster recovery
Disaster Recovery (DR)
activate replicas times
- Wide geographic distribution, storing
- Limited, typically within local or master data and replicas in different
Geographic Distribution
external networks regions (e.g., San Francisco, New York,
London)
What is cloud-to-cloud data replication?
A modern hybrid cloud option uses your local network as a
master copy and multiple cloud services or varying regions
within one cloud as part of the replication. Ideally, all nodes in
this design are accessible to applications (for reading and
writing) even when no disaster is at play.
Some tools
• AWS Migration Service
• Hevo Data, Carbonite
• Veeam Backup and Replication
• Microsoft Azure
• Google Cloud Storage Snapshots
• Informatica
Feature Traditional Data Replication Cloud Data Replication Data Backup
- Global: Applications to multiple cloud-
- Local: Mobile device to PC, PC to
Scope based data/services, replicating to other - Restores data to a specific point in time
networked database
cloud resources
- Protects data from corruption, system
- Advanced data protection and high
Primary Use - Preserve data in case of failure failure, outages, and other data loss
availability
events
- Replicas not directly accessible until
Accessibility - Near-instant access to replicas - Data can be restored from save points
primary nodes fail
- Requires manual work to reassemble - Automates replication and - Typically scheduled during off-hours to
Manual Work
data while offline management reduce impact on production systems
- Multiple cloud-based machines in the
- From local to external network for
Replication Levels same data center, rack-level distribution, - Save points created at periodic intervals
backup
cross-data center replication
- Not real-time, only becomes “active” - Not real-time, periodic backups can
Real-Time Sync - Real-time or near-real-time replication
when primary fails take up to several hours
- Relies on manual intervention to - Automatic failover and faster recovery - Provides a recovery point for restoring
Disaster Recovery (DR)
activate replicas times data in the event of a disaster
- Wide geographic distribution, storing - Data can be backed up on a variety of
- Limited, typically within local or
Geographic Distribution master data and replicas in different media and locations, both on-premises
external networks
regions and in the cloud
- Backups can be time-consuming but
- May slow down overall performance - Improves network performance and
Performance Impact typically scheduled during off-hours to
due to atomic operations availability without affecting bandwidth
minimize impact on production systems
- Risk of losing data between backups,
- Guarantees zero data loss with
- Lower risk of data loss due to near-real- but suitable for long-term storage of
Risk of Data Loss synchronous replication, but can slow
time replication large sets of static data and compliance
down performance