Migration
Data Migration can be defined as the one time movement of data from a source to a target,
where the data will subsequently only be accessed at the target.
Data migration refers to the relocation of data. After a migration operation, applications that
access the data must reference the data in its new location
Example:
Movement of data from one storage device to another.
Storage Platform Migration
Movement of applications from one storage device to another.
Migration of operating systems files from one storage device to another.
Consolidation of data or database instances.
Movement of a Database Instance to a New Storage Device
Migration of data centers containing storage infrastructures from one physical
location to another.
Data Migration Applicability:
Storage Acquisition
Storage growth
Storage Management/Consolidation
Data Migration Challenges:
Lack of standardized tools, templates and methodologies
Prolonged service outages
Skilled resources
Risk and complexity
Data Migration Sources:
Files/Directories
Volumes/Raw Partitions
Databases
SAN/NAS/DAS
Shares/NFS
E-Mail Systems
Consolidation
Data Center Relocation
Traditional Migration:
Backup and restore, outside network
Multiple copy operations across network
Data not fully accessible during migration
Data Migration Issues:
Mapping of data
Validation of migrated data
Security maintained
Data outage
Recovery
Lack of in-house data migration tools and expertise
Synchronization of Source and Target
User access to migrated data
Tools and methodologies vary by O/S and platform
Network usage
Hardware/Software changes
Data Migration Technologies:
Backup/Restore : Slow and cumbersome, Outage/synchronization, No additional cost
Move/Copy : Slow and cumbersome, Outage/synchronization, May loss security
Attributes, No additional cost
Mirroring : Attaching to both storage devices, Data synchronized
Replication : Network impact, Synchronization, Server to Server, SAN replication
(copy)
Active data migration: Network impact, Data access from storage device
Data Migration Terms:
Replication : Exact image "replicated"
: Replication of the data image at a specific point in time
Consolidation : Merging Data Sources
: Application - many to "fewer" servers and storage environments
: Databases - many to "fewer" servers and storage environments
: Operating Systems - many types to "fewer" types
: Logical data consolidation
: Hardware consolidation
Conversion : Transforming an Environment
: Upgrading an application, ex: upgrading Oracle applications version 8i to 9i
: Converting database types and versions
: Upgrading operating systems
: Converting or transforming data from one logical structure to another
: Flat file to database instance or vice-verse.
Data Migration Tools:
SRDF (Symmetrix Remote Data Facility)
Open Migrator/LM
CLARiiON MirrorView
CLARiiON SAN Copy
Veritas Volume Manager
Open Replicator
PowerPath Migration Enabler
EMC RepliStor
CDMS (Celerra Data Migration Service)
SDMS (Symmetrix Data Migration Services)
Robo Copy
Secure Copy
Rsync
SRDF (Symmetrix Remote Data Facility):
Symmetrix Remote Data Facility (SRDF) is a Symmetrix system based business
continuance, disaster recovery, restart, and data mobility solution. This can be used for
relocating data when two or more Symmetrix frames make up the environment. SRDF is
flexible and can be configured in many different ways, such as multiple sources to a single
target or a single source to multiple targets.
SRDF is an array-based replication application that requires no host resources. When
considering SRDF for data migration, the application, the distance between the sites, and
the amount of data to be migrated must be taken into account. For example, if the SRDF
link is only a few feet long then SRDF/S (synchronous) replication could be used. However,
if the migration is considered to be over a long distance, then SRDF/A (asynchronous)
replication may be required. Whether a migration is considered to be short or long distance
depends on the amount of data that SRDF needs to migrate.
Open Migrator/LM for UNIX:
Open Migrator/LM for UNIX is a host based device driver solution with a dynamically
loadable kernel module, and a command line management interface that provides seamless
storage migration within a UNIX environment. It inserts itself into the kernel I/O subsystem
and provides mirroring and background copying functions to synchronize data images on
one or more source and target Logical Volume, LUN or LUN Partitions. Open Migrator/LM
works with the same level block device as the application. It supports:
LUNs as presented to the host by the SAN or direct attached device
Multi-Path Input / Output devices, such as EMC PowerPathor VeritasDMP
Host Logical Volume in a Logical Volume Group in the Host Logical Volume
Manager (LVM)
EMC Open Migrator/LM enables online data migration of Microsoft Windows, UNIX, or Linux
volumes between any source and EMC storage. Open Migrator/LM host-based software
boosts the efficiency of your entire information infrastructure by automating and simplifying
data migration. Whether you're consolidating servers, upgrading storage, or tuning
performance, your volumes stay online and fully available to critical applications during
migration. And your host applications continue operating at peak performance.
With Open Migrator/LM, once the migration process begins the target cannot be used by the
host.
All host read/write I/O to the target device will be rejected. At the time of defining the
target volume, Open Migrator/LM checks to see if the volume are in use; for example, is it
mounted. Its definition as a target will fail if it is detected as in use.
It is recommended that the target volume be unmountedor marked as not ready to any
other hosts to guarantee that the volume cannot change while copying is in process.
When the synchronization completes, mirroring continues and maintains source and target
data synchronization.
Open Migrator/LM for Windows:
Open Migrator/LM for Windows migrates NTFS data from a source volume to a destination
volume that must be the same size or larger. Upon the next reboot, it expands the file
system on the target volume to match the size of the new volume. It also mounts the new
volume using the drive letter from the source and re-associates any mount points used by
the volume.
Open Migrator/LM for Windows allows a maximum of ten concurrent migrations, while
allowing full read and write access to the source volume. The number of concurrent
migrations from the GUI can be adjusted. The default maximum number of migrations upon
install is set to five.
Migrations can be launched, monitored, and verified from a remote machine. The server can
be configured to allow launch and access privileges to desired users that are able to access
the server from a remote client.
When accessing Open Migrator/LM remotely, you must be logged in to a machine in the
same domain or in a trusted domain as the target server.
With Open Migrator/LM the migration process continues across system reboots. If a
migration is in progress and the system reboots or crashes, the migration continues from
where it left off before the reboot.
If the system crashes during migrations, Open Migrator/LM automatically schedules
verification on the volumes that were migrating at the time of the crash. However, this
functionality is not available if the last known good configuration option is accessed during
system boot up.
Open Migrator/LM supports migration of both basic and dynamic disks as well as volumes of
any fault-tolerance type. Source and destination volumes can be of dissimilar types and
fault -tolerance levels. For example, Open Migrator/LM allows migration from a basic to a
dynamic disk or from a striped disk to a spanned disk.
Another good usage of Open Migrator/LM for Windows is migrating a volume that was
initially created misaligned, causing performance degradation, to a volume that is properly
aligned.
After the source and target volumes are synchronized, the backup operation can be verified.
To verify, Open Migrator/LM for Windows checks to make sure that the source and target
volumes are identical at the physical level. Upon reboot, Open Migratorwill stop updating the
source volume and will only write new data to the target.
When Open Migrator/LM is idle there are no migrations taking place and hence there it has
no performance penalty. It is used for one-time migrations and therefore it makes sense to
remove the product afterwards.
Open Migrator/LM is packaged for installation using the native package administrative tools:
(pkgadd/pkgrmon Solaris, swinstallon HP-UX, and lpinstall/SMIT on AIX).
The tunable Migration rate is dynamic and can be throttled forward or backward
depending on impact of any production application
The compare action is provided for users who wish to double-check that the source and
targets are synchronized. Note, however, that the compare action, when used, will causes
the migration to take more than twice as long Is not automatic and must be explicitly
invoked after the copy action is complete.
Open Migrator/LM can be used to migrate to EMC-based storage from non-EMC storage, for
day-to-day volume-resizing tasks, to install more storage capacity, upgrade existing
hardware, move volumes for load balancing or eliminate hot spots on the fly, as business
operations continue.
Scenarios when migration may be needed include migrating data to a different LUN,
introducing a new or replacement Logical Volume Manager (LVM) Layer, and/or
reconfiguring LUN or LVM size or striping.
MirrorView:
MirrorView can create multiple secondary copies of your production data. This secondary
(replication) data, can be used to support offline applications, or used to support testing or
development activity.
Using MirrorView to support data migration also enables the IS (Information Systems)
organization to backup the secondary copies to tape or disk. Using the secondary replicated
copies for backup, enables the production data to run with no impact on performance.
SAN Copy:
VNX SAN Copy is a software application for copying data between VNX storage systems,
within VNX storage systems, and between VNX and Symmetrix storage systems.
SAN Copy is designed to work with device replication technology, such as SnapView or
TimeFinder. SAN Copy can use a snapshot, clone, or Symmetrix BCV (Business Continuation
Volume) as its Source LUN, allowing I/O with the snapshot, clone or BCV Source LUN to
continue during the copy process.
Some of the key benefits supplied by SAN Copy software are the off-load of host traffic with
an associated increase in copy performance. Copy operations can be performed without
regard to the host operating system. Because ownership of the logical units or volumes
does not have to be shared, a level of security can be maintained.
SAN Copy is a storage-system based data-mover application that uses the SAN (Storage
Area Network) to copy data between storage systems.
Since SAN copy runs on the storage systems, it eliminates the need to move data to and
from the attached hosts, and reserves host processing resources for users and applications.
Since the host is not involved in the copy process, and the data migration takes place on
the SAN, the copy process is much faster than the LAN-based, host-involved copy process.
In todays business environment, it is common for a company to have multiple data centers
in different regions. Businesses frequently need to distribute data from headquarters to
regional offices, and collect data from local offices to headquarters. Such applications are
defined as content distribution and are supported by VNX SAN Copy. Web content
distribution is also in this category, which involves distributing content to multiple servers
on an internal, or external, website.
VNX SAN Copy has the following requirements:
Either the source logical unit, destination logical units, or both, must reside on a SAN Copy
storage system.
The SAN Copy ports must be correctly zoned to target storage systems in order for SAN
Copy to have access to them.
For Unisphere Manager to provide the drive letter/file system mapping of participating
Symmetrix volumes, the Unisphere Host Agent must be installed on the hosts that own the
volumes.
Logical units participating in a SAN Copy session must be accessible by the participating
SAN Copy port.
Veritas Volume Manager:
When using a volume manager such as Veritas, the system administrator would use
the Deport / Import commands to migrate data from a source host to a target host. This
method is a host-based data migration solution that requires any application, residing on
the volume being moved, to be disabled.
Before deporting a volume group, the application must be shut down on the host that owns
it. Deporting a volume group disables access to the volume group on the host that owns it.
Importing a volume group enables access to a volume group on the new host.
The umount-deport commands are used to disable access to a volume group on a host
that owns the volume group.
A volume group must be deported from the host that owns it when the volume group is:
Renamed (the volume group must be deported and then re-imported with a new name onto
the same host -Data Mirroring) or permanently moved to a different host (Point-in-Time
Data Relocation)
Date Relocation vs. Data Migration is one of those situations where a customer needs to be
real clear as to what each process entails. When performing a Host Data Relocation with
Veritasthe process entails migrating to a new host. Moving the Volume from host 1to
host 2represents only one copy of the data. When performing a Data Migration, the process
consists of creating a copy and then moving to a new environment. Again, make sure the
customer clearly understands the differences.
The Import command is used to make the volumes within a volume group accessible.
A volume group can be imported to another host using the original volume group name.
Because a volume group can be active on only one host, the volume group must have been
previously deported from its original host.
If a hardware-mirror copy of a volume group is imported on the same host, then a new
name must be provided to the hardware-mirror copy.
The example shown demonstrates how a system administrator would migrate or replicate a
volume set from the source host to a target host using Veritas and EMCs SRDF replication
facility. Once the SRDF device group is fully synchronized, and then split, the target host
(Host 2) could import the Veritas volume and activate it by performing a mount. The benefit
of using SRDF with Veritas (or any volume manager) is that the availability of the
application on the source side is not lost.
Another advantage of using EMCs SRDF is enabling a full mirror copy of the data. The
above data migration scenario is relocating a copy of the data to a target site. In the
event of a problem during the migration process, the customer has the original source copy
to fall back to.
Always have a failback plan with any Data Migration process. Do not start a Data Migration
process until a failback plan has been created and accepted.
Open Replicator:
Using Open Replicator to move data to a Symmetrix array from various types of arrays
allows migration of data from an older storage configuration on a SAN to newer Symmetrix
storage array. A host that is locally attached to a Symmetrix control array can initiate the
copying of data to the Symmetrix from a remote array (for example, a VNX, older
Symmetrix, or third-party storage array).
Open Replicator also allows the copying of data from a Symmetrix control array to other
types of arrays on the SAN. This capability is useful for backups or any situation where data
has to be moved from a high-end storage array to mid-tier or low-end storage array.
Open Replicator can reduce VMware ESX Server data migration downtime from hours to
minutes when compared to using native VMware tools for migration.
One potential use for Open Replicator (OR) within an ESX environment is for the purposes of
Data/Array Migration. Virtual Center and ESX Server (VI3) support their own set of
migration facilities for VMs in the form of cold and hot (VMotion) migrations, but for large
storage environments the use of replication facilities like OR may be more efficient.
Powerpath Migration Enabler:
PowerPath Migration Enabler (PPME) is a host-based migration tool that allows you to
migrate data between storage systems. PowerPath Migration Enabler takes advantage of
PowerPath technology and works in conjunction with another underlying technology, such as
Open Replicator or Invista. PowerPath Migration Enabler is independent of PowerPath multi-
pathing technology and does not require that you use PowerPath for multi-pathing.
This service is a module of the Data Migration All Inclusive service, but none of the Shared
Documents for the Data Migration All Inclusive service are required with this module. All
documents supporting Invista are associated with the Invista Planning, Design, and
Implementation Custom Service.
EMC Replistor:
RepliStor is host based software that allows replicating data and server failover in a
windows environment. RepliStor provides real time data replication and increases the
availability and reliability of Windows servers without the use of proprietary or specialized
hardware.
It is a flexible data replication tool allowing one to one, many to one, and many to many,
source to target replications It can also be used in local area and wide area network.
Before replication, specification must be created that tells RepliStor software which files,
directories, registry keys, and shares to replicate. After specifications are created, the data
must be synchronized from the source to the target system. Synchronization ensures that
the replicated data on the target system exactly matches the original data on the source
system before the replicating process begins. RepliStor is a Windows host-based remote
replication product which also performs:
File-level full and incremental synchronization
File-level asynchronous mirroring
Data transfer over IP network
Synchronization is the first process performed by RepliStor before data replication occurs. It
is the act of copying all files from the source to the target. Synchronization is done initially
when you create a specification, defining the data replication parameters. Initial
synchronization is called a full sync. A full sync copies all the data and all the attributes of
the data files to the target.
You can either perform a full sync or have the ability to perform an incremental sync. An
incremental sync first scans each file on the source and target systems. The source then
sends the appropriate patches to the target and the target then updates the files.
CDMS (Celerra Data Migration Service):
Celerra Data Migration Services (CDMS) allows seamless migration of existing data using
the Common Internet File System (CIFS) protocol from source file servers to the Celerra
Network Server with limited interruption to normal business operations. The CIFS protocol
enables Microsoft Windows clients to map shared file systems on the Celerra Network
Server as network drives.
SDMS (Symmetrix Data Migration Services):
Symmetrix Data Migration Services (SDMS) is a total solution to ensure high availability
when migrating data to a target Symmetrix. The SDMS offering includes pre-migration
planning, the software that facilitates the data movement, Customer Service Engineering to
perform the migration, and post-migration evaluation. SDMS has evolved from its traditional
implementation to include a non-disruptive version (ND-SDMS) that eliminates application
outage. In addition, data can be migrated in an open systems environment when the donor
system is Symmetrix.
The reach of SDMS can be extended to enable data center moves and consolidations with
SRDF. By combining SDMS with SRDF, application availability can be ensured during a
situation typically requiring a significant application outage. Open Systems SDMS benefits
include:
Support of all Symmetrix Models
Transparent-to-host operations, and
Operating system and file system independents.
Open Systems SDMS works similarly to traditional SDMS. A short outage must take place to
connect the new Symmetrix to the host and the existing Symmetrix. After the connections
are made, the host is rebooted and data migration begins transparently to the host. After
completion of the data migration, the existing Symmetrix can be non-disruptively removed
RoboCopy:
RoboCopy, or Robust File Copy, is a directory replication utility available via the Windows
Resource Kits from Microsoft. Currently it is a standard replication utility for Windows Vista,
Windows 7 and Windows Server 2008.
RoboCopy does have a full Graphical User Interface. This interface lets you define the
source and target paths. The RoboCopy GUI also lets you define where to store the
migration (Copy) logs. The major benefit of this utility is the ability to maintain NTFS
security information from source to destination shares. Robocopys capabilities generally
include:
File replication between source and destination directories.
Copying of single directories or a recursive copy of a directory and subdirectories
which allows for the complete migration of a share when the starting directory is the
root directory.
Full mirroring option that permits deletion of files in the destination directory if they
no longer exist in the source directory.
Extensive inclusion or exclusion options based on wildcards, paths, or file attributes.
Full logging of the transfer operation, file processing, and error reporting.
Control of retry operation after encountering a recoverable network error; a retry
may be allowed to restart from the point of failure.
Multiple runs of RoboCopy will only copy files that have changed (known as
incremental copying).
Batch scheduling of copy operations.
Capability to move source directories and files. Robocopy accomplishes this task by
copying files or directories from the source to the destination and then deleting the
source, but this is not a recommended methodology for migration.
Secure Copy:
Secure Copy is an NTFS File copying application for NT administrators. It allows an
administrator to copy files and directories on NTFS partitions while keeping the security
intact. This functionality uses a graphical user interface (GUI), which updates the copy
progress, as well as handling any errors that may occur. Secure Copy also includes
command line functionality and scheduling. Secure Copy supports:
Multiple copy destinations: Files can be copied from one source server to multiple
destinations -all in the same job. The data can be replicated to multiple machines
around the network with this feature.
Include/Exclude file types: Specific file types can be included and excluded for your
copy job.
Enhanced scheduling using the Windows Task Scheduler: Once a job has been
saved, that job can be scheduled to run at specified intervals.
Rsync:
Rsync is an open source utility that provides fast incremental file transfer and is freely
available.
The use of rsync for data migration is particularly useful for small NFS migrations. For
simple, small NFS migrations, or migrating small amounts of data where operating system
mirroring is not possible, this tool may be worth considering.