0% found this document useful (0 votes)
32 views147 pages

Efficient Data Storage Management

Uploaded by

maryam solihah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views147 pages

Efficient Data Storage Management

Uploaded by

maryam solihah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

According to industry analysts, data is

growing at 30% to 60% per year,


which means managing your storage infrastructure
is no easy task.

Thus, organization need a very Efficient


Storage Infrastructure!!!
http://www.clker.com/cliparts/0/5/e/d/1194985614534001410pulsante_02_architetto_f_01.svg.med.png
• https://www.youtube.com/watch?v=bAyrObl7TYE
• https://www.youtube.com/watch?v=vku2Bw7Vkfs
http://findicons.com/files/icons/552/aqua_candy_revolution/128/movie.png
 Computer data storage
 An electronic holding place for instructions
and data – which microprocessor can access
quickly
 Refers to computer component, computer
device and recording media that retain data
used for processing the data
 2 major types of storage:
• Primary storage
• Secondary storage
 Primary storage – RAM
• Volatile, when power off, data lost forever
 Secondary storage – ROM (hard disk,
magnetic tapes)
• Non-volatile, data remains when the power off
 Primary storage
• Primary storage, also known as main storage or memory, is the area in a computer in which data is
stored for quick access by the computer's processor. The terms random access memory (RAM)
and memory are often as synonyms for primary or main storage.

 Secondary storage
• Secondary storage, sometimes called auxiliary storage or external storage, is non-volatile
storage that is not under the direct control of a computer's central processing unit (CPU) or does
not directly interact with an application.

 Off-line storage
• A term used to describe any storage medium that is non-volatile and whose data cannot be
accessed by the computer once removed. A good example of off-line storage is a USB thumb
drive. Off-line storage is used for transport as well as backup protection in the face of
unpredictable events, such as hardware failure due to power outage or files corrupted by a
computer viruses.
 On-line Storage
• Online data storage is a virtual storage approach that allows users to use the Internet to store
recorded data in a remote network. The example of online storage is a Cloud storage.
• Cloud storage is a model of computer data storage in which the digital data is stored in logical
pools, said to be on "the cloud". The physical storage spans multiple servers, and the physical
environment is typically owned and managed by a hosting company
 Goal – to define, maintain and track data
and data resources in an organizational
environment
 Industries generally differentiate their data
storage into three levels – Hot, Warm and
Cold.
 The classifications are based on how crucial
to the current business the assets are and
how frequent it will be accessed, assisting
them to select the most ideal workflow
alongside suitable media.
 Hot data refers to assets that requires the
fastest storage as they are accessed most
frequently.
 Warm data represents assets that are stored
on a bigger storage capacity or file servers
for relatively cost-efficient concern.
 Cold data, it doesn’t require on-line or
near-line workflow and mostly include
archived materials that are rarely accessed,
increasingly stored on low-cost options.
 Let’s identify the example of :
• Hot
• Warm
• Cold
• Frozen
in your life…☺
 https://www.youtube.com/watch?v=Vky3h5_m
FYY
 https://www.youtube.com/watch?v=PvI_nzBU6
MA
 https://www.thedigitalspeaker.com/future-
data-storage-
dna/#:~:text=DNA%20molecules%20can%20st
ore%20vast,for%20long%2Dterm%20data%20
storage.
https://www.youtube.com/watch?v=bJPS
OWJjWqo
 Availability
Management
 Capacity Management
 Performance Management
 Security Management
 Reporting
 The critical task in availability management is
establishing a proper guideline for all configurations to
ensure availability based on service level.
 The server must be connected to the storage array
using at least 2 independent fabrics and switches that
have built-in redundancy.
 Storage devices with RAID protection are made
available to the server using at least 2 front-end ports.
 Virtualization technologies have significantly improved
the availability management task.
 With virtualization in place, resources can be
dynamically added or removed to maintain availability.
 The goal of capacity management is to ensure
adequate availability of resources for all services
based on their service level requirements.
 Capacity management provides capacity analysis,
comparing allocated storage to forecasted storage on a
regular basis.
 It also provides trend analysis of actual utilization of
allocated storage and rate of consumption, which must
be rationalized against storage acquisition and
deployment timetables.
 Capacity management also takes into account the
future needs of resources, and setting up monitors and
analytics to gather such information.
 Performance management ensures the optimal
operational efficiency of all components.
 Performance analysis is an important activity that
helps to identify the performance of storage
infrastructure components.
 Several performance management activities are
initiated for the deployment of an application or
server in the existing storage infrastructure
 The performance management task on a SAN
include designing sufficient Inter Switch Links
(ISLs) in a multi-switch fabric with adequate
bandwidth to support the required performance
levels.
 A fabric is a type of network topology where the
nodes of a network are connected to others via
switches and in a matrix fashion that ensures that
the best bandwidth is allocated to each.
 The term “fabric” is used in the field of
telecommunications, but especially in the field of
SAN storage networks using Fiber Channel
protocol , as well as in high-speed networks
including Infiniband .
 Fabric switches generally offer higher overall
throughput than traditional networks.
Fabric is a preconfigured network switch that was built to connect servers at the
highest bandwidth so that each frame is delivered accurately.
 In Fiber Channel switch fabric topology (called FC-
SW ), servers and storage devices are connected to
each other by FC switches; the topology allows a
theoretical number of 16 million elements (16 million
bit per second (16777216)).
 The reciprocal visibility of nodes is managed by the
definition of logical zones, called “zoning”.
 Most storage networks are configured as dual-fabrics
(two independent parallel fabrics), in order to obtain
high availability through redundancy of access paths
(Refer diagram in the next slide).
 This not only enables fault tolerance, but also
enables seamless hardware maintenance
interventions for applications
Zoning
• Zoning means restricting the scope of an initiator
(host) to a particular target (storage array) in
fabric.
• An initiator can see only the devices from a
particular storage array that is zoned to it.
• Zoning provides security to data by restricting
unauthorized access at the switch level.
• A zone is a logical path between two devices that
make communication between two devices.
• Zoning is illustrated below with an example.
1. In the provided diagram, Server A and
Storage A are the same zone called
Zone 1. Server A can see devices only
from Storage A. It cannot see any
devices from Storage B and Storage C
since its scope is restricted to Storage
A only.
2. In Zone 2, Server B, server C, Server D,
and storage B are added. So Server B,
server C, and Server D can access
devices from storage B.
3. Server E, Server F, and Storage C are
zoned together. So these servers can
see devices from Storage C.
• Security management prevents unauthorized
access and configuration of storage infrastructure
components.
• The security management tasks in SAN
environment include configuration of zoning to
restrict a host bus adapter (HBA’s) unauthorized
access to the specific storage array ports.
• Prevents data corruption on the storage array by
restricting host access to defend set of logical
devices.
https://harmonyoo.blogspot.com/2016/06/reading-notes-ism-chapter-5-fc-
san.html
https://www.jetorbit.com/blog/apa-itu-storage-area-network-san/
https://www.ibm.com/topics/storage-area-network
https://www.techjockey.com/blog/san-vs-nas-vs-das
An integrated circuit adapter that connects a
host system, such as a server, to
a storage or network device. An HBA also
provides input/output (I/O) processing to
reduce the load on the
host's microprocessor when storing and
retrieving data, helping to improve the host's
overall performance. host bus adapter
• It is difficult for business to keep track of the resources they
have in their data centres.
• Reporting on a storage infrastructure involves keeping track
and gathering information from various
components/processes.
• This information is compiled to generate reports for trend
analysis, capacity planning, chargeback, performance, and to
illustrate the basic configuration of storage infrastructure
components
• Chargeback reports contain information about the allocation
or utilization of storage infrastructure components by various
departments or user groups.
• Performance report provide details about the performance of
various storage infrastructure components.

 To protect information system
 Task of IS – to process data into
information that helps the organization to
achieve goals
DNA as a data storage https://www.youtube.com/watc
technology h?v=r8qWc9X4f6k

Shingled Magnetic Recording


(SMR) is a new way hard drives
record data
https://www.youtube.com/watch?v=3UFU
fv9n420

Helium Drives
https://www.youtube.com/watch?v=vKvU
cV1cNI0
 A traditional infrastructure is built out of individual units;
separate storage, application servers, networking and
backup appliances that are interlinked
 Each unit must be configured individually
 Each component is managed individually; usually, that
requires a team of IT experts, each specialized in a different
field
 Sometimes, each unit comes from a different vendor,
therefore support and warranty are managed individually
 Use case: Traditional infrastructures are still a good fit for
companies with a stable environment that handle very large
deployments. Tens of petabytes, thousands of applications,
many, many users, as well as a dedicated IT staff with
specialisations in different datacenter fields. Think huge
datacenters and large multi-national companies.
 Application servers, storage, and networking switches are sold as
a single turn-key solution by a vendor
 The entire product stack is pre-configured for a certain workload;
however, it does not offer a lot of flexibility to adapt for workload
changes
 As in the case of traditional infrastructures, more challenging
hardware issues may have to be handled by different providers
 Every appliance in the converged stack needs to be managed
separately in most cases
 Use case: Converged infrastructures are ideal for companies that
need a lot of control over each element in their IT infrastructure, as
each element can be “fine-tuned” individually. They may also be a
good fit for large enterprises who are replacing their entire
infrastructure, as they do not need to browse the market and
purchase every component separately.
 Storage, networking and compute are combined
in a single unit that is centrally managed and
purchased from a single vendor
 All of the technology is integrated and it takes
less time to configure the whole solution
 The software layer gives you flexibility in using
hardware resources and makes the deployment
and management of VMs easy
 Use case: Small and medium enterprises which
require a cost-effective, flexible and agile
infrastructure that can be managed by 1-2 IT
people.
 Managing storage in complex,
heterogeneous (various) physical and
virtual environments
 Low storage utilization
 Wasting storage with multiple and older
versions of the same files
 Applying and enforcing retention
policies across files efficiently
 Managing and finding information as
demanded
To solve the data growth problem, does your
company simply purchase more storage capacity
or devices? Is managing your storage
infrastructure on several platforms with multiple
point tools too complex?
 Maximize storage utilization across
heterogeneous platforms and arrays
 Increase operational efficiency with a storage
manager that offers end-to-end visibility and
centralized control
 Reduce the costs of multiple point tools with
integrated storage management software
 Storage, backup and service recovery
management cover all aspects of
information storage and data restoration
(storage management, storage allocation,
system backups and restoration, information
management, database management and
administration)
 Operational process of storage
management involves:
• Data backup
• Restore
• Recovery operations storage resource management
 Backup – is the process of periodically saving data in
some other device which different from its own hard
disk
 Move data from the hard disk to a secondary storage
medium for potential retrieval at a later stage
 The secondary storage medium may be hard disk, CD-
ROM, magnetic tape or optical disk.
 Process involved:
• Variation changes in customer’s SLA (change in
biz needs, restore, data recovery requirement
etc)
Backup Strategy
https://www.youtube.com/watch?v=-
wXeQ3T497A
https://www.youtube.com/watch?v=gsq0
fLQW5LE
 What data you need to backup?
 How often to backup?
 When to schedule the backup?
 How long to keep the backup data

https://www.upguard.com/blog/how-to-
back-up-your-data
 Multiple user accounts
 Archive file deletion
 Local file deletion
 Archive file identification
 Symbolic file links (Unix)
(eg: Deleting Shortcut on the Desktop)
Classification of Disasters
Natural Disasters Manmade disasters

Completely is very difficult but Due to human error – intentional or


precautions can possible minimize unintentional – cause loss of
looses. communication and utility
Smog, flood, hurricane, fire and Example: virus, walkouts, accidents,
earthquake intrusion and burglary
Image result for walk through disaster recovery plan
 Referred to as a Business Process Contingency Plan (BPCP).
 Describes strategies of an organization to fight against potential
disasters.
 Subset of Business Continuity Planning (BCP)
 It has to include planning for recommencement of data, hardware,
applications, communications and any other computer infra.
 BCP may also include planning for non-IT related aspects such as
facilities, key personnel and crisis communication and reputation
protection.
 Should refer to the DRP for IT-related infra recovery and continuity
 DRP minimize the effect of disaster since disaster is an
unexpected event
 DRP deals with analysis of business processes and continuity
needs – also focuses on prevention of disasters

A disaster recovery test is an important
element of an organization's business
continuity and disaster recovery (BCDR)
plan. It essentially involves assessing the
steps within this BCDR plan to be prepared
for operational disasters
 5 methods of testing the DRP:
• Walk-through
• Simulations
• Checklist
• Parallel testing https://www.youtube.com/watch?v=6ikk
• Full Interruption F0JG428
https://www.youtube.com/watch?v=lLbK5Twv6GM
 Performed by teams
 Group discussion of recovery, operations,
resumption plans and procedures
 Brainstorming and discussion brings out
new issues, ideas
 Provide feedback to document owners
 Walkthrough of recovery, operations,
resumption plans and procedures in a
scripted “case study” or “scenario”
 Performed by teams
 Places participants in a mental disaster
setting that helps them detect real issues
more easily
 More passive type of testing, members of
the key units “check off” the task on list
for which they are responsible
 Report accuracy of the list
 Full or partial workload is applied to
recovery systems
 Performed by teams
 Tests actual system readiness and accuracy
of procedures
 Production systems continue to operate and
support actual business processes
 Backup processing occurs in parallel with
production services that never stop. If
testing failed, normal production will not be
affected.
 Productionsystems are stopped as if a
disaster had occurred to see how backup
service perform
https://acsense.com/blog/disaster-recovery-testing-a-comprehensive-guide/

https://www.itperfection.com/cissp/security-operations-domain/test-
disaster-recovery-plans-drp/

https://ussignal.com/blog/disaster-recovery-plan-testing-methods-and-
must-haves
It identifies and moves low-activity
and inactive files to the hierarchy or
storage.
Use HSM to migrate automatically
and transparently unused data files
from computer’s local online storage
→ offline storage managed by
server.
 Supports unlimited online data storage
 Data backup
 Maintains total file system sizes
 Maintains the stub file sizes (A stub file works
as a type of placeholder, reminding the computer — and
the computer user — that the file is available for access.)
 Improves scalability
 Pre-migration
(Compound Annual Growth Rate)
 Release administrators and users from manual
file-pruning tasks which are time-consuming
 It stretches the usability of a given amount of
online storage space for longer periods
 It helps to reduce the need to purchase
increasing amount of online storage, as data
grows
 It automatically integrates and coordinates with
the storage manager for comprehensive data
protection
 It helps to reduce the overall cost of retaining
large number of easily retrievable data files for a
longer period
 Flexible Data Protection
 Archiving
 Faster recovery
 Point-in-time-Recovery
 Continuous logical log backup
 Parallel backup and restore
 Centralized backup operations
 Parallel backup and recovery
 Track backup history
Vulnerability is a cyber-security
term that refers to a flaw in a
system that can leave it open to
attack. A vulnerability may also
refer to any type of weakness in a
computer system itself, in a set of
procedures, or in anything that
leaves information security
exposed to a threat.
 Web and Application Server
 Website and application
 Flexible restore
 Database
• Alternate restoration techniques
• Online backup
• Centralized backup operations
• Robust data integrity
Web Applications Hack Attacks
 The ability to recover the operating system machine to the
identical state that it was at a given time.
 Returns the system to the state of the last backup
 It covers customizing, streamlining, and recovering all the
changes in OS
 It eliminates the need for a skilled professional to manually
reinstall hardware, network configurations and patches.
 Documented steps provided to recover a server or a number
of servers in the event of a disaster in a data centre.
 Advantages:
• Simplify the backup process (reinstall hw, nw configuration,
patches)
• Minimizes the storage and network usage
https://cdn4.iconfinder.com/data/icons/IMPRESSIONS/multimedia/png/400/video.png
 It schedules regular backup of the operating system, so
that a recovery brings back the latest information
which includes OS, service packs, patches and
hardware drivers.
 Support is offered for dissimilar hardware recovery
with Windows products.
 If disk or RAID fail, booting can be done from CD or
CBMR
 The restoration process only takes few minutes of
human intervention per client
https://www.vembu.com/blog/windows-server-backup-bare-metal-recovery/
 Ifstore manager not available / bandwidth
too small to do large restores, BMR can take
back up to a local network such as Network
Attached Storage – Common Internet File
System (CIFS) or Network File System (NFS)
 Factors to be considered while building
BMR software:
• Speed of recovery
• Complexity
• Storage requirements
https://www.maple-hosting.com/blog/what-is-a-bare-metal-server/
 The phrase bare metal
server describes a server, usually
a web server that runs directly on the
computer's main operating system.
 Other types of servers, such as
a VPS (virtual private server), operate
in a virtualized operating system.
 Whereas virtualization offers greater
control over many servers on a single
machine, bare metal servers do not
compete for system resources, making
them more performant.
A bare metal computer, also called a bare
machine, is a computer that can be
programmed to execute instructions
directly on its hardware, without an
operating system.
 Today, bare metal computers are usually
found in embedded systems, where a
small, dedicated piece of computer
hardware runs inside of a larger
machine.
 Defines the policies of persistent data and records management
for meeting legal and business data archival requirements
 Can helps in fighting against terrorism, organized crime and
maintenance of national security
 In the field of telecommunications, data
retention (or data preservation) generally refers to the storage of
call detail records (CDRs) of telephony and internet traffic and
transaction data (IPDRs) by governments and commercial
organizations.
 Data retention, also called records retention, is the
continued storage of an organization's data
for compliance or business reasons.
 An organization may retain data for several different
reasons. One reason is to comply with state and federal
regulations.
 Another is to provide the organization with the ability
to recover business critical data in the event of a site-
wide data loss, such as a fire or flood.
 Minimum records retention requirements regulations
vary by state and by data type, but typically they range
from three years to permanent.
 To ensure that all necessary data is stored properly,
an organization's IT administrators can work with the
organization's legal team and departmental business
owners to create a data retention policy.
 Such a policy is simply a set of guidelines that
describes which data will be archived and how long
it will be kept.
 Establishing a policy can reduce the
organization's storage costs by allowing
documents that are no longer needed to be deleted
or moving files that aren't accessed as frequently to a
lower-level storage tier in an archive.
 A good data retention policy organizes documents so
they can be searched and accessed when necessary.
 Capture & Archive
• Helps to capture and to archive important emails, systems event logs and
database
• Automatically categorizes and captures the needed data and store it
• It compresses archived data, reducing storage space required to retain
large number of email or event logs
 Search & Discovery
• Enables to quickly discover and retrieve evidences related to lawsuit or
internal investigation
• With extensive reporting and ad hoc search capabilities, the system can
quickly identify and retrieve relevant documents in their original form
 Archive Security
• Provides access controls to ensure that archived data remains unaltered
and accessible by authorized personnel only.
• It make complete audit trail of all searches and administrative events.
 Steps that can be taken by developers to reduce the risk of data leakage and
privacy issues in the context-aware application:
• Enforce data retention policy – data related to user profiling should be
periodically deleted.
• Perform on device processing – if possible, store data related to user profiling
locally on the device and perform on-device processing to provide
recommendations and analytics to context-aware application
• Obtain user consent before collecting their data – the developer should
provide a user consent screen, where the user can give their consent for their
data to be collected by the context-aware application.
• Give control to the user over which data that could be collected and processed
to enable context-aware applications - If possible, allow the user to continue
using the application with minimal data access.
• Release updates and software maintenance regularly – mobile applications
should always be updated to use the latest API and patches to prevent security
lapse, this also applies to the remote server.
• Only ask for bare minimum information – the developer should ensure that the
mobile application only requires minimum information to perform relevant
context-aware task. Collecting unnecessary might accidently expose the
mobile users’ personally identifiable information and would lead to legal
issues.
https://www.slideshare.net/swatibaiger/context-aware-computing-14084995
Australian Security Intelligence
Organization (ASIO)

You might also like