Cheat Sheets - 4
Cheat Sheets - 4
We’ll now cover several of these database types that may come up on the exam.
Amazon Relational Database Services (RDS)
Amazon Relational Database Service (Amazon RDS) is a managed service that makes it easy to
set up, operate, and scale a relational database in the cloud.
• SQL Server.
• Oracle.
• MySQL Server.
• PostgreSQL.
• Aurora.
• MariaDB.
RDS is a fully managed service and you do not have access to the underlying EC2 instance (no
root access).
A DB instance is a database environment in the cloud with the compute and storage resources
you specify.
Encryption:
• You can encrypt your Amazon RDS instances and snapshots at rest by enabling the
encryption option for your Amazon RDS DB instance.
• Encryption at rest is supported for all DB types and uses AWS KMS.
• You cannot encrypt an existing DB, you need to create a snapshot, copy it, encrypt the
copy, then build an encrypted DB from the snapshot.
DB Subnet Groups:
• A DB subnet group is a collection of subnets (typically private) that you create in a VPC
and that you then designate for your DB instances.
• Each DB subnet group should have subnets in at least two Availability Zones in a given
region.
• It is recommended to configure a subnet group with subnets in each AZ (even for
standalone instances).
Scalability:
RDS provides multi-AZ for disaster recovery which provides fault tolerance across availability
zones:
Push button scaling means that you can scale the DB at any time without incurring downtime.
DynamoDB is a Web service that uses HTTP over SSL (HTTPS) as a transport and JSON as a
message serialisation format.
Amazon DynamoDB stores three geographically distributed replicas of each table to enable high
availability and data durability.
• Amazon DynamoDB global tables provides a fully managed solution for deploying a
multi-region, multi-master database.
• When you create a global table, you specify the AWS regions where you want the table
to be available.
• DynamoDB performs all of the necessary tasks to create identical tables in these regions,
and propagate ongoing data changes to all of them.
Scale storage and throughput up or down as needed without code changes or downtime.
DynamoDB is schema-less.
• A strongly consistent read returns a result that reflects all writes that received a successful
response prior to the read (faster consistency).
Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache
for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to
microseconds – even at millions of requests per second.
Amazon RedShift
Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective
to analyze all your data using standard SQL and existing Business Intelligence (BI) tools.
RedShift is a relational database that is used for Online Analytics Processing (OLAP) use cases.
RedShift is used for running complex analytic queries against petabytes of structured data, using
sophisticated query optimization, columnar storage on high-performance local disks, and
massively parallel query execution.
RedShift is ideal for processing large amounts of data for business intelligence.
• Data is stored sequentially in columns which allows for much better performance and less
storage space.
• RedShift automatically selects the compression scheme.
RedShift uses replication and continuous backups to enhance availability and improve durability
and can automatically recover from component and node failures.
RedShift always keeps three copies of your data:
• The original.
• A replica on compute nodes (within the cluster).
• A backup copy on S3.
• Disk failures.
• Nodes failures.
• Network failures.
• AZ/region level disasters.
Amazon ElastiCache
ElastiCache is a web service that makes it easy to deploy and run Memcached or Redis protocol-
compliant server nodes in the cloud.
The in-memory caching provided by ElastiCache can be used to significantly improve latency
and throughput for many read-heavy application workloads or compute-intensive workloads.
Best for scenarios where the DB load is based on Online Analytics Processing (OLAP)
transactions.
The following table describes a few typical use cases for ElastiCache:
Elasticache EC2 nodes cannot be accessed from the Internet, nor can they be accessed by EC2
instances in other VPCs.
• Memcached – simplest model, can run large nodes with multiple cores/threads, can be
scaled in and out, can cache objects such as DBs.
• Redis – complex model, supports encryption, master / slave replication, cross AZ (HA),
automatic failover and backup/restore.
Amazon EMR
Amazon EMR is a web service that enables businesses, researchers, data analysts, and
developers to easily and cost-effectively process vast amounts of data.
EMR utilizes a hosted Hadoop framework running on Amazon EC2 and Amazon S3.
Most commonly used for log analysis, financial analysis, or extract, translate and loading (ETL)
activities.