MySQL HADR Solution in Azure
MySQL HADR Solution in Azure
Microsoft Azure
Paige Liu, Feb 2015
Contents
Overview ...........................................................................................................................................2
MySQL High Availability Solutions ........................................................................................................2
Optimizing Percona XtraDB Cluster (PXC) performance in Azure ............................................................3
Virtual machine sizes and storage account considerations.................................................................4
Disks and performance considerations .............................................................................................5
Disk configurations......................................................................................................................5
Multiple data disks ......................................................................................................................5
Premium Storage ........................................................................................................................6
Disk warm-up effect ....................................................................................................................6
Network performance considerations ..............................................................................................7
Load Balancing ............................................................................................................................7
Multiple Network Interface Cards.................................................................................................7
Replication & Disaster Recovery ..........................................................................................................8
Sysbench Comparison with 3rd Party ....................................................................................................9
FromDual .................................................................................................................................. 10
Openlife.cc................................................................................................................................ 11
Summary ......................................................................................................................................... 12
Appendix 1 – Automatically provision a MySQL PXC in Azure............................................................... 13
References & Related Resources ....................................................................................................... 14
(c) 2015 Mi crosoft Corporati on. Al l ri ghts re se rved. Thi s docume nt i s provi de d "as -is."
Informati on and vi e ws e xpresse d i n thi s docume nt, i ncl udi ng URL and othe r Inte rne t We b
si te re fe re nces, may change wi thout noti ce. You be ar the ri sk of usi ng i t.
Thi s docume nt doe s not provi de you wi th any l e gal ri ghts to any i nte l l ectual property i n
any Mi crosoft product. You may copy and use thi s docume nt for your i nte rnal , re fere nce
purpose s.
1
Overview
This article introduces the key factors to take into account when you design a high availability MySQL
solution running in an Azure Virtual Machine environment. It also provides certain best practices and
techniques for performance tuning when using MySQL cluster in Azure.
MySQL provides two main architectures to achieve high availability – replication and clustering. MySQL
built-in replication is asynchronous, so there is a lag between the nodes that applications may have to
deal with. Clustering, on the other hand, typically appears to an application as a single database server
but internally balances load among multiple nodes. While replication is often lower in cost and
complexity, it also offers lower levels of uptime when compared to a well -designed clustering solution
due to the lack of capabilities such as synchronous replication and automatic failover.
Master-Slave Clusters
o MySQL with DRBD (Distributed Replicated Block Device) with Corosync and Pacemaker
o MySQL with Windows Failover Clustering
Multi-Master Clusters
o MySQL NDB Cluster
o MySQL Galera Cluster
o Percona XtraDB Cluster (PXC)
Compared to Master-Slave clusters, Multi-Master clusters have the advantage of fast failover achieved
through the removal of central management. You can lose any node at any time, and the cluster will
continue to function (as long as there are enough healthy nodes left to form a quorum).
MySQL Galera Cluster and PXC are very similar in nature. PXC also uses Galera as its clustering layer.
The only difference is that PXC uses Percona server and XtraDB storage engine, which are drop-in
replacements for MySQL server and InnoDB storage engine respectively with enhanced capabilities.
Additionally, both Galera and Percona clusters are drop-in replacements for stand-alone MySQL. In
other words, each node in the cluster is a regular MySQL database that contains a full copy of the data.
This is the key difference from an NDB cluster, where data is sharded among cluster nodes. This is also
why NDB clusters offer superior performance for writes and primary key based lookups, but existing
applications often need to be modified in order to work in an NDB cluster or take advantage of its
benefits.
For completeness of the discussion, there’s also a ClearDB offering in the Azure Market Place. ClearDB is
a MySQL Software as a Service offering where the service provider takes care of the database
replication, high availability and scalability. You benefit from ease of use in exchange for the flexibility
of control and limits on the maximum database size.
2
We will focus our discussion on PXC in this article due to the fact that it’s a Multi-Master cluster that
provides drop-in replacement for MySQL, and is a highly popular open source MySQL cluster solution on
the market.
App Tier
Data tier subnet
Load
balancer
Data Tier
NIC1 NIC1 NIC1
Percona
XtraDB
NIC2 NIC2 Cluster
VM VM VM
Availability set
Intra cluster subnet
Virtual network
Unless otherwise noted, the cluster is configured with the following optimizations which are generally
recommended for MySQL clusters with multi-master synchronous replication:
3
innodb_flush_log_at_trx_commit=0
innodb_support_xa=0
innodb_doublewrite=0
sync_binlog=0
(Note that binary logs are enabled in certain tests and disabled in others. However, wherever a
comparison chart is presented, the tests have the same configurations other than the variable being
compared.) Please note that these test results provide general performance trends with different
optimization approaches, but results may vary under different environments and can also be affected by
many changing variables in a multi-tenancy environment.
1. A3 standard is a good starting point for general workloads. It provides 4 cores, 7GB memory,
and allows for 8 additional data disks with 500 IOPS each and multiple NICs. Memory plays an
important role in MySQL performance. The more memory that can be allocated to
innodb_buffer_pool_size to hold as much of the database, the better the performance. Many
performance articles suggest that you may allocate up to 80% of the machine’s memory to
MySQL. Below is a comparison of Sysbench tests performed with innodb_buffer_pool_size set
to the default 128MB vs. 2GB.
1200
1000
800
TPS
600
128MB
400
2GB
200
0
1 2 4 8 16 32 64 128 256 512
#threads
2. VMs can’t be easily moved to Azure VNet or other subnets after they are created. So b efore
provisioning the VMs, create an Azure VNet and subnets. Place the VMs in a subnet. You can
later add an Azure Internal Load Balancer to balance the traffic from the application tier to the
nodes in the cluster.
3. Additional virtual Network Interface Cards (NIC) must be created during VM creation time (at
the time of writing). So if your workload is network bound and can benefit from higher
bandwidth by, for example, separating the data plane for internal cluster communication from
the data plane between application and the databases, you should create additional NICs when
creating the VMs.
4. The VMs should also be placed in a Cloud Service so that they can belong to an availability set to
reduce the possibility of, for example, simultaneous power or network outages.
4
5. The VMs should be created with static IPs because their IPs will be referenced in the
configuration file to build the cluster.
Disk configurations
Since operating system disk and data disks are backed by Azure storage, we recommend that you create
your Azure storage account in the same data center as your MySQL virtual machines to reduce network
latency.
When using disk striping, disable geo-redundancy for your storage account as consistent write order
across multiple disks is not guaranteed.
For performance sensitive applications, we recommend that you use data disks instead of operating
system disks. Attach one or more data disks to your virtual machine with disk caching policy set to
none. Caching can help reduce the number of transactions against Azure Storage by storing frequently
accessed data on the local hard disks of the host machine. However, power loss could result in data
inconsistency if the cached data is not yet written to the storage. You could use read-write caching
without data loss if your application ensures critical writes are write -through. Be aware that caching in
standard storage is limited to 4 disks max, and caching isn’t available on D series machines.
MySQL leverages temporary files to perform certain operations such as sorting. If your workload heavily
leverages temporary files (tmpdir), you can place tmpdir on the SSD based temporary disk of D series
machines and up. If you choose to do so, you need to make sure the folder for tmpdir on the temporary
disk is created and configured with the right permission each time after VM restarts and before MySQL
starts.
5
700
log tmp data 600
500
400
TPS
1 2 300 singledisk
3 4 200
100 raid
0
2
1
4
8
16
32
64
128
512
256
#thread
OS disk data disks
You can find a step-by-step guide to configure RAID in Azure here. One thing to note when striping disks
is to ensure proper disk alignment. Since this is not specific to Azure we will simply provide an example
here rather than explaining in details.
>fdisk -ul
Device Boot Start End Blocks Id System
/dev/md127p1 256 83885823 41942784 83 Linux
You can also place binary logs (log-bin) and data files (datadir) on different data disks to improve IO
throughput.
Premium Storage
With the recent introduction of Azure Premium Storage for Disks, you can attach SSD based data disks to
a DS series virtual machine and achieve 50,000 max IOPS per VM vs. 16,000 from standard storage.
6
recommend you consider it during performance testing or when accessing systems that have been
inactive for a while.
We recommend that you host virtual machines for the MySQL cluster in the same Azure Virtual Network
to enable direct communication between virtual machines via internal IP addresses.
Load Balancing
Certain database drivers for MySQL have built-in load balancing capabilities. For example, JDBC drivers
can do automatic load balancing by accepting a connection string with multiple hosts. This eliminates
the need for an external load balancer. However, whenever the nodes in the cluster change (for
example, hostname or IP changes, addition or removal of nodes), the configuration for the app using the
cluster must be updated. If this type of configuration change doesn’t meet your requirements, or if you
are using a database driver that doesn’t have built-in load balancing support, then you can use the Azure
Internal Load Balancer.
It is important for the load balancer to send traffic only to “Joined” nodes, not to any “Joining” or
“Donor” nodes in PXC. The Azure Internal Load Balancer supports an HTTP based custom probe. So you
should enable mysqlchk, included in the PXC package, to create a custom probe against
http://nodeip:9200. This probe will return HTTP 200 status code only when the node is in a “Joined”
state.
The following Sysbench tests are performed with the Sysbench built-in load balancer vs. Azure Internal
Load Balancer. It show no significant overhead generated by the Azure Internal Load Balancer in this
case.
500
400
300
TPS
200 sysbenchlb
100 azurelb
0
4
8
1
2
16
32
64
256
512
128
#thread
7
following Sysbench tests show the difference between running PXC with one NIC vs. two, separating the
inter-cluster data communication from application to cluster communication. Note that even though a
performance improvement was clearly observed in this test, in other tests, only marginal improvements
were observed. Test results can be affected by changing network traffic patterns in the cloud as well as
the workload.
700
600
500
400
TPS
300 singlenic
200 multinic
100
0
1 2 4 8 16 32 64 128256512
#thread
Region A Region B
MySQL replica
VM VM
Native MySQL asynchronous replication
VM VM
8
Region A Region B
VM VM VM
VM
VM VM
Replication relies on binary logs being enabled for a MySQL database, as the slaves will replay the logs
from the master to keep itself up to date. Enabling the binary log does have a performance impact.
However, even if you don’t need cross-regional high availability and disaster recovery, you probably still
should enable the binary log because PXC is not a solution for point-in-time restore. If someone made a
mistake and modified the database resulting in data corruption, a backup is still needed to recover the
database to its previous good state.
You can create a backup of a MySQL database the same way you create it with your on-premises
instances. The resulting backup file can be stored in Azure blob storage which comes with high
durability with multiple copies.
FromDual: http://www.fromdual.com/galera-cluster-vs-pxc-vs-mariadb-galera-cluster-
benchmarking (published on Aug 7 2014)
Openlife.cc: http://openlife.cc/blogs/2011/august/running-sysbench-tests-against-galera-cluster
(published on Aug 10 2011)
There are many factors affecting the performance of a cluster. Although we try to make the
environment as similar as possible given the information in the above articles, there are still a lot of
unknown factors in both published tests, especially the Openlife test. Prices of virtual machines quoted
in these tests are current at the time of writing.
9
FromDual
The following table shows the respective environments for the tests:
FromDual Azure
Cluster Nodes (no disk striping, no multiple NICs)
CPU Intel(R) Xeon(R) E5-2680 v2 AMD Opteron(tm) Processor 4171 HE
@ 2.80GHz @ 2.09GHz
(# of cores 8, # of threads 16, HT enabled) (# of cores 8)
Memory 16GB 14GB
Storage HDD 120GB/ 5400RPM 605GB
Price AWS c3.2xlarge ($0.42/hour) Azure A4 ($0.48/hour)
#of nodes 3 3
Load Generator
CPU Intel(R) Xeon(R) CPU E5-2680 v2 Intel(R) Xeon(R) CPU E5-2660 0
@ 2.80GHz @ 2.20GHz
(# of cores 16, # of threads 32, HT enabled) (# of cores 8)
Memory 32GB 56GB
Storage HDD 10GB/ 5400RPM 400GB
Price AWS c3.4xlarge ($0.84/hour) Azure D13 ($0.857/hour)
Load Balancer
CPU Intel(R) Xeon(R) CPU E5-2651 v2 Azure Internal Load Balancer
@ 1.80GHz
(# of cores 4, # of threads 8, HT enabled)
Memory 16GB N/A
Storage HDD 10GB/ 5400RPM N/A
Price AWS m3.xlarge ($0.28/hour) N/A
Software
OS RHEL 6.5 CentOS 6.5
Sysbench 0.5.3 0.5.3
PXC 5.6.19 5.6.20
Test set up (show create table sbtest.sbtest\G)
ENGINE=InnoDB ENGINE=InnoDB
AUTO_INCREMENT=8574 DEFAULT AUTO_INCREMENT=30000 DEFAULT
CHARSET=latin1 CHARSET=latin1
MAX_ROWS=1000000
Test run (run 10 times, and calculate average of TPS)
--num-threads=64 --max-requests=1000 --db-driver=mysql
--mysql-table-engine=InnoDB --mysql-host=load_balancer_ip
--test=/usr/share/doc/sysbench/tests/db/oltp.lua
My.cnf
key_buffer_size = 16M
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 8
innodb_buffer_pool_size = 8G
expire_logs_days=10
max_binlog_size=100M
10
auto_increment_increment=3
auto_increment_offset=1
log_slave_updates
log-bin=mysql-bin log-bin=/var/lib/mysql/binlog
Only the results for the tests performed on PXC 5.6 with innodb_flush_log_at_trx_commit=0 and
sync_binlog=0 are compared:
TPS
1000 867.3
800
600 532.19 From Dual
400 Azure
200
0
Openlife.cc
Details of the environments used in this Openlife test are not published in the article. What is published
is also quite different from Azure environment. The following table shows the respective environments
known for the tests. We performed this test on both standard A4 and D4 machines in Azure.
Openlife.cc Azure
Cluster Nodes
#cores 8 8
RAM 32G 14G (A4) /28G (D4)
Storage EMC devices with 2.5GB write cache 2 Azure data disks stripped as Raid0 (no
(disk striping not known) caching)
Network Probably app and cluster have Dual NIC separating app and cluster network
separate network
#nodes 3 3
Load Generator
Unknown spec Same as FromDual test
Load Balancer
Sysbench direct IPs Sysbench direct IPs
Software
OS N/A CentOS 6.5
Sysbench 0.5 0.5.3
Cluster Galera MySQL 5.1 PXC 5.6.20
Test set up
--test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-table-engine=InnoDB
--mysql-engine-trx=yes --oltp-table-size=2000000 --max-time=300 --max-requests=0
--num-threads=1 --oltp-tables-count=10 --oltp-auto-inc=off prepare
Test run
for i in 1 2 4 8 16 32 64 128 256 512 1023;
11
--test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-table-engine=InnoDB
--mysql-engine-trx=yes --oltp-table-size=2000000 --max-time=60 --max-requests=0
--num-threads=$i --oltp-tables-count=10 --oltp-auto-inc=off
My.cnf
innodb_log_file_size=1500M
innodb_log_files_in_group=2
innodb_locks_unsafe_for_binlog=1
wsrep_slave_threads=32
innodb_flush_log_at_trx_commit=0
innodb_doublewrite=0
sync_binlog=0
innodb_buffer_pool_size=15G innodb_buffer_pool_size=10G(A4)/15G(D4)
(above are the settings the author wsrep_sst_method=xtrabackup-v2
called out, it’s unclear what other innodb_support_xa=0
settings are used in the test) innodb_log_buffer_size=128M
Only the tests on the 3 node cluster (the dark red line in the Openlife charts) are performed:
Summary
Running Percona XtraDB Cluster in Azure VMs is in many ways similar to running it in on-premises
infrastructure. However, a sound design does depend on careful consideration of the characteristics of
network, storage and virtual machines available in Azure. Here’s a quick list of what you can follow:
12
Keep the storage account and MySQL VM in the same region.
Disable Azure geo-replication on the storage account.
Avoid using operating system or temporary disks for database storage and logging.
Avoid using Azure data disk caching options (caching policy = None).
Stripe multiple Azure data disks to get increased IO throughput.
Use Azure internal load balancers to load balance PXC nodes and properly configure probes to
check node health.
Place PXC nodes in a virtual network with static IPs.
Place PXC nodes in an availability set to improve availability.
Increase network throughput by separating application traffic from cluster state transfer traffic
with multiple network interface cards.
Apply optimizations for PXC configuration where appropriate (for example, sync_binlog=0 and
innodb_flush_log_at_trx_commit=0)
Consider using Azure Premium Storage (DS series VMs currently in preview) to achieve higher IO
throughput.
We published a PowerShell script and a Linux VM custom script that work together to
automatically provision a PXC 5.6 cluster with CentOS 6.5 VMs on GitHub:
https://github.com/liupeirong/Azure/tree/master/DeployPXC. In addition to provisioning VMs,
the scripts also take care of disk striping, multiple NIC configuration, and load balancing. You
also have the flexibility to completely customize MySQL settings in your own configuration file.
If you are using a different Linux version or flavor, chances are you can still use the published
example as a reference and tailor it to your own environment.
If you don’t want to run a PowerShell script, we also published an Azure Automation runbook to
the Automation Gallery. You can access it from the Azure Portal when you select Azure
Automation, and then create from gallery:
13
References & Related Resources
1. Percona XtraDB Cluster 5.6 Documentation: http://www.percona.com/doc/percona-xtradb-
cluster/5.6/
2. Performance Best Practices for SQL Server in Azure Virtual Machines:
http://msdn.microsoft.com/en-us/library/azure/dn133149.aspx
3. Optimizing MySQL Performance on Azure Linux VMs: http://azure.microsoft.com/en-
us/documentation/articles/virtual-machines-linux-optimize-mysql-perf/
14
4. Using load-balanced sets to clusterize MySQL on Linux: http://azure.microsoft.com/en-
us/documentation/articles/virtual-machines-linux-mysql-cluster/
5. Installing MariaDB Galera Cluster on Windows Azure:
http://blogs.msdn.com/b/tconte/archive/2014/03/20/installing-mariadb-galera-cluster-on-
windows-azure.aspx
6. Running a MySQL Galera cluster on Microsoft Azure:
http://blogs.msdn.com/b/holgerkenn/archive/2014/08/06/running-a-mysql-galera-cluster-on-
microsoft-azure.aspx
7. Highly Available and Scalable Master-Master MySQL on Azure Virtual Machines:
http://sabbour.me/highly-available-and-scalable-master-master-mysql-on-azure-virtual-
machines/
8. Running an optimized MariaDB(MySQL) cluster on Azure: http://sabbour.me/running-a-
mariadb-cluster-on-azure
9. Azure storage secrets and Linux I/O optimizations:
http://blogs.msdn.com/b/igorpag/archive/2014/10/23/azure-storage-secrets-and-linux-i-o-
optimizations.aspx
15