CERN Cloud Architecture
Ops Midcycle - High Performance Computing with OpenStack - Manchester, 2016
Belmiro Moreira
[email protected]
@belmiromoreira
What is CERN?
CERN Cloud LHC and Experiments
CMS detector
https://www.google.com/maps/streetview/#cern
CERN Cloud AMS
OpenStack at CERN by numbers
~ 5500 Compute Nodes (~140k cores)
~ 5300 KVM
~ 200 Hyper-V
> 17000 VMs running
~ 2800 Images ( ~ 44 TB in use)
~ 2000 Volumes ( ~ 800 TB allocated)
~ 2200 Users
~ 2500 Projects
Number of VMs created (green) and VMs deleted (red) every 30 minutes
OpenStack timeline at CERN
CERN production infrastructure
Guppy
Jun 2012
ESSEX
5 Apr 2012
Hamster
Oct 2013
FOLSOM
27 Sep 2012
Ibex
Mar 2013
GRIZZLY
4 Apr 2013
Grizzly
Jul 2013
Havana
February 2014
HAVANA
17 Oct 2013
ICEHOUSE
17 Apr 2014
Icehouse
October 2014
JUNO
16 Oct 2014
Juno
April 2015
KILO
30 Apr 2015
Kilo
October 2015
LIBERTY
OpenStack timeline at CERN
Evolution of the number of VMs created since July 2013
Number of VMs running
Number of VMs created (cumulative)
Infrastructure Overview
One region, two data centres, 33 Cells
HA architecture only on Top Cell
Children Cells control plane are usually VMs running in the shared infrastructure
Using nova-network with custom CERN driver / Neutron in one cell
2 Hypervisor types (KVM, HyperV)
Scientific Linux CERN 6; CERN Centos 7; Windows Server 2012 R2
2 Ceph instances
Keystone integrated with CERN account/lifecycle system
Nova; Keystone; Glance; Cinder; Heat; Horizon, Ceilometer; Rally; Magnum; Neutron
Deployment using OpenStack puppet modules and RDO
Architecture Overview
Geneva Data Centre
Budapest Data Centre
Load Balancer
Ceph
DB infrastructure
Ceph
DB infrastructure
Nova Top Cell
Keystone
Glance
Nova Compute Cell
Cinder
Nova Compute Cell
Heat
(...)
Ceilometer
Neutron
Horizon
Magnum
Nova Compute Cell
Nova Compute Cell
Nova Compute Cell
(...)
Nova Compute Cell
10
Cells
AVZ_A
GVA
KVM
WIG
KVM
AVZ_B
GVA
AVZ_A
WIG
HyperV
KVM
KVM
AVZ_C
Project: uuid1
KVM
WIG
GVA
KVM
11
Nova Deployment at CERN
Top cell controller
Load Balancer
API node
rabbitmq
nova-api
nova-cells
DB
(...)
Child cell controller
Child cell controller
nova-api
DB
nova-api
rabbitmq
nova-scheduler
nova-cells
nova-conductor
nova-conductor
nova-network
nova-network
Compute node
nova-compute
rabbitmq
nova-scheduler
nova-cells
DB
Compute node
nova-compute
12
Keystone Deployment at CERN
Load Balancer
Keystone
Keystone
Active
Directory
DB
Service
Catalogue
(Exposed to Users)
DB
Service
Catalogue
(Dedicated to Ceilometer)
13
Glance Deployment at CERN
Load Balancer
(Only used for Ceilometer calls)
(Exposed to Users)
Glance node
Glance node
Ceph
Geneva
Glance-api
Glance-api
Glance-registry
Glance-registry
DB
14
Cinder Deployment at CERN
Load Balancer
Ceph
Geneva
Ceph
Budapest
Cinder node
rabbitmq
Cinder-api
Cinder-volume
Cinder-scheduler
NetApp
DB
15
Ceilometer Deployment at CERN
ceilometer-compute
Ceilometer
API
Ceilometer
Notification
Agent
Ceilometer
rabbitmq
sample UDP
nova-compute
Cell
rabbitmq
sample RPC
Compute node
notifications
ceilometer-central-agent
Ceilometer
Pulling
Collector
Hbase
Ceilometer
Notification
Collector
MongoDB
Mysql
Ceilometer
UDP
Collector
Ceilometer
API
Aodh
Evaluator & Notifier
Aodh
API
HEAT
16
Challenges
Capacity increase to 200k cores by Summer 2016
Live Migrate ~5000 thousands of VMs
Upgrade ~800 compute nodes from SLC6 to CC7
Retire old servers
Migrate to Neutron
Identity Federation with different scientific sites
Scale Magnum and containers deployment
17
[email protected]
@belmiromoreira
http://openstack-in-production.blogspot.com