Predicting Space
Weather with Docker
Chris Lauer, Senior Application Software Developer,
NOAA Space Weather Prediction Center
● Embedded in several levels of agency Bureaucracy:
Commerce, NOAA, NWS, NCEP, SWPC
● Different subject area, not in common with parent
agencies. Have our own software section.
● FISMA high security system. “No cloud!” they said.
● Cool Mission
The Space Weather
Prediction Center
Space Weather?!
1. Trained our developers in scrum
2. Taught us how to build consensus
3. Started sprinting - for 8 years
4. Customer Collaboration over Contract Negotiation
We saw big improvements in morale and capability.
It was time to improve delivery!
First: Agile
We started our move to Agile in 2012
Giant single MS SQL product database for all data
Views and stored procedures reaching across schemas to combine
disparate data sources into something “tidy”. Lots of business logic in the
database.
No Service layer
Vast majority of applications hit the database directly. Deep coupling was
killing our agility.
The Architectural Challenge
How do you get started with CI with this?
1. Took some Jenkins/CI training, with Docker!
2. Labeled the old way: shared database integration
3. Read up on a new way: Building Microservices
4. Found a small, high value project (Verification)
5. One team into two: database & microservices teams
6. Developed Verification Service with new rules
Part 1: Seeking out a new way
How we got started
Scientific Software Issues
Scientific Software often has delicate, conflicting, and weird
dependencies. It might also not have much configurability. Docker
captures and isolates these dependencies beautifully.
Easing Deployments
We want to go faster! If it works on Dev, it should work on Staging, and it
should work on Production. Docker can guarantee this if we solve
configuration and data persistence for each project.
Why Docker?
New Rules
Only one thing talks
to a database.
No more deep coupling. Only
a service can talk to a
database.
Automated end-to-
end testing.
Jenkins can stand up all
components and
dependencies, drop test data
in ingest, query the results.
Microservices.
A collection of loosely
coupled, highly cohesive
services that each solve part
of the problem.
New Rules
(Continued)
Event driven.
Low latency. Use
asynchronous messaging for
loose coupling.
New stuff!
NoSQL (MongoDB) for a
developer friendly database.
Python Flask for RESTful
gets/posts. RabbitMQ for
messaging. Docker
containers for everything.
12factor.net
practices.
The new default. Especially
“store configuration in the
environment” and “log to
standard out”.
Developers were happy.
So exciting to be learning contemporary technologies. Pace of learning
was rapid! Iterative changes were much easier. It was faster to learn the
new stuff than to deliver in the old system.
Customer was happy.
Customer was thrilled with the capabilities we built, and how quickly they
were delivered.
How did it go?
Amazing.
The microservices team got way out ahead of the
organization very quickly.
We found we’d made a few mistakes. (Learning time!)
We started hitting some new obstacles. E.g. Containers
wouldn’t build on the secure environment.
But...
And there’s always a but...
Don’t give anything Docker Daemon access.
We started monitoring our services with nagios, via the docker daemon.
Pen testers were able to compromise nagios and elevate their
permissions. Instead: monitor a RESTful endpoint of your service.
And if you do give that access, lock it down!
We were a little loose on security when we initially set up Jenkins (before
we knew what it was for), which was another easy target for pen-testers.
Some small lessons from our
early security mistakes
Minor security missteps aside, initial adoption went
great!
Let’s keep learning this Docker stuff!
Part 2: Replacing a Critical
Data Source
Part 2: GOES-16
Initial adoption went well!
Let’s do it again for real, with a critical
component of our operational
infrastructure: GOES satellite data
GOES drives two out of three of our NOAA Scales
X-rays and Particle data impact communications, satellites, and high
altitude radiation exposure. This has to be low latency and reliable.
Adventures in legacy land
Many of our critical applications had dependencies on GOES data (from
the database directly).
Challenge: Replacing Legacy
And getting better at this Docker stuff!
1. Build a service (in a Docker, end to end testing, etc.)
2. That service needs to understand and solve:
a. The nature of the data
b. The way other applications use the data
3. Add the old data to service (GOES-14 and GOES-15)
4. Port legacy applications to pull from the service
5. Add the new data (GOES-16) to the service. Go live!
Replacing the Old Way
Using Docker in our Strangler Pattern
● GOES Service: takes posts from ingest, gets (queries) from applications.
● LDM: Unidata software for getting raw data.
● GOES-R ingest: process data from the LDM, post to Service.
● GOES-NOP ingest: process data from the legacy PP, post to Service.
● High-cadence ingest: process 1-second, 2-second data, post to Service.
● Fluence/Background: generate daily indices, post to Service.
● Plotting: Host line plotting UI for forecasters.
● MongoDB: backing store for Service.
● RabbitMQ: loose coupling between ingests, LDM, Legacy PP.
● Imagery ingest: process solar imagery from LDM.
● Imagery: imagery displays for forecasters.
GOES: Docker Containers
● docker-compose.yml: defines relationship between
the services, and to the host.
● .env: isolate host differences
● service-name.env: one config for each service
● Docker image(s): hosted on our internal registry
(Harbor)
Four Deployment Artifacts
The new way starts to solidify
docker-compose.yml
CODE EDITOR
version: '2.1'
services:
goesr-ingest:
image: harbor.swpc.noaa.gov/goes/goes-microservices_goesr-ingest:latest
build: goes_ingest
volumes:
# define NETCDF_IN in the .env file, which may be hidden by your os
- ${NETCDF_IN}:/goes_netcdf_dir
env_file:
- goes_ingest.env
restart: unless-stopped
.env: different on each host
CODE EDITOR
# This file should be modified and saved as '.env' in the same
# directory as docker-compose.yml. It will be hidden, but
# Docker Compose will source these variables when it runs
NETCDF_IN=/data/goesr/ldm
QUALITY_ROUTER_PERSISTANCE=/home/goes_nop/
service-name.env
CODE EDITOR
MONGO_USER=goesUser
MONGO_PW=***
FLASK_USER=goesUser
FLASK_PW=***
SATELLITES=[13,14,15,16,17,18]
MONGO_SERVER=mongodb
MONGO_DATABASE=goes
IS_FLASK_DEBUG=False
GOES_SERVICE_LOG_LEVEL=INFO
Docker Images from Registry
From our local harbor (deployed with docker-
compose). http://goharbor.io
1. Make an account on the target host
2. Setup docker-compose.yml, .env, and service*.env
3. Add a robot account in harbor. Docker login to the
SWPC registry with that account.
4. docker-compose pull
5. docker-compose up -d; docker-compose logs -f
Deployment Steps
Simple instructions for deploying anything
How did it GOES?
Amazing.
How did it GOES?
Amazing.
Our biggest mistake
Where we really got it wrong.
We didn’t build CI into our
deployments. Or monitor it.
Turned out we weren’t very good at
writing tests yet, so they were broken
more often than the code.
Developers learned to ignore failed builds.
A missed opportunity to meet the CI goals
of this new architecture.
● Easy to get services running locally in a dev VM!
● Easy to know it will work in staging/production!
● Easy to work on just the part that needs a fix: Can
stand up the whole stack, or just one component!
● Easier collaboration between component devs!
● Dependencies are well documented in code!
● Data persistence is very intentional!
This Docker stuff is great!
Let’s do it for everything!
ICAO: Short-fuse Expansion of Mission
Within a few months, we needed to be running science code in
production to forecast space weather aviation impacts (GPS,
communication, and radiation).
We had models in varying states of not ready
A mix of FORTRAN from the punch card era, IDL, perl, python, shell, and a
model we’d always been told was never going to production. Hardcoded
paths, messy logging, and hidden dataflow.
Part 3: Science Collaboration
Collaborating with scientists, in Docker!
Let’s Teach Docker
I gave a presentation to the scientists:
Docker, Containers, and Continuous
Integration.
We tried three different ways of bringing
these models to production in
containers, using Docker and Docker
Compose.
Credit: Randall Munroe, xkcd.com
CTIPe: Max Usable Frequency
Global Total Electron Content
CARI7: High Altitude Radiation
Benefits
We’re going
faster! 1 Month
To onboard new devs
Used to be ~ 1 year
5 Minutes
To deploy to production
Used to be an hour, with luck
1 Minute
To rollback a deployment
And another 5 to try again!
1 Hour
To stand up a local dev
environment
Used to be about a week
Tamara Bledsoe
Ratina Dodani
Marcus England
Kelvin Fedrick
Kiley Gray
Michael Husler
Chris Lauer
Ben Rowells
David Stone

Predicting Space Weather with Docker

  • 1.
    Predicting Space Weather withDocker Chris Lauer, Senior Application Software Developer, NOAA Space Weather Prediction Center
  • 2.
    ● Embedded inseveral levels of agency Bureaucracy: Commerce, NOAA, NWS, NCEP, SWPC ● Different subject area, not in common with parent agencies. Have our own software section. ● FISMA high security system. “No cloud!” they said. ● Cool Mission The Space Weather Prediction Center
  • 3.
  • 4.
    1. Trained ourdevelopers in scrum 2. Taught us how to build consensus 3. Started sprinting - for 8 years 4. Customer Collaboration over Contract Negotiation We saw big improvements in morale and capability. It was time to improve delivery! First: Agile We started our move to Agile in 2012
  • 5.
    Giant single MSSQL product database for all data Views and stored procedures reaching across schemas to combine disparate data sources into something “tidy”. Lots of business logic in the database. No Service layer Vast majority of applications hit the database directly. Deep coupling was killing our agility. The Architectural Challenge How do you get started with CI with this?
  • 6.
    1. Took someJenkins/CI training, with Docker! 2. Labeled the old way: shared database integration 3. Read up on a new way: Building Microservices 4. Found a small, high value project (Verification) 5. One team into two: database & microservices teams 6. Developed Verification Service with new rules Part 1: Seeking out a new way How we got started
  • 7.
    Scientific Software Issues ScientificSoftware often has delicate, conflicting, and weird dependencies. It might also not have much configurability. Docker captures and isolates these dependencies beautifully. Easing Deployments We want to go faster! If it works on Dev, it should work on Staging, and it should work on Production. Docker can guarantee this if we solve configuration and data persistence for each project. Why Docker?
  • 8.
    New Rules Only onething talks to a database. No more deep coupling. Only a service can talk to a database. Automated end-to- end testing. Jenkins can stand up all components and dependencies, drop test data in ingest, query the results. Microservices. A collection of loosely coupled, highly cohesive services that each solve part of the problem.
  • 9.
    New Rules (Continued) Event driven. Lowlatency. Use asynchronous messaging for loose coupling. New stuff! NoSQL (MongoDB) for a developer friendly database. Python Flask for RESTful gets/posts. RabbitMQ for messaging. Docker containers for everything. 12factor.net practices. The new default. Especially “store configuration in the environment” and “log to standard out”.
  • 10.
    Developers were happy. Soexciting to be learning contemporary technologies. Pace of learning was rapid! Iterative changes were much easier. It was faster to learn the new stuff than to deliver in the old system. Customer was happy. Customer was thrilled with the capabilities we built, and how quickly they were delivered. How did it go? Amazing.
  • 11.
    The microservices teamgot way out ahead of the organization very quickly. We found we’d made a few mistakes. (Learning time!) We started hitting some new obstacles. E.g. Containers wouldn’t build on the secure environment. But... And there’s always a but...
  • 12.
    Don’t give anythingDocker Daemon access. We started monitoring our services with nagios, via the docker daemon. Pen testers were able to compromise nagios and elevate their permissions. Instead: monitor a RESTful endpoint of your service. And if you do give that access, lock it down! We were a little loose on security when we initially set up Jenkins (before we knew what it was for), which was another easy target for pen-testers. Some small lessons from our early security mistakes
  • 13.
    Minor security misstepsaside, initial adoption went great! Let’s keep learning this Docker stuff! Part 2: Replacing a Critical Data Source
  • 14.
    Part 2: GOES-16 Initialadoption went well! Let’s do it again for real, with a critical component of our operational infrastructure: GOES satellite data
  • 15.
    GOES drives twoout of three of our NOAA Scales X-rays and Particle data impact communications, satellites, and high altitude radiation exposure. This has to be low latency and reliable. Adventures in legacy land Many of our critical applications had dependencies on GOES data (from the database directly). Challenge: Replacing Legacy And getting better at this Docker stuff!
  • 16.
    1. Build aservice (in a Docker, end to end testing, etc.) 2. That service needs to understand and solve: a. The nature of the data b. The way other applications use the data 3. Add the old data to service (GOES-14 and GOES-15) 4. Port legacy applications to pull from the service 5. Add the new data (GOES-16) to the service. Go live! Replacing the Old Way Using Docker in our Strangler Pattern
  • 17.
    ● GOES Service:takes posts from ingest, gets (queries) from applications. ● LDM: Unidata software for getting raw data. ● GOES-R ingest: process data from the LDM, post to Service. ● GOES-NOP ingest: process data from the legacy PP, post to Service. ● High-cadence ingest: process 1-second, 2-second data, post to Service. ● Fluence/Background: generate daily indices, post to Service. ● Plotting: Host line plotting UI for forecasters. ● MongoDB: backing store for Service. ● RabbitMQ: loose coupling between ingests, LDM, Legacy PP. ● Imagery ingest: process solar imagery from LDM. ● Imagery: imagery displays for forecasters. GOES: Docker Containers
  • 18.
    ● docker-compose.yml: definesrelationship between the services, and to the host. ● .env: isolate host differences ● service-name.env: one config for each service ● Docker image(s): hosted on our internal registry (Harbor) Four Deployment Artifacts The new way starts to solidify
  • 19.
    docker-compose.yml CODE EDITOR version: '2.1' services: goesr-ingest: image:harbor.swpc.noaa.gov/goes/goes-microservices_goesr-ingest:latest build: goes_ingest volumes: # define NETCDF_IN in the .env file, which may be hidden by your os - ${NETCDF_IN}:/goes_netcdf_dir env_file: - goes_ingest.env restart: unless-stopped
  • 20.
    .env: different oneach host CODE EDITOR # This file should be modified and saved as '.env' in the same # directory as docker-compose.yml. It will be hidden, but # Docker Compose will source these variables when it runs NETCDF_IN=/data/goesr/ldm QUALITY_ROUTER_PERSISTANCE=/home/goes_nop/
  • 21.
  • 22.
    Docker Images fromRegistry From our local harbor (deployed with docker- compose). http://goharbor.io
  • 23.
    1. Make anaccount on the target host 2. Setup docker-compose.yml, .env, and service*.env 3. Add a robot account in harbor. Docker login to the SWPC registry with that account. 4. docker-compose pull 5. docker-compose up -d; docker-compose logs -f Deployment Steps Simple instructions for deploying anything
  • 24.
    How did itGOES? Amazing.
  • 25.
    How did itGOES? Amazing.
  • 26.
    Our biggest mistake Wherewe really got it wrong. We didn’t build CI into our deployments. Or monitor it. Turned out we weren’t very good at writing tests yet, so they were broken more often than the code. Developers learned to ignore failed builds. A missed opportunity to meet the CI goals of this new architecture.
  • 27.
    ● Easy toget services running locally in a dev VM! ● Easy to know it will work in staging/production! ● Easy to work on just the part that needs a fix: Can stand up the whole stack, or just one component! ● Easier collaboration between component devs! ● Dependencies are well documented in code! ● Data persistence is very intentional! This Docker stuff is great! Let’s do it for everything!
  • 28.
    ICAO: Short-fuse Expansionof Mission Within a few months, we needed to be running science code in production to forecast space weather aviation impacts (GPS, communication, and radiation). We had models in varying states of not ready A mix of FORTRAN from the punch card era, IDL, perl, python, shell, and a model we’d always been told was never going to production. Hardcoded paths, messy logging, and hidden dataflow. Part 3: Science Collaboration Collaborating with scientists, in Docker!
  • 29.
    Let’s Teach Docker Igave a presentation to the scientists: Docker, Containers, and Continuous Integration. We tried three different ways of bringing these models to production in containers, using Docker and Docker Compose. Credit: Randall Munroe, xkcd.com
  • 30.
  • 31.
  • 32.
  • 34.
    Benefits We’re going faster! 1Month To onboard new devs Used to be ~ 1 year 5 Minutes To deploy to production Used to be an hour, with luck 1 Minute To rollback a deployment And another 5 to try again! 1 Hour To stand up a local dev environment Used to be about a week
  • 35.
    Tamara Bledsoe Ratina Dodani MarcusEngland Kelvin Fedrick Kiley Gray Michael Husler Chris Lauer Ben Rowells David Stone