Microservices Docker Microsoft Azure
Microservices Docker Microsoft Azure
EPUB is an open, industry-standard format for e-books. However, support for EPUB and its many
features varies across reading devices and applications. Use your device or app settings to customize
the presentation to your liking. Settings that you can customize often include font, font size, single or
double column, landscape or portrait mode, and figures that you can click or tap to enlarge. For
additional information about the settings and features on your reading device or app, visit the device
manufacturer’s Web site.
Many titles include programming code or configuration examples. To optimize the presentation of
these elements, view the e-book in single-column, landscape mode and adjust the font size to the
smallest setting. In addition to presenting code and configurations in the reflowable text format, we
have included images of the code that mimic the presentation found in the print book; therefore, where
the reflowable format may compromise the presentation of the code listing, you will see a “Click here
to view code image” link. Click the link to view the print-fidelity code image. To return to the
previous page viewed, click the Back button on your device or app.
Microservices with Docker on Microsoft
Azure™
Boris Scholl
Trent Swanson
Daniel Fernandez
Foreword
Preface
Acknowledgments
About the Authors
1 Microservices
2 Containers on Azure Basics
3 Designing the Application
4 Setting Up Your Development Environment
5 Service Orchestration and Connectivity
6 DevOps and Continuous Delivery
7 Monitoring
8 Azure Service Fabric
A ASP.NET Core 1.0 and Microservices
Index
Contents
Foreword
Preface
Acknowledgments
About the Authors
1 Microservices
What are Microservices?
Autonomous Services
Small Services
Benefits of Microservices
Independent Deployments
Continuous Innovation
Improved Scale and Resource Utilization
Technology Diversity
Small Focused Teams
Fault Isolation
Challenges
Complexity
Network Congestion and Latency
Data Consistency
Testing
Integration and Versioning
Service Discovery and Routing
Monitoring and Logging
Skillset and Experience
Uptime Service Level Agreement
Best Practices
Encapsulation
DevOps Principles and Culture
Automation
Monitoring
Fault Tolerance
Summary
2 Containers on Azure Basics
VMs, Containers, and Processes
When Would We Use a Container Over a Virtual Machine or a Process?
Containers on Azure
Creating an Azure VM with Docker
Generating an SSH Public Key on Windows
Generating an SSH Public Key on Mac OS X
Choosing a Virtual Machine Image
Connecting to the VM Using SSH and Git Bash on Windows
Connecting to the VM Using SSH and Git Bash on Mac OS X
Docker Container Basics
Summary
3 Designing the Application
Determining Where to Start
Coarse-Grained Services
Starting with Microservices
Defining Services and Interfaces
Decomposing the Application
Service Design
Service to Service Communication
Synchronous Request/Response
Asynchronous Messaging
Monolith to Microservices
Flak.io e-Commerce Sample
Flak.io
Requirements
Architecture Overview
Considerations
Summary
4 Setting Up Your Development Environment
Using Docker for Local Development
Docker for Local Development
Docker for Production Validation
Docker as a Build/Test Host
Developer Configurations
Local Development
Local and Cloud
Cloud Only
Managing Docker Authentication
Choosing a Base Image
Build a Hierarchy of Images
Setting up your Local Dev Environment
Install Docker Tools
Install Developer Tools
Install Windows Utilities
Install OSX Utilities
Docker for Local Development
Local Development Settings
Starting your Local Docker Host
Connecting to a Docker Host
Cloning Samples
Enabling Live Reload in a Docker Container
Volumes
Preparing your Microservice for Production
Docker Compose
Debugging Docker Issues
Unable to Connect to the Docker Host
Containers That Won’t Start
Diagnosing a Running Container
Summary
5 Service Orchestration and Connectivity
Orchestration
Provisioning
Infrastructure as Code
Azure Resource Manager
Azure Container Service
Multivendor Provisioning
Scheduling and Cluster Management
Challenges
A Scheduling Solution
Docker Swarm
Kubernetes
Apache Mesos
Using Apache Mesos to Run Diverse Workloads
Service Discovery
Service Registration
Service Lookup
Service Registry
Technologies
Other Technologies
Application/API Gateway
Overlay Networking
Summary
6 DevOps and Continuous Delivery
DevOps Overview
Modern DevOps
DevOps Culture
Continuous Integration, Delivery, and Deployment
Creating Environments in Azure
Deploying a Microservice with Continuous Delivery
Application Configuration Changes Across Different Environments
Continuous Integration
Testing in a QA Environment
Deploying to Staging
Testing in Production
Choosing a Continuous Delivery Tool
On-Premises or Hosted?
On-Premises or Hosted Build Agents?
Best-of-breed or Integrated Solution?
Does the Tool Provide the Extensibility You Need?
Comparison of Jenkins, Team Services, Bamboo, and Tutum
Docker Cloud (Formerly Called Tutum)
Summary
7 Monitoring
Monitoring the Host Machine
Monitoring Containers
Monitoring Services
Monitoring Solutions
Azure Diagnostics
Application Insights
Operations Management Suite (OMS)
Recommended Solutions by Docker
Summary
8 Azure Service Fabric
Service Fabric Overview
Service Fabric Subsystems
Cluster Management
Resource Scheduling
Service Fabric Application
Custom Applications (Existing Applications)
Container Integration
Service Discovery
Programming Model
Stateless Services
Stateful Services
Reliable Actors
Reliable Services
Application Lifecycle
Service Updates
Application Upgrades
Testability Framework
Summary
A ASP.NET Core 1.0 and Microservices
A New Version of ASP.NET
Getting Started
Choosing the Right ASP.NET Docker Image
Visual Studio 2015 Tooling
ASP.NET Microservices Best Practices
Index
Foreword
Over the last couple of years, we have seen Azure evolve from a simple .NET-based platform to an
open and flexible platform, supporting the broadest selection of operating systems, programming
languages, frameworks, tools, databases and devices for infrastructure-as-a-service (IaaS), platform-
as-a-service (PaaS), and software-as-a-service (SaaS) workloads. As a result, Azure is growing at
an amazing rate with both existing and new customers.
Today, there is not a single industry that does not consider making use of the cloud in one form or
another, from big compute to Dev/Test to SaaS solutions. For IT and developers, flexibility and
agility are the number one reason for adopting Azure. A typical pattern of customers adopting Azure
is to start with dev/test scenarios, followed by moving existing applications to run IaaS-based hybrid
scenarios, and eventually developing new applications to take full advantage of the cloud platform.
The Azure cloud infrastructure is now in a place where it provides the flexibility to accommodate
almost every scenario. Thus, customers have realized that their application design is now the limiting
factor. Many customers still have a monolithic application design in place that makes it hard to
independently update, version, deploy, and scale individual application components. Therefore,
despite the cloud being extremely agile and flexible, the application itself limits the agility needed to
react quickly to market trends and customer demands.
Over the last couple of months, microservices-based applications have become the most talked-
about new architectural design, enabling previously impossible agility and ease of management.
Docker containers turn out to be a perfect technology to enable microservice-based applications,
from a density, DevOps, and open technology perspective. When coupled with Docker,
microservices-based applications are a game changer when it comes to modern application
development in the cloud.
I’m really excited that Azure offers the foundational technology and higher-level services to
support any type of microservices-based application. You can build applications using Docker
containers on Apache Mesos with Marathon/Chronos/Swarm or you can build applications on our
own native microservices application platform, Service Fabric. Azure offers the right choice for your
scenario.
Whether you have just gotten your feet wet with containers or microservices, or you have already
advanced in that subject, this book will help you understand how to build containerized
microservices-based applications on Azure. Beyond just describing the basics, this book dives into
some of the best practices each aspiring microservices developer or architect should know. Boris,
Trent, and Dan are the very best people to walk through both the basics and the advanced topics! All
of them have deep real-world experience building applications using these models, and amazing
product insight on Azure and the cloud. I am excited to see what you can build using the skills they
share in these pages!
—Corey Sanders
Partner Director of Program Management, Azure
Preface
The three of us have been working with the Microsoft Azure Cloud platform since its first release in
2009. Our collective work with the platform is fairly broad—from building the platform and
applications on the platform to creating the Azure development tools and experiences. In addition, we
have enabled numerous customers and partners to build their large-scale cloud-native applications on
Microsoft Azure. Over the years, we’ve been able to apply many lessons learned, ranging from
designing applications for resiliency and scale all the way to DevOps best practices from our
interactions with customers to Azure platform capabilities, Azure tooling, and technical
documentation.
However, some questions and problems continued to persist. Just to name a few, how do we make
sure that what works on my development machine also works in a cloud environment? How should
we think about structuring the application so that we do not need to update everything if there is just a
minor change to one component? How do we deploy updates as fast as possible without downtime?
Finally, how do we handle configuration and environment changes?
In 2013, we began hearing more industry leaders and customers talk about Netflix, Amazon, and
other businesses using microservices as an architectural approach to address those challenges. We
did a head-to-head comparison with our most successful architectures (with both internal and external
customers) and realized that we had already implemented many characteristics of microservices
patterns—for example, designing Cloud Services applications based on workloads or the
decomposition of an application into multiple services with the lifecycle of individual
components/services as the motivation. Clearly, architectures were evolving in this direction, and
when the term “microservices” became popular, many architects and developers realized theirs were
already heading in that direction.
Enter Docker. Docker reduces deployment friction and the cost of placing a single service into a
single host. This reduced deployment friction helped manage the deployment of the growing number
of services common in a microservices architecture and helped standardize the deployment
mechanisms in a polyglot environment. The programmable infrastructure offered by cloud
environments, along with containers, paved the way for microservices architectures.
But having the right architectural approach and tool is just half the equation. Conceptually thinking
about how to set up development and test environments, automate the DevOps flow, orchestrate and
schedule Docker containers in a cluster of virtual machines, how to make the microservices
discoverable by other services, and how to monitor the environments and the services, constitute the
critical other half.
We, the author team, have spent the last two years working on microservices and Docker scenarios
on either the Visual Studio tooling for the Docker engineering team, the Azure Service Fabric
Compute engineering team, or working directly with our customers.
We wrote this book to share the hard-earned lessons we’ve learned and provide you with the tools
you need to succeed building microservices with Docker on Azure.
Special thanks to Bin Du for his help with the Monitoring chapter, particularly the Azure diagnostics
part. Bin Du is a senior software engineer in Microsoft who has been working on Visual Studio
diagnostics tools for Azure platform for more than five years. His experience and insights in
diagnostic telemetry, performance troubleshooting, and big data analytics made it possible to write
about Azure diagnostics and diagnostics practices from an inside view of a developer who is
working in those areas.
Special thanks also to Den Delimarschi for helping author the Continuous Delivery chapter. Den is
a Program Manager who had previously worked on improving the build and continuous integration
process for Outlook so his insights and hands-on experience were invaluable.
This book would not have been possible without the help of many people. Special thanks to all of
the reviewers for their thoughtful comments and discussion, which has helped to make the book of
such a great quality. In particular, thanks to Ahmet Alp Balkan, Bin Du, Chris Patterson, Christopher
Bennage, Corey Sanders, Den Delimarschi, Donovan Brown, Jeffrey Richter, John Gossman, Lubo
Birov, Marc Mercuri, Mark Fussell, Masashi Narumoto, Matt Snider, Vaclav Turecek.
We also thank our families and friends who have contributed in countless nontechnical ways. We
could not have done it without your support!
About the Authors
Boris Scholl is a Principal Program Manager on the Microsoft Azure compute team, looks after
Service Fabric custom application orchestration, container integration, and Azure’s OSS developer
and DevOps story for container based workloads. Prior to this, he was leading the Visual Studio
Cloud Tools team focusing on architectural and implementation patterns for large-scale distributed
Cloud applications, IaaS developer tooling, provisioning of Cloud environments, and the entire ALM
lifecycle. Boris gained his experience by working as an architect for global cloud and SharePoint
solutions with Microsoft Services. In addition to being a speaker at various events, Boris is the
author of many articles related to Azure development and diagnosing cloud applications, as well as
co-author of the book SharePoint 2010 Development with Visual Studio 2010 (Addison Wesley
Professional ISBN 10 0321718313).
Trent Swanson is a typical entrepreneur. As a cofounder and consultant with Full Scale 180, he
works with some of Microsoft’s largest customers, helping them migrate and build applications on the
Microsoft Azure platform. He has been involved in building some of the largest applications running
on Microsoft Azure today, some of which now utilize Docker and a microservices architecture. Trent
often works with the Microsoft Patterns and Practices team developing guidance and best practices
for cloud applications, where he also co-authored a book on cloud design patterns. As a cofounder of
Krillan and Threadsoft, he has built applications based on a microservices architectural style using
Docker, Node.js, Go, and Mesos. As a cofounder of B & S Enterprises, he dabbles with various IoT
technologies for commercial building management.
Dan Fernandez is a Principal Director managing the Developer Content teams for Visual Studio,
Team Services, ASP.NET, and parts of Azure. Prior to this, Dan worked as a Principal Program
Manager managing the developer experience for Docker including Visual Studio, Visual Studio Code,
and Docker continuous integration using Visual Studio Team Services. Dan is also the author of the
Channel 9 video series Docker for .NET Developers. You can find Dan on Twitter at @danielfe.
Tech Editors
Marc Mercuri is currently a Principal Program Manager in the Azure Customer Advisory Team
(CAT) and the functional lead of the team’s strategic accounts program. Marc has been actively
working in the software and services industry for the past 20 years. The majority of this time has been
focused on distributed computing scenarios, with the past 6 years exclusively focused on cloud-
centric (PaaS, SaaS, and IaaS) enterprise projects and commercial products.
Marc has been involved in the architecture of hundreds of high-profile, high-scale public and
hybrid cloud solutions. The result is significant experience in designing, implementing, operating, and
supporting reliable cloud solutions and services at scale. This experience includes both general cloud
architecture best practices as well as nuanced considerations for key scenarios (IoT, Enterprise
Hybrid, Consumer Mobile, Big Data Pipelines, Machine Learning, Deployment Automation, Open
Source Software Deployment/Configuration, etc.).
Marc is the author of four books on services and identity and is a speaker at internal and external
conferences. He has 10 issued patents and 17 patents pending in the areas of cloud, mobile, and
social.
Nigel Poulton is a self-confessed tech addict who is currently living it large in the container world.
He creates the best IT training videos in the world for Pluralsight (those are his words) and co-hosts
the In Tech We Trust Podcast. In his spare time he does a good impression of a husband and a father.
He lives on Twitter @nigelpoulton and he blogs at http://nigelpoulton.com.
1. Microservices
Software is consuming the world and becoming a core competency of nearly all businesses today,
including traditional businesses whose focus is not specifically technical. A growing number of
industries need software to remain relevant, and that software must be able to evolve at a rapid rate
to meet the demanding needs of today’s fast-changing and competitive markets. The way we build and
manage software continues to evolve as new technologies enter the market and we learn of new ways
to build better systems. Many early concepts still hold true although others have simply evolved, or
have been adapted to new technologies. New applications are showing up every day to meet the
growing needs of consumers and businesses alike.
As we begin building a new application, the team and codebase are generally small and agile when
compared to a more mature application. As new features are added to the application, the team and
codebase continue to grow. The increased size of the team and codebase brings some new challenges
and overhead that can begin to slow progress. The code becomes increasingly difficult to understand
and reason about, leading to longer development ramp-up times. Enforcement of modularity becomes
challenging, the encapsulation easily breaks down, and the application grows increasingly more
complex and brittle, requiring additional coordination across the teams when changing features.
Builds and testing begin to take longer and releases become more fragile, including a larger set of
changes with each deployment. The smallest change to a component can require the redeployment of
the application, which can take a long time to build, test, and deploy. If one component of the
application needs to be scaled, the entire application needs to be scaled. Everything about the
application is tightly coupled: teams, builds, deployments. This might not be a problem for some
applications, but for large-scale applications that must evolve at a rapid rate, this can be a big
problem.
For a long time now we have understood the benefits of componentization and encapsulation in
software, but enforcement can be challenging. Through the recent DevOps shift we now understand
the benefits of smaller incremental releases, and making the entire team accountable for the complete
lifecycle of the application. When building large-scale applications, we understand the benefits of
partitioning, and the cost of coordination on scale. As with applications, some organizations can also
benefit from breaking down into smaller teams, helping to reduce coordination overhead.
Over recent years many organizations have begun to realize the benefits of decomposing
applications into smaller isolated units that are easier to deploy, and structuring teams around those
units. This decomposition of applications into smaller autonomous units is part of what is called a
microservice architecture.
Note
A microservices architecture has also been referred to as “fine-grained SOA.” We will
not discuss the differences and similarities between a microservices architecture and
service-oriented architecture (SOA) in this book. Both are service architectures dealing
with distributed systems communicating over the network. A microservices architecture
tends to have a specific focus on incremental evolution over reusability.
Instead of building one single codebase all developers touch, which is often large and risky to
manage, the application is composed of smaller codebases independently managed by small agile
teams. The only dependency the code bases have on one another is through APIs (application-
programmer interfaces). This architectural style has become popular with large-scale web companies
like Netflix, eBay, and Amazon. These companies were struggling to scale their applications and, at
the same time, bring new features to market rapidly with their existing monolithic architecture. They
realized that instead of building one monolithic application, they would build smaller services to
handle discrete business functions. The services are self-contained, and the boundaries between the
services are well-defined APIs, exposing the business capabilities of the service.
In Figure 1.1, we see a monolithic application with all features and components of the application
built and deployed together as a single unit. The monolithic application can be scaled out to multiple
instances, but it’s still deployed as a single application.
Autonomous Services
Microservices is an architecture that is optimized around autonomy. Autonomy and loose coupling
between the services are important characteristics of a microservices architecture. Loose coupling
means that we can independently update one service without making changes to another. The loose
coupling extends throughout the entire service, across the teams, and right down to the data store—a
type of coupling that is often overlooked is at the data store.
Microservices interact through APIs, and they do not share database schema, data structures, or
internal representations of the data. Teams that have been building web applications might be familiar
with some of these concepts and have consumed other third-party services when building their
application. If we are consuming a service from some third-party geocoding service, we would not
integrate through the database or a shared service bus that couples our system to the third-party
service. Instead, we would integrate through an API, and that API would be well documented,
versioned, and easy to consume. If the service owner needs to fix a bug in the API, they can make and
deploy the change without coordinating this effort with all the consumers of the API. These changes
should not affect consumers of the application and if for some reason a breaking change is needed,
then a new version of the interface is deployed and consumers can move to this version when they are
ready.
You can think of each service as its very own application, having its own team, testing, build,
database, and deployments. These services are built and managed like any other small application.
They must remain independently deployable and will generally service a single purpose. This brings
us to the next topic, the size of the individual services.
Small Services
The individual services in a microservices architecture generally adhere to the single responsibility
principal. The services are meant to be simple, doing one thing and doing it well. In theory if a
service has more than one responsibility then we should consider breaking it up in to separate
services.
In reality, following this principle might not make sense at times, and there can actually be good
reasons to keep some of the services more coarsely grained. Some functions of the system may
inherently be more tightly coupled or it may be too costly to decompose. This could be due to the
need for transactional consistency, or it may just be that the features are highly cohesive and become
too chatty.
The question still remains as to how big or small a service must be for it to be a properly sized
microservice. There are no specific rules to measure the size of a service, but there are some general
guidelines the microservices community has identified, which work well in most situations. Many
start with Amazon’s principle that a service should be no bigger than one that can be developed and
managed by a two-pizza team (the entire team can be fed with two pizzas). Others will use lines of
code or the ability for team members to understand the entire service. Whichever metrics are used,
the goal is that the individual services have a single purpose and can be easily understood and
managed by small teams. As engineers, we will need to be pragmatic and consider various factors
when defining service granularity and boundaries, like cohesion, coupling, communications, and data
architecture.
Defining service boundaries can be one of the most challenging tasks with a microservices
architecture, and doing so requires a strong understanding of the business domain. Whether
decomposing an existing monolith or starting a new application, we start from the business
perspective, considering the business capabilities. A good approach to identify the boundaries that
separate services is using a bounded context from domain-driven design (DDD). We will cover more
of the details for defining service boundaries in Chapter 3, “Microservices Design.”
Benefits of Microservices
Now that we know a bit of what these microservices are about, let’s have a look at the benefits
provided through this approach and why we might consider using this strategy in our application.
Independent Deployments
With a large monolithic application, fast reliable deployments can be problematic. When updating a
feature or fixing a bug in a monolithic application, we typically need to build and deploy the entire
application. Even the smallest change to a library or component will require a redeployment of the
entire application, including all those other components that have not changed. Depending on the size,
technologies, and processes used, building and deploying an update can take a significant amount of
time. In some situations it can take many hours to push out a new build for even small changes or bug
fixes. Not only is our update potentially held up by other changes that need to be deployed with it,
those other changes may cause deployment failures and require a rollback of the deployment. If we
could decouple changes to one feature of the application from others, these problems could be
avoided.
As a monolithic application grows, so does the fear of making changes, and as a result, all
development slows down. Businesses and IT departments avoid touching large monolithic
applications due to the increased risk of failure and long roll-back processes. This results in a
stalemate situation where an update is required but is not implemented due to the perceived risks. By
decoupling changes to one component of the application from other components, only the parts that
changed need to be released.
In a microservices-based architecture, each microservice is built and deployed independently of
the other services, provided we are able to maintain loose coupling between the services. This means
that we can deploy a critical bug fix to our payment processing component (or microservice)
completely independently of other changes needed by the catalog service. If the changes to the catalog
need to be rolled back, they can be rolled back while our payment processing bug fix remains in
production. This makes it possible for large applications to remain agile, and to deploy updates more
quickly and more often. A microservices design removes the coordination overhead, as well as the
fear factor of making changes. Quite often change advisory boards are no longer needed and changes
happen far more quickly and efficiently.
Note
Fast, reliable deployment through automation is an important factor to reducing Mean
Time to Resolution (MTTR).
Not only do the smaller services enable these separate deployments and updates, given that we
only have to build a small part of the application, build times can be much faster. Depending on the
size of the application, this can be significant, even with parallel builds and tests.
Continuous Innovation
Everything in business and technology is changing at a rapid pace, and companies can no longer seek
and rely on sustainable competitive advantage. Companies instead need to innovate continuously to
stay competitive, or even just to remain in business. Organizations must remain agile and able to
respond to fast-changing market conditions very quickly, and software has to be able to quickly adapt
to these changing business needs.
It’s not uncommon for organizations to refrain from pushing updates during their most busy seasons,
unless they absolutely have to. For example, ecommerce sites have generally stopped deploying
noncritical updates to systems during the busy holiday season. The thinking is, “Don’t touch anything
unless we absolutely need to because we dare not break something right now.” Downtime during
these critical periods can be extremely costly to the business—even a few minutes could mean
significant losses in sales.
Through independent deployments and rigorous DevOps practices, a microservices architecture
can make it possible for organizations to release updates during these critical times without adversely
impacting the business. The software and services can continue to evolve, and valuable feedback can
be gathered, enabling a business to innovate at a very rapid rate.
For example, during the holiday season a new recommendation service can be deployed without
affecting the rest of the application. A/B testing can be performed during these times of heavy usage,
and knowledge gained in this way can be used to improve conversions, or the user experience in the
application. Fixes and new features can be deployed using release management strategies, like blue-
green deployments and canary releases. If there is a problem with the changes, they can be quickly
rolled back. Even if the updates temporarily break recommendations or performance is a problem, the
components handling the purchase of goods remain untouched. Even if we break something, a good
deployment strategy would impact a small number of customers, the problem would be detected
early, and we would roll back. Deployments in these situations cease to become an event, and are
simply daily routine. See Chapter 6, “DevOps and Continuous Delivery,” in this book for more
information on microservices release management strategies.
The capability for a business to continuously innovate in today’s competitive markets is often
reason enough to consider a microservices architecture approach.
Technology Diversity
In the monolithic application, everyone needs to agree on a language, stack, and often a specific
version of a stack. If one component of an application can benefit from some new features in the most
recent .NET framework release, we may not be able to adopt it because another component in the
application might be unable to support it.
Unlike the libraries and components of a monolithic application, with a microservices approach
the application can be composed of independent services that leverage different frameworks,
versions of libraries or frameworks, and even completely different OS platforms.
Technology Diversity Benefits:
• Avoid long-term commitment to a stack
• Enable the best technology for the team
• Avoid stack version conflicts between features or libraries
• Employ evolutionary design
Developers are passionate about technology and often have very strong opinions for or against a
technology. These debates can go on for days, months, and even longer. All this discussion occurs
before we can even begin building an application, and then we have to wait on the framework team to
complete development.
With microservices we can use the best tool for the job. We can select the same or different
technology stack for each individual microservice we are building. Different languages, stacks, data
stores, and platforms can be used with each service. One of the services could be implemented using
Node.js, and another service can be written in Java or .NET. Components of an application can
require different versions of a framework or stack. One component of an application might not be
able to use the latest version of a framework because another component’s incompatibility with the
latest version.
As we can see in Figure 1.3, services can be implemented using a number of different languages
and different data stores. A catalog service can be implemented in Java and benefit from
Elasticsearch features, while the payment processing service can use ASP.NET and a transactional
database with strong consistency and reporting features like SQL Server.
Note
Just because we can use different languages and technologies across the various services
does not mean we necessarily should.
The technology diversity and small size of the services enables evolutionary design. Technology is
moving at an extremely rapid rate and leveraging new technologies in a monolith can be slow. If a
decision is made to move to a new platform or technology, the entire application needs to be changed
to support it. With the microservices architecture individual services can quickly adopt technologies
that benefit the needs of that service without having to worry about the impact to other features in the
application.
Fault Isolation
In a monolithic application, a memory leak in one component, or even a third-party library, can affect
the entire application and be hard to track down.
One of the benefits of a microservices architecture is the isolation of components or features. If one
service crashes, it’s quite possible the rest of the application can continue to operate until that service
recovers. This of course assumes the service is implemented to handle the faults. Also, if one service
is suffering from growing memory pressure due to a leak, it can be much easier to identify the service
causing the issue. We might still be able to handle requests for the application even though one of its
services is down. As depicted in Figure 1.4, even though one service is unhealthy and can be causing
the process to crash, other services remain unaffected. As long as they are implemented correctly they
can remain healthy.
Challenges
Despite the many benefits of microservices architectures, they come with their own set of challenges.
Although our complex application is decomposed into smaller, simple services, the architecture
needed to run and manage them can present some big challenges. All these services need to work
together to deliver the total functionality of our application.
Complexity
Distributed systems are inherently complex. Although the individual services themselves remain
small and generally simple, the communication between the services and the management of the
services becomes much more complex.
Keeping a monolithic application running on a small cluster can be a full time job in itself. In a
microservices architecture, instead of one deployment, we are now deploying tens or hundreds of
services, all of which need to be built, tested, and deployed, and that need to work seamlessly
together. We need to ensure that all the services are up and running, have sufficient disk space and
other resources, and remain performant. These services can even be implemented in multiple
languages. All the services generally require load balancing and communication over synchronous
and asynchronous messaging. A lot of the complexity in a microservices architecture is shifted into
the infrastructure and may require experienced infrastructure developers.
Data Consistency
In our monolithic application, all data can be stored in a single database, and even if we partitioned
it, we probably partitioned the data in such a way that we could still use the database to enforce
referential integrity and use transactions. In our microservices application, the data is probably
distributed across multiple databases, and in addition, they may be entirely different types of
databases.
As we decentralize our data and move to a model where each service has its own schema or
database, this causes data consistency and integrity challenges that can be hard to solve. Data in one
service can reference data in another service where maintaining integrity between both is important,
and if data changes in one database, it must change in the other.
What if we reference some data in another service that is deleted, or maybe the data changes and
we need to be made aware of this? If we replicate data, how do we keep it consistent and who owns
it? If we cache it, how do we invalidate caches?
Dealing with these consistency challenges, and concepts like eventual consistency, can be
extremely hard to get right. Although dealing with these data consistency and integrity concerns can be
quite challenging, there are some techniques we can use. We can use a notification service to publish
changes to data where consumers can subscribe to these changes and updates. Another approach
would be to set proper TTLs where some defined and acceptable inconsistency can be tolerated. We
can implement the services in a way that they can deal with eventually consistent or inconsistent data.
Many of these challenges are not new to microservices, as we commonly need to partition and
replicate data in large systems as a way to increase scale and availability.
Testing
Service dependencies can complicate testing and generally require a more formal effort describing
the service communication interfaces.
It can be difficult to recreate environments in a consistent way when you have multiple services
evolving at a different rate. Whatever we deploy in to a staging environment will not exactly reflect
what’s in production. Setting up that many instances for the staging environment may not be worth the
return. We can perform integration tests using mocks, service stubs, and consumer-driven contracts. A
tool called Pact can be useful (https://github.com/realestate-com-au/pact).
A release should not end with the deployment of the service update. Testing in production is
generally the way to go, and by placing more emphasis on monitoring we can detect anomalies and
issues very quickly and roll back. Development teams should invest in automation and use strategies
like blue-green deployments, canaries, dark canaries, A/B testing, and feature flags. This requires
advanced DevOps; automation, and monitoring.
If each of our services has an SLA of 99.9%, which means an individual service can be down for
roughly forty-three minutes every month, we need to consider the fact that they could and are
statistically likely to go down at different times. Now let’s say the application is dependent on three
services, each of which can be down for forty-three minutes a month at different times and still meet
their SLA. This means the combined effective SLA for our application is now approximately two
hours of outage each month.
Detecting the failure of a dependent service and gracefully dealing with it can help keep the
application’s overall SLA higher. The application might be able to temporarily provide service with
limited functionality in the event that one of its service dependencies is experiencing issues. These
are some important considerations when designing the application.
Best Practices
Several best practices for the design and implementation of microservices architectures have been
established by organizations working with microservices over the years. Before diving into the
design and development of our microservices, we will cover some of the best practices and some of
the challenges in building microservices.
Encapsulation
Encapsulating the internals of a service through a carefully defined service boundary that does not
expose internals of a service is important. Maintaining loose coupling in microservices is important
to realizing the benefits of a microservices approach. If we are not careful, our microservices can
become coupled. Once these services become coupled we can lose a lot of the value in a
microservices architecture, like independent lifecycles. Deploying and managing a microservices-
based application with tightly coupled services can be more complex than deploying and managing a
monolith. If the deployment of a service needs to be closely coordinated with other services, there is
likely coupling as a result.
Common causes of service coupling:
• Shared Database Schema
• Message Bus or API Gateway
• API design
• Rigid Protocols
• Exposing service internals in the API
• Organizational coupling (Conway’s Law)
Conway’s Law
Conway’s Law generally states that the design of a system reflects the communication
structure of the organization.
Shared databases and schemas are common causes of coupling. We start out building our
microservices, and because our microservices all use a SQL database, and there is some overlap in
the schema, we simply have a bunch of our services share a database. All of our data is related in
some way, so why not? It would appear that we can deploy and manage our services independently
until we need to change something in the database, and then we need to coordinate a deployment with
the other services and their teams. It might sometimes happen that a database schema never changes;
however, more often than not they do need changing over time. There can be good reasons for two or
more microservices to share database schema and still remain separate microservices, but it’s
important to understand what the trade-offs are.
Another common cause of coupling that was prominent in past implementations of service-oriented
architecture was an Enterprise Service Bus (ESB). The service bus often contained schemas,
transforms, and business logic that were tightly coupled with the services that integrated with them.
Not only do we need to ensure the design and implementation enables us to reduce or eliminate the
friction between our services so we can move quickly, it’s important that the teams themselves do not
become coupled.
Automation
Use of automation is recommended when building a monolithic application, but it is absolutely
necessary for microservices applications. Given the increased number of services that need to be
built and deployed, it’s even more important to automate the build and deployment of the services. A
widely accepted principle is that the first thing a team needs to build when starting a microservices
project is the continuous delivery pipeline. Manual deployment takes a lot of time, and is the primary
cause of failures in production. Multiply this issue by the number of services, and the chances of
human error go up significantly. It is much better to write code or build a deployment pipeline than be
responsible for manual deployments.
The automated pipeline can make these functions smoother and less error-prone:
• Builds
• Tests
• Deployments
• Rollback
• Routing & Registry
• Environments
All things should be automated and there should be no need for a deployment manual. Not only
should we be concerned with automating the delivery of the application, we should consider
automation for development. It should be easy for developers to provision environments, and friction
due to a lack of automation in this area can impede development.
Monitoring
Good monitoring is important to keeping any application healthy and available. In a microservices
architecture good monitoring is even more important. We not only need to be able to monitor what’s
going on inside a service, but also the interactions between the services and the operations that span
them.
• Use activity or correlation IDs: These are used to associate events to a specific activity or
transaction.
• Use a common log format: This enables better correlation of events across services and
instances.
• Collect and analyze logs: By collecting logs from the various services and instances, we can
analyze and query across multiple services and instances of a service.
• Consider using a sidecar: A sidecar is simply an out-of-process agent that is responsible for
collecting metrics and logs. We can run this in a linked container.
In addition to these points, all the standard monitoring tools and techniques should be used where
appropriate. This includes endpoint monitoring and synthetic user monitoring.
See Chapter 7 for more information on best practices for monitoring and logging with
microservices architectures.
Fault Tolerance
Just because we move to a microservices architecture and have isolated one component from another
in either a process, container, or instance, it does not necessarily mean we get fault tolerance for free.
Faults can sometimes be more prevalent and more of a challenge in a microservices architecture. We
split our application into services and put a network in between them, which can cause failures.
Although services are running in their own processes or containers and other services should not
directly affect them, one bad microservice can bring the entire application down. For instance, a
service could take too long to respond, exhausting all the threads in the calling service. This can
cause cascading failures, but there are some techniques we can employ to help address this.
• Timeouts: Properly configured timeouts can reduce the chance of resources being tied up
waiting on service calls that fail slowly.
• Circuit breaker: If we can know a request to a service is going to result in a failure then we
should not call it.
• Bulkheads: Isolate resources used to call-dependent services.
• Retry: In a cloud environment, some failed service operations are transient and we can simply
detect these and retry the operation.
Let’s have a look at these patterns and what they are.
If an application or update is going to fail, it’s better to fail fast. We cannot allow a service we are
calling to cause a failure that cascades through the system. We can often avoid cascading failures by
configuring proper timeouts and circuit breakers. Sometimes, not receiving a response from a service
can be worse than receiving an exception. Resources are then tied up waiting for a response, and as
more requests are piled up, we start to run out resources. We can’t wait forever for a response that
we may never receive. It would be far better if we never send requests to a service that is
unavailable. In Figure 1.6, Service E is failing to respond in a timely manner, which consumes all the
resources of the upstream services waiting for a response, causing a cascading failure across the
system.
Summary
There are a lot of advantages to a microservices architecture. The autonomy of services that it
provides can be worth the cost of segregating the services.
• Each service is simple.
• Services have independent lifecycles.
• Services have independent scaling and performance.
• It is easier to optimally tune the performance of each service.
A microservices architecture does come at a cost and there are a number of challenges and trade-
offs that must be considered.
• More services must be deployed and managed.
• Many small repositories must be administered.
• A stronger need exists for sophisticated tooling and dependency management.
• Network latencies are a risk.
Organizations need to consider whether they are ready to adopt this approach and what they need to
change to be successful with it.
• Can the organization adopt a devops culture of freedom and accountability?
• Does the team have the skills and experience?
• Is continuous delivery possible?
• Is the organization utilizing cloud technology (private or public) so it can easily provision
resources?
In the following chapters we will discuss containers and how they can be used in a microservices
architecture. We will also work through designing microservices and all the important processes for
developers, operations, and releasing an application based on a microservices architectural style.
2. Containers on Azure Basics
In this chapter, we will lay down the foundation of basic container knowledge that we will need
throughout the book. This chapter will start with a comparison of VMs, containers, and processes,
and will talk about containers on Azure. We will create an Azure VM with containers on it and use it
as our training environment to make ourselves familiar with basic container operations.
In the subsequent chapters, we will expand our container knowledge based on real world
examples.
As we have seen, virtual machines are not ideal for scenarios where new instances need to be spun
up quickly.
The next idea that comes to mind is why not just build a process on one of the virtual machines in
the existing computer cluster? (A cluster is a group of VMs.) Processes start fast, dynamically
allocate resources such as RAM, and are efficient at sharing these, and so it looks like a good fit. The
downside with processes is that they are not well-isolated from the rest of the environment, and can
quickly result in noisy neighbor situations, potentially compromising the entire virtual machine if the
code was not written well.
Noisy Neighbor
The noisy neighbor problem describes a situation where one application takes up or
blocks much of the resources that are shared amongst other applications on the same
infrastructure. This negatively affects the execution of other processes and applications.
Based on what we know about containers, they seem to be the ideal choice for this scenario. We
can spin up a background-processing task quickly, run it isolated, and scale it out quickly.
We can also see the difference between running the rating service in a VM and running it in a
container by looking at the figures following. Figure 2.1 shows the high-level topology for a machine
(Host OS) hosting two virtual machines (Guest OS) with applications running on virtual machines.
The box “Rating Service” represents our rating service on a virtual machine. As we can see, it shares
resources with application App on the same virtual machine. If there is a problem with the
background task, it can jeopardize the entire machine.
FIGURE 2.1: Rating service running in a virtual machine on a host with two virtual machines
Figure 2.2 shows the rating service running in a container. The container encapsulates everything
that is needed for the background task including all libraries, dependencies, and files. It runs as an
isolated user process on the same Host OS, sharing the same kernel as other containers. If there is a
problem with the rating service it would have no impact on the other applications running in
containers as they are isolated from each other.
FIGURE 2.2: Rating service running in a container on a host with multiple containers
We used the example of the background-processing task to illustrate how to think about virtual
machines vs. containers and processes, but there are other scenarios where containers can be a good
fit.
Containers are a great choice in scenarios that require reactive security. As containers are isolated
units, the attack space is limited, and due to the speedy boot and shutdown of containers, we can kill
them and replace them quickly.
Reactive Security
Reactive security refers to actions that are taken after discovering that some systems
have been compromised by a malicious program or script.
As mentioned previously, by using containers we make our applications portable. The very same
application container can run on a laptop, a physical or hypervisor server, or in a cloud environment.
In particular, dev-test and DevOps benefit hugely from this kind of portability. Chapter 6 covers
DevOps practices in detail.
Besides dev-test, high density and microservices scenarios benefit the most from containers, and
this is what we will learn in this book.
Containers on Azure
Within Microsoft Azure, virtual machines offer many platforms to choose from. These include many
Linux distributions such as CoreOS, SUSE, Ubuntu and Red Hat, as well as Windows-based systems.
As mentioned before, we will focus on Docker containers in this book. Docker is currently the most
popular container technology and the Docker ecosystem is growing at a very fast pace. Microsoft has
entered into a partnership with Docker, Inc., which contains an agreement that the open-source
Docker engine that builds and runs Docker containers is available in Windows Server 2016.
Windows Server containers use the Docker runtime and enable developers who are familiar with the
Windows ecosystem to develop container-based solutions. At the time of writing, Windows
containers are still in preview. We will have an online chapter about Windows containers after they
debut in Windows Server 2016.
VM Extension
A virtual machine extension is a software component enabling in-guest management
customization tasks such as adding software or enabling debugging and diagnostics
scenarios. Extensions are authored by Microsoft and trusted third parties like Puppet
Labs and Chef. The set of available extensions is growing all the time. The Docker
extension installs the Docker daemon on supported Linux-based virtual machines.
Extension can be installed in through the portal, command line tools (such as Azure CLI),
PowerShell, or Visual Studio.
In this chapter we use the easiest way possible: we provision an Azure VM from an image that will
install Docker as part of the virtual machine provisioning process through the Azure Portal.
Before we create the Azure VM we need to generate an SSH key that we will use to connect to the
Azure VM.
Next we are asked to enter a passphrase. We don’t have to enter one, but using a passphrase
provides a little more security. It prevents anyone obtaining the key from using it without the
passphrase.
The Git Bash console should now look similar to the one shown in Figure 2.3.
Our terminal window should now look similar to the one shown in Figure 2.4.
FIGURE 2.4: SSH key generation in MAC OS terminal
Clicking the “Create” button opens up the Create VM blade as shown in Figure 2.7.
Note
We use the identity file parameter “–i” to specify the path to our private key file. We use
the port parameter “–p,” which is optional when SSH is enabled on port 22. Azure
sometimes provisions the Azure VM with a random SSH port. We can find the SSH
information at the end of the Properties blade of our Azure VM.
Once we hit “Enter,” we are asked for the passphrase that we entered when creating the public key
(if you chose to enter a passphrase during creation). After a successful authentication, our Git Bash
shell should look similar to Figure 2.10.
FIGURE 2.10: Git Bash shell connected to Azure VM
The SSH command is identical to the one used with Git Bash on Windows. Figure 2.11 shows the
terminal window after successful connection to the Azure VM.
FIGURE 2.11: MAC OS terminal session connected to the Azure VM
Before we move on and get to the Docker basics, let’s recap what we have done during the last
couple of steps.
• We used the Azure Portal to provision a virtual machine based on the “Docker on Unbuntu
Server (preview)” image.
• We created an SSH public key on Windows (using Git Bash) and on Mac OS X (using the
terminal).
• We connected to the VM using SSH.
These steps show that it is very easy to set up an Azure virtual machine with the Docker daemon on
it and connect to it.
Docker Info
To make sure that the extension has successfully installed Docker on the machine, we can check
whether Docker is installed by typing
docker info
Note
The following commands are executed on the Docker host via SSH. Therefore, they are
exactly the same whether you are connecting to the Docker host from a Windows or Mac
OS X machine.
The first two lines indicate that we do not have any containers or images yet, as they show
Containers: 0 and Images: 0.
Figure 2.13 shows a logical view of the current state and components of our Azure VM after the
successful installation of Docker.
FIGURE 2.13: Azure VM state after provisioning
Now we can create our first container. As in many other examples, we start with a simple scenario.
We want to create a container that hosts a simple web site. As a first step, we need a Docker image.
We can think of an image as a template for containers, which could contain an operating system such
as Ubuntu, web servers, databases, and applications. Later in this chapter, we will learn how to
create our own images, but for now we will start with an existing image.
Docker has the notion of image repositories. The public place for Docker repositories is Docker
Hub (https://hub.docker.com). Docker Hub can host public and private repositories, as well as
official repositories which are always public. The official repositories contain certified images from
vendors such as Microsoft or Canonical, and can be consumed by everyone. Private repositories are
only accessible to authenticated users of those repositories. Typical scenarios for private repositories
are companies that do not want to share their images with others for security reasons, or companies
who are working on new services or applications that should not be publicly known.
The Docker command line also supports searching Docker Hub. The following command returns
all images that have NGINX included.
docker search nginx
Docker Images
During the pull, there is a lot of output in our command window. Running this command caused
Docker to check locally for an image called NGINX. If it cannot find it locally, it will pull it from
Docker Hub and put it in the local registry on the VM. We can now run
docker images
to check all the images that are on our VM. Figure 2.14 shows the shell after pulling down the image
and running the “docker images” command.
FIGURE 2.14: NGINX image on dockerhost
Figure 2.15 illustrates the state of the Azure VM after downloading the image from Docker Hub.
This command starts a container called “webcontainer” based on the “NGINX” image that you
downloaded in the previous section. The parameter “-p” maps ports directly. Port 80 and 443 are
exposed by the NGINX image and by using “–p 80:80” we map port 80 on the Docker host to port 80
of the running container. We could have also used the “-P” parameter, which dynamically maps the
ports of a container to ports of the host. In Chapter 5, “Service Orchestration and Connectivity,” we’ll
see how to use static port mapping because it makes our lives easier when running and exposing
multiple containers on virtual machines in a cluster. Finally, the “-d” parameter tells Docker to run
that container in the background.
Note
Docker pull is not required to download an image. Docker run will download an image
automatically if the image is not found locally.
We can now run the docker ps command to check our running container. Figure 2.16 shows the
output of the command.
Figure 2.18 shows the welcome web page for NGINX, telling us that the web service is working.
FIGURE 2.18: Working NGINX web server
We just created our first container based on an image that we pulled from Docker Hub and
familiarized ourselves with some basic Docker commands.
Next, we’ll look a bit deeper into the Docker basics and learn more about volumes and images,
which are important Docker concepts.
We will store the sources for our custom web page in the directory /home/src on the Docker host.
We can use the following commands to create the directory, assuming we are already in the /home
directory.
mkdir src
cd src
Next, let’s create a simple HTML page called index.html in the src directory. We can use the nano
editor to create that file. Type nano and hit “Enter” to open the editor.
The content of the HTML page is very simple and is shown below:
Click here to view code image
<html>
<head>
</head>
<body>
This is a Hello from a website in a container!
</body>
</html>
We have already learned that this command will create a container called webcontainer based on the
NGINX image and map port 80 on the Docker host to port 80 in the container.
The new part is
Click here to view code image
-v /home/src:/usr/share/nginx/html:ro
The “-v” parameter mounts the “/home/src” directory created earlier to the “/usr/share/nginx/html”
mount point in the container.
Note
The NGINX image uses the default NGINX configuration, so the root directory for the
container is /usr/share/nginx/html.
Once the container is up and running, we can check our changes by entering curl
http://localhost. The output should now look like Figure 2.19.
FIGURE 2.19: Web site with custom content in simple webcontainer
This will create a new container based on the “NGINX” image and drop you into the container’s
shell, which will look similar to
root@67337e2dbcbb:/#
Now we can go ahead and install software and make other changes within the container. In our
example, we resynchronize the package index files using apt-get update.
Once the container is in the state we want it to be, we can exit the container by entering
root@67337e2dbcbb:/# exit
Finally, we can commit a copy of that container to a new image using the following command from
the Docker host:
Click here to view code image
Figure 2.20 shows the entire flow of dynamically creating a new image.
The first line represents a comment as it is prefixed with the “#” sign. The “FROM” instruction
tells Docker which image we want to base our new image on. In our case, it is the NGINX image that
we pulled down earlier in the chapter. The “MAINTAINER” instruction is to specify who maintains
the image. As we will see in a later chapter, that is quite important information when we deal with
several teams and many images. The “COPY” instruction tells Docker to copy the contents of the
“web” directory on the Azure VM (Docker host) to the directory “/usr/share/nginx/html” in the
container. Below is the folder structure on the Azure VM:
|-/src
|-Dockerfile
|-web
|-index.html
Note
When copying files in the Dockerfile, the path to the local directory is relative to the
build context where the Dockerfile is located. For this example, the content to copy is in
the “src” directory. The Dockerfile is in the same directory.
Finally, we use the “EXPOSE” instruction to expose port 80. Now that we have the Dockerfile, we
can build our image. The following command builds the image, and by using the “-t” parameter, we
tag the image with the “customnginx” repository name.
docker build –t customnginx .
The period (“.”) at the end of the command tells Docker that the build context is the current
directory. The build context is where the Dockerfile is located, and all “COPY” instructions are
relative to the build context.
Once the build has been successful, we can run the docker images command again to see what
images we now have in our local repository. Figure 2.21 shows the output of the docker build
and docker images commands:
FIGURE 2.21: Docker build and docker images output
As we can see, there are now three images on the Azure VM.
• nginx: This is the official NGINX image we pulled from Docker Hub.
• bscholl/nginx: This is the image we created using docker commit.
• customnginx: This is the image we created from the Dockerfile using docker build.
This was just a small example to demonstrate how to create a Docker image using a Dockerfile.
Table 2.1 provides a list of the most common instructions to use with Dockerfiles to build an image.
TABLE 2.1: Common commands
To finish our Dockerfile exercise, we should test if we can create a new container based on our
new image customnginx.
First, we should delete the webcontainer that we have created in our mounting exercise by
executing:
docker stop webcontainer
docker rm webcontainer
Executing curl http://localhost should return the same page as previously shown in
Figure 2.19.
In this chapter, we have pulled down the NGINX image from Dockerhub and created two new
Docker images. Figure 2.22 shows the logical view of the Azure VM.
Image Layering
If we look closer at Figure 2.22, we can see another great advantage of Docker. Docker “layers”
images on top of each other. In fact, a Docker image is made of filesystems layered on top of each
other.
So what does that mean and why is this a good thing? Let’s look at the layers of our recently
created image by using the
docker history customnginx
command.
Figure 2.23 shows that the image has 15 layers.
FIGURE 2.23: Layers of the image customnginx
Let’s look at the history of the NGINX image by executing
docker history nginx.
This means that Docker adds a layer for each Dockerfile instruction executed. This comes with
many benefits, such as faster builds (as images are smaller) and rollback capabilities. As every image
contains all its building steps, we can easily go back to a previous step. We can do this by tagging a
certain layer. To tag a layer we can simply use the
docker tag imageid
command.
For the purpose of this book, we do not need to go into the details of how the various Linux
filesystems work and how Docker takes advantage of them. Chapter 4 covers image layers from a
development perspective.
If you are interested in advanced reading on that topic, you can check out the Docker layers chapter
on https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/.
If we wanted to continue to see live updates to the logs as they happen, we can add the “--follow”
option to the docker logs command as shown below.
Click here to view code image
docker logs --follow webcontainer.
The docker logs command also offers the parameters --since, --timestamps, and --tail to filter the
logs.
Container Networking
Docker provides rich networking features to provide complete isolation of containers. Docker creates
three networks by default when it’s installed.
• Bridge: This is the default network that all containers are attached to. It is usually called
docker0. If we create a container without the –net flag, the Docker daemon connects the
container to this network. We can see the network by executing ifconfig on the Docker host.
• None: This instructs the Docker daemon not to attach the container to any part of the Docker
host’s network stack In this case, we can create our own networking configuration.
• Host: Adds a container on the Azure VM’s network stack. The network configuration inside the
container is identical to the Azure VM.
To choose a network other than “bridge,” for example “host,” we need to execute the command
below:
Click here to view code image
docker run –name webcontainer –net=host –d – p 80:80 customnginx
In addition to the default networks, the –net parameter also supports the following options:
• 'container:<name|id>': reuses another container’s network stack.
• 'NETWORK': connects the container to a user-created network using 'docker network
create' command. Docker provides default network drivers for creating a new bridge
network or overlay network. We can also create a network plugin or remote network written to
your specifications, but this is beyond the scope of this chapter.
Overlay Network
An overlay network is a network that is built on top of another network. Overlay
networks massively simplify container networking and are the way to deal with container
networking going forward. In Chapter 5, “Service Orchestration and Connectivity,” we
discuss clusters, which are collections of multiple Azure VMs. The cluster uses an Azure
virtual network (VNET) to connect all the Azure VMs, and an overlay network would be
built on top of that VNET. The overlay network requires a valid key-value store service,
such as Zookeeper, Consul or Etcd. Chapter 5 also covers key-value stores. Chapter 5
covers how to set up an overlay network for our sample application.
Let’s have a closer look at the bridge network as it enables us to link containers, which is a basic
concept that we should be aware of. By linking containers, we provide a secure channel for Docker
containers to communicate with each other.
Start the first container.
Click here to view code image
docker run --name webcontainer –d –p 80:80 customnginx
Now we can start a second container and link it to the first one.
Click here to view code image
docker run --name webcontainer2 --link webcontainer:weblink -d –p
85:80 customnginx
The –link flag uses this format: sourcecontainername:linkaliasname. In this case, the source
container is webcontainer and we call the link alias weblink.
Next we enter our running container webcontainer2, to see how Docker set up the link between the
containers. We can use the exec command as shown below:
docker exec -it fcb9 bash
fcb9 are the first four digits of the container id for webcontainer2.
Once we are inside the container, we can issue a ping command to the webcontainer. As we can
see in Figure 2.25 we can ping the webcontainer by its name.
FIGURE 2.25: Pinging the webcontainer
Note that the IP address for webcontainer is 172.17.0.2. During startup, Docker created a host
entry in the /etc/hosts file of webcontainer2 with the IP address for webcontainer as shown in Figure
2.26. We can get the host entries by executing
more etc/hosts
FIGURE 2.26: Linked container entry in /etc/hosts of webcontainer2
In addition to the host entry, Docker also set environment variables during the start of
webcontainer2 that hold information about the linked container. If we execute printenv we get the
output shown in Figure 2.27. The environment variables that start with WEBLINK are the ones
containing information about the linked containers.
Environment Variables
Environment variables are critical when we start thinking about abstracting services in containers.
Good examples are configuration and connection string information. Environment variables can be set
by using the “–e” flag in docker run. Below is an example that creates an environment variable
“SQL_CONNECTION” and sets the value to staging.
Click here to view code image
docker run --name webcontainer2 --link webcontainer:weblink -d -p
85:80 -e SQL_CONNECTION='staging' customnginx
Summary
In this chapter, we covered some basic concepts of Docker containers starting from how we can think
about the difference between containers, virtual machines, and processes. We also learned how to
create virtual machines on Azure that serve as Docker hosts. We went on and learned about basic
Docker concepts like images, containers, data volumes, Dockerfiles, and layers. Throughout the
chapter, we also familiarized ourselves with basic Docker commands. As we have seen, containers
are a powerful technology that can make us rethink how we build applications in the future.
Over the next chapters, we will use this knowledge to dive deeper into how to work with
containers and how to build microservices architectures on top of containers.
3. Designing the Application
In this chapter we will cover some considerations for architecting and designing an application using
a microservice architectural style, as well as the paths to a microservices architecture. How do we
go about defining the boundaries for the various services and how big should each service be? Before
we dive into defining boundaries, let’s pause to consider whether or not this is the best approach for
the project currently. Sometimes the path to a microservices architecture actually starts with
something closer to a monolith.
Most of the successful microservices examples we have to draw experiences from today actually
started out as monoliths that evolved into a microservices architecture. That doesn’t necessarily mean
we can’t start a project with a microservices architectural approach, but it’s something we will need
to carefully consider. If the team doesn’t have a lot of experience with this approach, that can cause
additional risk. Microservices architecture has a cost that the project might not be ready to assume at
the start. We will cover more of these considerations in detail along with some thoughts on defining
service boundaries.
Note
As the tools and technologies available in the market for building and managing
microservices architecture mature, much of this will change. As the technologies become
more advanced, the overhead and complexities in developing and managing a
microservices application will be reduced.
Coarse-Grained Services
A microservices architecture introduces a lot of moving parts, and the initial costs will be higher. It’s
often better to start with a few coarse-grained, self-contained services, and then decompose them into
more fine-grained services as the application matures. With a new project, starting with a monolith or
more coarse-grained services can enable us to initially bring a product to market more quickly today.
Architects, developers, and operations are often more familiar with the approach, and the tools we
use today have been created to work well with this method. As we bring the new application to
market, we can further develop the skills necessary to managing a microservices architecture. The
application boundaries will be much more stable and we can better understand where our boundaries
should be.
If we do decide to start with something on the monolithic end of the spectrum, there are some
considerations to take into account for approaching the design if we plan to transition to a
microservices architecture. We could carefully design the application, while paying close attention to
the modularity of the application to simplify the migration to microservices. This is great in theory,
but in practice, maintaining modularity in a monolith can be challenging. It requires a lot of discipline
to build a monolith in a way that it can easily be refactored into a microservices architecture.
Note
Data for each component of the monolith can be deployed to separate schemas in the
same database or even separate databases. This can help enforce autonomy in the data
store and ease the transition to a microservices architecture.
In some situations, there is absolutely nothing wrong with planning to build a monolith that gets us
to market quickly, then replacing it with a microservices architecture and simply discarding the
monolith. This requires some planning for how the monolith will eventually be displaced when the
time comes.
Depending on the business and technical requirements, as well as the experience and knowledge of
the team, we can start at or somewhere between a monolith and fine-grained microservices, as we see
in Figure 3.1. Applications with requirements that have more to gain from microservices can start
closer to the right on this graph, and teams with less experience can start closer to the left, with plans
to further decompose as needed.
Note
It’s important that the domain is well understood before you begin partitioning it, as
refactoring boundaries can be costly and complex.
By starting with a microservices architecture, we can avoid the cost of refactoring later on, and
potentially reap the benefits of microservices earlier. We ensure our carefully designed components
and boundaries don’t become tightly coupled, and based on history they generally will to some degree
in a single codebase. The team becomes very familiar with building and managing a microservices-
based application from the start. They are able to develop the necessary experience, and build out the
necessary infrastructure and tooling to support a microservices architecture. There is no need to
worry about whether or not we re-architect it someday, and we avoid some potential technical debt.
As the tools and technologies mature, it can be easier to start with this approach.
Once we have made this decision, we will either have a monolith we need to refactor, or a new
application we need to build using some combination of coarse- and fine-grained services. Either
way, we need to think about approaches to breaking down an application into the parts suitable for a
microservices architecture.
We need to identify the boundaries in the application that we will use to define the individual
services and their interfaces. As we mentioned previously, those boundaries should ensure closely
related things are grouped together and unrelated things are someplace else. Depending on the size
and complexity of the application this can be a matter of identifying the nouns and verbs used in the
application and grouping them.
We can use Domain-Driven Design (DDD) concepts to help us define the boundaries within our
application that we will use to break it down in to individual services. A useful concept in DDD is
the bounded context. The context represents a specific responsibility of the application which can be
used in decomposing and organizing the problem space. It has a very well-defined boundary which is
our service interface, or API.
Bounded Context
A bounded context essentially defines a specific responsibility with an explicit
boundary. The specific responsibility is defined in terms of “things” or models within
some context. A bounded context also has an explicit interface which defines the
interactions and models to share with another context.
When identifying bounded contexts in our domain, think about the business capabilities and
terminology. Both will be used to identify and validate the bounded contexts in the domain. A deep
dive into Domain-Driven Design (DDD) is out of the scope of this book, but there are a number of
fantastic books in the market that cover this in great depth. I recommend “Domain-Driven Design” by
Eric Evans and “Patterns, Principles, and Practices of Domain-Driven Design” by Scott Millet and
Nick Tune. Defining these boundaries in a large complex domain can be challenging, especially if the
domain is not well understood.
We can further partition components within a bounded context into their own services and still
share a database. The partitioning can be to meet some nonfunctional requirements like scalability of
the application or the need to use a different technology for a feature we need to implement. For
example, we might have decided our product catalog will be a single service, and then we realized
the search functionality has much different resource and scale requirements than the rest of the
application. We can decide to further partition that feature into its own individual service for reasons
of security, availability, management, or deployment.
Service Design
When building a service, we want to ensure our service does not become coupled to another team’s
service, requiring a coordinated release. We want to maintain our independence. We also want to
ensure we are not breaking our consumer when deploying updates, including breaking changes. To
achieve this, we will need to carefully design the interfaces, tests, and versioning strategies, and
document our services while we do so.
When defining the interfaces, we need to ensure we are not exposing unnecessary information in the
model or internals of the services. We cannot make any assumptions of how the data being returned is
used, and removing a property or changing the name of an internal property that is inadvertently
exposed in the API can break a consumer. We need to be careful not to expose more than what is
needed. It’s easier to add to the model that’s returned than it is to remove or change what is returned.
Integration tests can be used when releasing an update to identify any potential breaking changes.
One of the challenges with testing our services is that we might not be able to test with the actual
versions that will be used in production. Consumers of our service are constantly evolving their
services, and we can have dependencies on services that have dependencies on others. We can use
consumer-driven contracts, mocks, and stubs for testing consumer services and service dependencies.
This topic is covered in Chapter 6, “DevOps and Continuous Delivery.”
There will come a time when we need to introduce breaking changes to the consumer and when we
do this, having a versioning strategy in place across the services will be important.
There are a number of different approaches to versioning services. We could put a version in the
header, query string, or simply run multiple versions of our service in parallel. If we are deploying
multiple versions in parallel, be aware of the fact that this will involve maintaining two branches. A
high-priority security update might need to be applied to multiple versions of a service. The
Microsoft Patterns & Practices team has provided some fantastic guidance and best practices for
general API design and versioning here: https://azure.microsoft.com/en-
us/documentation/articles/best-practices-api-design/.
It’s also important that the services are documented. This will not only help services get started
consuming the API quickly, but can also provide best practices for consuming and working with the
API. The API can include batch features that can be useful for reducing chattiness across services, but
if the consumer is not aware these batch features do not help. Swagger (http://swagger.io) is a tool
we can use for interactive documentation of our API, as well as client SDK generation and
discoverability.
Serialization Overhead
Serialization often represents the largest cost with interservice communications. A
number of serialization formats are available with various size, serialization,
deserialization, and schema trade-offs. If the cost can be avoided, this can help increase
performance of the system. We can sometimes pass data through, from upstream to
downstream services, and instead add data to headers.
When designing our microservices architecture, we need to consider how our services will
communicate. We want to make sure we can avoid unnecessary communication across services and
make the communication as efficient as we possibly can.
Whatever we select for our interface must be a well-defined interface that hopefully does not
introduce breaking changes for consumers of a specific version. It’s not that we can’t change the
interface; the key difference is breaking changes. Two different communications styles used between
services are synchronous and asynchronous request/response. This means the services implemented
will need to use some kind of versioning strategy for the eventual breaking change to an interface.
Synchronous Request/Response
Many developers are very familiar with the synchronous messaging approach. A client sends a
request to a back-end and then waits for a response. This style makes a lot of sense: it’s easy to
reason about and it’s been around for a long time. The implementation details might have evolved
over time.
One of the challenges with synchronous messaging at scale is that we can tie up resources waiting
for a response. If a thread makes a request, quite often it sits and waits for a response, utilizing
precious resources and doing nothing while it waits for the response. Many clients can use
asynchronous event-based implementations that enable the thread to continue doing work, and then
execute a callback when a response is returned. If we need to make synchronous requests to a service,
this is the way to do it. The client does, however, expect a timely response from the service. See
Appendix A in this book for best practices when implementing and consuming APIs with ASP.NET.
Many of the concepts discussed are transferrable to other languages and frameworks.
Asynchronous Messaging
Another useful approach for inter-service communications in distributed systems is asynchronous
messaging. Using an asynchronous messaging approach, the client makes a request by sending a
message. The client can then continue doing other things, and if the service is expected to send a
response it does this by sending another message back. The client will generally wait for
acknowledgment that the message was received and queued, but will not have to wait for the request
to be processed with a response. In some cases the client does not need a response or will check back
with a service at a later time on the status of the request.
There are a number of benefits to asynchronous messaging, but also some trade-offs and challenges
that need to be considered. A great introduction to asynchronous messaging is available here from the
Microsoft Patterns and Practices team at https://msdn.microsoft.com/en-us/library/dn589781.aspx.
This material is a great primer to help those new to asynchronous messaging concepts.
As we mentioned previously, when designing an application using a microservices approach, there
are a number of things to consider with regard to communication style, protocols, and serialization
formats used.
Considerations with inter-service communications:
• Select a communication approach that is best suited for the interaction, and if possible use
asynchronous messaging.
• Consider protocol and serialization formats and the various trade-offs. Performance is
important, but so is interoperability, flexibility, and ease of use.
• Always consider how coupling is affected by any decision, because as our services become
coupled we tend to lose many of the benefits of a microservices architecture.
• Always consider caching when making requests on external systems.
• Always implement the resiliency patterns covered in Chapter 1 in the Best Practices subsection
when calling other services.
• When one service takes a dependency on another service, it should not require the dependent
service to be available at deployment or impact its ability to start. There should never be a need
to deploy and start services in a specific order.
• Consider alternate approaches to reduce overhead. For example, if one service uses
information received in a request to call another with additional information there might be no
need to reserialize the original message. The original message can be passed with the
additional information in the header.
There’s a place for both synchronous request/response and asynchronous messaging in today’s
systems. We really just touched the surface on some inter-service communications concepts in this
section. For more information on routing requests to services see the Service Discovery section in
Chapter 5. Inter-service communication is an important consideration when designing an application
based on microservices architecture.
Monolith to Microservices
We might be starting from an existing monolithic application that we want to move to a microservices
architecture. This can even be the result of a decision to start with a monolith that we would
eventually move to a microservices architecture. Regardless of how we ended up with a monolith, we
now want to refactor this monolithic application into a microservices architecture because the
benefits outweigh the costs of moving and managing a microservices application. There are a number
of different ways to approach this, and we discuss some of those in this section. We have the benefit
that we have a fairly well-established business model, and data to help define our service
boundaries.
The challenges we often face when breaking down a monolith into a microservices is the tight
coupling within the system that contains all kinds of unrelated code that is intertwined throughout the
application. It can be very difficult to untangle the pieces of functionality we wish to break out of the
application into separate services.
We should first determine the motivations for refactoring a monolithic application into
microservices, as this will affect the approach and priorities. If it’s because we want to be able to
add new features using a different technology, then maybe we don’t need to refactor the monolith, and
instead can add the feature alongside the monolith. If there’s a feature causing some pain or the rest of
the monolith is holding back the capability of implementing some feature, then maybe we need to start
with moving that one feature first.
Refactoring a monolith is often a process of breaking out one microservice at a time. Parts of the
monolith’s functionality are replaced with a microservice, and over time the entire monolith has been
completely decomposed into a microservices architecture. As the monolith is reduced in size, it will
eventually become easier to work through those last few challenging items at the end. Where to start
and what to slice off first is something we need to think about very carefully.
Collect Data
Adequate logging and telemetry from a monolith can be extremely useful information
when approaching the task of breaking it down into a microservices architecture. This
information can be especially useful when we need to partition for scale, or with
identifying coupling in a data store. We might want to ensure our monolith has
instruments to gather additional information that will help identify the boundaries for its
decomposition into microservices.
We can start by identifying the business capabilities and bounded contexts within the application’s
domain, then begin analyzing the code. We can find seams in the code, making it easier to decouple
one feature easier than another. Dependency tools and profilers can be useful in better understanding
the structure and behavior of the application. A feature might exist that has a particular need that is
satisfied by a microservices architecture today, like the need to release very quickly; or maybe this
feature is fragile and breaks often because of releases to the monolith. It might even be that we want
to start with easier-to-partition features and experiment with microservices. Below is a list of things
we need to consider and think about when we approach partitioning a monolith.
Considerations for partitioning and prioritization:
• Rate of changes: Features that are changing and need to be released often, or those that are
very stable and never change.
• Scale: Features that require very different scale requirements than the rest of the application.
• Technology: A feature can leverage a new technology, making it a good candidate to be
partitioned out of the monolith.
• Organizational structure: A team working on a feature could be located in a different region.
• Ease: There can be some seams and features in the monolith that are easier to partition out and
experiment with.
• Security: There can be features in the application that deal with very sensitive information and
require additional security.
• Availability: A feature can have different availability requirements and it can be easier to meet
these requirements by breaking it out into its own service. This enables us to more effectively
isolate costs to targeted areas of the solution.
• Compliance: Certain aspects of the application can fall under compliance regulations and
would benefit from being separated from the rest of the application so that the rest of the
application is not subject to the compliance rules.
Also consider the fact that we don’t have to break out every feature into microservices. It might be
better to divide and conquer only if it makes sense for the application. Break up the monolith into
some coarse-grained services, and then continue to chip away at one or two large services in
parallel.
In addition to splitting out the code into another service, we also have to think about data migration
and consider data consistency as we decentralize the data and place it in separate data stores. There
will often be some coupling between components and features in the database. The database can be
full of challenges like shared data and tables, transactions, foreign key constraints, and reporting
needs. We might want to break out the behavior as a first step and enable the monolith and our new
service to share the database for some time.
As services are broken out from the monolith, they need to continue to collaborate and integrate
with the monolith. A proxy can be placed in front of the monolith to route traffic to the new services
as they are broken out. Features partitioned out of the monolith might need to be replaced with client
proxies used to call the new service. Also, where services need to interact with the existing
monolithic application, consider adding an anti-corruption layer. The anti-corruption layer is
introduced in Domain-Driven Design (DDD), by Eric Evans. This approach creates a façade over the
monolithic application and ensures the domain model of the monolith does not corrupt the
microservices we are building.
Flak.io
Flak.io is an advanced technology reseller that sells unique products online for the modern-day space
explorers. Space exploration is growing fast, and the demand for equipment is rapidly growing with
it. Flak.io needs to release an ecommerce solution that can scale to handle the expected demands. The
online retail market for space exploration equipment is getting very competitive, and flak.io must
remain agile and capable of continuous innovation. Flak.io has decided to build their new storefront
technology using microservices architecture. The team has some high-level requirements that align
well with the benefits of a microservices architecture.
Initially users will simply need to be able to browse a catalog of products and purchase them
online. Initially users will not need to log in or create accounts, but eventually this feature will be
added. The catalog will include basic browse and search capabilities, enabling a user to find a
product by entering search terms and browsing products by category. When viewing products on the
site, the user will be displayed a list of related and recommended products based on the analysis of
past orders and browsing histories. The user will add products that he or she wishes to purchase to a
shopping cart and then make the purchase. The user will then receive an email notification with a
receipt and details of the purchase.
Requirements
• Continuous Innovation: To keep their positon in the market, the flak.io team needs to be able
to experiment with new features and release them into the market immediately.
• Reduce Mean Time to Recovery (MTTR): The DevOps team needs to be able to quickly
identify and release fixes to the service.
• Technology Choice: New technologies are constantly being introduced into the market that can
be leveraged in key areas of the application, and it’s important that the team is able to quickly
adopt them if it makes sense for the business.
• Optimize resources: It’s important that the application is optimized to reduce cloud
infrastructure costs.
• Application availability: It’s important that some features of the application are highly
available.
• Reduce developer ramp-up: The team has been growing quickly, and it’s important that new
developers are able to ramp up and begin contributing immediately.
It’s apparent the Flak.io ecommerce application will benefit from a microservices architecture and
the costs are worth the return. The team has enough experience in DevOps to service design to handle
it, and the application has been decomposed into the following set of services.
Architecture Overview
As we see in Figure 3.2, the application is decomposed into four services.
Considerations
Considerations
Some things to consider when decomposing an application into individual services:
• Right now the client makes one request to retrieve catalog information and another for
recommendations. We might want to reduce chattiness in the client and aggregate the request in
the proxy or make one of the services responsible for that. For example, we can send a request
to the catalog service which could request recommendations from the recommendation service,
or have the edge proxy fan out and call both and return the combined result.
• We might want to further decompose a bounded context around business components,
components, or other things. For example, depending on how our application scales, we might
want to separate the search function from the other catalog features. We will keep it more
coarsely grained until we determine there is a need to further partition.
• Elasticsearch is used in place of other graph databases for providing recommendations for a
couple of reasons. Elasticsearch includes an extremely powerful graph API, and because it’s
used for other areas, we can reduce the number of technologies and share information for best
practices.
• We could place the recommendations and products into different indexes of the same
Elasticsearch cluster. As long as one service is not permitted to directly access the other
service’s index, we should be fine and might be able to reduce costs by running multiple
clusters.
• It’s quite possible that many of these services could be further decomposed into more
microservices. For example, the team had considered further decomposing the order service
into more services, by moving some of the functionality out to a payment/checkout service.
However, the team had determined that refactoring it at a later time would require minimal
work, and they were not ready to manage more services or the complexities it involves at this
time.
• Domain-Driven Design is an approach and a tool. It’s a great tool in this situation, but it’s not
the only approach to decomposing a domain into separate services.
• At the moment, customers are anonymous and simply enter payment information every time they
make a purchase. In the future, a feature will be added to the application that enables customer
accounts.
Summary
One of the biggest challenges in designing an application based on a microservices architecture is in
defining the boundaries. We can leverage techniques from Domain-Driven Design (DDD) for help
defining boundaries. We have also learned that the best path to a microservices architecture can
actually start with something a bit more monolithic in nature today. As the experiences and
technologies in the industry mature, this can change. Regardless of how we start, we know that a deep
understanding of the business domain is necessary. As with any architectural approach, microservices
architecture is loaded with trade-offs big and small. Now that we have an application and design,
let’s move along to building, deploying, and managing it.
4. Setting Up Your Development Environment
In this chapter, we’ll discuss the ways we can use Docker in a development environment including
local development, Docker image management, local development versus production-ready
development, using Docker Compose to manage multiple containers, and common techniques to
diagnose Docker issues.
Developer Configurations
One of the first considerations for developing with Docker is how to set up your development team’s
configuration. Below are three of the most common usage patterns.
Local Development
In this configuration, all development is done locally on a laptop, typically using containers running
inside of a virtual machine. Some developers find they are more productive writing code on their
local machine without using Docker, and then, once primary development is done, they test and
integrate their code running in a Docker container against other services.
Cloud Only
In this configuration, all development is done with containers on top of virtual machines in a public or
private cloud. Development teams sometimes need to choose this option because their local PCs
aren’t capable of running virtualization software, or a corporate policy doesn’t permit them to run
virtualization on their PCs. To work around these restrictions, each developer gets a bare-bones
virtual machine hosted in the cloud to run their containers.
set DOCKER_CERT_PATH=c:\users\<username>\.docker\machine\machines\
finance-dev
or on a Mac:
Click here to view code image
export DOCKER_CERT_PATH=~/.docker/machine/machines/finance-dev
As each Docker host (virtual machine) needs its own set of keys, you want to make sure you have
an easy way to manage a set of authentication keys across multiple developers and multiple virtual
machines. Remember not to include your Docker certs in your source control system, as by doing that,
you are effectively giving full trust and control to anyone who has access to your source control
system. Further, hackers can use automated sniffers that recursively search source control systems for
passwords and certificates, so instead of only gaining access to your source control system, hackers
would then be able to fully control your deployment servers.
• For local development, use the built-in Docker keys that are unique to each developer. These
are stored in the default directory, located at C:\users\
<username>.\docker\machine\machines\default or ~./docker/machine/machines/default on a
Mac.
• For a shared development environment, developers typically share a set of Docker keys across
the team as developers require the need to directly manage and monitor containers, including
viewing logs, stats, and performance.
• Promotion of source code between environments, like dev to staging or production is typically
handled by release management tools like Atlassian, or Visual Studio Team Services using an
automated process. Docker certificates in staging and production are typically managed by your
ops team instead of developers.
FIGURE 4.1: Image tags for the official Node Docker image
Let’s review these tags, as you’ll find many of them are commonly used in other Docker Hub
images:
• latest: The latest version of the image. If you do not specify a tag when pulling an image, this is
the default.
• slim: A minimalist image that doesn’t include common packages or utilities that are included in
the default image. Choose this image if the size of the image is important to your team, as the
slim image is often half the size of the full image. Another reason to use the slim image is that
smaller images inherently have a smaller attack surface, so they are generally considered more
secure.
• jessie/wheezy/sid/stretch: These tags represent codenames for either versions or branches of
the Debian operating system, named after characters from the Toy Story movie. jessie and
wheezy represent Debian OS versions 6.0 and 7.0 respectively, and sid is the codename for the
unstable trunk, and stretch is the codename for the testing branch.
• precise/trusty/vivid/wily: These tags represent codenames for versions 12.04 LTS, 14.04 LTS,
15.04, and 15.10 of the Ubuntu operating system.
• onbuild: The onbuild tag represents a version of the image that includes onbuild Dockerfile
commands which are designed to be used as a base image for dev and test. In normal Dockerfile
commands, the execution happens when the image is created. Onbuild Dockerfile commands are
different in that execution is deferred, and instead is executed with the downstream build.
Automated Builds
To make this process easier, and to ensure only quality builds end up in your Docker Hub repository,
you can automate the process of building images using continuous integration (CI) tools. For example,
the official ASP.NET image uses CircleCI and GitHub hooks to automate updating the official image
at http://bit.ly/aspnetci. Similarly, Docker Hub adds the capability to automate building and updating
images from GitHub or Bitbucket by adding a commit hook so that when you push a commit, a Docker
image build is triggered.
What happens if your base image is updated with a security fix? You can use a repository link
which links your image repository to another repository, like the one for your base image. Doing this
enables you to automatically rebuild your images based on changes to the underlying base image.
Managing Images
Ongoing maintenance of images is also something to consider. Image maintenance tasks include
understanding updates to software packages and updates to operating systems such as applying
security patches or deploying bug fixes. For large enterprises where IT decisions are typically
centralized, management of base images is handled by a central team that is responsible for
maintaining the images, ensuring quality, security, and consistency across a team or an organization.
In other organizations, where decision making is decentralized, each team or feature crew is
responsible for their own image management.
Although Docker security tools really deserve their own chapter in a book, we quickly wanted to
mention Docker Bench (https://dockerbench.com), a script created by Docker that is largely based on
the Center for Internet Security’s (CIS) Docker 1.6 Benchmark (https://bit.ly/ch4security) that checks
for common best practices for deploying Docker containers in production.
Another common maintenance task that all organizations face is image bloat on your Docker hosts.
As each image can be 500MB to 1GB, you can easily end up with hundreds of old or unused images
in your shared dev or staging environments. To help fix this, you can set up a maintenance script that
deletes images that reach a certain age using the docker rmi command.
Setting up your Local Dev Environment
Now that we’ve reviewed some of the considerations when setting up Docker for your teams, let’s get
your local development environment set up and ready to use Docker. Below is a quick list of common
software to install for development.
Set DOCKER_CERT_PATH=C:\Users\danielfe\.docker\machine\machines\
default
Set DOCKER_TLS_VERIFY=1
Set DOCKER_HOST=tcp://192.168.99.100:2376
Cloning Samples
To get started, we are going to clone the product catalog from https://Github.com/flakio/catalog to
your local machine. Because later in the chapter we will use local volume mounting, it’s important
that you run your clone command from within a user folder like the “Documents” (C:\users\
<username>\Documents) folder. This is because VirtualBox only mounts user folders by default.
From a Mac terminal, type the following:
Click here to view code image
cd ~
mkdir DockerBook
cd DockerBook
git clone https://github.com/flakio/catalog
From the Windows command prompt, type the following (replace the username as appropriate) to
create a new directory named “DockerBook” and clone our repository into it as shown in Figure 4.4:
Click here to view code image
cd c:\Users\<username>\Documents\
mkdir DockerBook
cd DockerBook
Git clone https://github.com/flakio/catalog
FIGURE 4.4: Cloning the Product Catalog project
Now that we have the code locally, the first thing we’re going to do is set up our project using live
reload.
You’ll learn how to do this in a simpler way using linked containers and Docker Compose later in
the chapter. With the data store now up, you can start the product catalog service by running:
Click here to view code image
Note
The above command is wrapped over several lines due to formatting in the printed book.
You should type it as a single command on a single line.
This docker run command creates a detached container (-d) with tty (-t) support where port 80
of the VirtualBox VM will listen and forward a request to port 80 in the container. It uses the -e flag
to set an environment variable to tell the ASP.NET web server to listen on port 80, mounts a volume
with the -v flag, and bases the container on the publicly available
thedanfernandez/productcatalog Docker image.
Volumes
The Docker volume command works just like the “-v” command line parameter we saw in Chapter 2,
with the left side of the colon (:) representing the Windows/OSX/Linux directory to mount and the
right side representing the destination directory inside the Docker container. Your source code from
Windows/OSX/Linux is available to the VirtualBox Linux Docker host using the shared folder
feature. It is then further shared from the Docker host to the container using the volume command. As
shown below, VirtualBox shared folders convert “C:\” into /c/ and convert all Windows path
backslashes into Linux-compatible forward slashes. For example, here are the before and after
directories for our source code in the Docker host:
Windows Path
Click here to view code image
c:\Users\<name>\Documents\DockerBook\catalog\ProductCatalog\src\
ProductCatalog
Finally, we use the “thedanfernandez/productcatalog” image, which is an image that when started,
restores NuGet packages and starts dnx-watch to listen for changes. This image itself inherits from the
“microsoft/aspnet” official image but is designed to ensure it works with the latest set of tooling by
the time this book is released.
Another option for developers is to start a container interactively (giving you a standard Linux
command prompt inside the running container). One key benefit to this, as mentioned in the third
bullet above, is it enables you to download any new packages you might not have included in the
Docker image by manually running a package restore whenever you need to. In the example below,
we add the “–i” flag to start the service interactively, which will start the bash command prompt
running in the container because we specified the entrypoint switch as part of the command as shown
in Figure 4.5.
Click here to view code image
docker run –i –t -p 80:80 –e "server.urls=http://*:80" --entrypoint /
bin/bash –v
/c/Users/danielfe/Documents/DockerBook/catalog/ProductCatalog/src/
ProductCatalog:/app thedanfernandez/productcatalog
Once your web browser is up, you can navigate to the Docker Host IP address and the
ProductsController route by going to http://ipaddress/api/products. By default, the IP address would
be 192.168.99.100, which with the URL below, returns JSON product information as shown in Figure
4.6.
Although we’ve gotten the code ready for production, that doesn’t mean the code works correctly,
or is high-performing, or can handle load. We’ll look at a full suite of tests for measuring code quality
as part of a continuous delivery pipeline in Chapter 6.
Docker Compose
Docker Compose is a tool that enables you to define multiple containers and their dependencies in a
declarative yml file. YAML (YAML Ain’t Markup Language) is a declarative language similar to
JSON but aimed at making it more readable by humans. Rather than surrounding everything in quotes
and brackets like JSON does, it uses indentation for hierarchy. Developers like Docker Compose
because it’s more readable than the Docker command line and makes it easy to compose and connect
multiple containers in a declarative way. The best way to understand Compose is by opening the
Docker Compose file at:
Click here to view code image
c:\Users\<username>\Documents\DockerBook\catalog\src\ProductCatalog\
docker-compose.yml
or on a Mac
Click here to view code image
~/DockerBook/catalog/ProductCatalog/src/ProductCatalog/docker-
compose.yml.
As you can see, this compose file does many of the same things we did on the Docker command
line. Read through the file and we’ll review in more detail below:
Click here to view code image
productcatalog:
image: "thedanfernandez/productcatalog"
ports:
- "80:80"
tty: true
links:
- elasticsearch
environment:
- server.urls=http://*:80
elasticsearch:
image: "elasticsearch"
ports:
- "9200:9200"
At a high level, this compose file creates two containers using the arbitrary labels,
“productcatalog” and “elasticsearch.” The elasticsearch container is simple to explain as it does what
our previous Docker run instruction did: it uses the official elasticsearch image and listens on
port 9200.
The “productcatalog” container represents the ASP.NET Core 1.0 microservice and has many
similarities to the Docker run command we used before. It uses a base image from
thedanfernandez/productcatalog which includes the source code “baked” into the image using the
Dockerfile ADD command, which adds the source code into the image. It also does the common steps
you’ve seen before such as opening port 80 for the host and container, turning on tty, and setting the
server.urls environment variable to tell the ASP.NET web server to listen on port 80.
Linked Containers
Container linking is a special Docker feature that creates a connection between two running
containers. productcatalog has a link defined to the elasticsearch container. Instead of
having to define a brittle connection string in our app to a specific IP address to the elasticsearch data
store, container linking will inject a set of environment variables into the productcatalog
container that includes a connection string to the elasticsearch service so that two containers can
easily communicate to each other without a hardcoded connection string.
One thing to keep in mind is that you cannot link containers across multiple Docker hosts as both
containers must run on the same host. To achieve that, you can use Docker’s overlay networking
feature which provides a way to connect multiple Docker hosts using a discovery service like Consul.
We will discuss Docker networking in more detail in Chapter 5.
Container Dependencies
Container linking also defines a dependency and order of instantiation for containers. Because the
productcatalog container is linked to and depends on the elasticsearch container to run,
Docker Compose will create the elasticsearch container first, and then create the
productcatalog container.
Smart Restart
Docker Compose is also smart in that, where possible, it will detect the current state of containers
and only change/restart if there is something newer. For example, if you change the Catalog
microservice and don’t change the Elasticsearch container, and then run a docker-compose up
command, Docker Compose won’t stop and recreate the Elasticsearch container because it can detect
there are no changes. This is shown in Figure 4.8.
FIGURE 4.8: Docker Compose recreating only changed containers
Once you have the image ID, view the logs by typing
docker logs <id>
If the answer isn’t immediately obvious, you can connect to the container directly as we discussed
earlier in the chapter by overriding the container’s entrypoint in the Docker run command as shown
below:
Click here to view code image
docker run –t -i -p 80:80 --entrypoint=/bin/bash thedanfernandez/
productcatalog
This snippet replaces the “–d” (detached) flag with the “–i” flag to run interactively and overrides
the Dockerfile entrypoint to instead start the bash command prompt. Once it runs, you should see a
bash command prompt that you can use to further diagnose issues such as:
• ensuring all application dependencies are in the base image.
• ensuring any apps or dependencies (ex: NPM) are installed.
• ensuring any environment variables your app needs are correctly configured.
When you type “exit” to exit the interactive shell, the container will automatically stop, but you
can go back and use the Docker logs command to view all the commands you typed into the bash
shell and the corresponding command line output.
This will execute the command in the container and return the output, in this case the contents of the
docker-entrypoint.sh file as shown in Figure 4.9.
FIGURE 4.9: Using Docker Exec to see the contents of a file in a running container
Summary
In this chapter, we covered a lot of steps for how to set up local development with Docker, image
management, Docker registry options, local versus production-ready development, Docker Compose,
and how to diagnose issues when creating Docker images and containers.
5. Service Orchestration and Connectivity
It’s likely our microservices-based application is composed of more than one service, and multiple
instances of each service. So how do we deploy and manage multiple services, multiple instances of
each service, and the connectivity between the services? With a microservices architecture, it’s even
more important that we automate everything. Orchestration tools are used to deploy and manage
multiple sets of containers in production, and service discovery tools are used to help route traffic to
the appropriate services.
The services used to compose our application could be deployed into a growing number of cloud-
hosted compute offerings available today. Using some combination of Infrastructure as a Service
(IaaS) or Platform as a Service (PaaS), the microservices can be deployed and run in their own
dedicated machines or a set of shared machines. In this chapter we will discuss using a cluster of
machines to host our Dockerized microservices.
Before jumping into the details of orchestration, scheduling, cluster management, and service
discovery, let’s start with a conceptual overview of a typical cluster used to host a microservice-
based application. Figure 5.1 shows a conceptual diagram of a typical environment. In addition to the
machines hosting our services, we have management and orchestration systems used to deploy and
manage the services that compose our application.
Orchestration
In the context of infrastructure and systems management, orchestration is a fairly general term that is
often used to refer to cluster management, scheduling of compute tasks, and the provisioning and de-
provisioning of host machines. It includes automating resource allocation and distribution, with the
goal of optimizing the process of building, deploying, and destroying computing tasks. In this case, the
tasks we’re referring to are microservices, such as those deployed in Docker containers.
In this context orchestration consists of provisioning nodes, cluster management, and container
scheduling.
Provisioning is the process of bringing new nodes online and getting them ready to perform work.
In addition to creating the virtual machine, this often involves initializing the node with cluster
management software and adding it into a cluster. Provisioning would also commonly include
resources other than compute, like networking, data storage services, monitoring services, and other
cloud provider services. Provisioning can be the result of manual administration, or automated
scaling implementation used to increase the cluster pool size.
Cluster management involves sending tasks to nodes, adding and removing nodes, and managing
active processes. Typically there is at least one machine that acts as cluster manager. This machine(s)
is responsible for delegating tasks, identifying failures, and synchronizing changes to the state of the
application. Cluster management is very closely related to scheduling, and in many cases the same
tool is used for both.
Scheduling is the process of running specific application tasks and services against specific nodes
in the cluster. Scheduling defines how a service should be executed. Schedulers are responsible for
comparing the service definition with the resources available in the cluster and determining the best
way to run the service. Schedulers integrate closely with cluster managers because schedulers need to
be aware of each host and its available resources. For a more detailed explanation, see the section on
Scheduling.
Service Connectivity
Although service discovery, application gateways, and network overlays are not
necessarily a part of orchestration, we will cover them in this chapter as they are closely
related to setting up a cluster and running services in a cluster.
Service Discovery and Application Gateways are used to route traffic to the service instances
deployed in the cluster. We typically have multiple instances of our service running on nodes within
the cluster, as determined by the scheduler, and we need a way to discover their location and route
traffic to them. Some container orchestration tools will provide this functionality.
Overlay networks enable us to create a single network that sits on top of multiple networks
beneath. We can deploy containers and services to the overlay network, and they can communicate
with other containers on the same overlay network without having to worry about the complexities of
the networks beneath.
Let’s start with provisioning and bootstrapping the virtual machine with the necessary cluster
management and orchestration tools.
Provisioning
We are going to need some machines to run our application, and they will need to be initialized with
cluster management software. In addition to the virtual machines needed to run our application, we
will need to provision storage accounts, availability sets, virtual networks, and load balancers. We
might even need to provision some additional Microsoft Azure hosted services like a Service Bus,
Azure DocumentDB, or Azure SQL Database.
We can break provisioning down into two high level concerns: outside the box and inside the box.
Provisioning outside the box would include creating all the correct resources like networking, virtual
machines, and similar. Inside the box would be anything that needs to set up inside the service, such
as cluster management software.
A big challenge with infrastructure has been the capability to reliably recreate an environment.
Machines and networks are provisioned, configurations are changed, and other machines are added or
removed. Even if all this is documented, tracking the changes and working through manual steps is
error-prone, and rarely ends with exactly the same environment setup. Automating infrastructure
provisioning can help here, and we can apply a lot of the best practices for managing application
code to managing our infrastructure.
Let’s start with a quick overview of infrastructure as code and the services used in Microsoft
Azure to provision and manage our resources.
Infrastructure as Code
Infrastructure as code, also known as programmable infrastructure, is a way of building and managing
an application’s entire environment through predefined code. It enables developers and
administrators to recreate the same environment on multiple physical machines. An environment that
can be reproduced is crucial to efficient development, testing, and deployment. As the size and
complexity of a service increases, so does the difficulty in reproducing it consistently.
Administrators often rely on scripts to automate this process. Infrastructure as code performs a
similar service while also adding automation, version control, true cross-platform support, and other
features. Infrastructure as code typically consists of one or more definition files written in a high-
level language, which contain instructions that are read and interpreted by a separate tool.
Maintaining Consistency
Consistency is key when it comes to microservices. When it’s time to deploy, having a way of
accurately reproducing an environment on another system greatly reduces the chance of an error.
Because infrastructure as code relies on predefined code to create the environment, each instance of
the environment is guaranteed to run the same. This also makes the environment easy to share with
developers, quality assurance, or operations.
Tracking Changes
Because infrastructure as code relies on definitions instead of actual environments, changes can
easily be tracked using a version control system such as Git, Subversion, or Mercurial. Not only does
this help keep track of changes to the environment, but it also helps administrators maintain different
versions of the environment for different versions of the application.
Test-Driven Infrastructure
Infrastructure as code changes the way we need to look at testing. One approach to doing
this is through a test-driven infrastructure. Similar to test-driven development, a test-
driven infrastructure is designed around a predetermined set of test cases that must pass
successfully.
Let’s have a closer look at each section. Before we do that we need to first cover template
functions and expressions.
Functions
Expressions and functions extend the JSON available in the template. This enables us to create values
that are not strictly literal. Expressions are enclosed in square brackets “[],” and they are evaluated
when the template is deployed. Template functions can be used for things like referencing parameters,
variables, and value manipulation such as type conversions, addition, math calculations, and string
concatenation.
As in JavaScript, function calls are formatted as functionName(arg1, arg2, arg3)
Click here to view code image
"variables": {
"location": "[resourceGroup().location]",
"masterDNSPrefix":"[concat(parameters('dnsNamePrefix'),'mgmt')]",
"agentCount": "[parameters('agentCount')]"
}
Parameters
The parameters section is used to define values that can be set when deploying the resources. We can
then use the parameter values throughout the template to set values for the deployed resources.
The JSON example following defines some parameters starting with the “agentCount,” followed by
“masterCount.” For each parameter we must define a type, and we can optionally define default
values and some other constraints, as well as some metadata that can contain additional information
for other consuming the template. As we see in the following, we have set a default value to the
“agentCount” parameter and we have declared the minimum value for this parameter of 1 and a
maximum value of 40.
Click here to view code image
"parameters": {
"agentCount": {
"type": "int",
"defaultValue": 1,
"metadata": {
"description": "The number of Mesos agents for the cluster."
},
"minValue":1,
"maxValue":40
},
"masterCount": {
"type": "int",
"defaultValue": 1,
"allowedValues": [1, 3, 5],
"metadata": {
"description": "The number of Mesos masters for the cluster."
}
}
Variables
In the variables section we can construct values that can be used to simplify template language
expressions. These variables are commonly based on values from the parameters as we saw with
“agentCount.”
The values can be transformed using ARM template language expressions like the “concat”
function. We can see this with the “masterDNSPrefix” variable in the following example. The value
for “masterDNSPrefix” will contain a string value of whatever was passed in the parameter
“dnsNamePrefix” with “mgmt” concatenated at the end of it. Now we can simply reference this
variable in the template instead of having to perform this concatenation. This improves the readability
and maintenance of the template.
Click here to view code image
"variables": {
"masterDNSPrefix":"[concat(parameters('dnsNamePrefix'),'mgmt')]",
"agentDNSNamePrefix":"[concat(parameters('dnsNamePrefix'),'agents')]"
"agentCount": "[parameters('agentCount')]",
"masterCount": "[parameters('masterCount')]",
"agentVMSize": "[parameters('agentVMSize')]",
"sshRSAPublicKey": "[parameters('sshRSAPublicKey')]",
"adminUsername": "azureuser"
}
Resources
In the resources section we define the resources that are deployed and updated. This section is the
bulk of most templates and contains a list of resources. The following template snippet defines an
Azure Container Service (ACS) and some parameters. The “apiVersion,” “type,” and “name”
elements are required elements for every resource. The apiVersion must match a supported version
for the “type” of resource defined, and the name defines a unique name for the resource that can be
used to reference it. There are some additional documented elements not shown, like the
“dependsOn” element used to define resource dependencies that need to be provisioned first. The
“properties” element contains all the resource properties. For example, the “orchestratorProfile” is a
property of the container service as we have defined it here.
Click here to view code image
"resources": [
{
"apiVersion": "2015-11-01-preview",
"type": "Microsoft.ContainerService/containerServices",
"location": "[resourceGroup().location]",
"name":"[concat('containerservice-',resourceGroup().name)]",
"properties": {
"orchestratorProfile": {
"orchestratorType": "Swarm"
}
}
}
]
Outputs
In the optional outputs section we can specify values that are returned from deployment. For example,
we could return the Fully Qualified Domain Name (FQDN) for the master and agent nodes created in
the deployment. This could then be used by a user or a script to connect to these endpoints using this
domain name.
Click here to view code image
"outputs": {
"masterFQDN": {
"type": "string",
"value":
"[reference(concat('Microsoft.ContainerService/containerServices/',
'containerservice-', resourceGroup().name)).masterProfile.fqdn]"
},
"agentFQDN": {
"type": "string",
"value": "[reference(concat('Microsoft.ContainerService/
containerServices/', 'containerservice-', resourceGroup().name)).
agentPoolProfiles[0].fqdn]"
}
}
Using an ARM template, we can repeatedly deploy our infrastructure through the entire lifecycle
and have confidence that the resources are deployed in a consistent manner. We define our
infrastructure in a template that is managed in source control, where we can track changes to the
infrastructure. When the template is changed we can apply the new template to existing deployments
and the Azure Resource Manager will determine what needs to change and make the necessary
changes based on the current state of the template and the existing state of the deployment.
Linked Templates
As templates grow, we can compose templates from other templates through linking. This
is a great technique used in managing templates, especially as they grow in size and
complexity. We could create a template responsible for deploying Consul, for example,
and then reference that template in our Docker Swarm template.
Parameters for each environment can be maintained in source control or some other configuration
store. Like environment configuration for an application, we would generally store this in the
operations source control repository or a special database. Parameters would generally be things like
the names, locations, and number of instances of the various resources. We might want to deploy a
cluster into a staging or test environment that has a fewer number of nodes or smaller nodes. Instead
of making the number of nodes in the cluster part of the template, we can make it a parameter. Then,
when we are deploying a template we can pass the parameter value containing the number of nodes in
the deployment suitable for the environment. The deployments of our cluster are consistent and the
same across environments, with some variation that can be controlled with parameters and tested,
like the number of nodes in this case.
{
"apiVersion": "2015-11-01-preview",
"type": "Microsoft.ContainerService/containerServices",
"location": "[resourceGroup().location]",
"name":"[concat('containerservice-',resourceGroup().name)]",
"properties": {
"orchestratorProfile": {
"orchestratorType": "Mesos"
},
"masterProfile": {
"count": "[variables('masterCount')]",
"dnsPrefix": "[variables('mastersEndpointDNSNamePrefix')]"
},
"agentPoolProfiles": [
{
"name": "agentpools",
"count": "[variables('agentCount')]",
"vmSize": "[variables('agentVMSize')]",
"dnsPrefix": "[variables('agentsEndpointDNSNamePrefix')]"
}
],
"linuxProfile": {
"adminUsername": "[variables('adminUsername')]",
"ssh": {
"publicKeys": [
{
"keyData": "[variables('sshRSAPublicKey')]"
}
]
}
}
}
}
As we can see here, Microsoft Azure makes it very easy to provision production-ready Mesos and
Docker Swarm clusters.
The flak.io sample application includes detailed steps for provisioning a new cluster and
deploying the flak.io ecommerce sample into the cluster at http://flak.io.
Azure CLI:
Click here to view code image
For example, the Azure CLI command for deploying a new storage account using the Azure GitHub
repository would look similar to the following:
Click here to view code image
For more information on Azure Resource Manager, including guidance and best practices, take a
look at the product documentation at https://azure.microsoft.com/en-
us/documentation/articles/resourcegroup-overview/.
Multivendor Provisioning
The Azure Resource Manager (ARM) is the recommended approach to provisioning infrastructure
resources in Azure. Sometimes there is a need to provision all or some of the resources for an
application outside Azure. These can be resources in a private cloud, another public cloud, or even
some other hosted services like the hosted Elasticsearch services offered by Found (http://found.no).
There are a lot of different ways to approach this. We could use a combination or custom scripts
for provisioning the other resource and then deploying the Azure resources through ARM using either
the declarative or imperative model. We could transform the ARM templates as needed, or even have
an ARM template invoke custom provisioning scripts to configure these resources. In addition to
those options, there are some other infrastructure provisioning orchestration tools available in the
market that enable us to define our infrastructure in a cloud platform vendor-agnostic way, and then
use cloud-specific providers to do that actual work of provisioning. An example of one such tool is
Terraform (http://terraform.io). Terraform is capable of provisioning Azure infrastructure through an
ARM-based provider, along with some other popular cloud platforms.
Terraform, by HashiCorp, is a tool for building, changing, and versioning infrastructure in a safe,
efficient, and consistent way. Using configuration files that describe the final state of an application,
Terraform generates and runs an execution plan that builds the necessary infrastructure. Terraform can
be used to manage tasks as small as individual computing resource allocations or as large as
multivendor cloud deployments.
Terraform integrates with different resources through the use of providers. Providers form a bridge
between the Terraform configuration and any underlying APIs, services, or platforms. The Azure
provider enables you to provision and manage Azure resources, including virtual machine instances,
security groups, storage devices, databases, virtual networks, and more. This tool can be useful when
you need to define and provision infrastructure on multiple cloud platforms.
Challenges
Scheduling services can quickly become complicated, especially as our application and infrastructure
grow. We need to balance efficiency, isolation, scalability, and performance, all while accounting for
each application’s various requirements. We need a service that can automate the decisions and all
the little complexities involved in determining what machines in our cluster should run each instance
of a service.
Efficiency/Density
Efficiency measures how well our infrastructure schedules services based on the resources available.
In an ideal environment, services would be evenly distributed across multiple servers and there
would be no wasted resources.
In the real world, each service has its own unique resource requirements, and all nodes might not
provide the same resources. A service can have low processor and low memory requirements, but it
can require lots of storage. Another service can have a requirement for low amounts of high-
throughput storage—in other words, it would require solid state storage or a RAM disk instead of a
traditional hard drive.
The scheduler needs to quickly identify optimal placement of our services alongside other services
on nodes in the cluster. On top of this, the scheduler needs to constantly account for changing
resources due to hardware provisioning and node failures.
Isolation
Contrary to the distributed nature of scheduling, services make heavy use of isolation. Our services
are designed to be created, deployed, and destroyed repeatedly without affecting the performance or
availability of other services. Although services can communicate with each other, removing
isolation and creating dependencies between them essentially defeats the purpose of a microservices
architecture.
As an example, container-based solutions such as Docker use the Linux kernel’s cgroups feature to
control resource consumption by specific processes. They also make use of kernel namespaces,
which limits the scope of a process. This can greatly improve fault and resource isolation of services
in a microservices architecture. In the event of an unexpected failure, a single service would not
compromise the entire node.
Scalability
As application complexity grows, so does the complexity of the data center. Not only do we need to
design our infrastructure around existing services, but we also need to consider how our
infrastructure will scale to meet the demands of future services. The scheduler might need to manage
a growing number of machines. Some are even able to increase and decrease the pool of virtual
machines to match demand.
Performance
Performance problems can be indicative of a poor scheduling solution. The scheduler has to manage
an extremely dynamic environment in which resources are changing; the services running on those
resource are changing, and load on the services are changing all the time. This can be complex, and
maintaining optimal performance can often require a great monitoring solution.
Identifying the most optimal resource for a task can take time and sometimes it’s important that the
task is scheduled quickly to respond to an increase in demand or a node failure.
A Scheduling Solution
As we just learned, the simple job of scheduling tasks to run on a node can quickly become complex.
Resource schedulers in operating systems have evolved over time and we have had a lot of
experience scheduling local resources. Solutions for scheduling meant to work across a cluster of
machines is not nearly as mature or advanced. However, some great solutions are available to help in
addressing this need, and they are evolving at a very rapid rate.
Many of the cluster scheduling solutions today provide a lot of similar features and approaches to
container placement. Before covering some of the more popular schedulers, we will first discuss
some of the common features available in some of these schedulers.
Availability Sets
In Azure, virtual machines are organized into availability sets which assign fault and
update domains to the virtual machine instances. Update domains indicate groups of
hardware that can be rebooted together. Fault domains indicate groups of machines that
can share a common power source and network switches, and so on.
Dependencies
A scheduler needs a way to define a grouping of containers, their dependencies, and connectivity
requirements. A service can be composed of multiple containers that need to remain in close
proximity. Kubernetes enables you to group a set of containers that make up a service into pods. A
good example of this is a node.js service that uses NGINX for its static content and Redis as a
“local” cache. Other services might need to be near each other for efficiency reasons.
Replication
A scheduler needs to deal with optimally scheduling multiple instances of a service across the
cluster. It’s likely we don’t want to place all the instances of a service on a single node or in the same
fault domain. The scheduler often needs to be aware of these factors and distribute the instances
across multiple nodes in the cluster.
Reallocation
Cluster resources can change as nodes are added, removed, or even fail. In addition to the cluster
changing, service load can change as well. A good scheduling solution needs to be able to constantly
monitor the nodes and services, and then make decisions on how to best reallocate resources based
on availability as well as the most optimal performance, scale, and efficiency needs of the system.
Azure Healing
In Azure, if the hardware running the services fails, Azure will handle moving the virtual
machine the services are running on. This situation, where a machine goes away and
comes back, can be challenging for schedulers to manage. Also note that Azure works at
the machine level and will not help us if an individual container fails.
Highly Available
It could be important that our orchestration and scheduling services are highly available. Most of the
orchestration tools enable us to deploy multiple management instances, so that in the event one is
unavailable for planned or unplanned maintenance, we can connect to another. Depending on the
application requirements and cluster management features, it can be okay to run a single instance of
that management and orchestration service. If for some reason the management node is temporarily
unavailable, it can simply mean we are unable to reschedule services until it’s available again, and
the application is not impacted.
Rolling Updates
As we roll out updates to our services, the scheduler might need to scale down an existing version of
a service while scaling up the new version and routing traffic across both. The scheduler should
monitor the health and status of the deployment and automatically roll back the update if necessary.
Autoscaling
The cluster scheduler might be able to schedule instances of a service based on time of day, or based
on monitoring metrics to match the load on the system. In addition to being able to monitor and
schedule task instances within a cluster, there could be a need to automatically scale the cluster nodes
to reduce the cost of idle resources or to respond to increased capacity needs. The scheduler can
raise an event for the provisioning systems to add new nodes to the cluster or remove some nodes
from the cluster.
API
Continuous Integration/Deployment systems might need to schedule new services or update an
existing service. An API will be necessary to make this possible. The API can even expose events
which can be used to integrate with other services in the infrastructure. This can be useful for
monitoring, scheduler tuning, and the integration of other systems or even cloud platform
infrastructure.
There are a growing number of technologies available in the market that are used for the
orchestration and scheduling of services. Some of them are still not necessarily feature-complete, but
they are moving extremely fast. We will cover a few of the more popular ones in use as of this
writing. The first we will look at is Docker Swarm.
Docker Swarm
Docker, Inc. provides a native clustering solution called Docker Swarm. A Swarm cluster is a pool
of Docker nodes that can be managed as if they were a single machine. Swarm uses the standard
Docker API, which means existing tools are fully compatible, including the Docker client. Docker
Swarm is a monolithic scheduler where a single Swarm Master that’s aware of the entire cluster state
is responsible for scheduling.
As shown in Figure 5.2, a Docker Swarm cluster will contain one or more Swarm Masters, a
number of nodes running the Docker daemon, and a discovery backend, not to be confused with
container discovery services covered later in the networking section of this chapter.
FIGURE 5.2: Docker Swarm cluster overview
Master Nodes
A Swarm cluster will contain one or more master nodes. Only one master node will be performing
cluster scheduling work, but additional nodes can be running to provide high availability. The masters
will elect a leader, and if for some reason the elected master is unavailable another master will be
elected and take over the task of handling client requests and scheduling work. A supported service
that can provide the necessary leader election feature is required. Services like Consul, Zookeeper,
and etcd are commonly used for this purpose, as well as the discovery backend. The master uses the
list of nodes in the discovery backend to manage containers on the nodes. The master communicates
with the nodes using the standard Docker protocol, the same one the client uses.
Discovery Backend
The Docker Swarm discovery backend is a pluggable service that is used as a cluster discovery
mechanism and will manage cluster state. When a node joins the cluster it will use this service to
register itself and join the cluster. The discovery backend is pluggable, with a cloud-hosted backend
available, as well as many other options.
Swarm Strategies
The Docker Swarm scheduler supports multiple strategies that determine how Swarm computes
ranking for container placement. When we run a new container, Docker Swarm will schedule it on the
node with the highest computed ranking for the selected strategy. Swarm currently supports three
strategies; spread, binpack, and random. The spread and binpack strategies consider the nodes
available CPU, RAM, and number of running containers. The random strategy simply selects a node
and is primarily used for debugging. The spread strategy optimizes for nodes with the least number of
containers, trying to keep a more even distribution. The binpack strategy optimizes for nodes with the
most containers, attempting to fill nodes up.
An API is available for creating new Swarm strategies, so if a strategy with a needed algorithm
does not exist, we can create one.
Swarm Filters
Docker Swarm comes with multiple filters that can be used to scheduler containers on a subset of
nodes. Constraint filters are key/value pairs associated with nodes, and can be used for selecting
specific nodes. Affinity filters can be used to schedule containers that need to be close to other
containers. Port filters are used to handle host port conflicts, and are considered unique filters. A
health filter will prevent scheduling of containers on an unhealthy node.
Secure Communications
For situations that require the Docker client to connect to a daemon over a public
network, it is recommended that Transport Layer Security (TLS) is configured. More
information on configuring TLS in Docker can be found here:
https://docs.docker.com/engine/security/https/. Alternatively, an SSH session can be
established to the Swarm Manager and the local Docker client can be used.
One of the nice things about working with Docker Swarm is the simplicity. The Docker Swarm API
is like working with a single instance of Docker, only there is an entire cluster behind it.
Kubernetes
Kubernetes
Similar to Docker Swarm is Kubernetes (often abbreviated as k8s), an open-source Google project
for managing containers in a cluster.
Kubernetes acts as a framework for deploying, scheduling, maintaining, updating, and scaling
microservices. In essence, it abstracts the process of deploying the service from the user and actively
monitors the state of the application infrastructure to ensure integrity. If a problem occurs, Kubernetes
can automatically rebuild and redeploy a failed container.
A key difference between Kubernetes and Docker Swarm is how they treat clusters of containers.
With Docker Swarm, individual containers are unified under a single API. With Kubernetes,
containers are grouped into logical units which are then managed, monitored, and assigned resources
by the cluster master. One of the key features of Kubernetes is that it enables users to define the
ultimate state of the cluster, while managing the cluster’s various components to match that state.
In addition to the Scheduler, the Kubernetes master includes an API Server and a Controller
Manager as we see in Figure 5.3. The kubectl client talks to the API Server on the master node and
sends a configuration for a replication controller, with a pod template and desired replicas.
Kubernetes uses this information to create a number of pods. The scheduler then looks at the cluster
state and schedules work on the nodes. The kubelet, an agent running on each node, monitors changes
in the set of pods assigned to it and then starts or kills pods as needed.
Components
A basic Kubernetes cluster consists of node agents (known as kubelets) managed by a Kubernetes
Control Plane running on a master node.
Pods
Kubernetes groups related containers into logical units called pods. Pods are the smallest components
that can be created, deployed, or managed by Kubernetes. However, they generally consist of
containers that perform complementary services. For example, a pod that provides a website could
consist of a web server container and data storage container. Containers within a pod can view other
containers’ processes, access shared volumes, and communicate via message queues.
Replication Controllers
When running multiple instances of a pod, Kubernetes controls the number of instances by using a
replication controller. A replication controller ensures that a specified number of pods are available
at any given time. For instance, if you have two instances of a pod running for load balancing
purposes and an instance fails, a replication controller will automatically start a new instance.
Services
Similar to how pods define a set of related containers, services define a set of related pods. Services
are a stable abstraction in the cluster that provide routing, service discovery, load balancing, and zero
downtime deployments. Applications consuming the service can use either the host name or IP
address of the service and the requests will be routed and load-balanced across the correct pods.
When a service is created, Kubernetes assigns it a unique IP address, and although pods will come
and go, services are more static.
Volumes
Containers are stateless by design. When a container is deleted, crashes, or refreshed, any changes
made to it are lost. With volumes, containers can preserve data in a directory that exists outside of the
container. Volumes also enable containers to share data with other containers in the same pod.
Kubernetes supports several types of volumes, including local folders, network-attached folders,
cloud storage volumes, and even Git repositories.
Other Components
Kubernetes supports other features such as annotations for attaching non-identifying metadata to
objects, secrets for storing sensitive data, and more. For additional information, see the Kubernetes
User’s Guide at http://kubernetes.io/v1.1/docs/user-guide/README.html.
Kubernetes on Azure
Kubernetes can be deployed on Microsoft Azure, and a set of scripts for easily deploying to Azure
are maintained in the Kubernetes project. These scripts can be found on the Kubernetes site
(https://kubernetes.io) in the Getting Started section of the documentation.
Apache Mesos
Apache Mesos is an open-source cluster manager and scheduling framework for pooling and sharing
resources like CPU, memory, and storage across multiple nodes. Instead of having a specific node for
a web server or database server, Mesos provides a pool of machines for running multiple services
simultaneously.
In Figure 5.4 we see an Apache Mesos Master with a standby node for high availability.
Zookeeper is used for cluster state and leader election, and frameworks are used for scheduling work
in the cluster.
Core Components
An Apache Mesos cluster consists of three major components: masters, agents, and frameworks.
Masters
A master daemon manages one or more agent daemons. It’s responsible for tracking each agent’s
available resources, tracking active applications, and delegating tasks to agents.
Agents
Agent daemons run on each cluster node. Agents cluster nodes to provide the physical resources that
the master tracks, pools, and distributes to each application/framework.
Frameworks
Frameworks are applications that run on an Apache Mesos cluster. Each framework is split into
tasks, which are individual work units that run on Apache Mesos agents. Unlike many clustering
solutions, Mesos enables frameworks to control much of the delegation of tasks. When Apache Mesos
allocates resources to a framework, it makes an offer to the framework; if the offer doesn’t satisfy the
framework’s constraints, the framework can reject it. Frameworks can be configured to accept or
reject resources based on certain constraints.
Frameworks themselves consist of two components: schedulers and executors. Schedulers register
with the master and handle resource offers. Executors are processes on the agent nodes that run tasks.
Mesosphere DCOS
Mesosphere Data Center Operating System (DCOS) is a commercial product from
Mesosphere, Inc. (http://mesosphere). DCOS is based on Apache Mesos and includes
enterprise-grade security in addition to many other core system services, like Mesos-
DNS, a Command Line Interface (CLI), an API, and more.
Marathon
Marathon is an open-source framework from Mesosphere’s DCOS, and is used for long-running
tasks. The microservices we will deploy are generally long running, and we will need to ensure they
continue running until we decide we need to stop them for some reason, like reducing the count or
updating them with a new version. Marathon can be used to ensure a defined number of our services
remain running in the cluster, even if a node on which one of our service instances is running goes
away—provided there are enough resources, of course. In addition to scheduling containers,
marathon can schedule and run commands in the cluster. For example, we could bootstrap a node.js
application that pulls down the necessary artifacts and runs. Marathon comes with a UI and an API
that can be used to schedule tasks.
We will actually use Marathon to deploy and run our sample application in a Microsoft Azure
Container Service.
Chronos
Chronos is an open-source framework from Mesosphere’s DCOS, and is used as a replacement for
running cron jobs in a cluster. It’s a distributed, fault-tolerant scheduler that supports custom Mesos
executors as well as default command executors. Chronos does a lot more than cron, and can schedule
jobs that run inside Docker containers, use repeating interval notation, and can even trigger jobs by
the completion of other jobs. Chronos includes a UI and API that can be used to schedule tasks.
Almost all large distributed applications include a growing number of scheduled jobs based on
time or interval.
Service Discovery
In more traditional distributed systems, services run at fixed, well-known locations, and there are
generally fewer dependent or collaborating services. A microservices-based application often runs a
large number of services in an environment where the service locations and instances are very
dynamic. As we discussed in the resource scheduling section, services are often deployed in a shared
cluster of virtual machines. Services are scaled up or out across a cluster of resources to meet the
demand. Nodes come and go, and services are reallocated to optimize resource usage. Services
provisioned in these environments require some mechanism to enable them to be easily discovered
for requests to be sent to them.
Our services will need to communicate with each other, and inbound requests into the system need
to be routed to the appropriate services. For this we simply maintain a list of services and their
locations, so that we can look up the endpoint location when we need to call it. This is referred to as
service discovery, which can be implemented a number of different ways, using a growing number of
technologies.
Figure 5.5 shows a conceptual service discovery service with service announcement and lookups.
Service Registration
Also known as service announcements, this is simply the process where changes to a service status
are written to a service registry. These changes are typically when a service is available to handle
requests, or is no longer available to handle requests. When a new service instance has been
scheduled on a node and is ready to receive requests, we need to enable consumers of the service
know that it’s ready and where it is. Multiple instances of a service can be scheduled across the
cluster, all listening on different IP addresses and ports. It’s part of a service announcement’s job to
simply report to the service registry, “I’m an Order service ready to receive requests and I am
listening on 10.0.0.4 at port 9001.” The service registry then maintains this information, and when the
service is shut down the service needs to let the registry know that it’s going away. We can also
configure a time-to-live (TTL) on the records and expect the service to routinely check back in, or we
can perform health checks on the service to make sure it has not silently disappeared.
A lot of different ways exist to implement this seemingly simple task. Common approaches are to
make this the job of the service orchestration tools, make it part of the service initialization, or
include it as a node management sidecar process. Kubernetes, for example, will manage this
information in a service registry store like etcd when scheduling pods to run on nodes. Projects like
Airbnb Nerve or Glider Labs Registrator will automatically register and deregister services for
Docker containers by inspecting containers on a node as they come online. Another approach would
be to have the service register itself as part of the service startup, and to implement this in a
framework.
Service Announcement Implementation Options:
• Orchestration tooling is responsible for maintaining this information. The orchestrator will
maintain a list of scheduled and running service instances in the cluster as well as the health of
each instance.
• The service itself implements announcements, and on service startup it will make a call to
register itself. This can be a simple approach and it can be implemented in a shared library for
each supported language.
• A sidecar service running on each node monitors services and performs the registration and
deregistration. Externalizing registration to a sidecar process means the service implementation
does not need to be concerned with the task of registration or deregistration.
Service Lookup
Also referred to as service discovery, is simply the process of finding services and their related
endpoint information. Clients and load balancers need to be able to look up the location of services to
send requests. When one service needs to call another service, it will look up endpoint information,
and quite possibly health status, when calling the service. Centralized load balancers will often
monitor and read this information to maintain a list of back ends for routing and load balancing traffic
across multiple instances of a service.
Service Lookup Implementation Options:
• DNS-based lookups can be an easy option that requires few changes.
• A client framework in the service can be used to look up endpoints and to load balance.
• A local proxy on each host can maintain lookup information and proxy requests to the correct
service, as well as load balance requests.
• The client uses a centralized proxy load balancer that performs lookups. The client will then
access the service through a shared load balancer that is responsible for maintaining a list of
back-end service instances.
In Figure 5.6 we see a configuration that uses a local proxy, such as NGINX, to route requests. The
order service talks to a well-known endpoint like localhost:9001 to call a dependent service, and the
local proxy handles the lookup. This can simplify application design, moving the service lookup to a
proxy. This is sometimes referred to as an arbiter.
FIGURE 5.6: Diagram showing a local proxy in each host to perform service routing
This approach would also remove the need to develop language-specific frameworks for each
language you need to support.
Service Registry
A service registry is simply a database that maintains a list of services, their endpoints, and quite
often additional service metadata which is used when consuming the service. Some service discovery
technologies, like Consul, will perform health checks and maintain health state of service instances.
Features
There are a number of different technologies commonly used as a service registry. Each technology
offers a similar set of features. Let’s first have a look at the features of a conceptual service
discovery store, and then the common technologies used.
Notifications
When a service changes, such as when instances are added, removed, moved, and so on, it’s
sometimes necessary to be notified of these changes. Technologies like Confd
(https://github.com/kelseyhightower/confd) or Consul Template (https://github.com/hashicorp/consul-
template) can monitor changes and update configuration based on the data. This is useful when
maintaining a list of back ends in something like NGINX or HAProxy.
Health Checks
Some solutions, like Consul, can be configured to perform health checks and maintain health metrics
about the services that are registered. This can be used for monitoring and also for routing logic.
Shared circuit breaker state might be maintained here, and if the service is having problems, the client
might not call it.
Technologies
Many of the technologies discussed here do more than just store service discovery data, and are often
used for coordination as well as a configuration store. In addition to the features and libraries that
integrate with these services, you should consider the following: If you are going to deploy Docker
Swarm and use Consul for cluster coordination, then maybe you want to consider using it for service
discovery so you don’t have to deploy and maintain another system. You need to be careful and
consider how this could affect the scalability and performance of the cluster, as well as the different
requirements and features of the technology, like the consistency models.
Note
When sharing deployments of a service, be careful that one workload does not adversely
affect the other. Using the Zookeeper deployment that Apache Mesos uses for maintaining
cluster state as a service discovery store can impact the performance of Apache Mesos.
In a production environment it is generally recommended that Apache Mesos should use
a dedicated Zookeeper deployment.
DNS
The simplest solution would be to use DNS and create a name for each service; then clients can
discover service instances using standard DNS. DNS for service discovery does have its limitations.
First of all, DNS does not support pushing changes, so we need to poll for changes. Caching and
propagation delays can cause challenges and latency in updating state. Some applications will even
cache DNS when they start up.
If planning to use DNS for service discovery, consider some of the more modern DNS services.
Mesosphere has created a project called Mesos-DNS which provides DNS-based discovery in
Apache Mesos (https://github.com/mesosphere/mesos-dns). SkyDNS
(https://github.com/kynetservices/skydns) is another popular option built on top of etcd. Consul is
also capable of exposing service lookup through DNS as well as its standard API. The Docker and
Weave networking features that are discussed further in the following overlay section also provide
DNS-based container discovery.
DNS-based service discovery can be adequate and readily available in the platform, as with the
Azure Container Service. When using DNS for the discovery of endpoints that are very dynamic, we
need to take care in setting appropriate TTL and to understand how our clients are caching DNS. The
argument against DNS is that it was designed for a different purpose and the caching can cause a lot
of challenges in production.
Consul
Consul is an open-source project from the folks at Hashicorp, the same people that brought us
Terraform, Vagrant, and Vault. Consul provides an HTTP API that we can use for service
announcement and discovery as well as a DNS interface. In addition to service discovery, Consul can
perform service health checks, which can be used for monitoring or service discovery routing. Consul
also has a very nice templating client, which can be used to monitor service announcement changes
and generate proxy client configurations. Other features include a key/value store and multiple data
center support.
etcd
etcd is a distributed key value store from the folks at CoreOS. It provides reliable storage of data
across a cluster of machines. It is often used as a service discovery store when building service
discovery systems. It originated in the CoreOS project, is included in the distribution, and can be
installed on other distributions.
Zookeeper
Zookeeper is an open-source Apache project, and provides a distributed, eventually consistent
hierarchical key/value store. It’s a popular option for a service discovery store when building service
discovery systems. Mesos uses Zookeeper for coordination, and you might want to consider whether
or not the technology is already used in your infrastructure when selecting your service discovery
tool.
Eureka
Eureka is an open-source service discovery solution developed by Netflix. It’s built for availability
and resiliency, and is used at Netflix for locating services for the purpose of load balancing and
failover of middle-tier servers. The project includes a Java-based client that offers simple round-
robin load balancing. Additional application-specific metadata about a service can also be stored in
Eureka.
Other Technologies
There are a number of other options that could be used for this purpose, but at the time of writing,
these are the most popular. A number of factors will need to be considered when selecting a
technology to use. You might already be using one of these to maintain cluster state or for leader
election, or you can have a preference for another service that has better integrations with one of
these.
Application/API Gateway
An application gateway, also commonly referred to as an API Gateway, is used in a microservices
architecture for traffic aggregation and request routing of inbound client requests to the necessary
services. These gateways are also quite commonly used for authentication offload and SSL offload,
as well as quality of service throttling, monitoring, and many other things. Even very basic load
balancing and routing requirements will often require a reverse gateway, as most load balancers in
public cloud environments are not dynamic or flexible enough to meet the needs of modern
microservices deployments.
Common Gateway Responsibilities:
• Request routing and load balancing: Perform service lookup and route requests to the
appropriate service instances.
• Request aggregation: To reduce chattiness between the client and back-end services, the
gateway will sometimes be responsible for handling a single request from the client, and then
sending requests to multiple services and returning the aggregate results as a response to the
client.
• SSL offload: A secure SSL connection is established between the client and the gateway, and
then non-SSL connections are used internally, or different internal certificates are used to
encrypt communications.
• Authentication: A gateway can be responsible for authenticating requests, and then passing
customer information to the services behind them.
• Protocol transformation: The gateway can be responsible for converting protocols the client
might need to those used by the internal services.
NGINX and HAProxy are two of the more popular technologies used as application gateways
today. Both of these applications support dynamically configurable back-end pools. Tools like confd
or Consul Templates can be used to maintain the back-end pools, replicating data from the service
registry into configurations.
Figure 5.7 shows two or more instances of an application gateway running behind an Azure load
balancer for high availability. Services will announce changes and the endpoint information to the
service discovery registry. The gateway will update its upstream pools based on the services and
service instance information. The gateway can then route requests using the path, so that requests to
the “/catalog” path are routed to instances of the catalog service. The gateway can load balance
across multiple instances of the catalog services, even though they are different machines and ports.
Requests can be routed based on other factors such as the destination port, IP address, or some header
value.
FIGURE 5.7: Application/API Gateway
We don’t necessarily need to deploy our gateway onto separate dedicated nodes. We could instead
configure the Azure load balancer to route traffic to a couple of nodes in the cluster, and then use
tagging and constraints to deploy a gateway container to appropriate hosts. We could also configure
the load balancer to route traffic to every node in the cluster and set up a gateway on every node in
the cluster. A gateway on every node in the cluster can work well for small clusters, but as the cluster
size increases you might want to use a dedicated set of nodes. These nodes can be part of the cluster
or they could be special virtual machines outside the cluster manager, even in a different edge
network.
Deployment Considerations:
• Dedicated gateway: Use a dedicated set of gateway virtual machines managed independently
of the scheduler. The Azure load balancer is then configured to load balance across this
specific set of machines.
• Peer gateway request routing: With a smaller cluster, we can place a gateway on each node
in the cluster and route requests. By doing this we can save management and compute costs by
eliminating a couple of virtual machines. When adding a node to the cluster, we will need to
update the load balancer.
• Dedicated cluster nodes: We can configure the Azure load balancer to route traffic to a subset
of the nodes in the cluster and tag the nodes appropriately so that we can schedule a gateway to
run on them. We will need to ensure the Azure load balancer configuration and node
configuration are in sync.
Overlay Networking
It would be nice if all the containers across the cluster were addressable with their own IP address
and we didn’t have to deal with dynamic ports or port conflicts. Kubernetes even requires each pod
to have its own IP address, which is assigned from a range of IPs on the node the pod is placed. We
still need some service discovery mechanism to know what IP address the service instance is at, but
we don’t have to worry about port conflicts on the hosts. In some environments, it’s not possible to
assign enough IP addresses to a host, and we need to manage the assignment of the IP address ranges
in the environment.
We can create an overlay network on top of the existing infrastructure network that can route
requests between containers that are distributed across multiple nodes. This enables us to assign an IP
address to each container and connect to the service using multiple and standard ports. This reduces
the complexity of port mapping and the need to treat ports as a resource on host machines when
scheduling work in the cluster.
Benefits:
• Basic DNS use: We can use DNS features to find containers in the network, and thus do not
need to write additional code to discover the assigned host ports the service is running on.
• Avoids host port resource conflicts: This can eliminate port conflicts in situations where we
might want to schedule tasks on a node needing to expose the same port.
• Simpler support for connectivity with legacy code: In some situations legacy code can make
it difficult or nearly impossible to use a different port.
• Networking management: Although it is necessary to configure and manage an overlay
network, it can often be easier managing the deployment and configuration of an overlay
network than dealing with IP host ranges and service port mappings.
An overlay network can be extremely useful, especially with a large cluster or clusters spanning
multiple data centers. There are, however, concerns with the additional performance overhead and
the need to install and manage another service on the cluster nodes.
Figure 5.8 provides a visual representation of an overlay network created on top of the existing
infrastructure network, enabling us to route traffic to services within a node. As we can see, the host
machine is running a multi-homed gateway service that is bound to port 80 of the host machine on
10.0.0.4 and connected to the overlay network. The gateway service can proxy inbound requests to
the order service on port 80 at 172.16.20.1, and the order service can connect to an instance of the
catalog service at either 172.16.20.2 or 172.16.30.1 on well-known port 80. Any of the name
resolution options can be used for service discovery including those that come with some of the
overlay technologies we will cover here.
FIGURE 5.8: Service lookup using a proxy
The nice thing about this approach is that each container is now directly addressable, and although
it can add a little complexity in the networking configuration, it significantly simplifies container
lookup and management. All our services can be deployed and listening on well-known HTTP port
80 at their very own IP addresses. We can use well-known ports for the various services; it’s easy to
expose multiple ports on a container; and it can simplify service discovery.
Technologies
There are a number of technologies available in the market that can be used to create and manage an
overlay network. These technologies all have trade-offs that need to be considered, providing a
different set of features.
Docker Networking
The Docker engine includes a built-in multi-host networking feature that provides Software Defined
Networking (SDN) for containers. The Docker networking feature creates an overlay network using
kernel-mode Open Virtual Switching (OVS) and Virtual Extensible LAN (VXLAN) encapsulation.
The Docker networking feature requires a key/value (KV) store to create and manage the VXLAN
mesh between the various nodes. The KV store is pluggable and currently supports the popular
Zookeeper, etcd, and Consul stores.
The Docker networking feature also provides service discovery features that make all containers
on the same overlay network aware of each other. Because multi-host networking is built into the
Docker engine, we would not have to deal with deploying a network overlay to all the host nodes. In
true Docker fashion, we can replace this with something like Weave, which offers more advanced
features.
Weave
Weaveworks Weave Net (http://weave.works) is a platform for connecting Docker containers,
regardless of where they’re located. Weave uses a peering system to discover and network containers
running on separate hosts, without the need to manually configure networking. Weave creates two
containers on the host machine: a router container, which captures traffic intended for containers
managed by Weave; and a DNS discovery container, which provides automatic DNS discovery for
Weave containers.
Registering a Docker container with Weave assigns it a DNS entry and makes it available to other
containers on the host. You can reference a Weave container from another simply by using its Weave-
assigned DNS name. Weave also enables multiple containers to share the same name for load
balancing, fault tolerance, hot-swappable containers, and redundancy. Weave also supports
additional features such as encryption of traffic, host network integration, and application isolation.
There are some benefits to running Weave Net in place of the Docker networking. Weave Net does
not require us to install and manage additional software, so there is no requirement for a KV store
with Weave Net. It’s more resilient to network partitions, offers simpler support for cross-site
deployments, and provides a more robust service discovery option. It’s also a great option for
Kubernets, Mesos, and other container-centric schedulers.
Flannel
Flannel is a virtual mesh network that assigns a subnet to each container host. Flannel removes the
need to map ports to containers by giving each host a pool of IP addresses, which can be allocated to
individual containers. Flannel is a CoreOS project, but can be built for multiple Linux distributions.
A basic Flannel network encapsulates IP frames in a UDP packet. Flannel can use different back
ends, including VXLAN, Amazon VPC routes, and Google Compute Engine routes.
Summary
In this chapter we covered a lot of material for building an environment on which to deploy and run
our microservices applications. We covered cluster provisioning and options for scheduling services
in the cluster. We use discovery services for routing, and connectivity and possibly additional
networking tools to meet our dynamic and configurable networking requirements. In the subsequent
chapters we will cover monitoring the services to ensure they continue to operate as expected, as
well as addressing configuration and management of the services.
6. DevOps and Continuous Delivery
One of the keys to the successful use of a microservice architecture is ensuring you have an
automated, well-defined workflow where development and operations work together to produce
agile, high-quality releases. This is the essence of DevOps. In this chapter, we’ll provide an
overview of DevOps, its benefits, and one of the most important facets of DevOps: building a culture
of DevOps in your organization. Next we’ll discuss creating environments in Azure for a continuous
delivery (CD) pipeline and how a microservice is validated through a series of tests from code
check-in to deployment in production. Finally, we’ll discuss key criteria for choosing a continuous
delivery tool.
DevOps Overview
DevOps is the combination of development and operations teams working together toward a unified
goal: Ship the highest-quality code and infrastructure in the shortest span of time to deliver value to
customers faster. With DevOps, operations are a core part of every step in the development pipeline.
This includes the developers writing the code as well as the engineering teams that provision the
hosting infrastructure and build and manage the release pipeline. It includes release engineers,
database operations, network operations, security operations, and many others. The Microsoft model
for DevOps shows how teams go through four phases, as shown in Figure 6.1.
Modern DevOps
Organizations that have fully embraced DevOps are redefining what it means to be agile. Teams are
not waiting until the end of the sprint to ship—they are shipping updates to their microservices dozens
or hundreds of times a day! While it might seem counterintuitive, high-performing DevOps companies
shipped thirty times more frequently and had fifty percent fewer failures by leveraging automation
(Puppet Labs 2014 State of DevOps). The Mean Time to Repair (MTTR), which is the average time
taken to repair an issue, was twelve times faster for companies that deploy small, more frequent
releases than companies that do large, less frequent releases.
When your organization embraces DevOps, deployment stops being an “event,” meaning there are
no special meetings required to deploy a new version of an app. It’s just something that happens
whenever it’s needed, using an automated pipeline that is managed and monitored from check-in to
production.
To summarize, let’s look at what a world with and without DevOps might look like in Table 6.1.
TABLE 6.1: Comparing Teams With and Without DevOps
DevOps Culture
While the majority of this book is about technologies, the key to success with DevOps is people. All
of the technology in the world won’t help you be successful if your team doesn’t work together.
Below are just some DevOps culture principles.
Demystify deployments
Rolling out a new deployment can often be described by the team as “terrifying” or “scary.” You need
to find a way to ensure the entire team knows how to deploy, and that every member of the team is
trusted to contribute code to the production environment. The systems put in place to prevent large-
scale mistakes should be good enough to ensure that nobody can break the existing experience. To
ingrain this in the culture, startups like Lyft have new employees deploy to production on their first
day on the job. If something significant breaks, this leads to the next point, the “no blame” rule.
FIGURE 6.2: Each microservice has its own defined continuous delivery pipeline
Immutable Infrastructure
Before we discuss ways to create our infrastructure, let’s discuss two common ways to define your
infrastructure and deployments. The first is the classic model, where a server is running for a
relatively long time, with patches and updates applied regularly. Any breaking changes result in
tweaks to the server configuration that must be rolled out and tested, resulting in regular maintenance
and patching. The longer the server runs and the more deployments that happen against that
environment, the higher the risk of encountering issues caused by the state of the machine being
different from when it was originally provisioned. One example is a log file that has filled a local
hard drive so it now throws exceptions because of a lack of disk space, after weeks of running
smoothly. Don’t be fooled by server uptime as an indicator of system health—it is not. A system
being able to run for an extended period of time is not directly correlated to its inherent stability and
ability to properly handle incoming deployments.
Another much more efficient model is defined by the term “Immutable Infrastructure,” coined by
Chad Fowler of 6Wunderkinder (now part of Microsoft). The term implies that all infrastructure,
once deployed, cannot be changed. Therefore, whenever a new configuration or operating system
patch is required, it should never be applied to running infrastructure. It must always be a new,
reimaged instance, started from scratch. This makes things much more predictable and stable as it
eliminates the state factors that might negatively impact future deployments. Immutable Infrastructure
heavily relies on the assumption that your environments are fully automated in a repeatable and
reliable process, from testing to configuration to deployment and monitoring.
There are three key issues related to the classic model:
• No or minimal automation: This means that everything that breaks will require dedicated
attention—increasing operational complexity that results in higher maintenance costs that are
hard to offset through mutable reconfigurations and updates.
• Slower and buggier deployments: The more moving pieces, the more likely that one of them
will break and bring the entire system down. It’s not uncommon that the perceived modular
architecture of the classic model is in fact a monolithic house of cards where an inadvertent
change can have damaging consequences.
• Expensive diagnostics: Once something actually fails, it will be both time- and resource-
consuming to pinpoint the real reason. The fix is rarely a long-term one, but rather a Band-Aid
that targets just one potential symptom.
Infrastructure as Code
Another key aspect of automation is automating the creation of environments. As we discussed in
Chapter 5, “Service Orchestration and Connectivity,” Azure provides an automated way to create
your application topology using Azure Resource Manager (ARM) templates. For DevOps, we’ll want
to ensure that the creation of all environments is fully automated using ARM templates, including any
necessary installation or configuration scripts. Having infrastructure definitions available as code
also means any team member can instantly run a script and provision a private instance of your team’s
environment painlessly.
FIGURE 6.3: Each microservice pipeline defines its own private Azure resources
Although Figure 6.3 should be intuitive in that the closer we get to production, the larger and more
realistic the preproduction environment becomes, the one thing to note is that the QA environment can
be of variable size. This means that, depending on the type of test being done in QA, you can easily
scale the number of virtual machines up or down as needed. In other words, you only pay for what
you need based on the test needed.
In the semi-shared model shown in Figure 6.4, each team still manages its development
environment privately, but all other environments use a shared pool of resources using a clustering
technology like Azure Container Service, Docker Swarm, Mesosphere, or Service Fabric as
discussed in Chapter 5 under “Orchestration.”
FIGURE 6.4: Microservices using a combination of private and shared resources
One common per-environment configuration setting missing in this parameter file is the capability
to configure VM size. When creating environments, you probably have smaller VM sizes in
development, like an A1 Basic VM with 1 core and 1.75GB of RAM, but your production
environment would have a more powerful VM size configuration like the D4 size that includes 8
cores, 28GB RAM, and a 400GB SSD drive. We can parameterize the VM size by defining it as a
new parameter in the azuredeploy.parameters.json file as shown here:
"vmSize": {
"value": "Standard_D4"
},
Next, you will need to open the azuredeploy.json file and under the hardwareProfile property,
change the vmSize property to read the value from the newly added vmSize parameter.
Click here to view code image
"properties": {
"hardwareProfile": {
"vmSize": "[parameters('vmSize')]"
},
...
In this simple configuration, you can use one ARM template to define the virtual machine and have
four parameter files that represent the different per-environment configuration settings.
For more advanced configuration scenarios, there are also templates to take and customize for
Docker Swarm, Mesosphere or the Azure Container Service used in the examples for this book.
• Docker Swarm: https://github.com/Azure/azure-quickstart-templates/tree/master/docker-
swarm-cluster
• Mesos with Swarm and Marathon: https://github.com/Azure/azure-quickstart-
templates/tree/master/mesos-swarm-marathon
• Azure Container Service using the example from this book:
https://github.com/flakio/infrastructure
All these examples provide parameter files that you can use to change the total number of VMs
included in your cluster (represented as the nodes parameter in Docker Swarm template, the
agents parameter in Mesos, and the agentCount parameter in the Azure Container Service
template).
Next, in the azuredeploy.json file, we will add a tags section to include metadata about the
environment, location, and department. Instead of hard-coding these values, notice that the tag values
are read from the parameters we created previously.
Click here to view code image
{
"apiVersion": "2015-05-01-preview",
"type": "Microsoft.Compute/virtualMachines",
"name": "[variables('vmName')]",
"location": "[parameters('location')]",
"tags": {
"environment": "[parameters('environment)]"
"location": "[parameters(location)]"
"dept": "[parameters(dept)]"
}
...
Doing this enables you to easily find and filter your ARM resources by tag from the Azure portal or
the command line. Instead of looking through a list of hundreds of virtual machines, you can filter the
list to just the finance department’s production VMs.
For Docker, we can use the label command to set key/value metadata about the Docker daemon
(meaning the host), a Docker image definition, or when a Docker container is created. For example,
you can set a tag on the Docker daemon running on the host to specify capabilities such as an SSD
drive as shown, using the reverse domain name notation:
Click here to view code image
Labels on Docker images can be set using the LABEL command in the Dockerfile. Remember that
if you add a label to the Dockerfile image, it is hard-coded into the image itself. For that reason, it’s
best to only add labels in a Docker image for metadata that will not change based on the runtime
environment. For per-environment labels, use the “--label” switch in the Docker run command as
shown:
Click here to view code image
Docker run –d \
--label io.flak.environment="dev" \
--label io.flak.dept="finance" \
--label io.flak.location="westus" \
nginx
Once you define the labels for your Docker containers, you can use standard Docker commands to
filter based on specific label values. This example will show only those running containers that are in
the “dev” environment.
Click here to view code image
Continuous Integration
The continuous integration process starts when a developer checks in code or merges code from a
feature branch into the main branch. This triggers a series of automated tests, most commonly unit
tests, to validate the quality of the code being checked in.
Continuous integration defines a build workflow that includes all the steps required to build and
run tests for your application. A typical build workflow could contain steps like the following:
• Download source code and any package dependencies like Maven for Java, NPM for Node, or
NuGet for .NET applications.
• Build the code using build tools like Ant, Gradle, or MSBuild.
• Run a set of tasks using Javascript task runners like Grunt or Gulp to optimize images, or
bundle and minify JavaScript and CSS.
• Run unit tests using tools like Junit for Java, Mocha for Node, or xUnit for .NET applications.
• Run static analysis tools like SonarQube to analyze your source and code coverage reports, or
run specialized tools like PageSpeed for web performance.
• If the tests were successful, push the new image into your Docker registry.
Now that we’ve discussed what a CI workflow might look like, let’s discuss some of the testing
and analysis tools mentioned previously.
Unit Testing
A unit test is designed to test code on a functional level. Let’s take an example of a simple Add()
method that takes two numbers and returns the sum. A unit test could run a number of tests to ensure
the method worked (1 + 1 = 2), and didn’t work (0 + “cat” throws an exception) as expected. One of
the main premises of unit testing is code isolation, where a function can be tested independently of
any other moving parts. Unit tests help with regression testing, where a code change didn’t
inadvertently break the expected behavior of an existing test.
Testing in a QA Environment
DevOps is fundamentally about ensuring high quality code. At any point in time your code, which is
being pushed from a developer machine into source control, is tested against a number of criteria
including scalability, interoperability, performance, and others. Because microservices are commonly
derived from a monolithic app that has been broken down into a set of independent services, it is
important to make sure to test and validate that services interoperate with each other, and none are in
a state that can break the larger system.
Integration Testing
When you have several microservices as part of a larger project, it is important to make sure that
these services work given the interdependence with each other. There are several techniques that can
be leveraged for integration testing. For example, Big Bang Integration Testing tests all the services
running together at the same time to ensure that the entire system works as expected. Another option is
bottom-up integration testing, where services at lower hierarchy levels are exposed to a set of tests,
and subsequent services that depend on those services are then tested.
Coded UI Testing
Coded UI testing helps validate that end-to-end scenarios using multiple services work together. For
an ecommerce site, this would include searching for a product, adding it to a shopping cart, and
checking out to place an order. Coded UI tests help ensure that the integrated multiservice scenario
doesn’t break.
Selenium (http://docs.seleniumhq.org) is an example of a cross-platform open-source web UI
testing framework, that integrates directly with Docker (http://bit.ly/dockerselenium). Selenium tests
can either be written in a programming language like Java, C#, or Ruby, or you can use the Selenium
IDE which records everything you do in your browser, like the ecommerce browse-to-checkout
example. One of the great benefits of Selenium is that it is both independent of the original language in
which the web app was written and can run simultaneously in the context of multiple browsers on
different systems. This results in a significant reduction of time investment for testing the same project
in different situations.
FIGURE 6.9: Selecting the Azure region to run for a load test
By running the test you can start seeing real-time results as shown in Figure 6.10. You can find
more information on how to configure and customize load tests at http://bit.ly/azureloadtest. Because
these tests are scripted, they can also be integrated directly into your continuous delivery pipeline.
FIGURE 6.10: Viewing the results of a load test
This code will create 100 containers with a lifespan of fifteen seconds each. Docker-stress also
enables you to monitor the status of the stress test using docker-monitor. The docker-monitor
command shown performs a health check over a fixed time interval (500 seconds) and sends any
failure results to the specified email address.
Click here to view code image
Deploying to Staging
A staging environment is the deployment target where the release candidate resides. It is considered
the “ready-to-go” code that needs to go through any final validation processes, like user acceptance
testing. User acceptance testing is where a user, such as a business owner for a process, validates that
the service is working correctly. There can be a number of additional differences to test related to the
environment configuration as well, such as the target databases and services that the application is
connected to. Whereas in the development and QA environments the database and service hooks point
to independent test endpoints, a staging environment can point to your production database.
Manual/Exploratory Testing
This is one of the most primitive test types, but at the same time it can produce first-hand results on
what the end user will go through, and is useful for your development team to quickly validate or
reproduce reported bugs. In the case of microservices, manual testing might involve trying to see
whether a specific REST call can cause unhandled exceptions and impact the overall system on a
larger scale. However, in DevOps manual testing is rarely used, if at all—it is inefficient in a
dynamic environment and prone to missed cases. What can be done manually can often be easily
automated.
Testing in Production
As we mentioned in the beginning of this chapter, the process of deploying an application to
production can be downright frightening. But there are a number of ways you can use to avoid the risk
and fear of pushing a new version out to production. Even when your application is fully rolled out,
you’re still not done testing, as there are a number of other tests you can do to ensure you have
resilient and reliable services.
Canary Testing
Canary testing is a technique used to deploy a new version of your microservice to a small percentage
of your user base to ensure there aren’t any bugs or bottlenecks in the new service. To do canary
testing, you can use tools like NGINX’s split_clients module to split traffic based on your routing
rules. Another option is Netflix’s Zuul (http://bit.ly/netflixzuul), which is an Edge service that can be
used for canary testing new services based on a set of routing rules. Facebook uses Gatekeeper, a tool
that gives developers the capability to only deploy their changes to a specific user group, region, or
demographic segment. Similarly, at Microsoft, the Bing team has a routing rule where all employees
on the corporate network get “dogfood” experimental versions of the Bing search engine before it is
shipped to customers.
For an example, let’s say we had a new version of a microservice to canary test. You can split
traffic as shown in Figure 6.11, with 99% of initial traffic going to the current release (v1.0), and 1%
of traffic would go to the new release (v1.1). If the new release is performing well (see Chapter 7,
“Monitoring,” for more information on what to measure), then you can increase the amount of traffic
incrementally until 100% of traffic is using the new release. If at any point during the rollout, the new
release is failing, you can easily toggle all traffic to use v1.0 without needing to do a rollback and
redeployment.
A/B Testing
A/B testing is a way to measure how two versions of a feature stack up against each other in terms of
performance, discoverability, and usability. When setting up an A/B test, the test manager sets up two
groups of users: One is the control group, and the other is the treatment group, which is normally
significantly smaller than the control group. As an example, in Outlook.com, we created an
experiment where we moved the built-in Unsubscribe bar from the bottom of an email message body
to the top, driven by the hypothesis that users simply were not aware of the “Unsubscribe” feature in
the product. We rolled out a new treatment that moved the unsubscribe feature to a more discoverable
location in the UI, and rolled it out to ten percent of worldwide Outlook.com users. Over a fixed
period of time we analyzed the data that showed a higher usage of the Unsubscribe feature that had
been obtained simply by moving it to a more discoverable location.
On-Premises or Hosted?
The first option to consider is whether you want to use an on-premises continuous delivery pipeline
or use a hosted service. For on-premises, the organization becomes the maintenance and service
provider for the tools and infrastructure, including hardware, networking, upgrades to newer
versions, and support issues. Organizations that typically use on-premises tools do so either because
of an existing investment in on-premises tools, or because of a corporate policy that forbids source
code hosting outside of the company. Beyond compliance, there are benefits to having a local solution
—one of them being the capability to fully control and customize the continuous delivery pipeline as
needed, whether that includes integration into custom tools, authentication mechanisms, or existing
workflows.
Hosted infrastructure is the alternative to on-premises deployment. Infrastructure is hosted on
external servers, usually optimized for scale and geographical region distribution. The host is
responsible for all maintenance and management, usually with guarantees of availability and
performance through a Service Level Agreement (SLA). Updates to the service are managed by the
hosting company, but you do lose full control of hosting the service yourself.
While many companies are using on-premises tools today, Software-as-a-Service (SaaS) solutions
continue to grow in popularity and richness. International Data Corporation (IDC) predicts that by
2018, almost 28% of the worldwide enterprise application marked will be SaaS-based
(http://www.idc.com/getdoc.jsp?containerId=252568).
Does the Product or Service Include Tools to Provision and Deploy using Azure and/or Docker?
Microsoft Azure provides a set of services, such as VM provisioning through Azure Resource
Manager templates, extensible storage, and API hooks for virtualized resources, tremendously
simplifying the DevOps process. It is important to consider whether the tool you select already
supports Azure as a first class deployment target to ensure that not only can you deploy, but that you
receive rich diagnostic information when the deployment fails. Similarly, what set of tools or
services exist for Docker? Are common workflows, like building Docker images, using Docker
Compose for deployments, or pushing/pulling images to a Docker registry included? How difficult is
it to integrate orchestration tools like Kubernetes or Mesosphere in your deployments?
Does the Tool Include Ways to Manually or Automatically Promote between Environments?
Depending on your organization’s process, microservice deployments could be fully automated from
check-in to production (continuous deployment), mostly automated where the process is automated up
to production but production deployments are manual (continuous delivery), or fully manual where
promotion of each environment is done and signed off by a QA team. Whatever tool you select should
ideally have the option for all three or be able to switch between manual and automated environment
promotion. One caveat to consider for full automation is the scenario where a QA engineer is still
validating an existing build while another build is being pushed into the same environment. It would
be inconvenient to just swap the bits, thereby invalidating all the previously-done work. That, of
course, is just one scenario, and automated approval is just as important—to move code between
environments, you need to decide what the right balance of manual versus automated testing makes
sense for your services.
Summary
Adopting DevOps practices like automation, continuous delivery, unit testing, integration testing, and
performance testing will all have a clear benefit in improving the agility and quality of your releases.
While these tools will help make creating a continuous delivery pipeline easier, this process can only
succeed based on the people and the collaboration culture you set for your developer and operations
teams.
7. Monitoring
In the last couple of chapters, we have learned how to design and develop a containerized
microservice-based application. If we recall some of the core concepts about microservices, we
know that the services should communicate with each other through APIs so that we have a loosely
coupled and flexible architecture. An environment like this introduces a set of challenges when it
comes to monitoring and operational management. Questions like the following arise: how does the
system know that it needs to start another instance of a service as one cannot handle the load anymore,
or how does the system know that it needs to spin up another host VM as the ones that in use are
running into resource constraints? To answer all those questions we need to have effective monitoring
in place. In fact monitoring is one of the most important aspects of microservices architectures. In this
chapter, we will have a closer look at what monitoring means for each component, what the
challenges and best practices for each component are, and we will look at some of the Microsoft
monitoring solutions available.
To illustrate that monitoring all components of a typical microservice-based application can be
quite a challenge it is worth looking at conceptual view of an environment hosting such an application
again. Figure 7.1 shows such an environment that we already know from Chapter 5.
Monitoring Containers
In a typical containerized environment, one host machine can run many containers. From a macro
monitoring perspective, we first want to have an overview of the container status—for example, how
many healthy containers vs. broken containers are on that host VM, or how many web containers vs.
database containers are running on the host VM? This data is particularly important if we have more
than one VM in a cluster, which is the case in almost any real world scenario. It gives us insights into
how and what types of containers and services are distributed across a cluster, so that we can correct
unwanted behaviors. For example, we can learn over time that our containers hosting databases use
up way more resources than the ones that only serve as gateways. We could then further use the data
to tell our scheduler (see more information on orchestration and scheduling in Chapter 5) to only
place the database containers on bigger VMs that offer more RAM and CPU, and the gateway
containers on smaller VMs.
Therefore, when it comes down to what we should monitor at an individual container level, the
classical runtime metrics including CPU, Memory, Network, and Disk are still the important
indicators. For example, we can monitor the memory usage trend to detect the potential memory leak
caused by a service in the container. Monitoring those metrics on containers is very important as we
can combine them with other metrics, such as the ones coming from the host VM. There are situations
when only the combined real-time runtime metrics enable us to make the right decisions.
Let’s think about the following scenario. We detect that one of our containers has a very high CPU
usage. As we have learned, we can spin up a container quickly, so we might think just to spin up
another one to add more CPU capacity. However, if the current host environment is running low on
CPU, we are unable to put a new container on it. In this case, we would need to add a new host VM
first, and then put the container on it.
So how and where does Docker emit the data needed for monitoring? The answer is that Docker
relies on two Linux kernel mechanisms, control groups and namespaces (we have discussed control
groups and namespaces in Chapter 2), to create the isolated container environment.
Those two features also provide the basic container runtime metrics.
• Control groups expose metrics about CPU, Memory, and Disk usage through a pseudo-
filesystem. In most of the latest Linux distributions using the Linux kernel 3.x or later, such as
Ubuntu 14.04, CentOS 7 Red Hat Enterprise Linux 7 and so on, they are mounted on
“/sys/fs/cgroup/” with each control group having its own sub-directory. For example,
memory metrics can be found in the “memory” control group sub-directory. On some older
systems, it might be mounted on /cgroup and the file hierarchies are also different.
• Namespaces expose network metrics. We can utilize the system call setns to switch the current
monitoring agent process to the same network namespace of the container and then read the
metrics data from “/proc/net/dev”.
Docker Runmetrics
The Docker web site http://docs.docker.com/articles/runmetrics/ provides more
information on Docker runtime metrics.
Now that we know where to find the data, we need to have an easy way to read that information.
There are actually two choices.
• Read the data directly from the control groups and namespaces
• Use the Docker Remote API
The Docker Remote API is a set of RESTFul APIs using JSON and GET/POST methods. Since
Docker Remote API v1.17, it can collect key performance metrics from containers running on the
host. It provides a programmable way for external monitoring agents to query Docker information like
container metadata and lifecycle events. The information returned by the APIs provide a
comprehensive overview of the containers and their host VM. Below is a list of calls relevant to
monitoring.
• GET /info: provides system-wide information; for example, total memory of the host, total
number containers on the host, total number of images, and so on.
• GET /version: provides the Docker version.
• GET /events: provides the container lifecycle and runtime events with timestamp. Table 7.1
provides an overview of all events and their respective Docker commands. Monitoring
container events is crucial for the overall monitoring strategy in automated environments, as it
provides insights into the lifecycle of the containers.
TABLE 7.1: Overview of container events
As an example, the single API GET /containers/(container_id)/stats can return all CPU, Memory,
Disk, and Network usage in the unified JSON format.
Note
The Docker Remote API website
https://docs.docker.com/reference/api/docker_remote_api/ offers a complete list of all
the different Remote API versions and methods.
Now that we know what we need to monitor from a container perspective, where to find the data,
and how to access the data, we need to find a way to collect it. The good thing is that we do not really
need to build our own agent to collect the data as there are already many monitoring solutions
available, although we could.
An important question, however, is where to run the monitoring agent. Most monitoring solutions
offer either a monitoring agent that runs on the host VM, or a container that contains the monitoring
agent. While the preferred way is to containerize the agent as well, the answer is really that it
depends on the scenario and host VM.
If the VM already hosts other applications, we can extend the monitoring agent running on each host
to support Docker as well. It is very doable, irrespective of whether you choose the native Linux
solution (control groups and namespaces) or the Docker Remote API. In fact, many existing server
monitor solutions have enabled Docker monitoring in their host-based agent. The Azure Diagnostics
agent, for example, runs on the host VM, collects all the data from the directories on the host VM, and
transfers it to a different location, such as Azure storage.
If our host VMs are only hosting containerized applications, we need a consolidated solution to
deploy and manage all the applications, including the monitoring agent. The preferred way is to
containerize the agent as well. With the agent and all its dependencies packaged into a single image,
we can deploy and run the monitoring agent on any Host/OS and integrate with other container
orchestration tools.
Monitoring Agents
Some special Docker environments such as CoreOS, do not even permit third-party
packages installed on the host VM, so using a “Monitoring” container is the only option.
Monitoring Services
Monitoring Services
The last piece in the puzzle is monitoring the services themselves. Having good monitoring in place is
important to keeping any application healthy and available. In a microservices architecture good
monitoring is even more important. We not only need to be able to monitor what is going on inside a
service, but also all the interactions between the services and the operations that span them. When an
anomaly occurs, the inter-services information is much needed to understand the causality and find the
root cause. It’s important to embrace the following design principles to achieve it.
• Log aggregation and analytics
• Use activity or correlation ID’s
• Consider an agent as an operations adapter
• Use a common log format
In addition to these points, common standard monitoring tools and techniques should be utilized
where appropriate, such as endpoint monitoring and synthetic user monitoring.
Log Aggregation
A request into the system will often span multiple services, and it is important that we are able to
easily view metrics and events for a request across all the systems. There is always the question of
how much to log to avoid over- or under-logging. A good starting point is to log at least the:
• Requestor name/ID: If a user initiates the request, it should be the user name. If a service
initiates the request, it should be the service name.
• Correlation ID: For more information, see the paragraph on Correlation ID.
• Service flow: Log entry and exit points of a service for a given request.
• Metrics: Log runtime performance of the service and its methods.
With all the data, can you imagine finding something in the logs of one service, and then having to
go to another system and try to find the related logs? For even just a handful of services, this would
be painful. This is not something we should spend our time doing, so it should be easy to query and
view logs across all the systems.
There are tools that collect and aggregate logs across all the VMs and transfer them to a centralized
store. For example, Logstash, an OSS tool by Elastic.co, or the Microsoft Azure diagnostics agent
(which we will discuss later in the chapter) can be used to collect logs from all the nodes running our
services, and put them in a centralized store. There are also many good tools that help us visualize
and analyze the data. One of the more popular end-to-end solutions is the ELK stack. It uses Elastic
Search as the data store, Logstash to transfer the logs, and Kibana to view the logs.
Correlation ID
In addition to collecting the logs, we need to be able to correlate logs; basically we need to be able to
find associated logs. When a request lands on the application, we can generate an activity or
correlation ID that represents that unique request on the application. This ID is then passed to all
downstream service calls, and each service includes this ID in its logs. This makes it easy to find
logs across all the services and systems used to process the request. If an operation fails, we can
trace it back through the systems and services to help identify the source. In Figure 7.2, we can see
how a correlation ID can be used to build a waterfall chart of requests to visualize the end-to-end
processing time of an operation. This can be used to optimize transactions or identify bottlenecks in
the transactions.
Operational Consistency
Freedom to use the technology of choice for each service has its benefits, but can present some
operational challenges as well. The operations tools and teams now need to deal with a growing
number of stacks and data stores, which can make it difficult to provide a common view of system
health and monitoring. Every technology tends to deal with configuration, logging, telemetry, and
other operational data a bit differently. Consider providing some consistency and standard
operational interfaces for things like service registration, logging, and configuration management.
Netflix, for example, uses a project called Prana and a sidecar pattern, also sometimes referred to
as a sidekick pattern, to ensure that type of consistency. The Prana service is an operations agent that
is deployed to each virtual machine. The operations agent can manage things like configuration in a
consistent manner across all the various services. Then the teams implementing the services can
integrate with the agent through an adapter and still use whatever technology they want.
Note
The sidecar pattern refers to an application that is deployed alongside a microservice.
Generally, the sidecar application is attached to its microservice just as a sidecar would
be to its motorcycle, thus the name “sidecar pattern.” For more information on sidecar
pattern and Netflix Prana visit http://techblog.netflix.com/2014/11/prana-sidecar-for-
your-netflix-paas.html.
{type:"info",thread:"2345",activityId:"1",message:"my message"}
There are a few things wrong with this. The differing format of the logs is one problem, and then
the keys vary across the logs. When we query the logs for cid=1 we are going to miss the logs from
the second microservice because although it’s logging the correlation ID, they call it something else.
The same is true for the event timestamp. If one service logs it as “timestamp”, another “eventdate”,
and yet another “@timestamp”, it can become difficult to correlate these events, and time is a
common property to correlate events on. Thus, for some of those critical events, we need to make
sure that every team is using the same key name, or at least we must consider processing events with
something like Logstash.
Note
Logstash is a very popular data pipeline for processing logs and other event data from
various systems such as Windows or Linux. Logstash offers many plug-ins for other
systems, like Elasticsearch. This makes the log data easily searchable and consumable,
making Logstash a great data pipeline for many scenarios.
In addition to the key, the format and meaning of the event message need to be consistent for
analysis. For example, timestamp can refer to the time the event was written, or when the event was
raised. As timestamp is commonly used to correlate events and analysis, not having consistency could
skew things quite a bit.
Further, we need to determine what events should be consistent across the entire organization, and
use them across all the services in the company. We should think of this as a schema for our logs that
has multiple parts that are defined at different scopes, to facilitate analysis of log events across the
various organizational, application, and microservices boundaries.
For example, we might include something like the following in all our log events.
• Timestamp with a key of ‘timestamp,’ the value in ISO 8601 format, and as close to the time the
event happened
• Correlation Identifier with a key of ‘activityId’ and a unique string
• Activity start and end times, like ‘activity.start’ and ‘activity.end’
• Severity/Level (warning, error) with a key of ‘level’
• Event Identifier with a key of ‘eventid’
• Process or services identifier enabling us to track the event across services
• Service name to identify the service that logged that event
• Host Identifier with a key of ‘nodeId’ and a unique string value of the machine name
As services will be developed by multiple teams across an organization, having these agreed upon,
documented, shared, and evolved together, is important.
A good example for the importance of enforcing a common log format is Azure itself. Some Azure
Services depend on each other. For example, Azure Virtual Machines rely on Azure storage. If there
is an issue with Azure storage, it can affect Azure Virtual Machines. It is important for Azure Virtual
Machine engineers to be able trace back the issue to its root cause. Azure storage components need to
log the data to the overall Azure diagnostics system following a common format and correlation rules.
The Azure Virtual Machine engineer can now easily trace back the issue to Azure storage by just
searching for the correlation ID.
Note
Find more information on supported logging drivers at
http://docs.docker.com/reference/logging/overview.
Monitoring Solutions
By now, we have gained enough knowledge to know how to build our own logging system. However,
building custom monitoring solutions is not an easy task, and it only makes sense if there is a
requirement that cannot be met by an existing monitoring solution. In fact, most monitoring solutions
can be customized in a way to meet almost any requirement. In this section, we will have a closer
look at the monitoring solutions offered by Microsoft Azure.
Azure Diagnostics
Azure offers a free basic monitoring and diagnostics framework that collects data from all kinds of
components and services on a virtual machine called Azure Diagnostics. Azure Diagnostics is really
more like a log data collector than a complete solution, as it does not offer any user interface to view
the data. Some monitoring solutions and tools use the data collected from Azure Diagnostics to
provide log analysis and alerting and other features.
To get started with Azure Diagnostics, we only need to install the Azure Diagnostics extension on a
VM. This can be dynamically applied to a VM at any given time. We can enable the diagnostics
extensions through
• CLI tools
• PowerShell
• Azure Resource Manager
• Visual Studio 2013 with Azure SDK 2.5 and higher
• Azure Portal. The diagnostics extension is turned on by default if a VM is created through the
portal.
Note
The diagnostics extension for Windows contains the Windows diagnostics agent and the
diagnostics extension for Linux contains the Linux diagnostics agent. The Linux
diagnostics agent is open source and available on GitHub.
https://github.com/Azure/WALinuxAgent.
When we enable the extension through any of the methods mentioned previously, the Linux
diagnostics agent is installed on the host VM. As part of the installation, the extension also installs
rsyslog, which is important for data logging. Figure 7.3 illustrates the basic components and data
flow.
FIGURE 7.3: Linux and Windows diagnostics agent architecture
To enable the Linux diagnostics agent, we need to pass in a configuration that contains a collection
plan. The collection plan defines what metrics need to be collected and where to store the data. The
data is usually stored in an Azure storage account from which it can be consumed by any client.
As we focus on Docker and Linux in this book, we will not discuss monitoring on Windows.
Please see https://msdn.microsoft.com/en-us/library/azure/dn782207.aspx for Windows configuration
details.
The following example shows how to collect a core set of basic system data (CPU, Disk, and
Memory) and all rsyslog information on a Linux virtual machine.
• Create a file named PrivateConfig.json with the following content:
Click here to view code image
{
"storageAccountName":"the name of the Azure storage account
where the data is being persisted",
"storageAccountKey":"the key of the account"
}
After we have applied the extension, it takes about 5 minutes for the Linux diagnostics agent to be
operational. This is very different from a monitoring container, as a container would spin up in almost
a matter of seconds and start transferring data.
The Linux diagnostics agent creates the following tables in the storage account specified in the
PrivateConfig.json:
Figure 7.7 shows log entries from a microservice inside a container, as well as from the containers
itself. The value of the facility field is “daemon” as it refers to the Docker daemon that is logging the
message. The Msg field contains the log entry itself, starting with the container ID that looks like
“docker/e15bca350018.”
At the time of writing, the Azure Linux diagnostics agent is certainly a very basic way to collect
microservices and container log data. One of the reasons for this is that it stores all log data in one
table and does not offer a column to filter based on container ID or name. Another drawback is that
Azure table storage might not be the best solution for high-throughput logging requirements due to its
indexing limitations. That said, Azure Diagnostics is certainly a good starting point for testing logging
strategies and potentially small microservices-based applications. The following Azure website
offers more details on the Linux Diagnostics extension and its configuration options:
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-diagnostic-
extension/.
Application Insights
Application Insights is a Microsoft Service that collects telemetry information for mobile, browser-
based, or server applications. Application Insights collects the data and moves it into the cloud to
process and store. The telemetry and diagnostics data can be viewed, as well as sliced and diced,
through an integration with the Azure Portal. The following link provides more information on general
Application Insights functionality: https://azure.microsoft.com/en-us/services/application-insights/.
Recently, Application Insights added full support for Docker containers by adding an Application
Insights image to Docker Hub. Application Insights follows the approach of a “monitoring container”
as we had mentioned in the container monitoring section. As a result, we can run a single instance of
an Application Insights container on our Docker host VM. The service in the container talks to the
Docker agent and sends the telemetry data back to Application Insights.
Application Insights supports two models for Docker.
• Capture telemetry for apps not instrumented with Application Insights. This means there is no
instrumentation code in the microservice, resulting in only the following Docker related data
being captured:
Performance counters with Docker context (Docker host, image, and container)
Container events
Container error information
This data by itself is already very helpful, as we have learned before. We can get even more out of
Application Insights if we instrument the microservice code.
• Capture telemetry for instrumented apps.
Gets all the data mentioned before.
Adds the Docker context (Docker host, image, and container) to the captured telemetry data.
Note
See https://azure.microsoft.com/en-us/documentation/articles/app-insights-get-started/
for more information on how to instrument applications with Application Insights.
As mentioned previously, Application Insights offers a great UI for looking at and analyzing log
data. The experience enables us to drill down into the Docker aspects of our microservices
application. Application Insights offers an overview Docker blade with information for:
• Container activity across Docker hosts and images
CPU
Memory
Network in
Network out
Block I/O
• Activity by Docker host
• Activity by Docker image
• Active images
• Active containers
In Figure 7.8, we can see the Docker overview blade for our book companion application Flak.io,
showing that we currently have four Docker host machines in this cluster, the activity of the four
Docker images, and the number of active containers.
FIGURE 7.8: Application Insights Docker overview blade
The Docker overview blade is a good starting point for drilling deeper into each component
(individual Docker host, images, and containers). Figure 7.9, for example, shows the Docker by host
blade.
FIGURE 7.9: Docker by host blade in Application Insights
All these features make Application Insights a very powerful and developer-focused monitoring
solution for containerized microservices applications on Azure.
Summary
In this chapter, we have covered why it is important to monitor the entire microservices environment,
starting from host VMs in a cluster all the way to the services running in containers. We have seen
best practices and guidance for VMs, containers, and services. In addition, we looked at some
monitoring and diagnostics solutions offered by Microsoft. The biggest takeaway should be that
monitoring is a key part of microservices-based architectures, and that we need to include it very
early in our planning.
8. Azure Service Fabric
Over the last couple of chapters, we have seen that building highly available, resilient and performant
microservices requires more than just containers. We need additional system services for cluster
management, services for container or service orchestration, service registration, service discovery,
and service monitoring. Mesosphere DCOS or Mesos proper, Zookeeper, Marathon, Docker Swarm,
and Kubernetes are currently the most popular services to help address those areas and are frequently
used together. Separately within Microsoft, Azure has built Azure Service Fabric as a platform to
build exactly these types of highly available, resilient, and performant microservices. It provides
cluster management, service orchestration, service registration, service discovery, service
monitoring, and more in a single cohesive platform, thereby eliminating the need to manage multiple
systems. Many Microsoft services such as Azure DB, Cortana and Intune, to name just a few, have
been using Azure Service Fabric as their platform for the last five years. Azure Service Fabric is
currently available as a public preview and will be made generally available in the first half of 2016.
Service Fabric plays an important part in Microsoft’s Azure strategy, and thus we need to
understand what Service Fabric is and how we can use it. This chapter provides a high-level
overview of Service Fabric with the goal of making us aware of its features, and how we can think
about Service Fabric concepts in the context of what we have learned so far.
Linux Support
Per Microsoft’s current plans, Linux support will be available in 2016. The Service
Fabric functionality on Linux will be almost a hundred percent identical.
Service Fabric includes system services that take care of everything that is important for resilient
and scalable microservices environments. Cluster management, resource scheduling, service registry
and discovery, failover, deployment, no-downtime rolling upgrades, and safe rollbacks are just some
of the platform’s features. The key here is that those services are part of Service Fabric itself and do
not need to be installed or configured separately—the integration has been done for us.
Cluster Management
In Service Fabric as well as in a containerized microservices architecture, there is a notion of a
cluster, which consists of a number of nodes. In the context of this chapter, a node relates to a
physical or virtual machine instance running Service Fabric.
In Service Fabric, a cluster is defined through the cluster manifest described in an Azure Resource
Model (ARM) JSON document. The cluster manifest specifies infrastructure metadata such as number
of nodes, node names and types, network topology, cluster security, and so on. One big difference
compared to a Docker Swarm or Mesos cluster is that Service Fabric does not require us to set up a
separate service for managing cluster availability with tools like Zookeeper or Consul. These are
normally needed for leader election among several machines, which then manage the cluster. Service
Fabric is a single stack designed specifically to work together vs. a coordinated set of open-source
projects that are used together. As a result, we only need to specify the nodes needed to run our
microservices in the ARM template, and the Service Fabric cluster bootstraps itself from there. To
ensure the highest-availability clustering, Service Fabric uses the quorum model and requires us to
use at least five nodes for a minimal cluster when setting up a cluster in Azure.
What is Quorum?
Quorum is a term used in high-availability clustering and means that there needs to be a
minimum number of nodes online to be able to communicate with each other. Of those, in
most cases a majority need to agree on a piece of information for it to be considered “the
truth.” In the case of Service Fabric Quorum has to be established between the replicas
of a stateful service.
At the time of writing, Azure Service Fabric was still in preview mode and was using Azure
virtual machines for its nodes.
To make sure that not all virtual machines can go down at the same time—for example, for planned
maintenance—Service Fabric Service uses Azure availability sets. Every virtual machine in an
availability set is put into an upgrade domain and is spread across fault domains by the underlying
Azure platform.
Figure 8.2 shows a simplified model of how Service Fabric sets up a five-node cluster across
upgrade and fault domains.
FIGURE 8.2: Five-node cluster across upgrade and fault domains
For clusters with more than five nodes, Azure continues to put the nodes into unique fault domains,
upgrade domain combinations using more than just three fault domains. Azure will follow this pattern
for any number of nodes. Figure 8.3 shows an example distribution of VMs across fault and upgrade
domains in a seven-node cluster.
FIGURE 8.3: Seven-node cluster across upgrade and fault domains
Now we know what a Service Fabric cluster is and how it works at a high level; we just need to
know how to set one up. Service Fabric clusters in Azure can be set up like any other Azure Resource
Manager-based environment. We can use the Azure portal, CLI, or PowerShell to provision a Service
Fabric cluster. Figure 8.4 shows how to create a Service Fabric cluster through the Azure portal.
FIGURE 8.4: Create a cluster through the Azure portal
Resource Scheduling
As we know from our experience with Mesos, we can use schedulers such as Marathon, Kubernetes,
or Docker Swarm to orchestrate containers that contain services and place them on the nodes in the
cluster. Each scheduler has its pros and cons: for example, at the time of writing Swarm does not
automatically place containers on a different node if you scale down your cluster. As most schedulers
or orchestrators are open-source, we can expect those functionalities to improve over time. Placing
containers or services on machines is also often referred to as resource scheduling.
Application Manifest
An application is defined through its manifest file. At the time of writing, the application manifest is a
definition file authored by the application developer, which describes the physical layout, or
package, of a Service Fabric application. The application manifest references the service manifests
of the constituent services from which the application is composed. From a conceptual point, the
application manifest can be compared to the docker-compose.yml file, which is used to define and
run multicontainer applications with Docker. In addition to referencing the services in an application,
developers can use the application manifest to configure the “run-as” policy as well as the “security
access” (that is, read/write-on resources) for each imported service when the application is
instantiated. This is typically very useful to control and restrict access of services to different
resources to minimize any potential security risk.
Service Manifest
The services referenced in the application manifest are defined through a service manifest authored
by the service developer. The service manifest specifies independently upgradable code,
configuration, and data packages that together implement a specified set of services that can then be
run in the cluster once it is deployed.
The service manifest is used to define what service or containers need to be executed, what data
belongs to a service, what endpoints to expose, and so on, which is very similar to a Dockerfile
where developers define OS, frameworks, code, and endpoints to build a Docker image which is
used for a container. In the service manifest, developers can configure additional things like service
types (stateful or stateless), load metrics, and placement constraints.
Note
With placement constraints, developers can make sure that services or containers are
only placed on a certain node type. Node types can be configured in the ARM templates
and contain the name of the node type and its hardware configurations. A good example
is that developers can make sure that their web services only run on nodes that are tagged
with the node type “webservers.” The nodes of type “webserver” could be based on
virtual machines with a hardware configuration optimized for web server workloads.
See https://azure.microsoft.com/en-us/documentation/articles/service-fabric-application-model/
for more information on application and service configuration.
Once the application is running, Service Fabric starts automatically resource-balancing the
services across the underlying shared pool of cluster nodes to achieve the optimal load distribution,
based on a set of load metrics using the services available in the Hosting subsystem. In case of a node
failure, Service Fabric moves the services that were running on that node onto the remaining nodes.
The placement happens based on available resources and service metrics. Another great functionality
of Service Fabric is that it rebalances the services across the cluster in the case of scale-out in a
cluster. We will learn more details about that later in the chapter when we look at stateful services.
Custom Applications (Existing Applications)
A custom application is an application that does not integrate with Service Fabric, meaning it does
not reference any Service Fabric binaries. Examples of custom applications are Node.js applications,
Java applications, MongoDB, and so on. Benefits of running custom applications in Service Fabric
are high availability of the application, health reporting, rolling upgrade support, and higher density.
• High availability: Applications that run in Service Fabric are highly available out of the box.
Service Fabric makes sure that one instance of an application is always up and running.
• Health monitoring: Out-of-the-box Service Fabric health monitoring detects if the application
is up and running and provides diagnostics information in the case of a failure.
• Application lifecycle management: Besides no downtime upgrades, Service Fabric also
enables rolling back to the previous version if there is an issue during upgrade.
• Density: You can run multiple applications in a cluster, which eliminates the need for each
application to run on its own hardware.
Custom applications are packaged and deployed in the same way as Service Fabric applications.
The service manifest contains additional metadata that tells Service Fabric where it can find the
binaries of the application and how to execute it. Once the package is copied into the image store,
Service Fabric pulls down the application package to the nodes and executes the custom applications.
This article provides more information for deploying custom applications:
https://azure.microsoft.com/en-us/documentation/articles/service-fabric-deploy-existing-app/. One
can easily see that instead of starting custom applications, Service Fabric could spin up Docker or
Windows containers, rather than just inside processes and Windows Job Objects as they do currently.
Container Integration
Container Support will be available in preview around the first half of 2016, Service Fabric will
support Windows, Hyper-V, and Docker containers. There will be two levels of integration:
• Bring your own container: In this scenario, Service Fabric will handle containers the same
way as custom applications. The service manifest defines all the settings, such as image, ports,
volumes, and others that are needed to launch a container. In this scenario, Service Fabric
becomes a first-class orchestrator for containers.
• Develop and package Service Fabric services inside containers. In this scenario, developers
will be able to create new stateful and stateless services and package them inside a container.
Service Discovery
As we have seen in previous chapters, there needs to be some sort of service registry and discovery
service that enables services to make themselves known to the system but also enables them to know
where to find other services they need to talk to and to expose them to the outside world. Tools and
services like Eureka, Zookeeper, or Consul, coupled with things like HAProxy are popular ones in
the Docker microservices world. Service Fabric has service registration and discovery included
through its Naming service. As mentioned before, the Naming service provides name resolution of
services running somewhere in the cluster. Service instances and replicas get an opportunity to
register whatever address they want with the Naming Service, and Name resolution takes the stable
service name and transparently maps it to that address, so that the service is always addressable no
matter on which node it runs in the cluster.
Programming Model
Service Fabric does not stop at the cluster and container/service management level; it also comes
with its own programing models. Before we look at the programming models, we should understand
that Service Fabric supports both stateless and stateful services.
Stateless Services
The definition of stateless is that something is truly stateless where the data is only stored in memory;
a calculator is a good example of a stateless application. Stateless can also mean that the data is
persisted in an external storage solution, such as SQL DB or Azure Tables. This is the standard
model for distributed systems today, with tiered architectures being the most popular. To be able to
enable our service to scale out by adding more instances, we avoid storing any state on our service
itself. To make stateless services resilient, we need to implement lots of patterns when accessing
state in the backend, starting with proper queuing patterns to pass messages between the tiers, through
caching patterns to improve performance, and ending with sophisticated retry patterns to make sure
we can retrieve the data from external systems.
Stateless services in Service Fabric are pretty much the same as any other types of stateless
services such as Azure Cloud Services or web apps. We can make those services highly available
and scalable by starting more instance of the service. A typical scenario for a stateless service is a
gateway service, which accepts the incoming traffic and routes the requests to the stateful services.
Another example is a simple web front end that provides a UI for any type of user interaction.
Stateful Services
One important pattern we need to implement for high-performing services is the colocation pattern.
Colocation means that the data needs to be as close as possible to the service to avoid latency when
accessing the data. In addition, the data is usually partitioned to avoid too much traffic on a single
data store, which then becomes a bottleneck.
This is exactly where we can use stateful services in Service Fabric. Think of stateful services as
those where the data is partitioned and colocated directly within the service itself. Service Fabric
ensures reliability of data through replication and local persistence.
There are three important concepts when it comes to stateful services replication.
• Partition: A partition is a logical construct and can be seen as a scale unit that is highly
reliable through replicas, which are spread across the cluster.
• Replica: A replica is an instance of the code of the service that has a copy of the data. Read
and Write operations are performed at one replica (called the Primary). Changes to data due to
write operations are replicated to multiple other replicas (called the Active Secondary
replicas). This combination of Primary and Active Secondary replicas is the replica set of the
service. Service Fabric places each replica in a set on a different node across fault and upgrade
domains in the cluster to improve scalability and availability. It also redistributes them in the
case of a cluster scale-out or scale-in to ensure an optimal distribution.
• Replication: Replication is the process of applying state changes to the primary and secondary
replicas. A replica is an object that encapsulates the state of a failover unit. In other words, it
contains a copy of the services code and the state. All replicas that back up a stateful service
build a replica set. Service Fabric places each replica in a set on a different node across fault
and upgrade domains in the cluster to improve scalability and availability.
Figure 8.6 shows a cluster with two partitions and their replicas spread across the nodes.
Reliable Actors
Reliable Actors API: Actors were introduced in the 1970s to deal with concurrent computation
problems in parallel and distributed computing systems to simplify programming. From a logical
point of view, actors are isolated, single-threaded, and independent units of compute and state.
Concurrency Problems
In computer sciences, the dining philosophers’ problem is a great example to illustrate
concurrency and synchronization issues that can be solved using actors. Wikipedia has a
good article on this subject at
https://en.wikipedia.org/wiki/Dining_philosophers_problem, which is worth reading if
someone is not familiar with those types of problems.
With the advent of cloud computing, there are more and more use cases where actor-based
applications offer a great advantage. Almost every application with multiple independent units of
state, such as online games or IoT (Internet of Things) scenarios, are great use cases for actors. In the
case of online games most of the times an actor represents a player, or in the case of IoT, an actor
represents a device.
Service Fabric implements reliable actors as virtual actors, which means that their lifetime is not
tied to their objects in memory. The advantage of virtual actors is that we do not need to explicitly
create or destroy them. The Service Fabric actor runtime automatically activates the actors when
needed and garbage collects them if they have not been used for a while. Once garbage collected, the
actors are not really “gone,” as the runtime still maintains knowledge about their existence. The
runtime also distributes the actors across the nodes in the cluster, and with that achieves high
availability in the same way as we have seen earlier in this chapter. So how does that help with
concurrency? Service Fabric only enables one active thread inside an instance of the actor code at
any time. The runtime requires a complete execution of an actor method, for example as a response to
requests from other actors or clients even though the communication is asynchronous. The complete
execution is called a turn, and thus we speak of turn-based concurrency. Service Fabric makes it easy
to use actors as the runtime does all the heavy lifting. As developers we just need to define the actor
interfaces, implement them, register the actor implementation, and connect using an actor proxy, and
the runtime will do all the heavy lifting, like synchronization and making the state of the actor reliable
for us.
Reliable Services
Reliable Services API: We have learned that developing reliable services in Service Fabric is
different from developing traditional cloud services, as Service Fabric services are highly available
and scalable “out of the box” without writing any additional code. How to develop stateless services
based on the reliable services APIs is pretty straightforward and is outside the scope of this chapter.
However, when developing stateful services, Service Fabric offers a programming model that is
unique: the Reliable Collections API. Reliable collections enable us to develop highly available
microservices. The local and replicated state is managed by the reliable collections itself. This is a
big deal as developers can simply use reliable collections as collection objects as if they are only
targeting single machines.
The biggest difference to other high-availability technologies, such as Redis cache or Service Bus
queues, is that the state is kept locally in the service instance, and with that eliminates the latency of
reads across a network. As we have learned before, the state is replicated and with that, our service
is highly available when using reliable collections. More information on reliable collections can be
found here: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-reliable-
services-reliable-collections/.
To conclude the Programming Model section, it is worth mentioning that Service Fabric ships an
SDK that offers full Visual Studio integration, a local development experience and great diagnostics
capabilities for .NET development. At the time of writing, the development experience was restricted
to .NET development, but the team is working on supporting Java and other languages. We can expect
a preview in the first half of 2016. Figure 8.7 shows the create new Service Fabric application
experience in Visual Studio.
FIGURE 8.7: Create new Service Fabric application in Visual Studio 2015
In addition to the development experience, Service Fabric provides a modern Service Fabric
Explorer that gives us all the insights into the cluster, application, and service status. The Service
Fabric explorer is part of SDK and you can launch it from a Windows tray icon. Figure 8.8 shows the
Service Fabric explorer on a development machine.
Application Lifecycle
Throughout this book, we have learned that one of the key advantages of microservices is that one can
update and upgrade microservices with no downtime of the running version. Service Fabric comes
with great application lifecycle capabilities that enable us to update services, upgrade services, and
test services.
Service Updates
Properties of running services can be updated through PowerShell, REST Endpoints, or the object
model. A good example for updating a service is to add more instances to a running stateless service
or change the replica counts of running stateful services.
Application Upgrades
Service Fabric enables us to upgrade each service in an application independently, which is right in
line with the notion of independently versionable and deployable microservices. Let’s assume we
have an application with a stateless service that serves as the web front end, and a stateful service
that serves as our back-end services. As we have learned, both services are referenced in the
application manifest and each service has its own service manifest. Assume further that we have
made updates to the backend service. In this case, we would only need to change the version number
of the application itself and the version of the back-end service. The front-end service can remain
unchanged and in its initial version.
The upgrade itself is done as a rolling upgrade. This means that the service is upgraded in stages.
In each stage, the upgrade is applied to the nodes in a specific upgrade domain. By continuing
upgrades in that manner, Service Fabric ensures that the service remains available throughout the
process. Figure 8.9 shows the upgrade process of an application in Service Fabric Explorer, with the
service in upgrade domain 0 completed and in upgrade domain 1 in progress.
Testability Framework
As we have seen in Chapter 6, testing plays an important role in the DevOps cycle of microservices
development. Whereas code-centric tests like unit tests are still up to the developer, Service Fabric
supports testing applications and services for faults in a number of ways.
The simplest and easiest way to test our service in case of node failures is using the Service Fabric
Explorer. It provides an easy way to shut down, stop and restart nodes. Those types of tests enable us
to see how a service behaves under those circumstances and thus provide valuables insights. When
testing stateful services, we can usually observe how Service Fabric promotes a secondary to
become the primary while the service is still available and keeps its state. Figure 8.10 shows the
action menu that enables us to shut down and restart the node.
Summary
As we have seen, Service Fabric is a complete framework for building microservices-based
applications. In addition to its great cluster management and resource scheduling capabilities, it
offers a rich programming models that make it easy for developers to create highly available,
resilient, and high-performing services. Perhaps the biggest and most compelling aspect of service
fabric is that it democratizes the development of stateful and highly available services. At the time of
writing, the Service Fabric Service was still in preview and some changes will certainly happen, but
the core concepts of Service Fabric described in this chapter will remain the same.
A. ASP.NET Core 1.0 and Microservices
This appendix provides an overview of the key design changes coming with the newest version of
ASP.NET including a new execution environment, cross-platform support, configurable environment
variables, and more. We’ll then review some common best practices and considerations for building
and designing containerized microservices architectures.
Getting Started
You can find installation and setup instructions for Mac, Linux, and Windows as well as additional
documentation at http://docs.asp.net.
.NET Core
.NET Core is a lightweight, cross-platform, modular version of the full .NET Framework that
includes CoreFX, a set of .NET libraries (collections, threading, data access, common data types, and
so on), and the CoreCLR, the underlying runtime and execution environment that handles assembly
loading, garbage collection, type safety, and other features. One of the key benefits of .NET Core’s
modularity is that all your app’s dependencies, including the CoreCLR, can be packaged and
distributed as standalone items. That means you no longer have to install the full .NET Framework on
a server to run ASP.NET apps. You can find more information on .NET Core at
http://bit.ly/aspdotnetcore.
.NET Core Side-by-Side Support
.NET Core’s architecture also alleviates one of the most common difficulties reported by customers,
where their server environment is locked into a specific version of the .NET Framework that cannot
be upgraded because of the potential to break legacy apps. With .NET Core, all the dependencies of
the app are included in the app itself. This enables you to have three different apps that reference
three different versions of the Core CLR without causing side-by-side compatibility issues. Your
legacy apps can continue to use an older version of the Core CLR without restricting your newer apps
to an outdated version of the Core CLR.
It isn’t just the .NET Framework that is packaged using NuGet; any class library that you build for
.NET Core is also packaged and distributed as a NuGet library. In fact, if you use Visual Studio’s
Add Reference... feature to reference a class library in your solution, in ASP.NET Core 1.0, what it
does is add a new dependency to the list of dependencies in your project.json file.
In the sample config.json file below, notice that you can go beyond just the key/value pair and build
a set of nested configuration settings like the ConnectionStrings section that includes a child database
connection string.
Click here to view code image
{
"AppSettings": {
"SiteTitle": "My Website"
},
"ConnectionStrings": {
"SqlDbConnection":
"Server=(localdb)\\mssqllocaldb;Database=Products;Trusted_
Connection=True;MultipleActiveResultSets=true"
}
To load the connection string, you can then just use the Configuration object that is created in
the Startup constructor and pass in a string that follows the nested hierarchy for the configuration
setting as shown below.
Click here to view code image
var conn =
Configuration.Get<string>(["ConnectionStrings:SqlDbConnection");
You definitely don’t want to add a production database string to source control for the production
environment, so you instead set the connection string using an environment variable that is set when a
Docker container is created. You can see an example of using ASP.NET and Environment variables
in a Docker container in Chapter 4, “Setting Up Your Development Environment,” or refer to Chapter
6, “DevOps and Continuous Delivery” to read about more advanced options for handling environment
configuration changes using services like Consul or Zookeeper.
Command-line Driven
ASP.NET is also command-line driven, meaning that common tasks, from scaffolding to build and
packaging, to deployment, can be done from either the command line or Visual Studio. Doing this also
enables you to use your favorite code editor, including cross-platform code editors for ASP.NET
development, instead of requiring Visual Studio for development. At the time of this writing, the
.NET and ASP.NET teams are rebranding their command line tools from “DNX” (DotNet eXecution)
to “dotnet.” The .NET Core Command Line Interface (CLI) includes the following commands for
managing .NET projects:
• Compile: Compiles your project.
• New: Creates a new “Hello World” .NET console project template.
• Publish: Prepares your project for deployment by downloading any dependencies, including the
.NET Core Framework that the project requires to run.
• Restore: Restores NuGet dependencies for your project.
• Run: Compiles and runs your project.
The min:js task uses the built-in uglify() method to convert readable javascript into obfuscated
code. The CSS task, on the other hand, minifies the CSS contents to reduce the amount of bytes sent
over the wire. You don’t have to understand how these tasks work, but the important part is to ensure
that these tasks run as part of the Docker image creation, either as part of the normal Visual Studio
publishing process, or via an automated build and continuous integration process.
Click here to view code image
gulp.task("min:js", function () {
gulp.src([paths.js, "!" + paths.minJs], { base: "." })
.pipe(concat(paths.concatJsDest))
.pipe(uglify())
.pipe(gulp.dest("."));
});
gulp.task("min:css", function () {
gulp.src([paths.css, "!" + paths.minCss])
.pipe(concat(paths.concatCssDest))
.pipe(cssmin())
.pipe(gulp.dest("."));
});
An IActionResult return type like ObjectResult behaves exactly like the C# return type, but with the
difference that with an IActionResult, you can return RESTful values using built-in primitives like
HttpNotFound(), which translates to a HTTP 404 error, or return any HTTP status code using the
HttpStatusCodeResult class. The snippet below shows the same API returning 25 objects, but this
time using an IActionResult return type.
Click here to view code image
[HttpGet]
public async Task<IActionResult> Get()
{
var result = await _context.Products
.Take(25).ToArrayAsync();
One common question is when to use which return type. There are no hard-and-fast rules, and
although both are commonly used, the IActionResult is often used due to the additional flexibility and
support for HTTP status codes it provides.
Another consideration is whether to allow for XML as a valid return type or not. Many online
services like Facebook and Twitter have switched to only provide JSON for results because of the
bad performance and overhead with using XML. Because XML is more verbose, it also results in
higher egress costs from Azure and is potentially more expensive for clients consuming services over
metered connections, such as a mobile phone. If you don’t have legacy apps that depend on XML
already, it’s likely best to switch to only supporting JsonResult types.
if (product == null)
{
return HttpNotFound();
}
You can then ensure that your APIs are factored to provide support for sorting, paging, and item
count for further control by your callers.
Beyond API design, you also want to keep a close eye on how your API is used, who are the
callers, and what API caller pattern are they using? Ensure that you have monitoring and diagnostic
tools to track and understand who is using your API and how are they are using it. Depending on how
your API is used (or abused), one potential option is to rate-limit your API to restrict certain callers
to a fixed amount of API calls per day.
A
A/B testing, 198
access to websites through the Internet, 52
ACS (Azure Container Service), 129–130, 132–133
ACS Resource Provider, 121–134
ADD, 60
adding
content to containers, using volumes, 53–54
dependencies to microservices, 111
agents, Apache Mesos, 148
aggregation, log aggregation, 217–218
Alpine, 93
announcements, service discovery, 151
anti-corruption layer, 83
Apache Mesos, 147–148
agents, 148
components, 148
diverse workloads, 150
frameworks, 148–149
masters, 148
Mesosphere DCOS (Data Center Operating System), 149
service discovery, 150–152
API Gateway, 159–161
APIs
authentication, OpenID, 270
Batch APIs, ASP.NET, 270
HTTP error codes, 268
reliable actors API, 247
reliable services API, 249–251
RESTful APIs, 267–268
schedulers, 141
application configuration changes, across different environments, 184–185
application data, logging from within containers, 222
application dependencies, project.json (ASP.NET), 257
application gateways, 121, 122, 159–161
Application Insights, 176, 227–231
application lifecycle, Azure Service Fabric, 251
application manifest, Azure Service Fabric, 241
application upgrades, Azure Service Fabric, 252–253
applications
Azure Service Fabric, 240–241
application manifest, 241
custom applications, 242–243
service manifest, 241–242
decomposing, 74–75
considerations for, 86–87
designing
bounded context, 75–76
coarse-grained services, 70–72
common versioning strategy, 77
data collection, 81–83
determining where to start, 70
refactoring across boundaries, 75
serialization, 78
service design, 76–77
service to service communication, 78
architects, skillsets and experience, 18
architecture
flak.io e-commerce sample, 85–86
microservices architecture, 2
ARM (Azure Resource Manager)
creating environments, 177–179
infrastructure as code, 126–127
inside the box provisioning, 131–132
multivendor provisioning, 135–136
tracking deployments, with tags and labels, 179–181
ARM (Azure Resource Manager) templates, 36
deploying
to Azure, 134
from version control, 135
ARM templating language, 127–128
outputs, 130
parameters, 128–129
resources, 129–130
variables, 129
artifact stores, 121
ASP.NET
application dependencies, project.json, 257
Async, 265–266
Batch APIs, 270
choosing Docker images, 262–263
cloud-ready environment-based configurations, 258–259
command-line driven, 260
concurrency, 269
cross-platform apps, 256
cross-platform console applications, 261–262
cross-platform data and nonrelational data, 261
cross-platform web servers, 261
dependency injection (DI), 260–261
designing,
microservices, 269
for mobile clients, 270–271
environment variables, 114–115
front-end web development practices, 263–264
.NET Core, 256
NuGet, 257–258
OData, 274
open source stack, 256
OpenID, 271
RAD development, Roslyn, 261
REST services, 264–265
RESTful APIs, 267–268
stateless APIs, 266
Swagger, 272–273
unification of MVC and Web API, 260
uptime services, 273–274
Web API, return types, 266–267
ASP.NET Core 1.0, 255
changes, 109–110
Async, ASP.NET, 265–266
asynchronous messaging, 79–80
authentication
Docker, 91–92
OpenID, ASP.NET, 271
automated builds, Docker images, 98–99
automation
Azure environments, 174
microservices, 22–23
autonomous services, 4–5
autoscaling, schedulers, 141
availability sets, 139
Azure Service Fabric, 237
Azure
availability sets, 139
common log format, 221
containers, 34–35
creating environments, 173
with ARM (Azure Resource Manager), 177–179
automation, 174
immutable infrastructure, 173–174
infrastructure, 176
infrastructure as code, 174–175
private versus shared, 175–176
third-party configuration and deployment tools, 181
deploying ARM (Azure Resource Manager) templates, 134
deployment models, 42–44
Kubernetes, 147
load testing, 193–194
swarms, creating, 143–144
tracking deployments with tags and labels, 179–181
virtual machines, creating with Docker, 35–37
Azure Container Service (ACS), 129–130, 132–133
Azure Diagnostics, 222
Azure healing, schedulers, 140
Azure Portal, 36
Azure Resource Manager. See ARM (Azure Resource Manager)
Azure Resource Manager (ARM) templates, 36
Azure Service Fabric, 233, 234
application lifecycle, 251
application upgrades, 252–253
applications, 240–241
application manifest, 241
custom applications, 242–243
service manifest, 241–242
cluster management, 236–237
container integration, 243–244
fault domains, 238–240
Linux support, 234
programming model, 244
Quorum, 237
reliable actors API, 247
reliable services, reliable services API, 249–251
resource scheduling, 240
service discovery, 244
service updates, 251
stateless services, 244–245, 245–247
subsystems, 234–236
testability framework, 253–254
update domains, 238–240
virtual machine scale sets, 237
B
Bamboo, 205
base images, Docker, 92–95
Batch APIs, ASP.NET, 270
benefits of
DevOps, 22
microservices, 6
continuous innovation, 8–9
fault isolation, 12–13
independent deployments, 6–8
resource utilization, 9–10
scaling, 9–10
small teams, 12
technology diversity, 10–12
best practices, microservices, 19
automation, 22–23
Conway’s Law, 20–21
DevOps, 21–22
encapsulation, 20
fault tolerance, 23–26
monitoring, 23
best-of-breed solutions, continuous delivery tools, 201
Big Bang Integration Testing, 192
bin packing, 138
bots, 185
boundaries, refactoring across, 75
bounded context, 75–76
bridge network, linking containers, 65
build agents, 201
build controllers, 201
Build Job, 205
build/test host, Docker, 90
bulkheads, 24
C
Calico, 165
canary testing, 196–197
cascading failures, 25
certificates, managing, in Docker, 91–92
challenges of
manual deployments, 124
microservices, 13–14
complexity, 14
data consistency, 15–16
network congestion, 14–15
scheduling, 136
changes to code, ASP.NET Core 1.0 changes, 109–110
Chaos Monkey, 199
choosing images for virtual machines, 40–41
Chronos, 149–150
CI (continuous integration), 171–172
circuit breakers, 24
CLAbot, 186
cloning samples, product catalog microservice, 106–107
cloud only, Docker, 91
cloud-config files, 132
cloud-ready environment-based configurations, ASP.NET, 258–259
cluster coordination, 156
cluster host environments, overview, 120
cluster management, 122
Azure Service Fabric, 236–237
cluster resource manager, Azure Service Fabric, 235
cluster schedulers, autoscaling, 141
cluster scheduling, 138
cluster state store, 120
clusters
application gateways, 159–161
Docker Swarm, 141–142
Swarm cluster, 141–142
master nodes, 142
CMD, 60
coarse-grained services, 70–72
decomposing, 72
defining services and interfaces, 73–74
microservices, 72–73
code analysis with SonarQube, 190
code samples, flak.io microservices sample code, 4
coded UI testing, 192–193
cohesion, 74–75
collaboration, DevOps, 170–171
command-line driven, ASP.NET, 260
commands
ADD, 60
CMD, 60
COPY, 60
ENTRYPOINT, 60
ENV, 60
EXPOSE, 60
FROM, 59
MAINTAINER, 59
ONBUILD, 60
RUN, 59
USER, 60
VOLUME, 60
WORKDIR, 60
committing images, 56–61
common log format, 219–221
common versioning strategy, 77
communication, service to service communication, 78
communication subsystem, Azure Service Fabric, 235
complexity, microservices, 14
concurrency, ASP.NET, 269
connecting
to Docker hosts, 105
to swarms, Docker Swarm, 144
to virtual machines
with SSH and Git Bash on Mac OS X, 46
with SSH and Git Bash on Windows, 44
connectivity issues, unable to connect to the host, Docker, 116–117
consistency, infrastructure as code, 125
console applications, cross-platform console applications (ASP.NET), 261–262
constraints, placement constraints, 242
Consul, 158
consumer-driven contract testing, service dependencies, 187–189
container events, overview, 214
container ids, deleting containers, 54
container integration, Azure Service Fabric, 243–244
container linking, 114
container logs, viewing, 63–64
container networking, 64
containers, 29–30
adding content to with volumes, 53–54
Azure, 34–35
creating new, 61
deleting, 60
with container ids, 54
dependencies, 114
Docker, 30, 34–35
Docker issues, containers that won’t start, 117
enabling live reload, 107–108
images, updating and committing, 56–61
linking, 65
Docker, 114
logging application data, 222
monitoring, 212–213, 216
Docker Remote API, 215–216
Docker runtime metrics, 213–215
noisy neighbor problem, 32–34
versus processes, 30–32
reactive security, 34
resource scheduling, 240
schedulers, 139
versus virtual machines, 30–32
continuous delivery, 171–172, 200
deploying microservices, 182–184
continuous delivery tools
best-of-breed solutions, 201
considerations when choosing, 203–204
extensibility, 202–203
hybrid pipelines, 202
on-premises or hosted service, 201
on-premises or hosted tools, 200
continuous deployment, 171–172
continuous innovation, microservices, 8–9
continuous integration (CI), 185
code analysis with SonarQube, 190
improving quality through pull request validation, 185–186
public third-party services, 190
testing service dependencies with consumer-driven contract testing, 187–189
unit testing, 186–187
website performance, 190–191
Conway’s Law, 20–21
COPY, 60
CoreOS, etcd, 158
correlation ID, monitoring services, 218
coupling, 74
cross-platform apps, ASP.NET, 256
cross-platform console applications, ASP.NET, 261–262
cross-platform data and nonrelational data, ASP.NET, 261
cross-platform web servers, ASP.NET, 261
culture, DevOps, 170–171
collaboration, 170–171
demystifying deployments, 170
no blame rule, 170
custom applications, Azure Service Fabric, 242–243
D
data collection, 81–83
data consistency, microservices, 15–16
DDD (Domain Driven Design), 6, 75
debugging Docker issues
containers that won’t start, 117
diagnosing running containers, 118
unable to connect to the host, 116–117
decomposing
applications, 74–75
considerations for, 86–87
coarse-grained services, 72
defining services and interfaces, 73–74
microservices, 72–73
dedicated cluster nodes, 161
dedicated gateways, 161
deleting containers, 60
with container ids, 54
demystifying deployments, DevOps, 170
dependencies
adding to microservices, 111
containers, 114
schedulers, 139
dependency injection (DI), 260–261
deploying
ARM (Azure Resource Manager) templates
to Azure, 134
from version control, 135
microservices, with continuous delivery, 182–184
to staging, 195
deployment models, 42–44
deployments
demystifying deployments, DevOps, 170
independent deployments, microservices, 6–8
manual deployments, challenges of, 124
sharing, 157
tracking changes with tags and labels, 179–181
designing
applications
bounded context, 75–76
coarse-grained services, 70–72
common versioning strategy, 77
data collection, 81–83
determining where to start, 70
refactoring across boundaries, 75
serialization, 78
service design, 76–77
service to service communication, 78
microservices, ASP.NET, 269
for mobile clients, ASP.NET, 270–271
developer configurations, Docker
cloud only, 91
local and cloud, 91
local development, 90
developer tools, installing, 102
development machines, installing (Docker), 37
DevOps
continuous delivery, 200
on-premises or hosted tools, 200
culture, 170–171
collaboration, 170–171
demystifying deployments, 170
no blame rule, 170
support, 170
Dockerizing, 181–182
microservices, 21–22
overview, 167–168, 169
testing, 192
A/B testing, 198
canary testing, 196–197
coded UI testing, 192–193
Docker stress testing, 195
fault tolerance testing, 198–199
integration testing, 192
load and stress testing, 193
manual/exploratory testing, 196
resiliency testing, 198–199
DI (dependency injection), 260–261
diagnosing running containers, 118
directories, creating with Docker, 54–55
discovery backend, Docker Swarm, 142
diverse workloads, Apache Mesos, 150
DNS, service discovery, 157
DNS protocol integrations, service registry, 156
Docker, XVIII
authentication, 91–92
build/test host, 90
container linking, 114
containers, 30, 34–35
creating Azure virtual machines, 35–37
developer configurations
cloud only, 91
local and cloud, 91
local development, 90
directories, creating, 54–55
images, 49–50
automated builds, 98–99
building a hierarchy, 95–98
choosing base images, 92–95
choosing with ASP.NET, 262–263
sharing, 99–100
tags, 99
installing
on Azure virtual machines, 36
on development machines, 37
local development, 89, 103
settings, 103–104
monitoring recommended solutions, 232
networking features, overlay networks, 163–164
product validation, 90
runtime metrics, 213–215
stress testing, 195
TLS (Transport Layer Security), 144
tracking deployments with labels, 180–181
verifying installation, 47–48
Docker Bench, 100
Docker Cloud, 206–207
Docker Compose, 102, 112
smart restart, 115–116
Docker Engine, 101
Docker Exec, 118
Docker for Mac, 102
Docker for Windows, 102
Docker hosts, connecting to, 105
Docker Hub, 48, 93–94
images, 94–95
Docker issues, debugging
containers that won’t start, 117
diagnosing running containers, 118
unable to connect to the host, 116–117
Docker Kitematic, 101
docker logs, viewing, 63–64
Docker Machine, 101
docker ps, 50–52
docker pull, 50
docker pull nginx, 49
Docker Quickstart Terminal, 104–105
Docker Remote API, 213, 215–216
docker run, 50–52
Docker run command, 181
docker search nginx, 49
Docker Swarm, 141–142
connecting to swarms, 144
creating swarms, on Azure, 143–144
discovery backend, 142
master nodes, 142
strategies, 142–143
swarm filters, 143
Docker Swarm Cluster Template, 143–144
Docker tools, installing, 101–102
Docker Trusted Registry (DTR), 100
Dockerfiles, 98
building images, 59–60
dockerhostkey, 38
dockerhostkey.pub, 38
Dockerizing, 181–182
Domain Driven Design (DDD), 6, 75
DTR (Docker Trusted Registry), 100
E
efficiency, scheduling, 137
elasticsearch, 105
encapsulation, microservices, 20
Enterprise Service Bus (ESB), 21
ENTRYPOINT, 60
ENV, 60
environment variables, 68
ASP.NET, 114–115
environments
application configuration changes, 184–185
Azure, 173
automation, 174
creating with ARM (Azure Resource Manager), 177–179
immutable infrastructure, 173–174
infrastructure, 176
infrastructure as code, 174–175
private versus shared, 175–176
third-party configuration and deployment tools, 181
removing, 125
updating, infrastructure as code, 124–125
ESB (Enterprise Service Bus), 21
etcd, 158
Eureka, 158
exception processing, 198
expectations, consumer-driven contract testing, 187
experience, architects, 18
exploratory testing, 196
EXPOSE, 60
expressions, templates, 128
extensibility, continuous delivery tools, 202–203
extensions, virtual machines, 37
F
failover manager service, Azure Service Fabric, 235
failures, cascading failures, 25
fault domains, Azure Service Fabric, 238–240
fault isolation, microservices, 12–13
fault tolerance, microservices, 23–26
fault tolerance testing, 198–199
features, service registry, 155
federation subsystem, Azure Service Fabric, 235
files
dockerhostkey, 38
dockerhostkey.pub, 38
filters, swarm filters, Docker Swarm, 143
fine-grained SOA. See microservices architecture
flak.io e-commerce sample, 83
architecture overview, 85–86
requirements, 84–85
flak.io microservices sample code, 4
Flannel, 164–165
format of logs, 219–221
frameworks
Apache Mesos, 148–149
testability framework, Azure Service Fabric, 253–254
FROM, 59
front-end web development practices, ASP. NET and Visual Studio 2015, 263–264
functions, templates, 128
G
Gatekeeper, 196–197
gateways, application gateways, 159–161
generating SSH public keys
on Mac OS X, 39
on Windows, 37–39
Git Bash, 37
connecting virtual machines, on Mac OS X, 46
Git command line, 102
Git Credential Manager, 102
Google, Page Speed Insights API, 190–191
Grunt, 111
phantomas task, 190–191
Gulp, 111
H
HAProxy, 160
HashiCorp
Consul, 158
Terraform, 135–136
health checks, service registry, 156
health subsystem, Azure Service Fabric, 234–235
hierarchies, images hierarchies (Docker), 95–98
high availability
schedulers, 140
service registry, 155
horizontal scaling, 32
host machines, monitoring, 210–211
host nodes, 120
host-agnostic, 112
hosted infrastructure, continuous delivery tools, 200
hosting subsystem, Azure Service Fabric, 235
HTTP error codes, APIs, 268
hybrid pipelines, 202
hypervisor virtualization technologies, 30–31
I
-i (identify file parameter), 45
IaaS (Infrastructure as a Service), 119
identify file parameter (-i), 45
image layering, 61–63
image registry, 121
image repositories, Docker, 48
images
choosing
ASP.NET Docker images, 262–263
for virtual machines, 40–41
Docker, 49–50
automated builds, 98–99
building a hierarchy, 95–98
choosing base images, 92–95
sharing, 99–100
tags, 99
managing, 100–101
updating and committing, 56–61
immutable infrastructure, environments (Azure), 173–174
independent deployments, microservices, 6–8
Information blade, 40–41
infrastructure
Azure environments, 176
hosted infrastructure, 200
immutable infrastructure, 173–174
Infrastructure as a Service. See IaaS (Infrastructure as a Service)
infrastructure as code, 123–124
ARM (Azure Resource Manager), 126–127
Azure, 174–175
consistency, 125
removing, old environments, 125
test-driven infrastructure, 126
tracking changes, 125
updating environments, 124–125
innovation, continuous innovation (microservices), 8–9
inside the box provisioning, 123
ARM (Azure Resource Manager), 131–132
installing
developer tools, 102
Docker, on Azure virtual machines, 36
Docker tools, 101–102
OSX utilities, 103
Windows utilities, 102–103
integrated solutions, continuous delivery tools, 201–202
integration
container integration, Azure Service Fabric, 243–244
continuous integration, 185
code analysis with SonarQube, 190
improving quality through pull request validation, 185–186
public third-party services, 190
testing service dependencies with consumer-driven contract testing, 187–189
unit testing, 186–187
website performance, 190–191
microservices, 17
integration testsstress testing, 76–77
interfaces, defining for coarse-grained services, 73–74
inter-service communications, 80
isolation, scheduling, 137
J
javascript task runners, 111–112
Jenkins, 205
JSON, parameters, 128–129
JSON-based documents, ARM templating language, 127–128
K
Kanban Tool, 202–203
key vault, 176
kubelets, 145
Kubernetes, 144–145
Azure, 147
components, 145, 147
labels, 146
names, 147
namespaces, 147
pods, 145–146
replication controllers, 146
selectors, 146
services, 146
volumes, 146–147
L
labels
Kubernetes, 146
tracking deployments, 179–181
Docker, 180–181
languages, ARM templating language, 127–128
layering, images, 61–63
linked templates, 131
linking containers, 65
Docker, 114
Linux
Azure Service Fabric, 234
Windows and, 35
Linux CPU, 225
Linux diagnostics event, 223–225
Linux VirtualBox Path, 108–109
LinuxDisk, 225–226
LinuxMemory, 226
ListAsync(-) method, 265
live reload, enabling, 107–108
load balancers, 153, 160, 176
load testing, 193
with Azure and Visual Studio, 193–194
local and cloud, Docker, 91
local development
Docker, 89, 90, 103
settings, 103–104
installing Docker tools, 101–102
local Docker hosts, starting, 104–105
settings, 103–104
local Docker hosts, starting, 104–105
log aggregation, 217–218
logging, application data, from within containers, 222
logs
common log format, 219–221
considerations for, 221–222
Logstash, 220
lookup API, service registry, 155
lookups, service discovery, 151
ls command, 109
M
Mac computers, cloning samples, 106
Mac OS X
connecting virtual machines with SSH and Git Bash, 46
generating SSH public keys, 39
MAINTAINER, 59
management services, 120
management subsystem, Azure Service Fabric, 234
managing images, 100–101
manual deployments, challenges of, 124
manual testing, 196
as a service, 196
Marathon, 149
master nodes, 142
masters, Apache Mesos, 148
Mean Time to Repair (MTTR), 169
Mean Time to Resolution (MTTR), 7
Meso-DNS, 157
Mesosphere DCOS (Data Center Operating System), 149
Chronos, 149–150
Marathon, 149
microservices, XVII–XVIII, 2–4
autonomous services, 4–5
benefits of, 6
continuous innovation, 8–9
fault isolation, 12–13
independent deployments, 6–8
resource utilization, 9–10
scaling, 9–10
small teams, 12
technology diversity, 10–12
best practices, 19
automation, 22–23
Conway’s Law, 20–21
DevOps, 21–22
encapsulation, 20
fault tolerance, 23–26
monitoring, 23
challenges of, 13–14
complexity, 14
data consistency, 15–16
network congestion, 14–15
changing monoliths to, 80–81
cross-grained services, 70–72
decomposing coarse-grained services, 72–73
defining service boundaries, 6
deploying with continuous delivery, 182–184
designing, ASP.NET, 269
integration, 17
monitoring, 18
preparing for production, 110
adding dependencies, 111
javascript task runners, 111–112
optimizing source code, 111
product catalog microservice, 105
cloning samples, 106–107
enabling live reload, 107–108
Linux VirtualBox Path, 108–109
volumes, 108
retry patterns, 26
routing, 17–18
service dependencies, 25
service discovery, 17–18
single responsibility principle, 5–6
skillsets for architects, 18
SLA (service level agreement), 19
small services, 5
testing, 16–17, 112
time to live (TTL), 16
versioning, 17
microservices architecture, 2
monitoring, 209–210
Application Insights, 227–231
Azure Diagnostics, 222
containers, 212–213, 216
Docker Remote API, 215–216
Docker runtime metrics, 213–215
host machines, 210–211
microservices, 18, 23
OMS (Operations Management Suite), 231–232
recommended solutions by Docker, 232
services, 216–217
common log format, 219–221
correlation ID, 218
log aggregation, 217–218
operational consistency, 218–219
solutions, 222
syslog drivers, 227
monoliths, 71
changing to microservices, 80–81
partitioning, 82
refactoring, 81
MTTR (Mean Time to Repair), 169
MTTR (Mean Time to Resolution), 7
multivendor provisioning, ARM (Azure Resource Manager), 135–136
N
names, Kubernetes, 147
namespaces, Kubernetes, 147
.NET Core
ASP.NET, 256
support, 256–257
Netflix, Prana, 219
network congestion, microservices, 14–15
networking
container networking, 64
overlay networks, 65–67
networks, overlay networks, 161–163
Docker networking feature, 163–164
Flannel, 164–165
Project Calico, 165
Weave Net, 164
NGINX, 160
Docker, images, 49–50
no blame rule, DevOps, 170
Node Docker image, 94
nodes, master nodes, 142
noisy neighbor problem, containers, 32–34
notifications, service registry, 156
NuGet, 109–110
ASP.NET, 257–258
O
OData, 274
OMS (Operations Management Suite), 231–232
ONBUILD, 60
on-premises tools, 200
open source stack, ASP.NET, 256
OpenID, authentication (ASP.NET), 271
operational consistency, monitoring services, 218–219
Operations Management Suite (OMS), 231–232
optimizing source code, microservices, 111
Oracle VirtualBox, 101
orchestration, 121–122
Apache Mesos. See Apache Mesos
cluster management, 122
Docker Swarm. See Docker Swarm
discovery backend, 142
master nodes, 142
Kubernetes. See Kubernetes
master nodes, 142
provisioning, 121, 123
schedulers. See schedulers
scheduling, 122, 136
challenges of, 136
efficiency, 137
isolation, 137
performance, 138
scalability, 137
solutions for, 138
service discovery, 150–152
service lookup, 153–155
service registration, 152–153
organizing images (Docker), with tags, 99
OSX utilities, installing, 103
outputs, ARM templating language, 130
outside the box provisioning, 123
overlay networks, 65–67, 122–123, 161–163
Docker networking feature, 163–164
Flannel, 164–165
Project Calico, 165
Weave Net, 164
P
-p (port parameter), 45
PaaS (Platform as a Service), 119
Pact, 16
consumer-driven contract testing, 189
PageSpeed task, 190–191
parameters
ARM templating language, 128–129
-i (identify file parameter), 45
linked templates, 131
-p (port parameter), 45
partitioning, 82
monoliths, 82
partitions, stateless services, 245
patterns
retry patterns, microservices, 26
sidecar patterns, 219
peer gateway request routing, 161
performance
scheduling, 138
website performance, 190–191
pets and cattle metaphor, 125
phantomas Grunt task, 190–191
placement constraints, 242
Platform as a Service (PaaS), 119
pods, Kubernetes, 145–146
port parameter (-p), 45
Prana, 219
preparing for production, microservices, 110
adding dependencies, 111
javascript task runners, 111–112
optimizing source code, 111
prioritization, 82
private environments versus shared environments, 175–176
processes versus containers, 30–32
product catalog microservice, 105
cloning samples, 106–107
enabling, live reload, 107–108
Linux VirtualBox Path, 108–109
volumes, 108
product validation, Docker, 90
production, testing in, 196
programmable infrastructure, 123–124
programming model, Azure Service Fabric, 244
Project Calico, 165
project.json, ASP.NET, 257
provisioning, 121–122, 123
inside the box provisioning, ARM (Azure Resource Manager), 131–132
multivendor provisioning, ARM (Azure Resource Manager), 135–136
proxies, routing proxies, 155
public third-party services, continuous integration, 190
pull request validation, improving quality through, 185–186
Putty client, 102
Q
QA environments
deploying to staging, 195
testing, 192
Quay Enterprise, 100
Quorum, Azure Service Fabric, 237
R
RAD development, with Roslyn, ASP.NET, 261
RainforestQA, 196
random approach, 138
rating services, containers, 32–34
reactive security, 34
reallocation, schedulers, 140
refactoring, monoliths, 81
refactoring across boundaries, 75
registry, service discovery, 151
reliability subsystem, Azure Service Fabric, 235
reliable actors API, Azure Service Fabric, 247
reliable services API, Azure Service Fabric, 249–251
removing old environments, infrastructure as code, 125
replicas, stateless services, 245–246
replication
schedulers, 139–140
stateless services, 246
replication controllers, Kubernetes, 146
requirements, flak.io e-commerce sample, 84–85
resiliency testing, 198–199
resource scheduling, Azure Service Fabric, 240
resource utilization, microservices, 9–10
resources, ARM templating language, 129–130
REST services, ASP.NET, 264–265
RESTful APIs, ASP.NET, 267–268
retry, 24
retry patterns, microservices, 26
return types, ASP.NET Web API, 266–267
Robinson, Ian, 188
rolling updates, schedulers, 140
Roslyn, RAD development, ASP.NET, 261
routing, microservices, 17–18
routing proxies, 155
RUN, 59
runtime metrics, Docker, 213–215
S
scalability
scheduling, 137
service registry, 155
scaling
autoscaling, 141
microservices, 9–10
virtual machines, 31
scaling out, 32
scaling up, 32
schedulers
API, 141
autoscaling, 141
availability sets, 139
Azure healing, 140
bin packing, 138
constraints, 139
dependencies, 139
high availability, 140
random approach, 138
reallocation, 140
replication, 139–140
rolling updates, 140
spread approach, 138
scheduling, 122, 136
challenges of, 136
efficiency, 137
isolation, 137
performance, 138
scalability, 137
solutions for, 138
security, reactive security, 34
security tools, Docker Bench, 101
selectors, Kubernetes, 146
Selenium, 192–193
serialization, 78
asynchronous messaging, 79–80
synchronous request/response, 78–79
service, defining for coarse-grained services, 73–74
service announcements, 152–153
service registry, 155
service boundaries, microservices, 6
service decomposition, coarse-grained services, 72
defining services and interfaces, 73–74
microservices, 72–73
service dependencies
microservices, 25
testing with consumer-driven contract testing, 187–189
service design, 76–77
service discovery, 122
Apache Mesos, 150–152
Azure Service Fabric, 244
Consul, 158
DNS, 157
etcd, 158
Eureka, 158
microservices, 17–18
Zookeeper, 158
service discovery store, 120
Service Fabric. See Azure Service Fabric
Service Fabric Replicator, 235
service level agreements. See SLA (service level agreement)
service lookup, 153–155
service manifest, Azure Service Fabric, 241–242
service registration, 152–153
service registry, 155
DNS protocol integrations, 156
features, 155
health checks, 156
high availability, 155
lookup API, 155
notifications, 156
scalability, 155
service announcements, 155
service to service communication, 78
service updates, Azure Service Fabric, 251
services
Kubernetes, 146
manual testing, 196
monitoring, 216–217
common log format, 219–221
correlation ID, 218
log aggregation, 217–218
operational consistency, 218–219
set OSX environment variables, 105
set Windows environment variables, 105
shared environments, versus private environments, 175–176
sharing
deployments, 157
images (Docker), 99–100
sidecar patterns, 219
single responsibility principle, 5–6
SkyDNS, 157
SLA (service level agreement), microservices, 19
Slack, 202–203
small services, microservices, 5
small teams, microservices, 12
smart restart, Docker Compose, 115–116
solutions, monitoring, 222
solutions for, scheduling, 138
SonarQube, code analysis, 190
Spotify, 195
spread approach, 138
SSH, connecting virtual machines, on Mac OS X, 46
SSH public keys, generating
on Mac OS X, 39
on Windows, 37–39
staging, deploying to, 195
starting local Docker hosts, 104–105
stateless APIs, ASP.NET, 266
stateless services, Azure Service Fabric, 244–247
STDERR, 222
STDOUT, 222
storage, 176
stress testing, 193
Docker, 195
subsystems, Azure Service Fabric, 234–236
support, .NET Core, 256–257
Swagger, 264–265
ASP.NET, 272–273
Swarm cluster, 141–142
master nodes, 142
swarm filters, Docker Swarm, 143
swarm strategies, Docker Swarm, 142–143
swarms
connecting to, Docker Swarm, 144
creating with Docker, on Azure, 143–144
synchronous request/response, 78–79
syslog drivers, 227
T
tags
images, Docker, 99
tracking deployments, 179–181
teams, microservices, 12
technical debt, 171
technologies
application gateways, 159–161
Consul, service discovery, 158
DNS, service discovery, 157
etcd, 158
Eureka, 158
Zookeeper, 158
technology diversity, microservices, 10–12
templates
ARM (Azure Resource Manager) templates, 36
deploying from version control, 135
deploying to Azure, 134
Docker Swarm Cluster Template, 143–144
expressions, 128
functions, 128
linked templates, 131
Terraform, 135–136
testability framework, Azure Service Fabric, 253–254
test-driven infrastructure, 126
testing
DevOps, 192
A/B testing, 198
canary testing, 196–197
coded UI testing, 192–193
Docker stress testing, 195
fault tolerance testing, 198–199
integration tests, 192
load and stress testing, 193
manual/exploratory testing, 196
resiliency testing, 198–199
integration tests, 76–77
load testing with Azure and Visual Studio, 193–194
manual testing, as a service, 196
microservices, 16–17, 112
in production, 196
service dependencies with consumer-driven contract testing, 187–189
unit testing, 186–187
user acceptance testing, 195
third-party configuration and deployment tools, Azure, 181
third-party services, continuous integration, 190
time to live (TTL), 16
timeouts, 24
TLS (Transport Layer Security), Docker, 144
tools
developer tools, installing, 102
Docker Compose, 112
Docker tools, installing, 101–102
Toxiproxy, 199
tracking changes, infrastructure as code, 125
tracking deployments, with tags and labels, 179–181
Transport Layer Security (TLS), Docker, 144
transport subsystem, Azure Service Fabric, 235
tribal knowledge, 170–171
TTL (time to live), 16
turns, Azure Service Fabric, 248
Tutum, 206–207
U
unit testing, 186–187
update domains, Azure Service Fabric, 238–240
updating
environments, infrastructure as code, 124–125
images, 56–61
upgrades, application upgrades, Azure Service Fabric, 252–253
uptime services, ASP.NET, 273–274
USER, 60
user acceptance testing, 195
UserVoice, 202–203
V
variables
ARM templating language, 129
environment variables, 68
ASP.NET and, 114–115
set OSX environment variables, 105
set Windows environment variables, 105
verifying Docker installation, 47–48
version control, deploying, ARM (Azure Resource Manager) templates, 135
versioning
common versioning strategy, 77
microservices, 17
vertical scaling, 32
viewing, container logs, 63–64
virtual machine scale sets, 237
virtual machines
choosing images, 40–41
connecting
with SSH and Git Bash on Mac OS X, 46
with SSH and Git Bash on Windows, 44
versus containers, 30–32
creating with Docker, 35–37
extensions, 37
installing, Docker, 36
scaling, 31
virtual networks, 176
virtual private networks (VPNs), 176
VirtualBox, 105
Visual Studio, load testing, 193–194
Visual Studio 2015, front-end web development practices, 263–264
Visual Studio Code, 102, 112–113
Visual Studio Team Services (VSTS), 201, 205
VOLUME, 60
volumes
adding content to containers, 53–54
Kubernetes, 146–147
product catalog microservice, 108
VPNs (virtual private networks), 176
VSTS (Visual Studio Team Services), 201
W
Weave Net, 164
Web API, ASP.NET, return types, 266–267
web servers, cross-platform web servers, ASP.NET, 261
website performance, continuous integration, 190–191
Windows
cloning samples, 106
connecting virtual machines, with SSH and Git Bash on Windows, 44
diagnostics extension, 223
generating, SSH public keys, 37–39
Linux and, 35
Windows Server 2016 containers, 179
Windows Server Containers, 179
Windows utilities, installing, 102–103
WORKDIR, 60
workloads, Apache Mesos, 150
X-Y
XML, ASP.NET, 267
Z
Zookeeper, 158
Zuul, 196–197
Microservices with DOCKER on Microsoft® Azure
This book is part of InformIT’s exciting new Content Update Program, which provides automatic
content updates for major technology improvements!
As significant updates are made to Docker and Azure, sections of this book will be updated or
new sections will be added to match the updates to the technologies.
The updates will be delivered to you via a free Web Edition of this book, which can be
accessed with any Internet connection.
This means your purchase is protected from immediately outdated information!
For more information on InformIT’s Content Update program, see the inside back cover or go to
informit.com/cup
If you have additional questions, please email our Customer Service department at
[email protected].
Microservices with DOCKER on Microsoft® Azure
Instructions to access your free copy of Microservices with Docker on Microsoft Azure Web
Edition as part of the Content Update Program:
If you purchased your book from informit.com, your free Web Edition can be found under the Digital
Purchases tab on your Account page.
If you have not registered your book, follow these steps:
1. Go to informit.com/register.
2. Sign in or create a new account.
3. Enter ISBN: 9780672337499.
4. Answer the questions as proof of purchase.
5. Click on the “Digital Purchases” tab on your Account page to access your free Web Edition.
For more information about the Content Update Program, visit informit.com/cup or
email our Customer Service department at [email protected].
Code Snippets