Network in Python
Network in Python
1 Introduction
The software development process is also known as the software development life cycle (SDLC).
This process is more than just coding. It also includes gathering requirements, creating a proof of
concept, testing, and fixing bugs.
In this module, we discuss phases in the software development life cycle followed by
methodologies for managing the real-world requirements of software projects. The
methodologies discussed here begin with the waterfall method, which emphasizes up-front
planning. Other methodologies in this topic, such as Agile and Lean, are more dynamic and
adaptive than the waterfall method.
The software development life cycle (SDLC) is the process of developing software, starting from
an idea and ending with delivery. This process consists of six phases. Each phase takes input
from the results of the previous phase. There is no standard SDLC, so the exact phases can vary,
but the most common are:
Phase 2. Design
Phase 3. Implementation
Phase 4. Testing
Phase 5. Deployment
Phase 6. Maintenance
Historically, development teams usually followed these phases in order in the waterfall method.
The goal of waterfall was to complete each phase of SDLC down to the last detail before
proceeding to the next, never returning to a prior phase, and writing everything down along the
way.
Although the waterfall methods is still widely used today, it's gradually being superseded by
more adaptive, flexible methods that produce better software, faster, with less pain. These
methods are collectively known as “Agile development.”
It is important to understand that the SDLC can be applied many different ways. Its phases can
be repeated, and the order reversed. Individual phases can be performed at many levels in
parallel (for example, requirements can be gathered separately for user interface details, back-
end integrations, operating and performance parameters, etc.).
We'll look at the phases of the SDLC in greater detail, and in their classic order (just remember:
this is a description, not a prescription)
The goal of the requirements and analysis phase is to answer several tiers of questions. These
begin with exploring stakeholders' current situation, needs and constraints, present infrastructure,
etc. They are determining what problem(s) this new software needs to solve.
After the problem is better-defined, more concrete issues can be explored to determine where,
and in what context, the new software will need to live.
When answers to such questions are compiled, it's time to begin exploring more precise
requirements by focusing on desired features and user experience (UX).
Finally, the team begins assessing architectural options for building the software itself. For most
enterprise applications, this means many iterations of defining requirements for the software's
front end and back end. You will also need to provide points of integration for other applications,
as well as services for lifecycle management.
After gathering the requirements, the team analyzes the results to determine the following:
Is it possible to develop the software according to these requirements, and can it be done
on-budget?
Are there any risks to the development schedule, and if so, what are they?
How will the software be tested?
When and how will the software be delivered?
At the conclusion of this phase, the classic waterfall method suggests creating a Software
Requirement Specification (SRS) document which states the software requirements and scope,
and confirms this meticulously with stakeholders.
Design
The design phase classically takes the SRS document from the Requirements & Analysis phase
as input. During the design phase, the software architects and developers design the software
based on the provided SRS.
At the conclusion of the design phase, the team creates High-Level Design (HLD) and Low-
Level Design (LLD) documents. The HLD gives a "10,000-foot view" of the proposed software.
It describes the general architecture, components and their relationships, and may provide
additional detail. The LLD, based on the HLD document, describes in much greater detail the
architecture of individual components, the protocols used to communicate among them, and
enumerates required classes and other aspects of the design.
Implementation
The implementation phase classically takes the HLD and the LLD from the design phase as an
input.
The implementation phase is often called the coding or development phase. During this phase,
the developers take the design documentation and develop the code according to that design. All
of the components and modules are built during this phase, which makes implementation the
longest phase of the life cycle. During this phase, testing engineers are also writing the test plan.
At the conclusion of the implementation phase, functional code that implements all of the
customer's requirements is ready to be tested.
Testing
The testing phase classically takes the software code from the implementation phase as input.
During this phase, the test engineers take the code and install it into the testing environment so
they can follow the test plan. The test plan is a document that includes a list of every single test
to be performed in order to cover all of the features and functionality of the software, as specified
by the customer requirements. In addition to functional testing, the test engineers also perform:
Integration testing
Performance testing
Security testing
When code doesn’t pass a test, the test engineer identifies the bug, which gets passed to the
developers. After the bug is fixed, the test engineers will re-test the software. This back and forth
between the test and development engineers continues until all of the code passes all of the tests.
At the conclusion of the testing phase, a high quality, bug-free, working piece of software is
ready for production, in theory. In practice, this rarely happens. Developers have learned how to
test more efficiently, how to build testing into automated workflows, and how to test software at
many different levels of detail and abstraction: from the tiniest of low-level function definitions
to large-scale component aggregations. They've also learned that software is never bug-free, and
must instead be made observable, tested in production, and made resilient so it can remain
available and perform adequately, despite issues.
Deployment
The deployment phase takes the software from the testing phase as input. During the deployment
phase, the software is installed into the production environment. If there are no deployment
issues, the product manager works with the architects and qualified engineers to decide whether
the software is ready to be released.
At the end of the deployment phase, the final piece of software is released to customers and other
end users.
Maintenance
At the conclusion of the maintenance phase, the team is ready to work on the next iteration and
version of the software, which brings the process back to the beginning of the SDLC and the
requirements and analysis phase.
Waterfall
Agile
Lean
Each methodology has its own pros and cons. Deciding on which to use depends on many
factors, such as the type of project, the length of the project, and the size of the team.
Waterfall is the traditional software development model, and is still practiced to this day. The
waterfall model is nearly identical to the software development life cycle because each phase
depends on the results of the previous phase.
With waterfalls, the water flows in one direction only. With the waterfall method, the process
goes in one direction, and can never go backwards. Think of it like a relay race where one runner
has to finish their distance before handing the baton off to the next person, who is waiting for
them. The baton always goes in a forward motion.
It is said that the original waterfall model was created by Winston W. Royce. His original model
consisted of seven phases:
System requirements
Software requirements
Analysis
Program Design
Coding
Testing
Operations
As you can see, the waterfall model is really just one iteration of the software development life
cycle. There are now many variations of the phases in the waterfall model, but the idea that each
phase cannot overlap and must be completed before moving on remains the same.
Because the outcome of each phase is critical for the next, one wrong decision can derail the
whole iteration; therefore, most implementations of the waterfall model require documentation
summarizing the findings of each phase as the input for the next phase. If the requirements
change during the current iteration, those new requirements cannot be incorporated until the next
waterfall iteration, which can get costly for large software projects, and cause significant delays
before requested features are made available to users.
The Agile method is flexible and customer-focused. Although methodologies similar to Agile
were already being practiced, the Agile model wasn't official until 2001, when seventeen
software developers joined together to figure out a solution to their frustrations with the current
options and came up with the Manifesto for Agile Software Development, also known as the
Agile Manifesto.
Agile Manifesto
1. Customer focus - Our highest priority is to satisfy the customer through early and
continuous delivery of valuable software.
2. Embrace change and adapt - Welcome changing requirements, even late in
development. Agile processes harness change for the customer's competitive advantage.
3. Frequent delivery of working software - Deliver working software frequently, from a
couple of weeks to a couple of months, with a preference to the shorter timescale.
4. Collaboration - Business people and developers must work together daily throughout the
project.
5. Motivated teams - Build projects around motivated individuals. Give them the
environment and support they need, and trust them to get the job done.
6. Face-to-face conversations - The most efficient and effective method of conveying
information to and within a development team is face-to-face conversation.
7. Working software - Working software is the primary measure of progress.
8. Work at a sustainable pace - Agile processes promote sustainable development. The
sponsors, developers, and users should be able to maintain a constant pace indefinitely.
9. Agile environment - Continuous attention to technical excellence and good design
enhances agility.
10. Simplicity - The art of maximizing the amount of work not done--is essential.
11. Self-organizing teams - The best architectures, requirements, and designs emerge from
self-organizing teams.
12. Continuous Improvement - At regular intervals, the team reflects on how to become
more effective, then tunes and adjusts its behavior accordingly.
The Agile Manifesto by design, wasn't precise about how Agile should work. After forging the
Manifesto, its originators (and many others) kept evolving these ideas, absorbing new ideas from
many sources, and testing them in practice. As a result, over the past few decades, many takes on
Agile have emerged, and some have become widely popular. These include:
Agile Scrum - In rugby, the term scrum describes a point in gameplay where players
crowd together and try to gain possession of the ball. The Scrum focuses on small, self-
organizing teams that meet daily for short periods and work in iterative sprints ,
constantly adapting deliverables to meet changing requirements.
Lean - Based on Lean Manufacturing, the Lean method emphasizes elimination of
wasted effort in planning and execution, and reduction of programmer cognitive load.
Extreme Programming (XP) - Compared with Scrum, XP is more prescriptive about
software engineering best-practices, and more deliberately addresses the specific kinds of
quality-of-life issues facing software development teams.
Feature-Driven Development (FDD) - FDD prescribes that software development
should proceed in terms of an overall model, broken out, planned, designed, and built
feature-by-feature. It specifies a process of modeling used to estimate and plan feature
delivery, and a detailed set of roles for both the core development team and support
people.
Of the methodologies described above, Agile Scrum is probably the most popular. We'll discuss
some Scrum terms and concepts below that have been more or less universally adopted by the
Agile community across all methodologies.
Sprints
In the Agile model, the software development life cycle still applies. Unlike the waterfall
method, where there is one long iteration of the SDLC, Agile is many quick iterations of the
SDLC.
These quick iterations are called sprints, and the purpose of sprints is to accomplish the frequent
delivery of working software principle of the Agile manifesto. A sprint is a specific period of
time (time-boxed) which is usually between two weeks and four weeks, but preferably as short as
possible. The duration of the sprint should be determined before the process begins and should
rarely change.
During a sprint, each team takes on as many tasks, also known as user stories, as they feel they
can accomplish within the time-boxed duration of the sprint. When the sprint is over, the
software should be working and deliverable, but that doesn't necessarily mean that it will be
delivered; a sprint doesn't always lead to a release, but Agile requires the software to remain
deliverable.
Backlog
It is the role of the product owner to create the backlog. This backlog is made up of all of the
features for the software, in a prioritized list. The features in the backlog are a result of the
Requirements & Analysis phase, and include features that won't necessarily be in the immediate
release. New features can be added to the backlog at any time, and the product owner can
reprioritize the backlog based on customer feedback.
User stories
When a feature gets close to the top of the priority list, it gets broken down into smaller tasks
called user stories. Each user story should be small enough that a single team can finish it within
a single sprint. If it's too large to be completed in a single sprint, the team should break it down
further. Because the software must be deliverable at the end of each sprint, a user story must also
abide by those rules.
A user story is a simple statement of what a user (or a role) needs, and why. The suggested
template for a user story is:
Completing a user story requires completing all of the phases of the SDLC. The user story itself
should already have the requirements defined by the product owner, and the team taking on the
user story needs to come up with a design for the task, implement it and test it.
Scrum Teams
Every day, each scrum team should have a daily standup. A standup is a meeting that should last
no longer than 15 minutes, and should take place at the same time every day. In fact, it's called a
"standup" because ideally it should be short enough for the team to accomplish it without having
to sit down.
The goal of the daily standup is to keep all team members in sync with what each person has
accomplished since the last standup, what they are going to work on until the next standup, and
what obstacles they have that may be preventing them from finishing their task. The scrum
master facilitates these standups, and their job is to report and/or help remove obstacles.
Lean software development is based on Lean Manufacturing principles, which are focused on
minimizing waste and maximizing value to the customer.
In its simplest form, Lean Software Development delivers only what the customers want. In the
book Lean Software Development: An Agile Toolkit , there are seven principles for lean:
Eliminate waste
Amplify learning
Decide as late as possible
Deliver as fast as possible
Empower the team
Build integrity in
Optimize the whole
Eliminate waste
Waste is anything that does not add value for customers. The definition of value is subjective,
however, because it's determined by the customer. Eliminating waste is the most fundamental
lean principle, the one from which all the other principles follow.
To eliminate waste, you must be able to understand what waste is. Waste is anything that does
not add direct value to the customer. There are seven wastes of software development:
Extra processes
Extra processes are just like a bunch of paperwork. As a result, they are a waste for pretty much
the same reasons as partially done work.
Extra features
If the customer didn't ask for it, it doesn't bring them value. It might be nice to have, but it's
better to use the resources to build exactly what customers want.
Task switching
Humans need time to switch their mind to focus on another task, and that time spent switching
contexts is a waste. Task switching wastes a resource's (person’s) time, so it's a waste to assign a
resource to multiple projects.
Waiting
Many people would agree that by definition, waiting is a big waste of time. So, any type of delay
is a waste. Examples of a delay in software development are delays in:
Motion
Lean software development defines motion for two things: people and artifacts. Motion for
people is when people need to physically walk from their desk to another team member to ask a
question, collaborate, and so on. When they move from their desk, it is not only the time it takes
for them to get to the destination that is a waste, but also the task switching time.
The motion of artifacts is when documents or code are moved from one person to another. Most
of the time, the document doesn't contain all of the information for the next person, so either that
person has to gather the information again, or the hand-off requires time, which is a waste.
Defects
Unnoticed defects (otherwise known as bugs) are a waste because of the impact of the defect.
The defect can cause a snowball effect with other features, so the time it takes to debug it is a
waste. Also, for a customer, the value of the software is reduced when they run into issues, so the
feature ends up being a waste.
To be able to fine tune software, there should be frequent short iterations of working software.
By having more iterations:
When there is uncertainty, it is best to delay the decision-making until as late as possible in the
process, because it's better to base decisions on facts rather than opinion or speculation.
Also, when a decision isn't yet made, the software is built to be flexible in order to accommodate
the uncertainty. This flexibility enables developers to make changes when a decision is made--or
in the future, if requirements change.
You'll notice that each of these reasons practices at least one of the previously discussed lean
principles.
Each person has their own expertise, so let them make decisions in their area of expertise. When
combined with the other principles such as eliminating waste, making late decisions, and fast
deliveries, there isn't time for others to make decisions for the team.
Build Integrity In
Integrity for software is when the software addresses exactly what the customer needs. Another
level of integrity is that the software maintains its usefulness for the customer.
Although one of the principles is empowering the team, each expert must take a step back and
see the big picture. The software must be built cohesively. The value of the software will suffer
if each expert only focuses on their expertise and doesn't consider the ramifications of their
decisions on the rest of the software.
In this lab, you review Python installation, PIP, and Python virtual environments.
3.2.1 Introduction
Software design patterns are best practice solutions for solving common problems in software
development. Design patterns are language-independent. This means that they can be
implemented in any contemporary, general-purpose computing language, or in any language that
supports object-oriented programming. Often, popular design patterns encourage creation of add-
on frameworks that simplify implementation in widely-used languages and paradigms.
Artisans have always shared time-tested methods and techniques for problem-solving. Calling
these things "design patterns" was first done systematically in the field of architecture and urban
planning. Architectural patterns were organized by abstract class of problem solved, urging
designers to recognize common core themes shared across numerous divergent contexts. For
example, a bus stop and a hospital waiting room, are both places in which people wait; so both
can usefully implement features of the pattern A PLACE TO WAIT.
This way of thinking about patterns was quickly picked up by pioneers in object-oriented coding
and Agile software development. In 1994, Erich Gamma, Richard Helm, Ralph Johnson, and
John Vlissides (known collectively as the Gang of Four (GoF)) published a book called Design
Patterns - Elements of Reusable Object-Oriented Software. We'll offer a broad view of the
patterns they identified and documented.
Program to an interface, not an implementation. Tightly coupling mainline program logic with
implementations of specific functionality tends to make code hard to understand and maintain.
Experience has shown that it works better to loosely-couple logical layers by using abstract
interfaces. For instance, the mainline code calls functions and methods in a generic way. Lower-
level functions implement matching interfaces for the functionality they provide, ensuring, for
example, that all serialization functions used in a program are called in similar fashion.
Object-oriented languages like Java formalize these ideas. They enable explicit declaration of
interfaces that classes can implement. An interface definition is basically a collection of function
prototypes, defining names and types for functions and parameters that higher-level logic might
use to invoke a range of classes. For example, the interface to a range of 'vehicle' classes (e.g.,
class 'car,' 'motorcycle,' 'tractor') might include start_engine(), stop_engine(), accelerate(), and
brake() prototypes.
Favor object composition over class inheritance. Object-oriented languages enable inheritance:
more generalized base classes can be inherited by derived classes. Thus a 'duck' class might
inherit substantial functionality from a 'bird' class. This requires, however, that the bird class
implement a very wide range of methods, most of which may not be used by a specific derived
class.
The principle of favoring composition over inheritance suggests that a better idea may be to
favor implementing a specific class (class - duck) by creating only required unique subclasses
(class - quack) along with abstract interfaces (interface - 'ducklike') to classes (class - fly, class -
swim) that can be shared widely in similar fashion (class - penguin implements interface
'penguinlike,' enabling sharing of class 'swim,' but not 'fly'). Organizing software in this way has
proven to be most flexible, ultimately easier to maintain, and encourages reuse of code.
Software design patterns have already been proven to be successful, so using them can speed up
development because developers don't need to come up with new solutions and go through a
proof of concept to make sure they work.
In their Design Patterns book, the Gang of Four divided patterns into three main categories:
Creational - Patterns used to guide, simplify, and abstract software object creation at
scale.
Structural - Patterns describing reliable ways of using objects and classes for different
kinds of software projects.
Behavioral - Patterns detailing how objects can communicate and work together to meet
familiar challenges in software engineering.
They listed a total of 23 design patterns, which are now considered the foundation of newer
design patterns. Most of these patterns, at some level, express basic principles of good object-
oriented software design.
Let's dive deeper into two of the most commonly used design patterns: the Observer design
pattern (a Behavioral design pattern), and Model-View-Controller (MVC).
The observer design pattern is a subscription notification design that lets objects (observers or
subscribers) receive events when there are changes to an object (subject or publisher) they are
observing. Examples of the observer design pattern abound in today's applications. Think about
social media. Users (observers) follow other users (subjects). When the subject posts something
on their social media, it notifies all of the observers that there is a post, and the observers look at
that update.
To implement this subscription mechanism:
1. The subject must have the ability to store a list of all of its observers.
2. The subject must have methods to add and remove observers.
3. All observers must implement a callback to invoke when the publisher sends a
notification, preferably using a standard interface to simplify matters for the publisher.
This interface needs to declare the notification mechanism, and it needs to have
parameters for the publisher to send the necessary data to the observer.
subscribenotifynotifynotifyObserversnotifyObserversnotifysubscribe
Subject
Subject’s state changes
1. An observer adds itself to the subject's list of observers by invoking the subject's method
to add an observer.
2. When there is a change to the subject, the subject notifies all of the observers on the list
by invoking each observer's callback and passing in the necessary data.
3. The observer's callback is triggered, and therefore executed, to process the notification.
4. Steps 2 and 3 continue whenever there is a change to the subject.
5. When the observer is done receiving notifications, it removes itself from the subject's list
of observers by invoking the subject's method to remove an observer.
The benefit of the observer design pattern is that observers can get real time data from the
subject when a change occurs. Subscription mechanisms always provide better performance than
other options, such as polling.
The figure has four symbols in a square. Top left is the user. An arrow goes from the user to the
controller with the words user input over the arrow. An arrow goes from the controller icon
down to the model icon. Words beside the arrow are manipulated data. The model icon has an
arrow that goes into the view icon and the words updated data below the arrow. The view icon
has an arrow that goes up to the user icon and the words beside the arrow are display data.
User
Controller
Model
View
User input
Manipulated data
Display data
Updated data
Components
Model - The model is the application's data structure and is responsible for managing the
data, logic and rules of the application. It gets input from the controller.
View - The view is the visual representation of the data. There can be multiple
representations of the same data.
Controller - The controller is like the middleman between the model and view. It takes
in user input and manipulates it to fit the format for the model or view.
The benefit of the Model-View-Controller design pattern is that each component can be built in
parallel. Because each component is abstracted, the only information each component needs is
the input and output interface for the other two components. Components don't need to know
about the implementation within the other components. What's more, because each component is
only dependent on the input it receives, components can be reused as long as the other
components provide the data according to the correct interface.
Version Control Systems
Scroll to begin
Version control, also called version control systems, revision control or source control, is a way
to manage changes to a set of files in order to keep a history of those changes. Think of all the
times you have made a copy of a file before modifying it, just in case you want to revert to the
original. Version control handles all of that for you.
Version control systems store the master set of files and the history of changes in a repository,
also known as a repo. In order to make a change to a file, an individual must get a working copy
of the repository onto their local system. The working copy is the individual's personal copy of
the files, where they can make changes without affecting others. Some of the benefits of version
control are:
It enables collaboration - Multiple people can work on a project (a set of files) at the
same time without overriding each other's changes.
Accountability and visibility - Know who made what changes, when they made those
changes and why.
Work in isolation - Build new features independently without affecting the existing
software.
Safety - Files can be reverted when a mistake is made.
Work anywhere - Files are stored in a repository, so any device can have a working
copy.
Local
Centralized
Distributed
Just like the name states, a Local Version Control System (LVCS) tracks files within a local
system. A local version control system replaces the "make a copy of the file before editing
further" scenario. The focus of a local version control system is mostly to be able to revert back
to a previous version. This type of version control isn't meant to address most of the benefits
listed above.
Local version control systems use a simple database to keep track of all of the changes to the file.
In most cases, the system stores the delta between the two versions of the file, as opposed to the
file itself. When the user wants to revert the file, the delta is reversed to get to the requested
version.
Centralized Version Control System
The figure shows the words local computer and a file folder icon below it labeled file system.
There are 8 blocks on this folder with one block labeled File A. To the right of the folder is the
version database icon. At the bottom of the icon is version 1 block that has an arrow pointing to
version 2 block that has an arrow pointing to the version 3 block. Within the version 3 block,
there is a smaller block labeled File A and an arrow going from it to the File A block on the
folder icon.
File system
Centralized version control system server
Version 3
Version 2
Version 1
Computer 1
Computer 2
Checkout
Checkout
File system
Version database
A Centralized Version Control Systems (CVCS) uses a server-client model. The repository (also
known as the repo), which is the only copy of the set of files and history, is stored in a
centralized location, on a server. Whenever an individual wants to make a change to a file, they
must first get a working copy of the file from the repository to their own system, the client.
In a centralized version control system, only one individual at a time can work on a particular
file. In order to enforce that restriction, an individual must checkout the file, which locks the file
and prevents it from being modified by anyone else. When the individual is done making
changes, they must checkin the file, which applies the individual's changes to the master copy in
the repo, tags a new version, and unlocks the file for others to make changes.
The figure shows the words local computer and a file folder icon below it labeled file system.
There are 8 blocks on this folder with one block labeled File A. To the right of the folder is the
version database icon. At the bottom of the icon is version 1 block that has an arrow pointing to
version 2 block that has an arrow pointing to the version 3 block. Within the version 3 block,
there is a smaller block labeled File A and an arrow going from it to the File A block on the
folder icon.
Server host
Computer 1
Version database
Version 3
Version 2
Version 1
Version database
Version 2
Version 1
Version 3
Computer 2
Version database
Version 3
Version 2
Version 1
File system
File system
A Distributed Version Control System (DVCS) is a peer-to-peer model. The repository can be
stored on a client system, but it is usually stored in a repository hosting service. When an
individual wants to make a change to a file, they must first clone the full repository to their own
system. This includes the set of files as well as all of the file history. The benefit of this model is
that the full repository will be on multiple systems and can be used to restore the repository in
the repository hosting service if an event such as data corruption occurs.
In a distributed version control system, every individual can work on any file, even at the same
time, because the local file in the working copy is what is being modified. As a result, locking
the file is not necessary. When the individual is done making the changes, they push the file to
the main repository that is in the repository hosting service, and the version control system
detects any conflicts between file changes.
3.3.2 Git
At the time of this writing, the most popular version control system in use is Git. Git is an open
source implementation of a distributed version control system that is currently the latest trend in
software development. Git:
Is easy to learn
Can handle all types of projects, including large enterprise projects
Has fast performance
Is built for collaborative projects
Is flexible
Has a small footprint
Has all the benefits of a distributed version control system
Is free
A Git client must be installed on a client machine. It is available for MacOS, Windows, and
Linux/Unix. Though some Git clients come with a basic GUI, Git's focus is on the command line
interface, about which we will go into detail later.
One key difference between Git and other version control systems is that Git stores data as
snapshots instead of differences (the delta between the current file and the previous version). If
the file does not change, git uses a reference link to the last snapshot in the system instead of
taking a new and identical snapshot.
The figure shows the file view on the left and the version view on the right. Within the file view
there is an arrow going from the left to the right across the top with the word checkins above the
line. Below the line are four colored blocks: version 1 version 2, version 3, and version 4. The
next row is File A (A1) under version 1 and arrow going to a textbox (A2) under version 2. An
arrow goes to the next textbox, A3, which is located under version 4. On the next row is the File
B (B1) textbox and an arrow that goes to the B2 textbox that is located under version 3. An
arrow goes to the B3 textbox that is under version 4. The last row is File C (C1) that has an
arrow that extends all the way across all four versions. In the version view there are four colored
blocks labeled version 1, version 2, version 3, and version 4. Each block within the version view
is either a snapshot or a reference link. Only the reference links will be called out. The other
versions are snapshots. In the first row is File A (A1) under version 1, A2 under version 2, A2
reference link under version 3, and A3 under version 4. In the second row is File B (B1) under
version 1, B1 reference link under version 2, B2 under version 3 and B3 under version 4. In the
last row is File C (C1) under version 1, the C1 reference link under version 2, 3, and 4.
A2A3B2B2A1B1C1A2B2C1A3B3C1
Version 1
Version 2
Version 3
Version 4
Checkins
File A (A1)
File B (B1)
File C (C1)
File view
Version 1
Version 2
Version 3
Version 4
File A (A1)
File B (B1)
File C (C1)
Snapshot Reference link
Version view
Git's 3s
The figure shows the file view on the left and the version view on the right. Within the file view
there is an arrow going from the left to the right across the top with the word checkins above the
line. Below the line are four colored blocks: version 1 version 2, version 3, and version 4. The
next row is File A (A1) under version 1 and arrow going to a textbox (A2) under version 2. An
arrow goes to the next textbox, A3, which is located under version 4. On the next row is the File
B (B1) textbox and an arrow that goes to the B2 textbox that is located under version 3. An
arrow goes to the B3 textbox that is under version 4. The last row is File C (C1) that has an
arrow that extends all the way across all four versions. In the version view there are four colored
blocks labeled version 1, version 2, version 3, and version 4. Each block within the version view
is either a snapshot or a reference link. Only the reference links will be called out. The other
versions are snapshots. In the first row is File A (A1) under version 1, A2 under version 2, A2
reference link under version 3, and A3 under version 4. In the second row is File B (B1) under
version 1, B1 reference link under version 2, B2 under version 3 and B3 under version 4. In the
last row is File C (C1) under version 1, the C1 reference link under version 2, 3, and 4.
Repository (.git directory)
Staging area
Working directory
Get a copy of the repository (Clone)
Make changes to the files
Add changes to stage
Update repo with changes (Commit)
Because Git is a distributed version control system, each client has a full copy of the repository.
When a project becomes a Git repository, a hidden .git directory is created, and it is essentially
the repository. The .git directory holds metadata such as the files (compressed), commits, and
logs (commit history).
WORKING DIRECTORY
The working directory is the folder that is visible in the filesystem. It is a copy of the files in the
repository. These files can be modified, and the changes are only visible to the user of the client.
If the client's filesystem gets corrupted, these changes will be lost, but the main repository
remains intact.
STAGING AREA
The staging area stores the information about what the user wants added/updated/deleted in the
repository. The user does not need to add all of their modified files to the stage/repo; they can
select specific files. Although it is called an area, it is actually just an index file located in the .git
directory.
Three States
Since there are three stages in Git, there are three matching states for a Git file:
committed - This is the version of the file has been saved in the repository (.git
directory).
modified - The file has changed but has not been added to the staging area or committed
to the repository.
staged - The modified file is ready to be committed to the repository.
3.3.3 Local vs. Remote Repositories
A local repository is stored on the filesystem of a client machine, which is the same one on
which the git commands are being executed.
A remote repository is stored somewhere other than the client machine, usually a server or
repository hosting service. Remote repositories are optional and are typically used when a project
requires collaboration between a team with multiple users and client machines.
Remote repositories can be viewed as the "centralized" repository for Git, but that does not make
it a CVCS. A remote repository with Git continues to be a DVCS because the remote repository
will contain the full repository, which includes the code and the file history. When a client
machine clones the repository, it gets the full repository without needing to lock it, as in a CVCS.
After the local repository is cloned from the remote repository or the remote repository is created
from the local repository, the two repositories are independent of each other until the content
changes are applied to the other branch through a manual Git command execution.
The figure shows the word master with a line that extends across the page with an arrow on the
far right. There is a textbox on the line on the left labeled C11. There are also two dots on the
line, one about midway through and it is labeled v1.0 and one about three quarters down the line
labeled v1.1. Below the master line there is a development line that has an arrow on the right
side. The C1 textbox on the master line connects to the C2 textbox on the left side of the
development line. Further down the development line is the C3 textbox. There is a connection
about midway down the development line that connects to the v1.0 dot on the master line and a
line that connects about three quarters down the development line to the v1.1 spot on the master
line. At the far left is the C6 textbox. Below the development line is a Feature 1 row. The C3
development textbox connects to a C4 feature 1 box (that connects to a C5 Feature 1 box and has
a line that connects up on the development line before the v1.1 line. There is also a feature 2 row
on the bottom and the C6 development textbox has a line connecting to a C7 textbox located on
the feature 2 row.
C1C2C3C4C5C6C7v1.0v1.1
Master
Development
Feature 1
Feature 2
Branching enables users to work on code independently without affecting the main code in the
repository. When a repository is created, the code is automatically put on a branch called Master.
Users can have multiple branches and those are independent of each other. Branching enables
users to:
Work on a feature independently while still benefitting from a distributed version control
system
Work on multiple features concurrently
Experiment with code ideas
Keep production, development, and feature code separately
Keep the main line of code stable
Branches can be local or remote, and they can be deleted. Local branches make it easy to try
different code implementations because a branch can be used if it is successful and deleted if it is
not. Merging a branch back to the parent branch is not mandatory.
Unlike other version control systems, Git's branch creation is lightweight, and switching between
branches is almost instantaneous. Although branches are often visually drawn as separate paths,
Git branches are essentially just pointers to the appropriate commit.
The figure shows the word master with a line that extends across the page with an arrow on the
far right. There is a textbox on the line on the left labeled C11. There are also two dots on the
line, one about midway through and it is labeled v1.0 and one about three quarters down the line
labeled v1.1. Below the master line there is a development line that has an arrow on the right
side. The C1 textbox on the master line connects to the C2 textbox on the left side of the
development line. Further down the development line is the C3 textbox. There is a connection
about midway down the development line that connects to the v1.0 dot on the master line and a
line that connects about three quarters down the development line to the v1.1 spot on the master
line. At the far left is the C6 textbox. Below the development line is a Feature 1 row. The C3
development textbox connects to a C4 feature 1 box (that connects to a C5 Feature 1 box and has
a line that connects up on the development line before the v1.1 line. There is also a feature 2 row
on the bottom and the C6 development textbox has a line connecting to a C7 textbox located on
the feature 2 row.
C1C2C3C4C5C6C7v1.1v1.1
Master
Feature 1
Development
Feature 2
Branches are like a fork in the road, where it starts with the code and history at the point of
diversion, then builds its own path with new commits independently. As a result, branches have
their own history, staging area, and working directory. When a user goes from one branch to
another, the code in their working directory and the files in the staging area change accordingly,
but the repository (.git) directories remain unchanged.
Wherever possible, you should try to use branches rather than updating the code directly to the
master branch in order to prevent accidental updates that break the code.
Dealing with projects using Git is often associated with GitHub, but Git and GitHub are not the
same. Git is an implementation of distributed version control and provides a command line
interface. GitHub is a service, provided by Microsoft, that implements a repository hosting
service with Git.
In addition to providing the distributed version control and source code management
functionality of Git, GitHub also provides additional features such as:
code review
documentation
project management
bug tracking
feature requests
To enable project owners to manage in such widely-disparate scenarios, GitHub introduced the
concept of the "pull request". A pull request is a way of formalizing a request by a contributor to
review changes such as new code, edits to existing code, etc., in the contributor's branch for
inclusion in the project's main or other curated branches. The pull request idiom is now
universally-implemented in Git hosting services.
GitHub is not the only repository hosting service using Git, others include Gitlab and Bitbucket.
Setting up Git
After installing Git to the client machine, you must configure it. Git provides a git config
command to get and set Git's global settings, or a repository's options.
To configure Git, use the --global option to set the initial global settings.
Using the --global option will write to the global ~/.gitconfig file.
For each user to be accountable for their code changes, each Git installation must set the user's
name and email. To do so, use the following commands:
where <user's name> and <user's email> are the user's name and email address, respectively.
Any project (folder) in a client's local filesystem can become a Git repository. Git provides a git
init command to create an empty Git repository, or make an existing folder a Git repository.
When a new or existing project becomes a Git repository, a hidden .git directory is created in that
project folder. Remember that the .git directory is the repository that holds the metadata such as
the compressed files, the commit history, and the staging area. In addition to creating the .git
directory, Git also creates the master branch.
To make a new or existing project a Git repository, use the following command:
where the <project directory> is the absolute or relative path to the new or existing project.
For a new Git repository, the directory in the provided path will be created first, followed by the
creation of the .git directory.
Creating a Git repository doesn't automatically track the files in the project. Files need to be
explicitly added to the newly created repository in order to be tracked. Details on how to add
files to a repository will be covered later.
The figure shows the local computer and a folder labeled file system. Within that folder is
another folder labeled Repo dir. Within the Repo dir is another folder labeled .git and the
following connections: HEAD, index, objects, ref, config, branches, ...
...
Local computer
File system
Repo dir
.git
HEAD
index
objects
ref
config
branches
With Git, it is easy to get a copy of and contribute to existing repositories. Git provides a git
clone command that clones an existing repository to the local filesystem. Because Git is a
DVCS, it clones the full repository, which includes the file history and remote-tracking branches.
where <repository> is the location of the repository to clone. Git supports four major transport
protocols for accessing the <repository>: Local, Secure Shell (SSH), Git, and HTTP. The
[target directory] is optional and is the absolute or relative path of where to store the cloned files.
If you don't provide the project directory, git copies the repository to the location where you
executed the command.
The figure shows the local computer and a folder labeled file system. Within that folder is
another folder labeled Repo dir. Within the Repo dir is another folder labeled .git and the
following connections: HEAD, index, objects, ref, config, branches, ...
1234555555
Computer 3
Repo dir
.git
Server host
Version database
Version 3
Version 2
Version 1
Computer 2
Version database
Version 3
Version 2
Version 1
Computer 1
Version database
Version 3
Version 2
Version 1
File system
1. Creates the working directory on the local filesystem with the name of the repository or
the specified name, if provided.
2. Creates a .git directory inside the newly created folder.
3. Copies the metadata of the repository to the newly created .git directory.
4. Creates the working copy of the latest version of the project files.
5. Duplicates the branch structure of the cloned, remote repository and enables tracking of
changes made to each branch, locally and remotely — this includes creating and checking
out a local active branch, "forked" from the cloned repository's current active branch.
Please see the official git clone documentation for more details and command line options.
View the Modified Files in the Working Directory
What has been changed in the working directory? What files were added in the staging area? Git
provides a git status command to get a list of files that have differences between the working
directory and the parent branch. This includes newly added untracked files and deleted files. It
also provides a list of files that are in the staging area. Note that the difference is calculated
based on the last commit that the local clone copied from the parent branch in the Git repository,
not necessarily the latest version in the remote repository. If changes have been made since the
repository was cloned, Git won't take those changes into account.
In addition to providing list of files, the output of the git status command provides additional
information such as:
Please see the official git status documentation for more details and command line options.
Compare Changes Between Files
Want to know what was changed in a file, or the difference between two files? Git provides a git
diff command that is essentially a generic file comparison tool.
Because this command is a generic file difference tool, it includes many options for file
comparison. When using this command, the file does not need to be a Git tracked file.
1. Show changes between the version of the file in the working directory and the last commit that
the local clone copied from the parent branch in the Git repository:
2. Show changes between the version of the file in the working directory and a particular commit
from the file history:
3. Show changes between a file’s two commits from the file history. <file path> is the
absolute or relative path of the file to compare and <commit id> is the id of the version
of the file to compare.
4. Show the changes between two files in the working directory or on disk.
Please see the official git diff documentation for more details and command line options.
3.3.7 Adding and Removing Files
After changes have been made to a file in the working directory, it must first go to the staging
area before it can be updated in the Git repository. Git provides a git add command to add file(s)
to the staging area. These files being added to staging can include newly untracked files, existing
tracked files that have been changed, or even tracked files that need to be deleted from the
repository. Modified files don't need to be added to the working directory unless the changes
need to be added to the repository.
To add multiple files to the staging area where the <file path> is the absolute or relative path
of the file to be added to the staging area and can accept wildcards.
$ git add .
Remember that Git has three stages, so adding files to the staging area is just the first step of the
two-step process to update the Git repository.
Please see the official git add documentation for more details and command line options.
The figure has the following words: before add command up top. There are 3 boxes from left to
right: repository (.git directory), staging area (index), and working directory with six blocks
within it labeled file 1, file 2, file 3, file 4, file 5, and file 6. File 5 has an x through it. In the
repository there is a box labeled index and dotted lines extending to the top and bottom in a v
shape connecting to the top and bottom of the staging area box. Below that are words in a box:
git add file1 file4 file5. Then there is a box at the bottom with the words After add command at
the bottom of this box. The same 3 major sections are shown. What is different is the staging
area box now has file 1, file 4, and file 5 with the red x contained within it. The working
directory only has file 2, file 3, and file 6.
Before add command
After add comand
Repository (.git directory)
index
Staging area (index)
Working directory
file 1
file 2
file 3
file 4
file 5
file 6
Repository (.git directory)
index
Working directory
file 1
file 2
file 3
file 1
file 4
file 5
Staging area (index)
file 4
file 5
file 6
There are two ways to remove files from the Git repository.
OPTION 1
Git provides a git rm command to remove files from the Git repository. This command will add
the removal of the specified file(s) to the staging area. It does not perform the second step of
updating the Git repository itself.
Command : git rm
To remove the specified file(s) from the working directory and add this change to the staging
area, use the following command:
where <file path> is the absolute or relative path of the file to be deleted from the Git
repository.
To add the specified file(s) to be removed to the staging area without removing the file(s) itself
from the working directory, use the following command:
This command will not work if the file is already in the staging area with changes.
Please see the official git rm documentation for more details and command line options.
The figure has the following words: before add command up top. There are 3 boxes from left to
right: repository (.git directory), staging area (index), and working directory with six blocks
within it labeled file 1, file 2, file 3, file 4, file 5, and file 6. File 5 has an x through it. In the
repository there is a box labeled index and dotted lines extending to the top and bottom in a v
shape connecting to the top and bottom of the staging area box. Below that are words in a box:
git add file1 file4 file5. Then there is a box at the bottom with the words After add command at
the bottom of this box. The same 3 major sections are shown. What is different is the staging
area box now has file 1, file 4, and file 5 with the red x contained within it. The working
directory only has file 2, file 3, and file 6.
Repository (.git directory)
index
Staging area (index)
Before remove command
Working directory
file 1
file 2
file 3
file 4
file 5
file 6
Repository (.git directory)
index
Staging area (index)
Working directory
file 1
file 2
file 4
file 5
file 6
file 3
After remove command
OPTION 2
This option is a two step process. First, use the regular filesystem command to remove the file(s).
Then, add the file to stage using the Git command git add that was discussed earlier.
This two step process is equivalent to using the git rm <file path 1> ... <file path
n> command. Using this option does not allow the file to be preserved in the working directory.
Updating the Local Repository with the Changes in the Staging Area
Remember that in Git, changes to a file go through three stages: working directory, staging area,
and repository. Getting the content changes from the working directory to the staging area can be
accomplished with the git add command, but how do the updates get to the repository? Git
provides a git commit command to update the local repository with the changes that have been
added in the staging area.
This command combines all of the content changes in the staging area into a single commit and
updates the local Git repository. This new commit becomes the latest change in the Git
repository. If there is a remote Git repository, it does not get modified with this command.
To commit the changes from the staging area, use the following command:
$ git commit
It is good software development practice to add a note to the commit to explain the reason for the
changes. To commit the changes from the staging area with a message, use the following
command:
$ git commit -m "<message>"
If the git commit command is executed without any content in the staging area, Git will return a
message, and nothing will happen to the Git repository. This command only updates the Git
repository with the content in the staging area. It will not take any changes from the working
directory.
Please see the official git commit documentation for more details and command line options.
The figure has the following words: before commit command up top. There are 3 boxes from left
to right: repository, staging area, and working directory. Repository has file 1, file 4, and file 5
and the index. The staging box has file 1, file 4, and file 5 with an x over it. The working
directory has file 2, file 3, and file 6. A text block is below: git commit -m Adding new files. The
figure that follows as the words After commit command at the bottom. In the repository block
are the index and file 1 and file 4. The staging area contains no files and the working directory
box has file 2, file 3, and file 6.
git commit -m “Adding new files”
Repository
Staging
Before commit command
Working directory
Staging area
Working directory
file 2
file 3
file 6
After commit command
file 1
file 4
file 5
file 1
file 4
file 5
file 2
file 3
file 6
index
Repository
file 1
file 4
index
This command will not execute successfully if there is a conflict with adding the changes from
the local Git repository to the remote Git repository. Conflicts occur when two people edit the
same part of the same file. For example, if you clone the repository, and someone else pushes
changes before you, your push may create a conflict. The conflicts must be resolved first before
the git push will be successful.
To update the contents from the local repository to a particular branch in the remote repository,
use the following command:
To update the contents from the local repository to the master branch of the remote repository,
use the following command:
Please see the official git diff documentation for more details and command line options.
The figure has the following words: before commit command up top. There are 3 boxes from left
to right: repository, staging area, and working directory. Repository has file 1, file 4, and file 5
and the index. The staging box has file 1, file 4, and file 5 with an x over it. The working
directory has file 2, file 3, and file 6. A text block is below: git commit -m Adding new files. The
figure that follows as the words After commit command at the bottom. In the repository block
are the index and file 1 and file 4. The staging area contains no files and the working directory
box has file 2, file 3, and file 6.
(Remote) Origin/Master
Before push command
After push command
(Remote) Origin/Master
(Local) Master
(Local) Master
Local copies of the Git repository do not automatically get updated when another contributor
makes an update to the remote Git repository. Updating the local copy of the repository is a
manual step. Git provides a git pull command to get updates from a branch or repository. This
command can also be used to integrate the local copy with a non-parent branch.
To go into more details about the git pull command, when executing the command, the following
steps happen:
1. The local repository (.git directory) is updated with the latest commit, file history, and so
on from the remote Git repository. (This is equivalent to the Git command git fetch.)
2. The working directory and branch is updated with the latest content from step 1. (This is
equivalent to the Git command git merge.)
3. A single commit is created on the local branch with the changes from step 1. If there is a
merge conflict, it will need to be resolved.
4. The working directory is updated with the latest content.
To update the local copy of the Git repository from the parent branch, use the following
command:
$ git pull
Or
Please see the official git pull documentation for more details and command line options.
The figure has the following words: before commit command up top. There are 3 boxes from left
to right: repository, staging area, and working directory. Repository has file 1, file 4, and file 5
and the index. The staging box has file 1, file 4, and file 5 with an x over it. The working
directory has file 2, file 3, and file 6. A text block is below: git commit -m Adding new files. The
figure that follows as the words After commit command at the bottom. In the repository block
are the index and file 1 and file 4. The staging area contains no files and the working directory
box has file 2, file 3, and file 6.
=++
Before pull command
(Remote) Origin/Master
The last commit that local repo knows about
(Remote) Origin/Master
After pull command
(Local) Master
(Local) Master
3.3.9 Branching Features
Branches are a very useful feature of Git. As discussed earlier, there are many benefits of using
branches, but one major benefit is that it allows features and code changes to be made
independent of the main code (the master branch).
OPTION 1
where <parent branch> is the branch to branch off of and the <branch name> is the name to
call the new branch.
When using this command to create a branch, Git will create the branch, but it will not
automatically switch the working directory to this branch. You must use the git switch <branch
name> command to switch the working directory to the new branch.
OPTION 2
Git provides a git checkout command to switch branches by updating the working directory with
the contents of the branch.
To create a branch and switch the working directory to that branch, use the following command:
where <parent branch> is the branch to branch off of and the <branch name> is the name to
call the new branch.
Deleting a Branch
Please see the official git branch and git checkout documentation for more details and command
line options.
To get a list of all the local branches, use the following command:
$ git branch
Or
Merging Branches
Branches diverge from one another when they are modified after they are created. To get the
changes from one branch (source) into another (target), you must merge the source branch into
the target branch. When Git merges the branch, it takes the changes/commits from the source
branch and applies it to the target branch. During a merge, only the target branch is modified.
The source branch is untouched and remains the same.
For example:
The figure has a box labeled Before merge command and two lines with arrows to the right
labeled Branch A and Branch B. On Branch A there is a C1 textbox that connects to the C3
textbox on the Branch B line about halfway down the line. C1 connects to C2 and then almost to
the end is the C5 textbox. On the Branch B line, the C3 textbox connects to C4. Below is a
textbox with the words: git merge BranchA. Underneath is another diagram with Branch A and
Branch B lines. Textboxes C1, C2, and C5 connect one after another. C5 connects to C3 on the
Branch B line and C3 connects to C4. At the bottom are the words After merge command.
git merge BranchA
After merge command
Branch A
Before merge command
Branch B
Branch A
Branch B
FAST-FORWARD MERGE
A fast-forward merge is when the Git algorithm is able to apply the changes/commits from the
source branch(es) to the target branch automatically and without conflicts. This is usually
possible when different files are changed in the branches being merged. It is still possible when
the same file is changed, but typically when different lines of the file have been changed. A fast-
forward merge is the best case scenario when performing a merge.
In a fast-forward merge, Git integrates the different commits from the source branch into the
target branch. Because branches are essentially just pointers to commits in the backend, a fast-
forward merge simply moves the pointer that represents the HEAD of the target branch, rather
than adding a new commit.
Note that in order to do a fast-forward merge, Git has to be able to merge all of the existing
commits without encountering any conflicts.
MERGE CONFLICTS
Modifying the same file on different branches to be merged increases the chances of a merge
conflict. A merge conflict is when Git is not able to perform a fast-forward merge because it does
not know how to automatically apply the changes from the branches together for the file(s).
When this occurs, the user must manually fix these conflicts before the branches can be merged
together. Manually fixing the conflict adds a new commit to the target branch containing the
commits from the source branch, as well as the fixed merge conflict(s).
The figure has a box labeled Before merge command and two lines with arrows to the right
labeled Branch A and Branch B. On Branch A there is a C1 textbox that connects to the C3
textbox on the Branch B line about halfway down the line. C1 connects to C2 and then almost to
the end is the C5 textbox. On the Branch B line, the C3 textbox connects to C4. Below is a
textbox with the words: git merge BranchA. Underneath is another diagram with Branch A and
Branch B lines. Textboxes C1, C2, and C5 connect one after another. C5 connects to C3 on the
Branch B line and C3 connects to C4. At the bottom are the words After merge command.
=++
Branch A
merge conflict fixes
Branch B
Git provides a git merge command to join two or more branches together.
To merge a branch into the client's current branch/repository, use the following command:
where <branch name> is the source branch that is being merged into the current branch
When using the git merge command, the target branch must be the current branch/repository, so
to merge a branch into a branch that is not the client's current branch/repository, use the
following commands:
where <target branch name> is the target branch and the <source branch name> is the
source branch.
To merge more than one branch into the client's current branch/repository, use the following
command:
Please see the official git merge documentation for more details and command line options.
The symbols and meanings in a unified diff file are shown below:
The signal can be a "+" or a "-" depending on the order of the hashes.
In this format, there are three lines shown above and below the exact changed line for context,
but you can spot the differences by comparing the - line with the + line. One of the changes in
this patch is to change the snapshot file name, replacing ...bgp.json with ...routes.json .
- snapshot_file: "{{ inventory_hostname }}_bgp.json"
+ snapshot_file: "{{ inventory_hostname }}_routes.json"
You can always look at the difference between two files from a GitHub Pull Request as a unified
diff by adding .diff to the GitHub URL.
In this lab, you will explore the fundamentals of the distributed version control system Git,
including most of the features you'd need to know in order to collaborate on a software project.
You will also integrate your local Git repository with a cloud-based GitHub repository.
Coding Basics
Scroll to begin
It may seem easy to throw code in a file and make it work. But as project size and complexity
grows, and as other developers (and stakeholders) get involved, disciplined methods and best
practices are needed to help developers write better code and collaborate around it more easily.
One thing that most developers agree on is trying to write clean code. But, what is clean code?
Clean code is the result of developers trying to make their code easy to read and understand for
other developers. What constitutes "clean code" is somewhat subjective, but here are some
common principles:
Is the formatting of the code neat, and does it follow generally-accepted formatting
practices for the computer language(s) involved, and/or meet specific requirements of the
institutional, project, and/or team "stylebook"?
o Does it stick with ALL tabs or ALL spaces?
o Does it use the same number of tabs or spaces per indentation level, throughout
the file? (Some languages, such as Python, make this a requirement.)
o Does it have the indentation in the right places?
o Does it use consistent formatting for syntax, such as the location of curly braces
({})?
Are variable, object, and other names used in the code intuitive?
Is the code organized in a way that it makes sense? For example, are declarations
grouped, with functions up top, mainline code at the bottom, or otherwise, depending on
language and context?
Is the code internally documented with appropriate comments?
Does each line of code serve a purpose? Have you removed all unused code?
Is the code written so that common code can be reused, and so that all code can be easily
unit-tested?
standardized formatting and intuitive naming for ease of reading, understanding, and
searching
overall organization to communicate intention and make things easier to find
modularity to facilitate testing and reuse
inline and other comments
other characteristics that help make code self-documenting
In theory, other developers should be able to understand, use, and modify your code without
being able to ask you questions. This accelerates development enormously, enabling reuse,
debugging, security analysis, code review and merging, along with other processes. It lets
longstanding projects (for example, open source projects) incorporate your code with greater
confidence. And it lets organizations keep using your code efficiently, after you have moved on
to other projects or jobs.
By contrast, code that is not clean quickly becomes technical debt. It may be unmaintainable, or
unusable with complete confidence. It is code that needs to be refactored, or removed and
rewritten, which is expensive and time-consuming.
What are some other reasons developers want to write clean code?
1. Clean code is easier to understand, more compact, and better-organized, which tends to
result in code that works correctly (fewer bugs) and performs as required.
2. Clean code, being modular, tends to be easier to test using automated methods such as
unit testing frameworks.
3. Clean code, being standardized, is easier to scan and check using automated tools such as
linters, or command-line tools like grep, awk, and sed.
4. It just looks nicer.
Now that you understand the goals of writing clean code, you can dive deeper into coding best
practices. Specifically, looking at how to break code into methods and functions, modules, and
classes.
Methods and functions share the same concept; they are blocks of code that perform tasks when
executed. If the method or function is not executed, those tasks will not be performed. Although
there are no absolute rules, here are some standard best-practices for determining whether a piece
of code should be encapsulated (in a method or function):
Code that performs a discrete task, even if it happens only once, may be a candidate for
encapsulation. Classic examples include utility functions that evaluate inputs and return a
logical result (for example, compare two strings for length), perform I/O operations (for
example, read a file from disk), or translate data into other forms (for example, parse and
process data). In these case, you encapsulate for clarity and testability, as well as for
possible future re-use or extension.
Task code that is used more than once should probably be encapsulated. If you find
yourself copying several lines of code around, it probably needs to be a method or
function. You can accommodate slight variations in usage using logic to assess passed
parameters (see below).
The most powerful thing about methods and functions is that they can be written once and
executed as many times as you like. If used correctly, methods and functions will simplify your
code, and therefore reduce the potential for bugs.
Another feature of methods and functions is the ability to execute the code based on the values of
variables passed in on execution. These are called arguments. In order to use arguments when
calling a method or function, the method or function needs to be written to accept these variables
as parameters.
Parameters can be any data type and each parameter in a method or function can have a different
data type. Arguments passed to the method or function must match the data type(s) expected by
the method or function.
Depending on the coding language, some languages require the data type to be defined in the
parameter (so-called 'strongly typed' languages), while some permit this optionally.
Even when parameter type specification is not required, it is usually a good idea. It makes code
easier to reuse because you can more easily see what kind of parameters a method or function
expects. It also makes error messages clearer. Type mismatch errors are easy to fix, whereas a
wrong type passed to a function may cause errors that are difficult to understand, deeper in the
code.
Parameters and arguments add flexibility to methods and functions. Sometimes the parameter is
just a boolean flag that determines whether certain lines of code should be executed in the
method or function. Think of parameters as being the input to the method or function.
The example above is passing this function a string, an integer (or number), and an object
containing a key and a value.
Return Statements
Methods and functions perform tasks, and can also return a value. In many languages, the return
value is specified using the keyword return followed by a variable or expression. This is called
the return statement. When a return statement is executed, the value of the return statement is
returned and any code below it gets skipped. It is the job of the line of code calling the method or
function to grab the value of the return, but it is not mandatory.
Notice that the version of code that uses functions, parameters, and arguments results in no
duplicate code. Also, by using functions, you are able to label blocks of code, which make their
purposes more understandable. If this example were more complicated and there were a lot of
lines of code within each function, having the blocks of code duplicated three times in the file
would make it much harder to understand.
If methods and functions share the same concept, why are they named differently? The
difference between methods and functions is that functions are standalone code blocks while
methods are code blocks associated with an object, typically for object-oriented programming.
Method example
The code from the function example can be modified to turn the function into a method,
producing the same result:
Modules are a way to build independent and self-contained chunks of code that can be reused.
Developers typically use modules to divide a large project into smaller parts. This way the code
is easier to read and understand, and each module can be developed in parallel without conflicts.
A module is packaged as a single file. In addition to being available for integration with other
modules, it should work independently.
A module consists of a set of functions and typically contains an interface for other modules to
integrate with. It is essentially, a library, and cannot be instantiated.
Module example
Below is a module with a set of functions saved in a script called circleModule.py. You will
see this script again later in the lab for this topic.
class Circle:
def circumference(self):
pi = 3.14
circumferenceValue = pi * self.radius * 2
return circumferenceValue
def printCircumference(self):
myCircumference = self.circumference()
print ("Circumference of a circle with a radius of " +
str(self.radius) + " is " + str(myCircumference))
An application that exists in the same directory as circleModule.py could use this module by
importing it, instantiating the class, and then using dot notation to call its functions, as follows:
# Two more instantiations and method calls for the Circle class.
circle2 = Circle(5)
circle2.printCircumference()
circle3 = Circle(7)
circle3.printCircumference()
3.4.5 Classes
In most OOP languages, and in Python, classes are a means of bundling data and functionality.
Each class declaration defines a new object type.
Encapsulating functionality together with data storage in a single structure also accomplishes one
aspect of data abstraction. Functions defined within a class are known as class methods. Classes
may have class variables and object variables. As a new class object is created, new class data
members and object data members (variables) are created. New classes may be defined, based on
existing, previously defined classes, so that they inherit the properties, data members, and
functionality (methods).
As with other Python data structures and variables, objects are instantiated (created) as they are
first used, rather than being predeclared. A class may be instantiated (created) multiple times,
and each with its own object-specific data attribute values. (Python classes also support class
variables that are shared by all objects of a class.) Outside of the class name scope, class methods
and data attributes are referenced using the dot notation: [class instance].[method name].
Note: Unlike other OOP languages, in Python, there is no means of creating 'private' class
variables or internal methods. However, by convention, methods and variables with a single
preceding underscore ( _ ) are considered private and not to be used or referenced outside of the
class.
A code review is when developers look over the codebase, a subset of code, or specific code
changes and provide feedback. These developers are often called reviewers. It is better to have
more than one reviewer when possible.
It is best to have reviewers who understand the purpose of the code so that they can give quality
and relevant feedback. The reviewers will provide comments, typically to the author of the code,
on what they think needs to be fixed. Because a lot of comments can be subjective, it is up to the
author to decide if the comment needs to be addressed, but it is good to have agreement from the
reviewer(s) if it will not be fixed. This code review process only happens after the code changes
are complete and tested.
The goal of code reviews is to make sure that the final code:
is easy to read
is easy to understand
follows coding best practices
uses correct formatting
is free of bugs
has proper comments and documentation
is clean
Doing code reviews has benefits for the whole team. For the author, they get input from the
reviewers and learn additional best practices, other ways the code could have been implemented,
and different coding styles. As a result of the review, the author learns from their mistakes and
can write better code the next time. Code reviews aren’t just for junior developers, they are a
great learning process for all developers.
Code reviews also transfer knowledge about the code between developers. If the reviewers have
to work on that piece of code in the future, they will have a better understanding of how it works.
Code reviews are a way to refine working code or spot potential bugs, which increases the
quality of the code. In general, having another set of eyes on the code is never a bad thing.
There are many ways to do code reviews. Each one has its own benefits. The most
common types of code review processes include:
A formal code review is where developers have a series of meetings to review the whole
codebase. In this meeting, they go over the code line by line, discussing each one in detail. This
type of code review process promotes discussion between all of the reviewers.
A formal code review enables reviewers to reach a consensus, which may result in better
feedback. You might do a new code review every time the comments are addressed.
Details of the code review meetings, such as the attendees, the comments, and comments that
will be addressed, are documented. This type of code review is often called Fagan inspection and
is common for projects that use the waterfall software development methodology.
A modern adaptation of the formal code review is to have a single meeting to review only the
code changes. This way, the code can benefit from the live discussion amongst reviewers. This is
sometimes known as a walkthrough.
A change-based code review, also known as a tool-assisted code review, reviews code that was
changed as a result of a bug, user story, feature, commit, etc.
In order to determine the code changes that need to be reviewed, a peer code review tool that
highlights the code changes is typically used. This type of code review is initiated by the
developers who made the code changes and are responsible for addressing the agreed upon
comments. In this type of code review process, the reviewers usually perform the review
independently and provide the comments via the peer code review tool.
A change-based code review makes it easy to determine the actual code changes to be reviewed
and enables multiple reviewers to get a diverse look into the code.
An over-the-shoulder code review is exactly what it sounds like. A reviewer looks over the
shoulder of the developer who wrote the code. The developer who wrote the code goes through
the code changes line by line and the reviewer provides feedback.
With this method, if the fix is not difficult, the code may be changed on the spot so that the
reviewer can re-review it immediately. The benefit of an over-the-shoulder code review is that
there is direct interaction between the author of the code and the reviewer, which allows for
discussion about what is the right fix. The downside of this type of code review is that it
typically involves only one reviewer, so the comments can be one-sided.
Email Pass-Around
An email pass-around review can occur following the automatic emails sent by source code
management systems when a checkin is made. When the emails are sent, it is up to the other
developers to review the code changes that were made in that checkin. The downside of this type
of code review is that sometimes a single checkin can be just a piece of the whole code change,
so it may not include the proper context to fully understand the code changes.
3.5.3 Testing
Why do coders test software? The simple answer is to make sure it works the way it is supposed
to work. This answer conceals a wealth of nuance and detail.
To begin with, software testing is classically subdivided into two general categories:
Functional testing seeks to determine whether software works correctly. Does it behave
as intended in a logical sense, from the lowest levels of detail examined with Unit
Testing, to higher levels of complexity explored in Integration Testing?
Non-functional testing examines usability, performance, security, resiliency, compliance,
localization, and many other issues. This type of testing finds out if software is fit for its
purpose, provides the intended value, and minimizes risk.
You might think that functional testing happens early in the development cycle, and non-
functional testing begins after parts of the software are built or even finalized. This is incorrect.
Some types of non-functional testing (for example, determining whether a particular language,
open source library, or component meets requirements of a design, or a standard) need to happen
well before design is fixed.
In fact, as you’ll see towards the end of this unit, some developers advocate using testing as a
framework for guiding software development. This means capturing design requirements as
tests, then writing software to pass those tests. This is called Test-Driven Development (TDD).
Let’s look at some methods and tools for testing the lines of code, blocks, functions, and classes.
Detailed functional testing of small pieces of code (lines, blocks, functions, classes, and other
components in isolation) is usually called Unit Testing. Modern developers usually automate this
kind of testing using unit test frameworks. These test frameworks are software that lets you make
assertions about testable conditions and determine if these assertions are valid at a point in
execution. For example:
a = 2 + 2
assert a == 4
The assert keyword is actually native to Python. In this case, the assertion will return true
because 2 + 2 does, in fact, equal 4. On the other hand, if you were to have:
assert a == 5
Collecting assertions and reporting on tests is made easier with testing frameworks. Some
examples of test frameworks for Python include:
unittest — This is a framework included in Python by default. It lets you create test
collections as methods extending a default TestCase class.
PyTest — This is a framework that is easily added to Python (from pip repositories: pip3
install pytest). PyTest can run unittest tests without modification, but it also
simplifies testing by letting coders build tests as simple functions rather than class
methods. PyTest is used by certain more-specialized test suites, like PyATS from Cisco.
Both are used in this part, so you can see some of the differences between them.
PyTest is handy because it automatically executes any scripts that start with test_ or end
with _test.py, and within those scripts, automatically executes any functions beginning
with test_ or tests_. So we can unit test a piece of code (such as a function) by copying it into
a file, importing pytest, adding appropriately-named testing functions (names that begin with
tests_ ), saving the file under a filename that also begins with tests_, and running it with
PyTest.
Suppose we want to test the function add5() , which adds 5 to a passed value, and returns the
result:
def add5(v):
myval = v + 5
return myval
We can save the function in a file called tests_mytest.py. Then import pytest and write a
function to contain our tests, called tests_add5() :
# in file tests_mytest.py
import pytest
def add5(v):
myval = v + 5
return myval
def tests_add5():
r = add5(1)
assert r == 6
r = add5(5)
assert r == 10
r = add5(10.102645)
assert r == 15.102645
The tests in our testing function use the standard Python assert keyword. PyTest will compile and
report on those results, both when collecting test elements from the file (a preliminary step where
PyTest examines Python's own code analysis and reports on proper type use and other issues that
emerge prior to runtime), and while running the tests_add5() function.
You can then run the tests using:
pytest tests_mytest.py
Note that while the function under test is certainly trivial, many real-world programs contain
functions that, like this one, perform math on their arguments. Typically, these functions are
called by higher-level functions, which then do additional processing on the returned values.
If there is a mistake in a lower-level function, causing it to return a bad result, this will likely be
reflected in higher-level output. But because of all the intermediary processing, it might be
difficult or impossible to find the source of an error (or even note whether an error occurred) by
looking at output of these higher-level functions, or at program output in general.
That is one reason why detailed unit testing is essential for developing reliable software. And it
is a reason why unit tests should be added, each time you add something significant to code at
any level, and then re-run with every change you make. We recommend that, when concluding a
work session, you write a deliberately-broken unit test as a placeholder, then use a start-of-
session unit test run to remind you where you left off.
The unittest framework demands a different syntax than PyTest. For unittest , you need to
subclass the built-in TestCase class and test by overriding its built-in methods or adding new
methods whose names begin with test_. The example unit test script, above, could be modified
to work with unittest like this:
import unittest
def add5(v):
myval = v + 5
return myval
class tests_add5(unittest.TestCase):
def test_add5(self):
self.assertEqual(add5(1),6)
self.assertEqual(add5(5),10)
self.assertEqual(add5(10.102645),15.102645)
if __name__ == '__main__':
unittest.main()
As with PyTest, you import the unittest module to start. Your function follows.
To subclass the TestCase class, pass it to your own (derived) test class (again
called tests_add5, though this is now a class, rather than a function), causing the latter to
inherit all characteristics of the former. For more on Python object-oriented programming
(OOP), see the documentation.
Next, use unittest's assertEqual method (this is one of a wide range of built-in test methods) in
the same way that you used Python's native assert in the PyTest example. Basically, you are
running your function with different arguments, and checking to see if returned values match
expectations.
The last stanza is a standard way of enabling command-line execution of our program, by calling
its main function; which, in this case, is defined by unittest.
Save this file (again as tests_mytest.py ), ensure that it is executable (for example, in Linux,
using chmod +x tests_mytest.py ) and execute it, adding the -v argument to provide a verbose
report:
python3 tests_mytest.py -v
test_add5 (__main__.tests_add5) ... ok
-----------------------------------------------------------------
-----
Ran 1 test in 0.000s
OK
3.5.5 Integration Testing
After unit testing comes integration testing, which makes sure that all of those individual units
you have been building fit together properly to make a complete application. For example,
suppose an application that you are writing needs to consult a local web service to obtain
configuration data, including the name of a relevant database host. You might want to test the
values of variables set when these functions are called. If you were using PyTest, you could do
that like this:
Note that your test_setconfig() method deliberately calls your setUp() function before
running tests, and your tearDown() function afterward. In unittest, methods
called setUp() and tearDown() are provided by the TestCase class, can be overridden in your
defined subclass, and are executed automatically.
Running this code with PyTest might produce output like this:
test_sample_app.py
F [100%]
==================================== FAILURES
====================================
_________________________________ test_setconfig
_________________________________
def test_setconfig():
setUp()
set_config("TESTVAL")
> assert get_config() == "ESTVAL"
E AssertionError: assert 'TESTVAL' == 'ESTVAL'
E - TESTVAL
E ? -
E + ESTVAL
test_sample_app.py:21: AssertionError
------------------------------- Captured log call
--------------------------------
connectionpool.py 225 DEBUG Starting new HTTP
connection (1): localhost:80
connectionpool.py 437 DEBUG http://localhost:80
"GET /get_config HTTP/1.1" 200 7
connectionpool.py 225 DEBUG Starting new HTTP
connection (1): localhost:80
connectionpool.py 437 DEBUG http://localhost:80
"GET /config_action?dbhost=TESTVAL HTTP/1.1" 200 30
connectionpool.py 225 DEBUG Starting new HTTP
connection (1): localhost:80
connectionpool.py 437 DEBUG http://localhost:80
"GET /get_config HTTP/1.1" 200 7
============================ 1 failed in 0.09 seconds
============================
If you fix the broken test, you can see that everything runs perfectly:
test_sample_app.py
. [100%]
============================ 1 passed in 0.07 seconds
============================
Again, you should run your integration tests before you make any changes for the day, whenever
you make significant changes, and before you close out for the day. If you are using Continuous
Integration, any errors you find must be corrected before you do anything else.
Note: You can run this script on your VM using pytest. However, understanding the output and
fixing any errors is beyond the scope of this course.
Building small, simple unit and integration tests around small bits of code helps in two ways:
It ensures that units are fit for purpose. In other words, you make sure that units are doing
what requirements dictate, within the context of your evolving solution.
It catches bugs locally and fixes them early, saving trouble later on when testing or using
higher-order parts of your solution that depend on these components.
The first of these activities is as important as the second, because it lets testing validate system
design or, failing that, guide local refactoring, broader redesign, or renegotiation of requirements.
Testing to validate design intention in light of requirements implies that you should write testing
code before you write application code . Having expressed requirements in your testing code,
you can then write application code until it passes the tests you have created in the testing code.
1. Create a new test (adding it to existing tests, if they already exist). The idea here is to
capture some requirement of the unit of application code you want to produce.
2. Run tests to see if any fail for unexpected reasons. If this happens, correct the tests. Note
that expected failures, here, are acceptable (for example, if your new test fails because the
function it is designed to test does not yet exist, that is an acceptable failure at this point).
3. Write application code to pass the new test. The rule here is to add nothing more to the
application besides what is required to pass the test.
4. Run tests to see if any fail. If they do, correct the application code and try again.
5. Refactor and improve application code. Each time you do, re-run the tests and correct
application code if you encounter any failures.
By proceeding this way, the test harness leads and grows in lockstep with your application. This
may be on a line-by-line basis, providing very high test coverage and high assurance that both
the test harness and the application are correct at any given stopping-point. Co-evolving test and
application code this way:
Obliges developers to consistently think about requirements (and how to capture them in
tests).
Helps clarify and constrain what code needs to do (because it just has to pass tests),
speeding development and encouraging simplicity and good use of design patterns.
Mandates creation of highly-testable code. This is code that, for example, breaks
operations down into pure functions that can be tested in isolation, in any order, etc.
Rest APIs, which you'll learn about in the next module, let you exchange information with
remote services and equipment. So do interfaces built on these APIs, including purpose-
dedicated command-line interface tools and integration software development kits (SDKs) for
popular programming languages.
When controlling these APIs through software, it is helpful to be able to receive and transmit
information in forms that are standards-compliant, and machine- and human-readable. This lets
you:
Easily use off-the-shelf software components and/or built-in language tools to convert
messages into forms that are easy for you to manipulate and extract data from, such as
data structures native to the programming language(s) you are using. You can also
convert them into other standard formats that you may need for various purposes.
Easily write code to compose messages that remote entities can consume.
Read and interpret received messages yourself to confirm that your software is handling
them correctly, and compose test messages by hand to send to remote entities.
More easily detect "malformed" messages caused by transmission or other errors
interfering with communication.
Today, the three most popular standard formats for exchanging information with remote APIs are
XML, JSON, and YAML. The YAML standard was created as a superset of JSON, so any legal
JSON document can be parsed and converted to equivalent YAML, and (with some limitations
and exceptions) vice-versa. XML, an older standard, is not as simple to parse, and in some cases,
it is only partly (or not at all) convertible to the other formats. Because XML is older, the tools
for working with it are quite mature.
3.6.2 XML
Extensible Markup Language (XML) is a derivative of Structured, Generalized Markup
Language (SGML), and also the parent of HyperText Markup Language (HTML). XML is a
generic methodology for wrapping textual data in symmetrical tags to indicate semantics. XML
filenames typically end in ".xml".
This example simulates information you might receive from a cloud computing management API
that is listing virtual machine instances.
For the moment, ignore the first line of the document, which is a special part known as the
prologue (more on this below), and the second line, which contains a comment. The remainder of
the document is called the body.
Notice how individual data elements within the body (readable character strings) are surrounded
by symmetrical pairs of tags, the opening tag surrounded by < and > symbols, and the closing
tag, which is similar, but with a "/" (slash) preceding the closing tag name.
Notice also that some tag pairs surround multiple instances of tagged data (for example,
the <vm> and corresponding <vm> tags). The main body of the document, as a whole, is always
surrounded by an outermost tag pair (for example, the <vms>...</vms> tag pair) or root tag pair.
The structure of the document body is like a tree, with branches coming off the root, containing
possible further branches, and finally leaf nodes, containing actual data. Moving back up the tree,
each tag pair in an XML document has a parent tag pair, and so on, until you reach the root tag
pair.
When consuming XML from an API, tag names and their meanings are generally documented by
the API provider, and may be representative of usage defined in a public namespace schema.
Data is conveyed in XML as readable text. As in most programming languages, encoding special
characters in XML data fields presents certain challenges.
For example, a data field cannot contain text that includes the < or > symbols, used by XML to
demarcate tags. If writing your own XML (without a schema), it is common to use HTML entity
encodings to encode such characters. In this case the characters can be replaced with their
equivalent < and > entity encodings. You can use a similar strategy to represent a wide range of
special symbols, ligature characters, and other entities.
Note that if you are using XML according to the requirements of a schema, or defined
vocabulary and hierarchy of tag names, (this is often the case when interacting with APIs such as
NETCONF) you are not permitted to use HTML entities. In the rare case when special characters
are required, you can use the characters' numeric representations, which for the less-than and
greater-than signs, are < and > respectively.
To avoid having to individually find and convert special characters, it is possible to incorporate
entire raw character strings in XML files by surrounding them with so-called CDATA blocks.
Here is an example:
XML Prologue
The XML prologue is the first line in an XML file. It has a special format, bracketed by <? and ?
>. It contains the tag name xml and attributes stating the version and a character encoding.
Normally the version is "1.0", and the character encoding is "UTF-8" in most cases; otherwise,
"UTF-16". Including the prologue and encoding can be important in making your XML
documents reliably interpretable by parsers, editors, and other software.
Comments in XML
XML files can include comments, using the same commenting convention used in HTML
documents. For example:
<!-- This is an XML comment. It can go anywhere -->
XML Attributes
XML lets you embed attributes within tags to convey additional information. In the following
example, the XML version number and character encoding are both inside the xml tag. However,
the vmid and type elements could also be included as attributes in the xml tag:
XML Namespaces
Some XML messages and documents must incorporate a reference to specific namespaces to
specify particular tagnames and how they should be used in various contexts. Namespaces are
defined by the IETF and other internet authorities, by organizations, and other entities, and their
schemas are typically hosted as public documents on the web. They are identified by Uniform
Resource Names (URNs), used to make persistent documents reachable without the seeker
needing to be concerned about their location.
The code example below shows use of a namespace, defined as the value of an xmlns attribute,
to assert that the content of an XML remote procedure call should be interpreted according to the
legacy NETCONF 1.0 standard. This code-sample shows a NETCONF remote procedure call
instruction in XML. Attributes in the opening rpc tag denote the message ID and the XML
namespace that must be used to interpret the meaning of contained tags. In this case, you are
asking that the remote entity kill a particular session. The NETCONF XML schema is
documented by IETF.
<rpc message-id="101"
xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<kill-session>
<session-id>4</session-id>
</kill-session>
</rpc>
Interpreting XML
In the above example, what is represented is intended as a list or one-dimensional array (called
'instances') of objects (each identified as an 'instance' by bracketing tags). Each instance object
contains two key-value pairs denoting a unique instance ID and VM server type. A semantically-
equivalent Python data structure might be declared like this:
vms [
{
{"vmid": "0101af9811012"},
{"type": "t1.nano"}
},
{
{"vmid": "0102bg8908023"},
{"type": "t1.micro"}
}
]
The problem is that XML has no way of deliberately indicating that a certain arrangement of tags
and data should be interpreted as a list. So we need to interpret the XML-writer's intention in
making the translation. Mappings between XML tree structures and more-efficient
representations possible in various computer languages require understanding data semantics.
This is less true of more-modern data formats, like JSON and YAML, which are structured to
map well onto common simple and compound data types in popular programming languages.
In this case, the <vm> tags bracketing each instance's data (id and type) are collapsed in favor of
using plain brackets ( {} ) for grouping. This leaves you with a Python list of objects (which you
can call 'vm objects,' but which are not explicitly named in this declaration), each containing two
sub-objects, each containing a key/value pair.
3.6.3 JSON
JSON, or JavaScript Object Notation, is a data format derived from the way complex object
literals are written in JavaScript (which is in turn, similar to how object literals are written in
Python). JSON filenames typically end in ".json".
Here is a sample JSON file, containing some key/value pairs. Notice that two values are text
strings, one is a boolean value, and two are arrays:
{
"edit-config":
{
"default-operation": "merge",
"test-operation": "set",
"some-integers": [2,3,5,7,9],
"a-boolean": true,
"more-numbers": [2.25E+2,-1.0735],
}
}
JSON basic data types include numbers (written as positive and negative integers, as floats with
decimal, or in scientific notation), strings, Booleans ('true' and 'false'), or nulls (value left blank).
JSON Objects
As in JavaScript, individual objects in JSON comprise key/value pairs, which may be surrounded
by braces, individually:
{"keyname": "value"}
This example depicts an object with a string value (for example, the word 'value'). A number or
Boolean would not be quoted.
Objects may also contain multiple key/value pairs, separated by commas, creating structures
equivalent to complex JavaScript objects, or Python dictionaries. In this case, each individual
key/value pair does not need its own set of brackets, but the entire object does. In the above
example, the key "edit-config" identifies an object containing five key/value pairs.
JSON can also express JavaScript ordered arrays (or 'lists') of data or objects. In the above
example, the keys "some-integers" and "more-numbers" identify such arrays.
No Comments in JSON
Unlike XML and YAML, JSON does not support any kind of standard method for including
unparsed comments in code.
Whitespace Insignificant
Whitespace in JSON is not significant, and files can be indented using tabs or spaces as
preferred, or not at all (which is a bad idea). This makes the format efficient (whitespace can be
removed without harming meaning) and robust for use on command-lines and in other scenarios
where whitespace characters can easily be lost in cut-and-paste operations.
3.6.4 YAML
YAML, an acronym for "YAML Ain't Markup Language", is a superset of JSON designed for
even easier human readability. It is becoming more common as a format for configuration files,
and particularly for writing declarative automation templates for tools like Ansible.
As a superset of JSON, YAML parsers can generally parse JSON documents (but not vice-
versa). Because of this, YAML is better than JSON at some tasks, including the ability to embed
JSON directly (including quotes) in YAML files. JSON can be embedded in JSON files too, but
quotes must be escaped with backslashes " or encoded as HTML character entities "e;
Here is a version of the JSON file from the JSON subsection, expressed in YAML. Use this as an
example to understand how YAML works:
---
edit-config:
a-boolean: true
default-operation: merge
more-numbers:
- 225.0
- -1.0735
some-integers:
- 2
- 3
- 5
- 7
- 9
test-operation: set
...
As shown in the example, YAML files conventionally open with three dashes ( --- alone on a
line) and end with three dots ( ... also alone a line). YAML also accommodates the notion of
multiple "documents" within a single physical file, in this case, separating each document with
three dashes on its own line.
YAML basic data types include numbers (written as positive and negative integers, as floats with
a decimal, or in scientific notation), strings, Booleans ( true and false ), or nulls (value left
blank).
String values in YAML are often left unquoted. Quotes are only required when strings contain
characters that have meaning in YAML. For example, { ,a brace followed by a space, indicates
the beginning of a map. Backslashes and other special characters or strings also need to be
considered. If you surround your text with double quotes, you can escape special characters in a
string using backslash expressions, such as for newline.
YAML also offers convenient ways of encoding multi-line string literals (more below).
Basic Objects
In YAML, basic (and complex) data types are equated to keys. Keys are normally unquoted,
though they may be quoted if they contain colons (:) or certain other special characters. Keys
also do not need to begin with a letter, though both these features conflict with the requirements
of most programming languages, so it is best to stay away from them if possible.
my_integer: 2
my_float: 2.1
my_exponent: 2e+5
'my_complex:key' : "my quoted string value\n"
0.2 : "can you believe that's a key?"
my_boolean: true
my_null: null # might be interpreted as empty string, otherwise
YAML does not use brackets or containing tag pairs, but instead indicates its hierarchy using
indentation. Items indented below a label are "members" of that labeled element.
The indentation amount is up to you. As little as a single space can be used where indentation is
required, though a best-practice is to use two spaces per indent level. The important thing is to be
absolutely consistent, and to use spaces rather than tabs.
YAML easily represents more complex data types, such as maps containing multiple key/value
pairs (equivalent to dictionaries in Python) and ordered lists.
Maps are generally expressed over multiple lines, beginning with a label key and a colon,
followed by members, indented on subsequent lines:
mymap:
myfirstkey: 5
mysecondkey: The quick brown fox
Lists (arrays) are represented in a similar way, but with optionally-indented members preceded
by a single dash and space:
mylist:
- 1
- 2
- 3
Maps and lists can also be represented in a so-called "flow syntax," which looks very much like
JavaScript or Python:
Long Strings
You can represent long strings in YAML using a 'folding' syntax, where linebreaks are presumed
to be replaced by spaces when the file is parsed/consumed, or in a non-folding syntax. Long
strings cannot contain escaped special characters, but may (in theory) contain colons, though
some software does not observe this rule.
mylongstring: >
This is my long string
which will end up with no linebreaks in it
myotherlongstring: |
This is my other long string
which will end up with linebreaks as in the original
Note the difference in the two examples above. The greater than ( > ) indicator gives us the
folding syntax, where the pipe ( | ) does not.
Comments
Comments in YAML can be inserted anywhere except in a long string literal, and are preceded
by the hash sign and a space:
# this is a comment
YAML has many more features, most often encountered when using it in the context of specific
languages, like Python, or when converting to JSON or other formats. For example, YAML 1.2
supports schemas and tags, which can be used to disambiguate interpretation of values. For
example, to force a number to be interpreted as a string, you could use the !!str string, which is
part of the YAML "Failsafe" schema:
Parsing means analyzing a message, breaking it into its component parts, and understanding their
purposes in context. When messages are transmitted between computers, they travel as a stream
of characters, which is effectively a string. This needs to be parsed into a semantically-equivalent
data-structure containing data of recognized types (such as integers, floats, strings, etc.) before
the application can interpret and act upon the data.
An Example
For example, imagine you wanted to send an initial query to some remote REST endpoint,
inquiring about the status of running services. To do this, typically, you would need to
authenticate to the REST API, providing your username (email), plus a permission key obtained
via an earlier transaction. You might have stored username and key in a Python dictionary, like
this:
auth = {
"user": {
"username": "[email protected]",
"key": "90823ff08409408aebcf4320384"
}
}
But the REST API requires these values to be presented as XML in string form, appended to
your query as the value of a key/value pair called "auth":
The XML itself might need to take this format, with Python key values converted to same-name
tag pairs, enclosing data values:
<user>
<username>[email protected]</username>
<key>90823ff08409408aebcf4320384</key>
</user>
You would typically use a serialization function (from a Python library) to output your auth data
structure as a string in XML format, adding it to your query:
At this point, the service might reply, setting the variable myresponse to contain a string like the
following, containing service names and statuses in XML format:
<services>
<service>
<name>Service A</name>
<status>Running</status>
</service>
<service>
<name>Service B</name>
<status>Idle</status>
</service>
</services>
You would then need to parse the XML to extract information into a form that Python could
access conveniently.
In this case, the untangle library would parse the XML into a dictionary whose root element
(services) contains a list (service[]) of pairs of key/value object elements denoting the name and
status of each service. You could then access the 'cdata' value of elements to obtain the text
content of each XML leaf node. The above code would print:
Service B Idle
*In this lab, you will use Python to parse each data format in turn: XML, JSON, and YAML.
You will walk through code examples and investigate how each parser works.
You will complete the following objectives:
Software Development
The software development life cycle (SDLC) is the process of developing software, starting from
an idea and ending with delivery. This process consists of six phases. Each phase takes input
from the results of the previous phase: 1. Requirements & Analysis, 2. Design, 3.
Implementation, 4. Testing, 5. Deployment, and 6. Maintenance. Three popular software
development models are waterfall, Agile, and Lean:
Waterfall - This is the traditional software development model. Each phase cannot
overlap and must be completed before moving on to the next phase.
Agile Scrum - In rugby, the term scrum describes a point in gameplay where players
crowd together and try to gain possession of the ball. The Scrum methodology focuses on
small, self-organizing teams that meet daily for short periods and work in iterative sprints
, constantly adapting deliverables to meet changing requirements.
Lean - Based on Lean Manufacturing, the Lean method emphasizes elimination of
wasted effort in planning and execution, and reduction of programmer cognitive load.
Software design patterns are best practice solutions for solving common problems in software
development. Design patterns are language-independent. In their Design Patterns book, the Gang
of Four divided patterns into three main categories:
• Creational - Patterns used to guide, simplify, and abstract software object creation at scale.
• Structural - Patterns describing reliable ways of using objects and classes for different kinds of
software projects.
• Behavioral - Patterns detailing how objects can communicate and work together to meet
familiar challenges in software engineering.
The observer design pattern is a subscription notification design that lets objects (observers or
subscribers) receive events when there are changes to an object (subject or publisher) they are
observing.
The Model-View-Controller (MVC) design pattern is sometimes considered an architectural
design pattern. Its goal is to simplify development of applications that depend on graphic user
interfaces.
Version control is a way to manage changes to a set of files to keep a history of those changes.
There are three types of version control systems: Local, Centralized, and Distributed.
Git is an open source implementation of a distributed version control system. Git has two types
of repositories, local and remote. Branching enables users to work on code independently
without affecting the main code in the repository. In addition to providing the distributed version
control and source code management functionality of Git, GitHub also provides additional
features such as: code review, documentation, project management, bug tracking, and feature
requests. After installing Git to the client machine, you must configure it. Git provides a git
config command to get and set Git's global settings, or a repository's options. Git has many other
commands that you can use, including a host of branching options. Developers use a .diff file to
show how two different versions of a file have changed.
Coding Basics
Clean code is the result of developers trying to make their code easy to read and understand for
other developers. Methods and functions share the same concept; they are blocks of code that
perform tasks when executed. If the method or function is not executed, those tasks will not be
performed. Modules are a way to build independent and self-contained chunks of code that can
be reused. In most OOP languages, and in Python, classes are a means of bundling data and
functionality. Each class declaration defines a new object type.
A code review is when developers look over the codebase, a subset of code, or specific code
changes and provide feedback. The most common types of code review processes include:
Formal code review, Change-based code review, Over-the-shoulder code review, and Email
pass-around.
Software testing is subdivided into two general categories: functional, and non-functional.
Detailed functional testing of small pieces of code (lines, blocks, functions, classes, and other
components in isolation) is usually called Unit Testing. After unit testing comes Integration
Testing, which makes sure that all of those individual units fit together properly to make a
complete application. Test-Driven Development (sometimes called Test-First Development) is
testing to validate the intent of the design in light of requirements. This means writing testing
code before writing application code. Having expressed requirements in testing code, developers
can then write application code until it passes the tests.
JavaScript Object Notation JSON), is a data format derived from the way complex object literals
are written in JavaScript. JSON filenames typically end in ".json".
YAML Ain't Markup Language (YAML) is a superset of JSON designed for even easier human
readability.
Parsing means analyzing a message, breaking it into its component parts, and understanding their
purposes in context. Serializing is roughly the opposite of parsing.
If you plan to automate your network (or even just parts of it), you will want to be able to create
and troubleshoot Application Programming Interfaces (APIs). APIs define the way users,
developers, and other applications interact with an application's components. With so many APIs
to work with, you have to be sure that you understand the foundations. This module concentrates
on REST APIs. Cisco has many product lines with these interfaces for multiple use cases,
whether collecting data, calling services, or listening to events.
Module Objective: Create REST API requests over HTTPS to securely integrate services.
An API allows one piece of software talk to another. An API is analogous to a power outlet.
Without a power outlet, what would you have to do to power your laptop?
An API defines how a programmer can write one piece of software to talk to an existing
application’s features or even build entirely new applications.
An API can use common web-based interactions or communication protocols, and it can also use
its own proprietary standards. A good example of the power of using an API is a restaurant
recommendation app that returns a list of relevant restaurants in the area. Instead of creating a
mapping function from scratch, the app integrates a third-party API to provide map functionality.
The creator of the API specifies how and under what circumstances programmers can access the
interface.
As part of this process, the API also determines what type of data, services, and functionality the
application exposes to third parties; if it is not exposed by the API, it is not exposed, period.
(That is, assuming security is properly configured!) By providing APIs, applications can control
what they expose in a secure way.
Think of it like the buttons on the dashboard of your car. When you turn the key or push the
Engine ON button to start the car, what you see is that the car starts. You do not see or even care
that the engine starts the electronic ignition system which causes the gas to enter the engine and
the pistons to start moving. All you know is that if you want the car to start, you turn the key or
push the ON Engine button. If you push a different button, such as the radio ON button, the car
will not start because that was not the definition that the car (application) defined for starting the
engine. The car (application) itself can do many more things than start the engine, but those
things are not exposed to the driver (third-party user).
APIs are usually built to be consumed programmatically by other applications, which is why
they are called Application Programming Interfaces. But they do not have to be used
programmatically; they can also be used by humans who want to interact with the application
manually.
Here are just a few of the many use cases for APIs:
Automation tasks - Build a script that performs your manual tasks automatically and
programmatically. Example: You are a manager and on the last day of every pay period,
you need to manually log into the portal and download the timecard for each individual
who reports to you. Then you need to manually add up the number of hours each person
worked so that you can see how much to pay them. If you create an automated script that
calls the portal's API to get the data from each timecard and calculate the total from that
data, your script can print out a list of total hours worked per person in a user friendly
format. Think about how much time that would save!
Data integration - An application can consume or react to data provided by another
application. Example: Many e-commerce websites use payment services which are
accessed via a REST API. The payment interface lets a vendor site transmit relatively
low-value data, such as a description of object purchased and the price. The user then
independently authenticates to the payment service to confirm the payment. This allows
the vendor to receive payment without needing direct access to customer credit card data.
The vendor gets back a confirmation code for use by accounting.
Functionality - An application can integrate another application's functionality into its
product. Example: Many online services, such as Yelp and Uber, exchange data with
Google Maps to create and optimize travel routes. They also embed functionality from
Google Maps within their own apps and websites to present realtime route maps.
APIs have existed for decades, but exposure and consumption of APIs has grown exponentially
in the last 10 years or so. In the past, applications were locked down and the only integration
between them was through predetermined partnerships. As the software industry has grown, so
has the demand for integration. As a result, more and more applications have started to expose
bits and pieces of themselves for third-party applications or individuals to use.
Most modern APIs are designed into the product rather than being an afterthought. These APIs
are usually thoroughly tested, just like any other component of the product. These APIs are
reliable and are sometimes even used by the product itself. For example, sometimes an
application's user interface is built on the same APIs that are provided for third parties.
The popularity of easier and more simplified coding languages such as Python have made it
possible for non-software engineers to build applications and consume these APIs. They get the
results they want without having to hire expensive development talent.
APIs can be delivered in one of two ways: synchronously or asynchronously. You need to know
the difference between the two, because the application that is consuming the API manages the
response differently depending on the API design. Each design has its own purpose, but also its
own set of complexities on the client and/or server side. A product’s set of APIs may consist of
both synchronous and asynchronous designs, where each API’s design is independent of the
others. For best practices, however, the logic behind the design should be consistent.
APIs are usually designed to be synchronous when the data for the request is readily available,
such as when the data is stored in a database or in internal memory. The server can instantly
fetch this data and respond back immediately.
Synchronous APIs enable the application to receive data immediately. If the API is designed
correctly, the application will have better performance because everything happens quickly.
However, if it is not designed correctly, the API request will be a bottleneck because the
application has to wait for the response.
Client side processing
The application making the API request must wait for the response before performing any
additional code execution tasks.
18
ORDER HERE
PICK UP HERE
Order #18
Asynchronous APIs provide a response to signify that the request has been received, but that
response does not have any actual data. The server processes the request, which may take time,
and sends a notification (or triggers a callback) with the data after the request has been
processed. The client can then act on that returned data.
APIs are usually designed to be asynchronous when the request is an action that takes some time
for the server to process, or if the data is not readily available. For example, if the server has to
make a request to a remote service to fetch the data, it cannot guarantee that it will receive the
data immediately to send back to the client. Just because an API is asynchronous does not
necessarily mean that the client will not get the data immediately. It only means that an
immediate response with data is not guaranteed.
Benefits of an asynchronous API design
Asynchronous APIs let the application continue execution without being blocked for the amount
of time it takes for the server to process the request. As a result, the application may have better
performance because it can multi-task and make other requests. However, unnecessary or
excessive use of asynchronous calls can have the opposite effect on performance.
Client-side processing
With asynchronous processing, the design of the API on the server side defines what you want to
do on the client side. Sometimes the client can establish a listener or callback mechanism to
receive these notifications and process them when they are received. Depending on the design of
the application, your client may also need a queue to store the requests to maintain the order for
processing. Other API designs need the client to have a polling mechanism to find out the status
and progress of a given request.
Formats
U4.3.1 Common Architectural Styles
The application defines how third parties interact with it, which means there's no "standard" way
to create an API. However, even though an application technically can expose a haphazard
interface, the best practice is to follow standards, protocols, and specific architectural styles. This
makes it much easier for consumers of the API to learn and understand the API, because the
concepts will already be familiar.
The three most popular types of API architectural styles are RPC, SOAP, and REST.
4.3.2 RPC
Remote Procedure Call (RPC) is a request-response model that lets an application (acting as a
client) make a procedure call to another application (acting as a server). The "server" application
is typically located on another system within the network.
With RPC, the client is usually unaware that the procedure request is being executed remotely
because the request is made to a layer that hides those details. As far as the client is concerned,
these procedure calls are simply actions that it wants to perform. In other words, to a client, a
Remote Procedure Call is just a method with arguments. When it's called, the method gets
executed and the results get returned.
RPC is an API style that can be applied to different transport protocols. Example
implementations include:
XML-RPC
JSON-RPC
NFS (Network File System)
Simple Object Access Protocol (SOAP)
4.3.3 SOAP
SOAP is a messaging protocol. It is used for communicating between applications that may be
on different platforms or built with different programming languages. It is an XML-based
protocol that was developed by Microsoft. SOAP is commonly used with HyperText Transfer
Protocol (HTTP) transport, but can be applied to other protocols. SOAP is independent,
extensible, and neutral.
Independent
SOAP was designed so that all types of applications can communicate with each other. The
applications can be built using different programming languages, can run on different operating
systems, and can be as different as possible.
Extensible
SOAP itself is considered an application of XML, so extensions can be built on top of it. This
extensibility means you can add features such as reliability and security.
Neutral
SOAP can be used over any protocol, including HTTP, SMTP, TCP, UDP, or JMS.
SOAP messages
A SOAP message is just an XML document that may contain four elements:
Envelope
Header
Body
Fault
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Header/>
<soap:Body>
<soap:Fault>
<faultcode>soap:Server</faultcode>
<faultstring>Query request too large.</faultstring>
</soap:Fault>
</soap:Body>
</soap:Envelope>
Envelope
The Envelope must be the root element of the XML document. In the Envelope, the namespace
provided tells you that the XML document is a SOAP message.
Header
The header is an optional element, but if a header is present, it must be the first child of the
Envelope element. Just like most other headers, it contains application-specific information such
as authorization, SOAP specific attributes, or any attributes defined by the application.
Body
The body contains the data to be transported to the recipient. This data must be in XML format,
and in its own namespace.
Fault
The fault is an optional element, but must be a child element of the Body. There can only be one
fault element in a SOAP message. The fault element provides error and/or status information.
4.3.4 REST
Note: Search for Fielding’s dissertation on the internet for more information.
We'll examine REST in detail in the next two sections, but let's look at the basics.
Client-Server
Stateless
Cache
Uniform Interface
Layered System
Code-On-Demand
These six constraints can be applied to any protocol, and when they are applied, you will often
hear that it is RESTful.
Client-Server
The client and server should be independent of each other, enabling the client to be built for
multiple platforms and simplifying the server side components.
Requests from the client to the server must contain all of the information the server needs to
make the request. The server cannot contain session states.
Responses from the server must state whether the response is cacheable or non-cacheable. If it is
cacheable, the client can use the data from the response for later requests.
dUniform interface
The interface between the client and the server must adhere to these four principles:
Layered system
The system is made up of different hierarchical layers in which each layer provides
services to only the layer above it. As a result, it consumes services from the layer
below.
nCode-on-demand
This constraint is optional, and references the fact that information returned by a REST service
can include executable code (e.g., JavaScript) or links to such code, intended to usefully extend
client functionality. For example, a payment service might use REST to make available links to
its published JavaScript libraries for making payments. These JavaScript files could then be
downloaded and (if judged trustworthy) executed by a client application. This eliminates the
need for client developers to create and maintain separate payment-processing code, and manage
dependency changes that might break such code from time to time.
The constraint is optional because executing third-party code introduces potential security risks,
and because firewalls and other policy-management tools may make third-party code execution
impossible in some cases.
The figure shows a client box on the left and a computer icon within a cloud and with the word
API. An arrow goes from client to API with the words request (H T T P). Another arrow below
the type one goes from the cloud to the client with the words response (H T T P).
nA REST web service API (REST API) is a programming interface that communicates
over HTTP while adhering to the principles of the REST architectural style.
To refresh your memory, the six principles of the REST architectural style are:
1. Client-Server
2. Stateless
3. Cache
4. Uniform Interface
5. Layered System
6. Code-On-Demand (Optional)
Because REST APIs communicate over HTTP, they use the same concepts as the
HTTP protocol:
HTTP requests/responses
HTTP verbs
HTTP status codes
HTTP headers/body
REST API requests are essentially HTTP requests that follow the REST principles.
These requests are a way for an application (client) to ask the server to perform a
function. Because it is an API, these functions are predefined by the server and must
follow the provided specification.
A URI is essentially the same format as the URL you use in a browser to go to a
webpage. The syntax consists the following components in this particular order:
Scheme
Authority
Path
Query
When you piece the components together, a URI will look like
this : scheme:[//authority][/path][?query]
u
ctScheme
The scheme specifies which HTTP protocol should be used. For a REST API, the two
options are:
Authority
The authority, or destination, consists of two parts that are preceded with two forward
slashes ( // ):
Host
Port
The host is the hostname or IP address of the server that is providing the REST API
(web service). The port is the communication endpoint, or the port number, that is
associated to the host. The port is always preceded with a colon ( : ). Note that if the
server is using the default port -- 80 for HTTP and 443 for HTTPS -- the port may be
omitted from the URI.
Path
For a REST API, the path is usually known as the resource path, and represents the
location of the resource, the data or object, to be manipulated on the server. The path is
preceded by a slash ( / ) and can consists of multiple segments that are separated by a
slash ( / ).
Query
The query, which includes the query parameters, is optional. The query provides
additional details for scope, for filtering, or to clarify a request. If the query is present, it
is preceded with a question mark ( ? ). There isn't a specific syntax for query
parameters, but it is typically defined as a set of key-value pairs that are separated by
an ampersand ( & ). For example:
http://example.com/update/person?id=42&email=person%40example.com
HTTP method
REST APIs use the standard HTTP methods, also known as HTTP verbs, as a way to
tell the web service which action is being requested for the given resource. There isn't a
standard that defines which HTTP method is mapped to which action, but the suggested
mapping looks like this:
io
Header
REST APIs use the standard HTTP header format to communicate additional
information between the client and the server, but this additional information is optional.
HTTP headers are formatted as name-value pairs that are separated by a colon ( : ),
[name]:[value]. Some standard HTTP headers are defined, but the web service
accepting the REST API request can define custom headers to accept.
There are two types of headers: request headers and entity headers.
Request headers
Request headers include additional information that doesn't relate to the content of the
message.
For example, here is a typical request header you may find for a REST API request:
n
toEntity headers
Entity headers are additional information that describe the content of the body of the
message.
Here is a typical entity header you may find for a REST API request:
Body
The body of the REST API request contains the data pertaining to the resource that the
client wants to manipulate. REST API requests that use the HTTP method POST, PUT,
and PATCH typically include a body. Depending on the HTTP method, the body is
optional, but if data is provided in the body, the data type must be specified in the
header using the Content-Type key. Some APIs are built to accept multiple data types
in the request.
REST API responses are essentially HTTP responses. These responses communicate
the results of a client's HTTP request. The response may contain the data that was
requested, signify that the server has received its request, or even inform the client that
there was a problem with their request.
REST API responses are similar to the requests, but are made up of three major
components:
HTTP Status
Header
Body
HTTP status
REST APIs use the standard HTTP status codes in the response to inform the client
whether the request was successful or unsuccessful. The HTTP status code itself can
help the client determine the reason for the error and can sometimes provide
suggestions for fixing the problem.
HTTP status codes are always three digits. The first digit is the category of the
response. The other two digits do not have meaning, but are typically assigned in
numerical order. There are five different categories:
1xx - Informational
2xx - Success
3xx - Redirection
4xx - Client Error
5xx - Server Error
1xx - informational
Responses with a 1xx code are for informational purposes, indicating that the server
received the request but is not done processing it. The client should expect a full
response later. These responses typically do not contain a body.
2xx - success
Responses with a 2xx code mean that the server received and accepted the request.
For synchronous APIs, these responses contain the requested data in the body (if
applicable). For asynchronous APIs, the responses typically do not contain a body and
the 2xx status code is a confirmation that the request was received but still needs to be
fulfilled.
3xx - redirection
Responses with a 3xx code mean that the client has an additional action to take in order
for the request to be completed. Most of the time a different URL needs to be used.
Depending on how the REST API was invoked, the user might be automatically
redirected without any manual action.
Responses with a 4xx code means that the request contains an error, such as bad
syntax or invalid input, which prevents the request from being completed. The client
must take action to fix these issues before resending the request.
Responses with a 5xx code means that the server is unable to fulfill the request even
though the request itself is valid. Depending on which particular 5xx status code it is, the
client may want to retry the request at a later time.
Header
Just like the request, the response's header also uses the standard HTTP header format and is also
optional. The header in the response is to provide additional information between the server and
the client in name-value pair format that is separated by a colon ( : ), [name]:[value].
There are two types of headers: response headers and entity headers.
Response headers
Response headers contain additional information that doesn't relate to the content of the message.
Some typical response headers you may find for a REST API request include:
Entity headers
Entity headers are additional information that describes the content of the body of the
message.
One common entity header specifies the type of data being returned:
ntr
oduction to REST APIs
Body
The body of the REST API response is the data that the client requested in the REST API
request. The body is optional, but if data is provided in the body, the data type is specified in the
header using the Content-Type key. If the REST API request was unsuccessful, the body
may provide additional information about the issue or an action that needs to be taken for the
request to be successful.
Response pagination
Some APIs, such as a search API, may need to send a huge amount of data in the response. To
reduce the bandwidth usage on the network, these APIs will paginate the response data.
Response pagination enables the data to be broken up into chunks. Most APIs that implement
pagination will enable the requester to specify how many items they want in the response.
Because there are multiple chunks, the API also has to allow the requester to specify which
chunk it wants. There isn't a standard way for an API to implement pagination, but most
implementations use the query parameter to specify which page to return in the response. Take a
look at the API's documentation to get the pagination details for the specific API you're using.
When the server needs to send very large amounts of data that cannot be paginated, compressed
data is another way to reduce the bandwidth.
This data compression can be requested by the client through the API request itself. To request a
data compression, the request must add the Accept-Encoding field to the request header.
The accepted values are:
gzip
compress
deflate
br
identity
*
If the server cannot provide any of the requested compression types, it will send a response back
with a status code of 406 -- Not acceptable.
If the server fulfills the compression, it will send the response back with the compressed data and
add the Content-Encoding field to the response header. The value of the Content-
Encoding is the type of compression that was used, enabling the client to decompress the data
appropriately.
Sequence diagrams are used to explain a sequence of exchanges or events. They provide a
scenario of an ordered set of events. They are also referred to as event diagrams. While a single
REST API request may serve to obtain information or to initiate a change to a system, more
commonly, interaction with a particular REST API service will be a sequence of requests. For
that reason, sequence diagrams are frequently used to explain REST API request/response and
asynchronous activity.
Formalized sequence diagrams are closely linked to and are considered a subset of a standardized
modeling system known as Unified Modeling Language (UML). UML includes standardized
approaches to other aspects of static resources and process, including standardized ways to
diagram and explain user interfaces, class definitions and objects, and interaction behavior.
Sequence diagrams are one way to diagram interaction behavior.
In a standard sequence diagram, such as the one below, the Y-axis is ordered and unscaled time,
with zero time (t=0) at the top and time increasing towards the bottom. If an arrow or an
exchange is lower down in the diagram, it occurs after those that are above.
The X-axis is comprised of lifelines, represented by titled vertical lines, and exchanges or
messages represented by horizontal arrows. A lifeline is any element that can interact by
receiving or generating a message. By convention, front-end initiators of a sequence such as a
user, a client, or a web browser are placed on the left side of the diagram. Elements such as file
systems, databases, persistent storage, and so on, are placed to the right side. Intermediate
services, such as a webserver or API endpoints, are arranged in the middle.
One of the useful aspects of sequence diagrams is that users of the diagrams can focus on the
interaction between just two lifeline elements. For example, while the rest of a diagram might
help set context, REST API users can focus on the interaction between the client and the front-
end, or as shown below, the Host HTTP/S Service API endpoint handler.
API Services may handle some requests directly or may interpret and then forward some or all of
a request onto Core. The rightmost column, shown here as Configuration Database, might be
any persistent storage - or even a messaging or enqueuing process to communicate with another
system.
In this example, there are three separate sequences shown: create session, get devices, and create
device.
Create Session: The starting request is labeled HTTPS: Create Session w/credentials. Here,
the logic to create an HTTPS session is in the front-end, Host HTTP/S Service API endpoint
handler. (This is shown graphically by use of the arrow that loops back.)
Get Devices: The second request from the client is to request a list of devices from the platform.
In this sequence, the request is forwarded to the API Services module, which then contains the
logic to query the configuration database directly, obtain a list of devices, and return the list to
endpoint handler. The handler then wraps the content into an HTTPS response, and returns it to
the client, along with an HTTP status code indicating Success.
These first two sequences demonstrate synchronous exchanges, in which a request is followed by
a response, and the task is fully completed with Success or Failure.
Create Device: The third sequence starts with a POST request to create a device. In this case,
the request migrates to API Services, which then forwards the request to the Core logic, which
then starts to work on the request. The API Services wait only for a message from
the Core acknowledging that the device creation process has begun. The Core provides a handle
(TaskId) that identifies the work and allows follow up to see if the work was completed. The
TaskId value propagates in a response back to the client. The HTTP response tells the client only
the handle (TaskId) and that the request has been initiated with a response code of 202
(Accepted). This indicates that the request was accepted, and that the work is now in progress.
The Core logic continues to execute. It updates the Configuration Database and then informs
the API Services when it is complete. At some later time, the client may choose to confirm that
the task completed. The client does this with a Task Status query. API Services can then respond
with information about the completion status, and the success or failure status for that task.
Because the actual work requested was not completed prior to the response back to the client,
this interaction is considered asynchronous.
U4.5.1 REST API Authentication
Merriam-Webster defines authentication as “…an act, process, or method of showing something
(such as an identity) to be real, true, or genuine.” For security reasons, most REST APIs require
authentication so that random users cannot create, update or delete information incorrectly or
maliciously, or access information that shouldn't be public. Without authentication, a REST API
permits anyone to access the features and system services that have been exposed through the
interface. Requiring authentication is the same concept as requiring a username/password
credential to access the administration page of an application.
Some APIs don't require authentication. These APIs are usually read-only and don't contain
critical or confidential information.
It's important to understand the difference between authentication and authorization when
working with REST APIs, because the two terms are often used interchangeably, or incorrectly.
Knowing the difference will help you troubleshoot any issues regarding the security surrounding
your REST API request.
Authentication
The figure shows a person behind a desk accepting an I D with a check on it from a person with
luggage. Words at the bottom: Authentication proves the user's identity.
n
dAuthentication is the act of verifying the user's identity. The user is proving that they
are who they say they are. For example, when you go to the airport, you have to show
your government-issued identification or use biometrics to prove that you are the person
you claim to be.
Authorization
Authorization is the user proving that they have the permissions to perform the requested action
on that resource. For example, when you go to a concert, all you need to show is your ticket to
prove that you are allowed in. You do not necessarily have to prove your identity. e
Basic authentication
Basic Authentication, also known as Basic Auth, uses the standard Basic HTTP authentication
scheme. Basic Auth transmits credentials as username/password pairs separated with a colon
( : ) and encoded using Base64.
In a REST API request, the Basic Auth information will be provided in the header:
Basic Auth is the simplest authentication mechanism. It is extremely insecure unless it is paired
with requests using HTTPS rather than HTTP. Although the credentials are encoded, they are not
encrypted. It is simple to decode the credentials and get the username/password pair.
Bearer authentication
Bearer Authentication, also known as Token Authentication, uses the standard Bearer HTTP
authentication scheme. It is more secure than Basic Authentication and is typically used with
OAuth (discussed later) and Single Sign-On (SSO). Bearer Authentication uses a bearer token,
which is a string generated by an authentication server such as an Identity Service (IdS).
In a REST API request, the Bearer Auth information will be provided in the header:
Just like Basic Authentication, Bearer Authentication should be used with HTTPS.
API key
An API key, also referred to as an API Token, is a unique alphanumeric string generated by the
server and assigned to a user. To obtain a unique API key, the user typically logs into a portal
using their credentials. This key is usually assigned one time and will not be regenerated. All
REST API requests for this user must provide the assigned API key as the form of
authentication.
Just as with the other types of authentication, API keys are only secure when used with HTTPS.
API keys are intended to be an authentication mechanism, but are commonly misused as an
authorization mechanism.
A public API key can be shared and enables that user to access a subset of data and APIs. Do not
share a private key, because it is similar to your username and password. Most API keys do not
expire, and unless the key can be revoked or regenerated, if it is distributed or compromised,
anyone with that key can indefinitely access the system as you.
A REST API request can provide an API key in a few different ways:
Open Authorization, also known as OAuth, combines authentication with authorization. OAuth
was developed as a solution to insecure authentication mechanisms. With increased security
compared to the other options, it is usually the recommended form of
authentication/authorization for REST APIs.
There are two versions of OAuth, simply named OAuth 1.0 and OAuth 2.0. Most of today's
REST APIs implement OAuth 2.0. Note that OAuth 2.0 is not backwards compatible.
Defined in the OAuth 2.0 Authorization Framework, “OAuth 2.0 authorization framework
enables a third-party application to obtain limited access to an HTTP service, either on behalf of
a resource owner by orchestrating an approval interaction between the resource owner and the
HTTP service, or by allowing the third-party application to obtain access on its own behalf.”
Essentially, OAuth 2.0 enables pre-registered applications to get authorization to perform REST
API requests on a user's behalf without the user needing to share their credentials with the
application itself. OAuth lets the user provide credentials directly to the authorization server,
typically an Identity Provider (IdP) or an Identity Service (IdS), to obtain an access token that
can be shared with the application.
This process of obtaining the token is called a flow. The application then uses this token in the
REST API as a Bearer Authentication. The web service for the REST API then checks the
Authorization server to make sure that the token is valid, and that the requester is authorized to
perform the request.
ta4.6.1 What are Rate Limits?
REST APIs make it possible to build complex interactions.
Using an API rate limit is a way for a web service to control the number of requests a user or
application can make per defined unit of time. Implementing rate limits is best practice for public
and unrestricted APIs. Rate limiting helps:
Consumers of the API should understand the different algorithm implementations for rate
limiting to avoid hitting the limits, but applications should also gracefully handle situations in
which those limits are exceeded.
There isn't a standard way to implement rate limiting, but common algorithms include:
Leaky bucket
Token bucket
Fixed window counter
Sliding window counter
Leaky bucket
nd
in
g Data Formats
Understanding Data Formats
The leaky bucket algorithm puts all incoming requests into a request queue in the order
in which they were received. The incoming requests can come in at any rate, but the
server will process the requests from the queue at a fixed rate. If the request queue is
full, the request is rejected.
With this algorithm, the client must be prepared for delayed responses or rejected
requests.
Token bucket
The token bucket algorithm gives each user a defined number of tokens they can use within a
certain increment of time, and those tokens accumulate until they're used. When the client does
make a request, the server checks the bucket to make sure that it contains at least one token. If
so, it removes that token and processes the request. If there isn't a token available, it rejects the
request.
All requests made before token replenishment will be rejected. After the tokens are replenished,
the user can make requests again.
For example, an API using the Token Bucket algorithm sets a rate limit of 10 requests per client
per hour. If a client makes 11 requests within an hour, the 11th request will be rejected because
there are no tokens left. On the other hand, if the client then makes no requests for 6 hours, it can
then make 60 requests at once, because those tokens have accumulated.
With this algorithm, the client must calculate how many tokens it currently has in order to avoid
rejected requests. It also needs to handle the potential rejected requests by building in a retry
mechanism for when the tokens are replenished.
The figure shows a faucet with water droplets falling into a container that is half full. Three
droplets come from the bottom of the container. Words at the bottom: Visual representation of
the leaky bucket algorithm.
The fixed window counter algorithm is similar to the token bucket, except for two major
differences:
For this algorithm, a fixed window of time is assigned a counter to represent how many
requests can be processed during that period. When the server receives a request, the
counter for the current window of time is checked to make sure it is not zero. When the
request is processed, the counter is deducted. If the limit for that window of time is met,
all subsequent requests within that window of time will be rejected. When the next
window of time begins, the counter will be set back to the pre-determined count and
requests can be processed again.
To go back to our previous example of 10 requests per hour using this algorithm, the
11th request in an hour will still be rejected, but after 6 hours with no requests, the client
can still only make 10 requests in a single hour because those "unused" requests were
not accumulated.
With this algorithm, the client must know when the window of time starts and ends so
that it knows how many requests can be made within that duration of time. Just like the
token bucket algorithm, the client must build a retry mechanism so that it can retry the
requests when the next window of time has started.
The sliding window counter algorithm allows a fixed number of requests to be made in a set
duration of time. This duration of time is not a fixed window and the counter is not replenished
when the window begins again. In this algorithm, the server stores the timestamp when the
request is made. When a new request is made, the server counts how many requests have already
been made from the beginning of the window to the current time in order to determine if the
request should be processed or rejected. For example, if the rate is five requests per minute,
when the server receives a new request, it checks how many requests have been made in the last
60 seconds. If five requests have already been made, then the new request will be rejected.
With this algorithm, the client does not need to know when the window of time starts and ends. It
just needs to make sure that the rate limit has not been exceeded at the time of the request. The
client also needs to design a way to delay requests if necessary so that it doesn't exceed the
allowed rate, and, of course, accommodate rejected requests.
4.6.3 Knowing the Rate Limit
An API's documentation usually provides details of the rate limit and unit of time. In addition,
many rate limiting APIs add details about the rate limit in the response's header. Because there
isn't a standard, the key-value pair used in the header may differ between APIs. Some commonly
used keys include:
The client can use this information to keep track of how many more API requests they can make
in the current window, helping the client avoid hitting the rate limit.
When the rate limit has been exceeded, the server automatically rejects the request and sends
back an HTTP response informing the user. In addition, it is common for the response containing
the "rate limit exceeded" error to also include a meaningful HTTP status code. Unfortunately,
because there isn't a standard for this interaction, the server can choose which status code to
send. The most commonly used HTTP status codes are 429: Too Many Requests or 403:
Forbidden; make sure your client is coded for the specific API it is using.
A webhook is an HTTP callback, or an HTTP POST, to a specified URL that notifies your
application when a particular activity or “event” has occurred in one of your resources on the
platform. The concept is simple. Think of asking someone "tell me right away if X happens".
That "someone" is the webhook provider, and you are the application.
Webhooks enable applications to get real-time data, because they are triggered by particular
activities or events. With webhooks, applications are more efficient because they no longer need
to have a polling mechanism. A polling mechanism is a way of repeatedly requesting
information from the service, until a condition is met. Imagine needing to ask someone, again
and again, "Has X happened yet? Has X happened yet?" Annoying, right? That's polling. Polling
degrades the performance of both the client and server due to repeatedly processing requests and
responses. Furthermore, polling isn't real-time, because polls happen at fixed intervals. If an
event occurs right after the last time you polled, your application won't learn about the changed
status until the poll interval expires and the next poll occurs.
Webhooks are also known as reverse APIs, because applications subscribe to a webhook server
by registering with the webhook provider. During this registration process, the application
provides a URI to be called by the server when the target activity or event occurs. This URI is
typically an API on the application side that the server calls when the webhook is triggered.
When the webhook is triggered, the server sends a notification by becoming the caller and makes
a request to the provided URI. This URI represents the API for the application, and the
application becomes the callee and consumes the request. As a result, for webhooks, the roles are
reversed; the server becomes the client and the client becomes the server. Multiple applications
can subscribe to a single webhook server.
Examples:
The Cisco DNA Center platform provides webhooks that enable third-party applications
to receive network data when specified events occur. You can have your application
registered with a particular REST endpoint URI that receives a message from Cisco
DNAC when a particular event occurs. For example, if a network device becomes
unreachable, the Cisco DNAC webhook can send an HTTP POST to the URI your app
has registered on the Cisco DNAC. Your application then receives all the details of the
outage in a JSON object from that HTTP POST so it can take action as appropriate. In
this case, Cisco DNAC is the webhook provider.
You can create a webhook to have Cisco Webex Teams notify you whenever new
messages are posted in a particular room. This way, instead of your app making repeated
calls to the Teams API to determine whether a new message has been posted, the
webhook automatically notifies you of each message. In this case, Cisco Webex Teams is
the webhook provider.
Consuming a webhook
In order to receive a notification from a webhook provider, the application must meet certain
requirements:
The application must be running at all times to receive HTTP POST requests.
The application must register a URI on the webhook provider so that the provider knows
where to send a notification when target events occur.
In addition to these two requirements, the application must handle the incoming notifications
from the webhook server.
Because working with webhooks involves third parties, it can sometimes be challenging to
ensure everything is working properly. There are many free online tools that ensure your
application can receive notifications from a webhook. Many can also give you a preview of the
content provided in the webhook notification. These tools can be found by searching for
"webhook tester", and can be useful when designing and implementing your application.
4.8.1 Troubleshooting REST API Requests
You have learned about REST APIs and explored them in a lab. In the provided scenarios,
everything was working properly and was always successful. Of course, in real life, that is not
always the way things work. At some point, you will find yourself making REST API requests
and not getting the response that you expected. In this part, you will learn how to troubleshoot
the most common REST API issues.
Note that when you apply these concepts and troubleshooting tips to a particular REST API,
make sure you have the API reference guide and API authentication information handy; it will
make your job a lot easier.
4.8.2 No Response and HTTP Status Code from the API server
True or false: every request that is sent to the REST API server will return a response with a
status code.
The answer is false. While one might expect to always receive an HTTP status code such as 2xx,
4xx and 5xx, there are many cases where the API server cannot be reached or fails to respond. In
those scenarios, you will not receive an HTTP status code because the API server is unable to
send a response.
While a non-responsive server can be a big issue, the root cause of the unresponsiveness can be
simple to identify. You can usually identify what went wrong from the error messages received
as a result of the request.
Let's look at some troubleshooting tips to help you debug why the API server didn't send a
response and HTTP status code, as well as some potential ways to fix the problem.
The first thing to check is whether you have a client-side error; it is here that you have more
control in terms of fixing the issue.
When using a REST API for the first time, mistyping the URI is common. Check the API
reference guide for that particular API to verify that the request URI is correct.
Invalid URI example
To test the invalid URI condition, run a script such as this one, which simply makes the request
to a URI that is missing the scheme. You can create a Python file or run it directly in a Python
interpreter.
import requests
uri = "sandboxdnac.cisco.com/dna/intent/api/v1/network-device"
resp = requests.get(uri, verify = False)
To test the wrong domain name condition, run a script such as this one, which simply makes the
request to a URI that has the wrong domain name.
import requests
url =
"https://sandboxdnac123.cisco.com/dna/intent/api/v1/network-
device"
resp = requests.get(url, verify = False)
....requests.exceptions.ConnectionError:
HTTPSConnectionPool(host='sandboxdnac123.cisco.com', port=443):
Max retries exceeded with url: /dna/intent/api/v1/network-device
(Caused by
NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection
object at 0x109541080>: Failed to establish a new connection:
[Errno 8] nodename nor servname provided, or not known'))....
If the URI is correct, it may still be inaccessible. Ask yourself these questions:
When the scheme of the URI is HTTPS, the connection will perform an SSL handshake between
the client and the server in order to authenticate one another. This handshake needs to be
successful before the REST API request is even sent to the API server.
When using the requests library in Python to make the REST API request, if the SSL
handshake fails, the traceback will contain requests.exceptions.SSLError.
For example, if the SSL handshake fails due to the certificate verification failing, the traceback
will look like this:
requests.exceptions.SSLError: HTTPSConnectionPool(host='xxxx',
port=443): Max retries exceeded with url:
/dna/intent/api/v1/network-device (Caused by
SSLError(SSLCertVerificationError(1, '[SSL:
CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed
certificate (_ssl.c:1108)')))
In this situation, you must fix the invalid certificate. But, if you are working in a lab environment
where the certificates aren't valid yet, you can turn off the certificate verification setting.
To turn it off for the requests library in Python, add the verify parameter to the request.
Resolution: Client errors are usually simple to fix, especially when the error messages in the
traceback indicate what may be wrong and the possible solution. Analyze the logs carefully for
the root cause.
After you've verified that there aren't any client-side errors, the next thing to check are server-
side errors.
The most obvious place to start is to make sure that the API server itself is functioning properly.
There are a few things you will want to ask yourself:
import requests
url = "https://209.165.209.225/dna/intent/api/v1/network-device"
resp = requests.get(url, verify = False)
Note: The IP address in this script is bogus, but the script produces the same result as a server
that is unreachable.
If the API server is not functioning, you'll get a long silence followed by a traceback that looks
like this:
....requests.exceptions.ConnectionError:
HTTPSConnectionPool(host='209.165.209.225', port=443): Max
retries exceeded with url: /dna/intent/api/v1/network-device
(Caused by
NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection
object at 0x10502fe20>: Failed to establish a new connection:
[Errno 60] Operation timed out'))....
Is there a communication issue between the API server and the client?
If the server is functioning properly, you'll need to determine whether there's a reason the client
isn't receiving a response. Ask yourself:
Is the IP address and domain name accessible from the client's location on the network?
Is the API server sending a response, but the client isn't receiving it?
To test this condition, use a network capturing tool to see if the response from the API server is
lost in the communication between the API server and the client. If you have access, take a look
at the API server logs to determine if the request was received, if it was processed, and if a
response was sent.
Resolution: Server side issues cannot be resolved from the API client side. Contact the
administrator of the API server to resolve this issue.
Because REST APIs are http-based, they also use http status codes to notify API clients of
request results.
Unless otherwise stated, the status code is part of the HTTP/1.1 standard (RFC 7231), which
means the first digit of the status code defines the class of response. (The last two digits do not
have any class or categorization role, but help distinguish between results.) There are 5 of these
categories:
Typically we see 2xx, 4xx and 5xx from the REST API server. Usually you can find the root
cause of an error when you understand the meaning of the response codes. Sometimes the API
server also provides additional information in the response body.
For all requests that have a return status code, perform these steps to troubleshoot errors:
Step 1: Check the return code. It can help to output the return code in your script during the
development phase.
Step 2: Check the response body. Output the response body during development; most of the
time you can find what went wrong in the response message sent along with the status code.
Step 3: If you can't resolve the issue using the above two steps, use the status code reference to
understand the definition of the status code.
2xx - Success
When the client receives a 2xx response code, it means the client's request was successfully
received, understood, and accepted. However, you should always verify that the response
indicates success of the right action and that the script is doing what you think it should.
A 4xx response means that the error is on the client side. Some servers may include an entity
containing an explanation of the error. If not, here are some general guidelines for
troubleshooting common 4xx errors:
The request could not be understood by the server due to malformed syntax. Check your API
syntax.
One cause of a 400 Bad Request error is the resource itself. Double-check the endpoint and
resource you are calling, did you misspell one of the resources or forget the "s" to make it plural,
such as /device versus /devices or /interface versus /interfaces? Is the URI well-formed
and complete?
Another cause of a 400 Bad Request error might be a syntax issue in the JSON object that
represents your POST request.
import requests
url = "http://myservice/api/v1/resources/house/rooms/id/"
resp = requests.get(url,auth=("person2","super"),verify = False)
print (resp.status_code)
print (resp.text)
This example returns a status code of 400. The server side also tells you "No id field provided",
because the id is mandatory for this API request. You would have to look that up in the
documentation to be sure.
import requests
url =
"http://myservice/api/v1/resources/house/rooms/id/1001001027331"
resp = requests.get(url,auth=("person2","super"),verify = False)
print (resp.status_code)
print (resp.text)
401 - Unauthorized
This error message means the server could not authenticate the request.
Check your credentials, including username, password, API key, token, and so on. If there are no
issues with those items, you may want to check the request URI again, because the server may
reject access in the case of an improper request URI.
import requests
url = "http://myservice/api/v1/resources/house/rooms/all"
resp = requests.get(url,verify = False)
print (resp.status_code)
print (resp.text)
This example returns a status code 401, can you guess what is wrong just by reading the code?
The request isn't providing a credential. The authentication auth=("person1","great") should
be added in the code.
import requests
url = "http://myservice/api/v1/resources/house/rooms/all"
resp = requests.get(url,auth=("person1","great"),verify = False)
print (resp.status_code)
print (resp.text)
403 - Forbidden
In this case, the server recognizes the authentication credentials, but the client is not authorized
to perform the request. Some APIs, such as Cisco DNA Center, have Role Based Access
Control, and require a super-admin role to execute certain APIs. Again, the API reference guide
may provide additional information.
import requests
url = "http://myservice/api/v1/resources/house/rooms/id/1"
resp = requests.get(url,auth=("person1","great"),verify = False)
print (resp.status_code)
print (resp.text)
Why is this imaginary code snippet not working? We just used the username person1 and the
password great and it worked in last example.
The status code 403 is not an authentication issue; the server believes in the user's identity, it is
just that the user does not have enough privileges to use that particular API. If we
use person2/super as username/password for the authentication, it will work, because person2
has the privilege to execute this API. Can you envision how to make it work?
import requests
url = "http://myservice/api/v1/resources/house/rooms/id/1"
resp = requests.get(url,auth=("person2","super"),verify = False)
print (resp.status_code)
print (resp.text)
The server has not found anything matching the request URI; check the request URI to make
sure it is correct. If the code used to work, you may want to check the latest API reference guide,
as an API's syntax can change over time.
Consider the Cisco DNA Center "get all interfaces" API. The title says, "all interfaces", so you
try to use api/v1/interfaces, but you get a 404 error because the API request is
actually api/v1/interface.
import requests
url = "http://myservice/api/v1/resources/house/room/all"
resp = requests.get(url,auth=("person1","great"),verify = False)
print (resp.status_code)
print (resp.text)
Why did this script return a 404 status code, and can you guess how to fix it?
import requests
url = "http://myservice/api/v1/resources/house/rooms/all"
resp = requests.get(url,auth=("person1","great"),verify = False)
print (resp.status_code)
print (resp.text)
In this case, the request was recognized by the server, but the method specified in the request has
been rejected by the server. You may want to check the API reference guide to see what methods
the server expects. The response from server may also include an Allow header containing a list
of valid methods for the requested resource.
For example, if you mistakenly use the POST method for an API that expects the GET method
you will receive a 405 error.
import requests
url = "http://myservice/api/v1/resources/house/rooms/id/1"
resp = requests.post(url,auth=("person2","super"),verify = False)
print (resp.status_code)
print (resp.text)
This error indicates that the target resource does not have a current representation that would be
acceptable to the client. The server has the data, but cannot represent it using any of the options
listed in the client's Accept- headers.
For example, the client is asking for SVG images: Accept: image/svg+xml
import requests
headers = {'Accept': 'image/svg+xml'}
url = "http://myservice/api/v1/resources/house/rooms/id/1"
resp =
requests.get(url,headers=headers,auth=("person2","super"),verify
= False)
print (resp.status_code)
print (resp.text)
In this case, the server does not support the requested resource, image/svg+xml, so it responds
with a 406 error.
This code is similar to 401 (Unauthorized), but indicates that the client must first authenticate
itself with the proxy. In this scenario, there is a proxy server between the client and server, and
the 407 response code indicates that client needs to authenticate with the proxy server first.
409 - The request could not be completed due to a conflict with the current state of the
target resource.
For example, an edit conflict where a resource is being edited by multiple users would cause a
409 error. Retrying the request later might succeed, as long as the conflict is resolved by the
server.
In this case, the client sent a request body in a format that the server does not support. For
example, if the client sends XML to a server that only accepts JSON, the server would return a
415 error.
import requests
headers = {"content-type":"application/xml"}
url = "http://myservice/api/v1/resources/house/rooms/id/1"
resp =
requests.get(url,headers=headers,auth=("person2","super"),verify
= False)
print (resp.status_code)
print (resp.text)
From the error message, can you guess a what could fix the code?
import requests
headers = {"content-type":"application/json"}
url = "http://myservice/api/v1/resources/house/rooms/id/1"
resp =
requests.get(url,headers=headers,auth=("person2","super"),verify
= False)
print (resp.status_code)
print (resp.text)
These are just the most common 4xx response codes. If you encounter other 4xx response codes,
you can refer to either RFC 2616, 6.1, Status-Line or RFC 7231, section 6, Response Status
Codes for more information on what they mean.
This error means that the server encountered an unexpected condition that prevented it from
fulfilling the request.
This error means that the server does not support the functionality required to fulfill this request.
For example, the server will respond with a 501 code when it does not recognize the request
method and is therefore incapable of supporting it for any resource.
This error means that the server, while acting as a gateway or proxy, received an invalid response
from an inbound server it accessed while attempting to fulfill the request.
This code indicates that the server is currently unable to handle the request due to a temporary
overload or scheduled maintenance, which will likely be resolved after a delay.
This error means that the server, while acting as a gateway or proxy, did not receive a timely
response from an upstream server it needed to access in order to complete the request.
If you get a 500 or 501 error, check the API reference guide to make sure the request is valid. For
other 5xx errors, check with your API server administrator to resolve the issue.
Before you start troubleshooting your API, it is crucial to have your API reference guide and
status code references in hand. If you are unable to receive a response code, you will likely be
able to identify the cause from the error log in your script. If you are able to receive a response
code, you will be able to identify the root cause of the error, including whether the error is on the
client or server side, by understanding the response code.
Introducing APIs
An Application Programming Interface (API) defines the ways users, developers, and other
applications can interact with an application's components. An API can use common web-based
interactions or communication protocols, and it can also use its own proprietary standards. Some
use cases are automation tasks, data integration, and functionality. Most modern APIs are
designed into the product. These APIs are usually thoroughly tested, are reliable, and are
sometimes even used by the product itself.
The three most popular types of API architectural styles are RPC, SOAP, and REST.
Remote Procedure Call (RPC) is a request-response model that lets an application (acting as a
client) make a procedure call to another application (acting as a server). The "server" application
is typically located on another system within the network.
Simple Object Access Protocol (SOAP) is a messaging protocol. It is used for communicating
between applications that may be on different platforms or built with different programming
languages. It is an XML-based protocol and is commonly used with HyperText Transfer Protocol
(HTTP) transport, but can be applied to other protocols. SOAP is independent, extensible, and
neutral.
REpresentational State Transfer (REST) is an architectural style that was created as a hybrid
style, derived from several network-based architectural styles that are combined with additional
constraints that define a uniform connector interface. There are six constraints applied to
elements within the architecture: Client-Server, Stateless, Cache, Uniform Interface, Layered
System, and Code-On-Demand. These six constraints can be applied to any protocol, and when
they are applied, you will often hear that it is RESTful.
A REST web service API (REST API) is a programming interface that communicates over
HTTP while adhering to the principles of the REST architectural style. REST API requests are
essentially HTTP requests that follow the REST principles. These requests are a way for an
application (client) to ask the server to perform a function. REST API responses are essentially
HTTP responses. These responses communicate the results of a client's HTTP request. The
response may contain the data that was requested, signify that the server has received its request,
or even inform the client that there was a problem with their request. Sequence diagrams are
used to explain a sequence of exchanges or events. They provide a scenario of an ordered set of
events. They are also referred to as event diagrams.
For security reasons, most REST APIs require authentication so that random users cannot create,
update or delete information incorrectly or maliciously, or access information that shouldn't be
public. Without authentication, a REST API permits anyone to access the features and system
services that have been exposed through the interface. Authentication is the act of verifying the
user's identity. The user is proving that they are who they say they are. Authorization is the user
proving that they have the permissions to perform the requested action on that resource.
Common types of authentication mechanisms include Basic, Bearer, and API Key. Open
Authorization, also known as OAuth, combines authentication with authorization.
Using an API rate limit is a way for a web service to control the number of requests a user or
application can make per defined unit of time. Common rate limit algorithms include: Leaky
bucket, Token bucket, Fixed window counter, and Sliding window counter. Many rate limiting
APIs add details about the rate limit in the response's header. Because there is no standard, the
key-value pair used in the header may differ between APIs. Some commonly used keys include:
● X-RateLimit-Limit: The maximum number of requests that can be made in a specified unit of
time
● X-RateLimit-Remaining: The number of requests remaining that the requester can make in
the current rate limit window
When the rate limit has been exceeded, the server automatically rejects the request and sends
back an HTTP response informing the user. In addition, it is common for the response containing
the "rate limit exceeded" error to also include a meaningful HTTP status code.
Working with Webhooks
A webhook is an HTTP callback, or an HTTP POST, to a specified URL that notifies your
application when a particular activity or “event” has occurred in one of your resources on the
platform. Webhooks enable applications to get real-time data, because they are triggered by
particular activities or events. With webhooks, applications are more efficient because they no
longer need to have a polling mechanism. In order to receive a notification from a webhook
provider, the application must be running at all times to receive HTTP POST requests, and it
must register a URI on the webhook provider so that the provider knows where to send a
notification when target events occur. In addition to these two requirements, the application must
handle the incoming notifications from the webhook server.
Before you troubleshoot, make sure you have the API reference guide and API authentication
information handy.
It is not true that every request that is sent to the REST API server will return a response with a
status code. While one might expect to always receive a HTTP status code such as 2xx, 4xx and
5xx, there are many cases where the API server cannot be reached or fails to respond. In those
scenarios, you will not receive an HTTP status code because the API server is unable to send a
response.
While a non-responsive server can be a big issue, the root cause of the unresponsiveness can be
simple to identify. You can usually identify what went wrong from the error messages received
as a result of the request. It could be client side error, user error, wrong URI, wrong domain, a
connectivity issue, an invalid certificate, a server side error, or a communication problem
between the server and the client. How do you narrow it down? By correctly interpreting status
codes. 4xx codes are Client side error and 5xx codes are Server side errors.
In this lab, you will create an application that retrieves JSON data from the Graphhopper
Directions API, parses the data, and formats it for output to the user. You will use the GET
Route request from the Graphhopper Directions API.
As a NetAcad student, you may already be well-versed in network operations. If you have been
paying attention to the changing world of networks, then more of your time recently may have
been spent learning about coding. If so, good for you! You might need this module to refresh
your memory about network fundamentals.
Why does a developer need to know about the network that all application traffic goes through
and on and between? Why does a network engineer who is already familiar with intricate details
and architecture decisions for network traffic need to know about application development? One
without the other doesn't quite get the job done, when the job is about managing and deploying
applications at scale while meeting performance requirements and expectations. Let's lay the
network foundation and study beyond the basics.
Module Objective: Apply the processes and devices that support network connectivity.
5.1.1 Overview
For the end-users of a network, they just want it to work. Developers are more curious and often
willing to troubleshoot their own connectivity issues. Network administrators benefit from
methods that automatically and programmatically manage and deploy network configurations,
including day zero scenarios.
Performance is top-of-mind for everyone, regardless of their perspective. With automation you
can have faster deployment. With application monitoring you can troubleshoot faster. Knowing
how to troubleshoot network connectivity is crucial to both developers and administrators, so
quicker resolutions to problems is critical for everyone.
This topic looks at the fundamental pieces of a network. You want to know what standards are
used for networks to make sure you have the right vocabulary to talk about network problems or
solutions with anyone on any team. A high-level understanding of the layers that network traffic
goes through gives you a head start on the knowledge you need to work on networks,
applications, and automation.
A network consists of end devices such as computers, mobile devices, and printers. These
devices are connected by networking devices such as switches and routers. The network enables
the devices to communicate with one another and share data. There are many ways to connect to
the network. The most common local area network (LAN) methods, specified by the Institute of
Electrical and Electronics Engineers (IEEE), are wired Ethernet LANs (IEEE 802.3) and wireless
LANs (IEEE 802.11). These end-devices connect to the network using an Ethernet or wireless
network interface card (NIC).
Ethernet NICs connect to the network via registered jack 45 (RJ-45) ports and twisted pair
Ethernet cables. Wireless NICs connect to the network via wireless radio signals in the 2.4 GHz
or more commonly 5 GHz frequency bands.
Protocol Suites
A protocol suite is a set of protocols that work together to provide comprehensive network
communication services. Since the 1970s there have been several different protocol suites, some
developed by a standards organization and others developed by various vendors. During the
evolution of network communications and the internet there were several competing protocol
suites:
Today, the OSI model and the TCP/IP model, shown in the figure, are used to describe network
operations.
Both the OSI and TCP/IP models use layers to describe the functions and services that can occur
at that layer. These models provide consistency within all types of network protocols and
services by describing what must be done at a particular layer, but not prescribing how it should
be accomplished. It also describes the interaction of each layer with the layers directly above and
below.
Both models can be used with the following differences:
The form that a piece of data takes at any layer is called a protocol data unit (PDU). During
encapsulation, each succeeding layer encapsulates the PDU that it receives from the layer
above in accordance with the protocol being used. When messages are sent on a network, the
encapsulation process works from top to bottom, as shown in the figure.
Data - The general term for the PDU used at the application layer
Segment - transport layer PDU
Packet - network layer PDU
Frame - data Link layer PDU
Bits - physical layer PDU used when physically transmitting data over the medium
At each layer, the upper layer information is considered data within the encapsulated protocol.
For example, the transport layer segment is considered data within the internet layer packet.
The packet is then considered data within the link layer frame.
An advantage with layering the data transmission process is the abstraction that can be
implemented with it. Different protocols can be developed for each layer and interchanged as
needed. As long as the protocol provides the functions expected by the layer above, the
implementation can be abstracted and hidden from the other layers. Abstraction of the protocol
and services in these models is done through encapsulation.
In general, an application uses a set of protocols to send the data from one host to the other.
Going down the layers, from the top one to the bottom one in the sending host and then the
reverse path from the bottom layer all the way to the top layer on the receiving host, at each layer
the data is being encapsulated.
At each layer, protocols perform the functionality required by that specific layer. The following
describes the functionality of each layer of the OSI model, starting from layer 1.
Note: An OSI model layer is often referred to by its number.
Physical Layer (Layer 1)
This layer is responsible for the transmission and reception of raw bit streams. At this layer, the
data to be transmitted is converted into electrical, radio, or optical signals. Physical layer
specifications define voltage levels, physical data rates, modulation scheme, pin layouts for cable
connectors, cable specification, and more. Ethernet, Bluetooth, and Universal Serial Bus (USB)
are examples of protocols that have specifications for the physical layer.
Data Link Layer (Layer 2)
This layer provides NIC-to-NIC communications on the same network. The data link layer
specification defines the protocols to establish and terminate connections, as well as the flow
control between two physically connected devices. The IEEE has several protocols defined for
the data link layer. The IEEE 802 family of protocols, which includes Ethernet and wireless
LANs (WLANs), subdivide this layer into two sublayers:
Medium Access Control (MAC) sublayer - The MAC sublayer is responsible for controlling how
devices in a network gain access to the transmission medium and obtain permission to transmit
data.
Logical Link Control (LLC) sublayer - The LLC sublayer is responsible for identifying and
encapsulating network layer protocols, error checking controls, and frame synchronization. IEEE
802.3 Ethernet, 802.11 Wi-Fi, and 802.15.4 ZigBee protocols operate at the data link layer. The
MAC sublayer within the data link layer is critically important in broadcast environments (like
wireless transmission) in which control to the transmission medium has to be carefully
implemented.
Addressing - All devices must be configured with a unique IP address for identification on the
network.
Routing - Routing protocols provide services to direct the packets to a destination host on
another network. To travel to other networks, the packet must be processed by a router. The
role of the router is to select the best path and forward packets to the destination host in a
process known as routing. A packet may cross many routers before reaching the destination
host. Each router a packet crosses to reach the destination host is called a hop.
The network layer also includes the Internet Control Message Protocol (ICMP) to provide
messaging services such as to verify connectivity with the ping command or discover the path
between source and destination with the traceroute command.
Transport Layer (Layer 4)
The transport layer defines services to segment, transfer, and reassemble the data for individual
communications between the end devices. This layer has two protocols: Transmission Control
Protocol (TCP) and User Datagram Protocol (UDP).
TCP provides reliability and flow control using these basic operations:
Number and track data segments transmitted to a specific host from a specific application.
Acknowledge received data.
Retransmit any unacknowledged data after a certain amount of time.
Sequence data that might arrive in wrong order.
Send data at an efficient rate that is acceptable by the receiver.
TCP is used with applications such as databases, web browsers, and email clients. TCP requires
that all data that is sent arrives at the destination in its original condition. Any missing data could
corrupt a communication, making it either incomplete or unreadable.
UDP is a simpler transport layer protocol than TCP. It does not provide reliability and flow
control, which means it requires fewer header fields. UDP datagrams can be processed faster
than TCP segments.
UDP is preferable for applications such as Voice over IP (VoIP). Acknowledgments and
retransmission would slow down delivery and make the voice conversation unacceptable. UDP is
also used by request-and-reply applications where the data is minimal, and retransmission can be
done quickly. Domain Name System (DNS) uses UDP for this type of transaction.
Application developers must choose which transport protocol type is appropriate based on the
requirements of the applications. Video may be sent over TCP or UDP. Applications that stream
stored audio and video typically use TCP. The application uses TCP to perform buffering,
bandwidth probing, and congestion control, in order to better control the user experience.
Session Layer (Layer 5)
The session layer provides mechanisms for applications to establish sessions between two hosts.
Over these end-to-end sessions, different services can be offered. Session layer functions keep
track of whose turn it is to transmit data, make sure two parties are not attempting to perform the
same operation simultaneously, pick up a transmission that failed from the point it failed, and
end the transmission. The session layer is explicitly implemented in applications that use remote
procedure calls (RPCs).
Presentation Layer (Layer 6)
The presentation layer specifies context between application-layer entities. The OSI model layers
so far, have been mostly dealing with moving bits from a source host to a destination host. The
presentation layer is concerned with the syntax and the semantics of the transmitted information
and how this information is organized. Differentiation is done at this layer between what type of
data is encoded for transmission, for example text files, binaries, or video files.
Application Layer (Layer 7)
The application layer is the OSI layer that is closest to the end user and contains a variety of
protocols usually needed by users. One application protocol that is widely used is HyperText
Transfer Protocol (HTTP) and its secure version HTTPS. HTTP/HTTPS is at the foundation of
the World Wide Web (WWW). Exchanging information between a client browser and a web
server is done using HTTP. When a client browser wants to display a web page, it sends the
name of the page to the server hosting the page using HTTP. The server sends back the Web
page over HTTP. Other protocols for file transfers, electronic email and others have been
developed throughout the years.
Some other examples of protocols that operate at the application layer include File Transfer
Protocol (FTP) used for transferring files between hosts and Dynamic Host Configuration
Protocol (DHCP) used for dynamically assigning IP addresses to hosts.
Data Flow in Layered Models
End devices implement protocols for the entire "stack" of layers. The source of
the message (data) encapsulates the data with the appropriate protocol header/trailer at
each layer, while the final destination de-encapsulates each protocol header/trailer to
receive the message (data).
The network access layer (shown as "Link" in the figure above) operates at the local
network connection to which an end-device is connected. It deals with moving frames
from one NIC to another NIC on the same network. Ethernet switches operate at this
layer.
The internet layer is responsible for sending data across potentially multiple distant
networks. Connecting physically disparate networks is referred to as internetworking.
Routing protocols are responsible for sending data from a source network to a
destination network. Routers are devices that operate at the internet layer and perform
the routing function. Routers are discussed in more detail later in this module. IP
operates at the internet layer in the TCP/IP reference model and performs the two basic
functions, addressing and routing.
Hosts are identified by their IP address. To identify network hosts' computers and locate
them on the network, two addressing systems are currently supported. IPv4 uses 32-bit
addresses. This means that approximately 4.3 billion devices can be identified. Today
there are many more than 4.3 billion hosts attached to the internet, so a new addressing
system was developed in the late 1990s. IPv6 uses 128-bit addresses. It was
standardized in 1998 and implementation started in 2006. The IPv6 128-bit address
space provides 340 undecillion addresses. Both IPv4 and IPv6 addressed hosts are
currently supported on the internet.
The second function of the internet layer is routing packets. This function means
sending packets from source to destination by forwarding them to the next router that is
closer to the final destination. With this functionality, the internet layer makes possible
internetworking, connecting different IP networks, and essentially establishing the
internet. The IP packet transmission at the internet layer is best effort and unreliable.
Any retransmission or error corrections are to be implemented by higher layers at the
end devices, typically TCP.
Planes of a Router
The logic of a router is managed by three functional planes: the management plane,
control plane, and data plane. Each provides different functionality:
Management Plane - The management plane manages traffic destined for the
network device itself. Examples include Secure Shell (SSH) and Simple Network
Management Protocol (SNMP).
Control Plane - The control plane of a network device processes the traffic that
is required to maintain the functionality of the network infrastructure. The control
plane consists of applications and protocols between network devices, such as
routing protocols OSPF, BGP, and Enhanced Interior Gateway Routing Protocol
(EIGRP). The control plane processes data in software.
Data Plane - The data plane is the forwarding plane, which is responsible for the
switching of packets in hardware, using information from the control plane. The
data plane processes data in hardware.
A network consists of end devices such as computers, mobile devices, and printers that are
connected by networking devices such as switches and routers. The network enables the devices
to communicate with one another and share data, as shown in the figure.
The network topology figure shows administration computers 20 hosts up top connected to a
switch. The switch has a server connected to the right of it and to the left through F a 0 / 0 to R T
R 1. R T R 1 connects through a lightning bolt labeled S 0 / 0 at both ends to R T R 2. To the left
of R T R 2 through F a 0 / 0 to a switch labeled 4 switches. The switches connect to instructor
computers 64 hosts. To the right of R T R 2 through F a 1 / 0 to a switch icon labeled 20
switches, are connections to student computers 460 hosts.
In the figure above, data from the student computer to the instructor computer travels through the
switch to the router (FastEthernet 1/0 interface), then to the next switch (FastEthernet 0/0
interface), and finally to the instructor computer.
All hosts and network devices that are interconnected, within a small physical area, form a LAN.
Network devices that connect LANs, over large distances, form a wide area network (WAN).
5.2.2 Ethernet
Connecting devices within a LAN requires a collection of technologies. The most common LAN
technology is Ethernet. Ethernet is not just a type of cable or protocol. It is a network standard
published by the IEEE. Ethernet is a set of guidelines and rules that enable various network
components to work together. These guidelines specify cabling and signaling at the physical and
data link layers of the OSI model. For example, Ethernet standards recommend different types of
cable and specify maximum segment lengths for each type.
There are several types of media that the Ethernet protocol works with: coaxial cable, twisted
copper pair cable, single mode and multimode fiber optics.
Bits that are transmitted over an Ethernet LAN are organized into frames. The Ethernet frame
format is shown in the figure.
In Ethernet terminology, the container into which data is placed for transmission is called a
frame. The frame contains header information, trailer information, and the actual data that is
being transmitted.
The figure above shows the most important fields of the Ethernet frame:
Preamble - This field consists of seven bytes of alternating 1s and 0s that are used to
synchronize the signals of the communicating computers.
Start of frame delimiter (SFD) - This is a 1-byte field that marks the end of the
preamble and indicates the beginning of the Ethernet frame.
Destination MAC Address - The destination address field is six bytes (48 bits) long and
contains the address of the NIC on the local network to which the encapsulated data is
being sent.
Source MAC Address - The source address field is six bytes (48 bits) long and contains
the address of the NIC of the sending device.
Type - This field contains a code that identifies the network layer protocol. For example,
if the network layer protocol is IPv4 then this field has a value of 0x0800 and for IPv6 it
has a value of 0x086DD.
Data - This field contains the data that is received from the network layer on the
transmitting computer. This data is then sent to the same protocol on the destination
computer. If the data is shorter than the minimum length of 46 bytes, a string of
extraneous bits is used to pad the field.
Frame Check Sequence (FCS) - The FCS field includes a checking mechanism to
ensure that the packet of data has been transmitted without corruption.
MAC addresses are used in transporting a frame across a shared local media. These are NIC-to-
NIC communications on the same network. If the data (encapsulated IP packet) is for a device on
another network, the destination MAC address will be that of the local router (default gateway).
The Ethernet header and trailer will be de-encapsulated by the router. The packet will be
encapsulated in a new Ethernet header and trailer using the MAC address of the router's egress
interface as the source MAC address. If the next hop is another router, then the destination MAC
address will be that of the next hop router. If the router is on the same network as the destination
of the packet, the destination MAC address will be that of the end device.
All network devices on the same network must have a unique MAC address. The MAC address
is the means by which data is directed to the proper destination device. The MAC address of a
device is an address that is burned into the NIC. Therefore, it is also referred to as the physical
address or burned in address (BIA).
A MAC address is composed of 12 hexadecimal numbers, which means it has 48 bits. There are
two main components of a MAC. The first 24 bits constitute the OUI. The last 24 bits constitute
the vendor-assigned, end-station address, as shown in the figure.
24-bit OUI - The OUI identifies the manufacturer of the NIC. The IEEE regulates the
assignment of OUI numbers. Within the OUI, there are 2 bits that have meaning only
when used in the destination address (DA) field of the Ethernet header:
24-bit, vendor-assigned, end-station address - This portion uniquely identifies the
Ethernet hardware.
0050.56c0.0001
00:50:56:c0:00:01
00-50-56-c0-00-01
Destination MAC addresses include the three major types of network communications:
Unicast - Communication in which a frame is sent from one host and is addressed to one
specific destination. In a unicast transmission, there is only one sender and one receiver.
Unicast transmission is the predominant form of transmission on LANs and within the
internet.
Broadcast - Communication in which a frame is sent from one address to all other
addresses. In this case, there is only one sender, but the information is sent to all of the
connected receivers. Broadcast transmission is essential for sending the same message to
all devices on the LAN. Broadcasts are typically used when a device is looking for MAC
address of the destination.
Multicast - Communication in which information is sent to a specific group of devices or
clients. Unlike broadcast transmission, in multicast transmission, clients must be
members of a multicast group to receive the information.
5.2.4 Switching
The switch builds and maintains a table (called the MAC address table) that matches the
destination MAC address with the port that is used to connect to a node. The MAC address table
is stored in the Content Addressable Memory (CAM), which enables very fast lookups.
The switch dynamically builds the MAC address table by examining the source MAC address of
frames received on a port. The switch forwards frames by searching for a match between the
destination MAC address in the frame and an entry in the MAC address table. Depending on the
result, the switch will decide whether to filter or flood the frame. If the destination MAC address
is in the MAC address table, it will send it out the specified port. Otherwise, it will flood it out
all ports except the incoming port.
In the figure, four topologies are shown. Each topology has a switch and three hosts
(HOST A, HOST B, and HOST C). The following describes the switching process illustrated in
the figure as Host A sends a frame to Host B:
1. In the first topology, top left, the switch receives a frame from Host A on port 1.
2. The switch enters the source MAC address and the switch port that received the frame
into the MAC address table.
3. The switch checks the table for the destination MAC address. Because the destination
address is not known, the switch floods the frame to all of the ports except the port on
which it received the frame. In the second topology, top right, Host B, the destination
MAC address, receives the Ethernet frame.
4. In the third topology, bottom left, Host B replies to the Host A with the destination MAC
address of Host A.
5. The switch enters the source MAC address of Host B and the port number of the switch
port that received the frame into the MAC table. The destination address of the frame and
its associated port is known in the MAC address table.
6. In the fourth topology, bottom right, the switch can now directly forward this frame to
Host A out port 1. Frames between the source and destination devices are sent without
flooding because the switch has entries in the MAC address table that identify the
associated ports.
A virtual LAN (VLAN) is used to segment different Layer 2 broadcast domains on one or more
switches. A VLAN groups devices on one or more LANs that are configured to communicate as
if they were attached to the same wire, when in fact they are located on a number of different
LAN segments. Because VLANs are based on logical instead of physical connections, they are
extremely flexible.
For example, in the figure, the network administrator created three VLANs based on the function
of its users: engineering, marketing, and accounting. Notice that the devices do not need to be on
the same floor.
VLANS
VLANs define Layer 2 broadcast domains. Broadcast domains are typically bounded by
routers because routers do not forward broadcast frames. VLANs on Layer 2 switches
create broadcast domains based on the configuration of the switch. Switch ports are
assigned to a VLAN. A Layer 2 broadcast received on a switch port is only flooded out
onto other ports belonging to the same VLAN.
You can define one or many VLANs within a switch. Each VLAN you create in the
switch defines a new broadcast domain. Traffic cannot pass directly to another VLAN
(between broadcast domains) within the switch or between two switches. To
interconnect two different VLANs, you must use a router or Layer 3 switch.
VLANs are often associated with IP networks or subnets. For example, all of the end
stations in a particular IP subnet belong to the same VLAN. Traffic between VLANs
must be routed. You must assign a VLAN membership (VLAN ID) to a switch port on an
port-by-port basis (this is known as interface-based or static VLAN membership). You
can set various parameters when you create a VLAN on a switch, including VLAN
number (VLAN ID) and VLAN name.
Switches support 4096 VLANs in compliance with the IEEE 802.1Q standard which
specifies 12 bits (2^12=4096) for the VLAN ID.
A trunk is a point-to-point link between two network devices that carries more than one
VLAN. A VLAN trunk extends VLANs across an entire network. IEEE 802.1Q defines a
"tag" that is inserted in the frame containing the VLAN ID. This tag is inserted when the
frame is forwarded by the switch on its egress interface. The tag is removed by the
switch that receives the frame. This is how switches know of which VLAN the frame is a
member.
These VLANs are organized into three ranges: reserved, normal, and extended. Some
of these VLANs are propagated to other switches in the network when you use the
VLAN Trunking Protocol (VTP).
Internetwork Layer
Interconnected networks have to have ways to communicate. Internetworking provides that
"between" (inter) networks communication method. This topic describes addressing and routing.
Every device on a network has a unique IP address. An IP address and a MAC address are used
for access and communication across all network devices. Without IP addresses there would be
no internet.
Despite the introduction of IPv6, IPv4 continues to route most internet traffic today. During
recent years, more traffic is being sent over IPv6 due to the exhaustion of IPv4 addresses and the
proliferation of mobile and Internet of Things (IoT) devices.
An IPv4 address is 32 bits, with each octet (8 bits) represented as a decimal value separated by a
dot. This representation is called dotted decimal notation. For example, 192.168.48.64 and
64.100.36.254 are IPv4 addresses represented in dotted decimal notation. The table shows the
binary value for each octet.
The IPv4 subnet mask (or prefix length) is used to differentiate the network portion from
the host portion of an IPv4 address. A subnet mask contains four bytes and can be
written in the same format as an IP address. In a valid subnet mask, the most significant
bits starting at the left most must be set to 1. These bits are the network portion of the
subnet mask. The bits set to 0 are the host portion of the mask
For this example, look at 203.0.113.0/24. The network's IPv4 address is 203.0.113.0
with a subnet mask 255.255.255.0. The last octet of the subnet mask has all 8 bits
available for host IPv4 addresses, which means that on the network 203.0.113.0/24,
there can be up to 28 (256) available subnet addresses.
Two IPv4 addresses are in use by default and cannot be assigned to devices:
203.0.113.0 is the network address
203.0.113.255 is the broadcast address
Therefore, there are 254 (256 - 2) host IP addresses available, and the range of
addresses available for hosts would be 203.0.113.1 to 203.0.113.254.
A network can be divided into smaller networks called subnets. Subnets can be
provided to individual organizational units, such as teams or business departments, to
simplify the network and potentially make departmental data private. The subnet
provides a specific range of IP addresses for a group of hosts to use. Every network is
typically a subnet of a larger network.
For example, the network IPv4 network address is 192.168.2.0/24. The /24
(255.255.255.0) subnet mask means that the last octet has 8 bits available for host
addresses. You can borrow from the host portion to create subnets. For example, you
need to use three bits to create eight subnets (23 = 8). This leaves the remaining five
bits for the hosts (25 = 32).
This can be more easily visualized when showing the subnet mask in binary format.
Because you need to create eight subnets, you designate three bits in the last octet for
subnet use. The remaining five bits are for the hosts, and provide each subnet with 32
IP addresses.
The following table lists the network address, broadcast address, and available host
address range for each subnet.
Notice the allocation for subnets and hosts specified for each row. You should now
understand how designating bits to create subnets reduces the number of hosts
available for each subnet. The number of hosts available per subnet takes into account
that network and broadcast addresses each require an IP address. The more bits you
use to create subnets, the fewer bits you have for hosts per subnet.
The table below shows the various options if you have a /24 subnet mask.
Due to the depletion of IPv4 addresses, internally most addresses use private IPv4 addresses
(RFC 1918). Use of variable-length subnet masks (VLSM) can also help support more efficient
use of IPv4 address space. Originally used when IPv4 addresses were classful (Class A, B, C).
VLSM is a method of dividing a single network (or subnet) using different subnet masks to
provide subnets with different number of host addresses.
Devices using private IPv4 addresses are able to access the internet via Network Address
Translation (NAT) and Port Address Translation (PAT). Outgoing data from your device is sent
through a router, which maps your device's private IPv4 address to a public IPv4 address. When
the data returns to that router, it looks up your device's private IP address and routes it to its
destination.
When the IETF began its development of a successor to IPv4, it used this opportunity to fix the
limitations of IPv4 and included enhancements. One example is Internet Control Message
Protocol version 6 (ICMPv6), which includes address resolution and address autoconfiguration
features not found in ICMP for IPv4 (ICMPv4).
The depletion of IPv4 address space has been the motivating factor for moving to IPv6. As
Africa, Asia, and other areas of the world become more connected to the internet, there are not
enough IPv4 addresses to accommodate this growth.
IPv6 is described initially in RFC 2460. Further RFCs describe the architecture and services
supported by IPv6.
The architecture of IPv6 has been designed to allow existing IPv4 users to transition easily to
IPv6 while providing services such as end-to-end security, quality of service (QoS), and globally
unique addresses. The larger IPv6 address space allows networks to scale and provide global
reachability. The simplified IPv6 packet header format handles packets more efficiently. IPv6
prefix aggregation, simplified network renumbering, and IPv6 site multihoming capabilities
provide an IPv6 addressing hierarchy that allows for more efficient routing. IPv6 supports widely
deployed routing protocols such as Routing Information Protocol (RIP), Integrated Intermediate
System-to-Intermediate System (IS-IS), OSPF, and multiprotocol BGP (mBGP). Other available
features include stateless autoconfiguration and an increased number of multicast addresses.
Private addresses in combination with Network Address Translation (NAT) have been
instrumental in slowing the depletion of IPv4 address space. However, NAT is problematic for
many applications, creates latency, and has limitations that severely impede peer-to-peer
communications. IPv6 address space eliminates the need for private addresses; therefore, IPv6
enables new application protocols that do not require special processing by border devices at the
edge of networks.
With the ever-increasing number of mobile devices, mobile providers have been leading the way
with the transition to IPv6. The top two mobile providers in the United States report that over
90% of their traffic is over IPv6. Most top ISPs and content providers such as YouTube,
Facebook, and NetFlix, have also made the transition. Many companies like Microsoft,
Facebook, and LinkedIn are transitioning to IPv6-only internally. In 2018, broadband ISP
Comcast reported a deployment of over 65% and British Sky Broadcasting over 86%.
IPv6 addresses are represented as a series of 16-bit hexadecimal fields (hextet) separated by
colons (:) in the format: x:x:x:x:x:x:x:x. The preferred format includes all the hexadecimal
values. There are two rules that can be used to reduce the representation of the IPv6 address:
Preferred
2001:0db8:0000:1111:0000:0000:0000:0200
No leading 0s
2001:db8:0:1111:0:0:0:200
IPv6 addresses commonly contain successive hexadecimal fields of zeros. Two colons (::) may
be used to compress successive hexadecimal fields of zeros at the beginning, middle, or end of an
IPv6 address (the colons represent successive hexadecimal fields of zeros).
A double colon (::) can replace any single, contiguous string of one or more 16-bit hextets
consisting of all zeros. For example, the following preferred IPv6 address can be formatted with
no leading zeros.
Preferred
2001:0db8:0000:1111:0000:0000:0000:0200
No leading 0s
2001:db8:0:1111::200
Two colons (::) can be used only once in an IPv6 address to represent the longest successive
hexadecimal fields of zeros. Hexadecimal letters in IPv6 addresses are not case-sensitive
according to RFC 5952. The table below lists compressed IPv6 address formats:
The unspecified address listed in the table above indicates the absence of an IPv6
address or when the IPv6 address does not need to be known. For example, a newly
initialized device on an IPv6 network may use the unspecified address as the source
address in its packets until it receives or creates its own IPv6 address.
Note: There are other types of IPv6 unicast addresses, but these four are the most
significant to our discussion.
Global Unicast Addresses
A global unicast address (GUA) is an IPv6 similar to a public IPv4 address. IPv6 global
unicast addresses are globally unique and routable on the IPv6 internet. The Internet
Committee for Assigned Names and Numbers (ICANN), the operator for the Internet
Assigned Numbers Authority (IANA), allocates IPv6 address blocks to the five Regional
Internet Registries (RIRs). Currently, only GUAs with the first three bits (001), which
converts to 2000::/3, are being assigned, as shown in the figure.
IPv6 GUA Format
The parts of the GUA in the figure above are as follows:
Global Routing Prefix - The global routing prefix is the prefix, or network,
portion of the address that is assigned by the provider such as an ISP, to a
customer or site. It is common for some ISPs to assign a /48 global routing prefix
to its customers, which always includes the first 3 bit (001) shown in the figure.
The global routing prefix will usually vary depending on the policies of the ISP.
For example, the IPv6 address 2001:db8:acad::/48 has a global routing prefix
that indicates that the first 48 bits (3 hextets or 2001:db8:acad) is how the ISP
knows of this prefix (network). The double colon (::) following the /48 prefix length
means the rest of the address contains all 0s. The size of the global routing prefix
determines the size of the subnet ID.
Subnet ID - The Subnet ID field is the area between the Global Routing Prefix
and the Interface ID. Unlike IPv4, where you must borrow bits from the host
portion to create subnets, IPv6 was designed with subnetting in mind. The
Subnet ID is used by an organization to identify subnets within its site. The larger
the Subnet ID, the more subnets available. For example, if the prefix has a /48
Global Routing Prefix, and using a typical /64 prefix length, the first four hextets
are for the network portion of the address, with the fourth hextet indicating the
Subnet ID. The remaining four hextets are for the Interface ID.
Interface ID - The IPv6 Interface ID is equivalent to the host portion of an IPv4
address. The term Interface ID is used because a single device may have
multiple interfaces, each having one or more IPv6 addresses. It is strongly
recommended that in most cases /64 subnets should be used, which creates a
64-bit interface ID. A 64-bit interface ID allows for 18 quintillion devices or hosts
per subnet. A /64 subnet or prefix (Global Routing Prefix + Subnet ID) leaves 64
bits for the interface ID. This is recommended to allow devices enabled with
Stateless Address Autoconfiguration (SLAAC) to create their own 64-bit interface
ID. It also makes developing an IPv6 addressing plan simple and effective.
The GUA is not a requirement; however every IPv6-enabled network interface must
have an Link-local Address (LLA).
Link-Local Addresses
An IPv6 Link-local Address (LLA) enables a device to communicate with other IPv6-
enabled devices on the same link and only on that link (subnet). Packets with a source
or destination LLA cannot be routed beyond the link from which the packet originated.
If an LLA is not configured manually on an interface, the device will automatically create
its own without communicating with a DHCP server. IPv6-enabled hosts create an IPv6
LLA even if the device has not been assigned a global unicast IPv6 address. This
allows IPv6-enabled devices to communicate with other IPv6-enabled devices on the
same subnet. This includes communication with the default gateway (router).
IPv6 devices must not forward packets that have source or destination LLAs to other
links.
Unique local addresses (range fc00::/7 to fdff::/7) are not yet commonly implemented.
However, unique local addresses may eventually be used to address devices that
should not be accessible from the outside, such as internal servers and printers.
The IPv6 unique local addresses have some similarity to RFC 1918 private addresses
for IPv4, but there are significant differences:
Unique local addresses are used for local addressing within a site or between a
limited number of sites.
Unique local addresses can be used for devices that will never need to access
another network.
Unique local addresses are not globally routed or translated to a global IPv6
address.
Note: Many sites also use the private nature of RFC 1918 addresses to attempt to
secure or hide their network from potential security risks. However, this was never the
intended use of these technologies, and the IETF has always recommended that sites
take the proper security precautions on their internet-facing router.
The figure shows the structure of a unique local address.
Multicast Addresses
There are no broadcast addresses in IPv6. IPv6 multicast addresses are used instead of broadcast
addresses. IPv6 multicast addresses are similar to IPv4 multicast addresses. Recall that a
multicast address is used to send a single packet to one or more destinations (multicast group).
IPv6 multicast addresses have the prefix ff00::/8.
Note: Multicast addresses can only be destination addresses and not source addresses.
There are two types of IPv6 multicast addresses:
Well-known IPv6 multicast addresses are assigned. Assigned multicast addresses are reserved
multicast addresses for predefined groups of devices. An assigned multicast address is a single
address used to reach a group of devices running a common protocol or service. Assigned
multicast addresses are used in context with specific protocols such as DHCPv6.
These are two common IPv6 assigned multicast groups:
ff02::1 All-nodes multicast group - This is a multicast group that all IPv6-enabled devices join. A
packet sent to this group is received and processed by all IPv6 interfaces on the link or network.
This has the same effect as a broadcast address in IPv4.
ff02::2 All-routers multicast group - This is a multicast group that all IPv6 routers join. A router
becomes a member of this group when it is enabled as an IPv6 router with the ipv6 unicast-
routing global configuration command. A packet sent to this group is received and processed by
all IPv6 routers on the link or network.
A solicited-node multicast address is similar to the all-nodes multicast address. The advantage of
a solicited-node multicast address is that it is mapped to a special Ethernet multicast address.
This allows the Ethernet NIC to filter the frame by examining the destination MAC address
without sending it to the IPv6 process to see if the device is the intended target of the IPv6
packet.
The format for an IPv6 solicited-node multicast address is shown in the figure.
IPv6 Solicited-Node Address Format
Recall that a router is a networking device that functions at the internet layer of the
TCP/IP model or Layer 3 network layer of the OSI model. Routing involves the
forwarding packets between different networks. Routers use a routing table to route
between networks. A router generally has two main functions: Path determination, and
Packet routing or forwarding.
Path Determination
Path determination is the process through which a router uses its routing table to
determine where to forward packets. Each router maintains its own local routing table,
which contains a list of all the destinations that are known to the router and how to
reach those destinations. When a router receives an incoming packet on one of its
interfaces, it checks the destination IP address in the packet and looks up the best
match between the destination address and the network addresses in its routing table.
A matching entry indicates that the destination is directly connected to the router or that
it can be reached by forwarding the packet to another router. That router becomes the
next-hop router towards the final destination of the packet. If there is no matching entry,
the router sends the packet to the default route. If there is no default route, the router
drops the packet.
Packet Forwarding
After the router determines the correct path for a packet, it forwards the packet through
a network interface towards the destination network.
Directly connected networks - These network route entries are active router interfaces.
Routers add a directly connected route when an interface is configured with an IP address
and is activated. Each router interface is connected to a different network segment.
Static routes - These are routes that are manually configured by the network
administrator. Static routes work relatively well for small networks that do not change in
time, but in large dynamic networks they have many shortcomings.
Dynamic routes - These are routes learned automatically when a dynamic routing
protocol is configured and a neighbor relationship to other routers is established. The
reachability information in this case is dynamically updated when a change in the
network occurs. Several routing protocols with different advantages and shortcomings
have been developed through the years. Routing protocols are extensively used
throughout networks deployed all over the world. Examples of routing protocols include
OSPF, EIGRP, IS-IS, and BGP.
Default routes - Default routes are either manually entered, or learned through a
dynamic routing protocol. Default routes are used when no explicit path to a destination
is found in the routing table. They are a gateway of last resort option instead of just
dropping the packet.
5.4 Network Devices
NETWORK DEVICES
5.4.1 Ethernet Switches
Earlier in this module, you explored both switching and routing functions in the network
layer. In this topic, you will explore in more detail the networking devices that perform
the switching and routing functions.
In legacy shared Ethernet, a device connected to a port on a hub can either transmit or
receive data at a given time. It cannot transmit and receive data at the same time. This
is referred to as half-duplex transmission. Half-duplex communication is similar to
communication with walkie-talkies in which only one person can talk at a time. In half-
duplex environments, if two devices do transmit at the same time, there is a collision.
The area of the network in which collisions can occur is called a collision domain.
One of the main features of Ethernet switches over legacy Ethernet hubs is that they
provide full-duplex communications, which eliminates collision domains. Ethernet
switches can simultaneously transmit and receive data. This mode is called full-duplex.
Full-duplex communication is similar to the telephone communication, in which each
person can talk and hear what the other person says simultaneously.
Operate at the network access layer of the TCP/IP model and the Layer 2 data
link layer of the OSI model
Filter or flood frames based on entries in the MAC address table
Have a large number of high speed and full-duplex ports
The figure shows an example of switches with multiple high speed and full-duplex ports.
The switch dynamically learns which devices and their MAC addresses are connected to which
switch ports. It builds the MAC address table and filters or floods frames based on that table. A
MAC address table on a switch looks similar to the one below:
The switching mode determines whether the switch begins forwarding the frame as soon as the
switch has read the destination details in the packet header, or waits until the entire frame has
been received and checked for errors, by calculating the cyclic redundancy check (CRC) value,
before forwarding on the network. The switching mode is applicable to all packets being
switched or routed through the hardware and can be saved persistently through reboots and
restarts.
High port density - Switches have a large number of ports, from 24 to 48 ports per
switch in smaller devices, to hundreds of ports per switch chassis in larger modular
switches. Switch ports usually operate at 100 Mbps, 1 Gbps, and 10 Gbps.
Large frame buffers - Switches have the ability to store received frames when there may
be congested ports on servers or other devices in the network.
Fast internal switching - Switches have very fast internal switching. They are able to
switch user traffic from the ingress port to the egress port extremely fast. Different
methods are used to connect the ports which affects the overall performance of the switch
including a fast internal bus, shared memory, or an integrated crossbar switch fabric.
5.4.2 Routers
While switches are used to connect devices on a LAN and exchange data frames, routers are
needed to reach devices that are not on the same LAN. Routers use routing tables to route traffic
between different networks. Routers are attached to different networks (or subnets) through their
interfaces and have the ability to route the data traffic between them.
They operate at the internet layer of TCP/IP model and Layer 3 network layer of the OSI
model.
They route packets between networks based on entries in the routing table.
They have support for a large variety of network ports, including various LAN and WAN
media ports which may be copper or fiber. The number of interfaces on routers is usually
much smaller than switches but the variety of interfaces supported is greater. IP addresses
are configured on the interfaces.
Process switching solves a problem by doing math long hand, even if it is the identical
problem that was just solved.
Fast switching solves a problem by doing math long hand one time and remembering the
answer for subsequent identical problems.
CEF solves every possible problem ahead of time in a spreadsheet.
5.4.3 Firewalls
A firewall is a hardware or software system that prevents unauthorized access into or out of a
network. Typically, firewalls are used to prevent unauthorized internet users from accessing
internal networks. Therefore, all data leaving or entering the protected internal network must
pass through the firewall to reach its destination, and any unauthorized data is blocked. The role
of the firewall in any network is critical. Additional details on how firewalls interact with
applications are presented in the Application Development and Security module of the course.
The most basic (and the original) type of firewall is a stateless packet filtering firewall. You
create static rules that permit or deny packets, based on packet header information. The firewall
examines packets as they traverse the firewall, compares them to static rules, and permits or
denies traffic accordingly. This stateless packet filtering can be based on several packet header
fields, including the following:
The static rules are fairly simple, but they do not work well for applications that dynamically use
different sets of TCP and/or UDP port numbers. This is because they cannot track the state of
TCP or UDP sessions as they transition from initial request, to fulfilling that request, and then the
closing of the session. Also, these static rules are built using a restrictive approach. In other
words, you write explicit rules to permit the specific traffic deemed acceptable, and deny
everything else.
Static rules are transparent to end systems, which are not aware that they are communicating
with a destination through a high-performance firewall. However, implementing static rules
requires deep understanding of packet headers and application processes.
The stateful packet filtering firewall performs the same header inspection as the stateless packet
filtering firewall but also keeps track of the connection state. This is a critical difference. To keep
track of the state, these firewalls maintain a state table.
A typical simple configuration works as follows. Any sessions or traffic initiated by devices on
trusted, inside networks are permitted through the firewall. This includes the TCP connection
request for destination port 80. The firewall keeps track of this outbound request in its state table.
The firewall understands that this is an initial request, and so an appropriate response from the
server is allowed back in through the firewall. The firewall tracks the specific source port used
and other key information about this request. This includes various IP and TCP flags and other
header fields. This adds a certain amount of intelligence to the firewall.
It will allow only valid response packets that come from the specific server. The response
packets must have all the appropriate source and destination IP addresses, ports, and flags set.
The stateful packet filtering firewall understands standard TCP/IP packet flow including the
coordinated change of information between inside and outside hosts that occurs during the life of
the connection. The firewall allows untrusted outside servers to respond to inside host requests,
but will not allow untrusted servers to initiate requests.
Of course, you can create exceptions to this basic policy. Your company might consider certain
applications to be inappropriate during work hours. You might want to block inside users from
initiating connections to those applications. However, with traditional stateful packet filtering,
this capability is limited. These traditional firewalls are not fully application-aware.
Also, you might have a web server hosted on the premises of a corporation. Of course, you
would like everyone in the world to access your web server and purchase your products or
services. You can write rules that allow anyone on the untrusted internet to form appropriate
inbound connections to the web server.
These stateful firewalls are more adept at handling Layer 3 and Layer 4 security than a stateless
device. However, like stateless packet filters, they have little to no insight into what happens at
OSI model Layers 5-7; they are “blind” to these layers.
The most advanced type of firewall is the application layer firewall which can perform deep
inspection of the packet all the way up to the OSI model’s Layer 7. This gives you more reliable
and capable access control for OSI Layers 3-7, with simpler configuration.
This additional inspection capability can impact performance. Limited buffering space can
hinder deep content analysis.
The application layer firewall can determine an File Transfer Protocol (FTP) session, just like a
stateless or stateful firewall can. However, this firewall can look deeper, into the application
layer to see that this is specifically an FTP “put” operation, to upload a file. You could have rules
that deny all FTP uploads. Or you can configure a more granular rule such as one that denies all
FTP uploads except those from a specific source IP and only if the filename is “os.bin”.
The deeper packet inspection capability of the application layer firewall enables it to verify
adherence to standard HTTP protocol functionality. It can deny requests that do not conform to
these standards, or otherwise meet criteria established by the security team.
Load balancing improves the distribution of workloads across multiple computing resources,
such as servers, cluster of servers, network links, and more. Server load balancing helps ensure
the availability, scalability, and security of applications and services by distributing the work of a
single server across multiple servers.
The load balancer decides which server should receive a client request such as a web page or a
file. The load balancer selects a server that can successfully fulfill the client request most
effectively, without overloading the selected server or the overall network.
At the device level, the load balancer provides the following features to support high network
availability:
Device redundancy — Redundancy allows you to set up a peer load balancer device in
the configuration so that if one load balancer becomes inoperative, the other load
balancer can take its place immediately.
Scalability — Virtualization allows running the load balancers as independent virtual
devices, each with its own resource allocation.
Security — Access control lists restrict access from certain clients or to certain network
resources.
At the network service level, a load balancer provides the following advanced services:
High services availability — High-performance server load balancing allows
distribution of client requests among physical servers and server farms. In addition,
health monitoring occurs at the server and server farm levels through implicit and explicit
health probes.
Scalability — Virtualization allows the use of advanced load-balancing algorithms
(predictors) to distribute client requests among the virtual devices configured in the load
balancer. Each virtual device includes multiple virtual servers. Each server forwards
client requests to one of the server farms. Each server farm can contain multiple physical
servers.
Services-level security — This allows establishment and maintenance of a Secure
Sockets Layer (SSL) session between the load balancer and its peer, which provides
secure data transactions between clients and servers.
Although the load balancer can distribute client requests among hundreds or even thousands of
physical servers, it can also maintain server persistence. With some e-commerce applications, all
client requests within a session are directed to the same physical server so that all the items in
one shopping cart are contained on one server.
You can configure a virtual server to intercept web traffic to a website and allow multiple real
servers (physical servers) to appear as a single server for load-balancing purposes.
A virtual server is bound to physical hardware and software resources that run on a real, physical
server in a server farm. They can be configured to provide client services or as backup servers.
Physical servers that are all perform the same or similar functions are grouped into server farms.
Servers in the same server farm often contain identical content (referred to as mirrored content)
so that if one server becomes inoperative, another server can take over its functions immediately.
Mirrored content also allows several servers to share the load during times of increased demand.
You can distribute incoming client requests among the servers in a server farm by defining load-
balancing rules called predictors using IP address and port information.
When a client requests an application service, the load balancer performs server load balancing
by deciding which server can successfully fulfill the client request in the shortest amount of time
without overloading the server or server farm. Some sophisticated predictors take into account
factors such as the server load, response time, or availability, allowing you to adjust load
balancing to each match the behavior of a particular application.
You can configure the load balancer to allow the same client to maintain multiple simultaneous
or subsequent TCP or IP connections with the same real server for the duration of a session. A
session is defined as a series of interactions between a client and a server over some finite period
of time (from several minutes to several hours). This server persistence feature is called
stickiness.
Depending on how you have configured server load balancing, the load balancer connects a
client to an appropriate server after it has determined which load-balancing method to use. If the
load balancer determines that a client is already stuck to a particular server, then the load
balancer sends subsequent client requests to that server, regardless of the load-balancing criteria.
If the load balancer determines that the client is not stuck to a particular server, it applies the
normal load-balancing rules to the request.
The combination of the predictor and stickiness enables the application to have scalability,
availability, and performance as well as persistence for transaction processing.
SSL configuration in a load balancer establishes and maintains an SSL session between the load
balancer and its peer, enabling the load balancer to perform its load-balancing tasks on the SSL
traffic. These SSL functions include server authentication, private-key and public-key
generation, certificate management, and data packet encryption and decryption. Depending on
how the load balancer is configured, it can also perform SSL offloading, by terminating the SSL
session from the client on the load balancer itself. This way, the resource intensive SSL
processes are offloaded on the load balancer itself, instead of terminating on the backend servers.
Application services require monitoring to ensure availability and performance. Load balancers
can be configured to track the health and performance of servers and server farms by creating
health probes. Each health probe created can be associated with multiple real servers or server
farms.
When the load balancer health monitoring is enabled, the load balancer periodically sends
messages to the server to determine the server status. The load balancer verifies the server
response to ensure that a client can access that server. The load balancer can use the server
response to place the server in or out of service. In addition, the load balancer can use the health
of servers in a server farm to make reliable load-balancing decisions.
Additional details on how load balancers interact with applications and load balancing
algorithms is covered in the Application Development and Security module of the course.
It is very important to document your code, not only to make it easier to understand and follow
by other people who will be reading and reviewing it, but also for yourself. Six months down the
road, when you come back and look at your code, you might find it very difficult and time
consuming to remember what exactly went through your mind when you wrote that amazing and
aptly named f() function.
Network diagrams are part of the documentation that goes with a network deployment and play
just as an important role as the documentation steps in programming code. Network diagrams
typically display a visual and intuitive representation of the network, depicting how are all the
devices are connected, and in which buildings, floors, closets are they located, as well as what
interface connects to each device.
Imagine being dropped into a place you have never been to, without GPS, without a map, with
the instruction to find the closest grocery store. This is what it feels like to manage a network of
devices without a network diagram and network documentation. Instead of finding the grocery
store, you have to figure out why a large number of devices are no longer connected to the
network. You might be able to find the grocery store eventually, if you set off in the right
direction. Similarly, you also might be able to figure out the network problem. But it would take
you a lot less time if you had access to a map, a network diagram.
As networks get built and configured and go through their lifecycle of ordering the devices,
receiving them on site, bringing them online and configuring them, maintaining and monitoring
them, upgrading them, all the way to decommissioning them, and starting the process over again,
network diagrams need to be updated and maintained to document all these changes.
Layer 2, or physical connectivity diagrams are network diagrams representing how devices are
physically connected in the network. It is basically a visual representation of which network port
on a network device connects to which network port on another network device. Protocols like
Cisco Discovery Protocol (CDP) or Link Layer Discovery Protocol (LLDP) can be used to
display the physical network port connectivity between two or more devices. This network
diagram is useful especially when troubleshooting direct network connectivity issues.
Layer 3, or logical connectivity diagrams are network diagrams that display the IP connectivity
between devices on the network. Switches and Layer 2 devices are usually not even displayed in
these diagrams as they do not perform any Layer 3 functions and from a routing perspective,
they are the equivalent of a physical wire. This type of network diagram is useful when
troubleshooting routing problems. Redundant connections and routing protocols are usually
present in networks that require high availability.
An example of a simplified Layer 2 network diagram is displayed in the figure. Notice that there
is no Layer 3 information documented, such as IP addresses or routing protocols.
Looking at this diagram you can get a general idea of how the clients connect to the network and
how the network devices connect to each other so that end to end connectivity between all clients
is accomplished. Router RTR1 has two active interfaces in this topology: FastEthernet 0/0 and
Serial 0/0. Router RTR2 has three active interfaces: FastEthernet0/0, FastEthernet 1/0, and Serial
0/0.
Most Cisco routers have network slots that support modular network interfaces. This means that
the routers are a future proof investment in the sense that when upgrading the capacity of the
network, for example from 100 Mbps FastEthernet to 1 Gbps GigabitEthernet and further to 10
Gbps TenGigabitEthernet, you can simply swap between modular interfaces and still use the
same router. Modular Ethernet cards for Cisco routers usually have multiple Ethernet ports on
each card.
In order to uniquely identify the modular cards and the ports on each one of these cards, a
naming convention is used. In the figure above, FastEthernet 0/0 specifies that this FastEthernet
modular card is inserted in the first network module on the router (module 0, represented by the
first 0 in 0/0) and is the first port on that card (port 0, represented by the second 0 in 0/0).
Following this logic, FastEthernet 0/1, references the second FastEthernet port on the first
FastEthernet module and FastEthernet 1/2, references the third FastEthernet port on the second
FastEthernet module. Cisco routers support a large number of network modules implementing
different technologies including the following: FastEthernet (rarely used these days),
GigabitEthernet, 10GigabitEthernet, 100GigabitEthernet, point to point Serial, and more.
Going back to the network diagram above, we see there are 2 routers RTR1 and RTR2 in the
diagram connected through a serial network connection. Interface FastEthernet 0/0 on RTR1
connects to a switch that provides network connectivity to a server and 20 hosts in the
Administration organization. Interface FastEthernet 0/0 on router RTR2 connects to 4 switches
that provide network connectivity to 64 hosts in the Instructor group. Interface FastEthernet 1/0
on Router RTR2 connects to 20 switches that provide network connectivity to 460 hosts in the
Student group.
Packet Tracer is a great tool for building and testing networks and network equipment. As a
developer, it is important that you are familiar with network devices and how they communicate
with each other. The simple network in this Packet Tracer activity is pre-configured to give you
an opportunity to explore the devices.
The internet was built on various standards. You should understand the standard
network protocols so you can communicate and troubleshoot effectively.
Each protocol meets a need and uses standard port values. You should know when to
use a particular protocol and know the standard port for connections. Many developers
have been puzzled by a mismatched port value; therefore, checking these values can
be a first line of attack when troubleshooting.
SSH connections can use a public key for authentication, rather than sending a
username and password over the network. This authentication method means that SSH
is a good choice to connect to network devices, to cloud devices, and to containers.
By default, SSH uses port 22 and Telnet uses port 23. Telnet can use port 992 when
creating a session over Transport Layer Security (TLS) or SSL.
HTTP and its secure version, HTTPS, are both protocols recognized by web browsers
and are used to connect to web sites. HTTPS uses TLS or SSL to make a secure
connection. You can see the http: or https: in the address bar on your browser. Many
browsers also recognize ssh: and ftp: protocols and allow you to connect to remote
servers in that way as well.
Later in this course, you will use NETCONF and RESTCONF to manage a Cisco router.
NETCONF uses port 830. RESTCONF does not have a reserved port value. You may
see various implementations of different values. Commonly the port value is in the
8000s.
To have multiple network operations, you want to make sure each protocol has a default
port and use standards to try to avoid conflicts. TCP and UDP traffic requires a
destination port be specified for each packet. The source port is automatically
generated by the sending device. The following table shows some common, well-known
port values for protocols used in this course. System port numbers are in the range 0 to
1023, though you may see others in use for different reasons.
Note: For a more complete list of ports, search the internet for TCP and UPD port
numbers.
5.5.2 DHCP
As you have seen previously in this module, IP addresses are needed by all devices
connected to a network in order for them to be able to communicate. Assigning these IP
addresses manually and one at a time for each device on the network is cumbersome
and time consuming. DHCP was designed to dynamically configure devices with IP
addressing information. DHCP works within a client/server model, where designated
DHCP servers allocate IP addresses and deliver configuration information to devices
that are configured to dynamically request addressing information.
In addition to the IP address for the device itself, a DHCP server can also provide
additional information, like the IP address of the DNS server, default router, and other
configuration parameters. For example, Cisco wireless access points use option 43 in
DHCP requests to obtain the IP address of the Wireless LAN Controller that they need
to connect to for management purposes.
DHCP defines a process by which the DHCP server knows the IP subnet in which the
client resides, and it can assign an IP address from a pool of available addresses in that
subnet. The rest of the network configuration parameters that a DHCP server supplies,
like the default router IP address or the IP address of the DNS server, are usually the
same for the whole subnet so the DHCP server can have these configurations per
subnet rather than per host.
The specifications for the IPv4 DHCP protocol are described in RFC 2131 - Dynamic
Host Configuration Protocol and RFC 2132 - DHCP options and BOOTP Vendor
Extensions. DHCP for IPv6 was initially described in RFC 3315 - Dynamic Host
Configuration Protocol for IPv6 (DHCPv6) in 2003, but this has been updated by several
subsequent RFCs. RFC 3633 - IPv6 Prefix Options for Dynamic Host Configuration
Protocol (DHCP) version 6 added a DHCPv6 mechanism for prefix delegation and RFC
3736 - Stateless Dynamic Host Configuration Protocol (DHCP) Service for IPv6 added
SLAAC. The main difference between DHCP for IPv4 and DHCP for IPv6 is that DHCP
for IPv6 does not include the default gateway address. The default gateway address
can only be obtained automatically in IPv6 from the Router Advertisement message.
DHCP Relay
In cases in which the DHCP client and server are located in different subnets, a DHCP
relay agent can be used. A relay agent is any host that forwards DHCP packets
between clients and servers. Relay agent forwarding is different from the normal
forwarding that an IP router performs, where IP packets are routed between networks
transparently. Relay agents receive inbound DHCP messages and then generate new
DHCP messages on another interface, as shown in the figure.
Clients use port 67 to send DHCP messages to DHCP servers. DHCP servers use port
68 to send DHCP messages to clients.
DHCP operations includes four messages between the client and the server:
The figure shows how these messages are sent between the client and server.
In the figure, the client broadcasts a DHCPDISCOVER message looking for a DHCP server. The
server responds with a unicast DHCPOFFER. If there is more than one DHCP server on the local
network, the client may receive multiple DHCPOFFER messages. Therefore, it must choose
between them, and broadcast a DHCPREQUEST message that identifies the explicit server and
lease offer that the client is accepting. The message is sent as a broadcast so that any other
DHCP servers on the local network will know the client has requested configuration from
another DHCP server.
A client may also choose to request an address that it had previously been allocated by the
server. Assuming that the IPv4 address requested by the client, or offered by the server, is still
available, the server sends a unicast DHCP acknowledgment (DHCPACK) message that
acknowledges to the client that the lease has been finalized. If the offer is no longer valid, then
the selected server responds with a DHCP negative acknowledgment (DHCPNAK) message. If a
DHCPNAK message is returned, then the selection process must begin again with a new
DHCPDISCOVER message. After the client has the lease, it must be renewed prior to the lease
expiration through another DHCPREQUEST message.
The DHCP server ensures that all IP addresses are unique (the same IP address cannot be
assigned to two different network devices simultaneously). Most ISPs use DHCP to allocate
addresses to their customers.
DHCPv6 has a set of messages that is similar to those for DHCPv4. The DHCPv6 messages are
SOLICIT, ADVERTISE, INFORMATION REQUEST, and REPLY.
5.5.3 DNS
In data networks, devices are labeled with numeric IP addresses to send and receive data over
networks. Domain names were created to convert the numeric address into a simple,
recognizable name.
Note: You will not be able to access www.cisco.com by simply entering that IP address
198.133.219.25 in your web browser.
The DNS protocol defines an automated service that matches domain names to IP addresses. It
includes the format for queries, responses, and data. DNS uses a single format called a DNS
message. This message format is used for all types of client queries and server responses, error
messages, and the transfer of resource record information between servers.
The DNS server stores different types of resource records that are used to resolve names. These
records contain the name, address, and type of record. Some of these record types are as follows:
When a client makes a query to its configured DNS server, the DNS server first looks at its own
records to resolve the name. If it is unable to resolve the name by using its stored records, it
contacts other servers to resolve the name. After a match is found and returned to the original
requesting server, the server temporarily stores the numbered address in the event that the same
name is requested again.
The DNS client service on Windows PCs also stores previously resolved names in memory.
The ipconfig /displaydns command displays all of the cached DNS entries.
As shown in the table, DNS uses the same message format between servers. It consists of a
question, answer, authority, and additional information for all types of client queries and server
responses, error messages, and transfer of resource record information.
DNS Hierarchy
The different top-level domains represent either the type of organization or the country of origin.
Examples of top-level domains are the following:
Note: For more examples, search the internet for a list of all the top-level domains.
5.5.4 SNMP
SNMP was developed to allow administrators to manage devices such as servers, workstations,
routers, switches, and security appliances. It enables network administrators to monitor and
manage network performance, find and solve network problems, and plan for network growth.
SNMP is an application layer protocol that provides a message format for communication
between managers and agents.
There are several versions of SNMP that have been developed through the years:
SNMP version 1 is rarely used anymore, but versions 2c and 3 are still extensively used. In
comparison to previous versions, SNMPv2c includes additional protocol operations and 64-bit
performance monitoring support. SNMPv3 focused primarily on improving the security of the
protocol. SNMPv3 includes authentication, encryption, and message integrity.
The figure shows the relationship(s) among SNMP manager, agents, and managed devices.
To configure SNMP on a networking device, it is first necessary to define the
relationship between the SNMP manager and the device (the agent).
The SNMP manager is part of a network management system (NMS). The SNMP
manager runs SNMP management software. As shown in the figure, the SNMP
manager can collect information from an SNMP agent by using the “get” action. It can
also change configurations on an agent by using the “set” action. In addition, SNMP
agents can forward information directly to the SNMP manager by using “traps”.
SNMP Operation
An SNMP agent running on a device collects and stores information about the device and its
operation. This information is stored locally by the agent in the MIB. The SNMP manager then
uses the SNMP agent to access information within the MIB and make changes to the device
configuration.
There are two primary SNMP manager requests, get and set. A get request is used by the SNMP
manager to query the device for data. A set request is used by the SNMP manager to change
configuration variables in the agent device. A set request can also initiate actions within a
device. For example, a set can cause a router to reboot, send a configuration file, or receive a
configuration file.
SNMP Polling
The NMS can be configured to periodically have the SNMP managers poll the SNMP agents that
are residing on managed devices using the get request. The SNMP manager queries the device
for data. Using this process, a network management application can collect information to
monitor traffic loads and to verify the device configurations of managed devices. The
information can be displayed via a GUI on the NMS. Averages, minimums, or maximums can be
calculated. The data can be graphed, or thresholds can be established to trigger a notification
process when the thresholds are exceeded. For example, an NMS can monitor CPU utilization of
a Cisco router. The SNMP manager samples the value periodically and presents this information
in a graph for the network administrator to use in creating a baseline, creating a report, or
viewing real time information.
SNMP Traps
Periodic SNMP polling does have disadvantages. First, there is a delay between the time that an
event occurs and the time that it is noticed (via polling) by the NMS. Second, there is a trade-off
between polling frequency and bandwidth usage.
To mitigate these disadvantages, it is possible for SNMP agents to generate and send traps to
inform the NMS immediately of certain events. Traps are unsolicited messages alerting the
SNMP manager to a condition or event on the network. Examples of trap conditions include, but
are not limited to, improper user authentication, restarts, link status (up or down), MAC address
tracking, closing of a TCP connection, loss of connection to a neighbor, or other significant
events. Trap notifications reduce network and agent resources by eliminating the need for some
of SNMP polling requests.
For SNMP to operate, the NMS must have access to the MIB. To ensure that access requests are
valid, some form of authentication must be in place.
SNMPv1 and SNMPv2c use community strings that control access to the MIB. Community
strings are plaintext passwords. SNMP community strings authenticate access to MIB objects.
Read-only (ro) - This type provides access to the MIB variables, but does not allow these
variables to be changed. Because security is minimal in version 2c, many organizations
use SNMPv2c in read-only mode.
Read-write (rw) - This type provides read and write access to all objects in the MIB.
To get or set MIB variables, the user must specify the appropriate community string for read or
write access.
The agent captures data from MIBs, which are data structures that describe SNMP network
elements as a list of data objects. Think of the MIB as a "map" of all the components of a device
that are being managed by SNMP. To monitor devices, the SNMP manager must compile the
MIB file for each equipment type in the network. Given an appropriate MIB, the agent and
SNMP manager can use a relatively small number of commands to exchange a wide range of
information with one another.
The MIB is organized in a tree-like structure with unique variables represented as terminal
leaves. An Object IDentifier (OID) is a long numeric tag. It is used to distinguish each variable
uniquely in the MIB and in the SNMP messages. Variables that measure things such as CPU
temperature, inbound packets on an interface, fan speed, and other metrics, all have associated
OID values. The MIB associates each OID with a human-readable label and other parameters,
serving as a dictionary or codebook. To obtain a metric (such as the state of an alarm, the host
name, or the device uptime), the SNMP manager puts together a get packet that includes the OID
for each object of interest. The SNMP agent on the device receives the request and looks up each
OID in its MIB. If the OID is found, a response packet is assembled and sent with the current
value of the object included. If the OID is not found, an error response is sent that identifies the
unmanaged object.
SNMP traps are used to generate alarms and events that are happening on the device. Traps
contain:
OIDs that identify each event and match it with the entity that generated the event
Severity of the alarm (critical, major, minor, informational or event)
A date and time stamp
SNMP Communities
SNMP community names are used to group SNMP trap destinations. When community names
are assigned to SNMP traps, the request from the SNMP manager is considered valid if the
community name matches one configured on the managed device. If so, all agent-managed MIB
variables are made accessible.
If the community name does not match, however, SNMP drops the request. New devices are
often preconfigured to have SNMP enabled, and provided with basic communities named public
for read-only access and private for read/write access to the system. From a security perspective,
is very important to either rename these communities, remove them completely and disable
SNMP if not used or apply an access-list to the community, limiting the access to only the IP
address or hostname of the SNMP manager station.
SNMP Messages
SNMP uses the following messages to communicate between the manager and the agent:
Get
GetNext
GetResponse
Set
Trap
The Get and GetNext messages are used when the manager requests information for a specific
variable. When the agent receives a Get or GetNext message it will issue a GetResponse message
back to the manager. The response message will contain either the information requested or an
error message indicating why the request cannot be processed.
A Set message is used by the manager to request that a change should be made to the value of a
specific variable. Similarly, to the Get and GetNext requests, the agent will respond with a
GetResponse message indicating either that the change has been successfully done or an error
message indicating why the requested change cannot be implemented.
The Trap message is used by the agent to inform the manager when important events take place.
An SNMP Trap is a change of state message. This means it could be one of the following:
an alarm
a clear
a status message
Several Requests for Comments (RFCs) have been published throughout the years concerning
SNMP. Some notable ones are: RFC 1155 - Structure and Identification of Management
Information for the TCP/IP-based Internets, RFC 1213 - Management Information Base for
Network Management of TCP/IP-based Internets: MIB-II and RFC 2578 - Structure of
Management Information Version 2 (SNMP).
5.5.5 NTP
Accurate time and making sure all devices in the network have a uniform and correct view of
time has always been a critical component to ensuring a smooth operation of the infrastructure.
IT infrastructure, including the network, compute, and storage, has become critical for the
success of nearly all businesses today. Every second of downtime or unavailability of services
over the network can be extremely expensive. In cases where these issues extend over hours or
days it can mean bankruptcy and going out of business. Service Level Agreements (SLAs) are
contracts between parties that consume infrastructure services and parties that provide these
services. Both parties depend on accurate and consistent timing from a networking perspective.
Time is fundamental to measuring SLAs and enforcing contracts.
The system clock on each device is the heart of the time service. The system clock runs from the
second the operating system starts and keeps track of the date and time. The system clock can be
set to update from different sources and can be used to distribute time to other systems through
various mechanisms. Most network devices contain a battery-powered clock to initialize the
system clock. The battery-powered clock tracks date and time across restarts and power outages.
The system clock keeps track of time based on Universal Time Code (UTC), equivalent to
Greenwich Mean Time (GMT). Information about local timezone and regional daylight savings
time can be configured to enable display of local time and date wherever the server is located.
NTP Overview
Network Time Protocol (NTP) enables a device to update its clock from a trusted network time
source, compensating for local clock drift. A device receiving authoritative time can be
configured to serve time to other machines, enabling groups of devices to be closely
synchronized.
NTP uses UDP port 123 as source and destination. RFC 5905 contains the definition of NTP
Version 4, which is the latest version of the protocol. NTP is used to distribute and synchronize
time among distributed time servers and clients. A group of devices on a network that are
configured to distribute NTP and the devices that are updating their local time from these time
servers form a synchronization subnet. Multiple NTP time masters (primary servers) can exist in
the same synchronization subnet at the same time. NTP does not specify any election mechanism
between multiple NTP servers in the same synchronization subnet. All available NTP servers can
be used for time synchronization at the same time.
An authoritative time source is usually a radio clock, or an atomic clock attached to a time
server. Authoritative server in NTP lingo just means a very accurate time source. It is the role of
NTP to distribute the time across the network. NTP clients poll groups of time servers at
intervals managed dynamically to reflect changing network conditions (primarily latency) and
the judged accuracy of each time server consulted (determined by comparison with local clock
time). Only one NTP transaction per minute is needed to synchronize the time between two
machines.
NTP uses the concept of strata (layers) to describe how far away a host is from an authoritative
time source. The most authoritative sources are in stratum 1. These are generally servers
connected directly to a very accurate time source, like a rubidium atomic clock. A stratum 2 time
server receives time from a stratum 1 server, and so on. When a device is configured to
communicate with multiple NTP servers, it will automatically pick the lowest stratum number
device as its time source. This strategy builds a self-organizing tree of NTP speakers. NTP
performs well over packet-switched networks like the internet, because it makes correct
estimates of the following three variables in the relationship between a client and a time server:
It is not uncommon to see NTP clock synchronization at the 10 millisecond level over long
distance WANs with devices as far apart as 2000 miles, and at the 1 millisecond level for LANs.
NTP avoids synchronizing with upstream servers whose time is not accurate. It does this in two
ways:
NTP never synchronizes with a NTP server that is not itself synchronized.
NTP compares time reported by several NTP servers, and will not synchronize to a server
whose time is an outlier, even if its stratum is lower than the other servers' stratum.
Communication between devices running NTP, also known as associations, are usually statically
configured. Each device is given the IP address or hostname of all NTP servers it should
associate with, and connects with them directly to solicit time updates. In a LAN environment,
NTP can be configured to use IP broadcast messages instead. Configuration complexity is
reduced in this case because each device can be configured to send or receive broadcast
messages. The downside with this situation is that the accuracy of timekeeping is a bit reduced
because the information flow is only one-way.
The time kept on a device is a critical resource. It is strongly recommended to use the security
features that come with NTP to avoid the accidental or malicious configuration of incorrect time.
The two security features usually used are:
An access list-based restriction scheme in which NTP traffic is allowed in the network
only from specific sources.
An encrypted authentication mechanism in which both the clients and the servers
authenticate each other securely.
Clients usually synchronize with the lowest stratum server they can access. But NTP
incorporates safeguards as well: it prefers to have access to at least three lower-stratum time
sources (giving it a quorum), because this helps it determine if any single source is incorrect.
When all servers are well synchronized, NTP chooses the best server based on a range of
variables: lowest stratum, network distance (latency), and precision claimed. This suggests that
while one should aim to provide each client with three or more sources of lower stratum time, it
is not necessary that all these sources be of highest quality. For example, good backup service
can be provided by a same-stratum peer that receives time from different lower stratum sources.
In order to determine if a server is reliable, the client applies many sanity checks:
Timeouts to prevent trap transmissions if the monitoring program does not renew this
information after a lengthy interval.
Checks on authentication, range bounds, and to avoid use of very old data.
Checks warn that the server's oscillator (local clock tick-source) has gone too long
without update from a reference source.
Recent additions to avoid instabilities when a reference source changes rapidly due to
severe network congestion.
If any one of these checks fail, the device declares the source insane.
Client/Server
Symmetric Active/Passive
Broadcast
Client/Server Mode
Client/server mode is most common. In this mode, a client or dependent server can synch with a
group member, but not the reverse, protecting against protocol attacks or malfunctions.
Client-to-server requests are made via asynchronous remote procedure calls, where the client
sends a request and expects a reply at some future time (unspecified). This is sometimes
described as "polling". On the client side, client/server mode can be turned on with a single
command or config-file change, followed by a restart of the NTP service on the host. No
additional configuration is required on the NTP server.
In this mode, a client requests time from one or more servers and processes replies as received.
The server changes addresses and ports, overwrites certain message fields, recalculates the
checksum, and returns the message immediately. Information included in the NTP message lets
the client determine the skew between server and local time, enabling clock adjustment. The
message also includes information to calculate the expected timekeeping accuracy and reliability,
as well as help the client select the best server.
Servers that provide time to many clients normally operate as a three-server cluster, each
deriving time from three or more stratum 1 or 2 servers as well as all other members of the
group. This protects against situations where one or more servers fail or become inaccurate. NTP
algorithms are designed to identify and ignore wrongly-functioning time sources and are even
resilient against attacks where NTP servers are subverted deliberately and made to send incorrect
time to clients. As backup, local hosts can be equipped with external clocks and made to serve
time temporarily, in case normal time sources, or communications paths used to reach them, are
disrupted.
In this mode, a group of low stratum peers work as backups for one another. Each peer derives
time from one or more primary reference sources or from reliable secondary servers. As an
example, a reference source could be a radio clock, receiving a corrected time signal from an
atomic clock. Should a peer lose all reference sources or stop working, the other peers
automatically reconfigure to support one another. This is called a 'push-pull' operation in some
contexts: peers either pull or push time, depending on self-configuration.
Symmetric/active mode is usually configured by declaring a peer in the configuration file, telling
the peer that one wishes to obtain time from it, and provide time back if necessary. This mode
works well in configurations where redundant time servers are interconnected via diverse
network paths, which is a good description of how most stratums 1 and 2 servers on the internet
are set up today.
Symmetric modes are most often used to interconnect two or more servers that work as a
mutually redundant group. Group members arrange their synch paths to minimize network jitter
and propagation delay.
Configuring a peer in symmetric/active mode is done with the peer command, and then providing
the DNS name or address of the other peer. The other peer may also be configured in symmetric
active mode in this way.
When only modest requirements for accuracy exist, clients can use NTP broadcast and/or
multicast modes, where many clients are configured the same way, and one broadcast server (on
the same subnet) provides time for them all. Broadcast messages are not propagated by routers,
meaning that this mode cannot be used beyond a single subnet.
Configuring a broadcast server is done using the broadcast command, and then providing a local
subnet address. The broadcast client command lets the broadcast client respond to broadcast
messages received on any interface. This mode should always be authenticated, because an
intruder can impersonate a broadcast server and propagate false time values.
5.5.6 NAT
Although implementation of IPv6 and its 40 undecillion addresses is proceeding, IPv4 is still
widely used. IPv4 can only accommodate a maximum of slightly over 4 billion unique addresses
(2 to the 32nd power). This creates problems. Given the necessarily-limited range of public or
external IPv4 addresses an organization (or subnetwork) can control, how can many more
devices use these addresses to communicate outside?
Network Address Translation (NAT) helps with the problem of IPv4 address depletion. NAT
works by mapping many private internal IPv4 addresses to a range of public addresses or to one
single address (as is done in most home networks). NAT identifies traffic to and from a specific
device, translating between external/public and internal/private IPv4 addresses.
NAT also hides clients on the internal network behind a range of public addresses, providing a
"sense of security" against these devices being directly attacked from outside. As mentioned
previously, the IETF does not consider private IPv4 addresses or NAT as effective security
measures.
NAT is supported on a large number of routers from different vendors for IPv4 address
simplification and conservation. In addition, with NAT you can select which internal hosts are
available for NAT and hence external access.
NAT can be configured on hosts and routers requiring it, without requiring any changes to hosts
or routers that do not need NAT. This is an important advantage.
Purpose of NAT
By mapping between external and internal IPv4 addresses, NAT allows an organization with
non-globally-routable IPv4 addresses to connect to the internet by translating addresses into a
globally-routable IPv4 address. Non-globally-routable addresses, or private addresses are defined
by RFC 1918 (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16). These addresses are private and cannot
be routed on the internet. With the exception of a few other types of IPv4 addresses, other IPv4
addresses are globally-routable on the internet. They are known as public IPv4 addresses.
NAT also enables an organization to easily change service providers or voluntarily renumber
network resources without affecting their public IPv4 address space. NAT is an IETF standard
and is described in RFC 1631 - The IP Network Address Translator (NAT).
NAT is typically configured at the point of connection between the internal network and the
outside network or the internet. For each packet exiting the domain, NAT translates the source
address into a globally unique address, and vice-versa. Networks with more than one point of
entrance/exit require multiple NATs, sharing the same translation table. If NAT runs out of
addresses from the pool, it drops the packet and sends an ICMP host unreachable message to the
destination.
Used in the context of NAT, the term "inside" usually means networks controlled by an
organization, with addresses in one local address space. "Outside" refers to networks to which
the network connects, and which may not be under the organization's control. Hosts in outside
networks may also be subject to translation, with their own local and global IPv4 addresses.
Types of NAT
NAT typically runs on a router. Before packets are forwarded between networks, NAT translates
the private (inside local) addresses within the internal network into public (inside global)
addresses. This functionality gives the option to configure NAT so that it advertises only a single
address to the outside world, for the entire internal network. By so doing, NAT effectively hides
the internal network from the world.
Static address translation (static NAT) - This is one-to-one mapping between global
and local IPv4 addresses.
Dynamic address translation (dynamic NAT) - This maps registered IPv4 addresses
from a pool to registered IP addresses.
Overloading (also called Port Address Translation or PAT) - This maps many
unregistered IPv4 addresses to a single registered address (many to one) on different
ports. Through overloading, thousands of users can be connected to the internet by using
only one real global IP address.
IPv6 was developed with the intention of making NAT unnecessary. However, IPv6 does include
its own IPv6 private address space called unique local addresses (ULAs). IPv6 ULAs are similar
to RFC 1918 private addresses in IPv4 but have a different purpose. ULAs are meant for only
local communications within a site. ULAs are not meant to provide additional IPv6 address
space, nor to provide a level of security.
IPv6 does provide for protocol translation between IPv4 and IPv6. This is known as NAT64.
NAT for IPv6 is used in a much different context than NAT for IPv4. The varieties of NAT for
IPv6 are used to transparently provide access between IPv6-only and IPv4-only networks. It is
not used as a form of private IPv6 to global IPv6 translation.
Four NAT Addresses
When determining which type of address is used, it is important to remember that NAT
terminology is always applied from the perspective of the device with the translated address:
Inside address - This is the address of the device which is being translated by NAT.
Outside address - This is the address of the destination device.
NAT also uses the concept of local or global with respect to addresses:
Local address - This is any address that appears on the inside portion of the network.
Global address - This is any address that appears on the outside portion of the network.
IPv4 addresses can be translated into globally-unique IPv4 addresses when communicating
outside the internal network. There are two options to accomplish this:
Static translation - This method sets up a one-to-one mapping between an inside local
address and an inside global address. This is useful when a host on the inside must be
accessed from a fixed outside address.
Dynamic translation - This method maps between inside local addresses and a global
address pool.
The figure shows a device translating a source address inside a network to a source address
outside the network
As shown in the above figure, inside source address translation works as follows:
The NAT device swaps the inside local source address of host 10.1.1.1 with the
global address of the translation entry, then forwards the packet.
3. Host B receives the packet and replies to host 10.1.1.1 using the inside global IP
destination address (DA) 203.0.113.20.
4. The NAT device uses the inside global address as a key, performs a NAT table
lookup, and then translates it to the inside local address of host 10.1.1.1 before
forwarding the packet.
5. Host 10.1.1.1 receives the packet and continues the exchange, beginning with
Step 1 above.
PAT
PAT works as follows. Both Host B and Host C think they are communicating with a single host
at address 203.0.113.20. They are actually communicating with different hosts, differentiated by
port number:
If no translation entry exists, the device translates the inside local address to a global
address from the available pool.
If another translation is ongoing (presuming overloading is enabled), the device reuses
that translation's global address and saves information in the NAT table that can be used
to reverse the process, translating the global address back to the proper local address.
This is called an extended entry.
The NAT device swaps source address 10.1.1.1 with the global address, then forwards
the packet.
3. Host B receives the packet and responds by using the inside global IP address
203.0.113.20.
4. The NAT device then uses the inside and outside addresses and port numbers to perfors a
NAT table lookup. It translates to the inside local address 10.1.1.1 and forwards the
packet.
5. Host 10.1.1.1 receives the packet and continues the exchange, beginning with Step 1
above.
Many services run on networks behind the scene to make things happen reliably and efficiently.
As a developer, you should understand what services are available and how they can help you.
You should also understand the basics of how the most useful and popular services are
configured. In Packet Tracer, these services are simulated, and the configuration is simple and
straightforward. However, Packet Tracer does a very good job at simulating the actual traffic. As
you work through this lab and send traffic, we encourage you to switch to Simulation mode to
explore the contents of the various types of packets that the network is generating.
Network troubleshooting usually follows the OSI layers. You can start either top to bottom
beginning at the application layer and making your way down to the physical layer. Or you can
go from the bottom to the top. In this example, we will cover a typical troubleshooting session
starting from physical layer and making our way up the stack of layers towards the application
layer.
First and foremost, from a client perspective, it is very important to determine how the client
connects to the network. Is it a wired or wireless connection?
If the client connects via an Ethernet cable, make sure the NIC comes online and there are
electrical signals being exchanged with the switch port to which the cable is connected.
Depending on the operating system that the client is running, the status of the network
connection will show as a solid green, or it will display "enabled" or "connected" text next to the
network interface card in the operating system settings. If the NIC shows as connected, you
know the physical layer is working as expected and can move the troubleshooting up the stack. If
the NIC on the client does not show up as connected or enabled, then check the configuration on
the switch. The port to which the client is connecting might be shut down, or maybe the cable
connecting the client to the network port in the wall is defective, or the cable connecting the
network port from the wall all the way to the switch might be defective. Troubleshooting at the
physical layer basically boils down to making sure there are four uninterrupted pairs of twisted
copper cables between the network client and the switch port.
If the client wirelessly connects to the network, make sure that the wireless network interface is
turned on and it can send and receive wireless signals to and from the nearest wireless access
point. Also, make sure you stay in the range of the wireless access point as long as you need to
be connected to the network.
Moving up to the data link layer, or Layer 2, make sure the client is able to learn destination
MAC addresses (using ARP) and also that the switch to which the client is connecting is able to
learn the MAC addresses received in its ports. On most operating systems you can view the ARP
table with the a form of the arp command (such as arp -a on a Windows 10 PC) or on Cisco
switches you can verify the MAC address table with the command show mac address-
table. If you can verify that the both these tables are accurate, then you can move to the next
layer. If the client cannot see any MAC addresses in its local ARP table, check for any Layer 2
access control lists on the switch port that might block this traffic. Also make sure that the switch
port is configured for the correct client VLAN.
At the network layer, or Layer 3, make sure the client obtains the correct IP address from the
DHCP server, or is manually configured with the correct IP address and the correct default
gateway. If the destination of your traffic is in a different subnet than the subnet you are
connected to, that means the traffic will have to be sent to the local router (default gateway).
Check Layer 3 connectivity one hop at a time. First check connectivity to the first Layer 3 hop in
the path of the traffic, which is the default gateway, to make sure you can reach it. If Layer 3
connectivity can be established all the way from the client to the destination, move on with
troubleshooting to the transport layer, or Layer 4. If Layer 3 connectivity cannot be established,
check IP access lists on the router interfaces, check the routing table on both the client and the
default gateway router and make sure the traffic is routed correctly. Routing protocol issues and
access control lists blocking IP traffic are some of the usual problems encountered at this layer.
You have verified end to end communication between the source and destination of the traffic in
the previous step. This is a major milestone, so give yourself a pat on the back. You almost
cleared the network team from the responsibility of that application outage. Before blaming the
application support team there is one more thing to verify. Make sure the client can access the
port on which the application is running. If the destination of the traffic is a webpage served by a
web server, make sure TCP ports 80 (HTTP) and 443 (HTTPS) are accessible and reachable
from the client. It could be that the web server is running on an esoteric port like 8080, so make
sure you know the correct port the application you are trying to connect to is running.
Networking tools like curl or a custom telnet command specifying the application port can be
used to ensure the transport layer works end-to-end between the source and destination of the
traffic. If a transport connection cannot be established, verify firewalls and security appliances
that are placed on the path of traffic for rules that are blocking the traffic based on TCP and UDP
ports. Verify if any load balancing is enabled and if the load balancer is working as expected, or
if any proxy servers intercepting the traffic are filtering and denying the connection.
So you got this far, checked end-to-end connectivity, you can connect to the port on which the
application is running, so the network team is in the clear, right? Almost. One additional thing to
verify is traffic load and network delay. Networking tools like iperf can generate traffic and load
stress the network to ensure that large amounts of data can be transported between the source and
destination. These issues are the most difficult to troubleshoot because they can be difficult to
reproduce. They can be temporarily caused by a spike in network traffic, or could be outside your
control all together. Implementing QoS throughout the network can help with these issues. With
QoS, traffic is categorized into different buckets and each bucket gets separate treatment from
the network. For example, real time traffic, like voice and video can be classified as such by
changing QoS fields in the Layer 2 and Layer 3 packet headers so that when switches and routers
process this type of traffic, it gets a higher priority and guaranteed bandwidth if necessary.
Or, maybe you are lucky to begin with and are verifying a web server access or a REST API
endpoint and the server returns a 500 status code. In that case you can start troubleshooting the
web server and skip all of the network troubleshooting steps.
If you got this far in your network troubleshooting, there is a good chance that the problem is not
with the network and a closer look at the application server is in order. Slow or no responses
from the application could also mean an overloaded backend database, or just faulty code
introduced through new features. In this case, solutions like Cisco AppDynamics can offer a
deeper view into application performance and root cause analysis of application issues.
5.6.2 Networking Tools - Using ifconfig
ifconfig is a software utility for UNIX-based operating systems. There is also a similar utility
for Microsoft Windows-based operating systems called ipconfig. The main purpose of this
utility is to manage, configure, and monitor network interfaces and their
parameters. ifconfig runs as a command-line interface tool and comes by default installed
with most operating systems.
Issuing the ifconfig --help command in the command line interface will display all the
options that are available with this version of ifconfig. The output should look similar to the
following.
From this output we can see that ifconfig gives us the option to add (add) or del (delete) IP
addresses and their subnet mask (prefix length) to a specific network interface. The hw
ether gives us the option to change the Ethernet MAC address. Care should be taken especially
when shutting down interfaces, because, if you are remotely connected to that host via the
interface you are shutting down, you have just disconnected your session and possibly have to
physically walk up to the device to bring it back online. That is not such a big problem when the
device is in the room next door, but it can be quite daunting when it is physically in a data center
hundreds of miles away and you have to drive for hours or even take a flight to bring it back
online.
If ifconfig is issued without any parameters, it just returns the status of all the network
interfaces on that host. For your DEVASC VM, the output should look similar to the following:
devasc@labvm:~$ ifconfig
dummy0: flags=195<UP,BROADCAST,RUNNING,NOARP> mtu 1500
inet 192.0.2.1 netmask 255.255.255.255 broadcast
0.0.0.0
inet6 fe80::48db:6aff:fe27:4849 prefixlen 64 scopeid
0x20<link>
ether 4a:db:6a:27:48:49 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 12293 bytes 2544763 (2.5 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions
0
devasc@labvm:~$
MTU is the Maximum Transmission Unit and specifies the maximum number of bytes that the
frame can be transmitted on this medium before being fragmented.
The RX packets and RX bytes contain the values of the received packets and bytes
respectively on that interface. In the example above there were 280055 packets received on
the enp0s3 interface that contained 281.9 MB of data. The TX packets and TX
bytes contain the values of the transmit packets and bytes on that specific interface. In the
example, there were 112889 packets transmitted on enp0s3 which accounted for 10.1 MB of
data.
ifconfig is still used extensively for both configuration and monitoring purposes. There are
also GUI clients for most operating systems that take this functionality into a graphical interface
and make it more visual for end users to configure network interfaces.
Note: The ifconfig command has been used within Linux for many years. However, some
Linux distributions have deprecated the ifconfig command. The ip address command is
becoming the new alternative. You will see the ip address command used in some of the
labs in this course.
Similar to ifconfig, ping is a software utility used to test IP network reachability for hosts
and devices connected to a specific network. It is also available on virtually all operating systems
and is extremely useful for troubleshooting connectivity issues. The ping utility uses Internet
Control Message Protocol (ICMP) to send packets to the target host and then waits for ICMP
echo replies. Based on this exchange of ICMP packets, ping reports errors, packet loss,
roundtrip time, time to live (TTL) for received packets, and more.
On Windows 10, enter the ping command to view its usage information. The output should
look similar to the following:
C:\> ping
Usage: ping [-t] [-a] [-n count] [-l size] [-f] [-i TTL] [-v TOS]
[-r count] [-s count] [[-j host-list] | [-k host-
list]]
[-w timeout] [-R] [-S srcaddr] [-c compartment] [-p]
[-4] [-6] target_name
Options:
-t Ping the specified host until stopped.
To see statistics and continue - type Control-
Break;
To stop - type Control-C.
-a Resolve addresses to hostnames.
-n count Number of echo requests to send.
-l size Send buffer size.
-f Set Don't Fragment flag in packet (IPv4-only).
-i TTL Time To Live.
-v TOS Type Of Service (IPv4-only. This setting has
been deprecated
and has no effect on the type of service field
in the IP
Header).
-r count Record route for count hops (IPv4-only).
-s count Timestamp for count hops (IPv4-only).
-j host-list Loose source route along host-list (IPv4-
only).
-k host-list Strict source route along host-list (IPv4-
only).
-w timeout Timeout in milliseconds to wait for each
reply.
-R Use routing header to test reverse route also
(IPv6-only).
Per RFC 5095 the use of this routing header
has been
deprecated. Some systems may drop echo
requests if
this header is used.
-S srcaddr Source address to use.
-c compartment Routing compartment identifier.
-p Ping a Hyper-V Network Virtualization provider
address.
-4 Force using IPv4.
-6 Force using IPv6.
C:\>
On MacOS Catalina, enter the ping command to view its usage information. The output should
look similar to the following:
$ ping
usage: ping [-AaDdfnoQqRrv] [-c count] [-G sweepmaxsize]
[-g sweepminsize] [-h sweepincrsize] [-i wait]
[-l preload] [-M mask | time] [-m ttl] [-p pattern]
[-S src_addr] [-s packetsize] [-t timeout][-W
waittime]
[-z tos] host
ping [-AaDdfLnoQqRrv] [-c count] [-I iface] [-i wait]
[-l preload] [-M mask | time] [-m ttl] [-p pattern]
[-S src_addr]
[-s packetsize] [-T ttl] [-t timeout] [-W waittime]
[-z tos] mcast-group
Apple specific options (to be specified before mcast-group or
host like all options)
-b boundif # bind the socket to the
interface
-k traffic_class # set traffic class socket
option
-K net_service_type # set traffic class socket
options
-apple-connect # call connect(2) in the socket
-apple-time # display current time
On your DEVASC VM, add the -help option to view its usage information. The output should
look similar to the following:
Usage
ping [options] <destination>
Options:
<destination> dns name or ip address
-a use audible ping
-A use adaptive ping
-B sticky source address
-c <count> stop after <count> replies
-D print timestamps
-d use SO_DEBUG socket option
-f flood ping
-h print help and exit
<output omitted>
IPv4 options:
-4 use IPv4
-b allow pinging broadcast
-R record route
-T <timestamp> define timestamp, can be one of <tsonly|
tsandaddr|tsprespec>
IPv6 options:
-6 use IPv6
-F <flowlabel> define flow label, default is random
-N <nodeinfo opt> use icmp6 node info query, try <help> as
argument
By default, ping (or ping -help in Linux) will display all the options it has available. Some
of the options you can specify include:
The output of the command in your environment will most probably look a bit different but the
major components should be the same. We specified a count of 5 ICMP echo request packets to
be sent to www.cisco.com. The ping utility automatically does the DNS resolution and in this
case it resolved the www.cisco.com name to the 23.204.11.200 IPv4 address. The packets are
sent, and responses are received from 23.204.11.200. TTL for the received echo replies and
round trip times are calculated and displayed. The final statistics, confirm that 5 ICMP echo-
request packets have been transmitted and 5 ICMP echo-reply packets have been received, hence
there is a 0.0% packet loss. Statistics about the minimum, average, maximum and standard
deviation of the time it took for the packets to get to the destination and back are also displayed.
Keep in mind that if you do not receive any replies from the destination you are trying to reach
with ping it does not mean that the host is offline or not reachable. It could simply mean that
ICMP echo-request packets are filtered by a firewall and are not allowed to reach the destination
host. It is actually a best practice to expose only the services needed to be available on the hosts
in the network. For example, a web server would only expose TCP port 443 for secure HTTP
traffic and deny any other types of traffic either through a local firewall on the web server itself
or a network firewall.
For IPv6 there exists a similar utility on Linux and MacOS that is called ping6 and is also
available on most operating systems. Windows and Cisco IOS uses the same ping command for
both IPv4 and IPv6.
You have seen how ping can display host reachability on the network. traceroute builds on
top of that functionality and displays the route that the packets take on their way to the
destination. The Microsoft Windows alternative is also a command-line utility and is
called tracert. Observing the path the network traffic takes from its source to the destination
is extremely important from a troubleshooting perspective, as routing loops and non-optimal
paths can be detected and then remedied.
traceroute uses ICMP packets to determine the path to the destination. The Time to Live
(TTL) field in the IP packet header is used primarily to avoid infinite loops in the network. For
each hop or router that an IP packet goes through, the TTL field is decremented by one. When
the TTL field value reaches 0, the packet is discarded, avoiding the dreaded infinite loops.
Usually, the TTL field is set to its maximum value, 255, on the host that is the source of the
traffic, as the host is trying to maximize the chances of that packet getting to its
destination. traceroute reverses this logic, and gradually increments the TTL value of the
packet it is sending, from 1 and keeps adding 1 to the TTL field on the next packet and so on.
Setting a TTL value of 1 for the first packet, means the packet will be discarded on the first
router. By default, most routers send back to the source of the traffic an ICMP Time Exceeded
packet informing it that the packet has reached a TTL value of 0 and had to be
discarded. traceroute uses the information received from the router to figure out its IP
address and hostname and also round trip times.
Note: Instead of ICMP, by default, Linux uses UDP and a high port range (33434 - 33534).
Destinations along the path respond with ICMP port unreachable messages instead of the echo
replies sent in ICMP-based traceroutes.
On Windows 10, use tracert to see the available options as shown in the following output:
C:\> tracert
Options:
-d Do not resolve addresses to hostnames.
-h maximum_hops Maximum number of hops to search for
target.
-j host-list Loose source route along host-list (IPv4-
only).
-w timeout Wait timeout milliseconds for each reply.
-R Trace round-trip path (IPv6-only).
-S srcaddr Source address to use (IPv6-only).
-4 Force using IPv4.
-6 Force using IPv6.
C:\>
On MacOS, use traceroute to see the available options as shown in the following output:
$ traceroute
Version 1.4a12+Darwin
Usage: traceroute [-adDeFInrSvx] [-A as_server] [-f first_ttl] [-
g gateway] [-i iface]
[-M first_ttl] [-m max_ttl] [-p port] [-P proto] [-q
nqueries] [-s src_addr]
[-t tos] [-w waittime] [-z pausemsecs] host [packetlen]
On your DEVASC VM, use traceroute --help to see the available options as shown in
the following output:
Specifying the TTL value of the first packet sent. By default this is 1.
Specifying the maximum TTL value. By default, it will increase the TTL value up to 64
or until the destination is reached.
Specifying the source address in case there are multiple interfaces on the host.
Specifying QoS value in the IP header.
Specifying the packet length.
Because of the way Virtual Box implements a NAT network, you cannot trace outside of your
DEVASC VM. You would need to change your VM to Bridged mode. But then, you would not
be able to communicate with the CSR1000v in other labs. Therefore, we recommend leaving
your VM in NAT mode.
However, you can tracert from your Windows device or traceroute from your MacOS
device. The following output is from a MacOS device inside the corporate Cisco network tracing
the route to one of Yahoo’s web servers.
$ traceroute www.yahoo.com
traceroute: Warning: www.yahoo.com has multiple addresses; using
98.138.219.232
traceroute to atsv2-fp-shed.wg1.b.yahoo.com (98.138.219.232), 64
hops max, 52 byte packets
1 sjc2x-dtbb.cisco.com (10.1x.y.z) 2.422 ms 1.916 ms 1.773 ms
2 sjc2x-dt5.cisco.com (12x.1y.1z.1ww) 2.045 ms
sjc2x-dt5-01.cisco.com (12x.1y.1z.15w) 2.099 ms 1.968 ms
3 sjc2x-sbb5.cisco.com (1xx.1x.1xx.4y) 1.713 ms 1.984 ms
sjc2x-sbb5-10.cisco.com (1xx.1x.1y.4w) 1.665 ms
4 sjc2x-rbb.cisco.com (1xx.1y.zz.yyy) 1.836 ms 1.804 ms 1.696
ms
5 sjc1x-rbb-7.cisco.com (1xx.zz.y.ww) 68.448 ms 1.880
ms 1.939 ms
6 sjc1x-corp-0.cisco.com (1xx.yy.z.w) 1.890 ms 2.660 ms 2.793
ms
7 * * *
8 * * *
9 * * *
...
61 * * *
62 * * *
63 * * *
64 * * *
Note: The output above has been altered for security reasons, but your output should actually
have both valid hostnames and IP addresses.
From this output, we can see the first 6 hops or routers that are on the path towards ww
w.yahoo.com. The entries for 2 and 3 have two values, suggesting there is load balancing
implemented on this specific path. Round trip times are also included in the output. In this output
you can also see that the traceroute traffic is not allowed outside of the corporate Cisco
network, so the complete path to the destination is not available. By filtering ICMP Time
Exceeded messages with firewalls or just disabling it at the host and router level, visibility of the
path with traceroute is greatly limited. Still, even with these limitations, traceroute is an
extremely useful utility to have in your tool belt for troubleshooting network-related issues.
nslookup is another command-line utility used for querying DNS to obtain domain name to IP
address mapping. Like other tools mentioned in this section, nslookup is widely available on
most all operating systems. This tool is useful to determine if the DNS server configured on a
specific host is working as expected and actually resolving hostnames to IP addresses. It could be
that maybe a DNS server is not configured at all on the host, so make sure you
check /etc/resolv.conf on UNIX-like operating systems and that you have at least
a nameserver defined.
The DEVASC VM Linux OS does not implement a help option for the nslookup command.
However, you can enter man nslookup to learn more about the available options.
In the terminal, execute the command nslookup www.cisco.com 8.8.8.8 to resolve the
IP address or addresses for Cisco’s web server and specify that you want to use Google’s DNS
server at 8.8.8.8 to do the resolution.
Non-authoritative answer:
www.cisco.com canonical name = www.cisco.com.akadns.net.
www.cisco.com.akadns.net canonical name =
wwwds.cisco.com.edgekey.net.
wwwds.cisco.com.edgekey.net canonical name =
wwwds.cisco.com.edgekey.net.globalredir.akadns.net.
wwwds.cisco.com.edgekey.net.globalredir.akadns.net canonical
name = e2867.dsca.akamaiedge.net.
Name: e2867.dsca.akamaiedge.net
Address: 23.204.11.200
Name: e2867.dsca.akamaiedge.net
Address: 2600:1404:5800:392::b33
Name: e2867.dsca.akamaiedge.net
Address: 2600:1404:5800:39a::b33
devasc@labvm:~$
The DNS service running on server 8.8.8.8 resolved the www.cisco.com domain to 3 IP
addresses as you can see above. This resolution from names to IP addresses is critically
important to the functioning of any network. It is much easier to remember www.cisco.com than
an IPv4 or IPv6 address every time you are trying to access the Cisco website.
Networks have a lot of components working together to ensure connectivity and data delivery.
Often, these components may not work properly. This may be due to a simple device
misconfiguration, or many, seemingly unrelated problems that must be systematically resolved.
As a developer, you may need to troubleshoot network issues to regain connectivity. To
troubleshoot network issues, it is necessary to take a step-by-step methodical approach, using
clues to determine the problem and implement a solution. You may often find more than one
problem preventing a connection from working.
In the effort to fix network connection issues, it is important for a developer to understand how
to use basic network troubleshooting tools. These tools are used to determine what the
connection problem might be.
A network consists of end devices such as computers, mobile devices, and printers that
are connected by networking devices such as switches and routers. The network
enables the devices to communicate with one another and share data. A protocol suite
is a set of protocols that work together to provide comprehensive network
communication services. Both the OSI and the TCP/IP reference models use layers to
describe the functions and services that can occur at that layer. The form that a piece of
data takes at any layer is called a protocol data unit (PDU). At each stage of the
encapsulation process, a PDU has a different name to reflect its new functions: data,
segment, packet, frame, and bits.
The OSI reference model layers are described here from bottom to top:
1. The physical layer is responsible with the transmission and reception of raw bit
streams.
2. The data link layer provides NIC-to-NIC communications on the same network.
3. The network layer provides services to allow end devices to exchange data
across networks.
4. The transport layer provides the possibility of reliability and flow control.
5. The session layer allows hosts to establish sessions between them.
6. The presentation layer specifies context between application-layer entities.
7. The application layer is the OSI layer that is closest to the end user and contains
a variety of protocols usually needed by users.
End devices implement protocols for the entire "stack", all layers. The source of the
message (data) encapsulates the data with the appropriate protocols, while the final
destination de-encapsulates each protocol header/trailer to receive the message (data).
Ethernet is a set of guidelines and rules that enable various network components to
work together. These guidelines specify cabling and signaling at the physical and data
link layers of the OSI model. In Ethernet terminology, the container into which data is
placed for transmission is called a frame. The frame contains header information, trailer
information, and the actual data that is being transmitted. Important fields of a MAC
address frame include preamble, SFD, destination MAC Address, source MAC address,
type, data, and FCS. Each NIC card has a unique Media Access Control (MAC) address
that identifies the physical device, also known as a physical address. The MAC address
identifies the location of a specific end device or router on a LAN. The three major types
of network communications are: unicast, broadcast, and multicast.
The switch builds and maintains a table (called the MAC address table) that matches
the destination MAC address with the port that is used to connect to a node. The switch
forwards frames by searching for a match between the destination MAC address in the
frame and an entry in the MAC address table. Depending on the result, the switch will
decide whether to filter or flood the frame. If the destination MAC address is in the MAC
address table, it will send it out the specified port. Otherwise, it will flood it out all ports
except the incoming port.
A VLAN groups devices on one or more LANs that are configured to communicate as if
they were attached to the same wire, when in fact they are located on a number of
different LAN segments. VLANs define Layer 2 broadcast domains. VLANs are often
associated with IP networks or subnets. A trunk is a point-to-point link between two
network devices that carries more than one VLAN. A VLAN trunk extends VLANs
across an entire network. VLANs are organized into three ranges: reserved, normal,
and extended.
Internetwork Layer
IPv6 is designed to be the successor to IPv4. IPv6 has a larger 128-bit address space,
providing 340 undecillion possible addresses. IPv6 prefix aggregation, simplified
network renumbering, and IPv6 site multihoming capabilities provide an IPv6
addressing hierarchy that allows for more efficient routing. IPv6 addresses are
represented as a series of 16-bit hexadecimal fields (hextet) separated by colons (:) in
the format: x:x:x:x:x:x:x:x. The preferred format includes all the hexadecimal values.
There are two rules that can be used to reduce the representation of the IPv6 address:
1. Omit leading zeros in each hextet, and 2. Replace a single string of all-zero hextets
with a double colon (::).
An IPv6 unicast address is an identifier for a single interface, on a single node. A global
unicast address (GUA) (or aggregatable global unicast address) is an IPv6 similar to a
public IPv4 address. The global routing prefix is the prefix, or network, portion of the
address that is assigned by the provider such as an ISP, to a customer or site. The
Subnet ID field is the area between the Global Routing Prefix and the Interface ID. The
IPv6 interface ID is equivalent to the host portion of an IPv4 address. An IPv6 link-local
address (LLA) enables a device to communicate with other IPv6-enabled devices on the
same link and only on that link (subnet). IPv6 multicast addresses are similar to IPv4
multicast addresses. Recall that a multicast address is used to send a single packet to
one or more destinations (multicast group). These are two common IPv6 assigned
multicast groups: ff02::1 All-nodes multicast group, and ff02::2 All-routers multicast
group.
A router is a networking device that functions at the internet layer of the TCP/IP model
or Layer 3 network layer of the OSI model. Routing involves the forwarding packets
between different networks. Routers use a routing table to route between networks. A
router generally has two main functions: Path determination, and Packet routing or
forwarding. A routing table may contain the following types of entries: directly connected
networks, static routes, default routes, and dynamic routes.
Network Devices
Operate at the network access layer of the TCP/IP model and the Layer 2 data
link layer of the OSI model
Filter or flood frames based on entries in the MAC address table
Have a large number of high speed and full-duplex ports
The switch operates in either of the following switching modes: cut-through, and store-
and-forward. LAN switches have high port density, large frame buffers, and fast internal
switching.
Routers are needed to reach devices that are not on the same local LAN. Routers use
routing tables to route traffic between different networks. Routers are attached to
different networks (or subnets) through their interfaces and have the ability to route the
data traffic between them.
They operate at the internet layer of TCP/IP model and Layer 3 network layer of
the OSI model.
The route packets between networks based on entries in the routing table.
They have support for a large variety of network ports, including various LAN and
WAN media ports which may be copper or fiber. The number of interfaces on
routers is usually much smaller than switches but the variety of interfaces
supported is greater. IP addresses are configured on the interfaces.
Network diagrams display a visual and intuitive representation of the network, how are
all the devices connected, in which buildings, floors, closets are they located, what
interface connects to that end device, etc. There are generally two types of network
diagrams: Layer 2 physical connectivity diagrams, and Layer 3 logical connectivity
diagrams. Layer 2, or physical connectivity diagrams are network diagrams representing
the port connectivity between the devices in the network. It is basically a visual
representation of which network port on a network device connects to which network
port on another network device. Layer 3, or logical connectivity diagrams are network
diagrams that display the IP connectivity between devices on the network.
Networking Protocols
Telnet and SSH, or Secure SHell, are both used to connect to a remote computer and
log in to that system using credentials. Telnet is less prevalent today because SSH uses
encryption to protect data going over the network connection and data security is a top
priority. HTTP stands for Hyper Text Transfer Protocol, and HTTPS adds the "Secure"
keyword to the end of the acronym. This protocol is recognizable in web browsers as
the one to use to connect to web sites. NETCONF does have a standardized port value,
830. RESTCONF does not have a reserved port value, so you may see various
implementations of different values.
Dynamic Host Configuration Protocol (DHCP) is used to pass configuration information
to hosts on a TCP/IP network. DHCP allocates IP addresses in three ways: automatic,
dynamic, and manual. DHCP operations includes four messages between the client and
the server: server discovery, IP lease offer, IP lease request, and IP lease
acknowledgment.
The DNS protocol defines an automated service that matches resource names with the
required numeric network address. It includes the format for queries, responses, and
data. The DNS protocol communications use a single format called a DNS message.
The DNS server stores different types of resource records that are used to resolve
names. These records contain the name, address, and type of record.
There are two primary SNMP manager requests, get and set. A get request is used by
the NMS to query the device for data. A set request is used by the NMS to change
configuration variables in the agent device. Traps are unsolicited messages alerting the
SNMP manager to a condition or event on the network. SNMPv1 and SNMPv2c use
community strings that control access to the MIB. SNMP community strings (including
read-only and read-write) authenticate access to MIB objects. Think of the MIB as a
"map" of all the components of a device that are being managed by SNMP.
NTP is used to distribute and synchronize time among distributed time servers and
clients. An authoritative time source is usually a radio clock, or an atomic clock attached
to a time server. NTP servers can associate in several modes, including: client/server,
symmetric active/passive, and broadcast.
Network Address Translation (NAT) helps with the problem of IPv4 address depletion.
NAT works by mapping thousands of private internal addresses to a range of public
addresses. By mapping between external and internal IPv4 addresses, NAT allows an
organization with non-globally-routable addresses connect to the internet by translating
addresses into a globally-routable address space. NAT includes four types of
addresses:
Types of NAT include: static NAT, dynamic NAT, and port address translation (PAT).
ping is a software utility used to test IP network reachability for hosts and devices
connected to a specific network. It is also available on virtually all operating systems
and is extremely useful for troubleshooting connectivity issues. The ping utility uses
Internet Control Message Protocol (ICMP) to send packets to the target host and then
waits for ICMP echo replies. Based on this exchange of ICMP packets, ping reports
errors, packet loss, roundtrip time, time to live (TTL) for received packets, and so on.
traceroute uses ICMP packets to determine the path to the destination. The Time to
Live (TTL) field in the IP packet header is used primarily to avoid infinite loops in the
network. For each hop or router that an IP packet goes through, the TTL field is
decremented by one. When the TTL field value reaches 0, the packet is discarded.
Usually, the TTL field is set to its maximum value, 255, on the host that is the source of
the traffic, as the host is trying to maximize the chances of that packet getting to its
destination. traceroute reverses this logic, and gradually increments the TTL value of
the packet it is sending, from 1 and keeps adding 1 to the TTL field on the next packet
and so on. Setting a TTL value of 1 for the first packet, means the packet will be
discarded on the first router. By default, most routers send back to the source of the
traffic an ICMP Time Exceeded packet informing it that the packet has reached a TTL
value of 0 and had to be discarded.
nslookup is another command-line utility used for querying DNS to obtain domain name
to IP address mapping. This tool is useful to determine if the DNS server configured on
a specific host is working as expected and actually resolving hostnames to IP
addresses. It could be that maybe a DNS server is not configured at all on the host, so
make sure you check /etc/resolv.conf on UNIX-like operating systems and that you have
at least a nameserver defined.
Even if you are a solo developer building an application for yourself, when deploying your
application you must account for a number of different factors, from creating the appropriate
environments, to properly defining the infrastructure, to basic security concepts. This simply
means that developers need to do more than deliver application code: they need to concern
themselves with how applications are deployed, secured, operated, monitored, scaled, and
maintained.
Meanwhile, the physical and virtual infrastructure and platforms on which applications are being
developed and deployed are quickly evolving. Part of this rapid evolution is aimed at making life
easier for developers and operators. For example, platform paradigms such as Containers as a
Service and Serverless Computing are designed to let developers focus on building core
application functionality, without having to worry about handling underlying platform
configuration, mechanics, scaling, and other operations.
But not all current development takes place on these platforms. Developers are confronted with
an expanding "stack" of platform options: bare metal, virtual machines, containers, and others,
are all hosted on infrastructures and frameworks of increasing flexibility and complexity.
This module discusses some of the places today's software "lives". It goes on to cover basic
techniques for deploying and testing applications, plus workflow techniques and tools for
delivering software (to development platforms, test environments, staging, and production)
quickly and efficiently. Finally, it covers some networking and security basics. All developers
should be familiar with these concepts.
Some chaos in the early stages of development is normal, but code should be well tested by the
time it gets to users. To make that happen, code needs to go through a series of steps to improve
its reliability. Code passes through a number of environments, and as it does, its quality and
reliability increases. These environments are self-contained, and intended to mimic the ultimate
environment in which the code will ‘live’.
Typically, large organizations use a four-tier structure: development, testing, staging, and
production.
Development environment
The development environment is where you do your coding. Usually, your development
environment bears little resemblance to the final environment. The development environment is
typically just enough for you to manage fundamental aspects of your infrastructure, such as
containers or cloud networking. You may use an Integrated Development Environment (IDE) or
other tool to make deployment easier.
This environment may also include “mock” resources that provide the form of the real resources,
but not the content. For example, you might have a database with a minimal number of test
records, or an application that mimics the output of a remote service. Each developer typically
has their own development environment.
Testing environment
When you believe your code is finished, you may move on to a second environment that has
been set aside for testing the code, though when working on small projects, the development and
testing environments are often combined. This testing environment should be structurally similar
to the final production environment, even if it is on a much smaller scale.
The testing environment often includes automated testing tools such as Jenkins, CircleCI, or
Travis CI, as well as integration with a version control system. It should be shared among the
entire team. It may also include code review tools such as Gerrit.
Staging environment
After the code has been tested, it moves to the staging environment. Staging should be as close
as possible to the actual production environment, so that the code can undergo final acceptance
testing in a realistic environment. Instead of maintaining a smaller-scale staging environment,
some organizations maintain two matching production environments, one of which hosts the
current release of an application, the other standing by to receive a new release. In this case,
when a new version is deployed, traffic is shifted (gradually or suddenly, as in "cut over") from
the current production environment to the other one. With the next release, the process is done in
reverse.
This is, of course, much more affordable in clouds, where an unused, virtualized environment
can be torn down and rebuilt automatically when needed.
Production environment
Finally, the code arrives at the production environment, where end users interact with it. At this
point it has been tested multiple times, and should be error free. The production environment
itself must be sized and constructed to handle expected traffic, including surges that might come
seasonally or with a particular event.
Handling those surges is something you can plan for when designing your infrastructure. Before
looking at infrastructure, however, you need to know about different models that you can use for
deploying your software.
In the early days of computers, there were no choices regarding how to deploy your software;
you simply installed it on the computer itself. Today this model is known as “bare metal,” but it
is only one of a variety of options available to you. These options include virtual machines,
containers, and newer options such as serverless computing.
Bare metal
The most familiar, and the most basic way to deploy software is by installing it directly on the
target computer, or the “bare metal.” In addition to this being the simplest method, bare metal
deployment has other advantages, such as the fact that software can access the operating system
and hardware directly. This is particularly useful for situations in which you need access to
specialized hardware, or for High Performance Computing (HPC) applications in which every bit
of speed counts.
One place where bare metal can be a disadvantage, however, is in isolating different
workloads from each other. In a bare metal environment, every application on the
machine is using the same kernel, operating system, storage, etc.. There are things you
can do to isolate some resources, but if this is an issue, other models are likely a better
choice. Additionally, bare metal is not very flexible in terms of resources; a machine with
64 GB of RAM is not going to get larger or smaller unless someone physically takes it
apart to add or remove hardware.
Virtual machines
One way to solve the flexibility and isolation problems is through the use of Virtual
Machines, or VMs. A virtual machine is like a computer within your computer; it has its
own computing power, network interfaces, and storage.
A hypervisor is software that creates and manages VMs. Hypervisors are available as
open-source (OpenStack, Linux KVM, XEN), and also from commercial vendors such
as Oracle (VirtualBox), VMware (Horizon, vSphere, Fusion), Microsoft (Hyper-V), and
others. Hypervisors are generally classified as either 'Type 1', which run directly on the
physical hardware ('bare metal'), and 'Type 2', which run, usually as an application,
under an existing operating system.
The use of VMs overcome a number of restrictions. For example, if you had three
workloads you wanted to isolate from each other, you could create three separate
virtual machines on one bare metal server.
VMs run on top of a hypervisor, such as KVM, QEMU, or VMware, which provides them
with simulated hardware, or with controlled access to underlying physical hardware. The
hypervisor sits on top of the operating system and manages the VMs.
VMs can be convenient for several reasons, not the least of which is that a VM image
can be saved for future use, or so that others can instantiate and use. This enables you
to distribute a VM, or at least, the means to use it. Applications that run VMs, such as
VirtualBox and VMware, can also take snapshots, or backups, of a VM so that you can
return it to its previous state, if necessary.
Because they are so much like physical machines, VMs can host a wide range of
software, even legacy software. Newer application environments, like containers, may
not be "real machine-like" enough to host applications that are not written with their
limitations in mind.
Container-based infrastructure
Moving up the abstraction ladder from VMs, you will find containers. Software to create
and manage or orchestrate containers is available from Docker, AWS (Elasticized
Container Service), Microsoft (Azure Container Service), and others.
Containers were designed to provide many of the same benefits as VMs, such as
workload isolation and the ability to run multiple workloads on a single machine, but
they are architected a bit differently.
For one thing, containers are designed to start up quickly, and as such, they do not
include as much underlying software infrastructure. A VM contains an entire guest
operating system, but a container shares the operating system of the host machine and
uses container-specific binaries and libraries.
Where VMs emulate an entire computer, a container typically represents just an
application or a group of applications. The value of using containers is that all of the
libraries and binaries you need to run the application are included, so the user does not
have to take that additional installation step.
An important distinction between a Docker container and a VM is that each VM has its
own complete operating system. Containers only contain part of the operating system.
For example, you may have an Ubuntu Linux host computer running a CentOS Linux
VM, an Ubuntu Linux VM, and a Windows 10 VM. Each of these VMs has its own
complete OS. This can be very resource intensive for the host computer.
With Docker, containers share the same kernel of their host computer. For example, on
the Ubuntu Linux host computer you may have an Ubuntu Linux container and a Centos
Linux container. Both of these containers are sharing the same Linux kernel. However,
you could not have a container running Windows 10 on this same Ubuntu Linux host
computer, because Windows uses a different kernel. Sharing the same kernel requires
far fewer resources than using separate VMs, each with its own kernel.
Containers also solve the problem that arises when multiple applications need different
versions of the same library in order to run. Because each application is in its own
container, it is isolated from any conflicting libraries and binaries.
Containers are also useful because of the ecosystem of tools around them. Tools such
as Kubernetes make fairly sophisticated orchestration of containers possible, and the
fact that containers are often designed to be stateless and to start up quickly means that
you can save resources by not running them unless you need to.
Containers are also the foundation of cloud native computing, in which applications are
generally stateless. This statelessness makes it possible for any instance of a particular
container to handle a request. When you add this to another aspect of cloud computing
that emphasizes services, serverless computing becomes possible.
Serverless computing
Let’s start with this important point: to say that applications are “serverless” is great for
marketing, but it is not technically true. Of course your application is running on a
server. It is just running on a server that you do not control, and do not have to think
about. Hence the name “serverless”.
Serverless computing takes advantage of a modern trend towards applications that are
built around services. That is, the application makes a call to another program or
workload to accomplish a particular task, to create an environment where applications
are made available on an “as needed” basis.
Step 2. You deploy your application as a container, so that it can run easily in any
appropriate environment.
Step 3. You deploy that container to a serverless computing provider, such as AWS
Lambda, Google Cloud functions, or even an internal Function as a Service
infrastructure. This deployment includes a specification of how long the function should
remain inactive before it is spun down.
Step 5. The provider spins up an instance of the container, performs the needed task,
and returns the result.
What is important to notice here is that if the serverless app is not needed, it is not running, and
you are not getting charged for it. On the other hand, if you are typically calling it multiple times,
the provider might spin up multiple instances to handle the traffic. You do not have to worry
about any of that.
Because the capacity goes up and down with need, it is generally referred to as “elastic” rather
than “scalable.”
While there is a huge advantage in the fact that you are only paying for the resources that are
actually in use, as opposed to a virtual machine that may be running all the time, even when its
capacity is not needed. The serverless computing model means that you have zero control over
the host machine, so it may not be appropriate from a security perspective.
In the early days of computers, infrastructure was pretty straightforward. The software you so
carefully wrote ran on a single computer. Eventually, you had a network that could link multiple
computers together. From there, things just got more and more complicated. Let’s look at the
various options for designing your infrastructure, such as different types of clouds, and what
each does and does not do well.
6.1.5 On-Premises
On-premises
Technically speaking, “on-premises” means any system that is literally within the confines of
your building. In this case we are talking about traditional data centers that house individual
machines which are provisioned for applications, rather than clouds, external or otherwise.
These traditional infrastructures are data centers with servers dedicated to individual
applications, or to VMs, which essentially enable a single computer to act like multiple
computers.
Operating a traditional on-premises data center requires servers, storage devices, and network
equipment to be ordered, received, assembled in racks ("racked and stacked"), moved to a
location, cabled for power and data. This equipment must be provided with environmental
services such as power protection, cooling, and fire prevention. Servers then need to be logically
configured for their roles, operating systems and software must be installed, and all of it needs to
be maintained and monitored.
All of this infrastructure work takes time and effort. Requests for resources need to go through
the operations team, which can lead to delays of days, weeks, or even months while new
hardware is obtained, prepared, and provisioned.
In addition, scaling an application typically means moving it to a larger server, which makes
scaling up or down a major event. That means an application is almost always either wasting
money with excess capacity that is not being used, or underperforming because it does not have
enough resources.
The downside of on-premises infrastructure can be easily solved by cloud computing. A cloud is
a system that provides self-service provisioning for compute resources, networking, and storage.
A cloud consists of a control plane, which enables you to perform requests. You can
create a new VM, attach a storage volume, even create a new network and compute
resources.
What distinguishes a private cloud from other types of clouds is that all resources within
the cloud are under the control of your organization. In most cases, a private cloud will
be located in your data center, but that is not technically a requirement to be called
“private.” The important part is that all resources that run on the hardware belong to the
owner organization.
The advantage of a private cloud is that you have complete control over where it is
located, which is important in situations where there are specific compliance
regulations, and that you do not typically have to worry about other workloads on the
system.
On the downside, you do have to have an operations team that can manage the cloud
and keep it running.
A public cloud is essentially the same as a private cloud, but it is managed by a public
cloud provider. Public clouds can also run systems such as OpenStack or Kubernetes,
or they can be specific proprietary clouds such as Amazon Web Services or Azure.
Public cloud customers may share resources with other organizations: your VM may run
on the same host as a VM belonging to someone else. Alternatively, public cloud
providers may provide customers with dedicated infrastructure. Most provide several
geographically-separate cloud 'regions' in which workloads can be hosted. This lets
workloads be placed close to users (minimizing latency), supporting geographic
redundancy (the East Coast and West Coast regions are unlikely to be offline at the
same time), and enabling jurisdictional control over where data is stored.
Public clouds can be useful because you do not have to pay for hardware you are not
going to use, so you can scale up virtually indefinitely as long as the load requires it,
then scale down when traffic is slow. Because you only pay for the resources you are
actually using, this solution can be most economical because your application never
runs out of resources, and you do not pay for resources you are not using. You also do
not have to worry about maintaining or operating the hardware; the public cloud provider
handles that. However, in practice, when your cloud gets to be a certain size, the cost
advantages tend to disappear, and you are better off with a private cloud.
There is one disadvantage of public cloud. Because you are sharing the cloud with
other users, you may have to contend with situations in which other workloads take up
more than their share of resources.
This problem is worse when the cloud provider is overcommitting. The provider
assumes not all resources will be in use at the same time, and allocates more "virtual"
resources than "physical" resources. For example, it is not unusual to see an
overcommit ratio of 16:1 for CPUs, which means that for every physical CPU, there may
be 16 virtual CPUs allocated to VMs. Memory can be overcommitted as well. With a
ratio of 2:1 for memory, a server with 128GB of RAM might be hosting 256GB of
workloads. With a public cloud you have no control over that (save for paying more for
dedicated instances or other services that help guarantee service levels).
For example, you might have an application that runs on your private cloud, but “bursts”
to public cloud if it runs out of resources. In this way, you can save money by not overbuying for
your private cloud, but still have the resources when you need them.
You might also go in the other direction, and have an application that primarily runs on the
public cloud, but uses resources in the private cloud for security or control. For example, you
might have a web application that serves most of its content from the public cloud, but stores
user information in a database within the private cloud.
Hybrid cloud is often confused with multi-cloud, in which an organization uses multiple clouds
for different purposes. What distinguishes hybrid cloud is the use of more than one cloud within
a single application. As such, a hybrid cloud application has to be much more aware of its
environment than an application that lives in a single cloud.
A non-hybrid cloud application and its cloud are like a fish and the ocean; the fish does not need
to be aware of the ocean because the ocean is just there, all around the fish. When you start
adding hybrid cloud capabilities to an application, that application has to be aware of what
resources are available and from where.
It is best if the application itself does not have to handle these things directly. It is a better
practice to have some sort of interface that the application can call when it needs more resources,
and that interface makes the decision regarding where to run those resources and passes them
back to the application. This way the resource mapping logic can be controlled independently of
the application itself, and you can adjust it for different situations. For example, you may keep
all resources internal during the testing and debugging phase, then slowly ramp up public cloud
use.
One way to accomplish this is through a tool such as Cisco Hybrid Cloud Platform for Google
Cloud, which manages networking, security, management, data center, open-source and API
software and tools. This provides you with a single, consistent, secure environment for your
application, enabling it to work across both on-premises data centers and the Google Cloud.
In addition, container orchestrators have become very popular with companies employing
hybrid-cloud deployments. The orchestrators provide a cloud-vendor agnostic layer which the
application can consume to request necessary resources, reducing the environmental awareness
needed in the application itself.
The newest type of cloud is edge cloud. Edge cloud is gaining popularity because of the growth
of the Internet of Things (IoT). These connected devices, such as connected cameras,
autonomous vehicles, and even smartphones, increasingly benefit from computing power that
exists closer to them on the network.
The two primary reasons that closer computing power helps IoT devices are speed and
bandwidth. For example, if you are playing a first person shooter game, even half a second of
latency between when you pull the trigger and when the shot registers is unacceptable. Another
instance where latency may be fatal, literally is with self-driving vehicles. At 55 miles per hour,
a car travels more than 40 feet in just 500ms. If a pedestrian steps off the curb, the car cannot
wait for instructions on what to do.
There is a second issue. Typically the self-driving car prevents the latency problem by making its
own decisions, but that leads to its own problems. These vehicles use machine learning, which
requires enormous amounts of data to be passed to and from the vehicle. It is estimated that these
vehicles generate more than 4 TB of data every hour, and most networks cannot handle that kind
of traffic (especially with the anticipated growth of these vehicles in the market).
To solve both of these problems, an edge cloud moves computing closer to where it is needed.
Instead of transactions making their way from an end user in Cleveland, to the main cloud in
Oregon, there may be an intermediary cloud, an edge cloud, in Cleveland. The edge cloud
processes the data or transaction. It then either sends a response back to the client, or does
preliminary analysis of the data and sends the results on to a regional cloud that may be farther
away.
Edge cloud computing comprises one or more central clouds that act as a hub for the edge clouds
themselves. Hardware for the edge clouds is located as close as possible to the user. For
example, you might have edge hardware on the actual cell tower handling the signals to and from
a user’s mobile phone.
Another area where you may see edge computing is in retail, where you have multiple stores.
Each store might have its own internal cloud. This is an edge cloud which feeds into the regional
cloud, which in turn might feed into a central cloud. This architecture gives local offices the
benefits of having their own cloud (such as consistent deployment of APIs to ensure each store
can be managed, updated, and monitored efficiently).
There is nothing "special" about edge clouds. They are just typical clouds. What makes them
"edge" is where they are, and that they are connected to each other. There is one more thing
about edge clouds, however. Because they often run on much smaller hardware than "typical"
clouds, they may be more resource-constrained. In addition, edge cloud hardware must be
reliable, efficient in terms of power usage, and preferably remotely manageable, because it may
be located in a remote area, such as a cell tower in the middle of the desert, where servicing the
hardware is difficult.
A Docker image is a set of read-only files which has no state. A Docker Image contains
source code, libraries, and other dependencies needed to run an application. A Docker
container is the run-time instance of a Docker image. You can have many running
containers of the same Docker image. A Docker image is like a recipe for a cake, and
you can make as many cakes (Docker containers) as you wish.
Images can in turn be stored in registries such as Docker Hub. Overall, the system
looks like this:
So a simplified version of the workflow of creating a container looks like this:
Step 1. Either create a new image using docker build or pull a copy of an existing image from
a registry using docker pull. (Depending on the circumstances, this step is optional. See step 3.)
Step 2. Run a container based on the image using docker run or docker container create.
Step 3. The Docker daemon checks to see if it has a local copy of the image. If it does not, it
pulls the image from the registry.
Step 4. The Docker daemon creates a container based on the image and, if docker run was
used, logs into it and executes the requested command.
As you can see, if you are going to create a container-based deployment of the sample
application, you are going to have to create an image. To do that, you need a Dockerfile.
6.2.2 What is a Dockerfile?
If you have used a coding language such as C, you know that it required you to compile your
code. If so, you may be familiar with the concept of a “makefile.” This is the file that the make
utility uses to compile and build all the pieces of the application.
That is what a Dockerfile does for Docker. It is a simple text file, named Dockerfile. It
defines the steps that the docker build command needs to take to create an image that can then
be used to create the target container.
You can create a very simple Dockerfile that creates an Ubuntu container. Use the cat
command to create a Dockerfile, and then add FROM ubuntu to the file. Enter Ctrl+D to save
and exit the file with the following text and save it in your current directory:
That is all it takes, just that one line. Now you can use the docker build command to build the
image as shown in the following example. The -t option is used to name the build. Notice the
period (.) at the end of the command which specifies that the image should be built in the current
directory. Use docker build --help to see all the available options.
Enter the command docker images to see your image in the list of images on the DEVASC
VM:
Now that you have the image, use the docker run command to run it. You are now in a bash
shell INSIDE the docker image you created. Change to the home directory and enter ls to see
that it is empty and ready for use. Enter exit to leave the Docker container and return to your
DEVASC VM main operating system.
Of course, if all you could do with a Dockerfile was to start a clean operating system, that
would be useful, but what you need is a way to start with a template and build from there.
Note: The steps shown in this rest of this topic are for instruction purposes only. Additional
details that you would need to complete these commands in your DEVASC VM are not
provided. However, you will complete similar steps in the lab Build a Sample Web App in a
Docker Container later in the topic.
FROM python
WORKDIR /home/ubuntu
COPY ./sample-app.py /home/ubuntu/.
RUN pip install flask
CMD python /home/ubuntu/sample-app.py
EXPOSE 8080
The FROM command installs Python in the Docker image. It invokes a Debian Linux-
based default image from Docker Hub, with the latest version of Python installed.
The WORKDIR command tells Docker to use /home/ubuntu as the working directory.
The COPY command tells Docker to copy the sample-app.py file from Dockerfile’s
current directory into /home/ubuntu.
The RUN command allows you to directly run commands on the container. In this
example, Flask is installed. Flask is a platform to support your app as a web app.
The CMD command will start the server when you run the actual container. Here, you use
the python command to run the sample-app.py inside the container.
The EXPOSE command tells Docker that you want to expose port 8080. Note that this is
the port on which Flask is listening. If you have configured your web server to listen
somewhere else (such as https requests on port 443) this is the place to note it.
Use the docker build command to build the image. In the following output, the image was
previously built. Therefore, Docker takes advantage of what is stored in cache to speed up the
process.
As you can see, Docker goes through each step in the Dockerfile, starting with the base image,
Python. If this image does not exist on your system, Docker pulls it from the registry. The default
registry is Docker Hub. However, in a secure environment, you might set up your own registry
of trusted container images. Notice that the image is actually a number of different images
layered on top of each other, just as you are layering your own commands on top of the base
image.
Notice that between steps such as executing a command, Docker actually creates a new container
and builds an intermediate image, a new layer, by saving that container. In fact, you can do that
yourself by creating a container, making the changes you want, then saving that container as a
new image.
In the previous example, only a small number of the available Dockerfile commands were used.
The complete list is available in the Docker documentation in the Dockerfile reference. Currently
a list of available commands looks like this:
FROM
MAINTAINER
RUN
CMD
EXPOSE
ENV
COPY
ENTRYPOINT
VOLUME
USER
WORKDIR
ARG
ONBUILD
STOPSIGNAL
LABEL
Enter the command docker images to view a list of images. Notice that there are actually two
images that are now cached on the machine. The first is the Python image, which you used as
your base. Docker has stored it so that if you were to rebuild your image, you will not have to
download it again.
$ docker images
REPOSITORY TAG IMAGE
ID CREATED SIZE
sample-app-image latest 7b1fd666ae4c About
an hour ago 410MB
python latest daddc1037fdf 2
days ago 410MB
$
6.2.4 Start a Docker Container Locally
Now that image is created, use it to create a new container and actually do some work by
entering the docker run command, as shown in the following output. In this case, several
parameters are specified. The -d parameter is short for --detach and says you want to run it in
the background. The -P parameter tells Docker to publish it on the ports that you exposed (in this
case, 8080).
$ docker run -d -P sample-app-image
1688a2c34c9e7725c38e3d9262117f1124f54685841e97c3c5225af88e30bfc5
$
$ docker ps
CONTAINER ID IMAGE
COMMAND CREATED
STATUS PORTS NAMES
90edd03a9511 sample-app-image "/bin/sh -c 'python …"
5 seconds ago Up 3 seconds 0.0.0.0:32774->8080/tcp
jovial_sammet
$
There are a few things to note here. Working backwards, notice that Docker has assigned the
container a name, jovial_sammet. You could also have named it yourself with the --
name option. For example:
Notice also that, even though the container is listening on port 8080, that is just an internal port.
Docker has specified an external port, in this case 32774, that will forward to that internal port.
This lets you run multiple containers that listen on the same port without having conflicts. If you
want to pull up your sample app website, you can use the public IP address for the host server
and that port. Alternatively, if you were to call it from the host machine itself, you would still use
that externalized port, as shown with the following curl command.
$ curl localhost:32774
You are calling me from 172.17.0.1
$
Docker also lets you specify a particular port to forward, so that you can create a more
predictable system:
To stop and remove a running container, you can call it by its name:
Now if you look at the running processes again, you can see that it is gone.
$ docker ps
CONTAINER ID IMAGE
COMMAND CREATED
STATUS PORTS NAMES
90edd03a9511 sample-app-image "/bin/sh -c 'python …"
25 minutes ago Up 25 minutes 0.0.0.0:32774->8080/tcp
jovial_sammet
$
6.2.5 Save a Docker Image to a Registry
Now that you know how to create and use your image, it is time to make it available for other
people to use. One way to do this is by storing it in an image registry.
By default, Docker uses the Docker Hub registry, though you can create and use your own
registry. You will need to start by logging in to the registry:
$ docker login
Login with your Docker ID to push and pull images from Docker
Hub. If you don't have a Docker ID, head over to
https://hub.docker.com to create one.
Username: devnetstudent # This would be your username
Password: # This would be your password
WARNING! Your password will be stored unencrypted in
/home/ubuntu/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/
#credentials-store
Login Succeeded
$
Next, you commit a running container instance of your image. For example,
the pythontest container is running in this example. Commit the container with the docker
commit command.
$ docker ps
CONTAINER ID IMAGE
COMMAND CREATED
STATUS PORTS NAMES
54c44606344c sample-app-image "/bin/sh -c 'python …"
4 seconds ago Up 2
seconds 0.0.0.0:8080->8080/tcp pythontest
$ docker commit pythontest sample-app
Sha256:bddc326383032598a1c1c2916ce5a944849d90e4db0a34b139eb315af2
66e68b
$
Next, use the docker tag command to give the image you commited a tag. The tag takes the
following form:
<repository>/<imagename>:<tag>
The first part, the repository, is usually the username of the account storing the image. In this
example, it is devnetstudent. Next is the image name, and then finally the optional tag.
(Remember, if you do not specify it, it will come up as latest.)
From here you can see that the new image is stored locally:
$ docker images
REPOSITORY TAG IMAGE
ID CREATED SIZE
sample-app
latest bddc32638303 About a minute ago
410MB
devnetstudent/sample-app
v1 bddc32638303 About a minute ago
410MB
$
6.2.6 Create a Development Environment
As you may recall, there are four different environments in a typical workflow:
The development environment is meant to be convenient to the developer; it only needs to match
the production environment where it is relevant. For example, if the developer is working on
functionality that has nothing to do with the database, the development environment does not
need a replica of the production database, or any database at all.
A typical development environment can consist of any number of tools, from Integrated
Development Environments (IDEs) such as Eclipse to databases to object storage. The important
part here is that it has to be comfortable for the developer.
In this case, you are going to build a simple Python app with tools available from the basic
command line, Bash. You can also use Bash to perform testing and deployment tasks, so start
with a Bash refresher.
Have you ever done a lot of work on an application and when you tried to merge it back into the
main application, there were many merge conflicts, any one of which carried the potential to
introduce major bugs? Continuous Integration is intended to eliminate this problem.
The idea behind Continuous Integration is that you, and all other developers on the project,
continually merge your changes with the main branch of the existing application. This means
that any given change set is small and the potential for problems is low. If everyone is using the
main branch, anyone who checks out code is going to have the latest version of what everyone
else is developing.
As part of this process, developers are expected to perform extensive, and usually automated,
testing on their code before merging back into the main branch. Doing this, the idea is that most
issues are caught before they become a more serious problem.
The Continuous Integration process provides a number of additional benefits, as every commit
provides an opportunity for the system to perform additional tasks. For example, the pipeline
might be set up to perform these tasks:
Code compilation
Unit test execution
Static code analysis
Integration testing
Packaging and versioning
Publishing the version package to Docker Hub or other package repositories
Note that there are some situations that involve large and complicated changes, such as the
addition of new features, in which changes must be grouped together. In this case, every commit
may trigger only part of the CI pipeline, with the packaging and versioning steps running only
when the entire feature is merged to the master.
In some cases, adjusting to this way of working requires a change in thinking on the part of the
organization, or on the part of individual developers who may be used to working in their own
branch, or on feature branches. This change is necessary, however, if you are going to achieve
the next step: Continuous Delivery, which is not quite the same as Continuous Deployment.
Continuous Delivery
Continuous Delivery is the process of developing in sprints that are short enough so that the code
is always in a deployable state. With Continuous Integration, small change sets are continuously
integrated into the main code branch. Continuous Delivery means that those changes are
engineered to be self-contained to the point where at any given time, you could deploy a working
application.
Step 1. Start with the version artifact that was created as part of the Continuous Integration
process.
Step 3. Run integration tests, security tests, performance tests, scale tests, or other tests identified
by the team or organization. These are known as gating tests because they determine whether this
version of the software can be promoted further in the deployment process.
Step 4. If all gating tests pass, tag this build as suitable for production.
Note that Continuous Delivery does not mean you deploy constantly, that process is called
Continuous Deployment. Continuous Delivery ensures that you always have a version that
you can deploy.
This process tells us two things:
You must think about testing in advance. In the module on Software Development we
discussed "Test Driven Development,” and the general idea is that you write automated
test routines that can be run by the CI/CD infrastructure.
If something breaks, everything stops. The idea behind this concept is that if a bug is
discovered, all other development stops until it has been fixed, returning the system to a
deployable state. This might be accomplished through finding and fixing the bug, or it
might be accomplished by rolling back changes until the error disappears, but the
important part is that the system must stay deployable. In practice, most organizations do
not actually follow this procedure, but it is the primary idea behind CI/CD.
Continuous Deployment
Continuous Deployment is the ultimate expression of CI/CD. When changes are made, tested,
integrated with the main branch, and tested again, they are deployed to production using
automation. This means that code is being deployed to production constantly, which means your
users are going to be your final testers. In other words, Continuous Deployment is a special type
of Continuous Delivery, in which every build that is marked as ready for production gets
deployed.
Some organizations favor this type of deployment because it means that users always have the
most up to date code. Most organizations take a more cautious approach that requires a human to
push the code to production.
Companies are willing to make such a seemingly “drastic” change in their processes because of
the benefits that come with using CI/CD for development. These benefits include:
Integration with agile methodologies - Agile development is built around the idea of
short sprints, after which the developer team delivers a functional application with some
subset of the required features. CI/CD works within that same short sprint framework.
Every commit is a version of the “deliver a working version of the software” concept.
Shorter Mean Time To Resolution (MTTR) - Because change sets are small, it
becomes much easier to isolate faults when they do occur, and to either fix them or roll
them back and resolve any issues.
Automated deployment - With automated testing and predictable deployment comes the
ability to do automated deployments. This means it is possible to use deployment
strategies such as canary release pipeline deployment, in which one set of users gets the
new feature set and the rest gets the old. This process enables you to get live testing of
the new feature to ensure it is functioning as expected before rolling it out to the entire
user base.
Less disruptive feature releases - With development proceeding in small chunks that
always result in a deployable artifact, it is possible to present users with incremental
changes rather than large-scale changes that can be disorienting to users.
Improved quality - All of these benefits add up to higher quality software because it has
been thoroughly tested before wide scale adoption. And because error resolution is easier,
it is more likely to be handled in a timely manner rather than accruing technical debt.
Improved time to market - Because features can be rolled out individually, they can be
offered to users much more quickly than if they had to be deployed all at the same time.
In this part we show a deployment pipeline, which is normally created with a build tool such as
Jenkins. These pipelines can handle tasks such as gathering and compiling source code, testing,
and compiling artifacts such as tar files or other packages. All these examples show screenshots
from an existing Jenkins server.
The fundamental unit of Jenkins is the project, also known as the job. You can create jobs that do
all sorts of things, from retrieving code from a source code management repo such as GitHub, to
building an application using a script or build tool, to packaging it up and running it on a server.
Here is a simple job that retrieves a version of the sample application from GitHub and runs the
build script. Then you do a second job that tests the build to ensure that it is working properly.
First, create a New Item in the Jenkins interface by clicking the "create new jobs" link on the
welcome page:
Enter a name, choose Freestyle project (so that you have the most flexibility) and click OK.
NETWORKS FOR APPLICATION DEVELOPMENT
AND SECURITY
6.4.1 Introduction
These days, you must take networking into account for all but the simplest of use cases.
This is especially true when it comes to cloud and container deployments. Here are
some of the applications you need to consider when it comes to cloud deployment:
Firewalls
Load balancers
DNS
Reverse proxies
6.4.2 Firewall
Firewalls are a computer’s most basic defense against unauthorized access by
individuals or applications. They can take any number of forms, from a dedicated
hardware device to a setting within an individual computer’s operating system.
At its most basic level, a firewall accepts or rejects packets based on the IP
addresses and ports to which they're addressed. For example, consider a web
server. This server has on it the actual web server software, as well as the
application that represents the site and the database that contains the content
the site displays. Without a firewall, the server could be accessed in multiple
ways:
A web browser can access the web application with an HTTP request to port 80
or an HTTPS request to port 443
A database client can access the database with a UDP request to port 5000
An SSH client can log into the server itself with a TCP request to port 22
But is that really what you want? You definitely want to access the web application,
though perhaps you only want HTTPS requests. You definitely do NOT want anyone to
access the database directly; in this case, only the web application really needs that
access. You might want to be able to log in to the server using an SSH client, rather
than having to have physical access to the machine.
To accomplish this, set up a firewall with specific “rules”, which are layered on top of
each other. So for example, you may have this rule:
In some cases, you do want to enable access, but not from just anybody. For example,
you might set up your systems so that logins to sensitive systems can only come from a
single machine. This is called a “jump box”, and everyone must log into that server first,
then log into the target machine from there. A jump box can be used to provide
additional access while still providing an additional layer of security.
For example, if your jump box had an internal IP address of 172.0.30.42, your firewall rules
might look like this:
Firewalls should keep any outside access to the untested application from occurring.
Firewalls need to be configured so that the application can be appropriately tested. For
example, if the application needs to access a development version of a database, the
firewall rules will need to allow that.
The environment should be as close a replica of production as possible in order to catch
any firewall-related configuration issues quickly.
Note that firewalls do not just keep traffic from coming in; they can also be configured to keep
traffic from getting out. For example, schools often have firewalls set up that keep students from
accessing all but a small handful of educational sites using the school network.
A load balancer does exactly what it says; it takes requests and “balances” them by spreading
them out among multiple servers. For example, if you have 10 servers hosting your web
application, requests will come first to the load balancer, which will then parcel them out among
those 10 hosts.
Load balancers can make their decisions on which servers should get a particular
request in a few different ways:
Round robin - With round robin load balancing, the server simply sends each request to the
“next” server on the list.
Least connections - Often it makes sense to send requests to the server that is the least
“busy” - the least number of active connections. In the figure, Server 3 receives the first request
because it is currently not handling any transactions. Server 3 receives the second request
because the service has the least number of active transactions. Server 1 and Server 3 both
now have two active transactions, so load balancer will now use the round robin method. Server
1 receives the third request, Server 3 the fourth request and Server 1 the fifth request
IP Hash - With this algorithm, the load balancer makes a decision based on a hash (an encoded
value based on the IP address of the request). You can think of this as similar to when you
attend an event and lines are formed for different stations based on the first letter of your last
name. This is also a simple way to maintain consistent sessions.
Other, more complicated algorithms can be used for deployment purposes. Some of these
examples include:
1. Blue-green deployment - Recall that this kind of deployment applies changes to a new
production environment (blue) rather than making the changes on the existing production
environment (green). A load balancer sends traffic to the blue environment when it is
ready, and if issues arise, the load balancer can send traffic back to the green environment
and changes can be rolled back.
2. Canary deployment - This deployment starts by diverting a small fraction of your traffic
to the blue environment. A load balancer can then increase the amount of traffic diverted
to the blue environment until issues are detected and traffic goes back to the old
environment, or all servers and users are on the new environment, and the old one is
retired or used for the next push.
6.4.4 DNS
DNS, or the Domain Name System, is how servers on the internet translate human-readable
names (such as developer.cisco.com or www.example.com) into machine-routable IP addresses
such as 74.125.157.99 (for Google) or 208.80.152.201 (for Wikipedia). These IP addresses are
necessary to actually navigate the internet
In software deployment, this system is beneficial because you can change the meaning of these
addresses. In this example, the application is coded to look for the database at
database.example.com:5000, which lives at the IP address of 203.0.113.25.
In another example, you might create a development version of the application, and you
would want it to hit a development version of the database, which lives at 172.24.18.36.
You can set the development machine to use a DNS server that lists
database.example.com as 172.24.18.36. You can test the application against the test
database without actually making any changes to the application.
Another way to use DNS as part of software deployment is to emulate some of the functions
that might be performed by a load balancer. Do this by changing the IP address of the target
server when you are ready to go “live”. (This is not necessarily a good option because DNS
changes can take a day or more to propagate through the internet at large.)
A reverse proxy is similar to a regular proxy: however, while a regular proxy works to
make requests from multiple computers look like they all come from the same client, a
reverse proxy works to make sure responses look like they all come from the same
server.
A reverse proxy enables requests to a single IP to be parceled out to different destinations inside the network.
All requests to the network come to the proxy, where they are evaluated and sent to the
appropriate internal server for processing. Like a forward proxy, a reverse proxy can evaluate
traffic and act accordingly. In this way, it is similar to, and can be used as, a firewall or a load
balancer.
Because it is so much like these functions, Reverse Proxy can also be used for software
deployment in similar ways.
SECURING APPLICATIONS
It is no secret that security is a major issue in today’s world. That applies to both data
and applications. If one is secure and the other is not, both are vulnerable.
In this part, you will look at some of the issues involved in securing both your data and
your application, starting with data.
Data is not just the heart of your application; it is said to be the new priceless resource;
and it has got to be protected, both for practical and legal reasons. That applies whether
data is being stored (also known as data at rest) or transferred from one server to
another (also known as data in flight or in motion).
When it comes to protecting data at rest, there are a few things you need to take into
consideration.
Encrypting data
You have probably seen plenty of news stories about data breaches. These are typically
a matter of individuals accessing data that is stored but not protected. In this context,
this means that data is stored in such a way that not only can an individual gain access,
but that when they do, the data is readily visible and usable.
Ideally, unauthorized persons or applications will never gain access to your systems,
but obviously you cannot guarantee that. So when a person with bad intentions (who
could just as easily be a disgruntled employee who has legitimate access) gets access
to your database, you do not want them to see something like this:
There are two methods for encrypting data: one-way encryption, and two-way
encryption.
Two-way encryption is literally what it sounds like; you encrypt the data using a key, and
then you can use that key (or a variation on it) to decrypt the data to get it back in
plaintext. You would use this for information you would need to access in its original
form, such as medical records or social security numbers.
One way encryption is simpler, in that you can easily create an encrypted value without
necessarily using a specific key, but you cannot unencrypt it. You would use that for
information you do not need to retrieve, just need to compare, such as passwords. For
example, let’s say you have a user, bob, who has a password of munich. You could store
the data as:
Of course, the question then becomes, if you are going to encrypt your data using a key, where
do you store the key safely? You have a number of different options, from specialized hardware
(good, but difficult and expensive), to using a key management service such as Amazon Key
Management Service (uses specialty hardware but is easier and less expensive), to storing it in
the database itself (which is not best practice, has no specialty hardware or physical
characteristics, and is vulnerable to attack).
Software vulnerabilities
When it comes to software vulnerabilities, you need to worry about two different types: your
own, and everyone else’s.
Most developers are not experts in security, so it is not uncommon to inadvertently code security
vulnerabilities into your application. For this reason, there are a number of different code
scanning tools, such as Bandit, Brakeman, and VisualCodeGrepper, that will scan your code
looking for well-known issues. These issues may be embedded in code you have written
yourself, or they may involve the use of other libraries.
These other libraries are how you end up with everyone else’s vulnerabilities. Even software that
has been in use for decades may have issues, such as the Heartbleed bug discovered in OpenSSL,
the software that forms the basis of much of the internet. The software has been around since
1998, but the bug was introduced in 2012 and sat, undetected, for two years before it was found
and patched.
Make sure that someone in your organization is responsible for keeping up with the latest
vulnerabilities and patching them as appropriate.
Remember that hackers cannot get what you do not store. For example, if you only need a credit
card authorization code for recurring billing, there is no reason to store the entire credit card
number. This is particularly important when it comes to personally identifying information such
as social security numbers and birthdays, and other information that could be considered
“private”, such as a user’s history.
Unless you need data for an essential function, do not store it.
Remember that when you store data in “the cloud” you are, by definition, storing it on someone
else’s computer. While in many cases a cloud vendor’s security may be better than that of most
enterprises, you still have the issue that those servers are completely outside of your control. You
do not know which employees are accessing them, or even what happens to hard drives that are
decommissioned. This is particularly true when using SSDs as storage, because the architectural
structure of an SSD makes it difficult or impossible to truly wipe every sector. Make sure that
your cloud data is encrypted or otherwise protected.
Roaming devices
In May of 2006, the United States Department of Veterans Affairs lost a laptop that contained a
database of personal information on 26.5 million veterans and service members. The laptop was
eventually recovered, but it is still a great example of why information must be stored in a secure
way, particularly because the world’s workforce is much more mobile now than it was in 2006.
In addition, apps are increasingly on devices that even more portable than laptops, such as your
tablet and especially your mobile phone. They are simply easier to lose. These might not even be
traditional apps such as databases, but apps targeted at the end user. Be sure you are not leaving
your data vulnerable by encrypting it whenever possible.
Data is also vulnerable when it is being transmitted. In fact, it may be even more vulnerable
because of the way the internet is designed, where packets pass through multiple servers (that
may or may not belong to you) on their way to their final destination.
This structure makes your data vulnerable to “man in the middle” attacks, in which a server
along the way can observe, steal, and even change the data as it goes by. To prevent these
problems you can use:
SSH - When connecting to your servers, always use a secure protocol such as SSH, or
secure shell, rather than an insecure protocol such as Telnet. SSH provides for
authentication and encryption of messages between the source and target machines,
making it difficult or impossible to snoop on your actions.
TLS - These days, the vast majority of requests to and from the browser use
the https:// protocol (rather than http://). This protocol was originally called SSL, or
Secured Sockets Layer, but over the years it has been gradually replaced with TLS, or
Transport Layer Security. TLS provides message authentication and stronger ciphers than
its predecessor. Whenever possible you should be using TLS.
VPN - Virtual Private Networks, are perhaps the most important means for protecting
your application. A VPN makes it possible to keep all application-related traffic inside
your network, even when working with remote employees. The remote employee
connects to a VPN server, which then acts as a proxy and encrypts all traffic to and from
the user.
Using a VPN has several benefits. First, traffic to and from the user is not vulnerable to snooping
or manipulation, so nobody can use that connection to damage your application or network.
Second, because the user is essentially inside the private network, you can restrict access to
development and deployment resources, as well as resources that do not need to be accessible to
end users, such as raw databases.
SQL injection is a code injection technique that is used to attack data-driven applications, in
which malicious SQL statements are inserted into an entry field for execution (e.g. to dump the
database contents to the attacker). SQL injection must exploit a security vulnerability in an
application's software. Two examples are when user input is either incorrectly filtered for string
literal escape characters embedded in SQL statements, or user input is not strongly typed and
unexpectedly executed. SQL injection is mostly known as an attack vector for websites but can
be used to attack any type of SQL database.
SQL injection attacks allow attackers to spoof identity, tamper with existing data, cause
repudiation issues such as voiding transactions or changing balances, allow the complete
disclosure of all data on the system, destroy the data or make it otherwise unavailable, and
become administrators of the database server.
SQL injection is one of the most common web hacking techniques. It is the placement of
malicious code in SQL statements, via web page input. It usually occurs when you ask a user for
input, like their username/userid, and instead of a name/id, the user gives you an SQL statement
that you will unknowingly run on your database.
Look at the following example which creates a SELECT statement by adding a variable ( uid) to
a select string. The variable is fetched from user input using request.args("uid"):
uid = request.args("uid");
str_sql = "SELECT * FROM Users WHERE UserId = " + uid;
One example is SQL Injection based on 1=1 is always true (in SQL-speak).
Take a look at the code to create an SQL statement to select user profile by UID , with a given
UserProfile UID.
If there is not an input validator to prevent a user from entering "wrong" input, the user can enter
input like this:
UID:
2019 OR 1=1
The SQL statement above is valid, but will return all rows from the "UserProfiles" table, because
OR 1=1 is always TRUE.
What will happen if the "UserProfiles" table contains names, emails, addresses, and passwords?
A malware creator or hacker might get access to all user profiles in database, by simply
typing 2019 OR 1=1 into the input field.
Another example is SQL Injection based on ""="" is always true. Here is that example.
Example:
u_name = request.args("uid");
u_pass = request.args("password");
sql = 'SELECT * FROM UserProfiles WHERE Name ="' + u_name + '"
AND Pass ="' + u_pass + '"'
Here is the expected SQL statement:
But the hacker can get access to user names and passwords in a database by simply inserting "
OR ""=" into the user name or password text box:
User Name:
" OR ""="
Password:
" OR ""="
The output code will create a valid SQL statement at server side, like this:
Output:
SELECT * FROM UserProfiles WHERE Name ="" OR ""="" AND Pass =""
OR ""=""
The SQL above is valid and will return all rows from the "Users" table, because OR ""="" is
always TRUE.
Most databases support batched SQL statements. A batch of SQL statements is a group of two or
more SQL statements, separated by semicolons.
The SQL statement below will return all rows from the "UserProfiles" table, then delete the
"UserImages" table.
Example:
uid = request.args("uid");
strSQL = "SELECT * FROM UserProfiles WHERE UID = " + uid;
Hopefully these examples influence you to design your data intake forms to avoid these common
security hacks.
SQL injection vulnerability exists because some developers do not care about data validation and
security. There are tools that can help detect flaws and analyze code.
To make detecting a SQL injection attack easy, developers have created good detection engines.
Some examples are SQLmap or SQLninja.
Source code analysis tools, also referred to as Static Application Security Testing (SAST) Tools,
are designed to analyze source code and/or compiled versions of code to help find security flaws.
These tools can automatically find flaws such as buffer overflows, SQL Injection flaws, and
others.
You can detect and prevent SQL injection by using a database firewall. Database Firewalls are a
type of Web Application Firewall that monitor databases to identify and protect against database
specific attacks. These attacks mostly seek to access sensitive information stored in the
databases.
SQL injection filtering works in a way similar to email spam filters. Database firewalls detect
SQL injections based on the number of invalid queries from a host, while there are OR and
UNION blocks inside of the request, or others.
The use of prepared statements with variable binding (also known as parameterized queries) is
how all developers should first be taught how to write database queries. They are simple to write,
and easier to understand than dynamic queries. Parameterized queries force the developer to first
define all the SQL code, and then pass in each parameter to the query later. This coding style
allows the database to distinguish between code and data, regardless of what user input is
supplied.
Prepared statements ensure that an attacker is not able to change the intent of a query, even if
SQL commands are inserted by an attacker. In the safe example below, if an attacker were to
enter the userID of tom' or '1'='1, the parameterized query would not be vulnerable and would
instead look for a username which literally matched the entire string tom' or '1'='1.
Stored procedures are not always safe from SQL injection. However, certain standard stored
procedure programming constructs have the same effect as the use of parameterized queries
when implemented safely. This is the norm for most stored procedure languages.
They require the developer to just build SQL statements with parameters, which are
automatically parameterized unless the developer does something largely out of the norm. The
difference between prepared statements and stored procedures is that the SQL code for a stored
procedure is defined and stored in the database itself, and then called from the application. Both
of these techniques have the same effectiveness in preventing SQL injection so your organization
should choose which approach makes the most sense for you.
Note: 'Implemented safely' means the stored procedure does not include any unsafe dynamic
SQL generation. Developers do not usually generate dynamic SQL inside stored procedures.
However, it can be done, but should be avoided. If it cannot be avoided, the stored procedure
must use input validation or proper escaping. This is to make sure that all user supplied input to
the stored procedure cannot be used to inject SQL code into the dynamically generated query.
Auditors should always look for uses of sp_execute, execute or exec within SQL Server
stored procedures. Similar audit guidelines are necessary for similar functions for other vendors.
There are also several cases where stored procedures can increase risk. For example, on MS SQL
server, you have three main default roles: db_datareader, db_datawriter and db_owner.
Before stored procedures came into use, DBA's would
give db_datareader or db_datawriter rights to the web service's user, depending on the
requirements. However, stored procedures require execute rights, a role that is not available by
default. Some setups where the user management has been centralized, but is limited to those
three roles, cause all web apps to run under db_owner rights so that stored procedures can work.
That means that if a server is breached, the attacker has full rights to the database, where
previously they might only have had read-access.
Whitelist Input Validation
Various parts of SQL queries are not legal locations for the use of bind variables, such as the
names of tables or columns, and the sort order indicator (ASC or DESC). In such situations,
input validation or query redesign is the most appropriate defense. For the names of tables or
columns, ideally those values come from the code, and not from user parameters.
But if user parameter values are used for targeting different table names and column names, then
the parameter values should be mapped to the legal/expected table or column names to make
sure unvalidated user input does not end up in the query. Please note, this is a symptom of poor
design and a full re-write should be considered.
String tableName;
switch(PARAM):
case "Value1": tableName = "fooTable";
break;
case "Value2": tableName = "barTable";
break;
...
default : throw new InputValidationException("unexpected
value provided"
+ " for table
name");
The tableName can then be directly appended to the SQL query because it is now known to be
one of the legal and expected values for a table name in this query. Keep in mind that generic
table validation functions can lead to data loss because table names are used in queries where
they are not expected.
For something simple like a sort order, it would be best if the user supplied input is converted to
a boolean, and then that boolean is used to select the safe value to append to the query. This is a
standard need in dynamic query creation.
For example:
Any user input can be converted to a non-string (such as a date, numeric, boolean, or enumerated
type). If you convert before the input is appended to a query, or used to select a value to append
to the query, this conversion ensures you can append to a query safely.
Input validation is also recommended as a secondary defense in ALL cases. More techniques on
how to implement strong whitelist input validation is described in document Open Web
Application Security Project (OWASP) Input Validation Cheat Sheet.
This technique should only be used as a last resort, when none of the above are feasible. Input
validation is probably a better choice as this methodology is frail compared to other defenses and
we cannot guarantee it will prevent all SQL Injection in all situations.
This technique is to escape user input before putting it in a query. Its implementation is very
database-specific. It is usually only recommended to retrofit legacy code when implementing
input validation is not cost effective. Applications built from scratch, or applications requiring
low risk tolerance should be built or re-written using parameterized queries, stored procedures,
or some kind of Object Relational Mapper (ORM) that builds your queries for you.
The Escaping works like this. Each DBMS supports one or more character escaping schemes
specific to certain kinds of queries. If you then escape all user supplied input using the proper
escaping scheme for the database you are using, the DBMS will not confuse that input with SQL
code written by the developer, thus avoiding any possible SQL injection vulnerabilities.
There are some libraries and tools that can be used for Input Escaping. For example, OWASP
Enterprise Security API or ESAPI is a free, open source, web application security control library
that makes it easier for programmers to write lower-risk applications.
The ESAPI libraries are designed to make it easier for programmers to retrofit security into
existing applications and serve as a solid foundation for new development.
Additional defenses
Beyond adopting one of the four primary defenses, we also recommend adopting all of these
additional defenses in order to provide defense in depth. These additional defenses can be:
Least privilege
To minimize the potential damage of a successful SQL injection attack, you should minimize the
privileges assigned to every database account in your environment. Do not assign DBA or admin
type access rights to your application accounts. We understand that this is easy, and everything
just 'works' when you do it this way, but it is very dangerous. Start from the ground up to
determine what access rights your application accounts require, rather than trying to figure out
what access rights you need to take away. Make sure that accounts that only need read access are
only granted read access to the tables for which they need access. If an account only needs access
to portions of a table, consider creating a view that limits access to that portion of the data and
assigning the account access to the view instead, rather than the underlying table. Rarely, if ever,
grant create or delete access to database accounts.
If you adopt a policy where you use stored procedures everywhere, and do not allow application
accounts to directly execute their own queries, then restrict those accounts to only be able to
execute the stored procedures they need. Do not grant them any rights directly to the tables in the
database.
SQL injection is not the only threat to your database data. Attackers can simply change the
parameter values from one of the legal values they are presented with, to a value that is
unauthorized for them, but the application itself might be authorized to access. Minimizing the
privileges granted to your application will reduce the likelihood of such unauthorized access
attempts, even when an attacker is not trying to use SQL injection as part of their exploit.
You should also minimize the privileges of the operating system account that the DBMS runs
under. Do not run your DBMS as root or system! Most DBMSs run out of the box with a very
powerful system account. For example, MySQL runs as system on Windows by default. Change
the DBMS's OS account to something more appropriate, with restricted privileges.
Web applications designers should avoid using the same owner/admin account in the web
applications to connect to the database. Different DB users could be used for different web
applications.
In general, each separate web application that requires access to the database could have a
designated database user account that the web-app uses to connect to the DB. That way, the
designer of the application can have detailed access control, thus reducing the privileges as much
as possible. Each DB user will then have select access to what it needs only, and write-access as
needed.
As an example, a login page requires read access to the username and password fields of a table,
but no write access of any form (no insert, update, or delete). However, the sign-up page
certainly requires insert privilege to that table; this restriction can only be enforced if these web
apps use different DB users to connect to the database.
SQL views
You can use SQL views to further increase the access detail by limiting read access to specific
fields of a table or joins of tables. It could potentially have additional benefits. For example,
suppose that the system is required to store the passwords of the users, instead of salted-hashed
passwords. The designer could use views to compensate for this limitation; revoke all access to
the table (from all database users except the owner or admin), and create a view that outputs the
hash of the password field and not the field itself. Any SQL injection attack that succeeds in
stealing DB information will be restricted to stealing the hash of the passwords (even a keyed
hash), because no database user for any of the web applications has access to the table itself.
What is OWASP?
The Open Web Application Security Project (OWASP) is focused on providing education, tools,
and other resources to help developers avoid some of the most common security problems in
web-based applications. Resources provided by OWASP include:
Tools - OWASP produces tools such as the OWASP Zed Attack Proxy (ZAP), which
looks for vulnerabilities during development, OWASP Dependency Check, which looks
for known vulnerabilities in your code, and OWASP DefectDojo, which streamlines the
testing process.
Code projects - OWASP produces the OWASP ModSecurity Core Rule Set (CRS),
generic attack detection rules that can be used with web application firewalls, and
OWASP CSRFGuard, which helps prevent Cross-Site Request Forgery (CSRF) attacks.
Documentation projects - OWASP is perhaps best known for its documentation
projects, which include the OWASP Application Security Verification Standard, the
OWASP Top Ten, which describes the 10 most common security issues in web
applications, and the OWASP Cheat Sheet Series, which explains how to mitigate those
issues.
Let’s look at some of the most common of those Top Ten issues.
SQL injection
You have learned about using data in your application, and how to protect it. One of the issues
with using data in your application is that if you incorporate user interaction, you can create a
potentially dangerous situation.
For example, would you ever want to execute a statement like this?
Of course not. Because if you did, you would have deleted your products table. But if you are not
careful, you could do exactly that. How? Consider this example. Let’s say you have a form:
That is an odd thing to enter, but follow the example through. If you have code that simply
integrates what the user typed, you will get the equivalent of this:
username = “bob”
userpass = “‘; drop table products; --”
sqlstatement = “select * from users where username=’”+username+”’
and password=’”+userpass+”’”
In this case, the hacker does not even have to enter a valid password for bob (or any username,
for that matter); the important part is that the dangerous statement, drop table products, gets
executed no matter what, and that double-dash (--) is a comment that prevents anything after it
from causing an error and preventing the statement from running.
How do you prevent it? While it is tempting to think that you can simply “sanitize” the inputs by
removing single quotes (‘), that is a losing battle. Instead, prevent it from happening by using
parameterized statements. How to achieve this is different for every language and database, but
here is how to do it in Python:
By doing it this way, you are creating these string variables, username and userpass, that are
dynamically inserted into the string in such a way that the user cannot use them to create
multiple statements.
One place this situation often appears is in regard to search, where by definition the user is
entering what will become part of the database statement. Consider this code:
...
@sample.route("/search")
def search():
db = getdb()
cur = db.cursor()
search_term = request.args.get(‘search_term’)
# Here you are ensuring that the search term is treated as a
string value
cur.execute("select * from products where title like %
(search_term)s", {'search_term': search_term})
output = ""
for row in cur.fetchall():
output = output + str(row[0]) + " -- " + str(row[1]) + "
"
db.close()
return output
if __name__ == "__main__":
sample.run(host="0.0.0.0", port=80)
6.5.5 Cross-Site Scripting (XSS)
Cross site scripting attacks happen when user-submitted content that has not been sanitized is
displayed to other users. The most obvious version of this exploit is where one user submits a
comment that includes a script that performs a malicious action, and anyone who views the
comments page has that script executed on their machine.
...
@sample.route("/product_comments")
def search():
db = getdb()
cur = db.cursor()
prod_id = request.args.get(‘prod_id’)
cur.execute("select * from products where id = %(prod_id)s",
{'prod_id': prod_id})
output = ""
for row in cur.fetchall():
output = output + str(row[0]) + ": " + str(row[1]) + "
"
db.close()
return output
…
This code simply extracts comment data from the database and displays it on the page. If a user
named Robin, were to submit content such as:
<script type="text/javascript">alert("Gotcha!")</script>
Then a user coming to the page would get content that looks like this:
When that user loads the page, they would see an alert box, triggered by the inserted Javascript.
Now, in this case, we are just displaying an alert, which is harmless. But that script could just as
easily have done something malicious, such as stealing cookies, or worse.
The bigger problem is that you are dealing with more than the data that is stored in your
database, or “Stored XSS Attacks.” For example, consider this page, which displays content
from a request parameter:
...
<h1>Search results for {{ request.args[‘search_term’] }}</h1>
{ for item in cursor }
…
A hacker could trick someone into visiting your page with a link in an email that provides
malicious code in a parameter:
http://www.example.com?search_term=%3Cscript%3Ealert%28%27Gotcha
%21%27%29%3C%2Fscript%3E
This link, which includes a “url encoded” version of the script, would result in an unsuspecting
user seeing a page of:
...
<h1>Search results for <script>alert('Gotcha!')</script></h1>
...
The main strategy is to sanitize content where possible, and if it cannot be sanitized, do not
display it.
Experienced web developers usually know to check for malicious content in comments, but there
are other places you must check for “untrusted” content. OWASP recommends never displaying
untrusted content in the following locations:
You can display content in some locations, if it is sanitized first. These locations include:
Sanitizing content can be a complicated process to get right, as you can see from the wide variety
of options an attacker has. It is worth it to use a tool that is built just for sanitizing content, such
as OWASP Java HTML Sanitizer, HtmlSanitizer, or Python Bleach.
Another type of attack that shares some aspects of XSS attacks is Cross Site Request Forgery
(CSRF), sometimes pronounced “Sea Surf.” In both cases, the attacker intends for the user to
execute the attacker’s code, usually without even knowing it. The difference is that CSRF attacks
are typically aimed not at the target site, but rather at a different site, one into which the user has
already authenticated.
Here is an example. Let’s say the user logs into their bank
website, http://greatbank.example.com. In another window, they are on a discussion page
that includes an interesting looking link, and they click it.
...
<form action="https://greatbank.example.com" method="POST">
Username: <input type="text" name="username" style="width: 200px"
/>
Password: <input type="text" name="password" style="width: 200px"
/>
Now that you know about three of the most well-known attacks,
here is the entire OWASP Top 10 list.
The rule for plaintext passwords is very simple: Just store them.
You have a database with a table for all your users, and it would
probably look something like (id, username, password). After the
account is created, you store the username and password in these
fields as plaintext, and on login, you extract the row associated
with the inputted username and compare the inputted password with
the password from the database. If it matches, you let your user
in. Perfectly simple and easy to implement.
######################################Plain Text
#########################################################
@app.route('/signup/v1', methods=['POST'])
def signup_v1():
conn = sqlite3.connect(db_name)
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS USER_PLAIN
(USERNAME TEXT PRIMARY KEY NOT NULL,
PASSWORD TEXT NOT NULL);''')
conn.commit()
try:
c.execute("INSERT INTO USER_PLAIN (USERNAME,PASSWORD) "
"VALUES ('{0}',
'{1}')".format(request.form['username'],
request.form['password']))
conn.commit()
except sqlite3.IntegrityError:
return "username has been registered."
print('username: ', request.form['username'], ' password: ',
request.form['password'])
return "signup success"
Verify the new signed-up user account with login function in
plaintext format, and you can see that user profile's data all
are in plaintext format.
Password Hashing
Hashing
Hashing example
a75e46e47a3c4cf3aaefe1e549949c90e90e0fe306a2e37d2880702a62b0ff31
Salted password
There are many passwords are encrypted but can be guessed. They
have been mined from hacked sites and placed into lists. These
lists have made hashed passwords much easier to crack. To
guarantee the uniqueness of the passwords, increase their
complexity, and prevent password attacks even when the inputs are
the same, a salt (which is simply random data) is added to the
input of a hash function. This is known as a salted password.
Sample
Hashed passwords are not unique to themselves due to the
deterministic nature of hash function: when given the same input,
the same output is always produced.
If devnet_alice and devnet_bob both choose devnetpassword1 as a
password, their hash would be the same:
And this example reveals, devnet_alice and devnet_bob have the same password because we
can see that both share the same
hash: 0e8438ea39227b83229f78d9e53ce58b7f468278c2ffcf45f9316150bd8e5201.
Let's say devnet_alice and devnet_bob decide to use both the same
password, devnetpassword1. For devnet_alice, we'll
use salt706173776f726473616c74a again as the salt. However,
for devnet_bob, we'll use salt706173776f726473616253b as the salt:
User: devnet_alice
Password: devnetpassword1
Salt: salt706173776f726473616c74a
Hash (SHA-256):
cefee7f060ed49766d75bd4ca2fd119d7fcabe795b9425f4fa9d7115f355ab8c
User: devnet_bob
Password: devnetpassword1
Salt: salt706173776f726473616253b
Hash (SHA-256):
41fffe05d7aca370abaff6762443d9326ce22107783b8ff5bb0cf576020fc1d5
Password guessing
Dictionary attack
Social engineering
Techniques
Password strength
password
passw0rd123
#W)rdPass1$
In this example, the password used is #W)rdPass1$, and it has strength. One estimate is that it
would take about 21 years to crack it.
Best practices
There are a few best practices to secure user login attempts. These include notifying users of
suspicious behavior, and limiting the number of password and username login attempts.
You can read more about NIST Special Publication 800-63B on the National Institute of
Standards and Technology site.
In this lab, you will create an application that stores a username and password in plaintext in a
database using python code. You will then test the server to ensure that not only were the
credentials stored correctly, but that a user can use them to login. You will then perform the
same actions, but with a hashed password so that the credentials cannot be read. It is important to
securely store credentials and other data to prevent different servers and systems from being
compromised.
A container is way of encapsulating everything you need to run your application, so that
it can easily be deployed in a variety of environments. Docker is a way of creating and
running that container. A “makefile” is the file the make utility uses to compile and build
all the pieces of the application. In Docker, this is a simple text file called a Dockerfile. It
defines the steps that the docker build command needs to take to create an image that
can then be used to create the target container. Here are some of the available
Dockerfile commands: FROM, MAINTAINER, RUN, CMD, EXPOSE, ENV, COPY,
ENTRYPOINT, VOLUME, USER, WORKDIR, ARG, ONBUILD, STOPSIGNAL, and
LABEL. You can use your image to create a new container and actually do some work.
To do that, you want to run the image:
In this case, you’ve specified several parameters. The -d parameter is short for --
detach and says you want to run it in the background, and -P tells Docker to publish it
on the ports that you exposed (in this case, 8080).
You can make your image available for other people to use by storing it in an image
registry. By default, Docker uses the Docker Hub registry, though you can create and
use your own registry.
CI/CD is a philosophy for software deployment that figures prominently in the field of
DevOps. DevOps itself is about communication and making certain that all members of
the team are working together to ensure smooth operation. The idea behind Continuous
Integration is that you, and all other developers on the project, continually merge your
changes with the main branch of the existing application. This means that any given
change set is small and the potential for problems is low. If everyone is using the main
branch, anyone who checks out code is going to have the latest version of what
everyone else is developing. Here are some benefits that come with using CI/CD for
development:
A deployment pipeline, can be created with a build tool such as Jenkins. These
pipelines can handle tasks such as gathering and compiling source code, testing, and
compiling artifacts such as tar files or other packages. The fundamental unit of Jenkins
is the project, also known as the job. You can create jobs that do all sorts of things, from
retrieving code from a source code management repo such as GitHub, to building an
application using a script or build tool, to packaging it up and running it on a server.
These days, you must consider networking for all but the simplest of use cases. This is
especially true when it comes to cloud and container deployments. Some of the
applications you need to consider when it comes to cloud deployment include: Firewalls,
Load balancers, DNS, and Reverse proxies. At its most basic level, a firewall accepts or
rejects packets based on the IP addresses and ports to which they're addressed. A load
balancer takes requests and balances them by spreading them out among multiple
servers. DNS is how servers on the internet translate human-readable names into
machine-routable IP addresses. IP addresses are required to navigate the internet. A
reverse proxy is similar to a regular proxy; however, while a regular proxy works to
make requests from multiple computers look like they all come from the same client, a
reverse proxy works to make sure responses look like they all come from the same
server.
Securing Applications
You must secure data when it is at rest. There are two methods for encrypting data:
one-way encryption, and two-way encryption. Data is also vulnerable when it’s being
transmitted. When your data is in motion it is vulnerable to “man in the middle” attacks,
in which a server along the way can observe, steal, and even change the data as it
goes by. To prevent this, you can use several techniques including: SSH, TLS, and
VPN. SQL injection is a code injection technique that is used to attack data-driven
applications, in which malicious SQL statements are inserted into an entry field for
execution. SQL injection must exploit a security vulnerability in an application's
software. SQL injection vulnerability exists because some developers do not care about
data validation and security. There are tools that can help detect flaws and analyze
code. OWASP is focused on providing education, tools, and other resources to help
developers avoid some of the most common security problems in web-based
applications. Resources provided by OWASP include: tools, code projects, and
documentation projects.
XSS attacks happen when user-submitted content that hasn’t been sanitized is
displayed to other users. The most obvious version of this exploit is where one user
submits a comment that includes a script that performs a malicious action, and anyone
who views the comments page has that script executed on their machine. Another type
of attack that shares some aspects of XSS attacks is CSRF. In both cases, the attacker
intends for the user to execute the attacker’s code, usually without even knowing it. The
difference is that CSRF attacks are typically aimed not at the target site, but rather at a
different site, one into which the user has already authenticated.
Injection
Broken Authentication
Sensitive Data Exposure
XML External Entities (XXE)
Broken Access Control
Security Misconfiguration
Cross-Site Scripting (XSS)
Insecure Deserialization
Using Components with Known Vulnerabilities
Insufficient Logging and Monitoring
The first passwords were simple plaintext ones stored in databases. A more secure way
to store a password is to transform it into data that cannot be converted back to the
original password, known as hashing. To guarantee the uniqueness of the passwords,
increase their complexity, and prevent password attacks even when the inputs are the
same, a salt (which is simply random data) is added to the input of a hash function. 2FA
uses the same password/username combination, but with the addition of being asked to
verify who a person is by using something only he or she owns, such as a mobile
device. With MFA, a user is only granted access after successfully presenting several
separate pieces of evidence to an authentication mechanism. Typically at least two of
the following categories are required for MFA: knowledge (something they know);
possession (something they have), and inherence (something they are).
Enterprises compete and control costs by operating quickly and being able to scale their
operations. Speed and agility enable the business to explore, experiment with, and exploit
opportunities ahead of their competition. Scaling operations lets the business capture market
share efficiently and match capacity to demand.
Developers need to accelerate every phase of software building: coding and iterating, testing, and
staging. DevOps practices require developers to deploy and manage apps in production, so
developers should also automate those activities.
Below are some of the risks that can be incurred in manually-deployed and -managed
environments.
Disadvantages of manual operations
Building up a simple, monolithic web application server can take a practiced IT operator 30
minutes or more, especially when preparing for production environments. When this process is
multiplied by dozens or hundreds of enterprise applications, multiple physical locations, data
centers and/or clouds; manual processes will, at some point, cause a break or even a network
failure. This adds costs and slows down the business.
Manual processes such as waiting for infrastructure availability, manual app configuration and
deployment, and production system maintenance, are slow and very hard to scale. They can
prevent your team from delivering new capabilities to colleagues and customers. Manual
processes are always subject to human error, and documentation meant for humans is often
incomplete and ambiguous, hard to test, and quickly outdated. This makes it difficult to encode
and leverage hard-won knowledge about known-good configurations and best practices across
large organizations and their disparate infrastructures.
Financial costs
Outages and breaches are most often caused when systems are misconfigured. This is frequently
due to human error while making manual changes. An often-quoted Gartner statistic (from 2014)
places the average cost of an IT outage at upwards of $5,600 USD per minute, or over $300,000
USD per hour. The cost of a security breach can be even greater; in the worst cases, it represents
an existential threat to human life, property, business reputation, and/or organizational survival.
Financial Costs of Server Outages
Dependency risks
Today's software ecosystem is decentralized. Developers no longer need to build and manage
monolithic, full-stack solutions. Instead, they specialize by building individual components
according to their needs and interests. Developers can mix and match the other components,
infrastructure, and services needed to enable complete solutions and operate them efficiently at
scale.
This modern software ecosystem aggregates the work of hundreds of thousands of independent
contributors, all of whom share the benefits of participating in this vast collaboration.
Participants are free to update their own work as needs and opportunities dictate, letting them
bring new features to market quickly, fix bugs, and improve security.
Responsible developers attempt to anticipate and minimize the impact of updates and new
releases on users by hewing closely to standards, deliberately engineering backwards
compatibility, committing to provide long-term support for key product versions (e.g., the "LTS"
versions of the Ubuntu Linux distribution), and other best practices.
Components need to be able to work alongside many other components in many different
situations (this is known as being flexibly configurable) showing no more preference for
specific companion components or architectures than absolutely necessary (this is known
as being unopinionated).
Component developers may abandon support for obsolete features and rarely-
encountered integrations. This disrupts processes that depend on those features. It is also
difficult or impossible to test a release exhaustively, accounting for every configuration.
Dependency-ridden application setups tend to get locked into fragile and increasingly
insecure deployment stacks. They effectively become monoliths that cannot easily be
managed, improved, scaled, or migrated to new, perhaps more cost-effective
infrastructures. Updates and patches may be postponed because changes are risky to
apply and difficult to roll back.
Infrastructure automation can deliver many benefits. These are summarized as speed,
repeatability, and the ability to work at scale, with reduced risk.
Self-service
Scale on demand
Apps and platforms need to be able to scale up and down in response to traffic and workload
requirements and to use heterogeneous capacity. An example is burst-scaling from private to
public cloud, and appropriate traffic shaping. Cloud platforms may provide the ability to
automatically scale (autoscale) VMs, containers, or workloads on a serverless framework.
Observability
An observable system enables users to infer the internal state of a complex system from its
outputs. Observability (sometimes abbreviated as o11y) can be achieved through platform and
application monitoring. Observability can also be achieved through proactive production testing
for failure modes and performance issues. But, in a dynamic operation that includes autoscaling
and other application behaviors, complexity increases, and entities become ephemeral. A recent
report by observability framework provider DataDog, states that the average lifetime of a
container under orchestration is only 12 hours; microservices and functions may only live for
seconds. Making ephemeral entities observable and testing in production are only possible with
automation.
Some software makers and observability experts recommend what is known as Chaos
Engineering. This philosophy is based on the assertion that failure is normal: as applications
scale, some parts are always failing. Because of this, apps and platforms should be engineered to:
Minimize the effects of issues: Recognize problems quickly and route traffic to
alternative capacity, ensuring that end users are not severely impacted, and that on-call
operations personnel are not unnecessarily paged.
Self-heal: Allocate resources according to policy and automatically redeploy failed
components as needed to return the application to a healthy state in current conditions.
Monitor events: Remember everything that led to the incident, so that fixes can be
scheduled, and post-mortems can be performed.
Some advocates of Chaos Engineering even advocate using automation tools to cause controlled
(or random) failures in production systems. This continually challenges Dev and Ops to
anticipate issues and build in more resilience and self-healing ability. Open source projects like
Chaos Monkey and "Failure-as-a-Service" platforms like Gremlin are purpose-built for breaking
things, both at random and in much more controlled and empirical ways. An emerging discipline
is called "failure injection testing."
7.1.5 Software-Defined Infrastructure - A Case for Automation
Software-defined infrastructure, also known as cloud computing, lets developers and operators
use software to requisition, configure, deploy, and manage bare-metal and virtualized compute,
storage, and network resources.
Cloud computing also enables more abstract platforms and services, such as Database-as-
a-Service (DaaS), Platform-as-a-Service (PaaS), serverless computing, container
orchestration, and more.
Private clouds let businesses use expensive on-premises hardware much more efficiently.
Public and hosted private clouds let businesses rent capacity at need, letting them move
and grow (or shrink) faster, simplifying planning and avoiding fixed capital investment.
Developers must pay close attention to platform design, architecture, and security. Cloud
environments make new demands on applications. Public or private cloud frameworks have
varying UIs, APIs, and quirks. This means that users cannot always treat cloud resources as the
commodities they really should be, especially when trying to manage clouds manually.
Access control is critical, because cloud users with the wrong permissions can do a lot of damage
to their organization's assets. Cloud permissions can be also challenging to manage, particularly
in manually operated scenarios.
When cloud resources can be self-served quickly via manual operations, consumption can be
hard to manage, and costs are difficult to calculate. Private clouds require frequent auditing and
procedures for retiring unused virtual infrastructure. Public cloud users can be surprised by
unexpected costs when pay-by-use resources are abandoned, but not torn down.
Large-scale enterprise and public-facing applications may need to manage heavy and variable
loads of traffic, computation, and storage.
For these and other reasons, modern application architectures are increasingly distributed. They
are built up out of small and relatively light components that are sometimes called microservices.
These components may be isolated in containers, connected via discovery and messaging
services (which abstract network connectivity) and backed by resilient, scalable databases (which
maintain state).
Monolithic applications can only be scaled by duplicating the whole application on additional machines.
Benefits of microservices
Challenges of microservices
Increased complexity - Microservices mean that there are many moving parts to
configure and deploy. There are more demanding operations, including scaling-on-
demand, self-healing and other features.
Automation is a requirement - Manual methods can not realistically cope with the
complexity of deploying and managing dynamic applications and their orchestrator
platforms, with their high-speed, autonomous operations and their transitory and
ephemeral bits and pieces.
Manage all phases of app building, configuration, deployment and lifecycle management.
This includes coding, testing, staging, and production.
Manage software-defined infrastructures on behalf of the applications you build.
Alongside your applications, to preserve, update, and continually improve the automation
code. This code helps you develop, test, stage, monitor, and operate our apps at
production scales, and in various environments. You can increasingly treat all this code
as one work-product.
Legacy bottlenecks
Traditionally, project resourcing or system scaling would be plan-driven, rather than demand-
driven. Requisitioning, purchasing, installing, provisioning, and configuring servers or network
capacity and services for a project could take months. With limited physical resources, resource-
sharing was common.
The lack of simple ways to set up and tear down isolated environments and connectivity meant
that organizations tended to create long-running systems that became single points of failure.
Mixed-use networks were difficult to secure and required meticulous ongoing management.
Colloquially, such long-running, elaborate systems were referred to as "pets" because people
would name them and care for that one system. The alternative is "cattle", which are stripped-
down, ephemeral workloads and virtual infrastructure built and torn down by automation. This
method ensures that there is a new system (or “cow”) available to take over the work.
Sophisticated tech organizations have been migrating away from these historic extremes for
generations. The process accelerated with widespread adoption of server virtualization, cloud,
and Agile software development. In the early 2000s, there began a movement to treat Dev and
Ops as a single entity:
DevOps evolved and continues to evolve in many places in parallel. Some key events have
shaped the discipline as we know it today.
By 2003, the world's biggest and most-advanced internet companies had significantly adopted
virtualization. They were dealing with large data centers and applications operated on a massive
scale. There were failures that resulted in Dev vs. Ops finger-pointing, fast-growing and
perpetually insufficient organizational headcounts, and on-call stress.
Google was among the first companies to understand and institutionalize a new kind of hybrid
Dev+Ops job description. This was the Site Reliability Engineer (SRE). The role of the SRE is
intended to fuse the disciplines and skills of Dev and Ops, creating a new specialty and best-
practices playbook for doing Ops with software methods.
The SRE approach was adopted by many other companies. This approach is based on:
Shared responsibility
Embracing risk
Acknowledgment of failure as normal
Commitment to use automation to reduce or eliminate "toil"
Measurement of everything
Qualifying success in terms of meeting quantitative service-level objectives
At Agile 2008, Belgian developer Patrick Debois gave a presentation called Agile Infrastructure
and Operations. His presentation discussed how to apply Developer methods to Ops while
maintaining that Dev and Ops should remain separate. Nevertheless, the following year, Debois
went on to found the DevOpsDays event series.
Debois's presentation was influential in advancing discussions around automating virtual (and
physical) infrastructure, using version-control (such as Git) to store infrastructure deployment
code (procedural or declarative), and applying Agile methods to the development and
maintenance of infrastructure-level solutions.
Automated infrastructure
Shared version control
Single-step builds and deployments
A focus on automation
The idea that "failure is normal"
A reframing of "availability" in terms of what a business can tolerate
Just as Agile Development can be seen as a method for defining and controlling management
expectations for software projects, DevOps can be viewed as a way of structuring a healthy
working culture for the technology-based parts of businesses. It can also be viewed as a way of
reducing costs. By making failure normal and automating mitigation, work can move away from
expensive and stressful on-call hours and into planned workday schedules.
Automation delivers speed and repeatability while eliminating manual, repetitive labor. This
enables technical talent to spend time solving new problems, increasing the value to the business
in the time spent.
DevOps/SRE practitioners are often expected to devote a significant fraction of working hours
(50% or more in some scenarios) to delivering new operational capabilities and engineering
reliable scaling, including development of automation tooling. This reduces hours spent on-call
and intervening manually to understand and fix issues.
Acquisition and retention of technical talent requires organizations to cooperate with their
innovators to minimize the boredom and stress of low-value labor and on-call work, and the risk
of confronting inevitable technology failures in "fire drill" mode.
This is especially critical for businesses with intrinsically low margins, that profit by rapid
growth and large scale and need to strictly control expenditures, particularly for skilled
headcount. For all kinds of organizations, however, it is important that information technology
work be perceived as a profit-center, rather than as a cost-center.
Failure is normal
The assumption that failures will occur does influence software design methodology. DevOps
must build products and platforms with greater resiliency, higher latency tolerance where
possible, better self-monitoring, logging, end-user error messaging, and self-healing.
When failures do occur and DevOps must intervene, the resulting activities should be viewed not
simply as repair work, but as research to identify and rank procedural candidates for new rounds
of automation.
Critical to DevOps/SRE culture are two linked ideas: 1. DevOps must deliver measurable,
agreed-upon business value, AND 2. the statistical reality of doing so perfectly is impossible.
These ideas are codified in a Service Level Objective (SLO) that is defined in terms of real
metrics called Service Level Indicators (SLIs).
SLIs are engineered to map to the practical reality of delivering a service to customers: they may
represent a single threshold or provide more sophisticated bracketing to further classify outlier
results. For example, an SLI might state that 99% of requests will be served within 50
milliseconds, and may also require capturing information such as whether a single >50msec
request completes at all, or whether a particular request has failed for your biggest customer.
SLO/SLI methodology permits cheaper, more rapid delivery of business value by removing the
obligation to seek perfection in favor of building what is "good enough". It can also influence the
pace, scope, and other aspects of development to ensure and improve adequacy.
One way of modeling SLO/SLI results requires establishing a so-called "error budget" for a
service for a given period of time (day, week, month, quarter, etc.), and then subtracting failures
to achieve SLO from this value. If error budgets are exceeded, reasonable decisions can be made,
such as slowing the pace of releases until sources of error are determined, and specific fixes are
made and tested.
For a given service, it makes sense to commit to delivering well within capacity, but then
overdeliver. SLAs (external agreements) are best set to where they will be easy to fulfill. SLOs
(targets for actual performance) can be set higher. The error budget is the difference between
SLO and 100% availability.
You have seen how DevOps/SRE is co-evolving with technologies like virtualization and
containerization, enabling a unified approach and unified tool set to support coordinated
application and infrastructure engineering.
Next, you will learn about some of the mechanics of infrastructure automation.
Powerful automation tools like Ansible, Puppet, and Chef bring ease of use, predictability,
discipline, and the ability to work at scale to DevOps work. But that does not mean that you
cannot do some automation with more basic tools like Bash and Python. Automation tooling
partly works by wrapping shell functionality, operating system utilities, API functions and other
control plane elements for simplicity, uniformity, feature enrichment, and compatibility in
DevOps scenarios. But tools still do not solve every problem of deployment and configuration.
That is why every automation tool has one or more functions that execute basic commands and
scripts on targets and return results. For example, in Ansible, these functions
include command, shell, and raw.
Sometimes it can be faster and simpler to use shell commands or scripts. Often, this is because
many tool implementations begin by translating automation originally written in Bash, Python,
or other languages, and you want to transfer that functionality quickly and accurately into the
tool before porting and refactoring.
In summary, it is rare to look deep down into tool-maintained infra-as-code repos without
finding some scripting. So having these skills is important!
Bash
In Linux (and other operating systems) the shell interoperates with interactive I/O, the file
system, and interprocess communication. This provides ways to issue commands, provides input
for processing and piping outputs to chains of powerful utilities.
The Bourne Again Shell (BASH) is the default on most Linux distributions. Because of its
ubiquity, the terms "Bash" and "shell" are generally used interchangeably.
Using commands in a Bash script is much the same as using them directly from the command
line. Very basic script development can simply be a matter of copying command-line expressions
into a file after testing the CLI commands to see if they work.
Sophisticated languages improve on Bash when complexity and scale requirements increase.
They are particularly useful when building and operating virtualized infrastructure in cloud
environments, using SDKs like the AWS SDK for Python or the AWS SDK for javascript in
Node.js. While Bash can be used to script access to the AWS CLI, you can use the built-in
features and libraries of more sophisticated languages to parse complex returned datasets (such
as JSON), manage many parallel operations, process errors, handle asynchronous responses to
commands, and more.
To develop and execute scripts in your desired language, you may need to install and configure
that language on your development system and on any remote target machines. Accessing
system-level utility functions may require invoking libraries (such as the os library in Python),
then wrapping what are Bash CLI commands in additional syntax for execution. You also need
to handle return codes, timeouts, and other conditions in your preferred language environment.
Using Bash, Python, or other conventional languages for automation usually means writing an
imperative procedure. An imperative procedure is an ordered sequence of commands aimed at
achieving a goal. The sequence may include flow-control, conditions, functional structure,
classes, and more.
Such procedural automation can be very powerful. But it stays simple only if you are
knowledgeable about how system utilities, CLIs/SDKs, and other interfaces work. You must also
know about target system state.
Developing a procedure
As you know, if you make a little script to install and configure a piece of software on a remote
target system, it may run okay the first time. Run it a second time however; and your simple
script might make a mess. It might throw an error and stop when it finds the application already
installed, or worse, ignore such an error, and go on to make redundant changes in config files.
To make this script safer, easier to use, more flexible, and reusable, you need to make it smarter
and more elaborate. For example, you could enhance it to:
As you develop and refine the scripts further, you will want them to accomplish some of the
following tasks:
Discover, inventory, and compile information about target systems, and ensure the scripts
do this by default.
Encapsulate the complexity of safely installing applications. Make config file backups
and changes, and restart services into reusable forms, such as subsidiary scripts
containing parameters, function libraries, and other information.
This type of scripting tends to be dangerous if starting state is not completely known
and controlled. Applying the same changes again to a correctly-configured system may
even break it.
Ultimately, the goal of almost any script is to achieve a desired state in a system,
regardless of starting conditions. Carefully-written procedural scripts and declarative
configuration tools examine targets before performing tasks on them, only performing
those tasks needed to achieve the desired state.
Idempotency is the property of operations where no matter how many times they are
executed, they produce the same result. There are a few basic principles of
idempotency to follow:
Ensure the change you want to make has not already been made - Also
known as "First, do no harm". Doing nothing is almost always a better choice
than doing something wrong and possibly unrecoverable.
Get to a known-good state, if possible, before making changes - For
example, you may need to remove and purge earlier versions of applications
before installing later versions. In production infra-as-code environments, this
principle becomes the basis for immutability. Immutability is the idea that
changes are never made on live systems. Instead, change automation and use it
to build brand-new, known-good components from scratch.
Test for idempotency - Be scrupulous about building automation free from side
effects.
All components of a procedure must be idempotent - Only if all components
of a procedure are known to be idempotent can that procedure as a whole be
idempotent.
You can store scripts locally, transmit them to target machines with a shell utility
like scp, then log into the remote machine using ssh and execute them.
You can pipe scripts to a remote machine using cat | ssh and execute them in
sequence with other commands, capturing and returning results to your terminal,
all in one command.
You can install a general-purpose secure file-transfer client like SFTP, then use
that utility to connect to the remote machine, transfer, set appropriate
permissions, then execute your script file.
You can store scripts on a webserver, log into the remote machine and retrieve
them with wget, curl, or other utilities, or store the scripts in a Git repository.
Installing git on the remote machine, clone the repo to it, check out a branch, and
execute the scripts found there.
You can install a full remote-operations solution like VNC or NoMachine locally,
install its server on the target (this usually requires also installing a graphical
desktop environment), transmit/copy and then execute scripts.
If your target devices are provisioned on a cloud framework, there is usually a
way to inject a configuration script via the same CLI command or WebUI action
that manifests the platform.
Most public cloud services let you inject configuration scripts directly into VM instances for
execution at boot time
Almost every developer will end up using these and other methods at one point or another,
depending on the task(s), environmental limitations, access to internet and other security
restrictions, and institutional policy.
Understanding these methods and practicing them is important, because procedurally automating
certain manual processes can still be useful, even when using advanced deployment tools for the
majority of a DevOps task. To be clear, this is not good practice, but the state of the art in tooling
is not yet perfect or comprehensive enough to solve every problem you may encounter.
Cloud providers and open source communities often provide specialized subsystems for popular
deployment tools. These subsystems extract a complete inventory of resources from a cloud
framework and keep it updated in real time while automation makes changes, which enables you
to more easily write automation to manage these resources.
You can also manage cloud resources using scripts written in Bash, Python, or other languages.
Such scripts are helped along by many tools that simplify access to automation targets. These
include:
CLIs and SDKs that wrap the REST and other interfaces of hardware, virtual
infrastructure entities, higher-order control planes, and cloud APIs. This makes their
features accessible from shells (and via Bash scripts) and within Python programs.
Command-line tools and Python's built-in parsers can parse JSON and YAML output
returned by CLIs and SDKs into pretty, easy-to-read formats and into native Python data
structures for easy manipulation.
IaaS and other types of infrastructure cloud also provide CLIs and SDKs that enable easy
connection to their underlying interfaces, which are usually REST-based.
If you are familiar with Cisco Compute products, including Unified Computing System (UCS),
Hyperflex, UCS Manager, and the Intersight infrastructure management system, you know these
are effectively a gateway to global SaaS management of an organization's UCS/Hyperflex
infrastructure.
Cisco's main API for this infrastructure is the Cisco Intersight RESTful API. This is an
OpenAPI-compatible API that can be interrogated with Swagger and other open source OpenAPI
tools. These enable you to generate specialized SDKs for arbitrary languages and environments,
and simplify the task of documenting the API (and maintaining SDKs).
Cisco provides and maintains a range of SDKs for the Intersight RESTful API, including ones
for Python and Microsoft PowerShell. They also provide a range of Ansible modules.
VMware
VMware's main CLI is now Datacenter CLI, which enables command-line operation of vCenter
Server API and VMware Cloud on AWS. It is written in Python and runs on Linux, Mac, and
Windows.
VMware also provides vSphere CLI for Linux and Windows, which lets you manage ESXi
virtualization hosts, vCenter servers, and offers a subset of DCLI commands. It also offers
PowerCLI for Windows PowerShell, which provides cmdlets for vSphere, vCloud, vRealize
Operations Manager, vSAN, NSX-T, VMware Cloud on AWS, VMware HCX, VMware Site
Recovery Manager, and VMware Horizon environments.
VMware also offers a host of SDKs for popular languages (including Python), aimed at vSphere
Automation, vCloud Suite, vSAN, and other products.
OpenStack
The OpenStack project provides the OpenStack Client (OSC), which is written in Python. The
OSC lets you access OpenStack Compute, Identity, Image, Object Storage, and Block Storage
APIs.
Installing the command-line clients also installs the bundled OpenStack Python SDK, enabling a
host of OpenStack commands in Python.
OpenStack Toolkits are also available for many other popular languages.
Summary
Basic automation scripting techniques are great to have in your toolbox, and understanding them
will improve your facility as an operator and user of mature automation platforms.
You will also have the option to install one or all of them on your local workstation. If you want
to try this, ensure that you have access to a Linux-based workstation, such as Ubuntu or macOS.
You should always refer to the tool's own installation documentation for your operating system.
Automation tools like Ansible, Puppet, or Chef offer powerful capabilities compared to ad-hoc
automation strategies using BASH, Python, or other programming languages.
Automation tools "wrap" operating system utilities and API functions to simplify and standardize
access. Often, they also establish intelligent defaults that speed code drafting and testing. They
make tool-centric code less verbose and easier to understand than scripts.
You can still access deeper underlying functionality with built-in shell access that enables you to
issue "raw" shell commands, inject shell and other scripts into remote systems to enable delicate
configuration. You can reuse legacy configuration code, and add functionality to the tool itself
by composing modules and plugins in languages like Python or Ruby.
Automation tool modules enable best practices that make code safer and idempotency easier to
achieve. For example, many Ansible functions can back up configuration files on target systems
or retrieve copies and store them locally on a deployer machine before making changes. This
helps to enable recovery if a deployment breaks, or is interrupted.
Automation tools typically provide some very powerful functionality to accelerate development.
For example, by default, Ansible 2.4+ provides functionality that lets you easily retrieve
configuration snapshots from Cisco ACI network fabrics. It also has complementary
functionality to help you enable rollback of ACI configurations to a prior snapshotted state.
Automation tools typically gather information from target devices as a default in the normal
course of operations. This information includes hardware configuration, BIOS settings, operating
system, configuration of network and other peripheral cards and subsystems, installed
applications, and other details.
Some tools, like Cisco ACI, can also gather configuration details from individual devices and
higher-order virtualization frameworks. Others feature dynamic inventory systems that enable
automated extraction, compilation, and realtime updating of data structures, describing all
resources configured in a private or public cloud estate.
Handle scale
Most automation tools can work in a local mode, as well as a client/server or distributed agent
mode. This lets the tool manage thousands or tens of thousands of nodes.
Engage community
Most popular tools are available in open source core versions, helping the community to
accelerate development, and find and fix bugs. Users of these tools also share deployment
playbooks, manifests, recipes, and more. These are designed for use with the tool, are may be
distributed via GitHub and other public repositories, and on tool-provider-maintained
repositories like Ansible Galaxy.
Idempotency: a review
Idempotent software produces the same desirable result each time that it is run. In deployment
software, idempotency enables convergence and composability. Idempotent deployment
components let you:
More easily gather components in collections that build new kinds of infrastructure and
perform new operations tasks
Execute whole build/deploy collections to safely repair small problems with
infrastructure, perform incremental upgrades, modify configuration, or manage scaling.
Procedural code can achieve idempotency, but many infrastructure management, deployment,
and orchestration tools have adopted another method, which is creating a declarative. A
declarative is static model that represents the desired end product. This model is used by
middleware that incorporates deployment-specific details, examines present circumstances, and
brings real infrastructure into alignment with the model, via the least disruptive, and usually least
time-consuming path.
These days, most popular automation tools are characterized as inherently procedural or
declarative. Ansible and Puppet, for example, are often described as employing declarative
Domain-Specific Languages (DSLs), whereas Chef is said to be more inherently procedural.
This is a somewhat artificial distinction, because all these platforms (as well as BASH, Python,
etc.) are procedural at the lowest level; Ansible is based on Python, Puppet and Chef are built on
Ruby. All can make use of both declarative and procedural techniques as needed, and many real-
world automation tasks require both approaches.
Figure 1. Typical terse Ansible declaration, which installs the Apache web server on an Ubuntu
host
In this example, state: present could be replaced with state: absent to remove the
package if found. The update-cache: yes setting performs the equivalent of apt-get
update before attempting the installation.
Operations people tend to think in terms of a hierarchy of infrastructure layers and associated
task-domains:
People who come to operations from software development tend to have a looser perspective on
how these terms should be used. They tend to use to term deployment about anything that is not
orchestration. They make the strongest distinction between "things you need to do to make a
system ready for testing/use" and "adjustments the system needs to make automatically, or that
you may be asked to make for it."
People also use the phrase "configuration management" when describing IT automation tools in
general. This can mean one of two things:
Statelessness
Automation works best when applications can be made stateless. This means that redeploying
them in place does not destroy, or lose track of, data that users or operators need.
Not stateless - An app that saves important information in files, or in a database on the
local file system.
Stateless - An app that persists its state to a separate database, or that provides service
that requires no memory of state between invocations.
The discussion of full-stack (infrastructure + applications) automation in this topic assumes that
the applications being discussed are stateless and/or that you, the developer, have figured out
how to persist state in your application so that your automation can work non-destructively.
Examples of Statelessness and Statefulness
Stateless / No state to store - This app requires only atomic/synchronous interactions between
client and server. Each request from client to server returns a result wholly independent of prior
and subsequent requests. An example of this application is a public web server that returns an
HTML page, image, or other data on request from a browser. The application can be scaled by
duplicating servers and data behind a simple load balancer.
Stateless / State stored in database - User state is stored in a database accessible to any
webserver in the middle tier. An example of this application is a web server that needs to be
aware of the correspondence between a user ID and user cookie. New webservers and copies of
the website can be added freely without disrupting user sessions in progress and without
requiring that each request from a given user be routed to the specific server that maintains
their session.
Stateful / State stored on server - A record of user state must be maintained across a series of
transactions. An example of this application is a website that requires authentication. The app is
not allowed to serve pages to a user who is not logged in. User state is typically persisted by
giving the client an identifying cookie that is returned to the server with each new request and
used to match an ID stored there. This application cannot be scaled just by adding servers. If a
logged-in user is routed to a server that has not stored an ID matching the user's cookie, that
server will not recognize them as being logged in, and will refuse their request.
Apps that need to maintain state are inconvenient candidates for full-stack automation, because
state will be destroyed by an ad hoc rebuild of their supporting infrastructure. They also cannot
be efficiently migrated away from one pool of resources (for example, one set of application
servers or hosts) to another.
7.4.4 Popular Automation Tools
The first modern automation tool was probably Puppet, introduced in 2005 as open source, and
then commercialized as Puppet Enterprise by Puppet Labs in 2011.
Currently, the most popular tools are Ansible, Puppet, and Chef. They share the following
characteristics:
Many other solutions also exist. Private and public cloud providers often endorse their own tools
for use on their platforms, for example, OpenStack's HEAT project, and AWS' CloudFormation.
Other solutions, many aimed at the fast-growing market for container orchestration, pure
infrastructure-as-code, and continuous delivery of infrastructure+applications, include SaltStack
and Terraform.
7.4.5 Ansible
Ansible
Ansible is probably the most broadly popular of current automation solutions. It is available as
open source, and in a version with added features from IBM/Red Hat that is called Ansible
Tower. Its name comes from the novels of speculative fiction author Ursula K. LeGuin, in which
an "ansible" is a future technology enabling instant communication at cosmic distances.
Ansible's basic architecture is very simple and lightweight.
Ansible's control node runs on virtually any Linux machine running Python 2 or 3, including a
laptop, a Linux VM residing on a laptop of any kind, or on a small virtual machine adjacent to
cloud-resident resources under management. All system updates are performed on the control
node.
The control node connects to managed resources over SSH. Through this connection, Ansible
can:
o Run shell commands on a remote server, or transact with a remote router, or other
network entity, via its REST interface.
o Inject Python scripts into targets and remove them after they run.
o Install Python on target machines if required.
Plugins enable Ansible to gather facts from and perform operations on infrastructure that
cannot run Python locally, such as cloud provider REST interfaces.
Ansible is substantially managed from the Bash command line, with automation code developed
and maintained using any standard text editor. Atom is a good choice, because it permits easy
remote work with code stored in nested systems of directories.
Ansible Architecture
Installing Ansible
The Ansible control node application is installed on a Linux machine (often a virtual machine)
from its public package repository. To install Ansible on your workstation, refer to the
installation documentation appropriate to your device.
In the Ansible code structure, work is separated into YAML (.yml) files that contain a sequence
of tasks, executed in top-down order. A typical task names and parameterizes a module that
performs work, similar to a function call with parameters.
Ansible has hundreds of pre-built Python modules that wrap operating-system-level functions
and meta-functions. Some modules like raw only do one thing; they present a command in string
form to the shell, capture a return code and any console output, and return it in accessible
variables. The module apt can be used to install, remove, upgrade, and modify individual
packages or lists of packages on a Linux web server running a Debian Linux variant. If you want
to learn more, the apt documentation will give you a sense of the scope and power of Ansible
modules.
An Ansible playbook (or "series of plays") can be written as a monolithic document with a series
of modular, named tasks. More often, developers will build a model of a complex DevOps task
out of low-level playbook task sequences (called "roles"), then reference these in higher-level
playbooks, sometimes adding additional tasks at the playbook level.
Clarity - Given a little context, almost anyone can interpret a higher-level playbook
referencing clearly-named roles.
Reusability and shareability - Roles are reusable and may be fairly closely bound to
infrastructure specifics. Roles are also potentially shareable. The Ansible project
maintains a repository for opensource role definitions, called Ansible Galaxy.
Example playbook
The following sample playbook builds a three-node Kubernetes cluster on a collection of servers:
It installs Python 2 on all servers and performs common configuration steps on all nodes,
via a role called configure-nodes, installing the Kubernetes software and Docker as a
container engine, and configuring Docker to work with Kubernetes. The actual Ansible
commands are not shown.
It designates one node as master, installing the Weave container network (one of many
network frameworks that work with Kubernetes), and performing wrapup tasks.
It joins the k8sworker nodes to the k8smaster node.
The statement become: true gives Ansible root privileges (via sudo) before attempting
an operation.
The line gather_facts: false in the first stanza prevents the automatic system-
facts interrogator from executing on a target machine before Python is installed. When
subsequent stanzas are executed, facts will be compiled automatically, by default.
---
- hosts: all
become: true
gather_facts: False
tasks:
- name: install python 2
raw: test -e /usr/bin/python || (apt -y update && apt install
-y python-minimal)
- hosts: all
become: true
roles:
- configure-nodes
- hosts: k8smaster
become: true
roles:
- create-k8s-master
- install-weave
- wrapup
- hosts:
- k8sworker1
- k8sworker2
become: true
roles:
- join-workers
...
Ansible projects are typically organized in a nested directory structure as shown below. The
hierarchy is easily placed under version control and used for GitOps-style infrastructure as code.
For an example, refer to “Directory Layout” in the Ansible documentation.
Inventory files - Also called hostfiles. These organize your inventory of resources (e.g.,
servers) under management. This enables you to aim deployments at a sequence of
environments such as dev, test, staging, production. For more information about
inventory files, refer to “How to build your inventory” in the Ansible documentation.
Variable files - These files describe variable values that are pertinent to groups of hosts
and individual hosts.
Library and utility files - These optional files contain Python code for custom modules
and the utilities they may require. You may wish to write custom modules and utilities
yourself, or obtain them from Ansible Galaxy or other sources. For example, Ansible
ships with a large number of modules already present for controlling main features of
Cisco ACI, but also provides tutorials on how to compose additional custom modules for
ACI features currently lacking coverage.
Main playbook files - Written in YAML, these files may reference one another, or
lower-level roles.
Role folders and files - Each role folder tree aggregates resources that collectively
enable a phase of detailed configuration. A role folder contains a /tasks folder with
a main.yml tasks file. It also contains a folder of asynchronous handler task files. For
more information about roles, refer to “Roles” in the Ansible documentation.
Ansible at scale
Ansible's control node is designed to sit close to the infrastructure that it manages. For example,
it may reside on a VM, or in a container, running in the same subnet as managed resources.
Enterprises and organizations with many hosts under management tend to operate many Ansible
control nodes, distributing across infrastructure pools as required.
If you are not doing rapid-fire continuous delivery, Ansible nodes do not even need to be
maintained between deployments. If your Ansible deployment code is stored in version control,
control nodes can be launched or scratch-built as you need.
There are scaling challenges for large organizations, such as managing and controlling access to
many Ansible nodes flexibly and securely. This also includes putting remote controllers
seamlessly and safely under control of centralized enterprise automation. For this, there are two
control-plane solutions:
The commercial Red Hat Ansible Tower product provides a sophisticated web interface,
REST API, and rich, role-based access control options.
The open-source, feature-comparable alternative, AWX project, of which Ansible Tower
is a value-added distribution. AWX, however, is said to represent a development branch
that undergoes minimal testing, and is not made available via signed binaries. This may
be a problem for many enterprises.
Continuous delivery around Ansible deployment can be performed with any general-purpose
CI/CD automation tool such as Jenkins or Spinnaker. Larger, more complex projects often use
the Zuul open source gating framework, originally developed by the OpenStack Project and spun
off independently in 2018. The AWX Project, among many others, is a Zuul-gated project.
Larger-scale Ansible implementations will also benefit from Ansible Vault, a built-in feature that
enables encryption of passwords and other sensitive information. It provides a straightforward
and easily-administered alternative to storing sensitive information in playbooks, roles, or
elsewhere as plaintext.
Cisco Ansible resources
Cisco and the Ansible community maintain extensive libraries of Ansible modules for
automating Cisco compute and network hardware including:
This exercise will let you view the structure of a simple Ansible playbook, which retrieves
information about the container the demo environment resides in. Note that Ansible normally
uses ssh to connect with remote hosts and execute commands. In this example, a line in the top-
level playbook.yml file instructs Ansible to run this playbook locally, without requiring ssh.
connection: local
Normally, Ansible is used to perform deployment and configuration tasks. For example, you
might use it to create a simple website on a remote host. Let's see how this might work.
Prerequisites
You can walk through this exercise yourself, or you can simply read along. If you want to
complete this exercise, you will need:
A target host running a compatible operating system (such as Ubuntu 18.04 server)
SSH and keywise authentication configured on that host
Ansible installed on your local workstation
This is typically how new virtual machines are delivered on private or public cloud frameworks.
For the target host, you can create one using a desktop virtualization tool like VirtualBox.
For the purposes of this exercise, the target machine's (DNS-resolvable) hostname is
simply target. If you have your own target host set up and configured, substitute your host
name when you create your files.
With your target machine SSH-accessible, begin building a base folder structure for your Ansible
project.
mkdir myproject
cd myproject
An inventory file, containing information about the machine(s) on which you want to
deploy.
A top level site.yml file, containing the most abstract level of instructions for carrying
out your deployment.
A role folder structure to contain your webserver role.
touch inventory
touch site.yml
mkdir roles
cd roles
ansible-galaxy init webservers
Your inventory file for this project can be very simple. Make it the DNS-resolvable hostname of
your target machine:
[webservers]
target # can also be IP address
You are defining a group called webservers and putting your target machine's hostname (or IP)
in it. You could add new hostnames/IPs to this group block, or add additional group blocks, to
assign hosts for more complex deployments. The name webservers is entirely arbitrary. For
example, if you had six servers you wanted to configure in a common way, and you then
configure three as webservers and three as database servers, your inventory might look like this:
[webservers]
target1
target2
target3
[dbservers]
target4
target5
target6
You don't actually need to create a common group, because Ansible provides means to apply a
common configuration to all servers in an inventory, which you'll see in a moment.
Creating your top level playbook file
A top-level playbook typically describes the order, permissions, and other details under which
lower-level configuration acts, defined in roles, are applied. For this example
project, site.yml looks like this:
---
- hosts: webservers
become: true
roles:
- webservers
site.yml identifies which hosts you want to perform an operation on, and which roles you want
to apply to these hosts. The line become: true tells Ansible that you want to perform the roles
as root, via sudo.
Note that instead of hosts: webservers, you could apply this role to all target hosts (which, in
this case, would work fine, because you only have one target) by substituting the line:
- hosts: all
Next step is to create the role that installs and configures your web server. You've already
created the folder structure for the role using ansible-galaxy. Code for the role is
conventionally contained in a file called main.yml in the role's /tasks directory. You can edit
the roles/webserver/tasks/main.yml file directly, to look like this:
---
- name: Perform updates and install apache2
apt:
name: apache2
state: present
update_cache: yes
- name: Insert new homepage index.html
copy:
src: index.html
dest: /var/www/html
owner: myname
mode: '0444'
Deploy Apache2.
Copy a new index.html file into the Apache2 HTML root, replacing the
default index.html page.
In the apt: stanza, you name the package, its required state, and instruct the apt module to
update its cache. You are basically performing a sudo apt update before the installation
happens.
In the second stanza, Ansible's copy routine moves a file from your local system to a directory
on the target and also changes its owner and permissions. This is the equivalent of:
Of course, you'll need to create a new index.html file as well. The Ansible copy command
assumes that such files will be stored in the /files directory of the role calling them, unless
otherwise specified. Navigate to that directory and create the following index.html file, saving
your changes afterward:
<html>
<head>
<title>My Website</title>
</head>
<body>
<h1>Hello!</h1>
</body>
</html>
Now you're ready to run your deployment. From the top level directory of your project, you can
do this with the statement:
If all is well, Ansible should ask us for your BECOME password (sudo password), then return
results similar to the following:
BECOME password:
PLAY [webservers]
**************************************************************
TASK [Gathering Facts]
*********************************************************
ok: [192.168.1.33]
TASK [webservers : Perform updates and install apache2]
****************************
changed: [192.168.1.33]
TASK [webservers : Insert new homepage index.html]
*********************************
changed: [192.168.1.33]
PLAY RECAP
*****************************************************************
****
192.168.1.33 :
ok=3 changed=2 unreachable=0 failed=0 skipped=0 re
scued=0 ignored=0
And now, if you visit the IP address of your target machine in a browser, you should see your
new homepage.
Note that Ansible gives back a full report on each execution, noting whether a step was actually
performed, or whether Ansible determined that its desired goal was already reached (In that case,
nothing happened, but the step was considered to have completed 'ok'.). This is an example of
how Ansible maintains idempotency. You can typically run an Ansible deployment as many
times as needed without putting a target system into an unknown state.
Let's walk through the example as if they were part of a CI/CD pipeline.
A developer collaborating with you on GitHub commits a change to the website, such as in
the index.html file.
Next, tests in the repository execute syntax and sanity checks as well as code review rules
against each pull request, for example.
If tests pass: Accepts their commit and notifies CI/CD server to run tests.
If tests fail: Rejects their commit based on failed checks and asks them to resubmit.
Next, the CI/CD system, such as Jenkins, prepares an environment and runs predefined tests for
any Ansible playbook. It should indicate the version expected each time and install it. Here's an
example pipeline:
After Jenkins is done running the job, you can get a notification that all is ready for staging and
you can push these changes to production with another pipeline, this time for pushing to
production. This is the power of code version control, Ansible, and multiple promotion through
environments using CI/CD.
In this lab, you will explore the fundamentals of how to use Ansible to automate some basic
device management task. First, you will configure Ansible in your DEVASC VM. Next, you will
use Ansible to connect to the CSR1000v and back up its configuration. Finally, you will
configure the CSR1000v with IPv6 addressing.
In this lab, you will first configure Ansible so that it can communicate with a webserver
application. You will then create a playbook that will automate the process of installing Apache
on the webserver. You will also create a customized playbook that installs Apache with specific
instructions.
Puppet was founded as open source in 2005 and commercialized as Puppet Enterprise by Puppet
Labs in 2011.
Operators communicate with the Puppet Server largely via SSH and the command line.
The Puppet Server can be a VM or even a Docker container for small self-teaching
implementations, and Puppet provides a compact Docker install for this purpose, called
PUPPERWARE. Standard packages are available for building a Puppet Server on Linux, which
is currently the only option for a Server install. Puppet Agents (also called Clients) are available
for Linux, Windows, and MacOS.
Puppet Architecture
Installing Puppet
Puppet Server requires fairly powerful hardware (or a big VM), and also requires a Network
Time Protocol client to be installed, configured, and tested. You can find a wide range of how-to
blog posts about Puppet installation, but the Puppet project is fast-moving and third-party posts
and articles may be outdated or unreliable.
When you have Puppet Server running, you can begin installing Puppet Agents on hosts you
wish to manage. The agents will then need to have the puppet.conf file configured to
communicate with a Puppet Server. After the client service is started, it will have its certificate
signed by the server. The Server will now be able to gather facts from the client and update the
client state with any configuration changes.
Like Ansible, Puppet lets you store components of a project or discrete configuration in a
directory tree (/etc/puppetlabs/code/environments). Subsidiary folders are created
according to the configuration in puppet.conf, or by the operator.
To begin a small project, you might create a folder inside this directory, and then within that
folder, create another called manifests, where you would store the manifest files declaring
operational classes. These are units of code describing a configuration operation. Manifest files
typically end in the extention .pp, and are written in Puppet's declarative language, which looks
something like Ruby, and was inspired by the Nagios configuration file format.
Like Ansible and other configuration tools, Puppet provides a host of resources that can be
invoked to define configuration actions to be performed on hosts and connected infrastructure.
The basic idea is very similar to Ansible's practice, where a class that invokes a resource will be
parameterized to function idempotently, and will be applied in context to produce the same
desired result every time it runs.
Puppet comes with a set of basic resources (templates for performing configuration actions) built
in. Many additional resources for performing all sorts of operations on all kinds of infrastructure
can be downloaded and installed from Puppet Forge using the puppet module command.
Puppet at scale
The first recommended step to accommodate more hosts is to create additional "compile
masters", which compile catalogs for client agents and place these behind a load balancer to
distribute work.
Puppet Enterprise customers can further expand capacity by replacing PuppetDB with a stand-
alone, customized database called PE-PostgreSQL. The Puppet Enterprise product offers many
other conveniences as well, including a web-based console giving access to reports, logs, and
enabling certain kinds of point-and-click configuration.
Cisco and the Puppet community maintain extensive libraries of modules for automating Cisco
compute and network hardware including:
This example describes how to install Puppet and then use Puppet to install Apache 2 on a
device. You can simply read along to better understand Puppet.
This approximates the normal workflow for Puppet operations in an automated client/server
environment. Note that modules can be completely generic and free of site-specific information,
then separately and re-usably invoked to configure any number of hosts or infrastructure
components. Because modules and manifests are composed as text files, they can easily be stored
in coordinated fashion in a version control repository, such as Git.
Puppet Server requires fairly powerful hardware (or a big VM), and also requires a Network
Time Protocol client to be installed, configured, and tested. Instructions for installing the server
can be found in Puppet's documentation.
When you have Puppet Server running, you can install Puppet Agents on a host. For example, on
a Debian-type Linux system, you can install Puppet Agent using a single command:
Modify
When installed, the Puppet Agent needs to be configured to seek a Puppet Server. Add the
following lines to the file /etc/puppet/puppet.conf:
[main]
certname = puppetclient
server = puppetserver
environment = production
runinterval = 15m
This tells the Client the hostname of your server (resolved via /etc/hosts) and the name of the
authentication certificate that you will generate in the next step.
Certificate signing
Puppet Agents use certificates to authenticate with the server before retrieving their
configurations. When the Client service starts for the first time, it sends a request to its assigned
server to have its certificate signed, enabling communication.
On the Server, issuing the ca list command returns a list of pending certificates:
Requested Certificates:
puppetclient
(SHA256) 44:9B:9C:02:2E:B5:80:87:17:90:7E:DC:1A:01:FD:35:C7:DB:4
3:B6:34:6F:1F:CC:DC:C2:E9:DD:72:61:E6:B2
You can then sign the certificate, enabling management of the remote node:
The response:
The Server and Client are now securely bound and able to communicate. This will enable the
Server to gather facts from the Client, and let you create configurations on the Server that are
obtained by the client and used to converge its state (every 15 minutes).
Creating a configuration
Like Ansible, Puppet lets you store components of a project or discrete configuration in a
directory tree:
/etc/puppetlabs/code/environments
Subsidiary folders are created according to the configuration in puppet.conf or by the operator.
In this example, having declared environment = production in puppet.conf, Puppet has
already created a directory for this default site, containing a modules subdirectory in which we
can store subsidiary projects and manifests for things we need to build and configure.
/etc/puppetlabs/code/environments/production/modules
You will now install Apache2 on your managed client. Puppet operations are typically performed
as root, so become root on the Server temporarily by entering:
sudo su -
cd /etc/puppetlabs/code/environments/production/modules
mkdir -p apache2/manifests
cd apache2/manifests
Inside this manifests folder, create a file called init.pp, which is a reserved filename for the
initialization step in a module:
class apache2 {
package { 'apache2':
ensure => installed,
}
service { 'apache2':
ensure => true,
enable => true,
require => Package['apache2'],
}
}
Step 1. Invoke the package resource to install the named package. If you wanted to remove the
package, you could change ensure => installed to read ensure => absent.
Step 2. Invoke the service resource to run if its requirement (in this case, that Apache2 is
present) is met. Instruct it to ensure that the service is available, and then enable it to restart
automatically when the server reboots.
cd /etc/puppetlabs/code/environments/production/manifests
Create a site.pp file that invokes the module and applies it to the target machine:
node 'puppetclient' {
include apache2
}
Deploying the configuration
Restarting the Puppet Server will now cause the manifests to be compiled and made
available to the Puppet Agent on the named device. The agent will retrieve and apply
them, installing Apache2 with the next update cycle:
For development and debugging, you can invoke Puppet Agent on a target machine (in
this case our Puppet Client machine):
The agent will immediately interrogate the server, download its catalog (the
configurations that reference it) and apply it. The results will be similar to the following:
root@target:/etc/puppetlabs/code/environments/production/
manifests# puppet agent -tInfo: Using configured environment
'production'Info: Retrieving pluginfactsInfo: Retrieving
pluginInfo: Retrieving localesInfo: Caching catalog for
puppetagentInfo: Applying configuration version
'1575907251'Notice: /Stage[main]/Apache2/Package[apache2]/ensure:
createdNotice: Applied catalog in 129.88 seconds
After the application has been successfully deployed, enter the target machine's IP address in
your browser. This should bring up the Apache2 default homepage.
7.4.11 Chef
Chef
Chef provides a complete system for treating infrastructure as code. Chef products are partly
licensed, but free for personal use (in Chef Infra Server's case, for fewer than 25 managed
nodes).
Chef's products and solutions enable infra-as-code creation, testing, organization, repository
storage, and execution on remote targets, either from a stand-alone Chef Workstation, or
indirectly from a central Chef Infra Server. You should be aware of the main Chef components:
Chef Workstation - A standalone operator workstation, which may be all that smaller
operations need.
Chef Infra Client (the host agent) - Chef Infra Clients run on hosts and retrieve
configuration templates and implement required changes. Cookbooks (and proxy Clients)
enable control of hardware and resources that cannot run a Chef Infra Client locally (such
as network devices).
Chef Infra Server - Replies to queries from Chef Infra Agents on validated hosts and
responds with configuration updates, upon which the Agents then converge host
configuration.
Most configuration tasks can also be carried out directly between Chef Workstation and
managed nodes and devices.
Chef Architecture
Chef provides hundreds of resources for performing common configuration tasks in idempotent
fashion, as well as Chef Supermarket, a community-maintained sharing site for Cookbooks,
custom resources, and other solutions.
Code is maintained in a local repository format called chef-repo, which can be synchronized
with Git for enterprise-wide infra-as-code efforts. Within a repo, code is organized in
"cookbook" folders, comprising "recipes" (actual linear configuration code, written in Chef's
extended Ruby syntax), segregated attributes, custom resources, test code, and metadata.
Chef's domain-specific language (DSL) enables you to address configuration tasks by authoring
a sequence of small, bracketed templates, each of which declares a resource and parameterizes it.
Chef resources tend to be more abstract than Ansible's or Puppet's, which helps address cross-
platform concerns. For example, the package resource can determine the kind of Linux, MacOS,
or Windows environment that it is running on and complete a required installation in a platform-
specific way.
Installing Chef Workstation
To begin using Chef, a good first step is to install Chef Workstation, which provides a complete
operations environment. Workstation is available for Linux and Windows. Refer to the Chef
Workstation downloads page for more information.
When Workstation is installed, you can use it immediately to start making configuration changes
on accessible hosts. Some node preparation is helpful before trying to manage a target node with
Chef. You should configure SSH keywise access to the host rather than using passwords. And it
helps (if you are not running DNS) to include the IP address and hostname of the target machine
in your Workstation machine's /etc/hosts file.
Running Chef at scale
Chef Infra Server was rewritten several years back in Erlang, to increase its capacity, enabling
management of up to about 10,000 hosts. It can be configured for high availability by deploying
its front-end services (including NGINX and stateless application logic) into an array of load-
balanced proxies, which connect to a three-server active/active cluster supporting back-end
services like elasticsearch, etcd, and PostgreSQL.
Chef also provides an array of products that together solve most of the problems enterprises face
in dealing with increasingly-complex, large-scale, hybrid infrastructures. Its on-
Workstation chef-repo structures harmonize with Git, enabling convenient version control and
collaboration on DevOps code, and simplifying transitions to infra-as-code regimes. Its core
philosophy of continuous configuration management dovetails well with the goal of continuous
IT delivery.
Chef's built-in unit testing framework Test Kitchen, pre-deployment simulators, and companion
auditing and security assessor InSpec provide the rest of a purpose-built DevOps test-driven
development framework.
Cisco Chef Resources
Cisco has developed modified Chef Infra Agents that run in the guest shell of NX-OS switch
equipment, enabling this hardware to work with Chef as if it were a managed host. Cisco has
also developed, and maintains a Cisco Chef Cookbook for NX-OS infrastructure, available on
Chef Supermarket.
A GitHub public repo of cookbook and recipe code is also maintained, to enable control of a
wide range of Cisco products.
Cisco UCS infrastructure is easily managed with Chef, via a cookbook enabling integration with
Integrated Management Controllers. Management via UCS Manager and Intersight is possible
via Python and/or PowerShell SDKs.
7.4.12 Chef Example - Install and Use Chef
This example describes how to install Chef and use it to install Apache 2 on a device. You can
simply read along to better understand Chef.
Installing Chef Workstation
Chef Workstation provides a complete operations environment. Workstation is available for
Linux and Windows. The following example assumes you're installing on an Ubuntu 18.04 LTS
virtual machine.
If your machine is set up with a standard desktop, you can browse to the Chef Workstation
downloads page, find the download for Ubuntu 18.04, and install it automatically with the
Debian package manager.
Alternatively, you can install from the command line by copying the URL of the .deb package
and using the following steps:
wget
https://packages.chef.io/files/stable/chef-workstation/0.12.20/ub
untu/18.04/chef-workstation_0.12.20-1_amd64.deb
sudo dpkg -i chef-workstation_0.12.20-1_amd64.deb
Basic configuration management
After Workstation is installed, you can use it immediately to start making configuration changes
on accessible hosts. You will use the chef-run command for this. It is a subsystem that takes
care of bootstrapping a Chef Infra Agent onto the target host and then executes whatever
commands you reference in files or provide in arguments.
The first time you use chef-run (or other Chef tools), you may be asked to accept licensing
terms (type yes) for the utility you're using, or for subsystems it invokes.
For the first configuration exercise, you will provide the information Chef needs to install
the ntp package. In the process, you will provide the remote username, their sudo (become root)
password, the name of the remote host target (This is in your /etc/hosts file. Otherwise, you
would use its IP address here.) and the name of the resource verb:
chef-run -U myname -sudo <password> target package ntp
action=install
Chef connects to the node, initially via SSH, and bootstraps the Chef Infra Client onto it (if not
already present). This can take a while, but chef-run helpfully shows you activity indicators.
When the client is installed, the task is handed to it, and the process completes. You get back
something that looks like this:
[✔] Packaging cookbook... done!
[✔] Generating local policyfile... exporting... done!
[✔] Applying package[ntp] from resource to target.
└── [✔] [target] Successfully converged package[ntp].
Note the vocabulary Chef uses to describe what it is doing. The configuration action that you
request is treated as a policy that partially describes the target machine's desired state. Chef does
what is required to make that policy happen, converging the machine to its desired state, where
NTP is installed and its time-synchronization service is activated by default.
Installing Chef Infra Client
Chef Infra Client runs locally on conventional compute nodes. It authenticates to Chef Infra
Server using public keypairs, which are generated and signed when a node is registered with a
Chef Infra Server. This ensures that rogue nodes cannot request configuration information from
the Server. Communications between authorized Clients and their Server are safely encrypted
with TLS.
Chef Infra Client includes a discovery subsystem called Ohai, which collects system facts and
uses them to determine whether (and how) the target system has drifted from its configuration,
and needs to be converged.
Chef Workstation can bootstrap Infra Client onto target nodes. You can also preinstall Infra
Client on nodes, for example, while creating new nodes on a public cloud. Below is an example
script you might run on a target host to do this. Note that user data scripts run as root, so sudo is
not required here. However, if you log into a remote manually as a user (perhaps in
the sudoers group) rather than as root, you would need to assume root privileges (using sudo
su -) before creating and running this script locally.
The script uses a Chef-provided installer called Omnitruck to do this. The Omnitruck shell script
figures out which kind of Linux distribution you are using and otherwise enables a safe,
predictable installation of Chef software (you can also use it to install other Chef products). A
Windows version of this script is also available that runs on PowerShell:
#!/bin/bash
apt-get update
apt-get install curl
curl -L https://omnitruck.chef.io/install.sh | bash -s once -c
current -p chef
Note that the parameters shown above will install the latest version of the Chef client, and do not
pin the version. This is dangerous for production work, because it permits updates to occur
without warning, possibly introducing an incompatibility between the Client and Workstation or
Server. The -v option lets you install a specific version of the Client, and pins it automatically.
Bootstrapping a node with Chef installs the latest compatible version and pins it.
Chef Infra Server prerequisites
Chef Infra Server stores configuration and provides it to Clients automatically, when polled,
enabling Clients to converge themselves to a desired state. Downloadable packages are listed and
linked on this page. The server is free to use for fewer than 25 hosts.
Before installing Chef Infra Server, install openssh-server and enable keywise access. You would
also need to install NTP for time synchronization. You can do this with Chef, or you can do it
manually:
sudo apt-get install ntp ntpdate net-tools
On an Ubuntu system, turn off the default timedatectl synchronization service to prevent it
from interfering with NTP synchronization:
sudo timedatectl set-ntp 0
After NTP is installed, ensure that it is synchronizing with a timeserver in its default pool. This
may take a few minutes, so repeat the command until you see something like this:
ntpstat
synchronised to NTP server (198.60.22.240) at stratum 2
time correct to within 108 ms
polling server every 256 s
When this shows up, you can install Chef Infra Server.
Installing Chef Infra Server
To install Chef Infra Server on Ubuntu 18.04, you can perform steps similar to the manual
Workstation install, above, after obtaining the URL of the .deb package. At time of writing, the
current stable version was 13.1.13-1.
wget
https://packages.chef.io/files/stable/chef-server/13.1.13/ubuntu/
18.04/chef-server-core_13.1.13-1_amd64.deb
sudo dpkg -i chef-server-core_13.1.13-1_amd64.deb
After Chef Infra Server is installed, issue the following command to tell it to read its default
configuration, initialize, and start all services. This is a long process, and is done by Chef itself,
giving you an opportunity to see some of Chef's logic used to apply configuration details in
converging a complex application to a desired state:
sudo chef-server-ctl reconfigure
The configuration process may initially fail on low-powered or otherwise compromised VMs,
but because this is Chef (thus idempotent) it can be run more than once in an attempt to get it to
complete. This sometimes works. If it does not work, it is a sign that you are trying to run on
hardware (or virtualware) that is not powerful enough, and should be upgraded before trying
again.
When chef-server-ctl begins the reconfigure process, on an initial run, you would be
expected to accept several product licenses. Type yes at the prompt.
Create an Infra Server user:
sudo chef-server-ctl user-create <username> <firstname>
<lastname> <email> '<password>' --filename
<key_file_path_and_name.pem>
Provide your own preferred user information for the <>-bracketed terms (removing
the <> brackets from your actual responses) and include a password. The argument to --
filename provides a pre-existing and accessible path and filename for the .pem private key file
that Chef will create for this user. This key file will need to be downloaded from the server and
established on the Workstation to enable server management. It makes sense to store this key in a
folder that is readily accessible from your OS user's (myname's) home
directory. IMPORTANT: Remember the key file path and filename!
Next, you create an organization, which is a structure Chef uses to isolate different bodies of
configuration policy from one another. These can be actual organizations, or a concept more like
'sites'. Chef issues each organization an RSA key which is used to authenticate hosts to Server
and organization, thus providing a multi-tenant infrastructure.
Provide a short name for the organization, a full descriptive name, the username you created in
the last step to associate with the organization, and an accessible path to store the validation key.
By convention (though this is optional) the key can be called <ORGANIZATION>-validator.pem:
sudo chef-server-ctl org-create <short_name>
'<full_organization_name>' --association_user
<username_you_just_created> --filename
<key_file_path_and_name.pem>
It makes sense to store this key in the same directory that you used to store the user key
generated in the prior step.
Install Chef-Manage
You can also install the web interface for Chef server. This can be done by entering:
sudo chef-server-ctl install chef-manage
When the process completes, restart the server and manage components. These are Chef
operations, and may take a while, as before.
sudo chef-server-ctl reconfigure
(lots of output)
sudo chef-manage-ctl reconfigure --accept-license
(lots of output)
The argument --accept-license prevents chef-manage-ctl from stopping to ask you about
the unique licenses for this product. When this process is complete, you can visit the console in a
browser via https://<IP_OF_CHEF_SERVER>. Note that most browsers will return an error about
the server's self-signed certificate, and you will need to give permission to connect. Use the
username/password you created above with chef-server-ctl user-create.
Initially, there is not much to see, but this will change when you register a node.
Finish configuring Workstation
Before Chef Workstation can talk to your Infra Server, you need to do a little configuration. To
begin, retrieve the keys generated during Server configuration, and store them in the
folder /home/myname/.chef, created during Workstation installation:
cd /home/myname/.chef
scp myname@chefserver:./path/*.pem .
/path/ is the path from your (myname) home directory on the Server to the directory in which
the Server stored keys.
If you are not using keywise authentication to your Server, scp will ask for your password (your
original user password, not your Chef Server user password). The . after user@host: refers to
your original user's home directory, from which the path is figured. The wildcard expression
finds files ending in .pem at that path. The closing dot means "copy to the current working
directory" (which should be the .chef folder). Run the ls command from within
the .chef folder to see if your keys made it.
7.4.13 Chef Example - Prepare to Use Knife
Now you will use everything together to create an actual recipe, push it to the server,
and tell the target machine's client to requisition and converge on the new configuration.
This is similar to the way Chef is used in production, but on a smaller scale.
Take a look around the cookbook folder structure. There are folders already prepared
for recipes and attributes. Add an optional directory and subdirectory for holding files
your recipe needs:
mkdir -p files/default
cd files/default
Files in the /default subdirectory of a /files directory within a cookbook can be found by
recipe name alone, no paths are required.
vi index.html
<html>
<head>
<title>Hello!</title>
</head>
<body>
<h1>HELLO!</h1>
</body>
</html>
Save the file and exit, then navigate to the recipes directory, where Chef has already
created a default.rb file for us. The default.rb file will be executed by default
when the recipe is run.
cd ../../recipes
Add some stanzas to the default.rb file. Again, edit the file:
vi default.rb
#
# Cookbook:: apache2
# Recipe:: default
#
# Copyright:: 2019, The Authors, All Rights Reserved.
apt_update do
action :update
end
package 'apache2' do
action :install
end
cookbook_file "/var/www/html/index.html" do
source "index.html"
mode "0644"
end
Underneath, the recipe performs three actions. The first resource you are
invoking, apt_update, handles the apt package manager on Debian. You would use
this to force the equivalent of sudo apt-get update on your target server, before
installing the Apache2 package. The apt_update resource's action parameter can
take several other values, letting you perform updates only under controlled conditions,
which you would specify elsewhere.
The package function is used to install the apache2 package from public repositories.
Alternative actions include :remove, which would uninstall the package, if found.
Finally, you use the cookbook_file resource to copy the index.html file
from /files/default into a directory on the target server (Apache's default web root
directory). What actually happens is that the cookbook, including this file, gets copied
into a corresponding cookbook structure on the server, then served to the client, which
executes the actions. The mode command performs the equivalent of chmod 644 on
the file which, when it reaches its destination, makes it universally readable and root-
writable.
Save the default.rb file, then upload the cookbook to the server:
You can then confirm that the server is managing your target node:
The Knife application can interoperate with your favorite editor. To enable this, perform
the following export with your editor's name:
export EDITOR=vi
This lets the next command execute interactively, putting the node definition into vi to let
you alter it manually.
As you can see, the expression "recipe[apache2]" has been added into
the run_list array, which contains an ordered list of the recipes you want to apply to
this node.
Save the file in the usual manner. Knife immediately pushes the change to the Infra
Server, keeping everything in sync.
Finally, you can use the knife ssh command to identify the node, log into it non-
interactively using SSH, and execute the chef-client application. This causes the
node to immediately reload its state from the server (which has changed) and
implement the new runlist on its host.
In this case, you would need to provide your sudo password for the target machine,
when Knife asks for it. In a real production environment, you would automate this so you
could update many nodes at once, storing secrets separately.
If all goes well, Knife gives you back a very long log that shows you exactly the content
of the file that was overwritten (potentially enabling rollback), and confirms each step of
the recipe as it executes.
target
target Starting Chef Infra Client, version 15.5.17
target resolving cookbooks for run list: ["apache2"]
target Synchronizing Cookbooks:
target - apache2 (0.1.0)
target Installing Cookbook Gems:
target Compiling Cookbooks...
target Converging 3 resources
target Recipe: apache2::default
target * apt_update[] action update
target - force update new lists of packages
target * directory[/var/lib/apt/periodic] action create (up
to date)
target * directory[/etc/apt/apt.conf.d] action create (up to
date)
target * file[/etc/apt/apt.conf.d/15update-stamp] action
create_if_missing (up to date)
target * execute[apt-get -q update] action run
target - execute ["apt-get", "-q", "update"]
target
target * apt_package[apache2] action install
target - install version 2.4.29-1ubuntu4.11 of package
apache2
target * cookbook_file[/var/www/html/index.html] action create
target - update content in file /var/www/html/index.html from
b66332 to 3137ae
target --- /var/www/html/index.html 2019-12-10
16:48:41.039633762 -0500
target +++ /var/www/html/.chef-index20191210-4245-
1kusby3.html 2019-12-10 16:48:54.411858482 -0500
target @@ -1,376 +1,10 @@
target -
target -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-
transitional.dtd">
target -<html xmlns="http://www.w3.org/1999/xhtml">
target - <!--
target - Modified from the Debian original for Ubuntu
target - Last updated: 2016-11-16
target - See: https://launchpad.net/bugs/1288690
target - -->
target - <head>
target - <meta http-equiv="Content-Type"
content="text/html; charset=UTF-8" />
target - <title>Apache2 Ubuntu Default Page: It
works</title>
target - <style type="text/css" media="screen">
target - * {
target - margin: 0px 0px 0px 0px;
target - padding: 0px 0px 0px 0px;
target - }
target -
target - body, html {
target - padding: 3px 3px 3px 3px;
target -
target - background-color: #D8DBE2;
target -
target - font-family: Verdana, sans-serif;
target - font-size: 11pt;
target - text-align: center;
target - }
target -
target - div.main_page {
target - position: relative;
target - display: table;
target -
target - width: 800px;
target -
target - margin-bottom: 3px;
target - margin-left: auto;
target - margin-right: auto;
target - padding: 0px 0px 0px 0px;
target -
target - border-width: 2px;
target - border-color: #212738;
target - border-style: solid;
target -
target - background-color: #FFFFFF;
target -
target - text-align: center;
target - }
target -
target - div.page_header {
target - height: 99px;
target - width: 100%;
target -
target - background-color: #F5F6F7;
target - }
target -
target - div.page_header span {
target - margin: 15px 0px 0px 50px;
target -
target - font-size: 180%;
target - font-weight: bold;
target - }
target -
target - div.page_header img {
target - margin: 3px 0px 0px 40px;
target -
target - border: 0px 0px 0px;
target - }
target -
target - div.table_of_contents {
target - clear: left;
target -
target - min-width: 200px;
target -
target - margin: 3px 3px 3px 3px;
target -
target - background-color: #FFFFFF;
target -
target - text-align: left;
target - }
target -
target - div.table_of_contents_item {
target - clear: left;
target -
target - width: 100%;
target -
target - margin: 4px 0px 0px 0px;
target -
target - background-color: #FFFFFF;
target -
target - color: #000000;
target - text-align: left;
target - }
target -
target - div.table_of_contents_item a {
target - margin: 6px 0px 0px 6px;
target - }
target -
target - div.content_section {
target - margin: 3px 3px 3px 3px;
target -
target - background-color: #FFFFFF;
target -
target - text-align: left;
target - }
target -
target - div.content_section_text {
target - padding: 4px 8px 4px 8px;
target -
target - color: #000000;
target - font-size: 100%;
target - }
target -
target - div.content_section_text pre {
target - margin: 8px 0px 8px 0px;
target - padding: 8px 8px 8px 8px;
target -
target - border-width: 1px;
target - border-style: dotted;
target - border-color: #000000;
target -
target - background-color: #F5F6F7;
target -
target - font-style: italic;
target - }
target -
target - div.content_section_text p {
target - margin-bottom: 6px;
target - }
target -
target - div.content_section_text ul,
div.content_section_text li {
target - padding: 4px 8px 4px 16px;
target - }
target -
target - div.section_header {
target - padding: 3px 6px 3px 6px;
target -
target - background-color: #8E9CB2;
target -
target - color: #FFFFFF;
target - font-weight: bold;
target - font-size: 112%;
target - text-align: center;
target - }
target -
target - div.section_header_red {
target - background-color: #CD214F;
target - }
target -
target - div.section_header_grey {
target - background-color: #9F9386;
target - }
target -
target - .floating_element {
target - position: relative;
target - float: left;
target - }
target -
target - div.table_of_contents_item a,
target - div.content_section_text a {
target - text-decoration: none;
target - font-weight: bold;
target - }
target -
target - div.table_of_contents_item a:link,
target - div.table_of_contents_item a:visited,
target - div.table_of_contents_item a:active {
target - color: #000000;
target - }
target -
target - div.table_of_contents_item a:hover {
target - background-color: #000000;
target -
target - color: #FFFFFF;
target - }
target -
target - div.content_section_text a:link,
target - div.content_section_text a:visited,
target - div.content_section_text a:active {
target - background-color: #DCDFE6;
target -
target - color: #000000;
target - }
target -
target - div.content_section_text a:hover {
target - background-color: #000000;
target -
target - color: #DCDFE6;
target - }
target -
target - div.validator {
target - }
target - </style>
target - </head>
target - <body>
target - <div class="main_page">
target - <div class="page_header floating_element">
target - <img src="/icons/ubuntu-logo.png" alt="Ubuntu
Logo" class="floating_element"/>
target - <span class="floating_element">
target - Apache2 Ubuntu Default Page
target - </span>
target - </div>
target -<!-- <div class="table_of_contents
floating_element">
target - <div class="section_header
section_header_grey">
target - TABLE OF CONTENTS
target - </div>
target - <div class="table_of_contents_item
floating_element">
target - <a href="#about">About</a>
target - </div>
target - <div class="table_of_contents_item
floating_element">
target - <a href="#changes">Changes</a>
target - </div>
target - <div class="table_of_contents_item
floating_element">
target - <a href="#scope">Scope</a>
target - </div>
target - <div class="table_of_contents_item
floating_element">
target - <a href="#files">Config files</a>
target - </div>
target - </div>
target --->
target - <div class="content_section floating_element">
target -
target -
target - <div class="section_header
section_header_red">
target - <div id="about"></div>
target - It works!
target - </div>
target - <div class="content_section_text">
target - <p>
target - This is the default welcome page used
to test the correct
target - operation of the Apache2 server after
installation on Ubuntu systems.
target - It is based on the equivalent page on
Debian, from which the Ubuntu Apache
target - packaging is derived.
target - If you can read this page, it means
that the Apache HTTP server installed at
target - this site is working properly. You
should replace this file (located at
target - <tt>/var/www/html/index.html</tt>)
before continuing to operate your HTTP server.
target - </p>
target -
target -
target - <p>
target - If you are a normal user of this web
site and don't know what this page is
target - about, this probably means that the
site is currently unavailable due to
target - maintenance.
target - If the problem persists, please
contact the site's administrator.
target - </p>
target -
target - </div>
target - <div class="section_header">
target - <div id="changes"></div>
target - Configuration Overview
target - </div>
target - <div class="content_section_text">
target - <p>
target - Ubuntu's Apache2 default
configuration is different from the
target - upstream default configuration, and
split into several files optimized for
target - interaction with Ubuntu tools. The
configuration system is
target - fully documented in
target
- /usr/share/doc/apache2/README.Debian.gz. Refer
to this for the full
target - documentation. Documentation for the
web server itself can be
target - found by accessing the <a
href="/manual">manual</a> if the <tt>apache2-doc</tt>
target - package was installed on this server.
target -
target - </p>
target - <p>
target - The configuration layout for an
Apache2 web server installation on Ubuntu systems is as follows:
target - </p>
target - <pre>
target -/etc/apache2/
target -|-- apache2.conf
target -| `-- ports.conf
target -|-- mods-enabled
target -| |-- *.load
target -| `-- *.conf
target -|-- conf-enabled
target -| `-- *.conf
target -|-- sites-enabled
target -| `-- *.conf
target - </pre>
target -
target -
target - <tt>apache2.conf</tt>
is the main configuration
target - file. It puts the pieces
together by including all remaining configuration
target - files when starting up the
web server.
target -
target -
target -
target - <tt>ports.conf</tt> is
always included from the
target - main configuration file.
It is used to determine the listening ports for
target - incoming connections, and
this file can be customized anytime.
target -
target -
target -
target - Configuration files in
the <tt>mods-enabled/</tt>,
target - <tt>conf-enabled/</tt> and
<tt>sites-enabled/</tt> directories contain
target - particular configuration
snippets which manage modules, global configuration
target - fragments, or virtual host
configurations, respectively.
target -
target -
target -
target - They are activated by
symlinking available
target - configuration files from
their respective
target - *-available/ counterparts.
These should be managed
target - by using your helpers
target - <tt>
target - a2enmod,
target - a2dismod,
target - </tt>
target - <tt>
target - a2ensite,
target - a2dissite,
target - </tt>
target - and
target - <tt>
target - a2enconf,
target - a2disconf
target - </tt>. See their
respective man pages for detailed information.
target -
target -
target -
target - The binary is called
apache2. Due to the use of
target - environment variables, in
the default configuration, apache2 needs to be
target - started/stopped with
<tt>/etc/init.d/apache2</tt> or <tt>apache2ctl</tt>.
target - Calling
<tt>/usr/bin/apache2</tt> directly will not work with the
target - default configuration.
target -
target -
target - </div>
target -
target - <div class="section_header">
target - <div id="docroot"></div>
target - Document Roots
target - </div>
target -
target - <div class="content_section_text">
target - <p>
target - By default, Ubuntu does not allow
access through the web browser to
target - <em>any</em> file apart of those
located in <tt>/var/www</tt>,
target - <a
href="http://httpd.apache.org/docs/2.4/mod/mod_userdir.html"
rel="nofollow">public_html</a>
target - directories (when enabled) and
<tt>/usr/share</tt> (for web
target - applications). If your site is using
a web document root
target - located elsewhere (such as in
<tt>/srv</tt>) you may need to whitelist your
target - document root directory in
<tt>/etc/apache2/apache2.conf</tt>.
target - </p>
target - <p>
target - The default Ubuntu document root is
<tt>/var/www/html</tt>. You
target - can make your own virtual hosts under
/var/www. This is different
target - to previous releases which provides
better security out of the box.
target - </p>
target - </div>
target -
target - <div class="section_header">
target - <div id="bugs"></div>
target - Reporting Problems
target - </div>
target - <div class="content_section_text">
target - <p>
target - Please use the <tt>ubuntu-bug</tt>
tool to report bugs in the
target - Apache2 package with Ubuntu. However,
check <a
target
- href="https://bugs.launchpad.net/ubuntu/+source/
apache2"
target - rel="nofollow">existing bug
reports</a> before reporting a new bug.
target - </p>
target - <p>
target - Please report bugs specific to
modules (such as PHP and others)
target - to respective packages, not to the
web server itself.
target - </p>
target - </div>
target -
target -
target -
target -
target - </div>
target - </div>
target - <div class="validator">
target - </div>
target - </body>
target +<html>
target +<head>
target +<title>Hello!</title>
target +</head>
target +<body>
target +<h1>HELLO!</h1>
target +</body>
target </html>
target
target
target Running handlers:
target Running handlers complete
target Chef Infra Client finished, 4/7 resources updated in 02
minutes 31 seconds
At this point, you should be able to point a browser at the target machine's IP address, and see
your new index page.
Chef will work to maintain this configuration, preventing drift in the configuration. If you were
to log into the target server and make a change to the index.html file in /var/www/html (for
example, changing the word "HELLO" to "GOODBYE"), Chef will fix the change pre-
emptively the next time the agent runs (by default, within 30 minutes).
Summary
This has been a high-level introduction to three modern DevOps toolkits. You should now be
ready to:
Deploy and integrate free versions of the major components of Ansible, Puppet, and/or
Chef on a range of substrates, from desktop virtual machines (such as VirtualBox VMs)
to cloud-based VMs on Azure, AWS, GCP or other IaaS platforms.
Experience each platform's declarative language, style of infra-as-code building and
organizing, and get a sense of the scope of its library of resources, plugins, and
integrations.
Get practice automating some of the common IT tasks you may do at work, or solve
deployment and lifecycle management challenges you set yourself, in your home lab.
Hands-on exercises and work will give you a complete sense of how each platform
addresses configuration themes, and help you overcome everyday IT gotchas.
If you are building up your reputation with community, know that almost nothing impresses IT
peers so much as a well-executed, insightfully automated deploy or manage codebase for a
complex, head-scratching piece of software. Entire companies are built on the bedrock of
knowing how to deploy complicated systems in robust configurations, such as specialized
databases, container orchestration and cloud frameworks like Kubernetes and Openstack.
Be realistic, though. These are each extremely complex and sophisticated platforms that take
users years to master, so do not get discouraged if you find them confusing! Reach out to the
communities of the products you enjoy using (or that your workplace endorses) and you will
learn more quickly.
At this point, it is time to introduce a new term: immutability. This literally means "the state of
being unchangeable," but in DevOps parlance, it refers to maintaining systems entirely as code,
performing no manual operations on them at all.
These topics have touched several times on the concept of treating infrastructure as code. But
thus far, it has mostly been preoccupied with the mechanics. You know that it makes sense to
automate deployment of full stacks, which are virtual infrastructures (compute/storage/network)
plus applications. You have seen several approaches to writing basic automation code, and at the
mechanics of automation tools, as well as storing code safely and retrieving it from version
control repositories.
You are now familiar with the idea of idempotency and related topics, and have seen how it
might be possible to compose automation code that is very safe to run. This is code that does not
break things, but instead puts things right and converges on the desired state described by a
(partly or wholly) declarative model of the deployed system. You have also learned how code
like this can be used to speed up operations, solving problems by brute force rather than detailed
forensics and error-prone, time-consuming manual operations.
GitOps: modern infrastructure-as-code
Committing to immutability enables you to treat your automation codebase the way you would
any application code:
You can trust that the codebase describes what is actually running on bare metal or cloud
servers.
You can manage the codebase Agile procedures and structured use of version control to keep
things clear and simple.
This is "GitOps", also referred to as "operations by pull request." In a typical GitOps setup, you
might maintain a repository, such as a private repo on GitHub, with several branches called
"Development," "Testing/UAT," and "Production."
GitOps is "Operations by Pull Request"
Development - Developers make changes in the Development branch, filing commits and
making pull requests. These requests are fulfilled by an automated gating process and queued
for automated testing. The operators see results of automated testing, and developers iterate
until tests pass. The changes are then merged into the Test branch for the next level of review.
Test - When changes are merged from Development into Test, the code is deployed to a larger
test environment and is subjected to a more extensive set of automated tests.
Production - Tested changes are again reviewed and merged to Production, from which release
deployments are made.
By the time tested code is committed, evaluated, iterated, and merged to the Production branch,
it has gone through at least two complete cycles of testing on successively more production-like
infrastructure, and has also had conscious evaluation by several experts. The result is code that is
reasonably free of bugs and works well. This operational model can be used to develop extensive
self-service capability and self-healing compute/storage/network/PaaS, or implement large-scale
distributed applications under advanced container orchestration.
Where can GitOps take you?
When your GitOps procedures and workflow, gating/CI-CD, automated testing and other
components are in place, you can begin experimenting with elite deployment strategies that
require a bedrock of flawless automation.
Blue/green deployments
Blue/green deployment is a method for reducing or eliminating downtime in production
environments. It requires you to maintain two identical production environments (You do not
have to call them Blue and Green. Any two colors such as Red and Black will do.). Develop the
capability of quickly redirecting application traffic to one or the other (through ACI automation;
load balancing; programmable DNS; or other means).
A release is deployed to the environment not currently in use (Green). After acceptance testing,
redirect traffic to this environment. If problems are encountered, you can switch traffic back to
the original environment (Blue). If the Green deployment is judged adequate, resources owned
by the Blue deployment can be relinquished, and roles swapped for the next release.
Blue/Green Deployment
Note: Some DevOps practitioners differentiate between blue/green and red/black strategies.
They say that in blue/green, that traffic is gradually migrated from one environment to the other,
so it hits both systems for some period; whereas in red/black, traffic is cut over all at once. Some
advanced DevOps practitioners, like Netflix, practice a server-level version of red/black they call
"rolling red/black". With rolling red/black, servers in both environments are gradually updated,
eventually ending up with one version distinct from one another, and can be rolled back
individually as well. This means that three (not two) versions of the application or stack are
running across both environments at any given time.
Canary testing
Canary testing is similar to rolling blue/green deployment, but somewhat more delicate. The
migration between old and new deployments is performed on a customer-by-customer (or even
user-by-user) basis, and migrations are made intentionally to reduce risk and improve the quality
of feedback. Some customers may be very motivated to gain access to new features and will be
grateful for the opportunity. They will happily provide feedback on releases along efficient
channels.
In this topic, you will learn about a suite of products and practices created by Cisco and its user
community to extend test automation to software-driven network configuration, and to reduce or
eliminate uncertainty about how prospective network architectures will function and perform
when fully implemented.
Automation tools like Ansible, Puppet, Chef, and others solve part of the problem by turning
infrastructure into code. But DevOps typically needs more fine-grained ways to define and
implement infrastructures, certify that deployed infrastructures are working as required, and
proactively ensure its smooth operations. DevOps also needs ways to preemptively take action
when failures are imminent, and find and fix issues when errors occur.
When you use unit-testing tools like pytest in tandem with higher-order automation and in
concert with continuous delivery (CI/CD), you can build environments where code can be
automatically tested when changes are made.
Unit-testing frameworks make tests a part of your codebase, following the code through
developer commits, pull requests, and code-review gates to QA/test and Production. This is
especially useful in test-driven development (TDD) environments, where writing tests is a
continuous process that actually leads development, automatically encouraging very high levels
of test coverage.
Network misconfigurations are often only discovered indirectly, when computers cannot talk to
one another. As networks become more complex and carry more diverse and performance-
sensitive traffic, risks to security and performance degradations, which may be both difficult to
discover and to quantify, are increasingly important consequences of misconfiguration.
Network management and testing are still complex even when network devices and connectivity
become software-addressable and virtualized. Methods of building and configuring networks
certainly change, but you still need to create a collective architecture for safe connectivity by
touching numerous device interfaces in detailed ways.
Cisco has made huge strides in developing Software Defined Networks (SDN) and middleware
that let engineers address a physical network collective as a single programmable entity. In
Cisco's case, this includes:
Solutions like ACI manage the whole network by converging models (often written in a
declarative syntax called YANG) that represent desired states of functionality and connectivity.
The middleware enables a model to work harmoniously with other models that are currently
defining system state and resource requirements. Engineers interact less often with individual
devices directly, though the models still need to provide enough detail to enable configuration.
The complex, fast-evolving state of large infrastructures can be maintained as code, enabling:
Rapid re-convergence to desired states at need. If a device breaks and is replaced, it can
be rapidly reintegrated with an existing network and its programmed functionality
quickly restored, along with network behavior and performance.
Portability, so that when a core application moves from one data center to another, its
required network configuration accompanies it.
Version control, CI/CD, and other tools to maintain, evolve, and apply the network
codebase.
These innovations are increasingly popular with larger organizations, carriers, and other scaled-
up entities. In many cases, however, networks still comprise several generations of diverse,
hybrid, multi-vendor physical and virtual infrastructures, so the problem of deliberate, device-
level configuration still looms.
And even when sophisticated SDN is available, SDN controller/orchestrators cannot always
prevent misconfiguration. They can reject flawed code, perform sanity checks before applying
changes, and inform when models make demands exceeding resource thresholds, but seemingly
legitimate changes can still be applied.
Python Automated Test System (pyATS) is a Python-based network device test and validation
solution, originally developed by Cisco for internal use, then made available to the public and
partially open-sourced. pyATS can be used to help check whether your changes work before
putting them into production, and continue validation and monitoring in production to ensure
smooth operations.
pyATS originated as the low-level Python underpinning for the test system as a whole. Its
higher-level library system, Genie, provides the necessary APIs and libraries that drive and
interact with network devices, and perform the actual testing. The two together form the Cisco
test solution we know as pyATS.
pyATS framework and libraries can be leveraged within any Python code.
It is modular, and includes components such as:
o AEtest executes the test scripts.
o Easypy is the runtime engine that enables parallel execution of multiple scripts,
collects logs in one place, and provides a central point from which to inject
changes to the topology under test.
A CLI enables rapid interrogation of live networks, extraction of facts, and helps
automate running of test scripts and other forensics. This enables very rapid 'no-code'
debugging and correction of issues in network topologies created and maintained using
these tools.
In SDN/cloud/virtual network environments, setup can involve actually building a topology, and
cleanup can involve retiring it, reclaiming platform resources. This setup and cleanup can be
done directly using pyATS code. pyATS provides an enormous interface library to Cisco and
other infrastructure via a range of interfaces, including low-level CLI and REST APIs, as well as
connectors to ACI and other higher-order SDN management frameworks.
pyATS can consume, parse, and implement topologies described in JSON, as YANG models,
and from other sources, even from Excel spreadsheets.
pyATS can also be integrated with automation tools like Ansible for building, provisioning, and
teardown. However, it may be better practice to do the reverse. Use Ansible, Puppet, or Chef to
manage the infrastructure's entire codebase and have those products invoke Python (and pyATS)
to deal with the details of network implementation. These tools also recruit ACI or other
middleware to simplify the task, and permit segregated storage and versioning of YANG or other
models defining concrete topologies.
Alternatively, you can invoke pyATS indirectly in several ways (including ways requiring
minimal Python programming knowledge).
The following content shows how to use pyATS to create and apply tests. You will need to be
familiar with this information to help you complete the lab on the next page. Simply read along
with this example to better understand pyATS.
Virtual environments
The pyATS tool is best installed for personal work inside a Python virtual environment (venv). A
venv is an environment copied from your base environment, but kept separate from it. This
enables you to avoid installing software that might permanently change the state of your system.
Virtual environments exist in folders in your file system. When they are created, they can be
activated, configured at will, and components installed in them can be updated or modified
without changes being reflected in your host's configuration. The ability to create virtual
environments is native to Python 3, but Ubuntu 18.04 may require you to install a python3-
venv package separately.
The following instructions describe how to create a venv on Ubuntu 18.04 (where python3 is the
default command). If you are using a different operating system, refer to the appropriate
documentation for pip and virtual environments.
Ensure that python3-pip, the Python3 package manager, is in place. You would also install git,
which you would need later:
Create a new virtual environment in the directory of your choice. In this example, it is
called myproject.
Venv creates the specified working directory (myproject) and corresponding folder structure
containing convenience functions and artifacts describing this environment's configuration. At
this point, you can cd to the myproject folder and activate the venv:
cd myproject
source bin/activate
Installing pyATS
You can install pyATS from the public Pip package repository (PyPI).
Note: You may see "Failed building wheel for ...<wheelname>" errors while installing pyATS
through pip. You can safely ignore those errors as pip has a backup plan for those failures and
the dependencies are installed despite errors reported.
pyats --help
Clone the pyATS sample scripts repo, maintained by Cisco DevNet, which contains sample files
you can examine:
The installed target, pyats[full], includes both the low-level underpinnings, various
components, dependencies, and the high-level Genie libraries.
The test declaration syntax for pyATS is inspired by that of popular Python unit-testing
frameworks like pytest. It supports basic testing statements, such as an assertion that a variable
has a given value, and adds to that the ability to explicitly provide results (including result
reason, and data) via specific APIs. This is demonstrated in the following excerpt from a basic
test script. The pyATS test script can be found in /basics/pyats-sample-script.py from the
repository that you cloned previously. A portion of the script is shown below.
class MyTestcase(aetest.Testcase):
@aetest.setup
def setup(self, section):
'''setup section
create a setup section by defining a method and
decorating it with
@aetest.setup decorator. The method should be named
'setup' as good
convention.
setup sections are optional within a testcase, and is
always runs first.
'''
log.info("%s testcase setup/preparation" % self.uid)
# set some variables
self.a = 1
self.b = 2
@aetest.test
def test_1(self, section):
'''test section
create a test section by defining a method and decorating
it with
@aetest.test decorator. The name of the method becomes
the unique id
labelling this test. There may be arbitrary number of
tests within a
testcase.
test sections run in the order they appear within a
testcase body.
'''
log.info("test section: %s in testcase %s" %
(section.uid, self.uid))
# testcase instance is preserved, eg
assert self.a == 1
@aetest.test
def test_2(self, section):
'''
you can also provide explicit results, reason and data
using result API.
These information will be captured in the result summary.
'''
log.info("test section: %s in testcase %s" %
(section.uid, self.uid))
if self.b == 2:
self.passed('variable b contains the expected value',
data = {'b': self.b})
else:
self.failed('variable b did not contains the expected
value',
data = {'b': self.b})
If you click through and examine the entire test script, you will see it contains several sections:
These blocks contain statements that prepare and/or determine readiness of the test topology (a
process that can include problem injection), perform tests, and then return the topology to a
known state.
The Testing blocks, which are often referred to in pyATS documentation as the Test Cases, can
each contain multiple tests, with their own Setup and Cleanup code. Best practice suggests that
the common Cleanup section, at the end, be designed for idempotency. This means that it should
check and restore all changes made by Setup and Test, restoring the topology to its original,
desired state.
pyATS scripts and jobs
A pyATS script is a Python file where pyATS tests are declared. It can be run directly as a
standalone Python script file, generating output only to your terminal window. Alternatively, one
or more pyATS scripts can be compiled into a "job" and run together as a batch, through the
pyATS EasyPy module. This enables parallel execution of multiple scripts, collects logs in one
place, and provides a central point from which to inject changes to the topology under test.
The pyATS job file can be found in /basics/pyats-sample-job.py from the repository that
you cloned previously. A portion of the job file is shown below.
import os
from pyats.easypy import run
def main():
'''
main() function is the default pyATS job file entry point
that Easypy module consumes
'''
# find the location of the script in relation to the job file
script_path = os.path.dirname(os.path.abspath(__file__))
testscript = os.path.join(script_path,
'basic_example_script.py')
# execute the test script
run(testscript=testscript)
If you have performed the installation steps and are now in a virtual environment containing the
cloned repo, you can run this job manually to invoke the basic test case:
If you see an error like RuntimeError: Jobfile 'basic_example_script' did not define
main(), it means you have run the basic_example_script.py file rather than
the basic_example_job.py file. Or, if you see The provided jobfile 'pyats-sample-
scripts/basic/basic_example_job.py' does not exist. double-check which directory
you are working within. Perhaps you have already changed directories into the pyats-sample-
scripts repository directory.
Output
2020-03-01T12:38:50: %EASYPY-INFO: Starting job run:
basic_example_job 2020-03-01T12:38:50: %EASYPY-INFO: Runinfo
directory:
/Users/agentle/.pyats/runinfo/basic_example_job.2020Mar01_12:38:4
8.974991 2020-03-01T12:38:50: %EASYPY-INFO:
-----------------------------------------------------------------
--------------- 2020-03-01T12:38:51: %EASYPY-INFO: Starting task
execution: Task-1 2020-03-01T12:38:51: %EASYPY-INFO: test
harness = pyats.aetest 2020-03-01T12:38:51: %EASYPY-INFO:
testscript =
/Users/agentle/src/pyats-sample-scripts/basic/basic_example_scrip
t.py 2020-03-01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting common
setup | 2020-03-01T12:38:51: %AETEST-
INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting subsection
subsection_1 | 2020-03-01T12:38:51:
%AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %SCRIPT-INFO: hello world!
2020-03-01T12:38:51: %AETEST-INFO: The result of subsection
subsection_1 is => PASSED 2020-03-01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting subsection
subsection_2 | 2020-03-01T12:38:51:
%AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %SCRIPT-INFO: inside
subsection subsection_2 2020-03-01T12:38:51: %AETEST-INFO: The
result of subsection subsection_2 is => PASSED 2020-03-
01T12:38:51: %AETEST-INFO: The result of common setup is =>
PASSED 2020-03-01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting testcase
Testcase_One | 2020-03-01T12:38:51:
%AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting section
setup | 2020-03-01T12:38:51: %AETEST-
INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %SCRIPT-INFO: Testcase_One
testcase setup/preparation 2020-03-01T12:38:51: %AETEST-INFO: The
result of section setup is => PASSED 2020-03-01T12:38:51:
%AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting section
test_1 | 2020-03-01T12:38:51: %AETEST-
INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %SCRIPT-INFO: test section:
test_1 in testcase Testcase_One 2020-03-01T12:38:51: %AETEST-
INFO: The result of section test_1 is => PASSED 2020-03-
01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting section
test_2 | 2020-03-01T12:38:51: %AETEST-
INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %SCRIPT-INFO: test section:
test_2 in testcase Testcase_One 2020-03-01T12:38:51: %AETEST-
INFO: Passed reason: variable b contains the expected value 2020-
03-01T12:38:51: %AETEST-INFO: The result of section test_2 is =>
PASSED 2020-03-01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting section
cleanup | 2020-03-01T12:38:51: %AETEST-
INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %SCRIPT-INFO: Testcase_One
testcase cleanup/teardown 2020-03-01T12:38:51: %AETEST-INFO: The
result of section cleanup is => PASSED 2020-03-01T12:38:51:
%AETEST-INFO: The result of testcase Testcase_One is => PASSED
2020-03-01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting common
cleanup | 2020-03-01T12:38:51:
%AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %AETEST-INFO:
| Starting subsection
clean_everything | 2020-03-01T12:38:51:
%AETEST-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:51: %SCRIPT-INFO: goodbye world
2020-03-01T12:38:51: %AETEST-INFO: The result of subsection
clean_everything is => PASSED 2020-03-01T12:38:51: %AETEST-INFO:
The result of common cleanup is => PASSED 2020-03-01T12:38:51:
%EASYPY-INFO:
-----------------------------------------------------------------
--------------- 2020-03-01T12:38:51: %EASYPY-INFO: Job finished.
Wrapping up... 2020-03-01T12:38:52: %EASYPY-INFO: Creating
archive file:
/Users/agentle/.pyats/archive/20-Mar/basic_example_job.2020Mar01_
12:38:48.974991.zip 2020-03-01T12:38:52: %EASYPY-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:52: %EASYPY-INFO:
| Easypy
Report | 2020-03-01T12:38:52:
%EASYPY-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:52: %EASYPY-INFO: pyATS
Instance : /Users/agentle/.local/share/virtualenvs/pyats-
sample-scripts-b4vw68FQ/bin/.. 2020-03-01T12:38:52: %EASYPY-INFO:
Python Version : cpython-3.8.1 (64bit) 2020-03-01T12:38:52:
%EASYPY-INFO: CLI Arguments :
/Users/agentle/.local/share/virtualenvs/pyats-sample-scripts-
b4vw68FQ/bin/pyats run job basic/basic_example_job.py 2020-03-
01T12:38:52: %EASYPY-INFO: User : agentle 2020-03-
01T12:38:52: %EASYPY-INFO: Host Server : AGENTLE-M-339A
2020-03-01T12:38:52: %EASYPY-INFO: Host OS Version : Mac OSX
10.14.6 (x86_64) 2020-03-01T12:38:52: %EASYPY-INFO: 2020-03-
01T12:38:52: %EASYPY-INFO: Job Information 2020-03-01T12:38:52:
%EASYPY-INFO: Name : basic_example_job 2020-03-
01T12:38:52: %EASYPY-INFO: Start time : 2020-03-01
12:38:50.019013 2020-03-01T12:38:52: %EASYPY-INFO: Stop
time : 2020-03-01 12:38:51.732162 2020-03-01T12:38:52:
%EASYPY-INFO: Elapsed time : 0:00:01.713149 2020-03-
01T12:38:52: %EASYPY-INFO: Archive :
/Users/agentle/.pyats/archive/20-Mar/basic_example_job.2020Mar01_
12:38:48.974991.zip 2020-03-01T12:38:52: %EASYPY-INFO: 2020-03-
01T12:38:52: %EASYPY-INFO: Total Tasks : 1 2020-03-
01T12:38:52: %EASYPY-INFO: 2020-03-01T12:38:52: %EASYPY-INFO:
Overall Stats 2020-03-01T12:38:52: %EASYPY-INFO: Passed :
3 2020-03-01T12:38:52: %EASYPY-INFO: Passx : 0 2020-03-
01T12:38:52: %EASYPY-INFO: Failed : 0 2020-03-
01T12:38:52: %EASYPY-INFO: Aborted : 0 2020-03-
01T12:38:52: %EASYPY-INFO: Blocked : 0 2020-03-
01T12:38:52: %EASYPY-INFO: Skipped : 0 2020-03-
01T12:38:52: %EASYPY-INFO: Errored : 0 2020-03-
01T12:38:52: %EASYPY-INFO: 2020-03-01T12:38:52: %EASYPY-INFO:
TOTAL : 3 2020-03-01T12:38:52: %EASYPY-INFO: 2020-03-
01T12:38:52: %EASYPY-INFO: Success Rate : 100.00 % 2020-03-
01T12:38:52: %EASYPY-INFO: 2020-03-01T12:38:52: %EASYPY-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:52: %EASYPY-INFO:
| Task Result
Summary | 2020-03-01T12:38:52:
%EASYPY-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:52: %EASYPY-INFO: Task-1:
basic_example_script.common_setup
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: Task-1:
basic_example_script.Testcase_One
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: Task-1:
basic_example_script.common_cleanup
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: 2020-03-01T12:38:52:
%EASYPY-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:52: %EASYPY-INFO:
| Task Result
Details | 2020-03-01T12:38:52:
%EASYPY-INFO:
+----------------------------------------------------------------
--------------+ 2020-03-01T12:38:52: %EASYPY-INFO: Task-1:
basic_example_script 2020-03-01T12:38:52: %EASYPY-INFO: |--
common_setup
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: | |--
subsection_1
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: | --
subsection_2
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: |--
Testcase_One
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: | |--
setup
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: | |--
test_1
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: | |--
test_2
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: | --
cleanup
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: --
common_cleanup
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: --
clean_everything
PASSED 2020-03-01T12:38:52: %EASYPY-INFO: Sending report email...
2020-03-01T12:38:52: %EASYPY-INFO: Missing SMTP server
configuration, or failed to reach/authenticate/send mail. Result
notification email failed to send. 2020-03-01T12:38:52: %EASYPY-
INFO: Done!
Pro Tip
Use the following command to view your logs locally: pyats logs view. This command
automatically opens a page in your web browser displaying the pyATS test results in a GUI
format.
</code></pre> </details>
This is a very simple example that uses the most basic pyATS functionality. There is no actual
topology or testbed on which to run network-type tests. However, the output shows you the kind
of detailed test log pyATS team creates, including a section-by-section run log of the whole
process, from setup to teardown, and appended comprehensive report sections:
2020-03-01T12:38:52: %EASYPY-INFO:
+----------------------------------------------------------------
--------------+
2020-03-01T12:38:52: %EASYPY-INFO: |
Task Result Summary |
2020-03-01T12:38:52: %EASYPY-INFO:
+----------------------------------------------------------------
--------------+
2020-03-01T12:38:52: %EASYPY-INFO: Task-1:
basic_example_script.common_setup
PASSED
2020-03-01T12:38:52: %EASYPY-INFO: Task-1:
basic_example_script.Testcase_One
PASSED
2020-03-01T12:38:52: %EASYPY-INFO: Task-1:
basic_example_script.common_cleanup
PASSED
Each job run generates an archive .zip file, stored by default under your user's home
directory, ~/.pyats/archive. You can list the files and unzip each archive file to view their
content (as regular text files), or use the built in web-browser based log viewer, using a localhost
web server:
A testbed can be a single YAML file, or can be programmatically assembled from YAML
(establishing basic structure) and Python files that make use of pyATS (and potentially Genie)
library modules to:
Define the testbed devices (routers, switches, servers, etc.), subsystems (such as ports,
network cards) and their interconnections.
Establish managerial connections with them, using pyATS' ConnectionManager class to
create connections and operate upon them. Inside the pyATS topology model, devices are
created as objects that include a connectionmanager attribute, containing an instance of
the ConnectionManager class that manages active connections to the real-world device.
The testbed file is an essential input to the rest of pyATS library and ecosystem. It provides
information to the framework for loading the right set of library APIs (such as parsers) for each
device, and how to effectively communicate to them.
Real testbed files for large topologies can be long, deeply-nested, and complex. A
simple testbed.yaml file with one device, identified with <device_ip> below might look like
this example. To run the example, you would need to enter a real IP address for a device that
matches the type and os settings.
Note: This is an example and it will not work with pyATS unless you enter real values for the
username, password, and connection IP address.
devices:
router1:
type: router
os: nxos
platform: n9kv
alias: under_test
credentials:
default:
username: "%ENV{MY_USERNAME}"
password: "%ENV{MY_PASSWORD}"
connections:
cli:
protocol: ssh
ip: "<device_ip>"
This example defines a router whose hostname is router1, with a supported OS.
To validate that your testbed YAML file meets the pyATS requirements (and conforms to the
standard schema), replace the username, password, and ip values, and then run a pyats
validate command like so:
pyats validate testbed testbed.yaml
This command checks the content of your file, loads it, and displays any errors in the schema or
format.
Note: It is possible to leverage pyATS libraries without a testbed file input, where you can elect
to define devices, connections, and other testbed elements programmatically.
Genie is the pyATS higher-level library system that provides APIs for interacting with devices,
and a powerful CLI for topology and device management and interrogation.
When installed, it adds its features and functionalities into the pyATS framework.
For example, Genie features parsers for a wide range of network operating systems and
infrastructure. Parsers are APIs that convert device output into Python structured data. To
exercise parsers, enter the pyATS interactive shell. This is effectively the same as Python
interpreter/interactive shell, except it provides niche functionalities such as automatically loading
your testbed file:
Note: If you are using a pyATS version older than v20.1, the equivalent command is
instead genie shell.
Now you can access the loaded testbed's devices, establish connectivity, and parse device
command outputs like this:
The Device.parse() API returns the processed structured data (Python dictionary), enabling
you to build your own business logic and/or tests on top. To see the list of all available platforms
and supported parser commands in Genie library, visit Available Parsers in the Genie
documentation.
In addition, it is also possible to exercise the library's functionality through shell CLIs
(pyats commands). You can interactively extract a comprehensive text description of the
configuration and operational states of various protocols, features and hardware information in a
given topology. For example:
The underlying Genie mechanism connects in parallel to all the devices defined in the testbed
and collects their configurations (conf) in a human-readable file (in this case,
called my_network). The output provides details about network state, including interface setup,
VLANs, spanning-tree configuration, and many other features.
Now that the output file exists, it serves as a "gold standard" for this topology's configuration. At
any subsequent point in time, if configuration drift seems to have taken hold and broken
something, you can run Genie again:
This API returns a set of diff files, detailing any changes, and letting you quickly discover the
cause of problems.
To see the list of "features" pyATS Genie current supports (and can learn), refer to Available
Models in the Genie documentation.
Many of pyATS's functions, such as parse and learn, are exercisable in Python directly (either
using interactive shell or in your .py script files, intended for programmers), and through the CLI
interface (for non-programmers). You can learn more about this topic the pyATS Getting Started
Guide.
This topic has provided a quick introduction to pyATS and its companion solutions. The next
topic introduces VIRL, a Cisco tool for faithfully simulating networks on a server platform,
along with some associated utilities.
Note: Windows platforms are not yet supported for using pyATS.
The visualization has clickable elements that let you explore configuration of entities and make
changes via the WebUI, or by connecting to network elements via console. You can also extract
individual device configurations, or entire simulated network configs, as .virl files.
VIRL files
VIRL also enables you to define simulations as code, enabling both-ways integration with other
software platforms for network management and testing.
VIRL's native configuration format is called a.virl file, which is a human-readable YAML file.
The .virl file contains complete descriptions of the IOS routers, their interface configurations
and connection (plus other configuration information), credentials for accessing them, and other
details. These files can be used to launch simulations via the VIRL REST API, and you can
convert .virl files to and from "testbed" files for use with PyATS and Genie.
In the VIRL UI, you select a simulation, make VIRL read the device's configuration, and then it
composes a .virl file to represent it. VIRL offers to save the topology in a new file that you can
then open in an editor for review.
The .virl file provides a method for determining if configuration drift has occurred on the
simulation. A simple diff command can compare a newly-extracted .virl file with the
original .virl file used to launch the simulation, and differences will be apparent.
Automation is using code to configure, deploy, and manage applications together with
the compute, storage, and network infrastructures and services on which they run. You
have choices in how to programmatically control your network configurations and
infrastructure. Walk: Read-only automation. Run: Activate policies and provide self-
service across multiple domains. Fly: Deploy applications, network configurations, and
more through CI/CD. Manual processes are always subject to human error, and
documentation meant for humans is often incomplete and ambiguous, hard to test, and
quickly outdated. Automation is the answer to these problems. Benefits of full-stack
automation are self service, scale on demand, observability, and Automated problem
mitigation. Software-defined infrastructure, also known as cloud computing, lets
developers and operators use software to requisition, configure, deploy, and manage
bare-metal and virtualized compute, storage, and network resources. Modern
application architectures are increasingly distributed. They are built up out of small and
relatively light components that are sometimes called microservices.
Defining Moments 1: Site Reliability Engineering (SRE). The role of the SRE is intended
to fuse the disciplines and skills of Dev and Ops.
A focus on automation
The idea that "failure is normal"
A reframing of "availability" in terms of what a business can tolerate
Automation Tools
Three of the most popular automation tools are Ansible, Puppet, and Chef. Automation
tools like Ansible, Puppet, or Chef offer powerful capabilities compared to ad-hoc
automation strategies using BASH, Python, or other programming languages.
Idempotent software produces the same desirable result each time that it is run. In
deployment software, idempotency enables convergence and composability. Procedural
code can achieve idempotency, but many infrastructure management, deployment, and
orchestration tools have adopted another method, which is creating a declarative. A
declarative is static model that represents the desired end product.
Ansible's basic architecture is very simple and lightweight. Ansible's control node runs
on virtually any Linux machine running Python 2 or 3. All system updates are performed
on the control node. Plugins enable Ansible to gather facts from and perform operations
on infrastructure that can't run Python locally, such as cloud provider REST interfaces.
Ansible is substantially managed from the Bash command line, with automation code
developed and maintained using any standard text editor.
Puppet’s core architecture has the following characteristics: A designated server to host
main application components called the Puppet Server, the Facter which is the fact-
gathering service, the PuppetDB, which can store facts, node catalogs, and recent
configuration event history, and a secure client, a Puppet Agent, installed and
configured on target machines. Operators communicate with the Puppet Server largely
via SSH and the command line.
Chef’s main components are the Chef Workstation which is a standalone operator
workstation, the Chef Infra Client (the host agent) which runs on hosts and retrieves
configuration templates and implemenst required changes, and the Chef Infra Server
which replies to queries from Chef Infra Agents on validated hosts and responds with
configuration updates, upon which the Agents then converge host configuration.
Infrastructure as Code
Immutability literally means "the state of being unchangeable," but in DevOps parlance,
it refers to maintaining systems entirely as code, performing no manual operations on
them at all. Committing to immutability enables you to treat your automation codebase
the way you would any application code:
● You can trust that the codebase describes what's actually running on bare metal or
cloud servers.
● You can manage the codebase Agile procedures and structured use of version control
to keep things clear and simple.
Automating Testing
DevOps typically needs more fine-grained ways to define and implement infrastructures,
certify that deployed infrastructures are working as required, proactively ensure its
smooth operations, preemptively take action when failures are imminent, and find and
fix issues when errors occur.
When you use unit-testing tools like pytest in tandem with higher-order automation and
in concert with continuous delivery (CI/CD), you can build environments where code can
be automatically tested when changes are made.
Unit-testing frameworks make tests a part of your codebase, following the code through
developer commits, pull requests, and code-review gates to QA/test and Production.
This is especially useful in test-driven development (TDD) environments, where writing
tests is a continuous process that actually leads development, automatically
encouraging very high levels of test coverage.
Network Simulation
Cisco Virtual Internet Routing Laboratory (VIRL) VIRL can run on bare metal, or on
large virtual machines on several hypervisor platforms.
From an end user's point of view, an SDK is quite different from an API. Most SDKs are a
package, integrated with libraries, documents, code examples, etc. Most SDKs require
installation. In comparison, an API is essentially a documented set of URIs. Developers require
only a reference guide and a resource address to get started.
For example, if you want to develop a mapping application for an iPhone, you need to download
and install the iOS SDK. In contrast, to try the Google Map API, you need only an API reference
and require initial authentication credentials.
An SDK can provide simpler authentication methods as well as enabling token refreshes as part
of the package. SDKs often help with pagination or rate limiting constraints on responses for a
particular API. Also, it can be easier to read examples in a programming language you are
already familiar with, so code examples in an SDK can be helpful.
Cisco SDKs
Cisco provides a wide range of SDKs on different Cisco platforms. This list is not an exhaustive
list:
Webex Teams Android SDK - Use the Webex Teams Android SDK to customize your
app and to access powerful Webex Teams collaboration features without making your
users leave the mobile app. See Webex Teams Android SDK.
Jabber Web SDK - This SDK is used for developing web applications on Cisco Unified
Communications, including voice and video, IM and Presence, voice messaging, and
conferencing. See Jabber Web SDK.
Jabber Guest SDK for Web - The Cisco Jabber Guest SDK for Web is primarily a call
widget that is embedded in an iframe within another web page. See Jabber Guest SDK for
Web.
Jabber Guest SDK for iOS - The Jabber Guest SDK for iOS coordinates and simplifies
the implementation, use, and quality of two-way video calls from within an application.
See Cisco Jabber Guest for iOS.
Jabber Guest SDK for Android - With the Jabber Guest Android SDK, you can enable
your application to instantiate a two-way video call via the internet. The call is between
the user's device and a video endpoint registered with a CUCM inside an enterprise
firewall. The SDK handles all aspects of establishing and maintaining the two-way video
call within your application. See Cisco Jabber Guest for Android.
Cisco DNA Center Multivendor SDK - For partners and customers who have a mix of
Cisco and non-Cisco devices in their network, this SDK builds support directly in Cisco
DNA Center. See Cisco DNA Center Multivendor SDK.
UCS Python SDK - The Cisco UCS Python SDK is a Python module used to automate
all aspects of Cisco UCS management, including server, network, storage, and hypervisor
management. See Official Documentation for the UCS Python SDK.
Cisco APIC Python SDK (Cobra SDK) - The Cobra SDK is a native Python language
binding for the APIC REST API to interact with the APIC controller. The installation of
Cobra requires installing two .egg files: acicobra.egg and acimodel.egg. See Installing
the Cisco APIC Python SDK.
Cisco IMC Python SDK - Cisco IMC Python SDK is a Python module supporting the
automation of all Cisco IMC management servers (C-Series and E-Series). See
Instructions for installing IMC Python SDK and Official Documentation for the IMC
Python SDK.
Cisco Instant Connect SDK - Cisco Instant Connect has an Android, Windows, and
Apple iOS Software Development Kit, providing tools for partners to embed Push-To-
Talk in mobile or desktop applications. See Cisco Instant Connect SDK Downloads and
Introducing Cisco Instant Connect.
Webex Teams Python SDK example - You can work with the Webex Teams APIs
using a familiar language, in this case Python, with the webexteamssdk . This SDK is
available on GitHub. This SDK handles pagination for you automatically, simplifies
authentication, provides built-in error reporting, and manages file attachments. Here is a
simple Python example to retrieve a user's account information:
The SDK needs a Webex Teams access token to make API calls. You can either set it
with a WEBEX_TEAMS_ACCESS_TOKEN environment variable, or by putting
the access_token argument into the function call. This example uses
an access_token variable and then passes it in as an argument.
The SDK provides error reporting with the ApiError exception. Notice
that ApiError imported in the first line.
The SDK gives you access to precise parts of the return body, such as me.emails . In the
example above it returns only the emails data from the response JSON.
YANG
NETCONF
RESTCONF
YANG is a modeling language. NETCONF and RESTCONF are protocols used for data model
programmable interfaces.
8.3.2 What is YANG?
YANG, an acronym for Yet Another Next Generation, as defined in RFC7519, is "a data
modeling language used to model configuration and state data manipulated by the Network
Configuration Protocol (NETCONF), NETCONF remote procedure calls, and NETCONF
notifications."
YANG in the Model-Driven Programmability Stack
Apps
APIs
Protocol
Encoding
Transport
Models
Model-Driven APIs YANG Development Kit (YDK)
YANG Models (native, open)
App
App
App
A YANG module defines hierarchies of data that can be used for NETCONF-based operations,
including configuration, state data, RPCs, and notifications. This allows a complete description
of all data sent between a NETCONF client and server. YANG can also be used with protocols
other than NETCONF.
Although YANG can describe any data model, it was originally designed for networking data
models.
In the real world there are two types of YANG models, open and native.
Open YANG Models: Developed by vendors and standards bodies, such as IETF, ITU, OpenConfig,
etc.. They are designed to be independent of the underlying platform and normalize the per-
vendor configuration of network devices.
Native Models: Developed by vendors, such as Cisco. They relate and are designed to integrate
features or configuration only relevant to that platform.
anyxml: A data node that can contain an unknown chunk of XML data.
augment: Adds new schema nodes to a previously defined schema node.
container: An interior data node that exists in, at most, one instance in the data tree. A
container has no value, but rather a set of child nodes.
data model: A data model describes how data is represented and accessed.
data node: A node in the schema tree that can be instantiated in a data tree. One of container,
leaf, leaf-list, list, and anyxml.
data tree: The instantiated tree of configuration and state data on a device.
derived type: A type that is derived from a built-in type (such as uint32), or another derived
type.
grouping: A reusable set of schema nodes. Grouping may be used locally in the module, in
modules that include it, and by other modules that import from it. The grouping statement is
not a data definition statement and, as such, does not define any nodes in the schema tree.
identifier: Used to identify different kinds of YANG items by name.
leaf: A data node that exists in at most one instance in the data tree. A leaf has a value but no
child nodes.
leaf-list: Like the leaf node but defines a set of uniquely identifiable nodes rather than a single
node. Each node has a value but no child nodes.
list: An interior data node that may exist in multiple instances in the data tree. A list has no
value, but rather a set of child nodes.
module: A YANG module defines a hierarchy of nodes that can be used for NETCONF-based
operations. With its definitions and the definitions it imports or includes from elsewhere, a
module is self-contained and “compilable”.
RPC: A Remote Procedure Call, as used within the NETCONF protocol.
state data: The additional data on a system that is not configuration data, such as read-only
status information and collected statistics [RFC4741].
YANG defines four types of nodes for data modeling. The detail is described in RFC6020
section 4.2.2 or RFC 7950 section 4.2.2.
Leaf Nodes
Leaf-List Nodes
Container Nodes
List Nodes
A YANG module contains a sequence of statements. Each statement starts with a keyword,
followed by zero or one argument, followed either by a semicolon (";") or a block of sub-
statements enclosed within braces ("{ }").
statement = keyword [argument] (";" / "{" *statement "}")
There are four major statements in a YANG module:
YANG in action
Now let's take a look at an example of a YANG file in real life, ietf-interfaces.yang. This
YANG open module contains a collection of YANG definitions for managing network
interfaces.
When you know the terminology and structure of YANG, it is not hard to understand the content
of a YANG file, which has very detailed comments and descriptions. But those descriptions also
make the file very long. Fortunately, there are tools to extract the content in a more readable and
concise way, and the pyang tool is one of them.
You can install pyang using the pip command in a virtual environment.
As you can see below, using the pyang tool can convert a YANG file to an easy-to-follow tree
structure.
The NETCONF protocol uses XML-based data encoding for both the configuration data
and the protocol messages.
The NETCONF protocol provides a small set of operations to manage device
configurations and retrieve device state information. The base protocol provides
operations to retrieve, configure, copy, and delete configuration data stores.
NETCONF Protocol Operations
The NETCONF protocol provides a set of operations to manage device configurations
and retrieve device state information. The base protocol includes the following protocol
operations:
NETCONF and SNMP protocols are both defined to remotely configure devices.
Feature comparison
SNMP:
Uses pull model when retrieving data from device which does not scale up well
for high-density platforms
Does not have a discovery process for finding Management Information Base
(MIBs) supported by a device
Does not support the concept of transactions
Lacks backup-and-restore of element configuration
Limited industry support for configuration MIBs
The NETCONF RFC has three distinct data stores that are the target of configuration
reads and writes. These are:
running (mandatory)
candidate (optional)
startup (optional)
The RESTCONF RFC 8040 defines a protocol and mechanism for REST-like access to
configuration information and control. Similar to NETCONF, it uses the datastore
models and command verbs defined in the Network Configuration Protocol
(NETCONF), encapsulated in HTTP messages. As with NETCONF, the YANG
language is used to define the command structure syntax, as the semantics of the
configuration datastore configuration, state, and events.
RESTCONF in the Model-Driven Programmability Stack
RESTCONF uses structured data (XML or JSON) and YANG to provide REST-like
APIs, enabling programmatic access to devices. HTTP commands GET, POST, PUT,
PATCH, and DELETE are directed to a RESTCONF API to access data resources
represented by YANG data models. These basic edit operations allow the running
configuration to be viewed and altered by a RESTCONF client.
Each device RESTCONF server is accessed via API methods. Where can you find the
RESTCONF API? The answer is that individual device APIs are rarely published. Instead the
method URLs are determined dynamically.
The RESTCONF RFC 8040 states that RESTCONF base URI syntax is /restconf/<resource-
type>/<yang-module:resource>. <resource-type> and <yang-module:resource> are
variables and the values are obtained using specific YANG model files.
The basic format of a RESTCONF URL
is https://<hostURL>/restconf<resource><container><leaf><options> where any
portion after restconf could be omitted.
HTTP/1.1 200 OK
Date: Thu, 26 Jan 2017 20:56:30 GMT
Server: example-server
Content-Type: application/yang-data+json
{
"ietf-restconf:restconf" : {
"data" : {},
"operations" : {},
"yang-library-version" : "2016-06-21"
}
}
Note: You will learn how to use RESTCONF in the lab, Use RESTCONF
to Access an IOS XE Device.
In this lab, you will learn how to use the open source pyang tool
to transform YANG data models from files using the YANG language,
into a much easier to read format. Using the “tree” view
transformation, you will identify the key elements of the ietf-
interfaces YANG model.
In the first half of this lab, you will use the Postman program
to construct and send API requests to the RESTCONF service that
is running on R1. In the second half of the lab, you will create
Python scripts to perform the same tasks as your Postman program.
A Cisco DNA Center is a foundational controller and analytics platform for large and midsize
organizations. It is at the heart of a Cisco intent-based network. It provides a single dashboard for
network management, network automation, network assurance, monitoring, analytics, and
security.
Cisco DNA Center provides both a web-based GUI dashboard and the RESTful Intent API used
to programmatically access its services and functions.
It supports full 360-degree services and integration:
North - The Intent API provides specific capabilities of the Cisco DNA Center platform.
East - These are services for asynchronous Event Notification through WebHooks and email
notification.
Southbound - The Multivendor SDK is used to integrate non-Cisco devices into the Cisco DNA
network.
Westbound - Integration with Assurance and IT Service Management (ITSM) services, such as
ServiceNow.
Here the focus is on using the Intent API, but you will also get a quick overview of services
directly available through the GUI.
Cisco DNA Center dashboard (GUI)
The GUI organizes services and activities into Design, Policy, Provision, Assurance and
Platform.
Design - Design your network using intuitive workflows, starting with locations where your
network devices will be deployed. Existing network designs created in Cisco Prime Infrastructure
or Application Policy Infrastructure can be imported into Cisco DNA Center.
Policy - Define and manage user and device profiles to facilitate highly secure access and
network segmentation.
Provision - Deployment, and Policy-based automation to deliver services to the network based
on business priority and to simplify device deployment. Zero-touch device provisioning and
software image management features reduce device installation or upgrade time.
Assurance - Telemetry and notification for application performance and user connectivity events
in real-time.
Platform - Information and documentation for the DNA Center Intent API supporting the use of
third-party applications and processes for data gathering and control via the API. This is the
means to improve and automate IT operations, establish customized and automated workflow
processes, and integrate network operations into third-party applications.
Site Design
Enterprise network provisioning, Network Settings
Software Image Management
Site Management onboarding, and deployment, and
Device Onboarding (PnP)
software image management. Configuration Templates
Advanced Malware Protection (AMP) for Endpoints - AMP for Endpoints provides API access to
automate security workflows and includes advanced sandboxing capabilities to inspect any file
that looks like malware in a safe and isolated way. AMP works with Windows, Mac, Linux,
Android, and iOS devices through public or private cloud deployments.
Cisco Firepower Management Center (FMC) - FMC is a central management console for the
Firepower Threat Defense (FTD) Next-Generation Firewall. This console can configure all aspects
of your FTD including key features like access control rules and policy object configuration, such
as network objects. FMC provides a central configuration database enabling efficient sharing of
objects and policies between devices. It provides a REST API to configure a subset of its
functionality.
Cisco Firepower Threat Defense (FTD) - FTD configuration with Firepower Device Manager also
provides protective services including track, backup, and protect CA Certificates; manage,
backup, encrypt, and protect private keys; IKE key management; and ACLs to select traffic for
services.
Cisco Identity Services Engine (ISE) - ISE provides a rule-based engine for enabling policy-based
network access to users and devices. It enables you to enforce compliance and streamline user
network access operations. With the ISE APIs, you can automate threat containment when a
threat is detected. It integrates with existing identity deployments.
Cisco Threat Grid - Threat Grid is a malware analysis platform that combines static and dynamic
malware analysis with threat intelligence from global sources. You can add a Threat Grid
appliance to your network, or use the Threat Grid service in the cloud. It can also be integrated
into other security technologies such as AMP.
Cisco Umbrella - Umbrella uses DNS to enforce security on the network. You configure your DNS
to direct traffic to Umbrella, and Umbrella applies security settings on their global domain name
list based on your organization's policies.
8.8.2 Packet Tracer - Compare using CLI and an SDN Controller to Manage a
Network
In this Packet Tracer activity, you will compare the differences between managing a network
from the command line interface (CLI) and using a software-defined networking (SDN)
controller to manage the network.
You will complete these objectives: