0% found this document useful (0 votes)
113 views

Accelerated C++ Practical Programming by Example

c ++

Uploaded by

Rahul Agarwal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views

Accelerated C++ Practical Programming by Example

c ++

Uploaded by

Rahul Agarwal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

The magazine of the algorithm community

A publication by

March 2019

Artificial Intelligence Spotlight News Bay Vision Meetup

Women in Science: Research Review:


Chip Huyen - NVIDIA Kalman Filter Based Classifier Fusion
for Affective State Recognition
Focus On:
PyOD (with codes!) Project Management:
When Client Says: “Do the Best You Can!”

Upcoming Events Computer Vision Project:


3D Stereo Vision Calibration and Triangulation
2 Contents Computer Vision News

04 15 24

Research Paper Review


… of Kalman Filters for… AI Spotlight News
Women in Science
Chip Huyen - NVIDIA
10 16 30

Project by RSIP Vision


Project Management Tip 3D Stereo Video Calibration
“Do the best you can!” Bay Vision Meetup

18
12 33

Challenge - BreastPathQ Focus on: PyOD


with Nicholas Petrick Python open source toolbox Upcoming Events

03 Editorial by Ralph Anzarouth 16 Project by RSIP Vision


3D Stereo Video Calibration ...
04 Research Paper Review
An implementation of … by A. Spanier
18 Focus on: PyOD
for Outlier Detection by A. Spanier
10 Project Management in Comp. Vis.
When client says … with A. Minkov
24 Women in Science
Chip Huyen: Stanford, NVIDIA
12 AI Challenge Review
BreastPathQ with N. Petrick
30 Computer Vision Events
Bay Vision Meetup: Deepen.ai, CRadar.Ai
15 AI Spotlight News
From Elsewhere on the Web
Upcoming Events Mar-May 2019
34 Subscribe for Free
Computer Vision News
Welcome 3

Dear reader,
Following our issue dedicated mostly to the
field of Autonomous Vehicles (February 2019),
we keep informing you about the most
interesting developments in that area. For
instance, the Bay Vision Meetup that we
sponsored on February 27 in Cupertino: see
the photos and the full video of the
discussions on pages 30-31 of this magazine.
We did not forget the other key subjects
normally featured on Computer Vision News:
we reviewed for you a very important medical
Did you subscribe to segmentation challenge on page 12, thanks to
Computer Vision News? help from Nicholas Petrick of the FDA: the
It’s free, click here! quantitative task of the BreastPathQ challenge is
to determine cancer cellularity from H&E patches
Computer Vision News
of breast cancer tumors. Read also what the
winners of the challenge say.
Editor: Finally, we know how much our readers love
Ralph Anzarouth codes! If you relate, don’t miss the technical
articles published in this issue, in particular
Engineering Editor: the “Focus on: PyOD” on page 18. As usual,
Assaf Spanier
lovers of Artificial Intelligence will find plenty
Publisher: of code to work with.
RSIP Vision
Enjoy the reading!
Contact us
Give us feedback
Free subscription Ralph Anzarouth
Read previous magazines Editor, Computer Vision News
Copyright: RSIP Vision RSIP Vision
All rights reserved
Unauthorized reproduction
is strictly forbidden.
Did you miss our "AI in Pharma" webinar?
Follow us: Deep Learning for the Segmentation,
Classification, and Quantification of
Dendritic Cells. Here it is again for you!
4 Research - Kalman Filter Based Classifier
Computer Vision News

by Assaf Spanier
Every month, Computer Vision News
reviews a research paper from our field.
This month we have chosen Kalman
Filter Based Classifier Fusion for
Affective State Recognition. The
authors are Michael Glodek, Stephan
Reuter, Martin Schels, Klaus C. J.
Dietmayer and Friedhelm Schwenker.
The paper is here.
An implementation of Kalman Filters for
fusing the decisions of a number of classifiers
Introduction
Kalman Filters are a method in widespread use in the fields of object tracking
and autonomous driving (navigation). Kalman Filters are highly efficient at fusion
of measurements, due to their Markov chain based design (Markov assumption
holds). The idea of Kalman Filters is to reduce measurement evaluation “noise”
by fusing measurements from a variety of sources, even if some have missing
values in certain instances, and computing what weight to give each source in
each instance. The model can handle missing measurements by raising the level
Research

of uncertainty. Kalman Filter works in two stages: the prediction stage and the
update stage. The prediction stage estimates a scalar -- the fusion of classifier
outputs. The update stage (also known as the correction step) combines this
estimate with the current measurements z.

The classifier fusion method proposed in this paper can be used for any classifier
type, as long as the following assumptions hold: 1) Markov assumption: future
states are independent of past states. And 2) the data is sequentially structured.

The figure above illustrates the model proposed in this paper: classifier fusion of
a number of simple classifiers (base classifiers), with a reject option for each
classifier. At every time point, the classification decision and confidence measure
of every classifier are collected and fused by a Kalman filter, which then outputs
a fused classification decision with a confidence measure for that decision.
Fusion for Affective State Recognition 5
Computer Vision News Computer Vision News

Method
As mentioned, the Kalman Filter consists of two stages: the prediction stage and
the update stage. The prediction stage computes an estimated scalar 𝑥ෞ𝑡 . The
update stage (also known as the correction step), fuses this estimate (ෞ 𝑥𝑡 ) with
the latest measurements 𝑧𝑚𝑡 , where 𝑡 ∈ {1, . . . , 𝑇} is the time and 𝑚 ∈
{1, . . . , 𝑀} is the m-th classifier.
The prediction stage: the new prediction 𝑥ෞ𝑡 is estimated based on the last
prediction 𝑥ෟ 𝑡−1 , according to the formula 𝑥 ෞ𝑥 = 𝑓 ሶ ⋅ 𝑥𝑡−1 . The original Kalman
Filter formula includes an update element u; however, it is meaningless here, as
it refers to a command or instruction given to a robot or autonomous vehicle,
and the paper deals with a classifier.
𝑝ෞ𝑡 is the covariance of the prediction and is equal to 𝑝ෞ𝑡 = 𝑓 ⋅ 𝑝𝑡 ⋅ 𝑓 + 𝑞𝑚 : it is
obtained by combining the a posteriori covariance with an additional covariance
𝑞𝑚 , which models the process noise:
The update stage is performed for each classifier m and it consists of the following
three sub-component computations:
𝑦 = 𝑧𝑚𝑡 − ℎ ⋅ 𝑥ෞ𝑡
𝑠 = ℎ ⋅ 𝑝ෞ𝑡 ⋅ ℎ + 𝑟𝑚
𝑘 = 𝑝ෞ𝑡 ⋅ ℎ ⋅ 𝑠 −1

Research
where h is the observation model mapping the prediction to the new estimate,
and 𝑟𝑚 is the observation noise.
The outcomes of this stage are used to update the estimate and covariance of the
next stage, based on the following formula:
𝑥𝑡 = 𝑥ෝ𝑡 + 𝑘 ⋅ 𝑦
𝑝𝑡 = 𝑝ෞ𝑡 − 𝑘 ⋅ 𝑠 ⋅ 𝑘
When there are missing values, that is for some classifiers there is no decision --
the output of that classifier (𝑧𝑚𝑡 ) is set as 0.5 and the observation noise 𝑟𝑚 of
that classifier is raised. The authors expanded the classifiers’ decision range to
include the possibility of rejection, that is, classifiers may decide to return a no-
classification output, rather than their estimate, based on a too-low confidence
measure. Due to the temporal structure of the method (the fact that there are a
number of classifications at every point in time), the missing classification results
can be estimated based on the results of the other classifiers. The architecture
proposed in the paper is as follow - at every point in time, each classifier m
produces a classification decision and a corresponding confidence measure,
which serves for making rejection decisions.

Dataset
The paper used the AVEC dataset, presented in 2011 as a benchmark for user
6 Research Computer Vision News

emotional states. The data are audio and video recordings of a human
communicating with a virtual agent and trying to stimulate one of four
emotional states: Arousal, Expectancy, Power and Valence. Ground truth was
determined by having eight human evaluators evaluate each recording on a
continuous scale, with the final binary labeling determined by applying a
threshold to the average evaluation.

The Base Classifiers


Below are the features used as input by the base classifiers. Each classifier was
trained either on audio recordings or video recordings, and optimized through
cross-validation with other classifiers for the same recording type.

Audio:
To arrive at a fixed length input vector (from a long continuous varying input),
HMM-based transformation is used. Classification is done by five bags of random
forests, and the final decision is determined by averaging the five trees, with the
standard deviation used to compute the measure of confidence.
Audio classification is conducted per-word using 3 bug-of-words composed of
the following features:
● Fundamental frequency, the energy and linear predictive coding (LPC)
● Mel frequency cepstral coefficient (MFCC)
Research

● Relative spectral transform - perceptual linear prediction (RASTA-PLP)

Video:
Video channel features were acquired from the computer expression recognition
toolbox (CERT), dedicated to facial expression recognition. Four models from the
CERT toolbox were used: Basic Smile Detector, Unilaterals, FACS 4.4 and
Emotions 4.4.3. The outputs of all four models were concatenated to create a
length 36 vector for every video frame.

The overall classification decisions and confidence measures were determined


similarly to the audio by using five bags of random forests. In 8% of cases facial
recognition failed leading to missing results for the base classifier, which the
Kalman Filter handled. The hyperparameter settings of all base classifiers
underwent optimization using the training and validation datasets. To optimize
the hyperparameters of the classifier fusion algorithm (Kalman Filter) the same
training and validation datasets were used, this time treating the classification
decisions and confidence measures of each base classifier as features.

The model fuses a large number of


measurements and can handle missing values
by increasing the noise level for that classifier
Kalman Filter Based Classifier… 7
Computer Vision News

Results
Audio and video classification pre-fusion performance. The results are
percentages with standard deviation. The video results are per frame. Table shows
performance of the audio and video classifiers separately, without classifier fusion.

Images without a classification decision, such as when the person is not speaking
or facial recognition failed, were excluded for the purpose of this evaluation. The
table presents precision and the F1 rate

(where P is the precision and R is the recall) for the four emotional state
categories. Compared to the best performance previously achieved on the
benchmark these results are already impressive (for instance, the best precision
for predicting ‘Arousal’ was 61%). The next two tables present performance for
the four categories with uni-modal classifier fusion -- that is fusion of the audio

Research
base classifiers separately and the video base classifiers separately.

Audio classification performance after fusion using Kalman Filter:

Video classification performance after fusion using Kalman Filter. The table
below presents video-channel results. Here, all categories show improvement
compared to the pre-fusion results. Note, however, that Arousal achieved better
results using the audio channel:
8 Research Computer Vision News

Reject indicates percentage of decisions rejected, 𝑞𝐴𝑈𝐷𝐼𝑂 and r correspond to


the process noise and the observation noise, respectively. 𝑧𝑚𝑡 is set to 0.5 and
observation noise 𝑟𝑚 is set to a higher rate than originally assigned to r.
At first glance, comparing the results of the second table (“audio fusion”) to the
first table -- no-classifier-fusion -- it may seem only Valence and Arousal show
improvement. However, the most important improvement in evaluation is that
now there is a classification decision for every video frame.
The table below presents results for the fusion of audio and video channel data.
This multi-modal fusion improves performance for Power and Expectancy.
However, Arousal performance was best fusing only the audio data, and Valence
performance was best fusing only the video data. The lowered performance of
Arousal is likely caused by an imbalance between the audio and video. Audio
channel decisions aren’t always available, while the smart platforms of the video
channel are almost always available. In the case of the Valence category, the
lowered performance can be attributed to the relatively low F1 of the audio
channel.
Research

Conclusion
The paper presents an implementation of Kalman Filters for fusing the decisions
of a number of classifiers. The authors show the feasibility of fusing multi-modal
classifier outputs. The model fuses a large number of measurements and can
handle missing values by increasing the noise level for that classifier. The authors
used the audio/visual emotional challenge (AVEC) 2011 data set to evaluate the
performance of the fused classifier. Fusion clearly improved performance for all
four emotional state categories measured in the challenge. Moreover, using the
Kalman Filter enabled the system to estimate the missing classification results,
so that it was able to classify all data (all video frames). Despite the fact that the
Kalman Filter can be considered the simplest instance of a time series (with no
control matrix and assuming an identity matrix for dynamics), the results
presented in the paper are excellent.
10 Project Management Tip Computer Vision News

When the client says: “Do the best you can!”


RSIP Vision’s CEO Ron Soferman has
launched a series of lectures to provide a
robust yet simple overview of how to
ensure that computer vision projects
respect goals, budget and deadlines. This
month Aliza Minkov tells us what to do
when the client says: “Do the best you
can!”. It’s another tip by RSIP Vision for
Project Management in Computer Vision.

“…a good way of knowing if you are in the good direction…”


Management

Besides scientific and algorithmic you can define the constraints that are
challenges, computer vision work upon your work. Some of them are
presents also project management straightforward and some are not:
challenges: we are going to discuss in time, amount and quality of data
this article the specific circumstance of needed. Ideally, the goals of a deep
projects that do not have well-defined learning project in computer vision
quantitative goals. Since this is a should include the level of accuracy
research project, neither client nor that we want to achieve. Sometimes,
supplier can predict results a priori. at the time of the POC, the client has
One of the first tasks which needs to no clear idea of the desired accuracy.
be dealt with in every project is the That’s equivalent to saying: “Do the
need to define our goals, such as best you can (with the given
would be accepted and understood by constraints)!”. That absence of ground
both parts, the client and RSIP Vision. truth might lead to overshooting, i. e.
working more than necessary and
In projects of this kind, customers start spend more time than necessary to
with some idea of what they want to further improve the results.
achieve, but for lack of knowledge of
what is possible to achieve, they often Without it, gaps may be found in the
design: for instance, we have the deep
expect the service provider to supply neural network, but how do we
the specific operative goals of the
project: in other word, how to translate the output of the neural
network to exactly what we need?
measure achievements and success at When you have little time for
the end of the work. When the
objectives are clear and understood, it experimenting, you cannot try
everything: you need instead to focus
makes it easier to verify whether we your effort. You define an end-to-end
are getting closer or not.
procedure and can admit some errors
When this does not happen, the lack of in the procedure, if the final result is
operative goals has an effect on the accurate enough. Again, the lack of
whole process. For instance, if you ground truth denies the chance to
define clearly the goals that you want, evaluate the quality of the result. With
Computer Vision News
Project Management Tip 11

… you need to develop some tools that


give you an objective assessment!
no early estimate of the performance, you an objective assessment. Doing
the team might not know how they are that about all the parts of the projects
doing until it’s almost too late to helps you decide where your efforts
rectify the project and refocus the should be applied on.
efforts.
In that case, you need to have a More articles on Project Management
regular contact with the client:
exposing what you do with all the
steps that you have taken is key to
keep the focus as close as possible to

Management
what the real expectations of the client
are. Measuring performance as you
progress and reflecting it to the clients
is a good way of knowing if you are in
the good direction to provide what
they really want.
There’s something that we learn from
most computer vision tasks: you look
at the image, you look at the network
output and you decide if it’s good
enough or not. But sometimes you
need to develop some tools that give
12 Challenge: BreastPathQ Computer Vision News

The SPIE (the international society for optics and photonics),


American Association of Physicists in Medicine (AAPM), and the
National Cancer Institute (NCI) developed a Grand Challenge for the
quantitative task of determining cancer cellularity from pathology
hematoxylin and eosin (H&E) slide patches of breast cancer tumors.
Cancer cellularity is the percent area (0-100%) of the tumor bed that is
comprised of invasive or in situ tumor cells and is one part of the MD
Anderson residual cancer burden assessment approach with
applicability, for example, in assessing neoadjuvant treatment of
Challenge

breast cancer. This intro and almost all the article are courtesy of
Nicholas Petrick, Deputy Director Division of Imaging, Diagnostics and
Software Reliability at the Center for Devices and Radiological Health,
U.S. Food and Drug Administration. Nicholas led this challenge effort
together with Kenny Cha (also from the FDA), Shazia Akbar and Anne
Martel (both from Sunnybrook Research Institute) among others.

Challenge synopsis between the method and the


The challenge included a training (2579 reference but not the absolute
512x512 patches from 46 patients), cellularity scores for each patch and
validation (185 512x512 patches from ranges between 0.0 and 1.0. The
4 patients) and test phase (1121 reference standard (“truth”) was a
512x512 patches from 18 patients) cellularity scores from one pathologist
using digitized H&E stained breast for the training/validation patches and
cancer slides. The data was collected at cellularity scores from two pathologists
the Sunnybrook Research Institute, for the test patches.
Toronto as part of a research project A total of 100 qualified algorithms
funded by Canadian Cancer Society were submitted from 37 different
and IRB approval was obtained to groups. The winning algorithm
allow the anonymized data to be achieved an pk=0.94 with algorithm
shared. The training and validation performance ranging from pk=[0.21,
phases included feedback to the 0.94].
participants on their algorithm’s
performance and had visible
leaderboards allowing participants to
compare their performance with other
algorithms. The test phase was blinded
without any performance feedback
available to participants until after the
challenge closed.
The performance metric selected was
predication probability (pk), which
assesses the ordering of the patches Nicholas Petrick
SPIE-AAPM-NCI BreastPathQ
Computer Vision News
13

What did we learn? • We learned that paying attention to


the performance criteria from
• We learned that participant comparing algorithms is crucial. We
engagement was critical. The found that a rank-based performance
BreastPathQ challenge include a forum metric, predication probability, would
section allowing participants and work best in our challenge because of
organizers to ask questions, answer high variability in human pathologist
questions, form teams and provide absolute cellularity scoring of individual
feedback to all participants at once. patches. A rank-based performance
One example of how the forum was metric assesses how the algorithm
used was when a participant identified

Challenge
orders patches from lowest cellularity
a problem with the original challenge to highest cellularity. The human
performance metric. Once this was pathologist “truthers” had smaller
identified on the forum, we, as the variability in ranking patches compared
challenge organizers, where able to with determining absolute percent
assess the problem, determine it was a cellularity scores. By taking this into
concern and implement a revised account, we believe we implemented a
performance metric to address the more relevant approach for comparing
issue. This is quite a nice example of the ability of algorithms to estimate
how participant engagement directly patch cellularity as part of our
lead to an improved challenge. challenge.

Figure above shows different patches within the pathology slide with various
level of cellularity scoring by a pathologist. Courtesy of Anne Martel, Sunnybrook
Research Institute, Toronto (and used as part of the BreastPathQ Challenge logo).
14 Challenge Computer Vision News

• We learned that the submitted


algorithms generally performed quite
well in assessing cancer cellularity for
H&E breast cancer tumor patches with
the majority of submitted algorithms
having a predication probability value
(pk) greater than 0.90 on a scale of 0.0
to 1.0. This is somewhat comparable to
the human pathologist readers who David Chambers Mamada Naoya
achieved predication probability values
of 0.96 and 0.93 on our test dataset. thought of as a segmentation problem
Challenge

While algorithm performance still underneath. The Southwest Research


needs to be confirmed on a much larger Institute and Univ. of Texas Health
and more diverse patient’s dataset to Sciences Center at San Antonio team
verify these results, the challenge approached the problem as a weakly-
suggests that automated cancer labeled segmentation problem, and
cellularity scoring may be a reasonable iteratively refined our algorithm by
approach to consider in order to reduce providing strong labels for hard
variability among pathologists and examples and retraining. Pathologist
potentially streamlining the assessment expertise played a large role in our
of residual cancer burden in breast and success.
other cancers. Because global context is important in
• We learned that a well-organized this problem, we used a network with
challenge in the medical field requires "Squeeze-and-Excitation" architectural
people of many expertise working units, a recent development in neural
together. For this challenge, we had network architectures that allows for
experts in the clinical task (to define the reweighting features according to
task to study and its clinical relevance), global content.”
statistical analysis (to determine the Co-winner Mamada Naoya, a master's
evaluation of the results), logistics course student at the Tokyo Institute of
support (to ensure that the challenge Technology, concludes:
runs smoothly), as well as the support
of organizations for this challenge (to “I study deep learning applications for
advertise the challenge, and have a material science and I know next to
venue for reporting the results). nothing about pathology and cancer.
Without these experts working So my winning shows the versatility of
together, the challenge would not have deep learning, I believe.
been successful. And I was surprised to hear that top
We have collected comments by the models' performances are as good as
two winning teams. David Chambers, expert pathologists'.
Senior Research Engineer at the With larger amount and more variety of
Southwest Research Institute (SwRI): data (annotation by many pathologists,
“The cellularity is a measure of the area different optical devices, rare clinical
of the patch that is occupied by cases, etc.), super-human models will be
malignant cells, so the problem can be possible."
Artificial Intelligence Spotlight News 15
Computer Vision News

Computer Vision News has found great new stories, written somewhere else
by somebody else. We share them with you, adding a short comment. Enjoy!
Delivery robots put sidewalks in the spotlight:
How ready robotics technology is to navigate the
surprisingly perilous world of the neighborhood
sidewalk? Now that Amazon has announced a pilot of
autonomous delivery robots, what are the main
technological hurdles that need to be solved? Read More…
‘Fauxtography’ is now a fact of life:
Unedited photos are today no more than a romantic
ideal. Theverge.com tells us what image processing
edits are silently done by our smartphones even before
we see the pic; after that, manual photo retouching
has become more a habit than an option. Whitening
and flattening the skin, enlarging the eyes, shrinking or
straightening the nose… the article is well written and
fun to read. Enjoy!
Don’t Let Artificial Intelligence Pick Your Employees:
Are algorithms sophisticated enough to make strategic
decisions like hiring? Even though a Stanford
organizational behavior teacher thinks they are not,
the use of machines in hiring has become widespread
long ago and now between 65% and 70% of all job
applications are touched first by a machine. Read More…
Exploring PlaNet, Google-DeepMind’s Solution for…
…for Long Term Planning in Reinforcement Learning
Agents. Google and DeepMind AI researchers work on
a Deep Planning Network (PlaNet), a new model that
can learn about the world using images and utilize that
knowledge for long-term planning, with two new
concepts: a Recurrent State Space Model and a Latent
Overshooting Objective. Read About It Here…
Artificial Intelligence Should Start With Artificial Joints:
Artificial intelligence is converging with medicine and
one of the areas involved is joint replacement: large
high quality patient data may feed a sophisticated
algorithm that predicts the risks of a joint replacement
in the future as well as expected costs and recovery
info. Read More and What RSIP Vision Offers in This Field
What’s Behind JPMorgan Chase’s Big Bet on Artificial Intelligence?
America’s biggest bank bets on AI to shape its future strategies. Read More Listen
16 3D Stereo Vision Calibration Computer Vision News

by Aliza Minkov
Every month, Computer Vision News reviews a successful project. Our main
purpose is to show how diverse image processing techniques contribute to
solving technical challenges and real world constraints. This month we
review a computer vision project by RSIP Vision: 3D Stereo Vision
Calibration and Triangulation.
Our client’s main goal in this project always the case, due to distortions of
was to find the 3D location of certain the lenses, which we solved doing an
Project

points of interest. They had automated appropriate calibration at the beginning


operations where they need to know of the project. This is a one-time
the exact location of objects in space. process that fixes the distortion. We
The original system was not accurate also needed to find the camera
enough for the client’s ambitions. geometry, to do a correct triangulation.
That means the camera’s intrinsic
For this purpose, we used two cameras
to find the exact 3D location out of 2D parameters as well as its rotation and
translation to get to the coordinate
image points. At the Proof Of Concept system that we want and the relation
phase, the client defined some of the
camera specifics, like the distance and between the cameras. Of course, in this
process of finding the parameters you
the height of the objects of interest; the also have some errors and you need to
specs were of course expected to be
different in the real production run. The be very organized and punctual in all
the steps of the calibration, which
two cameras system was needed to involves repeating some of the parts to
solve the depth part, transforming the
2D image into 3D points. Before get better results. In this case, we used
an object with a calibration pattern in
starting the project, we did a simulation the form of a known geometry: a
to verify whether additional cameras
would enhance the performance in that chessboard was chosen in this and in
many other cases since it has a grid
environment: this was not the case, so with boxes of the same size and a
for simplicity we decided to keep
working with two cameras, optimizing repeated pattern, with straight lines, so
that you can fix the distortion (by
the choice of angles, lenses and focus finding the distortion coefficients, so
to be used in order to see the object
with both cameras. that all the corners appearing on
straight lines in reality, will appear on
There were two main challenges to straight lines on the image as well). By
obtaining accurate results in this taking many images of the chessboard
project, that we solved using computer in different locations and using
vision techniques. The first one was the optimization algorithms, we were able
stereo vision part and the triangulation to fine-tune the parameters and find
part between the two 2D points: the best ones to satisfy these
theoretically, this sounds like something conditions, solving the calibration
that simple linear algebra equations can problem. The need to cover most of the
solve. In practice, these two points field of view with a single chessboard
need to correspond: it has to be the was challenging, since the field of view
same point in 3D that you see in both was pretty large and the chessboard
2D cameras. In real life this is not limited in size. We solved this challenge
A project by RSIP Vision 17
Computer Vision News

by moving around and taking many will cause an error. That’s why a great
images; in this way, we gave many 2D accuracy is required, even after the
3D correspondences to the algorithm, training of the network, hence some
as if you were taking images of a big algorithms were used to enhance this
chessboard. It's actually good to use accuracy of points correspondence. For
more than one image in any case for instance, if the point of interest is on
the optimization algorithm, even if some plane seen by both cameras, you
most of the field of view is covered by may want to get the homography
one image. We even had to replace the matrix – a matrix that relates the

Project
chessboard since the first one was transformation between two planes.
printed over a slightly bending Using the homography matrix, you can
material, bending the straight lines transform one image point to its exact
with it, which contributes to the error. corresponding point in the second
In addition, for the extrinsic calibration image. How do you get the
part (the calibration between the two Homography? Again, there are several
cameras), you want the camera to see traditional computer vision algorithms.
exactly the same 3D points to One way is using SIFT – matching
correspond and then in some of the similar feature points between the two
images we saw that there was a slight images, after the neural network has
movement due to humans holding identified the area of interest. It must
physically the chessboard in their be noted that SIFT might not work very
hands. This slight difference can have a well when the orientation of the
huge influence on the error, because it cameras is too different; in addition, it
is actually not the same point that you may fail for pattern-like repeating
are showing. The conditions for stereo structures, because the algorithm can
calibration require that you use a grid mistakenly match similar wrong feature
with straight lines, with no movement points. In that case another approach
between the two pictures. This solution can be taken, which is called a dense
per se is not very difficult: it requires stereo matching. In this method we
organization and precision in order to find the depth map of our area of
make it work well at the first time, interest. When you have a depth map,
localizing exact 3D point images to give you can better match each point with
as input to the navigation system. its correspondent point on the second
The second challenge consisted in image. With enough corresponding
automatically extracting the same points, you can use algorithms like
feature points from both images. To RANSAC to get the best fitting plane
that purpose, we used deep learning equation for the inliers point
neural network, which we trained to correspondences, and the plane
accurately localize these points. Of equation will give you the homography.
course, this requires a lot of data, There are many challenges in computer
which must be collected without vision projects like this and it is key that
delaying the Proof Of Concept’s you work with experts like RSIP Vision’s
schedule. Trying to triangulate image engineers to solve them in the optimal
points that do not correspond exactly way and reach your goals.
Take us along for your next Deep Learning project! Request a call here
18 Focus on: PyOD
Computer Vision News

by Assaf Spanier
What happened?
Is this a legitimate
transaction, an unusual
purchase, a change in
behaviour?
Or something is wrong…
What is the reason for
this outlier?

Outlier detection, also known as anomaly detection, refers to identifying rare


occurrences, observations, or in the most general sense -- data points -- that
show a distinct variance from the general population. Ground truth labeling is
scarce, especially for outlier detection tasks. Outlier detection is especially
valuable in fields processing big data. Industry applications include financial
fraud detection (Ahmed, et al), mechanical failure detection (Shin, et al),
network infiltration (Garcia-Teodoro, et al) and pathology detection in medical
imaging (Baur, et al).
Focus on

An outlier is any datapoint that is very different from the rest of the
observations in a set of data. Some examples: When an 8th-grade class includes
one student who is 1.85 m, when all the other students are between 1.55 m and
1.70 m. When a client’s purchasing patterns are analyzed and it turns out that
while most of his purchases are under $100, a single purchase of over $20,000
shows up out of nowhere. What happened? Is this a legitimate transaction, an
unusual purchase, a change in behavior -- or is there anything wrong? What is
the reason for this outlier?

There are many reasons for the occurrence of outliers. Perhaps there was an
error in data entry, or a measuring error, incorrect data can even be given
purposefully -- individuals who don’t want to reveal the real data about
themselves, may feed made up data as input (for instance into online forms). Of
course, we must remember, outliers may also represent real unusual
occurrences.

Why do we need to discover outliers?


Outliers can dramatically affect the results of our analysis and our statistical
models. Let’s look at the following example to realize what happens to our
model when outliers are included in our dataset versus when they have been
discovered and removed from it.
Python open source toolbox for Outlier Detection 19
Computer Vision News

Let’s assume our model is linear regression, and we are trying to fit a line to a
given dataset. Look at the difference between the line on the left (fitted to the
dataset including the outliers) and the data on the right (fitted after the outliers
were removed).

PyOD provides you with a wide variety of outlier detection algorithms, including
established outlier ensembles and more modern neural-network-based
approaches. All available through a single, well documented, API made with
both industry users and researchers in mind. PyOD was implemented with
emphasis on unit testing, continuous integration, code coverage, maintainability

Focus on
checks, interactive examples and parallelization.
Here is the link to download PyOD (it is compatible with both Python 2 and 3)
PyOD offers a number of advantages over previously available libraries:
1. PyOD includes over 20 algorithms, covering both classic techniques such as
Local Outlier Factor and the latest neural network architectures like
autoencoders or adversarial models.
2. Implements a number of methods for merging / combining the results of
numerous outlier detectors.
3. Uses a unified API, includes detailed documentation with interactive
examples of every method and algorithm for clarity and ease of use.
4. All functions and methods include code testing and continuous integration,
parallel processing and just-in-time (JIT) compilation.
5. Can be run on both Python 2 and Python 3, and on all the major operating
systems (Linux, Windows, and MacOS).
6. PyOD includes the following popular detection algorithms:
20 Focus on: PyOD
Computer Vision News

Let’s review some of the algorithms included:


Isolation Forest - A set of trees is used to partition the data and outliers are
determined by looking at the partitioning and seeing how isolated a leaf is in the
Focus on

overall structure. Isolation Forest handles multidimensional data well.


Histogram-based Outlier Detection - This is an effective system for handling
unsupervised data, it assumes feature independence. The metric used for outlier
detection is the construction of histograms and measuring distance from the
histogram. It’s much faster than multivariate approaches, but at the cost of
lower accuracy.
Angle-Based Outlier Detection (ABOD) - The method measures the distance of
every data point from its neighbors, taking into account the distance between
those neighbors -- the variance of the cosine scores is the metric used for outlier
detection. ABOD handles multidimensional data well. PyOD includes 2 versions:
1) Fast -- only using the k nearest neighbors, and 2) Taking into account all data
points.
k Nearest Neighbors Detector - For each data point, the distances from its k
nearest neighbors are looked at for outlier detection. PyOD supports 3 versions
of kNN: 1) using the distance from the k-th nearest neighbor as the metric for
outlier detection, 2) using the average of the k nearest neighbor distances as the
metric, and 3) using the median of the k nearest neighbor distances as the
metric.
Python open source toolbox for Outlier Detection 21
Computer Vision News

Local Correlation Integral (LOCI) - LOCI is very effective at detecting both


individual outliers and clusters of outliers. The method produces a LOCI plot for
every data point, which summarizes the information about the data points in the
area surrounding that point. It determines clusters, micro-clusters, the diameter
of clusters and distances between clusters. And from these measurements
determines the degree of anomaly of the data point.
Feature Bagging - Feature bagging fuses a number of base classifiers to improve
prediction accuracy, fusion can use simple methods like averaging or median, or
more sophisticated ones. Local Outlier Factor is used as the default Outlier
Detector, but any other outlier detection algorithm, such as kNN or ABOD, may
be substituted.
As stated, PyOD can be run on either Python 2 or 3, using package six. It’s based
on NumPy, SciPy and SciKit-Learn, and uses Keras for advanced neural network
methods such as autoencoders and SO_GAAL. To improve scalability, all
algorithms were optimized with JIT and Numba, and the library supports parallel
processing on multi-processor computers.
The PyOD API, inspired by the SciKit-Learn API, is basically a replica of that well-
known interface. In particular, it includes the following:
1) the fit function, which trains the model and collects the appropriate statistics.
2) decision_function, which ranks outliers for every new data-point, once the

Focus on
model is trained.
3) predict, which returns a binary label for each data point.
4) predict_prob, which delivers the result as a probability measure.
5) fit_predict, which corresponds to calling the predict function, after
performing fit.
The package is available as a Python open source toolbox and it can be easily
extended to implement new methods. New models are very easy to train within
this framework (using the unified and well-known API), and you can use
polymorphism to easily inherit any function and implement it for your own
needs.
How to install the package - as simple as:

pip install pyod

Example:
Testing 4 methods on random generated dataset
22 Focus on: PyOD
Computer Vision News

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

from pyod.models.abod import ABOD


from pyod.models.knn import KNN
from pyod.models.loci import LOCI
from pyod.models.iforest import IForest
from pyod.models.feature_bagging import FeatureBagging

from pyod.utils.data import generate_data, get_outliers_inliers

# random data
X_train, Y_train = generate_data(n_train=500,train_only=True, n_features=2)

# store outliers and inliers in different numpy arrays


n_outliers, n_inliers =
list(map(len,get_outliers_inliers(X_train,Y_train)))
x1_outliers, x1_inliers = get_outliers_inliers(X_train,Y_train)
xx , yy = np.meshgrid(np.linspace(-10, 10, 200), np.linspace(-10, 10, 200))
plt.figure(figsize=(10, 10))

# Test 4 different methods


contamination_ = 0.2
Focus on

classifiers = {
'ABOD' : ABOD(contamination=contamination_),
'KNN' : KNN(contamination=contamination_),
'Bag' : FeatureBagging(contamination=contamination_),
'IForest' : IForest(contamination=contamination_)
}

for i, (clf_name,clf) in enumerate(classifiers.items()) :


clf.fit(X_train)

# predict outlier score


scores_pred = clf.decision_function(X_train)*-1

# predict outlier
y_pred = clf.predict(X_train)
n_errors = (y_pred != Y_train).sum()

# threshold value outlier


threshold = stats.scoreatpercentile(scores_pred,75 *contamination_)

# calculates the raw anomaly


t = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) * -1
ZZ = t.reshape(xx.shape)
Python open source toolbox for Outlier Detection 23
Computer Vision News

# plot outliers and contour


subplot = plt.subplot(2, 2, i + 1)
subplot.contourf(xx, yy, ZZ, levels = np.linspace(ZZ.min(), threshold, 15))
subplot.contour(xx, yy, ZZ, levels=[threshold],linewidths=2, colors='red')

# fill orange contour lines where range of anomaly score is from


threshold to maximum anomaly score
subplot.contourf(xx, yy, ZZ, levels=[threshold, ZZ.max()],colors='blue')

# scatter plot of inliers with white dots


subplot.scatter(X_train[:-n_outliers, 0], X_train[:-n_outliers, 1],
c='white',s=12, edgecolor='g')
# scatter plot of outliers with black dots
subplot.scatter(X_train[-n_outliers:, 0], X_train[-n_outliers:, 1],
c='black',s=12, edgecolor='g')
subplot.axis('tight')

subplot.set_title(clf_name)
subplot.set_xlim((-15, 15))
subplot.set_ylim((-15, 15))
plt.show()

Where the output is:

Focus on
We clearly see that ABOD, Isolation Forest and KNN found the main groups
(red contour) and left the outlier out, while the BAG failed to find the outliers.
24 Women in Science
Computer Vision News

Chip Huyen
Chip Huyen works as a deep with a BS and MS in Computer
learning engineer with the Science. More interviews with
Artificial Intelligence Applications women scientists
team at NVIDIA, where she
develops new tools to make it Chip, I think that you have many
things going on in your life already
easier for companies to bring the even at such a young age. You grew
Women in Science

latest deep learning research into up in Vietnam, traveled all over the
production. Our attentive readers world for three years, and studied at
might remember her as the Stanford in California. Which one do
author of SOTAWHAT. She has you want to talk about first?
recently released lazynlp, a library I don’t know! I think all of them are in
that allows easy scraping, the past.
cleaning, and de-duplicating So tell me about your current work.
webpages to create massive What is it?
monolingual text datasets. I’m working with language models
Originally from Vietnam, Chip mostly. My team is building a toolkit to
graduated from Stanford University help companies bring AI research into

“Physical distance is not that big of a deal!”

photo: Timothy Archibald


Chip Huyen
Pauline Luc 25
Computer Vision News

“… the language we speak Why am I not happy in such a perfect


place? And you have to pretend to be
helps form who we are!” happy, and I think that leads to the
duck syndrome.
production). My team focuses more on
speech recognition, but I am focused So tell me something about California.
What do you like there which is better
more on the research side: the than Vietnam?
language modeling.
Woah [laughs]. I think the weather is
How did you get there?

Women in Science
nice. Have you been to Vietnam?
I have always been interested in
No, but I’m going there in a few
languages. My background is in months. You will have to give me tips.
writing, so I find languages fascinating.
And also, in my time traveling, I Well, it is very humid.
realized that language is a big part of I expect to be there in October.
any culture, and the language we You are likely going to sweat a lot. In
speak help form who we are. When I California, it’s never really too hot or
came to Stanford, I got super really too cold. Even in the winter, you
interested in computer science, can still have sun, go out, and play
programming, and NLP. I could sports. The air quality is really nice,
combine my love for languages and and there’s a blue sky. When I went
computer science. back to Vietnam the last time the sky
Once Amy Bearman told me about seemed to be foggy because of the
the duck syndrome at Stanford, where pollution.
students are like ducks: on the surface Tell me something nice about Vietnam
very placid and self-confident, but that we don’t know.
under the surface they are paddling
like mad. Do you know about it? There are definitely more human
interactions, personal interactions. I
Yeah, it’s pretty common to hear about think it’s great that everyone here is
it. really motivated, and I am also
Is it particular for Stanford? motivated talking to them. I also feel
I think there’s this certain perception like I have to live up to all of these
of California that makes it worse. I feel
like people here are very outgoing,
healthy, and have a very positive look
on life, which is great, but it also
creates a pressure to do well and be
happy all of the time. I live in this
amazing place with amazing weather,
like 300 days of sunshine. When
somebody asks me, “How are you
doing?”, the only correct answer is to
say, “Oh, I’m doing great!” because,
how can you not? So I just feel like if
I’m not happy, what’s wrong with me? photo: Dan Taylor/Heisenberg Media
26 Women in Science
Computer Vision News

expectations, and sometimes I feel like The 365Project shut down to make
people judge me for my potential. room for Stanford in Real Life.
They become my friends because they What happened to it?
think I’m going to do great in the
future or do something for them. It's Nothing, I’m just changing. I’ve been
working on the Learn365Project for a
not because of my personality. So while, and it’s just like something to
there are some things that are a bit
transactional about human scribble on every day. I tried, but I
didn't do it every day. Sometimes I
Women in Science

interactions instead of companions. I would do it and not publish it. Stanford


miss that a lot.
IRL is more focused. It’s solely about
A friend of mine told me a few people: what they are doing, the
months ago that Vietnam is heaven struggles that they are facing after
on earth because of the kindness of college, and the career paths they are
people. Would you agree with that? choosing. I’m working a lot on my full-
I think that each place has its own time job at NVIDIA. I really like my job.
characteristics, and some are pros. I’m working on my book, and the
Some are cons. The thing is there are Stanford in Real Life series.
people living in both places. That You do a lot of things! What drives
means that no place can be that bad if you?
people still live there. I’m actually
writing a book on Vietnam. I’m trying What drives you, Ralph?
to figure out what makes Vietnam, I try to live according to my values. I
Vietnam. try to make sure that I do things so I
will be happy with myself a few years
That’s very nice. This one is in English? ahead of now. My formula is a mix of
I hope so. Yes, it is in English three things. One is that I do things
Tell me about AI in Vietnam. I know that I like. The second is to do things
that you are working on something. that I think are useful. The third is to
I’m a member of this organization do things for which I don’t have to
called VietAI. It’s a nonprofit compromise my standards, my inner
organization that hopes to help create values.
a new generation of AI engineers and What are your inner values?
researchers in Vietnam. We have
courses that only cost 250 dollars for
ten weeks of learning. We charge
enough for operations, and we have
events. Last December we had this AI
Summit. I think it was the very first AI-
centric conference in Vietnam.
You do a lot of stuff! I tried to see how
many things you do and I lost count.
You have your blog. You have VietAI.
You have the new Stanford in Real Life
posts. You have your Learn365Project…
Chip Huyen 27
Computer Vision News

That includes the free hugs campaign…


Yeah!
Tell our readers.
It started in high school. I lived away
from home. It was a great chance for
me to experience life. I don’t think
many people have the chance to stand
on their feet at 15. It gave me a taste of

Women in Science
adulthood. But sometimes, when I got
sick, it got really lonely. I realized, in
Vietnam, parents don’t really tell you
that they love you. There’s no hugging.
It’s a different way to express feelings
and love to each other, but sometimes I
“When you contribute feel like if somebody hugged me, I
something, you value would feel so much better, but I didn’t
know how. Then I went online, and I
yourself and feel useful.” saw this free hugs campaign. I think it
was by an Australian guy. I was like,
“Wow! This is great!” Strangers can
Mainly truth, honesty, kindness... this
kind of stuff. The most important one also hug each other! A hug is not just a
hug. It’s a symbol of connection, of
is truth. What about you? both emotional and physical
I think for me it’s just a desire to connection. So I wrote this very
contribute, to create content. emotional blog post about how
For other people to benefit from? everyone should hug each other. It got
very popular. We decided to do a
Yes, some sort of value in myself. When campaign in Vietnam. It was one of the
you contribute something, you value biggest youth-organized events in the
yourself and feel useful. You’re making country. A thousand young people
a difference, even if it is small. I write a went in the street all across the
lot in Vietnamese. I contribute to a country, and we just went around and
couple of columns on Vietnamese hugged people. I remember there was
newspapers. I feel super, super this lady who was selling things on the
privileged. I had a chance to travel and street. She was really old. We went to
see different countries first hand. There hug her, and she just started crying.
are a lot of things that I learned that I She had never felt that before. It made
would like to share with other people me feel really sad, but also happy at
in Vietnam, so I write about it. the same time. It’s weird. It was very
What should you get out of it? It emotional. After that, we already had a
sounds perfect.
I just really feel like maybe I’m just
“… maybe I’m just obsessed
obsessed with this desire to share! with this desire to share!”
28 Women in Science
Computer Vision News

“We’re having less and less I think PhDs are great. I just don’t think
that it’s for me.
unknowns every day, but That’s fair enough. Tell me something
about NLP. What do you like about it?
how far are we from being NLP still has a lot of challenges. I feel
completely unignorant?” like the way that we approach NLP
nowadays is not going to solve
huge network of volunteers. We everything. I see a lot of big, important
Women in Science

started having this organization. Wepapers coming out, but I don’t see a
called it Free Hugs Vietnam. At first, we
fundamental approach to it yet.
tried to help foreigners like foreign
Actually, I’m very interested in transfer
NGOs. When they come to Vietnam, learning for NLP.
we help them liaison with the local I’m also very interested in something
organizations. We give them
that I see as a curriculum learning. I
volunteers. We have a lot of events for
feel like children who learn languages
young people like English-speaking have a very small vocabulary and they
clubs and planned competitions. We gradually build it up over time. I’m
raised money. We bought meals. We trying to see whether it’s possible with
went around and gave meals to a model. You start with a dataset of
homeless people during the holidays.simplified English, and we learn more
How nice! complex English. So I created a set of
children’s books with very simple
It’s really nice. I’m really happy. I’m not
actually involved with it anymore. English, and I’m not sure it’s going to
There are other young people in work. It’s an open research problem.
charge. Do you believe that human knowledge
is going upwards?
You have traveled a lot and met a lot
of people, so you have plenty of We’re having less and less unknowns
friends almost everywhere. And of every day, but how far are we from
course, your family is very far away. By
being completely unignorant? I don’t
being in different places all of theknow!
time, you certainly miss people. How Read more interviews like this
do you handle this?
I feel like the fact that I was able to
travel makes me realize that physical
distance is not that big of a deal. For
me, when I don’t see my friends
regularly, I know that if we see each
other again, we’re going to be friends
again. I’m not traveling that much
anymore. I’m mostly in California now.
Are you planning a PhD?
I’m not doing a PhD.
Can you tell me why?
Chip Huyen (and a testimonial)
Computer Vision News
29

“I had a chance to travel and see


different countries first hand!”

Women in Science
Feedback of the Month
We contracted RSIP Vision to develop algorithms for our leading
product, intended for SEM image quality improvement. They
applied a high level of computer vision expertise to create and
deliver effective solutions. I highly estimate their work and
would definitely recommend their services!
Ishai Schwarzband
Advanced Computations and Measurements Group Manager,
Applied Materials
30 Bay Vision Meetup Computer Vision News

Among many other initiatives, RSIP Vision sponsors the Bay Vision Meetup.
We invite great speakers in the Silicon Valley. The February 27 meeting held in
Cupertino focussed on the topic of Developments in Artificial Intelligence for
Autonomous Vehicles. In these two images, Shmulik Shpiro (RSIP Vision) says
a few words of welcome and introduces the two speakers.
Bay Vision Meetup 31
Computer Vision News

On top, the first speaker: Mohammad Musa, Founder and CEO at Deepen.ai,
speaking about “3D Point level segmentation”. Below, the second speaker:
Eran Dor, Co-Founder & CEO at CRadar.Ai, discussing how to utilize signal to
improve and change the way Radar sensors detect, classify and perceive the
car’s surroundings. Thank you both for your great talks! The full video is here!
Computer
ComputerVision
VisionNews
News
Upcoming Events 33
TRI-CON - International Molecular Medicine Tri-Conference
S.Francisco, CA Mar 10-15 Website and Registration

AAOS Annual Meeting American Academy of Orthopaedic Surgeons


Detroit, MI Mar 12-16 Website and Registration

ACC - American College of Cardiology’s 68th Annual Session


New Orleans, LA Mar 16-18 Website and Registration

ADAS Sensors 2019


Detroit, MI Mar 20-21
FREE Website and Registration

IAPR Computational Color Imaging Workshop


SUBSCRIPTION Chiba, Japan Mar 27-29 Website and Registration

ISBI 2019 - IEEE International Symposium on Biomedical Imaging


Dear reader, Venice, Italy Apr 8-11 Website and Registration
Do you enjoy reading
Computer Vision News? European Symposium - Artificial NN, Comput. Intelligence and ML
Bruges, Belgium Apr 24-26 Website and Registration
Would you like to receive it
for free in your mailbox AI & Big Data Expo Global
every month? London, UK Apr 25-26 Website and Registration

Subscription Form Automatic Face and Gesture Recognition


Lille, France May 14-18 Website and Registration
(click here, it’s free)
ICRA - International Conference on Robotics and Automation
You will fill the Subscription Montreal, Canada May 20-24 Website and Registration
Form in less than 1 minute.
Join many others computer Did we forget an event?
vision professionals and Tell us: [email protected]
receive all issues of
Computer Vision News as
soon as we publish them. Did you read
You can also read Computer
Vision News in PDF version the Feedback of the Month?
and find in our archive new
and old issues as well. It’s on page 29!

FEEDBACK
Dear reader,
If you like Computer Vision News (and also if you
don’t like it) we would love to hear from you:
We hate SPAM and Give us feedback, please (click here)
promise to keep your email It will take you only 2 minutes. Please tell us and
address safe, always. we will do our best to improve. Thank you!
IMPROVE YOUR VISION WITH

The only magazine covering all the fields of


the computer vision and image processing industry

SUBSCRIBE
CLICK HERE, IT’S FREE
Gauss Surgical

A PUBLICATION BY

You might also like