0% found this document useful (0 votes)

74 views11 pages

Sanity 2

The document discusses the importance of sanity checks and testing when writing code and conducting experiments. It notes that bugs are very common in machine learning code and provides examples of bugs found in published papers. It recommends practices like writing unit tests, comparing results to simple baselines, checking invariances, and plotting details to catch bugs early and avoid retractions. The key message is that a few basic sanity checks can go a long way in validating results.

Uploaded by

HT Cooking Channel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views11 pages

Sanity 2

Uploaded by

HT Cooking Channel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Sanity Checks

David Duvenaud

Cambridge University
Computational and Biological Learning Lab

April 24, 2013

A Simple Example
Comparing Models of Prawn
Minds Actual Experiments:

• Paper: Comparing evidence 102 reptitions of prawn flocking:

for different models of prawn
behavior.
• Requires inference
conditioned on results of
experiments.
• Author’s highest-impact
publication venue so far.
function [logP, samples] = logP_mc_ring_memory(theta, direction, N, modelidx, type)

if nargin < 5
type = 1;

if nargin <4
modelidx = 1;
end
end

%downsample inputs, coreelation length is ~10 frames

for i = 1:numel(theta)
theta = theta(:, 1:2:end);
direction = direction(:, 1:2:end);
end

logP = zeros(N, 1, ’double’);

samples = zeros(N, 6, ’double’);

priormin = [0, 1, -2, -2, 0, -7.5];

priormax = [pi, 5, 2, 2, 1, -7.49];
priorrange = priormax-priormin;

switch modelidx
case 0
log_l_pdf = @(x) logP_ring_null(theta, direction, x(1), x(2), x(3:4), x(5), x(6));
case 1
log_l_pdf = @(x) logP_ring_mf(theta, direction, x(1), x(2), x(3:4), x(5), x(6));
case 2
.
.
.
Is anything amiss?

% downsample inputs,
• theta is a cell array, one cell
% coreelation length is ~10 frames per experiment, each
for i = 1:numel(theta)
theta = theta(:, 1:2:end); iteration discards half the
direction = direction(:, 1:2:end); experiments!
end
• At the end of the loop, only 1
of 102 experiments left.
• So many pointless
experiments!
Is anything amiss?

The lesson: never release your code

[update: Was fixed and re-published:

www.ploscompbiol.org/article/
info:doi/10.1371/journal.pcbi.1002961]
Very Common

In Machine Learning In General

• 2009: Oxford vision group • Your code will have bugs!

retraction after including test • My rate: about 1 per line of
cases in training set. matlab.
• A unnamed lab member
almost didn’t include results
in NIPS paper because of
sign error in plots.
• Retraction Watch Blog:
retractionwatch.wordpress.com

How to trust anything?

When writing code

Standard Advice Carl’s Advice

• Write Unit Tests • Re-write your code until it

• Physicists have good uses the right data structure.
protocols. • Keep your code short and
• Compute same thing in simple, no corner cases.
different ways. • Ideally, everything fits on one
• Use checkgrad! page.
When writing code

Standard Advice Carl’s Advice

• Write Unit Tests • Re-write your code until it

These methods work but slow you down

When running experiments

Things to always compare

against: Datasets to include

• A random guesser (finds • A trivial-to-predict dataset

bugs in evaluation code) (finds major bugs in any
• Always guesses mean/mode method)
(finds too-easy problems) • A dataset with no signal
• 1-nearest neighbour (finds (finds bugs in evaluation
bugs in train/test splitting) code)
• A translated, scaled version
of dataset (finds bugs in
implementation of model)

Can detect problems without looking at code

In General

Notice Confusion Empirical Rates

• Notice when you’re confused

• Notice when you’re
rationalizing
• Red flag: Looking at only
one number and making up
a story about why it goes up
or down (i.e. cog sci )

Look at details until they aren’t suprising

Main Takeaways

To keep in mind To practice

• You probably have bugs • Include simple baselines

• Finding them early saves • Check invariants
time • Plot everything
• Finding them before you • Keep things simple
publish saves retractions

A few sanity checks go a long way

Perancangan Sistem Informasi Perawatan Mesin Pada PT XYZ: Berry Yuliandra, Kushisa Atta Jaeba
No ratings yet
Perancangan Sistem Informasi Perawatan Mesin Pada PT XYZ: Berry Yuliandra, Kushisa Atta Jaeba
12 pages
6020fbad418f6e7e0be22db0 SumatoSoft Company Profile 2021
No ratings yet
6020fbad418f6e7e0be22db0 SumatoSoft Company Profile 2021
49 pages
SMX Range: Operating Instructions
No ratings yet
SMX Range: Operating Instructions
28 pages
Datasheet: A New Generation of Routing and Processing
No ratings yet
Datasheet: A New Generation of Routing and Processing
20 pages
HG 3000 3U Pri GSM Data Sheet Dec 2009
No ratings yet
HG 3000 3U Pri GSM Data Sheet Dec 2009
26 pages
InDesign Import & Layout Guide
No ratings yet
InDesign Import & Layout Guide
5 pages
Prismatic Machining Preparation Assistant: What's New Getting Started
No ratings yet
Prismatic Machining Preparation Assistant: What's New Getting Started
75 pages
Maintaining Computers (Physically & Logically
No ratings yet
Maintaining Computers (Physically & Logically
12 pages
OG ROOMBA 600 Connected enUS - LATAM 03282022
No ratings yet
OG ROOMBA 600 Connected enUS - LATAM 03282022
16 pages
Digimar CX - 1 Fly en 2005 06 08 - 001
No ratings yet
Digimar CX - 1 Fly en 2005 06 08 - 001
7 pages
Fuzzy Logic Application To Power Electronics and Drives - An Overview
No ratings yet
Fuzzy Logic Application To Power Electronics and Drives - An Overview
7 pages
Installation of Cvs Server:: Cvs Server: Cvs NT Server Cvs Client: Wincvs, Eclipse, Netbeans Etc
No ratings yet
Installation of Cvs Server:: Cvs Server: Cvs NT Server Cvs Client: Wincvs, Eclipse, Netbeans Etc
7 pages
Top 100 + C Programming Interview Questions & Answers
100% (1)
Top 100 + C Programming Interview Questions & Answers
23 pages
Foundation of Data Science - Concept Notes
No ratings yet
Foundation of Data Science - Concept Notes
202 pages
Google App Engine: Alexander Zahariev Helsinki University of Technology A.zahariev@abv - BG
No ratings yet
Google App Engine: Alexander Zahariev Helsinki University of Technology A.zahariev@abv - BG
9 pages
Curriculum Vitae Pramod Kumar
No ratings yet
Curriculum Vitae Pramod Kumar
3 pages
Pressure Switch
No ratings yet
Pressure Switch
11 pages
Telecommunication Grade of Service
100% (1)
Telecommunication Grade of Service
3 pages
Group 16 - Design of Wind Mill Water Pump Project
No ratings yet
Group 16 - Design of Wind Mill Water Pump Project
10 pages
SMAPI Android Mod Log Analysis
No ratings yet
SMAPI Android Mod Log Analysis
3 pages
IB DP Computer Science Notes
100% (4)
IB DP Computer Science Notes
4 pages
STNW3001 Standard For Substation Equipment Identification
No ratings yet
STNW3001 Standard For Substation Equipment Identification
60 pages
The Seamless Gear Shifting Control For Pure Electric Vehicle With 2-Speed Inverse-AMT
No ratings yet
The Seamless Gear Shifting Control For Pure Electric Vehicle With 2-Speed Inverse-AMT
5 pages
DS K1T342MFX E1 Face Recognition Terminal - Datasheet
No ratings yet
DS K1T342MFX E1 Face Recognition Terminal - Datasheet
3 pages
Lipi - Digital Cutting System 1325
No ratings yet
Lipi - Digital Cutting System 1325
2 pages
Power Supply, Control and Consumption - HATCH Global Shrimp Report
No ratings yet
Power Supply, Control and Consumption - HATCH Global Shrimp Report
6 pages
Adriana Quintana Resume
No ratings yet
Adriana Quintana Resume
4 pages
The IBM System 360 Model 91 Floating-Point Execution Unit
No ratings yet
The IBM System 360 Model 91 Floating-Point Execution Unit
20 pages
2 3-4 Inch Radial Bond Tool
No ratings yet
2 3-4 Inch Radial Bond Tool
2 pages
LEAP-1A Airflow Rev 1.0
No ratings yet
LEAP-1A Airflow Rev 1.0
1 page

Sanity 2

Uploaded by

Sanity 2

Uploaded by

Sanity Checks

April 24, 2013

• Paper: Comparing evidence 102 reptitions of prawn flocking:

%downsample inputs, coreelation length is ~10 frames

logP = zeros(N, 1, ’double’);

priormin = [0, 1, -2, -2, 0, -7.5];

The lesson: never release your code

[update: Was fixed and re-published:

In Machine Learning In General

• 2009: Oxford vision group • Your code will have bugs!

How to trust anything?

Standard Advice Carl’s Advice

• Write Unit Tests • Re-write your code until it

Standard Advice Carl’s Advice

• Write Unit Tests • Re-write your code until it

These methods work but slow you down

Things to always compare

• A random guesser (finds • A trivial-to-predict dataset

Can detect problems without looking at code

Notice Confusion Empirical Rates

• Notice when you’re confused

Look at details until they aren’t suprising

To keep in mind To practice

• You probably have bugs • Include simple baselines

A few sanity checks go a long way

You might also like