Adventures!
in Azure Machine
Learning@deejaygraham
derek graham!
principal developer @sage!
special product
responsibility for Azure bits
Sorry!!
live coding!
live portall-ing
Machine learning is
not telling
computers what to
do but letting them
learn from examples
or past experience
Which means you
can…
• Analyse historic or current data
• Find patterns and trends
• Make predictions about future events
Project Adam
Azure Machine
Learning
• A "new" cloud-based service from Microsoft
• Integrates with existing Cloud technologies
• Use ready-made algorithms
• Program custom algorithms tuned to your problem
• You can evaluate it for free
http://
studio.azureml.net
• Browser based
• Drag n Drop
• Flowchart-y
• Example data sets
• Use R or Python
• Excellent intro wizard
ML Studio
• Import data
• Filter and aggregate data
• Create machine learning models
• Run experiments
• Publish finished model
Provides tools to:
The Learning Process
• Define a problem you want to solve
• Design a solution
• Experiment!
!
• Identify your data
• Train the model with the data
• Evaluate against expected results (speed and
accuracy)
• Adapt data or algorithm (or both)
• Repeat
!
• Save the best model
• Publish
• Run with live data
Proof!
Imagine…
• "Business" Software
• Azure hosted
• PaaS
• Multi-tenanted
Open-ended Workflow
• Monday Morning Login
• Friday Reports
• In between?
• Weekends?
• Holidays?
Balancing
• User demand
• User experience
• Compute resources
• Cost
Scaling
• Instances auto-scale based on the CPU% metric
using Azure’s standard scaling model.
• Azure standard scaling is slow
• Once auto scaler notices we need more capacity,
the demand has often disappeared!
• Not a good user experience
Experiment
• Customer use is not regular...
• ...but, is it predictable?
Hackathon!
• Can we build a better autoscaler?
• Spin-up before high demand
• Tear-down when idle
• Better Cost vs UX
Requirements
• What will "we" need on a given date or time?
• Do "we" need to take action now to compensate for
what will happen in 20 minutes time?
• Number of instances
• Predicted CPU
Best Predictor of Demand?
• Sessions?
• Instance Memory Use?
• Instance CPU?
Table Storage Diagnostics
• Too slow
• Purging
• ML queries all or nothing
• ML Data Reader stops after 4GB
• GB !!!!
• ML times-out after ~3 Hours
CPU
Event Hubs
• Application log sink
• Low overhead
• Highly scalable
• Time-based
• Disposable
Neural Net Experiments
!
• Feed Forward NN
• Written using R libraries
• Good predictor for 10-20 minute window
• Too inaccurate after that
• Best compromise between precision and speed
• Recurrent NN better at forecasting
• RNN execution time too long
• Need to reduce data to optimal subset
Stream Analytics
!
• Real-time data analysis
• Fast
• Sql-like syntax
• Range of inputs and outputs
• Interesting development
Anomalies
• Dev Process is painful
• Syntax Errors
• “Test” Import Behaviour
• Starting and Stopping and Starting and Stopping
Compromise
Closing the
Loop
Publish…
• ReST Web service
• Client Worker Role
• Management Service API
…& Be Damned
• Too much data crashes model
• Fine in ML Studio
• 500
• Out of memory?
Finished!
Result!
What we
learned
Bugs
• We were pushing the environment quite hard
• YMMV
• ML studio has bugs
• Parallel tasks !Parallel
• ML portal missing functionality preventing it being
production ready
#DevOps
• Sharing models is "public" - Gallery
• No export support
• No support (yet) for model deployment
• Still Drag n Drop
• PowerShell for EventHubs and Stream Analytics
Machine Learning
• Parallel R processing library would help
• Finding an appropriate solution often requires a data science
specialist
• Solution is only as good as your data
• You may need to compromise on accuracy for speed
• Cost
• Hosting
• Each call to the service
References
http://studio.azureml.net/
E-Book
• Microsoft Azure Essentials: Azure Machine
Learning
• Download from: https://mva.microsoft.com/ebooks
Titanic
• Jennifer Marsman - https://
blogs.msdn.microsoft.com/jennifer/2016/02/19/
using-azure-machine-learning-to-predict-who-will-
survive-the-titanic/
• Data Science! https://www.kaggle.com/
• Amy Nicholson @AmyKateNicho https://
blogs.technet.microsoft.com/amykatenicho/
#DevOps
• https://azure.microsoft.com/en-gb/documentation/articles/event-
hubs-programming-guide/
• https://azure.microsoft.com/en-gb/documentation/articles/service-
bus-event-hubs-manage-with-ps/
• https://azure.microsoft.com/en-us/documentation/articles/stream-
analytics-dotnet-management-sdk/
• https://azure.microsoft.com/en-us/documentation/articles/stream-
analytics-monitor-and-manage-jobs-use-powershell/
Questions ?

Adventures in Azure Machine Learning from NE Bytes