Title: Machine Learning & AI Foundations: A Guide to Predictive Modeling
1Machine Learning
AI Foundations A Guide to Predictive Modeling
NetCom Learning
Tom Goodheart NetCom Learning
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
2Agenda
- Evaluating the Proper Amount of Data
- Assessing the Data Quality and Quantity
- Data Preparation and Modeling Challenges
- Scoring Machine Learning Models
- Deploying Models and adjusting data prep /
scoring - Monitor and Maintenance
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
3www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
4Evaluating the Proper Amountof Data
- The Proper Amount of Data
- It Depends, No One Can Really tell You
- Complexity of the Problem
- Complexity of the Algorithms chosen
- Reason by Analogy Learning Curves
- Study how your problem changes as data scales as
well as your algorithm. Averaging over many
iterations of size can give you a good enough
idea - Domain Expertise
- Delphi Technique
- Statistical Heuristics
- Factor the Number of classes
- Factor the number of input features
- Factor the number of model parameters
- As data sets used in the field of Machine
Learning grow bigger and more complex we enter
the realm of big data massive amounts of data,
flowing from varieties of sources, that grow
exponentially. - Information needs to be stored efficiently and
rapidly in order to provide business value. - Just because we have data doesnt mean its
quality data or that it will provide the types
of answers we need.
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
5Evaluating the Proper Amountof Data
Learning Curves
These are statistical tools built off of
heuristics and domain expertise. As the model
training examples increase you can see the
scores of the model reach an asymptotic point At
this point no matter how much more data you
throw at your model, performance will not
improve.
http//acl.ldc.upenn.edu/P/P01/P01-1005.pdf
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
6Evaluating the Proper Amountof Data
Domain Expertise
We reach out to people we know with
more information than us, We Google search, We
continue our growth towards Domain Expertise by
exposing ourselves to experts and information in
the field.
https//proxy.duckduckgo.com/iu/?uhttps3A2F2Ft
se4.mm.bing.net2Fth3Fid3DOI P.LHEhOiJ-QBM7ZhVhN
xrO6wHaHa26pid3DApif1
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
7Evaluating the Proper Amountof Data
Statistical Heuristics The Ten Times Rule A
rule of thumb, along with many others, that can
help set the tone for your exploratory analysis.
https//proxy.duckduckgo.com/iu/?uhttps3A2F2Fw
ww.designheuristics.com2Fwp- content2Fuploads2
F20122F072Fhomepage.pngf1
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
8Assessing Quantity and Quality of Data
- The Data Quality Cheat Sheet
- Clarify your objectives and collected data
- Is my question a categorical or numerical
prediction? - Have I collected the right data source?
- Take the time to execute proper collection
methods - Machine Learning is the best example of Garbage
In, - Garbage Out you can compute expensively.
- Maintain a rigorous audit trail
- git blame
- Scoring Engines/Scoring Files Logging
- Proper CI/CD Pipelines
- Designate a Data Quality Team
- Their responsibility is the maintenance, testing,
and - assurance of the data.
- Independent Quality Assurance
- Can someone else reproduce your results?
- https//hbr.org/2018/04/if-your-data-is-bad-your-m
achine-learning-tools-are-useless
- One of the major setbacks in the availability of
data present today is that you have to slice
your way through copious amounts of questionable
data sets.
Timoelliott.com
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
9Data Preparation and Modeling Challenges
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
10Data Preparation and Modeling Challenges
To create a model, then, we make choices about
whats important enough to include, simplifying
the world into a toy version that can be easily
understood and from which we can infer important
facts and actions. We expect it to handle only
one job and accept that it will occasionally act
like a clueless machine, one with enormous blind
spots. ? Cathy O'Neil
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
11Scoring Machine Learning Models
Scoring is widely used in machine learning to
mean the process of generating new values, given
a model and some new input. The generic term
"score" is used, rather than "prediction,"
because the scoring process can generate so many
different types of values A list of recommended
items and a similarity score. Numeric values, for
time series models and regression models. A
probability value, indicating the likelihood that
a new input belongs to some existing category.
The name of a category or cluster to which a new
item is most similar. A predicted class or
outcome, for classification models.
Scoring is also called prediction, and is the
process of generating values based on a trained
machine learning model, given some new input
data.
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
12Deploying Machine Learning Models
https//proxy.duckduckgo.com/iu/?uhttps3A2F2Fd
ocs.microsoft.com2Fen-us2Fazure2Fmachine-learni
ng2Fdesktop-workbench2Fmedia2Fmodel-management-
overview2Fmodelmanagement.pngf1
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
13Monitor and Maintenance
Microsoft AzureML Studio has a built in tool
that allows you to glean insights from your
running models, like how they are scoring over
time, how fast they respond to customer inputs,
etc. Remember If It Isnt Logged It Doesnt
Exist
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
14Recorded WebinarVideo
To watch the recorded webinar video for live
demos, please access the link http//bit.ly/2GIpZ
Yc
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
15About NetCom Learning
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
16Recommended Courses Marketing Assets
Courses 20774 Perform Cloud Data Science
with Azure Machine Learning Class scheduled on
June 03 Artificial Intelligence (AI) for
Beginners AI-100T01 Designing and Implementing
an Azure AI Solution DP-200T01 Implementing an
Azure Data Solution (Data Scientist) Master
Program - Artificial Intelligence
- Marketing Assets
- Blog - Game Changers of 2019 Top 8 In-Demand IT
Skills - Free On-Demand Training - Preparing and
Architecting for Machine Learning - Free On-Demand Training - Develop Your AI
Strategy with These Trends in Mind
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
17- Google Cloud Fundamentals - Core Infrastructure
- Autodesk Inventor How to Organize and Reuse Your
Data - Project Management Essentials for Non-project
Managers - How to Build Effective Data Communications with
Tableau Desktop - Critical Thinking Developing Problem-Solving
Decision-Making Skills - Mastering Microsoft Teams - The Future of
Teamwork - Microsoft Azure Managing Subscriptions and
Resources - Explore Photoshop CC for the Web Designers
- Explore Data Warehousing and Business
Intelligence - More
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
18Promotions
Its time for a SALEbration! NetCom Learning is
headed for its next milestone 21 years of
nonstop training and learning. To commemorate, we
will kick off the best SALEbration of the year
Data AI Courses at 21 OFF! Learn More
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
19Follow Us On
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
20www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266
21THANK YOU !!!
www.netcomlearning.com info_at_netcomlearning.com
(888) 563 8266