Skip to main content
Version: 1.0

Overview

AI vs Machine Learning

ml-vs-ai.png

Misconceptions

uncovering-ai-meme.png

what-x-thinks-i-do.png

data-engineer-data-scientist.png

tableau-data-science.png

Machine Learning

machine-learning.png

Source: https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781789537550/1/ch01lvl1sec10/differences-between-classification-and-regression

Feeling Lost? Machine Learning Estimators Map

Supervised Learning

  • Classification
    • Binary:
      • should this loan be approved?
      • Is this a picture of a cat or a dog?
    • Multi-class: what bird species is this picture?
  • Regression
    • Many-to-one:
      • Predicting the price of a second-hand car
    • Many-to-many:
      • Forecasting sales for the next 3 months

Unsupervised Learning

  • Clustering, Topic Modelling
  • Dimensionality Reduction

Optimization & Reinforcement Learning

  • Convex Optimization, Genetic Algorithms
  • Deep Reinforcement Learning

Intuition

function.png decision-tree-regression.png

decision-tree-regression-profit-production-cost.png

What are some limitations of linear models (y = mx + b)? or even multiple linear regression (y = m1x1 + m2x2 + … + b)? * Assumes **monotonic gradient (slope)** between the target variable (y) and any feature (e.g. x1) * Assumes **constant slope steepness** between the target variable (y) and any feature (e.g. x1) for the entire domain of that feature

Machine Learning models can often provide the flexibility to overcome this. Think of it as automatic curve fitting and if-else!

  • You don’t need to hard-code the predicates
  • The algorithm will determine and optimize your if-conditions

However, note the issue happening with the green line in the picture 😵

Data Science

Trade-offs in modelling

underfitting-overfitting.png

accuracy-vs-intelligibility.png

  • Flexibility vs Generalization
  • Overfitting vs Underfitting
    • How can we avoid overfitting in particular?
  • Complexity vs Interpretability
    • Is that dichotomy still strictly true today?

Modelling in the real world

confidence-intervals.png

  • ML & Deep Learning training
    • can be expensive and difficult to debug, start simple!
  • Explainability
  • Are point estimates always enough?
  • some stakeholders require uncertainties / confidence intervals

What problem are you actually trying to solve?

roc-curve.png

true-class-predicted-class.png

Tools

tensor-flow.png pytorch.png xg-boost.png scikid-learn.png

  • Forecasting
  • Machine Learning
    • scikit-learn
      • plenty of models and preprocessing methods to choose from (except for large-scale deep learning)
    • XGBoost (RandomForests on steroids)
      • Popular choice for winning Kaggle competitions!
    • InterpretML (see next slides)
  • Deep Learning
    • PyTorch
      • Super flexible, create any model architecture
    • TensorFlow
      • the Keras API is super easy to use
      • if not using Keras, probably better to go with PyTorch
  • MLOps, CD4ML

Interpretability

i-am-dog.png

Why is interpretability crucial?

variety-of-techniques glass-box

Christoph Molnar: [Interpretable Machine Learning](https://christophm.github.io/interpretable-ml-book/) * Great introduction to concepts and theory Fundamentals: [LIME](https://christophm.github.io/interpretable-ml-book/lime.html) & [SHAP](https://shap.readthedocs.io/en/latest/index.html)

Modern Toolkits

  • InterpretML
    • Intro Video, Deep Dive
    • Includes various explainability techniques
    • You can even build advanced yet interpretable models from the ground up
  • Fairlearn
    • Assess and (automatically) mitigate unfairness in your current models

CD4ML - Business Applications