# Book: Mastering Machine Learning with Python in Six Steps

Book: Mastering Machine Learning with Python in Six Steps

A Practical Implementation Guide to Predictive Data Analytics Using Python

Covers basic to advanced topics in an easy step-oriented manner

Concise on theory, strong focus on practical and hands-on approach

Explores advanced topics, such as Hyper-parameter tuning, deep natural language processing, neural network and deep learning

Describes state-of-art best practices for model tuning for better model accuracy

About The Book:

This book is your practical guide towards novice to master in machine learning with Python in six steps. The six steps path has been designed based on the “Six degrees of separation” theory which states that everyone and everything is a maximum of six steps away. Note that the theory deals with the quality of connections, rather than their existence. So, a great effort has been taken to design an eminent, yet simple six steps covering fundamentals to advanced topics gradually that will help a beginner walk his way from no or least knowledge of machine learning in Python to all the way to becoming a master practitioner. This book is also helpful for current Machine Learning practitioners to learn the advanced topics such as Hyperparameter tuning, various ensemble techniques, Natural Language Processing (NLP), deep learning, and basics of reinforcement learning.

Each topic has two parts, the first part will cover the theoretical concepts and the second part will cover practical implementation with different Python packages. The traditional approach of math to machine learning i.e., learning all the mathematic then understanding how to implement them to solve problems need a great deal of time/effort which has proven to be not efficient for working professionals looking to switch careers. Hence the focus in this book has been more on simplification, such that the theory/math behind algorithms have been covered only to extend required to get you started.

I recommend you to work with the book instead of reading it. Real learning goes on only through active participation. Hence, all the code presented in the book are available in the form of iPython notebooks to enable you to try out these examples yourselves and extend them to your advantage or interest as required later.

What You’ll Learn:

Examine the fundamentals of Python programming language

Review machine Learning history & evolution

Learn various machine learning system development frameworks

Learn fundamentals to advanced text mining techniques

Learn and implement deep learning frameworks

Who This Book Is For:

This book will serve as a great resource for learning machine learning concepts and implementation techniques for:

Python developers or data engineers looking to expand their knowledge or career into machine learning area.

A current non-Python (R, SAS, SPSS, Matlab or any other language) machine learning practitioners looking to expand their implementation skills in Python.

Novice machine learning practitioners looking to learn advanced topics such as hyperparameter tuning, various ensemble techniques, Natural Language Processing (NLP), deep learning, and basics of reinforcement learning.

Content at a Glance

Introduction

Chapter 1: Step 1 – Getting Started in Python

Chapter 2: Step 2 – Introduction to Machine Learning

Chapter 3: Step 3 – Fundamentals of Machine Learning

Chapter 4: Step 4 – Model Diagnosis and Tuning

Chapter 5: Step 5 – Text Mining and Recommender Systems

Chapter 6: Step 6 – Deep and Reinforcement Learning

Chapter 7: Conclusion

Table of Content

INTRODUCTION

CHAPTER 1: STEP 1 – GETTING STARTED IN PYTHON

The Best Things in Life Are Free

The Rising Star

Python 2.7.x or Python 3.4.x?

Windows Installation

OSX Installation

Linux Installation

Python from Official Website

Running Python

Key Concepts

Python Identifiers

Keywords

My First Python Program

Code Blocks (Indentation & Suites)

Basic Object Types

When to Use List vs. Tuples vs. Set vs. Dictionary

Comments in Python

Multiline Statement

Basic Operators

Control Structure

Lists

Tuple

Sets

Dictionary

User-Defined Functions

Module

File Input/Output

Exception Handling

Endnotes

CHAPTER 2: STEP 2 – INTRODUCTION TO MACHINE LEARNINGHISTORY AND EVOLUTION

Artificial Intelligence Evolution

Different Forms

Statistics

Data Mining

Data Analytics

Data Science

Statistics vs. Data Mining vs. Data Analytics vs. Data Science

Machine Learning Categories

Supervised Learning

Unsupervised Learning

Reinforcement Learning

Frameworks for Building Machine Learning Systems

Knowledge Discovery Databases (KDD)

Cross-Industry Standard Process for Data Mining

SEMMA (Sample, Explore, Modify, Model, Assess)

KDD vs. CRISP-DM vs. SEMMA

Machine Learning Python Packages

Data Analysis Packages

NumPy

Pandas

Matplotlib

Machine Learning Core Libraries

Endnotes

CHAPTER 3: STEP 3 – FUNDAMENTALS OF MACHINE LEARNING

Machine Learning Perspective of Data

Scales of Measurement

Nominal Scale of Measurement

Ordinal Scale of Measurement

Interval Scale of Measurement

Ratio Scale of Measurement

Feature Engineering

Dealing with Missing Data

Handling Categorical Data

Normalizing Data

Feature Construction or Generation

Exploratory Data Analysis (EDA)

Univariate Analysis

Multivariate Analysis

Supervised Learning– Regression

Correlation and Causation

Fitting a Slope

How Good Is Your Model?

Polynomial Regression

Multivariate Regression

Multicollinearity and Variation Inflation Factor (VIF)

Interpreting the OLS Regression Results

Regression Diagnosis

Regularization

Nonlinear Regression

Supervised Learning – Classification

Logistic Regression

Evaluating a Classification Model Performance

ROC Curve

Fitting Line

Stochastic Gradient Descent

Regularization

Multiclass Logistic Regression

Generalized Linear Models

Supervised Learning – Process Flow

Decision Trees

Support Vector Machine (SVM)

k Nearest Neighbors (kNN)

Time-Series Forecasting

Unsupervised Learning Process Flow

Clustering

K-means

Finding Value of k

Hierarchical Clustering

Principal Component Analysis (PCA)

Endnotes

CHAPTER 4: STEP 4 – MODEL DIAGNOSIS AND TUNING

Optimal Probability Cutoff Point

Which Error Is Costly?

Rare Event or Imbalanced Dataset

Known Disadvantages

Which Resampling Technique Is the Best?

Bias and Variance

Bias

Variance

K-Fold Cross-Validation

Stratified K-Fold Cross-Validation

Ensemble Methods

Bagging

Feature Importance

RandomForest

Extremely Randomized Trees (ExtraTree)

How Does the Decision Boundary Look?

Bagging – Essential Tuning Parameters

Boosting

Example Illustration for AdaBoost

Gradient Boosting

Boosting – Essential Tuning Parameters

Xgboost (eXtreme Gradient Boosting)

Ensemble Voting – Machine Learning’s Biggest Heroes United

Hard Voting vs. Soft Voting

Stacking

Hyperparameter Tuning

GridSearch

RandomSearch

Endnotes

CHAPTER 5: STEP 5 – TEXT MINING AND RECOMMENDER SYSTEMS

Text Mining Process Overview

Data Assemble (Text)

Social Media

Step 1 – Get Access Key (One-Time Activity)

Step 2 – Fetching Tweets

Data Preprocessing (Text)

Convert to Lower Case and Tokenize

Removing Noise

Part of Speech (PoS) Tagging

Stemming

Lemmatization

N-grams

Bag of Words (BoW)

Term Frequency-Inverse Document Frequency (TF-IDF)

Data Exploration (Text)

Frequency Chart

Word Cloud

Lexical Dispersion Plot

Co-occurrence Matrix

Model Building

Text Similarity

Text Clustering

Latent Semantic Analysis (LSA)

Topic Modeling

Latent Dirichlet Allocation (LDA)

Non-negative Matrix Factorization

Text Classification

Sentiment Analysis

Deep Natural Language Processing (DNLP)

Recommender Systems

Content-Based Filtering

Collaborative Filtering (CF)

Endnotes

CHAPTER 6: STEP 6 – DEEP AND REINFORCEMENT LEARNING

Artificial Neural Network (ANN)

What Goes Behind, When Computers Look at an Image?

Why Not a Simple Classification Model for Images?

Perceptron – Single Artificial Neuron

Multilayer Perceptrons (Feedforward Neural Network)

Load MNIST Data

Key Parameters for scikit-learn MLP

Restricted Boltzman Machines (RBM)

MLP Using Keras

Autoencoders

Dimension Reduction Using Autoencoder

De-noise Image Using Autoencoder

Convolution Neural Network (CNN)

CNN on CIFAR10 Dataset

CNN on MNIST Dataset

Recurrent Neural Network (RNN)

Long Short-Term Memory (LSTM)

Transfer Learning

Reinforcement Learning

Endnotes

CHAPTER 7: CONCLUSION

Summary

Tips

Start with Questions/Hypothesis Then Move to Data!

Don’t Reinvent the Wheels from Scratch

Start with Simple Models

Focus on Feature Engineering

Beware of Common ML Imposters

Happy Machine Learning

Links:

Apress Link: Click here!

Amazon links by location: US, United Kingdom, India, Brazil, Canada, France, Germany, Italy, Japan, Mexico, Netherlands, Spain

DSC Resources

Services: Hire a Data Scientist | Search DSC | Classifieds | Find a Job

Contributors: Post a Blog | Ask a Question

Follow us: @DataScienceCtrl | @AnalyticBridge

Popular Articles

Difference between Machine Learning, Data Science, AI, Deep Learning, and Statistics

What is Data Science? 24 Fundamental Articles Answering This Question

Hitchhiker’s Guide to Data Science, Machine Learning, R, Python

Advanced Machine Learning with Basic Excel

Link: Book: Mastering Machine Learning with Python in Six Steps