Currently set to No Index

# Essential Math for Data Science

Essential Math for Data Science

This article was written by Tirthajyoti Sarkar. Below is a summary. The full article (accessible from link at the bottom) also features courses that you could attend to learn the topics listed below, as well as numerous comments. We also added a few topics that we think are important and missing in the original article.

Statistics

Data summaries and descriptive statistics, central tendency, variance, covariance, correlation,
Basic probability: basic idea, expectation, probability calculus, Bayes theorem, conditional probability,
Probability distribution functions — uniform, normal, binomial, chi-square, student’s t-distribution, Central limit theorem,
Sampling, measurement, error, random number generation,
Hypothesis testing, A/B testing, confidence intervals, p-values,
ANOVA, t-test
Linear and logistic regression, regularization
Decision trees
Robust and non-parametric statistics

Linear Algebra

Basic properties of matrix and vectors — scalar multiplication, linear transformation, transpose, conjugate, rank, determinant,
Inner and outer products, matrix multiplication rule and various algorithms, matrix inverse,
Special matrices — square matrix, identity matrix, triangular matrix, idea about sparse and dense matrix, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices,
Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving Ax=b linear system of equation,
Vector space, basis, span, orthogonality, orthonormality, linear least square,
Eigenvalues, eigenvectors, and diagonalization, singular value decomposition (SVD)

Calculus

Functions of single variable, limit, continuity and differentiability,
Mean value theorems, indeterminate forms and L’Hospital rule,
Maxima and minima,
Product and chain rule,
Taylor’s series, infinite series summation/integration concepts
Fundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals,
Beta and Gamma functions,
Functions of multiple variables, limit, continuity, partial derivatives,
Basics of ordinary and partial differential equations (not too advanced)

Discrete Math

Sets, subsets, power sets
Counting functions, combinatorics, countability
Basic Proof Techniques — induction, proof by contradiction
Basics of inductive, deductive, and propositional logic
Basic data structures- stacks, queues, graphs, arrays, hash tables, trees
Graph properties — connected components, degree, maximum flow/minimum cut concepts, graph coloring
Recurrence relations and equations
Growth of functions and O(n) notation concept

Optimization, Operations Research

Basics of optimization —how to formulate the problem
Maxima, minima, convex function, global solution
Linear programming, simplex algorithm
Integer programming
Constraint programming, knapsack problem
Randomized optimization techniques — hill climbing, simulated annealing, Genetic algorithms

To read the full article, click here.
DSC Resources

Invitation to Join Data Science Central
Free Book: Applied Stochastic Processes
Comprehensive Repository of Data Science and ML Resources
Advanced Machine Learning with Basic Excel
Difference between ML, Data Science, AI, Deep Learning, and Statistics
Selected Business Analytics, Data Science and ML articles
Hire a Data Scientist | Search DSC | Classifieds | Find a Job
Post a Blog | Forum Questions