# Essential Math for Data Science

Essential Math for Data Science

This article was written by Tirthajyoti Sarkar. Below is a summary. The full article (accessible from link at the bottom) also features courses that you could attend to learn the topics listed below, as well as numerous comments. We also added a few topics that we think are important and missing in the original article.

Statistics

Data summaries and descriptive statistics, central tendency, variance, covariance, correlation,

Basic probability: basic idea, expectation, probability calculus, Bayes theorem, conditional probability,

Probability distribution functions — uniform, normal, binomial, chi-square, student’s t-distribution, Central limit theorem,

Sampling, measurement, error, random number generation,

Hypothesis testing, A/B testing, confidence intervals, p-values,

ANOVA, t-test

Linear and logistic regression, regularization

Decision trees

Robust and non-parametric statistics

Linear Algebra

Basic properties of matrix and vectors — scalar multiplication, linear transformation, transpose, conjugate, rank, determinant,

Inner and outer products, matrix multiplication rule and various algorithms, matrix inverse,

Special matrices — square matrix, identity matrix, triangular matrix, idea about sparse and dense matrix, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices,

Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving Ax=b linear system of equation,

Vector space, basis, span, orthogonality, orthonormality, linear least square,

Eigenvalues, eigenvectors, and diagonalization, singular value decomposition (SVD)

Calculus

Functions of single variable, limit, continuity and differentiability,

Mean value theorems, indeterminate forms and L’Hospital rule,

Maxima and minima,

Product and chain rule,

Taylor’s series, infinite series summation/integration concepts

Fundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals,

Beta and Gamma functions,

Functions of multiple variables, limit, continuity, partial derivatives,

Basics of ordinary and partial differential equations (not too advanced)

Discrete Math

Sets, subsets, power sets

Counting functions, combinatorics, countability

Basic Proof Techniques — induction, proof by contradiction

Basics of inductive, deductive, and propositional logic

Basic data structures- stacks, queues, graphs, arrays, hash tables, trees

Graph properties — connected components, degree, maximum flow/minimum cut concepts, graph coloring

Recurrence relations and equations

Growth of functions and O(n) notation concept

Optimization, Operations Research

Basics of optimization —how to formulate the problem

Maxima, minima, convex function, global solution

Linear programming, simplex algorithm

Integer programming

Constraint programming, knapsack problem

Randomized optimization techniques — hill climbing, simulated annealing, Genetic algorithms

To read the full article, click here.

DSC Resources

Invitation to Join Data Science Central

Free Book: Applied Stochastic Processes

Comprehensive Repository of Data Science and ML Resources

Advanced Machine Learning with Basic Excel

Difference between ML, Data Science, AI, Deep Learning, and Statistics

Selected Business Analytics, Data Science and ML articles

Hire a Data Scientist | Search DSC | Classifieds | Find a Job

Post a Blog | Forum Questions