
Essential Math for Data Science: Take Control of Your Data with Fundamental Linear Algebra, Probability, and Statistics
- Length: 350 pages
- Edition: 1
- Language: English
- Publisher: O'Reilly Media
- Publication Date: 2022-07-12
- ISBN-10: 1098102932
- ISBN-13: 9781098102937
- Sales Rank: #279418 (See Top 100 Books)
Master the math needed to excel in data science, machine learning, and statistics. In this book author Thomas Nield guides you through areas like calculus, probability, linear algebra, and statistics and how they apply to techniques like linear regression, logistic regression, and neural networks. Along the way you’ll also gain practical insights into the state of data science and how to use those insights to maximize your career.
Learn how to:
- Use Python code and libraries like SymPy, NumPy, and scikit-learn to explore essential mathematical concepts like calculus, linear algebra, statistics, and machine learning
- Understand techniques like linear regression, logistic regression, and neural networks in plain English, with minimal mathematical notation and jargon
- Perform descriptive statistics and hypothesis testing on a dataset to interpret p-values and statistical significance
- Manipulate vectors and matrices and perform matrix decomposition
- Integrate and build upon incremental knowledge of calculus, probability, statistics, and linear algebra, and apply it to regression models including neural networks
- Navigate practically through a data science career and avoid common pitfalls, assumptions, and biases while tuning your skill set to stand out in the job market
Preface Conventions Used in This Book Using Code Examples O’Reilly Online Learning How to Contact Us Acknowledgments 1. Basic Math and Calculus Review Number Theory Order of Operations Variables Functions Summations Exponents Logarithms Euler’s Number and Natural Logarithms Euler’s Number Natural Logarithms Limits Derivatives Partial Derivatives The Chain Rule Integrals Conclusion Exercises 2. Probability Understanding Probability Probability Versus Statistics Probability Math Joint Probabilities Union Probabilities Conditional Probability and Bayes’ Theorem Joint and Union Conditional Probabilities Binomial Distribution Beta Distribution Conclusion Exercises 3. Descriptive and Inferential Statistics What Is Data? Descriptive Versus Inferential Statistics Populations, Samples, and Bias Descriptive Statistics Mean and Weighted Mean Median Mode Variance and Standard Deviation The Normal Distribution The Inverse CDF Z-Scores Inferential Statistics The Central Limit Theorem Confidence Intervals Understanding P-Values Hypothesis Testing The T-Distribution: Dealing with Small Samples Big Data Considerations and the Texas Sharpshooter Fallacy Conclusion Exercises 4. Linear Algebra What Is a Vector? Adding and Combining Vectors Scaling Vectors Span and Linear Dependence Linear Transformations Basis Vectors Matrix Vector Multiplication Matrix Multiplication Determinants Special Types of Matrices Square Matrix Identity Matrix Inverse Matrix Diagonal Matrix Triangular Matrix Sparse Matrix Systems of Equations and Inverse Matrices Eigenvectors and Eigenvalues Conclusion Exercises 5. Linear Regression A Basic Linear Regression Residuals and Squared Errors Finding the Best Fit Line Closed Form Equation Inverse Matrix Techniques Gradient Descent Overfitting and Variance Stochastic Gradient Descent The Correlation Coefficient Statistical Significance Coefficient of Determination Standard Error of the Estimate Prediction Intervals Train/Test Splits Multiple Linear Regression Conclusion Exercises 6. Logistic Regression and Classification Understanding Logistic Regression Performing a Logistic Regression Logistic Function Fitting the Logistic Curve Multivariable Logistic Regression Understanding the Log-Odds R-Squared P-Values Train/Test Splits Confusion Matrices Bayes’ Theorem and Classification Receiver Operator Characteristics/Area Under Curve Class Imbalance Conclusion Exercises 7. Neural Networks When to Use Neural Networks and Deep Learning A Simple Neural Network Activation Functions Forward Propagation Backpropagation Calculating the Weight and Bias Derivatives Stochastic Gradient Descent Using scikit-learn Limitations of Neural Networks and Deep Learning Conclusion Exercise 8. Career Advice and the Path Forward Redefining Data Science A Brief History of Data Science Finding Your Edge SQL Proficiency Programming Proficiency Data Visualization Knowing Your Industry Productive Learning Practitioner Versus Advisor What to Watch Out For in Data Science Jobs Role Definition Organizational Focus and Buy-In Adequate Resources Reasonable Objectives Competing with Existing Systems A Role Is Not What You Expected Does Your Dream Job Not Exist? Where Do I Go Now? Conclusion A. Supplemental Topics Using LaTeX Rendering with SymPy Binomial Distribution from Scratch Beta Distribution from Scratch Deriving Bayes’ Theorem CDF and Inverse CDF from Scratch Use e to Predict Event Probability Over Time Hill Climbing and Linear Regression Hill Climbing and Logistic Regression A Brief Intro to Linear Programming MNIST Classifier Using scikit-learn B. Exercise Answers Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Index About the Author
How to download source code?
1. Go to: https://www.oreilly.com/
2. Search the book title: Essential Math for Data Science: Take Control of Your Data with Fundamental Linear Algebra, Probability, and Statistics
, sometime you may not get the results, please search the main title
3. Click the book title in the search results
3. Publisher resources
section, click Download Example Code
.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.