50-Days-Data-Science

Welcome to the Data Science course! Over the next 50 days, you will learn a wide range of topics related to Python programming, data science, and machine learning. These topics will be covered in a variety of posts, so be sure to bookmark this page and follow me here and on GitHub for updates.

Throughout the course, you will have the opportunity to work with real-world data sets and apply the concepts you have learned to solve practical problems. You will also find exercises in each post that you can practice to further solidify your understanding of the material. All materials and exercises will be available on the GitHub repository linked below.

By the end of the course, you will have a strong foundation in data science and be well-prepared to pursue further study or a career in the field. So let’s get started!

Day Content Article Links
Day 1 Python Basics Link
Day 2 Python Data Structure Link
Day 3 OOPs in Python Link
Day 4 NumPy Link
Day 5 Pandas Link
Day 6 Data Visualization: Matplotlib and Seaborn Link
Day 7 DBMS(SQL and SQLite) Link
Day 8 Linear Algebra/Matrics Link
Day 9 Statistics Link
Day 10 Probability Link
Day 11 Calculas Link
Day 12 EDA (Exploratory Data Analysis) Link
Day 13 Introduction to Machine Learning Link
Day 14 Supervised Learning Link
Day 15 Unsupervised Learning Link
Day 16 Reinforcement Learning Learning Link
Day 17 Linear Regression in Python: From Data to Model Link
Day 18 Encoding Techniques: Transforming Categorical Data Link
Day 19 Multivariate Linear Regression Link
Day 20 Bias vs Variance Link
Day 21 Evaluation Metrics for Classification and Regression Link
Day 22 Heuristic search Techniques Link
Day 23 Project 1: Predicting Boston Housing Prices using Regression Models Link
Day 24 Project 2: Email Spam Classification Link
Day 25 KNN (K-nearest neighbors) Link
Day 26 Project 3: KNN (K-nearest neighbors) Classification Link
Day 27 Logistic Regression Link
Day 28 Support Vector Machines (SVM) Link
Day 29 Decision Trees Link
Day 30 Time Series Prediction Link
Day 31 Clustering Algorithms Link
Day 32 Centroid-based Clustering Link
Day 33 Project 4: Sentiment Analysis of Twitter Link
Day 34 Project 5: Hotel Reservations Dataset: Best Machine Learning Link
Day 35 GridSearchCV in scikit-learn Link
Day 36 Project 5(Improved): Hotel Reservations Dataset: Best Machine Learning Link
Day 37 Project 6: Drug classification Link
Day 38 Random Forest Link
Day 39 Dimensionality Reduction Link
Day 40 Overfitting and Underfitting Link
  1. Python

    1. Python basics

      1. Input/Output

        • Printing to the console

        • Getting input from the user

      2. Operators

        • Arithmetic operators (e.g. +, -, *, /)

        • Comparison operators (e.g. ==, !=, >, <)

        • Logical operators (e.g. and, or, not)

      3. Operations

        • Working with variables

        • Data types (e.g. int, float, str)

        • Type conversion

        • Basic string manipulation (e.g. indexing, slicing, concatenation)

    2. Python data structures

      1. list

        • Creating and accessing lists

        • Modifying lists (e.g. adding, removing, and sorting elements)

        • Looping through lists

      2. tuple

        • Creating and accessing tuples

        • Modifying tuples (e.g. adding and removing elements)

        • Looping through tuples

      3. set

        • Creating and accessing sets

        • Modifying sets (e.g. adding, removing, and intersecting elements)

        • Looping through sets

      4. dictionary

        • Creating and accessing dictionaries

        • Modifying dictionaries (e.g. adding, removing, and updating key-value pairs)

        • Looping through dictionaries

    3. Python fundamentals

      1. loops

        • For loops

        • While loops

        • Break and continue statements

      2. functions

        • Defining and calling functions

        • Parameters and arguments

        • Return values

      3. object and classes

        • Defining classes and objects

        • Constructors and destructors

        • Inheritance

        • Method overloading and overriding

    4. Pandas

      • Introduction to Pandas library

      • Loading and saving data with Pandas

      • Working with DataFrames and Series

      • Manipulating and cleaning data with Pandas

    5. Numpy

      • Introduction to Numpy library

      • Creating and accessing arrays

      • Array operations (e.g. reshaping, slicing, and element-wise operations)

      • Mathematical and statistical functions

    6. Matplotlib

      • Introduction to Matplotlib library

      • Creating basic plots (e.g. line, scatter, and bar plots)

      • Customizing plots (e.g. labels, titles, and legends)

      • Saving and showing plots

  2. SQL

    • Introduction to Structured Query Language (SQL)

    • Creating and modifying databases and tables

    • Selecting, filtering, and sorting data

    • Grouping and aggregating data

    • Joining tables

    • Subqueries and views

  3. Maths Refresher

    1. Statistics

      • Mean, median, mode

      • Range, variance, standard deviation

      • Percentiles and quartiles

      • Z-scores

    2. Probability

      • Basic probability concepts (e.g. events, sample space, and probability)

      • Conditional probability and independence

    3. Linear algebra

      • Vectors and matrices

      • Matrix operations (e.g. addition, multiplication, and transposition)

    4. Calculus

      • Limits and continuity

      • Derivatives

      • Integrals

      • Fundamental theorem of calculus

  4. Python for data science

    1. Jupyter notebook and google collab walkthrough

      • Introduction to Jupyter notebooks and Google Colab

      • Creating and running cells

      • Importing and exporting notebooks

    2. Python data science libraries

      • Introduction to popular data science libraries (e.g. Scikit-learn, TensorFlow, and Keras)

      • Installing and importing libraries

    3. Exploratory data analysis

      1. Visualization

        • Introduction to Matplotlib and Seaborn

        • Plotting distributions, scatterplots, and boxplots

        • Customizing plots

      2. Summary statistics

        • Calculating basic statistics (e.g. mean, median, and standard deviation)

        • Generating descriptive statistics with Pandas

      3. Correlation analysis

        • Calculating and interpreting correlations

        • Visualizing correlations with scatterplots

      4. Data cleaning

        • Handling missing values

        • Removing outliers

        • Normalizing and standardizing data

      5. Dimension reduction

        • Introduction to dimension reduction techniques (e.g. PCA and t-SNE)

        • Implementing and interpreting dimension reduction in Python

      6. Anomaly detection

        • Introduction to anomaly detection techniques (e.g. isolation forests and local outlier factor)

        • Implementing and interpreting anomaly detection in Python

      7. Feature engineering

        • Introduction to feature engineering

        • Creating new features from existing data

        • Selecting relevant features for model building

  5. Machine learning

    1. Introduction

      • Definition and types of machine learning

      • Differences between supervised, unsupervised, and reinforcement learning

    2. Supervised learning

      • Regression and classification algorithms

      • Evaluation metrics for regression and classification models (e.g. mean squared error and accuracy)

    3. Classification

      • K-nearest neighbors (KNN)

      • Logistic regression

      • Support vector machines (SVM)

    4. Decision trees

      • Introduction to decision trees

      • Implementing decision trees in Python

      • Visualizing decision trees

    5. Time series prediction

      • Introduction to time series data

      • Moving average and exponential smoothing models

      • Autoregressive integrated moving average (ARIMA) model

    6. Unsupervised learning

      • Clustering algorithms (e.g. k-means and hierarchical clustering)

      • Evaluation metrics for clustering (e.g. silhouette score and calinski-harabasz index)

    7. Some projects (5-8)

      • Suggested projects to apply machine learning concepts (e.g. building a spam detector or a customer segmentation model)
  6. Tableau

    • Connecting to and importing data

    • Working with data in Tableau

    • Creating and customizing visualizations

    • Dashboarding and storytelling with Tableau

    • Advanced techniques (e.g. calculated fields, parameters, and table calculations)

    • Exporting and publishing dashboards

Same in tabular form:

Module Topic Sub-Topic Content
Python Python basics Input/Output Printing to the console
      Getting input from the user
    Operators Arithmetic operators (e.g. +, -, *, /)
      Comparison operators (e.g. ==, !=, >, <)
      Logical operators (e.g. and, or, not)
    Operations Working with variables
      Data types (e.g. int, float, str)
      Type conversion
      Basic string manipulation (e.g. indexing, slicing, concatenation)
  Python data structures list Creating and accessing lists
      Modifying lists (e.g. adding, removing, and sorting elements)
      Looping through lists
    tuple Creating and accessing tuples
      Modifying tuples (e.g. adding and removing elements)
      Looping through tuples
    set Creating and accessing sets
      Modifying sets (e.g. adding, removing, and intersecting elements)
      Looping through sets
    dictionary Creating and accessing dictionaries
      Modifying dictionaries (e.g. adding, removing, and updating key-value pairs)
      Looping through dictionaries
  Python fundamentals loops For loops
      While loops
      Break and continue statements
    functions Defining and calling functions
      Parameters and arguments
      Return values
    object and classes Defining classes and objects
      Constructors and destructors
      Inheritance
      Method overloading and overriding
  Pandas Introduction to Pandas library  
      Loading and saving data with Pandas
      Working with DataFrames and Series
      Manipulating and cleaning data with Pandas
  Numpy Introduction to Numpy library  
      Creating and accessing arrays
      Array operations (e.g. reshaping, slicing, and element-wise operations)
      Mathematical and statistical functions
  Matplotlib Introduction to Matplotlib library  
      Creating basic plots (e.g. line, scatter, and bar plots)
      Customizing plots (e.g. labels, titles, and legends)
      Saving and showing plots
SQL Introduction to Structured Query Language (SQL)    
      Creating and modifying databases and tables
      Selecting, filtering, and sorting data
      Grouping and aggregating
  Joining tables    
      Subqueries and views
Maths Refresher Statistics Mean, median, mode  
      Range, variance, standard deviation
      Percentiles and quartiles
      Z-scores
  Probability Basic probability concepts (e.g. events, sample space, and probability)  
      Conditional probability and independence
      Bayes’ theorem
  Linear algebra Vectors and matrices  
      Matrix operations (e.g. addition, multiplication, and transposition)
      Determinants and inverses
  Calculus Limits and continuity  
      Derivatives
      Integrals
      Fundamental theorem of calculus
Python for data science Jupyter notebook and google collab walkthrough Introduction to Jupyter notebooks and Google Colab  
      Creating and running cells
      Importing and exporting notebooks
  Python data science libraries Introduction to popular data science libraries (e.g. Scikit-learn, TensorFlow, and Keras)  
      Installing and importing libraries
  Exploratory data analysis Visualization Introduction to Matplotlib and Seaborn
      Plotting distributions, scatterplots, and boxplots
      Customizing plots
    Summary statistics Calculating basic statistics (e.g. mean, median, and standard deviation)
      Generating descriptive statistics with Pandas
    Correlation analysis Calculating and interpreting correlations
      Visualizing correlations with scatterplots
    Data cleaning Handling missing values
      Removing outliers
      Normalizing and standardizing data
    Dimension reduction Introduction to dimension reduction techniques (e.g. PCA and t-SNE)
      Implementing and interpreting dimension reduction in Python
    Anomaly detection Introduction to anomaly detection techniques (e.g. isolation forests and local outlier factor)
      Implementing and interpreting anomaly detection in Python
    Feature engineering Introduction to feature engineering
      Creating new features from existing data
      Selecting relevant features for model building
Machine learning Introduction Definition and types of machine learning  
      Differences between supervised, unsupervised, and reinforcement learning
  Supervised learning Regression and classification algorithms  
      Evaluation metrics for regression and classification models (e.g. mean squared error and accuracy)
  Classification K-nearest neighbors (KNN)  
      Logistic regression
      Support vector machines (SVM)
  Decision trees Introduction to decision trees  
      Implementing decision trees in Python
      Visualizing decision trees
  Time series prediction Introduction to time series data  
      Moving average and exponential smoothing models
      Autoregressive integrated moving average (ARIMA) model
  Unsupervised learning Clustering algorithms (e.g. k-means and hierarchical clustering)  
      Evaluation metrics for clustering (e.g. silhouette score and calinski-harabasz index)
  Some projects (5-8) Suggested projects to apply machine learning concepts (e.g. building a spam detector or a customer segmentation model)  
Tableau Introduction to Tableau    
      Connecting to and importing data
      Working with data
  Working with data in Tableau    
      Creating and customizing visualizations
      Dashboarding and storytelling with Tableau
    Advanced techniques Calculated fields, parameters, and table calculations
      Exporting and publishing dashboards

We hope that you will enjoy learning about data science with me! By completing this course, you should now have a strong foundation in Python programming, SQL, maths refresher, data science with Python, machine learning, and Tableau. You should be well-prepared to pursue further study or a career in the field, and we encourage you to continue learning and staying up-to-date on new developments in the world of data science.

We would like to thank you for joining me on this journey and hope that you will continue to follow us for future updates and learning opportunities. Don’t forget to check out the GitHub repository linked below for all materials and exercises, and we look forward to seeing what you will accomplish with your new skills!

If you suggest some topics to be added then create a PR or comment on this post: Complete-Data-Science-Bootcamp

Buy Me A Coffee