Practical Data Science With Jupyter

The world runs on data, but who’s analyzing it? That could be you. Start your Jupyter Data Science journey today. 

Lessons
AI Tutor (Add-on)
Get A Free Trial

About This Course

Enroll in our Practical Data Science with Jupyter course to tackle real-world data challenges using Python and Jupyter.

In this hands-on course, you'll explore data cleaning, feature engineering, and machine learning techniques through practical examples and interactive exercises. From setting up your environment to building predictive models, you'll gain the skills needed to analyze data effectively and make informed decisions.

Skills You’ll Get

  • Python Programming for Data Science: Master Python fundamentals, including data structures, functions, and libraries like NumPy and pandas, to efficiently manipulate and analyze data.
  • Data Cleaning and Preprocessing: Learn techniques to clean, normalize, and prepare diverse datasets, ensuring they're analysis-ready for accurate insights.
  • Data Visualization: Create compelling visualizations using tools like Matplotlib and Seaborn to effectively communicate data-driven stories.
  • Statistical Analysis and Feature Engineering: Understand statistical concepts and apply feature engineering methods to enhance model performance.
  • Machine Learning Techniques: Gain hands-on experience with supervised and unsupervised machine learning algorithms, including regression, classification, and clustering.
  • Time-Series Analysis: Develop skills to analyze and forecast time-series data, applying models like ARIMA for predictive analytics.

1

Preface

2

Data Science Fundamentals

  • What is data?
  • What is data science?
  • What does a data scientist do?
  • Real-world use cases of data science
  • Why Python for data science?
  • Conclusion
3

Installing Software and System Setup

  • System requirements
  • Downloading Anaconda
  • Installing the Anaconda on Windows
  • Installing the Anaconda in Linux
  • How to install a new Python library in Anaconda?
  • Open your notebook - Jupyter
  • Know your notebook
  • Conclusion
4

Lists and Dictionaries

  • What is a list?
  • How to create a list?
  • Different list manipulation operations
  • Difference between Lists and Tuples
  • What is a Dictionary?
  • How to create a dictionary?
  • Some operations with dictionary
  • Conclusion
5

Package, Function, and Loop

  • The help() function in Python
  • How to import a Python package?
  • How to create and call a function?
  • Passing parameter in a function
  • Default parameter in a function
  • How to use unknown parameters in a function?
  • A global and local variable in a function
  • What is a Lambda function?
  • Understanding main in Python
  • while and for loop in Python
  • Conclusion
6

NumPy Foundation

  • Importing a NumPy package
  • Why use NumPy array over list?
  • NumPy array attributes
  • Creating NumPy arrays
  • Accessing an element of a NumPy array
  • Slicing in NumPy array
  • Array concatenation
  • Conclusion
7

Pandas and DataFrame

  • Importing Pandas
  • Pandas data structures
  • .loc[] and .iloc[]
  • Some Useful DataFrame Functions
  • Handling missing values in DataFrame
  • Conclusion
8

Interacting with Databases

  • What is SQLAlchemy?
  • Installing SQLAlchemy package
  • How to use SQLAlchemy?
  • SQLAlchemy engine configuration
  • Creating a table in a database
  • Inserting data in a table
  • Update a record
  • How to join two tables
  • Conclusion
9

Thinking Statistically in Data Science

  • Statistics in data science
  • Types of statistical data/variables
  • Basics of probability
  • Statistical distributions
  • Pearson correlation coefficient
  • Probability Density Function (PDF)
  • Real-world example
  • Statistical inference and hypothesis testing
  • Conclusion
10

Cleaning of Imported Data

  • Know your data
  • Analyzing missing values
  • Dropping missing values
  • Automatically fill missing values
  • How to scale and normalize data?
  • How to parse dates?
  • How to apply character encoding?
  • Cleaning inconsistent data
  • Conclusion
11

Data Visualization

  • Bar chart
  • Line chart
  • Histograms
  • Scatter plot
  • Stacked plot
  • Box plot
  • Conclusion
12

Data Pre-processing

  • About the case-study
  • Importing the dataset
  • Exploratory data analysis
  • Data cleaning and pre-processing
  • Feature Engineering
  • Conclusion
13

Supervised Machine Learning

  • Some common ML terms
  • Introduction to machine learning (ML)
  • Unsupervised learning
  • List of common ML algorithms
  • Supervised ML fundamentals
  • Solving a classification ML problem
  • Solving a regression ML problem
  • How to tune your ML model?
  • How to handle categorical variables in sklearn?
  • The advanced technique to handle missing data
  • Conclusion
14

Unsupervised Machine Learning

  • Why unsupervised learning?
  • Unsupervised learning techniques
  • Principal Component Analysis (PCA)
  • Case study
  • Validation of unsupervised ML
  • Conclusion
15

Handling Time-Series Data

  • Why time-series is important?
  • How to handle date and time?
  • Transforming a time-series data
  • Manipulating a time-series data
  • Comparing time-series growth rates
  • How to change time-series frequency?
  • Conclusion
16

Time-Series Methods

  • What is time-series forecasting?
  • Basic steps in forecasting
  • Time-series forecasting techniques
  • Autoregression (AR)
  • Moving Average (MA)
  • Forecast future traffic to a web page
  • Conclusion
17

Case Study-1

  • Predict whether or not an applicant will be able to repay a loan
  • Conclusion
18

Case Study-2

  • Build a prediction model that will accurately classify which text messages are spam
  • Conclusion
19

Case Study-3

  • Build a film recommendation engine
  • Conclusion
20

Case Study-4

  • Predict house sales in King County, Washington State, USA, using regression
  • Conclusion
21

Python Virtual Environment

22

Introduction to An Advanced Algorithm - CatBoost

  • What is a Gradient Boosting algorithm?
  • Introduction to CatBoost
  • Install CatBoost in Python virtual environment
  • How to solve a classification problem with CatBoost?
  • Push your notebook in your GitHub repository
  • Conclusion
23

Revision of All Lessons' Learning

  • Conclusion
24

How to Import Data in Python?

  • Importing text data
  • Importing CSV data
  • Importing Excel data
  • Importing JSON data
  • Importing pickled data
  • Importing a compressed data
  • Conclusion

Any questions?
Check out the FAQs

  Want to Learn More?

Contact Us Now

Designed for beginners and professionals seeking to upskill or reskill, this Jupyter notebook training is suitable for anyone interested in data science. A basic understanding of Python is beneficial but not mandatory.

To use Jupyter Notebook for data analysis, follow these steps:

  • Install Jupyter: Use pip install jupyterlab or conda install jupyterlab.
  • Launch: Run Jupyter notebook in your terminal/command prompt.
  • Create a Notebook: Click New → Python 3 (or your preferred kernel).
  • Import Libraries: Use pandas, numpy, matplotlib, or seaborn for analysis.
  • Load Data: Read datasets (e.g., CSV, Excel) using pd.read_csv().
  • Explore Data: Use df.head(), df.describe(), and df.info().
  • Clean & Analyze: Handle missing data, filter, group, and visualize.
  • Visualize: Create plots (e.g., df.plot(), plt.scatter()).
  • Save & Share: Download as .ipynb or export to HTML/PDF.

Related Courses

All Course
scroll to top