These are the tools you are almost certain to use when performing data science tasks using Python. Many other tools are built on these tools.
Summary
- NumPy
- Pandas
- SciPy
- Scikit-Learn
- Matplotlib
- Jupyter
Multi-Tool Focus
- Mirko Stojiljkovic. NumPy, SciPy, and Pandas: Correlation with Python. realpython, 12/23/19.*
NumPy
- NumPy – Repository – Stars: 22.3k – Updated: 12/2022 – Checked: 12/2022 – “The fundamental package for scientific computing with Python.”
- Mirko Stojiljkovic. NumPy arange(): How to use np.arange(). realpython, 7/22/2019.
- Beau Carnes. Learn NumPy and Start Doing Scientific Computing in Python. freecodecamp, 8/9/19.
- Stephen Gruppetta. np.linspace(): Create Evenly or Non-Evenly Spaced Arrays. realpython, 11/2020.
- Jay Alammar. A Visual Intro to NumPy and Data Representation. 2019.
Pandas
- Pandas – Repository – Stars: 36.4k – Updated: 12/2022 – Checked: 12/2022 – “Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more.”
- Bryan Weber. Pandas Project: Make a Gradebook with Python & Pandas. realpython, 7/2020.*
- Nick McCullum. The Ultimate Guide to the Pandas Library for Data Science in Python. freecodecamp, 7/2020.
- Zachary Wilson. The Pandas Library for Python. kite, 3/25/19.
- Bhavani Ravi. Python Pandas–Basics to Beyond. hackernoon, 2019.
- Bhavani Ravi. Learn Python Pandas in 5 Mins.
- Part 2. 2/7/19.
- Parul Pandey. Time Series Analysis with Pandas. kite, 2019.
- Alex DeBrie. How to Use Pandas GroupBy, Counts and Value Counts. kite, 7/18/2019.
- Mokhtar Ebrahim. Python Pandas Tutorial: Getting Started with DataFrames. like geeks, 2/2019.
- Brad Solomon. Pandas GroupBy: Your Guide to Grouping Data in Python. real python, 11/18/19.
- Alex DeBrie. Pandas Pivot: A Guide with Examples. kite, 6/29/19.
- T.J. Simmons. The Quickest Ways to Sort Pandas DataFrame Values. kite, 6/25/19.
- Alex DeBrie. Pandas Merge, Join, and Concat: How To and Examples. kite, 5/3/2019.
- Alex DeBrie. Guide: Pandas DataFrames for Data Analysis. kite, 4/3/19.
- Dataframe Visualization with Pandas Plot. kanoki, 2019.
- Brad Solomon. Python Pandas: Tricks & Features You May Not Know. Real Python.
- Chris Moffitt. Effectively Using Matplotlib. practical business python, 4/25/17.
- Peter Nistrup. Exploring Your Data with Just 1 Line of Python. towardsdatascience, 9/25/19.
- Reka Horvath. Using Pandas and Python to Explore Your Dataset. realpython, 1/6/20.
- Mirko Stojiljkovic. Pandas: How to Read and Write Files. realpython, 12/2/19.
- Malay Agarwal. Pythonic Data Cleaning with Pandas and NumPy. realpython.
- Joe Wyndham. Fast, Flexible, Easy and Intuitive: How to Speed Up Your Pandas Projects. realpython.
- Mirko Stojiljković. SettingWithCopyWarning in Pandas: Views vs Copies. realpython, 6/2020.
- Kyle Stratis. Combining Data in Pandas with merge(), .join(), and concat(). realpython, 4/2020.
- Reka Horvath. Plot with Pandas: Python Data Visualization for Beginners. realpython, 9/2020.
SciPy
SciPy – Repository – Stars: 10.6k – Updated: 12/2022 – Checked: 12/2022 – Data Science and Analysis toolset. Includes NumPy, SciPy, Matplotlib, IPython, pandas, Sympy, nose.
Bryan Weber. Scientific Python: Using SciPy for Optimization. real python, 1/20/20.
Scikit-Learn
- Scikit-Learn – Repository – Stars: 52.4k – Updated: 12/2022 – Checked: 12/2022 – “Simple and efficient tools for predictive data analysis…built on NumPy, SciPy, and matplotlib.”
- Mirko Stojiljkovic. Split Your Dataset with scikit-learn’s train_test_split(). realpython, 11/2020.
Matplotlib
- Matplotlib – Repository – Stars: 16.6k – Updated: 12/2022 – Checked: 12/2022
- Shaumik Daityari. How to Plot Charts in Python with Matplotlib. sitepoint, 7/10/2019.
- Hennadii Madan. Matplotlib Explained. kite, 3/5/19.
- Brad Solomon. Python Plotting with Matplotlib (Guide). realpython.
Jupyter
- Mike Driscoll. Jupyter Notebook: An Introduction. realpython.