Fundamental Python Data Science Tooling

These are the tools you are almost certain to use when performing data science tasks using Python. Many other tools are built on these tools.

Summary

  • NumPy
  • Pandas
  • SciPy
  • Scikit-Learn
  • Matplotlib
  • Jupyter

Multi-Tool Focus

NumPy

  • NumPyRepository – Stars: 22.3k – Updated: 12/2022 – Checked: 12/2022 – “The fundamental package for scientific computing with Python.”

Pandas

  • PandasRepository – Stars: 36.4k – Updated: 12/2022 – Checked: 12/2022 – “Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more.”

SciPy

SciPyRepository – Stars: 10.6k – Updated: 12/2022 – Checked: 12/2022 – Data Science and Analysis toolset. Includes NumPy, SciPy, Matplotlib, IPython, pandas, Sympy, nose.

Bryan Weber. Scientific Python: Using SciPy for Optimization. real python, 1/20/20.

Scikit-Learn

  • Scikit-LearnRepository – Stars: 52.4k – Updated: 12/2022 – Checked: 12/2022 – “Simple and efficient tools for predictive data analysis…built on NumPy, SciPy, and matplotlib.”

Matplotlib

Jupyter