This page will eventually consist of a number of resources related to Data Science as practiced in Python. Right now I’m working on PANDAS.
An article I have found quite helpful as an introductory, opinionated guide is Minimally Sufficient Pandas by Ted Petrou (1/2019).
- We have a separate page dedicated to the fundamental tools of python data science (NumPy, Pandas, SciPy, Scikit-Learn, and Matplotlib).
- We have a separate page dedicate to the core tools of python data science (PyTorch, TensorFlow, Keras, Seaborn, NLTK).
Articles
- Kirit Thadaka. Data Science, the Good, the Bad, and the…Future. kite, 2019.
- Pranathi V.N. Vemuri. Image Segmentation with Python. kite, 7/18/19.
- T.J. Simmons. 10 Essential Data Science Packages for Python. kite, 5/27/19.
- Joseph Lee Wei En. How to Get Started with Python for Deep Learning and Data Science. freecodecamp, 3/6/19
- Rahul Agarwal. Data Scientists, the 5 Graph Algorithms that You Should Know. towardsdatascience, 2019.
- Chris Moffit. Python Tools for Record Linking and Fuzzy Matching.* pbpython, 2/18/2020.
- Real Python Staff. Python for Social Scientists. realpython.
Statistics
- Mirko Stojiljkovic. Python Statistics Fundamentals: How to Describe Your Data. real python, 12/16/19.
- Tirtha Sarkar. Statistical Modeling with Python: How-to & Top Libraries. kite, 2019.
PySpark
Luke Lee. First Steps with PySpark and Big Data Processing. realpython, 7/31/2019.
Linear Regression
Mirko Stojiljkovic. Linear Regression in Python. realpython.
Natural Language Processing (NLP)
- Shaumik Daityari. Getting Started with Natural Language Processing in Python. 3/2019.
- SpaCy – NLP
- Taranjeet Singh. Natural Language Processing with spaCy in Python. realpython.
Neural Networks
- Padmaja Bhagwat. Introduction to Artificial Neural Networks in Python. kite, 7/18/19.
- CNTK – Stars: 17k – Updated: 3/2020 – Checked: 1/2021 – “The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes neural networks as a series of computational steps via a directed graph.”
Books
- Jake VanderPlas. Python Data Science Handbook. O’Reilly, 2018. – Stars: 28k – Updated: 11/2018 – Checked: 2/2021.
- Full text as Jupyter notebooks is available in github repo.
Tools
- Renato Candido. Setting Up Python for Machine Learning on Windows. realpython.
- Data Science Python Notebooks – 18.5k Stars – 2019. On Deep Learning (Tensorflow, Theano, Caffe, Keras), scikit-learn, kaggle, big data (Spark, Hadoop, MapReduce, HDFS), matplotlib, pandas, numpy, scipy, etc.
- Homemade Machine Learning* – 14k Stars – 2020 – “Python examples of popular machine learning algorithms with interactive Jupyter demos and math being explained.”
- Streamlit – “open-source app framework is the easiest way for data scientists and machine learning engineers to create beautiful, performant apps in only a few hours!”
- Spyder – Scientific Python Development Environment.
- TextBlob – Stars: 7.5k – Updated: 1/2021 – Checked: 2/2021 – Text processing including sentiment analysis.