Pandas 101 – Foundations in Data Science
What is Data Science
Data science is the field of applying advanced analytics techniques and scientific principles to extract valuable information from data for business decision-making, strategic planning and other uses. A data scientist’s duties can include developing strategies for analyzing data, preparing data for analysis, exploring, analyzing, and visualizing data, building models with data using programming languages, such as Python, and deploying models into applications.
Introducing Pandas for Data Science
pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool,
built on top of the Python programming language.
Pandas is an open source Python package that is most widely used for data science/data analysis and machine learning tasks. It is built on top of another package named Numpy, which provides support for multi-dimensional arrays. As one of the most popular data wrangling packages, Pandas works well with many other data science modules inside the Python ecosystem, and is typically included in every Python distribution

What can you do with Pandas?
Pandas makes it simple to do many of the time consuming, repetitive tasks associated with working with data, including:
- Data cleansing
- Data fill
- Data normalization
- Merges and joins
- Data visualization
- Statistical analysis
- Data inspection
- Loading and saving data
- And much more

Prerequisite
- Completed Python 101: https://www.kode2go.co.za/courses/python-101/
- Write all code and test on an VS Code Environment
What will you learn?
You will be learning the fundamentals of Pandas for Data Science tasks. It will serve as a foundation for learning further Data Science topics.
Chapter 1: Exploring Data sets in Pandas
Lesson 1.1: Creating and Reading Data
Lesson 1.2: Cleaning your Dataset
Lesson 1.3: Validate and Sort your Data
Lesson 1.4: Plotting in Pandas
