Data is data, so why should it be / Python and Excel go awkwardly?
OK, I’m no lyricist, but there is a beauty in build one skill set off of another.
I believe that every data analyst should have in their “stack” of tools a spreadsheet application, BI/dashboarding program, database and programming language. For that last category, Python is a solid choice.
Data analysts generally come to programming from spreadsheets. Too often, programming languages are seen as the spreadsheet-killer. This should not be the case: after all, spreadsheets are a valued plank of the analytics stack!
Not only that, but “spreadsheet smarts” put the analyst at an advantage for Python mastery. After all, data is data — once the tasks and method become second nature to you, it’s easy to shift production into another tool.
I built the below learning guide with this shift in mind. By starting with the “mental model” of data in Excel, my hope is that analysts can augment their knowledge by picking up Python.
Lesson 1: Python and Excel for data analytics
Objective: Student can compare and contrast uses of Excel and Python for data analytics
Description:
- Welcome to Planet Python
- What is Python and when would you use it?
- Working in Jupyter
Exercises: “Hello world” in Jupyter
Assets needed: None
Time: 30 minutes
Lesson 2: From Excel cells to Python lists and dictionaries
Objective: Student can create, inspect and manipulate lists and dictionaries
Description:
- Variable assignment
- Data types
- Indexing and subsetting lists
- Indexing and accessing dictionaries
Exercises: Drills on lists and dictionaries
Assets needed: None
Time: 45 minutes
Lesson 3: From Excel tables to Python DataFrames
Objective: Student can create, inspect and manipulate DataFrames
Description:
- From lists to NumPy arrays
- From NumPy arrays to Pandas DataFrames
- Importing and inspecting a DataFrame
Exercises: Drills
Assets needed: Baseball records
Time: 45 minutes
Lesson 4: From Excel lookups and PivotTables to Pandas manipulation
Objective: Student can manipulate tabular data with Pandas
Description:
- Sorting, filtering, summarizing, renaming
- Merging
- Un-pivoting and re-shaping
- Exporting results
Exercises: Drills
Assets needed: Baseball records
Time: 50 minutes
Lesson 5: Data visualization with seaborn
Objective: Student can visualize univariate distributions
Description:
- Bar charts
- Line charts
- Histograms
- Custom plots & themes
Exercises: Drills
Assets needed: Baseball records
Time: 30 minutes
Lesson 6: From “That’s hard in Excel” to “That’s easy in Python!”
Objective: Student can conduct end-to-end data analysis project
Description:
- Append, transpose, summarize and visualize a set of csv files
Exercises: Drills
Assets needed: Retail sales dataset
Time: 30 minutes
Leave a Reply