Tableau's Superstore Dataset Exploratory Data Analysis (EDA)

This is an example of how I would approach EDA when looking at a new set of data. In this exercise, I am exploring Tableau's Superstore dataset to prepare for data visualization implementation.

I started by importing the data into a Microsfot SQL Server instance on my home computer. This allows me to simulate a real-world situation where data must be pulled from a SQL server instance like Snowflake or Teradata. From there, I used the below code to give me an understanding of the structure of the data and how it might be leveraged.

Note: in this exercise, I do not have a data dictionary so I must make a few assumptions. In the real-world I would likely have access to a dictionary and/or data engineer whom I could ask for help in understanding the schema.

Load Dependencies and Data


Dependencies and connection to SQL

Load data from SQL

Added a datefromparts because date truncation in Python is very cumbersome

Exploratory Data Analysis


Describe Columns

We can see quartiles, mean, std. dev. etc.