I will be sharing a very interesting project with you all, where I will use Power BI to analyze the data of a Meal Delivery organization and answer a few of their business questions. While creating the dashboard, I will touch upon a few concepts named 'Calculated Column', 'Calculated Measures', etc. Wherever needed, I will mention some GCPs (Good Case Practices) in the process.
ABOUT DATA
The data used in this analysis has been taken from the internet and has no direct reference to any organization. It consists of three CSV files -
1. DEMAND DATA - It contains the historical data demand of various products by the centers.
2. MEAL LOOKUP - It contains the details about individual meals.
3. CENTER LOOKUP - It contains information about the various centers delivering the meals.
SOLUTION
The very first step before we jump into any analysis is cleaning the data and creating the schema. I won't be covering the data cleaning and schema generation steps here as it goes beyond the scope of this blog. One good thing about the data is that it is already in a normalized form, so we don't have to worry much to create a start schema out of it.
GCP - Always keep your dimension tables at top of the fact table. It gives you a visual interpretation of the flow of the filters (and looks nice too).
Q1. What is the total number of orders catered by the firm? Add a date filter in the dashboard.
Information related to orders can be found in the Demand Data table. I will create a Calculated Measure for this. I can use the SUM function of DAX. It comes out to be 120M.
Total Orders =
SUM(
'Demand Data'[num_orders]
)
GCP - Always format your Measures properly so that it is easier for others or even us to comprehend them later.
Q2. What is the revenue earned by the firm?
The revenue can be calculated by multiplying the checkout price with num of orders. Total Revenue can be calculated in two ways. Few people would like to create a Calculated Column, which will compute the revenue of each center-meal combinationfor each week. One more approach is by creating a calculated measure and using the magic of the SUMX function. SUMX will also compute row-wise revenue but won't store the output in any column. It comes out to be $31.92bn.
Total Revenue =
SUMX(
'Demand Data',
'Demand Data'[checkout_price] * 'Demand Data'[num_orders]
)
GCP - Always format your currency values and specify their decimal points.
Q3. What is the total discounted value? What is the overall discount percentage?
The total discounted value can be calculated by subtracting the base price from the checkout price and multiplying the whole with number of orders. I will create another calculated measure for this.
Total Discount =
SUMX(
'Demand Data',
('Demand Data'[base_price] - 'Demand Data'[checkout_price]) * 'Demand Data'[num_orders]
)
To calculate the overall discount percentage, I will create another measure. Total discounted percentage is calculated by dividing total discount with the total base price.
Total Discount % =
VAR Total_BasePrice =
SUMX(
'Demand Data',
'Demand Data'[base_price] * 'Demand Data'[num_orders]
)
VAR discount_pct =
DIVIDE(
[Total Discount],
Total_BasePrice
)
RETURN discount_pct
Q4. Is the business following the Pareto principle?
Please read about Pareto's Principle here. For this, I will be re-using the Total Revenue measure. I will need a measure to calculate the cumulative sales based on center IDs.
Cumulative Revenue (centre_id) =
VAR ctrrnk =
RANKX(
ALL(
'Center Lookup'[center_id]
),
[Total Revenue]
)
VAR RT =
CALCULATE(
[Total Revenue],
FILTER(
ALL(
'Center Lookup'[center_id]
),
ctrrnk >= RANKX(
ALL(
'Center Lookup'[center_id]
),
[Total Revenue]
)
)
RETURN RT
I will create another measure that contains the sum of total revenue irrespective of any filters being applied.
(ALL) Total Revenue =
CALCULATE(
[Total Revenue],
ALL()
)
Now, let's create a measure that will contain the cumulative percentage of revenue by each center.
CP of Revenue (by center) =
DIVIDE(
[Cummulative Revenue (centre_id)],
[(ALL) Total Revenue]
)
This is how it looks when I plot it in a matrix.
I will use a line and clustered column chart to visualize this one. I will take the center IDs on the x-axis, total revenue on the y-axis (column), and cumulative percentage on another y-axis (line). I will also add an 80% reference line on the secondary y-axis.
Once we plot, we can clearly see that there are more than 20% of the centers generate 80% of the revenue. Hence, the business is not following Pareto's Principle.
Q5. What are the top 5 selling categories?
I will reuse the 'total orders' measure in this. I will use the funnel chart to plot the total orders based on categories. I will apply a page filter to select only the top 5 categories.
Q6. Does the operational area of the store influence number of orders?
Since both the parameters are continuous values. I will use a scatter plot here. I will keep the number of orders on the y-axis and the operation area on the x-axis. I will also add a trend line to visualize the trend.
It is quite evident from the trend line that with the increase in the operational area, the number of orders increases.
Q7. Are Emails & product features on the Homepage an effective mode of promotion?
We don't have information about the cost of the promotion, so assuming it to be negligible. We need to find a metric which can show the impact of promotions.
I will find the percentage of weekly center-meal combinations which have been promoted by email and the corresponding percentage of revenue generated by it. I can use a tornado chart to vizualise this.
Here I can see that 8% of email promotions are generating 20% of the revenue, which indicates that email is an effective mode.
We can do the same for homepage features.
Q8. What will be the next 5 weeks forecasted weekly quantity sold for each center, city, and meal category?
I can plot the weeks and quantity sold in a line chart. I will also use the forecasting ability of Power BI to forecast the total orders for the future 5 weeks.
Hope it was informative. Feel free to leave your comments. ๐
Comments
Post a Comment