Case Study: Maven Market

Scenario

As a data analyst for the fictional Maven Market, my goal was to create an interactive Power BI dashboard that highlights sales and profitability trends for December 1998.

The dashboard provides stakeholders with actionable insights into store and product performance across various dimensions, including geography, sales volume, and profit margins.

This project encompassed all phases of the data analysis lifecycle: data cleaning, transformation, modeling, and visualization.

Summary

This interactive dashboard was designed to address key business questions that stakeholders might typically pose when analyzing sales performance.

Key Performance Indicators (KPIs):

Revenue and Profit Trends

The dashboard facilitates the dynamic analysis of revenue and profit trends across various dimensions:

Geographic Level: Enables comparison of performance across countries and individual stores.

Product Level: Allows for in-depth analysis of brand performance, identifying top and bottom performers.

Profitability Analysis

Store-Level Profitability: Identifies stores with the highest and lowest profit margins, offering actionable insights for optimization strategies such as renovations or closures.
- Example: In December 1998, the Hidalgo store in Zacatecas, Mexico, stood out with a remarkable profit of $6,401, reflecting strong operational efficiency and customer demand. This performance highlights its potential as a model for other locations to emulate through tailored strategies such as targeted marketing or inventory alignment.

A list of top stores by the current month’s profit shows Hidalgo as the clear winner.

In contrast, the Walla Walla store in Washington State struggled, earning just $241 in profit and experiencing a significant 30.11% decline from the prior month. This underperformance signals potential challenges, such as insufficient local demand or operational inefficiencies, warranting further investigation and potential corrective actions like staff training, promotional campaigns, or inventory adjustments.

Walla Walla performs worst both in terms of profit and month-over-month change.

Product Brand Performance

Sales Volume: Identifies high-selling brands (e.g., Hermanos with 5,342 transactions) to inform inventory management and promotional strategies.
Profitability and Return Rates: Evaluates the profitability and customer satisfaction of individual brands.

Example: The Hermanos brand achieved high sales volume with 5,342 transactions; however, further analysis revealed a moderate profit margin of 58.64% and a relatively high return rate of 0.95%. This indicates potential customer satisfaction issues or inefficiencies in pricing strategy, suggesting a need for quality control assessments or price optimization to maximize profitability.

Example: The King brand was a significant underperformer, with just 98 transactions and a low profit margin of 52.53%. This combination suggests a need for a comprehensive review of its product lineup, marketing efforts, and pricing strategy to address low customer demand and profitability.

Example: Several brands with high return rates, such as Genteel, Dollar, and Framton, also demonstrated poor sales and profit margins, indicating potential quality or customer satisfaction challenges.

Profit Margin vs. Sales Volume: Highlights the importance of considering both factors.

ADJ demonstrated the highest profit margin at 68.84%, but its low sales volume of 140 transactions and elevated return rate of 1.32% suggest untapped market potential. Targeted marketing campaigns or product enhancements could help drive sales while maintaining profitability.

By providing these insights, this dashboard empowers stakeholders to make informed decisions regarding:

Inventory Management: Optimize stock levels based on product demand and profitability.
Marketing Campaigns: Target promotions and marketing efforts towards high-potential stores and product brands.
Store Operations: Identify underperforming stores and implement necessary improvements (e.g., renovations, staff training).
Product Development: Analyze customer preferences and identify opportunities for new product introductions or improvements to existing products.

Creating the report

1. Connecting to the data

The project began with loading datasets into Power BI and standardizing table names to ensure clarity and consistency. Next, I validated data types for each column.

IDs were formatted as integers.
Phone numbers and zip codes were formatted as text strings.
Dollar amounts were assigned the decimal data type.

I then enriched the datasets by creating several new columns:

In the Customers table:

"Full Name" was created by concatenating the "First Name" and "Last Name" columns.
"Birth Year" was extracted from the "Birthdate" column.
A "Has Children" flag was created, assigning "Y" if the "Total Children" value exceeded zero, and "N" otherwise.

In the Products table:

"Discount Price" was calculated as 90% of the "Retail Price."
"Avg Retail Price" was calculated for each "Product Brand" using the "Group By" function.
Null values in the "Recyclable" and "Low-Fat" columns were replaced with "0."

In the Stores table:

"Full Address" was created by concatenating the "Store City," "Store State," and "Store Country" fields.
"Area Code" was extracted from the "Store Phone" field.

I then connected to the calendar table and added fields for "Start of Week," "Name of Day," "Start of Month," "Name of Month," "Quarter of Year," and "Year."

I proceeded to connect to the "Regions" and "Returns" tables, ensuring all columns were formatted with the correct data types.

Finally, I merged transaction data from two separate files: "MavenMarket_Transactions_1997" and "MavenMarket_Transactions_1998." I connected Power BI directly to the folder containing these files and successfully merged them into a single table titled "Transaction_Data."

In a later phase of building the dashboard, I identified several quality issues in the geographical data that required attention:

The city of Camacho had an incomplete name, resulting in its mislocation on the map. The correct name, "Estación Camacho," was applied.
Similarly, the city of San Andres was incorrectly mapped due to an incomplete name. The correct name, "San Andres Mixquic," was subsequently applied.
Mexico City was erroneously mapped to Brazil due to an incorrect state designation ("DF" instead of "Mexico City").
Inconsistent state abbreviations were observed (e.g., "WA" and "Yucatan"). To ensure uniformity, all state names were standardized to their full spellings.
A location in Salem, OR, was incorrectly mapped to Salem, MA. To address this, a location data hierarchy was established within Power BI, prioritizing country and state names before city names.

2. Creating the Data model

A well-defined data model is crucial in Power BI, as it forms the foundation for accurate and insightful analysis. By establishing clear relationships between tables, the data model ensures data integrity and facilitates efficient query performance.

I began by visually arranging the tables on the canvas, strategically positioning lookup tables above the data tables, and then connecting to them based on their primary keys.

The "Transaction_Data" table was connected to:

"Customers" table, using "customer_id."
"Products" table, using "product_id."
"Calendar" table, using "date."
"Stores" table, using "store_id."

The "Return_Data" table was connected to:

"Products" table using "product_id."
"Calendar" table using "date."

The "Stores" and "Regions" tables were connected in a snowflake schema.

Foreign keys in the data tables were hidden to improve visual clarity and reduce potential confusion. I then verified that all table relationships adhered to many-to-one cardinality. Additionally, I ensured one-way filters were applied to all relationships. I then confirmed that data tables were not directly connected to each other but rather linked through shared lookup tables.

Finally, I performed some final column formatting:

All date formats were standardized to m/d/yyyy.
Currency columns were appropriately formatted.
Geographical information, including country, state, and street addresses, was formatted consistently.

3. Adding DAX measures

To enrich data analysis, I developed new calculated columns and DAX measures.

Calculated Columns:

Calendar Table:
- Weekend (Identifies if a particular date falls on a weekend):
```
Weekend = IF(WEEKDAY('Calendar'[date], 2) < 6, "N", "Y")
```
- End of Month (Specifies the last day of the month a particular date is in):
```
End of Month = ENDOFMONTH('Calendar'[date])
```

Customers Table:

Current Age (returns a customer’s current age in years as of today):
```
Current Age = DATEDIFF(Customers[birthdate], TODAY(), YEAR) 
```

Priority (identifies customers as high priority if they are homeowners and own a Golden membership card, lists other customers as standard priority):

Priority =

IF(

    AND(

        Customers[homeowner] = "Y",

        Customers[member_card] = "Golden"

),

    "High",

    "Standard"

)

Short_Country (returns a three letter abbreviation of the customer’s home country):
```
Short_Country = UPPER(LEFT(Customers[customer_country], 3))
```

House Number (returns just the house number of a customer’s street address):

House Number =

LEFT(

    Customers[customer_address],

    SEARCH(" ", Customers[customer_address], 1, 0)

Products Table:
- Price_Tier (sorts products into “high,” “mid,” and “low” price brackets based on whether they are above $3, below $1, or in between):
```
Price_Tier = 
SWITCH(
    TRUE,
    Products[product_retail_price] > 3, "High",
    Products[product_retail_price] > 1, "Mid",
    "Low"
)
```

Stores Table:
- Years_Since_Remodel (determines how many years have passed since a store's last remodel, helping prioritize renovation schedules based on age and performance):
```
Years_Since_Remodel = DATEDIFF(Stores[last_remodel_date], TODAY(), YEAR)
```

DAX Measures:

Key performance indicators:
- Quantity Sold (calculates the total quantity of products sold across all transactions. It provides a foundational metric for evaluating overall sales performance):
```
Quantity Sold = SUM(Transaction_Data[quantity])
```
- Quantity Returned:
```
Quantity Returned = SUM(Return_Data[quantity])
```
- Total Transactions:
```
Total Transactions = COUNTROWS(Transaction_Data)
```
- Total Returns:
```
Total Returns = COUNTROWS(Return_Data)
```
- Return Rate (calculates the percentage of products returned relative to the total quantity sold, helping identify potential quality or satisfaction issues with specific products or categories):
```
Return Rate = [Quantity Returned]/[Quantity Sold]
```
- Weekend Transactions (calculates the number of transactions that occurred on weekends, providing insights into customer shopping behaviors and peak sales periods):
```
Weekend Transactions = 
```
```
CALCULATE([Total Transactions], 'Calendar'[Weekend] = "Y")
```
- % Weekend Transactions:
```
% Weekend Transactions = [Weekend Transactions]/[Total Transactions]
```
- All Transactions (like “Total Transactions” but is not affected by filter context):
```
All Transactions = COUNTROWS(ALL(Transaction_Data))
```
- All Returns (like “Total Returns” but is not affected by filter context):
```
All Returns = COUNTROWS(ALL(Return_Data))
```

Financial Metrics:

Total Revenue:

Total Revenue =

SUMX(

    Transaction_Data,

    Transaction_Data[quantity] * RELATED(Products[product_retail_price])

Total Cost:

Total Cost =

SUMX(

    Transaction_Data,

    Transaction_Data[quantity] * RELATED(Products[product_cost])

Total Profit:

Total Profit = [Total Revenue] - [Total Cost]

Profit Margin (determines the profitability of sales by expressing profit as a percentage of revenue, enabling comparison across products, stores, and time periods):
```
Profit Margin = [Total Profit] / [Total Revenue]
```

Product Analysis:

Unique Products:

Unique Products = DISTINCTCOUNT(Products[product_name])

Time-series Analysis:

YTD Revenue:

YTD Revenue = TOTALYTD([Total Revenue], 'Calendar'[date])

60-Day Revenue:

60-Day Revenue =

CALCULATE(

    [Total Revenue],

    DATESINPERIOD(

        'Calendar'[date],

        MAX('Calendar'[date]),

        -60,

DAY

Month-Over-Month Analysis:

Last Month Transactions:

Last Month Transactions =

CALCULATE(

    [Total Transactions],

    DATEADD('Calendar'[date], -1, MONTH)

Last Month Revenue:

Last Month Revenue =

CALCULATE(

    [Total Revenue],

    DATEADD('Calendar'[date], -1, MONTH)

Last Month Profit:

Last Month Profit =

CALCULATE(

    [Total Profit],

    DATEADD('Calendar'[date], -1, MONTH)

Last Month Returns:

Last Month Returns =

CALCULATE(

    [Total Returns],

    DATEADD('Calendar'[date], -1, MONTH)

Target Setting:
- Revenue Target (sets a revenue target for the current month by applying a 5% growth rate to the revenue from the previous month, helping track performance against goals):
```
Revenue Target = [Last Month Revenue] * 1.05
```

4. Building the dashboard

This section outlines the development of an interactive dashboard to effectively visualize and analyze the Maven Market data.

Dashboard Design & Layout:

The default report tab was named "Topline Performance" and branded with the Maven Market logo in the top left corner.
A matrix displayed key performance indicators (KPIs) for each product brand, such as total transactions, profit, profit margin, and return rate. This format allowed stakeholders to compare brands side by side, identify high-performing brands, and pinpoint those needing further attention due to low margins or high return rates.

Conditional formatting was applied to enhance readability:

Profit: Data bars for visual comparison.
Profit margin: White-to-green gradient, with deeper green indicating higher margins.
Return rate: White-to-red gradient, with deeper red indicating higher return rates.

Key Performance Indicators (KPIs):

KPI cards were created for key metrics:

Transactions: Current month's transactions with a trend line and target based on the previous month.
Profit: Current month's profit with a trend line and target, color-coded green for positive performance.
Returns: Current month's returns with a trend line and target, color-coded red for negative performance.

Geographical Analysis:

A map visualized total transactions by store city, enabling stakeholders to quickly identify geographic patterns and high-performing regions. The inclusion of a country-level slicer allowed for interactive zoom, providing detailed insights into city-level performance within specific countries.
A treemap offered a hierarchical visualization of total transactions, grouped by store country. This layout enabled stakeholders to quickly assess which countries contributed the most to overall sales and to drill down into specific regions for further analysis.
A matrix displayed store-level profit and month-over-month profit change for detailed performance comparisons.

Time-Series Analysis:

A column chart visualized weekly revenue trends for 1998, enabling stakeholders to identify seasonal fluctuations and assess the impact of sales initiatives over time.
A gauge chart visualized total revenue against the target revenue for the latest start of month.

Interactive Elements:

Bookmarks were created for specific insights:

Portland store reaching 1000 sales in December.
Performance of Horatio brand products.
Plato brand as the highest profit margin contributor.

Buttons were incorporated for seamless navigation between bookmarked views,

Conclusion

This case study demonstrates the successful application of Power BI to analyze sales data for Maven Market. Through comprehensive data cleaning, modeling, and visualization, I developed an interactive dashboard that empowers stakeholders with actionable insights into key business metrics.

The analysis uncovered critical trends, such as exceptional store performance in specific regions and underperforming product brands, providing clear opportunities for targeted interventions. This project reinforced the importance of data quality and thoughtful visualization in communicating complex insights effectively.