Final Course Project

Community Detection for Customer Segmentation

Published

December 10, 2025

Project Overview

This final project requires you to design a comprehensive analytical approach to solve a real-world business problem: customer segmentation for a music streaming platform. You will develop a research design document that addresses how Elena Martinez, Chief Data Scientist at a European music streaming platform, should approach a network-based customer segmentation strategy.

NoteKey Focus: Design, Not Implementation

This project emphasizes analytical design and strategic thinking rather than coding and model implementation. You are creating a blueprint for analysis, not conducting the full analysis itself.

Background

Traditional demographic segmentation (age, location, subscription tier) has proven inadequate for music streaming platforms. Music preferences transcend demographic boundaries—a 45-year-old executive might share the same taste in electronic music as a 22-year-old student. The platform needs a better way to segment 6 million active listeners based on their genre preferences and social connections.

The Business Challenge

Elena Martinez’s team at a major European music streaming platform faces a critical decision: should they move from demographic segmentation to a genre-based, network-driven approach? If so, how should they implement it?

Elena’s hypothesis is that genres cluster naturally based on shared audiences and social connections. The question is how to identify these clusters and use them for customer segmentation.

Your Task

You will develop a research design document that proposes how to analyze the Deezer platform data and address Elena’s challenge. Your design should articulate:

  1. The analytical approach to understanding genre preferences and social connections
  2. The methodological rationale for your proposed methods
  3. A comparison framework for evaluating different segmentation approaches
  4. Strategic recommendations for business implementation

Data

You have been provided with real data from 54,573 Deezer users in Croatia:

  • Genre preferences (HR_genres.json): User listening habits across 84 music genres
  • Social network (HR_edges.csv): 498,202 friendship connections between users

These two data tables may help you get a better understanding of users’ interaction patterns and preferences. You may want to use network analysis and conventional EDA to isolate your recommendations’ premises.

TipData Location

Required Document Components

Your research design document should address the following:

1. Problem Framing and Context

  • Explain the rationale for a network-based approach
  • Define the scope and objectives of your proposed analysis

2. Data Representation

  • Describe your approach to representing the available data
  • Justify your representational choices

3. Methodological Design

  • Propose methods for measuring relationships between genres and/or users
  • Propose strategies for identifying customer segments
  • Explain how you would evaluate the quality of your segmentation
  • Justify your methodological choices

4. Comparative Analysis

  • Compare different analytical approaches you could take
  • Discuss trade-offs, data requirements, and expected outcomes
  • Recommend which approach is most suitable and why

5. Business Strategy and Implementation

  • Translate your technical approach into business value
  • Propose validation strategies to assess impact
  • Address practical implementation considerations

Deliverable

WarningSingle Document Submission

You must submit one discursive research design document that addresses all case questions.

Document Specifications

  • Format: PDF
  • Length: Approximately 3,000 words (±10% acceptable)
  • Structure: Essay format with clear sections corresponding to the required components
  • Style: Professional business-academic writing
  • References: Cite course materials, academic literature, and the teaching case
  • Figures/Tables: Include conceptual diagrams where helpful (optional, do not count toward word limit)

Distribution of Words Across Document Components

Your document should flow as a cohesive narrative, but should address:

  1. Problem Framing and Context (200-300 words)
  2. Data Representation (400-500 words)
  3. Methodological Design (1,200-1,500 words)
  4. Comparative Analysis (600-800 words)
  5. Business Strategy and Implementation (400-600 words)
NoteNote on Implementation

You are not required to write code or implement the analysis. This is a design document that proposes how the analysis should be conducted. Focus on demonstrating your understanding of network analysis concepts and their business application.

Familiarizing yourself with the data—by means of EDA—is recommended, though.

Evaluation Criteria

Your submission and individual presentation will be evaluated based on:

Criterion Weight Description
Conceptual Understanding 35% Depth of understanding of network analysis concepts; appropriateness of proposed methods; theoretical justification
Analytical Design 30% Quality of research design; comparison of multiple approaches; understanding of trade-offs and limitations
Business Insight 25% Translation of technical concepts into business value; strategic thinking; practical implementation considerations
Communication 10% Clarity and coherence of writing; logical structure; professional presentation

Writing Guidance

TipTips for Design Documents
  1. Be specific: Identify and name the specific methods you propose
  2. Justify your choices: Explain why your chosen methods are appropriate for this context
  3. Consider trade-offs: Discuss advantages and limitations of different approaches
  4. Link technical to business: Always connect analytical choices to business objectives
  5. Be realistic: Acknowledge data quality, computational, and organizational constraints
  6. Structure logically: Each section should flow naturally into the next
  7. Write professionally: Address your audience (Elena and the advisory board)

Resources

  • Course materials: Weeks 1-8 lectures and readings from SMM638

Submission Details

WarningSubmission Requirements
  • Due date: November 28, 2025 (15:59)
  • Format: Single PDF document
  • Filename: SMM638_FinalProject_[YourName].pdf
  • Submission method: Via Moodle

Academic Integrity

  • This is an individual project
  • You may discuss high-level approaches with classmates, but your written document must be your own work
  • Properly cite all sources, including course materials, academic papers, and the teaching case
  • Use of AI tools (e.g., ChatGPT) for writing assistance must be disclosed in your document
  • Plagiarism or inappropriate collaboration will result in academic penalties

Tips for Success

TipClick to expand tips
  1. Read the teaching case carefully: Understand Elena’s challenge and the business context
  2. Review course materials: Draw on relevant lectures and concepts from the course
  3. Be specific: Name specific methods and explain your reasoning
  4. Think strategically: This is a business decision, not just a technical exercise
  5. Consider constraints: Address real-world implementation challenges
  6. Write clearly: Use logical structure and clear transitions
  7. Proofread: Check for errors before submitting
  8. Stay within word count: Conciseness is a professional skill

Questions? Contact the instructor or post in the course discussion forum.

Good luck! This project is your opportunity to demonstrate mastery of network analysis methods while solving a realistic business problem