← Back to Projects

Megaline Statistical Analysis

A comprehensive statistical analysis comparing two prepaid mobile plans (Surf and Ultimate) for Megaline telecom operator to determine which plan generates more revenue and optimize advertising budget allocation

Python Pandas NumPy SciPy Statistics

Project Overview

Business Context: Determining which mobile plan generates more revenue to guide marketing investment decisions.

This project analyzes customer behavior and revenue patterns from 500 Megaline clients during 2018. The analysis examines usage patterns (calls, messages, and internet data) across two prepaid plans to provide data-driven insights for the commercial department.

The project employs rigorous statistical hypothesis testing to validate findings and ensure recommendations are backed by statistically significant evidence.

What This Demonstrates

Learning Challenge

  • Statistical hypothesis testing was a new concept
  • Understanding telecommunications metrics (calls, messages, data usage)
  • Learning probability distributions and significance testing

Problem-Solving Process

  1. Domain Research: Studied telecom business models to understand what metrics matter
  2. Statistical Planning: Used Copilot to understand which statistical tests are appropriate
  3. Systematic Analysis: Analyzed calls, messages, and internet usage across 5 datasets
  4. Business Translation: Converted statistical findings (p-values, confidence intervals) into "should we invest more in Plan A or Plan B?"

Professional Outcome

  • Provided a clear, data-driven recommendation that executives could use for budget allocation
  • Demonstrated ability to apply academic concepts (statistics) to practical business decisions
  • Created reproducible analysis that could be updated with new data

Tools Utilized

  • VS Code with GitHub Copilot for development
  • Jupyter Notebook for interactive analysis
  • Git/GitHub for version control

Key Questions Addressed

Revenue Comparison

Which prepaid plan (Surf or Ultimate) generates more revenue for the company?

Usage Patterns

How do customer usage patterns differ between the two plans in terms of calls, messages, and data?

Geographic Analysis

Are there significant revenue differences between geographic regions?

Marketing Strategy

What data-driven recommendations can guide advertising budget allocation?

Dataset

The analysis uses five interconnected datasets:

  • megaline_plans.csv - Plan specifications and pricing structures
  • megaline_users.csv - Customer demographic information
  • megaline_calls.csv - Call duration records
  • megaline_messages.csv - SMS message counts
  • megaline_internet.csv - Mobile data usage records

Analysis Workflow

  1. Data Loading & Exploration - Import and examine datasets from multiple sources
  2. Data Cleaning - Handle missing values, correct data types, and prepare data for analysis
  3. Feature Engineering - Calculate monthly usage metrics and revenue per customer
  4. Exploratory Analysis - Analyze customer behavior patterns and usage distributions
  5. Statistical Testing - Hypothesis testing to validate findings with statistical rigor
  6. Conclusions - Data-driven recommendations for business strategy

Statistical Methodology

The project employs rigorous statistical hypothesis testing to ensure reliable conclusions:

  • Test Type: Independent samples t-tests for comparing mean revenues between groups
  • Significance Level (α): 0.05 (95% confidence level)
  • Null Hypothesis Formulation: No significant difference in revenues between compared groups
  • Alternative Hypothesis: Significant difference exists in revenues between groups
  • P-value Interpretation: Decision making based on statistical significance thresholds

Key Findings

The analysis reveals statistically significant differences in revenue between the Surf and Ultimate plans, providing actionable insights for marketing strategy optimization. Detailed findings include:

  • Revenue comparison between Surf and Ultimate plan subscribers
  • Usage pattern differences across plan types
  • Regional revenue variations and their statistical significance
  • Customer behavior segmentation by plan type
  • Recommendations for targeted marketing and resource allocation

Note: Detailed statistical results and visualizations are available in the complete Jupyter notebook.

Skills Demonstrated

  • Statistical hypothesis testing (t-tests)
  • Multi-table data integration and joins
  • Feature engineering for revenue calculation
  • Data cleaning and preprocessing
  • Exploratory data analysis (EDA)
  • Data visualization for statistical insights
  • Business intelligence and strategic recommendations
  • P-value interpretation and statistical significance assessment

Technologies Used

Python 3.x Pandas NumPy Matplotlib SciPy Jupyter Notebook

Business Impact

This analysis provides data-driven insights for:

  • Advertising Budget Allocation: Optimize marketing spend based on revenue-generating plan performance
  • Plan Optimization: Adjust pricing and features based on usage patterns
  • Regional Strategy: Tailor marketing approaches to geographic revenue patterns
  • Customer Segmentation: Target high-value customer segments more effectively
  • Revenue Forecasting: Predict future revenue based on customer behavior patterns