Project Spotlight

Maven Return to Space Challenge - Power BI Dashboard

This project is part of the maven-return-to-space-challenge. The goal is to create a single-page Power BI dashboard that answers:
The Ultimate Rocket Ranking: Which rocket (by name, cost, or size) is the true champion of space?
The Golden Era of Space: Which decade had the highest success rate, most launches, or best cost efficiency?
The Winning Team: Which country or organization has dominated the space race, and what does their strategy look like?

Live interactive demo of the Power BI dashboard

Detailed Analysis of Issues and Solutions

    This section outlines key data issues in analysis and provides practical solutions to ensure accurate, unbiased comparisons
  • Price Data Missing
    • Issue: A large portion of rows have no price information, making cost-based rankings unreliable.
    • Why it matters: Missing data reduces accuracy and biases analysis.
    • Implemented Solution: Impute missing values using machine learning model (Random Forest Regressor)

  • Outliers in Price
    • Issue: Extreme values for price in some rows.
    • Why it matters: Outliers distort predicted price and reduce accuracy in comparisons and rankings.
    • Implemented Solution: The code uses the Interquartile Range (IQR) to identify outliers, delete those outliers, considering them as missing.

  • Time Format Issues
    • Issue: Time stored as fractional values or inconsistent formats and missing values.
    • Why it matters: Having time in the dataset doesn’t provide any value to the data and increases the size of dataset
    • Implemented Solution: Remove Time from the dataset

  • Success Rate Bias
    • Issue: Rockets with very few launches (e.g., 1 with 1 success) show misleading success rates.
    • Why it matters: Small sample sizes exaggerate performance.
    • Implemented Solution: Apply Bayesian adjustment because one launch vs. >1000 launches creates very different certainty levels.
      Based on the Beta-binomial model with a uniform prior (α = 1, β = 1).
      Bayesian Estimate= (Success Count+α)/(Launch Count +α + β)

  • Era Bias
    • Issue: Older rockets dominate launch count due to historical context
    • Why it matters: Historical dominance can overshadow modern efficiency.
    • Implemented Solution: Segment analysis by era (1950s, 1960s, etc.). Compute era-adjusted scores combined with data for fair comparison.

  • Duplicate Records
    • Issue: Duplicate rows can distort analysis.
    • Why it matters: Duplicates inflate counts and skew metrics.
    • Implemented Solution: Remove or consolidate duplicates before analysis.

  • Price Comparability Across Time
    • Issue: Prices from different decades are not comparable due to inflation, technological changes, and mission complexity.
    • Why it matters: Comparing raw prices across eras misrepresents cost efficiency and can lead to incorrect rankings.
    • Implemented Solution: Apply inflation adjustment using CPI or GDP deflator to normalize costs to present-day values.
      Formula: Adjusted Price = Original Price × Inflation Factor.
      ➔ Pulls data from World Bank API for GDP Deflator or CPI.
      ➔ Calculates Inflation Factor using 2022 as the base year.
      ➔ Inflation Factor available from 1960 onwards so for 1957-1959 the inflation factor of 1960 was considered

Random Forest-Based Price Imputation and Outlier Handling

    This section demonstrates a Python-based solution for cleaning and preparing rocket launch data. The script detects and handles outliers, imputes missing price values using a Random Forest Regressor, and ensures categorical variables are properly encoded for modeling. By automating these steps, the process improves data quality and enables accurate, unbiased analysis.
    The full code is available in my GitHub repository under the file "Random forest resgressor.py"
  • 1. Copy dataset
      data = dataset.copy()

  • 2. Detect outliers using IQR
      Q1 = data['Price'].quantile(0.25)
      Q3 = data['Price'].quantile(0.75)
      IQR = Q3 - Q1
      lower_bound = Q1 - 1.5 * IQR
      upper_bound = Q3 + 1.5 * IQR

  • 3. Mark and remove outlier
      outlier_mask = (data['Price'] < lower_bound) | (data['Price']> upper_bound)
      data.loc[outlier_mask, 'Price'] = None

  • 4. Encode categorical variables and train Random Forest
  • 5. Predict missing prices and restore categories

Python Script for Inflation Adjustment Using World Bank API

    This section shows how to fetch historical inflation data from the World Bank API and compute inflation adjustment factors for rocket launch prices. The script retrieves GDP Deflator or CPI values for a given country and time range, calculates inflation factors using a base year (2022), and saves the results as a CSV file for further analysis or visualization in tools like Power BI.
    The full code is available in my GitHub repository under the file "Inflation data.py"
  • Import Required Libraries
  • Define Parameters
  • Fetch Data from World Bank API
  • Convert API Data to DataFrame
  • Compute Inflation Factor
  • Save Results as CSV

📐 DAX Measures for KPI and Visualization


  • Golden Era of Space

    Calculates key metrics for each decade, including launch count, success count, success rate, and average adjusted price. Normalizes these metrics for fair comparison.
    
    GoldenEraSpace =
    VAR SummaryTable =
        ADDCOLUMNS(
            SUMMARIZE(
                space_mission,
                space_mission[Decade]
            ),
            "Launch Count",
                CALCULATE(
                    COUNT(space_mission[Mission]),
                    FILTER(space_mission, space_mission[Decade] = EARLIER(space_mission[Decade]))
                ),
            "Success Count",
                CALCULATE(
                    COUNT(space_mission[MissionStatus]),
                    space_mission[MissionStatus] = "Success",
                    FILTER(space_mission, space_mission[Decade] = EARLIER(space_mission[Decade]))
                ),
            "Success Rate",
                DIVIDE(
                    CALCULATE(
                        COUNT(space_mission[MissionStatus]),
                        space_mission[MissionStatus] = "Success",
                        FILTER(space_mission, space_mission[Decade] = EARLIER(space_mission[Decade]))
                    ),
                    CALCULATE(
                        COUNT(space_mission[Mission]),
                        FILTER(space_mission, space_mission[Decade] = EARLIER(space_mission[Decade]))
                    )
                ),
            "Avg Adjusted Price",
                CALCULATE(
                    AVERAGE(space_mission[Adjusted Price]),
                    FILTER(space_mission, space_mission[Decade] = EARLIER(space_mission[Decade]))
                )
        )
    
    VAR MaxLaunchCount = MAXX(SummaryTable, [Launch Count])
    VAR MaxSuccessCount = MAXX(SummaryTable, [Success Count])
    VAR MinAvgAdjPrice = MINX(SummaryTable, [Avg Adjusted Price])
    
    RETURN
        ADDCOLUMNS(
            SummaryTable,
            "Normalized Launch Count", DIVIDE([Launch Count], MaxLaunchCount),
             "Normalized Success Count", DIVIDE([Success Count], MaxSuccessCount),
            "Normalized Avg Adjusted Price", DIVIDE(MinAvgAdjPrice, [Avg Adjusted Price])
        )
  • Ultimate Rocket Ranking

    Ranks rockets by launch count, average adjusted price, and success rate, with normalization for fair comparison.
    
    Ultimate Rocket Ranking =
    VAR SummaryTable= FILTER(
        SUMMARIZE(
            space_mission,
            space_mission[Rocket],
            "Launch Count", COUNT(space_mission[Rocket]),
            "Avg Adjusted Price", AVERAGE(space_mission[Adjusted Price]),
            "Success Count", CALCULATE(COUNT(space_mission[MissionStatus]), space_mission[MissionStatus] = "Success"),
            "Success Rate", DIVIDE(
                CALCULATE(COUNT(space_mission[MissionStatus]), space_mission[MissionStatus] = "Success"),
                COUNT(space_mission[Mission])
            )
        ),
        [Success Count] > 0
    )
    VAR MaxLaunchCount = MAXX(SummaryTable, [Launch Count])
    VAR MaxSuccessCount = MAXX(SummaryTable, [Success Count])
    VAR MinAvgAdjPrice = MINX(SummaryTable, [Avg Adjusted Price])
    
    RETURN
        ADDCOLUMNS(
            SummaryTable,
            "Normalized Launch Count", DIVIDE([Launch Count], MaxLaunchCount),
             "Normalized Success Count", DIVIDE([Success Count], MaxSuccessCount),
            "Normalized Avg Adjusted Price", DIVIDE(MinAvgAdjPrice, [Avg Adjusted Price])
        )
    
    			    
  • Winning Team Ranking

    Evaluates countries and organizations by launch count, success rate (Bayesian estimate), and cost efficiency, then combines these into a final score for ranking.
    
    Winning Team Ranking =
    VAR SummaryTable =
        SUMMARIZE(
            space_mission,
            space_mission[Company],
            space_mission[Country],
            "Launch Count", COUNT(space_mission[Mission]),
            "Success Count", CALCULATE(
                COUNT(space_mission[MissionStatus]),
                space_mission[MissionStatus] = "Success"
            ),
            "Avg Adjusted Price", AVERAGE(space_mission[Adjusted Price])
        )
    VAR MaxLaunchCount = MAXX(SummaryTable, [Launch Count])
    VAR MinAvgAdjPrice = MINX(SummaryTable, [Avg Adjusted Price])
    
    RETURN
        ADDCOLUMNS(
            SummaryTable,
            "Bayesian Estimate", DIVIDE([Success Count] + 1, [Launch Count] + 2),
            "Normalized Launch Count", DIVIDE([Launch Count], MaxLaunchCount),
            "Normalized Avg Adjusted Price", DIVIDE(MinAvgAdjPrice, [Avg Adjusted Price]),
            "Final Score",
                0.3 * DIVIDE([Success Count] + 1, [Launch Count] + 2) +
                0.3 * DIVIDE([Launch Count], MaxLaunchCount) +
                0.3 * DIVIDE(MinAvgAdjPrice, [Avg Adjusted Price])
        )
    			    

💡 Key Insights from Dashboard

  • Ultimate Rocket Ranking:
    Cosmos-3M (11K65M) dominates with 406 launches and a 93.95% success rate, ranking #1.
    Other notable rockets: Vostok: 226 launches, 94.40% success.
    Molniya-M: 125 launches, 95.11% success.
    Insight: High launch counts correlate with strong reliability, but Cosmos-3M leads in both volume and success.

  • Top 5 Winning Countries Russia leads with 36.8% share of total scores.
    USA follows at 20.8%, then China (15.4%), France (12.3%), and Kazakhstan (15.4%).
    Insight: Russia remains the dominant player historically, with USA and China as strong contenders

  • Top Winning Companies RVSN USSR ranks #1 with a Bayesian score of 1.00.
    Other top companies: NASA (USA), Astra (USA), Galactic Energy (China).
    Insight: Soviet-era organizations still hold top positions due to historical dominance.

  • Golden Era of Space 1960s had the highest launch count (780.27) and success rate (31.57% era score).
    Decline in launches after 1980s, but success rates improved.
    Insight: The 1960s were the most active decade, marking the space race peak.

  • Overall Patterns Historical dominance by Russia and USSR-era rockets.
    USA and China emerging strongly in modern times.
    1960s = peak activity; modern era = fewer launches but higher reliability.