
Marka Pranay
Transforming Fragmented Data into Executive Intelligence_
Hi, my name is Marka Pranay. I hold a Bachelor of Technology (B.Tech) in Artificial Intelligence and Machine Learning from JNTUH University, and I am highly motivated to leverage my technical foundation in an entry-level Data Analyst position. I excel at uncovering hidden insights from complex data streams and translating them into clear visual formats to support effective organizational decision-making.
Data Pipeline
End-to-end workflow from raw data to executive dashboards
Data Preprocessing
Ingesting raw datasets, handling missing values, removing duplicates, feature engineering, and transforming data for ML readiness.
Inference Engine
Building predictive models to generate probabilistic forecasts, churn scores, and trend signals from structured data.
Executive Delivery
Translating model outputs into Power BI dashboards, SQL reports, and Excel summaries for C-suite decision-making.
Internship
During my internship, where I performed on data cleaning, data analysis, and statistical analysis on data, I then explored machine learning basics to generate meaningful insights from large datasets. I completed my internship in Data Science & Analytics at Zideo Development Startup Company in Bangalore. During my internship, I undertook two major projects, such as Time Series Forecasting with Apple Stock Market Data & Customer Churns & Predictions by an XGBoost Model to generate meaningful insights. I enhanced my skills in Python, SQL, Power BI, Excel for data analysis, data cleaning, and dashboard making to make decisions effectively.
Projects
Built with real datasets · STAR Methodology
Time Series Forecasting — Apple Stock Market
During my internship, I took the Apple stock market dataset to identify future trends, seasonal information, model insights, and reasons based on past movements.
Handle missing values, remove duplicates, terminate null values, convert date columns into index format, and implement ARIMA & SARIMA time-series models.
Conducted EDA, cleaned the dataset, converted date columns into indexes, and implemented ARIMA & SARIMA forecasting models to generate predictions, trends, and seasonality insights.
Developed forecasting models and dashboards generating trend predictions, seasonal insights, and model-based reasons to support stakeholders in data-driven decision-making.
| Date | Actual Close ($) | ARIMA Pred ($) | SARIMA Pred ($) | ARIMA Error | SARIMA Error | Trend |
|---|---|---|---|---|---|---|
| 31 Mar 2022 | $177.84 | $179.66 | $187.44 | -1.82 | -9.60 | ↓ BEARISH |
| 01 Apr 2022 | $174.03 | $179.66 | $187.50 | -5.63 | -13.47 | ↓ BEARISH |
| 04 Apr 2022 | $174.57 | $179.66 | $187.40 | -5.09 | -12.83 | ↓ BEARISH |
| 05 Apr 2022 | $177.50 | $179.66 | $187.54 | -2.16 | -10.04 | ↓ BEARISH |
| 06 Apr 2022 | $172.36 | $179.66 | $187.92 | -7.30 | -15.56 | ↓ BEARISH |
| 07 Apr 2022 | $171.16 | $179.66 | $187.82 | -8.50 | -16.66 | ↓ BEARISH |
| 08 Apr 2022 | $171.78 | $179.66 | $188.10 | -7.88 | -16.32 | ↓ BEARISH |
Customer Churn Predictions — XGBoost Retention Model
Worked on transactions and customer churn sales datasets to identify high-risk retention customers and analyse behaviour, engagement, and subscription tenure.
Handle missing values, engineer features such as engagement score and subscription tenure, eliminate NaN/null values, then implement XGBoost for churn probability prediction.
Conducted EDA, evaluated XGBoost using Recall, Precision, Accuracy, and F1-Score, then gathered high-risk retentions, engagement scores, and customer behaviour analysis.
Created a dashboard displaying high-risk retentions, engagement scores, subscription tenure, and XGBoost probability recommendations to support sales growth and future analysis.
| Customer ID | Engagement Score | Subscription Tenure | Fragmentation | Churn Probability | Risk Level | Recommendation |
|---|---|---|---|---|---|---|
| C001 | 5.4 | 3 months | High Fragmentation | 87% | Critical Risk | Retention Call |
| C002 | 21.7 | 36 months | Low Fragmentation | 12% | Low Risk | Loyalty Rewards |
| C003 | 11.2 | 12 months | Medium Fragmentation | 58% | Medium Risk | Email Campaign |
Code
Actual Python notebooks · Click tabs to explore
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.metrics import mean_squared_error
import os
# Load dataset
df = pd.read_excel('aapl_2014_2023.csv.xlsx')
df.info()import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.metrics import accuracy_score, roc_auc_score
from xgboost import XGBClassifier
# Load datasets
customers_churn_data = pd.read_csv("WA_Fn-UseC_-Telco-Customer-Churn.csv")
Transactions = pd.read_csv("Daily Household Transactions.csv")Education
B.Tech — Artificial Intelligence & Machine Learning
Jawaharlal Nehru Technological University, Hyderabad (JNTUH)
Specialized coursework in machine learning algorithms, deep learning fundamentals, statistical analysis, and data engineering — directly applied to real-world internship projects.
Let's Connect
Let's connect to achieve data-driven goals!