I am a data analyst/scientist/enthusiast/monkey. "Data for life" is the motto I live by. So what exactly do I do?
Read More »
I help business managements make important business decisions through the power of data analytics. With data, I answer questions like:
- Which marketing campaign yields the highest profit?
- Who are the best customers?
- How can we preemptively identify unsatisfied customers and assist them further?
- How can we optimize the prices?
At work, I primarily use tools such as R, Python, SQL, MongDB and Linux commands.
Statistical Analysis Pipeline:
- Cleaning and preprocessing
- Data mining and pattern discovery
- Statistical Analysis
- Dashboard integrated with database
- Visualizations via ‘ggplot2’, 'googleVis', and 'rCharts' packages
- Geospatial maps visualization
- Interactive data visualization application with "Shiny" package
- automated data analysis and report generation through scheduled Cron jobs
- Multivariate linear and logistic regression
- Feature/variable selection
- Classification and regression trees (CART)
- Linear and quadratic discriminant analysis (LDA and QDA)
- Random forest
Clustering and segmentation:
- Data normalization
- Hierarchical and k-means clustering
- Independent and dependent t-tests
- Mann-Whitney U and Wilcoxon tests
- Chi-square and Fisher's exact tests
- Analysis of variance (ANOVA) and p-value adjustment: Tukey's HSD and Bonferroni correction
- Energy Usage Profile Project: parsed through energy usage data obtained from remote sensors to perform clustering, feature selection, classification, interactive visualization, and predictive modeling; the goal was to profile usage signatures/patterns and identify different types of buildings (residential, retail, manufacturer, bank, etc.)
- Customer Satisfaction Project: through prediction modeling, we identified potentially unsatisfied/unhappy...
Read More »