B

o

n

j

o

u

r

;

 

I

'

m

S

u

g

u

m

a

r

a

n

B

a

l

a

s

u

b

r

a

m

a

n

i

y

a

n

A

I

/

M

L

E

n

g

i

n

e

e

r

AI/ML Engineer with 7+ years of experience building production-grade machine learning systems on AWS and Azure. Focused on LLM-powered workflows, agentic AI, MLOps, and end-to-end ML systems that move from prototype to production.

Delivered measurable outcomes across supply chain, healthcare, and risk domains — including 21% cost reduction, 45% throughput improvement, and 81% AUC-ROC on multimodal healthcare prediction.

</AboutMe>

I'm an AI/ML Engineer with 7+ years of experience building production-grade data and machine learning systems across cloud, analytics, and enterprise environments. My recent work focuses on LLM-powered workflows, agentic AI, model deployment, and ML governance — systems that move from prototype to production.


I've built AWS-based data pipelines, implemented MLflow governance on SageMaker, deployed models with Docker and Kubernetes, and delivered measurable outcomes across supply chain, healthcare, and risk domains. Alongside industry work, I teach data analytics and machine learning at SKEMA Business School — bridging technical depth with real-world business impact.


Sugumaran Balasubramaniyan

</Experience>

Professor (External)
SKEMA Business School
Paris, Île-de-France, France | Jan 2026 - Present

• Lead instructor for undergraduate and graduate courses in Data Analytics, Business Intelligence, and SQL, with a focus on Power BI, Python, and machine learning applications in real-world business contexts.
• Design and deliver curriculum that bridges theoretical knowledge with hands-on industry applications, enabling students to develop end-to-end data solutions—from data extraction and modeling to visualization and storytelling.
• Mentor students in capstone projects and research initiatives, often collaborating with industry partners to solve practical challenges in marketing, finance, and customer analytics.
• Recognized for fostering an engaging and inclusive learning environment, consistently receiving high student satisfaction ratings for clarity, relevance, and applied teaching methods.
• Continuously update course content to reflect the latest tools and trends in data analytics, including SQL databases, cloud platforms (AWS/Azure), and AI-driven business intelligence.
• Certified in Microsoft Power BI Data Analyst Associate and AWS Machine Learning, integrating industry-recognized credentials and best practices into the academic experience.

Generative AI Data Scientist & ML Engineer
The Validate
Paris, Île-de-France, France | Jan 2024 - Present

AI-Powered Document Processing Pipeline
• Architected and deployed an end-to-end OCR pipeline leveraging n8n workflow automation, PyTorch for deep learning, and GPT-4 for intelligent data extraction.
• Achieved 45% reduction in manual data entry workload and recovered 20+ hours weekly for the operations team.
• Implemented error handling, data validation, and automated quality checks, ensuring 94% accuracy in document processing.
• Technologies: n8n, PyTorch, GPT-4, Python, Docker, REST APIs

Pharma Supply Chain Optimization
• Designed and built a full-stack supply chain optimization solution combining cloud data warehousing with an interactive web application.
• Improved forecast accuracy by 15% using SARIMA time-series models, enabling proactive inventory risk identification.
• Reduced stockout incidents by 12% through predictive analytics and a real-time alerting system.
• Technologies: Python, SQL, Tableau, AWS (S3, RDS), Snowflake, predictive modeling

MLOps & Deployment
• Containerized ML models using Docker and established CI/CD pipelines for automated model testing, validation, and deployment.
• Implemented model monitoring and drift detection, reducing model deployment time from 5 days to 6 hours.
• Technologies: Docker, Kubernetes, Git, CI/CD, MLflow, model versioning

Data Analyst / Data Engineer
Joubert-Associes
Paris, Île-de-France, France | Jul 2023 - Dec 2023

Executive Dashboard Development & SQL Optimization
• Designed and developed interactive Power BI dashboards tracking 50+ construction projects, providing leadership with real-time visibility into project health, timelines, and resource allocation.
• Optimized SQL query performance via indexes, window functions, and query refactoring — achieving 36% reduction in dashboard refresh time (8.5 min → 5.4 min).
• Enabled data-driven resource reallocation decisions that improved project delivery timelines by 14%.
• Technologies: Power BI, SQL Server, DAX, data modeling, ETL

Marketing Campaign Optimization through A/B Testing
• Planned, executed, and analyzed A/B tests for digital marketing campaigns targeting construction industry clients.
• Increased lead conversion rate from 8% to 12% within 3 months through data-driven campaign optimization.
• Technologies: Python (pandas, scipy), Excel, statistical analysis, hypothesis testing

Data Pipeline & Automation
• Automated weekly reporting workflows, reducing manual reporting time by 6 hours per week.
• Technologies: Python, SQL, Power BI, Jira, Confluence

Data Analyst Associate
Capgemini
Chennai, India | Apr 2021 - Jul 2022

Cloud-Based ETL Pipeline Development
• Architected and implemented automated ETL pipelines using AWS Glue and Python, processing 2TB+ of supply chain data daily.
• Accelerated data processing speed by 45% through optimized PySpark transformations and parallel processing.
• Eliminated 15 hours/week of manual data transformation work, freeing the team for higher-value analytics.

Machine Learning for Inventory Optimization (Client: Philip Morris International)
• Developed and deployed SARIMA and LSTM time-series forecasting models predicting demand across 150+ SKUs and 30+ distribution centers.
• Achieved $2.3M annual cost reduction (21% decrease) through optimized inventory management while maintaining 96% service levels.
• Reduced forecasting error from 18% to 7% through model refinement and hyperparameter tuning.
• Performed extensive feature engineering, creating 50+ predictive features from historical sales, seasonality, and external factors.

MLOps & Model Governance
• Established MLflow-based model governance framework to track experiments, manage versions, and ensure reproducibility.
• Implemented automated model monitoring and drift detection, reducing drift incidents by 18%.
• Deployed models to AWS SageMaker with automated retraining pipelines; managed 12+ production models with A/B testing and rollback.

Risk Modeling Data Analyst Executive
Infosys BPM
Chennai, India | Mar 2020 - Apr 2021

Risk Management & Credit Risk Analytics (Client: Citizens Bank, USA)
• Conducted comprehensive risk management analysis for a retail banking portfolio using Python-based statistical modeling and Monte Carlo simulations.
• Developed risk scoring models to predict loan default probability with 82% accuracy, enabling proactive risk mitigation strategies.
• Reduced portfolio risk exposure by 9% through a data-driven early warning system.
• Automated monthly risk reports using Python and SQL, reducing generation time from 2 days to 4 hours.
• Analyzed 10M+ records to identify trends, anomalies, and actionable insights for banking operations.

Team Leadership & Mentorship
• Mentored 2 junior analysts in Python, SQL best practices, and statistical analysis techniques.
• Improved team productivity and delivery timelines by 25% through knowledge sharing and collaborative problem-solving.

Manufacturing Analytics Specialist
Arimalytics
Pondicherry, India | Jun 2018 - Feb 2020

Predictive Maintenance & Manufacturing Analytics
• Built predictive models to forecast inventory requirements and machine failures, improving forecast accuracy by 12%.
• Enabled proactive maintenance scheduling that reduced unplanned downtime by 18% and extended equipment lifespan.
• Developed anomaly detection algorithms to identify equipment behavior deviations in real-time.

Dashboard Development & Business Intelligence
• Designed and deployed interactive Tableau dashboards visualizing production KPIs, risk indicators, and operational metrics.
• Reduced manual reporting time by 30% through automated data refresh and self-service analytics.
• Analyzed supply chain data to optimize inventory levels, reducing carrying costs by 8%.

</Education>

MSc Data Science and AI Strategy
emlyon business school
Paris, Île-de-France, France | Aug 2022 - Feb 2024

• This program uniquely bridged the gap between AI technology and business strategy, equipping me with the skills to design and deploy AI applications with a focus on responsible data governance and transparent practices. I gained a practical, action-oriented understanding of both the technical fundamentals and the human/business impacts of AI.

International Exchange Program
McGill University
Montréal, Québec, Canada | May 2023 - Jul 2023

• This program provided me with a comprehensive skillset in emerging technologies, including: developing and deploying Internet of Things (IoT) solutions, understanding North American business practices with a focus on Montreal's tech ecosystem, and designing and implementing advanced recommender systems. Through hands-on projects and theoretical study, I gained expertise in data analysis, hardware/software integration, and collaborative filtering techniques.

Post Graduate Program in Data Science
Great Learning
Chennai, India | July 2018 - Mar 2019

• This intensive program immersed me in key data science and analytics disciplines, including data analysis, machine learning (supervised and unsupervised), and text mining. I developed proficiency in essential tools and technologies like Python, R, Tableau, and database management, applying these skills through real-world industry case studies.

Master of Business Administration
Pondicherry University
Pondicherry, India | July 2016 - May 2018

• This specialized program offered in-depth training in key areas of Operations and Human Resources. I developed proficiency in Supply Chain Management, Operations Research, Service Operations Management, and Quality Management, alongside expertise in HR Analytics, Strategic Human Resource Management, and Human Resources Management. I also gained valuable knowledge in Strategic Management and Project Management.

Bachelor of Technology
Pondicherry University
Pondicherry, India | July 2012 - May 2016

• This four-year program equipped me with a deep understanding of mechanical engineering principles, covering a wide range of subjects including heat and mass transfer, kinematics, and automobile engineering. I developed proficiency in both theoretical concepts and practical applications, preparing me for roles in design, simulation, and control across various industries.

</Certifications>

AWS Machine Learning Engineer Associate
Amazon Web Services

• Proficient in designing, implementing, and deploying machine learning solutions on AWS.
• Expertise in SageMaker, feature engineering, and model optimization.
• Skilled in ML pipeline orchestration and automated model training workflows.

AWS Cloud Practitioner
Amazon Web Services

• Validated expertise in AWS cloud architecture and foundational services.
• Demonstrated knowledge of AWS pricing models and cost optimization strategies.
• Proficient in deploying scalable and secure cloud infrastructure on AWS.

Databricks AI Agents
Databricks Academy

• Mastered building autonomous AI agents using Databricks platform.
• Implemented LLM-based agents for complex task automation and reasoning.
• Expertise in prompt engineering and agentic workflow orchestration.

Snowflake Data Warehousing
Snowflake University

• Proficient in designing and managing cloud-based data warehouses using Snowflake.
• Expertise in data modeling, query optimization, and Snowflake governance.
• Skilled in data sharing and Snowflake collaboration features.

AWS GenAI Practitioner
Amazon Web Services

• Expert in building generative AI applications using AWS services.
• Proficient with Amazon Bedrock, SageMaker JumpStart, and generative AI tools.
• Skilled in prompt engineering and responsible AI practices.

Dataiku ML Practitioner
Dataiku Academy

• Proficient in end-to-end machine learning projects using Dataiku platform.
• Expertise in visual machine learning workflows and automated model selection.
• Skilled in model deployment and monitoring within Dataiku ecosystem.

Dataiku Developer
Dataiku Academy

• Expert in developing custom plugins and extensions for Dataiku platform.
• Skilled in Python development within Dataiku recipe and custom component frameworks.
• Proficient in integrating external APIs and data sources with Dataiku.

Atlassian Agile Project Management Professional
Atlassian Academy

• Expert in agile project management using Jira and Confluence platforms.
• Proficient in sprint planning, backlog management, and team collaboration workflows.
• Skilled in implementing agile methodologies and scaling agile practices across teams.

Microsoft Power BI Data Analyst Associate (PL-300)
Microsoft

• Certified in designing and building scalable data models, cleaning and transforming data, and enabling advanced analytic capabilities in Power BI.
• Proficient in DAX, Power Query, and publishing reports and dashboards for business decision-making.
• Applied directly in academic and professional contexts, including teaching BI at SKEMA Business School.

NVIDIA Certified Professional: Agentic AI
NVIDIA

• Certified in designing and deploying production-grade agentic AI systems using LLMs and tool-calling frameworks.
• Proficient in multi-agent orchestration, memory management, and RAG pipeline architectures.
• Skilled in building reliable, evaluable AI agents for enterprise environments.

NVIDIA Certified Associate: Generative AI LLMs
NVIDIA

• Certified in the fundamentals of large language models, transformer architectures, and generative AI techniques.
• Proficient in fine-tuning, prompt engineering, and deploying LLM-based solutions.
• Skilled in applying generative AI to real-world NLP and multimodal use cases.

Databricks AI Agent Fundamentals
Databricks Academy

• Proficient in building and evaluating AI agents using the Databricks platform and Unity Catalog.
• Skilled in integrating LLMs with tool use, retrieval augmentation, and structured outputs.
• Experienced in deploying agent workflows for enterprise-scale data and AI pipelines.

</Languages>

English
Native or Bilingual
French
Professional Working
Tamil
Native or Bilingual
Hindi
Limited Working

</Skills>

Tech Stack

  • Python
  • PyTorch
  • R
  • Scala
  • Azure-SQL
  • MySQL
  • Redis
  • PostgresSQL
  • GITHUB
  • HuggingFace
  • GIT
  • Anaconda
  • Apache-Spark
  • Apache-Airflow
  • Apache-Hadoop
  • Apache-Cassandra
  • Apache-Kafka
  • AWS
  • Azure
  • GCP
  • Power-BI
  • Tableau
  • NumPy
  • Pandas
  • Scikit-learn
  • Matplotlib
  • Plotly
  • Streamlit
  • Flask
  • Docker
  • Kubernetes
  • TensorFlow
  • HTML5
  • CSS3
  • JavaScript
  • React
  • Node.js
  • MongoDB
  • GraphQL
  • Confluence
  • Jira
  • Excel
  • FastAPI
  • OpenCV
  • Databricks
  • Snowflake
  • Dataiku
  • MLflow
  • LangChain
  • LangGraph
  • n8n

</Projects>

Patient Mortality Rate and Readmission Prediction

Patient Mortality & Readmission Prediction

Project 1

Customer Chrun

Project 2

Heart stroke prediction

Project 3

Sentiment analyzer

Project 4

Sleep disorder prediction

Project 5

Fraud detection using R

Project 6

Big Data analysis using Databricks

Project 7

Medical cost prediction

</Publications>

Stir Casting of Aluminium with Fly Ash
Academic Publication

• Research on the fabrication and mechanical characterization of aluminium metal matrix composites reinforced with fly ash using the stir casting process.

Multi-Objective Optimization in Wire-Cut EDM of Al 6063/Al₂O₃ Metal Matrix Composite
Academic Publication · Response Surface Methodology

• Applied Response Surface Methodology to optimize Wire-Cut Electric Discharge Machining parameters for Al 6063/Al₂O₃ composites, minimizing surface roughness and maximizing material removal rate.

Comparative Performance of Puducherry-Based Cooperative Societies
Academic Publication · Data Envelopment Analysis

• Comparative analysis of the operational efficiency of five Puducherry-based cooperative societies using Data Envelopment Analysis (DEA).