Aspiring Machine Learning Engineer
Building machine learning systems and documenting the engineering journey.
"A living engineering notebook documenting my journey toward becoming a Machine Learning Engineer."
Every project, engineering note, and case study reflects real work, real learning, and continuous improvement.
Machine learning models are only one component of a larger software system. Focus on deployment, APIs, infrastructure, validation, and maintainability.
Prioritize modular code, reproducible experiments, clean interfaces, validation, and maintainable pipelines.
Document experiments, failures, trade-offs, and engineering decisions to create a transparent learning record.
Loan Approval Prediction System
My development as an engineer focuses on transitioning from mathematical concepts to production software execution.
Established foundational scripting in Python, learning data processing libraries (Pandas, NumPy) and basic statistical classification methods.
Learned scikit-learn pipeline engineering to prevent training-serving data leakage. Began exploring REST API architectures using FastAPI.
Started a QA Internship at Panacee, translating validation concepts into structured test scripts. Currently learning Terraform configurations for serverless deployment environments.
NIMS B.Tech AIML
Interactive breakdowns of machine learning systems. Click on a case study header to expand detailed technical notes.
A classification model and API designed to predict loan approval outcomes using applicant financial profiles.
Manual credit evaluation is slow and subjective. Automating classifications requires preprocessing numeric/categorical features and exposing predictions as a fast, type-safe API.
Created an end-to-end preprocessing and model pipeline with scikit-learn, validating inputs using Pydantic, and serving inference requests via a FastAPI web server.
Selected scikit-learn for training to leverage its native Pipeline interface. Chose Pydantic schema validation inside FastAPI to reject invalid client inputs at the API entry point.
Data leakage between folds occurred during separate categorical and numerical preprocessing. Solved by encapsulation inside unified ColumnTransformer and Pipeline flows.
Biggest Lesson (Model vs. System): Model accuracy wasn't the hardest problem. Building a reliable preprocessing pipeline and validating inputs correctly took significantly more engineering effort than training the model itself.
Provisioning the FastAPI microservice to deploy automatically on AWS Lambda serverless endpoints using Terraform configurations.
Not available (Local REST prediction API only).
Learning in public: technical documentation logs detailing ML workflows, code investigations, and system architectures.
Documenting how to construct formal ML preprocessing pipelines. Bundling scaling, imputers, and encoders inside Pipeline and ColumnTransformer modules to enforce validation rules and prevent data leakage during training splits.
Building HTTP REST APIs for local model inference. Using FastAPI query endpoints and declaring Pydantic base schemas to validate input json data payloads, generating clean self-documenting OpenAPI endpoints automatically.
Investigating Infrastructure as Code (IaC) architectures. Writing basic Terraform scripts to define and automate AWS Lambda serverless functions, testing endpoints deployment and IAM permissions roles configuration.
A structured summary of my immediate focus and technical progression path.
I'm always interested in discussing machine learning systems, software engineering, internships, and collaborative projects. Feel free to reach out through the contact form or connect with me directly.