Alexandru Paicu

Machine Learning Engineer | Data Scientist

Iași, Romania

Profile

Machine Learning Engineer with 5+ years of experience in artificial intelligence and data science. Specialized in predictive modeling, computer vision, generative AI, RAG systems, Agentic AI, and AWS-based MLOps. Proven ability to build production-ready ML systems, optimize pipelines, and align technical solutions with business goals through cross-functional collaboration.

Skills

Programming

  • Python
  • JavaScript
  • HTML/CSS
  • C++

Machine Learning & AI

  • TensorFlow
  • PyTorch
  • Scikit-Learn
  • XGBoost
  • Keras
  • Neural Networks
  • Deep Learning
  • Computer Vision
  • NLP

Generative AI & RAG

  • LangChain
  • LangSmith
  • RAG Systems
  • Agentic AI
  • Prompt Engineering
  • QLoRA
  • PiSSA
  • Large Language Models
  • Cohere
  • Pinecone
  • GraphDB
  • OpenAI
  • Hugging Face

Cloud & MLOps

  • AWS SageMaker
  • AWS Bedrock
  • Lambda
  • S3
  • ECS
  • EC2
  • Fargate
  • CloudFormation
  • MLFlow
  • Model Deployment
  • CI/CD
  • Docker
  • GitHub Actions

Web Development

  • FastAPI
  • Flask
  • REST APIs
  • Streamlit
  • Gradio

Web & 3D Visualization

  • Three.js
  • React-Three/Drei

Data Science

  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Feature Engineering
  • Statistical Analysis
  • SQL
  • NoSQL
  • Data Pipelines
  • ETL Processes

Projects

RAG UI screenshot

RAG • Llama 4 Maverick via NVIDIA NIM

Docling multi-format parsing + Qdrant + NVIDIA NIM

This project showcases a working RAG application created on HF, using Docling for multi document parsing, Qdrant as our VectorDB and with an already set NVIDIA API for the LLM - Llama 4 Maverick.

Parsing: Docling DocumentConverter — PDF, DOCX, PPTX, HTML, Markdown, TXT

Chunking: Docling HybridChunker — structure-aware, tokenization-bounded

Embedding: sentence-transformers/all-MiniLM-L6-v2 (local, loaded once)

Retrieval: Qdrant in-memory — direct cosine search, no subprocess overhead

Generation: NVIDIA NIM — meta/llama-4-maverick

Sleep Disorder Risk Model UI screenshot

Sleep Disorder Risk Model – End-to-End Classification

XGBoost, Optuna, SHAP, Streamlit, NHANES Data

Binary classification model predicting sleep disorder risk using NHANES survey data (2005-2016 cycles). Engineered features from lifestyle and health variables including BMI, exercise, diet, sleep duration, and cardiovascular metrics. Implemented XGBoost as baseline model with Optuna hyperparameter tuning. Integrated SHAP explainability for global and local feature importance analysis and counterfactual recommendations. Deployed with Streamlit UI for real-time risk prediction and personalized health recommendations.

Super Creator Agent UI screenshot

Hybrid RAG - Document Parsing and Retrieval System

A system for parsing documents and retrieving relevant information using Hybrid Dense + BM25 → RRF → CrossEncoder rerank.

The Qdrant-Docling-RAG system is designed for efficient document parsing and information retrieval. It utilizes Docling DocumentConverter for parsing various document formats (PDF, DOCX, PPTX, HTML, Markdown, TXT) into manageable chunks. The system employs a retrieval mechanism that reranks candidates based on their relevance to the query. Furthermore, it optimizes the order of chunks for better contextual understanding. The system also features query rewriting capabilities via NVIDIA NIM to enhance dense retrieval. This multifaceted approach enables the system to provide accurate and relevant results.

Housing Regression UI demo screenshot

Housing Regression UI – End-to-End

XGBoost, Optuna, MLflow, FastAPI, Gradio, Docker, AWS

End-to-end housing-price regression pipeline built with production ML engineering best practices: time-aware splits, robust preprocessing & feature engineering, XGBoost with Optuna hyperparameter tuning, MLflow experiment tracking and containerized deployment. Includes an interactive Gradio UI and a REST API, plus CI/CD and AWS-ready task definitions.

Healthcare Analytics - Heart Disease Prediction

Python, Pandas, NumPy, Matplotlib, Seaborn, Jupyter, Feature Engineering, Statistical Analysis

Developed classification models using Random Forest, SVM, and Neural Networks on UCI heart disease dataset. Achieved 87% accuracy through feature engineering and hyperparameter optimization. Implemented cross-validation and ROC analysis for model validation.

Automotive Pricing Intelligence - Car Sales Prediction

Python, Scikit-Learn, Ensemble Methods

Created regression models predicting vehicle prices using ensemble methods. Processed 15K+ vehicle records with feature engineering on categorical and numerical data. Reduced prediction error by 23% through advanced feature selection techniques.

Industrial Equipment Valuation - Bulldozer Price Forecasting

Python, Time-Series Regression

Built time-series regression models for heavy equipment auction price prediction. Achieved RMSE reduction of 15% compared to baseline linear models. Implemented model retraining pipeline for continuous learning.

Computer Vision - Dog Breed Classification System

TensorFlow, CNN, Transfer Learning, ResNet

Developed CNN using TensorFlow and Transfer Learning with pre-trained ResNet models. Achieved 85% classification accuracy across 120 dog breeds. Created web interface for real-time image classification.

LightDrift - Astronomical Data Processing

Python, Gaia DR3, SDSS, Data Visualization, Statistical Models

Developed Python scripts for processing Gaia DR3 and SDSS astronomical datasets. Created data visualization tools for stellar motion analysis and cosmic event pattern recognition. Implemented statistical models for anomaly detection in large-scale astronomical data.

Futuristic interface displaying AI-generated code snippets with neon blue and purple gradients

Code Generator AI

AWS Bedrock, FastAPI, S3, Lambda

Engineered a serverless AI code generation tool using Anthropic Claude models via AWS Bedrock, integrated with FastAPI for real-time API access.

Vibrant AI-generated digital artwork with abstract patterns in neon cyan and magenta

Image Generation AI

AWS Bedrock, FastAPI, S3

Developed an AI-powered image generation tool using foundation models via AWS Bedrock, enabling rapid creation of high-quality visuals.

Sleek text summarization dashboard with glowing text overlays and holographic effects

Summarization AI

AWS Bedrock, FastAPI, S3, Lambda

Built a serverless text summarization tool using Anthropic Claude models via AWS Bedrock, integrated with FastAPI for API-driven access.

Key Achievements

  • Delivered 8+ production ML systems end-to-end
  • Cut model training time by 35% through pipeline optimization
  • Built reliable AWS ML infrastructure with high availability and sub-second inference
  • Maintained >95% accuracy across deployed models in multiple domains

Experience & Education

2020 – Present

Freelance Machine Learning Engineer

Self-Employed

  • Built and deployed production ML models using TensorFlow, PyTorch, Scikit-Learn for predictive analytics
  • Designed and implemented RAG systems of varying complexity, combining retrieval techniques with LLMs for enhanced context-aware responses
  • Developed Agentic AI solutions using LangChain, LangSmith, and integration with vector databases (Pinecone, GraphDB) and AI platforms (Cohere, OpenAI, Hugging Face)
  • Optimized model hyperparameters with Optuna's Bayesian optimization, improving performance by 35%
  • Implemented MLFlow for experiment tracking, model versioning, and deployment pipeline management
  • Developed FastAPI microservices with custom UIs for model inference and real-time predictions
  • Containerized applications with Docker and automated CI/CD pipelines using GitHub Actions
  • Deployed scalable cloud solutions on AWS (SageMaker, Lambda, S3) with sub-second inference times
  • Engineered various AI tools powered by transformer architectures and open-source models from Hugging Face, Kaggle, and Keras
2021 – 2025

Data Systems Specialist

Conduent | Enterprise Data Management & Automation, Iași, Romania

  • Automated data processes with Python, reducing manual effort by 30%
  • Built GDPR-compliant ETL pipelines for sensitive authentication data
  • Created reporting dashboards for KPI tracking across departments
  • Partnered with cross-functional IT and business teams to align reporting with operational needs
2013 – Present

Freelance Audio Engineer

Self-Employed, International

  • Developed audio plugins using JUCE modules and C++
  • Performed internationally with hardware synthesizers
  • Created custom audio projects and effects

Psychology

Alexandru Ioan Cuza University, Iași

AWS Bedrock: Build & Scale Generative AI

Amazon Web Services

AI Engineering Bootcamp

Zero To Mastery

Machine Learning & Data Science Bootcamp

Zero To Mastery

Prompt Engineering for Developers

Zero To Mastery

Complete Python Developer

Zero To Mastery

Hobbies & Interests

🔭
Astronomy
🧠
Neuroaesthetics
🎨
3D Visualization
🔬
Scientific Computing
🧪
Health Technology
Ask me anything about my work!