Senior ML / Data Engineer
Machine learning · MLOps · Real-time systems · Co-founder
Engineer with 15+ years of experience building data-intensive systems, from large-scale distributed pipelines to real-time ML inference. Comfortable across the full stack — from data crunching and model training to production deployment, observability, and mobile apps. Co-founded MadCap, a live tracking platform for ultra-cycling events. Passionate about sport, performance engineering, and making things run at scale.
Technical Skills
Machine Learning & Data
- Python data stack (numpy, scipy, matplotlib, pandas, scikit-learn)
- Tensorflow, diffusion models (production deployment)
- NLP, NLTK
- MLOps: CI/CD, monitoring, observability, alerting
Languages
- Python / C++ (proficient)
- TypeScript (React Native / Expo)
- Java, Shell Script, Erlang, Ruby, … (strong knowledge)
Infrastructure & Cloud
- Cloud: GCP (GKE, Pub/Sub, Cloud Run, Bigtable), AWS (EC2, S3, SQS, Lambda)
- Docker, Kubernetes (GKE), CI/CD (GitHub Actions)
- PostgreSQL, MySQL, MongoDB, Cassandra, Bigtable
- RabbitMQ, Redis, ElasticSearch, Pub/Sub
- FastAPI, real-time & distributed systems, batch processing
Visualization
- Kepler.gl (geospatial), Matplotlib, custom dashboards
Professional Experience
MadCap
2023–present
Co-founder — Real-time tracking platform for ultra-cycling events
MadCap provides turnkey real-time GPS tracking for ultra-cycling races. Co-founded with Hugo — fellow cyclists and engineers who raced the TCR and Three Peaks Bike Race before building the tool they wished existed. The platform pairs a dedicated GPS device with mobile (iOS/Android) and web apps, letting riders be tracked independently of their phone while spectators ("dotwatchers") follow live.
Stack:
- IoT ingestion pipeline processing data points from thousands of concurrent GPS devices.
- Pub/Sub based real-time processing (Python) feeding a PostgreSQL store; deployed on GKE.
- Python / FastAPI backend serving a React Native (Expo) app — exported as a web app too.
- Geospatial visualizations with Kepler.gl; 3D terrain rendering.
Highlights:
- Full product ownership: hardware integration, backend, mobile app, web app, ops.
- Features: live tracking, time-rewind, group filtering, shareable links, 3D terrain view.
- Scale to peaks of 1,000 simultaneous users, up to 50,000 users on major events.
Auto-entrepreneur
2021–2023
Voggt:
- Real-time re-encoding of videos to landscape format for live streaming distribution (YouTube Live, Facebook, Twitch).
- Data analysis and user behaviour tracking.
Yokai:
- MLOps — ML backend development, CI/CD, Monitoring / Observability / Alerting.
- Production deployment of diffusion models: scaling, error handling, queue management, logging, OOM management.
Yubo
2017–2021
Lead Data Scientist — Social live-streaming app
Yubo is a social app to meet new friends around the world via live video streams.
Projects:
- Define, develop, and deploy a system to automatically flag profile pictures and video streams not respecting platform rules (Tensorflow).
- Improve live streaming quality by exploring various approaches (WebRTC, network stack).
- Prototype, develop and deploy a new version of the swipes (10k req/s on an Elasticsearch cluster).
- Crunch all data (billions of rows) to improve application usage.
- Model revenue given usage to help fundraising.
- Plan, execute and write a report about CIR projects.
Management:
- Create, recruit, and manage a team of 4 data scientists.
- Define two new positions, prepare technical tests, and participate in two recruitments.
Zento
2016–2017
Data Scientist — Brand insights at sporting events
Zento analysed finish-line pictures from mass sporting events to generate shoe brand market share reports for running brands.
Project:
- Detect & count shoes in race pictures (Tensorflow).
- Link shoe brand & model with bib number; segment by time, age, gender.
- Deliver reports via a webapp (Django, Angular, ElasticSearch).
C-Radar
2012–2016
Data Engineer — B2B predictive marketing
C-Radar links company registry data to their websites to enrich prospect lists and predict new prospects via machine learning.
Project:
- Get, normalize, analyze, and output data for client companies (Python, MongoDB, RabbitMQ, Cassandra, PostgreSQL, ElasticSearch).
- Web, Data & Text mining, web scraping, machine learning (sklearn): classification, regression, clustering.
- Distributed system, scalability, batch processing — millions of pages crawled daily, billions of company data points processed.
- Sysop, DevOps (salt), pre-sales (proposal, specification).
Management:
Plizy
2011–2012
Data Scientist — Video recommendation
Project:
- Build recommendation systems (similar videos, user-based and item-based) using clustering, data mining, and machine learning.
- Python, MySQL, Cassandra, MongoDB, Redis, Hadoop, ElasticSearch — distributed, real-time, large-scale.
- 10M Facebook profiles and 400M Facebook likes scraped via the Real-Time API.
Management:
- Recruit and manage another data scientist.
Twenga
2008–2011
Data Engineer — Automatic product extraction from retail websites
Project:
- Develop a tool to extract product name, price, description, category, and picture from retail websites using structural and semantic analysis.
- C / C++, Shell Script, Python, MySQL — distributed, high-performance.
- 200,000 websites crawled, 300M products extracted.
Google
2008
6-month internship — Text orientation, script & language detection (Tesseract)
Project:
- Built tools to detect orientation, script, and language in images using Tesseract (open-source OCR) and clustering / energy-minimization techniques (C / C++).
Education
EPITA — Master of Science in Computer Science and Engineering
2005–2008
Specialization in Scientific Computing and Image Processing. Le Kremlin-Bicêtre, France.
LRDE — EPITA Research and Development Laboratory
2005–2008
Student researcher on Decision Diagrams distribution and generic Decision Diagrams library design.
High school diploma (Baccalaureat S)
2003
Major in Mathematics with honors.
Publications
polyDD: Towards a Framework Generalizing Decision Diagrams
ACSD 2010 — doi:10.1109/ACSD.2010.17
Decision Diagrams and Homomorphisms & Distribution
Technical reports, LRDE, EPITA — 2006–2008.
Miscellaneous
Languages
- French: mother tongue.
- English: fluent (TOEIC: 850).
Hobbies
- Sports: long distance cycling, triathlon, swimming, running.
- Reading: mainly Sci-Fi.
- Cooking.