Senior ML / Data Engineer

Machine learning · MLOps · Real-time systems · Co-founder

Engineer with 15+ years of experience building data-intensive systems, from large-scale distributed pipelines to real-time ML inference. Comfortable across the full stack — from data crunching and model training to production deployment, observability, and mobile apps. Co-founded MadCap, a live tracking platform for ultra-cycling events. Passionate about sport, performance engineering, and making things run at scale.

Technical Skills

Machine Learning & Data

Languages

Infrastructure & Cloud

Visualization

Professional Experience

MadCap

2023–present

Co-founder — Real-time tracking platform for ultra-cycling events

MadCap provides turnkey real-time GPS tracking for ultra-cycling races. Co-founded with Hugo — fellow cyclists and engineers who raced the TCR and Three Peaks Bike Race before building the tool they wished existed. The platform pairs a dedicated GPS device with mobile (iOS/Android) and web apps, letting riders be tracked independently of their phone while spectators ("dotwatchers") follow live.

Stack:
  • IoT ingestion pipeline processing data points from thousands of concurrent GPS devices.
  • Pub/Sub based real-time processing (Python) feeding a PostgreSQL store; deployed on GKE.
  • Python / FastAPI backend serving a React Native (Expo) app — exported as a web app too.
  • Geospatial visualizations with Kepler.gl; 3D terrain rendering.
Highlights:
  • Full product ownership: hardware integration, backend, mobile app, web app, ops.
  • Features: live tracking, time-rewind, group filtering, shareable links, 3D terrain view.
  • Scale to peaks of 1,000 simultaneous users, up to 50,000 users on major events.

Auto-entrepreneur

2021–2023
Voggt:
  • Real-time re-encoding of videos to landscape format for live streaming distribution (YouTube Live, Facebook, Twitch).
  • Data analysis and user behaviour tracking.
Yokai:
  • MLOps — ML backend development, CI/CD, Monitoring / Observability / Alerting.
  • Production deployment of diffusion models: scaling, error handling, queue management, logging, OOM management.

Yubo

2017–2021

Lead Data Scientist — Social live-streaming app

Yubo is a social app to meet new friends around the world via live video streams.

Projects:
  • Define, develop, and deploy a system to automatically flag profile pictures and video streams not respecting platform rules (Tensorflow).
  • Improve live streaming quality by exploring various approaches (WebRTC, network stack).
  • Prototype, develop and deploy a new version of the swipes (10k req/s on an Elasticsearch cluster).
  • Crunch all data (billions of rows) to improve application usage.
  • Model revenue given usage to help fundraising.
  • Plan, execute and write a report about CIR projects.
Management:
  • Create, recruit, and manage a team of 4 data scientists.
  • Define two new positions, prepare technical tests, and participate in two recruitments.

Zento

2016–2017

Data Scientist — Brand insights at sporting events

Zento analysed finish-line pictures from mass sporting events to generate shoe brand market share reports for running brands.

Project:
  • Detect & count shoes in race pictures (Tensorflow).
  • Link shoe brand & model with bib number; segment by time, age, gender.
  • Deliver reports via a webapp (Django, Angular, ElasticSearch).

C-Radar

2012–2016

Data Engineer — B2B predictive marketing

C-Radar links company registry data to their websites to enrich prospect lists and predict new prospects via machine learning.

Project:
  • Get, normalize, analyze, and output data for client companies (Python, MongoDB, RabbitMQ, Cassandra, PostgreSQL, ElasticSearch).
  • Web, Data & Text mining, web scraping, machine learning (sklearn): classification, regression, clustering.
  • Distributed system, scalability, batch processing — millions of pages crawled daily, billions of company data points processed.
  • Sysop, DevOps (salt), pre-sales (proposal, specification).
Management:
  • Supervise two interns.

Plizy

2011–2012

Data Scientist — Video recommendation

Project:
  • Build recommendation systems (similar videos, user-based and item-based) using clustering, data mining, and machine learning.
  • Python, MySQL, Cassandra, MongoDB, Redis, Hadoop, ElasticSearch — distributed, real-time, large-scale.
  • 10M Facebook profiles and 400M Facebook likes scraped via the Real-Time API.
Management:
  • Recruit and manage another data scientist.

Twenga

2008–2011

Data Engineer — Automatic product extraction from retail websites

Project:
  • Develop a tool to extract product name, price, description, category, and picture from retail websites using structural and semantic analysis.
  • C / C++, Shell Script, Python, MySQL — distributed, high-performance.
  • 200,000 websites crawled, 300M products extracted.

Google

2008

6-month internship — Text orientation, script & language detection (Tesseract)

Project:
  • Built tools to detect orientation, script, and language in images using Tesseract (open-source OCR) and clustering / energy-minimization techniques (C / C++).

Education

EPITA — Master of Science in Computer Science and Engineering

2005–2008

Specialization in Scientific Computing and Image Processing. Le Kremlin-Bicêtre, France.

LRDE — EPITA Research and Development Laboratory

2005–2008

Student researcher on Decision Diagrams distribution and generic Decision Diagrams library design.

High school diploma (Baccalaureat S)

2003

Major in Mathematics with honors.

Publications

polyDD: Towards a Framework Generalizing Decision Diagrams

ACSD 2010 — doi:10.1109/ACSD.2010.17

Decision Diagrams and Homomorphisms & Distribution

Technical reports, LRDE, EPITA — 2006–2008.

Miscellaneous

Languages

Hobbies