Mon Introduction au Machine Learning à destination des élèves ingénieur·e·s ou mastérien·ne·s vient de paraître aux éditions Dunod.

My ANR proposal for a project on "Methods for discovering SNP Combinations Associated with a PHEnotype from genome­wide data" (SCAPHE) was accepted! The project will run from January 2019 to December 2021.

Our proposal for a "Machine Learning Frontiers in Precision Medicine" Innovative Training Network was funded by the European Commission! I'm looking forward to working with 13 other European labs and training young researchers.

I am a researcher at the Centre for Computational Biology (CBIO) of ARMINES/MINES ParisTech, Institut Curie and INSERM (which are all part of PSL Research University). My research interests revolve around machine learning techniques for therapeutic research.

Research interests

I focus on the development of methods for efficient multi-locus biomarker discovery. Essentially, my goal is to make sense of data with a small number of samples and a large number of variables. These variables can be clinical variables (such as age, cholesterol levels or smoking history) or genetic variables (such as gene expression, mutations, or epigenetic markers). How can we find out which of them play a role in a particular biological process or pathology? My work has numerous applications, in particular in precision medicine, where we try to develop treatments that are adapted to the (genetic) specificities of patients, by contrast with a classical one-size-fits-all approach.

I am interested in the incorporation of additional (structured) information, for example as biological networks; in multi-task approaches, where one addresses multiple related problems simultaneously; and in the development of fast but accurate techniques to address these issues. In terms of machine learning, a lot of my work is linked to structured sparsity. This has led for example to the development of SConES (Selecting CONected Explanatory SNPs), a method for network-guided multi-locus association mapping based on graph cuts.

I am also currently working on various projects involving the analysis of various types of biological networks, differential privacy, and the prediction of molecule-protein interactions.


Short bio

Previously I have worked in the area of chemoinformatics and drug design (more particularly, virtual high-throughput screening) during my PhD at UC Irvine with Pierre Baldi. I switched my focus to the other side of the coin during my postdoctoral stay at the Max Planck Institutes for Developmental Biology and Intelligent Systems in Tübingen, where I worked with Karsten Borgwardt as a member of the Machine Learning and Computational Biology (MLCB) research group on methods for genome-wide association studies. I joined CBIO in December 2013.

Women in Machine Learning and Data Science

I am the co-founder of the Parisian chapter of Women in Machine Learning and Data Science. We host meetups where all invited speakers identify as female. People of every gender are welcome to attend! We also have a public Slack channel.

Are you trying to organize a gender-balanced machine learning event, but find yourself unable to find women speakers? WiML has a great list of women active in machine learning. You may also be interested in Request a Woman Scientist and, for French speakers, in Les Expertes.


  • Weiyi Zhang
    Convolutional neural networks for multiplex biological networks
    Intern at Institut Imagine. Joint supervision with Antonio Rausell.
  • Christophe Le Priol
    Systemic analysis of the micro-RNAs involved in epithelial cancers
    PhD student at CEA since January 2016
    Joint supervision with Xavier Gidrol.
  • Héctor Climente González
    Integrating structural constraints in multi-locus genome-wide association studies
    PhD student since October 2016
    Joint supervision with Véronique Stoven.
  • Lotfi Slim
    Detection of epistasis in genome-wide association studies with machine learning methods for biomarkers and therapeutic target identification
    PhD student since December 2016
    Joint supervision with Jean-Philippe Vert and Clément Chatelain



I teach lectures related to bioinformatics, machine learning and drug discovery at several places, including MINES ParisTech, Centrale Paris and Paris-Diderot. I'm also teaching online courses (in French) at OpenClassrooms. See my teaching page for more details.