Languages spoken: French (native), English (fluent), German (conversational).

Research Experience

2013 — current: Researcher
ARMINES/Mines Paris, Institut Curie and INSERM, Paris (France)
Centre for Computational Biology

2011 — 2013: Research Scientist
Max Planck Institutes, Tübingen (Germany)
Machine Learning and Computational Biology research group, headed by Karsten Borgwardt
Statistical methods and machine learning for GWAS analysis, epistasis detection, disease gene prediction.

2009: Summer Research Intern
IBM R&D Labs Tel-Aviv (Israel)
Statistical analysis of SNP data for the HyperGenes project, under the direction of Michal Rosen-Zvi.

2005 — 2010: Graduate Student Researcher
UC Irvine, Irvine, CA (United States)
Advisor: Pierre Baldi
Prediction of molecular properties (kernel methods), virtual high-throughput screening (kernel methods, neural networks), prediction of chemical reactions, docking.

2005: MSc Research Intern
Ecole des Mines de Paris, Fontainebleau (France)
Advisor: Jean-Philippe Vert
Implementation and validation of kernels for protein sequences, web interface for kernel testing.

Teaching Experience

Online lectures

Data Scientist and Machine Learning Engineer paths at OpenClassrooms.

Lecturer

2014 — current Mines ParisTech and Paris Sciences et Lettres (France)
I'm involved in various courses in computer science, applied mathematics and bioinformatics. See my Teaching page for details.

2015 — 2018 Centrale Paris (France)
Foundations of Machine Learning. See my Teaching page for details.

2012 — 2013 Eberhard Karls Universität Tübingen (Germany)
Data Mining in der Bioinformatik (2012, 2013).

Teaching Assistant

2019 — current Mines ParisTech
I'm involved in several courses as a teaching assistant, including Numerical Tools for Mathematics (essentially about scientific python, git and Jupyter) and Probabilities. I also participate in oral examinations for short research or computational projects.

2012 — 2013 Eberhard Karls Universität Tübingen (Germany)
Seminar Bioinformatik (2012 , 2013).

2006 — 2008 UC Irvine, Irvine, CA (United States)
Introduction to Probabilities and Statistics for Computer Science (2008)
Introduction to Artificial Intelligence (2006).

Supervision

Postdoctoral researchers

2022 — current Gwenaëlle Lemoine, working on network-guided analysis of whole genome biobank data.

2021 — 2022 Adeline Fermanian, working on high-dimensional inference in genome-wide association data.

2020 — 2021 Vivien Goepp, working on the detection of complex epistasis patterns in genome-wide association data.

2018 — 2020 Antoine Recanati, working on classification of diseases from electronic health records (EHRs), automated patient pre-screening for clinical trials, and privacy-preserving training of distributed models.

PhD students

2022 — current Gwenn Guichaoua, Development of machine-learning approaches for identification of a molecular signature, therapeutic targets, and drugs for ATIP3-low Triple Negative Breast Cancer (PSL Mines Paris, joint supervision with Olivier Collier, Clara Nahmias and Véronique Stoven).

2019 — 2022 Élise Dumas, Evaluation of the interactions between comedications and recurrence-free survival in breast cancer from SNDS data (Paris-Saclay, joint supervision with Fabien Reyal).

__2019 — 2022 Marc Michel, Determining the potential of methylation of circulating tumor DNA as a pan-cancer biomarker (Paris-Saclay, joint supervision with Charlotte Proudhon).

2019 — current Ndèye Maguette Mbaye, Learning from multi-modal data to improve cancer treatment. (PSL Mines Paris).

2019 — 2022 Asma Nouira, Stable feature selection for genome-wide association studies (PSL Mines ParisTech, joint supervision with Véronique Stoven).

2016 — 2020 Lotfi Slim, Detection of epistasis in genome-wide association studies with machine learning methods for biomarkers and therapeutic target identification (PSL Mines ParisTech, joint supervision with Jean-Philippe Vert and Clément Chatelain).

2016 — 2020 Héctor Climente González, Integrating structural constraints in multi-locus genome-wide association studies (PSL Mines ParisTech, joint supervision with Véronique Stoven).

2016 — 2019 Christophe Le Priol, Systemic analysis of the micro-RNAs involved in epithelial cancers (Grenoble Alpes, joint supervision with Xavier Gidrol).

2014 — 2017 Vìctor Bellòn, Adverse drug reaction discovery (PSL Mines ParisTech, joint supervision with Véronique Stoven).

MSc students

2021 Sophia Chirrane, Feature selection for biomarker discovery: a novel graph-guided approach to classification. (Joint supervision with Vivien Goepp.)

2021 Gwenn Guichaoua, Machine learning and systems biology to identify therapeutic targets for triple negative breast cancer. (Joint supervision with Véronique Stoven.)

2019 Thibaud Martinez, Development of a graph-convolutional neural network approach for the identification of candidate disease genes in multiplex biological networks. (Joint supervision with Antonio Rausell at Institut Imagine).

2018 Weiyi Zhang, Convolutional neural networks for multiplex biological networks (Joint supervision with Antonio Rausell at Institut Imagine).

2016 Athénaïs Vaginay, Multi-phenotype identification of biomarkers in a biological network (BiB Paris-Diderot Intern).

2013 — 2014 Udo Gieraths, Machine Learning for identification of autosomal recessive genomic variants (Joint supervision with Karsten Borgwardt and Hilal Kazan, Eberhard Karls Universität Tübingen).

2012 — 2013 Fabian Aicheler, Improving the functional annotation of genomic variants via machine learning (Joint supervision with Karsten Borgwardt, Eberhard Karls Universität Tübingen).

2011 — 2012 Valeri Velkov, Mining correlated loci at a genome-wide scale (Joint supervision with Karsten Borgwardt, Eberhard Karls Universität Tübingen).

Interns

2022 Sylvain Caillaud. Investigating the genomic features underpinning the spreading of DNA methylation from transposable element (TE) sequences into adjacent regions in Arabidopsis thaliana (M1, joint supervision with Pierre Baduel and Vincent Colot at ENS).

2021 — 2022 Antoine Poirier. Stability of GWAS methods (M1, joint supervision with Asma Nouira at Mines ParisTech).

2020 — 2021 Paul Dhalluin and Léopold Moeneclaey. Machine learning identification of therapeutic targets of drugs active against breast cancer cell lines (M1, joint supervision with Véronique Stoven at Mines ParisTech).

2020—2021 Pierre-François Saunier. Combining GWAS and biological networks in breast cancer (M1, joint supervision with Héctor Climente-González at RIKEN, Japan, and Asma Nouira at Mines ParisTech).

2018 Victor Sorreau. Machine learning and prediction of variant-induced RNA splicing defects (L3, joint supervision with Alexandra Martins at UFR Médecine et Pharmacie, Rouen).

2018 Liyang Sun. Convolutional neural networks for protein representation (Projet Innovation at CentraleSupelec, joint supervision with Benoît Playe at MINES ParisTech).

2017 Adrien Galamez, Paul Magon de la Villehuchet, Olivier Pham and Manon Revel. Identifying recurrence in electronic health records (Projet Innovation at CentraleSupelec, joint supervision with Jean-Philippe Vert at MINES ParisTech and Julien Guérin at Institut Curie).

2015 Killian Poulaud, Multitask feature selection in a graph (Supinfo Intern).

2014 Jean-Daniel Granet, Development and parallelization of the SConES tool for graph-guided GWAS (42 Intern).

Education

2020: Habilitation à Diriger des Recherches
Sorbonne Université, Paris (France)
Machine learning tools for biomarkers discovery.

2010: PhD in Computer Science
UC Irvine, Irvine, CA (United States)
Advisor: Pierre Baldi
Statistical data mining and machine learning for chemoinformatics and drug discovery.

2005: MSc in Mathematics and Computer Science
ENST Bretagne (now IMT Atlantique) (France)
Specialization: Software and Formal Methods.

2005: Master of Engineering
ENST Bretagne (now IMT Atlantique) (France)
Specialization: Computer Science for Telecommunications.

Research funding

Advanced statistical machine learning methods for determining genotype-phenotype associations from genome-wide biobank data . Collaboration with Janssen Research & Development, 2022 — 2024.

STEVE: Advancing genotype to phenotype Studies by considering Transposable Elements Variability and Epivariability. ANR PRC, 2021 ­— 2025 (~250k€, part of a 530k€ project between two labs).

PRAIRIE Springboard Chair (~186k€) since 2019.

MLFPM: Machine Learning Frontiers in Precision Medicine. H2020 Innovative Training Network, 2019 — 2023 (~250k€, part of a 3.6M€ project involving 14 labs).

SCAPHE: Methods for discovering SNP Combinations Associated with a PHEnotype from genome­wide data. ANR JCJC, 2019 — 2022 (~250k€).

Training distributed models. Collaboration with SANCARE, 2018 — 2020.

Machine learning for genome-wide association studies. Collaboration with SANOFI, 2016 — 2019.

Scholarships & Awards

2021: Young AI Woman Engineer award see here and here (in French)

2014: Second-best performing team
Phase I of the DREAM 8.5 Rheumatoid Arthritis Responder Challenge.

2013: Second-best performing team
Subchallenge 2 of the DREAM 8 NIEHS-NCATS-UNC DREAM Toxicogenetics Challenge.

July 2013: Travel fellowship from ISCB to attend ISMB 2013.

June 2013: Travel fellowship from JOBIM 2013 to attend the conference.

July 2012: Young researcher participant of the Lindau Nobel Laureate Meeting.

2011—2013: Alexander von Humboldt Research Fellowship.

2009—2010: IBM PhD Fellowship.

2009: CINF-Symyx Scholarship for Scientific Excellence.

2008—2010: Honorary Pre-Doctoral Trainee
Biomedical Informatics Training Program
UC Irvine Institute for Genomics and Bioinformatics .

2007: First Prize and Invited Presentation
Agnostic Learning vs. Prior Knowledge Challenge at IJCNN 2007 (HIVA dataset).

2006: UC Irvine School of Information and Computer Science Scholarship
to attend the Grace Hopper Celebration for Women in Computing.

2005—2009: Ted & Janice Smith Graduate Fellowship.

Professional Services

Professional Societies

Meetings organized

PhD defense committees

PhD thesis committees (comités de suivi de thèse)

  • 2021: Nicolas Captier, Institut Curie; Éric Daoud, Institut Curie; Élie El Hachem, Sorbonne Université; Antoine Favier, Institut Imagine; Pelin Gündoğdu, University of Sevilla; Arthur Imbert, Mines ParisTech; Surabhi Jagtap, Université Paris-Saclay; Tristan Lazard, Mines ParisTech; Matthieu Najm, Mines ParisTech; Philippe Pinel, Mines ParisTech; Rime Raissouni, LMU; Clémence Réda, Université de Paris; Antoine Villié, Lyon I.
  • 2020: Antoine Favier, Institut Imagine; Pelin Gündoğdu, University of Sevilla; Arthur Imbert, Mines ParisTech; Rime Raissouni, LMU; Antoine Villié, Lyon I.
  • 2019: Romain Ménégaux, Mines ParisTech.
  • 2018: Dexiong Cheng, Université Grenoble Alpes; Vincent Cabeli, Institut Curie; Judith Abecassis, Mines ParisTech.

Recruitment committees

  • Jury d'admissibilité MCF: ENSIMAG 2018, Montpellier 2021.
  • Jury d'admissibilité CR2 & CR1, Inria Bordeaux — Sud-Ouest 2016.

Reviews and Program Committees

Editor: I'm an editor for Computo. I'm a regular guest editor for PLoS Computational Biology.

Referee: AOAS, BMC Bioinformatics, Briefings in Bioinformatics Bioinformatics, Frontiers in Bioinformatics, IJCV, IEEE JBHI, JCIM, Journal of Chemoinformatics, JMLR, KAIS, Molecular Biosystems, Molecular Oncology, Nature Genetics, Neural Networks, New Journal of Chemistry, IEEE TCBB, IEEE TPAMI.

More on my Publons profile.

Nowadays I only review for RoMEO green journals.

I used to be a member of the F1000 Prime "Bioinformatics, Biomedical Informatics & Computational Biology" Faculty, under the Cataloguing & Benchmarking Computational Methods Section. I was June 2017's featured Faculty Member of the Month.

Conference reviewer: BCB (2017) ECML/PKDD (2013, 2014, 2015); CD-MAKE(2017); CAp (2018, 2020, 2021); ICLR (2021); ICML (2013, 2015, 2017, 2018); ISMB (2021, 2022); JOBIM (2018); KDD (2016); MLCB (2014, 2021); MOD (2015, 2016, 2017); NeurIPS (2013, 2014, 2017); PSB-PM (2015, 2016, 2018); SciPy (2015, 2018); SIMBAD (2015); WIML (2017).

Area/Program chair: NeurIPS 2016, WiML 2017, la Conférence d'Apprentissage (CAp) 2018, JOBIM 2018 and 2019, SMPGD 2020.

Publication chair: AISTATS 2018

Reviewer for funding agencies: ANR (France), BSF (US-Israel), CQDM (Canada), FRSQ (Canada) FWO (Belgium), Ligue contre le cancer (France), NSERC (Canada), Mitacs (Canada).

Outreach

2021 Took part in an event organized by Cahier Vert

2019—2020 Scientific advisory board of Objectif IA, a course (in French) about artificial intelligence targeted towards citizens without any prerequisite.

2020 Reviewed the chapter on Artificial Intelligence for Hatier's high-school science textbook.

Since 2017 Cofounder of the Paris branch of Women in Machine Learning and Data Science. In 2019 Paris WiMLDS won the Women in Tech Challenge in the "Science and Research" category.

2018 Jury in a young mathematicians tournament TFJM² Strasbourg, TFJM² National.

2017 Jury in a young mathematicians tournament TFJM Strasbourg.

2014 Participated in a Woman in Mathematics awareness day (targeted towards high-schoolers), « Filles et maths : une équation lumineuse ».

Media

Successful Applications