Languages spoken: French (native), English (fluent), German (conversational)

Research Experience

2013—current: Researcher
ARMINES/Mines ParisTech, Institut Curie and INSERM, Paris (France)
Centre for Computational Biology

2011—2013: Research Scientist
Max Planck Institutes, Tübingen (Germany)
Machine Learning and Computational Biology research group, headed by Karsten Borgwardt
Statistical methods and machine learning for GWAS analysis, epistasis detection, disease gene prediction.

2009: Summer Research Intern
IBM R&D Labs Tel-Aviv (Israel)
Statistical analysis of SNP data for the HyperGenes project, under the direction of Michal Rosen-Zvi.

2005—2010: Graduate Student Researcher
UC Irvine, Irvine, CA (United States)
Advisor: Pierre Baldi
Prediction of molecular properties (kernel methods), virtual high-throughput screening (kernel methods, neural networks), prediction of chemical reactions, docking.

2005: MSc Research Intern
Ecole des Mines de Paris, Fontainebleau (France)
Advisor: Jean-Philippe Vert
Implementation and validation of kernels for protein sequences, web interface for kernel testing.

Teaching Experience

Online lectures

Data Scientist path at OpenClassrooms.

Lecturer

2014—current Mines ParisTech (France)
I'm involved in various courses in computer science, applied mathematics and bioinformatics. See my Teaching page for details.

2015—2018 Centrale Paris (France)
Foundations of Machine Learning.

2012—2013 Eberhard Karls Universität Tübingen (Germany)
Data Mining in der Bioinformatik (2012, 2013).

Teaching Assistant

2019—current Mines ParisTech
I'm involved in several courses as a teaching assistant, including Numerical Tools for Mathematics (essentially about scientific python, git and Jupyter) and Probabilities. I also participate in oral examinations for short research or computational projects.

2012—2013 Eberhard Karls Universität Tübingen (Germany)
Seminar Bioinformatik (2012 , 2013).

2006—2008 UC Irvine, Irvine, CA (United States)
Introduction to Probabilities and Statistics for Computer Science (2008)
Introduction to Artificial Intelligence (2006).

Supervision

Postdoctoral researchers

2020—current Vivien Goepp, working on the detection of complex epistasis patterns in genome-wide association data.

2018—2020 Antoine Recanati, working on classification of diseases from electronic health records (EHRs), automated patient pre-screening for clinical trials, and privacy-preserving training of distributed models.

PhD students

2019—current Élise Dumas, Evaluation of the interactions between comedications and recurrence-free survival in breast cancer from SNDS data (Paris-Saclay, joint supervision with Fabien Reyal).

2019—current Marc Michel, Determining the potential of methylation of circulating tumor DNA as a pan-cancer biomarker (Paris-Saclay, joint supervision with Charlotte Proudhon).

2019—current Ndèye Maguette Mbaye, Learning from multi-modal data to improve cancer treatment. (PSL Mines ParisTech).

2019—current Asma Nouira, Stable feature selection for genome-wide association studies (PSL Mines ParisTech, joint supervision with Véronique Stoven).

2016—2020 Lotfi Slim, Detection of epistasis in genome-wide association studies with machine learning methods for biomarkers and therapeutic target identification (PSL Mines ParisTech, joint supervision with Jean-Philippe Vert and Clément Chatelain).

2016—2020 Héctor Climente González, Integrating structural constraints in multi-locus genome-wide association studies (PSL Mines ParisTech, joint supervision with Véronique Stoven).

2016—2019 Christophe Le Priol, Systemic analysis of the micro-RNAs involved in epithelial cancers (Grenoble Alpes, joint supervision with Xavier Gidrol).

2014—2017 Vìctor Bellòn, Adverse drug reaction discovery (PSL Mines ParisTech, joint supervision with Véronique Stoven).

MSc students

2021 Sophia Chirrane, Feature selection for biomarker discovery: a novel graph-guided approach to classification. (Joint supervision with Vivien Goepp.)

2021 Gwenn Guichaoua, Machine learning and systems biology to identify therapeutic targets for triple negative breast cancer. (Joint supervision with Véronique Stoven.)

2019 Thibaud Martinez, Development of a graph-convolutional neural network approach for the identification of candidate disease genes in multiplex biological networks. (Joint supervision with Antonio Rausell at Institut Imagine).

2018 Weiyi Zhang, Convolutional neural networks for multiplex biological networks (Joint supervision with Antonio Rausell at Institut Imagine).

2016 Athénaïs Vaginay, Multi-phenotype identification of biomarkers in a biological network (BiB Paris-Diderot Intern).

2013—2014 Udo Gieraths, Machine Learning for identification of autosomal recessive genomic variants (Joint supervision with Karsten Borgwardt and Hilal Kazan, Eberhard Karls Universität Tübingen).

2012—2013 Fabian Aicheler, Improving the functional annotation of genomic variants via machine learning (Joint supervision with Karsten Borgwardt, Eberhard Karls Universität Tübingen).

2011—2012 Valeri Velkov, Mining correlated loci at a genome-wide scale (Joint supervision with Karsten Borgwardt, Eberhard Karls Universität Tübingen).

Interns

2020—2021 Paul Dhalluin and Léopold Moeneclaey. Machine learning identification of therapeutic targets of drugs active against breast cancer cell lines (M1, joint supervision with Véronique Stoven at Mines ParisTech).

2020—2021 Pierre-François Saunier. Combining GWAS and biological networks in breast cancer (M1, joint supervision with Héctor Climente-González at RIKEN, Japan, and Asma Nouira at Mines ParisTech).

2018 Victor Sorreau. Machine learning and prediction of variant-induced RNA splicing defects (L3, joint supervision with Alexandra Martins at UFR Médecine et Pharmacie, Rouen).

2018 Liyang Sun. Convolutional neural networks for protein representation (Projet Innovation at CentraleSupelec, joint supervision with Benoît Playe at MINES ParisTech).

2017 Adrien Galamez, Paul Magon de la Villehuchet, Olivier Pham and Manon Revel. Identifying recurrence in electronic health records (Projet Innovation at CentraleSupelec, joint supervision with Jean-Philippe Vert at MINES ParisTech and Julien Guérin at Institut Curie.

2015 Killian Poulaud, Multitask feature selection in a graph (Supinfo Intern).

2014 Jean-Daniel Granet, Development and parallelization of the SConES tool for graph-guided GWAS (42 Intern).

Education

2010: PhD in Computer Science
UC Irvine, Irvine, CA (United States)
Advisor: Pierre Baldi
Statistical data mining and machine learning for chemoinformatics and drug discovery.

2005: MSc in Mathematics and Computer Science
ENST Bretagne (now IMT Atlantique) (France)
Specialization: Software and Formal Methods.

2005: Master of Engineering
ENST Bretagne (now IMT Atlantique) (France)
Specialization: Computer Science for Telecommunications.

Research funding

PRAIRIE Springboard Chair (~186k€).

MLFPM: Machine Learning Frontiers in Precision Medicine. H2020 Innovative Training Network, 2019 – 2022 (~250k€, part of a 3.6M€ project involving 14 labs).

SCAPHE: Methods for discovering SNP Combinations Associated with a PHEnotype from genome­wide data. ANR JCJC, 2019 – 2021 (~250k€).

Training distributed models. Collaboration with SANCARE, 2018 – 2020.

Machine learning for genome-wide association studies. Collaboration with SANOFI, 2016 – 2019.

Scholarships & Awards

2014: Second-best performing team
Phase I of the DREAM 8.5 Rheumatoid Arthritis Responder Challenge.

2013: Second-best performing team
Subchallenge 2 of the DREAM 8 NIEHS-NCATS-UNC DREAM Toxicogenetics Challenge.

July 2013: Travel fellowship from ISCB to attend ISMB 2013.

June 2013: Travel fellowship from JOBIM 2013 to attend the conference.

July 2012: Young researcher participant of the Lindau Nobel Laureate Meeting.

2011—2013: Alexander von Humboldt Research Fellowship.

2009—2010: IBM PhD Fellowship.

2009: CINF-Symyx Scholarship for Scientific Excellence.

2008—2010: Honorary Pre-Doctoral Trainee
Biomedical Informatics Training Program
UC Irvine Institute for Genomics and Bioinformatics .

2007: First Prize and Invited Presentation
Agnostic Learning vs. Prior Knowledge Challenge at IJCNN 2007 (HIVA dataset).

2006: UC Irvine School of Information and Computer Science Scholarship
to attend the Grace Hopper Celebration for Women in Computing.

2005—2009: Ted & Janice Smith Graduate Fellowship.

Professional Services

Professional Societies

Meetings organized

PhD defense committees

  • 2020: Martina Sundqvist, Stability and selection of the number of groups in unsupervised clustering: application to the classification of triple negative breast cancers, Université Paris-Saclay (France).
  • 2020: Barthélémy Caron, Computational assessment of human non-coding genetic variants with a potential clinical impact, Université de Paris (France) (rapporteuse).
  • 2020: Dexiong Cheng, Structured data modeling with deep kernel machines and applications in computational biology, Université Grenoble-Alpes (France) (rapporteuse).
  • 2020: Jérôme-Alexis Chevalier, Statistical control of sparse models in high dimension, Université Paris-Saclay (France) (rapporteuse).
  • 2020: Ahmed Debit, An in-depth study of random forests methodologies for short biomarker signature discovery, Université de Liège (Belgium)
  • 2020: Anne Jouinot, Marqueurs moléculaires cibles pour le pronostic et le traitement des corticosurrénalomes : de la génomique à la médecine personnalisée, Université de Paris (France).
  • 2019: France Rose, "Analysis of phenotypical and spatial cellular heterogeneity from large scale microscopy data", Paris Science et Lettres (France).
  • 2019: Alexandre Drouin, Inferring phenotypes from genotypes with machine learning, Université Laval (Canada).
  • 2018: Eugène Ndiaye, Safe optimization algorithms for variable selection and hyperparameter tuning, Université Paris Saclay.
  • 2018: Ramouna Fouladi, From statistical to biological interactions: towards an omics-integrated MB-MDR framework, Université de Liège (Belgium).
  • 2017: Yunlong Jiao, Rank-based molecular prognosis and network-guided biomarker discovery for breast cancer, PSL Mines ParisTech.
  • 2016: Flore Harlé, Multiple change-point detection in multivariate time series: application to the inference of dependency networks, Grenoble Alpes.

PhD thesis committees (comités de suivi de thèse)

  • 2020: Antoine Favier, Institut Imagine; Antoine Villié, Lyon I; Pelin Gündoğdu, University of Sevilla; Rime Raissouni, LMU; Arthur Imbert, Mines ParisTech.
  • 2019: Romain Ménégaux, Mines ParisTech.
  • 2018: Dexiong Cheng, Université Grenoble Alpes; Vincent Cabeli, Institut Curie; Judith Abecassis, Mines ParisTech.

Recruitment committees

  • Jury d'admissibilité MCF, ENSIMAG 2018.
  • Jury d'admissibilité CR2 & CR1, Inria Bordeaux — Sud-Ouest 2016.

Reviews and Program Committees

Referee: AOAS, BMC Bioinformatics, Bioinformatics, IJCV, IEEE JBHI, JCIM, Journal of Chemoinformatics, JMLR, KAIS, Molecular Biosystems, Nature Genetics, Neural Networks, New Journal of Chemistry, IEEE TCBB, IEEE TPAMI.

More on my Publons profile.

Nowadays I only review for RoMEO green journals.

I used to be a member of the F1000 Prime "Bioinformatics, Biomedical Informatics & Computational Biology" Faculty, under the Cataloguing & Benchmarking Computational Methods Section. I was June 2017's featured Faculty Member of the Month.

Conference reviewer: BCB (2017) ECML/PKDD (2013, 2014, 2015); CD-MAKE(2017); CAp (2018, 2020); ICLR (2021); ICML (2013, 2015, 2017], 2018); JOBIM (2018); KDD (2016); MLCB (2014); MOD (2015, 2016, 2017); NeurIPS (2013, 2014, 2017); PSB-PM (2015, 2016, 2018); SciPy (2015, 2018); SIMBAD (2015); WIML (2017).

Area/Program chair: NeurIPS 2016, WiML 2017, la Conférence d'Apprentissage (CAp) 2018, JOBIM 2018 and 2019, SMPGD 2020.

Publication chair: AISTATS 2018

Reviewer for funding agencies: ANR (France), BSF (US-Israel), CQDM (Canada), FRSQ (Canada) FWO (Belgium), Ligue contre le cancer (France), NSERC (Canada), Mitacs (Canada).

Outreach

2019—2020 Scientific advisory board of Objectif IA, a course (in French) about artificial intelligence targeted towards citizens without any prerequisite.

2020 Reviewed the chapter on Artificial Intelligence for Hatier's high-school science textbook.

Since 2017 Cofounder of the Paris branch of Women in Machine Learning and Data Science. In 2019 Paris WiMLDS won the Women in Tech Challenge in the "Science and Research" category.

2018 Jury in a young mathematicians tournament TFJM² Strasbourg, TFJM² National.

2017 Jury in a young mathematicians tournament TFJM Strasbourg.

2014 Participated in a Woman in Mathematics awareness day (targeted towards high-schoolers), « Filles et maths : une équation lumineuse ».

Media

Successful Applications