This is a list of tools and working practices I am trying to develop for myself and people working under my supervision. They are not set in stone and are meant to evolve according to the people and project.
Github offers both private and public repositories, and supports free accounts for academics.
- create a github account (and apply for the academic discount to get free additional functionalities, such as unlimited private repositories)
- prepare their work environment following the Set Up Git guide
- create a private repository for their electronic lab notebook, that they will share with me by adding me (chagaz) as a collaborator (see Settings > Collaborators). What I'm interesting in is a reverse-chronological-order diary of key points, probably in Markdown format, keeping track of specific tasks to accomplish, results, analyses, conclusions, next steps, overall goals, etc, as discussed in our meetings. One can also envision to add Jupyter notebooks to this repository.
The goal is to also use public repositories for code and papers, with one repository per project.
Jupyter notebooks support a variety of programming languages, including R, Scala, Julia, and Python. Notebooks allow you to run commands from a web browser, track typed commands and obtained results (be there printouts or images), organize them in sections, and introduce, comment and annotate your work (with formatting). They are a great way to track your work and produce technical reports, and a good tool for reproducible research.
Resources about reproducible research:
- Reproducible Research
- Posts tagged "reproducibility" at Titus Brown's blog Living in an Ivory Basement
- Ten simple rules for reproducible computational research by Geir Kjetil Sandve, Anton Nekrutenko, James Taylor, and Eivind Hovig.
Documenting code is important for yourself, for your reviewers, and for maximizing impact (by making it easier for others to reuse your work). For each project, we will attempt to respect the following rules:
- The repository contains a
README.mdfile, in Markdown syntax (easily displayed within Github and readable as plain text), which describes:
- The goal of the software
- Who created it
- How to contact authors in case of issues
- How to install it (be specific, list all dependencies)
- How to use it (give specific examples, document each functionality)
- The repository contains a
LICENSEfile (plain text). I am partial to the MIT License; check out ChooseALicense for more possibilities.
- Options, parameters, variables, methods, classes must be documented. In Python, we will follow the NumPy style guide as well as PEP 0257 regarding docstrings.
- Each script/program that can be called from the command line must give useful information when called without arguments (or with the
We will endeavor to follow PEP 0008. In particular:
- Variable names and comments must be in English.
- Indentations are done by blocks of 4 spaces (and not with tabs).
- For spacing, check out the Pet Peeves section of PEP 0008.
CamelCaseonly applies to class names.
- package names
- module names
- function names
- method names
- class instance names
- variables, parameters, arguments.
UPPER_CASE_WITH_UNDERSCORESonly applies to constant names.
I also recommend working with the interactive shell IPython. Some functionalities of IPython:
- tab completion
- inline help (with
- magic functions (predefined functions starting with
%), such as
run script.py, or
If you're an emacs user, I recommend the emacs-for-python package.
We write papers in LaTeX, a document preparation system much used in technical and scientific domains. Unlike What You See Is What You Get (WYSIWYG) software such as LibreOffice or Microsoft Word, LaTeX encourages you to focus on logical structure rather than format, and makes it easy to typeset mathematical formulas. As LaTeX documents are written in plain text, this also makes version control much easier.
A good place to start with LaTeX is Overleaf, a collaborative LaTeX editing platform.
Do take some time to set up a nice working environment for LaTeX on your own computer. For emacs users I recommend AUCTeX with the following configuration in your
;; auto-complete for latex (require 'auto-complete-auctex) ;; make auctex use pdflatex to compile (when C-c C-c) (TeX-global-PDF-mode t) (setq TeX-engine 'pdflatex) ;; make auctex use evince and firefox for visualization (when C-c C-v) (setq TeX-output-view-style (quote (("^pdf$" "." "evince -f %o") ("^html?$" "." "firefox %o"))))
Chances are at some point you will want the ability to manage your bibliographical references with something more advanced than a mere
.bibtex file. I use Zotero, another popular option in the lab is Mendeley; Wikipedia has a good list of options here.