Plasma, aka in French “Plateforme d’eLearning pour l’Analyse de données Scientifiques MAssives”, aims at creating an interactive tool to teach computational analysis of massive scientific data. Plasma was born out of the need to offer a reproducible and high-performance analysis environment to our students.
Our previous experiences of teaching genomics were not satisfying. Because of the limited availability of computational resources, studied samples were restricted to very small datasets, far from what is nowadays routinely analyzed in research labs. Furthermore, remote access to computational resources was not always possible and the user experience provided by the classical Unix terminal was somewhat intimidating for the students.
Plasma aims at providing an authentic experience of the actual bioinformatic analyses performed in research labs. Jupyter notebooks are used to describe, implement and teach such analyses. These notebooks are interactive numerical notebooks that integrate computer code in several programming languages (Python, R, Bash, C++…), text, mathematical equations and the visualization of analysis results in the form of graphics or tables. This technology is now a standard for data analysis, as evidenced by the millions of notebooks available on the GitHub collaborative development platform.
We also wanted a web-based solution that could be easily deployed on bare-metal servers or virtual machines, able to handle numerous, simultaneous and specific analysis environments (supporting any programming languages), with a simple and intuitive management interface.
This project is carried out in collaboration with QuantStack, a company strongly involved in the development of the Jupyter ecosystem. Notebooks are hosted on high-performance computer servers using the JupyterHub open source and highly customizable technology. Students are able to connect remotely and carry out their analysis in a user-friendly and powerful environment. Data are centralized on the servers and readily available for analysis.
The first instance of Plasma was designed for the needs of teachers and students of the European Master of Genetics at Université Paris Cité. It and was fully operational in september 2020. Since then, it has been used by more then 250 life-science and medical students and 20 teachers at Université Paris Cité.
The first stage of this project, Plasma 1.0, was a proof of concept. The implemented solution is fully documented and freely available to the community. It has been already successfully deployed at the Université de Rouen Normandie for 1st-year medical students. tljh-repo2docker, the core part of Plasma, is also being used by the CNAM for 40 courses reaching more than 1000 students.
Thanks to the success of Plasma 1.0, we are currently expanding the project and we are working on a future version, Plasma 2.0, for massive teaching including an automated management of user accounts and an automatic grading of notebooks. These new developments will notably be implemented for genomics teaching at the National University of Singapore (NUS).
PlasmaBio is the leading organization of the Plasma project. It is composed of three associate professors at Université Paris Cité:
- Research interests: genetics of autoimmune and inflammatory diseases, regulation of immune gene expression
- Teaching: human genetics, genomics, biostatistics, bioinformatics for students in medical school and biology department of Université Paris Cité
- Twitter ~ GitHub
- Research interests: proteomics, molecular dynamics, open data, scientific software development
- Teaching: Python programming, Unix, data management
- Twitter ~ GitHub ~ website
- Research interests: genetics of human infertility
- Teaching: genomics, human genetics, bioinformatics
- Twitter ~ GitHub
- QuantStack is a team of developers and contributors of major open-source projects for scientific computing, who are passionate about science and technology.
- Greg Tucker-Kellogg, is the director of the Computer Biology Program at the Faculty of Science in National University of Singapore (NUS).
In addition to the leading organization of Plasma, a boarding committee meets twice a year. This board includes a representative of the graduate school on Genetics and Epigenetics G.E.N.E., an associate professor at Université Sorbonne Paris Nord, two students at the master and doctoral levels, a representative of the Région Île-de-France, the co-head of the National Network of Computational Resources at the French Institute of Bioinformatics and the CEO of our partner QuantStack.
Sponsors to the Plasma initiative include:
For Plasma 1.0 (2019-2022):
- Région Île-de-France, via the “Trophées franciliens de l’innovation numérique dans le supérieur” (EdTech 2018) grant program,
- Université Paris Cité, via the Initiative of Excellence (IdEx) Label and its “innovating teaching” grant program,
- EUR G.E.N.E., the graduate school on Genetics and Epigenetics,
- the university training “Création, analyse et valorisation de données biologiques omiques” (DU Omiques).
For Plasma 2.0 (2022-):
The overall budget of the project is 180 k€.
Developments and achievements
The core of Plasma is tljh-repo2docker, a repo2docker plugin for The Littlest JupyterHub.
The first release of Plasma has been released in May 2020: Plasma: A learning platform powered by Jupyter.
To ease the creation of new environments in Plasma, we have created several templates for: Python, R, Bash and Octave.
Beyond the e-learning platform itself, the project has founded or co-founded a number of scientific developments to expand the Jupyter ecosystem in genomics and pedagogy.
- ipycytoscape brings interactive graph visualization in Jupyter. See also the blog post Interactive Graph Visualization in Jupyter with ipycytoscape (2020).
- ipyigv is a Jupyter widget to render genomics data. See also the blog post Genomic data visualization in Jupyter (2021).
- nbgrader is a system for assigning and grading notebooks. See also the blog post Upgrading Nbgrader (2022).
Resources and communication
Plasma has been presented in computer science and genomics conferences:
- JupyterCon 2020. Plasma: versatile e-learning platform powered by The Littlest JupyterHub (2’ video)
- Journées Ouvertes de Biologie, Informatique et Mathématique (JOBIM) 2020 [in French]. Plasma : e-learning platform for massive data analysis.
- European Society of Human Genetics (ESHG) 2021. P17.038.A Plasma: a versatile e-learning platform for teaching interactively genomic and genetic data analysis with Jupyter notebooks
- Journées Réseaux de l’Enseignement Supérieur (JRES) 2021 [in French]. Plasma : plateforme d’e-learning pour l’analyse interactive de données. (21’ vidéo + paper).
- Hybridation pédagogique : découvrez et expérimentez PLASMA, 2021, Université Paris Cité newsletter.