Plasma

e-learning Jupyter-based platform for massive data analysis

Plasma

Plasma, aka in French “Plateforme d’eLearning pour l’Analyse de données Scientifiques MAssives”, aims at creating an interactive tool to teach computational analysis of massive scientific data. Plasma was born out of the need to offer a reproducible and high-performance analysis environment to our students.

Plasma logo

Our previous experiences of teaching genomics were not satisfying. Because of the limited availability of computational resources, studied samples were restricted to very small datasets, far from what is nowadays routinely analyzed in research labs. Furthermore, remote access to computational resources was not always possible and the user experience provided by the classical Unix terminal was somewhat intimidating for the students.

Plasma aims at providing an authentic experience of the actual bioinformatic analyses performed in research labs. Jupyter notebooks are used to describe, implement and teach such analyses. These notebooks are interactive numerical notebooks that integrate computer code in several programming languages (Python, R, Bash, C++…), text, mathematical equations and the visualization of analysis results in the form of graphics or tables. This technology is now a standard for data analysis, as evidenced by the millions of notebooks available on the GitHub collaborative development platform.

We also wanted a web-based solution that could be easily deployed on bare-metal servers or virtual machines, able to handle numerous, simultaneous and specific analysis environments (supporting any programming languages), with a simple and intuitive management interface.

This project is carried out in collaboration with QuantStack, a company strongly involved in the development of the Jupyter ecosystem. Notebooks are hosted on high-performance computer servers using the JupyterHub open source and highly customizable technology. Students are able to connect remotely and carry out their analysis in a user-friendly and powerful environment. Data are centralized on the servers and readily available for analysis.

The first instance of Plasma was designed for the needs of teachers and students of the European Master of Genetics at Université Paris Cité. It and was fully operational in september 2020. Since then, it has been used by more then 250 life-science and medical students and 20 teachers at Université Paris Cité.

The first stage of this project, Plasma 1.0, was a proof of concept. The implemented solution is fully documented and freely available to the community. It has been already successfully deployed at the Université de Rouen Normandie for 1st-year medical students. tljh-repo2docker, the core part of Plasma, is also being used by the CNAM for 40 courses reaching more than 1000 students.

Thanks to the success of Plasma 1.0, we are currently expanding the project and we are working on a future version, Plasma 2.0, for massive teaching including an automated management of user accounts and an automatic grading of notebooks. These new developments will notably be implemented for genomics teaching at the National University of Singapore (NUS).

PlasmaBio

PlasmaBio is the leading organization of the Plasma project. It is composed of three associate professors at Université Paris Cité:

Claire Vandiedonck
Pierre Poulain
Sandrine Caburet

Partners

Governance

In addition to the leading organization of Plasma, a boarding committee meets twice a year. This board includes a representative of the graduate school on Genetics and Epigenetics G.E.N.E., an associate professor at Université Sorbonne Paris Nord, two students at the master and doctoral levels, a representative of the Région Île-de-France, the co-head of the National Network of Computational Resources at the French Institute of Bioinformatics and the CEO of our partner QuantStack.

Sponsors

Sponsors to the Plasma initiative include:

The overall budget of the project is 180 k€.

Developments and achievements

Plasma

The core of Plasma is tljh-repo2docker, a repo2docker plugin for The Littlest JupyterHub.

The first release of Plasma has been released in May 2020: Plasma: A learning platform powered by Jupyter.

To ease the creation of new environments in Plasma, we have created several templates for: Python, R, Bash and Octave.

Beyond the e-learning platform itself, the project has founded or co-founded a number of scientific developments to expand the Jupyter ecosystem in genomics and pedagogy.

Resources and communication

Plasma has been presented in computer science and genomics conferences:

Press release: