Plasma

e-learning Jupyter-based plateform for massive data analysis

Plasma

Plasma, aka in French “Plateforme d’eLearning pour l’Analyse de données Scientifiques MAssives”, aims at creating an interactive tool to teach computational analysis of massive scientific data. Plasma was born out of the need to offer a reproducible and high-performance analysis environment to our students.

Plasma logo

Our previous experiences of teaching genomics were not satisfying. Because of the limited availability of computational resources, studied samples were restricted to very small datasets, far from what is nowadays routinely analyzed in research labs. Furthermore, remote access to computational resources was not always possible and the user experience provided by the classical Unix terminal was somewhat intimidating for the students.

Plasma aims at providing an authentic experience of the actual bioinformatic analyses performed in research labs. Jupyter notebooks will be used to describe, implement and teach such analyses. These notebooks are interactive numerical notebooks that integrate computer code in several programming languages (Python, R, Bash, C++…), text, mathematical equations and the visualization of analysis results in the form of graphics or tables. This technology is gradually becoming a standard for data analysis, as evidenced by the millions of notebooks available on the GitHub collaborative development platform.

We also wanted a web-based solution that could be easily deployed on bare-metal servers or virtual machines, able to handle numerous, simultaneous and specific analysis environments (supporting any programming languages), with a simple and intuitive management interface.

This project is carried out in collaboration with QuantStack, a company strongly involved in the development of the Jupyter ecosystem. Notebooks will be hosted on high-performance computer servers using the JupyterHub open source and highly customizable technology. Students will be able to connect remotely and carry out their analysis in a user-friendly and powerful environment. Data will be centralized on the servers and readily available for analysis.

The first instance of Plasma is designed for the needs of teachers and students of the European Master of Genetics at Université de Paris.

Ultimately, this project is a proof of concept and the implemented solution will be fully documented and freely available to the community.

PlasmaBio

PlasmaBio is the leading organization of the Plasma project. It is composed of three associate professors at Université de Paris:

Claire Vandiedonck
Pierre Poulain
Sandrine Caburet

Partners

QuantStack is a team of developers and contributors of major open-source projects for scientific computing, who are passionate about science and technology.

Sponsors

Sponsors to the Plasma initiative include:

The overall budget of the project is 140 k€.

Developments & achievements

Plasma

Plasma utilizes tljh-repo2docker, a repo2docker plugin for The Littlest JupyterHub

ipycytoscape

Also part of the Plasma project, the new ipycytoscape package for interactive graph visualization in Jupyter: https://blog.jupyter.org/interactive-graph-visualization-in-jupyter-with-ipycytoscape-a8828a54ab63