Community Article #1

My Name is Luca Beltrame, I was born in Milan, the second-largest city in Italy, which also hosts a large number of scientific institutions. I got a degree in Biotechnology in 2002 and moved into scientific research: while I initially started working as a bench scientist, in 2005, during the course of my Ph.D., I moved on to computational biology and in particular into genomics (studies of large-scale changes in DNA and RNA), and I’ve been working in this field since then.

Currently, I work as a senior scientist at the Mario Negri Institute for Pharmacological Research (a non-profit research institute) in the Department of Oncology.

I speak Italian, Japanese, and English, and I’ve been mainly programming in Python for my main job and C++ for side projects.

In 2013, the group I am working in bought some new instruments for DNA and RNA sequencing at very high throughput, and that can generate very large amounts of data. Our first objective was to build an analytics infrastructure that would allow us to deal with the data (which can be massive) and to produce output that could be properly interpreted in a biological sense.

For this reason, we built two HPC clusters (as of today, we have three) as an activity on the sideline of my main research projects, using resources graciously donated to the institute and we started thinking about how we could store the data reliably and efficiently: I say efficiently because we would not only need to archive the data but actually be able to access it with at least acceptable performance.

The data is also precious: the experiments that produce such data are quite expensive, costing up to $4000 for a single run of the instruments that generate them. Therefore consistency and integrity were also a top priority.
For this reason, we tried a number of different solutions. Since exporting with NFS was neither reliable nor efficient, we looked at distributed file systems, and thus we evaluated and used GlusterFS, MooseFS (prior to the LizardFS fork), and RozoFS. We ultimately settled for LizardFS, which we have been using for about 2 1/2 years now.

The reasons we chose LizardFS is that it’s easy to administer, ease of setting up, robustness (it survived several power outages without a single file getting lost), EC22 erasure coding (not all distributed file systems offer it), and high resilience to heavy I/O. We have set up with minimal hassle about 400TB storage. It has many features that we like: administration via Web CGI, the fact that you can create a storage cluster relatively easily, and the use of standard tools for accessing it in a POSIX compliant way (FUSE). The right FUSE options also can considerably increase performance.

Of course, nothing is perfect, and what I’d like to see in the future would be ways to automatically detect disks being replaced (as opposed to editing configuration files and restarting services), or auto removal of failed disks (kind of what md-raid does in Linux). Lastly, NFS support in LizardFS could use an updated Ganesha plugin to work in more modern setups.