Community Article #2

Hi, my name is Tony Travis.

Born in Preston, Lancashire, UK. An old Mill-Town that used to be famous for weaving cotton. It’s up north as the southerners say, although when I moved to Aberdeen they would call me a southerner, Confused? Mee too!
I’m a computer-literate biologist fluent in 6502 assembly, Forth, C/C++, Fortran, Ratfor, Java, Perl, Python, R, awk, with more than thirty years experience of developing and using High-Performance Computing techniques for biological research, including analysis of NGS RNAseq data, de-novo assembly and annotation of first and second-generation DNA sequence data and the analysis of biological images.

I’m experienced in the construction and administration of Beowulf clusters for computationally demanding work, and creating distributed data sharing and bioinformatics infrastructure within geographically dispersed virtual organizations for which I used Bio-Linux with auto mounted SSHFS folders over the WAN between 32 hosts at 26 partners in different EU countries.
I originally used Sandia National Laboratories oneSIS with NFS, but quickly became aware of the lack of cache coherence in NFS causing problems with multiple writers and started to look for alternatives. Having tried almost all the FLOSS distributed filesystems at one time or another, I came across RozoFS, which uses the Mojette Transform instead of Reed-Solomon codes and is incredibly efficient. I built a six-node RozoFS cluster with 256TB of storage for my colleague Luca Beltrame at the non-profit Mario Negri Institute in Milan that worked well but was very difficult to administer, despite very good support from the RozoFS developers. I decided to look for alternatives and found MooseFS/LizardFS, so decide to evaluate it. I’ve been using LizardFS for three years now.

I built another six-node storage cluster at the Mario Negri Institute to evaluate LizardFS and compare it with RozoFS. Although RozoFS is an extremely interesting leading-edge SDS, on balance, I found LizardFS much easier to configure and administer, so I installed LizardFS on both of the Mario Negri SDS clusters to provide combined total storage of 400TB, the capacity of which could be upgraded to 1,056TB if all the servers are fully populated with disks.

What I like about LizardFS is that it’s easy to install and configure, has good admin tools including the Web GUI, great resilience to hardware failures, and good performance even on a 1Gb network.

Things I dislike about it are the need for manual recovery from disk failures, involving editing config files and reloading the service manually. Disk failures are not automatically removing volumes from the configuration. Cold start requires a lot of manual intervention. Bad vibes on the LizardFS blog about the future of the project, but it seems to have been addressed recently.

I would love to see admin tools in the Web GUI for the addition and removal of disks and other admin tasks. More automatic reconfiguration after disk or host failures and automatically accepting hosts that are configured back into the storage cluster.