Community Article #4

My name is Navid Malek Ghaini.

I was born in Tehran, Iran. Iran is a very beautiful country, especially from nature and historical perspective (have a look). I’ve recently graduated from Sharif University of Technology which is the highest-ranking university of the nation and also recognized internationally (most of the graduates apply for the top 50 universities around the world). My major was computer engineering; however, it is called engineering but it is rather computer science (in Iran CS and CE are relatively the same). Currently, I’m a research fellow at INL-LAB and we are working on encrypted traffic analysis; I’m mainly at the network and OS side.

My mother-tongue language is Farsi (aka Persian), I’m also fluent in English. My preference for coding languages that I use is Python, C, JavaScript, and BASH (if considered as a programming language!).

About a year ago, I was a DevOps engineer in a company based in Iran called Nopayar; I was part of the project Yumcoder where we developed a scalable and extensible cross-platform infrastructure with the help of open source technologies. Obviously, at some point in the project, we needed a distributed network file system/storage. I considered so many available options back then, but in the end, I’ve chosen LizardFS for my purpose. The other famous options which I considered were Ceph, BeeGFS, GlusterFS, and ZFS.

I considered LizardFS to be the best open-source solution that existed. The main advantage of LizardFS was its simplicity in comparison with its other open-source rivals. The configuring and deploying of every node and even the whole cluster of nodes is not only easy but also relatively fast. The web GUI for cluster management is also a great benefit that made the work much faster and easier (especially for team leaders and managers who are interested in GUI and integrated information about nodes and cluster).

We needed an open-source platform to further configure and even alter it for our scalable infrastructure. My main interaction with LizardFS was to scale and tune the performance of it, and I should say that I didn’t have any major problem paving this path; the parallel file serving system was extremely well coded, flexible and easy to both tune and alter. Also, the community and the developers were both fast responding and friendly. Moreover, the replicating strategies were not only quite easy to configure but also highly advanced; for instance, we used a slightly modified version of EC replication of LizardFS which was one of the most advanced replicating strategies among open-source distributed file storages back then.

I had been using LizardFS for around four months and it was a great experience. However, it had some minor performance issues with SSD storage back then (which are now history), but I believe it was by far the best option available.

The statement on LizardFS’s website “Get Your Storage Up and Running in 28 Minutes” is the greatest thing about it. Plus, LizardFS is highly configurable and customizable and indeed fast and easy to scale.

With the new team and development strategy, I believe the minor issues that it had, like the one I mentioned, will be eradicated and it will become one of the best-known options, if not the best, network defined storage in the near future.

In the end, I would like to thank Mark Mulrainey for starting this exciting movement of sharing the community experiences with LizardFS.

LizardFS Community Article #3

Logo LizardFS

Hi my name is Nick Coons. I was born in Phoenix, AZ (USA) and have lived here my whole life, though I do like to travel quite a lot. Everyone knows that our summers are hot, and that once the temperature drops below 70F (21c) it’s jacket weather. What a lot of people don’t know is that a lot of tech companies from California are opening offices here, or relocating here entirely. We’re becoming known as the Silicon Desert.

While I only speak English and American :), I’ve written code in many languages; including C/C++/C#, Assembly, Perl, PHP, JavaScript, and numerous varieties of BASIC over the decades.

Besides LizardFS, I’ve also worked with MooseFS and GlusterFS.

I went with LizardFS primarily for its simplicity and ease of management. LizardFS can be as simple or as complicated as one needs it to be, I’ve been using LizardFS for about five years.

We have a couple of clusters in our data center that we use as the storage pool for ProxMox-based VMs as well as user data.

The things I like most about LizardFS include:

The storage devices can be of differing sizes and unevenly distributed (unlike RAID). This allows a cluster to be easily upgraded by adding larger drives that are cost-effective today but may have been too expensive or even non-existent when the cluster was put into service, without having to scrap the existing drives.

While the web UI could use some polishing, I think the information that it provides at a glance is very valuable.

The performance out of the box is nearly as fast as direct writes to the storage devices. There appears to be very little performance loss by adding this system.

I like that it’s FUSE-based and detached from the kernel. I can do kernel update and filesystem updates independently.

Other than some things that I’d like to see added or changed (which I’ll describe later), I can’t think of anything that I really don’t like with LizardFS.

There are a few changes I’d like to see made:

The biggest would probably be a multi-master setup rather than just a high-availability failover. So I’m thinking something along the lines of a MariaDB Galera Cluster rather than uRaft. It’s one of the things that I like about GlusterFS (but it has too many other downsides to consider using). I can see two benefits to this:

Right now, the uRaft daemon manages the master daemon. If the uRaft daemon fails for some reason, the cluster may continue to run normally because the master daemon is still running. Then if something happens with master, the failover malfunctions. Or if uRaft fails, a different shadow is promoted but the master with the failed uRaft isn’t demoted, so you end up with two masters that don’t know about the other. Both of these scenarios have happened to us.

A cluster distributed across a WAN in a multi-master setup would allow the client accessing the cluster to access the master closest to it, and wouldn’t require that masters on the other side of a WAN link be opened up to clients.

Right now, I can query a file to see how many chunks it has, if it’s undergoal, etc. But I can’t do the reverse. If I see in the UI that there are missing or undergoal chunks, I’d like to be able to get a list of those files. I believe there’s a process that runs that updates a list of these, but it’s not real-time. Real-time access to this information would be valuable.

The ability to have different types of storage within a single cluster and set goals to direct it. For instance, I might have a mix of HDDs and SSDs in a cluster, and I would want to define what data goes where. We do this now by running two chunkserver instances on one physical machine and setting goals by chunkserver label, but it seems like a hack and it’d be nice to see this configurable without spawning multiple chunkserver instances.

LizardFS Community article #2

Logo LizardFS

Hi, my name is Tony Travis.

Born in Preston, Lancashire, UK. An old Mill-Town that used to be famous for weaving cotton. It’s up north as the southerners say, although when I moved to Aberdeen they would call me a southerner, Confused? Mee too!

I’m a computer-literate biologist fluent in 6502 assembly,Forth,C/C++, Fortran,Ratfor,Java,Perl,Python,R,awk, with more than thirty years experience of developing and using High Performance Computing techniques for biological research, including analysis of NGS RNAseq data, de-novo assembly and annotation of first and second generation DNA sequence data and the analysis of biological images.

I’m experienced in the construction and administration of Beowulf clusters for computationally demanding work, and creating distributed data sharing and bioinformatics infrastructure within geographically dispersed virtual organisations for which I used Bio-Linux with auto mounted SSHFS folders over the WAN between 32 hosts at 26 partners in different EU countries.

I originally used Sandia National Laboratories oneSIS with NFS, but quickly became aware of the lack of cache coherence in NFS causing problems with multiple writers and started to look for alternatives. Having tried almost all the FLOSS distributed filesystems at one time or another, I came across RozoFS, which uses the Mojette Transform instead of Reed-Solomon codes and is incredibly efficient. I built a six-node RozoFS cluster with 256TB of storage for my colleague Luca Beltrame at the non-profit Mario Negri Institute in Milan that worked well, but was very difficult to administer, despite very good support from the RozoFS developers. I decided to look for alternatives and found MooseFS/LizardFS, so decide to evaluate it. I’ve been using LizardFS for three years now.

I built another six-node storage cluster At the mario Negri Institute to evaluate LizardFS and compare it with RozoFS. Although RozoFS is an extremely interesting leading edge SDS, on balance, I found LizardFS much easier to configure and administer, so I installed LizardFS on both of the Mario Negri SDS clusters to provide a combined total storage of 400TB, the capacity of which could be upgraded to 1,056TB if all the servers are fully populated with disks.

What I like about LizardFS is that it’s easy to install and configure, has good admin tools including the Web GUI, great resilience to hardware failures and good performance even on a 1Gb network.

Things I dislike about it are the need for manual recovery from disk failures, involving editing config files and reloading the service manually. Disk failures are not automatically removing volumes from the configuration. Cold start requires a lot of manual intervention. Bad vibes on the LizardFS blog about the future of the project, but it seems to have been addressed recently.

I would love to see admin tools in the Web GUI for the addition and removal of disks and other admin tasks. More automatic reconfiguration after disk or host failures and automatically accepting hosts that are configured back into the storage cluster.