Community article #10

About me

Hi there, my name is Krzysztof (Chris). I am an enthusiastic IT specialist, Linux, and Free Software hothead. I speak Polish and a little English.

Background

I have been testing different storage systems for years. My first experience was a disc array SCSI RAID. After the RAID-5 catastrophe (damaged power cable burned every single disc in the array) I’ve returned to more standard methods of storing data, completely losing trust to a large single point of failure solutions.

Available files + backup maybe, but this solution is neither easy nor nice in use. 10 years ago it used to make some sense when DVD disks were cheap and had large capacity.

Today we use a phone, tablet, Smart TV and we expect our movies or video games collection to be available on multiple devices simultaneously. So, I started creating my own solutions that were going to ensure the availability and redundancy of data. For example a NAS server + backup in the cloud. Overtime when I had to plug in another 2TB USB disc all the time – the NAS server started giving up.

I’ve bought a commercial 4TB NAS server (quite a big and well-known brand) as an addition to the existing system to keep my video games on it. That was a huge mistake as after 6 months HDD failed and the repair wasn’t possible. I had to agree that I overpaid 3 times for just 4TB.

Moving to file storage

In 2019 I ran out of my storage capacity and had to look for something next. That’s when I decided to choose a distributed file system.

The first decision that I made is that CEPH is the only reasonably verified and professional system. Well, that was a huge mistake. CEPH has its requirements – in my case performance 50% less than expected. Had to look for something else.

Alternatives

  • GlusterFS is quite popular but has a lot of problems in scaling the installation – crossed out.
  • Quantcast File System – too much work to deploy on one system, no compatibility, outdated and problematic – crossed out.
  • Lustre – no documentation, some kind of basics, hard to find any information – crossed out.
  • BeeGFS – a step forward, though, problematic configuration, huge requirements, and really bad GUI – crossed out.
  • XtreemFS – problems with stability and installation – crossed out.
  • MinIO – new version supports multiple access points, etc. – could look at it sometime in the future.
  • Cloud Systems – C3, SWIFT, etc – high price, critical parameters on the LTE connection, which I have.

And finally, LizardFS, which was my choice, a system based on a legendary MooseFS. LizardFS meets all my requirements like direct access, failover, xor, EC, etc. Doesn’t require access to the dedicated disc, you can, for instance, dedicate only part of the capacity of the disc. Very transparent for people using NFS, the same authorization method and sharing. Relatively easy configuration and installation.

What’s now

My current LizardFS cluster has one Master Server, one Metalogger and 5 Chunkservers. The main purpose is storing media files in the distributed system. The cluster consists of 6 PC with default EC(3.2).

What I like about LizardFS is the ease of installation, it works with containers and has the lowest requirements. The sharing is as easy as in the NFS. A brilliant feature is a password – that’s what misses in the NFS.

Though sometimes there’s some chaos in the documentation – mfs or lfs. Would be also nice to have an open-source Windows client.

But apart from that, LizardFS does everything, you’d expect it to do. No idea what could be added, maybe Windows server or Alpine Linux port.

Conclusions

At the moment LizardFS is the best solution in its class. Hope it would not make me sad in the future. It may also be a very good solution for companies. It is definitely way cheaper and more scalable than CEPH which has a lot of limitations. LizardFS has no limits.