How To Change The World on a Budget

 

LIFE SCIENCES & OPEN SOURCE STORAGE

The scientific and research community tends to be a budget and grant driven environment, and the need to scale storage can often be sudden and unexpected. Traditional appliance based scale out storage vendors often take advantage of customers predicaments by charging expansion prices far in excess of what it costs to buy the solution initially. This forces customers to either accept the ransom, or have to seek out alternative storage solutions that do not integrate with their existing infrastructure. This not only increases the data management costs but also leaves them with complex and disjointed storage infrastructures.

 

With most Open Source Software Defined Storage solutions users are protected from vendor lock in. Organisations are free to choose the most cost efficient storage hardware from the vendor of their choice. By combining vendor-agnostic disk, tape, flash, ssd and cloud storage technologies into a single unified data repository, unpredictable storage costs, ‘forklift’ upgrades and unmanageable disjointed infrastructures of storage technology are completely eliminated. While your storage costs are decreasing by up to 90%.

Open Source solutions can provide a greater return on your storage investment by opening the path to competitive technology procurement. Automated data management unifies the end-to-end storage footprint and enables our users to spend a far greater proportion of their time and budgets on accelerating science and research (changing the world).

LizardFS is an example of Open Source Software-Defined Storage. It provides enterprise-class storage using commodity hardware and specialized software to deliver storage services, advanced features, and management capabilities. Compared to traditional enterprise storage that requires proprietary or custom storage systems, open source software-defined storage platforms such as LizardFS have much lower up-front and ongoing operational costs.

This solution removes the complexity that historically has burdened organizations who struggle to keep up with massive data growth, and simplifies the storage processes for IT professionals in all types of industries. LizardFS increases IT agility by enabling organizations to leverage hardware from any vendor, as well as the option to use existing IT infrastructure to create a custom storage solution to any storage need. It adjusts well to new technologies giving the opportunity to tune and increase performance by integrating ssd/ flash / NvME etc. Organizations can achieve massive scale by increasing storage capacity and performance as needed up to 8192 chunkservers and 1 Exabyte of data.

The life sciences industry, is struggling to keep up with massive data growth due to advancing technologies such as genomics. The processing of data from one genome produces about 1.5 gigabytes, that in turn creates huge amounts of storage needs for each genomics organization that stores thousands of sequenced genomes each day. Genome researchers and other life sciences professionals require storage systems that are high-performing, secure, and scalable. The processing of large datasets also requires fast storage that can concurrently handle many simultaneous write streams as data is processed. Below is a typical data flow involving LizardFS.

LizardFS has had many success stories within the Genomics sphere, some examples:

 

 

Aalborg University Hospital (Aalborg, Denmark) have been using LizardFS for over a year now. They store human exome sequencing data, currently they have 80TB of raw data stored and are adding 80-100GB per week. Vang Qu Le, the bioinformatician in charge of the storage facility was tasked with finding a replacement for the the existing NFS setup. As mentioned earlier, they are bound with the same budget restrictions like most of the scientific and research community, so the replacement had to be scalable but cheap . “We have existing workstations, with internal 10TB storage, and other hardware, we need to make use of them, for cost effectiveness.” Since installation and familiarization with the system he has had no issues with projects or meeting deadlines.

 

United States Department of Agriculture (USDA), they are actually in the process of setting up their cluster as we speak, but intend to be using it for 1PB of sequenced swine data, their use case is somewhat different from Vang’s, in that their primary concern is hurricanes, so they are looking to use LizardFS to replicate data between 2 different data centers in different parts of the state, just in case one of them gets taken off to OZ, they will also hopefully be making use of some of the other features that LizardFS has to offer, like erasure coding to reduce the amount of space their pigs take up in the data centers.

 

Another user of LizardFS is Brent Matthews, the linux administrator at Complete Genomics based in California and owned by BGI the world’s largest Genomics services company. They are doing WGS (whole genome sequencing) for them each genome can consume from 500gb up to 1.5TB of storage. They have been using LizardFS for sometime now and has a 22 node cluster of 3.8PB which is 80% full. Brent looked at several alternatives to LizardFS, but finally settled on it as it was so easy to set up, scalable, cost effective and reliability would not need to be an issue anymore. Brent loves that he does not have to babysit the system anymore, it just works!

We have set ourselves a clear goal to deliver the best software defined storage solution on the market. However, we know that even when we reach our current objective, we will not stop innovating.

Some examples of our latest innovations that are being tested and about to be launched for LizardFS, are a Hadoop plugin to allow you to connect your LizardFS storage to a Hadoop cluster and NFS 4.1 giving you full support for pNFS.

So if you are in the business of trying to make the world a better place, you might like to get in touch and discuss more details of how LizardFS can help you achieve your world changing goals.

 

Mark Mulrainey

Storage Dilemma Solver

+48 733 187 097

mark.mulrainey@lizardfs.com

https://lizardfs.com/