In the last couple of years bioinformatics have gone through a revolution in sequencing genome data. The involvement of computer power and the growing amount of data that can now be created have opened up new opportunities for leveraging genomic data that can bring meaningful insights in less time.
The decrease in cost per terabyte for storage has allowed bioinformatics to efficiently store genetic data in its raw form. On top of that, MPP (massively parallel computing) opensource technologies have allowed highly skilled bioinformatics engineers to leverage their scripting capabilities and to use the MPP platform to execute their scripts and queries over multiple nodes.
The biggest challenge today is to standardise the querying tools on top of the existing formats of storage and to be able to efficiently query the data that is being generated. Today people are using opensource databases and existing databases to query a fraction of the data.
Some of the existing tools that are being used in the big data world are already able to query such quantities of data. Tools such as big data databases like SQream DB, allow genomic researchers to efficiently query their data in a way that saves them months and years of research time.
The hybrid between existing enterprise data technologies and the massive quantity of data being created in the genomic research, are expediting the evolution of big data analytics.
Taking those into different areas such as smart grid, connected cars and other Internet of Things opportunities, will allow enterprises and us to leverage the power of big data.
It is clear that genomic research will allow us to live better and to treat people better. What is also clear is that big data is a powerful tool to resolve some other big mysteries in our world such as weather forecasting, space research, creation of the world and maybe to understand the human brain.
Thanks to Ami for chatting to us, you can follow him on Twitter at @amigic