The year 2013 marks the 60th anniversary of the discovery of the structure of DNA – the famous double helix. Watson & Crick’s discovery of DNA’s structure is often referred to as the most important biological discovery of the last 100 years – and I agree wholeheartedly. DNA contains the hereditary information necessary for multi-cellular organisms, and the discovery of its structure has provided the foundation for the progress and innovations we see in genetics, genomics, biology and science in general. For example, we have been able to sequence and map the whole human genome! Most of you may remember hearing about the Human Genome Project (HGP), an international consortium that aimed to map the more than 3 billion nucleotides that make up the entire genome. It took approximately $3 billion US and 13 years to complete the first draft of the human genome! The 10th anniversary of the completion of the HGP will also be celebrated this year. Today, we can sequence a full human genome in one day for less than $10 000. As sequencing technology continues to improve and become more efficient, sequencing cost will continue to decline. With the $1,000 genome on the horizon, we will soon see large-scale genomic data sets incorporating hundreds of thousands fully sequenced genomes ready to be analyzed. As a consequence, the question as to how a researcher can most efficiently handle these massive data sets and make sense of them has become the focus of attention.
For personalized medicine (individually tailored treatment/prevention programs based on the patient’s genetic, biological and physical information) to become a reality, we need new tools to obtain clinically actionable results from genomic data in the most efficient way possible. This is where SAP HANA can help! Leveraging the power of HANA can dramatically accelerate the genomics pipeline. With HANA, we can potentially run queries on hundreds of thousands of whole genomes to identify genetic patterns in conditions of interest. Integration of other large-scale “omics” datasets or electronic medical records can enable unprecedented in-depth analysis of all relevant data for the treatment of a particular patient, all of it in real time.
Recently, SAP was a key sponsor of the Personalized Medicine World Conference held in Mountain View, California. Prof. Dr. Hasso Plattner gave an impressive keynote highlighting our vision for the SAP HANA healthcare platform for personalized medicine. While working in our multidisciplinary teams, our pilot experiments have produced exciting results that will enable personalized medicine! When a DNA sample is sequenced, we get a very large amount of fragmented DNA that needs to be aligned to the genome. This step is called “alignment.” As of now, the SAP HANA alignment algorithm runs approximately 8x faster than the leading alignment algorithm (called “BWA-SW”). We have also seen improvements of up to 600x faster in the last step of the genomics pipeline – annotation and analytics of the variants. This means that researchers can run real-time flexible queries on genomic data. These are just our preliminary results, and we are very excited to continue our collaboration to deliver a game-changing platform to enable personalized medicine.