Crunching the Numbers to Boost Odds Against Cancer
Software engineers are moving to the fore in the war on cancer, designing programmes that sift genetic sequencing data at lightning speed and minimal cost to identify patterns in tumors that could lead to the next medical breakthrough.
Their analysis aims to pinpoint the mutations in our genetic code that drive cancers as diverse as breast, ovarian and bowel. The more precise their work is, the better the chance of developing an effective new drug.
Ever since James Watson and Francis Crick discovered the structure of DNA in 1953, scientists have been puzzling over how genes make us who we are. The confluence of computing and medicine is accelerating the pace of genetic research.
But making sense of the swathes of data has become a logjam.
That, in turn has created an opportunity for computer geeks and tech firms such as Microsoft, SAP and Amazon.
Oncology is the largest area of therapy in the global drugs market with market researcher IMS predicting it will increase to $83-$88 billion by 2016 from $62 billion in 2011. Computational genomics - using computers to decipher a person's genetic instructions and the mutations in cancerous cells - is emerging as the driver of this growth.
Life Technologies Corp and Illumina Inc are among firms developing equipment that can extract a person's entire genetic code - their genome - from a cell sample.
The newest machines are about the size of an office printer and can sequence a genome in a day, compared with six to eight weeks a few years ago. They can read the 3.2 billion chemical "bases" that make up the human genetic code for $1,000, compared with $100,000 dollars in 2008.
Growing numbers of software engineers are needed to help make sense of all this data.
"Many labs can now generate the data but fewer people or labs have the expertise and infrastructure to analyze it - this is becoming the bottleneck," said Gad Getz, who heads the Cancer Genome Analysis group at the Broad Institute in Boston, jointly run by MIT and Harvard.
Getz is one of a new generation of computational biologists who develop algorithms to parse data from tens of thousands of cell samples, shared with research institutes around the globe.
He and his team of 30 are trying to establish recurring patterns in the mutations and how they are linked to tumor growth. They are using some 1,200 processing units, each with 4-8 gigabytes of random access memory - about the computing power that comes with most desktop PCs.
HARVESTING KNOWLEDGE
Eli Lilly CEO John Lechleiter sees potential for progress.
"We are starting to harvest the knowledge that we gained through the sequencing of the human genome, our understanding of human genetics, disease pathways. We've got new tools that we can use in the laboratory to help us get to an answer much, much faster," said Lechleiter, whose firm is co-owner of the rights to bowel cancer drug Erbitux.
Approved drugs that take genetic information into account include Amgen's Vectibix and AstraZeneca's Iressa. But both these drugs derive from a single mutation. Sequencing has laid bare many more mutant genes - often hundreds in any given tumor - and highlighted the need for a subtler approach to cancer treatment.
Roche, the world's largest maker of cancer medicines, has spent several million euros on information technology for a pilot scheme examining how cancer cells in petri dishes react to new drugs. The scheme involves crunching hundreds of terabytes of gene sequences.
"It's the first large-scale in-house sequencing project for Roche and we expect more to follow in the near future," said Bryn Roberts, Roche's head of informatics in drug research and early development.
Roberts said the project, which uses processing power equivalent to hundreds of high-end desktop PCs, was self contained but there were plans to draw in external data. This would require advances in cloud computing - using software and computing power from remote data centers - but Roberts said the technology would soon be available.
"The scale of the problem means the solution will be on an international collaborative scale," he said.
OPPORTUNITIES IN CLOUDS
The trend of using cloud computing networks to allow commercial and public researchers to share cancer data is promising for the likes of IBM and Google which according to GBI Research are already established providers of cloud computing to drug makers' research efforts.
Amazon, with its cloud computing unit AWS, said it is benefiting as life science researchers rethink how data is stored, analyzed and shared. "We are happy with the growth we are seeing," a spokesman said, declining to provide figures.
Microsoft said it was dedicating "significant resources" to the expansion of cloud computing in the health and life sciences markets.
"Pharma R&D will be working with other technology companies, like Microsoft, in developing new algorithms, methodologies and indeed even therapies themselves," said Les Jordan, chief technology strategist at Microsoft's Life Sciences unit.
The world's largest business software company SAP has teamed up with German genetic testing specialist Qiagen. They are modifying SAP database software so that certain cancer diagnostic tests, which now keep a network of super computers busy for days, can be run on a desktop PC within hours.
Genetic analysis has revealed that types of cancer, now treated as one because they are in the same organ and look the same under the microscope, are driven by different genetics.
Hans Lehrach at the Max Planck Institute for Molecular Genetics in Berlin says every single tumor should be seen as an "orphan disease", using a term for rare illnesses that typically prompt drug regulators to make drug approval easier.
He has designed a software he describes as a virtual patient. It suggests a drug or a mix of drugs based on each tumor's genetic fingerprint. A single case can take several days to be processed.
Lehrach, a geneticist who says he has written software code throughout his scientific career, likens his approach to that of a meteorologist who regards every day's set of readings as unique.
Taking the analogy further, he says the convention of stratifying cancer patients is equivalent to a weather forecast based on simple rules such as 'red sky in the morning, sailor take warning'.
At a unit of Berlin's Charite university hospital, 20 patients left with no other treatment options for their aggressive type of skin cancer are being diagnosed based on Lehrach's computer model.
The trial is exploratory and there are no results yet on the overall treatment success, but the project, like many others, is driven by the hope that cancer can be wrestled down by sheer computing power.