Scientists using new mathematical and computational techniques have identified six influenza A viruses that have particularly close genetic relationships to the H1N1 "swine" flu virus that swept through the United States beginning in the spring of 2009. That virus eventually killed almost 18,000 people worldwide.

Biological studies focused on these strains of influenza virus could shed light on how the 2009 pandemic strain of influenza emerged, aiding in efforts to forestall another pandemic, the researchers say.

Five of these viruses were isolated from pigs, and the sixth had infected a human who worked with hogs.

The researchers arrived at these strains by using powerful computers to analyze the relationships between the genomes of more than 5,000 strains of influenza A that have been isolated over several decades and recently sequenced. Rather than using the conventional approach of constructing phylogenetic trees that illustrate organisms' hypothetical ancestors, these scientists set up a network that captured paths leading from previously observed viruses to contemporary viruses.

Biologists for years have used a tree to trace how viruses change over time by undergoing mutations, some of which allow them to resist attacks from immune systems or drugs, jump from host to host and keep surviving. But viruses also exchange genetic material with contemporaries, most commonly when two or more strains infect the same host. This process is called reassortment, and this computer-based networking model is a novel way to see how it all happened in influenza in over time.

"It's not unlike a social network, except that it's tracking an exchange of genetic material rather than gossip," said Daniel Janies, associate professor of biomedical informatics at Ohio State University and a co-author of the study. "This network gives us an explicit historical and molecular map of how influenza A viruses evolved from several ancestors to modern-day viruses."

Janies conducted the work with Ohio State co-authors Shahid Bokhari, a research professor of biomedical informatics, and Laura Pomeroy, a postdoctoral researcher in veterinary preventive medicine.

The research is published online in the journal IEEE Transactions on Computational Biology and Bioinformatics.

The scientists obtained data on all fully sequenced influenza A viruses available in a National Institutes of Health database as of Oct. 15, 2009. There were 5,016 viruses in all, with isolates dating as far back as 1931.

They then used supercomputers to create a database outlining the differences and similarities in the genomes among all of these strains, coming up with about 100 million ways in which the genomes differed from each other. Finally, using the highly specialized Cray XMT supercomputer at the Pacific Northwest National Laboratory that was guided by their own algorithms, the Ohio State scientists churned out a network that efficiently tracked how all of these influenza A viruses are related to each other and which paths through the network led to pandemic H1N1 of 2009.

Discovering commonly traveled paths in the network helped the researchers identify what they called the "bottleneck" viruses – the six strains of flu most closely related to the pandemic H1N1 flu that emerged in 2009.

"These viruses just popped out at us," Bokhari said. "We found that 3,600 out of almost 4,000 paths to pandemic H1N1 were passing through these six bottlenecks. That threw up a red flag and said there is something unusual about these six.

"These bottlenecks are the culprits just a few steps removed from the virus that caused the pandemic among humans. Biologists are encouraged to look at these viruses to determine what underlies the emergence of animal diseases in humans."

Four of the bottleneck viruses were H1N2 strains isolated from pigs in China and Hong Kong between 2005 and 2007. Two H1N1 strains were isolated in Kansas, in a pig in 2007, and in Iowa, in a human who worked with swine in 2005.

All of these viruses reassorted with a Hong Kong-based H1N1 strain isolated in 2009 before the pandemic strain emerged.

Reassortment refers to an exchange of genetic material by two organisms. When viruses infect a host, they land on a cell surface and release their DNA into the cell. This effectively hijacks the cell's internal replication machinery and forces the cell to make a new version of the virus rather than a copy of itself. If two or more viruses infect the same cell, then the DNA segments inside the cell are mixed, or reassorted, and the cell is tricked into creating an altogether new virus that contains genetic material from the original set of viruses.

"Reassortment is another way for viruses to evade our immune system," Janies said. "They shuffle the deck, and there are few controls on it. This sloppy process is perhaps an advantage for the viruses. The genes regroup in a new virus to get to the next pig, or person, to survive."

This network approach to studying the evolutionary history of organisms that exchange genes can be applied to other species as long as the genomic data are available, Bokhari noted. The researchers hope to use the model next to study how bacteria become resistant to antibiotics or become very pathogenic.

"Bacteria will exchange genetic material, and that material is often something that encodes toxins or genes for resistance to a specific antibiotic. The same approach will be used in that context to find out where bacteria, such as those that recently emerged in Europe, pick up these changes," he said.

This work was supported by the Pacific Northwest National Laboratory, the Ohio Supercomputer Center and the U.S. Army Research Laboratory and Office.