“Big data” is becoming a big problem for researchers across the board. But at Virginia Tech, computer science and engineering researchers are meeting this 21st century challenge head-on.
Computer science associate professor Wu Feng works with HokieSpeed, a supercomputer he created to be both incredibly powerful and energy efficient, to address the challenge big data poses to researchers.
Big data refers to data sets so large that they are difficult to collect, search and analyze.
Fields such as meteorology and biology require the ability to capture and process incredibly vast amounts of information, like how many stars are in the galaxy or how many cells are in an organism.
Feng, along with colleagues at Stanford and Iowa State, recently received a $2 million grant from the federal government to develop new ways of handling big data, specifically as it relates to DNA sequencing — a classic big data problem.
“(Computer) performance will double roughly every 24 months,” Feng said. “But the problem is that the amount of genetic sequencing data is doubling every nine months, perhaps even faster. We’re producing data at a faster rate than we can compute it.”
HokieSpeed offers a solution.
Traditionally, computer scientists and researchers relied on an increasing number of transistors in the the central processing unit or “brain” of a computer, the CPU, to increase computational speed. But this also means increasing power consumption and heat production.
“We hit a powerwall,” Feng said. “We couldn’t keep going along that path, it was making the processor way too hot ... You can probably find pictures of eggs frying on processors.”
Six years ago, researchers began working on multi-core processors in an attempt to address the problem. Rather than adding more transistors to one CPU, computers were made to utilize two, four or more CPUs at the same time.
Feng’s HokieSpeed supercomputer takes this concept a step further. By tapping into graphics processing unis, GPUs, in addition to traditional CPUs, HokieSpeed is able to handle big data sets more efficiently than previous systems.
Feng compares these graphics units to a drag race car. While
they do not have the flexibility of a CPU, they are able to update the image on a display incredibly quickly. So quickly, in fact, that the human eye cannot detect the changes.
“Every 30 milliseconds, a million pixels get updated,” Feng said. “That’s bloody fast.”
Feng’s work isn’t simple. The graphics processors were not produced to handle information like a CPU. They operate in different languages, and Feng has to make a translator.
“We now have a processor called AMD Fusion that fuses the CPUs and the GPUs together,” Feng said.
Researchers like Feng must work to determine how best to divide the labor between processors, which Feng likens to the left and right side of the brain.
“How do you make use of these brains that people haven’t used in the past? And once you’ve programmed them, how do you extract performance?” Feng said.
These questions drive Feng’s research.
HokieSpeed is listed on The Green500, a website that ranks the world’s most energy efficient supercomputers. And it’s fast.
“I’d probably say it’s at least 10,000 times more powerful (than a standard personal computer),” said Feng, using his iPhone calculator to come up with a quick estimate. “I guess I’m being a little conservative; it would probably end up being a lot more than that.”
HokieSpeed has delivered for Tech researchers.
Associate Professor Alexey Onufriev came to Feng with a molecular modeling program. It took 10 hours to run before, but with the HokieSpeed supercomputer the program could run in less than a second.
Feng sees a bright future for HokieSpeed and other supercomputers grappling with big data. The possibilities for DNA sequencing, drug design at the molecular level and the study of disease are some fields where this generation of information processing is important.
Feng’s goal is to provide all researchers with quicker access to data and information relevant to their fields.
“We’re looking to empower scientists, engineers, even humanities and business people to accelerate the discovery process (and) create innovations that will contribute to the betterment of society,” Feng said.
And for Hokies looking for a job when they leave Tech, big data processing may have another solution to offer.
“Big data is a big deal right now. My problem now is that I cannot provide enough students for the demand,” Feng said. “People worry about computer science, computer engineering, that all these jobs are being outsourced. The higher-order thinking ... that’s where there’s a fairly significant shortage of qualified people in the workforce.”
Feng estimates that within three to five years, more private businesses like Twitter, Facebook and Amazon may require big data solutions to help them index and keep track of their users and customers, in addition to current applications in research and development.
For those with research experience, big data could mean big money.
Follow the writer on Twitter: @KulakCT