NevadaToday

Researchers save money using central research grid

November 16, 2007

Researchers save money using central research grid

November 16, 2007

One year ago, the Information Technology (IT) Division announced plans to house and administer a centralized university research grid. Today, that grid is alive and well and being used by campus researchers to run complicated simulations involving massive amounts of data at a much lower cost than setting up their own computer cluster.
The way it works, according to John Kenyon, who was hired as Research Grid Systems Administrator early this summer, is that IT provides the server rack, air-conditioned server room, a head node, file server, and his server administration.
At present, the cluster contains 63 computation nodes, with a combined 312 gigabytes of RAM and 252 processor cores, supplemented by a 24-terabyte file server. Of course, as more researchers join the grid, those numbers will continue to climb. Currently, the cluster is predominately used for artificial intelligence (AI) simulations, but is also running some protein analysis and network simulations.

Kenyon, who gave an update on the research grid at last month's TLT Showcase of Technology Tools, said that scientists can solve important problems by harnessing this kind of research power. For example, during World War II, the British were able to crack the German Enigma code by using an early computing machine called the "Turning Bombe." Today's computers are much cheaper, smaller, and easier to use; they just need to be linked together to put their potential power to good use.

So far, two researchers are contributing members of the grid: Dr. Sushill Louis of the Computer Sciences (CS) Department, who is researching genetic algorithms, owns 26 nodes. Dr. Bobby Bryant, also of CS, owns 23 nodes and is researching neuro-evolutionary algorithms.
Others using the grid as guest researchers include Dr. Kevin Facemyer of Biochemistry, researching protein interaction and evolution; Fares Quedan of Mathematics, researching atmospherics; Murat Yuksel, CS, researching computer networking and communication; and George Bebis, CS, researching computer vision. There is also interest from two other researchers, as soon as the grid acquires specific software they need.

Kenyon noted that IT is using an enterprise-grade queuing system to make sure jobs are run in the fairest and most efficient way possible. The Sun Grid Engine chooses whose job to run based on past usage and priority. For example, those who own nodes have top priority on the nodes they own. If someone does not own nodes, their job will attempt to run in the common pool. If unavailable, they will run on anyone's idle nodes. However, if the owner of the nodes submits a job, those "guest" jobs are put on hold until the owner's job is complete.
"Let me be clear," Kenyon said in his October presentation. "You always have 100 percent access to the nodes that you buy!" However, he noted, any contribution to the grid helps all groups doing computational research.
"The common pool is not owned by anybody and no one gets priority there," he said. To make sure those resources are fairly distributed, all departments get an equal share of time, all groups within a department get an equal share of the department's time, unless the department chair directs otherwise, and all users in a group get an equal share of the group's time, unless the professor directs otherwise.

After monitoring the campus research grid for the past six months, Kenyon's first priority for the future is to get more computers and to make sure the grid is well known by the University research community so that it lives up to its true potential.
He also thinks it would be helpful to create a curriculum to teach people how to manipulate, analyze, and generally deal with massive amounts of data. For example, he would like to see classes or workshops that teach University researchers about computer programming, how to do a proper statistical analysis of research data and how best to present that data graphically for maximum impact.

November 16, 2007