PhD Students in ITI and CS Win IBM Fellowships

3/22/2010 5:46:00 AM ITI Staff

Information Trust Institute graduate student Eric W. D. Rozier is one of two Computer Science students at Illinois who have just received IBM Fellowships in recognition of their innovative research work.

Written by ITI Staff

 

Eric Rozier
Eric Rozier
Eric Rozier

 

Information Trust Institute graduate student Eric W. D. Rozier is one of two Computer Science students at Illinois who have just received IBM Fellowships in recognition of their innovative research work.

Rozier was honored for his work on rare-event simulation, and Jing Gao was honored for her work in data mining and knowledge integration.

IBM Ph.D. Fellowships are awarded based on a competitive worldwide program. According to IBM, the program "honors exceptional Ph.D. students who have an interest in solving problems that are important to IBM and fundamental to innovation in many academic disciplines and areas of study." Fellows are awarded a stipend and educational allowance to cover one academic year of study.

Eric Rozier

Eric W. D. Rozier is pursuing research on simulation of rare events in large-scale systems. Such simulation addresses systems in which important events occur at a variety of vastly different rates, such that the fastest and slowest rates in the system may differ by orders of magnitude.

Those systems are challenging to simulate, because in order to model system behavior over a span of time long enough to include an adequate number of rare event occurrences, it may be necessary to model an immense number of frequently occurring events. Since the computation time involved in solving a system model is generally determined by the number of events fired, systems with rare events typically result in "stiff" models that are too large and complex to solve. The research challenge is to find a way to make solution of such systems mathematically tractable.

Storage systems, an application area in which Rozier has been collaborating with IBM researchers, are a notable real-world example of systems with rare events. In addition to their "normal" faults, such as hard disk failures, storage systems are affected by potentially devastating, but exceedingly rare, events called "undetected disk errors." In those errors, the disk writes to the wrong sector, or says it's written to the right sector when it actually wrote nothing.

"These are very hard to catch," says Rozier, "and they can cause massive problems. And they become more relevant with systems like Blue Waters. Because in large petascale systems, these rare events that before you might not expect to see for maybe 50 years of runtime, now will occur roughly every 100 days."

Rozier earned a bachelor's degree in Computer Science from the College of William and Mary in 2003, and since 2004 has been a graduate student in the research group of Prof. William H. Sanders at Illinois. In 2008 and 2009 he did two research internships at the IBM Almaden Research Center in San Jose, California. He is also a key member of the development team for Möbius (www.mobius.illinois.edu), a simulation and modeling tool licensed to over 400 academic and industry users.

Jing Gao

As the saying goes, "two heads are better than one." This axiomatic principle forms the guideline and central theme for Jing Gao's research in knowledge integration. Her work draws from data mining and machine learning to create knowledge integration among multiple information sources. The work combines multiple base models, known as ensembles, for better predictions.

While there is an abundance of data sources available to users today, there is a lot of risk in relying on any single model for decision making, in terms of making the best choice to maximize your benefits. To reduce this risk and make accurate learning predictions, combining sources offers an attractive method.

"But, the data is so huge, that you can't really combine it at the data level," cautions Gao. "It's only feasible to do at the model level. One of the key trends in information technology is for increasingly large and disparate sources of data, and that's why it's so important that we get this [knowledge integration] right."

Knowledge integration holds the key for intelligent decision-making in many emerging applications, and has already shown its power in multiple disciplines, including recommendation systems (like the Netflix $1,000,000 Prize), anomaly detection, stream mining, and web applications.

Gao plans to extend the scope of her ensemble methods to several other data mining functions, including ranking, anomaly detection, and veracity analysis. She also plans to investigate knowledge synthesizing over multiple heterogeneous information networks, as the impact of Web 2.0 technologies continues to expand the amount and type of information available.

Gao works with Prof. Jiawei Han in the Database and Information Systems Laboratory. She received her B.E. and M.E. in Computer Science from the Harbin Institute of Technology in Harbin, China. In 2009, Jing did an internship at the IBM T.J. Watson Research Center, where she designed a supervised discriminative pattern-mining algorithm in conjunction with the IBM InfoSphere Warehouse Intelligent Miner.

Writers: Jennifer La Montagne, Department of Computer Science, and Jenny Applequist, Information Trust Institute, University of Illinois

March 22, 2010


Share this story

This story was published March 22, 2010.