skip to main content

An algorithm for understanding

Computer science and engineering researcher uses computational models to help computers understand what they see

Mircea Nicolescu

Most people can look at an image, a photograph for example, and instantly draw a large number of conclusions and inferences about what they see. You might be able to recognize famous individuals, interpret what the people in the image are trying to accomplish, and draw conclusions about the state of mind of those people.

That reasoning can happen in a split-second for you or I, but developing computer systems that are capable of doing the same thing is a life's work for Mircea Nicolescu.

"Robots, to be really useful, need to see and understand what they see," Nicolescu said. "Vision is the most natural way of sensing the environment."

Nicolescu is working on a number of research problems that can unlock ways to help computers better interpret the stream of visual data they capture, including ways for computers to segment an image into meaningful regions, group related objects or activities, or recognize certain activities and even be able to infer intent from those activities.

All of these problems are directly related to Nicolescu's main focus on surveillance-related applications. By giving computers the ability to recognize certain activities, such as leaving a bag unattended in a sensitive area like an airport, he hopes to be able to develop systems that don't require constant monitoring but can instead flag activities of interest.

"The ultimate goal for these surveillance-related applications would be to replace or reduce the need for human operators," he said.

For Nicolescu a key challenge is developing solutions that allow computers to deal with the natural variation in the world around us.

"I can always model a certain action in a very restricted way so that I can detect it," he said. "But then when things don't happen exactly like that, it turns out I need a more general description."

In addition to challenges posed by variation in human action - getting a computer to recognize an activity despite differences in how it is performed - Nicolescu is also tackling the daunting problem of developing computer systems that can understand those actions in context.

"That means not just recognizing an activity but based on what you see trying to infer the intentions of the persons or the agents involved in that," Nicolescu said. "Essentially trying to predict what they are trying to do or what they are going to do next. This has various applications to the military or for security applications to detect threats."

Nicolescu is collaborating with researchers in the robotics lab on the intent recognition work, which is funded by the Office of Naval Research.

Nicolescu is also working on ways to help computers distinguish between inherent background changes to a scene—say rain falling or a tree moving in the wind—and foreground changes, due to people or objects of interest. Such segmentation, according to Nicolescu, may be a prerequisite for computer vision to be able to achieve higher-level goals such as activity recognition and intent recognition.

While the algorithmic solutions to computer vision remain a research problem, Nicolescu said the field has made many advances as computing power has improved.

"In computer vision, historically the limitations were both due to the algorithm side, solving those problems that I mentioned, as well as due to the limited processing speed available," he said. "Vision is quite expensive computationally. Now with faster computers being readily available, that also helped with better and better algorithms that can realistically be used."

While Nicolescu's work is heavily mathematical, his interest in the field stems from a fascination with the visual.

"I always found it fascinating to look at an image or a video and then be able to automatically extract things of interest to understand what is in there," he said. "Essentially we're trying to emulate what we do with our brain and our eyes in the human vision system, with a camera instead of eyes and computer instead of brain."

Take the next step...