Mon, 18 Dec 2000

Watch out, the gaze-tracking equipped computer is watching you!

By Zatni Arbi

ALMADEN, California (JP): "Try to fix your gaze on the opposite corner of the screen," said David Koons, an MIT Media Lab PhD, to Tony Waltham from Bangkok Post. The cursor was sitting on the bottom left corner of the screen, so Tony fixed his eyes at a point on the top right hand corner.

The moment he touched the mouse the cursor jumped across the wide 18-inch TFT monitor to the spot he was gazing at. He did not have to drag the mouse at all to move the cursor, thanks to an application called Manual Acquisition with Gaze Initiated Cursor, or MAGIC.

Tony and I and about a dozen other IT journalists from Asia- Pacific countries were trying out a gaze-tracking technology called BlueEyes that the researchers at the BIM Almaden Research Center had been working on for some time. We were visiting the lab, which is the second largest of IBM's eight research facilities throughout the world. It was the second day of my recent trip to the U.S.

The eye-tracking technology uses a video camera and two infrared light sources that illuminate the eyes. One light source is placed on the axis of the camera's lens, while the other source is placed a bit off-axis.

When the on-axis source illuminates the eye, it will produce a bright image of the pupil. It is rather like the red-eye effect you get if you take a picture of someone's face with the flash and you forgot to turn on its red-eye control. The off-axis light source, which needs to be calibrated for each user, creates an ordinary image of the eyes. The system then uses the difference between the two images to determine where the eyes are looking.

Immediately a number of possible applications come to mind. Can, for example, this eye-tracking technology be used by a quadriplegic to control the screen cursor and type his message on the computer just by using his eyes? What about stroke patients who have become paralyzed, can this technology be used so that they can somehow communicate with doctors and family members just by gazing at the keys on the image of keyboard on a computer screen?

There is a challenge here, as a lot of the movements of the eyes are involuntary, and the jitters make it impossible for the cursor to land precisely on a narrow spot on the screen.

Still, there are things that can be done to improve the possibilities. A quadriplegic can use a chin rest so that his head can be held steady. The keyboard image can be made large enough so that the cursor will have some leeway to accommodate any jitters. So, in theory, the possibility is there.

Other possibilities

Would it not be better if our toll roads and highways were free from sleep-deprived drivers? We certainly would have far fewer traffic accidents and a lot of lives would not have to be lost so tragically. And, incidentally, just the night before the Almaden visit I had been so irritated by the Indian cab driver who was taking me back to my hotel from Radio Shack because, while driving, he constantly turned his head and looked at me as he talked proudly about the successes of the Indian entrepreneurs in Silicon Valley. So I asked Dave whether automobile manufacturers would be interested in incorporating this technology into their cars to improve safety on the road.

"Mercedes has shown their interest in this technology. BlueEyes can certainly help monitor whether the driver is paying attention to the road in front of him or whether he is drowsy and is falling asleep on the wheel. It can also be combined with other sensors in the exterior of the car that, for example, can determine whether he is coming too close to the car in front of him for the speed he is traveling at," Dave said. "The issue is then whether the car should take action by slowing down or simply warn the driver." Whichever it does, I cannot wait until these technologies become commonplace in our cars and trucks.

In this column last week, I wrote about one of the applications of this technology in human face recognition, as shown during the presentation of alphaWorks. Now imagine you are in a smart apartment, surrounded by hordes of voice-activated appliances and control devices.

When you say "Start" to your microwave oven, your DVD player, your WebTV set, your juice extractor, your coffeemaker, your food dispenser, your robot pet, your washing machine, your Internet appliance and your car key may all hear the same command and may wake up at the same time. You can imagine the commotion.

"In the future, we communicate with these intelligent appliances and devices not by just talking to them, but also by looking at them," said Dave. So, when you say "Start", the robot vacuum cleaner, for example, will first check whether you are looking at it. If you are not, it will know that you are not telling it to start vacuuming.

Attentive Environment

David Koons is the head of the BlueEyes team within the USER group at the Almaden center. The name USER is capitalized here because it stands for User Systems Ergonomics (or Experience) Research. By adding the capability of visual perception to smart machines, Dave's team aims to create an attentive environment in which human and machines can communicate more like two friends who can read each other's minds.

Is the BlueEyes technology very expensive to implement? "Not really," was the answer. "We do not need high-quality cameras for tracking the human eyes." A low-resolution camera may be even better for safeguarding privacy, as we never know how people may be dressed when they are in front of their home PCs. "A CMOS camera costs only about US$10.00 a piece, and the 128 by 128- pixel that Nintendo uses in its gadgets costs about $6.00," Dave added. Even with these rather crude components, BlueEyes can detect human eyes from up to five meters away, even when the person is wearing glasses. This robust technology can also detect the presence of more than one person at the same time.

When are we going to see BlueEyes embedded in ThinkPads and NetVistas? Dave thinks that in two years the technology will be ready for implementation in PCs and notebooks. At that time, the interaction between humans and computers will be richer than ever, as the machines will have the ability to anticipate users' next requirement by following the movements of their eyes.

SUITOR, or Simple User Interest Tracker, is an application developed by Dave's group that provides a good example of how this can be implemented. When browsing, for example, the computer can follow the user's eyes as they scroll down the page. The moment the eyes hit a link, the computer anticipates that the user will jump to the Web site indicated by the link after he finishes reading the entire page. It will then proactively cache the Web page and display it in on another window for the user to read. It will save a lot of time and increase efficiency.

Already proven to be a very robust technology, BlueEyes has a lot of potential applications, including the detection of the user's emotional state.

"There are just a few key features of the human face that can sufficiently show his or her feelings," said Dave. "Just look at comic strips, you can somehow detect the emotion of the figures although the pictures are not very detailed." Once the computer knows what psychological and emotional states you are in, it can respond accordingly. A computer used in training, for example, can offer more help if it senses that the user is frustrated because he still cannot comprehend a topic.

Next week, we will explore yet another research area of the Almaden Research Center, i.e. intelligent jewelry. You see, when you are in a technology research environment like this one, it is so difficult not to get excited.

By the way, last week I gave you the URL of a demo machine translation from IBM's alphaWorks. We have been out of luck, as the site had been experiencing a problem and has not been available up until the time I wrote this article. (zatni@cbn.net.id)