Get to Know

In the early 1990s, before the World-Wide-Web, I was involved in research applied to digital libraries. At Bell Labs, we built the RightPages Library, which incorporated novel methods in document image processing and watermarking and introduced several publishers (including IEEE, ACM, and Elsevier) to the complexities of the new digital publishing era. It was a pioneering time both in technology and in publishing, which was very exciting. Today, digital libraries are commonplace. (Some of the most impressive are UNESCO libraries for developing countries, built using the Greenstone digital library software, see “Pattern Recognition in Digital Libraries”, IAPR Newsletter, July 2006 [html] [pdf]).

After digital libraries, in the late 1990s, I was lucky enough to become involved in another pioneering technology era, although this I’d describe more as the “wild west”. There was an explosion of interest in biometrics for the purpose of expanding application beyond its traditional use in criminal investigation and forensics. The first solid-state fingerprint scanner was developed at Bell Labs, and I felt that this was the device that would finally free us from passwords, PINs, and keys. During this time, I came to understand the value and limitations of biometrics and where it fits in with other technologies. I summed up my thoughts in a paper in the Proceedings of the IEEE, “Comparing Passwords, Tokens, and Biometrics”.

I have recently become involved in another technical area that I think has the potential to bring about great positive change in the world. The new area is telepresence. At Bell Labs we call this “immersion at a distance”. I am always self-conscious when talking about this as “new” work, because at the New York Worlds Fair in 1964, AT&T first introduced the Picturephone. That was supposed to change the world, too. Yet 46 years later, most people still have voice-only telephones in their offices and homes. Granted, about 46% of Skype calls are video, but these small, low bit-rate video windows could not be called immersive.

So, why in a half century has telepresence not caught on? The answer is multifaceted and we are working on a few ideas at Bell Labs that we hope will address the issues. One issue is the lack of control that you, the remote user, have to see anything at the other end outside of the camera’s range. With remote control of the camera view, you would be able to look around the room as if you were there. Some pattern recognition features can be added to this such that your remote surrogate would automatically turn your view to a sound source or a visual change in your “peripheral vision”. I was recently on a short trip to Europe. I realized that I could have shipped my surrogate “eyes, and ears, and mouth” by express mail to the meeting, participated remotely, and then even discarding all of the equipment would have cost less than my flight and hotel!

Another reason that telepresence has not caught on may be privacy. It may be that they don’t want a camera constantly transmitting their image. An answer to this may be to capture and store pre-screened images, then to match facial expressions and pose on a real-time basis and send these matched images versus real-time video. This is like an avatar, but those on the receiving end see the real person. The privacy concern may also relate to the background of the video: fear that video from a home office may look unprofessional, or video of an office background may reveal proprietary material. In this case, the background could be suppressed. Background suppression algorithms exist, but they currently lack the quality to smoothly replace the actual surroundings with the desired background.

A third issue that may have prevented telepresence from taking off may be simply that current methods don’t scale well. This is obvious for bandwidth, but it is also true for screen real-estate. I’ve recently been on some teleconferences with about 30 other people. To the best of my knowledge, there isn’t a videoconferencing system that scales well to that size. Again we can use pattern recognition to help address this. Speech and motion analysis can detect state changes where one or a few of the 30 people can be displayed to direct the viewers to those with audible or visible reactions. Speech and vision analytics that are tuned to each person can even eliminate the need to send video by recognizing the emotion or intent of a person and conveying messages such as, “Mr. X has a question.” or, “Ms. Y appears to disagree with what was said.”

I don’t know which of these or other ideas will finally make telepresence comfortable and desirable to users. However, I do know that after 46 years, telephony’s halted step into the 21st century is more than overdue. Just as for digital libraries and biometrics, I hope to participate in this next step.

Getting to Know…

Larry O’Gorman, IAPR Fellow

By Larry O’Gorman, IAPR Fellow (USA)

Has the time for telepresence finally come?

Other articles in the

Getting to Know...Series:

Biometrics: The key to the gates of a secure and modern paradise by Nalini K. Ratha

January 2010 [html] [pdf]

Recognition of Human Activities: A Grand Challenge by J.K. Aggarwal

October 2009 [html] [pdf]

Lawrence O'Gorman is a Distinguished Member of Technical Staff at Bell Laboratories working in areas of image and signal processing, pattern recognition, and multimedia security. Before that he was a Research Scientist at Avaya Labs, Chief Scientist and co-founder of Veridicom (a fingerprint device company), and at Bell Labs (for the first time).

He has written over 70 technical papers, eight book chapters, holds 15 patents, and is co-author of the books, "Practical Algorithms for Image Analysis" published by Cambridge University Press, and "Document Image Processing" published by IEEE Press. He is a Fellow of the IEEE and of the International Association for Pattern Recognition. In 1996, he won the Best Industrial Paper Award at the International Conference for Pattern Recognition(ICPR) and an R&D 100 Award for one of "the top 100 innovative technologies of that year."

He has served on US government panels to NIST, NSF, and NAE, and to France's INRIA. He is an adjunct faculty member at Poly/NYU and the Cooper Union.

He received the B.A.Sc., M.S., and Ph.D. degrees in electrical and computer engineering from the University of Ottawa, University of Washington, and Carnegie Mellon University respectively.