The new system, which is the first computer system to combine both face and voice cues, has potential for use in security systems, market research and human/computer interaction systems as well as other applications. For example, the system could be used to signal when unauthorized individuals tried to enter a restroom, fitting room or dormitory. Law enforcement agents or private citizens could also use it as part of a system designed to scan videotape to find images of a particular individual. The new system could also be used to collect market research data automatically, for example, on how many women and how many men sat behind the wheel of a specific car at a car showroom or selected white versus red chrysanthemums at a garden center.
Sharma adds, "The new system could even be used to provide a richer communication environment for people using human and computer interaction systems." For example, a computer system that helps people locate stores in a mall might be able to provide better service if it could recognize the gender of the person inquiring. The new system is based on powerful pattern recognition software technique, called support vector machines (SVM) that can learn. SVMs have previously been used to scan cell samples for abnormalities or in other applications where patterns are very similar and difficult to separate. Sharma and his research group adapted SVMs separately for face and voice recognition. The Penn State researchers trained the software dedicated to faces on 1755 thumbnail images of human faces from a standard database. The thumbnails showed only the section of the face that includes eyes, nose and mouth - no hair, ears or neck. Another SVM was trained on voice samples. The voice samples also came from a standard database and included just fractions of a second of voice data.
When the face software and voice software each had been trained separately to the level of human proficiency at classifying gender, Sharma and his group added a SVM "manager" to fuse the results, make the final gender classification and improve the system's accuracy. For example, if the face software and voice software disagree on a particular gender classification, the SVM manager reviews both decision processes - with a view to the specific weaknesses of the face and voice software - and makes the final decision. The result is a system that classifies gender correctly more often than human beings supplied with the same data.
Besides Sharma, the inventors also include his former student, Leena A. Walavalkar, and visiting researcher Dr. Mohammed Yeasin. Penn State has submitted a provisional application for a patent for the invention that has been licensed to Sharma's company, Advanced Interface Technologies. Sharma thinks the new system will eventually find application as part of a wide variety of "intelligent" systems. He imagines, for example, a passive security check point system for a 'smart building,' where people would not have to stop to provide identification or even slow down. They could simply walk past the checkpoint, say hi and keep on going - unless they were intruders.
Sharma also notes that the system could be used to help ensure gender equity and equal access to public facilities by alerting officials if one gender predominated.
The research was supported, in part, by a grant from the National Science Foundation.
Besides Sharma, the inventors also include his former student, Leena A. Walavalkar, and visiting researcher Dr. Mohammed Yeasin. Penn State has submitted a provisional application for a patent for the invention that has been licensed to Sharma's company, Advanced Interface Technologies