Image matching software has become widely available. If you haven’t yet played with it, try dragging an image, for example, a photo of yourself, into the Google search bar. Or, for fun, you can try Doggelganger, “human to canine pairing software” promoted by pet food manufacturer Pedigree.
Doggelganger’s purpose is to promote the adoption of homeless dogs in New Zealand by finding dogs that bear some facial resemblance to a potential adopter. And, we’re told, security agencies have sophisticated face recognition systems behind the cameras in airports and at border crossings, programmed to watch out for specific individuals.
Photo comparison brings up one of my favorite questions: what does it mean for two things to be “similar”?
Intuitively, we think we understand the concept well: two faces, for example, are similar if … well … if they look alike! It’s not quite that fuzzy: we can make the definition more rigorous by defining two faces as similar if a significant number of people rate them as looking similar.
Separated at Birth? Secretary of State Hillary Clinton and actress Emma Thompson (from inmirror.com)
Or we can use a performance-based definition and say that two faces are similar if, when asked to find matches among a number of photos, people frequently mistake one for the other.
The problem with these definitions is that, while they are operational, they are not algorithmic. They are of no use to us in programming a computer to judge the similarity between two faces.
Digging into the problem in the naïve ways (looking at pixels, colors, geometry) doesn’t help much. From a raw image perspective, there’s a lot more difference between two photos of you, one in which you’re wearing a hat and standing in front of a plain background in daylight and the other in which you’re hatless at night, then there is between a photo of you and one of me, both taken under similar lighting. Nevertheless humans (and pigeons, it turns out) are very good at recognizing individual faces despite hats, glasses, lighting, viewing angle, and other distracting features.
There’s an entire, well-developed literature on cognitive and computational aspects of image similarity; it’s well beyond the scope of this brief blog entry. Here, I just want to raise two points, each of which I’ll discuss in a future posting:
- There’s a lot more to similarity than there at first appears to be: Assessing similarity requires deep knowledge. Comparing pictures of faces requires knowledge about people, and about how light and shadow work. Comparing musical recordings requires knowledge about music, musical instruments. Comparing paragraphs of text requires knowledge of meaning.
- When we design systems to assess similarity, we need to be very precise about the task. In particular, we need to be clear about what kind of mistakes and near misses we expect the system to make.


