At a high level, we will find the eigenfaces of the input data set. We can think of these as a base set of faces that can be used in linear combinations to represent other faces. We can calculate and store the weight coefficents needed to represent all of the faces in the data set by the eigenfaces. When presented with a new face, we can compute the weight coefficents needed to represent the new face with our eigenfaces. We can perform rudimentary facial recognition by then comparing the computed weights of the new face with the weights of all previously known faces in the database.
For this project, we are using the well known Olivetti Faces data set. This data set contains 64x64 pixel grayscale images of 40 individuals. There are 10 images of each indivual exhibiting different facial expressions, angles, and lighting. Here is a sample of the first 4 individuals:
In addition to the standard 400 images, I've also added 10 images of myself to the dataset:
We represent each image as one long flattened vector. Our data set is composed of 64x64 images, thus, each image will be represented by a 4096 dimensional vector. We pack all of these vectors into a matrix such that each row represents one of the input images.
Next, we need to calculate the mean (average) face of the dataset. This can be done by simply averaging all images across each dimension. This looks something like:
We then proceed to subtract this mean face from each face in our matrix. We can then leverage a singular value decomposition (SVD) on the resulting matrix (transposed) to determine the principal components of the data.
If we let
The benefit of using the SVD is that it allows us to avoid computing the covariance matrix
The eigenfaces we are searching for are the eigenvectors of
Therefore, it follows that
We determine the number of eigenfaces we want by using the rentention
variable passed to the constructor of our FaceRecognizer
class. We compute the total energy of the matrix retention
.
In our case, with a desired retention of 0.9, we end up with roughly 270 eigenfaces. Here a sample of what the first 40 look like when reshaped to be 64x64 images:
We can determine the weight of the
where
Internally, we have the model compute and store all the weights needed to represent each of the training images, which can be used for future comparisons.
We can pass in an image that the model has never seen before. The model will then compute the weights of the eigenfaces needed to represent that image, and compare against the stored weights of the training images. Our implementation utilizes the Euclidean distance as the metric for comparing weight vectors. The training image which minimizes the Euclidean distance (i.e., the training image with weights most similar to the input image weights) is selected as a match.
As a test, we can pass in a previously unseen image of myself to the model:
And in response, the model returns:
Thus, we have a positive identification.