Cascade classification

The classification of object based on Haar Features can be done by the Haar Feature-based Cascade Classifier for Object Detection. This object detector is initially proposed by Paul Viola and Michael J. Jones in their paper “Rapid Object Detection using a Boosted Cascade of Simple Features” in 2001 and later improved by Rainer Lienhart.

The approach is based on machine learning and uses many positive images (with faces) and negative images (without faces) for training. Afterwards the object detector is able to detect faces in given images.

In the approach Haar features are used - which are similar to convolution kernels - to detect a feature in a given image. Each feature is represented by one value by taking the subtracting the sum of pixels from the white rectangle from the sum of pixels from the black rectangle. Different types of Haar features e.g. Edge line and four-rectangle features are shown in the figure.

Hair features

The application to the image is that all features are searched in the image. This will take much effort and computation time. The next figure shows haar features on an image.

Hair features on image

The alorithm of Viola uses a 24x24 window for the evaluation of the features in any given image. This would result in about 160,000+ features if all positions, scales and types of the features are considered. To reduce the computation time the idea is to use a integral image. In this integral image the value of a pixel (x,y) is the sum of pixels above and to the left of (x,y). This reduces the calculations for one pixel to an operation with only four pixels. Now the calcutation is a bit faster. From the 160,000+ possible feature values only a few are good and useful features.

To find the best features Adaboost - a maschine learning algorithm - is used. All features are applied to all training images. For all the images Adaboost finds the best threshold to classify the faces to positive or negative. From this the features with minimum error rate are selected. These are the features that classify between images with faces and without faces.

Adaboost constructs a strong classifier as a linear combination from weak classifiers.

To make the algorithm even more efficient Viola/Jones applied a simple method. Most of an image is non-face region. Because of that the algorithm checks if a region is not a face region. The concept is that not all features are processed at the same time. A grouping to different stages is done. If the region does not include a face it is discarded and not processed again. This reduces the possible regions quite fast and makes the algorithm really efficient. The method is called Cascade of Classifiers.

detected faces

The picture shows the found faces marked with rectangles. Only one face could not be found.


In OpenCV exists many pretrained models. Here you can download the models from GitHub or are located in the data folder in the OpenCV installation.

To use the classifier in a first step the classifier has to be created and the necessary XML file is loaded. Then the detection is done using the detectMultiScale method. This method returns rectangles around the faces.

This is an example of how the cascade classifier can be used.

import cv2
image = "image.png"
path = "haarcascade_frontalface_default.xml"
Cascade = cv2.CascadeClassifier(path)
img = cv2.imread(image)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = faceCascade.detectMultiScale(gray,1.1,5)
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)
cv2.imshow("Faces found", img)

For more information look at the documentation of OpenCV.

Author: Dietlinde Dierks
Last modified: 21.06.2019