Pose-Invariant Object Recognition

The Problem

Robust scene interpretation by means of machine vision is a key factor in various new applications in robotics. Part of this problem is the efficient recognition and classification of previously known three-dimensional (3D) shapes in arbitrary scenes. So far, heavily constrained conditions have been utilized, or otherwise solutions have not been achieved in real time.

With the availability of ever faster computers and 3D-sensing technology (real-time stereo processing, laser range-scanner, etc.), more general approaches become feasible. They allow for weaker scene restrictions and hence facilitate new scenarios. Fundamental to visual object recognition are descriptions of general free-form shapes.

In computer graphics, surface meshes are a popular description of free forms. They are also useful for recognition purposes and the Internet makes them accessible to everybody for testing and comparing algorithms. A major drawback, however, is their large memory requirement. Furthermore, surface meshes are defined with respect to a global coordinate system. Thus time consuming registration is necessary to align the object of interest to the frame of the referenced object model before matching is possible. The same problems apply to voxel-based descriptions of shape.

Representations based on superquadrics, generalized cylinders, and splines all suffer from a great sensibility to noise and outliers in the sensed data. A significant effort is required to obtain a robust fit procedure and to select the model order so as to avoid over-fitting.


It is most desirable to develop a shape representation that

  • is compact,
  • is robust,
  • does not depend on a global coordinate frame, and
  • has the descriptive capacity to distinguish arbitrary shapes.

A promising approach is to analyze the statistical occurrence of features on a surface in 3D space. We introduce a statistical representation of 3D shape, based on a novel 4D feature. The feature parameterizes the intrinsic geometrical relation of a pair of surface points and respective surface normals, generalizing the concept of surface curvature. The set of all such features represents both local and global characteristics of the surface.

We compress the distribution of the 4D features into a histogram. A database of histograms, one per object, is sampled in a training phase. During recognition, sensed surface data, as may be acquired by stereo vision, a laser range-scanner, etc., are processed and compared to the stored histograms. We evaluate the match quality by the following six criteria:

  • the intersection (as often used in fuzzy-logic approaches)
  • the squared Euclidian distance
  • two forms of the chi-squared test
  • the Kullback-Leibler divergence
  • the likelihood


For each match criterion, we have investigated:

  • the recognition rate and processing time under ideal conditions
  • the recognition rate depending on surface-mesh resolution
  • the recognition rate depending on surface noise
  • the recognition rate depending on surface visibility

The recognition-rate curves are shown in the figure for a database of 20 objects, along with an illustrative example of an object as stored in the database (top left), and the same object degraded by noise (top center) and by missing data (top right).

 recognition-rate curves
zum Bild recognition-rate curves

All experiments show that Kullback-Leibler and likelihood matching yield robust recognition rates and outperform the other criteria. Hence, our novel 4D feature histogram demonstrates high representation capacity for free-form recognition.


Eric Wahl, Ulrich Hillenbrand, and Gerd Hirzinger. Surflet-Pair-Relation Histograms: A Statistical 3D-Shape representation for Rapid Classification. 3-D Digital Imaging and Modeling — 3DIM 2003, IEEE Computer Society Press, pp. 474-481.

Eric Wahl and Gerd Hirzinger. Cluster-Based Point Cloud Analysis for Rapid Scene Interpretation. 27th Annual Meeting of the German Association of Pattern Recognition DAGM 2005, Vienna.

Eric Wahl and Gerd Hirzinger. A Method for Fast Search of Variable Regions on Dynamic 3D Point Clouds. 27th Annual Meeting of the German Association of Pattern Recognition DAGM 2005, Vienna.

URL for this article