ISPRS Workshop on Laser Scanning **** and SilviLaser 2007, Espoo, September 12-14, 2007, Finland
EXTENDING GENERALIZED HOUGH TRANSFORM TO DETECT 3D OBJECTS IN
LASER RANGE DATA
Kourosh Khoshelham
Optical and Laser Remote Sensing Research Group, Delft University of Technology, Kluyverweg 1, 2629 HS Delft, The
Netherlands
abqmhb@r.postjobfree.com
KEY WORDS: Laser scanning, Point cloud, Object Recognition, 3D Generalized Hough Transform, Automation
ABSTRACT:
Automated detection and 3D modelling of objects in laser range data is of great importance in many applications. Existing
approaches to object detection in range data are limited to either 2.5D data (e.g. range images) or simple objects with a parametric
form (e.g. spheres). This paper describes a new approach to the detection of 3D objects with arbitrary shapes in a point cloud. We
present an extension of the generalized Hough transform to 3D data, which can be used to detect instances of an object model in laser
range data, independent of the scale and orientation of the object. We also discuss the computational complexity of the method and
provide cost-reduction strategies that can be employed to improve the efficiency of the method.
representations, as a set of voxel templates (Greenspan and
1. INTRODUCTION
Boulanger, 1999) or spin images (Johnson and Hebert, 1999),
Automated extraction of objects from laser range data is of which are matched against the data or as a set of parameters that
great importance in a wide range of applications. Reverse mathematically define the object. In the latter case, Hough
engineering, 3D visualisation, industrial design monitoring and transform (Duda and Hart, 1972; Hough, 1962) has been used
environmental planning are a few examples of the applications to determine the model parameters as well as the data points
that require 3D models of objects extracted from images or laser that belong to the object (Olson, 2001).
range data. A 3D model provides an abstract description of the
object, which can be processed and visualised more easily and The application of Hough transform is restricted to simple
efficiently. The process of object extraction consists of two objects that can be represented with few parameters, such as
main tasks. The first task is detection, in which the presence of planes, spheres and cylinders. Vosselman et al., (2004) describe
an object in the data is verified, and its approximate location is a Hough-based method for the detection of planes and spheres
found (usually by labeling the data points that belong to the in a point cloud. Rabbani (2006) developed an extension of this
object). The second task is modeling, where the detected object method that can be used for the detection of cylinders. Figure 1
is represented with a 3D geometric model that is most adequate demonstrates the application of Hough transform to the
in terms of such criteria as accuracy, compactness, the domain detection of cylinders in a point cloud. As can be seen, the
of the object and the application requirements. The detection curved parts joining the cylinders have not been extracted
step plays a key role in the successful modeling of the object. If because these parts cannot be expressed in parametric forms
the object is properly detected in the data, the modeling can be with few parameters.
carried out more reliably and accurately.
This paper concentrates on the detection of 3D objects with
Existing approaches to the detection of objects in range data arbitrary shapes in a point cloud. The objective of this paper is
can be divided into two major categories: data-driven to develop a new extension of Hough transform, which can be
approaches and model-driven approaches. Data-driven used to detect instances of a complex object model in laser
approaches are mainly based on segmentation (Khoshelham, range data, independent of the scale and orientation of the
2006; Rottensteiner and Briese, 2003; Sithole, 2005), clustering object.
(Filin, 2002; Vosselman, 1999) and classification (Forlani et
al., 2006; Oude Elberink and Maas, 2000). While these The paper has five sections. Section 2 provides an overview of
methods have been commonly applied to the laser range data of the standard and generalized Hough transform as applied to 2D
2.5D surfaces, their application to more complex 3D scenes is images. In section 3, the principles of the 3D generalized
not always possible. For instance, in laser range data of Hough transform is described. A discussion on the
industrial installations many objects are partially occluded and computational complexity of the method is presented in section
data-driven methods fail to correctly detect these objects in the 4. Conclusions appear in section 5.
data. Model-driven approaches, on the contrary, are more
robust in the presence of partial occlusion, since they
incorporate some form of knowledge about the shape of the
object. The object model can be represented, among other
206
IAPRS Volume XXXVI, Part 3 / W52, 2007
Figure 1. Detection of cylinders in a point cloud using Hough transform (from Rabbani (2006)). The curved parts
joining the cylinders cannot be extracted using this method.
length, r, and direction,, of the connecting vectors. Table 1
illustrates a general form of an R-table.
2. AN OVERVIEW OF THE STANDARD AND
GENERALIZED HOUGH TRANSFORM
Table 1: R-table
Hough transform is a well known method for the detection of
Point r
objects in 2D intensity images. The standard Hough transform
(r, )01 - (r, )02 - (r, )03 -
0 0
is applicable to objects with an analytical shape such as straight
(r, )11 - (r, )12 - (r, )13 -
1
lines, circles and ellipses; whereas, with the generalized Hough
transform any arbitrary curve can be detected in a 2D image. 2 (r, )21 - (r, )22 - (r, )23 -
2
The following sections briefly describe the standard and
generalized Hough transform.
The reconstruction of the object model from the R-table is
2.1 The standard Hough transform
straightforward:
The idea of Hough transform for detecting straight lines in
x p = x c r cos images was first introduced by Hough (1962). In the original
(1)
y p = y c r sin Hough transform, a straight line is parameterized as y = mx+b
with two parameters m and b. According to the number of
parameters, a 2D parameter space is formed in which every where (xc, yc) and (xp, yp) are respectively the coordinates of the
point in the image space corresponds to a line b = -xm+y. A set reference point and a point on the boundary of the object. For
of image points that lie on a same line y = mx+b in image space the detection of the object model in the image, however, the
correspond to a number of lines in the parameter space, which coordinates of the reference point are not known. A 2D
intersect at point (m, b). Finding this intersection point is, accumulator array is, therefore, constructed with the two
therefore, the basis for line detection in Hough transform. The parameters of the reference point as the axes. At every image
parameter space is realized in the form of a discrete edge pixel the gradient direction is obtained and then looked up
accumulator array consisting of a number of bins that receive in the R-table. The corresponding sets of r and values are
votes from edge pixels in the image space. The intersection used to evaluate Equation 1, and the resulting xc and yc values
point is determined by finding the bin that receives a maximum indicate the accumulator array bins that should receive a vote.
number of votes. Once this process is complete for all edge pixels, the bin with
the maximum vote indicates the reference point, and the edge
In addition to straight lines, Hough transform has been used to pixels that cast vote for this bin belong to an instance of the
detect also other analytical shapes, such as circles and ellipses, object in the image.
in 2D images. The underlying principle for the detection of
other analytical shapes is the same as for the straight line
detection, and is based on constructing a duality between edge
pixels in the image and object parameters in the parameter
space. The dimensions of the parameter space, however, vary
with respect to the parameterization of the object.
G
C (X C,YC)
2.2 The generalized Hough transform
r
Ballard (1981) proposed a generalization of Hough transform to X
detect non-parametric objects with arbitrary shapes in 2D P(XP,YP)
intensity images. In the generalized Hough transform, the object
model is stored in a so-called R-table format. An arbitrary Y
reference point is selected for the object, and for every pixel on
the object boundary the gradient direction as well as the length
and direction of a vector connecting the boundary pixel to the
X
reference point are computed (Figure 2). The gradient Figure 2: Parameters involved in the generalized Hough
directions,, serve as indices in the R-table to look up the transform.
207
ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, September 12-14, 2007, Finland
The generalized Hough transform can also be used to detect a C(XC, Y C,ZC )
rotated and scaled version of a model in an image. This is n
achieved by supplementing Equation 1 with a scale factor and a r
Z
rotation angle, and the parameter space is expanded to a 4D
accumulator array. The peak of the accumulator array Y
P(X P,YP,ZP)
determines the scale and rotation parameters in addition to the X
coordinates of the reference point, although at the price of a
higher computational expense.
2.3 Modifications to Hough transform
Several modified variations of the Hough transform have been
proposed to improve the performance of the method.
Illingworth and Kittler (1988) provide a survey of these
methods. Duda and Hart (1972) suggested a modification of the Z
standard Hough transform by substituting the original slope-
intercept parameterization of straight lines with a polar, angle-
radius, parameterization. The polar parameterization leads to a
Y
bounded parameter space, unlike the original parameterization,
and is, consequently, more computationally efficient. They also X
showed that standard Hough transform can be used to detect
more general curves in an image. Gradient weighted Hough
Z
Z
transform, as appears in Ballard s generalization, was first n C(XC, YC,Z C)
introduced by O Gorman and Clowes (1976). The derivation of r
edge orientation information imposes very little computational P(X P,YP,ZP)
P(XP,YP,ZP)
cost, but greatly increases the efficiency of the method. Other Y
Y
methods that have been shown to improve the performance of
Hough transform include Adaptive Hough transform X
X
(Illingworth and Kittler, 1987), Hierarchical Hough transform
(Princen et al., 1990), and Randomized Hough transform (Xu et Figure 3: Parameters involved in the 3D GHT method.
al., 1990).
3. EXTENSION OF GENERALIZED HOUGH
TRANSFORM TO 3D DATA 2
3
r41
r21
2
r1
r
In this section we present an extension of the generalized 211 41
2
r11 1 r31
r21
Hough transform to 3D data. The method will be referred to as 1
r11
r1
3D GHT in the subsequent parts of the paper. The 3D GHT 01
follows the same principle as generalized Hough transform as
outlined in Section 2.2. The main difference is that the gradient
Figure 4: Storing r vectors in a 2D R-table.
vector is replaced with a surface normal vector. The normal
vectors can be obtained by triangulating the surface of the
The reconstruction of the object model from the R-table is
object or by fitting planar surfaces to small sets of points in a
carried out by extending Equation 1 to 3D:
local neighbourhood. Vectors connecting each triangle to an
x p = x c r sin cos arbitrary reference point are stored in the R-table as a function
y p = y c r sin sin of the normal vector coordinates. A normal vector is (3)
z = z r cos constrained to be of unit length and is, therefore, defined by two p c
orientation angles, and, as depicted in Figure 3. A
connecting vector is defined by two orientation angles, and,
where and denote the orientation angles of the vector that
as well as its length r. These parameters can be derived from the
connects a point p to the reference point c. For the detection of
coordinates of the reference point and the object boundary
the 3D object model in a point cloud the three coordinates of
point:
the reference point are unknown parameters. Thus, the
equations given in (3) are rearranged so as to express the
unknown parameters as functions of the known variables:
1
r = [( x p x c ) 2 + ( y p y c ) 2 + ( z p z c ) 2 ] 2
x c = x p + r sin cos zc z p
= arccos
(2)
y c = y p + r sin sin r
(4)
z = z + r cos xc x p c
= arccos
p
r sin Having obtained the object model in the form of the R-table, an
algorithm for the detection of instances of this model in a point
This formulation results in a 2D R-table where all the
cloud can be outlined as follows:
connecting vectors, r, are stored in cells whose coordinates are
the orientation angles of the normal vectors. Figure 4
1. Construct a 3D accumulator array with the three
demonstrates how such a 2D R-table is constructed.
parameters of the reference point as the axes;
208
IAPRS Volume XXXVI, Part 3 / W52, 2007
2. Compute the normal vector for every point in the point strategy to reduce the number of bins that receive votes in the
cloud and look up r vectors at coordinates of the parameter space. In the randomized voting, instead of working
2D R-table; with one point at a time, a number of points sufficient for the
3. Evaluate Equation (4) with the corresponding sets of r, computation of all parameters are selected from the data. Once
and values to obtain xc, yc and zc; all the parameters are computed, only one bin in the
4. Cast a vote (an increment) to the accumulator array bin accumulator array receives a vote. In the case of a 3D object
corresponding to each set of xc, yc and zc values; with seven parameters, a set of three points must be selected
5. Repeat the voting process for all the points in the point from the data at each time. These points along with their
cloud; respective r vectors form nine equations of the form given in
6. The bin with the maximum vote indicates the reference Equation 5, which can be solved for the seven parameters.
point, and the 3D points that cast vote for this bin belong Thus, for each randomly selected set only one vote is cast for a
to an instance of the object in the point cloud. bin in the 7D accumulator array.
In practice, the object appears in range data with an arbitrary
rotation and scale. To account for the additional rotation and 5. CONCLUSIONS
scale parameters, Equation (4) is modified as:
In this paper we presented an extension of the generalized
Hough transform to detect arbitrary 3D objects in laser range
c = p + sM z .M y .M x .r (5) data. The procedure of storing a 3D model in a 2D R-table was
demonstrated, and a method for the detection of instances of the
where c = (xc, yc, zc )T, p = (x p, y p, z p )T, r = (r sin cos, r sin sin, r cos T model in a point cloud, based on a voting process, was
described. It was discussed that the voting process can be
s is a scale factor and Mx, My and Mz are rotation matrices
computationally expensive in the case that the object appears in
around x, y and z axis respectively. The incorporation of a scale
data with an arbitrary scale and rotation with respect to the
factor and three rotation parameters results in an expansion of
model. The employment of a voting process based on the
the Hough space to seven dimensions. To evaluate Equation 5
randomized Hough transform was, therefore, suggested to
and cast votes for the accumulator bins, a 4D space
reduce the computational cost of the method.
circumventing the entire range of scale factors and rotation
angles must be exhausted. This implies that the crude
application of the 3D GHT method to object detection can be
REFERENCES
very expensive. Therefore, cost-reduction strategies such as
adaptive, hierarchical and randomized voting schemes are of
Ballard, D.H., 1981. Generalizing the Hough transform to
great importance in the 3D GHT algorithm.
detect arbitrary shapes. Pattern Recognition, 13(2): 111-122.
Duda, R.O. and Hart, P.E., 1972. Use of the Hough
transformation to detect lines and curves in pictures.
4. IMPLEMENTATION ASPECTS
Communications of the ACM, 15: 11-15.
Filin, S., 2002. Surface clustering from airborne laser scanning
The 3D GHT method as described in Section 3 is
data, Proceedings of the Photogrammetric Computer Vision,
computationally expensive when the object appears in data with
ISPRS Commission III Symposium, Graz, Austria, pp. 119-124.
an arbitrary scale and rotation with respect to the model. The
Forlani, G., Nardinocchi, C., Scaioni, M. and Zingaretti, P.,
development of a cost-reduction strategy is thus the main
2006. Complete classification of raw LIDAR data and 3D
challenge in the application of 3D GHT. In general, the
reconstruction of buildings. Pattern Analysis & Applications,
execution time of Hough transform is more dominated by the
8(4): 357-374.
voting process rather than by the search for a peak in the
Greenspan, M. and Boulanger, P., 1999. Efficient and reliable
accumulator. In the absence of arbitrary scale and rotation, the
template set matching for 3D object recognition, 2nd
number of required operations in the voting process is O(M),
International Conference on 3-D Digital Imaging and
where M is the number of points in the dataset. Thus, a
Modeling, Ottawa, Canada, pp. 230-239.
desirable cost-reduction strategy must aim to reduce the number
Hough, P.V.C., 1962. Methods and means for recognizing
of points that are involved in the voting process. Randomized
complex patterns, US patent 3069654.
(Xu et al., 1990) and probabilistic (Kiryati et al., 1991)
Illingworth, J. and Kittler, J., 1987. The adaptive Hough
variations of the Hough transform work based on a random
transform. IEEE Transactions on Pattern Analysis and Machine
selection of a small number of data points, and are, therefore,
Intelligence, 9(5): 690-698.
suitable options for controlling the computational cost of the
Illingworth, J. and Kittler, J., 1988. A survey of the Hough
voting process..
transform. Computer Vision, Graphics and Image Processing,
44: 87-116.
In the presence of arbitrary scale and rotation, a 4D subset of
Johnson, A.E. and Hebert, M., 1999. Using spin images for
the parameter space circumventing the entire range of scale
efficient object recognition in cluttered 3D scenes. IEEE
factors and rotation angles is exhausted during the voting
Transactions on Pattern Analysis and Machine Intelligence,
process. Consequently, the number of operations required in the
21(5): 433-449.
voting process is O(M*N4), where N is the number of intervals
Khoshelham, K., 2006. Automated 3D modelling of buildings
along each axis of the accumulator array. Clearly, a desirable
in suburban areas based on integration of image and height
cost-reduction strategy in this case must concentrate on the N4
data, International Workshop on 3D Geoinformation
factor. The adaptive Hough transform (Illingworth and Kittler,
(3DGeoInfo '6), Kuala Lumpur, Malaysia, pp. 381-393.
1987) reduces the number of intervals along axes since it begins
Kiryati, N., Eldar, Y. and Bruckstein, A.M., 1991. A
with a coarse-resolution parameter space and increases the
probabilistic Hough transform. Pattern Recognition, 24(4): 303-
resolution only in the vicinity of the peak. The randomized
316.
Hough transform (Xu et al., 1990) also provides an efficient
209
ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, September 12-14, 2007, Finland
O'Gorman, F. and Clowes, M.B., 1976. Finding picture edges images, International Archives of Photogrammetry and Remote
through collinearity of feature points. IEEE Transactions on Sensing, Volume XXXIV/3-W13, Dresden, pp. 170-184.
Computers, 25(4): 449-456. Sithole, G., 2005. Segmentation and classification of airborne
Olson, C.F., 2001. A general method for geomtric feature laser scanner data. PhD Thesis, Delft University of Technology,
matching and model extraction. International Journal of Delft, 185 pp.
Computer Vision, 45(1): 39-54. Vosselman, G., 1999. Building reconstruction using planar
Oude Elberink, S. and Maas, H.G., 2000. The use of anisotropic faces in very high density height data, ISPRS Conference on
height texture measures for the segmentation of airborne laser Automatic Extraction of GIS Objects from Digital Imagery,
scanner data, International Archives of Photogrammetry and Munich, pp. 87-92.
Remote Sensing, 33 (Part B3/2), pp. 678 684. Vosselman, G., Gorte, B.G., Sithole, G. and Rabbani Shah, T.,
Princen, J., Illingworth, J. and Kittler, J., 1990. A hierarchical 2004. Recognising structure in laser scanner point cloud,
approach to line extraction based on the Hough transform. International Archives of Photogrammetry, Remote Sensing
Computer Vision, Graphics and Image Processing, 52(1): 57- and Spatial Information Sciences, Vol. 46, Part 8/W2, Freiburg,
77. Germany, pp. 33-38.
Rabbani Shah, T., 2006. Automatic reconstruction of industrial Xu, L., Oja, E. and Kultanen, P., 1990. A new curve detection
installations using point clouds and images PhD thesis Thesis, method: randomized Hough transform (RHT). Pattern
Delft University of Technology, Delft, 154 pp. Recognition Letters, 11(5): 331-338.
Rottensteiner, F. and Briese, C., 2003. Automatic generation of
building models from Lidar data and the integration of aerial
210