Optic Flow Computation Using Interpolating Thin-Plate Splines
Alireza Bab-Hadiashar, David Suter and Ray Jarvis Intelligent Robotics Research Centre Department of Electrical & Computer Systems Engineering Monash University, Clayton Vic. 3168, Australia. Email: email@example.com
Optic flow computation is one of the most fundamental problems in the realm of visual motion. In this contribution, we present a novel optic flow computation method based on a thin-plate spline representation of image brightness data. Using a spline-based description of image brightness data removes the need for presmoothing and provides an algebraic means to compute the optic flow (explicitly). This method can be generalised to solve a wide range of image registration problems. A set of results on a number of standard motion sequences is also presented.
During the last two decades there has been an increasing interest in analysing image sequences and, in particular, in recovering the optic flow field. Although many methods for estimating the flow field have been proposed, a practical real-time solution to this problem remains a challenge. The current existing methods for the optic flow computation are mainly based on the following approaches: - Correlation techniques - Phase or energy techniques - Differential techniques. The comparison study of Barron et al.  showed that the phase based techniques usually provide the most accurate estimates of the flow field. Recently, a few new differential techniques have been published and their accuracy is comparable with the phase based techniques (Szeliski & Coughlan 1994 and Weber & Malik 1995). Differential techniques are often the prime candidates for real time applications due to computational ease and speed. The accuracy of any differential technique mainly depends on the accuracy of estimating derivatives of the image brightness function. Although the finite difference method, due to its simplicity, has been the most popular method for estimating the derivatives, it suffers from certain draw-backs. The finite difference method has no means to distinguish between noise and the true data: thus the resulting estimates can be corrupted by noise.
To eliminate the noise problem, most of the proposed optic flow methods rely on pre-smoothing the image functions using Gaussian filters. Successful implementation of this remedy requires some prior knowledge about the noise and the visual data, often acquired through an expensive trial and error process which is simply not affordable in any real-time application. Moreover, an algorithm for automatically deriving the smoothing parameters of the Gaussian filters used has not yet been found. It is interesting to note that a similar problem also exists in phase based techniques in that their performance largely depends on the method of tuning the frequency response of the different filters and determining these also requires information not known in real-time applications. In this paper, a new framework, based on interpolating thin-plate splines, for estimating the optic flow field is proposed. The key difference between this method and other optic flow methods is that we explicitly recover the underlying function of brightness data by using interpolating thin-plate splines. In doing this, the derivatives can be calculated symbolically (off-line) and their subsequent numerical evaluation is consequently very fast. This proposition should not be mistaken for optic flow methods using a spline to represent the velocity field [2,3], or the standard Horn and Schunck method  which uses a membrane spline representing the velocity field. The proposed method does not require any pre-smoothing of images. This contributes to fast (real-time) computation of optic flow field. The major contribution of this paper is the proposal of a novel method for estimating the derivatives of the brightness intensity function. We also derive the necessary formulation for computing the optic flow field explicitly, assuming a simple translational model. The extension of this work to include any model of motion (eg. affine) is straight forward and will be addressed in future. The proposed method uses only two images (as opposed to most methods using a sequence of more than two images) and, unlike most of the existing methods, does not have several tuning parameters. All the necessary parameters in our framework are explicitly deduced from the input data. The patch size is the only parameter left to the user's judgment and, as we show later, it dictates the trade off between accuracy and speed.
2 Spline Representation
The data fitting problem in high dimensions has attracted much attention and many approaches have been proposed. Among these methods, the spline based approaches provide satisfactory answers to regression problems of recovering unknown functions from noisy data. The recovered function provides a valuable means for further studies of the data ranging from calculating the derivatives to statistical analysis. Many techniques for constructing the spline functions have been proposed but elaborating on them is beyond the scope of this paper. The image brightness function can be recovered by fitting a surface to a discrete set of available visual data. One of the common approaches for solving this regression problem is known as the variational approach. The variational approach is adopted in this paper because of its ingenious way of computing the parameters of the resulting spline. This approach is also capable of modelling the noise which is essential for accurate estimation of derivatives. Also, to keep the formulation simple, we consider the simplest member of surface spline family known as the thin-plate spline. The next sections are dedicated to describing an elegant method of characterising thin-plate splines by a reproducing kernel Hilbert space approach.
The simplest member of this family, the thin-plate spline, corresponds to l=2 and takes the form of:
f ( x, y) =
∑ a K ( x , y ) +a
i i i =1
m +1+a m + 2 x
+a m + 3 y
K i( x , y ) =
1 2 r ln(r 2 ) 16 π
r 2 = ( x ?x i )2 + ( y ?yi )2
The spline parameters ai in equation (4) are determined by solving the following system of linear simultaneous equations:
Aa = f
LMK + mλI N P
2.1 Thin-plate Splines
Thin-plate splines are the famous members of a family of surface or Dl splines which result from the solution of a variational problem. Meinguet  and Wahba  solved the following minimisation problem. Considering the following model (1) for a set of noisy data; the problem is to find the function f(x,y) ∈ H (a suitable space of sufficiently differentiable functions s) which minimises: with
LM a MM a PO 0P Q , a = MM a MM a Na
m+2 m +3
OP LM f OP PP MM f PP PP , f = MM 0 PP PP MM 0 PP Q N0Q
ξ ( s ) = δ ( s ) + λη( s )
where λ is the smoothing parameter and δ(s) and η(s) are given as:
LM K (x ,y ) K=M MN K (x ,y ) LM1 x y OP P= M MN1 x y PPQ
1 1 1
OP P K (x ,y ) P Q
δ( s ) = η( s ) =
i =1 l i=0
1 (f i? s (x i ,yi )))2 m
and I is the m by m identity matrix. (2)
FG l IJ ∑H jK
?l s ( x , y ) 2 ) dxdy (3) ?x j ?yl ? j
2.2 Numerical Computation
There are some issues to be addressed, from the numerical point of view, about calculating thin-plate splines. First of all, the number of coefficients that needs to be determined is directly proportional to the number of data points and therefore, for large data points (greater than a few hundreds), the computation would be very expensive. Secondly, the coefficient matrix A is dense
in which (xi,yi) is the location of the data point, fi is the corresponding data and m is the number of data points. The smoothing parameter, λ, controls the trade off between the fidelity to the data measured by δ(s) and the roughness of the solution measured by η(s). Interpolating
and ill-conditioned (the diagonal elements are zero for interpolating spline - λ = 0 - while the off-diagonal elements can be quite large). Although these issues can cause serious problems for fitting the scattered data, they do not have any serious consequences for the data over a rectangular grid (luckily, computer vision problems are often set on a rectangular grid). In fact, a simple LUdecomposition can easily and reliably solve the problem where the data are on a rectangular grid. The interesting point, not to be overlooked, is that the coefficient matrix A is only a function of x and y and not of the value of the function at the data points (assuming that the smoothing parameter λ is known). Therefore, to solve for the parameters of different splines over the same grids, the inverse of the A has to be calculated once and this can be done prior to the start of optic flow computation. So, in recovering the motion field, one must choose the same size for all the patches to exploit the advantage of this feature (it is, in any case, common to have patches of equal size in most of the optic flow methods). In view of the above considerations, a matrix (m by m) by vector (m by 1) multiplication is the only on-line computation required to calculate the parameters of a spline function representing the brightness data of a patch in an image.
given image, there exists a small spatial neighbourhood over which the motion can be approximated by planer translation. This is the most simple and yet most common model of motion. For small patches, a translational model is a relatively good approximation for the flow field. The well known maximum likelihood estimation gives the solution of the stated optic flow problem by minimising, over ?X and ?Y, the ordinary least squared error E defined as:
∑ (I (x + ?X ,y + ?Y ) ?I (x ,y ))
1 i i 2 i i i =1
The least square approximation is a maximum likelihood estimation if the errors are independent and normally distributed with constant standard deviation. Also, by using the ordinary least squares approach, one assumes that the measurements in x and y coordinates are error free and the noise is only distributed in the intensity measurement. This minimisation problem usually has many local solutions and different methods can be employed to find the optimal solution. In the following section the methodology for solving this minimisation problem is explained.
3 Problem Definition
The optic flow problem for two sequential images is often formulated as follows. Given two images, we assume the second image I2(x,y) is formed by locally displacing the first image I1(x,y) by (?X,?Y) and therefore we have:
3.1 Explicit Solution
One of the advantage of representing the image intensity data using splines is that it provides opportunity to explicitly solve the least square problem (13) for the unknown translational components. Since the mathematical description of I1(xi,yi) is expressed by its thin-plate spline representation (equation 4), we can set the partial derivatives of the ordinary least square error (equation 13) with respect to ?X and ?Y to zero. This results in a system of two non-linear simultaneous equations which can be solved using the Newton-Raphson method. It is important to note here that the second order derivatives are defined everywhere except on the data points. Therefore, to use the Newton-Raphson method, which requires the calculation of the second order derivatives, the computation has to be performed on some points in the spatial neighbourhood of the grid points but not on the data points. In our implementation, we have calculated the derivatives on a grid shifted up and right by half the horizontal (or vertical) distance between two data points.
I1( x i + ?X ,yi + ?Y) =I 2( x i ,yi )
The general problem is to recover the displacement field ?X and ?Y. Expanding I1(x,y) in its first order Taylor series form results in the well-known image brightness constraint written in terms of displacements instead of velocities (Fennema & Thompson, 1979):
?I1( x , y ) ?I ( x , y ) |i + ?Y 1 |i =I 2(xi ,yi ) ?I1(xi ,yi ) ?x ?y
This equation shows that for every point in the image function, there exists two unknowns and one constraint, and, consequently, the number of solutions are unlimited (ill-posed problem). In order to solve this problem, an extra assumption has to be made which differs in various optic flow algorithms. One way of reducing the number of solutions to the above problem is to assume that, for every pixel in a
Any algorithm for estimating the flow field of a 3-D structured scene is likely to fail due to occluding boundaries and or lack of texture in a specific patch. Therefore, a reliable algorithm has to have some means for measuring the confidence associated with every estimates. Although investigating a more suitable confidence measure is the subject of our current work,
we have used a very simple confidence measure in the results presented here. According to this measure (CM see equation 18), all the estimates for which their least square error E (equation 15) divided by the sum of all the data points in a patch is more than 0.04 are regarded as outliers.
Figure 4.1 A sample image of Sinusoid1 sequence (18) Algorithm (patch size) Avg. Std. Dev. Density Error Nagel 100% 2.55° 0.93° Fleet & Jepson 100% 0.03° 0.01° Spline Based (5x5) 2.714° 1.300° 100% Spline Based (10x10) 0.989° 0.434° 100% Spline Based (15x15) 0.532° 0.177° 100% Table 4.1 Sinusoid1 results on translational speeds
I1(x i ,yi )
The proposed confidence measure is only one of many (perhaps the most simple one) that come to mind. A more suitable and effective confidence measure will contribute to better results both in terms of average error and density. We will compare our results with the results for the Fleet and Jepson (1990) algorithm (as the most accurate method) and the Nagel (1983) algorithm (because it theoretically is the closest method to ours).
4.2 Real Data Performance
In this section, the calculated optic flow field for four well-known real image sequences are presented. Although the exact movement of different objects is not known, the number of moving objects and their motion directions are given. A sample image with its calculated flow field over a uniform rectangular grid and a brief description of each image sequences will be presented in the following subsections. All of the image sequences used in this survey are publicly available (see Barron et al. 1994 for their origin). 4.2.1 SRI Sequence: SRI tree sequence is one of the most challenging sequences due to its low contrast, lots of occlusion and relatively poor resolution. The camera in this sequence moves in front of a cluster of trees parallel to the ground plane (normal to its line of sight). Maximum velocity is about 2 pixels/frame. The uniform speed of the ground and relatively large velocities associated with the near by tree is clearly shown in the estimated flow field.
4.1 Theoretical Performance
The theoretical performance of every algorithm can be measured by applying the algorithm to a set of synthetic inputs. The true motion field for these inputs is known accurately and, therefore, the quantitative performance can be easily studied. As indicated by Barron et al. , the measure of performance on synthetic images should be taken as an optimistic bound on the expected errors with real image sequences. Following Barron's paper , to study the theoretical performance of the presented algorithm, we present and analyse the results an image sequences known as Sinusoid1 . The Sinusoid1 image sequence is created by superimposing two sinusoidal plane waves with spatial wavelengths of 6 pixels, orientation of 54° and -27°(with horizontal axis) and speeds of 1.63 and 1.02 pixels/frame, respectively. Therefore, the resulting plaid pattern translates with velocity of 1.585 in horizontal and 0.863 in vertical directions. The following figures show a sample of this sequence, followed by the error analysis on the results which are tabled for the ease of comparison. The errors are measured in degrees which represent the angle between the estimated and the true motion vector in the homogeneous coordinates .
Figure 4.3 A sample image of SRI image sequence and its estimated optic flow field 4.2.2 Hamburg Taxi: This image sequence shows the traffic in a street cross junction where a taxi turning toward right, a car moves from right to left in the lower left opposite to a van driving right to left and a pedestrian walking in the upper left of the scene. The speeds of these moving objects are roughly 1,3,3 and 0.3
pixels/frame, respectively. The estimated flow field contains most of the above mentioned motions.
expression for image intensity function which can then be differentiated explicitly. Using splines also allows the degree of smoothing to be chosen by well-founded methods (such as generalised cross-validation) and will be addressed elsewhere.
During the course of this work, the first author was fully supported by a scholarship from the Ministry of Culture and Higher Education, Iran. The authors would like to gratefully acknowledge Dr. H. Alehossein for an early discussion on cubic spline smoothing.
Figure 4.5 A sample of Hamburg Taxi image sequence and its estimated optic flow field 4.2.3 Rubik Cube: In this image sequence, a Rubik cube is placed on a turntable which is rotating in front of stationary background. The velocities on the turntable is around 1.2 to 1.4 pixels/frame and on the cube itself is around 0.2 to 0.5 pixels/frame. Existence of a few arrows which are representing the velocity on the back ground (where there is no distinct texture) shows that the used confidence measure is unreliable.
 Barron J.L., Fleet D.J., Beauchemin S.S. 1994 “Systems and experiment performance of optical flow techniques” Intern. J. Comput. Vis. 12: 43-77.  Suter D. 1994 “Motion estimation and vector splines” Procd. CVPR'94, pp 939-948, IEEE Computer Society, Seattle, Washington.  Szeliski R., Coughlan J. 1994 “Hierarchical splinebased image registration” Procd. CVPR'94, pp 194201, IEEE Computer Society, Seattle, Washington.  Meinguet J. 1979 “Multivaraite interpolation at arbitrary points made simple” Journ. of Applied Mathematical Physics, 30 292-304.  Wahba G. 1979 “How to smooth curves and surfaces with splines and cross-validation”, Procd. of 24th Conf. on Design of Experiments, pp 167-192, US army Research Office.  Fleet D.J., Jepson A.D. 1990 “Computation of component image velocity from local phase information” Intern. J. Comput. Vis. 5: 77-104.  Nagel H.H. 1983 “Displacement vectors derived from second order intensity variations in image sequences” Comput. Graph. Image Process. 21: 85117.
Figure 4.7 A sample image of Rubik Cube sequence and its estimated optic flow field 4.2.4 NASA Sequence: The motion flow field in this image sequence is mainly dilatational. The camera moves toward the Coca Cola can, near the centre of image. The divergence flow field can be seen in the following estimated motion fields.
Figure 4.9. A sample image of NASA sequence and its estimated optic flow field
We have demonstrated that good accuracy can be achieved by replacing an ad hoc Gaussian smoothing of grey level values (prior to differentiation) by spline fitting. The spline fitting method provides an algebraic