When we look at an object, its image depends upon its shape "s",
its pose "p" and its motion "m(t)".
In particular, we measure the perspective projection "pr" of
the 3-D object relative to the camera-frame (or eye-frame)
y(t) = pr(m(t) . p . s) or y(t) = pr(m(t) . p
. S(s))
depending on whether we represent the world as a collection of points,
or as a collection of parametric surfaces.
Our goal is that of reconstructing shape "s" from the moving
image "y(t)"
We may collect the unknown parameters as the state of
a nonlinear dynamical model:
s' = 0 shape is a constant on the shape space
(or on the surface parameter space)
p' = 0 pose is a constant on the Lie group SE(3)
m' = v^m ; m(0) = I motion is the integral of
velocity in SE(3); motion at time 0 is the identity transformation
v' = 0 velocity is approximately constant in the
Lie algebra se(3)
y(t) = pr(m(t) . p . s) + n (or S(s)
in place of s for a surface model
and then try to observe the initial conditions from the
measurements y
Even though our main concern is shape, what we measure
also depends upon motion and pose which are unknown.
Should we then ...
(A) try to estimate all unknown parameters and then concentrate on the
ones we are most interested, or ...
(B) devise models that are invariant with respect to
unknown and undesired parameters?
XS