## Homework 4: Curvature Flow

In this homework we’ll take a closer look at curvature flow. We already saw one example of curvature flow (mean curvature flow) while studying the Poisson equation. The general idea behind curvature flow is that we have an energy $E$ measuring the smoothness of our geometry, and can reduce this energy by traveling along the direction of steepest descent. Conceptually, you can imagine that $E$ is some kind of potential — surfaces with many wrinkles have a lot of energy, and want to reduce this energy by “relaxing” into a smoother state. This description suggests a sort of “energy landscape,” where high peaks correspond to wrinkly surfaces and low valleys correspond to smooth ones. Here’s a two-dimensional cartoon of what this landscape might look like:

To smooth out the geometry we can “ski” downhill, discovering a sequence of smoother and smoother surfaces along the way. Visually the effect is akin to a pat of butter slowly melting away on a hot piece of toast, or a water droplet buckling into a perfectly round sphere.

To be more concrete, let $f$ be an immersion of a manifold $M$ (e.g., a curve or surface) into Euclidean space, and suppose that $E$ is a real-valued function of $f$. Then a curvature flow is the solution to the partial differential equation
$\dot{f} = -\nabla E(f)$
starting with some initial immersion $f_0$, where $\dot{f}$ denotes the derivative in time. In words, this equation just says that the difference in position of the surface at two consecutive points in time is equal to the change in position that reduces the energy quickest.

For surfaces, two common energies are the Dirichlet energy
$E_D(f) = \frac{1}{4} \int_M |\nabla f|^2 dA$
and the Willmore energy
$E_W(f) = \frac{1}{4} \int_M (\kappa_1 - \kappa_2)^2 dA.$
where as usual $\kappa_1$ and $\kappa_2$ are the principal curvatures induced by $f$. Both energies somehow measure the “wrinkliness” of a surface, but how are they related?

Exercise 4.1
For a surface $M$ without boundary show that, up to an additive constant, the Willmore energy can be expressed as
$E_W = \int_M H^2\ dA$
and explain why this constant does not matter in the context of curvature flow. Hint: Gauss-Bonnet.

In other words, Dirichlet energy looks like the (squared) $L^2$ norm of the gradient, whereas Willmore energy looks like the (squared) $L^2$ norm of mean curvature. Superficially these quantities look quite different, but in fact they are quite similar!

Exercise 4.2
For a surface $M$ without boundary show that (again up to constant factors)
$E_D = \langle \Delta f, f \rangle,$
$E_W = \langle \Delta^2 f, f \rangle,$
Hint: Green’s first identity and the definition of the mean curvature normal.

From these expressions, it may appear that Dirichlet and Willmore energy are nice, simple, quadratic functions of $f$. Don’t be fooled! The Laplace-Beltrami operator $\Delta$ depends on the immersion $f$ itself, which means that the corresponding gradient flows are rather nasty and nonlinear. Later on we’ll look at a couple ways to deal with this nonlinearity.

Now that have a couple energies to work with, how do we derive gradient flow? Previously we defined the gradient of a function $\phi: \mathbb{R}^n \rightarrow \mathbb{R}$ as
$\nabla\phi = \left[ \begin{array}{c} \partial\phi/\partial x^1 \\ \vdots \\ \partial\phi/\partial x^n \end{array} \right],$
i.e., as just a list of all the partial derivatives. This definition works pretty well when $\phi$ is defined over a nice finite-dimensional vector space like $\mathbb{R}^n$, but what about something more exotic like the Willmore energy, which operates on an infinite-dimensional vector space of functions? In general, the gradient of a function $\phi$ at a point $x$ can be defined as the unique vector $\nabla \phi(x)$ satisfying
$\langle \nabla \phi(x), u \rangle = \lim_{h \rightarrow 0} \frac{\phi(x+hu)-\phi(x)}{h},$
for all vectors $u$, where $\langle \cdot, \cdot \rangle$ denotes the inner product on the vector space. In other words, taking the inner product with the gradient should yield the directional derivative in the specified direction. Notice that this definition actually serves as a definition of differentiability: a function is differentiable at $x$ if and only if all directional derivatives can be characterized by a single vector $\nabla\phi(x)$. Geometrically, then, differentiability means that if we “zoom in” far enough the function looks almost completely flat.

Exercise 4.3
Explain why the gradient is the direction of steepest ascent.

Exercise 4.4
Consider the function
$\phi: \mathbb{R}^2 \rightarrow \mathbb{R}; (x_1, x_2) \mapsto x_1^2 - x_2^2.$
Confirm that the gradient found using the expression above agrees with the usual gradient found via partial derivatives.

Exercise 4.5
Let $M$ be a surface without boundary. Assume that the Laplace-Beltrami operator $\Delta$ is constant with respect to the immersion $f$ and use our definition of $\nabla$ above to show that
$\nabla E_D(f) \approx HN.$
In other words, gradient flow on Dirichlet energy looks roughly like the mean curvature flow $\dot{f} = -HN$ that we studied in the previous assignment.

The approximate gradient $HN$ might be called a linearization of the true gradient — in general the idea is that we keep some quantity or some piece of an equation fixed so that the rest comes out to be a nice linear expression. This trick can be quite helpful in a practical setting, but it is typically worth understanding where and how the approximation affects the final result.

One final thing to mull over is the fact that the gradient depends on our particular choice of inner product $\langle \cdot, \cdot \rangle$, which appears on the left-hand side in our definition. Why does the inner product matter? Intuitively, the gradient picks out the direction in which the energy increases fastest. But what does “fastest” mean? For instance, if we use a vector of real numbers $\mathsf{x} \in \mathbb{R}^m$ to encode the vertices of a discrete curve, then what we really care about is the energy increase with respect to a change in the length of the curve — not the Euclidean length of the vector $\mathsf{x}$ itself. In terms of our energy landscape we end up with a picture like the one below — you can imagine, for instance, that arrows on the left have unit norm with respect to the standard Euclidean inner product whereas arrows on the right have unit norm with respect to the $L^2$ inner product on our discrete curve. As a result, gradient descent will proceed along two different trajectories:

Exercise 4.6
Consider an inner product $\langle \cdot, \cdot \rangle$ on $\mathbb{R}^n$ defined by a positive definite matrix $\mathsf{B} \in \mathbb{R}^{n \times n}$, i.e., $\langle u, v \rangle = \mathsf{u^T B v}$. Show that the gradient $\nabla_B$ induced by this inner product is related to the standard gradient $\nabla$ via
$\nabla_\mathsf{B} = \mathsf{B}^{-1} \nabla.$

In the discrete setting, the matrix $\mathsf{B}$ is sometimes referred to as the mass matrix, because it encodes the amount of “mass” each degree of freedom contributes to the total. When working with discrete differential forms, one possible choice mass matrix is given by (an appropriate constant multiple of) the diagonal Hodge star. This choice corresponds to applying piecewise-constant interpolation and then taking the usual $L^2$ inner product. For instance, here’s what piecewise constant interpolation looks like for a primal 1-form on a triangulated surface — the integrated value stored on a given edge gets “spread out” over the so-called diamond region associated with that edge:

In general, let $\star_k$ be a real diagonal matrix with one entry for each $k$-dimensional simplex $\sigma_i$. The nonzero entries are
$\left( \star_k \right)_{ii} = \frac{|\sigma_i^\star|}{|\sigma_i|},$
where $\sigma_i^\star$ is the circumcentric dual of $\sigma$, and $|\cdot|$ denotes the (unsigned) volume. The corresponding mass matrix on primal discrete $k$-forms in $n$ dimensions is then
$B_k = \left( \begin{array}{c} n \\ k \end{array} \right) \star_k,$
i.e., a binomial coefficient times a (primal) diagonal Hodge star. The mass matrices for dual $k$-forms can likewise be expressed as constant multiples of the inverse:
$B^\star_k = \left( \begin{array}{c} n \\ k \end{array} \right) \star_{n-k}^{-1}.$
These matrices will come in handy when deriving equations for discrete curvature flow.

Flow on Curves

For the remainder of this assignment, we’re going to make life simpler by working with planar curves instead of surfaces. As discussed earlier, we can describe the geometry of a curve via an immersion
$\gamma: I \rightarrow \mathbb{R}^2; s \mapsto \gamma(s)$
of some interval $I = [0,L] \subset \mathbb{R}$ into the Euclidean plane $\mathbb{R}^2$. A common energy for curves is simply the integral of the curvature $\kappa$, squared:
$E(\gamma) = \int_0^L \kappa^2\ ds.$
Let’s first establish some facts about curvature in the smooth setting.

Exercise 4.7
The unit tangent field on a smooth curve $\gamma$ can be expressed as $T = (\cos\theta,\sin\theta)$ for some function $\theta: I \rightarrow \mathbb{R}$. Show that the normal curvature can be expressed as
$\kappa = d\theta(X)$
where $X$ is a positively-oriented unit vector field. In other words, the scalar curvature is change in the direction of the tangent.

Exercise 4.8
Explain in words why the total curvature of any closed immersed curve $\gamma$ (whether discrete or not) is an integer multiple of $2\pi$, i.e., $\gamma(0) = \gamma(L)$
$\int_0^L \kappa ds = 2\pi k,\ k \in \mathbb{Z}.$
(The number $k$ is called the turning number of the curve.)

A stronger statement is the Whitney-Graustein theorem which says that the turning number of a curve will be preserved by any regular homotopy, i.e., by any continuous motion that keeps the curve immersed. For instance, here’s an example of a motion that is not a regular homotopy — note that the curve gets “pinched” into a sharp cusp halfway thorugh the motion, at which point the turning number goes from $k=2$ to $k=1$:

We’ll keep these ideas in mind as we develop algorithms for curvature flow.

Discrete Curves

In the discrete setting, $\gamma$ is simply a collection of line segments connecting a sequence of vertices with coordinates $\gamma_1, \ldots, \gamma_n \in \mathbb{R}^2$:

Note that in the provided code framework, a curve is represented by a half edge mesh consisting of a single polygon. Therefore, to iterate over the curve you might write something like

 

   FaceIter gamma = mesh.faces.begin();
HalfEdgeIter he = gamma->he;
do {
// do something interesting here!
he = he->next;
}
while( he != gamma->he );

 

As with surfaces, we can consider both a primal and dual “mesh” associated with this curve — this time, each primal edge is associated with a dual vertex at its midpoint, and each primal vertex is associated with a dual edge connecting the two adjacent dual vertices:

In the language of discrete exterior calculus, then, $\gamma \in (\mathbb{R}^2)^n$ is a $\mathbb{R}^2$-valued primal 0-form, i.e., a value associated with each vertex.

Exercise 4.9
Show that the nonzero entries of the diagonal Hodge star on primal 0-forms are given by
$(\star_0)_{ii} = L_i,$
where
$L_i = \frac{1}{2}( |\gamma_{i+1}-\gamma_i| + |\gamma_i-\gamma_{i-1}| ).$

Coding 4.1
Implement the methods Edge::length() and Vertex::dualLength(), which should return the primal edge length and the circumcentric dual edge length, respectively. (The latter should be a one-liner that just calls the former!) Finally, implement the method IsometricWillmoreFlow1D::buildMassMatrix(), which builds the diagonal Hodge star on primal 0-forms.

Exercise 4.10
Show that on a discrete curve the total curvature along a dual edge $e^\star_{ij}$ is equal to the exterior angle $\varphi_{ij} \in \mathbb{R}$ at the corresponding vertex, i.e., the difference in angle between the two consecutive tangents:
$\varphi_{ij} = \theta_j - \theta_i = \int_{e^\star_{ij}} \kappa\ ds.$
(Hint: Stokes’ theorem!)

In other words, the exterior angle $\varphi$ gives us the integrated curvature. Applying the discrete Hodge star yields pointwise curvatures $\kappa$, which we will use as the degrees of freedom in our numerical curvature flow:
$\kappa = \star\varphi.$

Coding 4.2
Implement the method Vertex::curvature(), which returns the pointwise curvature as defined above. Hint: in the language of discrete exterior calculus, what kind of quantity is $\varphi$? And what kind of quantity is $\kappa$?

The next exercise should solidify your understanding of where all these different quantities live, and which operators take you back and forth from one space to another!

Exercise 4.11
Show that for a discrete curve $E(\gamma)$ can be written explicitly as
$E(\gamma) = \sum_i \varphi_i^2 / L_i,$
assuming we use piecewise constant interpolation of curvature.

Now that we have a discrete curvature energy, let’s derive an expression for the gradient.

Exercise 4.12

Let $\varphi$ be the angle made by two vectors $u,v \in \mathbb{R}^2$. Show that the gradient of $\varphi$ with respect to $u$ can be expressed as
$\nabla_u \varphi = -\frac{v_{\perp u}}{2A}$

where $v_{\perp u}$ denotes the component of $v$ orthogonal to $u$ and $A$ is the area of a triangle with sides $u$ and $v$.

Exercise 4.13
Let $L$ be the length of a vector $u = b-a$, where $a$ and $b$ are a pair of points in $\mathbb{R}^2$. Show that
$\nabla_a L = -\hat{u}$
and
$\nabla_b L = \hat{u}.$

Exercise 4.14
Collecting the results of the past few exercises, show that the gradient of the $i$th term of our curvature energy
$E_i = \varphi_i^2 / L_i$
with respect to vertex coordinates $\gamma_{i-1}$, $\gamma_i$, and $\gamma_{i+1}$ is given explicitly by
$\begin{array}{rcr} \nabla_{\gamma_{i-1}} E_i &=& \frac{\varphi_i}{L_i L_{i-1}} \left( \frac{v_{\perp u}}{A_i} + \frac{\varphi_i}{2L_i} \hat{u} \right) \\ \nabla_{\gamma_{i+1}} E_i &=& \frac{\varphi_i}{L_i^2} \left( \frac{u_{\perp v}-v_{\perp u}}{A_i} + \frac{\varphi_i}{2L_i} (\hat{v}-\hat{u}) \right) \\ \nabla_{\gamma_i} E_i &=& -\frac{\varphi_i}{L_i L_{i+1}} \left( \frac{u_{\perp v}}{A_i} + \frac{\varphi_i}{2L_i} \hat{v} \right) \\ \end{array}$

where $\varphi_i$ is the exterior angle at vertex $i$, $L_i$ is the dual edge length, and $A_i$ is the area of a triangle with edges $u = \gamma_i-\gamma_{i-1}$ and $v = \gamma_{i+1}-\gamma_i$. The gradient of $E_i$ with respect to all other vertices $\gamma_j$ is zero. Why? (Hint: remember to take the gradient with respect to the right metric!)

Coding 4.3
Implement the method WillmoreFlow1D::computeGradient() using the expressions above. The gradient of energy with respect to a given vertex should be stored in the member Vertex::energyGradient. Remember that the overall energy is a sum over the terms $E_i$, which means you will need to add up the contributions to the gradient at each vertex.

Coding 4.4
Implement the method WillmoreFlow1D::integrate(), which should integrate the flow equation
$\dot{\gamma} = -\nabla E(\gamma)$
using the forward Euler scheme. (See the end of the previous assignment for a brief discussion of time integration.) Run the code on the provided meshes, and report the maximum stable time step in each case, i.e., the largest time step for which the flow succeeds at smoothing out the curve. (The time step size can be adjusted using the keys ‘-‘, ‘=‘, ‘\_‘, and ‘+.’)

Curvature Flow in Curvature Space

If you feel exhausted at this point in the assignment, you’re not alone! Taking derivatives by hand can be a royal pain (and it gets even worse when you want second derivatives, which are required for more sophisticated algorithms like Newton descent). But it’s worth grinding out this kind of expression at least once in your life so that you really understand what’s involved. In practice there are a variety of alternatives, including numerical differentiation, automatic differentiation, and symbolic differentiation — these methods all have their place, and it’s well-worth understanding the tradeoffs they offer in terms of accuracy, efficiency, and code complexity.

But before getting mired in the bedraggled business of computer-based derivatives, it’s worth realizing that there is a tantalizing fourth alternative: come up with a simpler formulation of your problem! In particular, the mention of a convex-quadratic energy should make your mouth water and your heart beat faster, since these things make your life easier in a number of ways. “Convex-quadratic” means that your energy can be expressed as a real-valued homogeneous quadratic polynomial, i.e., as
$E(x) = \langle Ax, x \rangle$
for some positive-semidefinite self-adjoint linear operator $A$ that does not depend whatsoever on the argument $x$. For instance, suppose that in the discrete setting the degrees of freedom $x$ of our system are encoded by a vector $\mathsf{x} \in \mathbb{R}^n$. Then a quadratic energy can always be represented as
$E(x) = \mathsf{x^T A x}$
for some fixed symmetric positive-semidefinite matrix $\mathsf{A} \in \mathbb{R}^{n \times n}$. Earlier we visualized definiteness in terms of the graph of the energy in two dimensions:

Independent of definiteness, the gradient of a quadratic energy has a simple linear expression
$\nabla E(x) = 2\mathsf{B^{-1}Ax},$
where the matrix $\mathsf{B} \in \mathbb{R}^{n \times n}$ encodes the inner product. This setup not only simplifies the business of taking derivatives, but also makes things inexpensive to evaluate at the numerical level — for instance, we can apply the backward Euler method by just solving a linear system, instead of performing some kind of nasty nonlinear root finding. Moreover, since the matrix $\mathsf{A}$ is constant, we can save a lot of computation by prefactoring it once and applying backsubstitution many times. This setup also has some nice analytical features. For one thing, any local minimum of a convex energy is guaranteed to be a global minimum, which means that gradient descent will ultimately lead to an optimal solution. For another, there is an extremely well-established theory of linear PDEs, which allows one to easily answer questions about things like numerical stability. In contrast, the general theory of nonlinear PDEs is kind of a zoo.

Ok, enough religion! Let’s see how a quadratic formulation can help us with the specific problem of curvature flow. Actually, we already have a quadratic energy — it’s
$E(\kappa) = \int_0^L \kappa^2 ds.$
The only difference between this energy and the one we’ve been working with all along is that it’s a function of the curvature $\kappa$ rather than the immersion $f$ — as a result, we avoid all the nonlinearity associated with expressing $\kappa$ in terms of $f$. More concretely, at the discrete level we’re going to store and manipulate a single number $\kappa_i$ at each vertex, rather than computing it indirectly from the vertex coordinates $\gamma_i \in \mathbb{R}^2$.

Coding 4.5
Implement the method IsometricWillmoreFlow1D::getCurvature(), which simply evaluates the (pointwise) curvature at each vertex and stores it in the member Vertex::kappa. This method should call Vertex::curvature().

One nice consequence of this setup is that the gradient becomes extremely simple! Taking the gradient with respect to the $L^2$ inner product on 0-forms, we get
$\nabla E(\kappa) = -2\kappa.$
Gradient flow then becomes a simple, linear equation involving no spatial derivatives:
$\dot{\kappa} = -2\kappa.$

Coding 4.6
Implement the methods IsometricWillmoreFlow1D::computeFlowDirection() and IsometricWillmoreFlow1D::integrate(), which integrate the above flow equation using the forward Euler scheme. Hint: this step should be very easy!

If we want to actually draw the curve, we can integrate curvature to get tangents, then integrate tangents to get positions. In other words, we can recover the direction $\theta$ of the tangent via
$\theta(s) = \theta_0 + \int_0^s d\theta = \theta_0 + \int_0^s \kappa\ ds,$
where $\theta_0$ specifies the direction of the first tangent on our curve. The tangent vectors themselves are given by $T(s) = (\cos\theta(s),\sin\theta(s))$ as before. Once we have the tangents, we can recover the immersion itself via
$f(s) = f_0 + \int_I T(s)\ ds,$
where again $f_0$ specifies a “starting point” for the curve. In the discrete setting, these two steps correspond to a very simple reconstruction procedure: start out at some initial vertex and join the edges end to end, rotating by the exterior angles $\varphi_i$ at each step:

More explicitly, let
$\theta_i = \sum_{k=0}^i \varphi_k$
and let
$T_i = L_i (\cos\theta_i,\sin\theta_i),$
where $L_i$ is the length of the $i$th primal edge. Then the vertex positions along the curve can be recovered via
$\gamma_i = \sum_{k=0}^i T_k,$
mirroring the continuous formulae above. (Some questions to ponder: can you interpret these sums as piecewise integrals? What kind of quantity is $(\cos\varphi_i,\sin\varphi_i)$? What kind of quantity is $T_i$? What’s the relationship between the two?)

Coding 4.7
Implement the methods IsometricWillmoreFlow1D::recoverTangents() and IsometricWillmoreFlow1D::recoverPositions(), which should compute the values $T_i$ and $\gamma_i$, respectively. If you use an $O(n^2)$ algorithm to implement either of these methods you will get zero points! In other words, do not just evaluate the whole sum once for each vertex — there is obviously a better way to do it!

Note that the length of each edge is preserved by construction — after all, we build the curve out of segments that have the same length as in the previous curve! In other words, we get not only a curvature flow but an isometric curvature flow (in the smooth case, isometry is reflected in the fact that $(\cos\alpha,\sin\alpha)$ is always a unit vector).

Ok, sounds pretty good so far: we simply subtract some fraction of the curvature each vertex, and compute a couple cumulative sums. As an added bonus, we preserve length. Why haven’t people been doing this all along? The answer is: when something sounds too good to be true, it probably is!

In particular, let’s take a look at what happens when we work with a closed curve, i.e., a loop in the plane. If we make a completely arbitrary change to the curvature $\kappa$, there’s no reason to expect that segments joined end-to-end will close back up. In other words, the final vertex may be somewhere different from where the inital vertex appeared:

A fancy way of describing this situation is to say that the tangents we recover from this procedure are not integrable — they do not “integrate up” to form a closed loop. Similarly, the curvature itself is not integrable: the cumulative curvature function $\alpha$ does not describe the tangent direction of any closed loop. Why did this happen? Well, let’s go back and take a look at our condition on total curvature. We said that the curvature $\kappa$ of any closed loop $\gamma$ satisfies
$\int_{0}^L \kappa\ ds = 2\pi k$
for some turning number $k \in \mathbb{Z}$. Another way of saying the same thing is that the first and last tangents of our curve must match up: $T(0) = T(L)$. But if we change $\kappa$ arbitrarily, this condition will no longer hold.

Exercise 4.15
Suppose that at time zero, the curvature function $\kappa$ on a smooth curve $\gamma$ satisfies our condition on total curvature. Show that any change in curvature $\dot{\kappa}$ orthogonal to the constant function $1: [0,L] \rightarrow \mathbb{R}; s \mapsto 1$ with respect to the $L^2$ inner product will preserve this condition.

We also need a condition that ensures the endpoints will meet up, i.e., $\gamma(0) = \gamma(L)$. Although we will not derive it here, this condition again turns out to have a simple form:
$\int_0^L \kappa \gamma = 0;$
equivalently, $\dot{\kappa}$ must be ($L^2$-)orthogonal to the $x-$ and $y-$ coordinate functions of the immersion. Overall, then, we’re saying that the change in curvature must avoid a three-dimensional linear subspace of directions:
$\langle \dot{\kappa}, 1 \rangle = \langle \dot{\kappa}, \gamma_x \rangle = \langle \dot{\kappa}, \gamma_y \rangle = 0.$
Like convex-quadratic energies, linear constraints are particularly easy to work with — in the case of our flow, we can simply remove the component of $\dot{\kappa}$ that sits in this “forbidden” space. More specifically, suppose that this space is spanned by an orthonormal basis $\{c_i\}$. Then we can simply travel in the augmented direction
$\dot{\kappa}_c = \dot{\kappa} - \sum_{i=1}^3 \langle \dot{\kappa}, \hat{c}_i \rangle \hat{c}_i.$

Coding 4.8
Implement the method IsometricWillmoreFlow1D::buildConstraints(), which constructs the three constraint directions $1$, $\gamma_x$, $\gamma_y$ as dense column vectors.

Coding 4.9
Implement the method IsometricWillmoreFlow1D::orthogonalizeConstraints(), which builds an orthonormal basis $\{\hat{c}_1,\hat{c}_2,\hat{c}_3\}$ spanning the same space as the three constraint directions. Hint: use the Gram-Schmidt process — remember to use the correct inner product!

Coding 4.10
Implement the method IsometricWillmoreFlow::enforceConstraints(), which removes the forbidden directions from the flow using the orthogonal basis and the procedure outlined above. Try running the isometric Willmore flow (you can switch to this flow in the GUI either by right-clicking to get the contextual menu, or by hitting the ‘i‘ key). Report the maximum stable time step that can be achieved for each of the provided meshes. Does the flow preserve the turning number of each of the input curves? (In other words, did we faithfully capture the Whitney-Graustein theorem in the discrete setting?) Try running the flow with and without constraints (by modifying IsometricWillmoreFlow::enforceConstraints()). What happens if you turn off all the constraints? Is there a strict subset of the constraints that is sufficient to keep the loop closed, or are they all needed?

One thing you might have noticed about this new flow is that, while it still smooths out the curve, it looks very different from the one you implemented in the WillmoreFlow1D class. Why is there a difference? In either case, aren’t we doing gradient descent on the same energy? Well, if you paid close attention, you might already know the answer: yes, the energy stays the same, but the metric we used to define the gradient is different! (And if you really paid close attention, you may even know how to modify the second flow to make it look like the first one — and how to implement it!) Beyond that, there are all sorts of nice ways to improve the algorithm that involve discrete Laplacians and Poisson equations and… you know what? You’ve worked hard enough already. Enjoy the break, and see you next year!

Skeleton code: ddg_hw4_skeleton.zip

## Homework 3: The Discrete Laplacian

In the course notes we mentioned that the Laplace-Beltrami operator (more commonly known as just the Laplacian) plays a fundamental role in a variety of geometric and physical equations. In this homework we’ll put the Laplacian to work by coming up with a discrete version for triangulated surfaces. Similar to the homework on vertex normals, we’ll see that the same discrete expression for the Laplacian (via the cotan formula) arises from two very different ways of looking at the problem: using test functions (often known as Galerkin projection), or by integrating differential forms (often called discrete exterior calculus).

Poisson Equations

Before we start talking about discretization, let’s establish a few basic facts about the Laplace operator $\Delta$ and the standard Poisson problem

$\Delta \phi = \rho.$

Poisson equations show up all over the place — for instance, in physics $\rho$ might represent a mass density in which case the solution $\phi$ would (up to suitable constants) give the corresponding gravitational potential. Similarly, if $\rho$ describes an charge density then $\phi$ gives the corresponding electric potential (you’ll get to play around with these equations in the code portion of this assignment). In geometry processing a surprising number of things can be done by solving a Poisson equation (e.g., smoothing a surface, computing a vector field with prescribed singularities, or even computing the geodesic distance on a surface).

Often we’ll be interested in solving Poisson equations on a compact surface $M$ without boundary.

Exercise 3.1: A twice-differentiable function $\phi: M \rightarrow \mathbb{R}$ is called harmonic if it sits in the kernel of the Laplacian, i.e., $\Delta \phi = 0$. Argue that the only harmonic functions on a compact domain without boundary are the constant functions. (This argument does not have to be incredibly formal — there are a just couple simple observations that capture the main idea.)

This fact is quite important because it means we can add a constant to any solution to a Poisson equation. In other words, if $\phi$ satisfies $\Delta \phi = \rho$, then so does $\phi+c$ since $\Delta(\phi+c) = \Delta\phi + \Delta c = \Delta\phi + 0 = \rho$.

Exercise 3.2: A separate fact is that on a compact domain without boundary, constant functions are not in the image of $\Delta$. In other words, there is no function $\phi$ such that $\Delta \phi = c$. Why?

This fact is also important because it tells us when a given Poisson equation admits a solution. In particular, if $\rho$ has a constant component then the problem is not well-posed. In some situations, however, it may make sense to simply remove the constant component. I.e., instead of trying to solve $\Delta \phi = \rho$ one can solve $\Delta \phi = \rho – \bar{\rho}$, where $\bar{\rho} := \int_M \rho\ dV / |M|$ and $|M|$ is the total volume of $M$. However, you must be certain that this trick makes sense in the context of your particular problem!

When working with PDEs like the Poisson equation, it’s often useful to have an inner product between functions. An extremely common inner product is the $L^2$ inner product $\langle \cdot, \cdot \rangle$, which takes the integral of the pointwise product of two functions over the entire domain $\Omega$:

$\langle f, g \rangle := \int_\Omega f(x) g(x) dx.$

In spirit, this operation is similar to the usual dot product on $\mathbb{R}^n$: it measures the degree to which two functions “line up.” For instance, the top two functions have a large inner product; the bottom two have a smaller inner product (as indicated by the dark blue regions):

Similarly, for two vector fields $X$ and $Y$ we can define an $L^2$ inner product

$\langle X, Y \rangle := \int_\Omega X(x) \cdot Y(x) dx$

which measures how much the two fields “line up” at each point.

Using the $L^2$ inner product we can express an important relationship known as Green’s first identity. Green’s identity says that for any sufficiently differentiable functions $f$ and $g$

$\langle \Delta f, g \rangle = -\langle \nabla f, \nabla g \rangle + \langle N \cdot \nabla f, g \rangle_\partial,$

where $\langle \cdot, \cdot \rangle_\partial$ denotes the inner product on the boundary and $N$ is the outward unit normal.

Exercise 3.3: Using exterior calculus, show that Green’s identity holds. Hint: apply Stokes’ theorem to the 1-form $g df$.

One last key fact about the Laplacian is that it is positive-semidefinite, i.e., $\Delta$ satisfies

$\langle \Delta \phi, \phi \rangle \geq 0$

for all functions $\phi$. (By the way, why isn’t this quantity strictly greater than zero?) Words cannot express the importance of positive-(semi)definiteness. Let’s think about a very simple example: functions of the form $\phi(x,y) = ax^2 + bxy + cy^2$ in the plane. Any such function can also be expressed in matrix form:

$\phi(x,y) = \underbrace{\left[ \begin{array}{cc} x & y \end{array} \right]}_{\mathbf{x}^T} \underbrace{\left[ \begin{array}{cc} a & b/2 \\ b/2 & c \end{array} \right]}_{A} \underbrace{\left[ \begin{array}{c} x \\ y \end{array} \right]}_{\mathbf{x}} = ax^2 + bxy + cy^2,$

and we can likewise define positive-semidefiniteness for $A$. But what does it actually look like? It looks like this — positive-definite matrices ($\mathbf{x}^T A \mathbf{x} > 0$) look like a bowl, positive-semidefinite matrices ($\mathbf{x}^T A \mathbf{x} \geq 0$) look like a half-cylinder, and indefinite matrices ($\mathbf{x}^T A \mathbf{x}$ might be positive or negative depending on $\mathbf{x}$) look like a saddle:

Now suppose you’re a back country skiier riding down this kind of terrain in the middle of a howling blizzard. You’re cold and exhausted, and you know you parked your truck in a place where the ground levels out, but where exactly is it? The snow is blowing hard and visibility is low — all you can do is keep your fingers crossed and follow the slope of the mountain as you make your descent. (Trust me: this is really how one feels when doing numerical optimization!) If you were smart and went skiing in Pos Def Valley then you can just keep heading down and will soon arrive safely back at the truck. But maybe you were feeling a bit more adventurous that day and took a trip to Semi Def Valley. In that case you’ll still get to the bottom, but may have to hike back and forth along the length of the valley before you find your car. Finally, if your motto is “safety second” then you threw caution to the wind and took a wild ride in Indef Valley. In this case you may never make it home!

In short: positive-definite matrices are nice because it’s easy to find the minimum of the quadratic functions they describe — many tools in numerical linear algebra are based on this idea. Same goes for positive definite linear operators like the Laplacian, which can often be thought of as sort of infinite-dimensional matrices (if you took some time to read about the spectral theorem, you’ll know that this analogy runs even deeper, especially on compact domains). Given the ubiquity of Poisson equations in geometry and physics, it’s a damn good thing $\Delta$ is positive-semidefinite!

Exercise 3.4: Using Green’s first identity, show that $\Delta$ is negative-semidefinite on any compact surface $M$ without boundary. From a practical perspective, why are negative semi-definite operators just as good as positive semi-definite ones?

Test Functions

The solution to a geometric or physical problem is often described by a function: the temperature at each point on the Earth, the curvature at each point on a surface, the amount of light hitting each point of your retina, etc. Yet the space of all possible functions is mind-bogglingly large — too large to be represented on a computer. The basic idea behind the finite element method is to pick a smaller set of functions and try to find the best possible solution from within this set. More specifically, if $u$ is the true solution to a problem and $\{\phi_i\}$ is a collection of basis functions, then we seek the linear combination of these functions

$\tilde{u} = \sum_i x_i \phi_i,\ x_i \in \mathbb{R}$

such that the difference $||\tilde{u}-u||$ is as small as possible with respect to some norm. (Above we see a detailed curve $u$ and its best approximation $\tilde{u}$ by a collection of bump-like basis functions $\phi_i$.)

Let’s start out with a very simple question: suppose we have a vector $v \in \mathbb{R}^3$, and want to find the best approximation $\tilde{v}$ within a plane spanned by two basis vectors $e_1, e_2 \in \mathbb{R}^3$:

Since $\tilde{v}$ is forced to stay in the plane, the best we can do is make sure there’s error only in the normal direction. In other words, we want the error $\tilde{v} – v$ to be orthogonal to both basis vectors $e_1$ and $e_2$:

$\begin{array}{rcl} (\tilde{v} - v) \cdot e_1 &=& 0, \\ (\tilde{v} - v) \cdot e_2 &=& 0. \\ \end{array}$

In this case we get a system of two linear equations for two unknowns, and can easily compute the optimal vector $\tilde{v}$.

Now a harder question: suppose we want to solve a standard Poisson problem

$\Delta u = f.$

How do we check whether a given function $\tilde{u}$ is the best possible solution? The basic picture still applies, except that our bases are now functions $\phi$ instead of finite-dimensional vectors $e_i$, and the simple vector dot product $\cdot$ gets replaced by the $L^2$ inner product. Unfortunately, when trying to solve a Poisson equation we don’t know what the correct solution $u$ looks like (otherwise we’d be done already!). So instead of the error $\tilde{u} – u$, we’ll have to look at the residual $\Delta \tilde{u} – f$, which measures how closely $\tilde{u}$ satisfies our original equation. In particular, we want to “test” that the residual vanishes along each basis direction $\phi_j$:

$\langle \Delta \tilde{u} - f, \phi_j \rangle = 0,$

again resulting in a system of linear equations. This condition ensures that the solution behaves just as the true solution would over a large collection of possible “measurements.”

Next, let’s work out the details of this system for a triangulated surface. The most natural choice of basis functions are the piecewise linear hat functions $\phi_i$, which equal one at their associated vertex and zero at all other vertices:

At this point you might object: if all our functions are piecewise linear, and $\Delta$ is a second derivative, aren’t we just going to get zero every time we evaluate $\Delta u$? Fortunately we’re saved by Green’s identity — let’s see what happens if we apply this identity to our triangle mesh, by breaking up the integral into a sum over individual triangles $\sigma$:

$\begin{array}{rcl} \langle \Delta u, \phi_j \rangle &=& \sum_k \langle \Delta u, \phi_j \rangle_{\sigma_k} \\ &=& \sum_k \langle \nabla u, \nabla \phi_j \rangle_{\sigma_k} + \sum_k \langle N \cdot \nabla u, \phi_j \rangle_{\partial\sigma_k}. \\ \end{array}$

If the mesh has no boundary then this final sum will disappear since the normals of adjacent triangles are oppositely oriented, hence the boundary integrals along shared edges cancel each-other out:

In this case, we’re left with simply

$\langle \nabla u, \nabla \phi_j \rangle$

in each triangle $\sigma_k$. In other words, we can “test” $\Delta u$ as long as we can compute the gradients of both the candidate solution $u$ and each basis function $\phi_j$. But remember that $u$ is itself a linear combination of the bases $\phi_i$, so we have

$\langle \nabla u, \nabla \phi_j \rangle = \left\langle \nabla \sum_i x_i \phi_i, \nabla \phi_j \right\rangle = \sum_i x_i \langle \nabla \phi_i, \nabla \phi_j \rangle.$

The critical thing now becomes the inner product between the gradients of the basis functions in each triangle. If we can compute these, then we can simply build the matrix

$A_{ij} := \langle \nabla \phi_i, \nabla \phi_j \rangle$

and solve the problem

$A x = b$

for the coefficients $x$, where the entries on the right-hand side are given by $b_i = \langle f, \phi_i \rangle$ (i.e., we just take the same “measurements” on the right-hand side).

Exercise 3.5: Show that the aspect ratio of a triangle can be expressed as the sum of the cotangents of the interior angles at its base, i.e.,

$\frac{w}{h} = \cot{\alpha} + \cot{\beta}.$

Exercise 3.6: Let $e$ be the edge vector along the base of a triangle. Show that the gradient of the hat function $\phi$ on the opposite vertex is given by

$\nabla \phi = \frac{e^\perp}{2A},$

where $e^\perp$ is the vector $e$ rotated by a quarter turn in the counter-clockwise direction and $A$ is the area of the triangle.

Exercise 3.7: Show that for any hat function $\phi$ associated with a given vertex

$\langle \nabla \phi, \nabla \phi \rangle = \frac{1}{2}( \cot\alpha + \cot\beta )$

within a given triangle, where $\alpha$ and $\beta$ are the interior angles at the remaining two vertices.

Exercise 3.8: Show that for the hat functions $\phi_i$ and $\phi_j$ associated with vertices $i$ and $j$ (respectively) of the same triangle, we have

$\langle \nabla \phi_i, \nabla \phi_j \rangle = -\frac{1}{2} \cot \theta,$

where $\theta$ is the angle between the opposing edge vectors.

Putting all these facts together, we find that we can express the Laplacian of $u$ at vertex $i$ via the infamous cotan formula

$(\Delta u)_i = \frac{1}{2} \sum_j (\cot \alpha_j + \cot \beta_j )( u_j - u_i ),$

where we sum over the immediate neighbors of vertex $i$.

Differential Forms

The “Galerkin” approach taken above reflects a fairly standard way to discretize partial differential equations. But let’s try a different approach, based on exterior calculus. Again we want to solve the Poisson equation $\Delta u = f$, which (if you remember our discussion of differential operators) can also be expressed as

$\star d \star d u = f.$

We already outlined how to discretize this kind of expression in the notes on discrete exterior calculus, but let’s walk through it step by step. We start out with a 0-form $u$, which is specified as a number $u_i$ at each vertex:

Next, we compute the discrete exterior derivative $d u$, which just means that we want to integrate the derivative along each edge:

$(du)_{ij} = \int_{e_{ij}}\!du = \int_{\partial e_{ij}}\!u = u_j - u_i.$

(Note that the boundary $\partial e_{ij}$ of the edge is simply its two endpoints $v_i$ and $v_j$.) The Hodge star converts a circulation along the edge $e_{ij}$ into the flux through the corresponding dual edge $e^\star_{ij}$. In particular, we take the total circulation along the primal edge, divide it by the edge length to get the average pointwise circulation, then multiply by the dual edge length to get the total flux through the dual edge:

$(\star du)_{ij} = \frac{|e^\star_{ij}|}{e_{ij}}(u_j - u_i).$

Here $|e_{ij}|$ and $e^\star_{ij}$ denote the length of the primal and dual edges, respectively. Next, we take the derivative of $\star d u$ and integrate over the whole dual cell:

$(d \star d u)_i = \int_{C_i} d \star d u = \int_{\partial C_i} \star d u = \sum_j \frac{|e^\star_{ij}|}{|e_{ij}|}( u_j - u_i ).$

The final Hodge star simply divides this quantity by the area of $C_i$ to get the average value over the cell, and we end up with a system of linear equations

$(\star d \star d u)_i = \frac{1}{|C_i|} \sum_j \frac{|e^\star_{ij}|}{|e_{ij}|}( u_j - u_i ) = f_i$

where $f_i$ is the value of the right-hand side at vertex $i$. In practice, however, it’s often preferable to move the area factor $|C_i|$ to the right hand side, since the resulting system

$(\star d \star d u)_i = \sum_j \frac{|e^\star_{ij}|}{|e_{ij}|}( u_j - u_i ) = |C_i| f_i$

can be represented by a symmetric matrix. (Symmetric matrices are often easier to deal with numerically and lead to more efficient algorithms.) Another way of looking at this transformation is to imagine that we discretized the system

$d \star d u = \star f.$

In other words, we converted an equation in terms of 0-forms into an equation in terms of $n$-forms. When working with surfaces, the operator $d \star d$ is sometimes referred to as the conformal Laplacian, because it does not change when we subject our surface to a conformal transformation. Alternatively, we can think of $d \star d$ as simply an operator that gives us the value of the Laplacian integrated over each dual cell of the mesh (instead of the pointwise value).

Exercise 3.9: Consider a simplicial surface and suppose we place the vertices of the dual mesh at the circumcenters of the triangles (i.e., the center of the unique circle containing all three vertices):

Demonstrate that the dual edge $e^\star$ (i.e., the line between the two circumcenters) bisects the primal edge orthogonally, and use this fact to show that

$\frac{|e^\star_{ij}|}{|e_{ij}|} = \frac{1}{2}(\cot \alpha_j + \cot \beta_j).$

Hence the DEC discretization yields precisely the same “cotan-Laplace” operator as the Galerkin discretization.

Meshes and Matrices

So far we’ve been giving a sort of “algorithmic” description of operators like Laplace. For instance, we determined that the Laplacian of a scalar function $u$ at a vertex $i$ can be approximated as

$(\Delta u)_i = \frac{1}{2} \sum_j (\cot\alpha_j+\cot\beta_j)(u_j - u_i),$

where the sum is taken over the immediate neighbors $j$ of $i$. In code, this sum could easily be expressed as a loop and evaluated at any vertex. However, a key aspect of working with discrete differential operators is building their matrix representation. The motivation for encoding an operator as a matrix is so that we can solve systems like

$\Delta u = f$

using a standard numerical linear algebra package. (To make matters even more complicated, some linear solvers are perfectly happy to work with algorithmic representations of operators called callback functions — in general, however, we’ll need a matrix.)

In the case of the Poisson equation, we want to construct a matrix $L \in \mathbb{R}^{|V| \times |V|}$ (where $|V|$ is the number of mesh vertices) such that for any vector $u \in \mathbb{R}^{|V|}$ of values at vertices, the expression $Lu$ effectively evaluates the formula above. But let’s start with something simpler — consider an operator $B$ that computes the sum of all neighboring values:

$(Bu)_i = \sum_j u_j$

How do we build the matrix representation of this operator? Think of $B$ as a machine that takes a vector $u$ of input values at each vertex and spits out another vector $Bu$ of output values. In order for this story to make sense, we need to know which values correspond to which vertices. In other words, we need to assign a unique index to each vertex of our mesh, in the range $1, \ldots, |V|$:

It doesn’t matter which numbers we assign to which vertices, so long as there’s one number for every vertex and one vertex for every number. This mesh has twelve vertices and vertex 1 is next to vertices 2, 3, 4, 5, and 9. So we could compute the sum of the neighboring values as

$(Bu)_1 = \left[ \begin{array}{cccccccccccc} 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \end{array} \right]\left[ \begin{array}{c} u_1 \\ u_2 \\ u_3 \\ u_4 \\ u_5 \\ u_6 \\ u_7 \\ u_8 \\ u_9 \\ u_{10} \\ u_{11} \\ u_{12} \end{array} \right].$

Here we’ve put a “1″ in the $j$th place whenever vertex $j$ is a neighbor of vertex $i$ and a “0″ otherwise. Since this row gives the “output” value at the first vertex, we’ll make it the first row of the matrix $B$. The entire matrix looks like this:

$B = \left[ \begin{array}{cccccccccccc} 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 1 \\ 0 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 0 \\ \end{array} \right]$

(You could verify that this matrix is correct, or you could go outside and play in the sunshine. Your choice.) In practice, fortunately, we don’t have to build matrices “by hand” — we can simply start with a matrix of zeros and fill in the nonzero entries by looping over local neighborhoods of our mesh.

Finally, one important thing to notice here is that many of the entries of $B$ are zero. In fact, for most discrete operators the number of zeros far outnumbers the number of nonzeros. For this reason, it’s usually a good idea to use a sparse matrix, i.e., a data structure that stores only the location and value of nonzero entries (rather than explicitly storing every single entry of the matrix). The design of sparse matrix data structures is an interesting question all on its own, but conceptually you can imagine that a sparse matrix is simply a list of triples $(i,j,x)$ where $i,j \in \mathbb{N}$ specify the row and column index of a nonzero entry and $x \in \mathbb{R}$ gives its value.

Scalar Poisson Equation

In the first part of the coding assignment you’ll build the cotan-Laplace operator and use it to solve the scalar Poisson equation

$\Delta \phi = \rho$

on a triangle mesh, where $\rho$ can be thought of as a (mass or charge) density and $\phi$ can be thought of as a (gravitational or electric) potential. Once you’ve implemented the methods below, you can visualize the results via the libDDG GUI. (If you want to play with the density function $\rho$, take a look at the method Viewer::updatePotential.)

Coding 3.1: Implement the method Mesh::indexVertices() which assigns a unique ID to each vertex in the range $0, \ldots, |V|-1$.

Coding 3.2: Derive an expression for the cotangent of a given angle purely in terms of the two incident edge vectors and the standard Euclidean dot product $(\cdot)$ and cross product $(\times)$. Implement the method HalfEdge::cotan(), which computes the cotangent of the angle across from a given halfedge.

Coding 3.3: Implement the methods Face::area() and Vertex::dualArea(). For the dual area of a vertex you can simply use one-third the area of the incident faces — you do not need to compute the area of the circumcentric dual cell. (This choice of area will not affect the order of convergence.)

Coding 3.4: Using the methods you’ve written so far, implement the method Mesh::buildLaplacian() which builds a sparse matrix representing the cotan-Laplace operator. (Remember to initialize the matrix to the correct size!)

Coding 3.5: Implement the method Mesh::solveScalarPoissonProblem() which solves the problem $\Delta\phi = \rho$ where $\rho$ is a scalar density on vertices (stored in Vertex::rho). You can use the method solve from SparseMatrix.h; $\rho$ and $\phi$ should be represented by instances of DenseMatrix of the appropriate size. Be careful about appropriately incorporating dual areas into your computations; also remember that the right-hand side cannot have a constant component!

You should verify that your code produces results that look something like these two images (density on the left; corresponding potential on the right):

(Take a look at the new menu items in the GUI.)

Implicit Mean Curvature Flow

Next, you’ll use nearly identical code to smooth out geometric detail on a surface mesh (also known as fairing or curvature flow). The basic idea is captured by the heat equation, which describes the way heat diffuses over a domain. For instance, if $u$ is a scalar function describing the temperature at every point on the real line, then the heat equation is given by

$\frac{\partial u}{\partial t} = \frac{\partial^2 u}{\partial x^2}.$

Geometrically this equation simply says that concave bumps get pushed down and convex bumps get pushed up — after a long time the heat distribution becomes completely flat. We also could have written this equation using the Laplacian: $\frac{\partial u}{\partial t} = \Delta u$. In fact, this equation is exactly the one we’ll use to smooth out a surface, except that instead of considering the evolution of temperature, we consider the flow of the surface $f: M \rightarrow \mathbb{R}^3$ itself:

$\frac{\partial f}{\partial t} = \Delta f.$

Remember from our discussion of vertex normals that $\Delta f = 2HN,$ i.e., the Laplacian of position yields (twice) the mean curvature times the unit normal. Therefore, the equation above reads, “move the surface in the direction of the normal, with strength proportional to mean curvature.” In other words, it describes a mean curvature flow.

So how do we compute this flow? We already know how to discretize the term $\Delta f$ — just use the cotangent discretization of Laplace. But what about the time derivative $\frac{\partial f}{\partial t}$? There are all sorts of interesting things to say about discretizing time, but for now let’s use a very simple idea: the change over time can be approximated by the difference of two consecutive states:

$\frac{\partial f}{\partial t} \approx \frac{f_h - f_0}{h},$

where $f_0$ is the initial state of our system (here the initial configuration of our mesh) and $f_h$ is the configuration after a mean curvature flow of some duration $h > 0$. Our discrete mean curvature flow then becomes

$\frac{f_h - f_0}{h} = \Delta f.$

The only remaining question is: which values of $f$ do we use on the right-hand side? One idea is to use $f_0$, which results in the system

$f_h = f_0 + h\Delta f_0.$

This scheme, called forward Euler, is attractive because it can be evaluated directly using the known data $f_0$ — we don’t have to solve a linear system. Unfortunately, forward Euler is not numerically stable, which means we can take only very small time steps $h$. One attractive alternative is to use $f_h$ as the argument to $\Delta$, leading to the system

$\underbrace{(I - h \Delta)}_A f_h = f_0,$

where $I$ is the identity matrix (try the derivation yourself!) This scheme, called backward Euler, is far more stable, though we now have to solve a linear system $A f_h = f_0$. Fortunately this system is highly sparse, which means it’s not too expensive to solve in practice. (Note that this system is not much different from the Poisson system.)

Coding 3.6: Implement the method Mesh::buildFlowOperator(), which should look very similar to Mesh::buildLaplacian.

Coding 3.7: Implement the method Mesh::computeImplicitMeanCurvatureFlow(). Note that you can treat each of the components of $f$ ($x$, $y$, and $z$) as separate scalar quantities.

You should verify that your code produces results that look something like these two images (original mesh on the left; smoothed mesh on the right):

Skeleton Code and Handin

Skeleton code for this assignment can be found here:

ddg_poisson.zip

(You may want to use your Makefile from the last project!) You can easily locate all the methods that need to be implemented by searching for the string “TODO.” Please submit the entire project this time as a single zip file (i.e., not just the files you modify).

## Homework 2: Normals of Discrete Surfaces

[Homework 2 has been released (below) and is due next Thursday by 5pm. We suspect there will be a lot of questions about C++, libDDG, etc., so please do not hesitate to ask questions either in the comments (preferred) or via email. Finally, some of the exercises require knowledge of exterior calculus that we haven't quite seen yet in class -- we will cover this material next time, but for those who want to get an early start I will post some additional notes that you can read through at your leisure. Good luck!]

For a smooth surface in $\mathbb{R}^3$, the normal direction is easy to define: it is the unique direction orthogonal to all tangent vectors — in other words, it’s the direction sticking “straight out” of the surface. For discrete surfaces the story is not so simple. If a mesh has planar faces (all vertices lie in a common plane) then of course the normal is well-defined: it is simply the normal of the plane. But if the polygon is nonplanar, or if we ask for the normal at a vertex, then it is not as clear how the normal should be defined.

In practice there are a number of different possibilities, which arise from different ways of looking at the smooth geometry. But before jumping in, let’s establish a few basic geometric facts.

Area of a Polygon

Here’s a simple question: how do you compute the area of a polygon in the plane? Suppose our polygon has vertices $p_1, p_2, \ldots, p_n$. One way to compute the area is to stick another point $q$ in the middle and sum up the areas of triangles $q,p_i,p_{i+1}$ as done on the left:

A cute fact about life is that if we place $q$ anywhere and sum up the signed triangle areas, we still recover the polygon area! (Signed area just means negative if our vertices are oriented clockwise; positive if they’re counter-clockwise.) You can get an idea of why this happens just by looking at the picture: positive triangles that cover “too much” area get accounted for by negative triangles.

The proof is an application of Stokes’ theorem — consider a different expression for the area $A$ of a planar polygon $P$:

$A = \int_P dx \wedge dy.$

Noting that $dx \wedge dy = d(xdy) = -d(ydx)$, we can also express the area as

$A = \frac{1}{2} \int_P d(xdy) - d(ydx) = \frac{1}{2} \int_{\partial P} xdy - ydx,$

where we’ve applied Stokes’ theorem in the final step to convert our integral over the entire surface into an integral over just the boundary. Now suppose that our polygon vertices have coordinates $p_i = (x_i,y_i)$. From here we can explicitly work out the boundary integral by summing up the integrals over each edge $e_{ij}$:

$\int_{\partial P} xdy - ydx = \sum \int_{e_{ij}} xdy - ydx.$

Since the coordinate functions $x$ and $y$ are linear along each edge (and their differentials $dx$ and $dy$ are therefore constant), we can write these integrals as

$\begin{array}{rcl} \sum \int_{e_{ij}} xdy - ydx &=& \sum \frac{x_i+x_j}{2}(y_j-y_i) - \frac{y_i+y_j}{2}(x_j-x_i) \\ &=& \frac{1}{2} \sum (p_i + p_j) \times (p_j - p_i) \\ &=& \frac{1}{2} \sum p_i \times p_j - p_i \times p_i - p_j \times p_j - p_j \times p_i \\ &=& \sum p_i \times p_j. \end{array}$

In short, we’ve shown that the area of a polygon can be written as simply

$A = \frac{1}{2} \sum_i p_i \times p_j.$

Exercise 2.1 Complete the proof by showing that for any point $q$ the signed areas of triangles $(q,p_i,p_{i+1})$ sum to precisely the expression above.

Vector Area

A more general version of the situation we just looked at with polygon areas is the vector area of a surface patch $f: M \rightarrow \mathbb{R}^3$, which is defined as the integral of the surface normal over the entire domain:

$N_\mathcal{V} := \int_M N dA.$

A very nice property of the vector area is that it depends only on the shape of the boundary $\partial M$ (as you will demonstrate in the next exercise). As a result, two surfaces that look very different (such as the ones above) can still have the same vector area — the physical intuition here is that the vector area measures the total flux through the boundary curve.

For a flat region the normal is constant over the surface and we get just the usual area times the unit normal vector. Things get more interesting when the surface is not flat — for instance, the vector area of a circular band is zero since opposing normals cancel each-other out:

Exercise 2.2 Using Stokes’ theorem, show that the vector area can be written as

$N_\mathcal{V} = \frac{1}{2} \int_{\partial M} f \wedge df,$

where the product of two vectors in $\mathbb{R}^3$ is given by the usual cross product $\times$.

Here’s another fairly basic question: consider a triangle sitting in $\mathbb{R}^3$, and imagine that we’re allowed to pull on one of its vertices $p$. What’s the quickest way to increase its area $A$? In other words, what’s the gradient of $A$ with respect to $p$?

Exercise 2.3 Show that the area gradient is given by

$\nabla_p A_\sigma = \frac{1}{2} \mathbf{u}^{\perp}$

where $\mathbf{u^\perp}$ is the edge vector across from $p$ rotated by an angle $\pi/2$ in the plane of the triangle (such that it points toward $p$). You should require only a few very simple geometric arguments — there’s no need to write things out in coordinates, etc.

Ok, with these facts out of the way let’s take a look at some different ways to define vertex normals. There are essentially only two definitions that arise naturally from the smooth picture: the area gradient and the volume gradient; we’ll start with the former.

The area gradient asks, “which direction should we `push’ the surface in order to increase its total area $A$ as quickly as possible?” Sliding all points tangentially along the surface clearly doesn’t change anything: we just end up with the same surface. In fact, the only thing we can do to increase surface area is move the surface in the normal direction. The idea, then, is to define the vertex normal as the gradient of area with respect to a given vertex.

Since we already know how to express the area gradient for a single triangle $\sigma$, we can easily express the area gradient for the entire surface:

$\nabla_p A = \sum_\sigma \nabla A_\sigma.$

Of course, a given vertex $p$ influences only the areas of the triangles touching $p$. So we can just sum up the area gradients over this small collection of triangles.

Exercise 2.4 Show that the gradient of surface area with respect to vertex $p_i$ can be expressed as

$\nabla_p A = \frac{1}{2}\sum_j (\cot\alpha_j + \cot\beta_j)( p_j - p_i )$

where $p_j$ is the coordinate of the $j$th neighbor of $p_i$ and $\alpha_j$ and $\beta_j$ are the angles across from edge $(p_i,p_j)$.

Mean Curvature Vector

The expression for the area gradient derived in the last exercise shows up all over discrete differential geometry, and is often referred to as the cotan formula. Interestingly enough this same expression appears when taking a completely different approach to defining vertex normals, by way of the mean curvature vector $HN$. In particular, for a smooth surface $f: M \rightarrow \mathbb{R}^3$ we have

$\Delta f = 2HN$

where $H$ is the mean curvature, $N$ is the unit surface normal (which we’d like to compute), and $\Delta$ is the Laplace-Beltrami operator (see below). Therefore, another way to define vertex normals for a discrete surface is to simply apply a discrete Laplace operator to the vertex positions and normalize the resulting vector.

The question now becomes, “how do you discretize the Laplacian?” We’ll take a closer look at this question in the future, but the remarkable fact is that the most straightforward discretization of $\Delta$ leads us right back to the cotan formula! In other words, the vertex normals we get from the mean curvature vector are precisely the same as the ones we get from the area gradient.

This whole story also helps us get better intuition for the Laplace-Beltrami operator $\Delta$ itself. Unfortunately, there’s no really nice way to write $\Delta$ — the standard coordinate formula you’ll find in a textbook on differential geometry is $\Delta \phi = \frac{1}{\sqrt{|g|}} \frac{\partial}{\partial x^i}(\sqrt{|g|} g^{ij} \frac{\partial}{\partial x^j}\phi)$, where $g$ is the metric. However, this obfuscated expression provides little intuition about what $\Delta$ really does, and is damn-near useless when it comes to discretization since for a triangle mesh we never have a coordinate representation of $g$! Later, using exterior calculus, we’ll see that the (0-form) Laplacian can be expressed as $\Delta = \star d \star d$, which leads to a fairly straightforward discretization. But for now, we’ll make use of a nice tool we learned about earlier: conformal parameterization. Remember that if $f$ is a conformal map, then lengths on $M$ and lengths on $f(M)$ are related by a positive scaling $e^{u}$. In other words, $|df(X)| = e^{u}|X|$ for some real-valued function $u$ on $M$. Moreover, a conformal parameterization always exists — in other words, we don’t have to make any special assumptions about our geometry in order to use conformal coordinates in proofs or other calculations. The reason conformal coordinates are useful when talking about Laplace-Beltrami is that we can write $\Delta$ as simply a rescaling of the standard Laplacian in the plane, i.e., as the sum of second partial derivatives divided by the metric scaling factor $e^{2u}$:

$\Delta \phi = \frac{d(d\phi(X))(X) + d(d\phi(Y))(Y)}{e^{2u}},$

where $X$ and $Y$ are any pair of unit, orthogonal directions.

What’s the geometric meaning here? Remember that for a good old-fashioned function $\phi: \mathbb{R} \rightarrow \mathbb{R}$ in 1D, second derivatives basically tell us about the curvature of a function, e.g., is it concave or convex?

Well, since $\Delta$ is a sum of second derivatives, it’s no surprise that it tells us something about the mean curvature!

Exercise 2.5 Show that the relationship $\Delta f = 2HN$ holds.

An alternative way to come up with normals is to look at the volume gradient. Suppose that our surface encloses some region of space with total volume $\mathcal{V}$. As before, we know that sliding the surface along itself tangentially doesn’t really change anything: we end up with the same surface, which encloses the same region of space. Therefore, the quickest way to increase $\mathcal{V}$ is to again move the surface in the normal direction. A somewhat surprising fact is that, in the discrete case, the volume gradient actually yields a different definition for vertex normals than the one we got from the area gradient. To express this gradient, we’ll use three-dimensional versions of of our “basic facts” from above.

First, much like we broke the area of a polygon into triangles, we’re going to decompose the volume enclosed by our surface into a collection of tetrahedra. Each tetrahedron includes exactly one face of our discrete surface, along with a new point $q$. For instance, here’s what the volume might look like in the vicinity of a vertex $p$:

Just as in the polygon case the location of $q$ makes no difference, as long as we work with the signed volume of the tetrahedra. (Can you prove it?)

Next, what’s the volume gradient for a single tetrahedron? One way to write the volume of a tet is as

$\mathcal{V} = \frac{1}{3} A h,$

where $A$ is the area of the base triangle and $h$ is the height. Then using the same kind of geometric reasoning as in the triangle case, we know that

$\nabla_p \mathcal{V} = \frac{1}{3} A N,$

where $N$ is the unit normal to the base.

To express the gradient of the enclosed volume with respect to a given vertex $p$, we simply sum up the gradients for the tetrahedra containing $p$:

$\nabla_p \mathcal{V} = \sum_i \mathcal{V}_i = \frac{1}{3} \sum_i A_i N_i.$

At first glance this sum does not lead to a nice expression for $\Delta_p \mathcal{V}$ — for instance, it uses the normals $N_i$ of faces that have little to do with our surface geometry. However, remember that we can place $q$ anywhere we please and still get the same expression for volume. In particular, if we put $q$ directly on top of $p$, then the $N_i$ and $A_i$ coincide with the normals and areas (respectively) of the faces containing $p$ from our original surface:

Exercise 2.6 Show that the volume gradient points in the same direction as the vector area $N_\mathcal{V}$ (i.e., show that they are the same up to a constant factor).

Other Definitions

So far we’ve only looked at definitions for vertex normals that arise from some smooth definition. This way of thinking captures the essential spirit of discrete differential geometry: relationships from the smooth setting should persist unperturbed in the discrete setting (e.g., $\Delta f = 2HN$ should be true independent of whether $\Delta$, $H$, and $N$ are smooth objects or discrete ones). Nonetheless, there are a number of common definitions for vertex normals that do not have a known origin in the smooth world. (Perhaps you can find one?)

Uniform Weighting

Perhaps the simplest way to get vertex normals is to just add up the neighboring face normals:

$N_U := \sum_i N_i$

The main drawback to this approach is that two different tessellations of the same geometry can produce very different vertex normals, as illustrated above.

Tip-Angle Weights

A simple way to reduce dependence on the tessellation is to weigh face normals by their corresponding tip angles $\theta$, i.e., the interior angles incident on the vertex of interest:

$N_\theta := \sum_i \theta_i N_i$

Sphere-Inscribed Polytope

Here’s another interesting approach to vertex normals: consider the sphere $S^2$ consisting of all points unit distance from the origin in $\mathbb{R}^3$. A nice fact about the sphere is that the unit normal $N$ at a point $x \in S^2$ is simply the point itself! I.e., $N(x) = x$. So if we start out with a polytope whose vertices all sit on the sphere, one reasonable way to define vertex normals is to simply use the vertex positions.

In fact, it’s not too hard to show that the direction of the normal at a vertex $p_i$ can be expressed purely in terms of the edge vectors $e_j = p_j – p_i$, where $p_j$ are the immediate neighbors of $p_i$. In particular, we have

$N_S = \frac{1}{c} \sum_{j=0}^{n-1} \frac{e_j \times e_{j+1}}{|e_j|^2 |e_{j+1}|^2}$

where the constant $c \in \mathbb{R}$ can be ignored since we’re only interested in the direction of the normal. (For a detailed derivation of this expression, see Max, “Weights for Computing Vertex Normals from Facet Normals.”) Of course, since this expression depends only on the edge vectors it can be evaluated on any mesh (not just those inscribed in a sphere).

Coding Assignment

Implement the following methods in libDDG:

• Vertex::normalEquallyWeighted()
Purpose: returns unit vertex normal using uniform weights $N_U$
• Vertex::normalAreaWeighted()
Purpose: returns unit vertex normal using face area weights $N_\mathcal{V}$
• Vertex::normalAngleWeighted()
Purpose: returns unit vertex normal using tip angle weights $N_\theta$
• Vertex::normalMeanCurvature()
Purpose: returns unit mean curvature normal $\Delta f$
• Vertex::normalSphereInscribed()
Purpose: returns unit sphere-inscribed normal $N_S$

(The definitions for these methods can be found in libddg/src/Vertex.cpp.)

Once you’ve successfully implemented these methods, test them out on the provided meshes. For convenience you can flip through the different methods using the keys [1]-[5]. You can also see what the mesh looks like by viewing it in “wireframe” mode. Do you notice that some definitions work better than others? When? Why? The only thing you need to submit for the coding assignment (via email) is your modified version of Vertex.cpp. (Of course, if you modify other files you should submit these too! For instance, you might find it convenient to add a method HalfEdge::cotan() that computes the cotangent of the angle across from a given half edge.)

Example meshes can be found here: ddg_hw2_meshes.zip

## Code for Next Assignment Posted

ddg_normals.zip

We’ll provide more details about the actual assignment later this week (including the due date), but for now just try to see if you can get the code to compile. Instructions on getting started can be found in the user guide — this precise sequence of steps may not work for you, but it should at least get you started. If you’re having trouble (or finding wild success!), feel free to gripe about it in the comments (below).

If you’ve never seen C++ before, don’t freak out — the path is pretty clearly laid out for you in the assignment.

Finally, we will be arranging a session (probably some evening later this week — with food!) to help setup the code and get everything compiling/linking. Please leave a comment below indicating which evenings/times work for you.