## Homework 4: Curvature Flow

In this homework we’ll take a closer look at curvature flow. We already saw one example of curvature flow (mean curvature flow) while studying the Poisson equation. The general idea behind curvature flow is that we have an energy $E$ measuring the smoothness of our geometry, and can reduce this energy by traveling along the direction of steepest descent. Conceptually, you can imagine that $E$ is some kind of potential — surfaces with many wrinkles have a lot of energy, and want to reduce this energy by “relaxing” into a smoother state. This description suggests a sort of “energy landscape,” where high peaks correspond to wrinkly surfaces and low valleys correspond to smooth ones. Here’s a two-dimensional cartoon of what this landscape might look like:

To smooth out the geometry we can “ski” downhill, discovering a sequence of smoother and smoother surfaces along the way. Visually the effect is akin to a pat of butter slowly melting away on a hot piece of toast, or a water droplet buckling into a perfectly round sphere.

To be more concrete, let $f$ be an immersion of a manifold $M$ (e.g., a curve or surface) into Euclidean space, and suppose that $E$ is a real-valued function of $f$. Then a curvature flow is the solution to the partial differential equation
$\dot{f} = -\nabla E(f)$
starting with some initial immersion $f_0$, where $\dot{f}$ denotes the derivative in time. In words, this equation just says that the difference in position of the surface at two consecutive points in time is equal to the change in position that reduces the energy quickest.

For surfaces, two common energies are the Dirichlet energy
$E_D(f) = \frac{1}{4} \int_M |\nabla f|^2 dA$
and the Willmore energy
$E_W(f) = \frac{1}{4} \int_M (\kappa_1 - \kappa_2)^2 dA.$
where as usual $\kappa_1$ and $\kappa_2$ are the principal curvatures induced by $f$. Both energies somehow measure the “wrinkliness” of a surface, but how are they related?

Exercise 4.1
For a surface $M$ without boundary show that, up to an additive constant, the Willmore energy can be expressed as
$E_W = \int_M H^2\ dA$
and explain why this constant does not matter in the context of curvature flow. Hint: Gauss-Bonnet.

In other words, Dirichlet energy looks like the (squared) $L^2$ norm of the gradient, whereas Willmore energy looks like the (squared) $L^2$ norm of mean curvature. Superficially these quantities look quite different, but in fact they are quite similar!

Exercise 4.2
For a surface $M$ without boundary show that (again up to constant factors)
$E_D = \langle \Delta f, f \rangle,$
$E_W = \langle \Delta^2 f, f \rangle,$
Hint: Green’s first identity and the definition of the mean curvature normal.

From these expressions, it may appear that Dirichlet and Willmore energy are nice, simple, quadratic functions of $f$. Don’t be fooled! The Laplace-Beltrami operator $\Delta$ depends on the immersion $f$ itself, which means that the corresponding gradient flows are rather nasty and nonlinear. Later on we’ll look at a couple ways to deal with this nonlinearity.

Now that have a couple energies to work with, how do we derive gradient flow? Previously we defined the gradient of a function $\phi: \mathbb{R}^n \rightarrow \mathbb{R}$ as
$\nabla\phi = \left[ \begin{array}{c} \partial\phi/\partial x^1 \\ \vdots \\ \partial\phi/\partial x^n \end{array} \right],$
i.e., as just a list of all the partial derivatives. This definition works pretty well when $\phi$ is defined over a nice finite-dimensional vector space like $\mathbb{R}^n$, but what about something more exotic like the Willmore energy, which operates on an infinite-dimensional vector space of functions? In general, the gradient of a function $\phi$ at a point $x$ can be defined as the unique vector $\nabla \phi(x)$ satisfying
$\langle \nabla \phi(x), u \rangle = \lim_{h \rightarrow 0} \frac{\phi(x+hu)-\phi(x)}{h},$
for all vectors $u$, where $\langle \cdot, \cdot \rangle$ denotes the inner product on the vector space. In other words, taking the inner product with the gradient should yield the directional derivative in the specified direction. Notice that this definition actually serves as a definition of differentiability: a function is differentiable at $x$ if and only if all directional derivatives can be characterized by a single vector $\nabla\phi(x)$. Geometrically, then, differentiability means that if we “zoom in” far enough the function looks almost completely flat.

Exercise 4.3
Explain why the gradient is the direction of steepest ascent.

Exercise 4.4
Consider the function
$\phi: \mathbb{R}^2 \rightarrow \mathbb{R}; (x_1, x_2) \mapsto x_1^2 - x_2^2.$
Confirm that the gradient found using the expression above agrees with the usual gradient found via partial derivatives.

Exercise 4.5
Let $M$ be a surface without boundary. Assume that the Laplace-Beltrami operator $\Delta$ is constant with respect to the immersion $f$ and use our definition of $\nabla$ above to show that
$\nabla E_D(f) \approx HN.$
In other words, gradient flow on Dirichlet energy looks roughly like the mean curvature flow $\dot{f} = -HN$ that we studied in the previous assignment.

The approximate gradient $HN$ might be called a linearization of the true gradient — in general the idea is that we keep some quantity or some piece of an equation fixed so that the rest comes out to be a nice linear expression. This trick can be quite helpful in a practical setting, but it is typically worth understanding where and how the approximation affects the final result.

One final thing to mull over is the fact that the gradient depends on our particular choice of inner product $\langle \cdot, \cdot \rangle$, which appears on the left-hand side in our definition. Why does the inner product matter? Intuitively, the gradient picks out the direction in which the energy increases fastest. But what does “fastest” mean? For instance, if we use a vector of real numbers $\mathsf{x} \in \mathbb{R}^m$ to encode the vertices of a discrete curve, then what we really care about is the energy increase with respect to a change in the length of the curve — not the Euclidean length of the vector $\mathsf{x}$ itself. In terms of our energy landscape we end up with a picture like the one below — you can imagine, for instance, that arrows on the left have unit norm with respect to the standard Euclidean inner product whereas arrows on the right have unit norm with respect to the $L^2$ inner product on our discrete curve. As a result, gradient descent will proceed along two different trajectories:

Exercise 4.6
Consider an inner product $\langle \cdot, \cdot \rangle$ on $\mathbb{R}^n$ defined by a positive definite matrix $\mathsf{B} \in \mathbb{R}^{n \times n}$, i.e., $\langle u, v \rangle = \mathsf{u^T B v}$. Show that the gradient $\nabla_B$ induced by this inner product is related to the standard gradient $\nabla$ via
$\nabla_\mathsf{B} = \mathsf{B}^{-1} \nabla.$

In the discrete setting, the matrix $\mathsf{B}$ is sometimes referred to as the mass matrix, because it encodes the amount of “mass” each degree of freedom contributes to the total. When working with discrete differential forms, one possible choice mass matrix is given by (an appropriate constant multiple of) the diagonal Hodge star. This choice corresponds to applying piecewise-constant interpolation and then taking the usual $L^2$ inner product. For instance, here’s what piecewise constant interpolation looks like for a primal 1-form on a triangulated surface — the integrated value stored on a given edge gets “spread out” over the so-called diamond region associated with that edge:

In general, let $\star_k$ be a real diagonal matrix with one entry for each $k$-dimensional simplex $\sigma_i$. The nonzero entries are
$\left( \star_k \right)_{ii} = \frac{|\sigma_i^\star|}{|\sigma_i|},$
where $\sigma_i^\star$ is the circumcentric dual of $\sigma$, and $|\cdot|$ denotes the (unsigned) volume. The corresponding mass matrix on primal discrete $k$-forms in $n$ dimensions is then
$B_k = \left( \begin{array}{c} n \\ k \end{array} \right) \star_k,$
i.e., a binomial coefficient times a (primal) diagonal Hodge star. The mass matrices for dual $k$-forms can likewise be expressed as constant multiples of the inverse:
$B^\star_k = \left( \begin{array}{c} n \\ k \end{array} \right) \star_{n-k}^{-1}.$
These matrices will come in handy when deriving equations for discrete curvature flow.

Flow on Curves

For the remainder of this assignment, we’re going to make life simpler by working with planar curves instead of surfaces. As discussed earlier, we can describe the geometry of a curve via an immersion
$\gamma: I \rightarrow \mathbb{R}^2; s \mapsto \gamma(s)$
of some interval $I = [0,L] \subset \mathbb{R}$ into the Euclidean plane $\mathbb{R}^2$. A common energy for curves is simply the integral of the curvature $\kappa$, squared:
$E(\gamma) = \int_0^L \kappa^2\ ds.$
Let’s first establish some facts about curvature in the smooth setting.

Exercise 4.7
The unit tangent field on a smooth curve $\gamma$ can be expressed as $T = (\cos\theta,\sin\theta)$ for some function $\theta: I \rightarrow \mathbb{R}$. Show that the normal curvature can be expressed as
$\kappa = d\theta(X)$
where $X$ is a positively-oriented unit vector field. In other words, the scalar curvature is change in the direction of the tangent.

Exercise 4.8
Explain in words why the total curvature of any closed immersed curve $\gamma$ (whether discrete or not) is an integer multiple of $2\pi$, i.e., $\gamma(0) = \gamma(L)$
$\int_0^L \kappa ds = 2\pi k,\ k \in \mathbb{Z}.$
(The number $k$ is called the turning number of the curve.)

A stronger statement is the Whitney-Graustein theorem which says that the turning number of a curve will be preserved by any regular homotopy, i.e., by any continuous motion that keeps the curve immersed. For instance, here’s an example of a motion that is not a regular homotopy — note that the curve gets “pinched” into a sharp cusp halfway thorugh the motion, at which point the turning number goes from $k=2$ to $k=1$:

We’ll keep these ideas in mind as we develop algorithms for curvature flow.

Discrete Curves

In the discrete setting, $\gamma$ is simply a collection of line segments connecting a sequence of vertices with coordinates $\gamma_1, \ldots, \gamma_n \in \mathbb{R}^2$:

Note that in the provided code framework, a curve is represented by a half edge mesh consisting of a single polygon. Therefore, to iterate over the curve you might write something like

 

   FaceIter gamma = mesh.faces.begin();
HalfEdgeIter he = gamma->he;
do {
// do something interesting here!
he = he->next;
}
while( he != gamma->he );

 

As with surfaces, we can consider both a primal and dual “mesh” associated with this curve — this time, each primal edge is associated with a dual vertex at its midpoint, and each primal vertex is associated with a dual edge connecting the two adjacent dual vertices:

In the language of discrete exterior calculus, then, $\gamma \in (\mathbb{R}^2)^n$ is a $\mathbb{R}^2$-valued primal 0-form, i.e., a value associated with each vertex.

Exercise 4.9
Show that the nonzero entries of the diagonal Hodge star on primal 0-forms are given by
$(\star_0)_{ii} = L_i,$
where
$L_i = \frac{1}{2}( |\gamma_{i+1}-\gamma_i| + |\gamma_i-\gamma_{i-1}| ).$

Coding 4.1
Implement the methods Edge::length() and Vertex::dualLength(), which should return the primal edge length and the circumcentric dual edge length, respectively. (The latter should be a one-liner that just calls the former!) Finally, implement the method IsometricWillmoreFlow1D::buildMassMatrix(), which builds the diagonal Hodge star on primal 0-forms.

Exercise 4.10
Show that on a discrete curve the total curvature along a dual edge $e^\star_{ij}$ is equal to the exterior angle $\varphi_{ij} \in \mathbb{R}$ at the corresponding vertex, i.e., the difference in angle between the two consecutive tangents:
$\varphi_{ij} = \theta_j - \theta_i = \int_{e^\star_{ij}} \kappa\ ds.$
(Hint: Stokes’ theorem!)

In other words, the exterior angle $\varphi$ gives us the integrated curvature. Applying the discrete Hodge star yields pointwise curvatures $\kappa$, which we will use as the degrees of freedom in our numerical curvature flow:
$\kappa = \star\varphi.$

Coding 4.2
Implement the method Vertex::curvature(), which returns the pointwise curvature as defined above. Hint: in the language of discrete exterior calculus, what kind of quantity is $\varphi$? And what kind of quantity is $\kappa$?

The next exercise should solidify your understanding of where all these different quantities live, and which operators take you back and forth from one space to another!

Exercise 4.11
Show that for a discrete curve $E(\gamma)$ can be written explicitly as
$E(\gamma) = \sum_i \varphi_i^2 / L_i,$
assuming we use piecewise constant interpolation of curvature.

Now that we have a discrete curvature energy, let’s derive an expression for the gradient.

Exercise 4.12

Let $\varphi$ be the angle made by two vectors $u,v \in \mathbb{R}^2$. Show that the gradient of $\varphi$ with respect to $u$ can be expressed as
$\nabla_u \varphi = -\frac{v_{\perp u}}{2A}$

where $v_{\perp u}$ denotes the component of $v$ orthogonal to $u$ and $A$ is the area of a triangle with sides $u$ and $v$.

Exercise 4.13
Let $L$ be the length of a vector $u = b-a$, where $a$ and $b$ are a pair of points in $\mathbb{R}^2$. Show that
$\nabla_a L = -\hat{u}$
and
$\nabla_b L = \hat{u}.$

Exercise 4.14
Collecting the results of the past few exercises, show that the gradient of the $i$th term of our curvature energy
$E_i = \varphi_i^2 / L_i$
with respect to vertex coordinates $\gamma_{i-1}$, $\gamma_i$, and $\gamma_{i+1}$ is given explicitly by
$\begin{array}{rcr} \nabla_{\gamma_{i-1}} E_i &=& \frac{\varphi_i}{L_i L_{i-1}} \left( \frac{v_{\perp u}}{A_i} + \frac{\varphi_i}{2L_i} \hat{u} \right) \\ \nabla_{\gamma_{i+1}} E_i &=& \frac{\varphi_i}{L_i^2} \left( \frac{u_{\perp v}-v_{\perp u}}{A_i} + \frac{\varphi_i}{2L_i} (\hat{v}-\hat{u}) \right) \\ \nabla_{\gamma_i} E_i &=& -\frac{\varphi_i}{L_i L_{i+1}} \left( \frac{u_{\perp v}}{A_i} + \frac{\varphi_i}{2L_i} \hat{v} \right) \\ \end{array}$

where $\varphi_i$ is the exterior angle at vertex $i$, $L_i$ is the dual edge length, and $A_i$ is the area of a triangle with edges $u = \gamma_i-\gamma_{i-1}$ and $v = \gamma_{i+1}-\gamma_i$. The gradient of $E_i$ with respect to all other vertices $\gamma_j$ is zero. Why? (Hint: remember to take the gradient with respect to the right metric!)

Coding 4.3
Implement the method WillmoreFlow1D::computeGradient() using the expressions above. The gradient of energy with respect to a given vertex should be stored in the member Vertex::energyGradient. Remember that the overall energy is a sum over the terms $E_i$, which means you will need to add up the contributions to the gradient at each vertex.

Coding 4.4
Implement the method WillmoreFlow1D::integrate(), which should integrate the flow equation
$\dot{\gamma} = -\nabla E(\gamma)$
using the forward Euler scheme. (See the end of the previous assignment for a brief discussion of time integration.) Run the code on the provided meshes, and report the maximum stable time step in each case, i.e., the largest time step for which the flow succeeds at smoothing out the curve. (The time step size can be adjusted using the keys ‘-‘, ‘=‘, ‘\_‘, and ‘+.’)

Curvature Flow in Curvature Space

If you feel exhausted at this point in the assignment, you’re not alone! Taking derivatives by hand can be a royal pain (and it gets even worse when you want second derivatives, which are required for more sophisticated algorithms like Newton descent). But it’s worth grinding out this kind of expression at least once in your life so that you really understand what’s involved. In practice there are a variety of alternatives, including numerical differentiation, automatic differentiation, and symbolic differentiation — these methods all have their place, and it’s well-worth understanding the tradeoffs they offer in terms of accuracy, efficiency, and code complexity.

But before getting mired in the bedraggled business of computer-based derivatives, it’s worth realizing that there is a tantalizing fourth alternative: come up with a simpler formulation of your problem! In particular, the mention of a convex-quadratic energy should make your mouth water and your heart beat faster, since these things make your life easier in a number of ways. “Convex-quadratic” means that your energy can be expressed as a real-valued homogeneous quadratic polynomial, i.e., as
$E(x) = \langle Ax, x \rangle$
for some positive-semidefinite self-adjoint linear operator $A$ that does not depend whatsoever on the argument $x$. For instance, suppose that in the discrete setting the degrees of freedom $x$ of our system are encoded by a vector $\mathsf{x} \in \mathbb{R}^n$. Then a quadratic energy can always be represented as
$E(x) = \mathsf{x^T A x}$
for some fixed symmetric positive-semidefinite matrix $\mathsf{A} \in \mathbb{R}^{n \times n}$. Earlier we visualized definiteness in terms of the graph of the energy in two dimensions:

Independent of definiteness, the gradient of a quadratic energy has a simple linear expression
$\nabla E(x) = 2\mathsf{B^{-1}Ax},$
where the matrix $\mathsf{B} \in \mathbb{R}^{n \times n}$ encodes the inner product. This setup not only simplifies the business of taking derivatives, but also makes things inexpensive to evaluate at the numerical level — for instance, we can apply the backward Euler method by just solving a linear system, instead of performing some kind of nasty nonlinear root finding. Moreover, since the matrix $\mathsf{A}$ is constant, we can save a lot of computation by prefactoring it once and applying backsubstitution many times. This setup also has some nice analytical features. For one thing, any local minimum of a convex energy is guaranteed to be a global minimum, which means that gradient descent will ultimately lead to an optimal solution. For another, there is an extremely well-established theory of linear PDEs, which allows one to easily answer questions about things like numerical stability. In contrast, the general theory of nonlinear PDEs is kind of a zoo.

Ok, enough religion! Let’s see how a quadratic formulation can help us with the specific problem of curvature flow. Actually, we already have a quadratic energy — it’s
$E(\kappa) = \int_0^L \kappa^2 ds.$
The only difference between this energy and the one we’ve been working with all along is that it’s a function of the curvature $\kappa$ rather than the immersion $f$ — as a result, we avoid all the nonlinearity associated with expressing $\kappa$ in terms of $f$. More concretely, at the discrete level we’re going to store and manipulate a single number $\kappa_i$ at each vertex, rather than computing it indirectly from the vertex coordinates $\gamma_i \in \mathbb{R}^2$.

Coding 4.5
Implement the method IsometricWillmoreFlow1D::getCurvature(), which simply evaluates the (pointwise) curvature at each vertex and stores it in the member Vertex::kappa. This method should call Vertex::curvature().

One nice consequence of this setup is that the gradient becomes extremely simple! Taking the gradient with respect to the $L^2$ inner product on 0-forms, we get
$\nabla E(\kappa) = -2\kappa.$
Gradient flow then becomes a simple, linear equation involving no spatial derivatives:
$\dot{\kappa} = -2\kappa.$

Coding 4.6
Implement the methods IsometricWillmoreFlow1D::computeFlowDirection() and IsometricWillmoreFlow1D::integrate(), which integrate the above flow equation using the forward Euler scheme. Hint: this step should be very easy!

If we want to actually draw the curve, we can integrate curvature to get tangents, then integrate tangents to get positions. In other words, we can recover the direction $\theta$ of the tangent via
$\theta(s) = \theta_0 + \int_0^s d\theta = \theta_0 + \int_0^s \kappa\ ds,$
where $\theta_0$ specifies the direction of the first tangent on our curve. The tangent vectors themselves are given by $T(s) = (\cos\theta(s),\sin\theta(s))$ as before. Once we have the tangents, we can recover the immersion itself via
$f(s) = f_0 + \int_I T(s)\ ds,$
where again $f_0$ specifies a “starting point” for the curve. In the discrete setting, these two steps correspond to a very simple reconstruction procedure: start out at some initial vertex and join the edges end to end, rotating by the exterior angles $\varphi_i$ at each step:

More explicitly, let
$\theta_i = \sum_{k=0}^i \varphi_k$
and let
$T_i = L_i (\cos\theta_i,\sin\theta_i),$
where $L_i$ is the length of the $i$th primal edge. Then the vertex positions along the curve can be recovered via
$\gamma_i = \sum_{k=0}^i T_k,$
mirroring the continuous formulae above. (Some questions to ponder: can you interpret these sums as piecewise integrals? What kind of quantity is $(\cos\varphi_i,\sin\varphi_i)$? What kind of quantity is $T_i$? What’s the relationship between the two?)

Coding 4.7
Implement the methods IsometricWillmoreFlow1D::recoverTangents() and IsometricWillmoreFlow1D::recoverPositions(), which should compute the values $T_i$ and $\gamma_i$, respectively. If you use an $O(n^2)$ algorithm to implement either of these methods you will get zero points! In other words, do not just evaluate the whole sum once for each vertex — there is obviously a better way to do it!

Note that the length of each edge is preserved by construction — after all, we build the curve out of segments that have the same length as in the previous curve! In other words, we get not only a curvature flow but an isometric curvature flow (in the smooth case, isometry is reflected in the fact that $(\cos\alpha,\sin\alpha)$ is always a unit vector).

Ok, sounds pretty good so far: we simply subtract some fraction of the curvature each vertex, and compute a couple cumulative sums. As an added bonus, we preserve length. Why haven’t people been doing this all along? The answer is: when something sounds too good to be true, it probably is!

In particular, let’s take a look at what happens when we work with a closed curve, i.e., a loop in the plane. If we make a completely arbitrary change to the curvature $\kappa$, there’s no reason to expect that segments joined end-to-end will close back up. In other words, the final vertex may be somewhere different from where the inital vertex appeared:

A fancy way of describing this situation is to say that the tangents we recover from this procedure are not integrable — they do not “integrate up” to form a closed loop. Similarly, the curvature itself is not integrable: the cumulative curvature function $\alpha$ does not describe the tangent direction of any closed loop. Why did this happen? Well, let’s go back and take a look at our condition on total curvature. We said that the curvature $\kappa$ of any closed loop $\gamma$ satisfies
$\int_{0}^L \kappa\ ds = 2\pi k$
for some turning number $k \in \mathbb{Z}$. Another way of saying the same thing is that the first and last tangents of our curve must match up: $T(0) = T(L)$. But if we change $\kappa$ arbitrarily, this condition will no longer hold.

Exercise 4.15
Suppose that at time zero, the curvature function $\kappa$ on a smooth curve $\gamma$ satisfies our condition on total curvature. Show that any change in curvature $\dot{\kappa}$ orthogonal to the constant function $1: [0,L] \rightarrow \mathbb{R}; s \mapsto 1$ with respect to the $L^2$ inner product will preserve this condition.

We also need a condition that ensures the endpoints will meet up, i.e., $\gamma(0) = \gamma(L)$. Although we will not derive it here, this condition again turns out to have a simple form:
$\int_0^L \kappa \gamma = 0;$
equivalently, $\dot{\kappa}$ must be ($L^2$-)orthogonal to the $x-$ and $y-$ coordinate functions of the immersion. Overall, then, we’re saying that the change in curvature must avoid a three-dimensional linear subspace of directions:
$\langle \dot{\kappa}, 1 \rangle = \langle \dot{\kappa}, \gamma_x \rangle = \langle \dot{\kappa}, \gamma_y \rangle = 0.$
Like convex-quadratic energies, linear constraints are particularly easy to work with — in the case of our flow, we can simply remove the component of $\dot{\kappa}$ that sits in this “forbidden” space. More specifically, suppose that this space is spanned by an orthonormal basis $\{c_i\}$. Then we can simply travel in the augmented direction
$\dot{\kappa}_c = \dot{\kappa} - \sum_{i=1}^3 \langle \dot{\kappa}, \hat{c}_i \rangle \hat{c}_i.$

Coding 4.8
Implement the method IsometricWillmoreFlow1D::buildConstraints(), which constructs the three constraint directions $1$, $\gamma_x$, $\gamma_y$ as dense column vectors.

Coding 4.9
Implement the method IsometricWillmoreFlow1D::orthogonalizeConstraints(), which builds an orthonormal basis $\{\hat{c}_1,\hat{c}_2,\hat{c}_3\}$ spanning the same space as the three constraint directions. Hint: use the Gram-Schmidt process — remember to use the correct inner product!

Coding 4.10
Implement the method IsometricWillmoreFlow::enforceConstraints(), which removes the forbidden directions from the flow using the orthogonal basis and the procedure outlined above. Try running the isometric Willmore flow (you can switch to this flow in the GUI either by right-clicking to get the contextual menu, or by hitting the ‘i‘ key). Report the maximum stable time step that can be achieved for each of the provided meshes. Does the flow preserve the turning number of each of the input curves? (In other words, did we faithfully capture the Whitney-Graustein theorem in the discrete setting?) Try running the flow with and without constraints (by modifying IsometricWillmoreFlow::enforceConstraints()). What happens if you turn off all the constraints? Is there a strict subset of the constraints that is sufficient to keep the loop closed, or are they all needed?

One thing you might have noticed about this new flow is that, while it still smooths out the curve, it looks very different from the one you implemented in the WillmoreFlow1D class. Why is there a difference? In either case, aren’t we doing gradient descent on the same energy? Well, if you paid close attention, you might already know the answer: yes, the energy stays the same, but the metric we used to define the gradient is different! (And if you really paid close attention, you may even know how to modify the second flow to make it look like the first one — and how to implement it!) Beyond that, there are all sorts of nice ways to improve the algorithm that involve discrete Laplacians and Poisson equations and… you know what? You’ve worked hard enough already. Enjoy the break, and see you next year!

Skeleton code: ddg_hw4_skeleton.zip

## Discrete Exterior Calculus

So far we’ve been exploring exterior calculus purely in the smooth setting. Unfortunately this theory was developed by some old-timers who did not know anything about computers, hence it cannot be used directly by machines that store only a finite amount of information. For instance, if we have a smooth vector field or a smooth 1-form we can’t possibly store the direction of every little “arrow” at each point — there are far too many of them! Instead, we need to keep track of a discrete (or really, finite) number of pieces of information that capture the essential behavior of the objects we’re working with; we call this scheme discrete exterior calculus (or DEC for short). The big secret about DEC is that it’s literally nothing more than the good-old fashioned (continuous) exterior calculus we’ve been learning about, except that we integrate differential forms over elements of our mesh.

Discrete Differential Forms

One way to encode a 1-form might be to store a finite collection of “arrows” associated with some subset of points. Instead, we’re going to do something a bit different: we’re going to integrate our 1-form over each edge of a mesh, and store the resulting numbers (remember that the integral of an $n$-form always spits out a single number) on the corresponding edges. In other words, if $\alpha$ is a 1-form and $e$ is an edge, then we’ll associate the number

$\hat{\alpha}_e := \int_e \alpha$

with $e$, where the use of the hat ($\ \hat{}\$) is supposed to suggest a discrete quantity (not to be confused with a unit-length vector).

Does this procedure seem a bit abstract to you? It shouldn’t! Think about what this integral represents: it tells us how strongly the 1-form $\alpha$ “flows along” the edge $e$ on average. More specifically, remember how integration of a 1-form works: at each point along the edge we take the vector tangent to the edge, stick it into the 1-form $\alpha$, and sum up the resulting values — each value tells us something about how well $\alpha$ “lines up” with the direction of the edge. For instance, we could approximate the integral via the sum

$\int_e \alpha \approx |e|\left(\frac{1}{N} \sum_{i=1}^N \alpha_{p_i}(\hat{e})\right),$

where $|e|$ denotes the length of the edge, $\{p_i\}$ is a sequence of points along the edge, and $\hat{e} := e/|e|$ is a unit vector tangent to the edge:

Of course, this quantity tells us absolutely nothing about the strength of the “flow” orthogonal to the edge: it could be zero, it could be enormous! We don’t really know, because we didn’t take any measurements along the orthogonal direction. However, the hope is that some of this information will still be captured by nearby edges (which are most likely not parallel to $e$).

More generally, a $k$-form that has been integrated over each $k$-dimensional cell (edges in 1D, faces in 2D, etc.) is called a discrete differential $k$-form. (If you ever find the distinction confusing, you might find it helpful to substitute the word “integrated” for the word “discrete.”) In practice, however, not every discrete differential form has to originate from a continuous one — for instance, a bunch of arbitrary values assigned to each edge of a mesh is a perfectly good discrete 1-form.

Orientation

One thing you may have noticed in all of our illustrations so far is that each edge is marked with a little arrow. Why? Well, one thing to remember is that direction matters when you integrate. For instance, the fundamental theorem of calculus (and common sense) tells us that the total change as you go from $a$ to $b$ is the opposite of the total change as you go from $b$ to $a$:

$\int_a^b \frac{\partial\phi}{\partial x} dx = \phi(b)-\phi(a) = -(\phi(a)-\phi(b)) = -\int_b^a \frac{\partial\phi}{\partial x} dx.$

Said in a much less fancy way: the elevation gain as you go from Pasadena to Altadena is 151 meters, so of the elevation “gain” in the other direction must be -151 meters! Just keeping track of the number 151 does you little good — you have to say what that quantity represents.

Therefore, when we store a discrete differential form it’s not enough to just store a number: we also have to specify a canonical orientation for each element of our mesh, corresponding to the orientation we used during integration. For an edge we’ve already seen that we can think about orientation as a little arrow pointing from one vertex to another — we could also just think of an edge as an ordered pair $(i,j)$, meaning that we always integrate from $i$ to $j$.

More generally, suppose that each element of our mesh is an oriented $k$-simplex $\sigma$, i.e., a collection of $k+1$ vertices $p_i \in \mathbb{R}^n$ given in some fixed order $(p_1, \ldots, p_{k+1})$. The geometry associated with $\sigma$ is the convex combination of these points:

$\left\{ \sum_{i=1}^{k+1} \lambda_i p_i \left| \sum_{i=1}^{k+1} \lambda_i = 1 \right. \right\} \subset \mathbb{R}^n$

(Convince yourself that a 0-simplex is a vertex, a 1-simplex is an edge, a 2-simplex is a triangle, and a 3-simplex is a tetrahedron.)

Two oriented $k$-simplices have the same orientation if and only if the vertices of one are an even permutation of the vertices of another. For instance, the triangles $(p_1, p_2, p_3)$ and $(p_2, p_3, p_1)$ have the same orientation; $(p_1, p_2, p_3)$ and $(p_2, p_1, p_3)$ have opposite orientation.

If a simplex $\sigma_1$ is a (not necessarily proper) subset of another simplex $\sigma_2$, then we say that $\sigma_1$ is a face of $\sigma_2$. For instance, every vertex, edge, and triangle of a tetrahedron $\sigma$ is a face of $\sigma$; as is $\sigma$ itself! Moreover, the orientation of a simplex agrees with the orientation of one of its faces as long as we see an even permutation on the shared vertices. For instance, the orientations of the edge $(p_2,p_1)$ and the triangle $(p_1,p_3,p_2)$ agree. Geometrically all we’re saying is that the two “point” in the same direction (as depicted above). To keep yourself sane while working with meshes, the most important thing is to pick and orientation and stick with it!

So in general, how do we integrate a $k$-form over an oriented $k$-simplex? Remember that a $k$-form is going to “eat” $k$ vectors at each point and spit out a number — a good canonical choice is to take the ordered collection of edge vectors $(p_2 – p_1, \ldots, p_{k+1}-p_1)$ and orthogonalize them (using, say the Gram-Schmidt algorithm) to get vectors $(u_1, \ldots, u_n)$. This way the sign of the integrand changes whenever the orientation changes. Numerically, we can then approximate the integral via a sum

$\int_\sigma \alpha \approx \frac{|\sigma|}{N} \sum_{i=1}^N \alpha_{p_i}(u_1, \ldots, u_n)$

where $\{p_i\}$ is a (usually carefully-chosen) collection of sample points. (Can you see why the orientation of $\sigma$ affects the sign of the integrand?) Sounds like a lot of work, but in practice one rarely constructs discrete differential forms via integration: more often, discrete forms are constructed via input data that is already discrete (e.g., vertex positions in a triangle mesh).

By the way, what’s a discrete 0-form? Give up? Well, it must be a 0-form (i.e., a function) that’s been integrated over every 0-simplex (i.e., vertex) of a mesh:

$\hat{\phi}_i = \int_{v_i} \phi$

By convention, the integral of a function over a zero-dimensional set is simply the value of the function at that point: $\hat{\phi}_i = \phi(v_i)$. In other words, in the case of 0-forms there is no difference between storing point samples and storing integrated quantities: the two coincide.

It’s also important to remember that differential forms don’t have to be real-valued. For instance, we can think of a map $f: M \rightarrow \mathbb{R}^3$ that encodes the geometry of a surface as an $\mathbb{R}^3$-valued 0-form; its differential $df$ is then an $\mathbb{R}^3$-valued 1-form, etc. Likewise, when we say that a discrete differential form is a number stored on every mesh element, the word “number” is used in a fairly loose sense: a number could be a real value, a vector, a complex number, a quaternion, etc. For instance, the collection of $(x,y,z)$ vertex coordinates of a mesh can be viewed as an $\mathbb{R}^3$-valued discrete 0-form (namely, one that discretizes the map $f$). The only requirement, of course, is that we store the same type of number on each mesh element.

The Discrete Exterior Derivative

One of the main advantages of working with integrated (i.e., “discrete”) differential forms instead of point samples is that we can easily take advantage of Stokes’ theorem. Remember that Stokes’ theorem says

$\int_\Omega d\alpha = \int_{\partial\Omega} \alpha,$

for any $k$-form $\alpha$ and $k+1$-dimensional domain $\Omega$. In other words, we can integrate the derivative of a differential form as long as we know its integral along the boundary. But that’s exactly the kind of information encoded by a discrete differential form! For instance, if $\hat{\alpha}$ is a discrete 1-form stored on the three edges of a triangle $\sigma$, then we have

$\int_\sigma d\alpha = \int_{\partial\sigma} \alpha = \sum_{i=1}^3 \int_{e_i} \alpha = \sum_{i=1}^3 \hat{\alpha}_i.$

In other words, we can exactly evaluate the integral on the left by just adding up three numbers. Pretty cool! In fact, the thing on the left is also a discrete differential form: it’s the 2-form $d\alpha$ integrated over the only triangle in our mesh. So for convenience, we’ll call this guy “$\hat{d}\hat{\alpha}$”, and we’ll call the operation $\hat{d}$ the discrete exterior derivative. (In the future we will drop the hats from our notation when the meaning is clear from context.) In other words, the discrete exterior derivative takes a $k$-form that has already been integrated over each $k$-simplex and applies Stokes’ theorem to get the integral of the derivative over each $k+1$-simplex.

In practice (i.e., in code) you can see how this operation might be implemented by simply taking local sums over the appropriate mesh elements. However, in the example above we made life particularly easy on ourselves by giving each edge an orientation that agrees with the orientation of the triangle. Unfortunately assigning a consistent orientation to every simplex is not always possible, and in general we need to be more careful about sign when adding up our piecewise integrals. For instance, in the example below we’d have

$(\hat{d}\hat{\alpha})_1 = \hat{\alpha}_1 + \hat{\alpha}_2 + \hat{\alpha}_3$

and

$(\hat{d}\hat{\alpha})_2 = \hat{\alpha}_4 + \hat{\alpha}_5 - \hat{\alpha}_2.$

Discrete Hodge Star

As hinted at above, a discrete $k$-form captures the behavior of a continuous $k$-form along $k$ directions, but not along the remaining $n-k$ directions — for instance, a discrete 1-form in 2D captures the flow along edges but not in the orthogonal direction. If you paid attention to our discussion of Hodge duality, this story starts to sound familiar! To capture Hodge duality in the discrete setting, we’ll need to define a dual mesh. In general, the dual of an $n$-dimensional simplicial mesh identifies every $k$-simplex in the primal (i.e., original) mesh with a unique $(n-k)$-cell in the dual mesh. In a two-dimensional simplicial mesh, for instance, primal vertices are identified with dual faces, primal edges are identified with dual edges, and primal faces are identified with dual vertices. Note, however, that the dual cells are not always simplices! (See above.)

So how do we talk about Hodge duality in discrete exterior calculus? Quite simply, the discrete Hodge dual of a (discrete) $k$-form on the primal mesh is an $(n-k)$-form on the dual mesh. Similarly, the Hodge dual of an $k$-form on the dual mesh is a $k$-form on the primal mesh. Discrete forms on the primal mesh are called primal forms and discrete forms on the dual mesh are called dual forms. Given a discrete form $\hat{\alpha}$ (whether primal or dual), its Hodge dual is typically written as $\hat{\star} \hat{\alpha}$.

Unlike continuous forms, discrete primal and dual forms live in different places (so for instance, discrete primal $k$-forms and dual $k$-forms cannot be added to each other). In fact, primal and dual forms often have different physical interpretations. For instance, a primal 1-form might represent the total circulation along edges of the primal mesh, whereas in the same context a dual 1-form might represent the total flux through the corresponding dual edges (see illustration above).

Of course, these two quantities (flux and circulation) are closely related, and naturally leads into one definition for a discrete Hodge star called the diagonal Hodge star. Consider a primal $k$-form $\alpha$. If $\hat{\alpha}_i$ is the value of $\hat{\alpha}$ on the $k$-simplex $\sigma_i$, then the diagonal Hodge star is defined by

$\hat{\star} \hat{\alpha}_i = \frac{|\sigma_i^\star|}{|\sigma_i|} \hat{\alpha}_i$

for all $i$, where $|\sigma|$ indicates the (unsigned) volume of $\sigma$ (which by convention equals one for a vertex!) and $|\sigma^\star|$ is the volume of the corresponding dual cell. In other words, to compute the dual form we simply multiply the scalar value stored on each cell by the ratio of corresponding dual and primal volumes.

If we remember that a discrete form can be thought of as a continuous form integrated over each cell, this definition for the Hodge star makes perfect sense: the primal and dual quantities should have the same density, but we need to account for the fact that they are integrated over cells of different volume. We therefore normalize by a ratio of volumes when mapping between primal and dual. This particular Hodge star is called diagonal since the $i$th element of the dual differential form depends only on the $i$th element of the primal differential form. It’s not hard to see, then, that Hodge star taking dual forms to primal forms (the dual Hodge star) is the inverse of the one that takes primal to dual (the primal Hodge star).

That’s All, Folks!

Hey, wait a minute, what about our other operations, like the wedge product ($\wedge$)? These operations can certainly be defined in the discrete setting, but we won’t go into detail here — the basic recipe is to integrate, integrate, integrate. Actually, even in continuous exterior calculus we omitted a couple operations like the Lie derivative ($\mathcal{L}_X$) and the interior product ($i_\alpha$). Coming up with a complete discrete calculus where the whole cast of characters $d$, $\wedge$, $\star$, $\mathcal{L}_X$, $i_\alpha$, etc., plays well together is an active and ongoing area of research, which may be of interest to aspiring young researchers like you (yes, you)!

## Homework 3: The Discrete Laplacian

In the course notes we mentioned that the Laplace-Beltrami operator (more commonly known as just the Laplacian) plays a fundamental role in a variety of geometric and physical equations. In this homework we’ll put the Laplacian to work by coming up with a discrete version for triangulated surfaces. Similar to the homework on vertex normals, we’ll see that the same discrete expression for the Laplacian (via the cotan formula) arises from two very different ways of looking at the problem: using test functions (often known as Galerkin projection), or by integrating differential forms (often called discrete exterior calculus).

Poisson Equations

Before we start talking about discretization, let’s establish a few basic facts about the Laplace operator $\Delta$ and the standard Poisson problem

$\Delta \phi = \rho.$

Poisson equations show up all over the place — for instance, in physics $\rho$ might represent a mass density in which case the solution $\phi$ would (up to suitable constants) give the corresponding gravitational potential. Similarly, if $\rho$ describes an charge density then $\phi$ gives the corresponding electric potential (you’ll get to play around with these equations in the code portion of this assignment). In geometry processing a surprising number of things can be done by solving a Poisson equation (e.g., smoothing a surface, computing a vector field with prescribed singularities, or even computing the geodesic distance on a surface).

Often we’ll be interested in solving Poisson equations on a compact surface $M$ without boundary.

Exercise 3.1: A twice-differentiable function $\phi: M \rightarrow \mathbb{R}$ is called harmonic if it sits in the kernel of the Laplacian, i.e., $\Delta \phi = 0$. Argue that the only harmonic functions on a compact domain without boundary are the constant functions. (This argument does not have to be incredibly formal — there are a just couple simple observations that capture the main idea.)

This fact is quite important because it means we can add a constant to any solution to a Poisson equation. In other words, if $\phi$ satisfies $\Delta \phi = \rho$, then so does $\phi+c$ since $\Delta(\phi+c) = \Delta\phi + \Delta c = \Delta\phi + 0 = \rho$.

Exercise 3.2: A separate fact is that on a compact domain without boundary, constant functions are not in the image of $\Delta$. In other words, there is no function $\phi$ such that $\Delta \phi = c$. Why?

This fact is also important because it tells us when a given Poisson equation admits a solution. In particular, if $\rho$ has a constant component then the problem is not well-posed. In some situations, however, it may make sense to simply remove the constant component. I.e., instead of trying to solve $\Delta \phi = \rho$ one can solve $\Delta \phi = \rho – \bar{\rho}$, where $\bar{\rho} := \int_M \rho\ dV / |M|$ and $|M|$ is the total volume of $M$. However, you must be certain that this trick makes sense in the context of your particular problem!

When working with PDEs like the Poisson equation, it’s often useful to have an inner product between functions. An extremely common inner product is the $L^2$ inner product $\langle \cdot, \cdot \rangle$, which takes the integral of the pointwise product of two functions over the entire domain $\Omega$:

$\langle f, g \rangle := \int_\Omega f(x) g(x) dx.$

In spirit, this operation is similar to the usual dot product on $\mathbb{R}^n$: it measures the degree to which two functions “line up.” For instance, the top two functions have a large inner product; the bottom two have a smaller inner product (as indicated by the dark blue regions):

Similarly, for two vector fields $X$ and $Y$ we can define an $L^2$ inner product

$\langle X, Y \rangle := \int_\Omega X(x) \cdot Y(x) dx$

which measures how much the two fields “line up” at each point.

Using the $L^2$ inner product we can express an important relationship known as Green’s first identity. Green’s identity says that for any sufficiently differentiable functions $f$ and $g$

$\langle \Delta f, g \rangle = -\langle \nabla f, \nabla g \rangle + \langle N \cdot \nabla f, g \rangle_\partial,$

where $\langle \cdot, \cdot \rangle_\partial$ denotes the inner product on the boundary and $N$ is the outward unit normal.

Exercise 3.3: Using exterior calculus, show that Green’s identity holds. Hint: apply Stokes’ theorem to the 1-form $g df$.

One last key fact about the Laplacian is that it is positive-semidefinite, i.e., $\Delta$ satisfies

$\langle \Delta \phi, \phi \rangle \geq 0$

for all functions $\phi$. (By the way, why isn’t this quantity strictly greater than zero?) Words cannot express the importance of positive-(semi)definiteness. Let’s think about a very simple example: functions of the form $\phi(x,y) = ax^2 + bxy + cy^2$ in the plane. Any such function can also be expressed in matrix form:

$\phi(x,y) = \underbrace{\left[ \begin{array}{cc} x & y \end{array} \right]}_{\mathbf{x}^T} \underbrace{\left[ \begin{array}{cc} a & b/2 \\ b/2 & c \end{array} \right]}_{A} \underbrace{\left[ \begin{array}{c} x \\ y \end{array} \right]}_{\mathbf{x}} = ax^2 + bxy + cy^2,$

and we can likewise define positive-semidefiniteness for $A$. But what does it actually look like? It looks like this — positive-definite matrices ($\mathbf{x}^T A \mathbf{x} > 0$) look like a bowl, positive-semidefinite matrices ($\mathbf{x}^T A \mathbf{x} \geq 0$) look like a half-cylinder, and indefinite matrices ($\mathbf{x}^T A \mathbf{x}$ might be positive or negative depending on $\mathbf{x}$) look like a saddle:

Now suppose you’re a back country skiier riding down this kind of terrain in the middle of a howling blizzard. You’re cold and exhausted, and you know you parked your truck in a place where the ground levels out, but where exactly is it? The snow is blowing hard and visibility is low — all you can do is keep your fingers crossed and follow the slope of the mountain as you make your descent. (Trust me: this is really how one feels when doing numerical optimization!) If you were smart and went skiing in Pos Def Valley then you can just keep heading down and will soon arrive safely back at the truck. But maybe you were feeling a bit more adventurous that day and took a trip to Semi Def Valley. In that case you’ll still get to the bottom, but may have to hike back and forth along the length of the valley before you find your car. Finally, if your motto is “safety second” then you threw caution to the wind and took a wild ride in Indef Valley. In this case you may never make it home!

In short: positive-definite matrices are nice because it’s easy to find the minimum of the quadratic functions they describe — many tools in numerical linear algebra are based on this idea. Same goes for positive definite linear operators like the Laplacian, which can often be thought of as sort of infinite-dimensional matrices (if you took some time to read about the spectral theorem, you’ll know that this analogy runs even deeper, especially on compact domains). Given the ubiquity of Poisson equations in geometry and physics, it’s a damn good thing $\Delta$ is positive-semidefinite!

Exercise 3.4: Using Green’s first identity, show that $\Delta$ is negative-semidefinite on any compact surface $M$ without boundary. From a practical perspective, why are negative semi-definite operators just as good as positive semi-definite ones?

Test Functions

The solution to a geometric or physical problem is often described by a function: the temperature at each point on the Earth, the curvature at each point on a surface, the amount of light hitting each point of your retina, etc. Yet the space of all possible functions is mind-bogglingly large — too large to be represented on a computer. The basic idea behind the finite element method is to pick a smaller set of functions and try to find the best possible solution from within this set. More specifically, if $u$ is the true solution to a problem and $\{\phi_i\}$ is a collection of basis functions, then we seek the linear combination of these functions

$\tilde{u} = \sum_i x_i \phi_i,\ x_i \in \mathbb{R}$

such that the difference $||\tilde{u}-u||$ is as small as possible with respect to some norm. (Above we see a detailed curve $u$ and its best approximation $\tilde{u}$ by a collection of bump-like basis functions $\phi_i$.)

Let’s start out with a very simple question: suppose we have a vector $v \in \mathbb{R}^3$, and want to find the best approximation $\tilde{v}$ within a plane spanned by two basis vectors $e_1, e_2 \in \mathbb{R}^3$:

Since $\tilde{v}$ is forced to stay in the plane, the best we can do is make sure there’s error only in the normal direction. In other words, we want the error $\tilde{v} – v$ to be orthogonal to both basis vectors $e_1$ and $e_2$:

$\begin{array}{rcl} (\tilde{v} - v) \cdot e_1 &=& 0, \\ (\tilde{v} - v) \cdot e_2 &=& 0. \\ \end{array}$

In this case we get a system of two linear equations for two unknowns, and can easily compute the optimal vector $\tilde{v}$.

Now a harder question: suppose we want to solve a standard Poisson problem

$\Delta u = f.$

How do we check whether a given function $\tilde{u}$ is the best possible solution? The basic picture still applies, except that our bases are now functions $\phi$ instead of finite-dimensional vectors $e_i$, and the simple vector dot product $\cdot$ gets replaced by the $L^2$ inner product. Unfortunately, when trying to solve a Poisson equation we don’t know what the correct solution $u$ looks like (otherwise we’d be done already!). So instead of the error $\tilde{u} – u$, we’ll have to look at the residual $\Delta \tilde{u} – f$, which measures how closely $\tilde{u}$ satisfies our original equation. In particular, we want to “test” that the residual vanishes along each basis direction $\phi_j$:

$\langle \Delta \tilde{u} - f, \phi_j \rangle = 0,$

again resulting in a system of linear equations. This condition ensures that the solution behaves just as the true solution would over a large collection of possible “measurements.”

Next, let’s work out the details of this system for a triangulated surface. The most natural choice of basis functions are the piecewise linear hat functions $\phi_i$, which equal one at their associated vertex and zero at all other vertices:

At this point you might object: if all our functions are piecewise linear, and $\Delta$ is a second derivative, aren’t we just going to get zero every time we evaluate $\Delta u$? Fortunately we’re saved by Green’s identity — let’s see what happens if we apply this identity to our triangle mesh, by breaking up the integral into a sum over individual triangles $\sigma$:

$\begin{array}{rcl} \langle \Delta u, \phi_j \rangle &=& \sum_k \langle \Delta u, \phi_j \rangle_{\sigma_k} \\ &=& \sum_k \langle \nabla u, \nabla \phi_j \rangle_{\sigma_k} + \sum_k \langle N \cdot \nabla u, \phi_j \rangle_{\partial\sigma_k}. \\ \end{array}$

If the mesh has no boundary then this final sum will disappear since the normals of adjacent triangles are oppositely oriented, hence the boundary integrals along shared edges cancel each-other out:

In this case, we’re left with simply

$\langle \nabla u, \nabla \phi_j \rangle$

in each triangle $\sigma_k$. In other words, we can “test” $\Delta u$ as long as we can compute the gradients of both the candidate solution $u$ and each basis function $\phi_j$. But remember that $u$ is itself a linear combination of the bases $\phi_i$, so we have

$\langle \nabla u, \nabla \phi_j \rangle = \left\langle \nabla \sum_i x_i \phi_i, \nabla \phi_j \right\rangle = \sum_i x_i \langle \nabla \phi_i, \nabla \phi_j \rangle.$

The critical thing now becomes the inner product between the gradients of the basis functions in each triangle. If we can compute these, then we can simply build the matrix

$A_{ij} := \langle \nabla \phi_i, \nabla \phi_j \rangle$

and solve the problem

$A x = b$

for the coefficients $x$, where the entries on the right-hand side are given by $b_i = \langle f, \phi_i \rangle$ (i.e., we just take the same “measurements” on the right-hand side).

Exercise 3.5: Show that the aspect ratio of a triangle can be expressed as the sum of the cotangents of the interior angles at its base, i.e.,

$\frac{w}{h} = \cot{\alpha} + \cot{\beta}.$

Exercise 3.6: Let $e$ be the edge vector along the base of a triangle. Show that the gradient of the hat function $\phi$ on the opposite vertex is given by

$\nabla \phi = \frac{e^\perp}{2A},$

where $e^\perp$ is the vector $e$ rotated by a quarter turn in the counter-clockwise direction and $A$ is the area of the triangle.

Exercise 3.7: Show that for any hat function $\phi$ associated with a given vertex

$\langle \nabla \phi, \nabla \phi \rangle = \frac{1}{2}( \cot\alpha + \cot\beta )$

within a given triangle, where $\alpha$ and $\beta$ are the interior angles at the remaining two vertices.

Exercise 3.8: Show that for the hat functions $\phi_i$ and $\phi_j$ associated with vertices $i$ and $j$ (respectively) of the same triangle, we have

$\langle \nabla \phi_i, \nabla \phi_j \rangle = -\frac{1}{2} \cot \theta,$

where $\theta$ is the angle between the opposing edge vectors.

Putting all these facts together, we find that we can express the Laplacian of $u$ at vertex $i$ via the infamous cotan formula

$(\Delta u)_i = \frac{1}{2} \sum_j (\cot \alpha_j + \cot \beta_j )( u_j - u_i ),$

where we sum over the immediate neighbors of vertex $i$.

Differential Forms

The “Galerkin” approach taken above reflects a fairly standard way to discretize partial differential equations. But let’s try a different approach, based on exterior calculus. Again we want to solve the Poisson equation $\Delta u = f$, which (if you remember our discussion of differential operators) can also be expressed as

$\star d \star d u = f.$

We already outlined how to discretize this kind of expression in the notes on discrete exterior calculus, but let’s walk through it step by step. We start out with a 0-form $u$, which is specified as a number $u_i$ at each vertex:

Next, we compute the discrete exterior derivative $d u$, which just means that we want to integrate the derivative along each edge:

$(du)_{ij} = \int_{e_{ij}}\!du = \int_{\partial e_{ij}}\!u = u_j - u_i.$

(Note that the boundary $\partial e_{ij}$ of the edge is simply its two endpoints $v_i$ and $v_j$.) The Hodge star converts a circulation along the edge $e_{ij}$ into the flux through the corresponding dual edge $e^\star_{ij}$. In particular, we take the total circulation along the primal edge, divide it by the edge length to get the average pointwise circulation, then multiply by the dual edge length to get the total flux through the dual edge:

$(\star du)_{ij} = \frac{|e^\star_{ij}|}{e_{ij}}(u_j - u_i).$

Here $|e_{ij}|$ and $e^\star_{ij}$ denote the length of the primal and dual edges, respectively. Next, we take the derivative of $\star d u$ and integrate over the whole dual cell:

$(d \star d u)_i = \int_{C_i} d \star d u = \int_{\partial C_i} \star d u = \sum_j \frac{|e^\star_{ij}|}{|e_{ij}|}( u_j - u_i ).$

The final Hodge star simply divides this quantity by the area of $C_i$ to get the average value over the cell, and we end up with a system of linear equations

$(\star d \star d u)_i = \frac{1}{|C_i|} \sum_j \frac{|e^\star_{ij}|}{|e_{ij}|}( u_j - u_i ) = f_i$

where $f_i$ is the value of the right-hand side at vertex $i$. In practice, however, it’s often preferable to move the area factor $|C_i|$ to the right hand side, since the resulting system

$(\star d \star d u)_i = \sum_j \frac{|e^\star_{ij}|}{|e_{ij}|}( u_j - u_i ) = |C_i| f_i$

can be represented by a symmetric matrix. (Symmetric matrices are often easier to deal with numerically and lead to more efficient algorithms.) Another way of looking at this transformation is to imagine that we discretized the system

$d \star d u = \star f.$

In other words, we converted an equation in terms of 0-forms into an equation in terms of $n$-forms. When working with surfaces, the operator $d \star d$ is sometimes referred to as the conformal Laplacian, because it does not change when we subject our surface to a conformal transformation. Alternatively, we can think of $d \star d$ as simply an operator that gives us the value of the Laplacian integrated over each dual cell of the mesh (instead of the pointwise value).

Exercise 3.9: Consider a simplicial surface and suppose we place the vertices of the dual mesh at the circumcenters of the triangles (i.e., the center of the unique circle containing all three vertices):

Demonstrate that the dual edge $e^\star$ (i.e., the line between the two circumcenters) bisects the primal edge orthogonally, and use this fact to show that

$\frac{|e^\star_{ij}|}{|e_{ij}|} = \frac{1}{2}(\cot \alpha_j + \cot \beta_j).$

Hence the DEC discretization yields precisely the same “cotan-Laplace” operator as the Galerkin discretization.

Meshes and Matrices

So far we’ve been giving a sort of “algorithmic” description of operators like Laplace. For instance, we determined that the Laplacian of a scalar function $u$ at a vertex $i$ can be approximated as

$(\Delta u)_i = \frac{1}{2} \sum_j (\cot\alpha_j+\cot\beta_j)(u_j - u_i),$

where the sum is taken over the immediate neighbors $j$ of $i$. In code, this sum could easily be expressed as a loop and evaluated at any vertex. However, a key aspect of working with discrete differential operators is building their matrix representation. The motivation for encoding an operator as a matrix is so that we can solve systems like

$\Delta u = f$

using a standard numerical linear algebra package. (To make matters even more complicated, some linear solvers are perfectly happy to work with algorithmic representations of operators called callback functions — in general, however, we’ll need a matrix.)

In the case of the Poisson equation, we want to construct a matrix $L \in \mathbb{R}^{|V| \times |V|}$ (where $|V|$ is the number of mesh vertices) such that for any vector $u \in \mathbb{R}^{|V|}$ of values at vertices, the expression $Lu$ effectively evaluates the formula above. But let’s start with something simpler — consider an operator $B$ that computes the sum of all neighboring values:

$(Bu)_i = \sum_j u_j$

How do we build the matrix representation of this operator? Think of $B$ as a machine that takes a vector $u$ of input values at each vertex and spits out another vector $Bu$ of output values. In order for this story to make sense, we need to know which values correspond to which vertices. In other words, we need to assign a unique index to each vertex of our mesh, in the range $1, \ldots, |V|$:

It doesn’t matter which numbers we assign to which vertices, so long as there’s one number for every vertex and one vertex for every number. This mesh has twelve vertices and vertex 1 is next to vertices 2, 3, 4, 5, and 9. So we could compute the sum of the neighboring values as

$(Bu)_1 = \left[ \begin{array}{cccccccccccc} 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \end{array} \right]\left[ \begin{array}{c} u_1 \\ u_2 \\ u_3 \\ u_4 \\ u_5 \\ u_6 \\ u_7 \\ u_8 \\ u_9 \\ u_{10} \\ u_{11} \\ u_{12} \end{array} \right].$

Here we’ve put a “1″ in the $j$th place whenever vertex $j$ is a neighbor of vertex $i$ and a “0″ otherwise. Since this row gives the “output” value at the first vertex, we’ll make it the first row of the matrix $B$. The entire matrix looks like this:

$B = \left[ \begin{array}{cccccccccccc} 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 1 \\ 0 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 0 \\ \end{array} \right]$

(You could verify that this matrix is correct, or you could go outside and play in the sunshine. Your choice.) In practice, fortunately, we don’t have to build matrices “by hand” — we can simply start with a matrix of zeros and fill in the nonzero entries by looping over local neighborhoods of our mesh.

Finally, one important thing to notice here is that many of the entries of $B$ are zero. In fact, for most discrete operators the number of zeros far outnumbers the number of nonzeros. For this reason, it’s usually a good idea to use a sparse matrix, i.e., a data structure that stores only the location and value of nonzero entries (rather than explicitly storing every single entry of the matrix). The design of sparse matrix data structures is an interesting question all on its own, but conceptually you can imagine that a sparse matrix is simply a list of triples $(i,j,x)$ where $i,j \in \mathbb{N}$ specify the row and column index of a nonzero entry and $x \in \mathbb{R}$ gives its value.

Scalar Poisson Equation

In the first part of the coding assignment you’ll build the cotan-Laplace operator and use it to solve the scalar Poisson equation

$\Delta \phi = \rho$

on a triangle mesh, where $\rho$ can be thought of as a (mass or charge) density and $\phi$ can be thought of as a (gravitational or electric) potential. Once you’ve implemented the methods below, you can visualize the results via the libDDG GUI. (If you want to play with the density function $\rho$, take a look at the method Viewer::updatePotential.)

Coding 3.1: Implement the method Mesh::indexVertices() which assigns a unique ID to each vertex in the range $0, \ldots, |V|-1$.

Coding 3.2: Derive an expression for the cotangent of a given angle purely in terms of the two incident edge vectors and the standard Euclidean dot product $(\cdot)$ and cross product $(\times)$. Implement the method HalfEdge::cotan(), which computes the cotangent of the angle across from a given halfedge.

Coding 3.3: Implement the methods Face::area() and Vertex::dualArea(). For the dual area of a vertex you can simply use one-third the area of the incident faces — you do not need to compute the area of the circumcentric dual cell. (This choice of area will not affect the order of convergence.)

Coding 3.4: Using the methods you’ve written so far, implement the method Mesh::buildLaplacian() which builds a sparse matrix representing the cotan-Laplace operator. (Remember to initialize the matrix to the correct size!)

Coding 3.5: Implement the method Mesh::solveScalarPoissonProblem() which solves the problem $\Delta\phi = \rho$ where $\rho$ is a scalar density on vertices (stored in Vertex::rho). You can use the method solve from SparseMatrix.h; $\rho$ and $\phi$ should be represented by instances of DenseMatrix of the appropriate size. Be careful about appropriately incorporating dual areas into your computations; also remember that the right-hand side cannot have a constant component!

You should verify that your code produces results that look something like these two images (density on the left; corresponding potential on the right):

(Take a look at the new menu items in the GUI.)

Implicit Mean Curvature Flow

Next, you’ll use nearly identical code to smooth out geometric detail on a surface mesh (also known as fairing or curvature flow). The basic idea is captured by the heat equation, which describes the way heat diffuses over a domain. For instance, if $u$ is a scalar function describing the temperature at every point on the real line, then the heat equation is given by

$\frac{\partial u}{\partial t} = \frac{\partial^2 u}{\partial x^2}.$

Geometrically this equation simply says that concave bumps get pushed down and convex bumps get pushed up — after a long time the heat distribution becomes completely flat. We also could have written this equation using the Laplacian: $\frac{\partial u}{\partial t} = \Delta u$. In fact, this equation is exactly the one we’ll use to smooth out a surface, except that instead of considering the evolution of temperature, we consider the flow of the surface $f: M \rightarrow \mathbb{R}^3$ itself:

$\frac{\partial f}{\partial t} = \Delta f.$

Remember from our discussion of vertex normals that $\Delta f = 2HN,$ i.e., the Laplacian of position yields (twice) the mean curvature times the unit normal. Therefore, the equation above reads, “move the surface in the direction of the normal, with strength proportional to mean curvature.” In other words, it describes a mean curvature flow.

So how do we compute this flow? We already know how to discretize the term $\Delta f$ — just use the cotangent discretization of Laplace. But what about the time derivative $\frac{\partial f}{\partial t}$? There are all sorts of interesting things to say about discretizing time, but for now let’s use a very simple idea: the change over time can be approximated by the difference of two consecutive states:

$\frac{\partial f}{\partial t} \approx \frac{f_h - f_0}{h},$

where $f_0$ is the initial state of our system (here the initial configuration of our mesh) and $f_h$ is the configuration after a mean curvature flow of some duration $h > 0$. Our discrete mean curvature flow then becomes

$\frac{f_h - f_0}{h} = \Delta f.$

The only remaining question is: which values of $f$ do we use on the right-hand side? One idea is to use $f_0$, which results in the system

$f_h = f_0 + h\Delta f_0.$

This scheme, called forward Euler, is attractive because it can be evaluated directly using the known data $f_0$ — we don’t have to solve a linear system. Unfortunately, forward Euler is not numerically stable, which means we can take only very small time steps $h$. One attractive alternative is to use $f_h$ as the argument to $\Delta$, leading to the system

$\underbrace{(I - h \Delta)}_A f_h = f_0,$

where $I$ is the identity matrix (try the derivation yourself!) This scheme, called backward Euler, is far more stable, though we now have to solve a linear system $A f_h = f_0$. Fortunately this system is highly sparse, which means it’s not too expensive to solve in practice. (Note that this system is not much different from the Poisson system.)

Coding 3.6: Implement the method Mesh::buildFlowOperator(), which should look very similar to Mesh::buildLaplacian.

Coding 3.7: Implement the method Mesh::computeImplicitMeanCurvatureFlow(). Note that you can treat each of the components of $f$ ($x$, $y$, and $z$) as separate scalar quantities.

You should verify that your code produces results that look something like these two images (original mesh on the left; smoothed mesh on the right):

Skeleton Code and Handin

Skeleton code for this assignment can be found here:

ddg_poisson.zip

(You may want to use your Makefile from the last project!) You can easily locate all the methods that need to be implemented by searching for the string “TODO.” Please submit the entire project this time as a single zip file (i.e., not just the files you modify).

## A Quick and Dirty Introduction to Exterior Calculus — Part V: Integration and Stokes’ Theorem

In the last set of notes we talked about how to differentiate $k$-forms using the exterior derivative $d$. We’d also like some way to integrate forms. Actually, there’s surprisingly little to say about integration given the setup we already have. Suppose we want to compute the total area $A_\Omega$ of a region $\Omega$ in the plane:

If you remember back to calculus class, the basic idea was to break up the domain into a bunch of little pieces that are easy to measure (like squares) and add up their areas:

$A_\Omega \approx \sum_i A_i.$

As these squares get smaller and smaller we get a better and better approximation, ultimately achieving the true area

$A_\Omega = \int_\Omega dA.$

Alternatively, we could write the individual areas using differential forms — in particular, $A_i = dx^1 \wedge dx^2(u,v)$. Therefore, the area element $dA$ is really nothing more than the standard volume form $dx^1 \wedge dx^2$ on $\mathbb{R}^2$. (Not too surprising, since the whole point of $k$-forms was to measure volume!)

To make things more interesting, let’s say that the contribution of each little square is weighted by some scalar function $\phi$. In this case we get the quantity

$\int_\Omega \phi\ dA = \int_\Omega \phi\ dx^1 \wedge dx^2.$

Again the integrand $\phi\ dx^1 \wedge dx^2$ can be thought of as a 2-form. In other words, you’ve been working with differential forms your whole life, even if you didn’t realize it! More generally, integrands on an $n$-dimensional space are always $n$-forms, since we need to “plug in” $n$ orthogonal vectors representing the local volume. For now, however, looking at surfaces (i.e., 2-manifolds) will give us all the intuition we need.

Integration on Surfaces

If you think back to our discussion of the Hodge star, you’ll remember the volume form

$\omega = \sqrt{\mathrm{det}(g)} dx^1 \wedge dx^2,$

which measures the area of little parallelograms on our surface. The factor $\sqrt{\mathrm{det}(g)}$ reminds us that we can’t simply measure the volume in the domain $M$ — we also have to take into account any “stretching” induced by the map $f: M \rightarrow \mathbb{R}^2$. Of course, when we integrate a function on a surface, we should also take this stretching into account. For instance, to integrate a function $\phi: M \rightarrow \mathbb{R}$, we would write

$\int_\Omega \phi \omega = \int_\Omega \phi \sqrt{\mathrm{det}(g)}\ dx^1 \wedge dx^2.$

In the case of a conformal parameterization things become even simpler — since $\sqrt{\mathrm{det}(g)} = a$ we have just

$\int_\Omega \phi a\ dx^1 \wedge dx^2,$

where $a: M \rightarrow \mathbb{R}$ is the scaling factor. In other words, we scale the value of $\phi$ up or down depending on the amount by which the surface locally “inflates” or “deflates.” In fact, this whole story gives a nice geometric interpretation to good old-fashioned integrals: you can imagine that $\int_\Omega \phi\ dA$ represents the area of some suitably deformed version of the initially planar region $\Omega$.

Stokes’ Theorem

The main reason for studying integration on manifolds is to take advantage of the world’s most powerful tool: Stokes’ theorem. Without further ado, Stokes’ theorem says that

$\int_\Omega d\alpha = \int_{\partial\Omega} \alpha,$

where $\alpha$ is any $n-1$-form on an $n$-dimensional domain $\Omega$. In other words, integrating a differential form over the boundary of a manifold is the same as integrating its derivative over the entire domain.

If this trick sounds familiar to you, it’s probably because you’ve seen it time and again in different contexts and under different names: the divergence theorem, Green’s theorem, the fundamental theorem of calculus, Cauchy’s integral formula, etc. Picking apart these special cases will really help us understand the more general meaning of Stokes’ theorem.

Divergence Theorem

Let’s start with the divergence theorem from vector calculus, which says that

$\int_\Omega \nabla \cdot X dA = \int_{\partial\Omega} N \cdot X d\ell,$

where $X$ is a vector field on $\Omega$ and $N$ represents the (outward-pointing) unit normals on the boundary of $\Omega$. A better name for this theorem might have been the “what goes in must come out theorem”, because if you think about $X$ as the flow of water throughout the domain $\Omega$ then it’s clear that the amount of water being pumped into $\Omega$ (via pipes in the ground) must be the same as the amount flowing out of its boundary at any moment in time:

Let’s try writing this theorem using exterior calculus. First, remember that we can write the divergence of $X$ as $\nabla \cdot X = \star d \star X^\flat$. It’s a bit harder to see how to write the right-hand side of the divergence theorem, but think about what integration does here: it takes tangents to the boundary and sticks them into a 1-form. For instance, $\int_\Omega X^\flat$ adds up the tangential components of $X$. To get the normal component we could rotate $X^\flat$ by a quarter turn, which conveniently enough is achieved by hitting it with the Hodge star. Overall we get

$\int_\Omega d \star X^\flat = \int_{\partial\Omega} \star X^\flat,$

which, as promised, is just a special case of Stokes’ theorem. Alternatively, we can use Stokes’ theorem to provide a more geometric interpretation of the divergence operator itself: when integrated over any region $\Omega$ — no matter how small — the divergence operator gives the total flux through the region boundary. In the discrete case we’ll see that this boundary flux interpretation is the only notion of divergence — in other words, there’s no concept of divergence at a single point.

By the way, why does $d \star X^\flat$ appear on the left-hand side instead of $\star d \star X^\flat$? The reason is that $\star d \star X^\flat$ is a 0-form, so we have to hit it with another Hodge star to turn it into an object that measures areas (i.e., a 2-form). Applying this transformation is no different from appending $dA$ to $\nabla \cdot X$ — we’re specifying how volume should be measured on our domain.

Fundamental Theorem of Calculus

The fundamental theorem of calculus is in fact so fundamental that you may not even remember what it is. It basically says that for a real-valued function $\phi: \mathbb{R} \rightarrow \mathbb{R}$ on the real line

$\int_a^b \frac{\partial \phi}{\partial x} dx = \phi(b) - \phi(a).$

In other words, the total change over an interval $[a,b]$ is (as you might expect) how much you end up with minus how much you started with. But soft, behold! All we’ve done is written Stokes’ theorem once again:

$\int_{[a,b]} d\phi = \int_{\partial[a,b]} \phi,$

since the boundary of the interval $[a,b]$ consists only of the two endpoints $a$ and $b$.

Hopefully these two examples give you a good feel for what Stokes’ theorem says. In the end, it reads almost like a Zen kōan: what happens on the outside is purely a function of the change within. (Perhaps it is Stokes’ that deserves the name, “fundamental theorem of calculus!”)

October 31, 2012 | Posted in: Notes | Comments Closed