This article documents my presentation of Gromov’s theorem on groups of polynomial growth at the MIT combinatorics reading group. The presentation is based on Gromov’s 1981 paper, *Groups of polynomial growth and expanding maps*, Kleiner’s 2007 paper, *A new proof of Gromov’s theorem on groups of polynomial growth*, and Tao’s 2009 blog post, *A finitary version of Gromov’s polynomial growth theorem*.

#### Introduction

Throughout, we fix a finitely generated group G and a finite symmetric generating set S (that is \forall x \in S. x^{-1}\in S). For every group x \in G, the word length \lVert x \rVert of x is the shortest length n of a word s_1s_2 \dots s_n in S that expresses x.

Gromov’s theorem connects a group property of G with the growth of the cardinality of the ball B(r) := \left\{x \in G : \lVert x \rVert \le r\right\} of radius r. The ball B(r) can be seen as the set of vertices within distance r from the identity element of G in the Cayley graph of (G, S).

**Definition (nilpotent and virtually nilpotent).** A group H is *nilpotent* of class n if there is a *lower central series* H = H_0 \rhd H_1 \rhd \dots \rhd H_n = \{e\}, where H_{i+1} = [H_i, H]. A group G is *virtually nilpotent* if there is a finite index subgroup H of G that is nilpotent.

**Example 1.** When G is abelian, G is nilpotent and \lvert B(r) \rvert = r^{\mathrm{rank}(G)}.

**Example 2.** When G is the discrete Heisenberg group, that is G = \left\{ \begin{pmatrix} 1 & a & b \\ & 1 & c \\ & & 1 \end{pmatrix} : a, b, c \in \mathbb{Z} \right\}, we have the lower central series G > \langle \begin{pmatrix} 1 & & 1 \\ & 1 & \\ & & 1\end{pmatrix} \rangle > \{e\}, and the growth of \lvert B(r) \rvert is bounded by r^4.

**Theorem (Gromov 1981).** For every group G generated by a finite symmetric set S, \lvert B(r)\rvert is at most polynomial in r if and only if G is virtually nilpotent.

Gromov’s proof uses the deep Montgomery–Zippin–Yamabe structure theory of locally compact groups on Hilbert’s fifth problem. Colding and Minicozzi, solving a conjecture of Yau, showed that the space of harmonic functions with polynomial growth on an open manifold with non-negative Ricci curvature. To state (a weak version of) the discrete analog of their result, we need the notion of Lipschitz harmonic on the group G with the finite symmetric generating set S.

**Definition (Lipschitz and harmonic).** A function f\colon G\to\mathbb{R} is *Lipschitz* if \sup_{g \in G, s \in S}\lvert f(gs) - f(g) \rvert is finite, and is *harmonic* if f(g) = \frac{1}{\lvert S \rvert} \sum_{s \in S} f(gs) for all g \in G.

**Theorem (Colding and Minicozzi 1997, Kleiner 2010).** If \vert B(r)\lvert is at most polynomial in r, then the linear space of Lipschitz harmonic functions on G is finite dimensional.

Kleiner provided a new proof of Gromov’s theorem by establishing the Colding–Minicozzi theorem from scratch. Later Shalom and Tao pushed Kleiner’s methods to obtain the following quantitative version of Gromov’s theorem.

**Theorem (Shalom and Tao 2010).** There exists an absolute constant c such that if \lvert B(r) \rvert \le r^d for some r > \exp(\exp(cd^c)) then G contains a finite index subgraph H which is nilpotent of class \le c^d.

In the rest of the article, we will prove the Colding–Minicozzi theorem through the Poincaré inequality and the reverse Poincaré inequality.

#### Poincaré inequality

**Lemma (Poincaré inequality).** For every function f \colon G \to \mathbb{R}, if f has mean 0 on B(r), the l^2-norm of f on B(r) is bounded by the fluctuation of f on B(2r): \sum_{x\in B(r)}f(x)^2 \le \frac{\lvert B(2r) \rvert}{\lvert B(r) \rvert}\cdot 2r^2 \sum_{x,y\in B(2r), x\sim y}(f(x) - f(y))^2.

*Proof.* One can check that the left hand side is equal to \frac{1}{2\lvert B(r)\rvert}\sum_{x,y\in B(r)}(f(x)-f(y))^2.

For each z \in B(2r), we fix a shortest path e = z_0, z_1, \dots, z_{\lVert z \rVert} = z from e to z in the Cayley graph of (G, S). Given x,y\in B(r), let z = x^{-1}y \in B(2r) and get f(x) - f(y) = \sum_{i=1}^{\lVert z \rVert}f(xz_{i-1})-f(xz_i) \\ \implies (f(x)-f(y))^2 \le \lVert z \rVert \sum_{i=1}^{\lVert z \rVert}(f(xz_{i-1})-f(xz_i))^2.

When summing over the last inequality over all x,y\in B(r), we can regroup the summands by z and i as follows: \sum_{z \in B(2r)}\lVert z\rVert\sum_{i=1}^{\lVert z \rVert} \left(\sum_{x\in B(r) : xz \in B(r)}(f(xz_{i-1})-f(xz_i))^2\right).

Fix z and i for a moment. One can show that both xz_{i-1} and xz_i are in B(2r), and moreover the directed edges (xz_{i-1}, xz_{i}) are distinct when x varies in B(r). Thus \sum_{x\in B(r) : xz \in B(r)}(f(xz_{i-1})-f(xz_i))^2 \le \sum_{x,y\in B(2r), x\sim y}(f(x)-f(y))^2.

We obtain the Poincaré inequality by putting everything together and noticing that \lVert z \rVert \le 2r. [qed]

#### Reverse Poincaré inequality

**Lemma (Reverse Poincaré inequality).** For every harmonic function f\colon G \to \mathbb{R}, the fluctuation of f on B(r) is bounded by the l^2-norm of f on B(2r): \sum_{x,y\in B(r), x\sim y}(f(x) - f(y))^2 \le \lvert S \vert \cdot 4r^{-2}\sum_{x\in B(2r)}f(x)^2.

To facilitate the proof, we introduce the following notations. Given a function f\colon G\to \mathbb{R} and s\in S, write f_s(x) := f(xs) and \partial_s f := f_s - f. It is easy to see that

- \sum_{s\in S}\partial_{s^{-1}}\partial_s f = 0 when f is harmonic, and
- \sum_{x\in G}f(x)\partial_s g(x) = \sum_{x\in G}\partial_{s^{-1}}f(x) g(x) when f or g is finitely supported.

*Proof.* Fix the harmonic function f\colon G\to \mathbb{R} and let \phi\colon G \to [0,1] be defined by \phi(x) = \begin{cases} 1 & \text{if }\lVert x\rVert \le r,\\ 2 - \lVert x\rVert/r & \text{if }r < \lVert x\rVert < 2r, \\ 0 & \text{if }\lVert x\rVert \ge 2r.\end{cases}

For every s \in S, note that \partial_s (f\phi^2) = (\partial_s f)\phi^2 + f_s(\partial_s \phi^2) and \partial_s \phi^2 = (\partial_s \phi)\phi + \phi_s (\partial_s\phi) = (\partial_s \phi)(2\phi + \partial_s \phi). We obtain \begin{aligned}\partial_s f \partial_s (f\phi^2) & = (\partial_s f)^2 \phi^2 + (\partial_s f)f_s(\partial_s \phi)(2\phi + \partial_s\phi) \\ & \ge \tfrac{1}{2}(\partial_s f)^2\phi^2 - 2(f_s)^2(\partial_s \phi)^2 + (\partial_s f)f_s(\partial_s\phi)^2 \\ & = \tfrac{1}{2}(\partial_s f)^2\phi^2 - f_s(f_s + f)(\partial_s\phi)^2 \\ & \ge \tfrac{1}{2}(\partial_s f)^2\phi^2 - \tfrac{1}{2}(3f_s^2 + f^2)(\partial_s\phi)^2. \end{aligned}

When summing the above inequality over all s\in S and x \in G, by noticing that \sum_{s\in S}\sum_{x\in G}\partial_s f(x) \partial_s (f(x)\phi(x)^2) = \sum_{s\in S}\sum_{x\in G}(\partial_{s^{-1}}\partial_s f(x)) f(x)\phi(x)^2 \\ = \sum_{x\in G}\left(\sum_{s\in S}\partial_{s^{-1}}\partial_s f(x)\right) f(x)\phi^2(x) = 0, we get \sum_{x\in G}\sum_{s \in S}(\partial_s f(x))^2\phi(x)^2 \le \sum_{x\in G}\sum_{s \in S}(3f_s(xs)^2+f(x)^2)(\partial_s \phi(x))^2 .

The left hand side of the last inequality is at least \sum_{x,y\in B(r), x\sim y}(f(x)-f(y))^2, whereas the right hand side is at most 4\lvert S\rvert \tfrac{1}{r^2}\sum_{r \le \lVert x\rVert \le 2r}f(x)^2 because (\partial_s\phi)^2 \le 1/r^2 and \partial_s\phi(x)^2 > 0 only if both x and xs are in \{x \in G : r\le \lVert x\rVert \le 2r\}. [qed]

#### Colding–Minicozzi theorem

To simplify the presentation, we will assume the doubling constant \lvert B(2r) \rvert / \lvert B(r) \rvert is uniformly bounded at all scales r, which, for example, is indeed the case when \lvert B(r)\rvert = \Theta(r^d). In general, one needs the pigeonhole principle to select the correct radii for the argument below to work.

*Proof assuming the doubling constant is uniformly bounded.* Suppose for the sake of contradiction, the dimension of the linear space consisting of Lipschitz harmonic functions on G is at least n, where the parameter n will be determined later. Denote by V the n-dimensional linear subspace.

Let k be a natural number to be determined soon, and fix r for a moment. Let \mathcal{A}_r be a maximal collection of disjoint balls of radius r/2 with centers in B(kr), and let \mathcal{B}_r be the collection of balls with the same centers of the balls in \mathcal{A}_r, but of radius r. Let V_r be the linear subspace of V consisting of harmonic functions in V that average to 0 on each ball in \mathcal{B}_r. Note that the co-dimension of V_r as a subspace of V is at most \lvert \mathcal{B}_r \rvert = \lvert \mathcal{A}_r \rvert, which is at most \lvert B(kr+r/2) \rvert / \lvert B(r/2) \rvert = O(1) =: C.

For every harmonic function f in V_r, using the fact that B(kr) \subseteq \cup\mathcal{B}_r, the fact that each point in G is covered by 2B for at most \lvert B(2r+r/2) \rvert / \lvert B(r/2) \rvert = O(1) many B \in \mathcal{B}_r, the Poincaré inequality and the reverse Poincaré inequality, we get \begin{aligned}\sum_{x \in B(kr)}f(x)^2 & \le \sum_{B \in \mathcal{B}(r)}\sum_{x\in B}f(x)^2 \\ & \lesssim r^2\sum_{B \in \mathcal{B}(r)}\sum_{x,y \in 2B, x\sim y}(f(x)-f(y))^2 \\ & \lesssim r^2\sum_{x, y \in B(kr+2r), x\sim y}(f(x)-f(y))^2 \\ & \lesssim \frac{1}{(k+2)^2}\sum_{x \in B(2(k+2)r)}f(x)^2.\end{aligned} Now take k large enough (depending only on the group G) so that for all f\in V_r, 3^d\sum_{x\in B(kr)}f(x)^2 \le \sum_{x\in B(3kr)}f(x)^2.

Consider the quadratic form Q_r on V defined by Q_r(f) := \sum_{x\in B(r)}f(x)^2. Since the kernels of Q_r‘s form a descending chain of vector spaces, there exists r_0 such that Q_r is positive-definite for all r \ge r_0.

For every r \ge r_0, let q(r) be the volume of the ellipsoid E_r induced by Q_r. To be more precise, after fixing a basis \{f_1, \dots, f_n\} of V, the ellipsoid is defined by E_r := \{(c_1, \dots, c_n) \in \mathbb{R}^n : Q_r(c_1f_1 + \dots c_nf_n) \le 1\}. By scaling and translation, we may assume that f_i is 1-Lipschitz and f_i(e)=0 (whenever f_i is non-constant). Since \lvert B(r) \rvert is at most polynomial in r, there exists a natural number d (depending only on the group G) such that \sum_{x \in B(r)} f_i(x)^2 \le r^d for all i \in [n]. By Cauchy–Schwarz inequality, we have \begin{aligned}Q_r(c_1f_1 + \dots + c_nf_n) & = \sum_{x\in B(r)}(c_1f_1(x)+\dots+c_nf_n(x))^2 \\ & \le n\sum_{x\in B(r)}c_1^2f_1(x)^2 + \dots c_n^2f_n(x)^2 \\ & \le (c_1^2+\dots+c_n^2)n^2r^d.\end{aligned} Therefore E_r contains the ball of radius (nr^{d/2})^{-1}, hence q_r \ge v_n(nr^{d/2})^{-n}, where v_n is the volume of the n-dimensional Euclidean unit ball.

Although the volume q(r) of the ellipsoid is not intrinsic to Q_r, the ratio between q(r) and q(r') does not depend on the choice of the basis.

After a linear transformation, we may assume the symmetric matrices associated to Q_{kr} and Q_{3kr} are of the form \begin{pmatrix}A_1 & B_1 \\ B_1^T & C_1 \end{pmatrix}, \begin{pmatrix}A_3 & \\ & C_3\end{pmatrix}, where A_1, A_3 act on V_r. Using the Schur complement, we obtain that the volume ratio q_{3kr}/q_{kr} is \frac{\det Q_{kr}}{\det Q_{3kr}} = \frac{\det A_1 \det (C_1 - B_1^TA_1^{-1}B_1)}{\det A_3 \det C_3}. As B_1^TA_1^{-1}B_1 is positive semi-definite, we obtain \det(C_1 - B_1^TA_1^{-1}B_1) \le \det C_1 and so q_{3kr}/q_{kr} is at most \frac{\det A_1\det C_1}{\det A_3\det C_3}. Recall that 3^dQ_{kr}(f) \le Q_{3kr}(f) for all f \in V_r and clearly Q_{kr}(f) \le Q_{3kr}(f) for all f \in V. In other words, 3^dA_1 \preceq A_3 and C_1 \preceq C_3 and so q_{kr}/q_{3kr} \ge (3^d)^{\dim V_r} \ge (3^d)^{n-C}.

Choose n so that (3^d)^{n-C} \ge 2^{dn} hence 2^{dn}q(3kr) \le q(kr). Repeatedly apply the last inequality to obtain q(kr) \ge 2^{mdn}q(3^mkr) \ge 2^{mdn}v_n(n(3^mkr)^{d/2})^{-n} = \Omega_{d,n}((2/\sqrt{3})^{mdn}), which is absurd for m sufficiently large. [qed]