On Hardy–Littlewood maximal function of singular measure

In this exposition, we explore the behavior of the Hardy–Littlewood maximal function of measures that are singular with respect to Lebesgue measure.

We are going to prove that for every positive Borel measure s that is singular with respect to Lebesgue measure m,
m(\{x\in\mathbb{T}:M_s(x))>h\})>\frac{1}{2h}s(\mathbb{T}), for all h>s(\mathbb{T}), where \mathbb{T}=\mathbb{R}/\mathbb{Z} is a torus and M_s is the Hardy–Littilewood maximal function of the measure s defined by
M_s(x)=\sup_{x\in B\subset\mathbb{T}}\frac{s(B)}{m(B)} \text{ for all }x\in\mathbb{T}, where the supreme is over all the balls containing the point x in the torus.

Yue Pu told me this problem a couple of days ago. Though I don’t feel confident solving analysis problems, I wanted to see how much I have learnt from Giovanni Leoni. Finally, after staring at the problem for about 6 hours, Yue and I came up with an answer. The proof is modified from Yue’s write-up.

For experts among the readers, our first step is to exploit Calderón–Zygmund covering lemma. We will explain this in detail.

Since two measures m and s are mutually singular, there exist two disjoint sets M and S whose union is \mathbb{T} such that s(M)=m(S)=0. In other words, m lives in M and s lives in S.

Let A be the set of all atoms of s, i.e., A=\{x: s(x)>0\}\subset\mathbb{T}. Define E(x):=\{x+\frac{m}{2^n}: m,n\in\mathbb{N}\}\subset\mathbb{T} for all x\in\mathbb{T}, where the addition is under mod 1. Note that A and E(x) are both countable, so there exists e\in\mathbb{T} such that A\cap E(e)=\emptyset. Without loss of generality, we may assume e=0 because we may choose a suitable point on the torus as 0. We abbreviate E(0) by E which is going to be the set of the endpoints of some intervals.

Fix h>s(\mathbb{T}). Now we are ready to apply the Calderón-Zygmund decomposition according to the height h.

We can chop the torus into two disjoint intervals (0, 1/2) and (1/2, 1) both of length 1/2. Notice that at most one of them, denoted as I_1 if any, whose average, namely s(I_1)/m(I_1), is greater than or equal to h. Since s(\mathbb{T})<h, we get an upper bound for the average of I_1, h\leq s(I_1)/m(I_1) ds < 2h.

We can keep chopping those intervals whose average is less than h into half and pick out sub-intervals whose average is no less than h. Inductively, we can generate an enumerable sequence of mutually disjoint open intervals \{I_n\}, each satisfying h \leq s(I_n)/m(I_n) < 2h, which implies s(I_n)<2hm(I_n).

Let I=\bigcup_n{I_n}. We claim that s(I)=s(\mathbb{T}). If this is the case, then \begin{aligned}m(\{x\in\mathbb{T}:M_s(x)>h\}) & \geq\sum_nm(I_n)>\frac{1}{2h}\sum_ns(I_n) \\ & =\frac{1}{2h}s(I)=\frac{1}{2h}s(\mathbb{T}).\end{aligned}

Left to prove the claim. To prove s(I)=s(\mathbb{T}), we only need to show s(S-I)=0 as s lives in S. Note that s(E)=0 since A, the set of the atoms of s, is disjoint from E and E is countable. It suffices to show s(S-I-E)=0.

Let K be a compact set contained in S-I-E. Since for a positive Borel measure, one can always approximate a Borel set from inside by compact sets, it is enough to show s(K)=0.

For any \epsilon>0, as m(S)=0, we may choose an open set U such that S\subset U and m(U)<\epsilon. We want to show for all x in K\subset S-I-E, we can always find two intervals J_x and J_x^+ both containing x such that 4m(J_x)=m(J_x^+), s(J_x^+)<hm(J_x^+) and 3J_x\subset J_x^+\subset U, where 3J denotes the open interval that has the same center as J but three times the radius.

According to the decomposition process, we can find an infinite sequence of descending open intervals \{J_n\}_{n=1}^\infty containing x such that J_{n+1} is the left or right half of J_n and s(J_n)<hm(J_n) for all n. This is possible because x\notin I\cup E. Note that x cannot always lie in the left half of J_n for all sufficiently large n and similarly x cannot always lies in the right half of J_n for all sufficiently large n. In other words, for infinitely many times, J_{n+2} is the right half of J_{n+1} which is the left half of J_{n}.

Intervals that choose different halves.

Therefore we can always find sufficiently large n such that J_n\subset U. Now for every x\in K\subset U, there exist J_x which is J_{n+2} above, and J_x^+ which is J_n above such that x\in J_x, 3J_x\subset J_x^+\subset U, m(J_x^+)=4m(J_x), and s(J_x^+)<hm(J_x^+). Hence \{J_x\}_{x\in K} is an open cover of the compact set K. We can choose a finite sub-cover, pass it to Vitali covering lemma and spit out a sub-collection \{J_{x_n}\}_{n=1}^N which are disjoint balls such that K\subset\bigcup_{n=1}^N 3J_{x_n}. Now we have

\begin{aligned}s(K) & \leq s\left(\bigcup_{n=1}^N 3J_{x_n}\right)\leq s\left(\bigcup_{n=1}^N J_{x_n}^+\right)\leq\sum_{n=1}^N s(J_{x_n}^+)<h\sum_{n=1}^N m(J_{x_n}^+) \\ &=4h\sum_{n=1}^N m(J_{x_n})=4hm\left(\bigcup_{n=1}^N J_{x_n}\right)\leq 4hm(U) < 4h\epsilon.\end{aligned}

As \epsilon>0 is chosen arbitrarily, we have s(K)=0, which yields the claim.

Remark. The fact that s(\mathbb{T})<\infty which is implicitly required in the problem is used to prove the countability of A, the set of all atoms of s.

Remark. I have no idea whether the result is sharp in terms of the constant multiplier on the right hand side of the inequality.

Alternating Fourier Coefficients

Suppose f\in L^2 is a periodic function from \mathbb{R} to \mathbb{C} with period 1. Let a(n) be its Fourier coefficients, namely a(n)=\int_0^1{f(x)e^{-i2\pi nx}dx} for all n\in\mathbb{Z}. Prove for all p\geq 2 it is almost surely that function g(\omega)(x)=\sum_{n\in \mathbb{Z}}\omega(n)a(n)e^{i2\pi nx} is in L^p where \omega is an infinite sequence of independent and identical random variables indexed by \mathbb{Z} with \omega(n) equals either -1 or 1 with probability 1/2.

I heard this problem from Yue Pu who heard it from Gautam Iyer. Roughly speaking, given a periodic L^2 function, by alternating the signs of its Fourier coefficients, one almost surely gets an L^p function for all p\geq 2.

Here is the proof Yue and I worked out this afternoon which I think is pretty neat. However, all convergence arguments are omitted.

We shall prove that for all p\in\mathbb{N}, the expectation of \int_0^1 |g(\omega)(x)|^{2p}dx is less than or equal to C_p\left(\int_0^1|f(x)|^2dx\right)^p, where C_p is a constant that only depends on p.

This is sufficient to prove the original problem. For all p\in\mathbb{N}, we let A=\{\omega: g(\omega)\notin L^{2p}\}. If \mathbb{P}(A)>0, then for all \omega\in A, \int_0^1{|g(\omega)(x)|^{2p}dx}=\infty. Consequently \infty\leq\mathbb{E}\left[\int_0^1 |g(\omega)(x)|^{2p}dx\right]\leq C_n\left(\int_0^1|f(x)|^2dx\right)^p<\infty, which is impossible. For a general p\geq 2, pick any p'\in\mathbb{N} such that p\leq 2p'. We know g(\omega) is almost surely in L^{2p'}, hence almost surely in L^{p}.

For brevity, we denote e(n)(x)=e^{i2\pi nx}. We will repeatedly use the facts that e(m)(x)e(n)(x)=e(m+n)(x), \overline{e(m)(x)}=e(-m)(x) and \int_0^1e(n)(x)=0 except for n=0 it equals 1. We will also use Parseval’s identity which says \int_0^1|f(x)|^2dx=\sum_{n\in\mathbb{Z}}|a(n)|^2.

For all p\in\mathbb{N}, we have \begin{aligned}|g(\omega)(x)|^2&=g(\omega)(x)\overline{g(\omega)(x)}\\&=\sum_{m,n}\omega(m)\omega(n)a(m)\overline{a(n)}e(m)(x)e(-n)(x).\end{aligned} This gives us |g(\omega)(x)|^{2p}=\sum_{m_1,\ldots,m_p,n_1,\ldots,n_p}\prod_{i=1}^p\omega(m_i)a(m_i)e(m_i)(x)\omega(n_i)\overline{a(n_i)}e(-n_i)(x). Integrate above formula from 0 to 1, we have
\int_0^1{|g(\omega)(x)|^{2p}}dx=\sum_{m_1+\ldots+m_p=n_1+\ldots+n_p}\prod_{i=1}^p\omega(m_i)\omega(n_i)a(m_i)\overline{a(n_i)}.

Notice that for fixed m_1, \ldots, m_p, n_1, \ldots, n_p, the expectation of the summand is always 0 except when there is a perfect matching among m_1,\ldots, m_p, n_1, \ldots, n_p. By a perfect matching, we mean that one can pair the numbers such that the two numbers in each pair are the same. Hence the expectation of above is equal to
\sum_{m_1+\ldots+m_p=n_1+\ldots+n_p}^*\prod_{i=1}^pa(m_i)\overline{a(n_i)},
where \sum^* is the summation over every sequence that admits a perfect matching.

The above is less than or equal to
\sum_{m_1,\ldots, m_p, n_1,\ldots, n_p}^*\prod_{i=1}^p|a(m_i)||a(n_i)|.

By over counting, the expectation is less than or equal to
\frac{(2p)!}{p!2^p}\sum_{l_1, \ldots, l_p}\prod_{i=1}^p|a(l_i)|^2=\frac{(2p)!}{p!2^p}\left(\int_0^1|f(x)|^2dx\right)^p,
where \frac{(2p)!}{p!2^p} is the number of perfect matching can possibly be formed among 2p numbers.

Remark. The proof is not constructive. Indeed, it doesn’t provide a way for one to alternate the Fourier coefficients so that the outcome is an L^p function, though it is almost sure that the outcome is in L^p.

Number of non-isomorphic graphs

This expository essay is to test my understanding of the techniques used in More Bricks – More Walls?, Thirty-three Miniatures by Jiří Matoušek’s.

We shall prove the sequence g_n(0),\ldots, g_n({n\choose 2}) is unimodal, i.e., it is first nondecreasing and then, from some point on, non-increasing, where g_n(k) is the number of non-isomorphic graphs with n vertices and k edges. In particular, we shall prove \begin{gathered}g_n(0)\leq g_n(1)\leq\ldots\leq g_n(\lfloor m/2\rfloor) \\ =g_n(\lceil m/2\rceil)\geq\ldots\geq g_n(m-1)\geq g_n(m),\end{gathered} where m={n\choose 2}.

Throughout the article, the number of vertices, which is denoted as n, is always fixed. And m always stands for the maximal number of edges between n vertices.

Notice that taking the complement of graphs establishes a bijection between graphs with k edges and graphs with m-k edges. Moreover, two graphs are isomorphic if and only if their complement graphs are isomorphic. Thus we have g_n(k)=g_n(m-k). Hence it is enough to show that g_n(k)\leq g_n(l) for all k\leq l\leq m/2.

Denote U, V be the sets of all graphs with k, l edges on the fixed vertex set [n] respectively. Let r, s denote the number of non-isomorphic graphs in U, V. By our notation above, r=g_n(k), s=g_n(l). We shall show r\leq s. The graph G is the bipartite graph between U and V with u\sim v if and only if u is a subgraph of v.

Let B=(b_{uv})_{u\in U, v\in V} be the bipartite adjacent matrix of G, where b_{uv}=1 if u and v are adjacent in G, otherwise 0.

We claim the matrix B is of full rank. As it is easy to see the size of U is no greater than the size of V, we shall prove an equivalent statement that the matrix B is of full row rank, that is the rows of B are linearly independent.

Suppose not. There is a non-zero row vector y in \mathbb{R}^{U} such that yB=0. Notice the coordinates of y are indexed by the elements in U. Let K^*\in U such that y_{K^*}\neq 0.

Now we partition U, V into k+1 according to the number of edges in the intersection of the graph with K^*: K_i is the set of graphs in U who share i common edges with K^* and L_j is the set of graphs in V who share j common edges with K^* for all i, j=0,\ldots, k. Remember the number of edges in K^* is k and K_k contains only one element K^*. Also all K_i and L_j defined above are non-empty because of the assumption k<l\leq m/2.

Note that for any i, j\in \{0,\ldots, k\}, all vertices in K_i have the same degree to L_j in the bipartite graph G. This is because the ways to extend a graph with k edges and i edges in common with K^* to graphs with l edges and j edges in common with doesn’t specifically depend the graph we start with. Denote this number as d_{ij} and let D=(d_{ij}) be the matrix. Apparently, D is an upper triangular matrix with d_{ii}\neq 0. Thus D is non-singular. On the other hand, as k_i\cdot d_{ij}=\sum_{K\in K_i, L\in L_j}B_{KL} where k_i is the size of K_i, D=\mathrm{diag}(k_0^{-1},\ldots, k_k^{-1})FBG where F=(f_{i, u})_{i\in\{0,\ldots, k\}, u\in U} and G=(g_{v, j})_{v\in V, j\in\{0,\ldots, k\}} are matrices with f_{i, u}=1 if and only if u\in K_i and g_{v, j}=1 if and only if v\in L_j otherwise 0.

Let x be a k+1 dimensional row vector such that x=yE, where E=(e_{ui})_{u\in U, i\in\{0,\ldots, k\}} is matrix with e_{ui}=1 if u\in K_i otherwise 0. In fact, x_i=\sum_{u\in U}y_ue_{ui}=\sum_{K\in K_i}y_K. Hence x_k=y_{K^*}\neq 0 as K_k contains only one element K^*.

Now we have xD=yE\mathrm{diag}(k_0^{-1},\ldots, k_k^{-1})FBG. However it is easy to check EF=\mathrm{diag}(k_0,\ldots, k_k) and yB=0. So xD=yBG=0 contradicting to the non-singularity of D as x\neq 0. According to the graph isomorphism the graphs in U, V are classified into equivalent classes, denoted as U_1,\ldots, U_r, V_1, \ldots, V_s.

Similarly, we observe that all vertices in V_j have the same degree to U_i since the number of ways to restrict two isomorphic graphs to a certain class of isomorphic graphs is the same. Denote this degree as d_{ij} as well. And again we claim the matrix D(different from the one we define before) is of full row rank where D=(d_{ij})_{r\times s}. This will finish the proof since an r\times s matrix of full row rank implies r\leq s.

Suppose not. We have a non-zero r dimensional row vector y such that yD=0. Let x\in\mathbb{R}^U be the row vector indexed by U such that x_u=y_i for all u\in U_i. Thus for all u\in V_j we have \begin{aligned}(xB)_v &=\sum_{u\in U}x_uB_{uv}=\sum_{i\in[r]}\sum_{u\in U_i}x_uB_{uv} \\ &=\sum_{i\in[r]}y_i\sum_{u\in U_i}B_{uv}=\sum_{i\in[r]}y_id_{ij}=(yD)_j=0.\end{aligned} Notice x\neq 0 but xB=0 which contradicts with the non-sigularity of B.

Remark. This proof appears to me as an intricate idea similar to finding an injection. To prove one set is smaller than the other, one would just hope to find an injection from one set to the other. But this is hard in this problem because those two sets we are considering are sets of equivalent classes. On the other hand, in this problem, the inclusion mapping works just fine on the level of graphs though it breaks down on the level of equivalent classes. Thus we should come up with a delicate analysis on the bipartite graph induced by the inclusion mapping. And I think this is the motivation behind the proof.

Fun with Hex

According to the folklore,

The Hex game was invented by the Danish mathematician Piet Hein, who introduced it in 1942 at the Niels Bohr Institute. It was independently re-invented in 1947 by the mathematician John Nash at Princeton University. The rules are really simple. Each player has an allocated color, Red and Blue being conventional. Players take turns placing a stone of their color on a single cell within the overall playing board. The goal is to form a connected path of your stones linking the opposing sides of the board marked by your colors, before your opponent connects his or her sides in a similar fashion. The first player to complete his or her connection wins the game. The four corner hexagons each belong to both adjacent sides… The game can never end in a tie, a fact proved by John Nash: the only way to prevent your opponent from forming a connecting path is to form a path yourself. In other words, Hex is a determined game.
– Hex (board game), Wikipedia

For instance, in the following 2 by 2 board, player 1 should intend to build a connected path linking x_1=0, x_1=2, while player 2 should intend to build a connected path linking x_2=0, x_2=2.

A 2 by 2 Hex Game Board with a coordinate attached to each cell.

Now, we would like to play Hex game in higher dimensions. The first question is: how can we set up the game board in the n dimensional space? The answer all depends on your point of view. If we visualize the game board in 2 dimension as follows, we can view each cell as a cube in 3 dimensional space.

Hex Board from another perspective.

Here is a rework of the 2 dimensional board which can be easily generalized to higher dimension. Consider all unit cubes in \mathbb{R}^3 who are translation of the canonical unit cube (whose vertices are (0,0,0), (1,0,0), (0,1,0), (0,0,1), (1,1,0), (1,0,1), (0,1,1),(1,1,1)) along the vector (x,y,z), where (x,y,z)\in[n]\times[n]\times\mathbb{Z} ([n]=\{1,2,\ldots, n\}) and x+y+z=0. Basically, those unit cubes are the ones illustrated above. Now project all those cubes to the 2-dimensional plane x+y+z=0. This gives us the hexagon board.

For higher dimension, say d dimension, just project all unit cubes in \mathbb{R}^{d+1} who are translation of the d+1-dimensional canonical unit cube along the vector (x_0, \ldots, x_n), where where (x_0,\ldots, x_d)\in\mathbb{Z}^{d+1} and x_0+\ldots+x_d=0, to the d-dimensional hyperplane given by the equation x_0+\ldots+x_d=0.
With the help of Mathematica, each cell of 3D Hex board looks like the following.

Building Block of 3D Hex

Source code is attached here for those who want to play with it a little bit in Mathematica.

v = Orthogonalize[{{1, -1, 0, 0}, {0, 1, -1, 0}, {0, 0, 1, -1}}];
data = Flatten[Table[{{x, y, z, t}.v[[1]], {x, y, z, t}.v[[2]], {x, y, z, t}.v[[3]]}, {x, 0, 1}, {y, 0, 1}, {z, 0, 1}, {t, 0, 1}], 3];
Needs["TetGenLink`"];
{p, s} = TetGenConvexHull[data];
Graphics3D[GraphicsComplex[p, Polygon[s]]]

I forgot to mention, in the d-dimensional Hex game, we have at most d players. Again, the game will never tie. For those who are interested beyond this mere fact, please refer to David Gale’s great exposition on this topic [DG] ((David Gale, The Game of Hex and the Brouwer Fixed-Point Theorem, The American Mathematical Monthly, vol. 86, 1979, pp. 818-827)).

However as a paper-and-pencil game, drawing a real Hex board is time consuming. Here is another version of Hex with an easy-to-draw game board. The vertices represent the hexagon region of the graph and the edges represent the borders. Two players take turns to color the vertices and try to form a monochromatic path from one side to the opposite.

A 3 by 3dDual game board.

Indeed, mathematically speaking, this is the dual diagram of the original board. Its generalization in higher dimension is easy to imagine. Define a graph whose vertex set V=[n]^d. Two distinct vertices u, v\in V are adjacent if u_i-v_i\in\{0,1\} for all i\in[d] or the other way around v_i-u_i\in\{0,1\} for all i\in[d].