Crazy Telescoping

In mathematics, a telescoping series is a series whose partial sums eventually only have a fixed number of terms after cancellation. I learnt this technique when I was doing maths olympiad. However until last year I learnt this buzz word ‘telescoping’ since I received my education in China and we call it ‘the method of differences’.

Anyway, yesterday, I was helping Po-Shen Loh to proctor this Annual Virginia Regional Mathematics Contest at CMU. Unfortunately, I did not take my breakfast. All I could digest are the seven problems. And even worse, I got stuck by the last problem which is the following.

Find \sum_{n=1}^\infty\frac{n}{(2^n+2^{-n})^2}+\frac{(-1)^nn}{(2^n-2^{-n})^2}.

It turns out that the punch line is, as you might have expected, telescoping. But it is just a bit crazy.

There is Pleasure sure, In being Mad, which none but Madmen know!

John Dryden’s “The Spanish Friar”

First consider the first term, \frac{1}{(2^1+2^{-1})^2}+\frac{-1}{(2^1-2^{-1})^2}=\frac{-2\times 2}{(2^2-2^{-2})^2}. Then take look at the second term, \frac{2}{(2^2+2^{-2})^2}+\frac{2}{(2^2-2^{-2})^2}. Aha, the sum of the first two terms becomes \frac{2}{(2^2+2^{-2})^2}+\frac{-2}{(2^2-2^{-2})^2}=\frac{-4\times 2}{(2^4-2^{-4})^2} which again will interact nicely with the 4th term (not the 3rd term). After all, the sum of the 1st, 2nd, 4th, 8th, …, n=(2^m)th  terms is \frac{-4n}{(2^{2n}-2^{-2n})^2}. Note that this goes to zero.

But this only handles the terms indexed by 2’s power. Naturally, the next thing to look at is the sum of the 3rd, 6th, 12th, 24th, …, n=(3\times 2^m)th terms, which amazingly turns out have the telescoping phenomena again and is equal to \frac{-4n}{(2^{2n}-2^{-2n})^2}. Again, the this goes to zero.

We claim that starting with an odd index, say (2l+1), the partial sum of the (2l+1), 2(2l+1), \ldots, 2^m(2l+1), \ldots‘th terms goes to zero.

For people who is willing to suffer a bit more for rigorousness, the argument for the previous claim can be easily formalized by the method of differences. The key fact we have been exploiting above is the following identity.

\frac{2n}{(2^{2n}+2^{-2n})^2}+\frac{2n}{(2^{2n}-2^{-2n})^2} = \frac{4n}{(2^{2n}-2^{-2n})^2}-\frac{8n}{(2^{4n}-2^{-4n})^2}.

The last bit of the ingredient is the absolute summability of the original sequence which implies that changing the order of summation does not affect the result. Hence the sum in total is 0.










On Hardy–Littlewood maximal function of singular measure

In this exposition, we explore the behavior of the Hardy–Littlewood maximal function of measures that are singular with respect to Lebesgue measure.

We are going to prove that for every positive Borel measure s that is singular with respect to Lebesgue measure m,
m(\{x\in\mathbb{T}:M_s(x))>h\})>\frac{1}{2h}s(\mathbb{T}), for all h>s(\mathbb{T}), where \mathbb{T}=\mathbb{R}/\mathbb{Z} is a torus and M_s is the Hardy–Littilewood maximal function of the measure s defined by
M_s(x)=\sup_{x\in B\subset\mathbb{T}}\frac{s(B)}{m(B)} \text{ for all }x\in\mathbb{T}, where the supreme is over all the balls containing the point x in the torus.

Yue Pu told me this problem a couple of days ago. Though I don’t feel confident solving analysis problems, I wanted to see how much I have learnt from Giovanni Leoni. Finally, after staring at the problem for about 6 hours, Yue and I came up with an answer. The proof is modified from Yue’s write-up.

For experts among the readers, our first step is to exploit Calderón–Zygmund covering lemma. We will explain this in detail.

Since two measures m and s are mutually singular, there exist two disjoint sets M and S whose union is \mathbb{T} such that s(M)=m(S)=0. In other words, m lives in M and s lives in S.

Let A be the set of all atoms of s, i.e., A=\{x: s(x)>0\}\subset\mathbb{T}. Define E(x):=\{x+\frac{m}{2^n}: m,n\in\mathbb{N}\}\subset\mathbb{T} for all x\in\mathbb{T}, where the addition is under mod 1. Note that A and E(x) are both countable, so there exists e\in\mathbb{T} such that A\cap E(e)=\emptyset. Without loss of generality, we may assume e=0 because we may choose a suitable point on the torus as 0. We abbreviate E(0) by E which is going to be the set of the endpoints of some intervals.

Fix h>s(\mathbb{T}). Now we are ready to apply the Calderón-Zygmund decomposition according to the height h.

We can chop the torus into two disjoint intervals (0, 1/2) and (1/2, 1) both of length 1/2. Notice that at most one of them, denoted as I_1 if any, whose average, namely s(I_1)/m(I_1), is greater than or equal to h. Since s(\mathbb{T})<h, we get an upper bound for the average of I_1, h\leq s(I_1)/m(I_1) ds < 2h.

We can keep chopping those intervals whose average is less than h into half and pick out sub-intervals whose average is no less than h. Inductively, we can generate an enumerable sequence of mutually disjoint open intervals \{I_n\}, each satisfying h \leq s(I_n)/m(I_n) < 2h, which implies s(I_n)<2hm(I_n).

Let I=\bigcup_n{I_n}. We claim that s(I)=s(\mathbb{T}). If this is the case, then \begin{aligned}m(\{x\in\mathbb{T}:M_s(x)>h\}) & \geq\sum_nm(I_n)>\frac{1}{2h}\sum_ns(I_n) \\ & =\frac{1}{2h}s(I)=\frac{1}{2h}s(\mathbb{T}).\end{aligned}

Left to prove the claim. To prove s(I)=s(\mathbb{T}), we only need to show s(S-I)=0 as s lives in S. Note that s(E)=0 since A, the set of the atoms of s, is disjoint from E and E is countable. It suffices to show s(S-I-E)=0.

Let K be a compact set contained in S-I-E. Since for a positive Borel measure, one can always approximate a Borel set from inside by compact sets, it is enough to show s(K)=0.

For any \epsilon>0, as m(S)=0, we may choose an open set U such that S\subset U and m(U)<\epsilon. We want to show for all x in K\subset S-I-E, we can always find two intervals J_x and J_x^+ both containing x such that 4m(J_x)=m(J_x^+), s(J_x^+)<hm(J_x^+) and 3J_x\subset J_x^+\subset U, where 3J denotes the open interval that has the same center as J but three times the radius.

According to the decomposition process, we can find an infinite sequence of descending open intervals \{J_n\}_{n=1}^\infty containing x such that J_{n+1} is the left or right half of J_n and s(J_n)<hm(J_n) for all n. This is possible because x\notin I\cup E. Note that x cannot always lie in the left half of J_n for all sufficiently large n and similarly x cannot always lies in the right half of J_n for all sufficiently large n. In other words, for infinitely many times, J_{n+2} is the right half of J_{n+1} which is the left half of J_{n}.

Intervals that choose different halves.

Therefore we can always find sufficiently large n such that J_n\subset U. Now for every x\in K\subset U, there exist J_x which is J_{n+2} above, and J_x^+ which is J_n above such that x\in J_x, 3J_x\subset J_x^+\subset U, m(J_x^+)=4m(J_x), and s(J_x^+)<hm(J_x^+). Hence \{J_x\}_{x\in K} is an open cover of the compact set K. We can choose a finite sub-cover, pass it to Vitali covering lemma and spit out a sub-collection \{J_{x_n}\}_{n=1}^N which are disjoint balls such that K\subset\bigcup_{n=1}^N 3J_{x_n}. Now we have

\begin{aligned}s(K) & \leq s\left(\bigcup_{n=1}^N 3J_{x_n}\right)\leq s\left(\bigcup_{n=1}^N J_{x_n}^+\right)\leq\sum_{n=1}^N s(J_{x_n}^+)<h\sum_{n=1}^N m(J_{x_n}^+) \\ &=4h\sum_{n=1}^N m(J_{x_n})=4hm\left(\bigcup_{n=1}^N J_{x_n}\right)\leq 4hm(U) < 4h\epsilon.\end{aligned}

As \epsilon>0 is chosen arbitrarily, we have s(K)=0, which yields the claim.

Remark. The fact that s(\mathbb{T})<\infty which is implicitly required in the problem is used to prove the countability of A, the set of all atoms of s.

Remark. I have no idea whether the result is sharp in terms of the constant multiplier on the right hand side of the inequality.


Alternating Fourier Coefficients

Suppose f\in L^2 is a periodic function from \mathbb{R} to \mathbb{C} with period 1. Let a(n) be its Fourier coefficients, namely a(n)=\int_0^1{f(x)e^{-i2\pi nx}dx} for all n\in\mathbb{Z}. Prove for all p\geq 2 it is almost surely that function g(\omega)(x)=\sum_{n\in \mathbb{Z}}\omega(n)a(n)e^{i2\pi nx} is in L^p where \omega is an infinite sequence of independent and identical random variables indexed by \mathbb{Z} with \omega(n) equals either -1 or 1 with probability 1/2.

I heard this problem from Yue Pu who heard it from Gautam Iyer. Roughly speaking, given a periodic L^2 function, by alternating the signs of its Fourier coefficients, one almost surely gets an L^p function for all p\geq 2.

Here is the proof Yue and I worked out this afternoon which I think is pretty neat. However, all convergence arguments are omitted.

We shall prove that for all p\in\mathbb{N}, the expectation of \int_0^1 |g(\omega)(x)|^{2p}dx is less than or equal to C_p\left(\int_0^1|f(x)|^2dx\right)^p, where C_p is a constant that only depends on p.

This is sufficient to prove the original problem. For all p\in\mathbb{N}, we let A=\{\omega: g(\omega)\notin L^{2p}\}. If \mathbb{P}(A)>0, then for all \omega\in A, \int_0^1{|g(\omega)(x)|^{2p}dx}=\infty. Consequently \infty\leq\mathbb{E}\left[\int_0^1 |g(\omega)(x)|^{2p}dx\right]\leq C_n\left(\int_0^1|f(x)|^2dx\right)^p<\infty, which is impossible. For a general p\geq 2, pick any p'\in\mathbb{N} such that p\leq 2p'. We know g(\omega) is almost surely in L^{2p'}, hence almost surely in L^{p}.

For brevity, we denote e(n)(x)=e^{i2\pi nx}. We will repeatedly use the facts that e(m)(x)e(n)(x)=e(m+n)(x), \overline{e(m)(x)}=e(-m)(x) and \int_0^1e(n)(x)=0 except for n=0 it equals 1. We will also use Parseval’s identity which says \int_0^1|f(x)|^2dx=\sum_{n\in\mathbb{Z}}|a(n)|^2.

For all p\in\mathbb{N}, we have \begin{aligned}|g(\omega)(x)|^2&=g(\omega)(x)\overline{g(\omega)(x)}\\&=\sum_{m,n}\omega(m)\omega(n)a(m)\overline{a(n)}e(m)(x)e(-n)(x).\end{aligned} This gives us |g(\omega)(x)|^{2p}=\sum_{m_1,\ldots,m_p,n_1,\ldots,n_p}\prod_{i=1}^p\omega(m_i)a(m_i)e(m_i)(x)\omega(n_i)\overline{a(n_i)}e(-n_i)(x). Integrate above formula from 0 to 1, we have

Notice that for fixed m_1, \ldots, m_p, n_1, \ldots, n_p, the expectation of the summand is always 0 except when there is a perfect matching among m_1,\ldots, m_p, n_1, \ldots, n_p. By a perfect matching, we mean that one can pair the numbers such that the two numbers in each pair are the same. Hence the expectation of above is equal to
where \sum^* is the summation over every sequence that admits a perfect matching.

The above is less than or equal to
\sum_{m_1,\ldots, m_p, n_1,\ldots, n_p}^*\prod_{i=1}^p|a(m_i)||a(n_i)|.

By over counting, the expectation is less than or equal to
\frac{(2p)!}{p!2^p}\sum_{l_1, \ldots, l_p}\prod_{i=1}^p|a(l_i)|^2=\frac{(2p)!}{p!2^p}\left(\int_0^1|f(x)|^2dx\right)^p,
where \frac{(2p)!}{p!2^p} is the number of perfect matching can possibly be formed among 2p numbers.

Remark. The proof is not constructive. Indeed, it doesn’t provide a way for one to alternate the Fourier coefficients so that the outcome is an L^p function, though it is almost sure that the outcome is in L^p.


Number of non-isomorphic graphs

This expository essay is to test my understanding of the techniques used in More Bricks – More Walls?, Thirty-three Miniatures by Jiří Matoušek’s.

We shall prove the sequence g_n(0),\ldots, g_n({n\choose 2}) is unimodal, i.e., it is first nondecreasing and then, from some point on, non-increasing, where g_n(k) is the number of non-isomorphic graphs with n vertices and k edges. In particular, we shall prove \begin{gathered}g_n(0)\leq g_n(1)\leq\ldots\leq g_n(\lfloor m/2\rfloor) \\ =g_n(\lceil m/2\rceil)\geq\ldots\geq g_n(m-1)\geq g_n(m),\end{gathered} where m={n\choose 2}.

Throughout the article, the number of vertices, which is denoted as n, is always fixed. And m always stands for the maximal number of edges between n vertices.

Notice that taking the complement of graphs establishes a bijection between graphs with k edges and graphs with m-k edges. Moreover, two graphs are isomorphic if and only if their complement graphs are isomorphic. Thus we have g_n(k)=g_n(m-k). Hence it is enough to show that g_n(k)\leq g_n(l) for all k\leq l\leq m/2.

Denote U, V be the sets of all graphs with k, l edges on the fixed vertex set [n] respectively. Let r, s denote the number of non-isomorphic graphs in U, V. By our notation above, r=g_n(k), s=g_n(l). We shall show r\leq s. The graph G is the bipartite graph between U and V with u\sim v if and only if u is a subgraph of v.

Let B=(b_{uv})_{u\in U, v\in V} be the bipartite adjacent matrix of G, where b_{uv}=1 if u and v are adjacent in G, otherwise 0.

We claim the matrix B is of full rank. As it is easy to see the size of U is no greater than the size of V, we shall prove an equivalent statement that the matrix B is of full row rank, that is the rows of B are linearly independent.

Suppose not. There is a non-zero row vector y in \mathbb{R}^{U} such that yB=0. Notice the coordinates of y are indexed by the elements in U. Let K^*\in U such that y_{K^*}\neq 0.

Now we partition U, V into k+1 according to the number of edges in the intersection of the graph with K^*: K_i is the set of graphs in U who share i common edges with K^* and L_j is the set of graphs in V who share j common edges with K^* for all i, j=0,\ldots, k. Remember the number of edges in K^* is k and K_k contains only one element K^*. Also all K_i and L_j defined above are non-empty because of the assumption k<l\leq m/2.

Note that for any i, j\in \{0,\ldots, k\}, all vertices in K_i have the same degree to L_j in the bipartite graph G. This is because the ways to extend a graph with k edges and i edges in common with K^* to graphs with l edges and j edges in common with doesn’t specifically depend the graph we start with. Denote this number as d_{ij} and let D=(d_{ij}) be the matrix. Apparently, D is an upper triangular matrix with d_{ii}\neq 0. Thus D is non-singular. On the other hand, as k_i\cdot d_{ij}=\sum_{K\in K_i, L\in L_j}B_{KL} where k_i is the size of K_i, D=\mathrm{diag}(k_0^{-1},\ldots, k_k^{-1})FBG where F=(f_{i, u})_{i\in\{0,\ldots, k\}, u\in U} and G=(g_{v, j})_{v\in V, j\in\{0,\ldots, k\}} are matrices with f_{i, u}=1 if and only if u\in K_i and g_{v, j}=1 if and only if v\in L_j otherwise 0.

Let x be a k+1 dimensional row vector such that x=yE, where E=(e_{ui})_{u\in U, i\in\{0,\ldots, k\}} is matrix with e_{ui}=1 if u\in K_i otherwise 0. In fact, x_i=\sum_{u\in U}y_ue_{ui}=\sum_{K\in K_i}y_K. Hence x_k=y_{K^*}\neq 0 as K_k contains only one element K^*.

Now we have xD=yE\mathrm{diag}(k_0^{-1},\ldots, k_k^{-1})FBG. However it is easy to check EF=\mathrm{diag}(k_0,\ldots, k_k) and yB=0. So xD=yBG=0 contradicting to the non-singularity of D as x\neq 0. According to the graph isomorphism the graphs in U, V are classified into equivalent classes, denoted as U_1,\ldots, U_r, V_1, \ldots, V_s.

Similarly, we observe that all vertices in V_j have the same degree to U_i since the number of ways to restrict two isomorphic graphs to a certain class of isomorphic graphs is the same. Denote this degree as d_{ij} as well. And again we claim the matrix D(different from the one we define before) is of full row rank where D=(d_{ij})_{r\times s}. This will finish the proof since an r\times s matrix of full row rank implies r\leq s.

Suppose not. We have a non-zero r dimensional row vector y such that yD=0. Let x\in\mathbb{R}^U be the row vector indexed by U such that x_u=y_i for all u\in U_i. Thus for all u\in V_j we have \begin{aligned}(xB)_v &=\sum_{u\in U}x_uB_{uv}=\sum_{i\in[r]}\sum_{u\in U_i}x_uB_{uv} \\ &=\sum_{i\in[r]}y_i\sum_{u\in U_i}B_{uv}=\sum_{i\in[r]}y_id_{ij}=(yD)_j=0.\end{aligned} Notice x\neq 0 but xB=0 which contradicts with the non-sigularity of B.

Remark. This proof appears to me as an intricate idea similar to finding an injection. To prove one set is smaller than the other, one would just hope to find an injection from one set to the other. But this is hard in this problem because those two sets we are considering are sets of equivalent classes. On the other hand, in this problem, the inclusion mapping works just fine on the level of graphs though it breaks down on the level of equivalent classes. Thus we should come up with a delicate analysis on the bipartite graph induced by the inclusion mapping. And I think this is the motivation behind the proof.


Fun with Hex

According to the folklore,

The Hex game was invented by the Danish mathematician Piet Hein, who introduced it in 1942 at the Niels Bohr Institute. It was independently re-invented in 1947 by the mathematician John Nash at Princeton University. The rules are really simple. Each player has an allocated color, Red and Blue being conventional. Players take turns placing a stone of their color on a single cell within the overall playing board. The goal is to form a connected path of your stones linking the opposing sides of the board marked by your colors, before your opponent connects his or her sides in a similar fashion. The first player to complete his or her connection wins the game. The four corner hexagons each belong to both adjacent sides… The game can never end in a tie, a fact proved by John Nash: the only way to prevent your opponent from forming a connecting path is to form a path yourself. In other words, Hex is a determined game.
– Hex (board game), Wikipedia

For instance, in the following 2 by 2 board, player 1 should intend to build a connected path linking x_1=0, x_1=2, while player 2 should intend to build a connected path linking x_2=0, x_2=2.

A 2 by 2 Hex Game Board with a coordinate attached to each cell.

Now, we would like to play Hex game in higher dimensions. The first question is: how can we set up the game board in the n dimensional space? The answer all depends on your point of view. If we visualize the game board in 2 dimension as follows, we can view each cell as a cube in 3 dimensional space.

Hex Board from another perspective.

Here is a rework of the 2 dimensional board which can be easily generalized to higher dimension. Consider all unit cubes in \mathbb{R}^3 who are translation of the canonical unit cube (whose vertices are (0,0,0), (1,0,0), (0,1,0), (0,0,1), (1,1,0), (1,0,1), (0,1,1),(1,1,1)) along the vector (x,y,z), where (x,y,z)\in[n]\times[n]\times\mathbb{Z} ([n]=\{1,2,\ldots, n\}) and x+y+z=0. Basically, those unit cubes are the ones illustrated above. Now project all those cubes to the 2-dimensional plane x+y+z=0. This gives us the hexagon board.

For higher dimension, say d dimension, just project all unit cubes in \mathbb{R}^{d+1} who are translation of the d+1-dimensional canonical unit cube along the vector (x_0, \ldots, x_n), where where (x_0,\ldots, x_d)\in\mathbb{Z}^{d+1} and x_0+\ldots+x_d=0, to the d-dimensional hyperplane given by the equation x_0+\ldots+x_d=0.
With the help of Mathematica, each cell of 3D Hex board looks like the following.

Building Block of 3D Hex

Source code is attached here for those who want to play with it a little bit in Mathematica.

v = Orthogonalize[{{1, -1, 0, 0}, {0, 1, -1, 0}, {0, 0, 1, -1}}];
data = Flatten[Table[{{x, y, z, t}.v[[1]], {x, y, z, t}.v[[2]], {x, y, z, t}.v[[3]]}, {x, 0, 1}, {y, 0, 1}, {z, 0, 1}, {t, 0, 1}], 3];
{p, s} = TetGenConvexHull[data];
Graphics3D[GraphicsComplex[p, Polygon[s]]]

I forgot to mention, in the d-dimensional Hex game, we have at most d players. Again, the game will never tie. For those who are interested beyond this mere fact, please refer to David Gale’s great exposition on this topic [DG] ((David Gale, The Game of Hex and the Brouwer Fixed-Point Theorem, The American Mathematical Monthly, vol. 86, 1979, pp. 818-827)).

However as a paper-and-pencil game, drawing a real Hex board is time consuming. Here is another version of Hex with an easy-to-draw game board. The vertices represent the hexagon region of the graph and the edges represent the borders. Two players take turns to color the vertices and try to form a monochromatic path from one side to the opposite.

A 3 by 3dDual game board.

Indeed, mathematically speaking, this is the dual diagram of the original board. Its generalization in higher dimension is easy to imagine. Define a graph whose vertex set V=[n]^d. Two distinct vertices u, v\in V are adjacent if u_i-v_i\in\{0,1\} for all i\in[d] or the other way around v_i-u_i\in\{0,1\} for all i\in[d].


The ultimate prisoner puzzle

People like prisoner puzzles. Here is the hardest one I have ever seen. I heard it from Prof. Boris Bukh on the social hour of Maths Department couple of weeks ago, whom heard it on a maths conference. Since recently I am quite into Cryptography, especially setting up communication protocols, I would like to log the puzzle in my blog.

The storyline is like The Shawshank Redemption. You are locked into one of the cells in the jail. Unfortunately, you do not know how many prisoners there are in the jail. You do not even know whether you are the only one! The evil-hearted yet mathematically inclined warden wants to play a game with you. The rules are as follows.

  1. On the first day, you can write up a communication protocol. The warden will make identical photocopies of your protocol and distribute them among other prisoners.
  2. During the following days, everyday, each prisoner will be given one black card and one white card, and they should turn in one of them to the warden according to your protocol. The rest of the cards will be discarded. The warden would line up all prisoners in a circle in his mind and imagine each of them passes his or her chosen card to the next person clockwise. After going through all this in his mind, he will redistribute the cards to the prisoners accordingly. Notice, the warden could line up the people by any order he likes. Moreover, the orders could probably differ from day to day.
  3. One day, you should call for a stop and immediately tell the number of prisoners in this jail. All prisoners including yourself will be released if you got the number correct.

Remember, the evil-hearted warden knows every detail of your protocol. So all odds are always against you.
The following is the solution I came up.

Denote the number of prisoners as p. First of all, we want to draft the protocol so that all prisoners know an upper bound of p, denoted as P. Before we say anything about the protocol, let us take a look at the significance of knowing such an upper bound.

If all prisoners know an upper bound P, then they can broadcast messages. To be precise, if one turns in a card of some color he chooses, say c. Everyone else should keep turning in white cards until he receives a black card, and then keep turning in black cards regardless of what cards are received. Then after P days, all prisoners will receive the the cards of color c. (Think about why this is true.) Now everyone knows the color c. Hence they succeeded in broadcasting a single bit. If the protocol says something about coding messages by colors, one can broadcast any message to others. Note that prisoners might not know who has sent the messages.

Though direct message is hard to realize in this situation, broadcasting would be good enough for us to figure out p.

Now, let us see how prisoners cooperatively figure out an upper bound P. We call it the first stage.

In the first stage, all prisoners will be labeled as either visible or hidden. At the beginning, only I am labeled as visible, all the rest are labeled as hidden. We break down the following days into several phases. At the beginning of the kth phase, all prisoners know the upper bound of the current visible prisoners P_k. For instance, P_1=1. Now, all hidden people broadcast the a single bit of message telling the others there are hidden people. The broadcasting is done similarly, i.e., whoever receives the black card once should keep sending black cards. The broadcasting can be done within P_k days because there are at most these many visible prisoners.

There are two possible turnouts. If no visible prisoner receives this message, then all prisoners are visible, and immediately all prisoners know P=P_k. Otherwise, on the next day, all visible prisoners turn in black cards. Whoever receives a black card changes his label to visible if it was hidden. The number of prisoners who change their labels is at most P_k. Thus we can use P_{k+1}=2P_k in the k+1th phase as now there are at most P_k+P_k visible prisoners. Since at least one prisoner changes his label in each phase, eventually, all prisoners will become visible. Hence this process will terminate at a certain phase and provides us an upper bound P.

After the first stage, comes the second stage.

In the second stage, we will partition prisoners by clans and try refining the clans as much as possible. During the second stage, each prisoner knows the total number of clans, the names of each clan and which clan he belongs to. At the beginning, I myself alone consists the first clan and the rest of prisoners consist the second clan. This initial set up is written into the protocol, so everyone knows it.

The following is how I refine the clans. I can broadcast a message that asks the members in a certain clans if any one has turned in a black card on a given day. Prisoners in this certain clan will broadcast a message of a single bit to indicate their answer. And then I broadcast a message that asks if any one in this clan has turned in a white card on that day. If both answers are ‘yes’, this means there are two people in this clan turned in different card on that given day. Then I will demand this clan to divide into two smaller clans according to the color of their cards turned in on that given day. I can also do the same thing to the cards they received on that given day as well.

So in the second phase, I can constantly refine the clans by asking questions clan by clan and day by day. Since the number of clans can not exceed P, this refining process will stop and the number of clans will stabilize. Moreover, once it stabilizes, prisoners in each clan will turn in the same cards and receive the same cards as well. Otherwise, some day I will interrogate the members in this clan and separate them into smaller clans.

Now we can assume the prisoners in each clans always turn in the same cards and receive the same cards. This is amazing!
Suppose c_i is the number of prisoners in the ith clan C_i for i=1,\ldots, n. Assume I am in C_1. Then c_1=1. Initially the values of all other c_is are unknown. Our goal is to find out each c_i, though it is enough to find \sum_{i=1}^nc_i.

Notation. [n] is the set of all naturals from 1 to n. If I\subset [n], c_I := \{c_i: i\in I\} and \mathrm{span}(c_I) be the set of all linear combination of \{c_i: i\in I\}.

For r = 1,\ldots, n-1, we claim that for all r-element subset I\subset [n] and i\in I, c_i can be written as a linear combination of \{c_j: j\in[n]-I\}. In abbreviation we shall write c_i\in \mathrm{span}(c_{[n]-I}). Note when we say ‘some variable can be written as a linear combination of other variables’, we mean all prisoners know this information. In other words, all prisoners keep track of these linear dependencies along the way. We will omit the details of how prisoners would share this information in the following proof and leave them to the readers.

Before we prove the claim, let us take a look at the consequence if what we claimed is true. When r=n-1, it means c_2, \ldots, c_n can be written as a linear combination of c_1=1. Thus we know c_i for all i\in[n]. Hence we know the total number of prisoners.

Now we prove the claim by induction on r. If r=1, we shall prove c_i\in \mathrm{span}(c_{[n]-\{i\}}) for all i\in[n]. This is relatively easy. Let the prisoners in the clan C_i send out black cards. Some of the other clans will receive those black cards. Thus c_i is the sum of the size of those clans which receive the black cards.

Suppose the claim is true for some r. Now we shall prove for every r+1-element subset I\subset [n] and i\in I, c_i\in \mathrm{span}(c_J), where J=[n]-I.

Fix some i_0\in I. By the induction hypothesis, we know for all i\in I with i\neq i_0, we have
c_i = \alpha_ic_{i_0}+\beta_i(c_J),
where \beta_i(c_J)\in \mathrm{span}(c_J).
This is also true for i=i_0. Because we may set \alpha_{i_0}=1 and \beta_{i_0}=0.

Let the index set I_+ be the set \{i\in I: \alpha_i > 0\} and I_-=I-I^+. As i_0\in I_+, I_+ is non-empty.

Now we demand all clans indexed by I^+ send out black cards. Suppose the clans indexed by I_+'\subset I_+ who send out black cards receive white cards and the clans indexed by I_-'\subset I_- and J'\subset J who send out white cards receive black cards. Hence we have the following equation.
\sum_{i\in I'_+}c_i = \sum_{i\in I'_-}c_i+\sum_{j\in J'}c_j

\sum_{i\in I'_+}\left(\alpha_ic_{i_0}+\beta_i(c_J)\right)=\sum_{i\in I'_-}\left(\alpha_ic_{i_0}+\beta_i(c_J)\right)+\sum_{j\in J'}c_j

The \alpha-coefficient on the left hand side is positive. Meanwhile it is nonpositive on the right hand side. Thus by the equation above we can solve for c_{i_0} in terms of a linear combination of c_J. Hence we can write c_i in terms of a linear combination of c_J for all i\in I.



Today, probably, is not my day. I dropped my phone on the ground when I was rushing to the first recitation. The screen was totally shattered into pieces. And I had no time mourning on the loss of my first smart phone because I had to arrive in the classroom on time.

Cobweb style for Halloween.

In the afternoon, I gave it a try at the Apple store. And unfortunately, I had to pay 200+ dollars for a replacement of my phone. As I was hesitating whether to take it or not, they told me that actually they could do nothing with the device because it is purchased outside the country. I left the Apple store with the number of Apple Care.

Then I gave my second try. The customer service whose name is Jeremy let me hold on the line for about 10 minutes because the case is rare and he was asking for help from the senior advisor. At first, he told me that probably I needed to send my phone back to China, get it fixed and send it back. And he transfered me to the customer service in China. After listening to about-10-minute enjoyable pop songs with no one answering the phone call, Jeremy told me that he had been looking for an alternative solution so that I wouldn’t have to go through the procedure in China.

Finally, he transfered me to the senior advisor whose name is Diego. And he found a new policy for devices purchased abroad which I thought was perfect for me. He charged me 99 dollars to upgrade my phone with an Apple care which will cover two incidents with 49 dollars extra charge each time. The only downside of this deal is that what I got is a new iPhone 5 locked to AT&T network instead of the factory unlocked one. I think overall it is a good deal because it’s cheaper and gives more protection.

During the evening, I went again into the Apple store, and replaced my phone with a new one. But I don’t know why it took them for like an hour to get it done. But anyway, it’s fixed.

Lesson learned. Buy a case. Just in case.

Movie reviews Reviews


昨天去影院看了 Looper,虽然已经上映超过一周,但是影院还是坐得满满的。





复旦附中 Just for you 跨校点歌祝福单