Distance between a point and a hyperplane not reached

Let’s investigate the following question: “Is the distance between a point and a hyperplane always reached?”

In order to provide answers to the question, we consider a normed vector space \((E, \Vert \cdot \Vert)\) and a hyperplane \(H\) of \(E\). \(H\) is the kernel of a non-zero linear form. Namely, \(H=\{x \in E \text{ | } u(x)=0\}\).

The case of finite dimensional vector spaces

When \(E\) is of finite dimension, the distance \(d(a,H)=\inf\{\Vert h-a \Vert \text{ | } h \in H\}\) between any point \(a \in E\) and a hyperplane \(H\) is reached at a point \(b \in H\). The proof is rather simple. Consider a point \(c \in H\). The set \(S = \{h \in H \text{ | } \Vert a- h \Vert \le \Vert a-c \Vert \}\) is bounded as for \(h \in S\) we have \(\Vert h \Vert \le \Vert a-c \Vert + \Vert a \Vert\). \(S\) is equal to \(D \cap H\) where \(D\) is the inverse image of the closed real segment \([0,\Vert a-c \Vert]\) by the continuous map \(f: x \mapsto \Vert a- x \Vert\). Therefore \(D\) is closed. \(H\) is also closed as any linear subspace of a finite dimensional vector space. \(S\) being the intersection of two closed subsets of \(E\) is also closed. Hence \(S\) is compact and the restriction of \(f\) to \(S\) reaches its infimum at some point \(b \in S \subset H\) where \(d(a,H)=\Vert a-b \Vert\).

An initial consideration regarding infinite dimensional vector spaces

We now suppose that \(E\) is of infinite dimension. We first notice that \(H\) is not necessarily closed. One can prove that \(H\) is closed if and only if \(u\) is continuous. When \(u\) is discontinuous, \(H\) is dense in \(E\), hence for any \(a \in E\) we have \(d(a,H)=0\).

For example, consider the space \(X\) of real-valued smooth functions on the interval \([0, 1]\) with the uniform norm, that is:
\[\Vert f \Vert = \sup\limits_{x \in [0,1]} \vert f(x) \vert\] The derivative-at-a-point map, given by \(T(f)=f^\prime(0)\) defined on \(X\) and with real values, is linear but not continuous. Indeed, consider the sequence \(\displaystyle f_n(x)=\frac{\sin(n^2x)}{n}\) for \(n \ge 1\). This sequence converges uniformly to the constantly zero function, but \(T(f_n)=n \to + \infty\) as \(n \to + \infty\) instead of \(T(f_n) \to T(0) =0\) which would hold for a continuous map. For any \(f \in X\), consider the sequence \(\displaystyle g_n(x)=f(x)-f^\prime(0)\frac{\sin(nx)}{n}\) for \(n \ge 1\). We have \(g_n^\prime(x)=f^\prime(x)-f^\prime(0) \cos(nx)\) and therefore \(g_n^\prime(0)=0\). This means that \(T(g_n)=0\) or in other words that \(g_n\) belongs to the hyperplane \(H \equiv T(f)=0\). As the sequence \(f^\prime(0)\frac{\sin(nx)}{n}\) converges uniformly to the constantly zero function we conclude that \(d(f,H)=0\).

The case of Hilbert spaces

We now suppose that \(E\) is a Hilbert space and \(H\) a closed hyperplane defined by the equation \(u(x)=0\). \(H\) is a non-empty convex subset of \(E\). According to the Hilbert projection theorem for every \(a \in E\) there exists a unique \(b \in H\) for which \(\Vert a-b \Vert\) is minimized over \(H\).
The proof of the Hilbert projection theorem is based on the parallelogram identity.

A counterexample in the Banach space \(c_0\)

Finally we look at a counterexample of a Banach space where the distance of a point to a closed hyperplane is not reached.
We take for \(E\) the sequence space \(c_0\) of real sequences converging to zero equipped with the supremum norm \(\Vert x \Vert = \sup\limits_n \vert x_n \vert\). \(c_0\) is a Banach space. The linear form:
\[\begin{array}{l|rcl}
u : & c_0 & \longrightarrow & \mathbb{R} \\
& x & \longmapsto & \displaystyle \sum_{n=0}^\infty 2^{-n}x_n
\end{array}\] is continuous as for \(x \in c_0\) the inequality \(\displaystyle \vert u(x) \vert \le (\sum_{n=0}^\infty 2^{-n}) \Vert x \Vert = 2 \Vert x \Vert\) holds. Consequently the hyperplane \(H=u^{-1}(\{0\})\) is closed. If we denote by \(\Vert l \Vert\) the norm \(\Vert l \Vert=\sup\limits_{\Vert x \Vert =1} \vert l(x) \vert\) of any continuous linear form \(l\), we have \(\Vert u \Vert=2\). The above inequality proves that \(\Vert u \Vert \le 2\). For \(p \in \mathbb{N}\) we also have \(u(e(p)) = 2(1-2^{-(p+1)})\), where \(e(p)\) is the real sequence (belonging to \(c_0\)) with elements \(e(p)_n\) equal to \(1\) for \(n \le p\) and to \(0\) for \(n > p\). Hence \(\Vert u \Vert \ge 2(1-2^{-(p+1)})\) for all \(p \in \mathbb{N}\) and therefore:
\[\Vert u \Vert = 2\] as desired. We now use the following general statement (G): for any continuous linear form \(u\) and any point \(a\) in a vector space \(E\), we have \(\displaystyle d(a,H)=\frac{\vert u(a) \vert}{\Vert u \Vert}\).

Let’s suppose that for \(a \in c_0 \setminus H\) the distance \(d(a,H)\) was reached at a point \(b \in H\). Using the statement (G), we would have:
\[\vert u(a) \vert = 2 \Vert a- b \Vert\] according to \(\Vert u \Vert = 2\). As \(\displaystyle u(b) = \sum_{n=0}^\infty 2^{-n}b_n=0\), we can reformulate above equality as:
\[\left \vert \displaystyle \sum_{n=0}^\infty 2^{-n}(a_n-b_n) \right \vert = 2 \Vert a – b \Vert = 2 \sup\limits_n \vert a_n – b_n \vert\] which is possible only if \(\vert a_p – b_p \vert = \Vert a – b \Vert\) for all \(p \in \mathbb{N}\). Indeed, given any \(p \in \mathbb{N}\) we have:
\[\begin{aligned}
\left \vert \displaystyle \sum_{n=0}^\infty 2^{-n}(a_n-b_n) \right \vert &= \left \vert 2^{-p}(a_p-b_p) + \displaystyle \sum_{\substack{n=0 \\ n \neq p}}^\infty 2^{-n}(a_n-b_n) \right \vert\\
&\le 2^{-p}\vert a_p-b_p \vert + \left \vert \displaystyle \sum_{\substack{n=0 \\ n \neq p}}^\infty 2^{-n}(a_n-b_n) \right \vert\\
&\le 2^{-p} \vert a_p-b_p \vert + 2 \Vert a- b \Vert – 2^{-p} \Vert a- b \Vert\
\end{aligned}\] and the last right hand term can be equal to \(2 \Vert a – b \Vert\) only if \(\vert a_p – b_p \vert = \Vert a – b \Vert\). This leads to a contradiction: as \(a_p \to 0\) and \(b_p \to 0\) for \(p \to +\infty\), we get \(\Vert a – b \Vert=0\) which is not possible as we supposed \(a \notin H\).

Proof of the general statement (G)

Take any continuous linear form \(u\) and \(a \in E\). We can suppose that \(a \notin H\) (where \(H\) is the kernel of \(u\)) as the statement is trivial when \(a \in H\). For any \(h \in H\) the norm of the vector \((a-h)/\Vert a – h \Vert \) is equal to \(1\). Hence by definition of \(\Vert u \Vert\) we have:
\[\left \vert u \left (\frac{a-h}{\Vert a-h\Vert} \right ) \right \vert = \frac{\vert u(a-h) \vert}{\Vert a-h\Vert}=\frac{\vert u(a) \vert}{\Vert a-h\Vert} \le \Vert u \Vert\] Therefore \(\Vert a-h \Vert \ge \vert u(a) \vert / \Vert u \Vert\) for all \(h \in H\) and \(d(a,H) \ge \vert u(a) \vert /\Vert u \Vert\).
On the other hand for all \(\epsilon > 0\) one can find a vector \(x\) whose norm is equal to one with \(\vert u(x) \vert \ge \Vert u \Vert/(1+\epsilon)\). For \(\lambda = u(x)/u(a)\), the vector \(h= x- \lambda a\) belongs to \(H\) as \(u(h)=0\) and the equality \(\displaystyle x=\lambda(\frac{1}{\lambda}h+a)\) holds. As \(\Vert x \Vert = 1\) we get:
\[1=\frac{\vert u(x) \vert}{\vert u(a) \vert} \left \Vert \frac{1}{\lambda}h+a \right \Vert \ge \frac{\Vert u \Vert}{\vert u(a) \vert(1+\epsilon)} \left \Vert \frac{1}{\lambda}h+a \right \Vert\] which implies that \(\displaystyle d(a,H) \le \frac{\vert u(a) \vert}{\Vert u \Vert} (1 + \epsilon)\) as \(\displaystyle \frac{1}{\lambda}h \in H\).

As the inequality holds for \(\epsilon > 0 \) as small as we want, we get the conclusion \(\displaystyle d(a,H) \le \frac{\vert u(a) \vert}{\Vert u \Vert}\) and the statement (G) is proven.

Math Counterexamples