Inverse Function Theorem

From elementary calculus we recall that a continuous function is invertible if and only if it is monotonically increasing or decreasing over the interval of the required inverse. We can see how this arises by looking at the linear approximation to \(f\) in the neighbourhood of some point \(x=a\), \(f(x)\approx f(a)+f'(a)\cdot(x-a)\). Clearly, to be able to invert this and express, at least locally, \(x\) in terms of \(f(x)\) we must have \(f'(a)\neq0\).

As we’ve seen, we can similarly approximate the function \(f:\RR^n\mapto\RR^m\) in the neighbourhood of a point \(a\in\RR^n\) as \(f(x)\approx f(a)+J_f(a)(x-a)\) which tells us that for \(f\) to be invertible in the neighbourhood of some point will certainly require the Jacobian matrix to be invertible at that point. In particular we must have \(n=m\), in which case the determinant of this matrix is called the Jacobian determinant of the map \(f\). We now state the important inverse function theorem.

Theorem (Inverse function theorem) Suppose \(f:\RR^n\mapto\RR^n\) is smooth on some open subset of \(\RR^n\). Then if \(\det\mathbf{J}_f(a)\neq0\) at some \(a\) in that subset then there exists an open neighbourhood \(U\) of \(a\) such that \(V=f(U)\) is open and \(f:U\mapto V\) is a diffeomorphism. In this case, if \(x\in U\) and \(y=f(x)\) then \(J_{f^{-1}}(y)=(J_f(x))^{-1}\).

Note that if \(f:U\mapto V\) is a diffeomorphism of open sets then we may form the identity function \(f\circ f^{-1}\) on \(V\). Clearly, for all \(y\in V\), \(J_{f\circ f^{-1}}(y)=\id_V\) but by the chain rule we have \(\id_V=J_{f\circ f^{-1}}(y)=J_f(x)J_{f^{-1}}(y)\) for any \(y=f(x)\in V\) and so \(J_f(x)\) is invertible at all points \(x\in U\).

Example In one dimension, the function \(f(x)=x^3\) is invertible with \(f^{-1}(x)=x^{1/3}\). Notice though that, \(f'(x)=3x^2\), so that, \(f'(0)=0\), and the hypothesis of the inverse function theorem is violated. The point is that \(f^{-1}\) is not differentiable at \(f(0)=0\).

A useful consequence of the inverse function theorem is the following. If \(U\subset\RR^n\) is some open subset of \(\RR^n\) on which a map \(f:U\mapto\RR^n\) is smooth and for which the Jacobian determinant \(\det\mathbf{J}_f(x)\neq0\) for all \(x\in U\) then \(f(U)\) is open and if \(f\) is injective then \(f:U\mapto f(U)\) is a diffeomorphism. To see this, note that since at every \(x\in U\), \(\det\mathbf{J}_f(x)\neq0\), the inverse function theorem tells us that we have open sets which we can call \(U_x\) and \(V_x\) such that \(x\in U_x\) and \(V_x=f(U_x)\) open in \(f(U)\) so that, since \(f(x)\in V_x\subset f(U)\), \(f(U)\) is open. If \(f\) is injective then, since by the theorem \(f:U_x\mapto V_x\) is a diffeomorphism for every \(x\in U\) and since \(f(U)\) is open we conclude that the inverse \(f^{-1}\) is smooth on \(f(U)\) so that indeed \(f:U\mapto f(U)\) is a diffeomorphism.

A coordinate system, \((y^1,\dots,y^n)\), for some subset \(U\) of points of \(\RR^n\) is simply a map
\begin{equation}
(x^1,\dots,x^n)\mapsto(y^1(x^1,\dots,x^n),\dots,y^n(x^1,\dots,x^n)), \label{map:coordmap}
\end{equation}
allowing us to (re-)coordinatize points \(x=(x^1,\dots,x^n)\in U\). Intuitively, for the \(y^i\) to be good coordinates, the map \eqref{map:coordmap} should be a diffeomorphism — points should be uniquely identified and we should be able to differentiate at will. Using the inverse function theorem we can test this by examining the Jacobian of the transformation.

Example Consider the coordinate transformation maps of the previous section. For polar coordinates in the plane the map \(r,\theta)\mapsto(r\cos\theta,r\sin\theta\) defined on the open set \((0,\infty)\times\RR\) is smooth with the Jacobian determinant \(r\) which is non-zero everywhere on the domain. Thus the inverse function theorem tells us that the restriction of this map to any open subset on which it is injective is a diffeomorphism onto its image. We could restrict, for example, to \((0,\infty)\times(0,2\pi)\) and the polar coordinates map is then a diffeomorphism onto the complement of the non-negative \(x\)-axis. For cylindrical coordinates the Jacobian is \(\rho\). Restricting to \((0,\infty)\times(0,2\pi)\times\RR\) the cylindrical polar coordinates map is a diffeomorphism onto the complement of the \(yz\) half plane corresponding to non-negative \(y\)-values. In the case of spherical polar coordinates the Jacobian is \(r^2\sin\theta\) so restricting to \((0,\infty)\times(0,\pi)\times(0,2\pi)\) we have a diffeomorphism onto the image.