In this chapter, we consider the differential calculus of mappings from one Euclidean space to another, that is, mappings . In first-year calculus, you considered the case or and . Examples of functions that you might have encountered were of the type , , or maybe even , etc. If now with then has component functions since . We can therefore write and is called the th component of . In this chapter, unless stated otherwise, we equip with the Euclidean 2-norm . For this reason, we will omit the subscript in and simply write .

Differentiation

Let and let be a function. How should we define differentiability of at some point ? Recall that for a function , where , we say that is differentiable at if exists. In this case, we denote and we call the derivative of at . As it is written, the above definition does not make sense for since division of vectors is not well-defined (or at least we have not defined it). An equivalent definition of differentiability of at is that there exists a number such that which is equivalent to asking that The number is then denoted by as before. Another way to think about the derivative is that the affine function is a good approximation to for points near . The linear part of the affine function is . Thought of in this way, the derivative of at is a linear function.
Let be a subset of . A mapping is said to be differentiable at if there exists a linear mapping such that
In the definition of differentiability, the expression denotes the linear mapping applied to the vector . An equivalent definition of differentiability is that where again denotes evaluated at . It is not hard to show that the linear mapping in the above definition is unique when is an open set. For this reason, we will deal almost exclusively with the case that is open without further mention. We therefore call the derivative of at and denote it instead by . Hence, by definition, the derivative of at is the unique linear mapping satisfying Applying the definition of the limit, given arbitrary there exists such that if then or equivalently If is differentiable at each then is a mapping from to the space of linear maps from to . In other words, if we denote by the space of linear maps from to then we have a well-defined mapping called the derivative of on which assigns the derivative of at each . We now relate the derivative of with the derivatives of its component functions. To that end, we need to recall some basic facts from linear algebra and the definition of the partial derivative. For the latter, recall that a function , has partial derivative at with respect to , if the following limit exists or equivalently, if there exists a number such that where denotes the th standard basis vector in . We then denote . Now, given any linear map , the action of on vectors in can be represented as matrix-vector multiplication once we choose a basis for and . Specifically, if we choose the most convenient bases in and , namely the standard bases, then where and the the entry of the matrix is the th component of the vector . We can now prove the following.
Let be open and suppose that is differentiable at , and write . Then the partial derivatives exist, and the matrix representation of in the standard bases in and is where all partial derivatives are evaluated at . The matrix above is called the Jacobian matrix of at .
Let denote the entry of the matrix representation of in the standard bases in and , that is, is the th component of . By definition of differentiability, it holds that Let where is the th standard basis vector. Since is open, provided is sufficiently small. Then since iff we have It follows that each component of the vector tends to as . Hence, for each we have Hence, exists and as claimed.
It is customary to write since for any the vector is the Jacobian matrix of at multiplied by (all partials are evaluated at ). When not explicitly stated, the matrix representation of will always mean the Jacobian matrix representation. We now prove that differentiability implies continuity. To that end, we first recall that if and then The proof of this fact is identical to the one in Example 9.4.16. In particular, if then .
Let be an open set. If is differentiable at then is continuous at .
Let . Then there exists such that if then Then if then and thus provided Hence, is continuous at .
Notice that Theorem 10.1.2 says that if exists then all the relevant partials exist. However, it does not generally hold that if all the relevant partials exist then exists. The reason is that partial derivatives are derivatives along the coordinate axes whereas, as seen from the definition, the limit used to define is along any direction that .
Consider the function defined as We determine whether and exist. To that end, we compute Therefore, and exist and are both equal to zero. It is straightforward to show that is not continuous at and therefore not differentiable at .
The previous examples shows that existence of partial derivatives is a fairly weak assumption with regards to differentiability, in fact, even with regards to continuity. The following theorem gives a sufficient condition for to exist in terms of the partial derivatives.
Let be an open set and consider with . If each partial derivative function exists and is continuous on then is differentiable on .
We will omit the proof of Theorem 10.1.5.
Let be defined by Explain why exists for each and find .
It is clear that the component functions of that are given by , , and have partial derivatives that are continuous on all of . Hence, is differentiable on . Then
Prove that the given function is differentiable on .
We compute and thus . A similar computations shows that . On the other hand, if then To prove that exists for any , it is enough to show that and are continuous on (Theorem 10.1.5). It is clear that and are continuous on the open set and thus exists on . Now consider the continuity of at . Using polar coordinates and , we can write Now if and only if and thus In other words, and thus is continuous at . A similar computation shows that is continuous at . Hence, by Theorem 10.1.5, exists on .
If is differentiable on and , then is called the gradient of and we write instead of . Hence, in this case, On the other hand, if and then is a curve in . In this case, it is customary to use lower-case letters such as , , or instead of , and use for the domain instead of . In any case, since is a function of one variable we use the notation and the derivative of is denoted by where all derivatives are derivatives of single-variable-single-valued functions.

Exercises

Let be differentiable functions at . Prove by definition that is differentiable at and that .
Recall that a mapping is said to be linear if and , for all and . Prove that if is linear then for all .
Let and suppose that there exists such that for all . Prove that is differentiable at and that .
Determine if the given function is differentiable at .
Compute if .

Differentiation Rules and the MVT

Let and be open sets. Suppose that is differentiable at , , and is differentiable at . Then is differentiable at and
Verify the chain rule for the composite function where and are
An important special case of the chain rule is the composition of a curve with a function . The composite function is a single-variable and single-valued function. In this case, if is defined for all and exists at each then In the case that and is a unit vector, that is, , then is called the directional derivative of at in the direction .
Let and be differentiable and suppose that . Prove that if then where .
Below is a version of the product rule for multi-variable functions.
Let be open and suppose that and are differentiable at . Then the function is differentiable at and
Verify the product rule for if and are
Let be differentiable functions. Find an expression of in terms of , and .
Let be a differentiable function. Suppose that is differentiable. Prove that for all if and only if for all .
Recall the mean value theorem (MVT) on . If is continuous on and differentiable on then there exists such that . The MVT does not generally hold for a function without some restrictions on and, more importantly, on . For instance, consider defined by . Then while and there is no such that . With regards to the domain , we will be able to generalize the MVT for points provided all points on the line segment joining and are contained in . Specifically, the line segment joining is the set of points Hence, the image of the curve given by is the line segment joining and . Even if is open, the line segment joining may not be contained in (see Figure 10.1).
figures/line-segment.svg
Line segment joining and not in
Let be open and assume that is differentiable on . Let and suppose that the line segment joining is contained entirely in . Then there exists on the line segment joining and such that .
Let for . By assumption, for all . Consider the function on . Then is continuous on and by the chain rule is differentiable on . Hence, applying the MVT on to there exists such that . Now and , and by the chain rule, Hence, and the proof is complete.
Let be open and assume that is differentiable on . Let and suppose that the line segment joining is contained entirely in . Then there exists on the line segment joining and such that for .
Apply the MVT to each component function
A set is said to be convex if for any the line segment joining and is contained in . Let be differentiable. Prove that if is an open convex set and on then is constant on .

Exercises

Let be an open set satisfying the following property: for any there is a continuous curve such that is differentiable on and and .
  1. Give an example of a non-convex set satisfying the above property.
  2. Prove that if satisfies the above property and is differentiable on with then is constant on .

The Space of Linear Maps

Let be an open subset of . Recall that if is differentiable at each then denotes the derivative of on . The space of linear maps is a vector space which after

Solutions to Differential Equations

A differential equation on is an equation of the form where is a given function and is the unknown in . A solution to is a curve such that where is an interval, possibly infinite. If is defined
Let be an open set and let be a differentiable function with a continuous derivative

High-Order Derivatives

In this section, we consider high-order derivatives of a differentiable mapping . To do this, we will need to make an excursion into the world of multilinear algebra. Even though we will discuss high-order derivatives for functions on Euclidean spaces, it will be convenient to first work with general vector spaces.
Let and be vector spaces. A mapping is said to be a -multilinear map if is linear in each variable separately. Specifically, for any , and any for , the mapping defined by is a linear mapping.
A -multilinear mapping is just a linear mapping. A -multilinear mapping is called a bilinear mapping. Hence, is bilinear if and for all , , and . Roughly speaking, a multilinear mapping is essentially a special type of polynomial multivariable function. We will make this precise after presenting a few examples.
Consider defined as . As can be easily verified, is bilinear. On the other hand, if then is not bilinear since for example in general, or in general. What about ?
Let be a set of vectors in and suppose that and . If is bilinear then expand so that it depends only on and for .
Let be a matrix and define as . Show that is bilinear. For instance, if say then Notice that is a polynomial in the components of and .
The function that returns the determinant of a matrix is multilinear in the columns of the matrix. Specifically, if say then and if then These facts are proved by expanding the determinant along the first column. The same is true if we perform the same computation with a different column of . In the case of a matrix we have and if is a matrix with columns , , and then
We now make precise the statement that a multilinear mapping is a (special type of) multivariable polynomial function. For simplicity, and since this will be the case when we consider high-order derivatives, we consider -multilinear mappings . For a positive integer let where on the right-hand-side appears -times. Let denote the space of -multilinear maps from to . It is easy to see that is a vector space under the natural notion of addition and scalar -multiplication. In what follows we consider the case , the general case is similar but requries more notation. Hence, suppose that is a multilinear mapping and let , , and . Then where is the th standard basis vector of , and similarly for and . Therefore, by multilinearity of we have Thus, to compute for any , we need only know the values for all triples with . If we set where the superscripts are not exponents but indices, then from our computation above Notice that the component functions of are multilinear, specifically, the mapping is multilinear for each . The numbers for and completely determine the multilinear mapping , and we call these the coefficients of the multilinear mapping in the standard bases.
The general case is just more notation. If is -multilinear then there exists unique coefficients , where and , such that for any vectors it holds that where are the standard basis vectors in .
A multilinear mapping is said to be symmetric if the value of is unchanged after an arbitrary permutation of the inputs to . In other words, is symmetric if for any it holds that for any permutation . For instance, if is symmetric then for any it holds that
Consider defined by Then and therefore is symmetric. Notice that and the matrix is symmetric.
Having introduced the very basics of multilinear mappings, we can proceed with discussing high-order derivatives of vector-valued multivariable functions. Suppose then that is differentiable on the open set and as usual let denote the derivative. Now is a finite dimensional vector space and can be equipped with a norm (all norms on a given finite dimensional vector space are equivalent). Thus, we can speak of differentiability of , namely, is differentiable at if there exists a linear mapping such that If such an exists then we denote it by . To simplify the notation, we write instead . Hence, is differentiable at if there exists a linear mapping such that To say that is a linear mapping from to is to say that Let us focus our attention on the space . If then for each , and moreover the assignment is linear, i.e., . Now, since , we have that In other words, the mapping is bilinear! Hence, defines (uniquely) a bilinear map by and the assignment is linear. Conversely, to any bilinear map we associate an element defined as and the assignment is linear. We have therefore proved the following.
Let and be vector spaces. The vector space is isomorphic to the vector space of multilinear maps from to .
The punchline is that can be viewed in a natural way as a bilinear mapping and thus from now on we write instead of the more cumbersome . We now determine a coordinate expression for . First of all, if then where is the standard basis of . By linearity of the derivative and the product rule of differentiation, we have that and also . Therefore, This shows that we need only consider for -valued functions . Now, and thus the Jacobian of is (Theorem 10.1.2) Therefore, Therefore, for any and , by multilinearity we have Now, if all second order partials of are defined and continuous on we can say more. Let us first introduce some terminology. We say that is of class if all partial derivatives up to and including order of are continuous functions on .
Let be an open set and suppose that is of class . Then on for all . Consequently, is a symmetric bilinear map on .
If we now go back to a multi-valued function with components , then if exists at then Higher-order derivatives of can be treated similarly. If is differentiable at then we denote the derivative at by . Then is a linear map, that is, The vector space is isomorphic to the space of -multilinear maps . The value of at is denoted by . Moreover, is a symmetric -multilinear map at each if is of class . If is of class then for vectors we have where the summation is over all -tuples where . Hence, there are terms in the above summation. In the case that , the above expression takes the form
Compute if , , and . Also compute .
We compute that and and then and then Then, If then

Taylor's Theorem

Taylor's theorem for a function is as follows.
Let be an open set and suppose that if of class on . Let and suppose that the line segment between and lies entirely in . Then there exists on the line segment such that where Furthermore,
If in Taylor's theorem then and We call the th order Taylor polynomial of centered at and the th order remainder term. Hence, Taylor's theorem says that Since , for close to we get an approximation Moreover, since is continuous, there is a constant such that if is sufficiently close to then the remainder term satisfies the bound From this it follows that
Compute the third-order Taylor polynomial of centered at .
Most of the work has been done in Example 10.5.10. Evaluating all derivatives at we find that Therefore,

Exercises

Find the 2nd order Taylor polynomial of the function centered at .
A function is called a homogeneous function of degree if for all and it holds that . Prove that if is differentiable at then the mapping is a homogeneous function of degree .

The Inverse Function Theorem

A square linear system or in vector form where the unknown is , has a unique solution if and only if exists if and only if . In this case, the solution is . Another way to say this is that the mapping has a global inverse given by . Hence, invertibility of completely determines whether is invertible. Consider now a system of equations where is nonlinear. When is it possible to solve for in terms of , that is, when does exists? In general, this is a difficult problem and we cannot expect global invertibility even when assuming the most desirable conditions on . Even in the 1D case, we cannot expect global invertibility. For instance, is not globally invertible but is so on any interval where . For instance, on the interval , we have that and . In any neighborhood where , for instance, at , is not invertible. However, having a non-zero derivative is not necessary for invertibility. For instance, the function has but has an inverse locally around ; in fact it has a global inverse . Let's go back to the 1D case and see if we can say something about the invertibility of locally about a point such that . Assume that is continuous on (or on an open set containing ). Then there is an interval such that for all . Now if and , then by the Mean Value Theorem, there exists in between and such that Since and then . Hence, if then and this proves that is injective on . Therefore, the function has an inverse where . Hence, if , has a local inverse at . In fact, we can say even more, namely, one can show that is also differentiable. Then, since for , by the chain rule we have and therefore since for all we have The following theorem is a generalization of this idea.
Let be an open set and let be of class . Suppose that for . Then there exists an open set containing such that is open and is invertible. Moreover, the inverse function is also and for and we have
Prove that is locally invertible at all points .
Clearly, exists for all since all partials of the components of are continuous on . A direct computation gives and thus . Clearly, if and only if . Therefore, by the Inverse Function theorem, for each non-zero there exists an open set containing such that is invertible. In this very special case, we can find the local inverse of about some . Let , that is, If then and therefore and therefore or By the quadratic formula, Since we must take and therefore Hence, provided and then

Exercises

Let be defined by for .
  1. Prove that the range of is . Hint: Think polar coordinates.
  2. Prove that is not injective.
  3. Prove that is locally invertible at every .
Can the system of equations be solved for in terms of near ?