Instructor's Lecture Notes for Multivariable Calculus

These are the instructor’s lecture notes for a course in multivariable calculus, covering coordinate systems and vectors in three-dimensional space, the techniques of calculus applied to vector-valued and multivariable functions, integration on two- or three-dimensional domains, and some calculus of vector fields. They contain everything discussed in lecture and some topics and points well beyond the scope of lecture.

Introduction

The purpose of this class is to investigate how the “one-dimensional” calculus from previous classes generalizes to situations with multiple variables.

The object of study in previous calculus classes have been single-variable, scalar-valued functions, denoted \(f\colon \mathbf{R} \to \mathbf{R}.\) These functions are called single-variable due to their inputs \(x\) being a single real number from the one-dimensional space \(\mathbf{R},\) and are called scalar-valued because they return a single real number (a scalar) from in a one-dimensional space \(\mathbf{R}.\) In this class we study vector-valued functions \(\bm{f}\colon \mathbf{R} \to \mathbf{R}^3\) whose output is some vector in the three-dimensional space \(\mathbf{R}^3,\) and we study multi-variable functions \(f\colon \mathbf{R}^2 \to \mathbf{R}\) and \(f\colon \mathbf{R}^3 \to \mathbf{R}\) whose input is some vector in the two- or three-dimensional space, and we study vector fields in three dimensional space, which are functions \(\bm{f}\colon \mathbf{R}^3 \to \mathbf{R}^3.\)

And naturally we study these functions from the perspective of their (analytic) geometry. Just like in previous calculus classes and function \(f\colon \mathbf{R} \to \mathbf{R}\) could be analyzed as a graph, a curve in \(\mathbf{R}^2,\) a vector-valued function \(\bm{f}\colon \mathbf{R} \to \mathbf{R}^3\) can be visualized as a parametrically-defined curve in \(\mathbf{R}^3,\) and a multi-variable function \(f\colon \mathbf{R}^2 \to \mathbf{R}\) can be visualized as its graph, a surface in \(\mathbf{R}^3.\)

Typographic Conventions

Generic variable names will be italicized, whereas names of constants or operators will be roman (upright, normal) type. For example in the equation \(f(x,y) = \sin\bigl(5\ln(x)+3y^2\bigr)\) the symbol \(f\) for the defined function and the symbols \(x\) and \(y\) for its parameters are italicized, whereas the symbols for the specific functions \(\sin\) and \(\ln\) and the constant \(3\) are roman. There is no analogue to this convention for handwritten mathematics except when writing certain ubiquitous sets. For example the real numbers will be denoted in type as \(\mathbf{R}\) in type but is often handwritten in “black-board bold” as \(\mathbb{R}.\)

Scalars will be set with the same weight as the surrounding type whereas vectors will be given a stronger weight. For example \(x\) must be a scalar and \(\bm{v}\) must be a vector. Similarly scalar-valued functions will be set with a standard weight like \(f\) whereas vector-valued functions will be set in a stronger weight like \(\bm{r}.\) It’s not worth bothering trying to draft bold symbols when handwriting mathematics, so instead vectors and vector-valued functions will be indicated with a small over-arrow like \(\vec v\) and \(\vec r.\)

Terminology & Definitions

The word space will be used to refer to our universe, the largest context we are working in, which will usually be three-dimensional real space \(\mathbf{R}^3.\) A point is a single, zero-dimensional element of space. A curve \(C\) is a locally one-dimensional collection of points in space, an simple example of which being a line. A surface \(S\) is a locally two-dimensional collection of points in space, an simple example of which being a plane. A region \(R\) is a general term referring to some connected collection of points — which could be in \(\mathbf{R}^1\) or \(\mathbf{R}^2\) or \(\mathbf{R}^3\) depending on the context — that typically serves as the domain of some function or integral. We sometime refer specifically to a region in \(\mathbf{R}^1\) as an interval \(I\) and a region in \(\mathbf{R}^3\) as an expanse \(E.\) We refer to the difference between the dimension of a collection of points and the dimension of the space it’s in as its codimension: e.g. a one-dimensional curve embedded in three-dimensional space has codimension two. To any of these collections of points, we may add the qualifier “with boundary” to indicate it might have, well, a boundary. The boundary of any collection of points has local dimension one fewer than that collection: e.g. the boundary of a surface-with-boundary is a curve, and the boundary of a curve-with-boundary are two points.

All of these previous terms have geometric connotations. If we wish to strip away that connotation and think of a space or a curve or a surface or a region as a collection of points we’ll refer to it as a set. One such of these sets may contain another — for a example a curve \(C\) may live entirely in a region \(R\) — and in this case we may say that the curve is a subset of the region, which we may denote symbolically \(C \subset R.\) If however a single point \(x\) is inside a set \(S\) we’ll refer to it as an element of \(S\) and denote this symbolically as \(x \in S.\)

Given a subset \(S\) of space we’ll let \(S^c\) denote its complement, the set of all points not in \(S.\) The notation \(\partial S\) will denote the boundary of \(S,\) all points in the space such that every ball centered on the point, no matter how small, intersects both \(S\) and \(S^c.\)

A set \(S\) is closed if \(\partial S \subset S\) and is open if its complement is closed. A set \(S\) is bounded if there exists some upper-bound \(M\) on the set of all distances from a point in \(S\) to the origin.

The set of all designated inputs of a function is its domain and the designated target set for the function’s potential outputs is its codomain; a function’s range is the subset of all outputs in the codomain that actually correspond to an input. Notationally, we’ll denote a function \(f\) having domain \(D\) and codomain \(R\) as \(f \colon D \to R.\) The symbol \(\times\) denotes the product of two spaces. For the function \(f \colon D \to R,\) the graph of \(f\) is the subset of \(D \times R\) consisting of all points \((x,y)\) for which \(x \in D\) and \(y \in R\) and \(f(x) = y.\)

An operator is a function whose inputs and outputs are functions themselves. For example, \(\frac{\mathrm{d}}{\mathrm{d}x}\) is the familiar differential operator; its input is a function \(f\) and its output \(\frac{\mathrm{d}}{\mathrm{d}x}(f)\) is the derivative of the function \(f.\) For a generic operator \(\mathcal{T}\) applied to a function \(f\) we may wish to evaluate the output function \(\mathcal{T}(f)\) at some value \(x.\) Doing so we would have to write \(\mathcal{T}(f)(x),\) or even \(\big(\mathcal{T}(f)\big)(x)\) to be precise. To avoid this cumbersome notation we will employ one of two notational tricks: (1) if there is only a single input function \(f\) in our context we omit the \(f\) and comfortably write \(\mathcal{T}(x)\) instead of \(\mathcal{T}(f)(x),\) trusting a reader perceives no ambiguity, or (2) the output function will be written in a “curried” form, \(\mathcal{T}_{f}(x)\) instead of \(\mathcal{T}(f)(x).\)

Analytic & Vector Geometry in Three-Dimensional Space

How does calculus extend to higher dimensional space? Or, how does calculus handle multiple independent or dependent variables?

In their previous experience studying calculus students learned the facts as they apply to functions \(f\colon \mathbf{R} \to \mathbf{R}\) that have a one-dimensional domain and one-dimensional codomain. These are called single-variable, scalar-valued functions, terms that allude to the one-dimensional-ness of their domain and codomain respectively. And in studying the geometry of these functions, they visualized their graphs in two-dimensional space \(\mathbf{R}^2.\) The goal now is to study how facts of calculus must be generalized in the case of functions whose domain or codomain have dimension greater than one. Functions that, via the graph or via an embedding of the domain into the codomain, can only be usefully visualized in a space of dimension greater than one. So we will start thinking about three-dimensional space \(\mathbf{R}^3.\)

First we’ll try to discuss space, surfaces, curves, etc, first from a purely analytic perspective, just to get used to thinking in higher dimensions. Then we’ll introduce vectors, and use vectors as a convenient means extend what we know about calculus of a single variable to multi-variable equations that correspond to surfaces and curves, and to extend common concepts from physics to three-dimensional space.

Three-Dimensional Space

In two-dimensional space \(\mathbf{R}^2\) we imposed a rectangular (Cartesian) coordinate system with two axes, the \(x\)-axis and \(y\)-axis, which divided the space into four quadrants, and we referred to each point by its coordinates \((x,y).\) In three-dimensional space \(\mathbf{R}^3\) we add a \(z\)-axis. The axes now divide space into eight octants, and we refer to each point by its coordinates \((x,y,z).\)

Sketch the coordinate axes and coordinate planes, both manually and digitally.

Each pair of axes forms a plane in space: the \(xy\)-plane, the \(yz\)-plane, and the \(xz\)-plane. Conventionally when drawing the axes of \(\mathbf{R}^3\) we draw the \(xy\)-plane foreshortened as if lying flat before us with the positive \(x\)-axis pointed slightly to the left of us, the positive \(y\)-axis pointing rightward, and the positive \(z\)-axis point straight up. This convention is often referred to as the “right-hand” rule, since if you point the index finger of your right hand in the positive \(x\) direction then curl your middle finger in the positive \(y\)-direction, your thumb will indicate the positive \(z\)-direction.

Plot the points \((-1,0,0)\) and \((0,3,0)\) and \((0,0,4)\) and \((1,2,3)\) and \((-2,7,-1).\)

Given points \((x_1, y_1, z_1)\) and \((x_2, y_2, z_2)\) in three-dimensional space, the distance between them is calculated by the formula. \[\sqrt{(x_2-x_1)^2+(y_2-y_1)^2+(z_2-z_1)^2}\] This formula results from applying the Pythagorean Theorem twice, and generalizes as expected to higher dimensional space. Fun fact, Pythagoras was mildly a cult leader.

Calculate the distance between \((1,2,3)\) and \((-2,7,-1).\)

Cylindrical and Spherical Coordinates

In two-dimensional space we’d usually refer to a point’s location with its rectangular coordinates, but there was a distinct other coordinate system, polar coordinates, by which we could refer to a point’s location using its distance from the origin and angle of inclination from the positive \(x\)-axis. Given a point with rectangular coordinates \((x,y),\) its polar coordinates would be \(\big(\sqrt{x^2+y^2}, \operatorname{atan2}(y,x)\big).\) Conversely given a point with polar coordinates \((r,\theta),\) its rectangular coordinates would be \((r\cos(\theta), r\sin(\theta)).\)

In three-dimensional space, there are three distinct coordinate systems we may use the describe the location of a point. The one we’ve been using, where a point is referred to by the triple of numbers \((x,y,z)\) that are it’s distance from the origin in the \(x\)-, \(y\)-, and \(z\)-directions respectively by is called rectangular (Cartesian) coordinates. The other two are based on polar coordinates in the \((x,y)\)-plane, but use a different method to describe where a point is above the plane.

In cylindrical coordinates, a point is referred to by a triple \((r, \theta, z)\) where \(r\) and \(\theta\) are the polar coordinates of the point in the \(xy\)-plane and \(z\) is the height of the point above the \(xy\)-plane. Similar to polar coordinates in two-dimensions, there is not a unique triple that describes a point, but conventionally, when given the choice, we use the triple for which \(r \geq 0\) and \(\theta\) is between \(-\pi\) and \(\pi.\) Explicitly, to convert between rectilinear and cylindrical coordinates: \[ \begin{align*} (x,y,z) &\longrightarrow \Big(\sqrt{x^2+y^2}, \operatorname{atan2}(y,x), z\Big) \\ \big(r\cos(\theta), r\sin(\theta),z\big) &\longleftarrow (r,\theta,z) \end{align*} \]

Calculate the cylindrical coordinates of \((1,2,3).\)

In spherical coordinates, a point is referred to by a triple \((\rho, \theta, \phi)\) where \(\rho\) is the point’s distance from the origin, \(\theta,\) the azimuth, is the angle in the \(xy\)-plane above which it lies, and \(\phi,\) the zenith, is the angle at which it’s declined from the positive \(z\)-axis. Again there is not a unique triple that describes a point, but conventionally we use the triple for which \(\rho \geq 0,\) \(\theta\) is between \(-\pi\) and \(\pi,\) and \(\phi\) is between \(0\) and \(\pi.\) Explicitly, to convert between rectilinear and spherical coordinates: \[ (x,y,z) \longrightarrow \Bigg(\sqrt{x^2+y^2+z^2}, \operatorname{atan2}(y,x), \operatorname{arccos}\bigg(\frac{z}{\sqrt{x^2+y^2+z^2}}\bigg)\Bigg) \\ \Big(\rho\sin(\phi)\cos(\theta), \rho\sin(\phi)\sin(\theta), \rho\cos(\phi)\Big) \longleftarrow (\rho,\theta,\phi) \]

Calculate the spherical coordinates of \((1,2,3).\)

A major reason for working in cylinder or spherical coordinates is that certain surfaces or regions are much easier to describe analytically in these coordinate systems. In cylindrical coordinates a cylinder of radius \(5\) is just \(r=5.\) In cylindrical coordinates \(r=z\) corresponds to a cone. In spherical coordinates a sphere of radius \(7\) is just \(\rho=7.\)

Before next class familiarize yourself with Desmos 3D, read a bit about Pythagoras and Pythagoreanism, review how to define a line parametrically, and review the shapes of all these curves in two-dimensional space:

\(x = 4\)

\(x+y = 4\)

\(x^2+y = 4\)

\(x+y^2 = 4\)

\(x^2+y^2 = 4\)

\(x^2-y^2 = 4\)

\(x^2+4y^2 = 4\)

Lines, Planes, and Surfaces

If you have a locally \(N\)-dimensional set sitting inside of \(M\)-dimensional space, we’ll refer to the difference \(M-N\) as the codimension of that set. The point of introducing the word “codimension” is to concisely state this fact: generically, a set with codimension \(n\) requires \(n\) equations to express.

The template equation for a plane in space passing through the point \((x_0, y_0, z_0)\) is \[ A(x - x_0) + B(y - y_0) + C(z - z_0) = 0 \qquad\text{or}\qquad Ax + By + Cz = D\,, \] where the coefficients \(A\) and \(B\) and \(C\) correspond to the “slopes” of the plane in various directions.

Plot the plane \(2(x-1)-3(y+1)+(z-2)=0,\) and determine the equations of the lines along which this plane intersects the coordinate planes,

In three dimensional space, a line is the intersection of two planes, but it’s more convenient to describe parametrically. A line through point \((x_0, y_0, z_0)\) is given by the equations

\( x(t) = x_0 + At \quad y(t) = y_0 + Bt \quad z(t) = z_0 + Ct\)

\( (x,y,z) = \bigl(x_0 + At, y_0 + Bt, z_0 + Ct\bigl)\)

where the coefficients \(A\) and \(B\) and \(C\) describe the relative “direction” of the line; every increase of \(A\) in the \(x\)-direction will correspond to an increase of \(B\) in the \(y\)-direction and \(C\) in the \(z\)-direction. If necessary the line may be re-parameterized, and expressed succinctly with two planar equations.

\(\displaystyle x = x_0 + A\bigg(\frac{t - z_0}{C}\bigg) \)

\(\displaystyle y = y_0 + B\bigg(\frac{t - z_0}{C}\bigg) \)

\(\displaystyle z = t \)

Digitally plot planes \(2(x-1)-3(y-1)+(z-2)=0\) and \(x-y-z=0,\) and give a parametric description of the line at which they intersect.

\( x = 1+4t \)

\( y = -1+3t \)

\( z = 2+t \)

A quadric surface is a surface with corresponding equation containing terms with at most degree two, e.g. \(x^2\) or \(z^2\) or \(yz.\) A cylinder in general is a surface that results from translating a planar curve along a line orthogonal to the curve. The locus of a single point from the curve along the line is referred to as a ruling, and a single “slice” along a ruling will be a cross-section, more generally called a trace, congruent to the curve. A “cylinder” as you knew it previously is a specific example we’ll now refer to as a right circular cylinder.

Plot these examples of cylinders based on their cross-sections.

\(x^2+y^2 = 4\)

Circular Cylinder

\(x^2+z^2 = 4\)

Circular Cylinder

\(x^2+y = 4\)

Parabolic Cylinder

\(x^2+4y^2 = 4\)

Elliptical Cylinder

\(x^2-y^2 = 4\)

Hyperbolic Cylinder

Plot these examples of surfaces based on their cross-sections.

\( x^2+y^2=z^2\)

Cone

\( x^2+y^2+z^2=1\)

Ellipsoid (Spheroid)

\( x^2+y^2-z^2=1\)

“Single-Sheet” Hyperboloid

\( x^2-y^2-z^2=1\)

“Double-Sheet” Hyperboloid

\( x^2+y^2=z\)

Elliptic Paraboloid

\( x^2-y^2=z\)

Hyperbolic Paraboloid

[TK] At what point(s) does the line \(\bigl(x_0 + At, y_0 + Bt, z_0 + Ct\bigl)\) intersect the surface \(x^2+3y^2=z?\)

Before next class, if you’ve seen vectors in a previous class, review them.

Vectors in Three-Dimensional Space

Sketch the vectors \(\bm{u} = \langle -3,1,4 \rangle\) and \(\bm{v} = \langle -2,-5,3 \rangle\) both manually and digitally, and use them as running example of the following definitions.

A vector in three-dimensional space is a triple of real numbers \(\langle x,y,z \rangle.\) In terms of data, a vector \(\langle x,y,z \rangle\) is no different than a point \((x,y,z).\) However, a vector comes with a different connotation. A point is static, just a location in space, whereas a vector implies direction: the vector \(\langle x,y,z \rangle\) denotes movement, usually from the origin towards the point \((x,y,z),\) or in other contexts movement from some initial point in the \(\langle x,y,z \rangle\)-direction. For these reasons a point is illustrated with a dot, whereas a vector is illustrated with an arrow.

As a conceptual example moving forward, it may be helpful to think of vectors as representing force vectors or velocity vectors of ships or aircraft.

These examples have been of vectors in three-dimensional space, but certainly we have have two-dimensional vectors, or vectors of any higher dimension too.

If the initial and terminal point of a vector have been named, say \(A\) and \(B\) respectively, that vector from \(A\) to \(B\) will be denoted \(\overrightarrow{AB}.\) More often though we just assign the vector a variable name: a bold character like \(\bm{v} = \langle x,y,z\rangle\) when typeset, or decorated with an arrow like \(\vec{v} = \langle x,y,z\rangle\) when handwritten. Conventionally, like vector variables themselves, functions that return vectors are also typeset as bold characters

The magnitude of a vector \(\bm{v}\) (sometimes called its modulus) is its length, and is denoted as \(|\bm{v}|\) or sometimes as \(\|\bm{v}\|\). Explicitly for \(\bm{v} = \langle x,y,z \rangle,\) its magnitude is \(|\bm{v}| = \sqrt{x^2+y^2+z^2}\,.\) We’ve previously used those bars \(||\) to denote the absolute value of the number they contain, but this new use is not new, only a generalization, when we remember that a number’s absolute value can be interpreted as its distance from zero. The magnitude of the vector \(\langle x,y,z\rangle\) is the distance of the point \((x,y,z)\) from zero.

Calculate the magnitude of \(\langle -3,1,4 \rangle.\)

Speaking of zero, there is certainly a zero vector denoted \(\bm{0}\) that has a magnitude of zero. Sometimes we wish to scale a vector, multiplying each of its components through by a constant to lengthen the vector, but maintain its direction. This number out front of the vector is called a scalar, the act of distributing it, multiplying each component of the vector by it, is called scalar multiplication. Scaling a vector by a negative number will result in a vector going “backwards” along the same direction.

Scale the vector \(\langle -3,1,4 \rangle\) by three, and by negative 2.

A vector consists of two pieces of information: a direction and magnitude. Sometimes it’s convenient to separate these two pieces of information. A unit vector is any vector of length one. Given a vector \(\bm{v},\) we’ll let \(\bm{\hat{v}}\) denote the unit vector. Writing \(\bm{v} = |\bm{v}|\bm{\hat{v}}\) effectively separates \(\bm{v}\) into its direction \(\bm{\hat{v}}\) and its magnitude \(|\bm{v}|.\)

Write \(\langle -3,1,4 \rangle\) as a unit vector times its magnitude.

Given two vectors \(\bm{u} = \langle u_1, u_2, u_3\rangle\) and \(\bm{v} = \langle v_1, v_2, v_3\rangle\) we can calculate their sum \(\bm{u} + \bm{v}\) as \( {\langle u_1+v_1, u_2+v_2, u_3+v_3\rangle\,.} \) This sum is the vector that would result from moving along vector \(\bm{u}\) and then moving along vector \(\bm{v},\) or vice versa. The difference \(\bm{u} - \bm{v}\) is best thought of as the vector from the head to \(\bm{v}\) to the head of \(\bm{u}.\)

Sketch generic vectors \(\bm{u}\) and \(\bm{v}\) along with their sum and difference, then digitally display the sum and difference of \(\langle -3,1,4 \rangle\) and \(\langle -2,-5,3 \rangle.\)

Sometimes it is convenient to break down a vector into a sum of its components in its coordinate directions. Define the standard basis vectors:

\(\mathbf{i} = \bigl\langle 1,0,0 \bigr\rangle\)

\(\mathbf{j} = \bigl\langle 0,1,0 \bigr\rangle\)

\(\mathbf{k} = \bigl\langle 0,0,1 \bigr\rangle\)

Using these vectors we can write any vector as explicitly as a sum of its components, i.e. \(\langle x, y, z \rangle = x\mathbf{i} + y\mathbf{j} + z\mathbf{k}.\) Note that there are many conventional notations for these vectors; you may sometimes see \(\mathbf{i}\) written as \(\vec{\imath}\) or \(\bm{i}\) or \(\bm{\hat\imath}\) or \(\bm{\vec \imath}\,.\)

For example \(\langle -3,1,4 \rangle = -3\mathbf{i} + \mathbf{j} + 4\mathbf{k}\,.\)

Before next class, review what the cosine of an angle is, and review the law of cosines.

The Dot Product and Projections

Can we “multiply” vectors?

How do we calculate the angle between two vectors.

Their are multiple “products” of vectors; today we talk about one of them. Given two vectors \(\bm{u} = \langle u_1, u_2, u_3 \rangle\) and \(\bm{v} = \langle v_1, v_2, v_3 \rangle,\) their dot product is the scalar value \({\bm{u}\cdot\bm{v} = u_1v_1 + u_2v_2 + u_3v_3\,.} \) Sometimes this is called the scalar product, or the inner product of vectors.

Compute \(\langle -3,1,4 \rangle \cdot \langle -2,-5,3 \rangle.\)

Some arithmetic properties of this operations should be explicitly pointed out:

\(\bm{v}\cdot\bm{0} = 0\)

\(\bm{v}\cdot\bm{v} = |\bm{v}|^2\)

\(\bm{u}\cdot\bm{v} = \bm{v}\cdot\bm{u}\)

\(\bm{u}\cdot(\bm{v}+\bm{w}) = \bm{u}\cdot\bm{v}+\bm{u}\cdot\bm{w}\)

\((c\bm{u})\cdot\bm{v} = c(\bm{u}\cdot\bm{v}) = \bm{u}\cdot(c\bm{v})\)

An alternative characterization of the dot product, sometimes taken to be an alternative definition, involves the angle \(\theta\) between the vectors involved. \[\bm{u}\cdot\bm{v} = |\bm{u}||\bm{v}| \cos(\theta) \]

Prove this by writing the law of cosines \(|\bm{u}-\bm{v}|^2 = |\bm{u}|^2 + |\bm{v}|^2 - 2\bm{u}|||\bm{v}|\cos\(\theta),\) expanding the RHS in terms of its components, and tidying up.

What is the measure of the angle between \(\langle -3,1,4 \rangle\) and \(\langle -2,-5,3 \rangle?\)

One interpretation of this characterization is that the dot product is a length-corrected measure of the angle between two vectors. Some specific facts following from this characterization: (1) If two vectors are parallel, their dot product will simply be the product of their lengths. (2) Two vectors are orthogonal — perpendicular, at a right angle to each other — if and only if their dot product is zero. (3) The angle between two vectors is acute if and only if their dot product is positive, and the angle between two vectors is obtuse if and only if their dot product is negative.

But there is another interpretation of the dot product that is useful to always keep in mind. The projection of \(\bm{u}\) onto \(\bm{v},\) denoted \( \operatorname{proj}_{\bm v}(\bm u),\) is the vector parallel to \(\bm{v}\) that lies “orthogonally underneath” \(\bm{u},\) as if \(\bm{u}\) were casting a shadow onto \(\bm{v}.\) The projection is calculated as the formula \[ \operatorname{proj}_{\bm v}(\bm u) = \bigg(\frac{\bm{u}\cdot\bm{v}}{|\bm{v}|^2}\bigg) \bm{v}\,. \]

Draw pictures of projection with example and explain why that formula works.

For \(\bm{u} = \langle -3,1,4 \rangle\) and \(\bm{v} = \langle -2,-5,3 \rangle\) plot and compute \(\operatorname{proj}_{\bm v}(\bm u)\) and \(\operatorname{proj}_{\bm u}(\bm v).\)

Now taking that last formula and taking the dot product of both sides by \(\bm{v}\) we get that \({\bm{u}\cdot\bm{v} = \operatorname{proj}_{\bm v}(\bm u) \cdot \bm{v}\,.} \) And since \(\operatorname{proj}_{\bm v}(\bm u)\) and \(\bm{v}\) are parallel, their dot product is simply the product of their lengths: \({\bm{u}\cdot\bm{v} = |\operatorname{proj}_{\bm v}(\bm u)||\bm{v}|\,.} \) Similarly, due to symmetry between the vectors \(\bm{u}\) and \(\bm{v},\) we also have \({\bm{u}\cdot\bm{v} = |\operatorname{proj}_{\bm u}(\bm v)||\bm{u}|\,.} \) The interpretation of the dot product we get from this is as a generalization of the usual product of numbers: the dot product of two vectors is the product of their lengths once one is projected onto the same line as the other.

In fact, this can be made a little stronger: given any third vector \(\bm{w},\) such that the angle between \(\bm{u}\) and \(\bm{w}\) is \(\theta\) and the angle between \(\bm{v}\) and \(\bm{w}\) is \(\phi,\) then projecting both \(\bm{u}\) and \(\bm{v}\) onto \(\bm{w}\) we get \[|\bm{u}||\bm{v}|\cos(\theta)\cos(\phi) = |\operatorname{proj}_{\bm w}(\bm u)||\operatorname{proj}_{\bm w}(\bm v)|\,.\] This left-hand-side is nearly the dot product of those vectors, especially in light of the identity \[ \cos(\theta)\cos(\phi) = \frac{1}{2}\Big(\cos(\theta+\phi)+\cos(\theta-\phi)\Big) \] and that right-hand-side is an average of the cosines of the sum and difference which seems to address the whole “sign” issue some folks seems to have, but I’m not sure how to cleanly consolidate this.

Before next class, review the law of sines, and matrix determinants (if you’ve seen them), and know how to prove that \(\frac{1}{2}AB\sin(\theta)\) area formula.

The Cross Product and Areas

For any two vectors in space, there must be a vector that’s orthogonal (perpendicular) to both of them … how can we compute it?

Does the cross-product relate to the law of sines?

What is the cross product in relation to the determinant? I think it’s halfway between a two-dimensional determinant and three-dimensional determinant. Is it just a “partially applied” determinant? The lecture should start here to orient any students already familiar with linear algebra.

Given a brief “elevator pitch” on two- and three-dimensional matrix determinants for the students who have seen linear algebra, and let them know the computation we’re seeing today is a determinant.

Talk about building the orthogonal direction by coming up with a vector \(\bm{n}\) such that \(\bm{u}\cdot\bm{n} = 0\) and \(\bm{v}\cdot\bm{n} = 0,\) but skip to the symbol-soup to avoid wasting class.

Given two vectors \(\bm{u} = \langle u_1, u_2, u_3 \rangle\) and \(\bm{v} = \langle v_1, v_2, v_3 \rangle,\) their cross product is the vector \[\bm{u}\times\bm{v} = \langle u_2v_3 - u_3v_2, -u_1v_3 + u_3v_1, u_1v_2 - u_2v_1\rangle\,. \] But remembering this formula can be a bit tough, so we usually phrase it in terms of a matrix determinant instead, which can be computed via the rule of Sarrus.

\(\displaystyle \bm{u}\times\bm{v} = \det\begin{pmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ u_1 & u_2 & u_3 \\ v_1 & v_2 & v_3 \end{pmatrix} %&= \det\begin{pmatrix} u_2 & u_3 \\ v_2 & v_3\end{pmatrix}\mathbf{i} %+ \det\begin{pmatrix} u_1 & u_3 \\ v_1 & v_3\end{pmatrix}\mathbf{j} %+ \det\begin{pmatrix} u_1 & u_2 \\ v_1 & v_2\end{pmatrix}\mathbf{k} \)

\(\displaystyle \begin{matrix} & & & \mathbf{i} & \mathbf{j} & \mathbf{k} & & & \\ & & u_3 & u_1 & u_2 & u_3 & u_1 & & \\ & v_2 & v_ 3 & v_1 & v_2 & v_3 & v_1 & v_2 & \end{matrix} \)

Compute the cross product \(\langle 1,2,3 \rangle \times \langle 5,4,1 \rangle.\)

Compute the cross product \(\langle -3,1,4 \rangle \times \langle -2,-5,3 \rangle.\)

Always remember that this cross product returns a vector that, by design, is orthogonal to its factors. You can determine the direction of the cross product vector via the right-hand rule. On a related note, the cross product is not quite a commutative operation, but instead \(\bm{u} \times \bm{v} = -\bm{v} \times \bm{u}.\) Additionally, if two vectors are parallel, there is not a unique direction that is orthogonal to each of them, and so the cross product of parallel vectors will be the zero vector. Here are other properties of the cross product as a vector operation.

\((c\bm{u})\!\times\!\bm{v} = c(\bm{u}\!\times\!\bm{v}) = \bm{u}\!\times\!(c\bm{v})\)

\(\bm{u}\!\times\!(\bm{v}+\bm{w}) = \bm{u}\!\times\!\bm{v}+\bm{u}\!\times\!\bm{w}\)

\((\bm{v}+\bm{w})\!\times\!\bm{u} = \bm{v}\!\times\!\bm{u}+\bm{w}\!\times\!\bm{u}\)

\(\bm{u}\!\cdot\!(\bm{v}\!\times\!\bm{w}) = (\bm{u}\!\times\!\bm{v})\!\cdot\!\bm{w} \)

\(\bm{u}\!\times\!(\bm{v}\!\times\!\bm{w}) = (\bm{u}\!\cdot\!\bm{w})v - (\bm{u}\!\cdot\!\bm{v})w \)

Just like how the dot product of two vectors has an alternative characterization involving the trigonometry of the angle between them, there is a similar alternative characterization of the cross-product \(\bm{u} \times \bm{v}\) as the unique vector orthogonal to \(\bm{u}\) and \(\bm{v}\) with magnitude \(|\bm{u}||\bm{v}| \sin(\theta)\) and direction dictated by the right-hand rule. The relationship between these two characterizations of the cross product is hidden slightly afield within the study of linear algebra. However the familiar right-hand side of the latter equation should give us a hint; it looks like the area formula for a triangle, and the determinant is a measure how much area scales. Without going into more detail, here’s the fact to know: the magnitude of the cross product \(\bm{u} \times \bm{v}\) is the area of the parallelogram determined by \(\bm{u}\) and \(\bm{v}.\)

Compute the area of the triangle framed by \(\langle -3,1,4 \rangle\) and \(\langle -2,-5,3 \rangle.\)

This previous fact generalizes to higher dimensions. The scalar triple product \[ \bm{u}\cdot(\bm{v}\times\bm{w}) = \det\begin{pmatrix} u_1 & u_2 & u_3 \\ v_1 & v_2 & v_3 \\ w_1 & w_2 & w_3 \end{pmatrix} \] is the (signed) volume of the parallelepiped framed by the vectors \(\bm{u}\) and \(\bm{v}\) and \(\bm{w}.\)

More than just framing a parallelogram/triangle it should be noted that the span of two vectors define an entire plane. Previously when we introduced the equation \( A(x - x_0) + B(y - y_0) + C(z - z_0) = 0 \) for a plane, there was a fact we glossed over: those “directions” \(A\) and \(B\) and \(C\) are actually the components of a vector normal to the plane. I.e. \(\bm{u}\times\bm{v}\) will give you those coefficients \(A\) and \(B\) and \(C\) for the plane that \(\bm{u}\) and \(\bm{v}\) span.

What’s an equation for the plane spanned by \(\langle -3,1,4 \rangle\) and \(\langle -2,-5,3 \rangle\) that passes through the origin? What about the plane that passes through \((6,7,-9)?\)

Before next class, review lines and surfaces from last week.

Distances Between Points, Lines, and Planes

Given two geometric objects — points, lines, planes, etc — how do we compute the (shortest) distance between them?

Previously we talked about lines and surfaces in three-dimensions space from a purely analytic perspective. Since introducing vectors, we should talk about how to denote the equations of lines and surfaces using vector-centric notation.

Recall that, generically, the equation for the plane passing through the point \((x_0, y_0, z_0)\) is \[ A(x - x_0) + B(y - y_0) + C(z - z_0) = 0\,, \] where the coefficients \(A\) and \(B\) and \(C\) correspond to the “slopes” in various directions. But these coefficients have another interpretation. A vector is normal to a surface at a point if it is orthogonal to any tangent vector to the surface at that point. I.e. it “sticks straight out” from the surface. The vector \(\langle A,B,C \rangle\) is normal to that plane.

Draw picture and work through example.

Note the consistency with graphs of single-variable lines.

Any two vectors and a point determine a unique plane in three dimensional space. Considering this last fact, we now know how to determine this plane.

Write an equation of the plane passing through the points \((1,2,3)\) and \((-3,1,4)\) and \((-2,-5,3)?\)

Parametrically a line through point \((x_0, y_0, z_0)\) is given by the equations \[ x = x_0 + At \qquad y = y_0 + Bt \qquad z = z_0 + Ct\,, \] where the coefficients \(A\) and \(B\) and \(C\) describe the “direction” in which the line is headed. In the language of vectors, the line is headed in the direction \(\bm{v} = \langle A,B,C \rangle\) after starting at the initial point \(\bm{r}_0 = \langle x_0,y_0,z_0 \rangle.\) which means the line can be expressed in terms of vectors as \(\bm{r} = \bm{r}_0 + \bm{v}t.\)

Parameterize the line segment between the points \((2,4,5)\) and \((-2,3,1)\) for a parameter \(t\) such that \(t=0\) at the first point and \(t=1\) at the second.

TK At what angle do these two planes intersect?

Compute the shortest distance to a plane from a point not on that plane. This idea of projecting ends up being the key. Given a plane in space with normal vector \(\bm{n}\) and containing a point \(Q,\) and given a point \(P\) not on that plane, if \(\overrightarrow{QP} = \bm{v},\) the shortest distance from the point \(P\) to the plane will be \(\bigl|\operatorname{proj}_{\bm{n}}(\bm{v})\bigr|.\)

What’s the shortest distance from the point \(\bigl(3,-3,10\bigr)\) to the plane \({2x-3y+6z=-23?}\) What are the coordinates of the point on that plane closest to that point?

Note \((-1,1,-3)\) lies on the plane. The distance is \(14\) and the closest point is \((-1,3,-2).\)

Compute the shortest distance to a line from a point not on that line. Call the point \(P,\) and let \(R\) be the point on the line closest to \(P.\) Let \(\bm{v}\) be the direction vector of the line and let \(Q\) be the coordinates of any point on the line, both of which can be inferred from an parameterization of the line. Let \(\theta\) be the angle between \(\bm{u}\) and \(\bm{v}.\) Considering the right triangle \(\triangle PQR,\) the distance from \(P\) to the plane — the distance between \(P\) to \(R\) — will be \(|\bm{u}|\sin\bigl(\theta\bigr).\) Then either notice that this distance is equal to \(\frac{|\bm{u}\times\bm{v}|}{|\bm{v}|}\) or compute \(\theta\) explicitly from the dot product.

What’s the shortest distance from the point \(\bigl(1,-1,-1\bigr)\) to the line \(\bigl\langle 3,1,4 \bigr\rangle t + \bigl(6,6,0\bigr)?\) What are the coordinates of the point on that line closest to the point?

The distance is \(7\) and the closest point is \((3,5,-4).\)

Compute the shortest distance between two skew (non-intersecting) lines. Let \(P + \bm{w}t\) and \(Q + \bm{v}t\) be parameterizations of the two lines. If the lines are parallel, i.e. \(\bm{w}\) and \(\bm{v}\) are parallel, then the distance between the lines will be the distance from the point \(P\) to the line \(Q + \bm{v}t.\) Otherwise for each line, there is a unique plane containing that line that doesn’t intersect the other line. I.e. there is a pair of parallel planes containing those lines. Each plane will have normal vector \(\bm{n} = \bm{w}\times\bm{v},\) and the distance between the lines will be the distance from \(Q\) to the plane containing \(P\) (or vice-versa).

What’s the shortest distance between the lines \({\bigl\langle 0,-2,1 \bigr\rangle t + \bigl(3,0,2\bigr)}\) and \({\bigl\langle 4,3,-2 \bigr\rangle t + \bigl(0,3,11\bigr)}?\)

The distance is \(9.\)

For each of these types of problems, let \(R\) denote the point on the line or plane closest to \(P.\) All of these problems could also be solved by finding the coordinates of the \(R\) as the intersection of some line and plane and computing the length of \(\overline{PR}.\)

Before next class, if you’ve seen them in a physics class, review the concepts of force, work, power, torque, etc.

Force, Work, Torque, etc

Time for a pop quiz

Today we consolidate all this talk of vectors with concepts you may have seen in physics.

A force, having a direction and magnitude (measured in Newtons), can be modelled by a vector. You may already be familiar with this if you’ve been drawing free-body diagrams in your physics classes. The resultant (net) force on a mass is the sum of all the individual component forces acting on that mass.

An airplane has velocity vector \(\bigl\langle 120, 442 \bigr\rangle\) mph and is flying at a constant altitude. What’s its speed? Suppose now there is a \(40\) mph wind blowing from the northwest. What is the course of the plane? What is the actual speed of the plane along this course?

Work done by a force acting on an object is the amount of force accumulated as the object is displaced; work is a scalar quantity. The basic formula, \(W = F x,\) only applies if \(F\) and \(x\) are constants and so in a previously class, to account for the possibility \(F\) was a function, we generalized this formula to \(W = \int F \,\mathrm{d}x.\) But now even that formula only applies if the displacement is in the same direction as the force being applied. If the force and displacement are in different directions, we need to generalize this formula further. Let \(\bm{F}\) be the (constant) force vector with magnitude \(|\bm{F}|\) and let \(\bm{x}\) be the displacement vector with magnitude \(|\bm{x}|.\) Calculating work, it’s no longer the magnitude of the force itself that matters, but only the magnitude of the component of the force parallel to the displacement: \(|\bm{F}|\cos(\theta)\) where \(\theta\) is the angle between the vectors. Specifically \[ W = \Big(|\bm{F}|\cos(\theta)\Big)|\bm{x}| = \bm{F}\cdot\bm{x}\,.\] I.e. work is the dot product of the force and displacement vectors.

Suppose a fella is pushing a mine cart with 123 N of force directed at an angle 54° degrees above the horizontal. How much work does he do pushing the cart 89 m?

Remembering that if the force being applied is variable then we need to compute work as an integral. Later we’ll generalize this even further to account for the possibility that the displacement vector \(\bm{x}\) may vary. \[ W = \int\limits_C \bm{F} \cdot \mathrm{d}\bm{x} = \int\limits_{t_1}^{t_2} \bm{F}\cdot\frac{\mathrm{d}\bm{x}}{\mathrm{d}t} \,\mathrm{d}t\,. \]

The integrand in that last integral has a name. Power \(P\) is the derivative of work over time, also a scalar, that measures the rate at which work is being done. It can be computed as the dot product of the force and velocity vectors: \[ P = \dot{W} = \frac{\mathrm{d}W}{\mathrm{d}t} = \bm{F}\cdot\bm{v}\,. \]

The last formula is valid even in the more general situation when the force is nonconstant and applied along a curve \(C\) due to the fundamental theorem of calculus \[ P = \frac{\mathrm{d}W}{\mathrm{d}t} = \frac{\mathrm{d}}{\mathrm{d}t} \int\limits_C \bm{F}\cdot\mathrm{d}\bm{x} = \frac{\mathrm{d}}{\mathrm{d}t} \int\limits_{t_1}^{t_2} \bm{F}\cdot\frac{\mathrm{d}\bm{x}}{\mathrm{d}t} \,\mathrm{d}t = \bm{F}\cdot\bm{v}\,. \]

Torque, a vector, is the rotational analogue of linear force, and measures the tendency of a body to rotate about the origin. Imagine you’re using a wrench to tighten a bolt. The bolt-head is the origin, the direction of the wrench is described by a (radial) position vector \(\bm{r}.\) If we apply a force \(\bm{F}\) to the tip of the wrench handle, the torque \(\bm{\tau}\) is computed as \[\bm{\tau} = \bm{r} \times \bm{F}\,.\] Note the order of the components of this cross product — the direction of the vector \(\bm{\tau}\) will be determined mathematically by the right-hand-rule, but be determined physically the direction of the threads on the bolt: righty-tighty lefty-loosey.

Torque is the first moment of force: \(\bm{\tau} = \bm{r} \times \bm{F}.\) Torque is the rate of the change of angular momentum: \(\bm{\tau} = \frac{\mathrm{d}\bm{L}}{\mathrm{d}t}.\) The derivative of torque over time is called rotatum.

Momentum is a vector quantity, is the product of an object’s mass with its velocity: \(\bm{p} = m\bm{v}.\) Force is the derivative of momentum: \(F = dp/dt,\) and impulse \(J\) is the change in momentum between t_1 and t_2 : \[ \Delta p = J = \int\limits_{t_1}^{t_2} F(t)\,\mathrm{d}t \] Angular momentum is first moment of momentum: \(\bm{R} = \bm{R}\times \bm{p}.\)

The moment of inertia (rotational inertia) is the second moment of mass, the torque needed for a desired angular acceleration. Mass|Force→Acceleration as RotationalInertia|Torque→AngularAcceleration.

Calculus of Vector-Valued Functions: Curves

Now equipped with the language of vectors we can begin studying how calculus generalizes to functions whose domain or codomain are more than one-dimensional. In particular we’ll start with vector-valued functions \(\mathbf{R} \to \mathbf{R}^3.\)

Parametrically-Defined Curves and Surfaces

A vector-valued function \(\bm{r} \colon \mathbf{R} \to \mathbf{R}^3\) can be thought of as a parameterization of a curve \(C\) in three-dimensional space. The domain \(\mathbf{R}\) is literally a line that \(\bm{r}\) curls up and puts in \(\mathbf{R}^3.\)

Example: \(\bm{r}(t) = \bigl\langle \cos\left(3t\right)\left(2\!+\!\cos\left(2t\right)\right), \sin\left(4t\right), \sin\left(3t\right)\left(2\!+\!\cos\left(2t\right)\right) \bigr\rangle.\)

Conventionally the letter \(\bm{r}\) is chosen since its output is the radial vector emanating from the origin to a point on the space curve.

Example: \(\bm{r}(t) = \bigl\langle -4t-1, 3t+1, 2t+1 \bigr\rangle \) is a line.

Example: \(\bm{r}(t) = \bigl\langle t^3-4t-1, 3t+1, 2t+1 \bigr\rangle \) can be visualized via its projections onto the three coordinate planes.

Generically the domain of a parameterization of a curve is \(\mathbf{R},\) but sometimes we only want to refer to a segment of the curve, in which case we restrict the domain, and say “consider \(\bm{r}(t)\) for \(t \in [-2,3]\)” or something, where \(t \in [-2,3]\) means the same thing as \(-2 \leq t \leq 3.\)

Example: \(\bm{r}(t) = \bigl\langle t, t^2, t^3 \bigr\rangle \) is the twisted cubic and serves as a good starting example moving forward.

Example: \(\bm{r}(t) = \bigl\langle \cos(t), \sin(t), t \bigr\rangle \) is the unit helix and serves as a good starting example moving forward.

While the “ends” of many curves shoot away to infinite, some curves will loop back on themselves. A closed curve is a curve with parameterization for which a bounded interval of the domain traces out the entire curve.

\(\bm{r}(t) = \bigl\langle \cos\left(3t\right)\left(2\!+\!\cos\left(2t\right)\right), \sin\left(4t\right), \sin\left(3t\right)\left(2\!+\!\cos\left(2t\right)\right) \bigr\rangle\) parameterizes a closed curve, a figure-eight knot because the entire knot is traced out for \(t \in [0,2\pi].\)

A curve is rectifiable if its arclength can be approximated to arbitrary precision as the sum of the lengths by finitely many segments connecting points on the curve. The graph of \(f(x) = x\sin(1/x)\) is not rectifiable. The boundary of a fractal is not rectifiable. We will only consider rectifiable curves.

A vector-valued function \(\bm{r}\) can be thought of as a parameterization of a curve \(C,\) whereas a curve \(C\) has infinitely many possible parameterizations. Fact: for a parameterization \(\bm{r}\) of a curve \(C\) and a real-valued function \(f,\) the function \(\bm{r}\circ f\) is also a parameterization of \(C\) Thinking of a curve not just as a static thing is space, but as a path along which a particle is travelling, \(f\) changes the “speed” at which the particle travels.

Let \(\bm{r}(t) = \bigl\langle \cos\left(3f(t)\right)\left(2\!+\!\cos\left(2f(t)\right)\right), \sin\left(4f(t)\right), \sin\left(3f(t)\right)\left(2\!+\!\cos\left(2f(t)\right)\right) \bigr\rangle.\) For \(f(x) = ax\) the particle is “sped up” by a factor of \(a.\) For \(f(x) = \frac{\pi}{4}t + \sin(t)\) the particle will double-back occasionally.

Just like a function \(\bm{r}\colon \mathbf{R} \to \mathbf{R}^3\) parameterizes a curve, a function \(\bm{r}\colon \mathbf{R}^2 \to \mathbf{R}^3\) parameterizes a surface, literally taking a plane \(\mathbf{R}^2\) and curling it up in space.

Example: \(\bm{r}(s,t) = \bigl\langle s\cos(t), t\sin(s), st \bigr\rangle.\)

Example: \(\bm{r}(s,t) = \bigl\langle -3-1s-2t, 2+s-2t, 1+s+t \bigr\rangle\) is a plane.

Example: \(\bm{r}(s,t) = \bigl\langle -3-1s-2t, 2+s-2t, 1+s^2+t^2 \bigr\rangle\) is a paraboloid.

Example: \(\bm{r}(s,t) = \bigl\langle 3\cos(s)\sin(t), 3\sin(s)\sin(t), \cos(t) \bigr\rangle\) is a sphere.

Before next class, ponder on what \(\bm{r}'(t)\) and \(\bm{r}''(t)\) represent.

Calculus with Vector-Valued Functions

Today we finally talk about calculus. The limit of a vector-valued function is the limit of its individual component functions. The derivative of a vector-valued function is the derivative of its individual component functions. The integral of a vector-valued function is the integral of its individual component functions. This latter one though, the integral, has no real reasonable interpretation geometrically. The derivative however gives us quite a bit. First some mechanics: for vector-valued functions \(\bm{r}\) and \(\bm{\rho},\) scalar \(c,\) and scalar-valued function \(f,\)

\(\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}\Bigl(\bm{r}(t) + \bm{\rho}(t)\Bigr) = \bm{r}'(t) + \bm{\rho}'(t) \)

\(\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}\Bigl(c\bm{r}(t)\Bigr) = c\bm{r}'(t)\)

\(\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}\Bigl(f(t)\bm{r}(t)\Bigr) = f'(t)\bm{r}(t) + f(t)\bm{r}'(t)\)

\(\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}\Bigl(\bm{r}(t) \cdot \bm{\rho}(t)\Bigr) = \bm{r}'(t)\cdot\bm{\rho}(t) + \bm{r}(t)\cdot\bm{\rho}'(t) \)

\(\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}\Bigl(\bm{r}(t) \times \bm{\rho}(t)\Bigr) = \bm{r}'(t)\times\bm{\rho}(t) + \bm{r}(t)\times\bm{\rho}'(t) \)

\(\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}\Bigl(\bm{r}\bigl(f(t)\bigr)\Bigr) = \bm{r}'\bigl(f(t)\bigr)f'(t) \)

The last one is the chain rule, and the three before all the product rule for various products.

A vector-valued function \(\bm{r}\) is continuous at a point at \(t = c\) if \(\lim_{t \to c} \bm{r}(t) = \bm{r}(c),\) and is continuous on its domain if it’s continuous at every point in its domain. A vector-valued function is smooth if the derivatives of its component functions exist for all orders. A curve itself is smooth if it has a smooth parameterization. A parameterization \(\bm{r}(t)\) of a curve is regular if \(\bm{r}'(t) \neq \bm{0}\) for any \(t\) — the parameterization never stops along the curve. Analogously, a parameterization \(\bm{r}(s,t)\) of a surface is regular if \(\frac{\mathrm{d}\bm{r}}{\mathrm{d}s}\) and \(\frac{\mathrm{d}\bm{r}}{\mathrm{d}t}\) are not parallel (not linearly dependent) at any point.

Pre-composing with \(f(x) = \frac{\pi}{4}t + \sin(t)\) create an irregular parameterization of most curves, the unit helix in particular: \(\bigl\langle \cos(t), \sint(t), t \bigr\rangle.\)

For a curve \(C\) with parameterization \(\bm{r}\) that represents the position of a particle along that curve in space, the derivative \(\bm{r}'(t)\) represents the velocity vector \(\bm{v}(t)\) of the particle, and likewise \(\bm{r}''(t)\) represents its acceleration vector \(\bm{\alpha}(t).\) While \(\bm{r}(t)\) represents where the particle is While \(\bm{r}'(t)\) represents where the particle is going and \(\bm{r}''(t)\) represents where the particle is going to be going, or where it’s being pulled.

Calculate the velocity and acceleration vectors to \(\bm{r}(t) = \bigl\langle 2t, \ln(t), t^2 \bigr\rangle\) at the point where \(t=2.\)

The line with parameterization \(\bm{r}(t_0) + \bm{r}'(t_0)t\) will be tangent to the curve \(C\) at the point \(\bm{r}(t_0).\)

Write down a parameterization the line tangent to \(\bm{r}(t) = \bigl\langle 2t, \ln(t), t^2 \bigr\rangle\) at the point where \(t=2.\)

Recall the formula for the arclength of a parameterized curve in \(\mathbf{R}^2.\) It’s hardly any different in higher dimensions. Given \(\bm{r}(t) = \langle x(t), y(t), z(t)\rangle,\) the arclength of the curve \((x,y,z) = \bm{r}(t)\) between \(t=a\) and \(t=b\) is \[ \int\limits_a^b \sqrt{\bigg(\frac{\mathrm{d}x}{\mathrm{d}t}\bigg)^2 + \bigg(\frac{\mathrm{d}y}{\mathrm{d}t}\bigg)^2 + \bigg(\frac{\mathrm{d}z}{\mathrm{d}t}\bigg)^2} \,\mathrm{d}t \quad = \quad \int\limits_a^b \big|\bm{r}'(t)\big| \,\mathrm{d}t \] We can re-parameterize a curve in terms of it's arclength to get a “canonical” parameterization. For a curve \(\bm{r}\) define an arclength function \(s\) as \[s(t) = \int\limits_0^t \big|\bm{r}'(u)\big| \,\mathrm{d}u\] and notice that \(s\) is a strictly increasing function, and so it must be invertible on its entire domain. The parameterization \(\bm{r}\circ s^{-1}\) is the canonical one we’re talking about: up to a choice of “anchor point” where \(t=0\), it has the property that the point corresponding to a specific \(t = \ell\) is at a distance to \(\ell\) from that anchor point.

Write down the arclength parameterization of the curve with parameterization \(\bm{r}(t) = \bigl\langle 2t, \ln(t), t^2 \bigr\rangle.\)

The Frenet-Serret Frame

If a vector-valued function \(\bm{r}\) has constant magnitude, then it will be orthogonal to its derivative.

The unit tangent vector points in the same direction as \(\bm{r}'(t)\) but normalized, with the “speed”/magnitude removed: \[ \mathbf{T}(t) = \frac{\bm{r}'(t)}{|\bm{r}'(t)|} \]

Since \(\mathbf{T}(t)\) is a unit vector, it will be orthogonal to its derivative. Define the unit normal vector as \[\mathbf{N}(t) = \frac{\mathbf{T}'(t)}{\big|\mathbf{T}'(t)\big|}\] which as a vector indicates the “direction the curve is curving”. It will lie in the same plane as the vectors \(\mathbf{T}\) and \(\bm{r}''(t).\) Then we define the unit binormal vector as \(\mathbf{B}(t) = \mathbf{T}(t)\times\mathbf{N}(t).\)

Altogether these three vectors form the Frenet-Serret frame, also more descriptively called the TNB frame, which provides a local coordinate system that describes the movement of a point along a curve relative to the point’s position — as opposed to an absolute coordinate system that a parameterization \(\bm{r}\) provides in terms of the basis vectors \(\mathbf{i}\) and \(\mathbf{j}\) and \(\mathbf{k}.\)

Some additional terminology: The plane spanned by \(\mathbf{T}\) and \(\mathbf{N}\) is called the osculating plane. Similarly the plane spanned by \(\mathbf{N}\) and \(\mathbf{B}\) is called the normal plane and the tangential direction of the curve is normal to this plane. The plane spanned by \(\mathbf{T}\) and \(\mathbf{B}\) is called the rectifying plane.

The acceleration vector \(\bm{\alpha}\) always lies in the osculating plane, and can be resolved in terms of the unit tangent and normal vectors as \[\bm{\alpha} = |\bm{v}|'\mathbf{T}+|\bm{v}||\mathbf{T}'|\mathbf{N}\,.\]

Calculate formulas for the tangent, normal, and binormal vectors at a general point on the curve with parameterization \(\bm{r}(t) = \bigl\langle 2t, \ln(t), t^2 \bigr\rangle.\)

Curvature & Torsion

The curvature of a smooth curve with parameterization \(\bm{r}(t)\) is a measure of how eccentrically it’s turning at a point — how tightly the curve bends, how much the curve fails to lie on its tangent line line — and is defined as \[ \kappa \!=\! \frac{\mathrm{d}{\mathbf{T}}}{\mathrm{d}s} \cdot \mathbf{N} \;\;\implies\;\; \kappa(t) \!=\! \frac{\bigl|\mathbf{T}'(t)\bigr|}{\bigl|\bm{r}'(t)\bigr|} \!=\! \frac{\bigl|\bm{r}'(t) \times \bm{r}''(t)\bigr|}{\bigl|\bm{r}'(t)\bigr|^3} \] where \(s\) is the arclength function. That last equality deserves a proof (TK theorem 10 page 945).

Orienting fact: the curvature of a circle of radius \(R\) is \(\frac{1}{R}.\) The osculating circle is the circle that lies in the osculating plane that touches (“kisses”) the curve at a point and has the same curvature as the curve at the point (Latin osculum for to kiss).

The torsion at a point along a smooth curve is a measure of how eccentrically it’s twisting in addition to turning — it’s a measure how much the curve fails to lie in a plane, its osculating plane, at that point. Orienting fact: the torsion of a helix of radius \(R\) is \(\frac{1}{2R}.\) \[ \tau \!=\! -\frac{\mathrm{d}\mathbf{B}}{\mathrm{d}s} \cdot \mathbf{N} \;\;\implies\;\; \tau(t) \!=\! - \frac{\mathbf{B}'(t) \cdot \mathbf{N}(t)}{\bigl|\bm{r}'(t)\bigr|} \!=\! \frac{\bigl(\bm{r}'(t) \times \bm{r}''(t)\bigr) \cdot \bm{r}'''(t)}{\bigl|\bm{r}'(t) \times\bm{r}''(t) \bigr|^2} \]

That last equality deserves a proof (TK exercise #72). The negative sign is just a convention, but corresponds to a curve coming up out of the plane as having positive torsion. Curvature is the failure of a curve to be linear

The Frenet-Serret formulas describe the kinematic properties of an object moving along a path/curve in space, irrespective of any absolute coordinate system. I.e. the motion of an object moving smoothly in space is completely determined by its curvature and torsion at a moment in time.

\(\displaystyle \frac{\mathrm{d}\mathbf{T}}{\mathrm{d}s} = \kappa \mathbf{N} \quad \frac{\mathrm{d}\mathbf{N}}{\mathrm{d}s} = -\kappa \mathbf{T} + \tau \mathbf{B} \quad \frac{\mathrm{d}\mathbf{B}}{\mathrm{d}s} = -\tau \mathbf{N} \)

\(\displaystyle \begin{pmatrix} \mathbf{T}' \\ \mathbf{N}' \\ \mathbf{B}' \end{pmatrix} = \begin{pmatrix} 0 & \kappa & 0 \\ -\kappa & 0 & \tau \\ 0 & -\tau & 0 \end{pmatrix} \begin{pmatrix} \mathbf{T} \\ \mathbf{N} \\ \mathbf{B} \end{pmatrix} \)

Calculate formulas for the curvature and torsion at a general point on the curve with parameterization \(\bm{r}(t) = \bigl\langle 2t, \ln(t), t^2 \bigr\rangle.\)

Projectile Motion & Kepler’s Laws

Instead of \(\bm{r}(t)\) tracing out a curve in space, we can think of it as being the location of something — a particle — at a time \(t;\) the curve is the path of the particle. Its velocity is the vector \(\bm{r}'(t);\) its magnitude is the “speed”. The second derivative vector \(\bm{r}'(t)\) to a curve is its acceleration vector. Etc.

For a projectile fired from the origin \((0,0,0)\) within the \(xy\)-plane under in the influence of gravitational acceleration \(g\) in the \(z\) direction, the parametric equations that describe its trajectory are

\(\displaystyle x(t) = \big(v_0 \cos(\theta)\big)t\)

\(\displaystyle y(t) = 0\)

\(\displaystyle z(t) = \big(v_0 \sin(\theta)\big)t - \frac{1}{2}gt^2\)

The acceleration vector \(\bm{\alpha}\) always lies in the osculating plane, and can be written as \( \bm{\alpha} = |\bm{v}|'\mathbf{T}+\kappa|\bm{v}|^2\mathbf{N}. \) As proof \[ \bm{v} = |\bm{v}|\mathbf{T} \\\implies \bm{\alpha} = |\bm{v}|'\mathbf{T} + |\bm{v}|\mathbf{T}' \\\implies \bm{\alpha} = |\bm{v}|'\mathbf{T} + |\bm{v}|\bigl(\mathbf{N}|\mathbf{T}'|\bigr) \\\implies \bm{\alpha} = |\bm{v}|'\mathbf{T} + |\bm{v}|\Bigl(\mathbf{N}\bigl(\kappa |\bm{v}|\bigr)\Bigr) \\\implies \bm{\alpha} = |\bm{v}|'\mathbf{T} + \kappa|\bm{v}|^2\mathbf{N}. \]

Calculus of Multivariable Functions: Surfaces

How does the calculus of single-variable functions extend to multivariable functions?

Now we discuss the “dual” to vector-valued functions of a scalar input, scalar-valued functions of multiple variables, \(f \colon \mathbf{R}^n \to \mathbf{R}.\) We’ll talk about what continuity means in this context, generalize the notion of the first and second derivatives to gradient and Hessian respectively, and discuss the problems of linearization and optimization in this higher-dimensional context.

Multivariable Functions and their Graphs

How do we “visualize” multivariable functions?

Consider functions \(f \colon \mathbf{R}^n \to \mathbf{R},\) where the input is a vector and the output is a scalar. These are called multivariable, scalar-valued functions. In particular we’ll study the case when \(n=2,\) for functions \(f\colon \langle x,y \rangle \mapsto z,\) where we can think about the graph \(z = f(x,y)\) as a surface in three-dimensional space living over the \(xy\)-plane. But we’ll also consider functions for \(n=3.\)

A generic example \(g(x,y) = 2\sin(x)\cos(y).\)

The plane \(4x-3(y-1)+5z=0\) is the graph of the function \(f(x,y) = -\frac{4}{5}x+\frac{3}{5}(y-1).\)

The graph of the function \(f(x,y) = -\frac{2}{5}x^2+\frac{3}{5}(y-1)^3\) has some parabolic and cubic cross sections.

Any specific output value \(z_0\) corresponds to a horizontal cross section of the graph \({z_0 = f(x,y).}\) Sketching these cross sections can help us visualize the graph. In general we call these cross sections level sets, but for a function \(f \colon \mathbf{R}^2 \to \mathbf{R}\) they will be curves, so we call them level curves. Altogether the plot of a healthy collection of a graph’s leve sets is called its contour plot. It’s like the topography map of terrain on the earth, where each level curve corresponds to an elevation.

Google Image search topography terrain contour map.

Digitally plot contour plots of the previous examples.

Manually sketch some level curves of \(f(x,y) = xy\) to get an idea of what the graph \(z = f(x,y)\) looks like.

For a function \(f \colon \mathbf{R}^3 \to \mathbf{R}\) the level sets will be surfaces, so we call them level surfaces. In fact since the graphs of these functions would need four dimensions to visualize, we really only have their contour plots to visualize them.

Digitally plot some level surfaces of \(f(x,y,z) = x^2+y^2-z^2.\)

Before next class, review the notation of limits and the definition of continuity.

Limits & Continuity

Recall the definition of a limit from single-variable calculus: we’ll say \(\lim_{x \to a} f(x) = L\) if for every neighborhood of \(L\) there exists a neighborhood of \(a\) such that for all \(x\) in that neighborhood of \(a\) \(f(x)\) will be in that neighborhood of \(L.\) a function \(f\) is continuous at a point \(a\) \(\lim_{x \to a} f(x) = f(a),\) and is continuous on its domain if its continuous at every point in its domain.

For multivariable functions we only need to update the fact that points in our domain are not numbers \(a\) but are now points \((a,b):\) we’ll say \(\lim_{(x,y) \to (a,b)} f(x,y) = L\) if for every neighborhood of \(L\) there exists a neighborhood of \((a,b)\) such that for all \((x,y))\) in that neighborhood of \((a,b)\) \(f(x,y)\) will be in that neighborhood of \(L.\) Then the definition of continuity in this context is roughly the same: a function \(f\) is continuous at a point \((a,b)\) \(\lim_{(x,y) \to (a,b)} f(x,y) = f(a,b),\) and is continuous on its domain if its continuous at every point in its domain.

For single-variable, when evaluating limits, there are only two directions from which to approach a point, two directions from which to enter the neighborhood of \(a,\) and so only two values that must agree for a limit to exist. For multivariable functions things become more complicated. A limit only exists when the limit along any path approaching the point \((a,b)\) exists and that value is the same for any path. To show that a limit doesn’t exist we must exhibit two paths in the domain towards \((a,b)\) along which the values of the limit differ.

Prove that \(\lim_{(x,y) \to (0,0)} \frac{1}{xy}\) doesn’t exist.

It’s not defined at \((0,0).\) Generally it’s not defined along \((x,0)\) or \((0,y)\) for any \(x\) or \(y.\) Digitally graph it. Sketch the paths. The graph of the function has four connected components. Along the path where \(y=x\) the value of the limit is \(\infty\) but along the path where \(y=-x\) its \(-\infty.\)

Prove that \(\lim_{(x,y) \to (0,0)} \frac{2xy}{x^2+y^2}\) doesn’t exist.

Digitally graph it. Sketch the paths. Along the path where \(y=x\) the value of the limit is \(1\) but along the path where \(y=-x\) its \(-1.\)

Broad advice for evaluating limits: (1) if you know in your heart a function is continuous at the point in question based on its formula — e.g. it’s a polynomial function — then the limit is the value at that point, (2) a visualization, either the graph or the contour plot, can help give you intuition on whether or not the limit exists, and (3) if you suspect a limit doesn’t exist its fairly quick to check the “straight line” paths \(x=0\) and \(y=0\) and \(y=x\) and \(y=-x.\) If the limit agrees along all those paths but you’re still convinced the limit doesn’t exist you’ll need to come up with a path more cleverly based on the formula for the function.

But how do we prove a limit does exist? How do we prove the value of \(\lim_{(x,y) \to (a,b)} f(x,y)\) is consistent along all paths terminating in \((a,b)?\) One clean way to do so is to shift our plane to center on \((a,b),\) convert the domain and function’s formula into polar coordinates, and show that the limit as \(r\) approaches \(0\) doesn’t depend at all on the value of \(\theta.\)

Prove that \(\lim_{(x,y) \to (0,0)} \frac{x^3}{x^2+y^2}\) exists and is zero.

Convert to polar and compute it.

Before next class, review all your rules of single-variable differentiation, and review the geometric interpretations of derivatives in terms of rates, slopes, concavity, etc.

Partial Derivatives

What’s a “derivative” when you have more than one independent variable?

Write the single-variable definition of a derivative on the board twice, to be updated shortly to the multivariable derivative.

The derivative of a single-variable function gives you information about the rate at which the function is increasing or decreasing corresponding to the slope of a line tangent to the graph of the function. For a multivariable function there are infinitely many directions to measure the rate of change; i.e. there is no single tangent line to a point on a surface, but instead a tangent plane. The derivative of a multivariable function cannot simply be another function, but is actually a vector of functions. Today we talk about the components of that vector, the rates-of-change of a function in the \(x\)- and \(y\)-direction. These are called partial derivatives. \[ f_x(x,y) = \lim\limits_{h \to 0} \frac{f(x+h, y) - f(x,y)}{h} \qquad f_y(x,y) = \lim\limits_{h \to 0} \frac{f(x, y+h) - f(x,y)}{h} \]

Desmos demo: show the lines tangent to some graph \(z = f(x,y)\) to indicate that \(f_x(x,y)\) and \(f_y(x,y)\) are the slopes of those lines.

The partial derivative of a function \(f\) with respect to \(x\) may be written any one of these various ways:

\(\displaystyle f_x\)

\(\displaystyle f_1\)

\(\displaystyle \frac{\partial}{\partial x}f\)

\(\displaystyle \frac{\partial f}{\partial x}\)

\(\displaystyle \mathrm{D}_x(f)\)

\(\displaystyle \mathrm{D}_1(f) \)

The “\(\mathrm{D}\)” is a short-hand for the differential operator, and the “\(1\)” is a reference to \(x\) being the first parameter of \(f\) without requiring it to have a name. They’re only “partial” derivatives because the “full” derivative is be a vector pieced together from these partial derivatives. More on that next week.

To compute \(f_x,\) take the derivative with respect to \(x\) and pretend that the other variables are just constants.

For the following functions \(f\) with formulas \(f(x,y)\) determine formulas for \(f_x(x,y)\) and \(f_y(x,y).\)

\(\displaystyle f(x,y) = 3x^2y+2y^3+\frac{y}{4x} \)

\(\displaystyle f(x,y) = 7x^5y^3 \)

\(\displaystyle f(x,y) = x^2\ln(xy) \)

\(\displaystyle f(x,y) = x^y \)

\(\displaystyle f(x,y) = xy\cos(x) \)

\(\displaystyle f(x,y) = \frac{x}{(x+y)^2} \)

\(\displaystyle f(x,y) = \mathrm{e}^x\sin(y) \)

The second-order partial derivatives of \(f\) are the results of taking the derivatives of \(f_x\) and \(f_y.\) Note there are four possibilities, \(f_{xx}\) and \(f_{yy}\) and \(f_{xy}\) and \(f_{yx}\) and various notations for each.

\(\displaystyle (f_x)_y \)

\(\displaystyle f_{xy} \)

\(\displaystyle f_{12} \)

\(\displaystyle \frac{\partial}{\partial y}\frac{\partial}{\partial x}f(x,y) \)

\(\displaystyle \frac{\partial^2 f}{\partial y\,\partial x} \)

\(\displaystyle \mathrm{D}_{xy}(f) \)

\(\displaystyle \mathrm{D}_{12}(f) \)

Be careful with the notation: notice the difference in order between \(\displaystyle f_{xy} \) and \(\displaystyle \frac{\partial^2 f}{\partial y\,\partial x} \) despite them both meaning to take a derivatives with respect to \(x\) and then \(y.\) Now while \(f_{xx}\) and \(f_{yy}\) can still be thought of in terms of concavity, \(f_{xy}\) and \(f_{yx}\) are more nebulous, and are often explained as corresponding to a “twisty-ness” of \(f.\)

Desmos demo: show the standard paraboloid \(f(x,y) = \frac{1}{2}ax^2 + bxy + \frac{1}{2}cy^2\) and show the concavity that \(f_{xx}(x,y) = a\) and \(f_{yy}(x,y) = c\) indicate, and the “twisty-ness” that \(f_{xy}(x,y) = b\) indicates.

Calculate \(f_{xx}\) and \(f_{yy}\) and \(f_{xy}\) and \(f_{yx}\) for some of the earlier examples, and note that is appears \(f_{xy} = f_{yx}.\)

If \(f\) is defined on some disk containing the point \((a,b)\) and \(f_{xy}\) and \(f_{yx}\) are both continuous on that disk, then \(f_{xy}(a,b) = f_{yx}(a,b).\)

Before next class, review how to find the equation of a line tangent to the graph of a function at a point, and refresh yourself on what the Taylor series of a smooth function is. Additionally, practice determining limits and taking partial derivatives; these are two concrete skills to get proficient with ASAP so that they don’t slow you down later.

Tangent Planes & Taylor Approximations

What’s an equation for the plane tangent to a surface at a point?

Start class writing the general form for a Taylor series centered at \(x_0\) on the whiteboard out to at least the second-order terms.

What’s an equation of the line tangent to the graph of \(f(x) = 4-x^2\) at the point where \(x = 1?\) Note this is asking for the first-order Taylor approximation.

As a first-order Taylor approximation \(y = 3 - 2(x-1),\) or \(-2(x-1) -(y-3) = 0\) in “standard form”.

Start with a task alluded to yesterday, but that we never actually did.

What are parameterizations of the lines tangent to the graph \(z = f(x,y)\) of the function \(f(x,y) = 2\sin(x)\cos(y)\) at the point \(\bigl(\frac{2\pi}{3},\frac{-\pi}{4}\bigr)?\)

\[ f(x_0,y_0) + \bigl\langle 1, 0, f_x(x_0,y_0)\bigr\rangle t \qquad f(x_0,y_0) + \bigl\langle 0, 1, f_y(x_0,y_0)\bigr\rangle t \]

What is an equation for the plane tangent to the graph \(z = f(x,y)\) of the function \(f(x,y) = 2\sin(x)\cos(y)\) at the point \(\bigl(\frac{2\pi}{3},\frac{-\pi}{4}\bigr)?\)

Work through this in particular, then notice that the roles of \(f_x\) and \(f_y\) can be easily traced through the calculation to write a general formula at the end.

Recall the template equation for a plane is \( {A(x-x_0) + B(y-y_0) + C(z-z_0) = 0} \) for a point \((x_0, y_0, z_0)\) and normal vector \(\langle A,B,C \rangle.\) The point is simple to get: \((x_0, y_0, z_0) = \bigl(\frac{2\pi}{3}, \frac{-\pi}{4}, f(\frac{2\pi}{3}, \frac{-\pi}{4}) \bigr).\) The normal vector \(\langle A,B,C \rangle\) will be orthogonal to the direction vectors of the tangent line we just computed — those two tangent lines span the tangent plane — so \[{\langle A,B,C \rangle = \bigl\langle 1, 0, f_x(x_0,y_0)\bigr\rangle \times \bigl\langle 0, 1, f_y(x_0,y_0)\bigr\rangle = \langle -f_x(x_0, y_0), -f_y(x_0, y_0), 1\rangle.}\] An equation for the tangent plane is then \[ -f_x(x_0, y_0)\bigl(x-x_0\bigr) - f_y(x_0, y_0)\bigl(y-y_0\bigr) + \bigl(z-f(x_0, y_0)\bigr) = 0\,. \] Rearranging this equation suggestively as \[ z = f(x_0,y_0) + f_x(x_0, y_0)\bigl(x-x_0\bigr) + f_y(x_0, y_0)\bigl(y-y_0\bigr)\] makes it appear very similar to the first-order Taylor approximation … because it is.

In general, be aware that Taylor’s theorem holds for multivariable functions: that a smooth function \(f\) has an expression as a power series on some radius of convergence, and that per some center, some “anchor point”, the initial terms in this series give us a polynomial that approximates the \(f\) around that center: \[ \begin{align*} f(x,y) &= \sum_{n=0}^{\infty}\sum_{k=0}^{n}\frac{f_{x_{k} y_{n-k}}(x_0,y_0)}{k!(n-k)!}\bigl(x-x_0\bigr)^k\bigl(x-y_0\bigr)^{n-k} \\&= f(x_0, y_0) \\&+ f_x(x_0, y_0)\bigl(x \!-\! x_0\bigr) + f_y(x_0, y_0)\bigl(y \!-\! y_0\bigr) \\&+ \frac{f_{xx}(x_0, y_0)}{2}\bigl(x \!-\! x_0\bigr)^2 + f_{xy}(x_0, y_0)\bigl(x \!-\! x_0\bigr)\bigl(y \!-\! y_0\bigr) + \frac{f_{yy}(x_0, y_0)}{2}\bigl(y \!-\! y_0\bigr)^2 \\&+ \dotsb \end{align*} \] In particular, the first-order Taylor approximation is a plane, and the second-order Taylor approximation is a paraboloid.

What’s an equation of the plane tangent to the graph of \({f(x,y) = 3-x^2-xy-y^2}\) at the point \((x_0,y_0) = \bigl(1,2\bigr)?\)

What’s an equation of the plane tangent to the graph of \(f(x,y) = x^y\) at the point \((x_0,y_0) = \bigl(\mathrm{e}^2,3\bigr)?\)

Before next class, briskly review implicit differentiation.

The Chain Rule & Implicit Differentiation

Short lecture, then pop quiz.

Given \(z = f(x,y)\) such that \(x\) and \(y\) are themselves dependent variables, functions of some independent variables \(s\) and \(t,\)

\(\displaystyle \frac{\mathrm{d}z}{\mathrm{d}s} = \frac{\partial z}{\partial x} \frac{\mathrm{d}x}{\mathrm{d}s} + \frac{\partial z}{\partial y} \frac{\mathrm{d}y}{\mathrm{d}s} \)

\(\displaystyle \frac{\mathrm{d}z}{\mathrm{d}t} = \frac{\partial z}{\partial x} \frac{\mathrm{d}x}{\mathrm{d}t} + \frac{\partial z}{\partial y} \frac{\mathrm{d}y}{\mathrm{d}t} \,. \)

Suppose \(z = x^2y-\sin(x)\) where \(x\) and \(y\) depend on the independent variable \(t\) according to the formulas \(x(t) = t^2\) and \(y(t) = t^3-t.\) Determine a formula for \(\frac{\mathrm{d}z}{\mathrm{d}t}\) two different ways.

The implicit function theorem provides a “speed-up” for computing derivatives of implicitly defined functions, but still appears unnecessary.

Suppose that \(z = f(x,y),\) instead of explicitly being defined by a formula in terms of \(x\) and \(y\) only has the implicit relationship defined by the equation \({x^2\cos(z) = 42 +\tan(yz).}\) Determine a formula for \(\frac{\mathrm{d}z}{\mathrm{d}x}.\)

The Gradient Vector & Directional Derivatives

Given a plane tangent to a surface, what’s its steepest slope? What’s is slope in a particular direction?

The gradient vector of a function \(f,\) denoted \(\nabla f\) or sometimes as \(\operatorname{grad}f,\) is the vector of partial derivatives of \(f\). This is “full” derivative of a multivariable function. Explicitly, in two variables, \[ \operatorname{grad}f(x,y) = \nabla f(x,y) = \Big\langle f_x(x,y), f_y(x,y) \Big\rangle = \frac{\partial f}{\partial x}\mathbf{i} + \frac{\partial f}{\partial y}\mathbf{j} \,. \]

Two key facts to thinking about what this vector represents geometrically:

Think of \(\nabla f\) as living in the domain of \(f\) on its contour plot. Here the vector \(\nabla f(a,b)\) will be perpendicular to the level curve of \(f\) containing \((a,b).\)
On the surface \(z = f(x,y)\) at the point \(\bigl(a,b, f(a,b)\bigr),\) the vector \(\nabla f(a,b)\) will point in the direction in which \(f\) is increasing the fastest — the direction in which the tangent plane at the point is steepest — and this maximal rate of increase will be \(\bigl|\nabla f(a,b)\bigr|.\) I.e. it points in the most “uphill” direction.

TK Can I prove/justify those facts? Stewart starts starts from the directional derivative. Keep in mind I must also justify the gradient of a constraint \(g\) being normal to that constraint later with Lagrange multipliers.

Explore the gradient of the function \(f(x,y) = 2\sin(x)\cos(y)\) at the point \((x,y) = \bigl(\frac{2\pi}{3},-\frac{\pi}{4}\bigr)\) and plot stuff digitally.

One thing to see now that will become more important later: there is a gradient vector at every point in the domain of \(f.\) Altogether this collection of vectors at each point is called the gradient vector field of \(f.\)

Digitally plot the gradient vector field of \(f(x,y) = 2\sin(x)\cos(y)\) and explain the graphic: every arrow points “uphill”.

The gradient is the direction of steepest ascent (maximal increase), and the magnitude of the gradient is how steep it is, but how can we calculate the rate of increase in another direction? Just take the dot product of the gradient with that direction. For a differentiable function \(f\) and a unit vector \({\bm{u} = \langle u_1, u_2\rangle,}\) the directional derivative of \(f\) in the direction \(\bm{u},\) denoted \(\operatorname{D}_{\bm{u}} f,\) is the function \(\nabla f \cdot \bm{u}.\) Written out explicitly, \[\operatorname{D}_{\bm{u}} f(x,y) = \nabla f(x,y) \cdot \bm{u} = f_x(x,y) u_1 + f_y(x,y) u_2\,.\] Then \(\operatorname{D}_{\bm{u}} f(a,b)\) will give the rate at which \(f\) is increasing (or decreasing) at the point \((a,b)\) in the direction \(\bm{u}.\) In effect, because \(\bm{u}\) is a unit vector, \(\operatorname{D}_{\bm{u}} f(a,b)\) is just \(\operatorname{proj}_{\bm{u}}\bigl(\nabla f(a,b)\bigr).\) If \(\theta\) is the angle between \(\nabla f(a,b)\) and \(\bm{u}\) then \(\operatorname{D}_{\bm{u}} f(a,b) = \bigl|\nabla f(a,b)\bigr|\cos(\theta).\)

Calculate \(\operatorname{D}_{\bm{u}}\bigl(f(x,y)\bigr)\) for \(f(x,y) = 2\sin(x)\cos(y)\) at the point \(\bigl(\frac{2\pi}{3},-\frac{\pi}{4}\bigr)\) in the direction \(\bm{u} = \langle 1, -1 \rangle.\)

Calculate \(\operatorname{D}_{\bm{u}}\bigl(f(x,y)\bigr)\) for \(f(x,y) = x^y\) at the point \(\bigl(\mathrm{e}^2,3\bigr)\) in the direction \(\bm{u} = \langle 1,2 \rangle.\)

TK this one doesn't have a nice visualization

Calculate \(\operatorname{D}_{\bm{u}}\bigl(f(x,y)\bigr)\) for \(f(x,y) = \arctan\bigl(\frac{y}{x}\bigr)\) at the point \(\bigl(3,1\bigr)\) in the direction \(\bm{u} = \langle -2,1 \rangle.\)

Calculate \(\operatorname{D}_{\bm{u}}\bigl(f(x,y)\bigr)\) for \(f(x,y) = x^2y^2-x-y\) at the point \(\bigl(2,3\bigr)\) in the direction \(\bm{u} = \langle 1,5 \rangle.\)

Before next class, review optimization problems from single-variable calculus, and the “first- and second-derivative tests”.

Local and Global Extrema

What are the minimum and maximum values of \(f(x) = x^3-6x^2+6x+6\) restricted to the interval \([0,5]?\)

Recall the “first-” and “second-derivative tests”. Plot \(f(x) = x^3-6x^2+6x+6\) in Desmos along with \(y = f(a) + f'(a)(x-a)\) and \(y = f(a) + f'(a)(x-a) + f''(a)/2(x-a).\) We define critical points at any point where \(f'(a) = 0\) or on the boundary of the domain of \(f',\) and check if they correspond to a minimum or maximum by evaluating \(f''\) at those critical points, determining the concavity.

For a multivariable function \(f\colon \mathbf{R}^2 \to \mathbf{R}\) we do roughly the same thing. At any point \((a,b)\) where \(f\) has an extreme values, the tangent plane to the graph of \(f\) at \((a,b)\) must be horizontal. At this point we must have \(\nabla f(a,b) = \bm{0}.\) We refer to all such points \((a,b)\) where either \(\nabla f(a,b) = \bm{0}\) or \((a,b)\) is on the boundary of the domain of \(\nabla f\) as a critical point. While \(\nabla f(a,b) = \bm{0}\) is a necessary condition for \((a,b)\) to correspond to a supreme value, its not sufficient; critical points are only suspected to correspond to extreme values. For a multivariable function, \((a,b)\) may correspond to a saddle point.

Plot \(f(x) = \frac{1}{2}ax^2 + bxy + \frac{1}{2}cy^2\) restricted to \(x^2+y^2 \leq 2\) and push around \(a\) and \(b\) and \(c\) to create a saddle.

To detect if a critical point corresponds to an extreme value or a saddle, we investigate its value at the second-derivative of \(f.\) The Hessian matrix consists of the partial derivatives of \(f.\) Explicitly \( \mathbf{H}_f = \begin{pmatrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \end{pmatrix} \) The determinant of the Hessian matrix, called the discriminant of \(f\) and denoted \(\operatorname{H}_f,\) can be expressed as \[ \det\mathbf{H}_f = \operatorname{H}_f = f_{xx}f_{yy} - \big(f_{xy}\big)^2\,. \] This is the formula that tells us if a critical point corresponds to an extrema or saddle. If \((a,b)\) is a critical point of \(f\) (so \(\nabla f((a,b) = 0\)) and the second partial derivatives of \(f\) are continues in some neighborhood of \((a,b),\) then

if \(\operatorname{H}_f(a,b) \gt 0,\) then \(f(a,b)\) is a local extrema;
- if \(f_{xx}(a,b) \gt 0,\) then \(f(a,b)\) is a local minimum;
- if \(f_{xx}(a,b) \lt 0,\) then \(f(a,b)\) is a local maximum;

Optimization with Lagrange Multipliers

If \(f\) is continuous on a closed, bounded subset of its domain, then \(f\) attains an absolute minimum and maximum value on that subset.

What are the minimum and maximum values of \(f(x) = x^3-6x^2+6x+6\) restricted to the interval \([0,5]?\)

What are the minimum and maximum values of \(f(x,y) = x^2+y^4 + 2xy\) restricted to the set of pairs \((x,y)\) such that \(x^2+y^2 \leq 3?\)

Plot it, but don’t start with this one as a demonstration of the necessary calculations.

Joseph-Louis Lagrange (1736–1813) — we are nearing modern mathematics.

Suppose you are looking to find the extreme values of \(f\) on some bounded subset of \(\mathbf{R}\). The techniques of last class only work on the interior of the set. On the boundary of the set we must use a different technique, and its not as easy as simply checking the boundary points because there are infinitely many: the boundary is a curve.

If the boundary can be parameterized as a function of some parameter \(t,\) then we can use the techniques of single-variable calculus to find the extreme values of the \(z\)-coordinate of the parameterization. But the boundary curve can’t always be parameterized. There is a technique that works more generally.

To find the extreme values of \(f(x,y)\) subject to \(g(x,y) = k,\) assuming these values exist and \(\nabla g \neq 0\) on the surface \(g(x,y,z) = k,\) the critical points will be any \((a,b)\) such that \[\nabla f(a,b) = \lambda \nabla f(a,b)\] for some real number \(\lambda.\) The number \(\lambda\) is called the Lagrange multiplier for that point.

What are the minimum and maximum values of \(f(x,y) = y-x\) subject to \(x^2+y^2 = 4?\)

Note the difference in the constraint having an “\(=\)” rather than an “\(\leq\)”. Carefully draw out why \(\grad f\) and \(\grad g\) will be parallel at the maximum value.

What are the minimum and maximum values of \(f(x,y) = 3x^2+5y^2\) subject to \(x^2+y^2 = 4?\)

What are the minimum and maximum values of \(f(x,y) = x+y-xy\) on the triangle with vertices at the points \((0,0)\) and \((0,2)\) and \((4,0)?\)

What are the minimum and maximum values of \(f(x,y) = \mathrm{e}^{xy}\) subject to \(x^3+y^3 = 16?\)

Guassian Curvature

pop quiz

Integration in Higher Dimensions

Introduction to Double and Triple Integrals

Reorganize this whole section into three parts: “Integrals over Intervals”, “Double Integrals over Regions”, and “Triple Integrals over Expanses” so they can be easily compared; students can see how they're all doing the same thing.

Revise the founding examples with the parabolas \(y=x^2-5x+3\) and \( y=-x^2+7x-7\) to use functions that fit on Desmos’s default viewport.

Review of one-dimensional integration. For a random sample of \(n\) points \(x_i^\ast\) in the interval \([a,b]\) the Riemann sum \[ \sum_{i=1}^n f\big(x_i^\ast\big) \Delta x \] approximates the area between the graph of \(f\) and the \(x\)-axis. Taking the limit as \(n \to \infty\) computes this area precisely. We define this limit as the definite integral \[ \int\limits_a^b f(x) \,\mathrm{d}x = \lim_{n \to \infty} \sum_{i=1}^n f\big(x_i^\ast\big) \Delta x \] and think of the integral as computing the amount of area accumulate as \(f\) sweeps over the interval \((a,b).\) A quick note regarding that: integration occurs over a domain. Instead of thinking of the operation as “integrating from \(a\) to \(b\)” we may instead think of it as “integrating over the interval \(I = (a,b)\).” Even more, the original integral could be written as \[ \int\limits_{I} f(x) \,\mathrm{d}x; \] we integrate over a region in the domain and for a function \(f\colon \mathbf{R} \to \mathbf{R}\) the regions are (unions of) intervals.

Review numerical integration. It doesn’t matter exactly how we set up this sum. We could be more methodical about it …

Then the fundamental theorem of calculus blows things wide open, informing us this operation of integration and the operation of differentiation are actually “inverse” operations in a sense. That for a continuous function \(f\) defined on the closed domain \([a,b]\) \[ \frac{\mathrm{d}}{\mathrm{d}x} \int\limits_a^x f(t) \,\mathrm{d}t = f(x) \quad\text{and}\quad \int\limits_a^b f(t) \,\mathrm{d}t = F(b)-F(a) \] for any antiderivative \(F\) of \(f.\) The second equation opens up a whole fresh can of worms where the task of computing the precise value of an integral is now reduced to the challenge of determining an antiderivative for the integrand, which is sometimes impossible, but fun when it is possible. There is a common canon of techniques for doing this covered in most calculus classes, but really there are just two, substitution and integration-by-parts, which respectively “undo” the chain rule and product rule of differentiation. The other “techniques of integration” usually covered are just algebraic/trigonometric tricks for manipulating the integrand before it yields to substitution or integration-by-parts.

Over higher-dimensional domains we define integration similarly. For a function \(f\colon \mathbf{R}^2 \to \mathbf{R}\) and a region \(R\) in \(\mathbf{R}^2,\) \[\iint\limits_R f(x,y) \,\mathrm{d}A\] denotes the double integral of \(f\) over \(R.\) Intuitively there are two reasonable ways to think of this integral: (1) as the signed volume between the graph \(z = f(x,y)\) and the region \(R\), or (2) as the signed mass of the region \(R\) where the point-density within \(R\) is given by \(f.\) In this latter case, if \(f(x,y) = 1\) we’re just computing the area. Then this integral, at its heart, is defined as the limit of a Riemann sum \[\lim_{\substack{\\[-3pt] m \to \infty \\[2pt] n \to \infty}} \sum_{j=1}^{n} \sum_{i=1}^{m} f\bigl(x_{ij}^\ast, y_{ij}^\ast\bigr) \,\mathrm{\Delta}A\] where the points \(\bigl(x_{ij}^\ast, y_{ij}^\ast\bigr)\) are randomly sampled over the domain of integration \(R.\)

Then for a function \(f\colon \mathbf{R}^3 \to \mathbf{R}\) and an expanse (region) in \(\mathbf{R}^3\) \[\iiint\limits_E f(x,y,z) \,\mathrm{d}V\] denotes the triple integral of \(f\) over \(E.\) While you can think of this triple integral as the signed hypervolume of the four-dimensional space bound between the graph \(w = f(x,y,z)\) and the expanse \(E\) in \(\mathbf{R}^3,\) its usually more reasonable to think of \(E\) as a solid and the integral as the signed mass of the solid where the point-density within it is given by \(f.\) In this case, if \(f(x,y,z) = 1\) we’re just computing the volume of \(E.\) Then this integral, at its heart, is defined as the limit of a Riemann sum \[\lim_{\substack{\\[-4pt] \ell \to \infty \\[0pt] m \to \infty \\[0pt] n \to \infty }} \sum_{k=1}^{n} \sum_{j=1}^{m} \sum_{i=1}^{\ell} f\bigl(x_{ijk}^\ast, y_{ijk}^\ast, z_{ijk}^\ast\bigr) \,\mathrm{\Delta}V\] where the points \(\bigl(x_{ijk}^\ast, y_{ijk}^\ast, z_{ijk}^\ast\bigr)\) are randomly sampled over the domain of integration \(E.\)

We can always approximate the values of these integrals numerically, but the dream is to somehow use the fundamental theorem of calculus to evaluate them precisely in terms of an antiderivative of the integrand. Our dream is fated to come true, because these double and triple integrals can be evaluated “one dimension at a time”. For example, if \(R\) is a simple rectangular region \(R = [a,b] \times [c,d],\) the double integral \(\iint_R f(x,y) \,\mathrm{d}A\) can be thought of as an “integral of an integral” \[ \iint\limits_R f(x,y) \,\mathrm{d}A \;\;=\;\; \int\limits_a^b\Biggl(\int\limits_c^d f(x,y) \,\mathrm{d}y\Biggr)\,\mathrm{d}x \;\;=\;\; \int\limits_a^b\int\limits_c^d f(x,y) \,\mathrm{d}y\,\mathrm{d}x \] where we first integrate with respect to \(y\) (in the \(y\)-direction) and then integrate with respect to \(x\) (in the \(x\)-direction). This is called an iterated integral, and can be evaluated using the fundamental theorem twice or thrice, just taking it one variable at a time from the inside out.

Compute the values of these iterated integrals.

\(\displaystyle \int\limits_R x+y \,\mathrm{d}A\) for \(R = [1,2]\times[3,4]\)

\(\displaystyle \int\limits_0^1 \int\limits_0^2 2x^2y^3 - y\mathrm{e}^{y^2}\,\mathrm{d}x\,\mathrm{d}y\)

\(\displaystyle \iiint\limits_E xy + yz + xz \,\mathrm{d}V \quad\text{for } E=[1,2]\times[0,3]\times[0,4] \)

\(\displaystyle \iint\limits_R \frac{\ln(y)}{xy} \,\mathrm{d}A \quad\text{for } R=[1,3]\times[1,\mathrm{e}^2]\)

\(\displaystyle \iint\limits_R x\mathrm{e}^{xy} \text{(very hard?)} \,\mathrm{d}A \quad\text{for } R=[1,3]\times[1,\mathrm{e}^2]\)

Before next class, solve this problem:

What is the area of the region bound between the parabolas \(y=x^2-5x+3\) and \( y=-x^2+7x-7?\)

Double Integrals in Rectangular Coordinates

What is the area of the region bound between the parabolas \(y=x^2-5x+3\) and \( y=-x^2+7x-7?\)

First a quick note: in each of the examples from yesterday we integrated first with respect to \(y\) then with respect to \(x.\) This wasn’t necessary. There is an analogue to Clairaut’s Theorem for integration.

If \(f\) is continuous on the rectangle \(R = [a,b] \times [c,d]\) then \[ \int\limits_a^b\int\limits_c^d f(x,y) \,\mathrm{d}y\,\mathrm{d}x = \int\limits_c^d\int\limits_a^b f(x,y) \,\mathrm{d}x\,\mathrm{d}y \]

Now a less quick note: in each of the examples from yesterday we integrated over a rectangular region \(R.\) Can’t we integrate over more wiggly-shaped region? Yes. It’s a little bit trickier though. We must be able to describe the region’s boundary analytically (with equations). Building on the example I asked you to do yesterday, if \(R\) is the region bound between those two parabolas, then \[ \iint\limits_R 1 \,\mathrm{d}A \;\;=\;\; \text{leave space} \int\limits_1^5 \bigl(-x^2+7x-7\bigr)-\bigl(x^2-5x+3\bigr) \,\mathrm{d}x \] But there’s a “missing link” here between these two: \[ \iint\limits_R 1 \,\mathrm{d}A \;\;=\;\; \int\limits_1^5\int\limits_{x^2-5x+3}^{-x^2+7x-7} 1 \,\mathrm{d}y\,\mathrm{d}x \;\;=\;\; \int\limits_1^5 \bigl(-x^2+7x-7\bigr)-\bigl(x^2-5x+3\bigr) \,\mathrm{d}x \] I think the best way to think about it is “from the outside, in:” we’re taking the accumulated sum for each \(x_0\) on the interval \((1,5)\) the values of the integrals \(\int_{x_0^2-5x_0+3}^{-x_0^2+7x_0-7} 1 \,\mathrm{d}y.\) I.e. it’s an integral of single-variable integrals.

Evaluate \(\iint_R x+\frac{1}{2}y \,\mathrm{d}A\) where \(R\) is the region in the \(xy\)-plane bounded by the parabolas \(y=x^2-5x+3\) and \(y=-x^2+7x-7.\)

Illustrate with Desmos.

Note that the essence of Fubini’s theorem still holds, so long the integrand is continuous on the closed region we’re considering. We may also integrate first with respect to \(x\) and then with respect to \(y,\) which is especially helpful if the boundary of the region is more easily described as functions of \(y.\) Sometimes, mechanically speaking, in terms of knowing an antiderivative of the integrand, the order of integration must be swapped.

Evaluate \(\iint_R 6xy \,\mathrm{d}A\) where \(R\) is the triangle in the \(xy\)-plane with vertices located at the coordinates \(\bigl(0,0\bigr)\) and \(\bigl(3,0\bigr)\) and \(\bigl(0,2\bigr).\)

Set it up both ways. It should equal 9.

Write down two integrals (or sum of integrals) to compute the area of the region bound between the curves \(y=-x\) and \(y=x^3\) and \(x=2.\)

Evaluate the integral \[\int\limits_{0}^{4}\int\limits_{0}^{\sqrt{4-y}} \mathrm{e}^{12x-x^3} \,\mathrm{d}x\,\mathrm{d}y\,.\]

The order of integration MUST be swapped.

One last factoid: just like for a function \(f(x,y)\) whose graph \(z = f(x,y)\) is a surface the integral \(\iint\limits_R f(x,y) \,\mathrm{d}A\) calculates the volume under the surface, the integral \[\iint_{R} \sqrt{ \bigl(f_x(x,y)\bigr)^2 \!+\! \bigl(f_y(x,y)\bigr)^2 \!+\! \bigl(1\bigr)^2 } \,\,\mathrm{d}A \] will compute the surface area of the graph above \(R.\) Doesn’t it look just like the arclength formula?

Before next class, review integration in polar coordinates and solve this problem:

What is the area of one “petal” of the polar graph of \(r = \sin\bigl(3\theta\bigr)?\)

Double Integrals in Polar Coordinates

What is the area of one “petal” of the polar graph of \(r = \sin\bigl(3\theta\bigr)?\)

First recall that a point \((x,y)\) in rectangular (Cartesian) coordinates can be expressed in polar coordinates as \(\bigl(\sqrt{x^2+y^2}, \operatorname{atan2}(y,x)\bigr),\) and that given a function \(f\) we can consider it’s graph in polar coordinates \(r = f(\theta).\) The area of the region bound by the polar graph of \(r = f(\theta)\) between \(\theta = \alpha\) and \(\theta = \beta\) is \[ \frac{1}{2}\int\limits_\alpha^\beta r^2 \,\mathrm{d}\theta \;\;=\;\; \frac{1}{2}\int\limits_\alpha^\beta \Bigl(f(\theta)\Bigr)^2 \,\mathrm{d}\theta \,. \]

What is the area of one “petal” of the polar graph of \(r = \sin\bigl(3\theta\bigr)?\)

\[\begin{align*} \frac{1}{2}\int\limits_{0}^{\pi/3} \Bigl(\sin(3\theta)\Bigr)^2 \,\mathrm{d}\theta \;\;&=\;\; \frac{1}{2}\int\limits_{0}^{\pi/3} \frac{1}{2}\bigl(1-\cos(6\theta)\bigr) \,\mathrm{d}\theta \\\;\;&=\;\; \frac{1}{4} \biggl(\theta-\frac{1}{6}\sin(6\theta)\biggr) \bigg\rvert_{0}^{\pi/3} \;\;=\;\; \frac{\pi}{12}-\frac{1}{24}\sin(2\pi) \;\;=\;\; \frac{\pi}{12} \,. \end{align*}\]

But why are that \(\frac{1}{2}\) and the square are necessary? It makes more sense if, like we did yesterday, we “back up” a dimension and consider this as a double integral: \[ \frac{1}{2}\int\limits_\alpha^\beta \Bigl(f(\theta)\Bigr)^2 \,\mathrm{d}\theta \;\;=\;\; \int\limits_\alpha^\beta\int\limits_0^{f(\theta)} r \,\mathrm{d}r\,\mathrm{d}\theta \,. \] We refer to that extra \(r\) that goes along with the differentials as the integrating factor.

Draw a “Cartesian” rectangle first and explain why \(\mathrm{d}A\) is only \(\mathrm{d}x\,\mathrm{d}y\) or \(\mathrm{d}y\,\mathrm{d}x\) in that case. Then draw a polar “rectangle”, label the radial sides \(\Delta r,\) label the arc sides \(r \Delta \theta,\) and show how the area of a small polar rectangle is \(\mathrm{d}A = \bigl(\mathrm{d}r\bigr) \times \bigl(r\,\mathrm{d}\theta\bigr).\)

And we can always shift our integral from rectangular to polar. Suppose we’re integrating under a surface that is the graph \(z = f(x,y)\) of some function \(f.\) For a polar rectangle \(R\) we have \[ \iint\limits_{R} f(x,y) \,\mathrm{d}A = \int\limits_\alpha^\beta \int\limits_a^b f\bigl(r\cos(\theta),r\sin(\theta)\bigr) r \,\mathrm{d}r\,\mathrm{d}\theta \] And if \(R\) is not a polar “rectangle” but is instead has inner and outer radii defined as the graphs of polar functions \(r = h_1(\theta)\) and \(r = h_2(\theta),\) then \[ \iint\limits_{R} f(x,y) \,\mathrm{d}A = \int\limits_\alpha^\beta \int\limits_{h_2(\theta)}^{h_1(\theta)} f\bigl(r\cos(\theta),r\sin(\theta)\bigr) r \,\mathrm{d}r\,\mathrm{d}\theta \]

Set up the integral that expresses the value of \(\iint_R 2x-y \,\mathrm{d}A\) where \(R\) is the region in the first quadrant bound by \(x=0\) and \(x=y\) and \(x+y=2\) and \(x^2+y^2=9.\)

What is the volume of the expanse inside the sphere \(x^2+y^2+z^2 = 25\) but outside the cylinder \(x^2+y^2 = 4?\)

What is the surface area of the sphere \(x^2+y^2+z^2 = 25\) above the plane \(z=4?\)

Before next class, if you’ve ever encountered moments in a physics class, review them.

Moments of Mass

Density mass first- second-moments center of mass inertia

TK do the centroid of the triangle from yesterday

Suppose a lamina occupies the region bound inside the circle \(x^2+y^2=2y\) but outside the circle \(x^2+y^2=1\) such that the density at a point within the lamina is inversely proportional to that point’s distance to the origin. What are the coordinates of the lamina’s center of mass?

Before next week, write down the double integral that computes the volume of the expanse bound by the parabolic cylinders \(y = x^2-5x+3\) and \(y = -x^2+7x-7\) and the plane \(z = 0\) and the surface \(z = 2\sin(x)\cos(y).\)

Triple Integrals in Rectangular Coordinates

Mechanically triple integrals can be evaluated just the same as double integrals: by considering them as three iterated integrals and evaluating them one-at-a-time. It’s easiest to think of them as a calculation of mass given a density function. Fubini’s theorem still holds, in that integration variables whose bounds don’t depend on each other can be freely swapped. The rousing exercise is swapping bounds that do depend on each other but first imagining the expanse under consideration.

Note that given a triple integral, e.g. \[ \int\limits_a^b\int\limits_{h_1(x)}^{h_2(x)}\int\limits_{g_1(x,y)}^{g_2(x,y)} f(x,y,z) \,\mathrm{d}z\,\mathrm{d}y\,\mathrm{d}x\,, \] the innermost bounds \(g_1\) and \(g_2\) can still be thought of as a “floor” and “ceiling” in that direction, and the outermost two bounds, the outer double integral, corresponds to a “largest” base region parallel to the \(xy\)-plane over which we’re taking the integral.

Conversely, the innermost double integral is the double integral we are evaluating at each \(x_0\) between the \(x\)-bounds.

Write down an integral that expresses the volume of the expanse bounded between the paraboloid \(z = x^2+y^2\) and the sphere \(x^2+y^2+z^2 = 2.\)

Sketch the solid whose volume is calculated by the integral \[ \int\limits_{0}^{2} \int\limits_{0}^{4-x^2} \int\limits_{0}^{4-2x} 1 \,\mathrm{d}y \,\mathrm{d}z \,\mathrm{d}x \] and write five other integrals that compute this volume with the order of the differentials permuted.

The order \(\mathrm{d}z \,\mathrm{d}y \,\mathrm{d}x\) is chill, and \(\mathrm{d}z \,\mathrm{d}x \,\mathrm{d}y\) is also fine. It’s the other three orders that are tough since there are two different “ceilings in the \(x\)-direction.”

Before next class, review the geometry of cylindrical and spherical coordinates.

Triple Integrals in Cylindrical Coordinates

Cover this and the next section on one day, and give a pop quiz on the second day.

\[ x = r\cos(\theta) \qquad y = r\sin(\theta) \qquad z = z \]

\[ \iiint\limits_E f \,\mathrm{d}V \;\;=\;\; \int\limits_{\alpha}^{\beta} \int\limits_{h_1(\theta)}^{h_2(\theta)} \int\limits_{g_1(r,\theta)}^{g_2(r,\theta)} f\bigl(r\cos(\theta),r\sin(\theta),z\bigr) \,{\color{maroon} r} \,\mathrm{d}z\,\mathrm{d}r\,\mathrm{d}\theta \] The integrating factor is the same from polar coordinates; cylindrical coordinates is just polar coordinates in one plane extended linearly along the third axis. Note the third dimension is only conventionally \(z;\) you could also set up a cylindrical coordinate system with, polar coordinates in the \(yz\)-plane and an extra linear \(x\) dimension, or polar coordinates in the \(xz\)-plane and an extra linear \(y\) dimension.

Describe the expanse whose volume is given by the iterated integral \( \int\limits_{0}^{\pi/3} \int\limits_{1}^{3} \int\limits_{0}^{3-r} r \,\mathrm{d}z \,\mathrm{d}r \,\mathrm{d}\theta \,.\)

Write down an integral that expresses the volume of the expanse bound between the paraboloid \(z = x^2+y^2\) and the sphere \(x^2+y^2+z^2 = 2.\)

\[ \int\limits_{0}^{2\pi} \int\limits_{0}^{1} \int\limits_{r^2}^{\sqrt{2-r^2}} r \,\mathrm{d}z \,\mathrm{d}r \,\mathrm{d}\theta \]

Triple Integrals in Spherical Coordinates

\[ x = \rho\sin(\varphi)\cos(\theta) \qquad y = \rho\sin(\varphi)\sin(\theta) \qquad z = \rho\cos(\varphi) \]

\[ \iiint\limits_E f \,\mathrm{d}V \;\;=\;\; \int\limits_{\alpha}^{\beta} \int\limits_{h_1(\varphi)}^{h_2(\varphi)} \int\limits_{g_1(\theta, \varphi)}^{g_2(\theta, \varphi)} f\bigl(\rho\sin(\varphi)\cos(\theta), \rho\sin(\varphi)\sin(\theta), \rho\cos(\varphi)\bigr) \,{\color{maroon} \rho^2\sin(\varphi)} \,\mathrm{d}\rho\,\mathrm{d}\theta\,\mathrm{d}\varphi \] The integrating factor comes from the fact that a single “spherical wedge” has infinitesimal “volume element” \(\mathrm{d}\rho \times \rho \sin(\varphi) \,\mathrm{d}\theta \times \rho \,\mathrm{d}\varphi.\)

Calculate volume of a sphere of radius \(R.\)

Describe the expanse whose volume is given by the iterated integral \( \int\limits_{0}^{5\pi/6} \int\limits_{\pi/3}^{2\pi/3} \int\limits_{2}^{7} \rho^2\sin(\varphi) \,\mathrm{d}\rho \,\mathrm{d}\theta \,\mathrm{d}\phi \,.\)

Write down an integral that expresses the volume of the expanse bound between the paraboloid \(z = x^2+y^2\) and the sphere \(x^2+y^2+z^2 = 2.\)

Gotta do it as a sum of two integrals since the outer radius changes: \[ \int\limits_{0}^{\pi/4} \int\limits_{0}^{2\pi} \int\limits_{0}^{\sqrt{2}} \rho^2\sin(\varphi) \,\mathrm{d}\rho \,\mathrm{d}\theta \,\mathrm{d}\phi + \int\limits_{\pi/4}^{\pi/2} \int\limits_{0}^{2\pi} \int\limits_{0}^{\frac{\cos(\phi)}{\sin^2(\phi)}} \rho^2\sin(\varphi) \,\mathrm{d}\rho \,\mathrm{d}\theta \,\mathrm{d}\phi \]

Before next class, recall the integration technique “\(u\)-substitution” and ponder what happens geometrically to space when you perform such a change of variables.

Change of Coordinates and the Jacobian

Remember “\(u\)-substitution”? You probably remember the mechanics of it, but have you ever thought about the geometry behind it?

For the integral \(\int_2^3 x\cos\bigl(x^2\bigr) \,\mathrm{d}x\) …

Let \(u = x^2,\) for which \({\color{maroon}\frac{\mathrm{d}u}{\mathrm{d}x} = 2x},\) and so \[\begin{align*} \int_2^3 x\cos\bigl(x^2\bigr) \,\mathrm{d}x \;\;&=\;\; \int_2^3 \cos\bigl(x^2\bigr) \tfrac{1}{2}({\color{maroon}2x})\,\mathrm{d}x \\\;\;&=\;\; \int_4^9 \cos\bigl(u\bigr) \tfrac{1}{2} \,\biggl({\color{maroon}\frac{\mathrm{d}u}{\mathrm{d}x}}\biggr)\,\mathrm{d}x \;\;=\;\; \tfrac{1}{2} \int_4^9 \cos\bigl(u\bigr) \,\mathrm{d}u \,. \end{align*}\] The substitution demands the sacrifice of \({\color{maroon} 2x}.\) But there is another perspective on this same mechanical procedure. Instead of making the sacrifice that the transformation demands of \(x\) before the substitution, we can instead think of the baggage that \(u\) brings to the substitution. Instead let \(x = \sqrt{u},\) for which we get the baggage \({\color{maroon}\frac{\mathrm{d}x}{\mathrm{d}u} = \frac{1}{2\sqrt{u}}},\) and so \[\begin{align*} \int_2^3 x\cos\bigl(x^2\bigr) \,\mathrm{d}x \;\;&=\;\; \int_2^3 x\cos\bigl(x^2\bigr) \,\biggl({\color{maroon}\frac{\mathrm{d}x}{\mathrm{d}u}}\biggr)\mathrm{d}u \\\;\;&=\;\; \int_4^9 \sqrt{u}\cos\Bigl(\sqrt{u}^2\Bigr) \,\biggl({\color{maroon}\frac{1}{2\sqrt{u}}}\biggr)\,\mathrm{d}u \;\;=\;\; \tfrac{1}{2} \int_4^9 \cos\bigl(u\bigr) \,\mathrm{d}u \,. \end{align*}\]

[Desmos] Geometrically, what’s happening is that we are inventing a whole other space, the \(u\)-axis, and imagining a transformation \(T\) from the \(u\)-axis to the \(x\)-axis under which our integral in question is coming from a “nicer” integral over the \(u\)-axis. \[ \int_{T^{-1}(I)} f(x) \,\mathrm{d}x \;\;\;\; \int_{I} f\bigl(T(u)\bigr) {\color{maroon}T'(u)} \,\mathrm{d}u \] The transformation is of space itself, and the “sacrifice” on the \(x\) side or the “baggage” that the transformation bring from the \(u\) side is the integrating factor that represents how lengths/areas/volumes are changing under \(T.\)

And you can even do this single-variable substitution twice for multivariable functions, although this isn’t so novel.

For the double integral \(\iint_R \frac{x}{y}\cos\bigl(x^2\ln(y)\bigr) \,\mathrm{d}A\) over the rectangular region \(R = [2,3]\times\bigl[\mathrm{e}, \mathrm{e}^2\bigr].\)

Letting \(u = x^2\) and \(v = \ln(y),\) \[\begin{align*} \int_{2}^{3}\int_{\mathrm{e}}^{\mathrm{e}^2} \frac{x}{y}\cos\bigl(x^2\ln(y)\bigr) \,\mathrm{d}y\,\mathrm{d}x &= \int_{2}^{3}\int_{\mathrm{e}}^{\mathrm{e}^2} \cos\bigl(x^2\ln(y)\bigr) \bigl(\tfrac{1}{y}\,\mathrm{d}y\bigr)\bigl(\tfrac{1}{2} 2x\,\mathrm{d}x\bigr) \\&= \int_{4}^{9}\int_{1}^{2} \cos\bigl(uv\bigr) \biggl(\frac{\mathrm{d}v}{\mathrm{d}y}\,\mathrm{d}y\biggr)\tfrac{1}{2} \biggl(\frac{\mathrm{d}u}{\mathrm{d}x}\,\mathrm{d}x\biggr) \\&= \tfrac{1}{2} \int_{4}^{9}\int_{1}^{2} \cos\bigl(uv\bigr) \,\mathrm{d}u\,\mathrm{d}v \\&= \dotsb \end{align*}\]

But this idea generalizes much further. The substitutions of \(u\) for \(x\) and \(v\) for \(y\) don’t need to be independent; they may be intertwined. The transformation \(T(u,v) = (x,y)\) can be any invertible function with continuous first-order derivatives. The only thing that becomes more complicated is the substitution for the differentials \(\mathrm{d}A = \mathrm{d}y\,\mathrm{d}x.\) What do we do with \(\frac{\partial x}{\partial v}\) and \(\frac{\partial y}{\partial u}\)? The answer is that we combine them all in a matrix called the Jacobian, and take the determinant of that matrix to measure the “baggage” of the transformation.

Let \(R\) be the parallelogram with vertices located at \((0,0)\) and \((2,1)\) and \((1,3)\) and \((3,4)\). Evaluate \(\iint_R 2y^2-3x^2 \,\mathrm{d}A.\)

Note that the transformation \((x,y) = T(u,v)\) defined by the formulas \[ x = 2u+v \qquad y = u+3v \] maps the unit square \([0,1]\times[0,1]\) onto our quadrilateral. Since \[ {\color{maroon}\operatorname{J}_T = \operatorname{det} \begin{pmatrix} 2 & 1 \\ 1 & 3 \end{pmatrix} =6-1=5} \] we have \[\begin{align*} \iint_R 2y^2-3x^2 \,\mathrm{d}y\,\mathrm{d}x &= \int_{0}^{1}\int_{0}^{1} \bigl(2(u+3v)^2-3(2u+v)^2\bigr) ({\color{maroon}5}) \,\mathrm{d}u\,\mathrm{d}v \\[1em]&= \int_{0}^{1}\int_{0}^{1} 65v^2-50u^2 \,\mathrm{d}u\,\mathrm{d}v \\[1em]&= \bigg(\frac{65}{3} - \frac{50}{3}\biggr) = \frac{25}{3} \,. \end{align*}\]

In this case \(\operatorname{J}_T\) was a constant because \(T\) was a linear transformation; \(x\) and \(y\) were each linear functions of \(u\) and \(v.\) But this doesn’t have to be the case. If \(T\) is not linear, \(\operatorname{J}_T\) may be a function.

What is the Jacobian of the transformation \((x,y) = T(u,v)\) defined by these formulas? \[ x = u\cos(v) \qquad y = u\sin(v) \]

These are polar coordinates, mildly disguised: \((x,y) = T(r,\theta)\) and \( x = r\cos(\theta)\) and \(y = r\sin(\theta). \) \[ \operatorname{J}_T = \operatorname{det} \begin{pmatrix} \cos(\theta) & -r\sin(\theta) \\ \sin(\theta) & r\cos(\theta) \end{pmatrix} = \bigl(r\cos^2(\theta)\bigr) - \bigl(-r\sin^2(\theta)\bigr) = r \]

That’s where the integrating factor \(r\) for polar coordinates comes from. And this even works in three-dimensional space (Desmos) and the integrating factors for cylindrical and spherical coordinates can be derived the same way.

The real utility of this is to be able to integrate over a larger collection of regions. Given \(\iint_R f \,\mathrm{d}A\) for a strange \(R\) if you can come up with a transformation \(T\) where \(T(S) = R\) for some much nicer region \(S\) then evaluating the equivalent integral over \(S\) is the way to go. However, without lots of practice, coming up with the transformation \(T\) can be tricky. But let’s continue working in two-dimensional space with some pre-fabricated transformations to get the hang of things.

What is the image in the \(xy\)-plane of the unit square \([0,1]\times[0,1]\) in the \(uv\)-plane under the transformation \((x,y) = T(u,v)\) defined by the component formulas \(x = u-v\) and \(y = u+\mathrm{e}^v\,? \)

Do it on the corners, explain why the edges probably aren’t just straight lines, then start parameterizing the edges.

Before next week, study everything! And do a brisk review of anything you know about probability associated with normal distributions.

Moments in General

Display, but Pop Quiz

Probability Density Functions & Expected Value

A probability density function is any function \(f\) on some domain \(D\) for which \(\int_D f(x)\,\mathrm{d}x = 1.\) The go-to example is the normal distribution, based on the curve \(y = \mathrm{e}^{-x^2}.\)

Prove that \(\int_{-\infty}^{\infty} \mathrm{e}^{-x^2} \,\mathrm{d}x = \sqrt{\pi}.\)

Let \(I = \int_{-\infty}^{\infty} \mathrm{e}^{-x^2} \,\mathrm{d}x\) and consider \[\begin{align*} I^2 &= \Biggl(\int_{-\infty}^{\infty} \mathrm{e}^{-x^2} \,\mathrm{d}x\Biggr) \Biggl(\int_{-\infty}^{\infty} \mathrm{e}^{-x^2} \,\mathrm{d}x\Biggr) \\&= \Biggl(\int_{-\infty}^{\infty} \mathrm{e}^{-x^2} \,\mathrm{d}x\Biggr) \Biggl(\int_{-\infty}^{\infty} \mathrm{e}^{-y^2} \,\mathrm{d}y\Biggr) \\&= \int_{-\infty}^{\infty}\int_{-\infty}^{\infty} \mathrm{e}^{-x^2-y^2} \,\mathrm{d}x\,\mathrm{d}y \\&= \int_{0}^{2\pi}\int_{0}^{\infty} \mathrm{e}^{-r^2} \,r \,\mathrm{d}r\,\mathrm{d}\theta = \int_{0}^{2\pi} \tfrac{1}{2} \,\mathrm{d}\theta = \pi \end{align*}\] so \(I = \sqrt{\pi}.\)

So the normal PDF is given by \(\frac{1}{\sqrt{\pi}}\mathrm{e}^{-x^2}.\) Upon further calculation we see that this distribution has mean \(\mu = 0,\) but variance \(\sigma^2 = \frac{1}{2}.\) We further correct for that variance, settling on the standard normal PDF \(\frac{1}{\sqrt{2\pi}}\mathrm{e}^{-\frac{1}{2}x^2}\) which has variance \(\sigma^2\) and standard deviation \(\sigma\) both equal to one. Then we may transform this PDF linearly to have any mean or standard deviation we want: \[\frac{1}{\sqrt{2\pi\sigma}}\mathrm{e}^{-\frac{1}{2}\Bigl(\frac{x-\mu}{\sigma}\Bigr)^2}\]

Notably the kurtosis of this PDF is 3, so statisticians have taken to defining the excess kurtosis of a PDF as three less than its kurtosis.

The sequence of moments is OEIS: A123023. Can you “correct for” the higher moments like you can for the mean and standard deviation?

Introduction to Vector Calculus

Vector Fields

A vector field is a function that assigns a vector to each point in your space, \(\bm{F}\colon \mathbf{R}^2 \to \mathbf{R}^2\) or \(\bm{F}\colon \mathbf{R}^3 \to \mathbf{R}^3\) Two dimensional vector fields are nicer to visualize, so we’ll mostly be looking at those for examples. Written as \(\bm{F} = \langle L, M, N \rangle\) or as \(\bm{F} = L\mathbf{i} + M\mathbf{j} + N\mathbf{k},\) or some authors use \(P\) and \(Q\) and \(R.\) For a two-dimensional field we just drop the \(N.\)

Show off windy.com and the other one linked in the course outline.

Manually plot \(\bigl\langle xy, x-\tfrac{1}{2}y^2\bigr\rangle\) a bit and then plot it digitally

Digitally plot \(\bigl\langle z^2-3x^2y, -x^3, 2+2xz \bigr\rangle.\)

Define solenoidal (incompressible) vector fields and show them the \(\langle -y, x\rangle\) example.

Define conservative (irrotational) vector fields and show them the \(\langle 2\cos(x)\cos(y), -2\sin(x)\sin(y)\rangle\) example.

Most vector fields are neither (draw a Venn diagram). A vector field that is simultaneously conservative and solenoidal is called a Laplacian vector field, Show them the \(\bigl\langle x^2-y^2, -2xy\bigr\rangle\) example.

Conservative vector fields are related to something we’ve already seen. Recall that for a function \(f,\) the gradient operator \(\nabla\) gives us a vector field \(\nabla f.\) Conservative vector fields are exactly these, the vector fields that arise as the gradient of some function. For a conservative vector field \(\bm{F} = \nabla f\) we call \(f\) a scalar potential function for \(\bm{F}.\)

The conservative field \(\langle 2\cos(x)\cos(y), -2\sin(x)\sin(y)\rangle\) is the gradient of the function \(f(x,y) = 2\sin(x)\cos(y),\) but how could we have figured this out? Show how to check it’s conservative with Clairaut’s theorem, then compute antiderivatives to find the potential.

What’s a scalar potential for the Laplacian vector field \(\bigl\langle x^2-y^2, -2xy\bigr\rangle?\)

Is the vector field \(\bigl\langle y\mathrm{e}^x, xy-\ln(x)\bigr\rangle?\) conservative? If so determine a scalar potential function.

Is the vector field \(\bigl\langle z^2-3x^2y, -x^3, 2+2xz \bigr\rangle.\) conservative? If so determine a scalar potential function.

Before next class, review curves in space \(C\) and their parameterizations \(\bm{r},\) and their arclength parameterization in particular.

Line Integrals in Scalar Fields

A curve’s parameterization determines an orientation, the direction along which the curve is traversed. Given a curve \(C\) with implied orientation, we’ll let \(-C\) denote the same curve traversed in the opposite direction. A curve is simple if it doesn’t intersect itself, and a curve is closed if it forms a loop. The Jordan Curve Theorem states that a simple closed curve in the plane must separate the plane into an “inside” and “outside” region. For such a closed curve we’ll declare the counterclockwise (right-handed) orientation to be “positive”. We’ve talked about smooth curves, but know that a curve is piecewise-smooth if it consists of finitely many smooth pieces.

Suppose we have a curve \(C\) in the plane parameterized as \(\bm{r}(t) = \bigl\langle x(t), y(t) \bigr\rangle\) for \(t\) between \(t = a\) and \(t = b\) and there is also a scalar-valued function (scalar field) defined on the plane. As \(t\) increases from \(a\) to \(b\) consider the value of \(f\) along the curve, \(f\bigl(\bm{r}(t)\bigr).\) Suppose we want to compute the “total amount” of \(f\) along \(C,\) or compute how much of \(f\) is “accumulated” along \(C\) as \(t\) sweeps from \(a\) to \(b.\)

The line integral (or path integral, or contour integral) of \(f\) along \(C\) with respect to arclength is \( \int_C f \,\mathrm{s}\,.\) For a planar curve We could imagine the integral this way: imagine the curve as a three-dimensional curve with height \(f(x,y)\) — the curve is now \(\bigl\langle x(t), y(t), f\bigl(x(t), y(t)\bigr) \bigr\rangle\) — and the value of \( \int_C f \,\mathrm{s}\) is the area of the “curtain” between \(C\) and the \(xy\)-plane.

Actually computing the value of line integrals can be done easily by writing the integral with respect to \(\mathrm{d}t\) instead of the arclength: \[ \int_C f \,\mathrm{d}s = \int_C f \,\bigl({\color{maroon}\tfrac{\mathrm{d}s}{\mathrm{d}t}}\bigr) \, \mathrm{d}t = \int_a^b f\bigl(x(t), y(t)\bigr) {\color{maroon} \sqrt{\bigl(\tfrac{\mathrm{d}x}{\mathrm{d}t}\bigr)^2 \!\!+\! \bigl(\tfrac{\mathrm{d}y}{\mathrm{d}t}\bigr)^2 }} \,\mathrm{d}t = \int_a^b f\bigl(\bm{r}(t)\bigr) {\color{maroon}\bigl|\bm{r}'(t)\bigr|} \,\mathrm{d}t \]

Compute the line integral of \(f(x,y) = xy\) over the curve \(C\) with parameterization \(\bm{r}(t) = \bigl\langle 4t, t^2 \bigr\rangle\) for \(t\) between \(0\) and \(1.\)

\[ \int_C xy \,\mathrm{d}s = \int_0^1 \bigl(4t\bigr)\bigl(t^2\bigr) \sqrt{(4)^2 + (2t)^2} \,\mathrm{d}t = … \]

Sometimes, in the heat of calculation, we need to consider a line integral with respect to \(x\) or \(y\) or \(z\) instead of with respect to arclength (\(\mathrm{d}s\)). \[ \int_C f \,\mathrm{d}x = \int_a^b f\bigl(x(t), y(t)\bigl) \,{\color{maroon}x'(t)} \,\mathrm{d}t \qquad \int_C f \,\mathrm{d}y = \int_a^b f\bigl(x(t), y(t)\bigl) \,{\color{maroon}y'(t)} \,\mathrm{d}t \] These line integrals with respect to \(x\) or \(y\) so frequently occur together (as we’ll see tomorrow) that it’s conventional to write them as a single integral. \[ \int_C L \,\mathrm{d}x + \int_C M \,\mathrm{d}y = \int_C L \,\mathrm{d}x + M \,\mathrm{d}y \] but know that this is mostly just syntactic sugar (abuse of notation).

Evaluate \(\int_C x^3 \,\mathrm{d}x + y \,\mathrm{d}y\) over the curve \(C\) that starts at \((-3,0),\) follows a circular arc clockwise up to \((0,3),\) then follows a straight-line path down to the point \((3,0).\)

Note first that per any parameterization \(\bm{r}(t) = \bigl\langle x(t), y(t) \bigr\rangle\) \[ \int_C x^3 \,\mathrm{d}x + y \,\mathrm{d}y = \int_C \bigl(x(t)\bigr)^3 \frac{\mathrm{d}x}{\mathrm{d}t} \,\mathrm{d}t + \bigl(y(t)\bigr) \frac{\mathrm{d}y}{\mathrm{d}t} \,\mathrm{d}t \,. \] Now \(C\) consists of two pieces: \(\bigl\langle 3\cos(t), 3\sin(t) \bigr\rangle\) for \(\pi \gt t \gt \pi/2\) and \(\bigl\langle t, 3-t \bigr\rangle\) for \(0 \lt t \lt 3.\) So our previous integral becomes \[ = \int_{\pi}^{\pi/2} \bigl(3\cos(t)\bigr)^3 \bigl(-3\sin(t)\bigr) \,\mathrm{d}t + \bigl(3\sin(t)\bigr) \bigl(3\cos(t)\bigr) \,\mathrm{d}t + \int_{0}^{3} \bigl(t\bigr)^3 (1) \,\mathrm{d}t + \bigl(3-t\bigr) (-1) \,\mathrm{d}t = \dotsb \]

Evaluate \(\int_C y-x \,\mathrm{d}x + xy \,\mathrm{d}y\) over the “pacman"-shaped curve \(C\) that starts at the origin, proceeds to \((1,1)\) along a straight line, and proceeds counterclockwise along a circular arc to \((1,-1)\) before heading along a straight line back to the origin.

Line Integrals in Vector Fields

Suppose we have a curve \(C\) in the plane parameterized as \(\bm{r}(t) = \bigl\langle x(t), y(t) \bigr\rangle\) for \(t\) between \(t = a\) and \(t = b\) and there is also a vector field \(\bm{F}\) defined on the plane. Let \(\mathbf{T}\) denote the tangent vector to \(\bm{r}\) at a point, As \(t\) increases from \(a\) to \(b\) consider the value of \(\bm{F}\cdot\mathbf{T}\) along the curve, the work that the vector field is doing along the curve. Suppose we want to compute the “total amount” of work \(\bm{F}\) does along \(C,\) or compute how much energy is “accumulated” along \(C\) as \(t\) sweeps from \(a\) to \(b.\) The line integral (or path integral, or contour integral) of \(\bm{F}\) along \(C\) with respect to arclength is \( \int_C \bm{F}\cdot\mathbf{T} \,\mathrm{s}\,.\)

Actually computing the value of line integrals can be done easily by writing the integral with respect to \(\mathrm{d}t\) instead of the arclength: \[ \int_C \bm{F}\cdot\mathbf{T}\,\mathrm{d}s = \int_C \bm{F}\cdot\mathbf{T} \,\bigl({\color{maroon}\tfrac{\mathrm{d}s}{\mathrm{d}t}}\bigr)\,\mathrm{d}t = \int_a^b \bm{F}\bigl(\bm{r}(t)\bigr)\cdot\Bigl(\tfrac{\bm{r}'(t)}{|\bm{r}'(t)|}\Bigr) {\color{maroon} |\bm{r}'(t)| } \,\mathrm{d}t = \int_a^b \bm{F}\big(\bm{r}(t)\big)\cdot\bm{r}'(t)\,\mathrm{d}t = \int_C \bm{F}\cdot\mathrm{d}\bm{r} = \int_C \bigl(L\mathbf{i}+M\mathbf{j}\bigr)\cdot\bigl(\tfrac{\mathrm{d}x}{\mathrm{d}t}\mathbf{i}+\tfrac{\mathrm{d}y}{\mathrm{d}t}\mathbf{j}\bigr) \,\mathrm{d}t = \int_C L \,\mathrm{d}x + M \,\mathrm{d}y \]

The Fundamental Theorem for Line Integrals, also called the Gradient Theorem — for a conservative vector field \(\bm{F}\) continuous on a smooth curve \(C\) with parameterization \(\bm{r}\) we have \[ \int_C \bm{F} \cdot \mathrm{d}\bm{r} = \int_C \nabla f \cdot \mathrm{d}\bm{r} = f\big(\bm{r}(b)\big) - f\big(\bm{r}(a)\big) \,. \] This is to say that line integrals in conservative vector fields are path independent: for a conservative vector field \(\bm{F}\) and any two curves (paths) \(C_1\) and \(C_2\) with coinciding initial and terminal points, \(\int_{C_1} \bm{F} \cdot \mathrm{d}\bm{r} = \int_{C_2} \bm{F} \cdot \mathrm{d}\bm{r}.\) In particular a line integral of a conservative vector field over a closed curve (loop) equals zero. This provides a separate characterization of conservative vector fields. Say \(R\) is a simply-connected region if it “has no holes”, if every closed curve in \(R\) can be contracted through \(R\) to a point in \(R.\) A vector field is conservative if it is path independent on any open connected region \(R,\) or if it’s mixed partials agree on any open simply-connected region \(R.\)

Surface Integrals in Vector Fields

Examples to end class

Green’s Theorem

This relates the line integral over a boundary of a region with the double integral over the interior of the region.

A closed curve is positively oriented if it is being traversed counter-clockwise.

For a positively oriented, piecewise-smooth, simple closed planar curve \(C\) with interior region \(R\) if \(P\) and \(Q\) have continuous partial derivatives on some open neighborhood containing \(R,\) then \[ \int\limits_C P\,\mathrm{d}x + Q\,\mathrm{d}y = \iint\limits_R \bigg(\frac{\partial Q}{\partial x}-\frac{\partial P}{\partial y}\bigg)\,\mathrm{d}A \]

A short-hand notation for the boundary of a region \(R\) is \(\partial R\) which is a brilliant overload of the partial-differential operator in light of Stokes’ Theorem.

When the orientation of the curve \(C\) needs to be acknowledged we use the notation \[\oint\limits_C P\,\mathrm{d}x + Q\,\mathrm{d}y\] and sometimes even put a little arrow on that circle in \(\oint\) to specify the orientation.

We can extend Green’s theorem to non-simple closed regions.

Bio on George Green.

Examples to end class

Divergence & Curl

For a vector field \(\bm{F} = P\mathbf{i} + Q\mathbf{j} + R\mathbf{k}\) the curl of \(\bm{F},\) denoted \(\operatorname{curl}\bm{F}\) is defined to be \[\begin{align*} \operatorname{curl}\bm{F} = \nabla \times \bm{F} &= \det\begin{pmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ \frac{\partial}{\partial x} & \frac{\partial}{\partial y} & \frac{\partial}{\partial z} \\ P & Q & R \end{pmatrix} \\&= \biggl(\frac{\partial R}{\partial y}-\frac{\partial Q}{\partial z}\biggr)\mathbf{i} + \biggl(\frac{\partial P}{\partial z}-\frac{\partial R}{\partial x}\biggr)\mathbf{j} + \biggl(\frac{\partial Q}{\partial x}-\frac{\partial P}{\partial y}\biggr)\mathbf{k} \end{align*}\]

If \(f\) has continuous second-order partial derivatives, then \(\operatorname{curl} \nabla f = \mathbf{0}.\)

If \(\bm{F}\) is defined on all of \(\mathbf{R}^3\) and if the components functions of \(\bm{F}\) have continuous second-order partial derivatives, and \(\operatorname{curl} \bm{F} = \mathbf{0},\) then \(\bm{F}\) is a conservative vector field.

For a vector field \(\bm{F} = P\mathbf{i} + Q\mathbf{j} + R\mathbf{k}\) the divergence of \(\bm{F},\) denoted \(\operatorname{div}\bm{F}\) is defined to be \[ \operatorname{div}\bm{F} = \nabla \cdot \bm{F} = \frac{\partial P}{\partial x} + \frac{\partial Q}{\partial y} + \frac{\partial R}{\partial z} \]

If \(\bm{F} = P\mathbf{i} + Q\mathbf{j} + R\mathbf{k}\) is defined on all of \(\mathbf{R}^3\) and if the components functions of \(\bm{F}\) have continuous second-order partial derivatives, then \(\operatorname{div}\operatorname{curl} \bm{F} = 0.\)

Laplace operator \(\nabla^2 = \nabla \cdot \nabla\)

Green’s theorem in vector form \[ \oint\limits_C \bm{F}\cdot\mathrm{d}r = \oint\limits_C \bm{F}\cdot \mathbf{T}_r \,\mathrm{d}\ell = \iint\limits_R \big(\operatorname{curl}\bm{F}\big) \cdot \mathbf{k} \,\mathrm{d}A \] or alternatively \[ = \oint\limits_C \bm{F}\cdot \mathbf{n}_r \,\mathrm{d}\ell = \iint\limits_R \operatorname{div}\bm{F}(x,y)\,\mathrm{d}A \]

Examples to end class

Before next class, TK Question/Review for next class

Stokes’ Theorem

Like Green’s theorem but in a higher dimension. I.e. you can evaluate flux just by looking at the boundary.

Given an oriented piecewise-smooth surface \(S\) that is bounded by a simple, closed, piecewise-smooth, positively oriented boundary curve \(\partial S,\) and a vector field \(\bm{F}\) whose components have continuous partial derivatives on a open region in space containing \(S,\) \[ \int\limits_{\partial S} \bm{F}\cdot\mathrm{d}\bm{r} = \iint\limits_S \operatorname{curl}\bm{F} \cdot \mathrm{d}\bm{S} \,. \]

Examples

Bio on George Stokes.

Examples to end class

Before next class, TK Question/Review for next class

The Divergence Theorem

This is now the same theorem, but for regions whose boundary are surfaces

Given a simple solid region \(E\) where \(\partial E\) denotes the boundary surface of \(E\) taken to have positive (outward) orientation, and given a vector field \(\bm{F}\) whose components have continuous partial derivatives on a open region in space containing \(E,\) \[ \iint\limits_{\partial E} \bm{F}\cdot\mathrm{d}\bm{S} = \iiint\limits_E \operatorname{div}\bm{F} \,\mathrm{d}V \,. \]

Examples

Sometimes called Gauss’s Theorem, or Ostrogradsky’s Theorem

Examples to end class

Before next class, TK Question/Review for next class

Maxwell’s Equations

Gauss’s Law — Electric charge creates an electric field. (Electrons create electric fields)
Gauss’s Law for Magnetism — Total magnetic field through a volume needs to add up to 0. Ie, if you cut a bar magnet in half, you now have two bar magnets with a North and South pole each. (Not one north pole magnet and one south pole magnet)
Faraday’s Law — A changing magnetic field creates voltage. Ie, like those shake flashlights where they have a magnet inside and then the flash light turns on.
Ampère-Maxwell Law — Current creates magnetic field. Ie think of electromagnets like those things that stick to metal in a dump or an MRI machine.

See this

Introduction

Typographic Conventions

Terminology & Definitions

Analytic & Vector Geometry in Three-Dimensional Space

Three-Dimensional Space

Cylindrical and Spherical Coordinates

Lines, Planes, and Surfaces

Vectors in Three-Dimensional Space

The Dot Product and Projections

The Cross Product and Areas

Distances Between Points, Lines, and Planes

Force, Work, Torque, etc

Calculus of Vector-Valued Functions: Curves

Parametrically-Defined Curves and Surfaces

Calculus with Vector-Valued Functions

The Frenet-Serret Frame

Curvature & Torsion

Projectile Motion & Kepler’s Laws

Calculus of Multivariable Functions: Surfaces

Multivariable Functions and their Graphs

Limits & Continuity

Partial Derivatives

Tangent Planes & Taylor Approximations

The Chain Rule & Implicit Differentiation

The Gradient Vector & Directional Derivatives

Local and Global Extrema

Optimization with Lagrange Multipliers

Guassian Curvature

Integration in Higher Dimensions

Introduction to Double and Triple Integrals

Double Integrals in Rectangular Coordinates

Double Integrals in Polar Coordinates

Moments of Mass

Triple Integrals in Rectangular Coordinates

Triple Integrals in Cylindrical Coordinates

Triple Integrals in Spherical Coordinates

Change of Coordinates and the Jacobian

Moments in General

Probability Density Functions & Expected Value

Introduction to Vector Calculus

Vector Fields

Line Integrals in Scalar Fields

Line Integrals in Vector Fields

Surface Integrals in Vector Fields

Green’s Theorem

Divergence & Curl

Stokes’ Theorem

The Divergence Theorem

Maxwell’s Equations

The Helmholtz Decomposition Theorem