Calculus/Print version

From Wikibooks, open books for an open world
< Calculus
Jump to: navigation, search

Table of Contents

Precalculus

1.1 Algebra 25% developed

1.2 Functions 50% developed

1.3 Graphing linear functions 25% developed

1.4 Exercises

Limits

Limit-at-infinity-graph.png

2.1 An Introduction to Limits 75% developed

2.2 Finite Limits 50% developed

2.3 Infinite Limits 50% developed

2.4 Continuity 25% developed

2.5 Formal Definition of the Limit 25% developed

2.6 Proofs of Some Basic Limit Rules

2.7 Exercises

Differentiation

Basics of Differentiation 75% developed

Derivative1.png

3.1 Differentiation Defined

3.2 Product and Quotient Rules

3.3 Derivatives of Trigonometric Functions

3.4 Chain Rule

3.5 Higher Order Derivatives - An introduction to second order derivatives

3.6 Implicit Differentiation

3.7 Derivatives of Exponential and Logarithm Functions

3.8 Some Important Theorems

3.9 Exercises

Applications of Derivatives 50% developed

3.10 L'Hôpital's Rule 75% developed

3.11 Extrema and Points of Inflection

3.12 Newton's Method

3.13 Related Rates

3.14 Optimization

3.15 Euler's Method

3.16 Exercises


Integration

The definite integral of a function f(x) from x=0 to x=a is equal to the area under the curve from 0 to a.

Basics of Integration

4.1 Definite integral 25% developed

4.2 Fundamental Theorem of Calculus 25% developed

4.3 Indefinite integral 25% developed

4.4 Improper Integrals

Integration Techniques

From bottom to top:
  • an acceleration function a(t);
  • the integral of the acceleration is the velocity function v(t);
  • and the integral of the velocity is the distance function s(t).

4.5 Infinite Sums

4.6 Derivative Rules and the Substitution Rule

4.7 Integration by Parts

4.8 Trigonometric Substitutions

4.9 Trigonometric Integrals

4.10 Rational Functions by Partial Fraction Decomposition

4.11 Tangent Half Angle Substitution

4.12 Reduction Formula

4.13 Irrational Functions

4.14 Numerical Approximations

4.15 Exercises

Applications of Integration

4.16 Area

4.17 Volume

4.18 Volume of solids of revolution

4.19 Arc length

Parametric and Polar Equations

HuggingRoseStarconstructionJohnManuel.png

Parametric Equations

Polar Equations

Sequences and Series

Basics

Series and calculus

Multivariable and Differential Calculus

Triple Integral Example 2.svg

Extensions

Advanced Integration Techniques

Further Analysis

Formal Theory of Calculus

Appendix

  • Choosing delta

Solutions


References

Acknowledgements and Further Reading

Introduction

Calculus Contributing →
Print version

What is calculus?

Calculus is the broad area of mathematics dealing with such topics as instantaneous rates of change, areas under curves, and sequences and series. Underlying all of these topics is the concept of a limit, which consists of analyzing the behavior of a function at points ever closer to a particular point, but without ever actually reaching that point. As a typical application of the methods of calculus, consider a moving car. It is possible to create a function describing the displacement of the car (where it is located in relation to a reference point) at any point in time as well as a function describing the velocity (speed and direction of movement) of the car at any point in time. If the car were traveling at a constant velocity, then algebra would be sufficient to determine the position of the car at any time; if the velocity is unknown but still constant, the position of the car could be used (along with the time) to find the velocity.

However, the velocity of a car cannot jump from zero to 35 miles per hour at the beginning of a trip, stay constant throughout, and then jump back to zero at the end. As the accelerator is pressed down, the velocity rises gradually, and usually not at a constant rate (i.e., the driver may push on the gas pedal harder at the beginning, in order to speed up). Describing such motion and finding velocities and distances at particular times cannot be done using methods taught in pre-calculus, whereas it is not only possible but straightforward with calculus.

Calculus has two basic applications: differential calculus and integral calculus. The simplest introduction to differential calculus involves an explicit series of numbers. Given the series (42, 43, 3, 18, 34), the differential of this series would be (1, -40, 15, 16). The new series is derived from the difference of successive numbers which gives rise to its name "differential". Rarely, if ever, are differentials used on an explicit series of numbers as done here. Instead, they are derived from a continuous function in a manner which is described later.

Integral calculus, like differential calculus, can also be introduced via series of numbers. Notice that in the previous example, the original series can almost be derived solely from its differential. Instead of taking the difference, however, integration involves taking the sum. Given the first number of the original series, 42 in this case, the rest of the original series can be derived by adding each successive number in its differential (42+1, 43-40, 3+15, 18+16). Note that knowledge of the first number in the original series is crucial in deriving the integral. As with differentials, integration is performed on continuous functions rather than explicit series of numbers, but the concept is still the same. Integral calculus allows us to calculate the area under a curve of almost any shape; in the car example, this enables you to find the displacement of the car based on the velocity curve. This is because the area under the curve is the total distance moved, as we will soon see. Let's understand this section very carefully. Suppose we have to add the numbers in series which is continuously "on" like 23,25,24,25,34,45,46,47, and so on...at this type integral calculation is very useful instead of the typical mathematical formulas.

Why learn calculus?

Calculus is essential for many areas of science and engineering. Both make heavy use of mathematical functions to describe and predict physical phenomena that are subject to continual change, and this requires the use of calculus. Take our car example: if you want to design cars, you need to know how to calculate forces, velocities, accelerations, and positions. All require calculus. Calculus is also necessary to study the motion of gases and particles, the interaction of forces, and the transfer of energy. It is also useful in business whenever rates are involved. For example, equations involving interest or supply and demand curves are grounded in the language of calculus.

Calculus also provides important tools in understanding functions and has led to the development of new areas of mathematics including real and complex analysis, topology, and non-euclidean geometry.

Notwithstanding calculus' functional utility (pun intended), many non-scientists and non-engineers have chosen to study calculus just for the challenge of doing so. A smaller number of persons undertake such a challenge and then discover that calculus is beautiful in and of itself.

What is involved in learning calculus?

Learning calculus, like much of mathematics, involves two parts:

  • Understanding the concepts: You must be able to explain what it means when you take a derivative rather than merely apply the formulas for finding a derivative. Otherwise, you will have no idea whether or not your solution is correct. Drawing diagrams, for example, can help clarify abstract concepts.
  • Symbolic manipulation: Like other branches of mathematics, calculus is written in symbols that represent concepts. You will learn what these symbols mean and how to use them. A good working knowledge of trigonometry and algebra is a must, especially in integral calculus. Sometimes you will need to manipulate expressions into a usable form before it is possible to perform operations in calculus.

What you should know before using this text

There are some basic skills that you need before you can use this text. Continuing with our example of a moving car:

  • You will need to describe the motion of the car in symbols. This involves understanding functions.
  • You need to manipulate these functions. This involves algebra.
  • You need to translate symbols into graphs and vice-versa. This involves understanding the graphing of functions.
  • It also helps (although it isn't necessarily essential) if you understand the functions used in trigonometry since these functions appear frequently in science.

Scope

The first four chapters of this textbook cover the topics taught in a typical high school or first year college course. The first chapter, Precalculus, reviews those aspects of functions most essential to the mastery of calculus. The second, Limits, introduces the concept of the limit process. It also discusses some applications of limits and proposes using limits to examine slope and area of functions. The next two chapters, Differentiation and Integration, apply limits to calculate derivatives and integrals. The Fundamental Theorem of Calculus is used, as are the essential formulae for computation of derivatives and integrals without resorting to the limit process. The third and fourth chapters include articles that apply the concepts previously learned to calculating volumes, and so on as well as other important formulae.

The remainder of the central Calculus chapters cover topics taught in higher-level calculus topics: multivariable calculus, vectors, and series (Taylor, convergent, divergent).

Finally, the other chapters cover the same material, using formal notation. They introduce the material at a much faster pace, and cover many more theorems than the other two sections. They assume knowledge of some set theory and set notation.

Calculus Contributing →
Print version

Precalculus

<h1> 1.1 Algebra</h1>

← Precalculus Calculus Functions →
Print version

This section is intended to review algebraic manipulation. It is important to understand algebra in order to do calculus. If you have a good knowledge of algebra, you should probably just skim this section to be sure you are familiar with the ideas.

Rules of arithmetic and algebra

The following laws are true for all a, b, and c, whether a, b, and c are numbers, variables, functions, or more complex expressions involving numbers, variable and/or functions.

Addition

  • Commutative Law: a+b=b+a \,.
  • Associative Law: (a+b)+c=a+(b+c)\,.
  • Additive Identity: a+0=a\,.
  • Additive Inverse: a+(-a)=0\,.

Subtraction

  • Definition: a-b = a+(-b)\,.

Multiplication

  • Commutative law: a\times b=b\times a.
  • Associative law: (a\times b)\times c=a\times (b\times c)\,.
  • Multiplicative identity: a\times 1=a\,.
  • Multiplicative inverse: a\times \frac{1}{a}=1, whenever a \neq 0\,
  • Distributive law: a\times (b+c)=(a\times b)+(a\times c)\,.

Division

  • Definition: \frac{a}{b}=a\times \frac{1}{b}, whenever b \neq 0\,.

Let's look at an example to see how these rules are used in practice.

\frac{(x+2)(x+3)}{x+3} = \left[(x+2)\times (x+3)\right]\times \left( \frac{1}{x+3}\right) (from the definition of division)
= (x+2)\times \left[(x+3)\times \left(\frac{1}{x+3} \right) \right] (from the associative law of multiplication)
= ((x+2)\times (1)),\qquad x \neq -3 \, (from multiplicative inverse)
= x+2, \qquad x \neq -3. (from multiplicative identity)

Of course, the above is much longer than simply cancelling x+3 out in both the numerator and denominator. But, when you are cancelling, you are really just doing the above steps, so it is important to know what the rules are so as to know when you are allowed to cancel. Occasionally people do the following, for instance, which is incorrect:


\frac{2\times (x + 2)}{2}= \frac{2}{2} \times \frac{x+2}{2}=1 \times \frac{x+2}{2}= \frac{x+2}{2}.


The correct simplification is


\frac{2\times (x + 2)}{2}= \left( 2 \times \frac{1}{2} \right) \times (x+2)=1 \times (x+2)=x+2,


where the number 2 cancels out in both the numerator and the denominator.

Interval notation

There are a few different ways that one can express with symbols a specific interval (all the numbers between two numbers). One way is with inequalities. If we wanted to denote the set of all numbers between, say, 2 and 4, we could write "all x satisfying 2<x<4." This excludes the endpoints 2 and 4 because we use < instead of  \leq . If we wanted to include the endpoints, we would write "all x satisfying 2 \leq x \leq 4 ." This includes the endpoints.

Another way to write these intervals would be with interval notation. If we wished to convey "all x satisfying 2<x<4" we would write (2,4). This does not include the endpoints 2 and 4. If we wanted to include the endpoints we would write [2,4]. If we wanted to include 2 and not 4 we would write [2,4); if we wanted to exclude 2 and include 4, we would write (2,4].

Thus, we have the following table:

Endpoint conditions Inequality notation Interval notation
Including both 2 and 4 all x satisfying  2 \leq x \leq 4
 [2,4] \,\!
Not including 2 nor 4 all x satisfying  2<x<4 \,
 (2,4) \,\!
Including 2 not 4 all x satisfying  2 \leq x < 4
 [2,4) \,\!
Including 4 not 2 all x satisfying  2 < x \leq 4
 (2,4] \,\!

In general, we have the following table:

Meaning Interval Notation Set Notation
All values greater than or equal to a and less than or equal to b \left[a,b\right] \left\{x:a\le x\le b\right\}
All values greater than a and less than b \left(a,b\right) \left\{x:a < x < b\right\}
All values greater than or equal to a and less than b \left[a,b\right) \left\{x:a\le x < b\right\}
All values greater than a and less than or equal to b \left(a,b\right] \left\{x:a < x\le b\right\}
All values greater than or equal to a. \left[a,\infty\right) \left\{x:x\ge a\right\}
All values greater than a. \left(a,\infty\right) \left\{x:x > a\right\}
All values less than or equal to a. \left(-\infty,a\right] \left\{x:x\le a\right\}
All values less than a. \left(-\infty,a\right) \left\{x:x < a\right\}
All values. \left(-\infty,\infty\right) \left\{x: x\in\mathbb{R}\right\}

Note that \infty and -\infty must always have an exclusive parenthesis rather than an inclusive bracket. This is because \infty is not a number, and therefore cannot be in our set. \infty is really just a symbol that makes things easier to write, like the intervals above.

The interval (a,b) is called an open interval, and the interval [a,b] is called a closed interval.

Intervals are sets and we can use set notation to show relations between values and intervals. If we want to say that a certain value is contained in an interval, we can use the symbol \in to denote this. For example, 2\in[1,3]. Likewise, the symbol \notin denotes that a certain element is not in an interval. For example 0\notin(0,1).

Exponents and radicals

There are a few rules and properties involving exponents and radicals that you'd do well to remember. As a definition we have that if n is a positive integer then  a^n denotes n factors of a. That is,

 a^n = a\cdot a \cdot a \cdots a \qquad (n~ \mbox{times}).

If  a \not= 0 then we say that a^0 =1 \, .

If n is a negative integer then we say that  a^{-n} = \frac{1}{a^n} .

If we have an exponent that is a fraction then we say that  a^{m/n} = \sqrt[n]{a^m} = (\sqrt[n]{a})^m .

In addition to the previous definitions, the following rules apply:

Rule Example
 a^n \cdot a^m = a^{n+m}  3^6 \cdot 3^9 = 3^{15}
 \frac{a^n}{a^m} = a^{n-m}  \frac{x^3}{x^2} = x^{1} = x
 (a^n)^m = a^{n\cdot m}  (x^4)^5 = x^{20} \,\!
 (ab)^n = a^n b^n \,\!  (3x)^5 = 3^5 x^5 \,\!
 \bigg(\frac{a}{b}\bigg)^n = \frac{a^n}{b^n}  \bigg(\frac{7}{3}\bigg)^3 = \frac{7^3}{3^3}.

Factoring and roots

Given the expression  x^2 + 3x + 2 , one may ask "what are the values of x that make this expression 0?" If we factor we obtain

 x^2 + 3x + 2 = (x + 2)(x + 1). \,\!

If x=-1 or -2, then one of the factors on the right becomes zero. Therefore, the whole must be zero. So, by factoring we have discovered the values of x that render the expression zero. These values are termed "roots." In general, given a quadratic polynomial  px^2 + qx + r that factors as

 px^2 + qx + r = (ax + c)(bx + d) \,\!

then we have that x = -c/a and x = -d/b are roots of the original polynomial.

A special case to be on the look out for is the difference of two squares,  a^2 - b^2. In this case, we are always able to factor as

 a^2 - b^2 = (a+b)(a-b). \,\!

For example, consider  4x^2 - 9 . On initial inspection we would see that both  4x^2 and  9 are squares ((2x)^2 = 4x^2 and  3^2 = 9 ). Applying the previous rule we have

 4x^2 - 9 = (2x+3)(2x-3). \,\!

The following is a general result of great utility.

The quadratic formula
Given any quadratic equation ax^2+bx+c=0, a\neq0, all solutions of the equation are given by the quadratic formula:

x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}.
Example: Find all the roots of 4x^2+7x-2

Finding the roots is equivalent to solving the equation 4x^2+7x-2=0. Applying the quadratic formula with a=4, b=7, c=-2, we have:
x=\frac{-7\pm\sqrt{7^2-4(4)(-2)}}{2(4)}

x=\frac{-7\pm\sqrt{49+32}}{8}

x=\frac{-7\pm\sqrt{81}}{8}

x=\frac{-7\pm9}{8}

x=\frac{2}{8}, x=\frac{-16}{8}

x=\frac{1}{4}, x=-2

The quadratic formula can also help with factoring, as the next example demonstrates.

Example: Factor the polynomial 4x^2+7x-2

We already know from the previous example that the polynomial has roots x=\frac{1}{4} and x=-2. Our factorization will take the form
 C(x+2)(x-\frac{1}{4})
All we have to do is set this expression equal to our polynomial and solve for the unknown constant C:
 C(x+2)(x-\frac{1}{4})=4x^2+7x-2
 C(x^2+(-\frac{1}{4}+2)x-\frac{2}{4})=4x^2+7x-2
 C(x^2+\frac{7}{4}x-\frac{1}{2})=4x^2+7x-2
You can see that C=4 solves the equation. So the factorization is
4x^2+7x-2=4(x+2)(x-\frac{1}{4})=(x+2)(4x-1)

Note that if 4ac>b^2 then the roots will not be real numbers.

Simplifying rational expressions

Consider the two polynomials

p(x) = a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0

and

 q(x) = b_m x^m + b_{m-1}x^{m-1} + \cdots + b_1x + b_0.

When we take the quotient of the two we obtain

\frac{p(x)}{q(x)} = \frac{a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0}{b_m x^m + b_{m-1}x^{m-1} + \cdots + b_1x + b_0}.

The ratio of two polynomials is called a rational expression. Many times we would like to simplify such a beast. For example, say we are given \frac{x^2-1}{x+1}. We may simplify this in the following way:

\frac{x^2-1}{x+1} = \frac{(x+1)(x-1)}{x+1} = x-1, \qquad x \neq -1 \,\!

This is nice because we have obtained something we understand quite well, x-1 , from something we didn't.

Formulas of multiplication of polynomials

Here are some formulas that can be quite useful for solving polynomial problems:

(a+b)^2=a^2+2ab+b^2
(a-b)^2=a^2-2ab+b^2
(a-b)(a+b)=a^2-b^2
(a\pm b)^3=a^3\pm 3a^2b+3ab^2\pm b^3
a^3\pm b^3=(a\pm b)(a^2\mp ab+b^2)

Polynomial Long Division

Suppose we would like to divide one polynomial by another. The procedure is similar to long division of numbers and is illustrated in the following example:

Example

Divide x^2-2x-15 (the dividend or numerator) by x+3 (the divisor or denominator)

Similar to long division of numbers, we set up our problem as follows:

\begin{array}{rl}\\
x+3\!\!\!\!&\big)\!\!\!\begin{array}{lll}
\hline
\,x^2-2x-15
\end{array}\end{array}

First we have to answer the question, how many times does x+3 go into x^2? To find out, divide the leading term of the dividend by leading term of the divisor. So it goes in x times. We record this above the leading term of the dividend:

\begin{array}{rl}&~~\,x\\
x+3\!\!\!\!&\big)\!\!\!\begin{array}{lll}
\hline
\,x^2-2x-15
\end{array}\\
\end{array}

, and we multiply x+3 by x and write this below the dividend as follows:

\begin{array}{rl}&~~\,x\\
x+3\!\!\!\!&\big)\!\!\!\begin{array}{lll}
\hline
\,x^2-2x-15
\end{array}\\
&\!\!\!\!-\underline{(x^2+3x)~~~}\\
\end{array}

Now we perform the subtraction, bringing down any terms in the dividend that aren't matched in our subtrahend:

\begin{array}{rl}&~~\,x\\
x+3\!\!\!\!&\big)\!\!\!\begin{array}{lll}
\hline
\,x^2-2x-15
\end{array}\\
&\!\!\!\!-\underline{(x^2+3x)~~~}\\
&\!\!\!\!~~~~~~-5x-15~~~\\
\end{array}

Now we repeat, treating the bottom line as our new dividend:

\begin{array}{rl}&~~\,x-5\\
x+3\!\!\!\!&\big)\!\!\!\begin{array}{lll}
\hline
\,x^2-2x-15
\end{array}\\
&\!\!\!\!-\underline{(x^2+3x)~~~}\\
&\!\!\!\!~~~~~~-5x-15~~~\\
&\!\!\!\!~~~-\underline{(-5x-15)~~~}\\
&\!\!\!\!~~~~~~~~~~~~~~~~~~~0~~~\\
\end{array}

In this case we have no remainder.

Application: Factoring Polynomials

We can use polynomial long division to factor a polynomial if we know one of the factors in advance. For example, suppose we have a polynomial P(x) and we know that r is a root of P. If we perform polynomial long division using P(x) as the dividend and (x-r) as the divisor, we will obtain a polynomial Q(x) such that P(x)=(x-r)Q(x), where the degree of Q is one less than the degree of P.

Exercise

1. Factor x-1 out of 6x^3-4x^2+3x-5.

(x-1)(6x^2+2x+5)

Solution

Application: Breaking up a rational function

Similar to the way one can convert an improper fraction into an integer plus a proper fraction, one can convert a rational function P(x) whose numerator N(x) has degree n and whose denominator D(x) has degree d with n\geq d into a polynomial plus a rational function whose numerator has degree \nu and denominator has degree \delta with \nu<\delta.

Suppose that N(x) divided by D(x) has quotient Q(x) and remainder R(x). That is

N(x)=D(x)Q(x)+R(x)

Dividing both sides by D(x) gives

\frac{N(x)}{D(x)}=Q(x)+\frac{R(x)}{D(x)}

R(x) will have degree less than D(x).

Example

Write \frac{x-1}{x-3} as a polynomial plus a rational function with numerator having degree less than the denominator.
\begin{array}{rl}&~~\,1\\
x-3\!\!\!\!&\big)\!\!\!\begin{array}{lll}
\hline
\,x-1
\end{array}\\
&\!\!\!\!-\underline{(x-3)~~~}\\
&\!\!\!\!~~~~~~~~~2~~~\\
\end{array}

so

\frac{x-1}{x-3}=1+\frac{2}{x-3}
← Precalculus Calculus Functions →
Print version

<h1> 1.2 Functions</h1>

← Algebra Calculus Graphing linear functions →
Print version

What functions are and how are they described

Note: This is an attempt at a rewrite of "Classical understanding of functions". If others approve, consider deleting that section.

Whenever one quantity depends on one or more quantities, we have a function. You can think of a function as a kind of machine. You feed the machine raw materials, and the machine changes the raw materials into a finished product based on a specific set of instructions.

A function in everyday life

Think about dropping a ball from a bridge. At each moment in time, the ball is a height above the ground. The height of the ball is a function of time. It was the job of physicists to come up with a formula for this function. This type of function is called real-valued since the "finished product" is a number (or, more specifically, a real number).

A function in everyday life (Preview of Multivariable Calculus)

Think about a wind storm. At different places, the wind can be blowing in different directions with different intensities. The direction and intensity of the wind can be thought of as a function of position. This is a function of two real variables (a location is described by two values - an x and a y) which results in a vector (which is something that can be used to hold a direction and an intensity). These functions are studied in multivariable calculus (which is usually studied after a one year college level calculus course).This a vector-valued function of two real variables.

We will be looking at real-valued functions until studying multivariable calculus. Think of a real-valued function as an input-output machine; you give the function an input, and it gives you an output which is a number (more specifically, a real number). For example, the squaring function takes the input 4 and gives the output value 16. The same squaring function takes the input -1 and gives the output value 1.

There are many ways which people describe functions. In the examples above, a verbal descriptions is given (the height of the ball above the earth as a function of time). Here is a list of ways to describe functions. The top three listed approaches to describing functions are the most popular and you could skip the rest if you like.

  1. A function is given a name (such as f) and a formula for the function is also given. For example, f(x) = 3 x + 2 describes a function. We refer to the input as the argument of the function (or the independent variable), and to the output as the value of the function at the given argument.
  2. A function is described using an equation and two variables. One variable is for the input of the function and one is for the output of the function. The variable for the input is called the independent variable. The variable for the output is called the dependent variable. For example,  y = 3 x + 2 describes a function. The dependent variable appears by itself on the left hand side of equal sign.
  3. A verbal description of the function.

When a function is given a name (like in number 1 above), the name of the function is usually a single letter of the alphabet (such as f or g). Some functions whose names are multiple letters (like the sine function y=sin(x).

Plugging a value into a function

If we write f(x) = 3x+2 \ , then we know that

  • The function f is a function of x.
  • To evaluate the function at a certain number, replace the x with that number.
  • Replacing x with that number in the right side of the function will produce the function's output for that certain input.
  • In English, the definition of f \ is interpreted, "Given a number, f will return two more than the triple of that number."

How would we know the value of the function f at 3? We would have the following three thoughts:

  1. f(3) = 3(3) + 2
  2.  3(3) + 2 = 9 + 2
  3. 9+2=11

and we would write

f(3) = 3(3)+2 = 9+2 = 11.

The value of f \ at 3 is 11.

Note that f(3) \ means the value of the dependent variable when x \ takes on the value of 3. So we see that the number 11 is the output of the function when we give the number 3 as the input. People often summarize the work above by writing "the value of f at three is eleven", or simply "f of three equals eleven".

Classical understanding of functions

To provide the classical understanding of functions, think of a function as a kind of machine. You feed the machine raw materials, and the machine changes the raw materials into a finished product based on a specific set of instructions. The kinds of functions we consider here, for the most part, take in a real number, change it in a formulaic way, and give out a real number (possibly the same as the one it took in). Think of this as an input-output machine; you give the function an input, and it gives you an output. For example, the squaring function takes the input 4 and gives the output value 16. The same squaring function takes the input -1 and gives the output value 1.

A function is usually written as f, g, or something similar - although it doesn't have to be. A function is always defined as "of a variable" which tells us what to replace in the formula for the function.

For example, f(x) = 3x+2 \ tells us:

  • The function f is a function of x.
  • To evaluate the function at a certain number, replace the x with that number.
  • Replacing x with that number in the right side of the function will produce the function's output for that certain input.
  • In English, the definition of f \ is interpreted, "Given a number, f will return two more than the triple of that number."

Thus, if we want to know the value (or output) of the function at 3:

f(x) = 3x+2 \
f(3) = 3(3)+2 \ We evaluate the function at x = 3.
f(3) = 9+2 = 11 \ The value of f \ at 3 is 11.

See? It's easy!

Note that f(3) \ means the value of the dependent variable when x \ takes on the value of 3. So we see that the number 11 is the output of the function when we give the number 3 as the input. We refer to the input as the argument of the function (or the independent variable), and to the output as the value of the function at the given argument (or the dependent variable). A good way to think of it is the dependent variable f(x) \ 'depends' on the value of the independent variable x \ . This is read as "the value of f at three is eleven", or simply "f of three equals eleven".

Notation

Functions are used so much that there is a special notation for them. The notation is somewhat ambiguous, so familiarity with it is important in order to understand the intention of an equation or formula.

Though there are no strict rules for naming a function, it is standard practice to use the letters f, g, and h to denote functions, and the variable x to denote an independent variable. y is used for both dependent and independent variables.

When discussing or working with a function f, it's important to know not only the function, but also its independent variable x. Thus, when referring to a function f, you usually do not write f, but instead f(x). The function is now referred to as "f of x". The name of the function is adjacent to the independent variable (in parentheses). This is useful for indicating the value of the function at a particular value of the independent variable. For instance, if

f(x)=7x+1\,,

and if we want to use the value of f for x equal to 2, then we would substitute 2 for x on both sides of the definition above and write

f(2)=7(2)+1=14+1=15\,

This notation is more informative than leaving off the independent variable and writing simply 'f', but can be ambiguous since the parentheses can be misinterpreted as multiplication.

Modern understanding of functions

The formal definition of a function states that a function is actually a rule that associates elements of one set called the domain of the function, with the elements of another set called the range of the function. For each value we select from the domain of the function, there exists exactly one corresponding element in the range of the function. The definition of the function tells us which element in the range corresponds to the element we picked from the domain. Classically, the element picked from the domain is pictured as something that is fed into the function and the corresponding element in the range is pictured as the output. Since we "pick" the element in the domain whose corresponding element in the range we want to find, we have control over what element we pick and hence this element is also known as the "independent variable". The element mapped in the range is beyond our control and is "mapped to" by the function. This element is hence also known as the "dependent variable", for it depends on which independent variable we pick. Since the elementary idea of functions is better understood from the classical viewpoint, we shall use it hereafter. However, it is still important to remember the correct definition of functions at all times.

To make it simple, for the function f(x), all of the possible x values constitute the domain, and all of the values f(x) (y on the x-y plane) constitute the range.

Remarks

The following arise as a direct consequence of the definition of functions:

  1. By definition, for each "input" a function returns only one "output", corresponding to that input. While the same output may correspond to more than one input, one input cannot correspond to more than one output. This is expressed graphically as the vertical line test: a line drawn parallel to the axis of the dependent variable (normally vertical) will intersect the graph of a function only once. However, a line drawn parallel to the axis of the independent variable (normally horizontal) may intersect the graph of a function as many times as it likes. Equivalently, this has an algebraic (or formula-based) interpretation. We can always say if  a = b, then f(a) = f(b), but if we only know that f(a) = f(b) then we can't be sure that a= b.
  2. Each function has a set of values, the function's domain, which it can accept as input. Perhaps this set is all positive real numbers; perhaps it is the set {pork, mutton, beef}. This set must be implicitly/explicitly defined in the definition of the function. You cannot feed the function an element that isn't in the domain, as the function is not defined for that input element.
  3. Each function has a set of values, the function's range, which it can output. This may be the set of real numbers. It may be the set of positive integers or even the set {0,1}. This set, too, must be implicitly/explicitly defined in the definition of the function.
This is an example of an expression which fails the vertical line test.

The vertical line test

The vertical line test, mentioned in the preceding paragraph, is a systematic test to find out if an equation involving x and y can serve as a function (with x the independent variable and y the dependent variable). Simply graph the equation and draw a vertical line through each point of the x-axis. If any vertical line ever touches the graph at more than one point, then the equation is not a function; if the line always touches at most one point of the graph, then the equation is a function.

(There are a lot of useful curves, like circles, that aren't functions (see picture). Some people call these graphs with multiple intercepts, like our circle, "multi-valued functions"; they would refer to our "functions" as "single-valued functions".)

Important functions

Constant function f(x)=c\,

It disregards the input and always outputs the constant c, and is a polynomial of the zeroth degree where f(x) = cx0= c(1) = c. Its graph is a horizontal line.

Linear function f(x)=mx+c\,

Takes an input, multiplies by m and adds c. It is a polynomial of the first degree. Its graph is a line (slanted, except m=0).

Identity function f(x)=x\,

Takes an input and outputs it unchanged. A polynomial of the first degree, f(x) = x1 = x. Special case of a linear function.

Quadratic function f(x)=ax^2+bx+c \,

A polynomial of the second degree. Its graph is a parabola, unless a=0. (Don't worry if you don't know what this is.)

Polynomial function f(x)=a_n x^n + a_{n-1}x^{n-1} + \cdots + a_2 x^2 + a_1 x + a_0

The number n is called the degree.

Signum function  \operatorname{sgn}(x) = \left\{ \begin{matrix}
-1 & \text{if} &  x < 0 \\
0 & \text{if} &  x = 0 \\
1 & \text{if} &  x > 0. \end{matrix} \right.

Determines the sign of the argument x.

Example functions

Some more simple examples of functions have been listed below.

h(x)=\left\{\begin{matrix}1,&\mbox{if }x>0\\-1,&\mbox{if }x<0\end{matrix}\right.
Gives 1 if input is positive, -1 if input is negative. Note that the function only accepts negative and positive numbers, not 0. Mathematics describes this condition by saying 0 is not in the domain of the function.
g(y)=y^2\,
Takes an input and squares it.

g(z)=z^2\,

Exactly the same function, rewritten with a different independent variable. This is perfectly legal and sometimes done to prevent confusion (e.g. when there are already too many uses of x or y in the same paragraph.)
f(x)=\left\{\begin{matrix}5^{x^2},&\mbox{if }x>0\\0,&\mbox{if }x\le0\end{matrix}\right.
Note that we can define a function by a totally arbitrary rule. Such functions are called piecewise functions.

It is possible to replace the independent variable with any mathematical expression, not just a number. For instance, if the independent variable is itself a function of another variable, then it could be replaced with that function. This is called composition, and is discussed later.

Manipulating functions

Addition, Subtraction, Multiplication and Division of functions

For two real-valued functions, we can add the functions, multiply the functions, raised to a power, etc.

Example: Adding, subtracting, multiplying and dividing functions which do not have a name

If we add the functions y = 3 x + 2 and y = x^2, we obtain y = x^2 + 3 x + 2.


If we subtract y = 3 x + 2 from y = x^2, we obtain y =x^2 - (3 x + 2). We can also write this as y=x^2-3x-2.


If we multiply the function y = 3 x + 2 and the function y = x^2, we obtain y = (3 x + 2) x^2. We can also write this as y=3x^3 + 2 x^2.


If we divide the function y = 3 x + 2 by the function y = x^2, we obtain y = (3 x + 2)/ x^2.


If a math problem wants you to add two functions f and g, there are two ways that the problem will likely be worded:

  1. If you are told that f(x) = 3 x + 2, that g(x)  = x^2, that h(x) = f(x)+g(x) and asked about h, then you are being asked to add two functions. Your answer would be h(x) = x^2 + 3 x + 2.
  2. If you are told that f(x) = 3 x + 2, that g(x)  = x^2 and you are asked about f+g, then you are being asked to add two functions. The addition of f and g is called f+g. Your answer would be (f+g)(x) = x^2 + 3 x + 2.

Similar statements can be made for subtraction, multiplication and division.

Example: Adding, subtracting, multiplying and dividing functions which do have a name

Let f(x)=3x+2\, and:g(x)=x^2\,. Let's add, subtract, multiply and divide.


\begin{align}
(f+g)(x)
    &= f(x)+g(x)\\
    &= (3x+2)+(x^2)\\
    &= x^2+3x+2\,
\end{align},


\begin{align}
(f-g)(x)
    &= f(x)-g(x)\\
    &= (3x+2)-(x^2)\\
    &= -x^2+3x+2\,
\end{align},


\begin{align}
 (f\times g)(x)
          &= f(x)\times g(x)\\
          &= (3x+2)\times(x^2)\\
          &= 3x^3+2x^2\,
\end{align},


\begin{align}
\left(\frac{f}{g}\right)(x)
            &= \frac{f(x)}{g(x)}\\
            &= \frac{3x+2}{x^2}\\
            &= \frac{3}{x}+\frac{2}{x^2}
\end{align}.

Composition of functions

We begin with a fun (and not too complicated) application of composition of functions before we talk about what composition of functions is.

Example: Dropping a ball

If we drop a ball from a bridge which is 20 meters above the ground, then the height of our ball above the earth is a function of time. The physicists tell us that if we measure time in seconds and distance in meters, then the formula for height in terms of time is h = -4.9t^2 + 20. Suppose we are tracking the ball with a camera and always want the ball to be in the center of our picture. Suppose we have \theta=f(h) The angle will depend upon the height of the ball above the ground and the height above the ground depends upon time. So the angle will depend upon time. This can be written as \theta = f(-4.9t^2 + 20). We replace h with what it is equal to. This is the essence of composition.

Composition of functions is another way to combine functions which is different from addition, subtraction, multiplication or division.


The value of a function f depends upon the value of another variable x; however, that variable could be equal to another function g, so its value depends on the value of a third variable. If this is the case, then the first variable is a function h of the third variable; this function (h) is called the composition of the other two functions (f and g).

Example: Composing two functions

Let f(x)=3x+2\, and:g(x)=x^2\,. The composition of f with g is read as either "f composed with g" or "f of g of x."

Let

h(x) = f(g(x))

Then

\begin{align}
h(x) &= f(g(x))\\
     &= f(x^2)\\
     &= 3(x^2)+2\\
     &= 3x^2+2\,
\end{align}.

Sometimes a math problem asks you compute (f \circ g)(x) when they want you to compute f(g(x)),

Here, h is the composition of f and g and we write h=f\circ g. Note that composition is not commutative:

f(g(x))=3x^2+2\,, and
\begin{align}
g(f(x)) &= g(3x + 2)\\
        &= (3x + 2)^2\\
        &= 9x^2+12x+4\, .
\end{align}
so f(g(x))\ne g(f(x))\,.

Composition of functions is very common, mainly because functions themselves are common. For instance, squaring and sine are both functions:


\operatorname{square}(x)=x^2,
\operatorname{sine}(x)=\sin x


Thus, the expression \sin^2x is a composition of functions:

\sin^2x = \operatorname{square}(\sin x)
= \operatorname{square}( \operatorname{sine}(x)).

(Note that this is not the same as \operatorname{sine}(\operatorname{square}(x))=\sin x^2.) Since the function sine equals 1/2 if x=\pi/6,


\operatorname{square}(\operatorname{sine}(\pi/6))= \operatorname{square}(1/2).


Since the function square equals 1/4 if x=\pi/6,

\sin^2 \pi/6=\operatorname{square}(\operatorname{sine}(\pi/6))=\operatorname{square}(1/2)
=1/4.

Transformations

Transformations are a type of function manipulation that are very common. They consist of multiplying, dividing, adding or subtracting constants to either the input or the output. Multiplying by a constant is called dilation and adding a constant is called translation. Here are a few examples:

f(2\times x) \, Dilation
f(x+2)\, Translation
2\times f(x) \, Dilation
2+f(x)\, Translation
Examples of horizontal and vertical translations
Examples of horizontal and vertical dilations

Translations and dilations can be either horizontal or vertical. Examples of both vertical and horizontal translations can be seen at right. The red graphs represent functions in their 'original' state, the solid blue graphs have been translated (shifted) horizontally, and the dashed graphs have been translated vertically.

Dilations are demonstrated in a similar fashion. The function

f(2\times x) \,

has had its input doubled. One way to think about this is that now any change in the input will be doubled. If I add one to x, I add two to the input of f, so it will now change twice as quickly. Thus, this is a horizontal dilation by \frac{1}{2} because the distance to the y-axis has been halved. A vertical dilation, such as

2\times f(x) \,

is slightly more straightforward. In this case, you double the output of the function. The output represents the distance from the x-axis, so in effect, you have made the graph of the function 'taller'. Here are a few basic examples where a is any positive constant:

Original graph f(x)\, Rotation about origin -f(-x)\,
Horizontal translation by a units left f(x+a)\, Horizontal translation by a units right f(x-a)\,
Horizontal dilation by a factor of a f(x\times \frac{1}{a}) \, Vertical dilation by a factor of a a\times f(x) \,
Vertical translation by a units down f(x)-a\, Vertical translation by a units up f(x)+a\,
Reflection about x-axis -f(x)\, Reflection about y-axis f(-x)\,

Domain and Range

Domain

The domain of the function is the interval from -1 to 1

The domain of a function is the set of all points over which it is defined. More simply, it represents the set of x-values which the function can accept as input. For instance, if

f(x)=\sqrt{1-x^2}

then f(x) is only defined for values of x between -1 and 1, because the square root function is not defined (in real numbers) for negative values. Thus, the domain, in interval notation, is \left[-1,1\right]. In other words,

f(x) \mbox{is defined for } x\in [-1,1], \operatorname{ or } \{x:-1\le x\le 1\}.


The range of the function is the interval from 0 to 1

Range

The range of a function is the set of all values which it attains (i.e. the y-values). For instance, if:

f(x)=\sqrt{1-x^2},

then f(x) can only equal values in the interval from 0 to 1. Thus, the range of f is \left[0,1\right].

One-to-one Functions

A function f(x) is one-to-one (or less commonly injective) if, for every value of f, there is only one value of x that corresponds to that value of f. For instance, the function f(x)=\sqrt{1-x^2} is not one-to-one, because both x=1 and x=-1 result in f(x)=0. However, the function f(x)=x+2 is one-to-one, because, for every possible value of f(x), there is exactly one corresponding value of x. Other examples of one-to-one functions are f(x)=x^3+ax, where a\in \left[0,\infty\right). Note that if you have a one-to-one function and translate or dilate it, it remains one-to-one. (Of course you can't multiply x or f by a zero factor).

Horizontal Line Test

If you know what the graph of a function looks like, it is easy to determine whether or not the function is one-to-one. If every horizontal line intersects the graph in at most one point, then the function is one-to-one. This is known as the Horizontal Line Test.

Algebraic 1-1 Test

You can also show one-to-oneness algebraically by assuming that two inputs give the same output and then showing that the two inputs must have been equal. For example, Is f(x)=\frac{1-2x}{1+x}\, a 1-1 function?

f(a)=f(b)\,

\frac{1-2a}{1+a}=\frac{1-2b}{1+b} \,

(1+b)(1-2a)=(1+a)(1-2b) \,

1-2a+b-2ab=1-2b+a-2ab \,

1-2a+b=1-2b+a \,

1-2a+3b=1+a \,

1+3b=1+3a \,

a=b \,

Therefore by the algebraic 1-1 test, the function f(x)\, is 1-1.

You can show that a function is not one-to-one by finding two distinct inputs that give the same output. For example, f(x)=x^2 is not one-to-one because f(-1)=f(1) but -1\neq1.

Inverse functions

We call g(x) the inverse function of f(x) if, for all x:

g(f(x)) = f(g(x)) = x\ .

A function f(x) has an inverse function if and only if f(x) is one-to-one. For example, the inverse of f(x)=x+2 is g(x)=x-2. The function f(x)=\sqrt{1-x^2} has no inverse.

Notation

The inverse function of f is denoted as f^{-1}(x). Thus, f^{-1}(x) is defined as the function that follows this rule

f(f^{-1}(x))=f^{-1}(f(x)) = x:

To determine f^{-1}(x) when given a function f, substitute f^{-1}(x) for x and substitute x for f(x). Then solve for f^{-1}(x), provided that it is also a function.

Example: Given f(x) = 2x - 7, find f^{-1}(x).

Substitute f^{-1}(x) for x and substitute x for f(x). Then solve for f^{-1}(x):

f(x) = 2x - 7\,
  x  = 2[f^{-1}(x)] - 7\,
x + 7  = 2[f^{-1}(x)]\,
\frac{x + 7}{2} = f^{-1}(x)\,

To check your work, confirm that f^{-1}(f(x)) = x:

f^{-1}(f(x)) =

f^{-1}(2x - 7) = {}

\frac{(2x - 7) + 7}{2} = \frac{2x}{2} = x

If f isn't one-to-one, then, as we said before, it doesn't have an inverse. Then this method will fail.

Example: Given f(x)=x^2, find f^{-1}(x).

Substitute f^{-1}(x) for x and substitute x for f(x). Then solve for f^{-1}(x):

f(x) = x^2\,
x = (f^{-1}(x))^2\,
f^{-1}(x) = \pm\sqrt{x}\,

Since there are two possibilities for f^{-1}(x), it's not a function. Thus f(x)=x^2 doesn't have an inverse. Of course, we could also have found this out from the graph by applying the Horizontal Line Test. It's useful, though, to have lots of ways to solve a problem, since in a specific case some of them might be very difficult while others might be easy. For example, we might only know an algebraic expression for f(x) but not a graph.

← Algebra Calculus Graphing linear functions →
Print version

<h1> 1.3 Graphing linear functions</h1>

← Functions Calculus Precalculus/Exercises →
Print version
Graph of y=2x

It is sometimes difficult to understand the behavior of a function given only its definition; a visual representation or graph can be very helpful. A graph is a set of points in the Cartesian plane, where each point (x,y) indicates that f(x)=y. In other words, a graph uses the position of a point in one direction (the vertical-axis or y-axis) to indicate the value of f for a position of the point in the other direction (the horizontal-axis or x-axis).

Functions may be graphed by finding the value of f for various x and plotting the points (x, f(x)) in a Cartesian plane. For the functions that you will deal with, the parts of the function between the points can generally be approximated by drawing a line or curve between the points. Extending the function beyond the set of points is also possible, but becomes increasingly inaccurate.

Example

Plotting points like this is laborious. Fortunately, many functions' graphs fall into general patterns. For a simple case, consider functions of the form

f(x)=3x + 2\,\!

The graph of f is a single line, passing through the point (0,2) with slope 3. Thus, after plotting the point, a straightedge may be used to draw the graph. This type of function is called linear and there are a few different ways to present a function of this type.

Slope-intercept form

When we see a function presented as

 y = mx + b \,\!

we call this presentation the slope-intercept form. This is because, not surprisingly, this way of writing a linear function involves the slope, m, and the y-intercept, b.

Point-slope form

If someone walks up to you and gives you one point and a slope, you can draw one line and only one line that goes through that point and has that slope. Said differently, a point and a slope uniquely determine a line. So, if given a point (x_0,y_0) and a slope m, we present the graph as

 y - y_0 = m(x - x_0). \,\!

We call this presentation the point-slope form. The point-slope and slope-intercept form are essentially the same. In the point-slope form we can use any point the graph passes through. Where as, in the slope-intercept form, we use the y-intercept, that is the point (0,b).

Calculating slope

If given two points,  (x_1,y_1) and  (x_2,y_2) , we may then compute the slope of the line that passes through these two points. Remember, the slope is determined as "rise over run." That is, the slope is the change in y-values divided by the change in x-values. In symbols,

 \mbox{slope}~ = \frac{\mbox{change in}~y}{\mbox{change in}~x} = \frac{\Delta y}{\Delta x}.

So now the question is, "what's \Delta y and \Delta x?" We have that \Delta y = y_2-y_1 and \Delta x = x_2 - x_1. Thus,

 \mbox{slope}~ = \frac{y_2-y_1}{x_2-x_1}.

Two-point form

Two points also uniquely determine a line. Given points (x_1,y_1) and (x_2,y_2), we have the equation

 y - y_1 = \frac{y_2-y_1}{x_2-x_1}(x-x_1).

This presentation is in the two-point form. It is essentially the same as the point-slope form except we substitute the expression \frac{y_2-y_1}{x_2-x_1} for m.

← Functions Calculus Precalculus/Exercises →
Print version

<h1> 1.4 Precalculus Cumulative Exercises</h1>

← Graphing linear functions Calculus Limits →
Print version

Algebra

Convert to interval notation

1.  \{x:-4<x<2\} \,

(-4,2)

2.  \{x:-\frac{7}{3} \leq x \leq -\frac{1}{3}\}

[-\frac{7}{3},-\frac{1}{3}]

3.  \{x:-\pi \leq x < \pi\}

[-\pi,\pi)

4.  \{x:x \leq \frac{17}{9}\}

(-\infty, \frac{17}{9}]

5.  \{x:5 \leq x+1 \leq 6\}

[4, 5]

6.  \{x:x - \frac{1}{4} < 1\} \,

(-\infty, \frac{5}{4})

7.  \{x:3 > 3x\} \,

(-\infty, 1)

8.  \{x:0 \leq 2x+1 < 3\}

[-\frac{1}{2}, 1)

9.  \{x:5<x \mbox{ and } x<6\} \,

(5,6)

10.  \{x:5<x \mbox{ or } x<6\} \,

(-\infty,\infty)

State the following intervals using set notation

11.  [3,4] \,

\{x:3\leq x\leq 4\}

12.  [3,4) \,

\{x:3\leq x<4\}

13.  (3,\infty)

\{x:x>3\}

14.  (-\frac{1}{3}, \frac{1}{3}) \,

\{x:-\frac{1}{3}<x<\frac{1}{3}\}

15.  (-\pi, \frac{15}{16}) \,

\{x:-\pi<x<\frac{15}{16}\}

16.  (-\infty,\infty)

\{x:x\in\Re\}

Which one of the following is a true statement?

Hint: the true statement is often referred to as the triangle inequality. Give examples where the other two are false.

17.  |x+y| = |x| + |y| \,

false

18.  |x+y| \geq |x| + |y|

false

19.  |x+y| \leq |x| + |y|

true

Evaluate the following expressions

20.  8^{1/3} \,

2

21.  (-8)^{1/3} \,

-2

22.  \bigg(\frac{1}{8}\bigg)^{1/3} \,

\frac{1}{2}

23.  (8^{2/3}) (8^{3/2}) (8^0) \,

8^{13/6}

24.  \bigg( \bigg(\frac{1}{8}\bigg)^{1/3} \bigg)^7

\frac{1}{128}

25.  \sqrt[3]{\frac{27}{8}}

\frac{3}{2}

26.  \frac{4^5 \cdot 4^{-2}}{4^3}

1

27.  \bigg(\sqrt{27}\bigg)^{2/3}

3

28.  \frac{\sqrt{27}}{\sqrt[3]{9}}

3^{5/6}

Simplify the following

29.  x^3 + 3x^3 \,

4x^3

30.  \frac{x^3 + 3x^3}{x^2}

4x

31.  (x^3+3x^3)^3 \,

64x^9

32.  \frac{x^{15} + x^3}{x}

x^{14}+x^2

33.  (2x^2)(3x^{-2}) \,

6

34.  \frac{x^2y^{-3}}{x^3y^2}

\frac{1}{xy^5}

35.  \sqrt{x^2y^4}

xy^2

36.  \bigg(\frac{8x^6}{y^4}\bigg)^{1/3}

\frac{2x^2}{y^{4/3}}

Find the roots of the following polynomials

37.  x^2 - 1 \,

x=\pm1

38.  x^2 +2x +1 \,

x=-1

39.  x^2 + 7x + 12 \,

x=-3, x=-4

40.  3x^2 - 5x -2 \,

x=2, x=-\frac{1}{3}

41.  x^2 + 5/6x + 1/6 \,

x=-\frac{1}{3}, x=-\frac{1}{2}

42.  4x^3 + 4x^2 + x \,

x=0,x=-\frac{1}{2}

43.  x^4 - 1 \,

x=\pm i, x=\pm 1

44.  x^3 + 2x^2 - 4x - 8 \,

x=\pm2

Factor the following expressions

45.  4a^2 - ab - 3b^2 \,

(4a+3b)(a-b)

46.  (c+d)^2 - 4 \,

(c+d+2)(c+d-2)

47.  4x^2 - 9y^2 \,

(2x+3y)(2x-3y)

Simplify the following

48.  \frac{x^2 -1}{x+1} \,

x-1, x\neq-1

49.  \frac{3x^2 + 4x + 1}{x+1} \,

3x+1, x\neq-1

50.  \frac{4x^2 - 9}{4x^2 + 12x + 9} \,

\frac{2x-3}{2x+3}

51.  \frac{x^2 + y^2 +2xy}{x(x+y)} \,

\frac{x+y}{x}, x\neq-y

Functions

52. Let f(x)=x^2.

a. Compute f(0) and f(2).

{0,4}

b. What are the domain and range of f?

{(-\infty,\infty)}

c. Does f have an inverse? If so, find a formula for it.

{x^{1/2}}

53. Let f(x)=x+2, g(x)=1/x.

a. Give formulae for
i. f+g

(f + g)(x) = x + 2 + \frac{1}{x}

ii. f-g

(f - g)(x) = x + 2 - \frac{1}{x}

iii. g-f

(g - f)(x) = \frac{1}{x} - x - 2

iv. f\times g

(f \times g)(x) = 1 + \frac{2}{x}

v. f/g

(f / g)(x) = x^2 + 2x

vi. g/f

(g / f)(x) = \frac{1}{x^2 + 2x}

vii. f\circ g

(f \circ g)(x) = \frac{1}{x} + 2

viii. g\circ f

(g \circ f)(x) = \frac{1}{x + 2}

b. Compute f(g(2)) and g(f(2)).

f(g(2))=5/2, g(f(2))=1/4

c. Do f and g have inverses? If so, find formulae for them.

f^{-1}(x)=x-2, g^{-1}(x)=\frac{1}{x}

54. Does this graph represent a function? Sinx over x.svg

Yes.

55. Consider the following function

f(x) = \begin{cases} -\frac{1}{9} & \mbox{if } x<-1 \\ 2 & \mbox{if } -1\leq x \leq 0 \\ x + 3 & \mbox{if } x>0. \end{cases}
a. What is the domain?
b. What is the range?
c. Where is f continuous?

56. Consider the following function

f(x) = \begin{cases} x^2 & \mbox{if } x>0 \\ -1 & \mbox{if } x\leq 0. \end{cases}
a. What is the domain?
b. What is the range?
c. Where is f continuous?

57. Consider the following function

f(x) = \frac{\sqrt{2x-3}}{x-10}
a. What is the domain?

When you find the answer, you can add it here by clicking "edit".

b. What is the range?

When you find the answer, you can add it here by clicking "edit".

c. Where is f continuous?

When you find the answer, you can add it here by clicking "edit".

58. Consider the following function

f(x) = \frac{x-7}{x^2-49}
a. What is the domain?

When you find the answer, you can add it here by clicking "edit".

b. What is the range?

When you find the answer, you can add it here by clicking "edit".

c. Where is f continuous?

When you find the answer, you can add it here by clicking "edit".

Graphing

59. Find the equation of the line that passes through the point (1,-1) and has slope 3.

3x-y=4

60. Find the equation of the line that passes through the origin and the point (2,3).

3x-2y=0

Solutions

← Graphing linear functions Calculus Limits →
Print version

Limits

<h1> 2.1 An Introduction to Limits</h1>

← Limits/Contents Calculus Finite Limits →
Print version

Intuitive Look

A limit looks at what happens to a function when the input approaches a certain value. The general notation for a limit is as follows:

\quad\lim_{x\to a} f(x)

This is read as "The limit of f of x as x approaches a". We'll take up later the question of how we can determine whether a limit exists for f(x) at a and, if so, what it is. For now, we'll look at it from an intuitive standpoint.

Let's say that the function that we're interested in is f(x)=x^2, and that we're interested in its limit as x approaches 2. Using the above notation, we can write the limit that we're interested in as follows:

\quad\lim_{x\to 2} x^2

One way to try to evaluate what this limit is would be to choose values near 2, compute f(x) for each, and see what happens as they get closer to 2. This is implemented as follows:

x 1.7 1.8 1.9 1.95 1.99 1.999
f(x)=x^2 2.89 3.24 3.61 3.8025 3.9601 3.996001

Here we chose numbers smaller than 2, and approached 2 from below. We can also choose numbers larger than 2, and approach 2 from above:

x 2.3 2.2 2.1 2.05 2.01 2.001
f(x)=x^2 5.29 4.84 4.41 4.2025 4.0401 4.004001

We can see from the tables that as x grows closer and closer to 2, f(x) seems to get closer and closer to 4, regardless of whether x approaches 2 from above or from below. For this reason, we feel reasonably confident that the limit of x^2 as x approaches 2 is 4, or, written in limit notation,

\quad\lim_{x\to 2} x^2=4.

We could have also just substituted 2 into x^2 and evaluated: (2)^2=4. However, this will not work with all limits.

Now let's look at another example. Suppose we're interested in the behavior of the function f(x)=\frac{1}{x-2} as x approaches 2. Here's the limit in limit notation:

\quad\lim_{x\to 2} \frac{1}{x-2}

Just as before, we can compute function values as x approaches 2 from below and from above. Here's a table, approaching from below:

x 1.7 1.8 1.9 1.95 1.99 1.999
f(x)=\frac{1}{x-2} -3.333 -5 -10 -20 -100 -1000

And here from above:

x 2.3 2.2 2.1 2.05 2.01 2.001
f(x)=\frac{1}{x-2} 3.333 5 10 20 100 1000

In this case, the function doesn't seem to be approaching a single value as x approaches 2, but instead becomes an extremely large positive or negative number (depending on the direction of approach). This is known as an infinite limit. Note that we cannot just substitute 2 into \frac{1}{x-2} and evaluate as we could with the first example, since we would be dividing by 0.

Both of these examples may seem trivial, but consider the following function:

f(x) = \frac{x^2(x-2)}{x-2}

This function is the same as


f(x) =\left\{\begin{matrix} x^2 & \mbox{if }  x\neq 2 \\
\mbox{undefined} & \mbox{if } x=2\end{matrix}\right.

Note that these functions are really completely identical; not just "almost the same," but actually, in terms of the definition of a function, completely the same; they give exactly the same output for every input.

In algebra, we would simply say that we can cancel the term (x-2), and then we have the function f(x)=x^2. This, however, would be a bit dishonest; the function that we have now is not really the same as the one we started with, because it is defined when x=2, and our original function was specifically not defined when x=2. In algebra we were willing to ignore this difficulty because we had no better way of dealing with this type of function. Now, however, in calculus, we can introduce a better, more correct way of looking at this type of function. What we want is to be able to say that, although the function doesn't exist when x=2, it works almost as though it does. It may not get there, but it gets really, really close. That is, f(1.99999)=3.99996. The only question that we have is: what do we mean by "close"?

Informal Definition of a Limit

As the precise definition of a limit is a bit technical, it is easier to start with an informal definition; we'll explain the formal definition later.

We suppose that a function f is defined for x near c (but we do not require that it be defined when x=c).

Definition: (Informal definition of a limit)

We call L the limit of f(x) as x approaches c if f(x) becomes close to L when x is close (but not equal) to c, and if there is no other value L' with the same property..

When this holds we write

 \lim_{x \to c} f(x) = L

or

 f(x) \to L \quad \mbox{as} \quad x \to c.

Notice that the definition of a limit is not concerned with the value of f(x) when x=c (which may exist or may not). All we care about are the values of f(x) when x is close to c, on either the left or the right (i.e. less or greater).

Limit Rules

Now that we have defined, informally, what a limit is, we will list some rules that are useful for working with and computing limits. You will be able to prove all these once we formally define the fundamental concept of the limit of a function.

First, the constant rule states that if f(x)=b (that is, f is constant for all x) then the limit as x approaches c must be equal to b. In other words

Constant Rule for Limits

If b and c are constants then  \lim_{x\to c} b = b.
Example: \lim_{x\to 6} 5=5

Second, the identity rule states that if f(x)=x (that is, f just gives back whatever number you put in) then the limit of f as x approaches c is equal to c. That is,

Identity Rule for Limits

If c is a constant then  \lim_{x\to c} x = c.
Example: \lim_{x\to 6} x=6

The next few rules tell us how, given the values of some limits, to compute others.

Operational Identities for Limits
Suppose that \lim_{x\to c} f(x) =L and \lim_{x\to c} g(x) =M and that k is constant. Then

  •  \lim_{x\to c} k f(x) = k \cdot \lim_{x\to c} f(x) =  k L
  •  \lim_{x\to c} [f(x) + g(x)] = \lim_{x\to c} f(x) +  \lim_{x\to c} g(x) =  L + M
  •  \lim_{x\to c} [f(x) - g(x)] = \lim_{x\to c} f(x) -  \lim_{x\to c} g(x) =  L - M
  •  \lim_{x\to c} [f(x) g(x)] = \lim_{x\to c} f(x) \lim_{x\to c} g(x) = L M
  •  \lim_{x\to c} \frac{f(x)}{g(x)} = \frac{\lim_{x\to c} f(x)}{\lim_{x\to c} g(x)} = \frac{L}{M} \,\,\, \mbox{ provided } M\neq 0

Notice that in the last rule we need to require that M is not equal to zero (otherwise we would be dividing by zero which is an undefined operation).

These rules are known as identities; they are the scalar product, sum, difference, product, and quotient rules for limits. (A scalar is a constant, and, when you multiply a function by a constant, we say that you are performing scalar multiplication.)

Using these rules we can deduce another. Namely, using the rule for products many times we get that

 \lim_{x\to c} f(x)^n = \left(\lim_{x\to c} f(x) \right)^n = L^n for a positive integer n.

This is called the power rule.

Examples

Example 1

Find the limit \lim_{x\to 2} {4x^3}.

We need to simplify the problem, since we have no rules about this expression by itself. We know from the identity rule above that \lim_{x\to 2} {x} = 2. By the power rule, \lim_{x\to 2} {x^3} = \left(\lim_{x\to 2} x\right)^3 = 2^3 = 8. Lastly, by the scalar multiplication rule, we get \lim_{x\to 2} {4x^3} = 4\lim_{x\to 2} x^3=4 \cdot 8=32.

Example 2

Find the limit \lim_{x\to 2} [4x^3 + 5x +7].

To do this informally, we split up the expression, once again, into its components. As above,\lim_{x\to 2} 4x^3=32.

Also \lim_{x\to 2} 5x = 5\cdot\lim_{x\to 2} x = 5\cdot2=10 and \lim_{x\to 2} 7 =7. Adding these together gives

\lim_{x\to 2} 4x^3 + 5x +7 = \lim_{x\to 2} 4x^3 + \lim_{x\to 2} 5x + \lim_{x\to 2} 7 = 32 + 10 +7 =49.
Example 3

Find the limit \lim_{x\to 2}\frac{4x^3 + 5x +7}{(x-4)(x+10)}.

From the previous example the limit of the numerator is \lim_{x\to 2} 4x^3 + 5x +7 =49. The limit of the denominator is

\lim_{x\to 2} (x-4)(x+10) = \lim_{x\to 2} (x-4) \cdot \lim_{x\to 2} (x+10) = (2-4)\cdot(2+10)=-24.

As the limit of the denominator is not equal to zero we can divide. This gives

\lim_{x\to 2}\frac{4x^3 + 5x +7}{(x-4)(x+10)} = -\frac{49}{24}.
Example 4

Find the limit \lim_{x\to 4}\frac{x^4 - 16x + 7}{4x-5}.

We apply the same process here as we did in the previous set of examples;

\lim_{x\to 4}\frac{x^4 - 16x + 7}{4x-5} = \frac{\lim_{x\to 4} (x^4 - 16x + 7)} {\lim_{x\to 4} (4x-5)} = \frac{\lim_{x\to 4} (x^4) - \lim_{x\to 4} (16x) + \lim_{x\to 4} (7)} {\lim_{x\to 4} (4x) - \lim_{x\to 4} 5}.

We can evaluate each of these; 
\lim_{x\to 4} (x^4) = 256,

\lim_{x\to 4} (16x) = 64,

\lim_{x\to 4} (7) = 7,

 \lim_{x\to 4} (4x) = 16
and  \lim_{x\to 4} (5) = 5. Thus, the answer is \frac{199}{11}.

Example 5

Find the limit \lim_{x\to 2}\frac{x^2 - 3x + 2}{x-2}.

In this example, evaluating the result directly will result in a division by zero. While you can determine the answer experimentally, a mathematical solution is possible as well.

First, the numerator is a polynomial that may be factored: \lim_{x\to 2}\frac{(x-2)(x-1)}{x-2}

Now, you can divide both the numerator and denominator by (x-2): \lim_{x\to 2} (x-1) = (2-1) = 1


Example 6

Find the limit \lim_{x\to 0}\frac{1-\cos x}{x}.

To evaluate this seemingly complex limit, we will need to recall some sine and cosine identities. We will also have to use two new facts. First, if f(x) is a trigonometric function (that is, one of sine, cosine, tangent, cotangent, secant or cosecant) and is defined at a, then  \lim_{x\to a} f(x) = f(a) .

Second, \lim_{x\to 0}\frac{\sin x}{x} = 1. This may be determined experimentally, or by applying L'Hôpital's rule, described later in the book.

To evaluate the limit, recognize that  1 - \cos x can be multiplied by  1+\cos x to obtain  (1-\cos^2 x) which, by our trig identities, is  \sin^2 x. So, multiply the top and bottom by  1+\cos x. (This is allowed because it is identical to multiplying by one.) This is a standard trick for evaluating limits of fractions; multiply the numerator and the denominator by a carefully chosen expression which will make the expression simplify somehow. In this case, we should end up with:

\begin{align}\lim_{x\to 0} \frac{1-\cos x}{x} &=& \lim_{x\to 0} \left(\frac{1-\cos x}{x} \cdot \frac{1}{1}\right) \\
&=& \lim_{x\to 0} \left(\frac{1-\cos x}{x} \cdot \frac{1 + \cos x} {1+ \cos x}\right) \\
&=& \lim_{x\to 0}\frac{(1 - \cos x) \cdot 1 + (1 - \cos x) \cdot \cos x} {x \cdot (1+ \cos x)} \\
&=& \lim_{x\to 0}\frac{1 - \cos x + \cos x - \cos^2 x}{x \cdot (1+ \cos x)} \\
&=& \lim_{x\to 0}\frac{1 - \cos^2 x} {x \cdot (1+ \cos x)} \\
&=& \lim_{x\to 0}\frac{\sin^2 x} {x \cdot (1+ \cos x)} \\
&=& \lim_{x\to 0} \left(\frac{\sin x} {x} \cdot \frac{\sin x} {1+ \cos x}\right)\end{align}.

Our next step should be to break this up into \lim_{x\to 0}\frac{\sin x}{x} \cdot \lim_{x\to 0} \frac{\sin x}{1+\cos x} by the product rule. As mentioned above, \lim_{x\to 0} \frac{\sin x} {x} = 1.

Next,  \lim_{x\to 0} \frac{\sin x} {1+\cos x} = \frac{\lim_{x\to 0}\sin x} {\lim_{x\to 0} (1+\cos x)} = \frac{0} {1 + \cos 0} = 0.

Thus, by multiplying these two results, we obtain 0.


We will now present an amazingly useful result, even though we cannot prove it yet. We can find the limit at c of any polynomial or rational function, as long as that rational function is defined at c (so we are not dividing by zero). That is, c must be in the domain of the function.

Limits of Polynomials and Rational functions

If f is a polynomial or rational function that is defined at c then

\lim_{x \rightarrow c} f(x) = f(c)


We already learned this for trigonometric functions, so we see that it is easy to find limits of polynomial, rational or trigonometric functions wherever they are defined. In fact, this is true even for combinations of these functions; thus, for example,  \lim_{x\to 1} (\sin x^2 + 4\cos^3(3x-1)) = \sin 1^2 + 4\cos^3 (3(1)-1) .

The Squeeze Theorem

Graph showing f being squeezed between g and h

The Squeeze Theorem is very important in calculus, where it is typically used to find the limit of a function by comparison with two other functions whose limits are known.

It is called the Squeeze Theorem because it refers to a function f whose values are squeezed between the values of two other functions g and h, both of which have the same limit L. If the value of f is trapped between the values of the two functions g and h, the values of f must also approach L.

Expressed more precisely:

Theorem: (Squeeze Theorem)
Suppose that g(x) \le f(x) \le h(x) holds for all x in some open interval containing c, except possibly at x=c itself. Suppose also that \lim_{x\to c}g(x)=\lim_{x\to c}h(x)=L. Then \lim_{x\to c}f(x)=L also.
Plot of x*sin(1/x) for -0.5 < x <0.5

Example: Compute \lim_{x\to 0} x\sin(1/x). Note that the sine of any real number is in the interval [-1,1]. That is, -1 \le \sin x \le 1 for all x, and -1 \le \sin(1/x) \le 1 for all x. If x is positive, we can multiply these inequalities by x and get -x \le x\sin(1/x) \le x. If x is negative, we can similarly multiply the inequalities by the positive number -x and get x \le x\sin(1/x) \le -x. Putting these together, we can see that, for all nonzero x, -\left|x\right| \le x\sin(1/x) \le \left|x\right|. But it's easy to see that \lim_{x\to 0} -\left|x\right| = \lim_{x\to 0} \left|x\right| = 0. So, by the Squeeze Theorem, \lim_{x\to 0} x\sin(1/x) = 0.

Finding Limits

Now, we will discuss how, in practice, to find limits. First, if the function can be built out of rational, trigonometric, logarithmic and exponential functions, then if a number c is in the domain of the function, then the limit at c is simply the value of the function at c.

If c is not in the domain of the function, then in many cases (as with rational functions) the domain of the function includes all the points near c, but not c itself. An example would be if we wanted to find \lim_{x\to 0} \frac{x}{x}, where the domain includes all numbers besides 0.

In that case, in order to find \lim_{x\to c}f(x) we want to find a function g(x) similar to f(x), except with the hole at c filled in. The limits of f and g will be the same, as can be seen from the definition of a limit. By definition, the limit depends on f(x) only at the points where x is close to c but not equal to it, so the limit at c does not depend on the value of the function at c. Therefore, if \lim_{x\to c} g(x)=L, \lim_{x\to c} f(x) = L also. And since the domain of our new function g includes c, we can now (assuming g is still built out of rational, trigonometric, logarithmic and exponential functions) just evaluate it at c as before. Thus we have \lim_{x\to c} f(x) = g(c).

In our example, this is easy; canceling the x's gives g(x)=1, which equals f(x)=x/x at all points except 0. Thus, we have \lim_{x\to 0}\frac{x}{x} = \lim_{x\to 0} 1 = 1. In general, when computing limits of rational functions, it's a good idea to look for common factors in the numerator and denominator.

Lastly, note that the limit might not exist at all. There are a number of ways in which this can occur:


f(x) = \sqrt{x^2 - 16}
"Gap"
There is a gap (not just a single point) where the function is not defined. As an example, in
f(x) = \sqrt{x^2 - 16}
\lim_{x\to c}f(x) does not exist when -4\le c\le4. There is no way to "approach" the middle of the graph. Note that the function also has no limit at the endpoints of the two curves generated (at c=-4 and c=4). For the limit to exist, the point must be approachable from both the left and the right.
Note also that there is no limit at a totally isolated point on a graph.
"Jump"
If the graph suddenly jumps to a different level, there is no limit at the point of the jump. For example, let f(x) be the greatest integer \le x. Then, if c is an integer, when x approaches c from the right f(x)=c, while when x approaches c from the left f(x)=c-1. Thus \lim_{x\to c} f(x) will not exist.
A graph of 1/(x2) on the interval [-2,2].
Vertical asymptote
In
f(x) = {1 \over x^2}
the graph gets arbitrarily high as it approaches 0, so there is no limit. (In this case we sometimes say the limit is infinite; see the next section.)
A graph of sin(1/x) on the interval (0,1/π].
Infinite oscillation
These next two can be tricky to visualize. In this one, we mean that a graph continually rises above and falls below a horizontal line. In fact, it does this infinitely often as you approach a certain x-value. This often means that there is no limit, as the graph never approaches a particular value. However, if the height (and depth) of each oscillation diminishes as the graph approaches the x-value, so that the oscillations get arbitrarily smaller, then there might actually be a limit.
The use of oscillation naturally calls to mind the trigonometric functions. An example of a trigonometric function that does not have a limit as x approaches 0 is
f(x) = \sin {1 \over x}.
As x gets closer to 0 the function keeps oscillating between -1 and 1. In fact, \sin(1/x) oscillates an infinite number of times on the interval between 0 and any positive value of x. The sine function is equal to zero whenever x=k\pi, where k is a positive integer. Between every two integers k, \sin x goes back and forth between 0 and -1 or 0 and 1. Hence, \sin(1/x)=0 for every x=1/(k\pi). In between consecutive pairs of these values, 1/(k\pi) and 1/[(k+1)\pi], \sin(1/x) goes back and forth from 0, to either -1 or 1 and back to 0. We may also observe that there are an infinite number of such pairs, and they are all between 0 and 1/\pi. There are a finite number of such pairs between any positive value of x and 1/\pi, so there must be infinitely many between any positive value of x and 0. From our reasoning we may conclude that, as x approaches 0 from the right, the function \sin(1/x) does not approach any specific value. Thus, \lim_{x\to 0} \sin(1/x) does not exist.

Using Limit Notation to Describe Asymptotes

Now consider the function

 g(x) = \frac{1}{x^2}.

What is the limit as x approaches zero? The value of g(0) does not exist; it is not defined.

Notice, also, that we can make g(x) as large as we like, by choosing a small x, as long as x\ne0. For example, to make g(x) equal to 10^{12}, we choose x to be 10^{-6}. Thus, \lim_{x\to 0} \frac{1}{x^2} does not exist.

However, we do know something about what happens to g(x) when x gets close to 0 without reaching it. We want to say we can make g(x) arbitrarily large (as large as we like) by taking x to be sufficiently close to zero, but not equal to zero. We express this symbolically as follows:

\lim_{x\to 0} g(x) = \lim_{x\to 0} \frac{1}{x^2} = \infty

Note that the limit does not exist at 0; for a limit, being \infty is a special kind of not existing. In general, we make the following definition.

Definition: Informal definition of a limit being \pm\infty

We say the limit of f(x) as x approaches c is infinity if f(x) becomes very big (as big as we like) when x is close (but not equal) to c.

In this case we write

\lim_{x\to c} f(x) = \infty

or

f(x)\to\infty\quad\mbox{as}\quad x\to c.

Similarly, we say the limit of f(x) as x approaches c is negative infinity if f(x) becomes very negative when x is close (but not equal) to c.

In this case we write

\lim_{x\to c} f(x) = -\infty

or

f(x)\to-\infty\quad\mbox{as}\quad x\to c.

An example of the second half of the definition would be that \lim_{x\to 0} -\frac{1}{x^2} = -\infty.

Key Application of Limits

To see the power of the concept of the limit, let's consider a moving car. Suppose we have a car whose position is linear with respect to time (that is, a graph plotting the position with respect to time will show a straight line). We want to find the velocity. This is easy to do from algebra; we just take the slope, and that's our velocity.

But unfortunately, things in the real world don't always travel in nice straight lines. Cars speed up, slow down, and generally behave in ways that make it difficult to calculate their velocities.

Now what we really want to do is to find the velocity at a given moment (the instantaneous velocity). The trouble is that in order to find the velocity we need two points, while at any given time, we only have one point. We can, of course, always find the average speed of the car, given two points in time, but we want to find the speed of the car at one precise moment.

This is the basic trick of differential calculus, the first of the two main subjects of this book. We take the average speed at two moments in time, and then make those two moments in time closer and closer together. We then see what the limit of the slope is as these two moments in time are closer and closer, and say that this limit is the slope at a single instant.

We will study this process in much greater depth later in the book. First, however, we will need to study limits more carefully.

External Links

← Limits/Contents Calculus Finite Limits →
Print version

<h1> 2.2 Finite Limits</h1>

← Limits/An Introduction to Limits Calculus Infinite Limits →
Print version

Informal Finite Limits

Now, we will try to more carefully restate the ideas of the last chapter. We said then that the equation \lim_{x\to 2} f(x) = 4 meant that, when x gets close to 2, f(x) gets close to 4. What exactly does this mean? How close is "close"? The first way we can approach the problem is to say that, at x=1.99, f(x)=3.9601, which is pretty close to 4.

Sometimes however, the function might do something completely different. For instance, suppose f(x)=x^4-2x^2-3.77, so f(1.99)=3.99219201. Next, if you take a value even closer to 2, f(1.999)=4.20602, in this case you actually move further from 4. The reason for this is that substitution gives us 4.23 as x approaches 2.

The solution is to find out what happens arbitrarily close to the point. In particular, we want to say that, no matter how close we want the function to get to 4, if we make x close enough to 2 then it will get there. In this case, we will write

\quad\lim_{x\to 2} f(x) = 4

and say "The limit of f(x), as x approaches 2, equals 4" or "As x approaches 2, f(x) approaches 4." In general:

Definition: (New definition of a limit)

We call L the limit of f(x) as x approaches c if f(x) becomes arbitrarily close to L whenever x is sufficiently close (and not equal) to c.

When this holds we write

 \lim_{x \to c} f(x) = L

or

 f(x) \to L \quad \mbox{as} \quad x \to c.

One-Sided Limits

Sometimes, it is necessary to consider what happens when we approach an x value from one particular direction. To account for this, we have one-sided limits. In a left-handed limit, x approaches a from the left-hand side. Likewise, in a right-handed limit, x approaches a from the right-hand side.

For example, if we consider \quad\lim_{x\to 2} \sqrt{x-2}, there is a problem because there is no way for x to approach 2 from the left hand side (the function is undefined here). But, if x approaches 2 only from the right-hand side, we want to say that \sqrt{x-2} approaches 0.

Definition: (Informal definition of a one-sided limit)

We call L the limit of f(x) as x approaches c from the right if f(x) becomes arbitrarily close to L whenever x is sufficiently close to and greater than c.

When this holds we write

 \lim_{x \to c^+} f(x) = L.

Similarly, we call L the limit of f(x) as x approaches c from the left if f(x) becomes arbitrarily close to L whenever x is sufficiently close to and less than c.

When this holds we write

 \lim_{x \to c^-} f(x) = L.

In our example, the left-handed limit \quad\lim_{x\to 2^{-}} \sqrt{x-2} does not exist.

The right-handed limit, however, \quad\lim_{x\to 2^{+}} \sqrt{x-2} = 0.

It is a fact that \lim_{x\to c} f(x) exists if and only if \lim_{x\to c^+} f(x) and \lim_{x\to c^-} f(x) exist and are equal to each other. In this case, \lim_{x\to c} f(x) will be equal to the same number.

In our example, one limit does not even exist. Thus \lim_{x\to 2} \sqrt{x-2} does not exist either.

← Limits/An Introduction to Limits Calculus Infinite Limits →
Print version

<h1> 2.3 Infinite Limits</h1>

← Finite Limits Calculus Continuity →
Print version

Informal Infinite Limits

Another kind of limit involves looking at what happens to f(x) as x gets very big. For example, consider the function f(x)=1/x. As x gets very big, 1/x gets very small. In fact, 1/x gets closer and closer to zero the bigger x gets. Without limits it is very difficult to talk about this fact, because x can keep getting bigger and bigger and 1/x never actually gets to zero; but the language of limits exists precisely to let us talk about the behavior of a function as it approaches something - without caring about the fact that it will never get there. In this case, however, we have the same problem as before: how big does x have to be to be sure that f(x) is really going towards 0?

In this case, we want to say that, however close we want f(x) to get to 0, for x big enough f(x) is guaranteed to get that close. So we have yet another definition.

Definition: (Definition of a limit at infinity)

We call L the limit of f(x) as x approaches infinity if f(x) becomes arbitrarily close to L whenever x is sufficiently large.

When this holds we write

 \lim_{x \to \infty} f(x) = L

or

 f(x) \to L \quad \mbox{as} \quad x \to \infty.

Similarly, we call L the limit of f(x) as x approaches negative infinity if f(x) becomes arbitrarily close to L whenever x is sufficiently negative.

When this holds we write

 \lim_{x \to -\infty} f(x) = L

or

 f(x) \to L \quad \mbox{as} \quad x \to -\infty.

So, in this case, we write:

\quad\lim_{x\to \infin} \frac{1}{x} = 0

and say "The limit, as x approaches infinity, equals 0," or "as x approaches infinity, the function approaches 0".

We can also write:

\quad\lim_{x\to -\infin} \frac{1}{x} = 0,

because making x very negative also forces 1/x to be close to 0.

Notice, however, that infinity is not a number; it's just shorthand for saying "no matter how big." Thus, this is not the same as the regular limits we learned about in the last two chapters.

Limits at Infinity of Rational Functions

One special case that comes up frequently is when we want to find the limit at \infty (or -\infty) of a rational function. A rational function is just one made by dividing two polynomials by each other. For example, f(x)=(x^3+x-6)/(x^2-4x+3) is a rational function. Also, any polynomial is a rational function, since 1 is just a (very simple) polynomial, so we can write the function f(x)=x^2-3 as f(x)=(x^2-3)/1, the quotient of two polynomials.

Consider the numerator of a rational function as we allow the variable to grow very large (in either the positive or negative sense). The term with the highest exponent on the variable will dominate the numerator, and the other terms become more and more insignificant compared to the dominating term. The same applies to the denominator. In the limit, the other terms become negligible, and we only need to examine the dominating term in the numerator and denominator.

There is a simple rule for determining a limit of a rational function as the variable approaches infinity. Look for the term with the highest exponent on the variable in the numerator. Look for the same in the denominator. This rule is based on that information.

  • If the exponent of the highest term in the numerator matches the exponent of the highest term in the denominator, the limit (at both \infty and -\infty) is the ratio of the coefficients of the highest terms.
  • If the numerator has the highest term, then the fraction is called "top-heavy". If, when you divide the numerator by the denominator the resulting exponent on the variable is even, then the limit (at both \infty and -\infty) is \infty. If it is odd, then the limit at \infty is \infty, and the limit at -\infty is -\infty.
  • If the denominator has the highest term, then the fraction is called "bottom-heavy" and the limit (at both \infty and -\infty) is zero.

Note that, if the numerator or denominator is a constant (including 1, as above), then this is the same as x^0. Also, a straight power of x, like x^3, has coefficient 1, since it is the same as 1x^3.

Examples

Example 1

Find \quad\lim_{x\to \infin} \frac{x-5}{x-3} .

The function f(x)=(x-5)/(x-3) is the quotient of two polynomials, x-5 and x-3. By our rule we look for the term with highest exponent in the numerator; it's x. The term with highest exponent in the denominator is also x. So, the limit is the ratio of their coefficients. Since x=1x, both coefficients are 1, so \lim_{x\to\infty} (x-5)/(x-3) = 1/1 = 1.

Example 2

Find \quad\lim_{x\to \infty} \frac{x^3+x-6}{x^2-4x+3}.

We look at the terms with the highest exponents; for the numerator, it is x^3, while for the denominator it is x^2. Since the exponent on the numerator is higher, we know the limit at \infty will be \infty. So,

\quad\lim_{x\to \infty} \frac{x^3+x-6}{x^2-4x+3}=\infty.

← Finite Limits Calculus Continuity →
Print version

Infinity is not a number

Most people seem to struggle with this fact when first introduced to calculus, and in particular limits.

\lim_{x\to 0^+} \frac{1}{x} = \infin .

But \infin is different. \infin is not a number.

Mathematics is based on formal rules that govern the subject. When a list of formal rules applies to a type of object (e.g., "a number") those rules must always apply — no exceptions!

What makes \infin different is this: "there is no number greater than infinity". You can write down the formula in a lot of different ways, but here's one way: 1 +\infin = \infin. If you add one to infinity, you still have infinity; you don't have a bigger number. If you believe that, then infinity is not a number.

Since \infin does not follow the rules laid down for numbers, it cannot be a number. Every time you use the symbol \infin in a formula where you would normally use a number, you have to interpret the formula differently. Let's look at how \infin does not follow the rules that every actual number does:

Addition Breaks

Every number has a negative, and addition is associative. For \infin we could write -\infin and note that \infin - \infin = 0. This is a good thing, since it means we can prove if you take one away from infinity, you still have infinity: \infin -1 = (\infin +1) - 1 = \infin + (1 - 1) = \infin + 0 = \infin . But it also means we can prove 1 = 0, which is not so good.

1+ \infin = \infin
(1+ \infin) - \infin= \infin - \infin
1+ (\infin - \infin)= \infin - \infin
1 = 0\!

Therefore, \infin - \infin = \mathrm{indeterminate}.

Reinterpret Formulas that Use \infin

We started off with a formula that does "mean" something, even though it used \infin and \infin is not a number.

\lim_{x\to 0^+} \frac{1}{x} = \infin .

What does this mean, compared to what it means when we have a regular number instead of an infinity symbol:

\lim_{x\to 2} \frac{1}{x} = \frac1{2} .

This formula says that I can make sure the values of  \frac{1}{x} don't differ very much from \frac{1}{2}, so long as I can control how much x varies away from 2. I don't have to make \frac1{x} exactly equal to \frac{1}{2}, but I also can't control x too tightly. I have to give you a range to vary x within. It's just going to be very, very small (probably) if you want to make  \frac{1}{x} very very close to \frac{1}{2}. And by the way, it doesn't matter at all what happens when x=2.

If we could use the same paragraph as a template for my original formula, we'll see some problems. Let's substitute 0 for 2, and \infin for \frac{1}{2}.

\lim_{x\to 0^+} \frac{1}{x} = \infin .

This formula says that I can make sure the values of  \frac{1}{x} don't differ very much from \infin, so long as I can control how much x varies away from 0. I don't have to make \frac1{x} exactly equal to \infin, but I also can't control x too tightly. I have to give you a range to vary x within. It's just going to be very, very small (probably) if you want to see that  \frac{1}{x} gets very, very close to \infin. And by the way, it doesn't matter at all what happens when x=0.

It's close to making sense, but it isn't quite there. It doesn't make sense to say that some real number is really "close" to \infin. For example, when x = .001 and \frac 1 {x} = 1000 does it really makes sense to say 1000 is closer to \infin than 1 is? Solve the following equations for δ:

1000 + \delta = \infin
\delta = \infin-1000
\delta = \infin
1 + \delta = \infin
 \delta = \infin - 1
 \delta = \infin

No real number is very close to \infin; that's what makes \infin so special! So we have to rephrase the paragraph:

\lim_{x\to 0^+} \frac{1}{x} = \infin .

This formula says that I can make sure the values of  \frac{1}{x} get as big as any number you pick, so long as I can control how much x varies away from 0. I don't have to make \frac1{x} bigger than every number, but I also can't control x too tightly. I have to give you a range to vary x within. It's just going to be very, very small (probably) if you want to see that  \frac{1}{x} gets very, very large. And by the way, it doesn't matter at all what happens when x=0.

You can see that the essential nature of the formula hasn't changed, but the exact details require some human interpretation. While rigorous definitions and clear distinctions are essential to the study of mathematics, sometimes a bit of casual rewording is okay. You just have to make sure you understand what a formula really means so you can draw conclusions correctly.

Exercises

Write out an explanatory paragraph for the following limits that include \infin. Remember that you will have to change any comparison of magnitude between a real number and \infin to a different phrase. In the second case, you will have to work out for yourself what the formula means.

1. \lim_{x \to \infin} \frac{1}{x^2} = 0

This formula says that I can make the values of \frac{1}{x^2} as close as I would like to 0, so long as I make x sufficiently large.

2. \sum_{n = 0}^{\infin} 2^{-n} = 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \cdots  = 2

This formula says that you can make the sum \sum_{n=0}^{i} 2^{-n} as close as you would like to 2 by making i sufficiently large.

<h1> 2.4 Continuity</h1>

← Infinite Limits Calculus Formal Definition of the Limit →
Print version

Defining Continuity

We are now ready to define the concept of a function being continuous. The idea is that we want to say that a function is continuous if you can draw its graph without taking your pencil off the page. But sometimes this will be true for some parts of a graph but not for others. Therefore, we want to start by defining what it means for a function to be continuous at one point. The definition is simple, now that we have the concept of limits:

Definition: (continuity at a point)

If f(x) is defined on an open interval containing c, then f(x) is said to be continuous at c if and only if

\lim_{x \rightarrow c} f(x) = f(c).

Note that for f to be continuous at c, the definition in effect requires three conditions:

  1. that f is defined at c, so f(c) exists,
  2. the limit as x approaches c exists, and
  3. the limit and f(c) are equal.

If any of these do not hold then f is not continuous at c.

The idea of the definition is that the point of the graph corresponding to c will be close to the points of the graph corresponding to nearby x-values. Now we can define what it means for a function to be continuous in general, not just at one point.

Definition: (continuity)
A function is said to be continuous on (a, b) if it is continuous at every point of the interval (a, b).

We often use the phrase "the function is continuous" to mean that the function is continuous at every real number. This would be the same as saying the function was continuous on (−∞, ∞), but it is a bit more convenient to simply say "continuous".

Note that, by what we already know, the limit of a rational, exponential, trigonometric or logarithmic function at a point is just its value at that point, so long as it's defined there. So, all such functions are continuous wherever they're defined. (Of course, they can't be continuous where they're not defined!)

Discontinuities

A discontinuity is a point where a function is not continuous. There are lots of possible ways this could happen, of course. Here we'll just discuss two simple ways.

Removable discontinuities

The function f(x) = \frac {x^2-9} {x-3} is not continuous at x = 3. It is discontinuous at that point because the fraction then becomes \frac{0}{0}, which is undefined. Therefore the function fails the first of our three conditions for continuity at the point 3; 3 is just not in its domain.

However, we say that this discontinuity is removable. This is because, if we modify the function at that point, we can eliminate the discontinuity and make the function continuous. To see how to make the function f(x) continuous, we have to simplify f(x), getting f(x) = \frac {x^2-9} {x-3} = \frac {(x+3)(x-3)} {(x-3)} = \frac {x+3} {1} \cdot \frac {x-3} {x-3}. We can define a new function g(x) where g(x) = x + 3. Note that the function g(x) is not the same as the original function f(x), because g(x) is defined at x=3, while f(x) is not. Thus, g(x) is continuous at x=3, since \lim_{x\to 3} (x+3) = 6 = g(3). However, whenever x\ne 3, f(x)=g(x); all we did to f to get g was to make it defined at x=3.

In fact, this kind of simplification is often possible with a discontinuity in a rational function. We can divide the numerator and the denominator by a common factor (in our example x-3) to get a function which is the same except where that common factor was 0 (in our example at x=3). This new function will be identical to the old except for being defined at new points where previously we had division by 0.

However, this is not possible in every case. For example, the function f(x)=\frac{x-3}{x^2-6x+9} has a common factor of x-3 in both the numerator and denominator, but when you simplify you are left with g(x)=\frac{1}{x-3}, which is still not defined at x=3. In this case the domain of f(x) and g(x) are the same, and they are equal everywhere they are defined, so they are in fact the same function. The reason that g(x) differed from f(x) in the first example was because we could take it to have a larger domain and not simply that the formulas defining f(x) and g(x) were different.

Jump discontinuities

Illustration of a jump discontinuity

Not all discontinuities can be removed from a function. Consider this function:

k(x) = \left\{\begin{matrix} 1, & \mbox{if }x > 0 \\ -1, & \mbox{if }x \le 0 \end{matrix}\right.

Since \lim_{x\to 0} k(x) does not exist, there is no way to redefine k at one point so that it will be continuous at 0. These sorts of discontinuities are called nonremovable discontinuities.

Note, however, that both one-sided limits exist; \lim_{x\to 0^-} k(x) = -1 and \lim_{x\to 0^+} k(x) = 1. The problem is that they are not equal, so the graph "jumps" from one side of 0 to the other. In such a case, we say the function has a jump discontinuity. (Note that a jump discontinuity is a kind of nonremovable discontinuity.)

One-Sided Continuity

Just as a function can have a one-sided limit, a function can be continuous from a particular side. For a function to be continuous at a point from a given side, we need the following three conditions:

  1. the function is defined at the point,
  2. the function has a limit from that side at that point and
  3. the one-sided limit equals the value of the function at the point.

A function will be continuous at a point if and only if it is continuous from both sides at that point. Now we can define what it means for a function to be continuous on a closed interval.

Definition: (continuity on a closed interval)

A function is said to be continuous on [a,b] if and only if

  1. it is continuous on (a,b),
  2. it is continuous from the right at a and
  3. it is continuous from the left at b.

Notice that, if a function is continuous, then it is continuous on every closed interval contained in its domain.

Intermediate Value Theorem

A useful theorem regarding continuous functions is the following:

Intermediate Value Theorem
If a function f is continuous on a closed interval [a,b], then for every value k between f(a) and f(b) there is a value c between a and b such that f(c)=k.

Application: bisection method

A few steps of the bisection method applied over the starting range [a1;b1]. The bigger red dot is the root of the function.

The bisection method is the simplest and most reliable algorithm to find zeros of a continuous function.

Suppose we want to solve the equation f(x) = 0. Given two points a and b such that f(a) and f(b) have opposite signs, the intermediate value theorem tells us that f must have at least one root between a and b as long as f is continuous on the interval [a,b]. If we know f is continuous in general (say, because it's made out of rational, trigonometric, exponential and logarithmic functions), then this will work so long as f is defined at all points between a and b. So, let's divide the interval [a,b] in two by computing c = (a+b) / 2. There are now three possibilities:

  1. f(c)=0,
  2. f(a) and f(c) have opposite signs, or
  3. f(c) and f(b) have opposite signs.

In the first case, we're done. In the second and third cases, we can repeat the process on the sub-interval where the sign change occurs. In this way we hone in to a small sub-interval containing the zero. The midpoint of that small sub-interval is usually taken as a good approximation to the zero.

Note that, unlike the methods you may have learned in algebra, this works for any continuous function that you (or your calculator) know how to compute.

← Infinite Limits Calculus Formal Definition of the Limit →
Print version

<h1> 2.5 Formal Definition of the Limit</h1>

← Continuity Calculus Proofs of Some Basic Limit Rules →
Print version


In preliminary calculus, the concept of a limit is probably the most difficult one to grasp (after all, it took mathematicians 150 years to arrive at it); it is also the most important and most useful.

The intuitive definition of a limit is inadequate to prove anything rigorously about it. The problem lies in the vague term "arbitrarily close". We discussed earlier that the meaning of this term is that the closer x gets to the specified value, the closer the function must get to the limit, so that however close we want the function to the limit, we can accomplish this by making x sufficiently close to our value. We can express this requirement technically as follows:

Definition: (Formal definition of a limit)

Let f(x) be a function defined on an open interval D that contains c, except possibly at x=c. Let L be a number. Then we say that

 \lim_{x \to c} f(x) = L

if, for every \varepsilon>0, there exists a \delta>0 such that for all x\in D with

0 < \left| x - c \right| < \delta,

we have

\left| f(x) - L \right| < \varepsilon.

To further explain, earlier we said that "however close we want the function to the limit, we can find a corresponding x close to our value." Using our new notation of epsilon (\varepsilon) and delta (\delta), we mean that if we want to make f(x) within \varepsilon of L, the limit, then we know that making x within \delta of c puts it there.

Again, since this is tricky, let's resume our example from before: f(x)=x^2, at x=2. To start, let's say we want f(x) to be within .01 of the limit. We know by now that the limit should be 4, so we say: for \varepsilon=.01, there is some \delta so that as long as 0 < \left| x - c \right| < \delta, then \left| f(x) - L \right| < \varepsilon.

To show this, we can pick any \delta that is bigger than 0, so long as it works. For example, you might pick .00000000000001, because you are absolutely sure that if x is within .00000000000001 of 2, then f(x) will be within .01 of 4. This \delta works for \varepsilon=.01. But we can't just pick a specific value for \varepsilon, like .01, because we said in our definition "for every \varepsilon>0." This means that we need to be able to show an infinite number of \deltas, one for each \varepsilon. We can't list an infinite number of \deltas!

Of course, we know of a very good way to do this; we simply create a function, so that for every \varepsilon, it can give us a \delta. In this case, one definition of \delta that works is \delta(\varepsilon)=\left\{\begin{matrix}2\sqrt2-2,&\mbox{if }\epsilon\geq4\\\sqrt{\epsilon+4}-2,&\mbox{if }\epsilon<4\end{matrix}\right. (see example 5 in choosing delta for an explanation of how this delta was chosen.)

So, in general, how do you show that f(x) tends to L as x tends to c? Well imagine somebody gave you a small number \varepsilon (e.g., say \varepsilon=0.03). Then you have to find a \delta>0 and show that whenever 0<\left|x-c\right|<\delta we have |f(x)-L|<0.03. Now if that person gave you a smaller \varepsilon (say \varepsilon=0.002) then you would have to find another \delta, but this time with 0.03 replaced by 0.002. If you can do this for any choice of \varepsilon then you have shown that f(x) tends to L as x tends to c. Of course, the way you would do this in general would be to create a function giving you a \delta for every \varepsilon, just as in the example above.

Formal Definition of the Limit at Infinity

Definition: (Limit of a function at infinity)

We call L the limit of f(x) as x approaches \infty if for every number \varepsilon>0 there exists a \delta such that whenever x>\delta we have

\left| f(x) - L \right| < \varepsilon

When this holds we write

 \lim_{x \to \infty} f(x) = L

or

 f(x) \to L as  x \to \infty.

Similarly, we call L the limit of f(x) as x approaches -\infty if for every number \varepsilon>0, there exists a number \delta such that whenever x<\delta we have

\left| f(x) - L \right| < \varepsilon

When this holds we write

 \lim_{x \to -\infty} f(x) = L

or

 f(x) \to L as  x\to -\infty.

Notice the difference in these two definitions. For the limit of f(x) as x approaches \infty we are interested in those x such that x>\delta. For the limit of f(x) as x approaches -\infty we are interested in those x such that x<\delta.

Examples

Here are some examples of the formal definition.

Example 1

We know from earlier in the chapter that

\lim_{x \to 8} \frac {x} {4}=2 .

What is \delta when \varepsilon=0.01 for this limit?

We start with the desired conclusion and substitute the given values for f(x) and \varepsilon:

\left| \frac {x} {4} - 2 \right| < 0.01.

Then we solve the inequality for x:

7.96<x<8.04

This is the same as saying

-0.04<x-8<0.04.

(We want the thing in the middle of the inequality to be x-8 because that's where we're taking the limit.) We normally choose the smaller of \left|-0.04\right| and 0.04 for \delta, so \delta=0.04, but any smaller number will also work.

Example 2

What is the limit of f(x) = x + 7 as x approaches 4?

There are two steps to answering such a question; first we must determine the answer — this is where intuition and guessing is useful, as well as the informal definition of a limit — and then we must prove that the answer is right.

In this case, 11 is the limit because we know f(x) = x + 7 is a continuous function whose domain is all real numbers. Thus, we can find the limit by just substituting 4 in for x, so the answer is 4+7=11.

We're not done, though, because we never proved any of the limit laws rigorously; we just stated them. In fact, we couldn't have proved them, because we didn't have the formal definition of the limit yet, Therefore, in order to be sure that 11 is the right answer, we need to prove that no matter what value of \varepsilon is given to us, we can find a value of \delta such that

\left| f(x) - 11 \right| < \varepsilon

whenever

\left| x - 4 \right| < \delta.

For this particular problem, letting \delta=\varepsilon works (see choosing delta for help in determining the value of \delta to use in other problems). Now, we have to prove

\left| f(x) - 11 \right| < \varepsilon

given that

\left| x - 4 \right| < \delta = \varepsilon.

Since \left| x - 4 \right| < \varepsilon, we know

\left| f(x) - 11 \right| = \left| x + 7 - 11 \right| = \left| x - 4 \right| < \varepsilon

which is what we wished to prove.

Example 3

What is the limit of f(x) = x^2 as x approaches 4?

As before, we use what we learned earlier in this chapter to guess that the limit is 4^2=16. Also as before, we pull out of thin air that

\delta = \sqrt{\varepsilon+16}-4.

Note that, since \varepsilon is always positive, so is \delta, as required. Now, we have to prove

\left| x^2 - 16 \right| < \varepsilon

given that

\left| x - 4 \right| < \delta = \sqrt{\varepsilon + 16} - 4.

We know that

\left|x + 4\right| = \left|(x - 4) + 8\right| \le \left|x - 4\right| + 8<\delta+8

(because of the triangle inequality), so

\begin{matrix}
\left| x^2 - 16 \right| & = & \left| x - 4 \right| \cdot \left| x + 4 \right| \\  \\
\ & < & \delta \cdot (\delta + 8) \\  \\
\ & < & (\sqrt{16 + \varepsilon} - 4) \cdot (\sqrt{16 + \varepsilon} + 4) \\  \\
\ & < & (\sqrt{16 + \varepsilon})^2 - 4^2 \\  \\
\ & = & \varepsilon+16-16 \\ \\
\ & < & \varepsilon. \end{matrix}

Example 4

Show that the limit of \sin(1/x) as x approaches 0 does not exist.

We will proceed by contradiction. Suppose the limit exists; call it L. For simplicity, we'll assume that L\neq 1; the case for L=1 is similar. Choose \varepsilon = |1-L|. Then if the limit were L there would be some \delta>0 such that \left|\sin(1/x)-L\right|<\varepsilon=|1-L| for every x with 0<\left|x\right|<\delta. But, for every \delta > 0, there exists some (possibly very large) n such that  0 < x_0 = \frac{1}{\pi /2 + 2\pi n} < \delta, but |\sin(1/x_0) - L|=|1-L|, a contradiction.

Example 5

What is the limit of x \sin(1/x) as x approaches 0?

By the Squeeze Theorem, we know the answer should be 0. To prove this, we let \delta = \varepsilon. Then for all x, if 0 < |x| < \delta, then |x \sin(1/x) - 0| \leq | x | < \varepsilon as required.

Example 6

Suppose that \lim_{x\to a}f(x)=L and \lim_{x\to a}g(x)=M. What is \lim_{x\to a}(f(x)+g(x))?

Of course, we know the answer should be L+M, but now we can prove this rigorously. Given some \varepsilon, we know there's a \delta_1 such that, for any x with 0<\left|x-a\right|<\delta_1, \left|f(x)-L\right|<\varepsilon/2 (since the definition of limit says "for any \varepsilon", so it must be true for \varepsilon/2 as well). Similarly, there's a \delta_2 such that, for any x with 0<\left|x-a\right|<\delta_2, \left|g(x)-M\right|<\varepsilon/2. We can set \delta to be the lesser of \delta_1 and \delta_2. Then, for any x with 0<\left|x-a\right|<\delta, \left|(f(x)+g(x))-(L+M)\right|\le\left|f(x)-L\right|+\left|g(x)-M\right|<\varepsilon/2+\varepsilon/2
=\varepsilon, as required.

If you like, you can prove the other limit rules too using the new definition. Mathematicians have already done this, which is how we know the rules work. Therefore, when computing a limit from now on, we can go back to just using the rules and still be confident that our limit is correct according to the rigorous definition.

Formal Definition of a Limit Being Infinity

Definition: (Formal definition of a limit being infinity)

Let f(x) be a function defined on an open interval D that contains c, except possibly at x=c. Then we say that

 \lim_{x \to c} f(x) = \infty

if, for every \varepsilon, there exists a \delta>0 such that for all x\in D with

0 < \left| x - c \right| < \delta,

we have

f(x) > \varepsilon.

When this holds we write

\lim_{x\to c}f(x)=\infty

or

f(x)\to\infty as x\to c.

Similarly, we say that

 \lim_{x \to c} f(x) = -\infty

if, for every \varepsilon, there exists a \delta>0 such that for all x\in D with

0 < \left| x - c \right| < \delta,

we have

f(x) < \varepsilon.

When this holds we write

\lim_{x\to c}f(x)=-\infty

or

f(x)\to-\infty as x\to c.
← Continuity Calculus Proofs of Some Basic Limit Rules →
Print version

<h1> 2.6 Proofs of Some Basic Limit Rules</h1>

← Formal Definition of the Limit Calculus Limits/Exercises →
Print version

Now that we have the formal definition of a limit, we can set about proving some of the properties we stated earlier in this chapter about limits.

Constant Rule for Limits

If b and c are constants then  \lim_{x\to c} b = b.

Proof of the Constant Rule for Limits:
To prove that  \lim_{x\to c} f(x) = b, we need to find a \delta>0 such that for every \varepsilon>0, \left|b-b\right|<\varepsilon whenever \left|x-c\right|<\delta. \left|b-b\right|=0 and \varepsilon>0, so \left|b-b\right|<\varepsilon is satisfied independent of any value of \delta; that is, we can choose any \delta we like and the \varepsilon condition holds.

Identity Rule for Limits

If c is a constant then  \lim_{x\to c} x = c.

Proof of the Identity Rule for Limits:
To prove that  \lim_{x\to c} x = c, we need to find a \delta>0 such that for every \varepsilon>0, \left|x-c\right|<\varepsilon whenever \left|x-c\right|<\delta. Choosing \delta=\varepsilon satisfies this condition.

Scalar Product Rule for Limits

Suppose that \lim_{x\to c} f(x) =L for finite L and that k is constant. Then  \lim_{x\to c} k f(x) = k \cdot \lim_{x\to c} f(x) =  k L

Proof of the Scalar Product Rule for Limits:
Since we are given that \lim_{x\to c} f(x) =L, there must be some function, call it \delta_{f}(\varepsilon), such that for every \varepsilon>0, \left|f(x)-L\right|<\varepsilon whenever \left|x-c\right|<\delta_{f}(\varepsilon). Now we need to find a \delta_{kf}(\varepsilon) such that for all \varepsilon>0, \left|k f(x)-k L\right|<\varepsilon whenever \left|x-c\right|<\delta_{kf}(\varepsilon).
First let's suppose that k>0. \left|k f(x)-k L\right| = k \left|f(x)-L\right|<\varepsilon, so \left|f(x)-L\right|<\varepsilon/k. In this case, letting \delta_{kf}(\varepsilon)=\delta_{f}(\varepsilon/k) satisfies the limit condition.
Now suppose that k=0. Since f(x) has a limit at x=c, we know from the definition of a limit that f(x) is defined in an open interval D that contains c (except maybe at c itself). In particular, we know that f(x) doesn't blow up to infinity within D (except maybe at c, but that won't affect the limit), so that  0 f(x)=0 in D. Since k f(x) is the constant function 0 in D, the limit  \lim_{x\to c} k f(x) = 0 by the Constant Rule for Limits.
Finally, suppose that k<0. \left|k f(x)-k L\right| = -k \left|f(x)-L\right|<\varepsilon, so \left|f(x)-L\right|<-\varepsilon/k. In this case, letting \delta_{kf}(\varepsilon)=\delta_{f}(-\varepsilon/k) satisfies the limit condition.

Sum Rule for Limits
Suppose that \lim_{x\to c} f(x) =L and \lim_{x\to c} g(x) =M. Then

 \lim_{x\to c} [f(x) + g(x)] = \lim_{x\to c} f(x) +  \lim_{x\to c} g(x) =  L + M

Proof of the Sum Rule for Limits:
Since we are given that \lim_{x\to c} f(x) =L and \lim_{x\to c} g(x) =M, there must be functions, call them \delta_{f}(\varepsilon) and \delta_{g}(\varepsilon), such that for all \varepsilon>0, \left|f(x)-L\right|<\varepsilon whenever \left|x-c\right|<\delta_{f}(\varepsilon), and .\left|g(x)-M\right|<\varepsilon whenever \left|x-c\right|<\delta_{g}(\varepsilon).
Adding the two inequalities gives \left|f(x)-L\right| + \left|g(x)-M\right| < 2\varepsilon. By the triangle inequality we have \left|f(x)-L\right| + \left|g(x)-M\right| \geq \left|(f(x)-L)+(g(x)-M)\right|=\left|(f(x)+g(x))-(L+M)\right|, so we have \left|(f(x)+g(x))-(L+M)\right|<2\varepsilon whenever \left|x-c\right|<\delta_{f}(\varepsilon) and \left|x-c\right|<\delta_{g}(\varepsilon). Let \delta_{fg}(\varepsilon) be the smaller of \delta_{f}(\varepsilon/2) and \delta_{g}(\varepsilon/2). Then this \delta satisfies the definition of a limit for \lim_{x\to c} [f(x) + g(x)] having limit  L + M .

Difference Rule for Limits
Suppose that \lim_{x\to c} f(x) =L and \lim_{x\to c} g(x) =M. Then

 \lim_{x\to c} [f(x) - g(x)] = \lim_{x\to c} f(x) -  \lim_{x\to c} g(x) =  L - M

Proof of the Difference Rule for Limits: Define h(x)=-g(x). By the Scalar Product Rule for Limits, \lim_{x\to c}h(x)=-M. Then by the Sum Rule for Limits, \lim_{x\to c}(f(x)-g(x))=\lim_{x\to c}(f(x)+h(x))=L-M.

Product Rule for Limits
Suppose that \lim_{x\to c} f(x) =L and \lim_{x\to c} g(x) =M. Then

 \lim_{x\to c} [f(x) g(x)] = \lim_{x\to c} f(x) \lim_{x\to c} g(x) = L M

Proof of the Product Rule for Limits:[1]
Let \varepsilon be any positive number. The assumptions imply the existence of the positive numbers \delta_{1}, \delta_{2}, \delta_{3} such that

(1)\qquad\left|f(x)-L\right|<\frac{\varepsilon}{2(1+\left|M\right|)} when 0<\left|x-c\right|<\delta_{1}
(2)\qquad\left|g(x)-M\right|<\frac{\varepsilon}{2(1+\left|L\right|)} when 0<\left|x-c\right|<\delta_{2}
(3)\qquad\left|g(x)-M\right|<1 when 0<\left|x-c\right|<\delta_{3}

According to the condition (3) we see that

\left|g(x)\right|=\left|g(x)-M+M\right|\leq\left|g(x)-M\right|+\left|M\right|<1+\left|M\right| when 0<\left|x-c\right|<\delta_{3}

Supposing then that 0<\left|x-c\right|<\min\{\delta_{1},\delta_{2},\delta_{3}\} and using (1) and (2) we obtain

\begin{align}\left|f(x)g(x)-LM\right|&=\left|f(x)g(x)-Lg(x)+Lg(x)-LM\right|\\
&\leq\left|f(x)g(x)-Lg(x)\right|+\left|Lg(x)-LM\right|\\
&=\left|g(x)\right|\cdot\left|f(x)-L\right|+\left|L\right|\cdot\left|g(x)-M\right|\\
&<(1+\left|M\right|)\frac{\varepsilon}{2(1+\left|M\right|)}+(1+\left|L\right|)\frac{\varepsilon}{2(1+\left|L\right|)}\\
&=\varepsilon
\end{align}

Quotient Rule for Limits
Suppose that \lim_{x\to c} f(x) =L and \lim_{x\to c} g(x) =M and M \neq 0. Then

 \lim_{x\to c} \frac{f(x)}{g(x)} = \frac{\lim_{x\to c} f(x)}{\lim_{x\to c} g(x)} = \frac{L}{M}

Proof of the Quotient Rule for Limits:
If we can show that \lim_{x\to c}\frac{1}{g(x)}=\frac{1}{M}, then we can define a function, h(x) as h(x)=\frac{1}{g(x)} and appeal to the Product Rule for Limits to prove the theorem. So we just need to prove that \lim_{x\to c}\frac{1}{g(x)}=\frac{1}{M}.
Let \varepsilon be any positive number. The assumptions imply the existence of the positive numbers \delta_{1}, \delta_{2} such that

(1)\qquad\left|g(x)-M\right|<\varepsilon\left|M\right|(1+\left|M\right|) when 0<\left|x-c\right|<\delta_{1}
(2)\qquad\left|g(x)-M\right|<1 when 0<\left|x-c\right|<\delta_{2}

According to the condition (2) we see that

\left|g(x)\right|=\left|g(x)-M+M\right|\leq\left|g(x)-M\right|+\left|M\right|<1+\left|M\right| when 0<\left|x-c\right|<\delta_{2}

which implies that

(3)\qquad\left|\frac{1}{g(x)}\right|>\frac{1}{1+\left|M\right|} when 0<\left|x-c\right|<\delta_{2}

Supposing then that 0<\left|x-c\right|<\min\{\delta_{1},\delta_{2}\} and using (1) and (3) we obtain

\begin{align}\left|\frac{1}{g(x)}-\frac{1}{M}\right|&=\left|\frac{M-g(x)}{Mg(x)}\right|\\
&=\left|\frac{g(x)-M}{Mg(x)}\right|\\
&=\left|\frac{1}{g(x)}\right|\cdot\left|\frac{g(x)-M}{M}\right|\\
&<\frac{1}{1+\left|M\right|}\cdot\left|\frac{g(x)-M}{M}\right|\\
&<\frac{1}{1+\left|M\right|}\cdot\left|\frac{\varepsilon\left|M\right|(1+\left|M\right|)}{M}\right|\\
&=\varepsilon
\end{align}
Theorem: (Squeeze Theorem)
Suppose that g(x) \le f(x) \le h(x) holds for all x in some open interval containing c, except possibly at x=c itself. Suppose also that \lim_{x\to c}g(x)=\lim_{x\to c}h(x)=L. Then \lim_{x\to c}f(x)=L also.

Proof of the Squeeze Theorem:
From the assumptions, we know that there exists a \delta such that \left|g(x)-L\right|<\varepsilon and \left|h(x)-L\right|<\varepsilon when 0<\left|x-c\right|<\delta.
These inequalities are equivalent to L-\varepsilon<g(x)<L+\varepsilon and L-\varepsilon<h(x)<L+\varepsilon when 0<\left|x-c\right|<\delta.
Using what we know about the relative ordering of f(x), g(x), and h(x), we have
L-\varepsilon<g(x)<f(x)<h(x)<L+\varepsilon when 0<\left|x-c\right|<\delta.
or
-\varepsilon<g(x)-L<f(x)-L<h(x)-L<\varepsilon when 0<\left|x-c\right|<\delta.
So
\left|f(x)-L\right|<max(\left|g(x)-L\right|,\left|h(x)-L\right|)<\varepsilon when 0<\left|x-c\right|<\delta.

Notes

  1. This proof is adapted from one found at planetmath.org/encyclopedia/ProofOfLimitRuleOfProduct.html due to Planet Math user pahio and made available under the terms of the Creative Commons By/Share-Alike License.
← Formal Definition of the Limit Calculus Limits/Exercises →
Print version

<h1> 2.7 Limits Cumulative Exercises</h1>

← Proofs of Some Basic Limit Rules Calculus Differentiation →
Print version

Basic Limit Exercises

1. \lim_{x\to 2} (4x^2 - 3x+1)

11

2. \lim_{x\to 5} (x^2)

25

Solutions

One-Sided Limits

Evaluate the following limits or state that the limit does not exist.

3.  \lim_{x\to 0^-} \frac{x^3+x^2}{x^3+2x^2}

\frac{1}{2}

4.  \lim_{x\to 7^-} |x^2+x| -x

49

5.  \lim_{x\to -1^+} \sqrt{1-x^2}

0

6.  \lim_{x\to -1^-} \sqrt{1-x^2}

The limit does not exist

Solutions

Two-Sided Limits

Evaluate the following limits or state that the limit does not exist.

7.  \lim_{x \to -1} \frac{1}{x-1}

-\frac{1}{2}

8.  \lim_{x\to 4}  \frac{1}{x-4}

The limit does not exist.

9.  \lim_{x\to 2}  \frac{1}{x-2}

The limit does not exist.

10.  \lim_{x\to -3}  \frac{x^2 - 9}{x+3}

-6

11.  \lim_{x\to 3} \frac{x^2 - 9}{x-3}

6

12.  \lim_{x\to -1} \frac{x^2+2x+1}{x+1}

0

13.  \lim_{x\to -1} \frac{x^3+1}{x+1}

3

14.  \lim_{x\to 4} \frac{x^2 + 5x-36}{x^2 - 16}

\frac{13}{8}

15.  \lim_{x\to 25} \frac{x-25}{\sqrt{x}-5}

10

16.  \lim_{x\to 0} \frac{\left|x\right|}{x}

The limit does not exist.

17.  \lim_{x\to 2} \frac{1}{(x-2)^2}

\infty

18.  \lim_{x\to 3} \frac{\sqrt{x^2+16}}{x-3}

The limit does not exist.

19.  \lim_{x\to -2} \frac{3x^2-8x -3}{2x^2-18}

-\frac{5}{2}

20.  \lim_{x\to 2} \frac{x^2 + 2x + 1}{x^2-2x+1}

9

21.  \lim_{x\to 3} \frac{x+3}{x^2-9}

The limit does not exist.

22.  \lim_{x\to -1} \frac{x+1}{x^2+x}

-1

23.  \lim_{x\to 1} \frac{1}{x^2+1}

\frac{1}{2}

24.  \lim_{x\to 1} x^ + 5x - \frac{1}{2-x}

5

25.  \lim_{x\to 1} \frac{x^2-1}{x^2+2x-3}

\frac{1}{2}

26.  \lim_{x\to 1} \frac{5x}{x^2+2x-3}

The limit does not exist.

Solutions

Limits to Infinity

Evaluate the following limits or state that the limit does not exist.

27.  \lim_{x\to \infty} \frac{-x + \pi}{x^2 + 3x + 2}

0

28.  \lim_{x\to -\infty} \frac{x^2+2x+1}{3x^2+1}

\frac{1}{3}

29.  \lim_{x\to -\infty} \frac{3x^2 + x}{2x^2 - 15}

\frac{3}{2}

30.  \lim_{x\to -\infty} 3x^2-2x+1

\infty

31.  \lim_{x\to \infty} \frac{2x^2-32}{x^3-64}

0

32.  \lim_{x\to \infty} 6

6

33.  \lim_{x\to \infty} \frac{3x^2 +4x}{x^4+2}

0

34.  \lim_{x\to -\infty} \frac{2x+3x^2+1}{2x^2+3}

\frac{3}{2}

35.  \lim_{x\to -\infty} \frac{x^3-3x^2+1}{3x^2+x+5}

-\infty

36.  \lim_{x\to \infty} \frac{x^2+2}{x^3-2}

0

Solutions

Limits of Piecewise Functions

Evaluate the following limits or state that the limit does not exist.

37. Consider the function

 f(x) = \begin{cases} (x-2)^2 & \mbox{if }x<2 \\ x-3 & \mbox{if }x\geq 2. \end{cases}
a.  \lim_{x\to 2^-}f(x)

0

b.  \lim_{x\to 2^+}f(x)

-1

c.  \lim_{x\to 2}f(x)

The limit does not exist


38. Consider the function

 g(x) = \begin{cases} -2x+1 & \mbox{if }x\leq 0 \\ x+1 & \mbox{if }0<x<4 \\ x^2 +2 & \mbox{if }x \geq 4. \end{cases}
a.  \lim_{x\to 4^+} g(x)

18

b.  \lim_{x\to 4^-} g(x)

5

c.  \lim_{x\to 0^+} g(x)

1

d.  \lim_{x\to 0^-} g(x)

1

e.  \lim_{x\to 0} g(x)

1

f.  \lim_{x\to 1} g(x)

2


39. Consider the function

 h(x) = \begin{cases} 2x-3 & \mbox{if }x<2 \\ 8 & \mbox{if }x=2 \\ -x+3 & \mbox{if } x>2. \end{cases}
a.  \lim_{x\to 0} h(x)

-3

b.  \lim_{x\to 2^-} h(x)

1

c.  \lim_{x\to 2^+} h(x)

1

d.  \lim_{x\to 2} h(x)

1

Solutions

External Links


← Proofs of Some Basic Limit Rules Calculus Differentiation →
Print version

Differentiation

Basics of Differentiation

<h1> 3.1 Differentiation Defined</h1>

← Differentiation/Contents Calculus Product and Quotient Rules →
Print version

What is Differentiation?

Differentiation is an operation that allows us to find a function that outputs the rate of change of one variable with respect to another variable.

Informally, we may suppose that we're tracking the position of a car on a two-lane road with no passing lanes. Assuming the car never pulls off the road, we can abstractly study the car's position by assigning it a variable, x. Since the car's position changes as the time changes, we say that x is dependent on time, or x = f(t). This tells where the car is at each specific time. Differentiation gives us a function dx / dt which represents the car's speed, that is the rate of change of its position with respect to time.

Equivalently, differentiation gives us the slope at any point of the graph of a non-linear function. For a linear function, of form f(x)=ax+b, a is the slope. For non-linear functions, such as f(x)=3x^2, the slope can depend on x; differentiation gives us a function which represents this slope.

The Definition of Slope

Historically, the primary motivation for the study of differentiation was the tangent line problem: for a given curve, find the slope of the straight line that is tangent to the curve at a given point. The word tangent comes from the Latin word tangens, which means touching. Thus, to solve the tangent line problem, we need to find the slope of a line that is "touching" a given curve at a given point, or, in modern language, that has the same slope. But what exactly do we mean by "slope" for a curve?

The solution is obvious in some cases: for example, a line y = m x + c is its own tangent; the slope at any point is m. For the parabola y = x^2, the slope at the point (0,0) is 0; the tangent line is horizontal.

But how can you find the slope of, say, y = \sin  x + x^2 at x = 1.5? This is in general a nontrivial question, but first we will deal carefully with the slope of lines.

Of a line

Three lines with different slopes

The slope of a line, also called the gradient of the line, is a measure of its inclination. A line that is horizontal has slope 0, a line from the bottom left to the top right has a positive slope and a line from the top left to the bottom right has a negative slope.

The slope can be defined in two (equivalent) ways. The first way is to express it as how much the line climbs for a given motion horizontally. We denote a change in a quantity using the symbol \Delta (pronounced "delta"). Thus, a change in x is written as \Delta x. We can therefore write this definition of slope as:

\mbox{Slope}=\frac{\Delta y}{\Delta x}

An example may make this definition clearer. If we have two points on a line, P \left(x_1,y_1 \right) and Q \left( x_2,y_2 \right), the change in x from P to Q is given by:

\Delta x = x_2 - x_1\,

Likewise, the change in y from P to Q is given by:

\Delta y = y_2 - y_1\,

This leads to the very important result below.

The slope of the line between the points (x_1, y_1) and (x_2, y_2) is

\frac{\Delta y}{\Delta x} = \frac{y_2 - y_1}{x_2 - x_1}.

Alternatively, we can define slope trigonometrically, using the tangent function:

\mbox{Slope}=\tan\left( \alpha \right),

where \alpha is the angle from the rightward-pointing horizontal to the line, measured counter-clockwise. If you recall that the tangent of an angle is the ratio of the y-coordinate to the x-coordinate on the unit circle, you should be able to spot the equivalence here.

Of a graph of a function

The graphs of most functions we are interested in are not straight lines (although they can be), but rather curves. We cannot define the slope of a curve in the same way as we can for a line. In order for us to understand how to find the slope of a curve at a point, we will first have to cover the idea of tangency. Intuitively, a tangent is a line which just touches a curve at a point, such that the angle between them at that point is zero. Consider the following four curves and lines:

(i) (ii)
Tangency Example 1.svg Tangency Example 2.svg
(iii) (iv)
Tangency Example 3.svg Tangency Example 4.svg
  1. The line L crosses, but is not tangent to C at P.
  2. The line L crosses, and is tangent to C at P.
  3. The line L crosses C at two points, but is tangent to C only at P.
  4. There are many lines that cross C at P, but none are tangent. In fact, this curve has no tangent at P.

A secant is a line drawn through two points on a curve. We can construct a definition of a tangent as the limit of a secant of the curve taken as the separation between the points tends to zero. Consider the diagram below.

Tangent as Secant Limit.svg

As the distance h tends to zero, the secant line becomes the tangent at the point x_0. The two points we draw our line through are:

P \left( x_0, f\left( x_0 \right) \right)

and

Q \left( x_0+h, f\left( x_0+h \right) \right)

As a secant line is simply a line and we know two points on it, we can find its slope, m_h, using the formula from before:

m = \frac{y_2 - y_1}{x_2 - x_1}

(We will refer to the slope as m_h because it may, and generally will, depend on h.) Substituting in the points on the line,

m_h = \frac{f\left( x_0+h \right) - f\left( x_0 \right)}{\left(x_0 + h \right) - x_0}.

This simplifies to

m_h = \frac{f\left( x_0+h \right) - f\left( x_0 \right)}{h}.

This expression is called the difference quotient. Note that h can be positive or negative — it is perfectly valid to take a secant through any two points on the curve — but cannot be 0.

The definition of the tangent line we gave was not rigorous, since we've only defined limits of numbers — or, more precisely, of functions that output numbers — not of lines. But we can define the slope of the tangent line at a point rigorously, by taking the limit of the slopes of the secant lines from the last paragraph. Having done so, we can then define the tangent line as well. Note that we cannot simply set h to zero as this would imply division of zero by zero which would yield an undefined result. Instead we must find the limit of the above expression as h tends to zero:

Definition: (Slope of the graph of a function)

The slope of the graph of f(x) at the point (x_0,f(x_0)) is

\lim_{h \to 0}\left[\frac{f\left( x_0+h \right) - f\left( x_0 \right)}{h}\right]

If this limit does not exist, then we say the slope is undefined.

If the slope is defined, say m, then the tangent line to the graph of f(x) at the point (x_0,f(x_0)) is the line with equation

y-f(x_0) = m\cdot(x-x_0)

This last equation is just the point-slope form for the line through (x_0,f(x_0)) with slope m.

Exercises

1. Find the slope of the tangent to the curve y=x^2 at (1,1).

2

Solutions

The Rate of Change of a Function at a Point

Consider the formula for average velocity in the x direction, \frac{\Delta x}{\Delta t}, where \Delta x is the change in x over the time interval \Delta t. This formula gives the average velocity over a period of time, but suppose we want to define the instantaneous velocity. To this end we look at the change in position as the change in time approaches 0. Mathematically this is written as: \lim_{\Delta t \to 0} \frac{\Delta x}{\Delta t}, which we abbreviate by the symbol \frac{dx}{dt}. (The idea of this notation is that the letter d denotes change.) Compare the symbol d with \Delta. The idea is that both indicate a difference between two numbers, but \Delta denotes a finite difference while d denotes an infinitesimal difference. Please note that the symbols dx and dt have no rigorous meaning on their own, since \lim_{\Delta t \to 0} \Delta t=0, and we can't divide by 0.

(Note that the letter s is often used to denote distance, which would yield \frac{ds}{dt}. The letter d is often avoided in denoting distance due to the potential confusion resulting from the expression \frac{dd}{dt}.)

The Definition of the Derivative

You may have noticed that the two operations we've discussed — computing the slope of the tangent to the graph of a function and computing the instantaneous rate of change of the function — involved exactly the same limit. That is, the slope of the tangent to the graph of y=f(x) is \frac{dy}{dx}. Of course, \frac{dy}{dx} can, and generally will, depend on x, so we should really think of it as a function of x. We call this process (of computing \frac{dy}{dx}) differentiation. Differentiation results in another function whose value for any value x is the slope of the original function at x. This function is known as the derivative of the original function.

Since lots of different sorts of people use derivatives, there are lots of different mathematical notations for them. Here are some:

  • f'(x)\ (read "f prime of x") for the derivative of f(x),
  • D_x[f(x)],
  • D f(x),
  • \frac{dy}{dx} for the derivative of y as a function of x or
  • \frac{d}{dx}\left[ y\right], which is more useful in some cases.

Most of the time the brackets are not needed, but are useful for clarity if we are dealing with something like D (fg), where we want to differentiate the product of two functions, f and g.

The first notation has the advantage that it makes clear that the derivative is a function. That is, if we want to talk about the derivative of f(x) at x=2, we can just write f'(2).

In any event, here is the formal definition:

Definition: (derivative)
Let f(x) be a function. Then f'(x) = \lim_{\Delta x \to 0}\frac{f(x+\Delta x)-f(x)}{\Delta x} wherever this limit exists. In this case we say that f is differentiable at x and its derivative at x is f'(x).

Examples

Example 1

The derivative of f(x)=x/2 is

f'(x)=\lim_{\Delta x \to 0}\left(\frac{\frac{x+\Delta x}{2} - \frac{x}{2}}{\Delta x}\right)=\lim_{\Delta x \to 0}\left(\frac{\frac{x}{2}+\frac{\Delta x}{2} - \frac{x}{2}}{\Delta x}\right)=\lim_{\Delta x \to 0}\left(\frac{\frac{\Delta x}{2}}{\Delta x}\right)=\lim_{\Delta x \to 0}\left(\frac{\Delta x}{2 \Delta x}\right)=\lim_{\Delta x \to 0}\left(\frac{1}{2}\right)=\frac{1}{2},

no matter what x is. This is consistent with the definition of the derivative as the slope of a function.

Example 2

What is the slope of the graph of  y=3x^2 at (4,48)? We can do it "the hard (and imprecise) way", without using differentiation, as follows, using a calculator and using small differences below and above the given point:

When x=3.999, y=47.976003.

When x=4.001, y=48.024003.

Then the difference between the two values of x is \Delta x=0.002.

Then the difference between the two values of y is \Delta y=0.048.

Thus, the slope = \frac{\Delta y}{\Delta x} = 24 at the point of the graph at which x=4.

But, to solve the problem precisely, we compute

\lim_{\Delta x\to 0}\frac{3(4+\Delta x)^2-48}{\Delta x}\, = 3\lim_{\Delta x\to 0}\frac{(4+\Delta x)^2-16}{\Delta x}
= 3\lim_{\Delta x\to 0}\frac{16+8\Delta x+(\Delta x)^2-16}{\Delta x}
= 3\lim_{\Delta x\to 0}\frac{8\Delta x+(\Delta x)^2}{\Delta x}
= 3\lim_{\Delta x\to 0}(8+\Delta x)
= 3(8)
= 24.

We were lucky this time; the approximation we got above turned out to be exactly right. But this won't always be so, and, anyway, this way we didn't need a calculator.

In general, the derivative of f(x)=3x^2 is

f'(x)\, = \lim_{\Delta x\to 0}\frac{3(x+\Delta x)^2-3x^2}{\Delta x}
= 3\lim_{\Delta x\to 0}\frac{(x+\Delta x)^2-x^2}{\Delta x}
= 3\lim_{\Delta x\to 0}\frac{x^2+2x\Delta x+(\Delta x)^2-x^2}{\Delta x}
= 3\lim_{\Delta x\to 0}\frac{2x\Delta x+(\Delta x)^2}{\Delta x}
= 3\lim_{\Delta x\to 0}(2x+\Delta x)
= 3(2x)
= 6x.

Example 3

If f(x) = \left|x\right| (the absolute value function) then f'(x) = \frac{x}{\left|x\right|}, which can also be stated as f'(x) = \left\{ \begin{matrix} -1, & x < 0 \\ \operatorname{undefined}, & x = 0 \\ 1, & x > 0 \end{matrix} \right. . Finding this derivative is a bit complicated, so we won't prove it at this point.


Here, f(x) is not smooth (though it is continuous) at x=0 and so the limits \lim_{x \to 0^{+}} f'(x) and \lim_{x \to 0^{-}} f'(x) (the limits as 0 is approached from the right and left respectively) are not equal. From the definition, f'(0)=\lim_{\Delta x\to 0}\frac{\left|\Delta x\right|}{\Delta x}, which does not exist. Thus, f'(0) is undefined, and so f'(x) has a discontinuity at 0. This sort of point of non-differentiability is called a cusp. Functions may also not be differentiable because they go to infinity at a point, or oscillate infinitely frequently.

Understanding the derivative notation

The derivative notation is special and unique in mathematics. The most common notation for derivatives you'll run into when first starting out with differentiating is the Leibniz notation, expressed as  \frac{dy}{dx}. You may think of this as "rate of change in y with respect to x". You may also think of it as "infinitesimal value of y divided by infinitesimal value of x". Either way is a good way of thinking, although you should remember that the precise definition is the one we gave above. Often, in an equation, you will see just \frac{d}{dx}, which literally means "derivative with respect to x". This means we should take the derivative of whatever is written to the right; that is, \frac{d}{dx}(x+2) means \frac{dy}{dx} where y=x+2.

As you advance through your studies, you will see that we sometimes pretend that dy and dx are separate entities that can be multiplied and divided, by writing things like dy=x^4\,dx. Eventually you will see derivatives such as \frac {dx} {dy}, which just means that the input variable of our function is called y and our output variable is called x; sometimes, we will write  \frac{d}{dy}, to mean the derivative with respect to y of whatever is written on the right. In general, the variables could be anything, say \frac{d\theta}{dr}.

All of the following are equivalent for expressing the derivative of y = x^{2}

  • \frac{dy}{dx} = 2x
  • \frac{d}{dx} x^{2} = 2x
  • dy = 2x dx \
  • f '(x) = 2x \
  • D(f(x)) = 2x \

Exercises

2. Using the definition of the derivative find the derivative of the function f(x)=2x+3.

2

3. Using the definition of the derivative find the derivative of the function f(x)=x^3. Now try f(x)=x^4. Can you see a pattern? In the next section we will find the derivative of f(x)=x^n for all n.

\frac{d x^3}{dx}=3x^2\qquad\frac{d x^4}{dx}=4x^3

4. The text states that the derivative of \left|x\right| is not defined at x = 0. Use the definition of the derivative to show this.

\begin{alignat}{2}\lim_{\Delta x\to 0^-}\frac{\left|0+\Delta x\right|-\left|0\right|}{\Delta x}
&=\lim_{\Delta x\to 0^-}\frac{-\Delta x}{\Delta x}
& \qquad\lim_{\Delta x\to 0^+}\frac{\left|0+\Delta x\right|-\left|0\right|}{\Delta x}
&= \lim_{\Delta x\to 0^+}\frac{\Delta x}{\Delta x}\\
&=\lim_{\Delta x\to 0^-}-1
& &=\lim_{\Delta x\to 0^+}1\\
&=-1
& &=1
\end{alignat}
Since the limits from the left and the right at x=0 are not equal, the limit does not exist, so \left|x\right| is not differentiable at x=0.

5. Graph the derivative to y=4x^2 on a piece of graph paper without solving for  dy/dx . Then, solve for  dy/dx and graph that; compare the two graphs.
6. Use the definition of the derivative to show that the derivative of \sin x is \cos x . Hint: Use a suitable sum to product formula and the fact that \lim_{t \to 0}\frac{\sin(t)}{t}=1 and \lim_{t \to 0}\frac{\cos(t)-1}{t}=0.

\begin{align}\lim_{\Delta x\to 0}\frac{\sin(x+\Delta x)-\sin(x)}{\Delta x}
&=\lim_{\Delta x\to 0}\frac{(\sin(x)\cos(\Delta x)+\cos(x)\sin(\Delta x))-\sin(x)}{\Delta x}\\
&=\lim_{\Delta x\to 0}\frac{\sin(x)(\cos(\Delta x)-1)+\cos(x)\sin(\Delta x)}{\Delta x}\\
&=\sin(x)\cdot\lim_{\Delta x\to 0}\frac{\cos(\Delta x)-1}{\Delta x}+\cos(x)\cdot\lim_{\Delta x\to 0}\frac{\sin(\Delta x)}{\Delta x}\\
&=\sin(x)\cdot 0+\cos(x)\cdot 1\\
&=\cos(x)
\end{align}

Solutions

Differentiation Rules

The process of differentiation is tedious for complicated functions. Therefore, rules for differentiating general functions have been developed, and can be proved with a little effort. Once sufficient rules have been proved, it will be fairly easy to differentiate a wide variety of functions. Some of the simplest rules involve the derivative of linear functions.

Derivative of a constant function

For any fixed real number c,

\frac{d}{dx}\left[c\right]=0.

Intuition

The graph of the function f(x) = c is a horizontal line, which has a constant slope of zero. Therefore, it should be expected that the derivative of this function is zero, regardless of the values of x and c.

Proof

The definition of a derivative is

\lim_{\Delta x \to 0}\frac{f(x+\Delta x)-f(x)}{\Delta x}.

Let  f(x) = c for all x. (That is, f is a constant function.) Then  f(x+\Delta x) = c . Therefore

 \frac{d}{dx}\left[c\right] = \lim_{\Delta x \to 0} \frac{c-c}{\Delta x} = \lim_{\Delta x \to 0} \frac{0}{\Delta x}.

Let g(\Delta x)=\frac{0}{\Delta x}. To prove that \lim_{\Delta x\to 0}g(\Delta x)=0, we need to find a positive \delta such that, for any given positive \varepsilon, \left|g(\Delta x)-0\right|<\varepsilon whenever 0<\left|\Delta x-0\right|<\delta. But \left|g(\Delta x)-0\right|=0, so \left|g(\Delta x)-0\right|<\varepsilon for any choice of \delta.

Examples

  1. \frac{d}{dx}\left[3\right]=0
  2. \frac{d}{dx}\left[z\right]=0

Note that, in the second example, z is just a constant.

Derivative of a linear function

For any fixed real numbers m and c,

\frac{d}{dx}\left[mx+c\right]=m

The special case \frac{dx}{dx} = 1 shows the advantage of the \frac{d}{dx} notation—rules are intuitive by basic algebra, though this does not constitute a proof, and can lead to misconceptions to what exactly dx and dy actually are.

Intuition

The graph of y=mx+c is a line with constant slope m.

Proof

If f(x)=mx+c, then f(x+\Delta x)=m(x+\Delta x)+c. So,

f'(x)\, = \lim_{\Delta x\to 0}\frac{m(x+\Delta x)+c-mx-c}{\Delta x}\,
= \lim_{\Delta x\to 0}\frac{m(x+\Delta x)-mx}{\Delta x}
= \lim_{\Delta x\to 0}\frac{mx+m\Delta x-mx}{\Delta x}\,
= \lim_{\Delta x\to 0}\frac{m\Delta x}{\Delta x}
= m.

Constant multiple and addition rules

Since we already know the rules for some very basic functions, we would like to be able to take the derivative of more complex functions by breaking them up into simpler functions. Two tools that let us do this are the constant multiple rule and the addition rule.

The Constant Rule

For any fixed real number c,

\frac{d}{dx}\left[cf(x)\right] = c \frac{d}{dx}\left[f(x)\right]

The reason, of course, is that one can factor c out of the numerator, and then of the entire limit, in the definition. The details are left as an exercise.

Example

We already know that

\frac{d}{dx}\left[x^2\right]=2x.

Suppose we want to find the derivative of 3x^2

\frac{d}{dx}\left[3x^2\right] = 3\frac{d}{dx}\left[x^2\right]
= 3\times2x\,
= 6x\,

Another simple rule for breaking up functions is the addition rule.

The Addition and Subtraction Rules

\frac{d}{dx}\left[f(x)\pm g(x)\right]= \frac{d}{dx}\left[f(x)\right]\pm\frac{d}{dx}\left[g(x)\right]

Proof

From the definition:

 \lim_{\Delta x \to 0}\left[\frac{[f(x+\Delta x) \pm g(x + \Delta x)] - [f(x) \pm g(x)]}{\Delta x}\right]

 = \lim_{\Delta x \to 0} 
   \left[\frac{[f(x+\Delta x) - f(x)] \pm [g(x + \Delta x) - g(x)]}{\Delta x}\right]

 = \lim_{\Delta x \to 0} \left[\frac{[f(x+\Delta x) - f(x)]}{\Delta x}\right]
  \pm \lim_{\Delta x \to 0} \left[\frac{[g(x+\Delta x) - g(x)]}{\Delta x}\right]

By definition then, this last term is  \frac{d}{dx} \left[f(x)\right] \pm \frac{d}{dx}\left[g(x)\right]

Example

What is the derivative of 3x^2+5x?

\frac{d}{dx}\left[3x^2+5x\right] = \frac{d}{dx}\left[3x^2+5x\right]
= \frac{d}{dx}\left[3x^2\right]+\frac{d}{dx}\left[5x\right]
= 6x+\frac{d}{dx}\left[5x\right]
= 6x+5\,

The fact that both of these rules work is extremely significant mathematically because it means that differentiation is linear. You can take an equation, break it up into terms, figure out the derivative individually and build the answer back up, and nothing odd will happen.

We now need only one more piece of information before we can take the derivatives of any polynomial.

The Power Rule

\frac{d}{dx}\left[x^n\right]=nx^{n-1}

For example, in the case of x^2 the derivative is 2x^1=2x as was established earlier. A special case of this rule is that dx/dx=dx^1/dx=1x^0=1.

Since polynomials are sums of monomials, using this rule and the addition rule lets you differentiate any polynomial. A relatively simple proof for this can be derived from the binomial expansion theorem.

This rule also applies to fractional and negative powers. Therefore

\frac{d}{dx}\left[\sqrt x \right] = \frac{d}{dx}\left[ x^{1/2}\right]
= \frac 1 2 x^{-1/2}
= \frac 1 {2\sqrt x}

Derivatives of polynomials

With these rules in hand, you can now find the derivative of any polynomial you come across. Rather than write the general formula, let's go step by step through the process.

\frac{d}{dx}\left[6x^5+3x^2+3x+1\right]

The first thing we can do is to use the addition rule to split the equation up into terms:

\frac{d}{dx}\left[6x^5\right]+\frac{d}{dx}\left[3x^2\right]+\frac{d}{dx}\left[3x\right]+\frac{d}{dx}\left[1\right].

We can immediately use the linear and constant rules to get rid of some terms:

\frac{d}{dx}\left[6x^5\right]+\frac{d}{dx}\left[3x^2\right]+3+0.

Now you may use the constant multiplier rule to move the constants outside the derivatives:

6\frac{d}{dx}\left[x^5\right]+3\frac{d}{dx}\left[x^2\right]+3.

Then use the power rule to work with the individual monomials:

6\left(5x^4\right)+3\left(2x\right)+3.

And then do some algebra to get the final answer:

30x^4+6x+3.\,

These are not the only differentiation rules. There are other, more advanced, differentiation rules, which will be described in a later chapter.

Exercises

  • Find the derivatives of the following equations:
7.  f(x) = 42

f'(x)=0

8.  f(x) = 6x + 10

f'(x)=6

9.  f(x) = 2x^2 + 12x + 3

f'(x)=4x+12

Solutions

← Differentiation/Contents Calculus Product and Quotient Rules →
Print version

<h1> 3.2 Product and Quotient Rules</h1>

← Differentiation/Differentiation Defined Calculus Derivatives of Trigonometric Functions →
Print version

Product Rule

When we wish to differentiate a more complicated expression such as

h(x) = (x^2+5x + 7) \cdot (x^3 + 2x - 4),

our only way (up to this point) to differentiate the expression is to expand it and get a polynomial, and then differentiate that polynomial. This method becomes very complicated and is particularly error prone when doing calculations by hand. A beginner might guess that the derivative of a product is the product of the derivatives, similar to the sum and difference rules, but this is not true. To take the derivative of a product, we use the product rule.

Derivatives of products (Product Rule)

\frac{d}{dx}\left[ f(x) \cdot g(x) \right] = f(x) \cdot g'(x)+f'(x) \cdot g(x)\,\!

It may also be stated as

(f\cdot g)'=f'\cdot g+f\cdot g' \,\!

or in the Leibniz notation as

\dfrac{d}{dx}(u\cdot v)=u\cdot \dfrac{dv}{dx}+v\cdot \dfrac{du}{dx}.

The derivative of the product of three functions is:

\dfrac{d}{dx}(u\cdot v \cdot w)=\dfrac{du}{dx} \cdot v \cdot w + u \cdot \dfrac{dv}{dx} \cdot w + u\cdot v\cdot \dfrac{dw}{dx}.

Since the product of two or more functions occurs in many mathematical models of physical phenomena, the product rule has broad application in physics, chemistry, and engineering.

Examples

  • Suppose one wants to differentiate ƒ(x) = x2 sin(x). By using the product rule, one gets the derivative ƒ '(x) = 2x sin(x) + x2cos(x) (since the derivative of x2 is 2x and the derivative of sin(x) is cos(x)).
  • One special case of the product rule is the constant multiple rule, which states: if c is a real number and ƒ(x) is a differentiable function, then (x) is also differentiable, and its derivative is (c × ƒ)'(x) = c × ƒ '(x). This follows from the product rule since the derivative of any constant is zero. This, combined with the sum rule for derivatives, shows that differentiation is linear.

Physics Example I: rocket acceleration

The acceleration of model rockets can be computed with the product rule.

Consider the vertical acceleration of a model rocket relative to its initial position at a fixed point on the ground. Newton's second law says that the force is equal to the time rate change of momentum. If F is the net force (sum of forces), p is the momentum, and t is the time,


\begin{align}
F = \frac{dp}{dt}.
\end{align}

Since the momentum is equal to the product of mass and velocity, this yields


\begin{align}
F = \frac{d}{dt}\left( mv \right),
\end{align}

where m is the mass and v is the velocity. Application of the product rule gives


\begin{align}
F = v\frac{dm}{dt} + m\frac{dv}{dt}.
\end{align}

Since the acceleration, a, is defined as the time rate change of velocity, a = dv/dt,


\begin{align}
F = v\frac{dm}{dt} + ma.
\end{align}

Solving for the acceleration,


\begin{align}
a= \frac{F - v\frac{dm}{dt}}{m}.
\end{align}

Since the rocket is losing mass, dm/dt is negative, and the changing mass term results in increased acceleration.[1][2]

Physics Example II: electromagnetic induction

Faraday's law of electromagnetic induction states that the induced electromotive force is the negative time rate of change of magnetic flux through a conducting loop.

 \mathcal{E} = -{{d\Phi_B} \over dt},

where \mathcal{E} is the electromotive force (emf) in volts and ΦB is the magnetic flux in webers. For a loop of area, A, in a magnetic field, B, the magnetic flux is given by

 \Phi_B = B\cdot A \cdot \cos(\theta),

where θ is the angle between the normal to the current loop and the magnetic field direction.

Taking the negative derivative of the flux with respect to time yields the electromotive force gives


\begin{align}
\mathcal{E} &= -\frac{d}{dt} \left( B\cdot A \cdot \cos(\theta) \right) \\
&= -\frac{dB}{dt} \cdot A \cos(\theta) -B \cdot \frac{dA}{dt} \cos(\theta)- B \cdot A \frac{d}{dt}\cos(\theta)\\
\end{align}

In many cases of practical interest only one variable (A, B, or θ) is changing, so two of the three above terms are often zero.

Proof of the Product Rule

Proving this rule is relatively straightforward, first let us state the equation for the derivative:

\frac{d}{dx} \left[ f(x) \cdot g(x) \right] = \lim_{h \to 0} \frac{ f(x+h)\cdot g(x+h) - f(x) \cdot g(x)}{h}

We will then apply one of the oldest tricks in the book—adding a term that cancels itself out to the middle:

\frac{d}{dx} \left[ f(x) \cdot g(x) \right] = \lim_{h \to 0} \frac{ f(x+h)\cdot g(x+h) \mathbf{- f(x+h) \cdot g(x) + f(x+h) \cdot g(x)} - f(x) \cdot g(x)}{h}

Notice that those terms sum to zero, and so all we have done is add 0 to the equation. Now we can split the equation up into forms that we already know how to solve:

\frac{d}{dx} \left[ f(x) \cdot g(x) \right] = \lim_{h \to 0} \left[ \frac{ f(x+h)\cdot g(x+h) - f(x+h) \cdot g(x) }{h} + \frac{f(x+h) \cdot g(x) - f(x) \cdot g(x)}{h} \right]

Looking at this, we see that we can separate the common terms out of the numerators to get:

\frac{d}{dx} \left[ f(x) \cdot g(x) \right] = \lim_{h \to 0} \left[ f(x+h) \frac{ g(x+h) - g(x) }{h} + g(x) \frac{f(x+h) - f(x)}{h} \right]

Which, when we take the limit, becomes:

\frac{d}{dx} \left[ f(x) \cdot g(x) \right] = f(x) \cdot g'(x) + g(x) \cdot f'(x) , or the mnemonic "one D-two plus two D-one"

This can be extended to 3 functions:

\frac{d}{dx}[fgh] = f(x) g(x) h'(x) + f(x) g'(x) h(x) + f'(x) g(x) h(x) \,

For any number of functions, the derivative of their product is the sum, for each function, of its derivative times each other function.

Back to our original example of a product, h(x) = (x^2+5x + 7) \cdot (x^3 + 2x - 4), we find the derivative by the product rule is

 h'(x) = (x^2+5x+7)(3x^2+2) + (2x+5)(x^3+2x-4) = 5x^4+20x^3+27x^2+12x-6\,

Note, its derivative would not be

{\color{red}(2x+5) \cdot (3x^2+2) = 6x^3+15x^2+4x+10}

which is what you would get if you assumed the derivative of a product is the product of the derivatives.

To apply the product rule we multiply the first function by the derivative of the second and add to that the derivative of first function multiply by the second function. Sometimes it helps to remember the memorize the phrase "First times the derivative of the second plus the second times the derivative of the first."

Application: proof of the Power Rule

The product rule can be used to give a proof of the power rule for whole numbers. The proof proceeds by mathematical induction. We begin with the base case n=1. If  f_1(x) = x then from the definition is easy to see that

 
f_1'(x) = \lim _{ h \rightarrow 0} \frac {x+h - x} h = 1

Next we suppose that for fixed value of N, we know that for  f_N(x) = x^N, f_N'(x) = Nx^{N-1}. Consider the derivative of  f_{N+1}(x)= x^{N+1},

 f_{N+1}'(x) = ( x \cdot x^N) ' = (x)' x^N + x\cdot(x^N)'= x^N + x\cdot N\cdot x^{N-1} = (N+1)x^N.

We have shown that the statement f_n'(x)=n\cdot x^{n-1} is true for n=1 and that if this statement holds for  n=N, then it also holds for n=N+1. Thus by the principle of mathematical induction, the statement must hold for n=1, 2, \dots.

Quotient Rule

There is a similar rule for quotients. To prove it, we go to the definition of the derivative:

\begin{align}
\frac{d}{dx} \frac{f(x)}{g(x)} &=  \lim_{h \to 0} \frac{\frac{f(x + h)}{g(x + h)} - \frac{f(x)}{g(x)}}{h} \\
                               &= \lim_{h \to 0} \frac{f(x + h)g(x) - f(x)g(x + h)}{h g(x) g(x + h)} \\
                               &= \lim_{h \to 0} \frac{f(x + h)g(x) - f(x)g(x) + f(x)g(x) - f(x)g(x + h)}{h g(x) g(x + h)} \\
                               &= \lim_{h \to 0} \frac{g(x)\frac{f(x + h) - f(x)}{h} - f(x)\frac{g(x + h) - g(x)}{h}}{g(x) g(x + h)} \\
                               &= \frac{g(x)f'(x) - f(x) g'(x)}{g(x)^2}
\end{align}

This leads us to the so-called "quotient rule":

Derivatives of quotients (Quotient Rule)

 \frac{d}{dx} \left[{f(x)\over g(x)}\right] = \frac{g(x)f'(x) - f(x)g'(x)}{g(x)^2}\,\!

Some people remember this rule with the mnemonic "low D-high minus high D-low, over the square of what's below!"

Examples

The derivative of (4x - 2)/(x^2 + 1) is:

\begin{align}
\frac{d}{dx}\left[\frac{(4x - 2)}{x^2 + 1}\right] &= \frac{(x^2 + 1)(4) - (4x - 2)(2x)}{(x^2 + 1)^2} \\
                                                  &= \frac{(4x^2 + 4) - (8x^2 - 4x)}{(x^2 + 1)^2} \\
                                                  &= \frac{-4x^2 + 4x + 4}{(x^2 + 1)^2}
\end{align}

Remember: the derivative of a product/quotient is not the product/quotient of the derivatives. (That is, differentiation does not distribute over multiplication or division.) However one can distribute before taking the derivative. That is \frac{d}{dx}\left((a+b)\times(c+d)\right) = \frac{d}{dx}\left(ac+ad+bc+bd\right)

← Differentiation/Differentiation Defined Calculus Derivatives of Trigonometric Functions →
Print version

References

  1. Chandler, David (October 2000). "Newton's Second Law for Systems with Variable Mass". The Physics Teacher 38 (7): 396. 
  2. Courtney, Michael; Courtney, Amy. "Measuring thrust and predicting trajectory in model rocketry". arΧiv:0903.1555. 

<h1> 3.3 Derivatives of Trigonometric Functions</h1>

← Product and Quotient Rules Calculus Chain Rule →
Print version

Sine, cosine, tangent, cosecant, secant, cotangent. These are functions that crop up continuously in mathematics and engineering and have a lot of practical applications. They also appear in more advanced mathematics, particularly when dealing with things such as line integrals with complex numbers and alternate representations of space like spherical and cylindrical coordinate systems.

We use the definition of the derivative, i.e.,

f'(x) = \lim_{h \to 0}\frac{f(x+h)-f(x)}{h},

to work these first two out.

Let us find the derivative of sin x, using the above definition.

 f(x) = \sin{x} \, \!
 f'(x) = \lim_{h \to 0}{\sin(x+h)-\sin{x} \over h} Definition of derivative
 = \lim_{h \to 0}{\cos(x)\sin(h)+\cos(h)\sin(x) - \sin(x) \over h} trigonometric identity
 = \lim_{h \to 0}{\cos(x)\sin(h)+(\cos(h) - 1)\sin(x) \over h} factoring
 = \lim_{h \to 0}{\cos(x)\sin(h) \over h}  +\lim_{h \to 0}{(\cos(h) - 1)\sin(x) \over h} separation of terms
 = \cos{x} \, \! \times 1 + \sin{x} \, \! \times 0 application of limit
 = \cos{x} \, \! solution

Now for the case of cos x.

 f(x) = \cos{x} \, \!
 f'(x) = \lim_{h \to 0}{\cos(x+h)-\cos{x} \over h} Definition of derivative
 = \lim_{h \to 0}{\cos(x)\cos(h)-\sin(h)\sin(x) - \cos(x) \over h} trigonometric identity
 = \lim_{h \to 0}{\cos(x)(\cos(h) - 1)-\sin(x)\sin(h) \over h} factoring
 =\lim_{h \to 0}{\cos(x)(\cos(h) - 1) \over h}  - \lim_{h \to 0}{\sin(x)\sin(h) \over h} separation of terms
 = \cos{x} \, \! \times 0 - \sin{x} \, \! \times 1 application of limit
 = -\sin{x} \, \! solution

Therefore we have established

Derivative of Sine and Cosine

\frac{d}{dx} \sin(x) = \cos(x)\,\!
\frac{d}{dx} \cos(x) = -\sin(x)\,\!


To find the derivative of the tangent, we just remember that:

\tan(x) = \frac{\sin(x)}{\cos(x)}

which is a quotient. Applying the quotient rule, we get:

\frac{d}{dx} \tan(x) = \frac{\cos^2(x) + \sin^2(x)}{\cos^2(x)}

Then, remembering that \cos^2(x) + \sin^2(x) = 1, we simplify:

\frac{\cos^2(x) + \sin^2(x)}{\cos^2(x)} =\frac{1}{\cos^2(x)}
=\sec^2(x)\,


Derivative of the Tangent

\frac{d}{dx}  \tan(x) = \sec^2(x)\,\!

For secants, we again apply the quotient rule.

\sec(x) = \frac{1}{\cos(x)}
\begin{align}\frac{d}{dx} \sec(x)&=\frac{d}{dx}\frac{1}{\cos(x)}\\
&=\frac{\cos(x)\frac{d 1}{dx}-1\frac{d \cos(x)}{dx}}{\cos(x)^2}\\
&=\frac{\cos(x)0-1(-\sin(x))}{\cos(x)^2}
\end{align}

Leaving us with:

\frac{d}{dx} \sec(x) = \frac{\sin(x)}{\cos^2(x)}

Simplifying, we get:


Derivative of the Secant

\frac{d}{dx} \sec(x) = \sec(x) \tan(x)\,\!

Using the same procedure on cosecants:

\csc(x) = \frac{1}{\sin(x)}

We get:


Derivative of the Cosecant

\frac{d}{dx} \csc(x) = -\csc(x) \cot(x)\,\!

Using the same procedure for the cotangent that we used for the tangent, we get:


Derivative of the Cotangent

\frac{d}{dx} \cot(x) = -\csc^2(x) \,\!

← Product and Quotient Rules Calculus Chain Rule →
Print version

<h1> 3.4 Chain Rule</h1>

← Derivatives of Trigonometric Functions Calculus Higher Order Derivatives →
Print version

The chain rule is a method to compute the derivative of the functional composition of two or more functions.

If a function, f, depends on a variable, u, which in turn depends on another variable, x, that is f = y(u(x)) , then the rate of change of f with respect to x can be computed as the rate of change of y with respect to u multiplied by the rate of change of u with respect to x.

Chain Rule

If a function f is composed to two differentiable functions y(x) and u(x), so that f(x) = y(u(x)), then f(x) is differentiable and,

\frac {df}{dx} = \frac {dy} { du} \cdot\frac {du}{ dx}\,\!

The method is called the "chain rule" because it can be applied sequentially to as many functions as are nested inside one another.[1] For example, if f is a function of g which is in turn a function of h, which is in turn a function of x, that is

f(g(h(x))),

the derivative of f with respect to x is given by

 \frac{df}{dx} = \frac{df}{dg} \cdot \frac{dg}{dh} \cdot \frac{dh}{dx} and so on.

A useful mnemonic is to think of the differentials as individual entities that you can cancel algebraically, such as

\frac{df}{dx} = \frac{df}{\cancel{dg}} \cdot \frac{\cancel{dg}}{\cancel{dh}} \cdot \frac{\cancel{dh}}{dx}

However, keep in mind that this trick comes about through a clever choice of notation rather than through actual algebraic cancellation.

The chain rule has broad applications in physics, chemistry, and engineering, as well as being used to study related rates in many disciplines. The chain rule can also be generalized to multiple variables in cases where the nested functions depend on more than one variable.

Examples

Example I

Suppose that a mountain climber ascends at a rate of 0.5 kilometer per hour. The temperature is lower at higher elevations; suppose the rate by which it decreases is 6 °C per kilometer. To calculate the decrease in air temperature per hour that the climber experiences, one multiplies 6 °C per kilometer by 0.5 kilometer per hour, to obtain 3 °C per hour. This calculation is a typical chain rule application.

Example II

Consider the function f(x) = (x2 + 1)3. It follows from the chain rule that

f(x)  = (x^2+1)^3 Function to differentiate
u(x)  = x^2+1 Define u(x) as inside function
f(x)  = [u(x)]^3 Express f(x) in terms of u(x)
\frac{df}{dx}  = \frac{df}{du} \cdot \frac {du}{dx} Express chain rule applicable here
\frac{df}{dx}  = \frac{d}{du}u^3 \cdot\frac {d}{dx}(x^2+1) Substitute in f(u) and u(x)
\frac{df}{dx} = 3u^2 \cdot 2x Compute derivatives with power rule
\frac{df}{dx} = 3(x^2+1)^2 \cdot 2x Substitute u(x) back in terms of x
 \frac{df}{dx} = 6x(x^2+1)^2 Simplify.

Example III

In order to differentiate the trigonometric function

f(x) = \sin(x^2),\,

one can write:

f(x) = \sin(x^2) Function to differentiate
u(x)  = x^2 Define u(x) as inside function
f(x) = \sin(u) Express f(x) in terms of u(x)
\frac{df}{dx}  = \frac{df}{du} \cdot \frac {du}{dx} Express chain rule applicable here
\frac{df}{dx}  = \frac{d}{du}\sin(u) \cdot\frac {d}{dx}(x^2) Substitute in f(u) and u(x)
\frac{df}{dx} = \cos(u) \cdot 2x Evaluate derivatives
\frac{df}{dx} = \cos(x^2)(2x) Substitute u in terms of x.

Example IV: absolute value

The chain rule can be used to differentiate \left|x\right|, the absolute value function:

f(x)  = \left|x\right| Function to differentiate
f(x)  = \sqrt{x^2} Equivalent function
u(x)  = x^2 Define u(x) as inside function
f(x)  = [u(x)]^\frac{1}{2} Express f(x) in terms of u(x)
\frac{df}{dx}  = \frac{df}{du} \cdot \frac {du}{dx} Express chain rule applicable here
\frac{df}{dx}  = \frac{d}{du}u^\frac{1}{2} \cdot\frac {d}{dx}(x^2) Substitute in f(u) and u(x)
\frac{df}{dx} = \frac{1}{2}u^{-\frac{1}{2}} \cdot 2x Compute derivatives with power rule
\frac{df}{dx} = \frac{1}{2}\left(x^2\right)^{-\frac{1}{2}} \cdot 2x Substitute u(x) back in terms of x
 \frac{df}{dx} = \frac{x}{\sqrt{x^2}} Simplify
 \frac{df}{dx} = \frac{x}{\left|x\right|} Express \sqrt{x^2} as absolute value.

Example V: three nested functions

The method is called the "chain rule" because it can be applied sequentially to as many functions as are nested inside one another. For example, if f(g(h(x))) = e^{\sin(x^2)}, sequential application of the chain rule yields the derivative as follows (we make use of the fact that \frac{de^x}{dx}=e^x, which will be proved in a later section):

f(x)  = e^{\sin(x^2)} = e^g Original (outermost) function
h(x)  = x^2 Define h(x) as innermost function
g(x)  = \sin(h) = \sin(x^2) g(h) = sin(h) as middle function
\frac{df}{dx}  = \frac{df}{dg} \cdot \frac{dg}{dh} \cdot \frac{dh}{dx} Express chain rule applicable here
\frac{df}{dg} = e^g = e^{\sin(x^2)} Differentiate f(g)[2]
\frac{dg}{dh} = \cos(h) = \cos(x^2) Differentiate g(h)
\frac{dh}{dx} = 2x Differentiate h(x)
 \frac{d}{dx} e^{\sin(x^2)} = e^{\sin(x^2)}\cdot \cos(x^2)\cdot 2x Substitute into chain rule.

Chain Rule in Physics

Because one physical quantity often depends on another, which, in turn depends on others, the chain rule has broad applications in physics. This section presents examples of the chain rule in kinematics and simple harmonic motion. The chain rule is also useful in electromagnetic induction.

Physics Example I: relative kinematics of two vehicles

One vehicle is headed north and currently located at (0,3); the other vehicle is headed west and currently located at (4,0). The chain rule can be used to find whether they are getting closer or further apart.

For example, one can consider the kinematics problem where one vehicle is heading west toward an intersection at 80 miles per hour while another is heading north away from the intersection at 60 miles per hour. One can ask whether the vehicles are getting closer or further apart and at what rate at the moment when the northbound vehicle is 3 miles north of the intersection and the westbound vehicle is 4 miles east of the intersection.

Big idea: use chain rule to compute rate of change of distance between two vehicles.

Plan:

  1. Choose coordinate system
  2. Identify variables
  3. Draw picture
  4. Big idea: use chain rule to compute rate of change of distance between two vehicles
  5. Express c in terms of x and y via Pythagorean theorem
  6. Express dc/dt using chain rule in terms of dx/dt and dy/dt
  7. Substitute in x, y, dx/dt, dy/dt
  8. Simplify.

Choose coordinate system: Let the y-axis point north and the x-axis point east.

Identify variables: Define y(t) to be the distance of the vehicle heading north from the origin and x(t) to be the distance of the vehicle heading west from the origin.

Express c in terms of x and y via Pythagorean theorem:


c = (x^2 + y^2)^{1/2}

Express dc/dt using chain rule in terms of dx/dt and dy/dt:

\frac{dc}{dt} = \frac{d}{dt}(x^2 + y^2)^{1/2} Apply derivative operator to entire function
= \frac{1}{2}(x^2 + y^2)^{-1/2}\frac{d}{dt}(x^2 + y^2) Sum of squares is inside function
=\frac{1}{2}(x^2 + y^2)^{-1/2}\left[\frac{d}{dt}(x^2) + \frac{d}{dt}(y^2) \right] Distribute differentiation operator
= \frac{1}{2}(x^2 + y^2)^{-1/2}\left[ 2x\frac{dx}{dt} + 2y\frac{dy}{dt}\right] Apply chain rule to x(t) and y(t)}
= \frac{x\frac{dx}{dt} + y\frac{dy}{dt}}{\sqrt{x^2 + y^2}} Simplify.


Substitute in x = 4 mi, y = 3 mi, dx/dt = −80 mi/hr, dy/dt = 60 mi/hr and simplify


\begin{align}
\frac{dc}{dt} & = \frac{4 mi \cdot (-80 mi/hr) + 3 mi \cdot (60) mi/hr}{\sqrt{(4 mi)^2 + (3 mi)^2}}\\
& = \frac{-320 mi^2/hr + 180 mi^2/hr}{5 mi}\\
&= \frac{-140 mi^2/hr}{5 mi}\\
& = -28 mi/hr\\
\end{align}

Consequently, the two vehicles are getting closer together at a rate of 28 mi/hr.

Physics Example II: harmonic oscillator

An undamped spring-mass system is a simple harmonic oscillator.

If the displacement of a simple harmonic oscillator from equilibrium is given by x, and it is released from its maximum displacement A at time t = 0, then the position at later times is given by

 x(t) = A \cos(\omega t),

where ω = 2 π/T is the angular frequency and T is the period of oscillation. The velocity, v, being the first time derivative of the position can be computed with the chain rule:

v(t) = \frac{dx}{dt} Definition of velocity in one dimension
  = \frac{d}{dt} A \cos(\omega t) Substitute x(t)
 = A \frac{d}{dt} \cos(\omega t) Bring constant A outside of derivative
 = A (-\sin(\omega t)) \frac{d}{dt}(\omega t) Differentiate outside function (cosine)
 = -A \sin(\omega t) \frac{d}{dt} (\omega t) Bring negative sign in front
 = -A \sin(\omega t) \omega Evaluate remaining derivative
v(t) = -\omega A \sin(\omega t). Simplify.

The acceleration is then the second time derivative of position, or simply dv/dt.

a(t) = \frac{dv}{dt} Definition of acceleration in one dimension
  = \frac{d}{dt} (-\omega A \sin(\omega t) ) Substitute v(t)
 = -\omega A \frac{d}{dt} \sin(\omega t) Bring constant term outside of derivative
 = -\omega A \cos(\omega t) \frac{d}{dt} (\omega t) Differentiate outside function (sine)
 = -\omega A \cos(\omega t) \omega Evaluate remaining derivative
a(t) = -\omega^2 A \cos(\omega t).  Simplify.

From Newton's second law, F = ma, where F is the net force and m is the object's mass.

F = ma Newton's second law
  = m (-\omega^2 A \cos(\omega t)) Substitute a(t)
 = -m\omega^2  A \cos(\omega t) Simplify
 F = -m\omega^2  x(t). Substitute original x(t).

Thus it can be seen that these results are consistent with the observation that the force on a simple harmonic oscillator is a negative constant times the displacement.

Chain Rule in Chemistry

The chain rule has many applications in Chemistry because many equations in Chemistry describe how one physical quantity depends on another, which in turn depends on another. For example, the ideal gas law describes the relationship between pressure, volume, temperature, and number of moles, all of which can also depend on time.

Chemistry Example I: Ideal Gas Law

Isotherms of an ideal gas. The curved lines represent the relationship between pressure and volume for an ideal gas at different temperatures: lines which are further away from the origin (that is, lines that are nearer to the top right-hand corner of the diagram) represent higher temperatures.

Suppose a sample of n moles of an ideal gas is held in an isothermal (constant temperature, T) chamber with initial volume V0. The ideal gas is compressed by a piston so that its volume changes at a constant rate so that V(t) = V0 - kt, where t is the time. The chain rule can be employed to find the time rate of change of the pressure.[3] The ideal gas law can be solved for the pressure, P to give:

P(t) = \frac{ n R T}{V(t)},

where P(t) and V(t) have been written as explicit functions of time and the other symbols are constant. Differentiating both sides yields

\frac{dP(t)}{dt} = n R T \frac{ d}{dt} \left( \frac{1}{V(t)}\right),

where the constant terms, n, R, and T, have been moved to the left of the derivative operator. Applying the chain rule gives

\frac{dP}{dt} = n R T \frac{ d}{dV} \left( \frac{1}{V(t)}\right)\frac{dV}{dt} = n R T (- \frac{1}{V^2}) \frac{dV}{dt},

where the power rule has been used to differentiate 1/V, Since V(t) = V0 - kt, dV/dt = -k. Substituting in for V and dV/dt yields dP/dt.

\frac{dP}{dt} = - n R T k \left( \frac{1}{(V_0-k t)^2}\right)

Chemistry Example II: Kinetic Theory of Gases

The temperature of an ideal monatomic gas is a measure of the average kinetic energy of its atoms. The size of helium atoms relative to their spacing is shown to scale under 1950 atmospheres of pressure. The atoms have a certain, average speed, slowed down here two trillion fold from room temperature.

A second application of the chain rule in Chemistry is finding the rate of change of the average molecular speed, v, in an ideal gas as the absolute temperature T, increases at a constant rate so that T = T0 + at, where T0 is the initial temperature and t is the time.[3] The kinetic theory of gases relates the root mean square of the molecular speed to the temperature, so that if v(t) and T(t) are functions of time,

v(t) = \left( \frac{3 R T(t)}{M} \right) ^{1 \over 2},

where R is the ideal gas constant, and M is the molecular weight.

Differentiating both sides with respect to time yields:

\frac{d}{dt} v(t) = \frac{d}{dt} \left( \frac{3 R T(t)}{M} \right) ^{1 \over 2}.

Using the chain rule to express the right side in terms of the with respect to temperature, T, and time, t, respectively gives

\frac{dv}{dt} = \frac{d}{dT} \left( \frac{3 R T}{M} \right)^{\frac{1}{2}}\frac{dT}{dt}.

Evaluating the derivative with respect to temperature, T, yields

\frac{dv}{dt} = \frac{1}{2} \left( \frac{3 R T}{M} \right)^{-\frac{1}{2}} \frac{d}{dT}\left( \frac{3RT}{M}\right) \frac{dT}{dt}.

Evaluating the remaining derivative with respect to T, taking the reciprocal of the negative power, and substituting T = T0 + at, produces

\frac{dv}{dt} = \frac{1}{2} \left( \frac{M}{ 3 R (T_0 + a t)} \right) ^{\frac{1}{2}}\frac{3R}{M}\frac{d}{dt}\left( T_0 + at \right).

Evaluating the derivative with respect to t yields

\frac{dv}{dt} = \frac{1}{2} \left( \frac{M}{ 3 R (T_0 + a t)} \right) ^{\frac{1}{2}}\frac{3R}{M} a.

which simplifies to

\frac{dv}{dt} = \frac{a}{2} \left( \frac{3R}{M (T_0 + a t)} \right) ^{\frac{1}{2}}.

Exercises

1. Evaluate f'(x) if f(x)=(x^2+5)^2, first by expanding and differentiating directly, and then by applying the chain rule on f(u(x))=u^2 where u=x^2+5. Compare answers.

4x^3+20x

2. Evaluate the derivative of y=\sqrt{1 + x^2} using the chain rule by letting y=\sqrt{u} and u=1+x^2.

\frac {x} \sqrt {1 + x^2}

Solutions

References

  1. Chandler, David (October 2000). "Newton's Second Law for Systems with Variable Mass". The Physics Teacher 38 (7): 396. 
  2. Courtney, Michael; Courtney, Amy. "Measuring thrust and predicting trajectory in model rocketry". arΧiv:0903.1555. 

External links

← Derivatives of Trigonometric Functions Calculus Higher Order Derivatives →
Print version

<h1> 3.5 Higher Order Derivatives</h1>

← Chain Rule Calculus Implicit differentiation →
Print version

The second derivative, or second order derivative, is the derivative of the derivative of a function. The derivative of the function f(x) may be denoted by f^\prime(x), and its double (or "second") derivative is denoted by f^{\prime\prime}(x). This is read as "f double prime of x," or "The second derivative of f(x)." Because the derivative of function f is defined as a function representing the slope of function f, the double derivative is the function representing the slope of the first derivative function.

Furthermore, the third derivative is the derivative of the derivative of the derivative of a function, which can be represented by f^{\prime\prime\prime}(x). This is read as "f triple prime of x", or "The third derivative of f(x)". This can continue as long as the resulting derivative is itself differentiable, with the fourth derivative, the fifth derivative, and so on. Any derivative beyond the first derivative can be referred to as a higher order derivative.

Notation

Let  f(x) be a function in terms of x. The following are notations for higher order derivatives.

2nd Derivative 3rd Derivative 4th Derivative nth Derivative Notes
f^{\prime\prime}(x) f^{\prime\prime\prime}(x) f^{(4)}(x) f^{(n)}(x) Probably the most common notation.
\frac{d^2 f}{dx^2} \frac{d^3 f}{dx^3} \frac{d^4 f}{dx^4} \frac{d^n f}{dx^n} Leibniz notation.
\frac{d^2}{dx^2} \left[ f(x) \right] \frac{d^3}{dx^3} \left[ f(x) \right] \frac{d^4}{dx^4} \left[ f(x) \right] \frac{d^n}{dx^n} \left[ f(x) \right] Another form of Leibniz notation.
D^2f D^3f D^4f D^nf Euler's notation.

Warning: You should not write f^{n} (x) to indicate the nth derivative, as this is easily confused with the quantity f(x) all raised to the nth power.

The Leibniz notation, which is useful because of its precision, follows from

\frac{d}{dx}\left( \frac{df}{dx}\right) = \frac{d^2 f}{dx^2}.

Newton's dot notation extends to the second derivative, \ddot y, but typically no further in the applications where this notation is common.

Examples

Example 1:

Find the third derivative of  f(x) = 4x^5 + 6x^3 + 2x + 1 \  with respect to x.

Repeatedly apply the Power Rule to find the derivatives.

  •  f'(x) = 20x^4 + 18x^2 + 2 \
  •  f''(x) = 80x^3 + 36x \
  •  f'''(x) = 240x^2 + 36 \

Example 2:

Find the third derivative of  f(x) = 12\sin x + \frac{1}{x+2} + 2x \  with respect to x.
  •  f'(x) = 12\cos x - \frac{1}{(x+2)^2} + 2
  •  f''(x) = -12\sin x + \frac{2}{(x+2)^3}
  •  f'''(x) = -12\cos x - \frac{6}{(x+2)^4}

Applications:

For applications of the second derivative in finding a curve's concavity and points of inflection, see "Extrema and Points of Inflection" and "Extreme Value Theorem". For applications of higher order derivatives in physics, see the "Kinematics" section.

← Chain Rule Calculus Implicit differentiation →
Print version

<h1> Failed to match page to section number. Check your argument; if correct, consider updating Template:Calculus/map page. Implicit Differentiation</h1>

← Higher Order Derivatives Calculus Derivatives of Exponential and Logarithm Functions →
Print version

Generally, you will encounter functions expressed in explicit form, that is, in the form y = f(x). To find the derivative of y with respect to x, you take the derivative with respect to x of both sides of the equation to get

\frac{dy}{dx}=\frac{d}{dx}[f(x)]=f'(x)

But suppose you have a relation of the form f(x,y(x))=g(x,y(x)). In this case, it may be inconvenient or even impossible to solve for y as a function of x. A good example is the relation y^2 + 2yx + 3 = 5x \,. In this case you can utilize implicit differentiation to find the derivative. To do so, one takes the derivative of both sides of the equation with respect to x and solves for y'. That is, form

\frac{d}{dx}[f(x,y(x))]=\frac{d}{dx}[g(x,y(x))]

and solve for dy/dx. You need to employ the chain rule whenever you take the derivative of a variable with respect to a different variable. For example,

 \frac{d}{dx} (y^3) = \frac{d}{dy}[y^3]\cdot\frac{dy}{dx}=3y^2 \cdot y' \

Implicit Differentiation and the Chain Rule

To understand how implicit differentiation works and use it effectively it is important to recognize that the key idea is simply the chain rule. First let's recall the chain rule. Suppose we are given two differentiable functions f(x) and g(x) and that we are interested in computing the derivative of the function f(g(x)), the the chain rule states that:

\frac{d}{dx}\Big(f(g(x))\Big) = f'(g(x))\,g'(x)

That is, we take the derivative of f as normal and then plug in g, finally multiply the result by the derivative of g.

Now suppose we want to differentiate a term like y2 with respect to x where we are thinking of y as a function of x, so for the remainder of this calculation let's write it as y(x) instead of just y. The term y2 is just the composition of f(x) = x2 and y(x). That is, f(y(x)) = y2(x). Recalling that f′(x) = 2x then the chain rule states that:

\frac{d}{dx}\Big(f(y(x))\Big)=f'(y(x))\,y'(x)=2y(x)y'(x)

Of course it is customary to think of y as being a function of x without always writing y(x), so this calculation usually is just written as

\frac{d}{dx}y^2=2yy'.

Don't be confused by the fact that we don't yet know what y′ is, it is some function and often if we are differentiating two quantities that are equal it becomes possible to explicitly solve for y′ (as we will see in the examples below.) This makes it a very powerful technique for taking derivatives.

Explicit Differentiation

For example, suppose we are interested in the derivative of y with respect to x where x and y are related by the equation

x^2 + y^2 = 1\,

This equation represents a circle of radius 1 centered on the origin. Note that y is not a function of x since it fails the vertical line test (y=\pm1 when x=0, for example).

To find y', first we can separate variables to get

y^2 = 1 - x^2\,

Taking the square root of both sides we get two separate functions for y:

y = \pm \sqrt{1-x^2}\,

We can rewrite this as a fractional power:

y = \pm (1-x^2)^{\frac{1}{2}}\,

Using the chain rule we get,

y' = \pm\frac{1}{2}(1-x^2)^{-1/2}\cdot(-2x) = \pm\frac{x}{(1-x^2)^{1/2}}

And simplifying by substituting y back into this equation gives

y' = -\frac{x}{y}

Implicit Differentiation

Using the same equation

x^2 + y^2 = 1\,

First, differentiate with respect to x on both sides of the equation:

\frac{d}{dx}[x^2 + y^2] = \frac{d}{dx}[1]
\frac{d}{dx}[x^2]+\frac{d}{dx}[y^2] = 0

To differentiate the second term on the left hand side of the equation (call it f(y(x))=y2), use the chain rule:

\frac{df}{dx}=\frac{df}{dy}\cdot\frac{dy}{dx}=2y\cdot y'

So the equation becomes

2x+2yy'=0

Separate the variables:

2yy' = -2x\,

Divide both sides by 2y\,, and simplify to get the same result as above:

y' = -\frac{2x}{2y}
y' = -\frac{x}{y}

Uses

Implicit differentiation is useful when differentiating an equation that cannot be explicitly differentiated because it is impossible to isolate variables.

For example, consider the equation,

x^2 + xy + y^2 = 16\,

Differentiate both sides of the equation (remember to use the product rule on the term xy):

2x + y + xy' + 2yy' = 0\,

Isolate terms with y':

xy' + 2yy' = -2x - y\,

Factor out a y' and divide both sides by the other term:

y' = \frac{-2x-y}{x+2y}

Example

xy \,=1

can be solved as:

y=\frac{1}{x}

then differentiated:

\frac{dy}{dx}=-\frac{1}{x^2}

However, using implicit differentiation it can also be differentiated like this:

\frac{d}{dx}[xy]=\frac{d}{dx}[1]

use the product rule:

x\frac{dy}{dx}+y=0

solve for \frac{dy}{dx}:

\frac{dy}{dx}=-\frac{y}{x}

Note that, if we substitute y=\frac{1}{x} into \frac{dy}{dx}=-\frac{y}{x}, we end up with \frac{dy}{dx}=-\frac{1}{x^2} again.

Application: inverse trigonometric functions

Arcsine, arccosine, arctangent. These are the functions that allow you to determine the angle given the sine, cosine, or tangent of that angle.

First, let us start with the arcsine such that:

y=\arcsin(x)

To find dy/dx we first need to break this down into a form we can work with:

x = \sin(y)

Then we can take the derivative of that:

1 = \cos(y) \cdot \frac{dy}{dx}

...and solve for dy / dx:

y=arcsin(x) gives us this unit triangle.
\frac{dy}{dx} = \frac{1}{\cos(y)}

At this point we need to go back to the unit triangle. Since y is the angle and the opposite side is sin(y) (which is equal to x), the adjacent side is cos(y) (which is equal to the square root of 1 minus x2, based on the Pythagorean theorem), and the hypotenuse is 1. Since we have determined the value of cos(y) based on the unit triangle, we can substitute it back in to the above equation and get:


Derivative of the Arcsine

\frac{d}{dx} \arcsin(x) = \frac{1}{\sqrt{1-x^2}}\,\!

We can use an identical procedure for the arccosine and arctangent:


Derivative of the Arccosine

\frac{d}{dx} \arccos(x) = \frac{-1}{\sqrt{1-x^2}}\,\!

Derivative of the Arctangent

\frac{d}{dx} \arctan(x) = \frac{1}{1+x^2}\,\!



← Higher Order Derivatives Calculus Derivatives of Exponential and Logarithm Functions →
Print version

<h1> 3.7 Derivatives of Exponential and Logarithm Functions</h1>

← Implicit differentiation Calculus Some Important Theorems →
Print version

Exponential Function

First, we determine the derivative of e^x using the definition of the derivative:

\frac{d}{dx} e^x = \lim_{h \to 0} \frac{e^{x+h} - e^{x}}{h}

Then we apply some basic algebra with powers (specifically that ab + c = ab ac):

\frac{d}{dx} e^x = \lim_{h \to 0} \frac{e^{x} e^{h} - e^{x}}{h}

Since ex does not depend on h, it is constant as h goes to 0. Thus, we can use the limit rules to move it to the outside, leaving us with:

\frac{d}{dx} e^x = e^x \cdot \lim_{h \to 0} \frac{e^{h} - 1}{h}

Now, the limit can be calculated by techniques we will learn later, for example Calculus/L'Hôpital's rule, and we will see that

\lim_{h \to 0} \frac{e^h - 1}{h} = 1,

so that we have proved the following rule:

Derivative of the exponential function

\frac{d}{dx}e^x = e^x\,\!

Now that we have derived a specific case, let us extend things to the general case. Assuming that a is a positive real constant, we wish to calculate:

\frac{d}{dx}a^x

One of the oldest tricks in mathematics is to break a problem down into a form that we already know we can handle. Since we have already determined the derivative of ex, we will attempt to rewrite ax in that form.

Using that eln(c) = c and that ln(ab) = b · ln(a), we find that:

a^x = e^{x \cdot \ln(a)}

Thus, we simply apply the chain rule:

\frac{d}{dx}e^{x \cdot \ln(a)} = \frac{d}{dx} \left[ x\cdot \ln(a) \right] e^{x \cdot \ln(a)} = \ln(a) a^x
Derivative of the exponential function

\frac{d}{dx}a^x = \ln\left(a\right)a^x\,\!

Logarithm Function

Closely related to the exponentiation is the logarithm. Just as with exponents, we will derive the equation for a specific case first (the natural log, where the base is e), and then work to generalize it for any logarithm.

First let us create a variable y such that:

y = \ln\left(x\right)

It should be noted that what we want to find is the derivative of y or \frac{dy}{dx} .

Next we will put both sides to the power of e in an attempt to remove the logarithm from the right hand side:

e^y = x

Now, applying the chain rule and the property of exponents we derived earlier, we take the derivative of both sides:

 \frac{dy}{dx} \cdot e^y = 1

This leaves us with the derivative:

 \frac{dy}{dx}  = \frac{1}{e^y}

Substituting back our original equation of x = ey, we find that:


Derivative of the Natural Logarithm

\frac{d}{dx}\ln\left(x\right) = \frac{1}{x}\,\!

If we wanted, we could go through that same process again for a generalized base, but it is easier just to use properties of logs and realize that:

\log_b(x) = \frac{\ln(x)}{\ln(b)}

Since 1 / ln(b) is a constant, we can just take it outside of the derivative:

\frac{d}{dx}\log_b(x) = \frac{1}{\ln(b)} \cdot \frac{d}{dx} \ln(x)

Which leaves us with the generalized form of:


Derivative of the Logarithm

\frac{d}{dx}\log_b\left(x\right) = \frac{1}{x\ln\left(b\right)}\,\!

Logarithmic Differentiation

We can use the properties of the logarithm, particularly the natural log, to differentiate more difficult functions, such a products with many terms, quotients of composed functions, or functions with variable or function exponents. We do this by taking the natural logarithm of both sides, re-arranging terms using the logarithm laws below, and then differentiating both sides implicitly, before multiplying through by y.

\log\left(\frac{a}{b}\right) = \log(a) - \log(b)

\log(a^n) = n\log(a)\,\!

\log(a) + \log(b) = \log(ab)\,\!

See the examples below.

Example 1

Suppose we wished to differentiate

y = \frac{(6x^2+9)^2}{\sqrt{3x^3-2}}

We take the natural logarithm of both sides


\begin{align}
 \ln(y) & = \ln\Bigg(\frac{(6x^2+9)^2}{\sqrt{3x^3-2}}\Bigg) \\
            & = \ln(6x^2+9)^2 - \ln(3x^3-2)^{\frac{1}{2}} \\
            & = 2\ln(6x^2+9) - \frac{1}{2}\ln(3x^3-2) \\
\end{align}

Differentiating implicitly, recalling the chain rule


\begin{align}
 \frac{1}{y} \frac{dy}{dx} & = 2 \times \frac{12x}{6x^2+9} - \frac{1}{2} \times \frac{9x^2}{3x^3-2} \\
                                           & = \frac{24x}{6x^2+9} - \frac{\frac{9}{2}x^2}{3x^3-2} \\
                                           & = \frac{24x(3x^3-2) - \frac{9}{2}x^2(6x^2+9)}{(6x^2+9)(3x^3-2)} \\
\end{align}

Multiplying by y, the original function

\frac{dy}{dx} = \frac{(6x^2+9)^2}{\sqrt{3x^3-2}} \times \frac{24x(3x^3-2) - \frac{9}{2}x^2(6x^2+9)}{(6x^2+9)(3x^3-2)}
Example 2

Let us differentiate a function

y=x^x\,\!

Taking the natural logarithm of left and right


\begin{align}
 \ln y & = \ln(x^x) \\
          & = x\ln(x) \\
\end{align}

We then differentiate both sides, recalling the product and chain rules


\begin{align}
 \frac{1}{y} \frac{dy}{dx} & = \ln(x) + x\frac{1}{x} \\
                                           & = \ln(x) + 1 \\
\end{align}

Multiplying by the original function y

\frac{dy}{dx} = x^x(\ln(x) + 1)
Example 3

Take a function

y=x^{6\cos(x)}\,\!

Then


\begin{align}
 \ln y & = \ln(x^{6\cos(x)})\,\! \\
          & = 6\cos(x)\ln(x)\,\! \\
\end{align}

We then differentiate

\frac{1}{y} \frac{dy}{dx} = -6\sin(x)\ln(x)+\frac{6\cos(x)}{x}

And finally multiply by y


\begin{align}
 \frac{dy}{dx} & = y\Bigg(-6\sin(x)\ln(x)+\frac{6\cos(x)}{x}\Bigg)\\
               & = x^{6\cos(x)}\Bigg(-6\sin(x)\ln(x)+\frac{6\cos(x)}{x}\Bigg)\\
                
\end{align}
← Implicit differentiation Calculus Some Important Theorems →
Print version

<h1> 3.8 Some Important Theorems</h1>

← Derivatives of Exponential and Logarithm Functions Calculus Differentiation/Basics of Differentiation/Exercises →
Print version

This section covers three theorems of fundamental importance to the topic of differential calculus: The Extreme Value Theorem, Rolle's Theorem, and the Mean Value Theorem. It also discusses the relationship between differentiability and continuity.

Extreme Value Theorem

Classification of Extrema

We start out with some definitions.

Global Maximum
A global maximum (also called an absolute maximum) of a function f on a closed interval I is a value f(c) such that f(c)\geq f(x) for all x in I.
Global Minimum
A global minimum (also called an absolute minimum) of a function f on a closed interval I is a value f(c) such that f(c)\leq f(x) for all x in I.

Maxima and minima are collectively known as extrema.

The Extreme Value Theorem

Extreme Value Theorem
If f is a function that is continuous on the closed interval [a,b], then f has both a global minimum and a global maximum on [a,b]. It is assumed that a and b are both finite.

The Extreme Value Theorem is a fundamental result of real analysis whose proof is beyond the scope of this text. However, the truth of the theorem allows us to talk about the maxima and minima of continuous functions on closed intervals without concerning ourselves with whether or not they exist. When dealing with functions that do not satisfy the premises of the theorem, we will need to worry about such things. For example, the unbounded function f(x)=x has no extrema whatsoever. If f(x) is restricted to the semi-closed interval I=[0,1), then f has a minimum value of 0 at x=0, but it has no maximum value since, for any given value c in I, one can always find a larger value of f(x) for x in I, for example by forming f(d), where d is the average of c with 1. The function g(x)=\frac{1}{x} has a discontinuity at x=0. g(x) fails to have any extrema in any closed interval around x=0 since the function is unbounded below as one approaches 0 from the left, and it is unbounded above as one approaches 0 from the right. (In fact, the function is undefined for x=0. However, the example is unaffected if g(0) is assigned any arbitrary value.)

The Extreme Value Theorem is an existence theorem. It tells us that global extrema exist if certain conditions are met, but it doesn't tell us how to find them. We will discuss how to determine the extrema of continuous functions in the section titled Extrema and Points of Inflection.

Rolle's Theorem

Rolle's Theorem
If a function,  f(x) \ , is continuous on the closed interval  [a,b] \ , is differentiable on the open interval  (a,b) \ , and  f(a) = f(b) \ , then there exists at least one number c, in the interval  (a,b) \ such that  f'(c) = 0 \ .
Rolle's theorem.svg

Rolle's Theorem is important in proving the Mean Value Theorem. Intuitively it says that if you have a function that is continuous everywhere in an interval bounded by points where the function has the same value, and if the function is differentiable everywhere in the interval (except maybe at the endpoints themselves), then the function must have zero slope in at least one place in the interior of the interval.

Proof of Rolle's Theorem

If f is constant on [a,b], then f'(x)=0 for every x in [a,b], so the theorem is true. So for the remainder of the discussion we assume f is not constant on [a,b].

Since f satisfies the conditions of the Extreme Value Theorem, f must attain its maximum and minimum values on [a,b]. Since f is not constant on [a,b], the endpoints cannot be both maxima and minima. Thus, at least one extremum exists in (a,b). We can suppose without loss of generality that this extremum is a maximum because, if it were a minimum, we could consider the function -f instead. Let f(c) with c in (a,b) be a maximum. It remains to be shown that f'(c)=0.

By the definition of derivative, f'(c)=\lim_{h\to0}\frac{f(c+h)-f(c)}{h}. By substituting h=x-c, this is equivalent to \lim_{x\to c}\frac{f(x)-f(c)}{x-c}. Note that f(x)-f(c)\leq 0 for all x in [a,b] since f(c) is the maximum on [a,b].

\lim_{x\to c^{-}}\frac{f(x)-f(c)}{x-c}\geq0 since it has non-positive numerator and negative denominator.

\lim_{x\to c^{+}}\frac{f(x)-f(c)}{x-c}\leq0 since it has non-positive numerator and positive denominator.

The limits from the left and right must be equal since the function is differentiable at c, so \lim_{x\to c}\frac{f(x)-f(c)}{x-c}=0=f'(c).

Exercise

1. Show that Rolle's Theorem holds true between the x-intercepts of the function f(x)=x^2-3x.

1: The question wishes for us to use the x-intercepts as the endpoints of our interval.

Factor the expression to obtain x(x-3)= 0 . x=0 and x=3 are our two endpoints. We know that f(0) and f(3) are the same, thus that satisfies the first part of Rolle's theorem (f(a)=f(b)).

2: Now by Rolle's Theorem, we know that somewhere between these points, the slope will be zero. Where? Easy: Take the derivative.

\frac{dy}{dx}  = 2x - 3

Thus, at  x = 3/2 , we have a spot with a slope of zero. We know that 3/2 (or 1.5) is between 0 and 3. Thus, Rolle's Theorem is true for this (as it is for all cases).

Mean Value Theorem

Lagrange mean value theorem.svg
Mean Value Theorem

If  f(x) \ is continuous on the closed interval  [a, b] \ and differentiable on the open interval  (a,b) \ , there exists a number,  c \ , in the open interval  (a,b) \ such that

 f'(c) = \frac{f(b) - f(a)}{b - a} .

The Mean Value Theorem is an important theorem of differential calculus. It basically says that for a differentiable function defined on an interval, there is some point on the interval whose instantaneous slope is equal to the average slope of the interval. Note that Rolle's Theorem is the special case of the Mean Value Theorem when f(a)=f(b).

In order to prove the Mean Value Theorem, we will prove a more general statement, of which the Mean Value Theorem is a special case. The statement is Cauchy's Mean Value Theorem, also known as the Extended Mean Value Theorem.

Cauchy's Mean Value Theorem

Cauchy's Mean Value Theorem

If  f(x) \ ,  g(x) \ are continuous on the closed interval  [a, b] \ and differentiable on the open interval  (a,b) \ , then there exists a number,  c \ , in the open interval  (a,b) \ such that

f'(c)(g(b)-g(a))=g'(c)(f(b)-f(a))

If g(b)\ne g(a) and g'(c)\ne 0, then this is equivalent to

 \frac{f'(c)}{g'(c)} = \frac{f(b) - f(a)}{g(b) - g(a)} .

To prove Cauchy's Mean Value Theorem, consider the function h(x)=f(x)(g(b)-g(a))-g(x)(f(b)-f(a))-f(a)g(b)+f(b)g(a). Since both f and g are continuous on [a,b] and differentiable on (a,b), so is h. h'(x)=f'(x)(g(b)-g(a))-g'(x)(f(b)-f(a)).Since h(a)=h(b) (see the exercises), Rolle's Theorem tells us that there exists some number c in (a,b) such that h'(c)=0. This implies that f'(c)(g(b)-g(a))=g'(x)(f(b)-f(a)), which is what was to be shown.

Exercises

2. Show that h(a)=h(b), where h(x) is the function that was defined in the proof of Cauchy's Mean Value Theorem.

\begin{align}h(a)&=f(a)(g(b)-g(a)-g(a)(f(b)-f(a)-f(a)g(b)+f(b)g(a)\\
&=f(a)g(b)-f(a)g(a)-g(a)f(b)-g(a)f(a)-f(a)g(b)+f(b)g(a)\\
&=0\end{align} \begin{align}h(b)&=f(b)(g(b)-g(a))-g(b)(f(b)-f(a))-f(a)g(b)+f(b)g(a)\\
&=f(b)g(b)-f(b)g(a)-g(b)f(b)-g(b)f(a)-f(a)g(b)+f(b)g(a)\\
&=0\end{align}

3. Show that the Mean Value Theorem follows from Cauchy's Mean Value Theorem.

Let g(x)=x. Then g'(x)=1 and g(b)-g(a)=b-a, which is non-zero if b\ne a. Then
 \frac{f'(c)}{g'(c)} = \frac{f(b) - f(a)}{g(b) - g(a)} simplifies to f'(c) = \frac{f(b) - f(a)}{b-a} , which is the Mean Value Theorem.

4. Find the x=c that satisfies the Mean Value Theorem for the function f(x)=x^3 with endpoints x=0 and x=2.

x=\frac{2\sqrt{3}}{3}

5. Find the point that satisifies the mean value theorem on the function f(x) = \sin(x) and the interval [0,\pi].

x=\pi/2

Solutions

Differentiability Implies Continuity

If f'(x_0) exists then f is continuous at x_0. To see this, note that \lim_{x\to x_0}(x-x_0)f'(x_0)=0. But
\begin{align}\lim_{x\to x_0}(x-x_0)f'(x_0)&=\lim_{x\to x_0}(x-x_0)\frac{f(x)-f(x_0)}{x-x_0}\\
&=\lim_{x\to x_0}(f(x)-f(x_0))\\
&=\lim_{x\to x_0}f(x)-f(x_0)\end{align}
This imples that \lim_{x\to x_0}f(x)-f(x_0)=0 or \lim_{x\to x_0}f(x)=f(x_0), which shows that f is continuous at x=x_0.
The converse, however, is not true. Take f(x)=|x|, for example. f is continuous at 0 since \lim_{x\to 0^-}|x|=\lim_{x\to 0^-}-x=0 and \lim_{x\to 0^+}|x|=\lim_{x\to 0^+}x=0 and |0|=0, but it is not differentiable at 0 since \lim_{h\to 0^-}\frac{|0+h|-|0|}{h}=\lim_{h\to 0^-}\frac{-h}{h}=-1 but \lim_{h\to 0^+}\frac{|0+h|-|0|}{h}=\lim_{h\to 0^+}\frac{h}{h}=1.

← Derivatives of Exponential and Logarithm Functions Calculus Differentiation/Basics of Differentiation/Exercises →
Print version

<h1> 3.9 Basics of Differentiation Cumulative Exercises</h1>

← Some Important Theorems Calculus L'Hôpital's rule →
Print version

Find the Derivative by Definition

Find the derivative of the following functions using the limit definition of the derivative.

1. f(x) = x^2 \,

2x

2. f(x) = 2x + 2 \,

2

3. f(x) = \frac{1}{2}x^2 \,

x

4. f(x) = 2x^2 + 4x + 4 \,

4x+4

5. f(x) = \sqrt{x+2} \,

\frac{1}{2\sqrt{x+2}}

6. f(x) = \frac{1}{x} \,

-\frac{1}{x^2}

7. f(x) = \frac{3}{x+1} \,

\frac{-3}{(x+1)^2}

8. f(x) = \frac{1}{\sqrt{x+1}} \,

\frac{-1}{2(x+1)^{3/2}}

9. f(x) = \frac{x}{x+2} \,

\frac{2}{(x+2)^2}

Solutions

Prove the Constant Rule

10. Use the definition of the derivative to prove that for any fixed real number c, \frac{d}{dx}\left[cf(x)\right] = c \frac{d}{dx}\left[f(x)\right]

\begin{align}\frac{d}{dx}\left[cf(x)\right]
&=\lim_{\Delta x \to 0}\frac{cf\left(x+\Delta x \right)-cf\left(x\right)}{\Delta x}\\
&=c\lim_{\Delta x \to 0}\frac{f(x+\Delta x)-f(x)}{\Delta x}\\
&=c\frac{d}{dx}\left[f(x)\right]\end{align}

Solutions

Find the Derivative by Rules

Find the derivative of the following functions:

Power Rule

11. f(x) = 2x^2 + 4\,

f'(x)=4x

12. f(x) = 3\sqrt[3]{x}\,

f'(x)=\frac{1}{\sqrt[3]{x^2}}

13. f(x) = 2x^5+8x^2+x-78\,

f'(x)=10x^4+16x+1

14. f(x) = 7x^7+8x^5+x^3+x^2-x\,

f'(x)=49x^6+40x^4+3x^2+2x-1

15. f(x) = \frac{1}{x^2}+3x^\frac{1}{3}\,

f'(x)=\frac{-2}{x^3}+\frac{1}{\sqrt[3]{x^2}}

16. f(x) = 3x^{15} + \frac{1}{17}x^2 +\frac{2}{\sqrt{x}} \,

f'(x)=45x^{14}+\frac{2}{17}x-\frac{1}{x\sqrt{x}}

17. f(x) = \frac{3}{x^4} - \sqrt[4]{x} + x \,

f'(x)=\frac{-12}{x^5}-\frac{1}{4\sqrt[4]{x^3}}+1

18. f(x) = 6x^{1/3}-x^{0.4} +\frac{9}{x^2} \,

f'(x)=\frac{2}{\sqrt[3]{x^2}}-\frac{0.4}{x^{0.6}}-\frac{18}{x^3}

19. f(x) = \frac{1}{\sqrt[3]{x}} + \sqrt{x} \,

f'(x)=\frac{-1}{3x\sqrt[3]{x}}+\frac{1}{2\sqrt{x}}

Solutions

Product Rule

20. f(x) = (x^4+4x+2)(2x+3) \,

10x^4+12x^3+16x+16

21. f(x) = (2x-1)(3x^2+2) \,

18x^2-6x+4

22. f(x) = (x^3-12x)(3x^2+2x) \,

15x^4+8x^3-108x^2-48x

23. f(x) = (2x^5-x)(3x+1) \,

36x^5+10x^4-6x-1

f(x) = (5x^2+3)(2x+7)\,

30x^2+70x+6

f(x) = 3x^2(5x^2+1)^4 \,

6x(25x^2+1)(5x^2+1)^3

f(x) = x^3(2x^2-x+4)^4 \,

x^2(2x^2-x+4)^3(22x^2-7x+12)

f(x) = 5x^2(x^3-x+1)^3 \,

5x(x^3-x+1)^2(11x^3-2x+1)

f(x) = (2-x)^6(5+2x)^4 \,

2(x-2)^5(2x+5)^3(10x+7)

Solutions

Quotient Rule

24. f(x) = \frac{2x+1}{x+5} \,

f'(x)=\frac{9}{(x+5)^2}

25. f(x) = \frac{3x^4+2x +2}{3x^2+1} \,

f'(x)=\frac{18x^5+12x^3-6x^2-12x+2}{(3x^2+1)^2}

26. f(x) = \frac{x^\frac{3}{2}+1}{x+2} \,

f'(x)=\frac{x\sqrt{x}+6\sqrt{x}-2}{2(x+2)^2}

27. d(u) = \frac{u^3+2}{u^3} \,

d'(u)=-\frac{6}{u^4}

28. f(x) = \frac{x^2+x}{2x-1} \,

f'(x)=\frac{2x^2-2x-1}{(2x-1)^2}

29. f(x) = \frac{x+1}{2x^2+2x+3} \,

f'(x)=\frac{-2x^2-4x+1}{(2x^2+2x+3)^2}

30. f(x) = \frac{16x^4+2x^2}{x} \,

f'(x)=48x^2+2

f(x) = \frac{8x^3+2}{5x+5} \,

f'(x)=\frac{2(8x^3+12x^2-1)}{5(x+1)^2}

f(x) = \frac{(3x-2)^2}{x^{1/2}} \,

f'(x)=\frac{(3x-2)(9x+2)}{2x^{3/2}}

f(x) = \frac{ x^{1/2}}{2x-1} \,

f'(x)=\frac{-(2x+1)}{2x^{1/2} (2x-1)^2}

f(x) = \frac{ 4x-3}{x+2} \,

f'(x)=\frac{11}{(x+2)^2}

f(x) = \frac{ 4x+3}{2x-1} \,

f'(x)=\frac{-10}{(2x-1)^2}

f(x) = \frac{ x^2}{x+3} \,

f'(x)=\frac{x(x+6)}{(x+3)^2)}

f(x) = \frac{ x^5}{3-x} \,

f'(x)=\frac{x^4(-4x+15)}{(3-x)^2}

Solutions

Chain Rule

31. f(x) = (x+5)^2 \,

f'(x)=2(x+5)

32. g(x) = (x^3 - 2x + 5)^2 \,

g'(x)=2(x^{3}-2x+5)(3x^{2}-2)

33. f(x) = \sqrt{1-x^2} \,

f'(x)=-\frac{x}{\sqrt{1-x^{2}}}

34. f(x) = \frac{(2x+4)^3}{4x^3+1} \,

f'(x)=\frac{6(4x^{3}+1)(2x+4)^{2}-(2x+4)^{3}(12x^{2})}{(4x^{3}+1)^{2}}

35. f(x) = (2x+1)\sqrt{2x+2} \,

f'(x)=2\sqrt{2x+2}+\frac{2x+1}{\sqrt{2x+2}}

36. f(x) = \frac{2x+1}{\sqrt{2x+2}} \,

f'(x)=\frac{2x+3}{(2x+2)^{3/2}}

37. f(x) = \sqrt{2x^2+1}(3x^4+2x)^2 \,

f'(x)=\frac{2x(3x^{4}+2x)^{2}}{\sqrt{2x^{2}+1}}+\sqrt{2x^{2}+1}(2)(3x^{4}+2x)(12x^{3}+2)

38. f(x) = \frac{2x+3}{(x^4+4x+2)^2} \,

f'(x)=\frac{2(x^{4}+4x+2)^{2}-2(2x+3)(x^{4}+4x+2)(4x^{3}+4)}{(x^{4}+4x+2)^{4}}

39. f(x) = \sqrt{x^3+1}(x^2-1) \,

f'(x)=\frac{3x(x^{2}-1)}{2\sqrt{x^{3}+1}}+2x\sqrt{x^{3}+1}

40. f(x) = ((2x+3)^4 + 4(2x+3) +2)^2 \,

f'(x)=2((2x+3)^{4}+4(2x+3)+2)(8(2x+3)^{3}+8)

41. f(x) = \sqrt{1+x^2} \,

f'(x)=\frac{x}{\sqrt{1+x^{2}}}

Solutions

Exponentials

42. f(x) = (3x^2+e)e^{2x}\,

f'(x)=6xe^{2x}+2e^{2x}(3x^{2}+e)

43. f(x) = e^{2x^2+3x}

f'(x)=(4x+3)e^{2x^{2}+3x}

44. f(x) = e^{e^{2x^2+1}}

f'(x)=4xe^{2x^{2}+1+e^{2x^{2}+1}}

45. f(x) = 4^x\,

f'(x)=\ln(4)4^{x}

Solutions

Logarithms

46. f(x) = 2^{x-3}\cdot3\sqrt{x^3-2}+\ln x\,

f'(x)=3\ln(2)2^{x-3}\sqrt{x^{3}-2}+\frac{9x^{2}2^{x-4}}{\sqrt{x^{3}-2}}+\frac{1}{x}

47. f(x) = \ln x - 2e^x + \sqrt{x}\,

f'(x)=\frac{1}{x}-2e^{x}+\frac{1}{2\sqrt{x}}

48. f(x) = \ln(\ln(x^3(x+1))) \,

f'(x)=\frac{4x^{3}+3x^{2}}{x^{3}(x+1)\ln(x^{3}(x+1))}

49. f(x) = \ln(2x^2+3x)\,

f'(x)=\frac{4x+3}{2x^{2}+3x}

50. f(x) = \log_4 x + 2\ln x\,

f'(x)=\frac{1}{x\ln4}+\frac{2}{x}

Solutions

Trigonometric functions

51. f(x) = 3e^x-4\cos (x) - \frac{1}{4}\ln x\,

f'(x)=3e^{x}+4\sin(x)-\frac{1}{4x}

52. f(x) = \sin(x)+\cos(x)\,

f'(x)=\cos(x)-\sin(x)

Solutions

More Differentiation

53. \frac{d}{dx}[(x^{3}+5)^{10}]

30x^{2}(x^{3}+5)^{9}

54. \frac{d}{dx}[x^{3}+3x]

3x^{2}+3

55. \frac{d}{dx}[(x+4)(x+2)(x-3)]

(x+2)(x-3)+(x+4)(x-3)+(x+4)(x+2)

56. \frac{d}{dx}[\frac{x+1}{3x^{2}}]

-\frac{x+2}{3x^{3}}

57. \frac{d}{dx}[3x^{3}]

9x^{2}

58. \frac{d}{dx}[x^{4}\sin x]

4x^{3}\sin x+x^{4}\cos x

59. \frac{d}{dx}[2^{x}]

\ln(2)2^{x}

60. \frac{d}{dx}[e^{x^{2}}]

2xe^{x^{2}}

61. \frac{d}{dx}[e^{2^{x}}]

2*e^{2^{x}}

Solutions

Implicit Differentiation

Use implicit differentiation to find y'

62.  x^3 + y^3 = xy \,

y'=\frac{y-3x^{2}}{3y^{2}-x}

63.  (2x+y)^4 + 3x^2 +3y^2 = \frac{x}{y} + 1 \,

y'=\frac{y-8y^{2}(2x+y)^{3}-6xy^{2}}{4y^{2}(2x+y)^{3}+6y^{3}+x}

Solutions

Logarithmic Differentiation

Use logarithmic differentiation to find \frac{dy}{dx}:

64. y = x(\sqrt[4]{1-x^3}\,)

y'=\sqrt[4]{1-x^{3}}-\frac{3x^{3}}{4(1-x^{3})^{3/4}}

65. y = \sqrt{x+1 \over 1-x}\,

y'=\frac{1}{2}\sqrt{\frac{x+1}{1-x}}\,(\frac{1}{x+1}+\frac{1}{1-x})

66. y = (2x)^{2x}\,

y'=(2x)^{2x}(2\ln(2x)+2)

67. y = (x^3+4x)^{3x+1}\,

y'=(x^{3}+4x)^{3x+1}(3\ln(x^{3}+4x)+\frac{(3x+1)(3x^{2}+4)}{x^{3}+4x})

68. y = (6x)^{\cos(x) + 1}\,

y'=6x^{\cos(x)+1}(-\sin(x)\ln(x)+\frac{\cos(x)+1}{x})

Solutions

Equation of Tangent Line

For each function, f, (a) determine for what values of x the tangent line to f is horizontal and (b) find an equation of the tangent line to f at the given point.

69.  f(x) = \frac{x^3}{3} + x^2 + 5, \;\;\; (3,23)

a) x=0,-2
b) y=15x-22

70.  f(x) = x^3 - 3x + 1, \;\;\;  (1,-1)

a) x=\pm1
b) y=-1

71.  f(x) = \frac{2}{3} x^3 + x^2 - 12x + 6, \;\;\; (0,6)

a) x=2,-3
b) y=-12x+6

72.  f(x) = 2x + \frac{1}{\sqrt{x}}, \;\;\; (1,3)

a) x=2^{-4/3}
b) y=\frac{3}{2}x+\frac{3}{2}

73.  f(x) = (x^2+1)(2-x), \;\;\; (2,0)

a) x=1,\frac{1}{3}
b) y=-5x+10

74.  f(x) = \frac{2}{3}x^3+\frac{5}{2}x^2 +2x+1, \;\;\; (3,\frac{95}{2})

a) x=-\frac{1}{2},-2
/ b) y=35x-\frac{115}{2}

75. Find an equation of the tangent line to the graph defined by (x-y-1)^3 = x \, at the point (1,-1).

y=\frac{2}{3}x-\frac{5}{3}

76. Find an equation of the tangent line to the graph defined by  e^{xy} + x^2 = y^2 \, at the point (1,0).

y=-2x+2

Solutions

Higher Order Derivatives

77. What is the second derivative of 3x^4+3x^2+2x?

36x^2+6

78. Use induction to prove that the (n+1)th derivative of a n-th order polynomial is 0.

base case: Consider the zeroth-order polynomial, c. \frac{dc}{dx}=0
induction step: Suppose that the n-th derivative of a (n-1)th order polynomial is 0. Consider the n-th order polynomial, f(x). We can write f(x)=cx^n+P(x) where P(x) is a (n-1)th polynomial.
\frac{d^{n+1}}{dx^{n+1}}f(x)=\frac{d^{n+1}}{dx^{n+1}}(cx^n+P(x))=\frac{d^{n+1}}{dx^{n+1}}(cx^n)+\frac{d^{n+1}}{dx^{n+1}}P(x)=\frac{d^{n}}{dx^{n}}(cnx^{n-1})+\frac{d}{dx}\frac{d^{n}}{dx^{n}}P(x)=0+\frac{d}{dx}0=0

Solutions

External Links


← Some Important Theorems Calculus L'Hôpital's rule →
Print version

Applications of Derivatives

<h1> Failed to match page to section number. Check your argument; if correct, consider updating Template:Calculus/map page. L'Hôpital's Rule</h1>

← Differentiation/Basics of Differentiation/Exercises Calculus Extrema and Points of Inflection →
Print version


L'Hopital's Rule

Occasionally, one comes across a limit which results in \frac{0}{0} or \frac{\infty}{\infty}, which are called indeterminate limits. However, it is still possible to solve these in many cases due to L'Hôpital's rule. This rule also is vital in explaining how a number of other limits can be derived.

Definition: Indeterminate Limit
If \frac{f(c)}{g(c)} exists, where \lim_{x \to c}f(c)=\lim_{x \to c}g(c)=0 or \lim_{x \to c}f(c)=\lim_{x \to c}g(c)=\infty, the limit \lim_{x \to c}\frac{f(x)}{g(x)} is said to be indeterminate.

All of the following expressions are indeterminate forms.

 \frac{0}{0}, \frac{\pm \infin}{\pm \infin}, \infty - \infty, 0 \cdot \infin, 0^0, \infty^0, 1^\infin \

These expressions are called indeterminate because you cannot determine their exact value in the indeterminate form. Depending on the situation, each indeterminate form could evaluate to a variety of values.

Theorem

If  \lim_{x \to c} \frac{f(x)}{g(x)} \ is indeterminate of type \frac{0}{0} or \frac{\pm \infty}{\pm \infty},

then  \lim_{x \to c} \frac{f(x)}{g(x)} = \lim_{x \to c} \frac{f'(x)}{g'(x)}

In other words, if the limit of the function is indeterminate, the limit equals the derivative of the top over the derivative of the bottom. If that is indeterminate, L'Hôpital's rule can be used again until the limit isn't \frac{0}{0} or \frac{\infty}{\infty}.

Note:

 x \ can approach a finite value c,  \infin or  -\infin .


Proof of the 0/0 case

Suppose that for real functions f and g , \lim_{x\to c}f(x)=0 and \lim_{x\to c}g(x)=0 and that \lim_{x\to c}\frac{f'(x)}{g'(x)} exists. Thus f'(x) and g'(x) exist in an interval (c-\delta,c+\delta) around c , but maybe not at c itself. This implies that both f and g are differentiable (and thus continuous) everywhere in (c-\delta,c+\delta) except perhaps at c. Thus, for any x in (c-\delta,c+\delta) , in any interval [c,x] or [x,c] , f and g are continuous and differentiable, with the possible exception of c . Define
F(x)=\left\{\begin{matrix}f(x),&\mbox{if }x\ne c\\ \lim_{x\to c}f(x),&\mbox{if }x=c\end{matrix}\right.       and       G(x)=\left\{\begin{matrix}g(x),&\mbox{if }x\ne c\\ \lim_{x\to c}g(x),&\mbox{if }x=c\end{matrix}\right..
Note that \lim_{x\to c}\frac{f(x)}{g(x)}=\lim_{x\to c}\frac{F(x)}{G(x)} , \lim_{x\to c}\frac{f'(x)}{g'(x)}=\lim_{x\to c}\frac{F'(x)}{G'(x)} and that F(x) and G(x) are continuous in any interval [c,x] or [x,c] and differentiable in any interval (c,x) or (x,c) when x is in (c-\delta,c+\delta). Cauchy's Mean Value Theorem tells us that \frac{F(x)-F(c)}{G(x)-G(c)}=\frac{F'(\xi)}{G'(\xi)} for some \xi in (c,x) (if x>c ) or (x,c) (if x<c ). Since F(c)=G(c)=0 , we have \frac{F(x)}{G(x)}=\frac{F'(\xi)}{G'(\xi)} for x and \xi in (c-\delta,c+\delta). Note that \lim_{x\to c}\frac{F'(\xi)}{G'(\xi)} is the same limit as \lim_{x\to c}\frac{F'(x)}{G'(x)} since both x and \xi are being squeezed to c . So taking the limit as x\to c of the last equation gives \lim_{x\to c}\frac{F(x)}{G(x)}=\lim_{x\to c}\frac{F'(x)}{G'(x)} which is equivalent to \lim_{x\to c}\frac{f(x)}{g(x)}=\lim_{x\to c}\frac{f'(x)}{g'(x)}.

Examples

Example 1

Find \lim_{x \to 0}\frac{\sin x}{x}

Since plugging in 0 for x results in \frac{0}{0}, use L'Hôpital's rule to take the derivative of the top and bottom, giving:

\lim_{x \to 0}\frac{\cos x}{1}

Plugging in 0 for x gives 1 here.

Example 2

Find \lim_{x \to 0}x \cot x

First, you need to rewrite the function into an indeterminate limit fraction:

\lim_{x \to 0}\frac{x}{\tan x}

Now it's indeterminate. Take the derivative of the top and bottom:

\lim_{x \to 0}\frac{1}{\sec^{2} x}

Plugging in 0 for x once again gives one.

Example 3

Find \lim_{x \to \infty}\frac{4x+22}{5x+9}

This time, plugging in \infty for x gives you \frac{\infty}{\infty}. You know the drill:

\lim_{x \to \infty}\frac{4}{5}

This time, though, there is no x term left! \frac{4}{5} is the answer.

Example 4

Sometimes, forms exist where it is not intuitively obvious how to solve them. One might think the value  1^\infin = 1 . However, as was noted in the definition of an indeterminate form, this isn't possible to evaluate using the rules learned before now, and we need to use L'Hôpital's rule to solve.

Find  \lim_{x \to \infin} \left( 1 + \frac{1}{x} \right) ^ x

Plugging the value of x into the limit yields

 \lim_{x \to \infin} \left( 1 + \frac{1}{x} \right) ^ x = 1^\infin (indeterminate form).

Let  k = \lim_{x \to \infin} \left( 1 + \frac{1}{x} \right) ^ x = 1^\infin

 \ln k \, =  \lim_{x \to \infin} \ln \left( 1 + \frac{1}{x} \right) ^ x
=  \lim_{x \to \infin} x \ln \left( 1 + \frac{1}{x} \right)
=  \lim_{x \to \infin} \frac{\ln \left( 1 + \frac{1}{x} \right)}{\frac{1}{x}} = \frac{\ln 1}{\frac{1}{x}} = \frac{0}{0} (indeterminate form)

We now apply L'Hôpital's rule by taking the derivative of the top and bottom with respect to x.

 \frac{d}{dx} \left[ \ln \left( 1 + \frac{1}{x} \right) \right] = \frac{x}{x+1} \cdot \frac{-1}{x^2} = \frac{-1}{x(x+1)}
 \frac{d}{dx} \left( \frac{1}{x} \right) = \frac{-1}{x^2}

Returning to the expression above

 \ln k \ =  \lim_{x \to \infin} \frac{-(-x^2)}{x(x+1)}
=  \lim_{x \to \infin} \frac{x}{x+1} = \frac{\infin}{\infin} (indeterminate form)

We apply L'Hôpital's rule once again

 \ln k = \lim_{x \to \infin} \frac{1}{1} = 1

Therefore

 k = e \

And

 \lim_{x \to \infin} \left( 1 + \frac{1}{x} \right) ^x = e \ne 1

Careful: this does not prove that  1^\infin = e \ because

 \lim_{x \to \infin} \left( 1 + \frac{2}{x} \right) ^x = 1^\infin \ne e

Exercises

Evaluate the following limits using L'Hôpital's rule:

1. \lim_{x \to 0}\frac{x+\tan x}{\sin x}

2

2. \lim_{x \to \pi}\frac{x-\pi}{\sin x}

-1

3. \lim_{x \to 0}\frac{\sin 3x}{\sin 4x}

\frac{3}{4}

4. \lim_{x \to \infty}\frac{x^5}{e^{5x}}

0

5. \lim_{x \to 0}\frac{\tan x - x}{\sin x - x}

-2

Solutions

← Differentiation/Basics of Differentiation/Exercises Calculus Extrema and Points of Inflection →
Print version

<h1> 3.11 Extrema and Points of Inflection</h1>

← L'Hôpital's rule Calculus Newton's Method →
Print version
The four types of extrema.

Maxima and minima are points where a function reaches a highest or lowest value, respectively. There are two kinds of extrema (a word meaning maximum or minimum): global and local, sometimes referred to as "absolute" and "relative", respectively. A global maximum is a point that takes the largest value on the entire range of the function, while a global minimum is the point that takes the smallest value on the range of the function. On the other hand, local extrema are the largest or smallest values of the function in the immediate vicinity.

All extrema look like the crest of a hill or the bottom of a bowl on a graph of the function. A global extremum is always a local extremum too, because it is the largest or smallest value on the entire range of the function, and therefore also its vicinity. It is also possible to have a function with no extrema, global or local: y=x is a simple example.

At any extremum, the slope of the graph is necessarily zero, as the graph must stop rising or falling at an extremum, and begin to head in the opposite direction. Because of this, extrema are also commonly called stationary points or turning points. Therefore, the first derivative of a function is equal to zero at extrema. If the graph has one or more of these stationary points, these may be found by setting the first derivative equal to zero and finding the roots of the resulting equation.

The function f(x)=x3, which contains a point of inflexion at the point (0,0).

However, a slope of zero does not guarantee a maximum or minimum: there is a third class of stationary point called a point of inflexion (also spelled point of inflection). Consider the function

f \left(x \right) = x^3.

The derivative is

f^\prime \left(x \right) = 3 x^2

The slope at x=0 is 0. We have a slope of zero, but while this makes it a stationary point, this doesn't mean that it is a maximum or minimum. Looking at the graph of the function you will see that x=0 is neither, it's just a spot at which the function flattens out. True extrema require a sign change in the first derivative. This makes sense - you have to rise (positive slope) to and fall (negative slope) from a maximum. In between rising and falling, on a smooth curve, there will be a point of zero slope - the maximum. A minimum would exhibit similar properties, just in reverse.

Good (B and C, green) and bad (D and E, blue) points to check in order to classify the extremum (A, black). The bad points lead to an incorrect classification of A as a minimum.

This leads to a simple method to classify a stationary point - plug x values slightly left and right into the derivative of the function. If the results have opposite signs then it is a true maximum/minimum. You can also use these slopes to figure out if it is a maximum or a minimum: the left side slope will be positive for a maximum and negative for a minimum. However, you must exercise caution with this method, as, if you pick a point too far from the extremum, you could take it on the far side of another extremum and incorrectly classify the point.

The Extremum Test

A more rigorous method to classify a stationary point is called the extremum test, or 2nd Derivative Test. As we mentioned before, the sign of the first derivative must change for a stationary point to be a true extremum. Now, the second derivative of the function tells us the rate of change of the first derivative. It therefore follows that if the second derivative is positive at the stationary point, then the gradient is increasing. The fact that it is a stationary point in the first place means that this can only be a minimum. Conversely, if the second derivative is negative at that point, then it is a maximum.

Now, if the second derivative is zero, we have a problem. It could be a point of inflexion, or it could still be an extremum. Examples of each of these cases are below - all have a second derivative equal to zero at the stationary point in question:

  • y = x^{3} has a point of inflexion at x=0
  • y = x^{4} has a minimum at x=0
  • y =-x^{4} has a maximum at x=0

However, this is not an insoluble problem. What we must do is continue to differentiate until we get, at the (n+1)th derivative, a non-zero result at the stationary point:

f^{\prime} \left(x \right)=0, \,f^{\prime \prime} \left(x \right)=0,\, \ldots ,f^{\left(n\right)} \left(x \right)=0,\,f^{\left(n+1\right)} \left(x \right)\ne 0

If n is odd, then the stationary point is a true extremum. If the (n+1)th derivative is positive, it is a minimum; if the (n+1)th derivative is negative, it is a maximum. If n is even, then the stationary point is a point of inflexion.

As an example, let us consider the function

f \left( x \right) = -x^4

We now differentiate until we get a non-zero result at the stationary point at x=0 (assume we have already found this point as usual):

f^\prime \left( x \right) = -4x^3
f^{\prime \prime} \left( x \right) = -12x^2
f^{\prime \prime \prime} \left( x \right) = -24x
f^{\left(4\right)} \left( x \right) = -24

Therefore, (n+1) is 4, so n is 3. This is odd, and the fourth derivative is negative, so we have a maximum. Note that none of the methods given can tell you if this is a global extremum or just a local one. To do this, you would have to set the function equal to the height of the extremum and look for other roots.

Critical Points

Critical points are the points where a function's derivative is 0 or not defined. Suppose we are interested in finding the maximum or minimum on given closed interval of a function that is continuous on that interval. The extreme values of the function on that interval will be at one or more of the critical points and/or at one or both of the endpoints. We can prove this by contradiction. Suppose that the function f(x) has maximum at a point x=c in the interval (a,b) where the derivative of the function is defined and not 0. If the derivative is positive, then x values slightly greater than c will cause the function to increase. Since c is not an endpoint, at least some of these values are in [a,b]. But this contradicts the assumption that f(c) is the maximum of f(x) for x in [a,b]. Similarly, if the derivative is negative, then x values slightly less than c will cause the function to increase. Since c is not an endpoint, at least some of these values are in [a,b]. This contradicts the assumption that f(c) is the maximum of f(x) for x in [a,b]. A similar argument could be made for the minimum.

Example 1

Consider the function f(x)=x on the interval [-1,1]. The unrestricted function f(x)=x has no maximum or minimum. On the interval [-1,1], however, it is obvious that the minimum will be -1, which occurs at =-1 and the maximum will be 1, which occurs at x=1. Since there are no critical points (f'(x) exists and equals 1 everywhere), the extreme values must occur at the endpoints.

Example 2

Find the maximum and minimum of the function f(x)=x^3-2x^2-5x+6 on the interval [-3,3].

First start by finding the roots of the function derivative:
f'(x)=3x^2-4x-5 = 0
x=\frac{2\pm\sqrt{19}}{3}
Now evaluate the function at all critical points and endpoints to find the extreme values.
f(-3)=-24
f(\frac{2-\sqrt{19}}{3})=\frac{56+38\sqrt{19}}{27}\approx 8.2088
f(\frac{2+\sqrt{19}}{3})=\frac{56-38\sqrt{19}}{27}\approx -4.0607
f(3)=0
From this we can see that the minimum on the interval is -24 when x=-3 and the maximum on the interval is \frac{56+38\sqrt{19}}{27} when x=\frac{2-\sqrt{19}}{3}

See "Optimization" for a common application of these principles.

← L'Hôpital's rule Calculus Newton's Method →
Print version

<h1> 3.12 Newton's Method</h1>

← Extrema and Points of Inflection Calculus Related Rates →
Print version

Newton's Method (also called the Newton-Raphson method) is a recursive algorithm for approximating the root of a differentiable function. We know simple formulas for finding the roots of linear and quadratic equations, and there are also more complicated formulae for cubic and quartic equations. At one time it was hoped that there would be formulas found for equations of quintic and higher-degree, though it was later shown by Neils Henrik Abel that no such equations exist. The Newton-Raphson method is a method for approximating the roots of polynomial equations of any order. In fact the method works for any equation, polynomial or not, as long as the function is differentiable in a desired interval.

Newton's Method

Let f\left(x\right) be a differentiable function. Select a point x_0 based on a first approximation to the root, arbitrarily close to the function's root. To approximate the root you then recursively calculate using:

x_{n+1} = x_n - \frac{f\left(x_n\right)}{f'\left(x_n\right)}

As you recursively calculate, the x_{n+1}'s often become increasingly better approximations of the function's root.

In order to explain Newton's method, imagine that x_0 is already very close to a zero of f\left(x\right). We know that if we only look at points very close to x_0 then f\left(x\right) looks like its tangent line. If x_0 was already close to the place where f\left(x\right) was zero, and near x_0 we know that f\left(x\right) looks like its tangent line, then we hope the zero of the tangent line at x_0 is a better approximation then x_0 itself.

The equation for the tangent line to f\left(x\right) at x_0 is given by

y=f'\left(x_0\right)\left(x-x_0\right)+f\left(x_0\right).

Now we set y=0 and solve for x.

0=f'\left(x_0\right)\left(x-x_0\right)+f\left(x_0\right)
-f\left(x_0\right)=f'\left(x_0\right)\left(x-x_0\right)
\frac{-f\left(x_0\right)}{f'\left(x_0\right)}=\left(x-x_0\right)
x=\frac{-f\left(x_0\right)}{f'\left(x_0\right)}+x_0

This value of x we feel should be a better guess for the value of x where f\left(x\right)=0. We choose to call this value of x_1, and a little algebra we have

x_1=x_0-\frac{f\left(x_0\right)}{f'\left(x_0\right)}.

If our intuition was correct and x_1 is in fact a better approximation for the root of f\left(x\right), then our logic should apply equally well at x_1. We could look to the place where the tangent line at x_1 is zero. We call x_2, following the algebra above we arrive at the formula

x_2=x_1-\frac{f\left(x_1\right)}{f'\left(x_1\right)}.

And we can continue in this way as long as we wish. At each step, if your current approximation is x_n our new approximation will be x_{n+1}=x_n-\frac{f\left(x_n\right)}{f'\left(x_n\right)}.


Examples

Find the root of the function  f\left(x\right) = x^2 \ .
Figure 1: A few iterations of Newton's method applied to y=x^2 starting with x_0=4. The blue curve is f\left(x\right). The other solid lines are the tangents at the various iteration points.

\begin{align}x_{0}&=&4\\
x_{1}&=&x_{0}-\frac{f\left(x_{0}\right)}{f'\left(x_{0}\right)}=4-\frac{16}{8}=2\\
x_{2}&=&x_{1}-\frac{f\left(x_{1}\right)}{f'\left(x_{1}\right)}=2-\frac{4}{4}=1\\
x_{3}&=&x_{2}-\frac{f\left(x_{2}\right)}{f'\left(x_{2}\right)}=1-\frac{1}{2}=\frac{1}{2}\\
x_{4}&=&x_{3}-\frac{f\left(x_{3}\right)}{f'\left(x_{3}\right)}=\frac{1}{2}-\frac{1/4}{1}=\frac{1}{4}\\
x_{5}&=&x_{4}-\frac{f\left(x_{4}\right)}{f'\left(x_{4}\right)}=\frac{1}{4}-\frac{1/16}{1/2}=\frac{1}{8}\\
x_{6}&=&x_{5}-\frac{f\left(x_{5}\right)}{f'\left(x_{5}\right)}=\frac{1}{8}-\frac{1/64}{1/4}=\frac{1}{16}\\
x_{7}&=&x_{6}-\frac{f\left(x_{6}\right)}{f'\left(x_{6}\right)}=\frac{1/256}{1/8}=\frac{1}{32}\end{align}

As you can see x_n is gradually approaching zero (which we know is the root of f\left(x\right)). One can approach the function's root with arbitrary accuracy.

Answer:  f\left(x\right) = x^2 \  has a root at  x = 0 \ .

Notes

Figure 2: Newton's method applied to the function
f\left(x\right) = \begin{cases} \sqrt{x-4}, & \mbox{for}\,\,x \ge 4 \\ -\sqrt{4-x}, & \mbox{for}\,\,x < 4 \end{cases}
starting with x_0=2.

This method fails when f'\left(x\right) = 0. In that case, one should choose a new starting place. Occasionally it may happen that f\left(x\right) = 0 and f'\left(x\right) = 0 have a common root. To detect whether this is true, we should first find the solutions of f'\left(x\right) = 0, and then check the value of f\left(x\right) at these places.

Newton's method also may not converge for every function, take as an example:

f\left(x\right) = \begin{cases} \sqrt{x-r}, & \mbox{for}\,\,x \ge r \\ -\sqrt{r-x}, & \mbox{for}\,\,x < r \end{cases}

For this function choosing any x_1 = r - h then x_2 = r + h would cause successive approximations to alternate back and forth, so no amount of iteration would get us any closer to the root than our first guess.

Figure 3: Newton's method, when applied to the function f\left(x\right)=x^5-x+1 with initial guess x_0=0, eventually iterates between the three points shown above.

Newton's method may also fail to converge on a root if the function has a local maximum or minimum that does not cross the x-axis. As an example, consider f\left(x\right)=x^5-x+1 with initial guess x_0=0. In this case, Newton's method will be fooled by the function, which dips toward the x-axis but never crosses it in the vicinity of the initial guess.

See also

← Extrema and Points of Inflection Calculus Related Rates →
Print version

<h1> 3.13 Related Rates</h1>

← Newton's Method Calculus Optimization →
Print version

Introduction

One useful application of derivatives is as an aid in the calculation of related rates. What is a related rate? In each case in the following examples the related rate we are calculating is a derivative with respect to some value. We compute this derivative from a rate at which some other known quantity is changing. Given the rate at which something is changing, we are asked to find the rate at which a value related to the rate we are given is changing.

Process for solving related rates problems:

  • Write out any relevant formulas and information.
  • Take the derivative of the primary equation with respect to time.
  • Solve for the desired variable.
  • Plug-in known information and simplify.

Notation

Newton's dot notation is used to show the derivative of a variable with respect to time. That is, if f is a quantity that depends on time, then \dot f=\frac{df}{dt}, where t represents the time. This notation is a useful abbreviation in situations where time derivatives are often used, as is the case with related rates.

Examples

Example 1:

Filling cone with water.png
A cone with a circular base is being filled with water. Find a formula which will find the rate with which water is pumped.
  • Write out any relevant formulas or pieces of information.
 V = \frac{1}{3} \pi r^2 h
  • Take the derivative of the equation above with respect to time. Remember to use the Chain Rule and the Product Rule.
 V = \frac{1}{3} \pi r^2h
 \dot V = \frac{\pi}{3} \left( r^2\dot h + 2rh\dot r \right)
Answer:  \dot V = \frac{\pi}{3} \left( r^2\dot h + 2rh\dot r \right)

Example 2:

A spherical hot air balloon is being filled with air. The volume is changing at a rate of 2 cubic feet per minute. How is the radius changing with respect to time when the radius is equal to 2 feet?
  • Write out any relevant formulas and pieces of information.
 V_{sphere} = \frac{4}{3} \pi r^3
 \dot V = 2
 r = 2 \
  • Take the derivative of both sides of the volume equation with respect to time.
 V = \frac{4}{3} \pi r^3
 \dot V =  \frac{4}{3}3\pi r^2\dot r
=  4 \pi r^2\dot r
  • Solve for  \dot r.
 \dot r = \frac{1}{4 \pi r^2}\dot V
  • Plug-in known information.
 \dot r = \frac{1}{16 \pi}2
Answer:  \dot r = \frac{1}{8 \pi} ft/min.

Example 3:

An airplane is attempting to drop a box onto a house. The house is 300 feet away in horizontal distance and 400 feet in vertical distance. The rate of change of the horizontal distance with respect to time is the same as the rate of change of the vertical distance with respect to time. How is the distance between the box and the house changing with respect to time at the moment? The rate of change in the horizontal direction with respect to time is -50 feet per second.

Note: Because the vertical distance is downward in nature, the rate of change of y is negative. Similarly, the horizontal distance is decreasing, therefore it is negative (it is getting closer and closer).

The easiest way to describe the horizontal and vertical relationships of the plane's motion is the Pythagorean Theorem.

  • Write out any relevant formulas and pieces of information.
 x^2 + y^2 = s^2 \ (where s is the distance between the plane and the house)
 x = 300 \
 y = 400 \
 s = \sqrt{x^2 + y^2} = \sqrt{300^2 + 400^2} = 500 \
 \dot x = \dot y = -50
  • Take the derivative of both sides of the distance formula with respect to time.
 x^2 + y^2 = s^2 \
 2x\dot x + 2y\dot y = 2s\dot s
  • Solve for  \dot s.
 \dot s = \frac{1}{2s}( 2x\dot x + 2y\dot y)
=

\frac{x\dot x + y\dot y}{s}

  • Plug-in known information
 \dot s =  \frac{(300)(-50) + (400)(-50)}{(500)}
=  \frac{-35000}{500}
=  -70 \ ft/s
Answer:  \dot s = -70 ft/sec.

Example 4:

Sand falls onto a cone shaped pile at a rate of 10 cubic feet per minute. The radius of the pile's base is always 1/2 of its altitude. When the pile is 5 ft deep, how fast is the altitude of the pile increasing?
  • Write down any relevant formulas and information.
 V = \frac{1}{3} \pi r^2 h
 \dot V = 10
 r = \frac{1}{2} h \
 h = 5 \

Substitute  r = \frac{1}{2} h into the volume equation.

 V \ =  \frac{1}{3} \pi r^2 h
=  \frac{1}{3} \pi h( \frac{h^2}{4})
=  \frac{1}{12} \pi h^3
  • Take the derivative of the volume equation with respect to time.
 V = \frac{1}{12} \pi h^3
 \dot V = \frac{1}{4} \pi h^2\dot h
  • Solve for  \dot h.
 \dot h = \frac{4}{\pi h^2}\dot V
  • Plug-in known information and simplify.
 \dot h =  \frac{4}{\pi (5)^2}10
=  \frac{8}{5 \pi} ft/min
Answer:  \dot h = \frac{8}{5 \pi} ft/min.

Example 5:

A 10 ft long ladder is leaning against a vertical wall. The foot of the ladder is being pulled away from the wall at a constant rate of 2 ft/sec. When the ladder is exactly 8 ft from the wall, how fast is the top of the ladder sliding down the wall?
  • Write out any relevant formulas and information.

Use the Pythagorean Theorem to describe the motion of the ladder.

 x^2 + y^2 = l^2 \ (where l is the length of the ladder)
 l = 10 \
 \dot x = 2
 x = 8 \
 y = \sqrt{l^2 - x^2} = \sqrt{100-64} = \sqrt{36} = 6 \
  • Take the derivative of the equation with respect to time.
 2x\dot x + 2y\dot y = 0 ( l is constant so  \frac{dl^2}{dt} = 0 .)
  • Solve for  \dot y .
 2x\dot x + 2y\dot y = 0
 2y\dot y = -2x\dot x
 \dot y = - \frac{x}{y}\dot x
  • Plug-in known information and simplify.
 \dot y =  \left( - \frac{8}{6} \right) (2)
=  - \frac{8}{3} ft/sec
Answer:  \dot y = -\frac{8}{3} ft/sec.

Exercises

1. A spherical balloon is inflated at a rate of 100 ft^3/min. Assuming the rate of inflation remains constant, how fast is the radius of the balloon increasing at the instant the radius is 4 ft?

\frac{25}{16\pi} \frac{ft}{min}

2. Water is pumped from a cone shaped reservoir (the vertex is pointed down) 10 ft in diameter and 10 ft deep at a constant rate of 3 ft^3/min. How fast is the water level falling when the depth of the water is 6 ft?

\frac{1}{3\pi} \frac{ft}{min}

3. A boat is pulled into a dock via a rope with one end attached to the bow of a boat and the other wound around a winch that is 2ft in diameter. If the winch turns at a constant rate of 2rpm, how fast is the boat moving toward the dock?

4\pi\frac{ft}{min}

4. At time t=0 a pump begins filling a cylindrical reservoir with radius 1 meter at a rate of e^{-t} cubic meters per second. At what time is the liquid height increasing at 0.001 meters per second?

t=-\ln(.001\pi)

Solutions

← Newton's Method Calculus Optimization →
Print version

<h1> 3.14 Optimization</h1>

← Related Rates Calculus Euler's Method →
Print version

Introduction

Optimization is one of the uses of calculus in the real world. Let us assume we are a pizza parlor and wish to maximize profit. Perhaps we have a flat piece of cardboard and we need to make a box with the greatest volume. How does one go about this process?

Obviously, this requires the use of maximums and minimums. We know that we find maximums and minimums via derivatives. Therefore, one can conclude that calculus will be a useful tool for maximizing or minimizing (collectively known as "optimizing") a situation.

Examples

Volume Example

A box manufacturer desires to create a box with a surface area of 100 inches squared. What is the maximum size volume that can be formed by bending this material into a box? The box is to be closed. The box is to have a square base, square top, and rectangular sides.

  • Write out known formulas and information
 A_{base} = x^2 \
 A_{side} = x \cdot h \
 A_{total} = 2x^2 + 4x \cdot h = 100
 V = l \cdot w \cdot h = x^2 \cdot h
  • Eliminate the variable h in the volume equation
 2x^2 + 4xh = 100 \
 x^2 + 2xh = 50 \
 2xh = 50 - x^2 \
 h = \frac{50 - x^2}{2x}
 V \ =  (x^2) \left( \frac{50 - x^2}{2x} \right)
=  \frac{1}{2} (50x - x^3)
  • Find the derivative of the volume equation in order to maximize the volume
 \frac{dV}{dx} = \frac{1}{2} (50-3x^2)
  • Set  \frac{dV}{dx} = 0 and solve for  x \
 \frac{1}{2} (50-3x^2) = 0
 50-3x^2 = 0 \
 3x^2 = 50 \
 x = \pm \frac{\sqrt{50}}{\sqrt{3}}
  • Plug-in the x value into the volume equation and simplify
 V \ =  \frac{1}{2} \left[ 50 \cdot \sqrt{\frac{50}{3}} - \left( \sqrt{\frac{50}{3}} \right) ^3 \right]
=  68.04138174.. \
Answer:  V_{max} = 68.04138174.. \ 

Volume Example II

Open-top box.svg

It is desired to make an open-top box of greatest possible volume from a square piece of tin whose side is \alpha, by cutting equal squares out of the corners and then folding up the tin to form the sides. What should be the length of a side of the squares cut out?

If we call the side length of the cut out squares x, then each side of the base of the folded box is \alpha-2x, and the height is x. Therefore, the volume function is V(x)=x(\alpha-2x)^2=x(\alpha^2-4\alpha x+4x^2)=\alpha^2 x - 4\alpha x^2 + 4x^3.

We must optimize the volume by taking the derivative of the volume function and setting it equal to 0. Since it does not change, \alpha is treated as a constant, not a variable.

V(x)=\alpha^2 x - 4\alpha x^2 + 4x^3

V'(x)=\alpha^2 - 8\alpha x + 12x^2

0=12x^2 - 8\alpha x + \alpha^2

We can now use the quadratic formula to solve for x:

x=\frac{-b \pm \sqrt{b^2 - 4ac}}{2a}

x=\frac{-(-8\alpha) \pm \sqrt{(-8\alpha)^2 - 4(12)(\alpha^2)}}{2(12)}

x=\frac{8\alpha \pm \sqrt{64\alpha^2 - 48\alpha^2}}{24}

x=\frac{8\alpha \pm \sqrt{16\alpha^2}}{24}

x=\frac{8\alpha \pm 4\alpha}{24}

x=\frac{\alpha}{6}, \frac{\alpha}{2}

We reject x=\frac{\alpha}{2}, since it is a minimum (it results in the base length \alpha-2x being 0, making the volume 0). Therefore, the answer is x=\frac{\alpha}{6}.

Sales Example

Calculus Graph-Finding Maximum Profit.png

A small retailer can sell n units of a product for a revenue of r(n)=8.1n and at a cost of c(n)=n^3-7n^2+18n, with all amounts in thousands. How many units does it sell to maximize its profit?

The retailer's profit is defined by the equation p(n)=r(n) - c(n), which is the revenue generated less the cost. The question asks for the maximum amount of profit which is the maximum of the above equation. As previously discussed, the maxima and minima of a graph are found when the slope of said graph is equal to zero. To find the slope one finds the derivative of p(n). By using the subtraction rule p'(n)=r'(n) - c'(n):

p(n)\, = r(n) - c(n)\,
p'(n)\, = \frac{d}{dn}\left[8.1n\right]-\frac{d}{dn}\left[n^3-7n^2+18n\right]\,
= -3n^2+14n-9.9\,

Therefore, when -3n^2+14n-9.9\,=0 the profit will be maximized or minimized. Use the quadratic formula to find the roots, giving {3.798,0.869}. To find which of these is the maximum and minimum the function can be tested:

p(0.869) = - 3.97321, p(3.798) = 8.58802

Because we only consider the functions for all n \ge 0 (i.e., you can't have n=-5 units), the only points that can be minima or maxima are those two listed above. To show that 3.798 is in fact a maximum (and that the function doesn't remain constant past this point) check if the sign of p'(n) changes at this point. It does, and for n greater than 3.798 P'(n) the value will remain decreasing. Finally, this shows that for this retailer selling 3.798 units would return a profit of $8,588.02.

← Related Rates Calculus Euler's Method →
Print version

<h1> 3.15 Euler's Method</h1>

← Optimization Calculus Differentiation/Applications of Derivatives/Exercises →
Print version

Euler's Method is a method for estimating the value of a function based upon the values of that function's first derivative.

The general algorithm for finding a value of  y(x) \ is:

 y_{n+1} = y_n + \Delta x_{step} \cdot f(x_n,y_n), \

where f is y'(x). In other words, the new value, y_{n+1}, is the sum of the old value y_n and the step size \Delta x_{step} times the change, f(x_n,y_n).

You can think of the algorithm as a person traveling with a map: Now I am standing here and based on these surroundings I go that way 1 km. Then, I check the map again and determine my direction again and go 1 km that way. I repeat this until I have finished my trip.

The Euler method is mostly used to solve differential equations of the form

 
    y' = f(x,y), 
    y(x_0) = y_0. \


Examples

A simple example is to solve the equation:


 
    y' = x + y,  
    y(0) = 1. \

This yields f = y' = x + y and hence, the updating rule is:

 
    y_{n+1} = y_n + 0.1 (x_n + y_n)\

Step size \Delta x_{step} = 0.1 is used here.

The easiest way to keep track of the successive values generated by the algorithm is to draw a table with columns for  n, x_n, y_n, y_{n+1} \ .

The above equation can be e.g. a population model, where y is the population size and x a decease that is reducing the population.

← Optimization Calculus Differentiation/Applications of Derivatives/Exercises →
Print version

<h1> 3.16 Applications of Derivatives Cumulative Exercises</h1>

← Euler's Method Calculus Integration/Contents →
Print version

Relative Extrema

Find the relative maximum(s) and minimum(s), if any, of the following functions.

1.  f(x) = \frac{x}{x+1} \,

none

2.  f(x) = (x-1)^{2/3} \,

Minimum at the point (1,0)

3.  f(x) = x^2 + \frac{2}{x} \,

Relative minimum at x=1

4.  f(s) = \frac{s}{1+s^2} \,

Relative minimum at x=-1
Relative maximum at x=1

5.  f(x) =  x^2 - 4x + 9 \,

Relative minimum at x=2

6.  f(x) = \frac{x^2 + x +1}{x^2 -x +1} \,

Relative minimum at x=-1
Relative maximum at x=1

Solutions

Range of Function

7. Show that the expression x+ 1/x cannot take on any value strictly between 2 and -2.

f(x)=x+\frac{1}{x}
f'(x)=1-\frac{1}{x^{2}}
1-\frac{1}{x^{2}}=0\implies x=\pm1
f^{\prime\prime}(x)=\frac{2}{x^{3}}
f^{\prime\prime}(-1)=-2
Since f^{\prime\prime}(-1) is negative, x=-1 corresponds to a relative maximum.
f(-1)=-2
\lim\limits _{x\to-\infty}f(x)=-\infty
For x<-1, f'(x) is positive, which means that the function is increasing. Coming from very negative x-values, f increases from a very negative value to reach a relative maximum of -2 at x=-1.
For -1<x<1, f'(x) is negative, which means that the function is decreasing.
\lim_{x\to0^{-}}f(x)=-\infty
\lim_{x\to0^{+}}f(x)=+\infty
f^{\prime\prime}(1)=2
Since f^{\prime\prime}(1) is positive, x=1 corresponds to a relative minimum.
f(1)=2
Between [-1,0) the function decreases from -2 to -\infty, then jumps to +\infty and decreases until it reaches a relative minimum of 2 at x=1.
For x>1, f'(x) is positive, so the function increases from a minimum of 2.
The above analysis shows that there is a gap in the function's range between -2 and 2.

Absolute Extrema

Determine the absolute maximum and minimum of the following functions on the given domain

8.  f(x) = \frac{1}{3}x^3 - \frac{1}{2}x^2 + 1 on [0,3]

Maximum at (3,\frac{11}{2}); minimum at (1,\frac{5}{6})

9.  f(x) = (\frac{4}{3}x^2 -1)x on [-\frac{1}{2},2]

Maximum at (2,\frac{26}{3}); minimum at (\frac{1}{2},-\frac{1}{3})

Solutions

Determine Intervals of Change

Find the intervals where the following functions are increasing or decreasing

10. f(x)=10-6x-2x^2

Increasing on (-\infty,-\frac{3}{2}); decreasing on (-\frac{3}{2},+\infty)

11. f(x)=2x^3-12x^2+18x+15

Decreasing on (1,3); increasing elsewhere

12. f(x)=5+36x+3x^2-2x^3

Increasing on (-2,3); decreasing elsewhere

13. f(x)=8+36x+3x^2-2x^3

Increasing on (-2,3); decreasing elsewhere

14. f(x)=5x^3-15x^2-120x+3

Decreasing on (-2,4); increasing elsewhere

15. f(x)=x^3-6x^2-36x+2

Decreasing on (-2,6); increasing elsewhere

Solutions

Determine Intervals of Concavity

Find the intervals where the following functions are concave up or concave down

16. f(x)=10-6x-2x^2

Concave down everywhere

17. f(x)=2x^3-12x^2+18x+15

Concave down on (-\infty,2); concave up on (2,+\infty)

18. f(x)=5+36x+3x^2-2x^3

Concave up on (-\infty,\frac{1}{2}); concave down on (\frac{1}{2},+\infty)

19. f(x)=8+36x+3x^2-2x^3

Concave up on (-\infty,\frac{1}{2}); concave down on (\frac{1}{2},+\infty)

20. f(x)=5x^3-15x^2-120x+3

Concave down on (-\infty,1); concave up on (1,+\infty)

21. f(x)=x^3-6x^2-36x+2

Concave down on (-\infty,2); concave up on (2,+\infty)

Solutions

Word Problems

22. You peer around a corner. A velociraptor 64 meters away spots you. You run away at a speed of 6 meters per second. The raptor chases, running towards the corner you just left at a speed of 4t meters per second (time t measured in seconds after spotting). After you have run 4 seconds the raptor is 32 meters from the corner. At this time, how fast is death approaching your soon to be mangled flesh? That is, what is the rate of change in the distance between you and the raptor?

10 m/s

23. Two bicycles leave an intersection at the same time. One heads north going 12 mph and the other heads east going 5 mph. How fast are the bikes getting away from each other after one hour?

13 mph

24. You're making a can of volume 200 m^3 with a gold side and silver top/bottom. Say gold costs 10 dollars per m^2 and silver costs 1 dollar per m^2. What's the minimum cost of such a can?

$878.76

Solutions

Graphing Functions

For each of the following, graph a function that abides by the provided characteristics

25. f(1)= f(-2) = 0, \; \lim_{x\to \infty} f(x) = \lim_{x\to -\infty} f(x) = 0, \; \mbox{ vertical asymptote at } x=-3, \; f'(x)>0 \mbox{ on } (0,2),  f'(x)<0 \mbox{ on } (-\infty,-3)\cup(-3,0)\cup(2,\infty),\; f''(x)>0 \mbox{ on } (-3,1)\cup (3,\infty),\; f''(x)<0 \mbox{ on } (-\infty,-3)\cup(1,3).
There are many functions that satisfy all the conditions. Here is one example:
Calculus graphing exercise 1.png
26. f \mbox{ has domain } [-1,1], \; f(-1) = -1, \; f(-\frac{1}{2}) = -2,\; f'(-\frac{1}{2}) = 0,\; f''(x)>0 \mbox{ on } (-1,1)
There are many functions that satisfy all the conditions. Here is one example:
Calculus graphing exercise 2.png

Solutions

← Euler's Method Calculus Integration/Contents →
Print version

Integration

Basics of Integration

<h1> 4.1 Definite Integral</h1>

← Integration/Contents Calculus Fundamental Theorem of Calculus →
Print version

Suppose we are given a function and would like to determine the area underneath its graph over an interval. We could guess, but how could we figure out the exact area? Below, using a few clever ideas, we actually define such an area and show that by using what is called the definite integral we can indeed determine the exact area underneath a curve.

Definition of the Definite Integral

Figure 1: Approximation of the area under the curve f(x) from x=x_0 to x=x_4.
Figure 2: Rectangle approximating the area under the curve from x_2 to x_3 with sample point x_3^*.

The rough idea of defining the area under the graph of f is to approximate this area with a finite number of rectangles. Since we can easily work out the area of the rectangles, we get an estimate of the area under the graph. If we use a larger number of smaller-sized rectangles we expect greater accuracy with respect to the area under the curve and hence a better approximation. Somehow, it seems that we could use our old friend from differentiation, the limit, and "approach" an infinite number of rectangles to get the exact area. Let's look at such an idea more closely.

Suppose we have a function f that is positive on the interval [a,b] and we want to find the area S under f between a and b. Let's pick an integer n and divide the interval into n subintervals of equal width (see Figure 1). As the interval [a,b] has width b-a, each subinterval has width  \Delta x = \frac{b-a}{n}. We denote the endpoints of the subintervals by x_0,x_1,\ldots,x_n. This gives us

 x_i = a + i \Delta x \mbox{ for } i=0,1,\ldots, n.\,
Figure 3: Riemann sums with an increasing number of subdivisions yielding better approximations.

Now for each i=1,\ldots,n pick a sample point x_i^* in the interval [x_{i-1},x_{i}]\! and consider the rectangle of height f(x_i^*) and width \Delta x (see Figure 2). The area of this rectangle is f(x_i^*)\Delta x. By adding up the area of all the rectangles for i=1,\ldots,n we get that the area S is approximated by

 A_n= f(x_1^*) \Delta x + f(x_2^*) \Delta x + \cdots + f(x_n^*) \Delta x.

A more convenient way to write this is with summation notation:

 A_n = \sum_{i=1}^{n} f(x_i^*)\Delta x.

For each number n we get a different approximation. As n gets larger the width of the rectangles gets smaller which yields a better approximation (see Figure 3). In the limit of A_n as n tends to infinity we get the area S.

Definition of the Definite Integral
Suppose f is a continuous function on [a,b] and  \Delta x=\frac{b-a}{n}. Then the definite integral of f between a and b is

\int_{a}^{b} f(x)\ dx = \lim_{n \to \infty} A_n= \lim_{n \to \infty} \sum_{i=1}^{n} f(x_i^*) \Delta x,
where x_i^* are any sample points in the interval [x_{i-1},x_{i}] and x_k=a+k\cdot\Delta x for k=0,\dots n.

It is a fact that if f is continuous on [a,b] then this limit always exists and does not depend on the choice of the points x_i^*\in[x_{i-1},x_{i}]. For instance they may be evenly spaced, or distributed ambiguously throughout the interval. The proof of this is technical and is beyond the scope of this section.

Notation

When considering the expression, \int_{a}^{b} f(x)\ dx (read "the integral from a to b of the f of x dx"), the function f is called the integrand and the interval [a,b] is the interval of integration. Also a is called the lower limit and b the upper limit of integration.
Figure 4: The integral gives the signed area under the graph.

One important feature of this definition is that we also allow functions which take negative values. If f(x)<0 for all x then f(x_i^*)<0 so f(x_i^*)\Delta x<0. So the definite integral of f will be strictly negative. More generally if f takes on both positive an negative values then \int_a^b f(x)dx will be the area under the positive part of the graph of f minus the area above the graph of the negative part of the graph (see Figure 4). For this reason we say that \int_a^b f(x) dx is the signed area under the graph.

Independence of Variable

It is important to notice that the variable x did not play an important role in the definition of the integral. In fact we can replace it with any other letter, so the following are all equal:

 \int_a^b f(x) dx = \int_a^b f(t) dt=\int_a^b f(u) du = \int_a^b f(w) dw.

Each of these is the signed area under the graph of f between a and b. Such a variable is often referred to as a dummy variable or a bound variable.

Left and Right Handed Riemann Sums

Figure 5: Right-handed Riemann sum
Figure 6: Left-handed Riemann sum

The following methods are sometimes referred to as L-RAM and R-RAM, RAM standing for "Rectangular Approximation Method."

We could have decided to choose all our sample points x_i^* to be on the right hand side of the interval [x_{i-1},x_{i}] (see Figure 5). Then x_i^*=x_{i} for all i and the approximation that we called A_n for the area becomes

 A_n = \sum_{i=1}^{n} f(x_{i})\Delta x.

This is called the right-handed Riemann sum, and the integral is the limit

 \int_{a}^{b} f(x)\ dx = \lim_{n \to \infty} A_n= \lim_{n \to \infty} \sum_{i=1}^{n} f(x_i) \Delta x.

Alternatively we could have taken each sample point on the left hand side of the interval. In this case x_i^*=x_{i-1} (see Figure 6) and the approximation becomes

 A_n = \sum_{i=1}^{n} f(x_{i-1})\Delta x.

Then the integral of f is

 \int_{a}^{b} f(x)\ dx = \lim_{n \to \infty} A_n= \lim_{n \to \infty} \sum_{i=1}^{n} f(x_{i-1}) \Delta x.

The key point is that, as long as f is continuous, these two definitions give the same answer for the integral.

Examples

Example 1
In this example we will calculate the area under the curve given by the graph of f(x) = x for x between 0 and 1. First we fix an integer n and divide the interval [0,1] into n subintervals of equal width. So each subinterval has width

\Delta x = \frac{1}{n}.

To calculate the integral we will use the right-handed Riemann sum. (We could have used the left-handed sum instead, and this would give the same answer in the end). For the right-handed sum the sample points are

x_i^* = 0 + i\Delta x = \frac{i}{n} \quad i=1,\ldots,n

Notice that f(x_i^*) = x_i^* = \frac{i}{n}. Putting this into the formula for the approximation,

A_n = \sum_{i=1}^n f(x_{i}^*) \Delta x = \sum_{i=1}^n f\left(\frac{i}{n}\right)\Delta x = \sum_{i=1}^n \frac{i}{n} \cdot \frac{1}{n} = \frac{1}{n^2} \sum_{i=1}^n i.

Now we use the formula

 \sum_{i=1}^n i = \frac{n(n+1)}{2}

to get

 A_n = \frac{1}{n^2} \frac{n(n+1)}{2} = \frac{n+1}{2n}.

To calculate the integral of f between 0 and 1 we take the limit as n tends to infinity,

 \int_0^1 f(x) dx = \lim_{n\to \infty} \frac{n+1}{2n} = \frac{1}{2}.

Example 2
Next we show how to find the integral of the function f(x) =x^2 between x=a and x=b. This time the interval [a,b] has width b-a so

\Delta x = \frac{b-a}{n}.

Once again we will use the right-handed Riemann sum. So the sample points we choose are

x_i^* = a + i\Delta x = a + \frac{i(b-a)}{n}.

Thus

A_n\, = \sum_{i=1}^n f(x_{i}^*) \Delta x
=\sum_{i=1}^n f\left(a+\frac{(b-a)i}{n}\right)\Delta x
=\frac{b-a}{n} \sum_{i=1}^n \left(a+\frac{(b-a)i}{n}\right)^2
=\frac{b-a}{n} \sum_{i=1}^n \left( a^2 + \frac{2a(b-a)i}{n} + \frac{(b-a)^2i^2}{n^2} \right)

We have to calculate each piece on the right hand side of this equation. For the first two,

\sum_{i=1}^n a^2 = a^2 \sum_{i=1}^n 1 = na^2
\sum_{i=1}^n \frac{2a(b-a)i}{n} = \frac{2a(b-a)}{n} \sum_{i=1}^n i = \frac{2a(b-a)}{n}\cdot \frac{n(n+1)}{2}.

For the third sum we have to use a formula

 \sum_{i=1}^n i^2 = \frac{n(n+1)(2n+1)}{6}

to get

 \sum_{i=1}^n \frac{(b-a)^2i^2}{n^2} = \frac{(b-a)^2}{n^2}\frac{n(n+1)(2n+1)}{6}.

Putting this together

 A_n = \frac{b-a}{n} \left(na^2 + \frac{2a(b-a)}{n}\cdot \frac{n(n+1)}{2} + \frac{(b-a)^2}{n^2}\frac{n(n+1)(2n+1)}{6}\right).

Taking the limit as n tend to infinity gives

\int_a^b x^2 dx = (b-a)\left(a^2 + a(b-a) + \frac{1}{3}(b-a)^2\right)
=(b-a)\left( a^2 + ab - a^2 + \frac{1}{3}(b^2 - 2ab + a^2)\right)
=\frac{1}{3}(b-a)(b^2+ab+a^2)
=\frac{1}{3}(b^3-a^3).

Exercises

1. Use left- and right-handed Riemann sums with 5 subdivisions to get lower and upper bounds on the area under the function f(x)=x^6 from x=0 to x=1.

Lower bound: 0.062592
Upper bound: 0.262592

2. Use left- and right-handed Riemann sums with 5 subdivisions to get lower and upper bounds on the area under the function f(x)=x^6 from x=1 to x=2.

Lower bound: 12.460992
Upper bound: 25.060992

Solutions

Basic Properties of the Integral

From the definition of the integral we can deduce some basic properties. For all the following rules, suppose that f and g are continuous on [a,b].

The Constant Rule

Constant Rule

 \int_a^b c f(x) dx = c \int_a^b f(x) dx.

When f is positive, the height of the function cf at a point x is c times the height of the function f. So the area under cf between a and b is c times the area under f. We can also give a proof using the definition of the integral, using the constant rule for limits,

\int_a^b c \, f(x) dx = \lim_{n\to \infty} \sum_{i=1}^n cf(x_i^*)=c\lim_{n\to \infty} \sum_{i=1}^n f(x_i^*)=c\int_a^b f(x)dx.

Example

We saw in the previous section that

\int_0^1 x dx = \frac{1}{2}.

Using the constant rule we can use this to calculate that

 \int_0^1 3x dx = 3\int_0^1 x dx = 3.\frac{1}{2} = \frac{3}{2},
 \int_0^1 -7x dx = -7\int_0^1 x dx = (-7).\frac{1}{2} = -\frac{7}{2}.

Example

We saw in the previous section that

 \int_a^b x^2 dx = \frac{1}{3}(b^3-a^3).

We can use this and the constant rule to calculate that

 \int_1^3 2x^2 dx = 2\int_1^3 x^2 dx = 2.\frac{1}{3}.(3^3-1^3) = \frac{2}{3}(27-1) = \frac{52}{3}.


There is a special case of this rule used for integrating constants:

Integrating Constants

If c is constant then  \int_a^b c \, dx = c \, (b-a).

When c>0 and a<b this integral is the area of a rectangle of height c and width b-a which equals c(b-a).

Example

\int_1^3 9 dx = 9(3-1)=9\cdot 2 = 18.
\int_{-2}^6 11 dx = 11(6-(-2))=11\cdot 8 = 88.
\int_{2}^{17} 0 dx = 0\cdot(17-2) =0.

The addition and subtraction rule

Addition and Subtraction Rules of Integration
 \int_a^b (f(x) + g(x)) dx = \int_a^b f(x) dx + \int_a^b g(x) dx.

 \int_a^b (f(x) - g(x)) dx = \int_a^b f(x) dx - \int_a^b g(x) dx.

As with the constant rule, the addition rule follows from the addition rule for limits:

\int_a^b (f(x)+g(x)) dx = \lim_{n\to \infty} \sum_{i=1}^n f(x_i^*) + g(x_i^*)
= \lim_{n\to \infty} \sum_{i=1}^n f(x_i^*)+\lim_{n\to \infty} \sum_{i=1}^n g(x_i^*)
= \int_a^b f(x)dx+\int_a^b g(x)dx.

The subtraction rule can be proved in a similar way.

Example

From above \int_1^3 9 dx =  18 and \int_1^3 2x^2 dx = \frac{52}{3} so

 \int_1^3 (2x^2 + 9)dx = \int_1^3 2x^2 dx + \int_1^3 9 dx = \frac{52}{3} + 18 = \frac{106}{3},
 \int_1^3 (2x^2 - 9)dx = \int_1^3 2x^2 dx - \int_1^3 9 dx = \frac{52}{3} - 18 = -\frac{2}{3}.

Example

\int_0^2 4x^2 + 14 dx = 4\int_0^2 x^2 dx + \int_0^2 14 dx = 4 \cdot \frac{1}{3}(2^3-0^3) + 2 \cdot 14 = \frac{32}{3} + 28 = \frac{116}{3}.

Exercise

3. Use the subtraction rule to find the area between the graphs of f(x)=x and g(x)=x^2 between x=0 and x=1

\frac{1}{6}

Solution

The Comparison Rule

Figure 7: Bounding the area under f(x) on [a,b]

Comparison Rule

  • Suppose f(x)\ge 0 for all x in [a,b]. Then
 \int_a^b f(x) dx \ge 0.
  • Suppose f(x)\ge g(x) for all x in [a,b]. Then
 \int_a^b f(x) dx \ge \int_a^b g(x) dx.
  • Suppose M\ge f(x)\ge m for all x in [a,b]. Then
 M(b-a)\ge \int_a^b f(x) dx \ge m(b-a).

If f(x)\ge 0 then each of the rectangles in the Riemann sum to calculate the integral of f will be above the y axis, so the area will be non-negative. If f(x)\ge g(x) then f(x)-g(x)\ge 0 and by the first property we get the second property. Finally if M\ge f(x)\ge m then the area under the graph of f will be greater than the area of rectangle with height m and less than the area of the rectangle with height M (see Figure 7). So

M(b-a)=\int_a^b M \ge \int_a^b f(x) dx \ge \int_a^b m = m(b-a).

Linearity with respect to endpoints

Additivity with respect to endpoints Suppose a<c<b. Then

 \int_a^b f(x) dx = \int_a^c f(x) dx + \int_c^b f(x) dx.

Again suppose that f is positive. Then this property should be interpreted as saying that the area under the graph of f between a and b is the area between a and c plus the area between c and b (see Figure 8).

Figure 8: Illustration of the property of additivity with respect to endpoints

Extension of Additivity with respect to limits of integration
When a=b we have that \Delta x=\frac{b-a}{n}=0 so

\int_a^a f(x) dx = 0.

Also in defining the integral we assumed that a<b. But the definition makes sense even when b<a, in which case \Delta x = \frac{1}{n}(b-a) has changed sign. This gives

\int_b^a f(x) dx = -\int_a^b f(x) dx.

With these definitions,

 \int_a^b f(x) dx = \int_a^c f(x) dx + \int_c^b f(x) dx
whatever the order of a,b,c.

Exercise

4. Use the results of exercises 1 and 2 and the property of linearity with respect to endpoints to determine upper and lower bounds on \int_0^2 x^6 dx.

Lower bound: 12.523584
Upper bound: 25.323584

Solution

Even and odd functions

Recall that a function f is called odd if it satisfies f(-x) = -f(x) and is called even if f(-x) = f(x).

Suppose f is a continuous odd function then for any a,

\int_{-a}^a f(x) dx =0.

If f is a continuous even function then for any a,

\int_{-a}^a f(x) dx = 2 \int_0^a f(x)dx.

Suppose f is an odd function and consider first just the integral from -a to 0. We make the substitution u=-x so du=-dx. Notice that if x=-a then u=a and if x=0 then u=0. Hence \int_{-a}^0 f(x) dx = - \int_a^0 f(-u) du= \int_0^a f(-u) du. Now as f is odd,  f(-u) = -f(u) so the integral becomes \int_{-a}^0 f(x) dx = - \int_0^a f(u) du. Now we can replace the dummy variable u with any other variable. So we can replace it with the letter x to give \int_{-a}^0 f(x) dx = - \int_0^a f(u) du = - \int_0^a f(x) dx.

Now we split the integral into two pieces

\int_{-a}^a f(x) dx = \int_{-a}^0 f(x) dx+\int_{0}^a f(x) dx = -\int_0^a f(x) dx + \int_0^a f(x) dx =0.

The proof of the formula for even functions is similar.

5. Prove that if f is a continuous even function then for any a,
\int_{-a}^a f(x) dx = 2 \int_0^a f(x)dx.

From the property of linearity of the endpoints we have

\int_{-a}^a f(x) dx = \int_{-a}^0 f(x) dx +\int_{0}^a f(x) dx

Make the substitution u=-x; du=-dx. u=a when x=-a and u=0 when x=0. Then

\int_{-a}^0 f(x)dx=\int_a^0 f(-u)(-du)=-\int_a^0 f(-u)du=\int_0^a f(-u)du=\int_0^a f(u)du

where the last step has used the evenness of f. Since u is just a dummy variable, we can replace it with x. Then

\int_{-a}^a f(x) dx = \int_0^a f(x)dx + \int_{0}^a f(x) dx = 2\int_{0}^a f(x) dx
← Integration/Contents Calculus Fundamental Theorem of Calculus →
Print version

<h1> 4.2 Fundamental Theorem of Calculus</h1>

← Definite integral Calculus Indefinite integral →
Print version

The fundamental theory of calculus is a critical portion of calculus because it links the concept of a derivative to that of an integral. As a result, we can use our knowledge of derivatives to find the area under the curve, which is often quicker and simpler than using the definition of the integral.

Mean Value Theorem for Integration

We will need the following theorem in the discussion of the Fundamental Theorem of Calculus.

Mean Value Theorem for Integration

Suppose f(x) is continuous on [a,b]. Then \frac{\int_{a}^{b}f(x)dx}{b-a}=f(c) for some c in [a,b].

Proof of the Mean Value Theorem for Integration

f(x) satisfies the requirements of the Extreme Value Theorem, so it has a minimum m and a maximum M in [a,b]. Since
\int_{a}^{b}f(x)dx=\lim\limits _{n\to\infty}\sum\limits _{i=1}^{n}f(x_{i}^{*})\frac{b-a}{n}=\lim\limits _{n\to\infty}\frac{b-a}{n}\sum\limits _{i=1}^{n}f(x_{i}^{*})
and since
m\leq f(x_{i}^{*})\leq M for all x_{i}^{*} in [a,b],
we have

\lim\limits _{n\to\infty}\frac{b-a}{n}\sum\limits _{i=1}^{n}m\leq\lim\limits _{n\to\infty}\frac{b-a}{n}\sum\limits _{i=1}^{n}f(x_{i}^{*})\leq\lim\limits _{n\to\infty}\frac{b-a}{n}\sum\limits _{i=1}^{n}M

\lim\limits _{n\to\infty}\frac{b-a}{n}nm\leq\int_{a}^{b}f(x)dx\leq\lim\limits _{n\to\infty}\frac{b-a}{n}nM

\lim\limits _{n\to\infty}(b-a)m\leq\int_{a}^{b}f(x)dx\leq\lim\limits _{n\to\infty}(b-a)M

(b-a)m\leq\int_{a}^{b}f(x)dx\leq(b-a)M

m\leq\frac{\int_{a}^{b}f(x)dx}{(b-a)}\leq M

Since f is continuous, by the Intermediate Value Theorem there is some f(c) with c in [a,b] such that

\frac{\int_{a}^{b}f(x)dx}{b-a}=f(c)

Fundamental Theorem of Calculus

Statement of the Fundamental Theorem

Suppose that f is continuous on [a,b]. We can define a function F by

F(x)= \int_a^x f(t)\ dt \mbox{ for } x \mbox{ in } [a,b].

Fundamental Theorem of Calculus Part I Suppose f is continuous on [a,b] and F is defined by

F(x)= \int_a^x f(t)\ dt.

Then F is differentiable on (a,b) and for all x\in(a,b),

 F'(x) = f(x).\,

When we have such functions F and f where F^\prime(x)=f(x) for every x in some interval I we say that F is the antiderivative of f on I.

Fundamental Theorem of Calculus Part II Suppose that f is continuous on [a,b] and that F is any antiderivative of f. Then

\ \int_{a}^{b} f(x)\ dx=F(b)-F(a).
Figure 1

Note: a minority of mathematicians refer to part one as two and part two as one. All mathematicians refer to what is stated here as part 2 as The Fundamental Theorem of Calculus.

Proofs

Proof of Fundamental Theorem of Calculus Part I

Suppose x is in (a,b). Pick \Delta x so that x+\Delta x is also in (a, b). Then

F(x) = \int_{a}^{x} f(t) dt

and

F(x+\Delta x) = \int_{a}^{x+\Delta x} f(t) dt.

Subtracting the two equations gives

F(x + \Delta x) - F(x) = \int_{a}^{x + \Delta x} f(t) dt - \int_{a}^{x} f(t) dt.

Now


\int_{a}^{x + \Delta x} f(t) dt  = \int_{a}^{x} f(t) dt + \int_{x}^{x + \Delta x} f(t) dt

so rearranging this we have

F(x + \Delta x) - F(x) = \int_{x}^{x + \Delta x} f(t) dt.

According to the Mean Value Theorem for Integration, there exists a c in [x, x + Δx] such that

\int_{x}^{x + \Delta x}f(t) dt = f(c) \Delta x .

Notice that c depends on \Delta x. Anyway what we have shown is that

F(x + \Delta x) - F(x) = f(c) \Delta x \,,

and dividing both sides by Δx gives

\frac{F(x + \Delta x) - F(x)}{\Delta x} = f(c) .

Take the limit as \Delta x\to 0 we get the definition of the derivative of F at x so we have

F'(x) = \lim_{\Delta x \to 0} f(c)..

To find the other limit, we will use the squeeze theorem. The number c is in the interval [x, x + Δx], so xcx + Δx. Also, \lim_{\Delta x \to 0} x = x and \lim_{\Delta x \to 0} x + \Delta x = x. Therefore, according to the squeeze theorem,

\lim_{\Delta x \to 0} c = x.

As f is continuous we have

F'(x) = \lim_{\Delta x\to 0} f(c)=  f\left(\lim_{\Delta x \to 0} c\right) = f(x)

which completes the proof.

Proof of Fundamental Theorem of Calculus Part II

Define  P(x) = \int_a^x f(t) dt. Then by the Fundamental Theorem of Calculus part I we know that P is differentiable on (a,b) and for all x\in (a,b)

P'(x) = f(x).\,

So P is an antiderivative of f. Since we were assuming that F was also an antiderivative for all x\in (a,b),

P'(x)=F'(x)
P'(x)-F'(x)=0
(P(x)-F(x))'=0.

Let g(x)=P(x)-F(x). The Mean Value Theorem applied to g(x) on [a,\xi] with a<\xi<b says that

\frac{g(\xi)-g(a)}{\xi-a}=g'(c)

for some c in (a,\xi). But since g'(x)=0 for all x in [a,b], g(\xi) must equal g(a) for all \xi in (a,b), i.e. g(x) is constant on (a,b).
This implies there is a constant C=g(a)=P(a)-F(a)=-F(a) such that for all x\in (a,b),

 P(x) = F(x) + C.\,,

and as g is continuous we see this holds when x=a and x=b as well. And putting x=b gives

 \int_a^b f(t) dx = P(b) = F(b) + C = F(b) - F(a).

Notation for Evaluating Definite Integrals

The second part of the Fundamental Theorem of Calculus gives us a way to calculate definite integrals. Just find an antiderivative of the integrand, and subtract the value of the antiderivative at the lower bound from the value of the antiderivative at the upper bound. That is

\int_a^b f(x)dx=F(b)-F(a)

where F'(x)=f(x). As a convenience, we use the notation

F(x)\bigr|_a^b

to represent F(b)-F(a).

Integration of Polynomials

Using the power rule for differentiation we can find a formula for the integral of a power using the Fundamental Theorem of Calculus. Let f(x) =x^n. We want to find an antiderivative for f. Since the differentiation rule for powers lowers the power by 1 we have that

 \frac{d}{dx} x^{n+1} = (n+1)x^n.

As long as n+1\neq 0 we can divide by n+1 to get

 \frac{d}{dx} \left(\frac{x^{n+1}}{n+1}\right)= x^n = f(x).

So the function F(x) = \frac{x^{n+1}}{n+1} is an antiderivative of f. If 0 is not in [a,b] then F is continuous on [a,b] and, by applying the Fundamental Theorem of Calculus, we can calculate the integral of f to get the following rule.

Power Rule of Integration I

 \int_a^b x^n dx = {x^{n+1} \over {n+1}}\biggr|_a^b = \frac{b^{n+1}-a^{n+1}}{n+1}\; as long as n\neq -1 and 0 is not in [a,b].

Notice that we allow all values of n, even negative or fractional. If n>0 then this works even if [a,b] includes 0.

Power Rule of Integration II

 \int_a^b x^n dx = {x^{n+1} \over {n+1}}\biggr|_a^b = \frac{b^{n+1}-a^{n+1}}{n+1} as long as n>0.

Examples

  • To find \int_1^2 x^3 dx we raise the power by 1 and have to divide by 4. So
\int_1^2 x^3 dx = \frac{x^4}{4}\biggr|_1^2 = \frac{2^4}{4} - \frac{1^4}{4} = \frac{15}{4}.
  • The power rule also works for negative powers. For instance
\int_1^3 \frac{1}{x^3} dx = \int_1^3  x^{-3} dx = \frac{x^{-2}}{-2}\biggr|_1^3 = \frac{1}{-2}\left(3^{-2}-1^{-2}\right) = -\frac{1}{2}\left(\frac{1}{3^2} - 1\right) =-\frac{1}{2}\left(\frac{1}{9}-1\right) = \frac{1}{2} \cdot \frac{8}{9} = \frac{4}{9}.
  • We can also use the power rule for fractional powers. For instance
\int_0^5 \sqrt x\, dx = \int_0^5 x^{\frac{1}{2}}\, dx =  \frac{x^{\frac{3}{2}}}{\frac{3}{2}}\Biggr|_0^5 = \frac{2}{3}\left(5^{\frac{3}{2}} - 0^{\frac{3}{2}}\right)=\frac{2}{3}\left(5^{\frac{3}{2}}\right)
  • Using linearity the power rule can also be thought of as applying to constants. For example,
=\int_3^{11} 7\,dx=\int_3^{11} 7x^0\,dx=7\int_3^{11} x^0\,dx=7x \bigr|_3^{11}= 7(11 - 3) = 56.
  • Using the linearity rule we can now integrate any polynomial. For example
\int_0^3 (3x^2 + 4x +2) dx = (x^3 + 2x^2 + 2x)\bigr|_0^3 = 3^3 + 2 \cdot 3^2 + 2 \cdot 3-0 = 27+18+6=51.

Exercises

1. Evaluate \int_0^1 x^6 dx. Compare your answer to the answer you got for exercise 1 in section 4.1.

\frac{1}{7} = 0.\overline{142857}

2. Evaluate \int_1^2 x^6 dx. Compare your answer to the answer you got for exercise 2 in section 4.1.

\frac{127}{7} = 18.\overline{142857}

3. Evaluate \int_0^2 x^6 dx. Compare your answer to the answer you got for exercise 4 in section 4.1.

\frac{128}{7} = 18.\overline{285714}

Solutions

← Definite integral Calculus Indefinite integral →
Print version

<h1> 4.3 Indefinite Integral</h1>

← Fundamental Theorem of Calculus Calculus Improper Integrals →
Print version

Definition

Now recall that F is said to be an antiderivative of f if  F'(x) = f(x)\,.. However, F is not the only antiderivative. We can add any constant to F without changing the derivative. With this, we define the indefinite integral as follows:

\int f(x) dx = F(x) + C \;\;\; where F satisfies F'(x) = f(x)\; and C is any constant.

The function f(x), the function being integrated, is known as the integrand. Note that the indefinite integral yields a family of functions.

Example

Since the derivative of x^4 is 4x^3, the general antiderivative of 4x^3 is x^4 plus a constant. Thus,

 \int 4x^3 dx = x^4 + C.

Example: Finding antiderivatives

Let's take a look at 6x^2. How would we go about finding the integral of this function? Recall the rule from differentiation that

 \frac{d}{dx} x^n = n x^{n-1}\

In our circumstance, we have:

 \frac{d}{dx} x^3  = 3 x^2\

This is a start! We now know that the function we seek will have a power of 3 in it. How would we get the constant of 6? Well,

 2 \frac{d}{dx} x^3= 2 \times 3 x^2 =6x^2.\

Thus, we say that 2x^3 is an antiderivative of 6x^2.

Exercises

1. Evaluate \int \frac{3x}{2}dx

\frac{3}{4}x^2+C

2. Find the general antiderivative of the function f(x)=2x^4.

\frac{2x^5}{5}+C

Solutions

Indefinite integral identities

Basic Properties of Indefinite Integrals

Constant Rule for indefinite integrals

If c is a constant then  \int c f(x) dx = c \int f(x) dx.

Sum/Difference Rule for indefinite integrals

 \int (f(x) + g(x)) dx = \int f(x) dx + \int g(x) dx.
 \int (f(x) - g(x)) dx = \int f(x) dx - \int g(x) dx.

Indefinite integrals of Polynomials

Say we are given a function of the form,  f(x)= x^n , and would like to determine the antiderivative of f. Considering that

 \frac{d}{dx} \frac{1}{n+1}x^{n+1} = x^{n}

we have the following rule for indefinite integrals:

Power rule for indefinite integrals

 \int x^n {dx}= \frac{1}{n+1} x^{n+1} +C \; for all  n \not= -1.

Integral of the Inverse function

To integrate  f(x)=\frac{1}{x} , we should first remember

 \frac{d}{dx} \ln x = \frac{1}{x}.

Therefore, since \frac{1}{x} is the derivative of \ln(x) we can conclude that

\int \frac{dx}{x} = \ln \left| x \right| +C.

Note that the polynomial integration rule does not apply when the exponent is -1. This technique of integration must be used instead. Since the argument of the natural logarithm function must be positive (on the real line), the absolute value signs are added around its argument to ensure that the argument is positive.

Integral of the Exponential function

Since

 \frac{d}{dx} e^x= e^x

we see that e^x is its own antiderivative. This allows us to find the integral of an exponential function:

 \int e^x dx = e^x + C.

Integral of Sine and Cosine

Recall that

\frac{d}{dx}\sin{x}= \cos{x} \,
\frac{d}{dx}\cos{x}= -\sin{x}. \,

So sin x is an antiderivative of cos x and -cos x is an antiderivative of sin x. Hence we get the following rules for integrating sin x and cos x

\int \cos{x}\ dx= \sin{x} +C
\int \sin{x}\ dx= -\cos{x} + C.

We will find how to integrate more complicated trigonometric functions in the chapter on integration techniques.

Example

Suppose we want to integrate the function f(x)=x^4+1+2\sin{x}. An application of the sum rule from above allows us to use the power rule and our rule for integrating \sin{x} as follows,

 \int f(x)\,dx  = \int x^4 + 1 + 2\sin{x}\,dx
 = \int x^4dx + \int 1\,dx + \int 2\sin{x}\,dx
 = \frac{x^5}{5} + x - 2\cos{x} + C.

Exercises

3. Evaluate \int(7x^2+3\cos(x)-e^x)dx

\frac{7}{3}x^3+3\sin(x)-e^x+C

4. Evaluate \int(\frac{2}{5x}+\sin(x))dx

\frac{2}{5}\ln|x|-\cos(x)+C

Solutions

The Substitution Rule

The substitution rule is a valuable asset in the toolbox of any integration greasemonkey. It is essentially the chain rule (a differentiation technique you should be familiar with) in reverse. First, let's take a look at an example:

Preliminary Example

Suppose we want to find \int x \cos(x^2) dx. That is, we want to find a function such that its derivative equals  x \, \cos(x^2). Stated yet another way, we want to find an antiderivative of  f(x)= x \, \cos (x^2). Since \sin(x) differentiates to \cos(x), as a first guess we might try the function \sin (x^2). But by the Chain Rule,

 \frac{d}{dx} \sin (x^2) = \cos(x^2) \cdot \frac{d}{dx} x^2 = \cos(x^2) \cdot 2x = 2x \cos(x^2).

Which is almost what we want apart from the fact that there is an extra factor of 2 in front. But this is easily dealt with because we can divide by a constant (in this case 2). So,

 \frac{d}{dx}\frac{\sin (x^2)}{2} = \frac{1}{2} \cdot \frac{d}{dx} \sin (x^2) = \frac{1}{2} \cdot 2 \cos(x^2)x=x \cos(x^2)=f(x).

Thus, we have discovered a function,  F(x) = \frac{\sin (x^2)}{2} , whose derivative is  x \, \cos (x^2). That is, F is an antiderivative of  f(x)= x \, \cos (x^2). This gives us

 \int x \cos(x^2) dx = \frac{\sin(x^2)}{2} + C.

Generalization

In fact, this technique will work for more general integrands. Suppose u is a differentiable function. Then to evaluate  \int u'(x) \cos(u(x)) dx we just have to notice that by the Chain Rule

 \frac{d}{dx} \sin(u(x)) = \cos(u(x)) \frac{du}{dx} = u'(x) \cos(u(x)).

As long as u' is continuous we have that

\int \cos(u(x)) u'(x) dx = \sin(u(x)) + C.

Now the right hand side of this equation is just the integral of \cos(u) but with respect to u. If we write u instead of u(x) this becomes \int \cos(u(x)) u'(x) dx = \sin(u) du + C = \int \cos(u) du.

So, for instance, if u(x) = x^3 we have worked out that

\int (\cos(x^3)\cdot 3x^2)dx = \sin(x^3) + C.

General Substitution Rule

Now there was nothing special about using the cosine function in the discussion above, and it could be replaced by any other function. Doing this gives us the substitution rule for indefinite integrals:

Substitution rule for indefinite integrals
Assume u is differentiable with continuous derivative and that f is continuous on the range of u. Then

\int f(u(x)) \frac{du}{dx} dx = \int f(u) du.

Notice that it looks like you can "cancel" in the expression \frac{du}{dx} dx to leave just a du. This does not really make any sense because \frac{du}{dx} is not a fraction. But it's a good way to remember the substitution rule.

Examples

The following example shows how powerful a technique substitution can be. At first glance the following integral seems intractable, but after a little simplification, it's possible to tackle using substitution.

Example

We will show that

\int\frac{1}{\left(x^2+a^2\right)^{3/2}}\,dx = \frac{x}{a^2 \sqrt{x^2+a^2}} + C

First, we re-write the integral:

\int \frac{1}{\left(x^2+a^2\right)^{3/2}}\,dx = \int \left(x^2 + a^2\right)^{-3/2}\,dx
= \int \left(x^2 \left(1 + \frac{a^2}{x^2}\right) \right)^{-3/2} dx
= \int x^{-3} \left(1+\frac{a^2}{x^2}\right)^{-3/2} dx
= \int \left(1+\frac{a^2}{x^2}\right)^{-3/2} \left(x^{-3} dx\right).

Now we preform the following substitution:

 u = 1 + \frac{a^2}{x^2}
 \frac{du}{dx} = -2a^2x^{-3} \Longrightarrow  x^{-3}dx = -\frac{du}{2a^2}

Which yields:

\int \left(1+\frac{a^2}{x^2}\right)^{-3/2} \left(x^{-3} dx\right) =
= \int u^{-3/2} \left(-\frac{du}{2a^2}\right)
= -\frac{1}{2a^2} \int u^{-3/2} du
= -\frac{1}{2a^2} \left(-\frac{2}{\sqrt{u}}\right) + C
= \frac{1}{a^2 \sqrt{1+\frac{a^2}{x^2}}} + C
= \left(\frac{x}{x}\right) \frac{1}{a^2 \sqrt{1+\frac{a^2}{x^2}}} + C
= \frac{x}{a^2 \sqrt{x^2+a^2}} + C.

Exercises

5. Evaluate \int x\sin(2x^2)dx by making the substitution u=2x^2

-\frac{\cos(2x^{2})}{4}+C

6. Evaluate \int-3\cos(x)e^{\sin(x)}dx

-3e^{\sin(x)}+C

Solutions

Integration by Parts

Integration by parts is another powerful tool for integration. It was mentioned above that one could consider integration by substitution as an application of the chain rule in reverse. In a similar manner, one may consider integration by parts as the product rule in reverse.

Preliminary Example

General Integration by Parts

Integration by parts for indefinite integrals
Suppose f and g are differentiable and their derivatives are continuous. Then

\int f(x)g'(x) dx = f(x)g(x) - \int f'(x) g(x) dx.

If we write u=f(x) and v=g(x), then by using the Leibniz notation du=f'(x) dx and dv=g'(x) dx the integration by parts rule becomes

\int u dv = uv - \int v du.

Examples

Example

Find \int x\cos (x) \,dx .

Here we let:

u = x, so that du = dx,
dv = \cos(x)dx , so that v = \sin(x).

Then:

\int x\cos (x) \,dx = \int u \,dv
= uv - \int v \,du
\int x\cos (x) \,dx = x\sin (x) - \int \sin (x) \,dx
\int x\cos (x) \,dx = x\sin (x) + \cos (x) + C.

Example

Find \int x^2 e^x\,dx

In this example we will have to use integration by parts twice.

Here we let

u = x^2, so that du= 2x dx,
dv= e^xdx, so that v =e^x.

Then:

\int x^2e^x \,dx = \int u \,dv
= uv - \int v \,du
\int x^2e^x \,dx = x^2e^x - \int 2xe^x\,dx= x^2 e^x - 2\int xe^x\,dx.

Now to calculate the last integral we use integration by parts again. Let

u = ''x, so that du=  dx,
dv= e^xdx, so that v =e^x

and integrating by parts gives

\int xe^x \,dx = xe^x - \int e^x\,dx= x e^x - e^x.

So, finally we obtain

\int x^2e^x \,dx =  x^2 e^x - 2(x e^x - e^x)+ C = x^2 e^x - 2x e^x + 2e^x + C =e^x(x^2-2x+2) + C.

Example

Find \int \ln (x) \,dx.

The trick here is to write this integral as

\int \ln (x) \cdot 1 \,dx.

Now let

u = \ln(x) so du = (1/x) dx,
v = x so dv = 1dx.

Then using integration by parts,

\int \ln (x) \,dx = x \ln (x) - \int \frac{x}{x} \,dx
= x \ln (x) - \int 1 \,dx
\int \ln (x) \,dx = x \ln (x) - {x} + {C}
\int \ln (x) \,dx = x ( \ln (x) - 1 ) + C.

Example

Find \int \arctan(x) dx.

Again the trick here is to write the integrand as \arctan(x) = \arctan(x) \cdot 1 . Then let

u = arctan(x); du = 1/(1+x2) dx
v = x; dv = 1·dx

so using integration by parts,

\int \arctan (x) \,dx = x \arctan (x) - \int \frac{x}{1 + x^2} \,dx
= x \arctan (x) - {1 \over 2} \ln \left( 1 + x^2 \right) + C.

Example

Find \int e^{x} \cos (x) \,dx.

This example uses integration by parts twice. First let,

u = ex; thus du = exdx
dv = cos(x)dx; thus v = sin(x)

so

\int e^{x} \cos (x) \,dx = e^{x} \sin (x) - \int e^{x} \sin (x) \,dx

Now, to evaluate the remaining integral, we use integration by parts again, with

u = ex; du = exdx
v = -cos(x); dv = sin(x)dx

Then

\int e^{x} \sin (x) \,dx = -e^{x} \cos (x) - \int -e^{x} \cos (x) \,dx
= -e^{x} \cos (x) + \int e^{x} \cos (x) \,dx

Putting these together, we have

\int e^{x} \cos (x) \,dx = e^{x} \sin (x) + e^x \cos (x) - \int e^{x} \cos (x) \,dx

Notice that the same integral shows up on both sides of this equation, but with opposite signs. The integral does not cancel; it doubles when we add the integral to both sides to get

2\int e^{x} \cos (x) \,dx = e^{x} ( \sin (x) + \cos (x) )
\int e^{x} \cos (x) \,dx = \frac{e^{x} ( \sin (x) + \cos (x) )}{2} .

Exercises

7. Evaluate \int \frac{2x-5}{x^3}dx using integration by parts with u=2x-5 and dv=\frac{dx}{x^3}

\frac{5-4x}{2x^2} + C

8. Evaluate \int(2x-1)e^{-3x+1}dx

\frac{(1-6x)e^{-3x+1}}{9} + C

Solutions

← Fundamental Theorem of Calculus Calculus Improper Integrals →
Print version

<h1> Failed to match page to section number. Check your argument; if correct, consider updating Template:Calculus/map page. Improper integrals</h1>

← Indefinite integral Calculus Integration techniques/Infinite Sums →
Print version

The definition of a definite integral:

\int_{a}^{b} f(x)\, dx

requires the interval [a,b] be finite. The Fundamental Theorem of Calculus requires that f be continuous on [a,b]. In this section, you will be studying a method of evaluating integrals that fail these requirements—either because their limits of integration are infinite, or because a finite number of discontinuities exist on the interval [a,b]. Integrals that fail either of these requirements are improper integrals. (If you are not familiar with L'Hôpital's rule, it is a good idea to review it before reading this section.)

Improper Integrals with Infinite Limits of Integration

Consider the integral

\int_{1}^{\infin} \frac{dx}{x^2}

Assigning a finite upper bound b in place of infinity gives

 \lim_{b \to \infin} \int_{1}^{b} \frac{dx}{x^2} = \lim_{b \to \infin}\left(\frac{1}{1} -\frac{1}{b}\right)\  =  \lim_{b \to \infin}\left(1-\frac{1}{b}\right)\ = 1

This improper integral can be interpreted as the area of the unbounded region between f(x)=1/x^2, y=0 (the x-axis), and x=1.

Definition

1. Suppose \int_a^b f(x) dx exists for all b\ge a. Then we define

\int_{a}^{\infin} f(x)\, dx=\lim_{b \to \infin}\int_{a}^{b} f(x)\, dx, as long as this limit exists and is finite.

If it does exist we say the integral is convergent and otherwise we say it is divergent.

2. Similarly if \int_a^b f(x) dx exists for all a\le b we define

\int_{-\infin}^{b} f(x)\, dx=\lim_{a \to -\infin}\int_{a}^{b} f(x)\, dx.

3. Finally suppose c is a fixed real number and that \int_{-\infin}^{c} f(x)\, dx and \int_{c}^{\infin} f(x)\, dx are both convergent. Then we define

\int_{-\infin}^{\infin} f(x)\, dx=\int_{-\infin}^{c} f(x)\, dx + \int_{c}^{\infin} f(x)\, dx.
Example: Convergent Improper Integral

We claim that

\int_{0}^{\infin} e^{-x}\, dx=1

To do this we calculate

 \int_{0}^{\infin} e^{-x} dx =  \lim_{b \to \infin} \int_{0}^{b} e^{-x} dx
=  \lim_{b \to \infin} (-e^{-x})\bigr|_0^b
=  \lim_{b \to \infin} \left(-e^{-b}+1 \right)
=  1 \,
Example: Divergent Improper Integral

We claim that the integral

\int_{1}^{\infin} \frac{dx}{x}\, diverges.

This follows as

\int_{1}^{\infin} \frac{dx}{x}\, = \lim_{b \to \infin}\int_{1}^{b} \frac{dx}{x}\,
= \lim_{b \to \infin}(\ln x)\bigr|_1^b
= \lim_{b \to \infin}\left(\ln b - 0\right)
=  \infin \,

Therefore

 \int_{1}^{\infin} \frac{dx}{x}\, diverges.
Example: Improper Integral

Find \int_0^\infty x^2e^{-x} dx.

To calculate the integral use integration by parts twice to get

\begin{align}
\int_0^b x^2 e^{-x} dx &= (-x^2 e^{-x})\bigr|_0^b + 2 \int_0^b x e^{-x}dx\\
&=-b^2 e^{-b} + 2 ((-x e^{-x})\bigr|_0^b +  \int_0^b e^{-x}dx)\\
&=-b^2 e^{-b} + 2 (-b e^{-b} - (e^{-x})\bigr|_0^b )\\
&=-b^2 e^{-b} + 2 (-b e^{-b} - e^{-b}+1).\\
\end{align}

Now \lim_{b\to \infty} e^{-b} =0 and because exponentials overpower polynomials, we see that \lim_{b\to \infty} b^2 e^{-b} =0 and \lim_{b\to \infty} b e^{-b} =0 as well. Hence,

\int_0^\infty x^2 e^{-x} dx =\lim_{b\to\infty} \int_0^b x^2 e^{-x} dx=0+2(0-0+1)=2.
Example: Powers

Show \int_{1}^{\infin} \frac{dx}{x^p}\,=\begin{cases} \frac{1}{p-1}, & \mbox{if }p>1 \\ \mbox{diverges}, & \mbox{if }p\le1 \end{cases}}}

If p\neq 1 then

\int_{1}^{\infin} \frac{dx}{x^p}\, = \lim_{b \to \infin}\int_{1}^{b} x^{-p} dx\,
= \lim_{b \to \infin} (\frac{1}{-p+1} x^{-p+1})\bigr|_1^b
=  \frac{1}{1-p} \lim_{b \to \infin}  \left(b^{-p+1}-1\right)
= \begin{cases} \frac{1}{p-1}, & \mbox{if }p>1 \\ \mbox{diverges}, & \mbox{if }p<1.\end{cases}

Notice that we had to assume that p\ne 1 to avoid dividing by zero. However the p=1 case was done in a previous example.

Improper Integrals with a Finite Number Discontinuities

First we give a definition for the integral of functions which have a discontinuity at one point.

Definition of improper integrals with a single discontinuity

If f is continuous on the interval [a,b) and is discontinuous at b, we define :\int_{a}^{b} f(x)\, dx=\lim_{c \to b^-}\int_{a}^{c} f(x)\, dx.

If the limit in question exists we say the integral converges and otherwise we say it diverges.

Similarly if f is continuous on the interval(a,b] and is discontinuous at a, we define

\int_{a}^{b} f(x)\, dx=\lim_{c \to a^+}\int_{c}^{b} f(x)\, dx.

Finally suppose f has an discontinuity at a point c in (a,b) and is continuous at all other points in [a,b]. If \int_{a}^{c} f(x)\, dx and \int_{c}^{b} f(x)\, dx converge we define

\int_{a}^{b} f(x)\, dx=\int_{a}^{c} f(x)\, dx+\int_{c}^{b} f(x)\, dx
Example 1

Show \int_{0}^{1} \frac{dx}{x^p}\,=\begin{cases} \frac{1}{1-p}, & \mbox{if }p<1 \\ \mbox{diverges}, & \mbox{if }p\ge1 \end{cases}

If p\neq 1 then

\int_{0}^{1} \frac{dx}{x^p}\, = \lim_{a \to 0^+}\int_{a}^{1} x^{-p} dx\,
= \lim_{a \to 0^+} (\frac{1}{-p+1} x^{-p+1})\bigr|_a^1
=  \frac{1}{1-p} \lim_{a \to 0^+}  \left(1-a^{-p+1}\right)
= \begin{cases} \frac{1}{1-p}, & \mbox{if }p<1 \\ \mbox{diverges}, & \mbox{if }p>1.\end{cases}

Notice that we had to assume that p\ne 1 do avoid dividing by zero. So instead we do the p=1 case separately,

\int_0^1 \frac{1}{x} dx = \lim_{a\to 0^+} (\ln |x|)\bigr|_a^1= \lim_{a\to 0^+} -\ln a

which diverges.


Example 2

The integral \int_{-1}^3 \frac{1}{x-2}dx is improper because the integrand is not continuous at x=2. However had we not noticed that we might have been tempted to apply the fundamental theorem of calculus and conclude that it equals

 (\ln |x-2|)\bigr|_{-1}^3 = \ln(5)- \ln (3) = \ln (5/3),

which is not correct. In fact the integral diverges since

\begin{align}\int_{-1}^3 \frac{1}{x-2}dx &= \lim_{b\to 2^-}\int_{-1}^b \frac{1}{x-2}dx + \lim_{a\to 2^+}\int_a^3 \frac{1}{x-2}dx\\
&=\lim_{b\to 2^-}(\ln |x-2|)\bigr|_{-1}^{b} +\lim_{a\to 2^+}(\ln |x-2|)\bigr|_{a}^{3}\\
&=\lim_{b\to 2^-}(\ln (2-b)-\ln 3) +\lim_{a\to 2^+}(\ln 1-\ln (a-2))\\
&=\lim_{b\to 2^-}(\ln (2-b))-\ln 3 +\lim_{a\to 2^+}(-\ln (a-2))\\
\end{align}

and \lim_{b\to 2^-}(\ln (2-b)) and \lim_{a\to 2^+}(-\ln (a-2)) both diverge.

We can also give a definition of the integral of a function with a finite number of discontinuities.

Definition: Improper integrals with finite number of discontinuities

Suppose f is continuous on [a,b] except at points c_1<c_2<\ldots<c_n in [a,b]. We define \int_a^b f(x) dx = \int_a^{c_1} f(x) dx + \int_{c_1}^{c_2} f(x) dx +\int_{c_2}^{c_3} f(x) dx + \cdots + \int_{c_{n-1}}^{c_n} f(x) dx + \int_{c_n}^{b} f(x) dx as long as each integral on the right converges.

Notice that by combining this definition with the definition for improper integrals with infinite endpoints, we can define the integral of a function with a finite number of discontinuities with one or more infinite endpoints.

Comparison Test

There are integrals which cannot easily be evaluated. However it may still be possible to show they are convergent by comparing them to an integral we already know converges.

Theorem (Comparison Test) Let f and g be continuous functions defined for all x\ge a.

  1. Suppose g(x)\ge f(x)\ge 0 for all x\ge a. Then if \int_a^\infty g(x) dx converges so does \int_a^\infty f(x) dx.
  2. Suppose f(x)\ge h(x)\ge 0 for all x\ge a. Then if \int_a^\infty h(x) dx diverges so does \int_a^\infty f(x) dx.

A similar theorem holds for improper integrals of the form \int_{-\infty}^b f(x) dx and for improper integrals with discontinuities.

Example: Use of comparsion test to show convergence

Show that \int_1^{\infty} \frac{\sin x+2}{x^2} dx converges.

For all x we know that -1\le \sin x \le 1 so 1\le \sin x+2\le 3. This implies that

 0\le \frac{\sin x+2}{x^2} \le \frac{3}{x^2}.

We have seen that \int_1^{\infty} \frac{3}{x^2} dx = 3\int_1^{\infty} \frac{1}{x^2} dx converges. So putting f(x) = \frac{\sin x+2}{x^2} and g(x) = \frac{3}{x^2} into the comparison test we get that the integral \int_1^{\infty} \frac{\sin x+2}{x^2} dx converges as well.

Example: Use of Comparsion Test to show divergence

Show that \int_1^{\infty} \frac{\sin x+2}{x} dx diverges.

Just as in the previous example we know that \sin x+2\ge 1 for all x. Thus

 \frac{\sin x+2}{x} \ge \frac{1}{x} \ge 0.

We have seen that \int_1^{\infty} \frac{1}{x} dx diverges. So putting   f(x) = \frac{\sin x+2}{x} and g(x) = \frac{1}{x} into the comparison test we get that \int_1^{\infty} \frac{\sin x+2}{x} dx diverges as well.

An extension of the comparison theorem

To apply the comparison theorem you do not really need g(x) \ge f(x) \ge 0 for all x\ge a. What we actually need is this inequality holds for sufficiently large x (i.e. there is a number c such that g(x) \ge f(x) for all x\ge c). For then

 \int_a^\infty f(x) dx = \int_a^c f(x) dx + \int_c^{\infty} f(x) dx,

so the first integral converges if and only if third does, and we can apply the comparison theorem to the  \int_c^\infty f(x) dx piece.


Example

Show that \int_1^{\infty} x^{7/2} e^{-\frac{3x}{2}} dx converges.

The reason that this integral converges is because for large x the e^{-x} factor in the integrand is dominant. We could try comparing  x^{7/2} e^{-x} with e^{-x}, but as x\ge 1, the inequality

 x^{7/2} e^{-x} \ge e^{-x}

is the wrong way around to show convergence.

Instead we rewrite the integrand as x^{3/2} e^{-\frac{3x}{2}} dx= x^{7/2} e^{-\frac{x}{2}} e^{-x} dx.

Since the limit \lim_{x\to \infty} x^{7/2} e^{-\frac{x}{2}} = 0 we know that for x sufficiently large we have  x^{3/2} e^{-\frac{x}{2}} \le 1. So for large x,

 x^{7/2} e^{\frac{-7x}{2}} = x^{7/2} e^{-\frac{x}{2}} e^{-x} \le e^{-x}.

Since the integral  \int_1^{\infty} e^{-x} dx converges the comparison test tells us that  \int_a^{\infty} x^{7/2} e^{\frac{-3x}{2}} converges as well.

← Indefinite integral Calculus Integration techniques/Infinite Sums →
Print version

Integration Techniques

4.5 Infinite Sums

4.6 Derivative Rules and the Substitution Rule

4.7 Integration by Parts

4.8 Trigonometric Substitutions

4.9 Trigonometric Integrals

4.10 Rational Functions by Partial Fraction Decomposition

4.11 Tangent Half Angle Substitution

4.12 Reduction Formula

4.13 Irrational Functions

4.14 Numerical Approximations

<h1> 4.5 Infinite Sums</h1>

← Improper Integrals Calculus Integration techniques/Recognizing Derivatives and the Substitution Rule →
Print version

The most basic, and arguably the most difficult, type of evaluation is to use the formal definition of a Riemann integral.

Exact Integrals as Limits of Sums

Using the definition of an integral, we can evaluate the limit as n goes to infinity. This technique requires a fairly high degree of familiarity with summation identities. This technique is often referred to as evaluation "by definition," and can be used to find definite integrals, as long as the integrands are fairly simple. We start with definition of the integral:

\int_a^bf(x)\ dx =\lim_{n\rightarrow\infty}\frac{b-a}{n}\sum_{i=1}^nf(x_i^*) Then picking x_i^* to be x_i=a+i\frac{b-a}{n} we get,
=\lim_{n\rightarrow\infty}\frac{b-a}{n}\sum_{i=1}^nf(a+i\frac{b-a}{n}).

In some simple cases, this expression can be reduced to a real number, which can be interpreted as the area under the curve if f(x) is positive on [a,b].

Example 1

Find \int_0^2x^2\ dx by writing the integral as a limit of Riemann sums.

\int_0^2x^2\ dx =\lim_{n\rightarrow\infty}\frac{b-a}{n}\sum_{i=1}^nf(x_i^*)
=\lim_{n\rightarrow\infty}\frac{2}{n}\sum_{i=1}^nf(\frac{2i}{n})
=\lim_{n\rightarrow\infty}\frac{2}{n}\sum_{i=1}^n\left(\frac{2i}{n}\right)^2
=\lim_{n\rightarrow\infty}\frac{2}{n}\sum_{i=1}^n\frac{4i^2}{n^2}
=\lim_{n\rightarrow\infty}\frac{8}{n^3}\sum_{i=1}^ni^2
=\lim_{n\rightarrow\infty}\frac{8}{n^3}\frac{n(n+1)(2n+1)}{6}
=\lim_{n\rightarrow\infty}\frac{4}{3}\frac{2n^2+3n+1}{n^2}
=\lim_{n\rightarrow\infty}\frac{8}{3}+\frac{4}{n}+\frac{4}{3n^2}
=\frac{8}{3}

In other cases, it is even possible to evaluate indefinite integrals using the formal definition. We can define the indefinite integral as follows:

\int f(x)\ dx =\int_0^x f(t)\ dt=\lim_{n\rightarrow\infty}\frac{x-0}{n}\sum_{i=1}^n f(t_i^{*})
=\lim_{n\rightarrow\infty}\frac{x}{n}\sum_{i=1}^n f\left(0+\frac{(x-0)\cdot i}{n}\right)
=\lim_{n\rightarrow\infty}\frac{x}{n}\sum_{i=1}^n f\left(\frac{x\cdot i}{n}\right)

Example 2

Suppose f(x)=x^2, then we can evaluate the indefinite integral as follows.

\int_0^xf(t)\ dt= \lim_{n\rightarrow\infty}\frac{x}{n}\sum_{i=1}^n f\left(\frac{x\cdot i}{n}\right)
\lim_{n\rightarrow\infty}\frac{x}{n}\sum_{i=1}^n \left(\frac{x\cdot i}{n}\right)^2
\lim_{n\rightarrow\infty}\frac{x}{n}\sum_{i=1}^n \frac{x^2\cdot i^2}{n^2}
\lim_{n\rightarrow\infty}\frac{x^3}{n^3}\sum_{i=1}^n i^2
\lim_{n\rightarrow\infty}\frac{x^3}{n^3}\sum_{i=1}^n i^2
\lim_{n\rightarrow\infty}\frac{x^3}{n^3}\frac{n(n+1)(2n+1)}{6}
\lim_{n\rightarrow\infty}\frac{x^3}{n^3}\frac{n(n+1)(2n+1)}{6}
\lim_{n\rightarrow\infty}\frac{x^3}{n^3}\frac{2n^3+3n^2+n}{6}
x^3\lim_{n\rightarrow\infty}\left(\frac{2n^3}{6n^3}+\frac{3n^2}{6n^3}+\frac{n}{6n^3}\right)
x^3\lim_{n\rightarrow\infty}\left(\frac{1}{3}+\frac{1}{2n}+\frac{1}{6n^2}\right)
x^3\cdot\left(\frac{1}{3}\right)
\frac{x^3}{3}
← Improper Integrals Calculus Integration techniques/Recognizing Derivatives and the Substitution Rule →
Print version

<h1> 4.6 Derivative Rules and the Substitution Rule</h1>

← Integration techniques/Infinite Sums Calculus Integration techniques/Integration by Parts →
Print version

After learning a simple list of antiderivatives, it is time to move on to more complex integrands, which are not at first readily integrable. In these first steps, we notice certain special case integrands which can be easily integrated in a few steps.

Recognizing Derivatives and Reversing Derivative Rules

If we recognize a function g(x) as being the derivative of a function f(x), then we can easily express the antiderivative of g(x):

\int g(x)\, dx = f(x) + C.

For example, since

\frac{d}{dx} \sin x = \cos x

we can conclude that

\int \cos x\, dx = \sin x + C.

Similarly, since we know e^x is its own derivative,

\int e^x \, dx = e^x + C.


The power rule for derivatives can be reversed to give us a way to handle integrals of powers of x. Since

\frac{d}{dx} x^n = n x^{n-1},

we can conclude that

\int n x^{n-1} \, dx = x^n + C,

or, a little more usefully,

\int x^n \, dx = \frac{x^{n+1}}{n+1} + C.

Integration by Substitution

For many integrals, a substitution can be used to transform the integrand and make possible the finding of an antiderivative. There are a variety of such substitutions, each depending on the form of the integrand.

The objective of Integration by substitution is to substitute the integrand from an expression with variable x to an expression with variable  u where u \,=\, g(x)

Theory

We want to transform the Integral from a function of x to a function of u

\int_{x=a}^{x=b}f(x)\, dx\,\rightarrow\,\int_{u=c}^{u=d}h(u)\, du

Starting with

u\,=\,g(x)


Steps

\int_{x=a}^{x=b}f(x)\, dx\,  =\int_{x=a}^{x=b}f(x)\,{\operatorname{d}\!u\over\operatorname{d}\!u}\, dx\,      (1) ie   {\operatorname{d}\!u\over\operatorname{d}\!u}\,=\,1
=\int_{x=a}^{x=b}(f(x)\,{\operatorname{d}\!x\over\operatorname{d}\!u})({\operatorname{d}\!u\over\operatorname{d}\!x})\, dx\,      (2) ie   {\operatorname{d}\!x\over\operatorname{d}\!u}{\operatorname{d}\!u\over\operatorname{d}\!x}\,=\,{\operatorname{d}\!u\over\operatorname{d}\!u}\,=\,1
=\int_{x=a}^{x=b}(f(x)\,{\operatorname{d}\!x\over\operatorname{d}\!u})g'(x)\, dx\,      (3) ie   {\operatorname{d}\!u\over\operatorname{d}\!x}\,=\,g'(x)
=\int_{x=a}^{x=b}h(g(x))g'(x)\, dx\,      (4) ie   Now equate (f(x)\,{\operatorname{d}\!x\over\operatorname{d}\!u}) with h(g(x))
=\int_{x=a}^{x=b}h(u)g'(x)\, dx\,      (5) ie   g(x)\,=\,u
=\int_{u=g(a)}^{u=g(b)}h(u)\, du\,      (6) ie    du\,=\,{\operatorname{d}\!u\over\operatorname{d}\!x} dx\,=\,g'(x)\, dx\,
=\int_{u=c}^{u=d}h(u)\, du\,      (7) ie   We have achieved our desired result

Procedure

  • Calculate g'(x)\,=\,{\operatorname{d}\!u\over\operatorname{d}\!x}
  • Calculate h(u) which is f(x)\,{\operatorname{d}\!x\over\operatorname{d}\!u}\,=\,\frac{f(x)}{g'(x)} and make sure you express the result in terms of the variable u
  • Calculate c\,=\,g(a)
  • Calculate d\,=\,g(b)

Integrating with the derivative present

If a component of the integrand can be viewed as the derivative of another component of the integrand, a substitution can be made to simplify the integrand.

For example, in the integral

\int 3x^2 (x^3+1)^5 \, dx

we see that 3x^2 is the derivative of x^3+1. Letting

u=x^3+1

we have

\frac{du}{dx} = 3x^2

or, in order to apply it to the integral,

du = 3x^2 dx.

With this we may write \int 3x^2 (x^3+1)^5 \, dx = \int u^5 \, du = \frac{1}{6} u^6 + C = \frac{1}{6} (x^3+1)^6 + C.

Note that it was not necessary that we had exactly the derivative of u in our integrand. It would have been sufficient to have any constant multiple of the derivative.

For instance, to treat the integral

\int x^4 \sin(x^5) \, dx

we may let u=x^5. Then

du = 5x^4 \, dx

and so

\frac{1}{5} du = x^4 \, dx

the right-hand side of which is a factor of our integrand. Thus,

\int x^4 \sin(x^5) \, dx = \int \frac{1}{5} \sin u \, du = -\frac{1}{5} \cos u + C  = -\frac{1}{5} \cos x^5 + C.

In general, the integral of a power of a function times that function's derivative may be integrated in this way. Since \frac{d[g(x)]}{dx}=g'(x),

we have dx=\frac{d[g(x)]}{g'(x)}.

Therefore, \int g'(x)[g(x)]^n dx\, = \int g'(x)[g(x)]^n \frac{d[g(x)]}{g'(x)}
=\int [g(x)]^n d[g(x)]
=\frac{[g(x)]^{n+1}}{n+1}


There is a similar rule for definite integrals, but we have to change the endpoints.

Substitution rule for definite integrals

Assume u is differentiable with continuous derivative and that f is continuous on the range of u. Suppose c=u(a) and d=u(b). Then \int_a^b f(u(x)) \frac{du}{dx} dx = \int_c^d f(u) du.

Examples

Consider the integral


\int_{0}^2 x \cos(x^2+1) \,dx

By using the substitution u = x2 + 1, we obtain du = 2x dx and

\int_{0}^2 x \cos(x^2+1) \,dx = \frac{1}{2} \int_{0}^2 \cos(x^2+1) 2x \,dx
= \frac{1}{2} \int_{1}^{5}\cos(u)\,du
= \frac{1}{2}(\sin(5)-\sin(1)).

Note how the lower limit x = 0 was transformed into u = 02 + 1 = 1 and the upper limit x = 2 into u = 22 + 1 = 5.

Proof of the substitution rule

We will now prove the substitution rule for definite integrals. Let H be an anti derivative of h so

 H'(u) = h(u).

Suppose we have a differentiable function, g(x) such that u\,=\,g(x), and numbers c=g(a) and d=g(b) derived from some given numbers, a and b.
By the Fundamental Theorem of Calculus, we have

 \int_c^d h(u) du = H(d) - H(c).

Next we define a function F by the rule

 F(x) = H(g(x)) = H(u)\,.

Naturally

f(x)\,=\,F'(x)\,=\,\frac{dF}{dx}

Then by the Chain rule F is differentiable with derivative

 \frac{dF}{dx} = \frac{dH}{du}\frac{du}{dx}= h(u) \frac{du}{dx} = h(g(x)) \frac{du}{dx}\,.

Integrating both sides with respect to x and using the Fundamental Theorem of Calculus we get

 \int_a^b h(g(x)) \frac{du}{dx}dx =\int_a^b \frac{dF}{dx} dx = F(b) - F(a).

But by the definition of F this equals

 F(b) - F(a) =  H(g(b)) - H(g(a)) = H(d) - H(c) = \int_c^d h(u) du.

Hence

\int_a^b f(x) dx\,=\,\int_a^b h(g(x)) \frac{du}{dx}dx = \int_c^d h(u) du.

which is the substitution rule for definite integrals.

Exercises

Evaluate the following using a suitable substitution.

1. \int\frac{19}{\sqrt{9x-38}}dx

\frac{38\sqrt{9x-38}}{9}+C

2. \int-15\sqrt{9x+43}dx

-\frac{10(9x+43)^{3/2}}{9}+C

3. \int\frac{17\sin(x)}{\cos(x)}dx

-17\ln|\cos(x)|+C

4. \int5\cos(x)\sin(x)dx

\frac{5\sin^{2}(x)}{2}+C

5. \int_{0}^{1}-\frac{10}{(-5x-32)^{4}}dx

-\frac{17885}{2489696256}

6. \int-3e^{3x+12}dx

-e^{3x+12}+C

Solutions

← Integration techniques/Infinite Sums Calculus Integration techniques/Integration by Parts →
Print version

<h1> 4.7 Integration by Parts</h1>

← Integration techniques/Recognizing Derivatives and the Substitution Rule Calculus Integration techniques/Trigonometric Substitution →
Print version

Continuing on the path of reversing derivative rules in order to make them useful for integration, we reverse the product rule.

Integration by parts

If y = uv where u and v are functions of x, then

y'=\left(uv\right)'=u'v+uv'\,

Rearranging,

uv'=\left(uv\right)'-u'v\,

Therefore,

\int uv' dx = \int \left(uv\right)' dx - \int u'v dx

Therefore,

\int uv' dx = uv - \int vu' dx

, or

\int u \, dv = uv - \int v \, du

This is the integration by parts formula. It is very useful in many integrals involving products of functions, as well as others.

For instance, to treat

\int x \sin x \, dx

we choose u=x and dv=\sin x dx. With these choices, we have du=dx and v=-\cos x, and we have

\int x \sin x \, dx = -x \cos x - \int \left(-\cos x\right) \, dx = -x \cos x + \int \cos x \, dx = -x \cos x + \sin x + C

Note that the choice of u and dv was critical. Had we chosen the reverse, so that u=\sin x and dv=x \, dx, the result would have been

\frac{1}{2} x^2 \sin x - \int \frac{1}{2} x^2 \cos x \, dx

The resulting integral is no easier to work with than the original; we might say that this application of integration by parts took us in the wrong direction.

So the choice is important. One general guideline to help us make that choice is, if possible, to choose u to be the factor of the integrand which becomes simpler when we differentiate it. In the last example, we see that \sin x does not become simpler when we differentiate it: \cos x is no simpler than \sin x.

An important feature of the integration by parts method is that we often need to apply it more than once. For instance, to integrate

\int x^2 e^x \, dx,

we start by choosing u=x^2 and dv = e^x dx to get

\int x^2 e^x \, dx = x^2 e^x - 2 \int x e^x \, dx.

Note that we still have an integral to take care of, and we do this by applying integration by parts again, with u=x and dv=e^x\, dx, which gives us

\int x^2 e^x \, dx = x^2 e^x - 2 \int x e^x \, dx = x^2 e^x - 2\left(x e^x -e^x\right) + C = x^2 e^x - 2x e^x + 2e^x + C.

So, two applications of integration by parts were necessary, owing to the power of x^2 in the integrand.

Note that any power of x does become simpler when we differentiate it, so when we see an integral of the form

\int x^n f\left(x\right) \, dx

one of our first thoughts ought to be to consider using integration by parts with u=x^n. Of course, in order for it to work, we need to be able to write down an antiderivative for dv.

Example

Use integration by parts to evaluate the integral

 \int \sin\left(x\right) e^x\, dx

Solution: If we let  u = \sin\left(x\right) and v'=e^x dx, then we have u'=\cos\left(x\right)dx and v=e^x. Using our rule for integration by parts gives

\int \sin\left(x\right) e^x\, dx= e^x\sin\left(x\right)- \int \cos\left(x\right)e^x\, dx

We do not seem to have made much progress. But if we integrate by parts again with  u = \cos\left(x\right) and v'=e^x dx and hence u'=-\sin\left(x\right)dx and v=e^x, we obtain

\int \sin\left(x\right) e^x\, dx = e^x\sin\left(x\right) - e^x\cos\left(x\right) - \int e^x \sin\left(x\right)\, dx

We may solve this identity to find the anti-derivative of e^x\sin\left(x\right) and obtain

\int \sin\left(x\right) e^x\, dx = \frac 1 2 e^x\left(\sin\left(x\right) - \cos\left(x\right) \right) +C

With definite integral

For definite integrals the rule is essentially the same, as long as we keep the endpoints.

Integration by parts for definite integrals Suppose f and g are differentiable and their derivatives are continuous. Then

\int_a^b f\left(x\right)g'\left(x\right) dx = \left.\left(f\left(x\right)g\left(x\right)\right)\right|_a^b - \int_a^b f'\left(x\right) g\left(x\right) dx
=f\left(b\right)g\left(b\right)-f\left(a\right)g\left(a\right) - \int_a^b f'\left(x\right) g\left(x\right) dx.

This can also be expressed in Leibniz notation.

\int_a^b udv = \left.\left(uv\right)\right|_a^b - \int_a^b v du.

More Examples

Examples Set 1: Integration by Parts

Exercises

Evaluate the following using integration by parts.

1. \int -4\ln\left(x\right)dx

-4x\ln\left(x\right)+4x+C

2. \int\left(-7x+38\right)\cos\left(x\right)dx

\left(-7x+38\right)\sin\left(x\right)-7\cos\left(x\right)+C

3. \int_0^\frac{\pi}{2}\left(-6x+45\right)\cos\left(x\right)dx

-3\pi+51

4. \int\left(5x+1\right)\left(x-6\right)^4 dx

\frac{\left(5x+1\right)\left(x-6\right)^5}{5}-\frac{\left(x-6\right)^6}{6}+C

5. \int_{-1}^1 \left(2x+8\right)^3\left(-x+2\right)dx

\frac{9584}{5}

Solutions

← Integration techniques/Recognizing Derivatives and the Substitution Rule Calculus Integration techniques/Trigonometric Substitution →
Print version

<h1> 4.8 Trigonometric Substitutions</h1>

← Integration techniques/Integration by Parts Calculus Integration techniques/Trigonometric Integrals →
Print version

The idea behind the trigonometric substitution is quite simple: to replace expressions involving square roots with expressions that involve standard trigonometric functions, but no square roots. Integrals involving trigonometric functions are often easier to solve than integrals involving square roots.

Let us demonstrate this idea in practice. Consider the expression \sqrt{1-x^2}. Probably the most basic trigonometric identity is \displaystyle\sin^2(\theta)+\cos^2(\theta)=1 for an arbitrary angle \displaystyle\theta. If we replace \displaystyle x in this expression by \displaystyle\sin(\theta), with the help of this trigonometric identity we see

\sqrt{1-x^2}=\sqrt{1-\sin^2(\theta)}=\sqrt{\cos^2(\theta)}=\cos(\theta)

Note that we could write \displaystyle\theta=\arcsin(x) since we replaced \displaystyle x^2 with \displaystyle\sin^2(\theta).

We would like to mention that technically one should write the absolute value of \displaystyle\cos(\theta), in other words \displaystyle|\cos(\theta)| as our final answer since \displaystyle\sqrt{A^2}=|A| for all possible \displaystyle A. But as long as we are careful about the domain of all possible \displaystyle x and how \displaystyle\cos(\theta) is used in the final computation, omitting the absolute value signs does not constitute a problem. However, we cannot directly interchange the simple expression \displaystyle\cos(\theta) with the complicated \displaystyle\sqrt{1-x^2} wherever it may appear, we must remember when integrating by substitution we need to take the derivative into account. That is we need to remember that \displaystyle dx=\cos(\theta)\,d\theta, and to get a integral that only involves \displaystyle\theta we need to also replace \displaystyle dx by something in terms of \displaystyle d\theta. Thus, if we see an integral of the form

\int\sqrt{1-x^2}dx

we can rewrite it as

\int\cos(\theta) \cos\theta\,d\theta=\int\cos^2 \theta \,d\theta.

Notice in the expression on the left that the first \displaystyle\cos\theta comes from replacing the \displaystyle\sqrt{1-x^2} and the \displaystyle\cos\theta\,d\theta comes from substituting for the \displaystyle dx.

Since \displaystyle\cos^2(\theta)=\tfrac{1}{2}(1+\cos(2\theta)) our original integral reduces to:

\tfrac{1}{2}\int d\theta +\tfrac{1}{2}\int\cos(2\theta)d\theta.

These last two integrals are easily handled. For the first integral we get

\tfrac{1}{2}\int\,d\theta=\tfrac{1}{2}\theta

For the second integral we do a substitution, namely \displaystyle u=2\theta (and \displaystyle du=2d\theta) to get:

\tfrac{1}{2}\int\cos(2\theta)d\theta=\tfrac{1}{2}\int\cos u\,\tfrac{1}{2}du=\tfrac{1}{4}\sin u=\tfrac{1}{4}\sin(2\theta)

Finally we see that:

\int\cos^2 \theta \,d\theta=\tfrac{1}{2}\theta + \tfrac{1}{4}\sin(2\theta)=\tfrac{1}{2}\theta + \tfrac{1}{2}\sin(\theta)cos(\theta)

However, this is in terms of \displaystyle\theta and not in terms of \displaystyle x, so we must substitute back in order to rewrite the answer in terms of \displaystyle x.

That is we worked out that:

\sin(\theta)=x\qquad \cos(\theta)=\sqrt{1-x^2}\qquad \text{and }\theta=\arcsin(x)

So we arrive at our final answer

\int\sqrt{1-x^2}dx=\tfrac{1}{2}\arcsin(x)+\tfrac{1}{2}x\sqrt{1-x^2}

As you can see, even for a fairly harmless looking integral this technique can involve quite a lot of calculation. Often it is helpful to see if a simpler method will suffice before turning to trigonometric substitution. On the other hand, frequently in the case of integrands involving square roots, this is the most tractable way to solve the problem. We begin with giving some rules of thumb to help you decide which trigonometric substitutions might be helpful.

If the integrand contains a single factor of one of the forms \sqrt{a^2-x^2} \mbox{ or } \sqrt{a^2+x^2} \mbox{ or } \sqrt{x^2-a^2} we can try a trigonometric substitution.

  • If the integrand contains \sqrt{a^2-x^2} let x = a \sin \theta and use the identity 1-\sin^2 \theta = \cos^2 \theta.
  • If the integrand contains \sqrt{a^2+x^2} let x = a \tan \theta and use the identity 1+\tan^2 \theta = \sec^2 \theta.
  • If the integrand contains \sqrt{x^2-a^2} let x = a \sec \theta and use the identity \sec^2 \theta-1 = \tan^2 \theta.

Sine substitution

This substitution is easily derived from a triangle, using the Pythagorean Theorem.

If the integrand contains a piece of the form  \sqrt{a^2-x^2} we use the substitution

x=a\sin \theta \quad dx=a \cos \theta d\theta

This will transform the integrand to a trigonometric function. If the new integrand can't be integrated on sight then the tan-half-angle substitution described below will generally transform it into a more tractable algebraic integrand.

E.g., if the integrand is √(1-x2),

\begin{matrix}
\int_0^1 \sqrt{1-x^2} dx 
& = & \int_0^{\pi/2} \sqrt{1-\sin^2 \theta} \cos \theta \, d\theta \\
& = & \int_0^{\pi/2}  \cos^2 \theta \, d\theta \\
& = & \frac{1}{2} \int_0^{\pi/2}  1+ \cos 2\theta \, d\theta \\
& = & \frac{\pi}{4} 
\end{matrix}

If the integrand is √(1+x)/√(1-x), we can rewrite it as

\sqrt{\frac{1+x}{1-x}} = \sqrt{\frac{1+x}{1+x}\frac{1+x}{1-x}}
=\frac{1+x}{\sqrt{1-x^2}}

Then we can make the substitution

\begin{matrix}
\int_0^a \frac{1+x}{\sqrt{1-x^2}} dx
& = & \int_0^\alpha \frac{1+\sin \theta}{\cos \theta} \cos \theta \, d\theta
& 0 <a < 1 \\
& = & \int_0^\alpha 1+ \sin \theta \, d\theta & \alpha = \sin^{-1} a \\
& = & \alpha + \left[ - \cos \theta \right]_0^\alpha & \\
& = & \alpha + 1 - \cos \alpha & \\
& = & 1+ \sin^{-1} a - \sqrt{1-a^2} & \\  
\end{matrix}

Tangent substitution

This substitution is easily derived from a triangle, using the Pythagorean Theorem.

When the integrand contains a piece of the form \sqrt{a^2+x^2} we use the substitution

 x = a \tan \theta \quad \sqrt{x^2+a^2} = a \sec \theta \quad 
dx = a \sec^2 \theta d\theta

E.g., if the integrand is (x2+a2)-3/2 then on making this substitution we find

\begin{matrix}
\int_0^z \left( x^2+a^2 \right)^{-\frac{3}{2}}dx & = & 
a^{-2} \int_0^\alpha \cos \theta \, d\theta & z>0 \\
& = & a^{-2} \left[ \sin \theta \right]_0^\alpha & \alpha = \tan^{-1} (z/a) \\
& = & a^{-2} \sin \alpha & \\
& = & a^{-2} \frac{z/a}{\sqrt{1+z^2/a^2}} 
& = \frac{1}{a^2} \frac{z}{\sqrt{a^2+z^2}} \\
\end{matrix}

If the integral is

I= \int_0^z \sqrt{x^2+a^2} \quad z>0

then on making this substitution we find

\begin{matrix}
I & = & a^2 \int_0^\alpha \sec^3 \theta \, d\theta 
& & & \alpha = \tan^{-1} (z/a) \\
& = & a^2 \int_0^\alpha \sec \theta \, d\tan \theta & & & \\
& = & a^2 [ \sec \theta \tan \theta ]_0^\alpha & - & 
a^2 \int_0^\alpha \sec \theta \tan^2 \theta \, d\theta & \\
& = & a^2 \sec \alpha \tan \alpha & - 
& a^2 \int_0^\alpha \sec^3 \theta \, d\theta & +
a^2 \int_0^\alpha \sec \theta \, d\theta \\
& = & a^2 \sec \alpha \tan \alpha & -
& I & + a^2 \int_0^\alpha \sec \theta \, d\theta \\
\end{matrix}

After integrating by parts, and using trigonometric identities, we've ended up with an expression involving the original integral. In cases like this we must now rearrange the equation so that the original integral is on one side only

\begin{matrix}
I & =  & \frac{1}{2}a^2 \sec \alpha \tan \alpha & 
+ & \frac{1}{2}a^2 \int_0^\alpha \sec \theta \, d\theta \\
& = & \frac{1}{2}a^2 \sec \alpha \tan \alpha & 
+ &  \frac{1}{2}a^2 \left[ \ln \left( \sec \theta 
+ \tan \theta \right) \right]_0^\alpha \\
& = & \frac{1}{2}a^2 \sec \alpha \tan \alpha & 
+ &  \frac{1}{2}a^2 \ln \left( \sec \alpha  + \tan \alpha \right) \\
& = & \frac{1}{2}a^2 \left( \sqrt{1+\frac{z^2}{a^2}} \right) \frac{z}{a} & 
+ & \frac{1}{2}a^2 \ln \left( \sqrt{1+\frac{z^2}{a^2}}+\frac{z}{a} \right) \\
& = & \frac{1}{2}z\sqrt{z^2+a^2} & 
+ & \frac{1}{2}a^2 \ln \left(\frac{z}{a} + \sqrt{1+\frac{z^2}{a^2}} \right) \\
\end{matrix}

As we would expect from the integrand, this is approximately z2/2 for large z.

In some cases it is possible to do trigonometric substitution in cases when there is no \sqrt{\ \ \ } appearing in the integral.

Example

\int\frac{1}{x^2+1}\,dx

The denominator of this function is equal to \displaystyle(\sqrt{1+x^2})^2. This suggests that we try to substitute \displaystyle x=\tan u\, and use the identity \displaystyle 1 + \tan^2 u =\sec^2 u\,. With this substitution, we obtain that \displaystyle dx= \sec^2 u\, du and thus

\int\frac{1}{x^2+1}\,dx=\int\frac{1}{\tan^2 u+1} \sec^2 u\,du
=\int\frac{1}{\sec^2 u} \sec^2 u\,du
=\int\,du
=u + c

Using the initial substitution u=\arctan x gives

\int\frac{1}{x^2+1}\,dx = \arctan x + C

Secant substitution

This substitution is easily derived from a triangle, using the Pythagorean Theorem.

If the integrand contains a factor of the form \displaystyle\sqrt{x^2-a^2} we use the substitution

x = a \sec \theta \quad dx = a \sec \theta \tan \theta d\theta \quad
\sqrt{x^2-a^2} = a \tan \theta.

Example 1

Find \displaystyle\int_1^z \frac{\sqrt{x^2-1}}{x}dx.

\begin{matrix}
\int_1^z \frac{\sqrt{x^2-1}}{x}dx & = & 
\int_1^\alpha \frac{\tan \theta }{\sec \theta }\sec \theta \tan \theta \,d\theta & z>1 \\
& =  & \int_0^\alpha \tan^2 \theta \, d\theta & \alpha = \sec^{-1} z \\
& = & \left[ \tan \theta  -\theta \right]_0^\alpha & 
\tan \alpha = \sqrt{\sec^2 \alpha -1} \\
& =& \tan \alpha  -\alpha & \tan \alpha = \sqrt{z^2-1} \\
& =& \sqrt{z^2-1} - \sec^{-1} z & \\
\end{matrix}

Example 2

Find \displaystyle\int_1^z \frac{\sqrt{x^2-1}}{x^2} dx.

\begin{matrix}
\int_1^z \frac{\sqrt{x^2-1}}{x^2}dx & = & 
\int_1^\alpha \frac{\tan \theta}{\sec^2 \theta}\sec \theta \tan \theta \, d\theta &
z>1 \\
& =  & \int_0^\alpha \frac{\sin^2 \theta}{\cos \theta} d\theta & 
\alpha = \sec^{-1} z \\
\end{matrix}

We can now integrate by parts

\begin{matrix}
\int_1^z \frac{\sqrt{x^2-1}}{x^2}dx & = & 
-\left[ \tan \theta \cos \theta \right]_0^\alpha 
+ \int_0^\alpha \sec \theta \, d\theta \\ 
& = & -\sin \alpha  +\left[ \ln (\sec \theta + \tan \theta ) \right]_0^\alpha \\
& = & \ln (\sec \alpha + \tan \alpha ) - \sin \alpha \\
& = & \ln (z+ \sqrt{z^2-1} ) - \frac{\sqrt{z^2-1}}{z}\\
\end{matrix}

Exercise

Evaluate the following using an appropriate trigonometric substitution.

1. \int\frac{10}{25x^{2}+a}dx

2\arctan(5x)+C

Solution

← Integration techniques/Integration by Parts Calculus Integration techniques/Trigonometric Integrals →
Print version

<h1> 4.9 Trigonometric Integrals</h1>

← Integration techniques/Trigonometric Substitution Calculus Integration techniques/Partial Fraction Decomposition →
Print version

When the integrand is primarily or exclusively based on trigonometric functions, the following techniques are useful.

Powers of Sine and Cosine

We will give a general method to solve generally integrands of the form \cos^m(x)\sin^n(x). First let us work through an example.

\int\cos^3(x)\sin^2(x)\,dx

Notice that the integrand contains an odd power of cos. So rewrite it as

\int\cos^2(x)\sin^2(x)\cos(x)\,dx

We can solve this by making the substitution u=\sin(x) so du=\cos(x)dx. Then we can write the whole integrand in terms of u by using the identity

\cos^2(x)=1-\sin^2(x)=1-u^2.

So

\begin{matrix}
\int\cos^3(x)\sin^2(x)\,dx &=&\int\cos^2(x)\sin^2(x)\cos(x)\,dx\\
&=&\int (1-u^2)u^2\,du\\
&=&\int u^2\,du - \int u^4\,du\\
&=&{1\over 3} u^3+{1\over 5}u^5 + C\\
&=&{1\over 3} \sin^3(x)-{1\over 5}\sin^5(x)+C 
\end{matrix}.

This method works whenever there is an odd power of sine or cosine.

To evaluate \int\cos^m(x)\sin^n(x)\,dx when either m or n is odd.

  • If m is odd substitute u=\sin(x) and use the identity \cos^2(x)=1-\sin^2(x)=1-u^2.
  • If n is odd substitute u=\cos(x) and use the identity \sin^2(x)=1-\cos^2(x)=1-u^2.

Example

Find \int_0^{\pi/2} \cos^{40}(x)\sin^3(x) dx.

As there is an odd power of \sin we let u=\cos(x) so du=-\sin(x)dx. Notice that when x=0 we have u=cos(0)=1 and when x=\pi/2 we have u=\cos(\pi/2) = 0.

\begin{matrix}
\int_0^{\pi/2} \cos^{40}(x)\sin^3(x) dx &=&  \int_0^{\pi/2} \cos^{40}(x)\sin^2(x) \sin(x) dx \\ 
&=& -\int_{1}^{0} u^{40} (1-u^2) du \\
&=&\int_{0}^{1} u^{40} (1-u^2) du\\
&=& \int_{0}^{1} u^{40} - u^{42} du \\
&=& [\frac{1}{41}u^{41} - \frac{1}{43}u^{43}]_0^1 \\
&=& \frac{1}{41}-\frac{1}{43}.
\end{matrix}

When both m and n are even things get a little more complicated.

To evaluate \int\cos^m(x)\sin^n(x)\,dx when both m and n are even.


Use the identities \sin^2(x)=\frac{1}{2}(1-\cos(2x)) and \cos^2(x)=\frac{1}{2}(1+\cos(2x)).

Example

Find  \int\sin^2(x)\cos^4(x)\,dx.

As \sin^2(x)=\frac{1}{2}(1-\cos(2x)) and \cos^2(x)=\frac{1}{2}(1+\cos(2x)) we have


\int \sin^2(x)\cos^4(x)\,dx = \int \left( {1 \over 2}(1 - \cos(2x)) \right)
  \left( {1 \over 2}(1 + \cos(2x)) \right)^2 \,dx,

and expanding, the integrand becomes

\frac{1}{8} \int \left( 1 - \cos^2(2x) + \cos(2x)- \cos^3(2x) \right) \,dx.

Using the multiple angle identities

\begin{matrix}
I & = & \frac{1}{8} \left( \int 1 \, dx  - \int \cos^2(2x)\, dx 
 + \int \cos(2x)\,dx  -\int \cos^3(2x)\,dx \right) \\ 
& = & \frac{1}{8}  \left( x  - \frac{1}{2} \int (1 + \cos(4x))\,dx 
 + \frac{1}{2}\sin(2x)  -\int \cos^2(2x) \cos(2x) \,dx\right) \\
& = & \frac{1}{16}  \left( x + \sin(2x) + \int \cos(4x) \,dx   
-2 \int(1-\sin^2(2x))\cos(2x)\,dx\right) \\
\end{matrix}

then we obtain on evaluating

I=\frac{x}{16}-\frac{\sin(4x)}{64} + \frac{\sin^3(2x)}{48}+C

Powers of Tan and Secant

To evaluate \int\tan^m(x)\sec^n(x)\,dx.

  1. If n is even and n\ge 2 then substitute u=tan(x) and use the identity \sec^2(x)=1+\tan^2(x).
  2. If n and m are both odd then substitute u=\sec(x) and use the identity \tan^2(x)=\sec^2(x)-1.
  3. If n is odd and m is even then use the identity \tan^2(x)=\sec^2(x)-1 and apply a reduction formula to integrate \sec^j(x)dx\,, using the examples below to integrate when j=1,2.

Example 1

Find \int \sec^2(x)dx.

There is an even power of \sec(x). Substituting u=\tan(x) gives du = \sec^2(x)dx so

 \int \sec^2(x)dx = \int du = u+C = \tan(x)+C.


Example 2

Find \int \tan(x)dx.

Let u=\cos(x) so du=-\sin(x)dx. Then

\begin{matrix}
\int \tan(x)dx &=& \int \frac{\sin(x)}{\cos(x)} dx \\
&=& \int \frac{-1}{u} du \\
&=& -\ln |u| + C \\
&=& -\ln |\cos(x) | + C\\
&=& \ln |\sec(x)| +C.
\end{matrix}


Example 3

Find \int \sec(x)dx.

The trick to do this is to multiply and divide by the same thing like this:

\begin{matrix}
\int \sec(x)dx &=& \int \sec(x)\frac{\sec(x) + \tan(x)}{\sec(x) + \tan(x)} dx \\
&=& \int \frac{\sec^2(x) + \sec(x) \tan(x)}{\sec(x)+ \tan(x)}
\end{matrix}.

Making the substitution u= \sec(x) + \tan(x) so du = \sec(x)\tan(x) + \sec^2(x)dx,

\begin{matrix}
\int \sec(x) dx &=& \int \frac{1}{u} du\\
&=& \ln |u| + C \\
&=& \ln |\sec(x) + \tan(x)| + C
\end{matrix}.

More trigonometric combinations

For the integrals \int \sin(nx)\cos(mx)\,dx or \int \sin(nx)\sin(mx)\,dx or \int \cos(nx)\cos(mx)\,dx use the identities

  •  \sin(a)\cos(b) = {1\over 2}(\sin{(a+b)}+\sin{(a-b)}) \,
  •  \sin(a)\sin(b) = {1\over 2}(\cos{(a-b)}-\cos{(a+b)}) \,
  •  \cos(a)\cos(b) = {1\over 2}(\cos{(a-b)}+\cos{(a+b)}) \,

Example 1

Find  \int \sin(3x)\cos(5x)\,dx.

We can use the fact that \sin(a)\cos(b)=(1/2)(\sin(a+b)+\sin(a-b)), so

\sin(3x)\cos(5x)=(\sin(8x)+\sin{(-2x)})/2 \,

Now use the oddness property of \sin(x) to simplify

\sin(3x)\cos{5x}=(\sin(8x)-\sin(2x))/2 \,

And now we can integrate

\begin{matrix}
\int \sin(3x)\cos(5x)\,dx & = & \frac{1}{2} \int \sin(8x)-\sin(2x)dx \\
& = &  \frac{1}{2}(-\frac{1}{8}\cos(8x)+\frac{1}{2}\cos(2x)) +C \\
\end{matrix}

Example 2

Find:\int \sin(x)\sin(2x)\,dx.

Using the identities

\sin(x) \sin(2x)= \frac{1}{2} \left( \cos(-x)-\cos(3x) \right)
= \frac{1}{2} (\cos(x) -\cos(3x)).

Then

\begin{matrix} 
\int \sin(x)\sin(2x)\,dx & = & \frac{1}{2} \int (\cos(x)-\cos(3x))\,dx  \\
& = &  \frac{1}{2}(\sin(x)-\frac{1}{3}\sin(3x)) + C
\end{matrix}
← Integration techniques/Trigonometric Substitution Calculus Integration techniques/Partial Fraction Decomposition →
Print version

<h1> 4.10 Rational Functions by Partial Fractional Decomposition</h1>

← Integration techniques/Trigonometric Integrals Calculus Integration techniques/Tangent Half Angle →
Print version

Suppose we want to find \int {3x+ 1 \over x^2+x} dx. One way to do this is to simplify the integrand by finding constants A and B so that

{3x+ 1 \over x^2+x}={3x+ 1 \over x(x+1)}= {A \over x}+ {B \over x+1}.

This can be done by cross multiplying the fraction which gives

{3x+1\over x(x+1)} = {{A(x+1) + Bx} \over {x(x+1)}}

As both sides have the same denominator we must have

3x+1 = A(x+1)+Bx

This is an equation for x so it must hold whatever value x is. If we put in x=0 we get  1 = A and putting x=-1 gives -2=-B so B=2. So we see that

 \frac{3x+ 1}{x^2+x} = \frac{1}{x} + \frac{2}{x+1} \

Returning to the original integral

 \int \frac{3x+1}{x^2+x} dx =  \int \frac{dx}{x} + \int \frac{2}{x+1} dx
=  \ln \left| x \right| + 2 \ln \left| x+1 \right| + C

Rewriting the integrand as a sum of simpler fractions has allowed us to reduce the initial integral to a sum of simpler integrals. In fact this method works to integrate any rational function.

Method of Partial Fractions

To decompose the rational function \frac{P(x)}{Q(x)}:

1.1).

  • Step 2 Factor Q(x) as far as possible.
  • Step 3 Write down the correct form for the partial fraction decomposition (see below) and solve for the constants.

To factor Q(x) we have to write it as a product of linear factors (of the form ax+b) and irreducible quadratic factors (of the form ax^2+bx+c with b^2-4ac<0).

Some of the factors could be repeated. For instance if Q(x) = x^3-6x^2+9x we factor Q(x) as

 Q(x) = x(x^2-6x+9) = x(x-3)(x-3)=x(x-3)^2.

It is important that in each quadratic factor we have b^2-4ac<0, otherwise it is possible to factor that quadratic piece further. For example if Q(x) = x^3-3x^2 - 2x then we can write

 Q(x) = x(x^2-3x+2) = x(x-1)(x+2)


We will now show how to write P(x)/Q(x) as a sum of terms of the form

 {A \over (ax+b)^k} and {Ax+B \over (ax^2+bx+c)^k}.

Exactly how to do this depends on the factorization of Q(x) and we now give four cases that can occur.

Q(x) is a product of linear factors with no repeats

This means that Q(x) = (a_1x+b_1)(a_2x+b_2)...(a_nx+b_n) where no factor is repeated and no factor is a multiple of another.

For each linear term we write down something of the form {A \over (ax+b)}, so in total we write

 {P(x) \over Q(x)} = {A_1 \over (a_1x+b_1)} + {A_2 \over (a_2x+b_2)} + \cdots +  {A_n \over (a_nx+b_n)}
Example 1

Find  \int {1+x^2 \over (x+3)(x+5)(x+7)}dx

Here we have P(x)=1+x^2,  Q(x)=(x+3)(x+5)(x+7) and Q(x) is a product of linear factors. So we write

 \frac{1+x^2}{(x+3)(x+5)(x+7)}=\frac{A}{x+3}+\frac{B}{x+5}+\frac{C}{x+7}

Multiply both sides by the denominator

1+x^2=A(x+5)(x+7)+B(x+3)(x+7)+C(x+3)(x+5)

Substitute in three values of x to get three equations for the unknown constants,

\begin{matrix} x=-3 & 1+3^2=2\cdot 4 A \\  x=-5 & 1+5^2=-2\cdot 2 B \\  
x=-7 & 1+7^2=(-4)\cdot (-2) C \end{matrix}

so A=5/4, B=-13/2, C=25/4, and

 \frac{1+x^2}{(x+3)(x+5)(x+7)}=\frac{5}{4x+12} -\frac{13}{2x+10} +\frac{25}{4x+28}

We can now integrate the left hand side.

\int \frac{1+x^2 \, dx}{(x+3)(x+5)(x+7)}=
\frac{5}{4} \ln|x+3| - \frac{13}{2}\ln|x+5|+ \frac{25}{4}\ln|x+7|+ C

Exercises

Evaluate the following by the method partial fraction decomposition.

1. \int\frac{2x+11}{(x+6)(x+5)}dx

\ln|x+6|+\ln|x+5|+C

2. \int\frac{7x^{2}-5x+6}{(x-1)(x-3)(x-7)}dx

\frac{2}{3}\ln|x-1|-\frac{27}{4}\ln|x-3|+\frac{157}{12}\ln|x-7|+C

Solutions

Q(x) is a product of linear factors some of which are repeated

If (ax+b) appears in the factorisation of Q(x) k-times then instead of writing the piece  {A \over (ax+b)} we use the more complicated expression

 {A_1\over ax+b} + {A_2\over (ax+b)^2} +  {A_3\over (ax+b)^3} + \cdots + {A_k\over (ax+b)^k}

Example 2

Find  \int {1 \over (x+1)(x+2)^2}dx

Here P(x)=1 and Q(x)=(x+1)(x+2)^2 We write

 \frac{1}{(x+1)(x+2)^2}=\frac{A}{x+1}+\frac{B}{x+2}+\frac{C}{(x+2)^2}

Multiply both sides by the denominator  1= A(x+2)^2+B(x+1)(x+2)+C(x+1)

Substitute in three values of x to get 3 equations for the unknown constants,

\begin{matrix} x=0 & 1= 2^2A +2B+C \\  x=-1 & 1=A \\  
x=-2 & 1= -C \end{matrix}

so A=1, B=-1, C=-1, and

 \frac{1}{(x+1)(x+2)^2}=\frac{1}{x+1}-\frac{1}{x+2}-\frac{1}{(x+2)^2}

We can now integrate the left hand side.
\int \frac{1}{(x+1)(x+2)^2} dx= \ln \frac{1}{x+1} - \ln \frac{1}{x+2} + \frac{1}{x+2} +C
We now simplify the fuction with the property of Logarithms.
\int \ln \frac{1}{x+1} - \ln \frac{1}{x+2} + \frac{1}{x+2} + C = \ln \frac{x+1}{x+2} + \frac{1}{x+2} +C

Exercise

3. Evaluate \int\frac{x^{2}-x+2}{x(x+2)^{2}}dx using the method of partial fractions.

\frac{1}{2}\ln|x|+\frac{1}{2}\ln|x+2|+\frac{4}{x+2}+C

Solution

Q(x) contains some quadratic pieces which are not repeated

If (ax^2+bx+c) appears we use  {Ax+B \over (ax^2+bx+c)}.

Exercises

Evaluate the following using the method of partial fractions.

4. \int \frac{2}{(x+2)(x^{2}+3)} dx

\frac{2}{7}\ln|x+2|-\frac{1}{7}\ln|x^{2}+3|+\frac{4}{7\sqrt{3}}\arctan(\frac{x}{\sqrt{3}})+C

5. \int\frac{dx}{(x+2)(x^{2}+2)}

\frac{1}{6}\ln|x+2|-\frac{1}{12}\ln|x^{2}+2|+\frac{\sqrt{2}}{6}\arctan(\frac{x}{\sqrt{2}})+C

Solutions

Q(x) contains some repeated quadratic factors

If (ax^2+bx+c) appears k-times then use

 {A_1x+B_1 \over (ax^2+bx+c)} +  {A_2x+B_2 \over (ax^2+bx+c)^2} + {A_3x+B_3 \over (ax^2+bx+c)^3} + \cdots + {A_kx+B_k \over (ax^2+bx+c)^k}

Exercise

Evaluate the following using the method of partial fractions.

6. \int\frac{dx}{(x^{2}+1)^{2}(x-1)}

-\frac{1}{2}\arctan(x)+\frac{1-x}{4(x^{2}+1)}+\frac{1}{8}\ln\left(\frac{(x-1)^{2}}{x^{2}+1}\right)+C

Solution

← Integration techniques/Trigonometric Integrals Calculus Integration techniques/Tangent Half Angle →
Print version

<h1> 4.11 Tangent Half Angle Substitution</h1>

← Integration techniques/Partial Fraction Decomposition Calculus Integration techniques/Reduction Formula →
Print version

Another useful change of variables is the Weierstrass substitution, named after Karl Weierstrass:

t=\tan (x/2) \,

With this transformation, using the double-angle trigonometric identities,

\sin x =\frac{2t}{1+t^2} \quad \cos x = \frac{1-t^2}{1+t^2} \quad
\tan x = \frac{2t}{1-t^2} \quad dx=\frac{2 \, dt}{1+t^2}

This transforms a trigonometric integral into a algebraic integral, which may be easier to integrate.

For example, if the integrand is 1/(1 + sin x) then

\begin{align}
& {}\quad \int_0^{\pi/2} \frac{dx}{1 + \sin x} = \int_0^1 \frac{\left(\frac{2\,dt}{1+t^2}\right)}{1 + \left(\frac{2t}{1+t^2}\right)} \\[8pt]
& = \int_0^1 \frac{2\,dt}{(t+1)^2}\\[8pt]
\end{align}

This method can be used to further simplify trigonometric integrals produced by the changes of variables described earlier.

For example, if we are considering the integral

I = \int_{-1}^1 \frac{\sqrt{1-x^2}}{1+x^2} \, dx

we can first use the substitution x = sin θ, which gives

I = \int_{-\pi/2}^{\pi/2} \frac{ \cos^2 \theta}{1+ \sin^2 \theta} \, d\theta

then use the tan-half-angle substition to obtain

I = \int_{-1}^1 \frac{(1-t^2)^2}{1+6t^2+t^4}\frac{2 \, dt}{1+t^2}

In effect, we've removed the square root from the original integrand. We could do this with a single change of variables, but doing it in two steps gives us the opportunity of doing the trigonometric integral another way.

Having done this, we can split the new integrand into partial fractions, and integrate.

\begin{align}
I & = \int_{-1}^1 \frac{2-\sqrt{2}}{t^2+3-\sqrt{8}} \, dt 
+ \int_{-1}^1 \frac{2+\sqrt{2}}{t^2+3+\sqrt{8}} \, dt 
- \int_{-1}^1 \frac{2}{1+t^2} \, dt \\[8pt]
& = \frac{4-\sqrt{8}}{\sqrt{3-\sqrt{8}}} \tan^{-1} \left(\sqrt{3+\sqrt{8}}\,\right)
+ \frac{4+\sqrt{8}}{\sqrt{3+\sqrt{8}}} \tan^{-1}\left(\sqrt{3-\sqrt{8}}\,\right)
- \pi \end{align}

This result can be further simplified by use of the identities

 3 \pm \sqrt{8}= \left( \sqrt{2} \pm 1 \right)^2 \quad 
\tan^{-1}  \left( \sqrt{2} \pm 1 \right) = 
\left( \frac{1}{4} \pm \frac{1}{8} \right)\pi

ultimately leading to

I=(\sqrt{2}-1)\pi \,

In principle, this approach will work with any integrand which is the square root of a quadratic multiplied by the ratio of two polynomials. However, it should not be applied automatically.

E.g., in this last example, once we deduced

I = \int_{-\pi/2}^{\pi/2} \frac{ \cos^2 \theta}{1+ \sin^2 \theta} \, d\theta

we could have used the double angle formula, since this contains only even powers of cos and sin. Doing that gives

I = \int_{-\pi/2}^{\pi/2} \frac{1+\cos 2\theta}{3 - \cos 2\theta} d\theta
= \frac{1}{2} \int_{-\pi}^\pi  \frac{1+\cos \phi}{3 - \cos \phi} \, d\phi

Using tan-half-angle on this new, simpler, integrand gives

\begin{align}
I & = \int_{-\infty}^\infty \frac{1}{1+2t^2} \frac{dt}{1+t^2} \\
&= \int_{-\infty}^\infty \frac{2 \, dt}{1+2t^2} 
- \int_{-\infty}^\infty \frac{dt}{1+t^2} 
\end{align}

This can be integrated on sight to give

I = \frac{4}{\sqrt{2}}\frac{\pi}{2}-2\frac{\pi}{2} = (\sqrt{2}-1) \pi

This is the same result as before, but obtained with less algebra, which shows why it is best to look for the most straightforward methods at every stage.

A more direct way of evaluating the integral I is to substitute t = tan θ right from the start, which will directly bring us to the line

I = \int_{-\infty}^\infty \frac{1}{1+2t^2} \frac{dt}{1+t^2}

above. More generally, the substitution t = tan x gives us

\sin x =\frac{t}{\sqrt{1+t^2}} \quad \cos x = \frac{1}{\sqrt{1+t^2}} \quad
dx=\frac{dt}{1+t^2}

so this substitution is the preferable one to use if the integrand is such that all the square roots would disappear after substitution, as is the case in the above integral.

Example

Using the trigonometric substitution t=a\tan x, then dt=a\sec^2 x\, dx and \sqrt{t^2+a^2} = a\sec x (when -\pi/2 < x < \pi/2). So,


\begin{align}
\int \frac{1}{\left(t^2+a^2\right)^\frac{3}{2}}\,dt & = \int \frac{1}{a^3 \sec^3 x}\, a\sec^2 x\,dx \\[8pt]
& = \frac{1}{a^2} \int \cos x\,dx \\[8pt]
& = \frac{1}{a^2} \sin x + C \\[8pt]
& = \frac{1}{a^2} \frac{a\tan x}{a\sec x} + C \\[8pt]
& = \frac{t}{a^2 \sqrt{t^2+a^2}} + C
\end{align}

Alternate Method

In general, to evaluate integrals of the form

\int \frac{A + B \cos x + C \sin x}{a + b \cos x + c \sin x} \, dx,

it is extremely tedious to use the aforementioned "tan half angle" substitution directly, as one easily ends up with a rational function with a 4th degree denominator. Instead, we may first write the numerator as

A + B \cos x + C \sin x \equiv p (a + b \cos x + c \sin x) + q \frac{d}{dx} (a + b \cos x + c \sin x) + r.

Then the integral can be written as

\int \left(p+\frac{q \frac{d}{dx} (a + b \cos x + c \sin x)}{a + b \cos x + c \sin x} + \frac{r}{a + b \cos x + c \sin x} \right) \, dx

which can be evaluated much more easily.

Example

Evaluate \int \frac{\cos x + 2}{\cos x + \sin x} \, dx,

Let

\cos x + 2 \equiv p (\cos x + \sin x) + q \frac{d}{dx} (\cos x + \sin x) + r.

Then

\cos x + 2 \equiv p (\cos x + \sin x) + q (-\sin x + \cos x) + r
\cos x + 2 \equiv (p+q) \cos x + (p-q) \sin x + r.

Comparing coefficients of cos x, sin x and the constants on both sides, we obtain

 \begin{cases} p+q & = 1 \\ p-q & = 0 \\ r & = 2 \end{cases}

yielding p = q = 1/2, r = 2. Substituting back into the integrand,

\int \frac{\cos x + 2}{\cos x + \sin x} \, dx = \int \frac{1}{2} \, dx + \frac{1}{2} \int \frac{d(\cos x + \sin x)}{\cos x + \sin x} + \int \frac{2}{\cos x + \sin x} \, dx.

The last integral can now be evaluated using the "tan half angle" substitution described above, and we obtain

\int \frac{2}{\cos x + \sin x} \, dx = \sqrt{2} \ln \left| \frac{\tan \frac{x}{2} - 1 + \sqrt{2}}{\tan \frac{x}{2} - 1 - \sqrt{2}} \right| + C.

The original integral is thus

\int \frac{\cos x + 2}{\cos x + \sin x} \, dx = \frac{x}{2} + \frac{1}{2} \ln |\cos x + \sin x| + \sqrt{2} \ln \left| \frac{\tan \frac{x}{2} - 1 + \sqrt{2}}{\tan \frac{x}{2} - 1 - \sqrt{2}} \right| + C.
← Integration techniques/Partial Fraction Decomposition Calculus Integration techniques/Reduction Formula →
Print version

<h1> 4.12 Reduction Formula</h1>

← Integration techniques/Tangent Half Angle Calculus Integration techniques/Irrational Functions →
Print version

A reduction formula is one that enables us to solve an integral problem by reducing it to a problem of solving an easier integral problem, and then reducing that to the problem of solving an easier problem, and so on.

For example, if we let

I_n = \int x^n e^x\,dx

Integration by parts allows us to simplify this to

I_n = x^ne^x - n\int x^{n-1}e^x\,dx=
I_n = x^ne^x - nI_{n-1} \,\!

which is our desired reduction formula. Note that we stop at

I_0 = e^x \,\!.

Similarly, if we let

I_n = \int \sec^n \theta \, d\theta

then integration by parts lets us simplify this to

I_n = \sec^{n-2}\theta \tan \theta - 
(n-2)\int \sec^{n-2} \theta \tan^2 \theta \, d\theta

Using the trigonometric identity, \tan^2\theta=\sec^2\theta-1, we can now write

\begin{matrix}
I_n & = & \sec^{n-2}\theta \tan \theta & 
+ (n-2) \left( \int \sec^{n-2} \theta \, d\theta
- \int \sec^n \theta \, d\theta \right) \\
& = & \sec^{n-2}\theta \tan \theta & + (n-2) \left( I_{n-2}  - I_n \right) \\
\end{matrix}

Rearranging, we get

I_n=\frac{1}{n-1}\sec^{n-2}\theta \tan \theta + \frac{n-2}{n-1} I_{n-2}

Note that we stop at n=1 or 2 if n is odd or even respectively.

As in these two examples, integrating by parts when the integrand contains a power often results in a reduction formula.

← Integration techniques/Tangent Half Angle Calculus Integration techniques/Irrational Functions →
Print version

<h1> 4.13 Irrational Functions</h1>

← Integration techniques/Reduction Formula Calculus Integration techniques/Numerical Approximations →
Print version

Integration of irrational functions is more difficult than rational functions, and many cannot be done. However, there are some particular types that can be reduced to rational forms by suitable substitutions.

Type 1

Integrand contains \sqrt[n]{\frac{ax+b}{cx+d}}

Use the substitution u=\sqrt[n]{\frac{ax+b}{cx+d}}.

Example

Find \int \frac{1}{x} \sqrt{\frac{1-x}{x}} \, dx.

 \int \frac {x}{\sqrt[3]{ax+b}}\,dx

Type 2

Integral is of the form \int \frac{Px+Q}{\sqrt{ax^2+bx+c}} \, dx

Write Px+Q as Px+Q = p \frac{d}{dx} (ax^2+bx+c) + q.

Example

Find \int \frac{4x-1}{\sqrt{5-4x-x^2}} \, dx.

Type 3

Integrand contains \sqrt{a^2-x^2}, \sqrt{a^2+x^2} or \sqrt{x^2-a^2}

This was discussed in "trigonometric substitutions above". Here is a summary:

  1. For \sqrt{a^2-x^2}, use x = a \sin \theta.
  2. For \sqrt{a^2+x^2}, use x = a \tan \theta.
  3. For \sqrt{x^2-a^2}, use x = a \sec \theta.

Type 4

Integral is of the form \int \frac{1}{(px+q) \sqrt{ax^2+bx+c}} \, dx

Use the substitution u=\frac{1}{px+q}.

Example

Find \int \frac{1}{(1+x)\sqrt{3+6x+x^2}} \, dx.

Type 5

Other rational expressions with the irrational function \sqrt{ax^2+bx+c}

  1. If a>0, we can use u=\sqrt{ax^2+bx+c} \pm \sqrt{a}x.
  2. If c>0, we can use u=\frac{\sqrt{ax^2+bx+c} \pm \sqrt{c}}{x}.
  3. If ax^2+bx+c can be factored as a(x-\alpha)(x-\beta), we can use u=\sqrt{\frac{a(x-\alpha)}{x-\beta}}.
  4. If a<0 and ax^2+bx+c can be factored as -a(\alpha-x)(x-\beta), we can use x=\alpha\cos^2\theta+\beta\sin^2\theta
← Integration techniques/Reduction Formula Calculus Integration techniques/Numerical Approximations →
Print version

<h1> 4.14 Numerical Approximations</h1>

← Integration techniques/Irrational Functions Calculus Integration/Exercises →
Print version

It is often the case, when evaluating definite integrals, that an antiderivative for the integrand cannot be found, or is extremely difficult to find. In some instances, a numerical approximation to the value of the definite value will suffice. The following techniques can be used, and are listed in rough order of ascending complexity.

Riemann Sum

This comes from the definition of an integral. If we pick n to be finite, then we have:

\int_a^b f(x)\ dx\approx \sum_{i=1}^nf(x_i^*)\Delta x

where x_i^* is any point in the i-th sub-interval [x_{i-1},x_i] on [a,b].

Right Rectangle

A special case of the Riemann sum, where we let x_i^*=x_i, in other words the point on the far right-side of each sub-interval on, [a,b]. Again if we pick n to be finite, then we have:

\int_a^b f(x)\ dx\approx \sum_{i=1}^nf(x_i)\Delta x

Left Rectangle

Another special case of the Riemann sum, this time we let x_i^*=x_{i-1}, which is the point on the far left side of each sub-interval on [a,b]. As always, this is an approximation when n is finite. Thus, we have:

\int_a^b f(x)\ dx\approx \sum_{i=1}^nf(x_{i-1})\Delta x

Trapezoidal Rule

\int_a^b f(x)\ dx \approx \frac{b-a}{2n}\left[f(x_0)+2\sum_{i=1}^{n-1}(f(x_i))+f(x_n)\right]=\frac{b-a}{2n}(f(x_0) + 2f(x_1) + 2f(x_2) +\cdots+ 2f(x_{n-1}) + f(x_n))

Simpson's Rule

Remember, n must be even,

\int_a^b f(x)\ dx  \approx \frac{b-a}{6n}\left[f(x_0)+\sum_{i=1}^{n-1}\left((3-(-1)^{i})f(x_i)\right)+f(x_n)\right]
=\frac{b-a}{6n}\left[f(x_0)+4f(x_1/2)+2f(x_1)+4f(x_3/2)+\cdots+4f(x_{n-1/2})+f(x_n)\right]

Further reading

← Integration techniques/Irrational Functions Calculus Integration/Exercises →
Print version

<h1> 4.15 Integration Exercises</h1>

← Integration techniques/Numerical Approximations Calculus Area →
Print version

Integration of Polynomials

Evaluate the following:

1. \int (x^2-2)^{2}\, dx

\frac{x^{5}}{5}-\frac{4x^{3}}{3}+4x+C

2. \int 8x^3\, dx

2x^{4}+C

3. \int (4x^2+11x^3)\, dx

\frac{4x^{3}}{3}+\frac{11x^{4}}{4}+C

4. \int (31x^{32}+4x^3-9x^4) \,dx

\frac{31x^{33}}{33}+x^{4}-\frac{9x^{5}}{5}+C

5. \int 5x^{-2}\, dx

-\frac{5}{x}+C

Solutions

Indefinite Integration

Find the general antiderivative of the following:

6. \int (\cos x+\sin x)\, dx

\sin x-\cos x+C

7. \int 3\sin x\, dx

-3\cos(x)+C

8. \int (1+\tan^2 x)\, dx

\tan x+C

9. \int (3x-\sec^2 x)\, dx

\frac{3x^{2}}{2}-\tan x+C

10. \int -e^x\, dx

-e^{x}+C

11. \int 8e^x\, dx

8e^{x}+C

12. \int \frac1{7x}\, dx

\frac{1}{7}\ln|x|+C

13. \int \frac1{x^2+a^2}\, dx

\frac{1}{a}\arctan\frac{x}{a}+C

Solutions

Integration by parts

14. Consider the integral \int \sin(x) \cos(x)\,dx. Find the integral in two different ways. (a) Integrate by parts with u=\sin(x) and  v' =\cos(x). (b) Integrate by parts with u=\cos(x) and  v' =\sin(x). Compare your answers. Are they the same?

a. \frac{\sin^{2}x}{2}
b. -\frac{\cos^{2}x}{2}

Solutions

← Integration techniques/Numerical Approximations Calculus Area →
Print version

Applications of Integration

Area

← Integration/Exercises Calculus Volume →
Print version

Introduction

Finding the area between two curves, usually given by two explicit functions, is often useful in calculus.

In general the rule for finding the area between two curves is

 A = A_{top} - A_{bottom} \ or

If f(x) is the upper function and g(x) is the lower function

 A = \int_a^b[f(x)-g(x)]\, dx

This is true whether the functions are in the first quadrant or not.

Area between two curves

Suppose we are given two functions y1=f(x) and y2=g(x) and we want to find the area between them on the interval [a,b]. Also assume that f(x)≥ g(x) for all x on the interval [a,b]. Begin by partitioning the interval [a,b] into n equal subintervals each having a length of Δx=(b-a)/n. Next choose any point in each subinterval, xi*. Now we can 'create' rectangles on each interval. At the point xi*, the height of each rectangle is f(xi*)-g(xi*) and the width is Δx. Thus the area of each rectangle is [f(xi*)-g(xi*)]Δx. An approximation of the area, A, between the two curves is

 A := \sum_{i=1}^{n} [f(x_{i}^{*})-g(x_{i}^{*})]\Delta x.

Now we take the limit as n approaches infinity and get

A = \lim_{n \to \infty} \sum_{i=1}^{n} [f(x_{i}^{*})-g(x_{i}^{*})]\Delta x

which gives the exact area. Recalling the definition of the definite integral we notice that

A = \int_a^b[f(x)-g(x)]\,dx.

This formula of finding the area between two curves is sometimes known as applying integration with respect to the x-axis since the rectangles used to approximate the area have their bases lying parallel to the x-axis. It will be most useful when the two functions are of the form y1=f(x) and y2=g(x). Sometimes however, one may find it simpler to integrate with respect to the y-axis. This occurs when integrating with respect to the x-axis would result in more than one integral to be evaluated. These functions take the form x1=f(y) and x2=g(y) on the interval [c,d]. Note that [c,d] are values of y. The derivation of this case is completely identical. Similar to before, we will assume that f(y)≥ g(y) for all y on [c,d]. Now, as before we can divide the interval into n subintervals and create rectangles to approximate the area between f(y) and g(y). It may be useful to picture each rectangle having their 'width', Δy, parallel to the y-axis and 'height', f(yi*)-g(yi*) at the point yi*, parallel to the x-axis. Following from the work above we may reason that an approximation of the area, A, between the two curves is

 A := \sum_{i=1}^{n} [f(y_{i}^{*})-g(y_{i}^{*})]\Delta y.

As before, we take the limit as n approaches infinity to arrive at

A = \lim_{n \to \infty} \sum_{i=1}^{n} [f(y_{i}^{*})-g(y_{i}^{*})]\Delta y,

which is nothing more than a definite integral, so

A = \int_c^d[f(y)-g(y)]\,dy.

Regardless of the form of the functions, we basically use the same formula.

← Integration/Exercises Calculus Volume →
Print version

Volume

← Area Calculus Volume of solids of revolution →
Print version

When we think about volume from an intuitive point of view, we typically think of it as the amount of "space" an item occupies. Unfortunately assigning a number that measures this amount of space can prove difficult for all but the simplest geometric shapes. Calculus provides a new tool that can greatly extend our ability to calculate volume. In order to understand the ideas involved it helps to think about the volume of a cylinder. The volume of a cylinder is calculated using the formula V=\pi r^2 h. The base of the cylinder is a circle whose area is given by A=\pi r^2. Notice that the volume of a cylinder is derived by taking the area of its base and multiplying by the height h. For more complicated shapes, we could think of approximating the volume by taking the area of some cross section at some height x and multiplying by some small change in height \Delta x then adding up the heights of all of these approximations from the bottom to the top of the object. This would appear to be a Riemann sum. Keeping this in mind, we can develop a more general formula for the volume of solids in \mathbb{R}^3 (3 dimensional space).

Formal Definition

Formally the ideas above suggest that we can calculate the volume of a solid by calculating the integral of the cross-sectional area along some dimension. In the above example of a cylinder, the every cross section was given by the same circle, so the cross-sectional area is therefore a constant function, and the dimension of integration was vertical (although it could have been any one we desired). Generally, if S is a solid that lies in \mathbb{R}^3 between x=a and x=b, let A(x) denote the area of a cross section taken in the plane perpendicular to the x direction, and passing through the point x. If the function A(x) is continuous on [a,b], then the volume V_S of the solid S is given by:

V_S = \int_a^b A(x)\,dx.

Examples

Example 1: A right cylinder

Figure 1

Now we will calculate the volume of a right cylinder using our new ideas about how to calculate volume. Since we already know the formula for the volume of a cylinder this will give us a "sanity check" that our formulas make sense. First, we choose a dimension along which to integrate. In this case, it will greatly simplify the calculations to integrate along the height of the cylinder, so this is the direction we will choose. Thus we will call the vertical direction x (see Figure 1). Now we find the function, A(x), which will describe the cross-sectional area of our cylinder at a height of x. The cross-sectional area of a cylinder is simply a circle. Now simply recall that the area of a circle is \pi r^2, and so A(x)=\pi r^2. Before performing the computation, we must choose our bounds of integration. In this case, we simply define x=0 to be the base of the cylinder, and so we will integrate from x=0 to x=h, where h is the height of the cylinder. Finally, we integrate:

V_{\mathrm{cylinder}} = \int_a^b A(x)\,dx
=\int_0^h \pi r^2\,dx
=\pi r^2\int_0^h\,dx
=\left.\pi r^2x\right|_{x=0}^h
=\pi r^2(h-0)
=\pi r^2h.

This is exactly the familiar formula for the volume of a cylinder.

Example 2: A right circular cone

Figure 2: The cross-section of a right circular cone by a plane perpendicular to the axis of the cone is a circle.

For our next example we will look at an example where the cross sectional area is not constant. Consider a right circular cone. Once again the cross sections are simply circles. But now the radius varies from the base of the cone to the tip. Once again we choose x to be the vertical direction, with the base at x=0 and the tip at x=h, and we will let R denote the radius of the base. While we know the cross sections are just circles we cannot calculate the area of the cross sections unless we find some way to determine the radius of the circle at height x.

Figure 3: Cross-section of the right circular cone by a plane perpendicular to the base and passing through the tip.

Luckily in this case it is possible to use some of what we know from geometry. We can imagine cutting the cone perpendicular to the base through some diameter of the circle all the way to the tip of the cone. If we then look at the flat side we just created, we will see simply a triangle, whose geometry we understand well. The right triangle from the tip to the base at height x is similar to the right triangle from the tip to the base at height h. This tells us that \frac{r}{h-x}=\frac{R}{h}. So that we see that the radius of the circle at height x is r(x)=\frac{R}{h}(h-x). Now using the familiar formula for the area of a circle we see that A(x)=\pi\frac{R^2}{h^2}(h-x)^2.

Now we are ready to integrate.

V_{\mathrm{cone}} = \int_a^b A(x)\,dx
=\int_0^h \pi \frac{R^2}{h^2}(h-x)^2\,dx
=\pi \frac{R^2}{h^2}\int_0^h(h-x)^2\,dx
By u-substitution we may let u=h-x, then du=-dx and our integral becomes
=\pi \frac{R^2}{h^2}\left(-\int_h^0 u^2\,du\right)
=\pi \frac{R^2}{h^2}\left(\left.-\frac{u^3}{3}\right|_h^0\right)
=\pi \frac{R^2}{h^2}(-0+\frac{h^3}{3})
=\frac{1}{3}\pi R^2h.

Example 3: A sphere

Figure 4: Determining the radius of the cross-section of the sphere at a distance |x| from the sphere's center.

In a similar fashion, we can use our definition to prove the well known formula for the volume of a sphere. First, we must find our cross-sectional area function, A(x). Consider a sphere of radius R which is centered at the origin in \mathbb{R}^3. If we again integrate vertically then x will vary from -R to R. In order to find the area of a particular cross section it helps to draw a right triangle whose points lie at the center of the sphere, the center of the circular cross section, and at a point along the circumference of the cross section. As shown in the diagram the side lengths of this triangle will be R, |x|, and r. Where r is the radius of the circular cross section. Then by the Pythagorean theorem r=\sqrt{R^2-|x|^2} and find that A(x)=\pi(R^2-|x|^2). It is slightly helpful to notice that |x|^2=x^2 so we do not need to keep the absolute value.

So we have that

V_{\mathrm{sphere}} = \int_a^b A(x)\,dx
=\int_{-R}^R \pi (R^2-x^2)\,dx
=\pi \int_{-R}^R R^2\,dx-\pi\int_{-R}^R x^2\,dx
=\left.\pi R^2x\right|_{-R}^R-\left.\pi \frac{x^3}{3}\right|_{-R}^R
=\pi R^2(R-(-R))-\pi\left(\frac{R^3}{3}-\frac{(-R)^3}{3}\right)
=2\pi R^3-\frac{2}{3}\pi R^3=\frac{4}{3}\pi R^3.

Extension to Non-trivial Solids

Now that we have shown our definition agrees with our prior knowledge, we will see how it can help us extend our horizons to solids whose volumes are not possible to calculate using elementary geometry.

← Area Calculus Volume of solids of revolution →
Print version

Volume of solids of revolution

← Volume Calculus Arc length →
Print version

In this section we cover solids of revolution and how to calculate their volume. A solid of revolution is a solid formed by revolving a 2-dimensional region around an axis. For example, revolving the semi-circular region bounded by the curve y=\sqrt{1-x^2} and the line y=0 around the x-axis produces a sphere. There are two main methods of calculating the volume of a solid of revolution using calculus: the disk method and the shell method.

Disk Method

Figure 1: A solid of revolution is generated by revolving this region around the x-axis.
Figure 2: Approximation to the generating region in Figure 1.

Consider the solid formed by revolving the region bounded by the curve y=f(x), which is continuous on [a,b], and the lines x=a, x=b and y=0 around the x-axis. We could imagine approximating the volume by approximating f(x) with the stepwise function g(x) shown in figure 2, which uses a right-handed approximation to the function. Now when the region is revolved, the region under each step sweeps out a cylinder, whose volume we know how to calculate, i.e.

V_{cylinder}=\pi r^2 h

, where r is the radius of the cylinder and h is the cylinder's height. This process is reminiscent of the Riemann process we used to calculate areas earlier. Let's try to write the volume as a Riemann sum and from that equate the volume to an integral by taking the limit as the subdivisions get infinitely small.

Consider the volume of one of the cylinders in the approximation, say the k-th one from the left. The cylinder's radius is the height of the step function, and the thickness is the length of the subdivision. With n subdivisions and a length of b-a for the total length of the region, each subdivision has width

\Delta x=\frac{b-a}{n}

Since we are using a right-handed approximation, the k-th sample point will be

x_k=k\Delta x

So the volume of the k-th cylinder is

V_k=\pi f(x_k)^2\Delta x

Summing all of the cylinders in the region from a to b, we have

V_{approx}=\sum_{k=1}^{n}\pi f(x_k)^2\Delta x

Taking the limit as n approaches infinity gives us the the exact volume

V=\lim_{n\to\infty}\sum_{k=1}^{n}\pi f(x_k)^2\Delta x

, which is equivalent to the integral

V=\int_a^b \pi f(x)^2 dx
Example: Volume of a Sphere

Let's calculate the volume of a sphere using the disk method. Our generating region will be the region bounded by the curve f(x)=\sqrt{r^2-x^2} and the line y=0. Our limits of integration will be the x-values where the curve intersects the line y=0, namely, x=\pm r. We have

\begin{align}V_{sphere}&=\int_{-r}^r \pi(r^2-x^2)dx\\
&=\pi(\int_{-r}^r r^2 dx-\int_{-r}^r x^2 dx)\\
&=\pi(r^2 x\bigr|_{-r}^r - \frac{x^3}{3}\biggr|_{-r}^r)\\
&=\pi(r^2 (r-(-r)) - (\frac{r^3}{3}-\frac{(-r)^3}{3})\\
&=\pi(2r^3-\frac{2r^3}{3})\\
&=\pi\frac{6r^3-2r^3}{3}\\
&=\frac{4\pi r^3}{3}
\end{align}

Exercises

1. Calculate the volume of the cone with radius R and height h which is generated by the revolution of the region bounded by y=R-\frac{R}{h}x and the lines y=0 and x=0 around the x-axis.

\frac{\pi R^2 h}{3}

2. Calculate the volume of the solid of revolution generated by revolving the region bounded by the curve y=x^2 and the lines x=1 and y=0 around the x-axis.

\frac{\pi}{5}

Solutions

Washer Method

Figure 3: A solid of revolution containing an irregularly shaped hole through its center is generated by revolving this region around the x-axis.
Figure 4: Approximation to the generating region in Figure 3.

The washer method is an extension of the disk method to solids of revolution formed by revolving an area bounded between two curves around the x-axis. Consider the solid of revolution formed by revolving the region in figure 3 around the x-axis. The curve f(x) is the same as that in figure 1, but now our solid has an irregularly shaped hole through its center whose volume is that of the solid formed by revolving the curve g(x) around the x-axis. Our approximating region has the same upper boundary, f_{step}(x) as in figure 2, but now we extend only down to g_{step}(x) rather than all the way down to the x-axis. Revolving each block around the x-axis forms a washer-shaped solid with outer radius f_{step}(x) and inner radius g_{step}(x). The volume of the k-th hollow cylinder is

\begin{align}V_k&=\pi f(x_k)^2\Delta x-\pi g(x_k)^2\Delta x\\
&=\pi(f(x_k)^2-g(x_k)^2)\Delta x\end{align}

where \Delta x=\frac{b-a}{n} and x_k=k\Delta x. The volume of the entire approximating solid is

V_{approx}=\sum_{k=1}^n \pi(f(x_k)^2-g(x_k)^2)\Delta x

Taking the limit as n approaches infinity gives the volume

\begin{align}V&=\lim_{n\to\infty}\sum_{k=1}^n \pi(f(x_k)^2-g(x_k)^2)\Delta x\\
&=\int_a^b \pi(f(x)^2-g(x)^2)dx\end{align}

Exercises

3. Use the washer method to find the volume of a cone containing a central hole formed by revolving the region bounded by y=R-\frac{R}{h}x and the lines y=r and x=0 around the x-axis.

\pi h\left(\frac{R^2}{3}-r^2\right)

4. Calculate the volume of the solid of revolution generated by revolving the region bounded by the curves y=x^2 and y=x^3 and the lines x=1 and y=0 around the x-axis.

\frac{2\pi}{35}

Solutions

Shell Method

Figure 5: A solid of revolution is generated by revolving this region around the y-axis.
Figure 6: Approximation to the generating region in Figure 5.

The shell method is another technique for finding the volume of a solid of revolution. Using this method sometimes makes it easier to set up and evaluate the integral. Consider the solid of revolution formed by revolving the region in figure 5 around the y-axis. While the generating region is the same as in figure 1, the axis of revolution has changed, making the disk method impractical for this problem. However, dividing the region up as we did previously suggests a similar method of finding the volume, only this time instead of adding up the volume of many approximating disks, we will add up the volume of many cylindrical shells. Consider the solid formed by revolving the region in figure 6 around the y-axis. The k-th rectangle sweeps out a hollow cylinder with height \left|f(x_k)\right| and with inner radius x_k and outer radius x_k+\Delta x, where \Delta x=\frac{b-a}{n} and x_k=k\Delta x, the volume of which is

\begin{align}V_k&=\pi((x_k+\Delta x)^2-x_k^2)|f(x_k)|\\
&=\pi((x_k^2+2x_k\Delta x+\Delta x^2)-x_k^2)|f(x_k)|\\
&=\pi(2x_k\Delta x+\Delta x^2)|f(x_k)|\end{align}

The volume of the entire approximating solid is

V_{approx}=\sum_{k=1}^n \pi(2x_k\Delta x+\Delta x^2)|f(x_k)|

Taking the limit as n approaches infinity gives us the exact volume

\begin{align}V&=\lim_{n\to\infty}\sum_{k=1}^n \pi(2x_k\Delta x+\Delta x^2)|f(x_k)|\\
&=\pi\lim_{n\to\infty}\left(\sum_{k=1}^n 2x_k\Delta x|f(x_k)|+\sum_{k=1}^n \Delta x^2|f(x_k)|\right)\end{align}

Since |f| is continuous on [a,b], the Extreme Value Theorem implies that |f| has some maximum, M, on [a,b]. Using this and the fact that \Delta x^2|f(x_k)|>0, we have

\sum_{k=1}^n 2x_k\Delta x|f(x_k)| \leq \sum_{k=1}^n 2x_k\Delta x|f(x_k)|+\sum_{k=1}^n \Delta x^2|f(x_k)| \leq \sum_{k=1}^n 2x_k\Delta x|f(x_k)|+\sum_{k=1}^n \Delta x^2 M

But

\begin{align}\lim_{n\to\infty}\sum_{k=1}^n \Delta x^2 M &= \lim_{n\to\infty}\sum_{k=1}^n \left(\frac{b-a}{n}\right)^2 M\\
&=\lim_{n\to\infty} \frac{(b-a)^2}{n} M\\
&=0\end{align}

So by the Squeeze Theorem

\pi\lim_{n\to\infty}\left(\sum_{k=1}^n 2x_k\Delta x|f(x_k)|+\sum_{k=1}^n \Delta x^2|f(x_k)|\right) = \pi\lim_{n\to\infty}\sum_{k=1}^n 2x_k\Delta x|f(x_k)|

, which is just the integral

\int_a^b 2\pi x|f(x)|dx

Exercises

5. Find the volume of a cone with radius R and height h by using the shell method on the appropriate region which, when rotated around the y-axis, produces a cone with the given characteristics.

\frac{\pi r^2 h}{3}

6. Calculate the volume of the solid of revolution generated by revolving the region bounded by the curve y=x^2 and the lines x=1 and y=0 around the y-axis.

\frac{\pi}{2}

Solutions

← Volume Calculus Arc length →
Print version

Arc length

← Volume of solids of revolution Calculus Surface area →
Print version

Suppose that we are given a function f that is continuous on an interval [a,b] and we want to calculate the length of the curve drawn out by the graph of f(x) from x=a to x=b. If the graph were a straight line this would be easy — the formula for the length of the line is given by Pythagoras' theorem. And if the graph were a piecewise linear function we can calculate the length by adding up the length of each piece.

The problem is that most graphs are not linear. Nevertheless we can estimate the length of the curve by approximating it with straight lines. Suppose the curve C is given by the formula y=f(x) for a\le x\le b. We divide the interval [a,b] into n subintervals with equal width \Delta x and endpoints x_0,x_1,\ldots,x_n. Now let y_i=f(x_i) so P_i=(x_i,y_i) is the point on the curve above x_i. The length of the straight line between P_i and P_{i+1} is

 |P_iP_{i+1}| = \sqrt{ (y_{i+1} - y_{i})^2 + (x_{i+1}-{x_i})^2}

So an estimate of the length of the curve C is the sum

 \sum_{i=0}^{n-1} |P_iP_{i+1}|

As we divide the interval [a,b] into more pieces this gives a better estimate for the length of C. In fact we make that a definition.

Length of a Curve

The length of the curve y=f(x) for a\le x\le b is defined to be

L = \lim_{n\to \infty} \sum_{i=0}^{n-1} |P_{i+1}P_i|

The Arclength Formula

Suppose that f' is continuous on [a,b]. Then the length of the curve given by y=f(x) between a and b is given by

 L = \int_a^b \sqrt{ 1+ (f'(x))^2 } dx

And in Leibniz notation

 L = \int_a^b \sqrt{ 1+ \left(\frac{dy}{dx}\right)^2 } dx

Proof: Consider y_{i+1} - y_i= f(x_{i+1}) -f(x_i). By the Mean Value Theorem there is a point  z_i in (x_{i+1},x_i) such that

 y_{i+1} - y_i= f(x_{i+1}) -f(x_i)= f'(z_i) (x_{i+1}-x_i)\,

So

\begin{align}|P_iP_{i+1}|&=\sqrt{ (y_{i+1} - y_{i})^2 + (x_{i+1}-x_i)^2}\\
&=\sqrt{ (f'(z_i))^2 (x_{i+1}-x_i)^2 + (x_{i+1}-x_i)^2 }\\
&=\sqrt{ (1+(f'(z_i))^2) (x_{i+1}-x_i)^2 }\\
&=\sqrt{ (1+(f'(z_i))^2)} \Delta x\end{align}

Putting this into the definition of the length of C gives

L=\lim_{n\to \infty} \sum_{i=0}^{n-1} \sqrt{ (1+(f'(z_i))^2)} \Delta x

Now this is the definition of the integral of the function g(x) = \sqrt{1+(f'(x))^2} between a and b (notice that g is continuous because we are assuming that f' is continuous). Hence

L=\int_a^b \sqrt{1+(f'(x))^2} dx

as claimed.

Example: Length of the curve y=2x from x=0 to x=1

As a sanity check of our formula, let's calculate the length of the "curve" y=2x from x=0 to x=1. First let's find the answer using the Pythagorean Theorem.

P_0=(0,0)

and

P_1=(1,2)

so the length of the curve, s, is

s=\sqrt{2^2+1^2}=\sqrt{5}

Now let's use the formula

s=\int_0^1 \sqrt{1+\left(\frac{d(2x)}{dx}\right)^2}=\int_0^1 \sqrt{1+2^2}=\sqrt{5}x\bigr|_0^1=\sqrt{5}


Exercises

1. Find the length of the curve y=x^{3/2} from x=0 to x=1.
\frac{13^{3/2}-8}{27}
2. Find the length of the curve y=\frac{e^x+e^{-x}}{2} from x=0 to x=1.
\frac{e-\frac{1}{e}}{2}

Solutions

Arclength of a parametric curve

For a parametric curve, that is, a curve defined by x=f(t) and y=g(t), the formula is slightly different:

L = \int_a^b \sqrt{ (f'(t))^2 + (g'(t))^2 }\,dt

Proof: The proof is analogous to the previous one: Consider y_{i+1} - y_i= g(t_{i+1}) -g(t_i) and x_{i+1} - x_i= f(t_{i+1}) -f(t_i). By the Mean Value Theorem there are points c_i and d_i in (t_{i+1},t_i) such that

 y_{i+1} - y_i= g(t_{i+1}) -g(t_i)= g'(c_i) (t_{i+1}-t_i)\,

and

  x_{i+1} - x_i= f(t_{i+1}) -f(t_i)= f'(d_i) (t_{i+1}-t_i)\,

So

\begin{align}|P_iP_{i+1}|&=\sqrt{ (y_{i+1} - y_{i})^2 + (x_{i+1}-x_i)^2}\\
&=\sqrt{ (g'(c_i))^2 (t_{i+1}-t_i)^2 + (f'(d_i))^2 (t_{i+1}-t_i)^2 }\\
&=\sqrt{ (f'(d_i))^2)+(g'(c_i))^2) (t_{i+1}-t_i)^2 }\\
&=\sqrt{ (f'(d_i))^2 + (g'(c_i))^2} \Delta t\end{align}

Putting this into the definition of the length of the curve gives

L=\lim_{n\to \infty} \sum_{i=0}^{n-1} \sqrt{ (f'(d_i))^2 + (g'(c_i))^2 } \Delta t

This is equivalent to:

L=\int_a^b \sqrt{ (f'(t))^2 + (g'(t))^2}\,dt

Exercises

3. Find the circumference of the circle given by the parametric equations x(t)=R\cos(t), y(t)=R\sin(t), with t running from 0 to 2\pi.
2\pi R
4. Find the length of one arch of the cycloid given by the parametric equations x(t)=R(t-\sin(t)), y(t)=R(1-\cos(t)), with t running from 0 to 2\pi.
8R

Solutions

← Volume of solids of revolution Calculus Surface area →
Print version

Surface area

← Arc length Calculus Work →
Print version

Suppose we are given a function f and we want to calculate the surface area of the function f rotated around a given line. The calculation of surface area of revolution is related to the arc length calculation.

If the function f is a straight line, other methods such as surface area formulas for cylinders and conical frustra can be used. However, if f is not linear, an integration technique must be used.

Recall the formula for the lateral surface area of a conical frustum:

 A = 2 \pi r l \,

where r is the average radius and l is the slant height of the frustum.

For y=f(x) and a\le x\le b, we divide [a,b] into subintervals with equal width Δx and endpoints x_0,x_1,\ldots,x_n. We map each point y_i= f(x_i) \, to a conical frustum of width Δx and lateral surface area A_i \,.

We can estimate the surface area of revolution with the sum

 A = \sum_{i=0}^{n} A_i

As we divide [a,b] into smaller and smaller pieces, the estimate gives a better value for the surface area.

Definition (Surface of Revolution)

The surface area of revolution of the curve y=f(x) about a line for a\le x\le b is defined to be

 A = \lim_{n\to \infty} \sum_{i=0}^{n} A_i

The Surface Area Formula

Suppose f is a continuous function on the interval [a,b] and r(x) represents the distance from f(x) to the axis of rotation. Then the lateral surface area of revolution about a line is given by

 A = 2 \pi \int_a^b r(x) \sqrt{ 1+ (f'(x))^2 } dx

And in Leibniz notation

 A = 2 \pi \int_a^b r(x) \sqrt{ 1+ \left(\frac{dy}{dx}\right)^2 } dx

Proof:

 A \ =  \lim_{n \to \infty} \sum_{i=1}^n A_i
=  \lim_{n \to \infty} \sum_{i=1}^n 2 \pi r_i l_i
=  2 \pi \lim_{n \to \infty} \sum_{i=1}^n r_i l_i

As  n \rightarrow \infty and  \Delta x \rightarrow 0 , we know two things:

1. the average radius of each conical frustum  r_i approaches a single value

2. the slant height of each conical frustum  l_i equals an infitesmal segment of arc length

From the arc length formula discussed in the previous section, we know that

 l_i = \sqrt{ 1+ (f'(x_i))^2 }

Therefore

 A \ =  2 \pi \lim_{n \to \infty} \sum_{i=1}^n r_i l_i
=  2 \pi \lim_{n \to \infty} \sum_{i=1}^n r_i \sqrt{ 1+ (f'(x_i))^2} \Delta x

Because of the definition of an integral  \int_a^b f(x) dx = \lim_{n \to \infty} \sum_{i=1}^n f(c_i) \Delta x_i , we can simplify the sigma operation to an integral.

 A = 2 \pi \int_a^b r(x) \sqrt{ 1+ (f'(x))^2 } dx

Or if f is in terms of y on the interval [c,d]

 A = 2 \pi \int_c^d r(y) \sqrt{ 1+ (f'(y))^2} dy

Work

← Surface area Calculus Centre of mass →
Print version

W = \int F\,dr=\int ma\,dr=\int m \frac{dv}{dt}dr=m\int  \frac{dr}{dt}dv=m\int  v dv=\frac{1} {2} mv^2=\Delta E_k

Centre of mass

← Work Calculus Kinematics →
Print version

\vec{r_G}=\frac{\sum_{i=1}^{n}{\vec{r_i}{m_i}}}{\sum_{i=1}^{n}{m_i}}

Exercises

See the exercises for Integration

Parametric Equations

Introduction

← Parametric and Polar Equations Calculus Parametric Differentiation →
Print version

Introduction

Parametric equations are typically definied by two equations that specify both the x and y coordinates of a graph using a parameter. They are graphed using the parameter (usually t) to figure out both the x and y coordinates.

Example 1:

x = t \

y = t^2 \

Note: This parametric equation is equivalent to the rectangular equation y = x^2 \ .

Example 2:

x = \cos t \

y = \sin t \

Note: This parametric equation is equivalent to the rectangular equation  x^2 + y^2 = 1 \ and the polar equation   
r=1 \ .

Parametric equations can be plotted by using a t-table to show values of x and y for each value of t. They can also be plotted by eliminating the parameter though this method removes the parameter's importance.

Forms of Parametric Equations

Parametric equations can be described in three ways:

  • Parametric form
  • Vector form
  • An equality

The first two forms are used far more often, as they allow us to find the value of the component at the given value of the parameter. The final form is used less often; it allows us to verify a solution to the equation, or find the parameter (or some constant multiple thereof).

Parametric Form

A parametric equation can be shown in parametric form by describing it with a system of equations. For instance:

x = t \

y = t^2 - 1 \

Vector Form

Vector form can be used to describe a parametric equation in a similar manner to parametric form. In this case, a position vector is given: [x,y] = [t, t^2 - 1] \

Equalities

A parametric equation can also be described with a set of equalities. This is done by solving for the parameter, and equating the components. For example:

x = t \

y = t^2 - 1 \

From here, we can solve for t:

t = x \

t = \pm\sqrt{1+y} \

And hence equate the two right-hand sides:

x = \pm\sqrt{1+y} \

Converting Parametric Equations

There are a few common place methods used to change a parametric equation to rectangular form. The first involves solving for t in one of the two equations and then replacing the new expression for t with the variable found in the second equation.

Example 1:

x=t-3 \

y=t^2 \

x=t-3 \ becomes  x+3=t \

 y=(x+3)^2 \

Example 2:

Given

 x=3\cos{\theta} \

 y=4\sin{\theta} \

Isolate the trigonometric functions

 \cos{\theta} = \frac{x}{3}

 \sin{\theta} = \frac{y}{4}

Use the "Beloved Identity"

 cos^2{\theta} + sin^2{\theta} = 1 \

 \frac{x^2}{9} + \frac{y^2}{16} = 1 \

Differentiation

Taking Derivatives of Parametric Systems

Just as we are able to differentiate functions of x, we are able to differentiate x and y, which are functions of t. Consider:

x = \sin t \

y = t \

We would find the derivative of x with respect to t, and the derivative of y with respect to t:

x' = \cos t \

y' = 1 \

In general, we say that if

x = f(t) \ and y = g(t) \ then:

x' = f'(t) \ and y' = g'(t) \

It's that simple.

This process works for any amount of variables.

Slope of Parametric Equations

In the above process, x' has told us only the rate at which x is changing, not the rate for y, and vice versa. Neither is the slope.

In order to find the slope, we need something of the form {dy \over dx}.

We can discover a way to do this by simple algebraic manipulation:

{y' \over x'} = {{dy \over dt} \over {dx \over dt}} = {dy \over dx}

So, for the example in section 1, the slope at any time t:

{1 \over \cos t} = \sec t

In order to find a vertical tangent line, set the horizontal change, or x', equal to 0 and solve.

In order to find a horizontal tangent line, set the vertical change, or y', equal to 0 and solve.

If there is a time when both x' and y' are 0, that point is called a singular point.

Concavity of Parametric Equations

Solving for the second derivative of a parametric equation can be more complex than it may seem at first glance. When you have take the derivative of {dy \over dx} in terms of t, you are left with {{d^2y \over dx}\over dt}:

{d\over dt}[{dy \over dx}] = {{d^2y \over dx}\over dt}.

By multiplying this expression by {dt \over dx}, we are able to solve for the second derivative of the parametric equation:

{{d^2y \over dx}\over dt} \times {dt \over dx} = {d^2y\over dx^2}.

Thus, the concavity of a parametric equation can be described as:

{d\over dt}[{dy \over dx}] \times {dt \over dx}

So for the example in sections 1 and 2, the concavity at any time t:

{d\over dt}[\csc t] \times \cos t = - \csc^2 t \times \cos t

Integration

Introduction

Because most parametric equations are given in explicit form, they can be integrated like many other equations. Integration has a variety of applications with respect to parametric equations, especially in kinematics and vector calculus.

 x = \int{x'(t)}\, dt

 y = \int{y'(t)}\, dt

So, taking a simple example:

 y = \int{cost}\, dt = sint

Polar Equations

=Introduction

A polar grid with several angles labeled in degrees

The polar coordinate system is a two-dimensional coordinate system in which each point on a plane is determined by an angle and a distance. The polar coordinate system is especially useful in situations where the relationship between two points is most easily expressed in terms of angles and distance; in the more familiar Cartesian coordinate system or rectangular coordinate system, such a relationship can only be found through trigonometric formulae.

As the coordinate system is two-dimensional, each point is determined by two polar coordinates: the radial coordinate and the angular coordinate. The radial coordinate (usually denoted as r) denotes the point's distance from a central point known as the pole (equivalent to the origin in the Cartesian system). The angular coordinate (also known as the polar angle or the azimuth angle, and usually denoted by θ or t) denotes the positive or anticlockwise (counterclockwise) angle required to reach the point from the 0° ray or polar axis (which is equivalent to the positive x-axis in the Cartesian coordinate plane).

Plotting points with polar coordinates

The points (3,60°) and (4,210°) on a polar coordinate system

Each point in the polar coordinate system can be described with the two polar coordinates, which are usually called r (the radial coordinate) and θ (the angular coordinate, polar angle, or azimuth angle, sometimes represented as φ or t). The r coordinate represents the radial distance from the pole, and the θ coordinate represents the anticlockwise (counterclockwise) angle from the 0° ray (sometimes called the polar axis), known as the positive x-axis on the Cartesian coordinate plane.

For example, the polar coordinates (3, 60°) would be plotted as a point 3 units from the pole on the 60° ray. The coordinates (−3, 240°) would also be plotted at this point because a negative radial distance is measured as a positive distance on the opposite ray (the ray reflected about the origin, which differs from the original ray by 180°).

One important aspect of the polar coordinate system, not present in the Cartesian coordinate system, is that a single point can be expressed with an infinite number of different coordinates. This is because any number of multiple revolutions can be made around the central pole without affecting the actual location of the point plotted. In general, the point (r, θ) can be represented as (r, θ ± n×360°) or (−r, θ ± (2n + 1)180°), where n is any integer.

The arbitrary coordinates (0, θ) are conventionally used to represent the pole, as regardless of the θ coordinate, a point with radius 0 will always be on the pole. To get a unique representation of a point, it is usual to limit r to negative and non-negative numbers r ≥ 0 and θ to the interval [0, 360°) or (−180°, 180°] (or, in radian measure, [0, 2π) or (−π, π]).

Angles in polar notation are generally expressed in either degrees or radians, using the conversion 2π rad = 360°. The choice depends largely on the context. Navigation applications use degree measure, while some physics applications (specifically rotational mechanics) and almost all mathematical literature on calculus use radian measure.

Converting between polar and Cartesian coordinates

A diagram illustrating the conversion formulae

The two polar coordinates r and θ can be converted to the Cartesian coordinates x and y by using the trigonometric functions sine and cosine:

x = r \cos \theta \,
y = r \sin \theta, \,

while the two Cartesian coordinates x and y can be converted to polar coordinate r by

r = \sqrt{x^2 + y^2} \, (by a simple application of the Pythagorean theorem).

To determine the angular coordinate θ, the following two ideas must be considered:

  • For r = 0, θ can be set to any real value.
  • For r ≠ 0, to get a unique representation for θ, it must be limited to an interval of size 2π. Conventional choices for such an interval are [0, 2π) and (−π, π].

To obtain θ in the interval [0, 2π), the following may be used (\arctan denotes the inverse of the tangent function):

\theta = 
\begin{cases}
\arctan(\frac{y}{x})        & \mbox{if } x > 0 \mbox{ and } y \ge 0\\
\arctan(\frac{y}{x}) + 2\pi & \mbox{if } x > 0 \mbox{ and } y < 0\\
\arctan(\frac{y}{x}) + \pi  & \mbox{if } x < 0\\
\frac{\pi}{2}               & \mbox{if } x = 0 \mbox{ and } y > 0\\
\frac{3\pi}{2}              & \mbox{if } x = 0 \mbox{ and } y < 0
\end{cases}

To obtain θ in the interval (−π, π], the following may be used:

\theta = 
\begin{cases}
\arctan(\frac{y}{x}) & \mbox{if } x > 0\\
\arctan(\frac{y}{x}) + \pi & \mbox{if } x < 0 \mbox{ and } y \ge 0\\
\arctan(\frac{y}{x}) - \pi & \mbox{if } x < 0 \mbox{ and } y < 0\\
\frac{\pi}{2} & \mbox{if } x = 0 \mbox{ and } y > 0\\
-\frac{\pi}{2} & \mbox{if } x = 0 \mbox{ and } y < 0
\end{cases}

One may avoid having to keep track of the numerator and denominator signs by use of the atan2 function, which has separate arguments for the numerator and the denominator.

Polar equations

The equation defining an algebraic curve expressed in polar coordinates is known as a polar equation. In many cases, such an equation can simply be specified by defining r as a function of θ. The resulting curve then consists of points of the form (r(θ), θ) and can be regarded as the graph of the polar function r.

Different forms of symmetry can be deduced from the equation of a polar function r. If r(−θ) = r(θ) the curve will be symmetrical about the horizontal (0°/180°) ray, if r(π−θ) = r(θ) it will be symmetric about the vertical (90°/270°) ray, and if r(θ−α°) = r(θ) it will be rotationally symmetric α° counterclockwise about the pole.

Because of the circular nature of the polar coordinate system, many curves can be described by a rather simple polar equation, whereas their Cartesian form is much more intricate. Among the best known of these curves are the polar rose, Archimedean spiral, lemniscate, limaçon, and cardioid.

For the circle, line, and polar rose below, it is understood that there are no restrictions on the domain and range of the curve.

Circle

A circle with equation r(θ) = 1

The general equation for a circle with a center at (r0, φ) and radius a is

r^2 - 2 r r_0 \cos(\theta - \varphi) + r_0^2 = a^2.\,

This can be simplified in various ways, to conform to more specific cases, such as the equation

r(\theta)=a \,

for a circle with a center at the pole and radius a.

Line

Radial lines (those running through the pole) are represented by the equation

\theta = \varphi \,,

where φ is the angle of elevation of the line; that is, φ = arctan m where m is the slope of the line in the Cartesian coordinate system. The non-radial line that crosses the radial line θ = φ perpendicularly at the point (r0, φ) has the equation

r(\theta) = {r_0}\sec(\theta-\varphi). \,

Polar rose

A polar rose with equation r(θ) = 2 sin 4θ

A polar rose is a famous mathematical curve that looks like a petaled flower, and that can be expressed as a simple polar equation,

r(\theta) = a \cos (k\theta + \phi_0)\,

for any constant \phi_0 (including 0). If k is an integer, these equations will produce a k-petaled rose if k is odd, or a 2k-petaled rose if k is even. If k is rational but not an integer, a rose-like shape may form but with overlapping petals. Note that these equations never define a rose with 2, 6, 10, 14, etc. petals. The variable a represents the length of the petals of the rose.

Archimedean spiral

One arm of an Archimedean spiral with equation r(θ) = θ for 0 < θ < 6π

The Archimedean spiral is a famous spiral that was discovered by Archimedes, which also can be expressed as a simple polar equation. It is represented by the equation

r(\theta) = a+b\theta. \,

Changing the parameter a will turn the spiral, while b controls the distance between the arms, which for a given spiral is always constant. The Archimedean spiral has two arms, one for θ > 0 and one for θ < 0. The two arms are smoothly connected at the pole. Taking the mirror image of one arm across the 90°/270° line will yield the other arm. This curve is notable as one of the first curves, after the Conic Sections, to be described in a mathematical treatise, and as being a prime example of a curve that is best defined by a polar equation.

Conic sections

Ellipse, showing semi-latus rectum

A conic section with one focus on the pole and the other somewhere on the 0° ray (so that the conic's semi-major axis lies along the polar axis) is given by:

r  = { \ell\over {1 + e \cos \theta}}

where e is the eccentricity and \ell is the semi-latus rectum (the perpendicular distance at a focus from the major axis to the curve). If e > 1, this equation defines a hyperbola; if e = 1, it defines a parabola; and if e < 1, it defines an ellipse. The special case e = 0 of the latter results in a circle of radius \ell.

Differentiation

Differential calculus

We have the following formulas:

r \tfrac{\partial}{\partial r}= x \tfrac{\partial}{\partial x} + y \tfrac{\partial}{\partial y} \,
\tfrac{\partial}{\partial \theta} = -y \tfrac{\partial}{\partial x} + x \tfrac{\partial}{\partial y} .

To find the Cartesian slope of the tangent line to a polar curve r(θ) at any given point, the curve is first expressed as a system of parametric equations.

x=r(\theta)\cos\theta \,
y=r(\theta)\sin\theta \,

Differentiating both equations with respect to θ yields

\tfrac{\partial x}{\partial \theta}=r'(\theta)\cos\theta-r(\theta)\sin\theta \,
\tfrac{\partial y}{\partial \theta}=r'(\theta)\sin\theta+r(\theta)\cos\theta \,

Dividing the second equation by the first yields the Cartesian slope of the tangent line to the curve at the point (rr(θ)):

\frac{dy}{dx}=\frac{r'(\theta)\sin\theta+r(\theta)\cos\theta}{r'(\theta)\cos\theta-r(\theta)\sin\theta}

Integration

Introduction

Integrating a polar equation requires a different approach than integration under the Cartesian system, hence yielding a different formula, which is not as straightforward as integrating the function f(x).

Proof

In creating the concept of integration, we used Riemann sums of rectangles to approximate the area under the curve. However, with polar graphs, one can use sectors of circles with radius r and angle measure dθ. The area of each sector is then (πr²)(dθ/2π) and the sum of all the infinitesimally small sectors' areas is :\frac{1}{2} \int_{a}^{b} r^2\,d\theta, This is the form to use to integrate a polar expression of the form r=f(\theta) where (a, f(a)) and (b, f(b)) are the ends of the curve that you wish to integrate.

Integral calculus

The integration region R is bounded by the curve r = f(\theta) and the rays \theta=a and \theta=b.

Let R denote the region enclosed by a curve r = f(\theta) and the rays \theta=a and \theta=b, where 0<b-a<2\pi. Then, the area of R is

\frac12\int_a^b r^2\,d\theta.
The region R is approximated by n sectors (here, n = 5).

This result can be found as follows. First, the interval [a,b] is divided into n subintervals, where n is an arbitrary positive integer. Thus \theta, the length of each subinterval, is equal to b-a (the total length of the interval), divided by n, the number of subintervals. For each subinterval i=1,2,\ldots,n, let \theta_i be the midpoint of the subinterval, and construct a circular sector with the center at the origin, radius r_i=f(\theta_i), central angle \delta\theta, and arc length r_i\delta\theta. The area of each constructed sector is therefore equal to \tfrac12 r_i^2\delta\theta. Hence, the total area of all of the sectors is

\sum_{i=1}^n \tfrac12 r_i^2\,\delta\theta.

As the number of subintervals n is increased, the approximation of the area continues to improve. In the limit as n\to\infty, the sum becomes the Riemann integral.

Generalization

Using Cartesian coordinates, an infinitesimal area element can be calculated as dA = dx\,dy. The substitution rule for multiple integrals states that, when using other coordinates, the Jacobian determinant of the coordinate conversion formula has to be considered:

J = \det\frac{\partial(x,y)}{\partial(r,\theta)}
=\begin{vmatrix}
  \frac{\partial x}{\partial r}  & \frac{\partial x}{\partial \theta} \\
  \frac{\partial y}{\partial r}  & \frac{\partial y}{\partial \theta}
\end{vmatrix}
=\begin{vmatrix}
  \cos\theta & -r\sin\theta \\
  \sin\theta &  r\cos\theta
\end{vmatrix}
=r\cos^2\theta + r\sin^2\theta = r.

Hence, an area element in polar coordinates can be written as

dA = J\,dr\,d\theta = r\,dr\,d\theta.

Now, a function that is given in polar coordinates can be integrated as follows:

\iint_R g(r,\theta) \, dA = \int_a^b \int_0^{r(\theta)}  g(r,\theta)\,r\,dr\,d\theta.

Here, R is the same region as above, namely, the region enclosed by a curve r=f(\theta) and the rays \theta=a and \theta=b.

The formula for the area of R mentioned above is retrieved by taking g identically equal to 1.

Applications

Polar integration is often useful when the corresponding integral is either difficult or impossible to do with the Cartesian coordinates. For example, let's try to find the area of the closed unit circle. That is, the area of the region enclosed by x^2 + y^2 = 1.

In Cartesian

 \int_{-1}^1 \int_{-\sqrt{1-x^2}}^{\sqrt{1-x^2}} \, dy \, dx = 2\int_{-1}^1 \sqrt{1-x^2} \, dx

In order to evaluate this, one usually uses trigonometric substitution. By setting \sin\theta = x, we get both \cos\theta = \sqrt{1-x^2} and \cos\theta\,d\theta = dx.

\begin{align}\int\sqrt{1-x^2}\,dx &= \int \cos^2\theta\,d\theta\\
&= \int \frac{1}{2} + \frac{1}{2} \cos 2\theta\,d\theta\\
&= \frac{\theta}{2} + \frac{1}{4}\sin2\theta+c= \frac{\theta}{2} + \frac{1}{2}\sin\theta\cos\theta+c\\
&= \frac{\arcsin x}{2}+\frac{x\sqrt{1-x^2}}{2}+c\end{align}

Putting this back into the equation, we get

2\int_{-1}^1\sqrt{1-x^2}\,dx = 2\left[\frac{\arcsin x}{2}+\frac{x\sqrt{1-x^2}}{2}\right]_{-1}^1 = \arcsin 1-\arcsin(-1) = \pi

In Polar

To integrate in polar coordinates, we first realize r = \sqrt{x^2 + y^2} = \sqrt{1} = 1 and in order to include the whole circle, a=0 and b=2\pi.

\int_{0}^{2\pi} \int_{0}^1 r\,dr\,d\theta = \int_0^{2\pi} \left[\frac{r^2}{2}\right]_0^1\,d\theta = \int_0^{2\pi} \frac{1}{2}\,d\theta = \left[\frac{\theta}{2}\right]_0^{2\pi} = \frac{2\pi}{2} = \pi

An interesting example

A less intuitive application of polar integration yields the Gaussian integral

 \int_{-\infty}^\infty e^{-x^2} \, dx = \sqrt\pi.

Try it! (Hint: multiply  \int_{-\infty}^\infty e^{-x^2} \, dx and  \int_{-\infty}^\infty e^{-y^2} \, dy.)

Sequences and Series

Sequences

A sequence is an ordered list of objects (or events). Like a set, it contains members (also called elements or terms), and the number of terms (possibly infinite) is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence.

For example, (C, R, Y) is a sequence of letters that differs from (Y, C, R), as the ordering matters. Sequences can be finite, as in this example, or infinite, such as the sequence of all even positive integers (2, 4, 6,...).

An infinite sequence of real numbers (in blue). This sequence is neither increasing, nor decreasing, nor convergent. It is however bounded.

Examples and notation

There are various and quite different notions of sequences in mathematics, some of which (e.g., exact sequence) are not covered by the notations introduced below.

A sequence may be denoted (a1, a2, ...). For shortness, the notation (an) is also used.

A more formal definition of a finite sequence with terms in a set S is a function from {1, 2, ..., n} to S for some n ≥ 0. An infinite sequence in S is a function from {1, 2, ...} (the set of natural numbers without 0) to S.

Sequences may also start from 0, so the first term in the sequence is then a0.

A finite sequence is also called an n-tuple. Finite sequences include the empty sequence ( ) that has no elements.

A function from all integers into a set is sometimes called a bi-infinite sequence, since it may be thought of as a sequence indexed by negative integers grafted onto a sequence indexed by positive integers.

Types and properties of sequences

A subsequence of a given sequence is a sequence formed from the given sequence by deleting some of the elements (which, as stated in the introduction, can also be called "terms") without disturbing the relative positions of the remaining elements.

If the terms of the sequence are a subset of an ordered set, then a monotonically increasing sequence is one for which each term is greater than or equal to the term before it; if each term is strictly greater than the one preceding it, the sequence is called strictly monotonically increasing. A monotonically decreasing sequence is defined similarly. Any sequence fulfilling the monotonicity property is called monotonic or monotone. This is a special case of the more general notion of a monotonic function. A sequence that both increases and decreases (at different places in the sequence) is said to be non-monotonic or non-monotone.

The terms non-decreasing and non-increasing are often used in order to avoid any possible confusion with strictly increasing and strictly decreasing, respectively. If the terms of a sequence are integers, then the sequence is an integer sequence. If the terms of a sequence are polynomials, then the sequence is a polynomial sequence.

If S is endowed with a topology (as is true of real numbers, for example), then it becomes possible to consider the convergence of an infinite sequence in S. Such considerations involve the concept of the limit of a sequence.

It can be shown that bounded monotonic sequences must converge.

Sequences in analysis

In analysis, when talking about sequences, one will generally consider sequences of the form

(x_1, x_2, x_3, ...)\, or (x_0, x_1, x_2, ...)\,

which is to say, infinite sequences of elements indexed by natural numbers. (It may be convenient to have the sequence start with an index different from 1 or 0. For example, the sequence defined by xn = 1/log(n) would be defined only for n ≥ 2. When talking about such infinite sequences, it is usually sufficient (and does not change much for most considerations) to assume that the members of the sequence are defined at least for all indices large enough, that is, greater than some given N.)

The most elementary type of sequences are numerical ones, that is, sequences of real or complex numbers.

Series

Introduction

A series is the sum of a sequence of terms. An infinite series is the sum of an infinite number of terms (the actual sum of the series need not be infinite, as we will see below).

An arithmetic series is the sum of a sequence of terms with a common difference (the difference between consecutive terms). For example:

1+4+7+10+13+\dots

is an arithmetic series with common difference 3, since a_2-a_1=3, a_3-a_2=3, and so forth.

A geometric series is the sum of terms with a common ratio. For example, an interesting series which appears in many practical problems in science, engineering, and mathematics is the geometric series  r + r^2 + r^3 + r^4 + ... where the ... indicates that the series continues indefinitely. A common way to study a particular series (following Cauchy) is to define a sequence consisting of the sum of the first n terms. For example, to study the geometric series we can consider the sequence which adds together the first n terms:

S_n(r) = \sum_{i=1}^{n} r^i.

Generally by studying the sequence of partial sums we can understand the behavior of the entire infinite series.

Two of the most important questions about a series are:

  • Does it converge?
  • If so, what does it converge to?

For example, it is fairly easy to see that for  r > 1 , the geometric series S_n(r) will not converge to a finite number (i.e., it will diverge to infinity). To see this, note that each time we increase the number of terms in the series, S_n(r) increases by r^{n+1}, since r^{n+1} > 1 for all r > 1 (as we defined), S_n(r) must increase by a number greater than one every term. When increasing the sum by more than one for every term, it will diverge.

Perhaps a more surprising and interesting fact is that for  |r| < 1 ,  S_n(r) will converge to a finite value. Specifically, it is possible to show that

\lim_{n \to \infty} S_n(r) = \frac{r}{1-r}.

Indeed, consider the quantity

(1-r)S_n(r) = (1-r)\sum_{i=1}^{n} r^n = \sum_{i=1}^{n} r^n - \sum_{i=2}^{n+1} r^n = r - r^{n+1}

Since r^{n+1}\to 0 as n\to \infty for |r|<1, this shows that (1-r)S_n(r)\to r as n\to \infty. The quantity 1-r is non-zero and doesn't depend on n so we can divide by it and arrive at the formula we want.

We'd like to be able to draw similar conclusions about any series.

Unfortunately, there is no simple way to sum a series. The most we will be able to do in most cases is determine if it converges. The geometric and the telescoping series are the only types of series in which we can easily find the sum of.

Convergence

It is obvious that for a series to converge, the an must tend to zero (because sum of an infinite number of terms all greater than any given positive number will be infinity), but even if the limit of the sequence is 0, this is not sufficient to say it converges.

Consider the harmonic series, the sum of 1/n, and group terms

\begin{matrix}
\sum_1^{2^m} \frac{1}{n} & = &1+\frac{1}{2} & + &\frac{1}{3}+\frac{1}{4} & + &\frac{1}{5}+\frac{1}{6}+\frac{1}{7}+\frac{1}{8} &+\ldots & + &\sum_{p=1+2^{n-1}}^{2^n} \frac{1}{p} \\ 
 & > &\frac{3}{2} & + &\frac{1}{4}(2) & + &\frac{1}{8}(4) &+\ldots & + & \frac{1}{2^n}2^{n-1} \\
 & = &\frac{3}{2} & + &\frac{1}{2} & + &\frac{1}{2} &+\ldots & + & \frac{1}{2} \quad (m \mbox{ terms})
\end{matrix}

As m tends to infinity, so does this final sum, hence the series diverges.

We can also deduce something about how quickly it diverges. Using the same grouping of terms, we can get an upper limit on the sum of the first so many terms, the partial sums.

1+\frac{m}{2} \le   \sum_1^{2^m} \frac{1}{n} \le 1+m

or

1+\frac{\log_2 m}{2}\le \sum_1^m \frac{1}{n} \le 1+ \log_2 m

and the partial sums increase like log m, very slowly.

Comparison test

The argument above, based on considering upper and lower bounds on terms, can be modified to provide a general-purpose test for convergence and divergence called the comparison test (or direct comparison test). It can be applied to any series with nonnegative terms:

  • If \sum b_n converges and 0 \le a_n \le b_n, then \sum a_n converges.
  • If \sum b_n diverges and 0 \le b_n \le a_n, then \sum a_n diverges.

There are many such tests for convergence and divergence, the most important of which we will describe below.

Absolute convergence

Theorem: If the series of absolute values, \sum_{n=1}^\infty \left| a_n \right|, converges, then so does the series \sum_{n=1}^\infty a_n

We say such a series converges absolutely.

Proof:

Let \epsilon>0

According to the Cauchy criterion for series convergence, exists N so that for all N<m,n:

\sum_{k=n}^{m}|a_{k}|<\epsilon

We know that:

|\sum_{k=n}^{m}a_{k}|\leq\sum_{k=n}^{m}|a_{k}|

And then we get:

|\sum_{k=n}^{m}a_{k}|\leq\sum_{k=n}^{m}|a_{k}|<\epsilon

Now we get:

|\sum_{k=n}^{m}a_{k}|<\epsilon

Which is exactly the Cauchy criterion for series convergence.

Q.E.D


The converse does not hold. The series 1-1/2+1/3-1/4 ... converges, even though the series of its absolute values diverges.

A series like this that converges, but not absolutely, is said to converge conditionally.

If a series converges absolutely, we can add terms in any order we like. The limit will still be the same.

If a series converges conditionally, rearranging the terms changes the limit. In fact, we can make the series converge to any limit we like by choosing a suitable rearrangement.

E.g., in the series 1-1/2+1/3-1/4 ..., we can add only positive terms until the partial sum exceeds 100, subtract 1/2, add only positive terms until the partial sum exceeds 100, subtract 1/4, and so on, getting a sequence with the same terms that converges to 100.

This makes absolutely convergent series easier to work with. Thus, all but one of convergence tests in this chapter will be for series all of whose terms are positive, which must be absolutely convergent or divergent series. Other series will be studied by considering the corresponding series of absolute values.

Ratio test

For a series with terms an, if

 \lim_{n \to \infty } \left| \frac{a_{n+1}}{a_n} \right| = r

then

  • the series converges (absolutely) if r<1
  • the series diverges if r>1 (or if r is infinity)
  • the series could do either if r=1, so the test is not conclusive in this case.

E.g., suppose

a_n=\frac{n!n!}{(2n)!}

then

\frac{a_{n+1}}{a_n}=\frac{(n+1)^2}{(2n+1)(2n+2)}=\frac{n+1}{4n+2} \to \frac{1}{4}

so this series converges.

Integral test

If f(x) is a monotonically decreasing, always positive function, then the series

\sum_{n=1}^\infty f(n)

converges if and only if the integral

\int_1^\infty f(x)dx

converges.

E.g., consider f(x)=1/xp, for a fixed p.

  • If p=1 this is the harmonic series, which diverges.
  • If p<1 each term is larger than the harmonic series, so it diverges.
  • If p>1 then
\begin{matrix}\int_1^\infty x^{-p}dx & = & \lim_{s \to \infty}\int_1^s x^{-p}dx & \\ 
& = & \lim_{s \to \infty } \left. \frac{-1}{(p-1)x^{p-1}} \right|^s_1 &  \\ 
& = & \lim_{s \to \infty } \left( \frac{1}{p-1}-\frac{1}{(p-1)s^{p-1}} \right) 
& =\frac{1}{p-1} \end{matrix}

The integral converges, for p>1, so the series converges.


We can prove this test works by writing the integral as

\int_1^\infty f(x)dx=\sum_{n=1}^\infty \int_n^{n+1} f(x)dx

and comparing each of the integrals with rectangles, giving the inequalities

f(n) \ge \int_n^{n+1} f(x)dx \ge f(n+1)

Applying these to the sum then shows convergence.

Limit comparison test

Given an infinite series \textstyle\sum a_n with positive terms only, if one can find another infinite series \textstyle\sum b_n with positive terms for which

\lim_{n \to \infty} \frac{a_n}{b_n} = L

for a positive and finite L (i.e., the limit exists and is not zero), then the two series either both converge or both diverge. That is,

  • \textstyle\sum a_n converges if \textstyle\sum b_n converges, and
  • \textstyle\sum a_n diverges if \textstyle\sum b_n diverges.

Example:

a_n = n^{-\frac{n+1}{n}}

For large n, the terms of this series are similar to, but smaller than, those of the harmonic series. We compare the limits.

\lim \frac{a_n}{b_n} = \lim \frac{n^{-\frac{n+1}{n}}}{1/n} = \lim \frac{n}{n^{\frac{n+1}{n}}} 
= \lim \frac {1}{n^{\frac {1}{n}}}=1>0

so this series diverges.

Alternating series

Given an infinite series \sum a_n, if the signs of the an alternate, that is if

a_n=(-1)^n |a_n| \,

for all n or

a_n=(-1)^{n+1} |a_n| \,

for all n, then we call it an alternating series.

The alternating series test states that such a series converges if

\lim_{n \to \infty}a_n=0

and

\ |a_{n+1}| < |a_n|

(that is, the magnitude of the terms is decreasing).

Note that this test cannot lead to the conclusion that the series diverges; if one cannot conclude that the series converges, this test is inconclusive, although other tests may, of course, be used to give a conclusion.

Estimating the sum of an alternating series

The absolute error that results in using a partial sum of an alternating series to estimate the final sum of the infinite series is smaller than the magnitude of the first omitted term.

\left| \sum_{n=1}^\infty a_n - \sum_{n=1}^m a_n \right| < |a_{m+1}|

Geometric series

The geometric series can take either of the following forms

\sum_{n=0}^\infty ar^n or \sum_{n=1}^\infty ar^{n-1}

As you have seen at the start, the sum of the geometric series is

s=\lim_{n\to\infty} S_n =\lim_{n\to\infty}\frac{a(1-r^n)}{1-r}=\frac{a}{1-r}\quad\mbox{ for } |r| < 1.

Telescoping series

\sum_{n=0}^\infty (b_n - b_{n+1})

Expanding (or "telescoping") this type of series is informative. If we expand this series, we get:

\sum_{n=0}^k (b_n - b_{n+1})= (b_0 - b_1) + (b_1 - b_2) + ... + (b_{k-1} - b_k)

Additive cancellation leaves:

\sum_{n=0}^k (b_n - b_{n+1})= b_0 - b_k

Thus,

\sum_{n=0}^\infty (b_n - b_{n+1}) = 
\lim_{k \to \infty} \sum_{n=0}^k (b_n - b_{n+1}) =
\lim_{k \to \infty} (b_0 - b_k) = b_0 - \lim_{k \to \infty} b_k

and all that remains is to evaluate the limit.

There are other tests that can be used, but these tests are sufficient for all commonly encountered series.

Series and Calculus

Taylor Series

Taylor Series

As the degree of the Taylor series rises, it approaches the correct function. sin(x) and Taylor approximations, polynomials of degree 1, 3, 5, 7, 9, 11 and 13.

The Taylor series of an infinitely, often differentiable real, (or complex) function f defined on an open interval (a-r, a+r) is the power series 
\sum_{n=0}^{\infin} \frac{f^{(n)}(a)}{n!} (x-a)^{n}

Here, n! is the factorial of n and f (n)(a) denotes the nth derivative of f at the point a. If this series converges for every x in the interval (a-r, a+r) and the sum is equal to f(x), then the function f(x) is called analytic. To check whether the series converges towards f(x), one normally uses estimates for the remainder term of Taylor's theorem. A function is analytic if and only if a power series converges to the function; the coefficients in that power series are then necessarily the ones given in the above Taylor series formula.

If a = 0, the series is also called a Maclaurin series.

The importance of such a power series representation is threefold. First, differentiation and integration of power series can be performed term by term and is hence particularly easy. Second, an analytic function can be uniquely extended to a holomorphic function defined on an open disk in the complex plane, which makes the whole machinery of complex analysis available. Third, the (truncated) series can be used to approximate values of the function near the point of expansion.


Around zero, the function looks very flat. The function e-1/x² is not analytic: the Taylor series is 0, although the function is not.

Note that there are examples of infinitely often differentiable functions f(x) whose Taylor series converge, but are not equal to f(x). For instance, for the function defined piecewise by saying that f(x) = exp(−1/x²) if x ≠ 0 and f(0) = 0, all the derivatives are zero at x = 0, so the Taylor series of f(x) is zero, and its radius of convergence is infinite, even though the function most definitely is not zero. This particular pathology does not afflict complex-valued functions of a complex variable. Notice that exp(−1/z²) does not approach 0 as z approaches 0 along the imaginary axis.

Some functions cannot be written as Taylor series because they have a singularity; in these cases, one can often still achieve a series expansion if one allows also negative powers of the variable x; see Laurent series. For example, f(x) = exp(−1/x²) can be written as a Laurent series.

The Parker-Sockacki theorem is a recent advance in finding Taylor series which are solutions to differential equations. This theorem is an expansion on the Picard iteration.

Derivation/why this works

If a function f(x) is written as a infinite power series, it will look like this:


f(x)=c0(x-a)0+c1(x-a)1+c2(x-a)2+c3(x-a)3+c4(x-a)4+c5(x-a)5+c6(x-a)6+c7(x-a)7+...


where a is half the radius of convergence and c0,c1,c2,c3,c4... are coefficients. If we substitute a for x:


f(a)=c0


If we differentiate:


f´(x)=1c1(x-a)0+2c2(x-a)1+3c3(x-a)2+4c4(x-a)3+5c5(x-a)4+6c6(x-a)5+7c7(x-a)6+...


If we substitute a for x:


f´(a)=1c1


If we differentiate:


f´´(x)=2c2+3*2*c3(x-a)1+4*3*c4(x-a)2+5*4*c5(x-a)3+6*5*c6(x-a)4+7*6*c7(x-a)5+...


If we substitute a for x:


f´´(a)=2c2


Extrapolating:


n!cn=fn(a)


where f0(x)=f(x) and f1(x)=f´(x) and so on.We can actually go ahead and say that the power approximation of f(x) is:


f(x)=Sn=0¥((fn(a)/n!)*(x-a)n)


<this needs to be improved>

List of Taylor series

Several important Taylor series expansions follow. All these expansions are also valid for complex arguments x.

Exponential function and natural logarithm:

e^{x} = \sum^{\infin}_{n=0} \frac{x^n}{n!}\quad\mbox{ for all } x
\ln(1+x) = \sum^{\infin}_{n=1} \frac{(-1)^{n+1}}n x^n\quad\mbox{ for } \left| x \right| < 1

Geometric series:

\frac{1}{1-x} = \sum^{\infin}_{n=0} x^n\quad\mbox{ for } \left| x \right| < 1

Binomial series:

(1+x)^\alpha = \sum^{\infin}_{n=0} C(\alpha,n) x^n\quad\mbox{ for all } \left| x \right| < 1\quad\mbox{ and all complex } \alpha

Trigonometric functions:

\sin x = \sum^{\infin}_{n=0} \frac{(-1)^n}{(2n+1)!} x^{2n+1}\quad\mbox{ for all } x
\cos x = \sum^{\infin}_{n=0} \frac{(-1)^n}{(2n)!} x^{2n}\quad\mbox{ for all } x
\tan x = \sum^{\infin}_{n=1} \frac{B_{2n} (-4)^n (1-4^n)}{(2n)!} x^{2n-1}\quad\mbox{ for } \left| x \right| < \frac{\pi}{2}
\sec x = \sum^{\infin}_{n=0} \frac{(-1)^n E_{2n}}{(2n)!} x^{2n}\quad\mbox{ for } \left| x \right| < \frac{\pi}{2}
\arcsin x = \sum^{\infin}_{n=0} \frac{(2n)!}{4^n (n!)^2 (2n+1)} x^{2n+1}\quad\mbox{ for } \left| x \right| < 1
\arctan x = \sum^{\infin}_{n=0} \frac{(-1)^n}{2n+1} x^{2n+1}\quad\mbox{ for } \left| x \right| < 1

Hyperbolic functions:

\sinh x = \sum^{\infin}_{n=0} \frac{1}{(2n+1)!} x^{2n+1}\quad\mbox{ for all } x
\cosh x = \sum^{\infin}_{n=0} \frac{1}{(2n)!} x^{2n}\quad\mbox{ for all } x
\tanh x = \sum^{\infin}_{n=1} \frac{B_{2n} 4^n (4^n-1)}{(2n)!} x^{2n-1}\quad\mbox{ for } \left| x \right| < \frac{\pi}{2}
\sinh^{-1} x = \sum^{\infin}_{n=0} \frac{(-1)^n (2n)!}{4^n (n!)^2 (2n+1)} x^{2n+1}\quad\mbox{ for } \left| x \right| < 1
\tanh^{-1} x = \sum^{\infin}_{n=0} \frac{1}{2n+1} x^{2n+1}\quad\mbox{ for } \left| x \right| < 1

Lambert's W function:

W_0(x) = \sum^{\infin}_{n=1} \frac{(-n)^{n-1}}{n!} x^n\quad\mbox{ for } \left| x \right| < \frac{1}{e}

The numbers Bk appearing in the expansions of tan(x) and tanh(x) are the Bernoulli numbers. The C(α,n) in the binomial expansion are the binomial coefficients. The Ek in the expansion of sec(x) are Euler numbers.

Multiple dimensions

The Taylor series may be generalized to functions of more than one variable with


\sum_{n_1=0}^{\infin} \cdots \sum_{n_d=0}^{\infin}
\frac{\partial^{n_1}}{\partial x_{1}^{n_1}} \cdots \frac{\partial^{n_d}}{\partial x_{d}^{n_d}}
\frac{f(a_1,\cdots,a_d)}{n_1!\cdots n_d!}
(x_1-a_1)^{n_1}\cdots (x_d-a_d)^{n_d}

History

The Taylor series is named for mathematician Brook Taylor, who first published the power series formula in 1715.

Constructing a Taylor Series

Several methods exist for the calculation of Taylor series of a large number of functions. One can attempt to use the Taylor series as-is and generalize the form of the coefficients, or one can use manipulations such as substitution, multiplication or division, addition or subtraction of standard Taylor series (such as those above) to construct the Taylor series of a function, by virtue of Taylor series being power series. In some cases, one can also derive the Taylor series by repeatedly applying integration by parts. The use of computer algebra systems to calculate Taylor series is common, since it eliminates tedious substitution and manipulation.

Example 1

Consider the function

f(x)=\ln{(1+\cos{x})} \,,

for which we want a Taylor series at 0.

We have for the natural logarithm

\ln(1+x) = \sum^{\infin}_{n=1} \frac{(-1)^{n+1}}{n} x^n = x - {x^2\over 2}+{x^3 \over 3} - {x^4 \over 4} + \cdots \quad\mbox{ for } \left| x \right| < 1

and for the cosine function

\cos x = \sum^{\infin}_{n=0} \frac{(-1)^n}{(2n)!} x^{2n} = 1 -{x^2\over 2!}+{x^4\over 4!}- \cdots \quad\mbox{ for all } x\in\mathbb{C}.

We can simply substitute the second series into the first. Doing so gives

\left(1 -{x^2\over 2!}+{x^4\over 4!}-\cdots\right)-{1\over 2}\left(1 -{x^2\over 2!}+{x^4\over 4!}-\cdots\right)^2 +{1\over 3}\left(1 -{x^2\over 2!}+{x^4\over 4!}-\cdots\right)^3-\cdots

Expanding by using multinomial coefficients gives the required Taylor series. Note that cosine and therefore f are even functions, meaning that f(x)=f(-x), hence the coefficients of the odd powers x, x^3, x^5, x^7 and so on have to be zero and don't need to be calculated. The first few terms of the series are

\ln(1+\cos x)=\ln 2-{x^2\over 4}-{x^4\over 96}-{x^6\over 1440} -{17x^8\over 322560}-{31x^{10}\over 7257600}-\cdots

The general coefficient can be represented using Faà di Bruno's formula. However, this representation does not seem to be particularly illuminating and is therefore omitted here.

Example 2

Suppose we want the Taylor series at 0 of the function

g(x)=\frac{e^x}{\cos x}\,.

We have for the exponential function

e^x = \sum^\infty_{n=0} {x^n\over n!} =1 + x + {x^2 \over 2!} + {x^3 \over 3!} + {x^4 \over 4!} +\cdots

and, as in the first example,

\cos x = 1 - {x^2 \over 2!} + {x^4 \over 4!} - \cdots

Assume the power series is

{e^x \over \cos x} = c_0 + c_1 x + c_2 x^2 + c_3 x^3 + \cdots

Then multiplication with the denominator and substitution of the series of the cosine yields

\begin{align} e^x &= (c_0 + c_1 x + c_2 x^2 + c_3 x^3 + \cdots)\cos x\\
&=\left(c_0 + c_1 x + c_2 x^2 + c_3 x^3 + c_4x^4 + \cdots\right)\left(1 - {x^2 \over 2!} + {x^4 \over 4!} - \cdots\right)\\
&=c_0 - {c_0 \over 2}x^2 + {c_0 \over 4!}x^4 + c_1x - {c_1 \over 2}x^3 + {c_1 \over 4!}x^5 + c_2x^2 - {c_2 \over 2}x^4 + {c_2 \over 4!}x^6 + c_3x^3 - {c_3 \over 2}x^5 + {c_3 \over 4!}x^7 +\cdots \end{align}

Collecting the terms up to fourth order yields

=c_0 + c_1x + \left(c_2 - {c_0 \over 2}\right)x^2 + \left(c_3 - {c_1 \over 2}\right)x^3+\left(c_4+{c_0 \over 4!}-{c_2\over 2}\right)x^4 + \cdots

Comparing coefficients with the above series of the exponential function yields the desired Taylor series

\frac{e^x}{\cos x}=1 + x + x^2 + {2x^3 \over 3} + {x^4 \over 2} + \cdots

Convergence

Generalized Mean Value Theorem

Power Series

The study of power series is aimed at investigating series which can approximate some function over a certain interval.

Motivations

Elementary calculus (differentiation) is used to obtain information on a line which touches a curve at one point (i.e. a tangent). This is done by calculating the gradient, or slope of the curve, at a single point. However, this does not provide us with reliable information on the curve's actual value at given points in a wider interval. This is where the concept of power series becomes useful.

An example

Consider the curve of y = cos(x), about the point x = 0. A naïve approximation would be the line y = 1. However, for a more accurate approximation, observe that cos(x) looks like an inverted parabola around x = 0 - therefore, we might think about which parabola could approximate the shape of cos(x) near this point. This curve might well come to mind:

y = {1 - { x^2 \over 2}}

In fact, this is the best estimate for cos(x) which uses polynomials of degree 2 (i.e. a highest term of x2) - but how do we know this is true? This is the study of power series: finding optimal approximations to functions using polynomials.

Definition

A power series is a series of the form

a0x0 + a1x1 + ... + anxn

or, equivalently,

\sum_{j=0}^n a_jx^j

Radius of convergence

When using a power series as an alternative method of calculating a function's value, the equation

f(x)=\sum_{j=0}^n a_jx^j

can only be used to study f(x) where the power series converges - this may happen for a finite range, or for all real numbers.

The size of the interval (around its center) in which the power series converges to the function is known as the radius of convergence.

An example

\frac{1}{1-x}=\sum_{n=0}^\infty x^n (a geometric series)

this converges when | x | < 1, the range -1 < x < +1, so the radius of convergence - centered at 0 - is 1. It should also be observed that at the extremities of the radius, that is where x = 1 and x = -1, the power series does not converge.

Another example

e^x=\sum_{n=0}^\infty \frac{x^n}{n!}

Using the ratio test, this series converges when the ratio of successive terms is less than one:

\lim_{n \to \infty} \left|\frac{x^\left(n+1\right)}{\left(n+1\right)!}\frac{n!}{x^n}\right|<1
\lim_{n \to \infty} \left|\frac{x^n x^1}{n! \left(n+1\right)}\frac{n!}{x^n}\right|<1
or \lim_{n \to \infty}\left|\frac{x}{n+1}\right|<1

which is always true - therefore, this power series has an infinite radius of convergence. In effect, this means that the power series can always be used as a valid alternative to the original function, ex.

Abstraction

If we use the ratio test on an arbitrary power series, we find it converges when

\lim \frac{|a_{n+1}x|}{|a_n|} <1

and diverges when

\lim \frac{|a_{n+1}x|}{|a_n|} >1

The radius of convergence is therefore

r=\lim \frac{|a_n|}{|a_{n+1}|}

If this limit diverges to infinity, the series has an infinite radius of convergence.

Differentiation and Integration

Within its radius of convergence, a power series can be differentiated and integrated term by term.

\frac{d}{dx} \sum_{j=0}^\infty a_jx^j = 
\sum_{j=0}^\infty (j+1)a_{j+1}x^j
\int \sum_{j=0}^\infty a_jz^j dz = \sum_{j=1}^\infty \frac{a_{j-1}}{j}x^j

Both the differential and the integral have the same radius of convergence as the original series.

This allows us to sum exactly suitable power series. For example,

\frac{1}{1+x}=1-x+x^2-x^3+ \ldots

This is a geometric series, which converges for | x | < 1. Integrating both sides, we get

\ln (1+x) = x-\frac{x^2}{2}+\frac{x^3}{3} \ldots

which will also converge for | x | < 1. When x = -1 this is the harmonic series, which diverges'; when x = 1 this is an alternating series with diminishing terms, which converges to ln 2 - this is testing the extremities.

It also lets us write power series for integrals we cannot do exactly such as the error function:

e^{-x^2}=\sum (-1)^n \frac{x^{2n}}{n!}

The left hand side can not be integrated exactly, but the right hand side can be.

\int_0^z e^{-x^2}dx=\sum  \frac{(-1)^n z^{2n+1}}{(2n+1)n!}

This gives us a power series for the sum, which has an infinite radius of convergence, letting us approximate the integral as closely as we like.

Further reading

Exercises

The following exercises test your understanding of infinite sequences and series. You may want to review that material before trying these problems.

Each question is followed by a "Hint" (usually a quick indication of the most efficient way to work the problem), the "Answer only" (what it sounds like), and finally a "Full solution" (showing all the steps required to get to the right answer). These should show up as "collapsed" or "hidden" sections (click on the title to display the contents), but some older web browsers might not be able to display them correctly (i.e., showing the content when it should be hidden). If this is true for your browser (or if you're looking at a printed version), you should take care not to "see too much" before you start thinking of how to work each problem.

Sequences

Consider the infinite sequence

a_n=n/2^n,\ n=1,\ldots,\infty.
  • Is the sequence monotonically increasing or decreasing?
Hint

compare adjacent terms algebraically or take a derivative

Answer only

monotonically decreasing (strictly decreasing starting with the second term)

Full solution

One may either consider how a_{n+1} compares to a_n algebraically and try to show that one is greater than the other, or take the derivative of a_n with respect to n and check where it is positive or negative.

Algebraically, since

a_{n+1}-a_n=\frac{n+1}{2^{n+1}}-\frac{n}{2^n}=\frac{n+1}{2^{n+1}}-\frac{2n}{2^{n+1}}=\frac{1-n}{2^{n+1}},

we see that a_{n+1}<a_n for n>1. That is, starting from the second term, the sequence is strictly decreasing. It is easy to check how the first two terms compare by just plugging in n=1 and n=2:

a_1=1/2
a_2=2/4=1/2

The first two terms are equal and thereafter the terms are strictly decreasing. Therefore, the sequence is monotonically decreasing.

Using calculus,

\frac{d}{dn}\,\frac{n}{2^n}=\frac{2^n\cdot1-n\cdot2^n\ln2}{(2^n)^2}=\frac{2^n(1-n\ln2)}{2^{2n}}=\frac{1-n\ln2}{2^n},

which is negative for n>1/\ln2\approx1.44>1. The rest of the argument is the same as before.

  • Is the sequence bounded from below, from above, both, or neither?
Hint

consider what kind of values are taken on by the numerator and denominator, and use the previous answer

Answer only

bounded from below and from above (both)

Full solution

The sequence is bounded from below because the terms are clearly positive (greater than 0) for all values of n. Also, since the sequence is decreasing (see the previous problem), the maximum value of the sequence must be the value of the first term. So the sequence is bounded from above (by the value 1/2), as well.

  • Does the sequence converge or diverge?
Hint

use the previous two answers to make a conclusion, or take a limit

Answer only

converges

Full solution

By the previous two answers, the sequence is bounded from below and monotonically decreasing, thus by a theorem it must converge.

To show this directly, consider the limit

\lim_{n\to\infty} \frac{n}{2^n}=\lim_{n\to\infty} \frac{1}{2^n\ln2}=0.

The two limits are equal by L'Hôpital's Rule, since the numerator and denominator of the expression in the first limit both grow to infinity.

Since the limit exists, it is the number to which the sequence converges.

Partial sums

Assume that the nth partial sum of a series is given by s_n=2-\frac{1}{3^n}.

  • Does the series converge? If so, to what value?
Hint

take a limit

Answer only

converges to 2

Full solution

The series converges to 2 since

s=\lim_{n\to\infty} s_n=\lim_{n\to\infty} \left(2-\frac{1}{3^n}\right)=2.
  • What is the formula for the nth term of the series?
Show hint

s_n=a_1+\cdots+a_{n-1}+a_n=s_{n-1}+a_n

Show answer only

a_n=\frac{2}{3^n}

Show full solution

a_n=s_n-s_{n-1}=\left(2-\frac{1}{3^n}\right)-\left(2-\frac{1}{3^{n-1}}\right)=\frac{1}{3^{n-1}}-\frac{1}{3^n}=\frac{3}{3^n}-\frac{1}{3^n}=\frac{2}{3^n}

Note that the series turns out to be geometric, since

a_n=2\,(1/3)^n=a\,r^n.

Sums of infinite series

Find the value to which each of the following series converges.

  • \sum_{n=0}^{\infty} \frac{3}{4^n}
Show hint

sum of an infinite geometric series

Show answer only

4

Show full solution

The series is

\sum_{n=0}^{\infty} 3\left(\frac{1}{4}\right)^n

and so is geometric with first term a=3 and common ratio r=1/4. So

s=\frac{a}{1-r}=\frac{3}{1-1/4}=4.
  • \sum_{n=1}^{\infty} \left(\frac{2}{e}\right)^n
Show hint

sum of an infinite geometric series

Show answer only

\frac{2}{e-2}

Show full solution

s=\frac{2/e}{1-2/e}=\frac{2}{e-2}

  • \sum_{n=2}^{\infty} \frac{1}{n^2-n}
Show hint

telescoping series

Show answer only

1

Show full solution

Note that

\sum_{n=2}^{\infty} \frac{1}{n^2-n} = \sum_{n=2}^{\infty} \frac{1}{n(n-1)} = \sum_{n=2}^{\infty} \left(\frac{1}{n-1}-\frac{1}{n}\right)

by partial fractions. So

s = \lim_{N\to\infty} s_N = \lim_{N\to\infty} \left(1-\frac{1}{2}\right) + \left(\frac{1}{2}-\frac{1}{3}\right) + \left(\frac{1}{3}-\frac{1}{4}\right) + \ldots + \left(\frac{1}{N-1}-\frac{1}{N}\right).

All but the first and last terms cancel out, so

s = \lim_{N\to\infty} \left(1-\frac{1}{N}\right) = 1.
  • \sum_{n=1}^{\infty} \frac{(-1)^n 2^{n-1}}{3^n}
Show hint

rewrite so that all exponents are n

Show answer only

−1/5

Show full solution

The series simplifies to

\sum_{n=1}^{\infty} \frac{(-1)^n 2^n}{3^n \cdot 2} = \sum_{n=1}^{\infty} \frac{1}{2} \left(\frac{-2}{3}\right)^n

and so is geometric with common ratio r=-2/3 and first term -1/3. Thus

s=\frac{-1/3}{1-(-2/3)}=-1/5.

Convergence and divergence of infinite series

Determine whether each the following series converges or diverges. (Note: Each "Hint" gives the convergence/divergence test required to draw a conclusion.)

  • \sum_{n=1}^{\infty} \frac{1}{n^2}
Show hint

p-series

Show answer only

converges

Show full solution

This is a p-series with p=2. Since p>1, the series converges.

  • \sum_{n=0}^{\infty} \frac{1}{2^n}
Show hint

geometric series

Show answer only

converges

Show full solution

This is a geometric series with common ratio r=1/2, and so converges since |r|<1.

  • \sum_{n=1}^{\infty} \frac{n}{n^2+1}
Show hint

limit comparison test

Show answer only

diverges

Show full solution

This series can be compared to a p-series:

\sum_{n=1}^{\infty} \frac{n}{n^2+1} \sim \sum_{n=1}^{\infty} \frac{n}{n^2} = \sum_{n=1}^{\infty} \frac{1}{n}

The \sim symbol means the two series are "asymptotically equivalent"—that is, they either both converge or both diverge because their terms behave so similarly when summed as n gets very large. This can be shown by the limit comparison test:

\lim_{n\to\infty} \left( \frac{n}{n^2+1} \div \frac{1}{n} \right) = \lim_{n\to\infty} \left( \frac{n}{n^2+1} \cdot \frac{n}{1} \right) = \lim_{n\to\infty} \frac{n^2}{n^2+1} = 1

Since the limit is positive and finite, the two series either both converge or both diverge. The simpler series diverges because it is a p-series with p=1 (harmonic series), and so the original series diverges by the limit comparison test.

  • \sum_{n=2}^{\infty} \frac{1}{\ln n}
Show hint

direct comparison test

Show answer only

diverges

Show full solution

This series can be compared to a smaller p-series:

\sum_{n=2}^{\infty} \frac{1}{\ln n} \ge \sum_{n=2}^{\infty} \frac{1}{n}

The p-series diverges since p=1 (harmonic series), so the larger series diverges by the appropriate direct comparison test.

  • \sum_{n=0}^{\infty} \frac{n!}{2^n}
Show hint

divergence test

Show answer only

diverges

Show full solution

The terms of this series do not have a limit of zero. Note that when n>1,

\frac{n!}{2^n} = \frac{n}{2}\cdot\left[\frac{n-1}{2}\cdot\frac{n-2}{2}\dots\frac{2}{2}\right]\cdot\frac{1}{2} \ge \frac{n}{2}\cdot(1)\cdot\frac{1}{2} = \frac{n}{4}

To see why the inequality holds, consider that when n=2 none of the fractions in the square brackets above are actually there; when n=3 only 2/2 (which is the same as [n-1]/2) is in the brackets; when n=4 only 3/2 (equal to [n-1]/2) and 2/2 (equal to [n-2]/2) are there; when n=5, only 4/2, 3/2, and 2/2 are there; and so forth. Clearly none of these fractions are less than 1 and they never will be, no matter what n>1 is used.

The fact that

\lim_{n\to\infty} \frac{n}{4} = \infty

then implies that

\lim_{n\to\infty} \frac{n!}{2^n} = \infty

Therefore the series diverges by the divergence test.

  • \sum_{n=1}^{\infty} \frac{\cos\pi n}{n}
Show hint

alternating series test

Show answer only

converges

Show full solution

This is an alternating series:

\sum_{n=1}^{\infty} \frac{\cos\pi n}{n} = \sum_{n=1}^{\infty} \frac{(-1)^n}{n}

Since the sequence

|a_n|=\frac{1}{n}

decreases to 0, the series converges by the alternating series test.

  • \sum_{n=2}^{\infty} \frac{(-1)^n}{n\ln n-1}
Show hint

alternating series test

Show answer only

converges

Show full solution

Since the terms alternate, consider the sequence

|a_n|=\frac{1}{n\ln n-1}

This sequence is clearly decreasing (since both n and \ln n are increasing — one may also show that the derivative (with respect to n) of the expression is negative for n\ge2) and has limit zero (the denominator goes to infinity), so the series converges by the alternating series test.

Absolute and conditional convergence

Determine whether each the following series converges conditionally, converges absolutely, or diverges. (Note: Each "Hint" gives the test or tests that most easily lead to the final conclusion.)

  • \sum_{n=1}^{\infty} \frac{(-1)^n}{\sqrt{n}}
Show hint

alternating series test and either direct comparison test or integral test

Show answer only

converges conditionally

Show full solution

This series alternates, so consider the sequence

|a_n|=\frac{1}{\sqrt{n}}

Since this sequence is clearly decreasing to zero, the original series is convergent by the alternating series test. Now, consider the series formed by taking the absolute value of the terms of the original series:

\sum |a_n|=\sum_{n=1}^{\infty} \frac{1}{\sqrt{n}}

This new series can be compared to a p-series:

\sum_{n=1}^{\infty} \frac{1}{\sqrt{n}} \ge \sum_{n=1}^{\infty} \frac{1}{n}

Since the smaller series diverges, the larger one diverges. But this means the original (alternating) series was not absolutely convergent. (This last fact can also be shown using an integral test.) Therefore, the original series is only conditionally convergent.

  • \sum_{n=2}^{\infty} \frac{(-1)^n \ln n}{n}
Show hint

alternating series test and either integral test or direct comparison test

Show answer only

converges conditionally

Show full solution

This series alternates, so consider the sequence

|a_n|=\frac{\ln n}{n}

This sequence has a limit of zero by, for example, L'Hospital's Rule.

\lim_{n\to\infty}\frac{\ln n}{n}=\lim_{n\to\infty}\frac{1/n}{1}=0

That the sequence is decreasing can be verified by, for example, showing that as a continuous function of x, its derivative is negative.

\frac{d}{dx}\,\frac{\ln x}{x} = \frac{1 - \ln x}{x^2} < 0 \mbox{ if } x > e

This means that the terms definitely decrease starting with the second term (n=3). Thus, the series starting at n=3 is convergent by the alternating series test; clearly, then, the series starting at n=2 also converges (since the two series only differ by one term). Now, consider the series formed by taking the absolute value of the terms of the original series:

\sum |a_n|=\sum_{n=2}^{\infty} \frac{\ln n}{n}

This new series of positive terms only can be compared to a p-series:

\sum_{n=2}^{\infty} \frac{\ln n}{n} \ge \sum_{n=2}^{\infty} \frac{1}{n}

Since the smaller series diverges, the larger one diverges. Alternatively, the integral test can be used to test the convergence of the series of positive terms, since f(x) = \frac{\ln x}{x} is clearly a continuous, positive function on [2,\infty) and, as we have just verified, is also decreasing:

\int_{2}^{\infty} \frac{\ln x}{x}\,dx = \lim_{t\to\infty} \int_{2}^{t} \frac{\ln x}{x}\,dx = \lim_{t\to\infty} \int_{\ln 2}^{\ln t} u\,du

by the substitution u=\ln x; and this last expression becomes

\lim_{t\to\infty} \Bigl(\frac{1}{2} u^2\bigr|_{\ln 2}^{\ln t}\Bigr) = \lim_{t\to\infty} \frac{1}{2}(\ln t)^2 - \frac{1}{2} (\ln 2)^2 = \infty

Since the improper integral diverges, the series of positive terms diverges.

Either way you test it, the series with all positive terms diverges, and this means the original (alternating) series was not absolutely convergent. Thus, the original series is only conditionally convergent.

  • \sum_{n=2}^{\infty} \frac{(-1)^n n}{(\ln n)^2}
Show hint

divergence test

Show answer only

diverges

Show full solution

This series is alternating, but note that by L'Hospital's Rule

\lim_{n\to\infty}\frac{n}{(\ln n)^2} = \lim_{n\to\infty}\frac{1}{2(\ln n)(1/n)} = \lim_{n\to\infty}\frac{n}{2\ln n}
\mbox{ } = \lim_{n\to\infty}\frac{1}{2/n} = \lim_{n\to\infty}\frac{n}{2} = \infty

Which implies that

\lim_{n\to\infty}\frac{(-1)^n n}{(\ln n)^2}

does not exist, and hence by the divergence test, the series diverges.

  • \sum_{n=1}^{\infty} \frac{(-1)^n 2^n}{e^n-1}
Show hint

limit comparison test with geometric series

Show answer only

converges absolutely

Show full solution

While this alternating series can be shown to converge by the alternating series test, it can also be shown that the absolute value of the terms form a convergent series, and this is sufficient to conclude absolute convergence of the original series. Thus we will skip the former test and show only the latter.

\sum_{n=1}^{\infty} |a_n| = \sum_{n=1}^{\infty} \frac{2^n}{e^n-1}

This series of positive terms is asymptotically geometric with r=2/e:

\sum_{n=1}^{\infty} \frac{2^n}{e^n-1} \sim \sum_{n=1}^{\infty} (2/e)^n

The equivalence of these series is shown by using a limit comparison test:

\lim_{n\to\infty} \left[\frac{2^n}{e^n-1} \div (2/e)^n\right] = \lim_{n\to\infty} \left(\frac{2^n}{2^n}\cdot\frac{e^n}{e^n-1}\right) = \lim_{n\to\infty} \frac{1}{1-1/e^n} = 1

Since the limit is positive and finite, and since the simpler series converges because it geometric with r=2/e (the absolute value of which is less than 1), then the series of positive terms converges by the limit comparison test. Thus the original alternating series is absolutely convergent.

Note, by the way, that a direct comparison test in this case is more difficult (although still possible to do), since

\sum_{n=1}^{\infty} \frac{2^n}{e^n-1} \ge \sum_{n=1}^{\infty} (2/e)^n

and we need the inequality to go the other way to get a conclusion, since the geometric series converges.

This can be fixed by choosing the new series more carefully:

\sum_{n=1}^{\infty} \frac{2^n}{e^n-1} \le \sum_{n=1}^{\infty} \frac{2^n}{e^n-.5e^n} = \sum_{n=1}^{\infty} \frac{2^n}{.5e^n} = \sum_{n=1}^{\infty} 2(2/e)^n

Comparing the original series to the new (convergent geometric) series gives the desired result.

  • \sum_{n=1}^{\infty} \frac{(-1)^n}{\sin^2 n}
Show hint

divergence test

Show answer only

diverges

Show full solution

Since this is an alternating series, we can try the alternating series test. Consider the absolute value of the terms:

|a_n|=\frac{1}{\sin^2 n}

Because

\lim_{n\to\infty} \sin^2 n

does not exist (since it continually oscillates within the interval [0, 1] as n gets larger)

\lim_{n\to\infty} \frac{1}{\sin^2 n}

doesn't exist either. Thus the alternating series test fails (it is inconclusive).

However, in such a situation we can use the divergence test instead. Since

\lim_{n\to\infty} \frac{(-1)^n}{\sin^2 n}

also does not exist (and thus the terms of the series do converge to 0), the original series diverges by the divergence test.

  • \sum_{n=1}^{\infty} \frac{(-1)^n n!}{(2n)!}
Show hint

ratio test

Show answer only

converges absolutely

Show full solution

Because of the factorials in this series, we try the ratio test:

\lim_{n\to\infty}\left|\frac{(-1)^{n+1}(n+1)!}{[2(n+1)]!}\div\frac{(-1)^n n!}{(2n)!}\right| = \lim_{n\to\infty}\frac{(n+1)!}{n!}\cdot\frac{(2n)!}{(2n+2)!} = \lim_{n\to\infty}\frac{(n+1)}{(2n+2)(2n+1)} = \lim_{n\to\infty}\frac{1}{2(2n+1)} = 0

Since the limit is less than 1, the series converges absolutely by the ratio test.

  • \sum_{n=1}^{\infty} \frac{(-1)^n e^{1/n}}{\arctan n}
Show hint

divergence test

Show answer only

diverges

Show full solution

Although this is an alternating series, neither the numerator nor the denominator have infinite limits, so it is likely that a divergence test will work.

Note that

\lim_{n\to\infty} \frac{e^{1/n}}{\arctan n} = \frac{e^0}{\pi/2} = \frac{2}{\pi} \ne 0

Thus

\lim_{n\to\infty} \frac{(-1)^n e^{1/n}}{\arctan n} \ne 0

In fact, the latter limit does not exist. So, by the divergenve test, the series diverges.

Vector Calculations

Vectors

← Multivariable and differential calculus Calculus Lines and Planes in Space →
Print version

Two-Dimensional Vectors

Introduction

In most mathematics courses up until this point, we deal with scalars. These are quantities which only need one number to express. For instance, the amount of gasoline used to drive to the grocery store is a scalar quantity because it only needs one number: 2 gallons.

In this unit, we deal with vectors. A vector is a directed line segment -- that is, a line segment that points one direction or the other. As such, it has an initial point and a terminal point. The vector starts at the initial point and ends at the terminal point, and the vector points towards the terminal point. A vector is drawn as a line segment with an arrow at the terminal point:

A single vector without coordinate axes.


The same vector can be placed anywhere on the coordinate plane and still be the same vector -- the only two bits of information a vector represents are the magnitude and the direction. The magnitude is simply the length of the vector, and the direction is the angle at which it points. Since neither of these specify a starting or ending location, the same vector can be placed anywhere. To illustrate, all of the line segments below can be defined as the vector with magnitude 4\sqrt{2} and angle 45 degrees:

Multiple locations for the same vector.

It is customary, however, to place the vector with the initial point at the origin as indicated by the black vector. This is called the standard position.

Component Form

In standard practice, we don't express vectors by listing the length and the direction. We instead use component form, which lists the height (rise) and width (run) of the vectors. It is written as follows:

\begin{pmatrix} \mathrm{run} \\ \mathrm{rise}\end{pmatrix}

Vector with rise and run measurements.

Other ways of denoting a vector in component form include:

\mathbf{(u_x,u_y)} and \mathbf{\left\langle u_x,u_y\right\rangle}

From the diagram we can now see the benefits of the standard position: the two numbers for the terminal point's coordinates are the same numbers for the vector's rise and run. Note that we named this vector u. Just as you can assign numbers to variables in algebra (usually x, y, and z), you can assign vectors to variables in calculus. The letters u, v, and w are usually used, and either boldface or an arrow over the letter is used to identify it as a vector.

When expressing a vector in component form, it is no longer obvious what the magnitude and direction are. Therefore, we have to perform some calculations to find the magnitude and direction.

Magnitude

|\mathbf{u}| = \sqrt{u_x^2+u_y^2}

where u_x is the width, or run, of the vector; u_y is the height, or rise, of the vector. You should recognize this formula as the Pythagorean theorem. It is -- the magnitude is the distance between the initial point and the terminal point.

The magnitude of a vector can also be called the norm.

Direction

\tan \theta = \frac{u_y}{u_x}

Vector triangle with ux and uy labeled.


where \theta is the direction of the vector. This formula is simply the tangent formula for right triangles.

Vector Operations

For these definitions, assume:


\mathbf{u} = \begin{pmatrix} u_x \\ u_y \end{pmatrix}

\mathbf{v} = \begin{pmatrix} v_x \\ v_y \end{pmatrix}


Vector Addition

Vector Addition is often called tip-to-tail addition, because this makes it easier to remember.

The sum of the vectors you are adding is called the resultant vector, and is the vector drawn from the initial point (tip) of the first vector to the terminal point (tail) of the second vector. Although they look like the arrows, the pointy bit is the tail, not the tip. (Imagine you were walking the direction the vector was pointing... you would start at the flat end (tip) and walk toward the pointy end.)

It looks like this:

Tip-to-tail addition

(Notice, the black lined vector is the sum of the two dotted line vectors!)


Numerically:


\begin{pmatrix} 4 \\ 6 \end{pmatrix} + \begin{pmatrix} 1 \\ -3 \end{pmatrix} = \begin{pmatrix} 5 \\ 3 \end{pmatrix}

Or more generally:


\mathbf{u} + \mathbf{v} = \begin{pmatrix} u_x + v_x \\ u_y + v_y \end{pmatrix}

Scalar Multiplication

Graphically, multiplying a vector by a scalar changes only the magnitude of the vector by that same scalar. That is, multiplying a vector by 2 will "stretch" the vector to twice its original magnitude, keeping the direction the same.

Multiplication of a vector with a scalar 2 \cdot \begin{pmatrix} 3 \\ 3 \end{pmatrix} = \begin{pmatrix} 6 \\ 6 \end{pmatrix}

Numerically, you calculate the resultant vector with this formula:

c \mathbf{u} = \begin{pmatrix} c u_x \\ c u_y \end{pmatrix}, where c is a constant scalar.

As previously stated, the magnitude is changed by the same constant:

|c \mathbf{u}| = c |\mathbf{u}|

Since multiplying a vector by a constant results in a vector in the same direction, we can reason that two vectors are parallel if one is a constant multiple of the other -- that is, that \mathbf{u} || \mathbf{v} if \mathbf{u} = c \mathbf{v} for some constant c.

We can also divide by a non-zero scalar by instead multiplying by the reciprocal, as with dividing regular numbers:

\frac{\mathbf{u}}{c} = \frac{1}{c} \mathbf{u}, {c}\ne 0


Dot Product

The dot product is a way of multiplying two vectors to produce a scalar value. Because it combines the components of two vectors to form a /scalar/, it is sometimes called a scalar product. If you were asked to take the 'dot product of two rectangular vectors' you would do the following:

\mathbf{u} \cdot \mathbf{v} = u_x v_x + u_y v_y

http://en.wikibooks.org/w/index.php?title=Calculus/Vectors&action=edit&section=9

It is very important to note that the dot product of two vectors does not result in another vector, it gives you a scalar, just a numerical value.

Another common pitfall may arise if your vectors are not in rectangular ('cartesian') format. Sometimes, vectors are instead expressed in polar coordinates, where the first component is the vector's magnitude (length) and the second is the angle from the x-axis at which the vector should be oriented. Dot products cannot be performed using the conventional method on these sorts of vectors; vectors in polar format must be converted to their equivalent rectangular form before you can work with them using the formula given above. A common way to convert to rectangular coordinates is to imagine that the vector was projected horizontally and vertically to form a right triangle. You could then use properties of sin and cos to find the length of the two legs the right triangle. The horizontal length would then be the x-component of the rectangular expression of the vector and the vertical length would be the y-component. Remember that if the vector is pointing down or to the left, the corresponding components would have to be negative to indicate that.

With some rearrangement and trigonometric manipulation, we can see that the number that results from the dot product of two vectors is a surprising and useful identity:

\mathbf{u} \cdot \mathbf{v} = |\mathbf{u}| |\mathbf{v}| \cos \theta

where \theta is the angle between the two vectors. This provides a convenient way of finding the angle between two vectors:

 \cos \theta = \frac{\mathbf{u} \cdot \mathbf{v}}{
|\mathbf{u}| |\mathbf{v}|}

Notice that the dot product is 'commutative', that is:

\mathbf{u} \cdot \mathbf{v} = \mathbf{v} \cdot \mathbf{u}

Also, the dot product of two vectors will be the length of the vector squared:

\mathbf{u} \cdot \mathbf{u} = u_x u_x + u_y u_y = (u_x)^2+(u_y)^2

and by the Pythagorean theorem,

(u_x)^2+(u_y)^2=||u||^2

The dot product can be visualized as the length of a projection of one vector on to the other. In other words, the dot product asks 'how much magnitude of this vector is going in the direction of that vector?'

Applications of Scalar Multiplication and Dot Product

Unit Vectors

A unit vector is a vector with a magnitude of 1. The unit vector of u is a vector in the same direction as \mathbf{u}, but with a magnitude of 1:

Unit vector


The process of finding the unit vector of u is called normalization. As mentioned in scalar multiplication, multiplying a vector by constant C will result in the magnitude being multiplied by C. We know how to calculate the magnitude of \mathbf{u}. We know that dividing a vector by a constant will divide the magnitude by that constant. Therefore, if that constant is the magnitude, dividing the vector by the magnitude will result in a unit vector in the same direction as \mathbf{u}:

\mathbf{w}=\frac{\mathbf{u}}{|\mathbf{u}|}, where \mathbf{w} is the unit vector of \mathbf{u}

Standard Unit Vectors

A special case of Unit Vectors are the Standard Unit Vectors i and j: i points one unit directly right in the x direction, and j points one unit directly up in the y direction:

\mathbf{i} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}

\mathbf{j} = \begin{pmatrix} 0 \\ 1 \end{pmatrix}

Using the scalar multiplication and vector addition rules, we can then express vectors in a different way:

\begin{pmatrix} x \\ y \end{pmatrix} = x \mathbf{i} + y \mathbf{j}

If we work that equation out, it makes sense. Multiplying x by i will result in the vector \begin{pmatrix} x \\ 0 \end{pmatrix}. Multiplying y by j will result in the vector \begin{pmatrix} 0 \\ y \end{pmatrix}. Adding these two together will give us our original vector, \begin{pmatrix} x \\ y \end{pmatrix}. Expressing vectors using i and j is called standard form.

Projection and Decomposition of Vectors

Sometimes it is necessary to decompose a vector  \mathbf{u} into two components: one component parallel to a vector  \mathbf{v}
, which we will call  \mathbf{u}_ \parallel ; and one component perpendicular to it,  \mathbf{u}_ \bot .

Projection of a vector

Since the length of  \mathbf{u}_ \parallel is (|\mathbf{u}| \cdot \cos \theta ), it is straightforward to write down the formulas for  \mathbf{u}_ \bot and 
\mathbf{u}_ \parallel  :

\mathbf{u_ \parallel} = |\mathbf{u}| * \frac{(\mathbf{u} \cdot
\mathbf{v})}{(|\mathbf{u}| |\mathbf{v}|)} * \frac{\mathbf{v}}{|\mathbf{v}|}= 
(\mathbf{u} \cdot \mathbf{v})/(|\mathbf{v}|^2)  \mathbf{v}

and

 \mathbf{u}_ \bot = \mathbf{u} - \mathbf{u}_ \parallel

Length of a vector

The length of a vector is given by the dot product of a vector with itself, and  \theta = 0 deg :

\mathbf{u} \cdot \mathbf{u} = |\mathbf{u}||\mathbf{u}| \cos \theta = |\mathbf{u}|^2

Perpendicular vectors

If the angle  \theta between two vectors is 90 degrees or  {\pi \over 2} (if the two vectors are orthogonal to each other), that is the vectors are perpendicular, then the dot product is 0. This provides us with an easy way to find a perpendicular vector: if you have a vector  \mathbf{u} = \begin{pmatrix} u_x \\ u_y \end{pmatrix}, a perpendicular vector can easily be found by either

 \mathbf{v} = \begin{pmatrix} -u_y \\ u_x \end{pmatrix} = - \begin{pmatrix} u_y \\ -u_x \end{pmatrix}


Polar coordinates

Polar coordinates are an alternative two-dimensional coordinate system, which is often useful when rotations are important. Instead of specifying the position along the x and y axes, we specify the distance from the origin, r, and the direction, an angle θ.

Polar coordinates

Looking at this diagram, we can see that the values of x and y are related to those of r and θ by the equations

\begin{matrix} 
x=r \cos \theta & r = \sqrt{x^2+y^2} \\
y=r \sin \theta & \tan \theta = \frac{y}{x} \end{matrix}

Because tan-1 is multivalued, care must be taken to select the right value.

Just as for Cartesian coordinates the unit vectors that point in the x and y directions are special, so in polar coordinates the unit vectors that point in the r and θ directions are also special.

We will call these vectors \hat{\mathbf{r}} and \hat{\boldsymbol{\theta}}, pronounced r-hat and theta-hat. Putting a circumflex over a vector this way is often used to mean the unit vector in that direction.

Again, on looking at the diagram we see,
\begin{matrix} 
\mathbf{i}= \hat{\mathbf{r}} \cos \theta  
-\hat{\boldsymbol{\theta}} \sin \theta  &
\hat{\mathbf{r}} = \frac{x}{r}\mathbf{i}+ \frac{y}{r}\mathbf{j} \\
\mathbf{j}= \hat{\mathbf{r}} \sin \theta  
+\hat{\boldsymbol{\theta}} \cos \theta & 
\hat{\boldsymbol{\theta}} =  -\frac{y}{r}\mathbf{i} + \frac{x}{r}\mathbf{j}
\end{matrix}

Three-Dimensional Coordinates and Vectors

Basic definition

Two-dimensional Cartesian coordinates as we've discussed so far can be easily extended to three-dimensions by adding one more value: 'z'. If the standard (x,y) coordinate axes are drawn on a sheet of paper, the 'z' axis would extend upwards off of the paper.

3D coordinate axes.

Similar to the two coordinate axes in two-dimensional coordinates, there are three coordinate planes in space. These are the xy-plane, the yz-plane, and the xz-plane. Each plane is the "sheet of paper" that contains both axes the name mentions. For instance, the yz-plane contains both the y and z axes and is perpendicular to the x axis.

center|Coordinate planes in space.

Therefore, vectors can be extended to three dimensions by simply adding the 'z' value.

\mathbf{u} = \begin{pmatrix} x \\ y \\ z \end{pmatrix}

To facilitate standard form notation, we add another standard unit vector:

\mathbf{k} = \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix}

Again, both forms (component and standard) are equivalent.

\begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} = 1 \mathbf{i} + 2 \mathbf{j} + 3 \mathbf{k}

Magnitude: Magnitude in three dimensions is the same as in two dimensions, with the addition of a 'z' term in the radicand.

| \mathbf{u} | = \sqrt{u_x^2 + u_y^2 + u_z^2}

Three dimensions

The polar coordinate system is extended into three dimensions with two different coordinate systems, the cylindrical and spherical coordinate systems, both of which include two-dimensional or planar polar coordinates as a subset. In essence, the cylindrical coordinate system extends polar coordinates by adding an additional distance coordinate, while the spherical system instead adds an additional angular coordinate.

Cylindrical coordinates

a point plotted with cylindrical coordinates

The cylindrical coordinate system is a coordinate system that essentially extends the two-dimensional polar coordinate system by adding a third coordinate measuring the height of a point above the plane, similar to the way in which the Cartesian coordinate system is extended into three dimensions. The third coordinate is usually denoted h, making the three cylindrical coordinates (r, θ, h).

The three cylindrical coordinates can be converted to Cartesian coordinates by

 \begin{align}
x &= r \, \cos\theta \\
y &= r \, \sin\theta \\
z &= h.
\end{align}

Spherical coordinates

A point plotted using spherical coordinates

Polar coordinates can also be extended into three dimensions using the coordinates (ρ, φ, θ), where ρ is the distance from the origin, φ is the angle from the z-axis (called the colatitude or zenith and measured from 0 to 180°) and θ is the angle from the x-axis (as in the polar coordinates). This coordinate system, called the spherical coordinate system, is similar to the latitude and longitude system used for Earth, with the origin in the centre of Earth, the latitude δ being the complement of φ, determined by δ = 90° − φ, and the longitude l being measured by l = θ − 180°.

The three spherical coordinates are converted to Cartesian coordinates by

 \begin{align}
x &= \rho \, \sin\phi \, \cos\theta \\
y &= \rho \, \sin\phi \, \sin\theta \\
z &= \rho \, \cos\phi.
\end{align}


r = \sqrt{x^2 + y^2 + z^2},
\theta\ = \arctan{\frac{y}{x}},
\phi\ = \arccos{\frac{z}{r}},

Cross Product

The cross product of two vectors is a determinant:

 \mathbf{u} \times \mathbf{v} = \begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ u_x & u_y & u_z \\ v_x & v_y & v_z \end{vmatrix}

and is also a pseudovector.

The cross product of two vectors is orthogonal to both vectors. The magnitude of the cross product is the product of the magnitude of the vectors and the sin of the angle between them.

 |\mathbf{u} \times \mathbf{v}|=|\mathbf{u}||\mathbf{v}| \sin \theta

This magnitude is the area of the parallelogram defined by the two vectors.

The cross product is linear and anticommutative. For any numbers a and b,

\mathbf{u} \times \left( a\mathbf{v} +b\mathbf{w} \right) =
a \mathbf{u} \times  \mathbf{v} +b\mathbf{u} \times \mathbf{w} 
\quad \mathbf{u} \times  \mathbf{v} = - \mathbf{v} \times  \mathbf{u}

If both vectors point in the same direction, their cross product is zero.

Triple Products

If we have three vectors we can combine them in two ways, a triple scalar product,

 \mathbf{u} \cdot (\mathbf{v} \times \mathbf{w})

and a triple vector product

 \mathbf{u} \times (\mathbf{v} \times \mathbf{w})

The triple scalar product is a determinant

 \mathbf{u} \cdot(\mathbf{v} \times \mathbf{w}) = \begin{vmatrix} 
u_x & u_y & u_z \\ v_x & v_y & v_z \\ w_x & w_y & w_z \end{vmatrix}

If the three vectors are listed clockwise, looking from the origin, the sign of this product is positive. If they are listed anticlockwise the sign is negative.

The order of the cross and dot products doesn't matter.

 
\mathbf{u} \cdot (\mathbf{v} \times \mathbf{w}) = 
(\mathbf{u} \times \mathbf{v}) \cdot \mathbf{w}

Either way, the absolute value of this product is the volume of the parallelepiped defined by the three vectors, u, v, and w

The triple vector product can be simplified

 \mathbf{u} \times (\mathbf{v} \times \mathbf{w})
=(\mathbf{u} \cdot \mathbf{w}) \mathbf{v} 
-(\mathbf{u} \cdot \mathbf{v}) \mathbf{w}

This form is easier to do calculations with.

The triple vector product is not associative.

 \mathbf{u} \times (\mathbf{v} \times \mathbf{w})
\ne (\mathbf{u} \times \mathbf{v}) \times \mathbf{w}

There are special cases where the two sides are equal, but in general the brackets matter. They must not be omitted.

Three-Dimensional Lines and Planes

We will use r to denote the position of a point.

The multiples of a vector, a all lie on a line through the origin. Adding a constant vector b will shift the line, but leave it straight, so the equation of a line is,

\mathbf{r}=\mathbf{a}s+\mathbf{b}

This is a parametric equation. The position is specified in terms of the parameter s.

Any linear combination of two vectors, a and b lies on a single plane through the origin, provided the two vectors are not colinear. We can shift this plane by a constant vector again and write

\mathbf{r}=\mathbf{a}s+\mathbf{b}t+\mathbf{c}

If we choose a and b to be orthonormal vectors in the plane (i.e. unit vectors at right angles) then s and t are Cartesian coordinates for points in the plane.

These parametric equations can be extended to higher dimensions.

Instead of giving parametric equations for the line and plane, we could use constraints. E.g., for any point in the xy plane z=0

For a plane through the origin, the single vector normal to the plane, n, is at right angle with every vector in the plane, by definition, so

\mathbf{r} \cdot \mathbf{n}=0

is a plane through the origin, normal to n.

For planes not through the origin we get

(\mathbf{r}- \mathbf{a}) \cdot \mathbf{n}=0
\quad \mathbf{r} \cdot \mathbf{n}=\mathbf{a} \cdot \mathbf{n}

A line lies on the intersection of two planes, so it must obey the constraint for both planes, i.e.

\mathbf{r} \cdot \mathbf{n}=a \quad \mathbf{r} \cdot \mathbf{m}=b

These constraint equations con also be extended to higher dimensions.

Vector-Valued Functions

Vector-Valued Functions are functions that instead of giving a resultant scalar value, give a resultant vector value. These aid in the creation of direction and vector fields, and are therefore used in physics to aid with visualizations of electric, magnetic, and many other types of fields. They are of the following form:

\mathbf{F(t)} = \begin{pmatrix} \mathbf{a_1(t)} \\ \mathbf{a_2(t)} \\ \mathbf{a_3(t)} \\ \mathbf{.} \\ \mathbf{.} \\ \mathbf{a_n(t)} \end{pmatrix}

Introduction

Limits, Derivatives, and Integrals

Put simply, the limit of a vector-valued function is the limit of its parts.

Proof:

Suppose \lim_{t \rightarrow c} \mathbf{F}(t) = \mathbf{L} = \begin{pmatrix} \mathbf{a_1} \\ \mathbf{a_2} \\ \mathbf{a_3} \\ \mathbf{.} \\ \mathbf{.} \\ \mathbf{a_n} \end{pmatrix}

Therefore for any \epsilon > 0 there is a \phi > 0 such that

0 <  |t-c| <  \phi \implies |\mathbf{F}(t) - \mathbf{L}| <  \epsilon

But by the triangle inequality |a_1| \le |\mathbf{F}| \le |a_1| + |a_2| + |a_3| + ... + |a_n| |a_1(t) - a_1| \le |\mathbf{F}(t) - \mathbf{L}|

So

0 < |t-c| < \phi  \implies  |a_1(t) - a_1| < \epsilon

Therefore \lim_{t \rightarrow c} a_1(t) = a_1 A similar argument can be used through parts a_n(t)

Now let \lim_{t \rightarrow c} \mathbf{F}(t) = \mathbf{L} = \begin{pmatrix} \mathbf{a_1} \\ \mathbf{a_2} \\ \mathbf{a_3} \\ \mathbf{.} \\ \mathbf{.} \\ \mathbf{a_n} \end{pmatrix} again, and that for any ε>0 there is a corresponding φ>0 such 0<|t-c|<φ implies

|a_n(t) - a_n| < {\epsilon \over n}

Then

0 < |t-c| < \phi  \implies  |\mathbf{F}(t) - \mathbf{L}| \le {\epsilon_1 \over n} + ... + {\epsilon_n \over n} = \epsilon

therefore!:

 \lim_{t \rightarrow c} \mathbf{F}(t) = \mathbf{L} = \begin{pmatrix} \mathbf{a_1} \\ \mathbf{a_2} \\ \mathbf{a_3} \\ \mathbf{.} \\ \mathbf{.} \\ \mathbf{a_n} \end{pmatrix} = \begin{pmatrix} \lim_{t \rightarrow c} \mathbf{a_1(t)} \\ \lim_{t \rightarrow c} \mathbf{a_2(t)} \\ \lim_{t \rightarrow c} \mathbf{a_3(t)} \\ \mathbf{.} \\ \mathbf{.} \\ \lim_{t \rightarrow c} \mathbf{a_n(t)} \end{pmatrix}

From this we can then create an accurate definition of a derivative of a vector-valued function:


\mathbf{F}'(t)=\lim_{h \rightarrow 0} {\mathbf{F}(t+h) - \mathbf{F}(t) \over h}
= \begin{pmatrix} \mathbf{a_1(t)} \\ \mathbf{a_2(t)} \\ \mathbf{a_3(t)} \\ \mathbf{.} \\ \mathbf{.} \\ \mathbf{a_n(t)} \end{pmatrix}

= \lim_{h \rightarrow 0} {\begin{pmatrix} \mathbf{a_1(t + h)} \\ \mathbf{a_2(t + h)} \\ \mathbf{a_3(t + h)} \\ \mathbf{.} \\ \mathbf{.} \\ \mathbf{a_n(t + h)} \end{pmatrix} - \begin{pmatrix} \mathbf{a_1(t)} \\ \mathbf{a_2(t)} \\ \mathbf{a_3(t)} \\ \mathbf{.} \\ \mathbf{.} \\ \mathbf{a_n(t)} \end{pmatrix} \over h}
= \begin{pmatrix} \lim_{h \rightarrow 0} {a_1(t+h) - a_1(t) \over h} \\ \lim_{h \rightarrow 0} {a_2(t+h) - a_2(t) \over h} \\ \lim_{h \rightarrow 0} {a_3(t+h) - a_3(t) \over h} \\ \mathbf{.} \\ \mathbf{.} \\ \lim_{h \rightarrow 0} {a_n(t+h) - a_n(t) \over h} \end{pmatrix}

The final step was accomplished by taking what we just did with limits.

By the Fundamental Theorem of Calculus integrals can be applied to the vector's components.

In other words: the limit of a vector function is the limit of its parts, the derivative of a vector function is the derivative of its parts, and the integration of a vector function is the integration of it parts.

Velocity, Acceleration, Curvature, and a brief mention of the Binormal

Assume we have a vector-valued function which starts at the origin and as its independent variables changes the points that the vectors point at trace a path.

We will call this vector \mathbf{r}(t), which is commonly known as the position vector.

If \mathbf{r} then represents a position and t represents time, then in model with Physics we know the following:

\mathbf{r}(t+h) - \mathbf{r}(t) is displacement. \mathbf{r}'(t) = \mathbf{v}(t) where \mathbf{v}(t) is the velocity vector. |\mathbf{v}(t)| is the speed. \mathbf{r}''(t) = \mathbf{v}'(t) = \mathbf{a}(t) where \mathbf{a}(t) is the acceleration vector.

The only other vector that comes in use at times is known as the curvature vector.

The vector \mathbf{T}(t) used to find it is known as the unit tangent vector, which is defined as \frac{\mathbf{v}(t)}{|\mathbf{v}(t)|} or shorthand \mathbf{\hat v}.

The vector normal \mathbf{N} to this then is \frac{\mathbf{T}'(t)}{|\mathbf{v}(t)|}.

We can verify this by taking the dot product

\mathbf{T} \cdot \mathbf{N} = 0

Also note that |\mathbf{v}(t)| = {ds \over dt}

and

\mathbf{T}(t) = \frac{v}{|v|} = \frac{\frac{dr}{dt}}{\frac{ds}{dt}} = {dr \over ds}

and

\mathbf{N} = \frac{\mathbf{T}'(t)}{|\mathbf{v}(t)|} = \frac{\frac{dT}{dt}}{\frac{ds}{dt}} = {dT \over ds}

Then we can actually verify:

{d \over ds} ( \mathbf{T} \cdot \mathbf{T} )= {d \over ds} (1)

 {dT \over ds} \cdot \mathbf{T} + \mathbf{T} \cdot {dT \over ds} = 0

2 * \mathbf{T} \cdot {dT \over ds} = 0

 \mathbf{T} \cdot {dT \over ds} = 0

 \mathbf{T} \cdot \mathbf{N} = 0

Therefore \mathbf{N} is perpendicular to \mathbf{T}

What this gives rise to is the Unit Normal Vector  \frac{\frac{dT}{ds}}{|\frac{dT}{ds}|} of which the top-most vector is the Normal vector, but the bottom half (|\frac{dT}{ds}|)^{-1} is known as the curvature. Since the Normal vector points toward the inside of a curve, the sharper a turn, the Normal vector has a large magnitude, therefore the curvature has a small value, and is used as an index in civil engineering to reflect the sharpness of a curve (clover-leaf highways, for instance).

The only other thing not mentioned is the Binormal that occurs in 3-d curves \mathbf{T} \times \mathbf{N} = \mathbf{B}, which is useful in creating planes parallel to the curve.

Lines and Planes in Space

← Vectors Calculus Multivariable calculus →
Print version

Introduction

For many practical applications, for example for describing forces in physics and mechanics, you have to work with the mathematical descriptions of lines and planes in 3-dimensional space.

Parametric Equations

Line in Space

A line in space is defined by two points in space, which I will call P_1 and P_2. Let \mathbf{x}_1 be the vector from the origin to P_1, and \mathbf{x}_2 the vector from the origin to P_2. Given these two points, every other point P on the line can be reached by

 \mathbf{x} = \mathbf{x}_1 + \lambda \mathbf{a}

where  \mathbf{a} is the vector from P_1 and P_2:

 \mathbf{a} = \mathbf{x}_2 - \mathbf{x}_1

Line in 3D Space.

Plane in Space

The same idea can be used to describe a plane in 3-dimensional space, which is uniquely defined by three points (which do not lie on a line) in space (P_1, P_2, P_3). Let \mathbf{x}_i be the vectors from the origin to P_i. Then


\mathbf{x} = \mathbf{x}_1 + \lambda \mathbf{a} + \mu \mathbf{b}

with:


\mathbf{a} = \mathbf{x}_2 - \mathbf{x}_1 \,\, \text{and} \,\, \mathbf{b} = \mathbf{x}_3 - \mathbf{x}_1

Note that the starting point does not have to be  \mathbf{x}_1 , but can be any point in the plane. Similarly, the only requirement on the vectors  \mathbf{a} and  \mathbf{b} is that they have to be two non-collinear vectors in our plane.

Vector Equation (of a Plane in Space, or of a Line in a Plane)

An alternative representation of a Plane in Space is obtained by observing that a plane is defined by a point P_1 in that plane and a direction perpendicular to the plane, which we denote with the vector \mathbf{n}. As above, let \mathbf{x}_1 describe the vector from the origin to P_1, and \mathbf{x} the vector from the origin to another point P in the plane. Since any vector that lies in the plane is perpendicular to \mathbf{n}, the vector equation of the plane is given by


\mathbf{n} \cdot (\mathbf{x} - \mathbf{x}_1) = 0

In 2 dimensions, the same equation uniquely describes a Line.

Scalar Equation (of a Plane in Space, or of a Line in a Plane)

If we express \mathbf{n} and \mathbf{x} through their components


\mathbf{n} = \left( {\begin{array}{*{20}c}
   a  \\
   b  \\
   c  \\
\end{array}} \right),\,\,\text{and}\,\,
\mathbf{x} = \left( {\begin{array}{*{20}c}
   x  \\
   y  \\
   z  \\
\end{array}} \right),

writing out the scalar product for 
\mathbf{n} \cdot (\mathbf{x} - \mathbf{x}_1) = 0
provides us with the scalar equation for a plane in space:


ax+by+cz=d

where  d = \mathbf{n} \cdot \mathbf{x}_1 .

In 2d space, the equivalent steps lead to the scalar equation for a line in a plane:


ax+by=c

Multivariable & Differential Calculus

← Lines and Planes in Space Calculus Ordinary differential equations →
Print version

In your previous study of calculus, we have looked at functions and their behavior. Most of these functions we have examined have been all in the form

f(x) : RR,

and only occasional examination of functions of two variables. However, the study of functions of several variables is quite rich in itself, and has applications in several fields.

We write functions of vectors - many variables - as follows:

f : RmRn

and f(x) for the function that maps a vector in Rm to a vector in Rn.

Before we can do calculus in Rn, we must familiarize ourselves with the structure of Rn. We need to know which properties of R can be extended to Rn

Topology in Rn

We are already familiar with the nature of the regular real number line, which is the set R, and the two-dimensional plane, R2. This examination of topology in Rn attempts to look at a generalization of the nature of n-dimensional spaces - R, or R23, or Rn.

Lengths and distances

If we have a vector in R2, we can calculate its length using the Pythagorean theorem. For instance, the length of the vector (2, 3) is

\sqrt{3^2+2^2}=\sqrt{13}

We can generalize this to Rn. We define a vector's length, written |x|, as the square root of the sum of the squares of each of its components. That is, if we have a vector x=(x1,...,xn),

|\mathbf{x}| = \sqrt{x_1^2+x_2^2+\cdots+x_n^2}

Now that we have established some concept of length, we can establish the distance between two vectors. We define this distance to be the length of the two vectors' difference. We write this distance d(x, y), and it is

d(\mathbf{x}, \mathbf{y}) = |\mathbf{x} - \mathbf{y}| = \sqrt{\sum{(x_i-y_i)^2}}

This distance function is sometimes referred to as a metric. Other metrics arise in different circumstances. The metric we have just defined is known as the Euclidean metric.

Open and closed balls

In R, we have the concept of an interval, in that we choose a certain number of other points about some central point. For example, the interval [-1, 1] is centered about the point 0, and includes points to the left and right of zero.

In R2 and up, the idea is a little more difficult to carry on. For R2, we need to consider points to the left, right, above, and below a certain point. This may be fine, but for R3 we need to include points in more directions.

We generalize the idea of the interval by considering all the points that are a given, fixed distance from a certain point - now we know how to calculate distances in Rn, we can make our generalization as follows, by introducing the concept of an open ball and a closed ball respectively, which are analogous to the open and closed interval respectively.

an open ball
B(\mathbf{a}, r)
is a set in the form { xRn|d(x, a) < r}
a closed ball
\overline{B}(\mathbf{a}, r)
is a set in the form { xRn|d(x, a) ≤ r}

In R, we have seen that the open ball is simply an open interval centered about the point x=a. In R2 this is a circle with no boundary, and in R3 it is a sphere with no outer surface. (What would the closed ball be?)


Boundary points

If we have some area, say a field, then the common sense notion of the boundary is the points 'next to' both the inside and outside of the field. For a set, S, we can define this rigorously by saying the boundary of the set contains all those points such that we can find points both inside and outside the set. We call the set of such points ∂S

Typically, when it exists the dimension of ∂S is one lower than the dimension of S. e.g. the boundary of a volume is a surface and the boundary of a surface is a curve.

This isn't always true; but it is true of all the sets we will be using.


A set S is bounded if there is some positive number such that we can encompass this set by a closed ball about 0. --> if every point in it is within a finite distance of the origin, i.e there exists some r>0 such that x is in S implies |x|<r.

Curves and parameterizations

If we have a function f : RRn, we say that f's image (the set {f(t) | tR} - or some subset of R) is a curve in Rn and f is its parametrization.

Parameterizations are not necessarily unique - for example, f(t) = (cos t, sin t) such that t ∈ [0, 2π) is one parametrization of the unit circle, and g(t) = (cos at, sin at) such that t ∈ [0, 2π/a) is a whole family of parameterizations of that circle.

Collision and intersection points

Say we have two different curves. It may be important to consider

  • points the two curves share - where they intersect
  • intersections which occur for the same value of t - where they collide.

Intersection points

Firstly, we have two parameterizations f(t) and g(t), and we want to find out when they intersect, this means that we want to know when the function values of each parametrization are the same. This means that we need to solve

f(t) = g(s)

because we're seeking the function values independent of the times they intersect.

For example, if we have f(t) = (t, 3t) and g(t) = (t, t2), and we want to find intersection points:

f(t) = g(s)
(t, 3t) = (s, s2)
t = s and 3t = s2

with solutions (t, s) = (0, 0) and (3, 3)

So, the two curves intersect at the points (0, 0) and (3, 3).

Collision points

However, if we want to know when the points "collide", with f(t) and g(t), we need to know when both the function values and the times are the same, so we need to solve instead

f(t) = g(t)

For example, using the same functions as before, f(t) = (t, 3t) and g(t) = (t, t2), and we want to find collision points:

f(t) = g(t)
(t, 3t) = (t, t2)
t = t and 3t = t2

which gives solutions t = 0, 3 So the collision points are (0, 0) and (3, 3).

We may want to do this to actually model physical problems, such as in ballistics.

Continuity and differentiability

If we have a parametrization f : RRn, which is built up out of component functions in the form f(t) = (f1(t),...,fn(t)), f is continuous if and only if each component function is also.

In this case the derivative of f(t) is

ai = (f1′(t),...,fn′(t)). This is actually a specific consequence of a more general fact we will see later.


Tangent vectors

Recall in single-variable calculus that on a curve, at a certain point, we can draw a line that is tangent to that curve at exactly at that point. This line is called a tangent. In the several variable case, we can do something similar.

We can expect the tangent vector to depend on f′(t) and we know that a line is its own tangent, so looking at a parametrised line will show us precisely how to define the tangent vector for a curve.

An arbitrary line is f(t)=at+b, with  :fi(t)=ait+bi, so

fi′(t)=ai and
f′(t)=a, which is the direction of the line, its tangent vector.

Similarly, for any curve, the tangent vector is f′(t).



Angle between curves

Inner-product-angle.png

We can then formulate the concept of the angle between two curves by considering the angle between the two tangent vectors. If two curves, parametrized by f1 and f2 intersect at some point, which means that

f1(s)=f2(t)=c,

the angle between these two curves at c is the angle between the tangent vectors f1′(s) and f2′(t) is given by

\arccos{\mathbf{f}_1^\prime(s)\cdot\mathbf{f}_2^\prime(t) \over |\mathbf{f}_1^\prime(s)||\mathbf{f}_2^\prime(t)|}

Tangent lines

With the concept of the tangent vector as being analogous to being the gradient of the line in the one variable case, we can form the idea of the tangent line. Recall that we need a point on the line and its direction.

If we want to form the tangent line to a point on the curve, say p, we have the direction of the line f′(p), so we can form the tangent line

x(t)=p+t f′(p)


Different parameterizations

One such parametrization of a curve is not necessarily unique. Curves can have several different parametrizations. For example, we already saw that the unit circle can be parametrized by g(t) = (cos at, sin at) such that t ∈ [0, 2π/a).

Generally, if f is one parametrization of a curve, and g is another, with

f(t0) = g(s0)

there is a function u(t) such that u(t0)=s0, and g'(u(t)) = f(t) near t0.

This means, in a sense, the function u(t) "speeds up" the curve, but keeps the curve's shape.

Surfaces

A surface in space can be described by the image of a function f : R2Rn. f is said to be the parametrization of that surface.

For example, consider the function

f(α, β) = α(2,1,3)+β(-1,2,0)

This describes an infinite plane in R3. If we restrict α and β to some domain, we get a parallelogram-shaped surface in R3.

Surfaces can also be described explicitly, as the graph of a function z = f(x, y) which has a standard parametrization as f(x,y)=(x, y, f(x,y)), or implictly, in the form f(x, y, z)=c.

Level sets

The concept of the level set (or contour) is an important one. If you have a function f(x, y, z), a level set in R3 is a set of the form {(x,y,z)|f(x,y,z)=c}. Each of these level sets is a surface.

Level sets can be similarly defined in any Rn

Level sets in two dimensions may be familiar from maps, or weather charts. Each line represents a level set. For example, on a map, each contour represents all the points where the height is the same. On a weather chart, the contours represent all the points where the air pressure is the same.


Limits and continuity

Before we can look at derivatives of multivariate functions, we need to look at how limits work with functions of several variables first, just like in the single variable case.

If we have a function f : RmRn, we say that f(x) approaches b (in Rn) as x approaches a (in Rm) if, for all positive ε, there is a corresponding positive number δ, |f(x)-b| < ε whenever |x-a| < δ, with xa.

This means that by making the difference between x and a smaller, we can make the difference between f(x) and b as small as we want.

If the above is true, we say

  • f(x) has limit b at a
  • \lim_{\mathbf{x}\rightarrow\mathbf{a}} \mathbf{f}(\mathbf{x}) = \mathbf{b}
  • f(x) approaches b as x approaches a
  • f(x) → b as xa

These four statements are all equivalent.

Rules

Since this is an almost identical formulation of limits in the single variable case, many of the limit rules in the one variable case are the same as in the multivariate case.

For f and g, mapping Rm to Rn, and h(x) a scalar function mapping Rm to R, with

  • f(x) → b as xa
  • g(x) → c as xa
  • h(x) → H as xa

then:

  • \lim_{\mathbf{x}\rightarrow\mathbf{a}} (\mathbf{f} + \mathbf{g}) = \mathbf{b} + \mathbf{c}
  • \lim_{\mathbf{x}\rightarrow\mathbf{a}} (h\mathbf{f}) = H\mathbf{b}

and consequently

  • \lim_{\mathbf{x}\rightarrow\mathbf{a}} (\mathbf{f}\cdot\mathbf{g}) = \mathbf{b}\cdot\mathbf{c}
  • \lim_{\mathbf{x}\rightarrow\mathbf{a}} (\mathbf{f}\times\mathbf{g}) = \mathbf{b}\times\mathbf{c}

when H≠0

  • \lim_{\mathbf{x}\rightarrow\mathbf{a}} ({\mathbf{f}\over h}) = {\mathbf{b}\over H}

Continuity

Again, we can use a similar definition to the one variable case to formulate a definition of continuity for multiple variables.

If f : RmRn, f is continuous at a point a in Rm if f(a) is defined and

\lim_{\mathbf{x}\rightarrow\mathbf{a}} \mathbf{f}(\mathbf{x}) = \mathbf{f}(\mathbf{a})

Just as for functions of one dimension, if f, g are both continuous at p, f+g, λf (for a scalar λ), f·g, and f×g are continuous also. If φ : RmR is continus at p, φf, f/φ are too if φ is never zero.

From these facts we also have that if A is some matrix which is n×m in size, with x in Rm, a function f(x)=A x is continuous in that the function can be expanded in the form x1a1+...+xmam, which can be easily verified from the points above.

If f : RmRn which is in the form f(x) = (f1(x),...,fn(x) is continuous if and only if each of its component functions are a polynomial or rational function, whenever they are defined.

Finally, if f is continuous at p, g is continuous at f(p), g(f(x)) is continuous at p.

Special note about limits

It is important to note that we can approach a point in more than one direction, and thus, the direction that we approach that point counts in our evaluation of the limit. It may be the case that a limit may exist moving in one direction, but not in another.

Differentiable functions

We will start from the one-variable definition of the derivative at a point p, namely

\lim_{x\rightarrow p} {f(x)-f(p) \over x-p} = f'(p)

Let's change above to equivalent form of

\lim_{x\rightarrow p} {f(x)-f(p)-f'(p)(x-p) \over x-p} = 0

which achieved after pulling f'(p) inside and putting it over a common denominator.

We can't divide by vectors, so this definition can't be immediately extended to the multiple variable case. Nonetheless, we don't have to: the thing we took interest in was the quotient of two small distances (magnitudes), not their other properties (like sign). It's worth noting that 'other' property of vector neglected is its direction. Now we can divide by the absolute value of a vector, so lets rewrite this definition in terms of absolute values

\lim_{x\rightarrow p} \frac{\left|f(x)-f(p)-f'(p)(x-p)\right|}{\left|  x-p\right|} = 0

Another form of formula above is obtained by letting h=x-p we have x=p+h and if x \rightarrow p, the h = x - p \rightarrow 0, so

\lim_{h\rightarrow 0} \frac{\left|f(p+h)-f(p)-f'(p)h\right|}{\left| h\right|} = 0,

where h can be thought of as a 'small change'.

So, how can we use this for the several-variable case?

If we switch all the variables over to vectors and replace the constant (which performs a linear map in one dimension) with a matrix (which denotes also a linear map), we have

\lim_{\mathbf{x}\rightarrow\mathbf{p}} {|\mathbf{f}(\mathbf{x})-\mathbf{f}(\mathbf{p})-\mathbf{A}(\mathbf{x}-\mathbf{p})| \over |\mathbf{x}-\mathbf{p}|} = 0

or

\lim_{\mathbf{h}\rightarrow\mathbf{0}} {|\mathbf{f}(\mathbf{p}+\mathbf{h})-\mathbf{f}(\mathbf{p})-\mathbf{A}\mathbf{h}| \over |\mathbf{h}|} = 0

If this limit exists for some f : RmRn, and there is a linear map A : RmRn (denoted by matrix A which is m×n), we refer to this map as being the derivative and we write it as Dp f.

A point on terminology - in referring to the action of taking the derivative (giving the linear map A), we write Dp f, but in referring to the matrix A itself, it is known as the Jacobian matrix and is also written Jp f. More on the Jacobian later.

Properties

There are a number of important properties of this formulation of the derivative.

Affine approximations

If f is differentiable at p for x close to p, |f(x)-(f(p)+A(x-p))| is small compared to |x-p|, which means that f(x) is approximately equal to f(p)+A(x-p).

We call an expression of the form g(x)+c affine, when g(x) is linear and c is a constant. f(p)+A(x-p) is an affine approximation to f(x).

Jacobian matrix and partial derivatives

The Jacobian matrix of a function is in the form

\left(J_\mathbf{p} \mathbf{f}\right)_{ij} = \left.{\partial f_i \over \partial x_j}\right|_\mathbf{p}

for a f : RmRn, Jp f' is a n×m matrix.

The consequence of this is that if f is differentiable at p, all the partial derivatives of f exist at p.

However, it is possible that all the partial derivatives of a function exist at some point yet that function is not differentiable there, so it is very important not to mix derivative (linear map) with the Jacobian (matrix) especially in situations akin to the one cited.

Continuity and differentiability

Furthermore, if all the partial derivatives exist, and are continuous in some neighbourhood of a point p, then f is differentiable at p. This has the consequence that for a function f which has its component functions built from continuous functions (such as rational functions, differentiable functions or otherwise), f is differentiable everywhere f is defined.

We use the terminology continuously differentiable for a function differentiable at p which has all its partial derivatives existing and are continuous in some neighbourhood at p.

Rules of taking Jacobians

If f : RmRn, and h(x) : RmR are differentiable at 'p':

  • J_\mathbf{p} (\mathbf{f}+\mathbf{g}) = J_\mathbf{p} \mathbf{f} + J_\mathbf{p} \mathbf{g}
  • J_\mathbf{p} (h\mathbf{f}) = hJ_\mathbf{p} \mathbf{f} + \mathbf{f}(\mathbf{p}) J_\mathbf{p} h
  • J_\mathbf{p} (\mathbf{f}\cdot \mathbf{g}) = \mathbf{g}^T J_\mathbf{p} \mathbf{f} + \mathbf{f}^T J_\mathbf{p}\mathbf{g}

Important: make sure the order is right - matrix multiplication is not commutative!

Chain rule

The chain rule for functions of several variables is as follows. For f : RmRn and g : RnRp, and g o f differentiable at p, then the Jacobian is given by

\left( J_{\mathbf{f}(\mathbf{p})} \mathbf{g}\right) \left( J_\mathbf{p} \mathbf{f}\right)

Again, we have matrix multiplication, so one must preserve this exact order. Compositions in one order may be defined, but not necessarily in the other way.


Alternate notations

For simplicity, we will often use various standard abbreviations, so we can write most of the formulae on one line. This can make it easier to see the important details.

We can abbreviate partial differentials with a subscript, e.g.,

\partial_x h(x,y)= \frac{\partial h}{\partial x} \quad \partial_x \partial_y h= \partial_y \partial_x h

When we are using a subscript this way we will generally use the Heaviside D rather than ∂,

D_x h(x,y)= \frac{\partial h}{\partial x} 
\quad D_x D_y h= D_y D_x h

Mostly, to make the formulae even more compact, we will put the subscript on the function itself.

D_x h= h_x \quad h_{xy}=h_{yx}

If we are using subscripts to label the axes, x1, x2 …, then, rather than having two layers of subscripts, we will use the number as the subscript.

h_1 = D_1 h = \partial_1 h = \partial_{x_1}h = \frac{\partial h}{\partial x_1}

We can also use subscripts for the components of a vector function, u=(ux, uy, uy) or u=(u1,u2un)

If we are using subscripts for both the components of a vector and for partial derivatives we will separate them with a comma.

u_{x,y}=\frac{\partial u_x}{\partial y}

The most widely used notation is hx. Both h1 and ∂1h are also quite widely used whenever the axes are numbered. The notation ∂xh is used least frequently.

We will use whichever notation best suits the equation we are working with.

Directional derivatives

Normally, a partial derivative of a function with respect to one of its variables, say, xj, takes the derivative of that "slice" of that function parallel to the xj'th axis.

More precisely, we can think of cutting a function f(x1,...,xn) in space along the xj'th axis, with keeping everything but the xj variable constant.

From the definition, we have the partial derivative at a point p of the function along this slice as

{\partial \mathbf{f} \over \partial x_j} = \lim_{t\rightarrow 0} {\mathbf{f}(\mathbf{p}+t\mathbf{e}_j) - \mathbf{f}(\mathbf{p}) \over t}

provided this limit exists.

Instead of the basis vector, which corresponds to taking the derivative along that axis, we can pick a vector in any direction (which we usually take as being a unit vector), and we take the directional derivative of a function as

{\partial \mathbf{f} \over \partial \mathbf{d}} = \lim_{t\rightarrow 0} {\mathbf{f}(\mathbf{p}+t\mathbf{d}) - \mathbf{f}(\mathbf{p}) \over t}

where d is the direction vector.

If we want to calculate directional derivatives, calculating them from the limit definition is rather painful, but, we have the following: if f : RnR is differentiable at a point p, |p|=1,

{\partial \mathbf{f} \over \partial \mathbf{d}} = D_\mathbf{p} \mathbf{f}(\mathbf{d})

There is a closely related formulation which we'll look at in the next section.

Gradient vectors

The partial derivatives of a scalar tell us how much it changes if we move along one of the axes. What if we move in a different direction?

We'll call the scalar f, and consider what happens if we move an infintesimal direction dr=(dx,dy,dz), using the chain rule.

\mathbf{df}=dx\frac{\partial f}{\partial x} + 
dy\frac{\partial f}{\partial y}+dz\frac{\partial f}{\partial z}

This is the dot product of dr with a vector whose components are the partial derivatives of f, called the gradient of f

\operatorname{grad} \mathbf{f} = \nabla \mathbf{f} = 
\left(\frac{\partial \mathbf{f}(\mathbf{p})}{\partial x_1},\cdots,
\frac{\partial \mathbf{f}(\mathbf{p})}{\partial x_n}\right)

We can form directional derivatives at a point p, in the direction d then by taking the dot product of the gradient with d

{\partial \mathbf{f}(\mathbf{p}) \over \partial \mathbf{d}} =\mathbf{d} \cdot \nabla \mathbf{f}(\mathbf{p}).

Notice that grad f looks like a vector multiplied by a scalar. This particular combination of partial derivatives is commonplace, so we abbreviate it to

\nabla = \left( \frac{\partial }{\partial x}, 
\frac{\partial }{\partial y}, \frac{\partial }{\partial z}\right)

We can write the action of taking the gradient vector by writing this as an operator. Recall that in the one-variable case we can write d/dx for the action of taking the derivative with respect to x. This case is similar, but acts like a vector.

We can also write the action of taking the gradient vector as:

\nabla  = \left( \frac{\partial }{\partial x_1},
\frac{\partial }{\partial x_2}, \cdots \frac{\partial }{\partial x_n}\right)

Properties of the gradient vector

Geometry
  • Grad f(p) is a vector pointing in the direction of steepest slope of f. |grad f(p)| is the rate of change of that slope at that point.

For example, if we consider h(x, y)=x2+y2. The level sets of h are concentric circles, centred on the origin, and

 \nabla h = (h_x,h_y) = 2(x,y)= 2 \mathbf{r}

grad h points directly away from the origin, at right angles to the contours.

  • Along a level set, (∇f)(p) is perpendicular to the level set {x|f(x)=f(p) at x=p}.

If dr points along the contours of f, where the function is constant, then df will be zero. Since df is a dot product, that means that the two vectors, df and grad f, must be at right angles, i.e. the gradient is at right angles to the contours.

Algebraic properties

Like d/dx, ∇ is linear. For any pair of constants, a and b, and any pair of scalar functions, f and g

\frac{d}{dx} (af+bg)= a\frac{d}{dx}f + b\frac{d}{dx}g 
\quad \nabla (af+bg) = a \nabla f + b \nabla g

Since it's a vector, we can try taking its dot and cross product with other vectors, and with itself.

Divergence

If the vector function u maps Rn to itself, then we can take the dot product of u and ∇. This dot product is called the divergence.

\operatorname{div}\, \mathbf{u} = \nabla \cdot \mathbf{u} = 
\frac{\partial u_1}{\partial x_1} + \frac{\partial u_2}{\partial x_2} + \cdots \frac{\partial u_n}{\partial x_n}

If we look at a vector function like v=(1+x2,xy) we can see that to the left of the origin all the v vectors are converging towards the origin, but on the right they are diverging away from it.

Div u tells us how much u is converging or diverging. It is positive when the vector is diverging from some point, and negative when the vector is converging on that point.

Example:
For v=(1+x2, xy), div v=3x, which is positive to the right of the origin, where v is diverging, and negative to the left of the origin, where v is converging.

Like grad, div is linear.

\nabla \cdot \left( a\mathbf{u}+b\mathbf{v} \right) = 
a\nabla \cdot \mathbf{u}  + b\nabla \cdot \mathbf{v}

Later in this chapter we will see how the divergence of a vector function can be integrated to tell us more about the behaviour of that function.

To find the divergence we took the dot product of and a vector with on the left. If we reverse the order we get

\mathbf{u} \cdot \nabla = u_x D_x + u_y D_y + u_z D_z

To see what this means consider i· This is Dx, the partial differential in the i direction. Similarly, u· is the partial differential in the u direction, multiplied by |u|

Curl

If u is a three-dimensional vector function on R3 then we can take its cross product with ∇. This cross product is called the curl.

\operatorname{curl}\,\mathbf{u} = \nabla \times \mathbf{u} 
=\begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ 
D_x & D_y & D_z \\ u_x & u_y & u_z \end{vmatrix}

Curl u tells us if the vector u is rotating round a point. The direction of curl u is the axis of rotation.

We can treat vectors in two dimensions as a special case of three dimensions, with uz=0 and Dzu=0. We can then extend the definition of curl u to two-dimensional vectors

\operatorname{curl}\,\mathbf{u} = D_y u_x-D_x u_y

This two dimensional curl is a scalar. In four, or more, dimensions there is no vector equivalent to the curl.

Example:
Consider u=(-y, x). These vectors are tangent to circles centred on the origin, so appear to be rotating around it anticlockwise.

\operatorname{curl}\,\mathbf{u} = D_y(-y)-D_x x= -2

Example
Consider u=(-y, x-z, y), which is similar to the previous example.

\operatorname{curl}\,\mathbf{u} = 
\begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ 
D_x & D_y & D_z \\ -y & x-z & y \end{vmatrix}
=2\mathbf{i} + 2\mathbf{k}

This u is rotating round the axis i+k

Later in this chapter we will see how the curl of a vector function can be integrated to tell us more about the behaviour of that function.

Product and chain rules

Just as with ordinary differentiation, there are product rules for grad, div and curl.

  • If g is a scalar and v is a vector, then
the divergence of gv is
\nabla \cdot (g\mathbf{v})=g \nabla \cdot \mathbf{v} + (\mathbf{v} \cdot \nabla) g
the curl of gv is
\nabla \times (g\mathbf{v}) = g(\nabla \times \mathbf{v})+

(\nabla g) \times \mathbf{v}
  • If u and v are both vectors then
the gradient of their dot product is
\nabla ( \mathbf{u} \cdot \mathbf{v} ) = 
\mathbf{u} \times (\nabla \times \mathbf{v} ) + 
\mathbf{v} \times (\nabla \times \mathbf{u} ) + 
(\mathbf{u} \cdot \nabla) \mathbf{v} + (\mathbf{v} \cdot \nabla) \mathbf{u}
the divergence of their cross product is
\nabla \cdot ( \mathbf{u} \times \mathbf{v} ) =
\mathbf{v}\cdot ( \nabla \times \mathbf{u} ) -
\mathbf{u}\cdot ( \nabla \times \mathbf{v} )
the curl of their cross product is
\nabla \times ( \mathbf{u} \times \mathbf{v} ) =
(\mathbf{v} \cdot \nabla ) \mathbf{u} - (\mathbf{u} \cdot \nabla) \mathbf{v}
+ \mathbf{u}(\nabla \cdot \mathbf{v}) - \mathbf{v}(\nabla \cdot \mathbf{u})


We can also write chain rules. In the general case, when both functions are vectors and the composition is defined, we can use the Jacobian defined earlier.

 
\left. \nabla \mathbf{u}(\mathbf{v}) \right|_{\mathbf{r}}=
\mathbf{J}_{\mathbf{v}} \left. \nabla \mathbf{v} \right|_{\mathbf{r}}

where Ju is the Jacobian of u at the point v.

Normally J is a matrix but if either the range or the domain of u is R1 then it becomes a vector. In these special cases we can compactly write the chain rule using only vector notation.

  • If g is a scalar function of a vector and h is a scalar function of g then
\nabla h(g) = \frac{dh}{dg} \nabla g
  • If g is a scalar function of a vector then
\nabla = (\nabla g) \frac{d}{dg}

This substitution can be made in any of the equations containing

Second order differentials

We can also consider dot and cross products of with itself, whenever they can be defined. Once we know how to simplify products of two ∇'s we'll know out to simplify products with three or more.

The divergence of the gradient of a scalar f is

 \nabla^2 f(x_1,x_2,\ldots x_n) = \frac{\partial^2 f}{\partial x_1^2}+
\frac{\partial^2 f}{\partial x_2^2}+\ldots+\frac{\partial^2 f}{\partial x_n^2}

This combination of derivatives is the Laplacian of f. It is commmonplace in physics and multidimensional calculus because of its simplicity and symmetry.

We can also take the Laplacian of a vector,

 \nabla^2 \mathbf{u}(x_1,x_2,\ldots x_n) = 
\frac{\partial^2 \mathbf{u}}{\partial x_1^2}+ 
\frac{\partial^2 \mathbf{u}}{\partial x_2^2}+ \ldots +
\frac{\partial^2 \mathbf{u}}{\partial x_n^2}

The Laplacian of a vector is not the same as the divergence of its gradient

\nabla (\nabla \cdot \mathbf{u})- \nabla^2 \mathbf{u}=
\nabla \times (\nabla \times \mathbf{u})

Both the curl of the gradient and the divergence of the curl are always zero.

\nabla \times \nabla f=0 \quad \nabla \cdot (\nabla \times \mathbf{u})=0

This pair of rules will prove useful.

Integration

We have already considered differentiation of functions of more than one variable, which leads us to consider how we can meaningfully look at integration.

In the single variable case, we interpret the definite integral of a function to mean the area under the function. There is a similar interpretation in the multiple variable case: for example, if we have a paraboloid in R3, we may want to look at the integral of that paraboloid over some region of the xy plane, which will be the volume under that curve and inside that region.

Riemann sums

When looking at these forms of integrals, we look at the Riemann sum. Recall in the one-variable case we divide the interval we are integrating over into rectangles and summing the areas of these rectangles as their widths get smaller and smaller. For the multiple-variable case, we need to do something similar, but the problem arises how to split up R2, or R3, for instance.

To do this, we extend the concept of the interval, and consider what we call a n-interval. An n-interval is a set of points in some rectangular region with sides of some fixed width in each dimension, that is, a set in the form {xRn|aixibi with i = 0,...,n}, and its area/size/volume (which we simply call its measure to avoid confusion) is the product of the lengths of all its sides.

So, an n-interval in R2 could be some rectangular partition of the plane, such as {(x,y) | x ∈ [0,1] and y ∈ [0, 2]|}. Its measure is 2.

If we are to consider the Riemann sum now in terms of sub-n-intervals of a region Ω, it is

\sum_{i; S_i \subset \Omega} f(x^*_i)m(S_i)

where m(Si) is the measure of the division of Ω into k sub-n-intervals Si, and x*i is a point in Si. The index is important - we only perform the sum where Si falls completely within Ω - any Si that is not completely contained in Ω we ignore.

As we take the limit as k goes to infinity, that is, we divide up Ω into finer and finer sub-n-intervals, and this sum is the same no matter how we divide up Ω, we get the integral of f over Ω which we write

\int_\Omega f

For two dimensions, we may write

\int\int_\Omega f

and likewise for n dimensions.

Iterated integrals

Thankfully, we need not always work with Riemann sums every time we want to calculate an integral in more than one variable. There are some results that make life a bit easier for us.

For R2, if we have some region bounded between two functions of the other variable (so two functions in the form f(x) = y, or f(y) = x), between a constant boundary (so, between x = a and x =b or y = a and y = b), we have

\int_a^b\,\int_{f(x)}^{g(x)} h(x,y)\,dy dx

An important theorem (called Fubini's theorem) assures us that this integral is the same as

\int\int_\Omega f,

if f is continuous on the domain of integration.

Order of integration

In some cases the first integral of the entire iterated integral is difficult or impossible to solve, therefore, it can be to our advantage to change the order of integration.

\int_a^b\,\int_{f(x)}^{g(x)} h(x,y)\,dx dy

\int_c^d\,\int_{e(y)}^{f(y)} h(x,y)\,dy dx

As of the writing of this, there is no set method to change an order of integration from dxdy to dydx or some other variable. Although, it is possible to change the order of integration in an x and y simple integration by simply switching the limits of integration around also, in non-simple x and y integrations the best method as of yet is to recreate the limits of the integration from the graph of the limits of integration.

In higher order integration that can't be graphed, the process can be very tedious. For example, dxdydz can be written into dzdydx, but first dxdydz must be switched to dydxdz and then to dydzdx and then to dzdydx (as 3-dimensional cases can be graphed, this method would lack parsimony).

Parametric integrals

If we have a vector function, u, of a scalar parameter, s, we can integrate with respect to s simply by integrating each component of u separately.

\mathbf{v}(s)=\int \mathbf{u}(s)\,ds \Rightarrow v_i(s)=\int u_i(s)\,ds

Similarly, if u is given a function of vector of parameters, s, lying in Rn, integration with respect to the parameters reduces to a multiple integral of each component.

Line integrals

Line-Integral.gif In one dimension, saying we are integrating from a to b uniquely specifies the integral.

In higher dimensions, saying we are integrating from a to b is not sufficient. In general, we must also specify the path taken between a and b.

We can then write the integrand as a function of the arclength along the curve, and integrate by components.

E.g., given a scalar function h(r) we write

\int_C h(\mathbf{r}) \, d\mathbf{r} = 
\int_C h(\mathbf{r}) \frac{d\mathbf{r}}{ds}\,ds =
\int_C h(\mathbf{r}(s)) \mathbf{t}(s) \, ds

where C is the curve being integrated along, and t is the unit vector tangent to the curve.

There are some particularly natural ways to integrate a vector function, u, along a curve,

\int_C \mathbf{u}\, ds \quad 
\int_C \mathbf{u} \cdot d\mathbf{r} \quad 
\int_C \mathbf{u} \times d\mathbf{r} \quad 
\int_C \mathbf{u} \cdot \,\mathbf{n} ds

where the third possibility only applies in 3 dimensions.

Again, these integrals can all be written as integrals with respect to the arclength, s.


\int_C  \mathbf{u} \cdot d\mathbf{r} = \int_C  \mathbf{u} \cdot \mathbf{t} \, ds
\quad 
\int_C \mathbf{u} \times d\mathbf{r} = \int_C \mathbf{u} \times \mathbf{t}\, ds
\quad

If the curve is planar and u a vector lying in the same plane, the second integral can be usefully rewritten. Say,

\mathbf{u}=u_t \mathbf{t}+u_n \mathbf{n} + u_b \mathbf{b}

where t, n, and b are the tangent, normal, and binormal vectors uniquely defined by the curve.

Then

\mathbf{u} \times \mathbf{t}= -\mathbf{b} u_n + \mathbf{n} u_b

For the 2-d curves specified b is the constant unit vector normal to their plane, and ub is always zero.

Therefore, for such curves,

\int_C \mathbf{u} \times d\mathbf{r} =
\int_C \mathbf{u} \cdot \,\mathbf{n} ds

Green's Theorem

Green's-theorem-simple-region.svg

Let C be a piecewise smooth, simple closed curve that bounds a region S on the Cartesian plane. If two function M(x,y) and N(x,y) are continuous and their partial derivatives are continuous, then

\int \int_S  (\frac{\partial N}{\partial x} - \frac{\partial M}{\partial y})  dA = \oint_C M dx + N dy = \oint_C \mathbf{F} \cdot d\mathbf{r}

In order for Green's theorem to work there must be no singularities in the vector field within the boundaries of the curve.

Green's theorem works by summing the circulation in each infinitesimal segment of area enclosed within the curve.

Inverting differentials

We can use line integrals to calculate functions with specified divergence, gradient, or curl.

  • If grad V = u
V(\mathbf{p})=\int_{\mathbf{p}_0}^{\mathbf{p}} \mathbf{u} \cdot d\mathbf{r} + h(\mathbf{p})
where h is any function of zero gradient and curl u must be zero.
  • If div u = V
\mathbf{u}(\mathbf{p}) =\int_{\mathbf{p}_0}^{\mathbf{p}} V \, d\mathbf{r} + \mathbf{w}(\mathbf{p})
where w is any function of zero divergence.
  • If curl u = v
\mathbf{u}(\mathbf{p}) =
\frac{1}{2}\int_{\mathbf{p}_0}^{\mathbf{p}} \mathbf{v} \times d\mathbf{r} + \mathbf{w}(\mathbf{p})
where w is any function of zero curl.

For example, if V=r2 then

\operatorname{grad} V = 2(x, y, z) = 2 \mathbf{r}

and

\begin{matrix}
\int_{\mathbf{0}}^{\mathbf{r}} 2\mathbf{u} \cdot d\mathbf{u} & = & \int_{\mathbf{0}}^{\mathbf{r}} 2 \left( udu + vdv+ wdw \right) \\
& = & \left[ u^2 \right]_{\mathbf{0}}^{\mathbf{r}} + 
\left[ v^2 \right]_{\mathbf{0}}^{\mathbf{r}} +
\left[ w^2 \right]_{\mathbf{0}}^{\mathbf{r}} \\
& = & x^2 + y^2 + z^2 = r^2\\
\end{matrix}

so this line integral of the gradient gives the original function.

Similarly, if v=k then

\mathbf{u}(\mathbf{p}) =
\int_{\mathbf{p}_0}^{\mathbf{p}}  \mathbf{k} \times d\mathbf{r}

Consider any curve from 0 to p=(x, y', z), given by r=r(s) with r(0)=0 and r(S)=p for some S, and do the above integral along that curve.

\begin{matrix}
\mathbf{u}(\mathbf{p}) & = & 
\int_0^S  \mathbf{k} \times \frac{d\mathbf{r}}{ds} \, ds \\ & = & 
\int_0^S \left( \frac{dr_x}{ds} \mathbf{j}-\frac{dr_y}{ds} \mathbf{i} \right) \, ds \\ & = & 
\mathbf{j} \int_0^S \frac{dr_x}{ds} \, ds - 
\mathbf{i} \int_0^S \frac{dr_y}{ds} \, ds \\ 
& = & \mathbf{j} [r_x(s)]^S_0 - \mathbf{i} [r_y(s)]^S_0 \\ 
& = & p_x\mathbf{j} - p_y \mathbf{i} =  x\mathbf{j} - y \mathbf{i}
\end{matrix}

and curl u is


\frac{1}{2}\begin{vmatrix}
\mathbf{i} & \mathbf{j} & \mathbf{k} \\
D_x & D_y & D_z \\
-y & x & 0
\end{vmatrix} = \mathbf{k} = \mathbf{v}

as expected.

We will soon see that these three integrals do not depend on the path, apart from a constant.

Surface and Volume Integrals

Just as with curves, it is possible to parameterise surfaces then integrate over those parameters without regard to geometry of the surface.

That is, to integrate a scalar function V over a surface A parameterised by r and s we calculate

\int_A V(x,y,z) \, dS = \int \int_A V(r,s) \det J \, dr ds

where J is the Jacobian of the transformation to the parameters.

To integrate a vector this way, we integrate each component separately.

However, in three dimensions, every surface has an associated normal vector n, which can be used in integration. We write dS=ndS.

For a scalar function, V, and a vector function, v, this gives us the integrals

\int_A V \, \mathbf{dS} \quad \int_A  \mathbf{v} \cdot \mathbf{dS}
\quad \int_A  \mathbf{v} \times \mathbf{dS}

These integrals can be reduced to parametric integrals but, written this way, it is clear that they reflect more of the geometry of the surface.

When working in three dimensions, dV is a scalar, so there is only one option for integrals over volumes.

Gauss's divergence theorem

Divergence theorem.svg

We know that, in one dimension,

\int_a^b Df dx = f|_a^b

Integration is the inverse of differentiation, so integrating the differential of a function returns the original function.

This can be extended to two or more dimensions in a natural way, drawing on the analogies between single variable and multivariable calculus.

The analog of D is ∇, so we should consider cases where the integrand is a divergence.

Instead of integrating over a one-dimensional interval, we need to integrate over a n-dimensional volume.

In one dimension, the integral depends on the values at the edges of the interval, so we expect the result to be connected with values on the boundary.

This suggests a theorem of the form,

\int_V \nabla \cdot  \mathbf{u} \, dV = 
\int_{\partial V} \mathbf{n} \cdot  \mathbf{u} dS

This is indeed true, for vector fields in any number of dimensions.

This is called Gauss's theorem.

There are two other, closely related, theorems for grad and curl:

  • \int_V \nabla u \, dV = 
\int_{\partial V} u \mathbf{n} dS ,
  • \int_V \nabla \times  \mathbf{u} \, dV = 
\int_{\partial V} \mathbf{n} \times  \mathbf{u} dS ,

with the last theorem only being valid where curl is defined.

Stokes' curl theorem

Stokes' Theorem.svg

These theorems also hold in two dimensions, where they relate surface and line integrals. Gauss's divergence theorem becomes

\int_S \nabla \cdot  \mathbf{u} \, dS = 
\oint_{\partial S} \mathbf{n} \cdot  \mathbf{u}\, ds

where s is arclength along the boundary curve and the vector n is the unit normal to the curve that lies in the surface S, i.e. in the tangent plane of the surface at its boundary, which is not necessarily the same as the unit normal associated with the boundary curve itself.

Similarly, we get

\int_S \nabla \times  \mathbf{u} \, dS = 
\int_C \mathbf{n} \times  \mathbf{u} ds \quad (1),

where C is the boundary of S

In this case the integral does not depend on the surface S.

To see this, suppose we have different surfaces, S1 and S2, spanning the same curve C, then by switching the direction of the normal on one of the surfaces we can write

 \int_{S_1+S_2} \nabla \times  \mathbf{u} \, dS =
\int_S \nabla \times  \mathbf{u} \, dS - \int_S \nabla \times  \mathbf{u} \, dS 
\quad (2)

The left hand side is an integral over a closed surface bounding some volume V so we can use Gauss's divergence theorem.

 \int_{S_1+S_2} \nabla \times  \mathbf{u} \, dS =
\int_V \nabla \cdot \nabla \times  \mathbf{u} \, dV

but we know this integrand is always zero so the right hand side of (2) must always be zero, i.e. the integral is independent of the surface.

This means we can choose the surface so that the normal to the curve lying in the surface is the same as the curves intrinsic normal.

Then, if u itself lies in the surface, we can write

\mathbf{u}=\left( \mathbf{u} \cdot \mathbf{n} \right) \mathbf{n} + 
\left( \mathbf{u} \cdot \mathbf{t} \right) \mathbf{t}

just as we did for line integrals in the plane earlier, and substitute this into (1) to get

\int_S \nabla \times  \mathbf{u} \, dS = 
\int_C \mathbf{u} \cdot d\mathbf{r}

This is Stokes' curl theorem

Ordinary Differential Equations

← Differential Equations Calculus Partial differential equations →
Print version

Ordinary differential equations involve equations containing:

  • variables
  • functions
  • their derivatives

and their solutions.

In studying integration, you already have considered solutions to very simple differential equations. For example, when you look to solving

\int f(x) \,dx=g(x)

for g(x), you are really solving the differential equation

 g'(x) = f(x) \,

Notations and terminology

The notations we use for solving differential equations will be crucial in the ease of solubility for these equations.

This document will be using three notations primarily:

  • f' to denote the derivative of f
  • D f to denote the derivative of f
  • {df \over dx} to denote the derivative of f (for separable equations).

Terminology

Consider the differential equation

3 f^{\prime \prime}(x)+5xf(x)=11

Since the equation's highest derivative is 2, we say that the differential equation is of order 2.

Some simple differential equations

A key idea in solving differential equations will be that of integration.

Let us consider the second order differential equation (remember that a function acts on a value).

f''(x)=2 \,

How would we go about solving this? It tells us that on differentiating twice, we obtain the constant 2 so, if we integrate twice, we should obtain our result.

Integrating once first of all:

\int f''(x) \,dx = \int 2 \,dx
f'(x)=2x+C_1 \,

We have transformed the apparently difficult second order differential equation into a rather simpler one, viz.

f'(x)=2x+C_1 \,

This equation tells us that if we differentiate a function once, we get 2x+C_1. If we integrate once more, we should find the solution.

\int f'(x) \,dx = \int 2x+C_1 \,dx
f(x)=x^2+C_1x+C_2 \,

This is the solution to the differential equation. We will get f''=2 \, for all values of C_1 and C_2.

The values C_1 and C_2 are related to quantities known as initial conditions.

Why are initial conditions useful? ODEs (ordinary differential equations) are useful in modeling physical conditions. We may wish to model a certain physical system which is initially at rest (so one initial condition may be zero), or wound up to some point (so an initial condition may be nonzero, say 5 for instance) and we may wish to see how the system reacts under such an initial condition.

When we solve a system with given initial conditions, we substitute them after our process of integration.

Example

When we solved f''(x)=2 \, say we had the initial conditions f'(0)=3 \, and f(0)=2 \,. (Note, initial conditions need not occur at f(0)).

After we integrate we make substitutions:

f'(0)=2(0)+C_1 \,
3=C_1 \,
\int f'(x) \,dx = \int 2x+3 \,dx
f(x)=x^2+3x+C_2 \,
f(0)=0^2+3(0)+C_2 \,
2=C_2 \,
f(x)=x^2+3x+2 \,

Without initial conditions, the answer we obtain is known as the general solution or the solution to the family of equations. With them, our solution is known as a specific solution.

Basic first order DEs

In this section we will consider four main types of differential equations:

  • separable
  • homogeneous
  • linear
  • exact

There are many other forms of differential equation, however, and these will be dealt with in the next section

Separable equations

A separable equation is in the form (using dy/dx notation which will serve us greatly here)

{dy \over dx} = f(x)/g(y)

Previously we have only dealt with simple differential equations with g(y)=1. How do we solve such a separable equation as above?

We group x and dx terms together, and y and dy terms together as well.

g(y)\ dy = f(x)\ dx

Integrating both sides with respect to y on the left hand side and x on the right hand side:

\int g(y)\,dy=\int f(x)\,dx+C

we will obtain the solution.

Worked example

Here is a worked example illustrating the process.

We are asked to solve

{dy \over dx} = 3x^2y

Separating

{dy \over y} = (3x^2)\,dx

Integrating

\int {dy \over y} = \int 3x^2\,dx
\ln{y}=x^3+C \,\!
y=e^{x^3+C}

Letting k = e^C where k is a constant we obtain

y=ke^{x^3}

which is the general solution.

Verification

This step does not need to be part of your work, but if you want to check your solution, you can verify your answer by differentiation.

We obtained

y=ke^{x^3}

as the solution to

{dy \over dx} = 3x^2y

Differentiating our solution with respect to x,

{dy \over dx} = 3kx^2e^{x^3}

And since y=ke^{x^3}, we can write

{dy \over dx} = 3x^2y

We see that we obtain our original differential equation, thus our work must be correct.

Homogeneous equations

A homogeneous equation is in the form

{dy \over dx} = f(y/x)

This looks difficult as it stands, however we can utilize the substitution

v = {y \over x}

so that we are now dealing with F(v) rather than F(y/x).

Now we can express y in terms of v, as y=xv and use the product rule.

The equation above then becomes, using the product rule

{dy \over dx} = v+x{dv \over dx}

Then

v+x{dv \over dx} = f(v)
x{dv \over dx} = f(v)-v
{dv \over dx} = {f(v)-v \over x}

which is a separable equation and can be solved as above.

However let's look at a worked equation to see how homogeneous equations are solved.

Worked example

We have the equation

 {dy \over dx} = {y^2 + x^2 \over yx}

This does not appear to be immediately separable, but let us expand to get

 {dy \over dx} = {y^2 \over yx} + {x^2 \over yx}
 {dy \over dx} = {x \over y} + {y \over x}

Substituting y=xv which is the same as substituting v=y/x:

 {dy \over dx} = 1/v + v

Now

 v+x{dv \over dx} = 1/v + v

Canceling v from both sides

 x{dv \over dx} = 1/v

Separating

 v\, dv = dx/x

Integrating both sides

 {1 \over 2}v^2+C= \ln(x) \,
 {1 \over 2}\left({y \over x}\right)^2= \ln(x)-C
 y^2 = 2x^2 \ln(x) - 2Cx^2 \,
 y = x\sqrt{2 \ln(x) - 2C}

which is our desired solution.

Linear equations

A linear first order differential equation is a differential equation in the form

 a(x){dy \over dx} + b(x)y=c(x)

Multiplying or dividing this equation by any non-zero function of x makes no difference to its solutions so we could always divide by a(x) to make the coefficient of the differential 1, but writing the equation in this more general form may offer insights.

At first glance, it is not possible to integrate the left hand side, but there is one special case. If b happens to be the differential of a then we can write

a(x){dy \over dx} + b(x)y = a(x){dy \over dx} + y{da \over dx}
= {d \over dx}a(x)y

and integration is now straightforward.

Since we can freely multiply by any function, lets see if we can use this freedom to write the left hand side in this special form.

We multiply the entire equation by an arbitrary, I(x), getting

 aI{dy \over dx} + bIy=cI

then impose the condition

 \frac{d}{dx}aI = bI

If this is satisfied the new left hand side will have the special form. Note that multiplying I by any constant will leave this condition still satisfied.

Rearranging this condition gives

 \frac{1}{I}\frac{dI}{dx} = \frac{b-\frac{da}{dx}}{a}

We can integrate this to get

 \ln I(x) = \int \frac{b(z)}{a(z)}dz - \ln a(x) + c \quad
I(x)=\frac{k}{a(x)}e^{\int \frac{b(z)}{a(z)}dz}

We can set the constant k to be 1, since this makes no difference.

Next we use I on the original differential equation, getting

 e^{\int \frac{b(z)}{a(z)}dz}{dy \over dx} + 
e^{\int \frac{b(z)}{a(z)}dz} \frac{b(x)}{a(x)}y
=e^{\int \frac{b(z)}{a(z)}dz}\frac{c(x)}{a(x)}

Because we've chosen I to put the left hand side in the special form we can rewrite this as

 {d \over dx}(ye^{\int \frac{b(z)}{a(z)}dz}) = 
e^{\int \frac{b(z)}{a(z)}dz}\frac{c(x)}{a(x)}

Integrating both sides and dividing by I we obtain the final result

 y = e^{-\int \frac{b(z)}{a(z)}dz}
\left(\int e^{\int \frac{b(z)}{a(z)}dz}\frac{c(x)}{a(x)}dx + C\right)

We call I an integrating factor. Similar techniques can be used on some other calculus problems.

Example

Consider

\frac{dy}{dx} + y \tan x = 1 \quad y(0)=0

First we calculate the integrating factor.

I=e^{\int \tan x dx} = e^ {\ln \sec x} = \sec x

Multiplying the equation by this gives


\sec x \frac{dy}{dx} + y \sec x \tan x = \sec x

or

 \frac{d}{dx} y\sec x = \sec x

We can now integrate

 y = \cos x \int_0^x \sec z \, dz = \cos x \ln (\sec x + \tan x)

Exact equations

An exact equation is in the form

f(x, y) dx + g(x, y) dy = 0

and, has the property that

Dx f = Dy g

(If the differential equation does not have this property then we can't proceed any further).

As a result of this, if we have an exact equation then there exists a function h(x, y) such that

Dy h = f and Dx h = g

So then the solutions are in the form

h(x, y) = c

by using the fact of the total differential. We can find then h(x, y) by integration


Basic second and higher order ODE's

The generic solution of a nth order ODE will contain n constants of integration. To calculate them we need n more equations. Most often, we have either

boundary conditions, the values of y and its derivatives take for two different values of x

or

initial conditions, the values of y and its first n-1 derivatives take for one particular value of x.

Reducible ODE's

1. If the independent variable, x, does not occur in the differential equation then its order can be lowered by one. This will reduce a second order ODE to first order.

Consider the equation:

F\left(y,\frac{dy}{dx},\frac{d^2y}{dx^2}\right)=0

Define

u=\frac{dy}{dx}

Then

\frac{d^2y}{dx^2}=\frac{du}{dx}=\frac{du}{dy}\cdot\frac{dy}{dx}=\frac{du}{dy}\cdot u

Substitute these two expression into the equation and we get

F\left(y,u,\frac{du}{dy}\cdot u\right)=0

which is a first order ODE

Example

Solve

1+2y^2\operatorname{D}^2y=0

if at x=0,  y=Dy=1

First, we make the substitution, getting

1+2y^2 u \frac{du}{dy}=0

This is a first order ODE. By rearranging terms we can separate the variables

udu=-\frac{dy}{2y^2}

Integrating this gives

u^2/2=c+1/2y

We know the values of y and u when x=0 so we can find c

c=u^2/2-1/2y=1^2/2-1/(2\cdot 1)=1/2-1/2=0

Next, we reverse the substitution

\frac{dy}{dx}^2=u^2=\frac{1}{y}

and take the square root

\frac{dy}{dx}=\pm \frac{1}{\sqrt{y}}

To find out which sign of the square root to keep, we use the initial condition, Dy=1 at x=0, again, and rule out the negative square root. We now have another separable first order ODE,

\frac{dy}{dx}=\frac{1}{\sqrt{y}}

Its solution is

\frac{2}{3}y^\frac{3}{2}= x+d

Since y=1 when x=0, d=2/3, and

y=\left(1 + \frac{3x}{2} \right)^\frac{2}{3}

2. If the dependent variable, y, does not occur in the differential equation then it may also be reduced to a first order equation.

Consider the equation:

F\left(x,\frac{dy}{dx},\frac{d^2y}{dx^2}\right)=0

Define

u=\frac{dy}{dx}

Then

\frac{d^2y}{dx^2}=\frac{du}{dx}

Substitute these two expressions into the first equation and we get

F\left(x,u,\frac{du}{dx}\right)=0

which is a first order ODE

Linear ODEs

An ODE of the form

\frac{d^ny}{dx^n}+a_1(x)\frac{d^{n-1}y}{dx^{n-1}}+ ... +a_n y=F(x)

is called linear. Such equations are much simpler to solve than typical non-linear ODEs. Though only a few special cases can be solved exactly in terms of elementary functions, there is much that can be said about the solution of a generic linear ODE. A full account would be beyond the scope of this book


If F(x)=0 for all x the ODE is called homogeneous

Two useful properties of generic linear equations are

  1. Any linear combination of solutions of an homogeneous linear equation is also a solution.
  2. If we have a solution of a nonhomogeneous linear equation and we add any solution of the corresponding homogenous linear equation we get another solution of the nonhomogeneous linear equation

Variation of constants

Suppose we have a linear ODE,

\frac{d^ny}{dx^n}+a_1(x)\frac{d^{n-1}y}{dx^{n-1}}+ ... +a_n y=0

and we know one solution, y=w(x)

The other solutions can always be written as y=wz. This substitution in the ODE will give us terms involving every differential of z upto the nth, no higher, so we'll end up with an nth order linear ODE for z.

We know that z is constant is one solution, so the ODE for z must not contain a z term, which means it will effectively be an n-1th order linear ODE. We will have reduced the order by one.

Lets see how this works in practice.

Example

Consider

\frac{d^2y}{dx^2}+\frac{2}{x}\frac{dy}{dx}-\frac{6}{x^2}y=0

One solution of this is y=x2, so substitute y=zx2 into this equation.

\left( x^2\frac{d^2z}{dx^2}+2x\frac{dz}{dx}+2z\right) 
+\frac{2}{x} \left( x^2\frac{dz}{dx}+2xz \right) -\frac{6}{x^2}x^2 z=0

Rearrange and simplify.

x^2 D^2 z + 6xD z=0

This is first order for Dz. We can solve it to get

z=A x^{-5} \quad y=A x^{-3}

Since the equation is linear we can add this to any multiple of the other solution to get the general solution,

y=A x^{-3} + B x^2

Linear homogeneous ODE's with constant coefficients

Suppose we have a ODE

 (D^n+a_1 D^{n-1}+ ... + a_{n-1}D+a_0)y=0

we can take an inspired guess at a solution (motivate this)

y=e^{px}

For this function Dny=pny so the ODE becomes

 (p^n+a_1 p^{n-1}+ ... + a_{n-1}p+a_0)y=0

y=0 is a trivial solution of the ODE so we can discard it. We are then left with the equation

 p^n+a_1 p^{n-1}+ ... + a_{n-1}p+a_0)=0

This is called the characteristic equation of the ODE.

It can have up to n roots, p1, p2 … pn, each root giving us a different solution of the ODE.

Because the ODE is linear, we can add all those solution together in any linear combination to get a general solution

y=A_1 e^{p_1 x} +A_2 e^{p_2 x} + ... + A_n e^{p_n x}

To see how this works in practice we will look at the second order case. Solving equations like this of higher order uses exactly the same principles; only the algebra is more complex.

Second order

If the ODE is second order,

D^2 y + bDy+cy=0

then the characteristic equation is a quadratic,

p^2+bp+c=0

with roots

p_{\pm}=\frac{-b \pm \sqrt{b^2-4c}}{2}

What these roots are like depends on the sign of b2-4c, so we have three cases to consider.

1) b2 > 4c

In this case we have two different real roots, so we can write down the solution straight away.

 y=A_{+}e^{p_{+}}+A_{-}e^{p_{-}}


2) b2 < 4c

In this case, both roots are imaginary. We could just put them directly in the formula, but if we are interested in real solutions it is more useful to write them another way.

Defining k2=4c-b2, then the solution is

y=A_{+}e^{ikx-\frac{bx}{2}}+A_{-}e^{-ikx-\frac{bx}{2}}

For this to be real, the A's must be complex conjugates

A_{\pm}=A e^{\pm ia}

Make this substitution and we can write,

y=A e^{-bx/2}\cos (kx+a)

If b is positive, this is a damped oscillation.


3) b2 = 4c

In this case the characteristic equation only gives us one root, p=-b/2. We must use another method to find the other solution.

We'll use the method of variation of constants. The ODE we need to solve is,

D^2 y -2pDy+p^2y=0

rewriting b and c in terms of the root. From the characteristic equation we know one solution is y=e^{px} so we make the substitution y=ze^{px}, giving

 (e^{px}D^2z+2pe^{px}Dz+p^2e^{px}z)-2p(e^{px}Dz+pe^{px}z)+p^2e^{px}z=0

This simplifies to D2z=0, which is easily solved. We get

z=Ax+B \quad y=(Ax+B)e^{px}

so the second solution is the first multiplied by x.

Higher order linear constant coefficient ODE's behave similarly: an exponential for every real root of the characteristic and a exponent multiplied by a trig factor for every complex conjugate pair, both being multiplied by a polynomial if the root is repeated.

E.g., if the characteristic equation factors to

(p-1)^4(p-3)(p^2+1)^2=0

the general solution of the ODE will be

y=(A+Bx+Cx^2+Dx^3)e^x + Ee^{3x}+ F \cos (x+a) +Gx \cos(x+b)

The most difficult part is finding the roots of the characteristic equation.

Linear nonhomogeneous ODEs with constant coefficients

First, let's consider the ODE

Dy-y=x

a nonhomogeneous first order ODE which we know how to solve.

Using the integrating factor e-x we find

y=c e^{-x} +1 -x

This is the sum of a solution of the corresponding homogeneous equation, and a polynomial.

Nonhomogeneous ODE's of higher order behave similarly.

If we have a single solution, yp of the nonhomogeneous ODE, called a particular solution,

 (D^n+a_1 D^{n-1} + \cdots + a_n)y=F(x)

then the general solution is y=yp+yh, where yh is the general solution of the homogeneous ODE.

Find yp for an arbitrary F(x) requires methods beyond the scope of this chapter, but there are some special cases where finding yp is straightforward.

Remember that in the first order problem yp for a polynomial F(x) was itself a polynomial of the same order. We can extend this to higher orders.

Example:

D^2y+y=x^3-x+1

Consider a particular solution

y_p=b_0+b_1 x+b_2 x^2 + x^3

Substitute for y and collect coefficients

x^3 + b_2 x^2 +(6+b_1)x +(2b_2+b_0)=x^3-x+1

So b2=0, b1=-7, b0=1, and the general solution is

y=a \sin x + b \cos x + 1 -7x + x^3

This works because all the derivatives of a polynomial are themselves polynomials.

Two other special cases are

F(x)=P_n e^{kx} \quad y_p(x)=Q_n e^{kx}
F(x)=A_n \sin kx +B_n \cos kx \quad 
y_p(x)=P_n \sin kx +Q_n \cos kx

where Pn,Qn,An, and Bn are all polynomials of degree n.

Making these substitutions will give a set of simultaneous linear equations for the coefficients of the polynomials.

Partial Differential Equations

← Ordinary differential equations Calculus Multivariable and differential calculus:Exercises →
Print version

Introduction

First order

Any partial differential equation of the form

 h_1  \frac {\partial u}{\partial x_1}
+ h_2 \frac {\partial u}{\partial x_2} \cdots +
h_n \frac {\partial u}{\partial x_n} = 
b

where h1, h2hn, and b are all functions of both u and Rn can be reduced to a set of ordinary differential equations.

To see how to do this, we will first consider some simpler problems.

Special cases

We will start with the simple PDE

u_z(x,y,z)=u(x,y,z) \quad (1)

Because u is only differentiated with respect to z, for any fixed x and y we can treat this like the ODE, du/dz=u. The solution of that ODE is cez, where c is the value of u when z=0, for the fixed x and y

Therefore, the solution of the PDE is

u(x, y, z)=u(x,y,0) e^z

Instead of just having a constant of integration, we have an arbitrary function. This will be true for any PDE.

Notice the shape of the solution, an arbitrary function of points in the xy, plane, which is normal to the 'z' axis, and the solution of an ODE in the 'z' direction.

Now consider the slightly more complex PDE

a_x u_x +a_y u_y + a_z u_z=h(u) \quad (2)

where h can be any function, and each a is a real constant.

We recognize the left hand side as being a·, so this equation says that the differential of u in the a direction is h(u). Comparing this with the first equation suggests that the solution can be written as an arbitrary function on the plane normal to a combined with the solution of an ODE.

Remembering from Calculus/Vectors that any vector r can be split up into components parallel and perpendicular to a,

\mathbf{r}= \mathbf{r}_\perp + \mathbf{r}_\| =  \left( \mathbf{r}-\frac{(\mathbf{r} \cdot \mathbf{a})\mathbf{a}}{|\mathbf{a}|^2} \right) + \frac{(\mathbf{r} \cdot \mathbf{a})\mathbf{a}}{|\mathbf{a}|^2}

we will use this to split the components of r in a way suggested by the analogy with (1).

Let's write

\mathbf{r} = (x,y,z) = \mathbf{r}_\perp + s\mathbf{a} \quad 
s =  \frac{\mathbf{r} \cdot \mathbf{a}}{\mathbf{a} \cdot \mathbf{a}}

and substitute this into (2), using the chain rule. Because we are only differentiating in the a direction, adding any function of the perpendicular vector to s will make no difference.

First we calculate grad s, for use in the chain rule,

\nabla s = \frac{\mathbf{a}}{a^2}

On making the substitution into (2), we get,

h(u)= \mathbf{a} \cdot \nabla s \frac{d}{ds} u(s)=
\frac{\mathbf{a} \cdot \mathbf{a}}{\mathbf{a} \cdot \mathbf{a}} \frac{d}{ds} u(s)= \frac{du}{ds}

which is an ordinary differential equation with the solution

s=c(\mathbf{r}_\perp) + \int^u \frac{dt}{h(t)}

The constant c can depend on the perpendicular components, but not upon the parallel coordinate. Replacing s with a monotonic scalar function of s multiplies the ODE by a function of s, which doesn't affect the solution.

Example:

u(x,t)_x=u(x,t)_t

For this equation, a is (1, -1), s=x-t, and the perpendicular vector is (x+t)(1, 1). The reduced ODE is du/ds=0 so the solution is

u=f(x+t)

To find f we need initial conditions on u. Are there any constraints on what initial conditions are suitable?

Consider, if we are given

  • u(x,0), this is exactly f(x),
  • u(3t,t), this is f(4t) and f(t) follows immediately
  • u(t3+2t,t), this is f(t3+3t) and f(t) follows, on solving the cubic.
  • u(-t,t), then this is f(0), so if the given function isn't constant we have a inconsistency, and if it is the solution isn't specified off the initial line.

Similarly, if we are given u on any curve which the lines x+t=c intersect only once, and to which they are not tangent, we can deduce f.

For any first order PDE with constant coefficients, the same will be true. We will have a set of lines, parallel to r=at, along which the solution is gained by integrating an ODE with initial conditions specified on some surface to which the lines aren't tangent.

If we look at how this works, we'll see we haven't actually used the constancy of a, so let's drop that assumption and look for a similar solution.

The important point was that the solution was of the form u=f(x(s),y(s)), where (x(s),y(s)) is the curve we integrated along -- a straight line in the previous case. We can add constant functions of integration to s without changing this form.

Consider a PDE,

a(x,y)u_x+b(x,y)u_y=c(x,y,u)

For the suggested solution, u=f(x(s),y(s)), the chain rule gives

  \frac{du}{ds} = \frac{dx}{ds}u_x+ \frac{dy}{ds}u_y

Comparing coefficients then gives

 \frac{dx}{ds}=a(x,y) \quad \frac{dy}{ds}=b(x,y) \quad \frac{du}{ds}=c(x,y,u)

so we've reduced our original PDE to a set of simultaneous ODE's. This procedure can be reversed.

The curves (x(s),y(s)) are called characteristics of the equation.

Example: Solve yu_x=xu_y given u=f(x) for x≥0 The ODE's are

 \frac{dx}{ds}=y \quad \frac{dy}{ds}=-x \quad \frac{du}{ds}=0

subject to the initial conditions at s=0,

x(0)=r \quad y(0)=0 \quad u(0)=f(r) \quad r \ge 0

This ODE is easily solved, giving

x(s)=r \cos s \quad y(s)=r \sin s \quad u(s)=f(r)

so the characteristics are concentric circles round the origin, and in polar coordinates u(r,θ)=f(r)

Considering the logic of this method, we see that the independence of a and b from u has not been used either, so that assumption too can be dropped, giving the general method for equations of this quasilinear form.

Quasilinear

Summarising the conclusions of the last section, to solve a PDE

 a_1(u,\mathbf{x})  \frac {\partial u}{\partial x_1}
+ a_2(u,\mathbf{x}) \frac {\partial u}{\partial x_2} \cdots +
a_n(u,\mathbf{x}) \frac {\partial u}{\partial x_n} = 
b(u,\mathbf{x})

subject to the initial condition that on the surface, (x1(r1,…,rn-1, …xn(r1,…,rn-1), u=f(r1,…,rn-1) --this being an arbitrary paremetrisation of the initial surface--

  • we transform the equation to the equivalent set of ODEs,
\frac{dx_1}{ds}=a_1 \ \ldots \ \frac{dx_n}{ds}=a_n \quad \frac{du}{ds}=b
subject to the initial conditions
x_i(0)=f(r_1, \ldots ,r_{n-1}) \quad u=f(r_1,r_2, \ldots r_{n-1})
  • Solve the ODE's, giving xi as a function of s and the ri.
  • Invert this to get s and the ri as functions of the xi.
  • Substitute these inverse functions into the expression for u as a function of s and the ri obtained in the second step.

Both the second and third steps may be troublesome.

The set of ODEs is generally non-linear and without analytical solution. It may even be easier to work with the PDE than with the ODEs.

In the third step, the ri together with s form a coordinate system adapted for the PDE. We can only make the inversion at all if the Jacobian of the transformation to Cartesian coordinates is not zero,

\begin{vmatrix}
\frac{\partial x_1}{\partial r_1}&\cdots &\frac{\partial x_1}{\partial r_{n-1}} & a_1 \\
\vdots & \ddots & & \vdots \\
\frac{\partial x_n}{\partial r_1}&\cdots &\frac{\partial x_n}{\partial r_{n-1}} & a_n \end{vmatrix} \ne 0

This is equivalent to saying that the vector (a1, &hellip:, an) is never in the tangent plane to a surface of constant s.

If this condition is not false when s=0 it may become so as the equations are integrated. We will soon consider ways of dealing with the problems this can cause.

Even when it is technically possible to invert the algebraic equations it is obviously inconvenient to do so.

Example

To see how this works in practice, we will
a/ consider the PDE,

uu_x+u_y+u_t=0

with generic initial condition,

u=f(x,y) \mbox{ on } t=0

Naming variables for future convenience, the corresponding ODE's are

\frac{dx}{d\tau}=u \quad \frac{dy}{d\tau}=1 \quad \frac{dt}{d\tau}=1 \quad \frac{du}{d\tau}=0 \quad

subject to the initial conditions at τ=0

x=r \quad y=s \quad t=0 \quad u=f(r,s)

These ODE's are easily solved to give

x=r+f(r,s)\tau \quad y=s+\tau \quad t=\tau \quad u=f(r,s)

These are the parametric equations of a set of straight lines, the characteristics.

The determinant of the Jacobian of this coordinate transformation is

\begin{vmatrix} 
1+\tau\frac{\partial f}{\partial r} & \tau\frac{\partial f}{\partial s} & f \\
0 & 1 & 1 \\ 0 & 0 & 1 \end{vmatrix}=1+\tau\frac{\partial f}{\partial r}

This determinant is 1 when t=0, but if fr is anywhere negative this determinant will eventually be zero, and this solution fails.

In this case, the failure is because the surface sf_r=-1 is an envelope of the characteristics.

For arbitrary f we can invert the transformation and obtain an implicit expression for u

u=f(x-tu,y-x)

If f is given this can be solved for u.

1/ f(x,y)=ax, The implicit solution is

u=a(x-tu) \Rightarrow u=\frac{ax}{1+at}

This is a line in the u-x plane, rotating clockwise as t increases. If a is negative, this line eventually become vertical. If a is positive, this line tends towards u=0, and the solution is valid for all t.

2/ f(x,y)=x2, The implicit solution is

u=(x-tu)^2  \Rightarrow u=\frac{1+2tx-\sqrt{1+4tx}}{2t^2}


This solution clearly fails when 1+4tx<0, which is just when sf_r=-1 . For any t>0 this happens somewhere. As t increases this point of failure moves toward the origin.

Notice that the point where u=0 stays fixed. This is true for any solution of this equation, whatever f is.

We will see later that we can find a solution after this time, if we consider discontinuous solutions. We can think of this as a shockwave.

3/ f(x,y)=\sin (xy)
The implicit solution is

u(x,y,t)=\sin \left( (x-tu)(y-x) \right)

and we can not solve this explitely for u. The best we can manage is a numerical solution of this equation.

b/We can also consider the closely related PDE

uu_x+u_y+u_t=y

The corresponding ODE's are

\frac{dx}{d\tau}=u \quad \frac{dy}{d\tau}=1 \quad \frac{dz}{d\tau}=1 \quad \frac{du}{d\tau}=y \quad

subject to the initial conditions at τ=0

x=r \quad y=s \quad t=0 \quad u=f(r,s)

These ODE's are easily solved to give

x=r+\tau f+\frac{1}{2}s\tau^2+\frac{1}{6}\tau^3 \quad 
y=s+\tau \quad t=\tau \quad u=f+s\tau+\frac{1}{2}\tau^2

Writing f in terms of u, s, and τ, then substituting into the equation for x gives an implicit solution

u(x,y,t)=f(x-ut+\frac{1}{2}yt^2-\frac{1}{6}t^3,y-t)+yt-\frac{1}{2}t^2

It is possible to solve this for u in some special cases, but in general we can only solve this equation numerically. However, we can learn much about the global properties of the solution from further analysis

Characteristic initial value problems

What if initial conditions are given on a characteristic, on an envelope of characteristics, on a surface with characteristic tangents at isolated points?

Discontinuous solutions

So far, we've only considered smooth solutions of the PDE, but this is too restrictive. We may encounter initial conditions which aren't smooth, e.g.

u_t = c u_x \quad u(x,0)= 
\left\{ \begin{matrix} 1, & x \ge 0 \\ 0, & x < 0 \end{matrix} \right.

If we were to simply use the general solution of this equation for smooth initial conditions,

 u(x,t) = u(x + ct, 0) \,

we would get

 u(x,t) = 
\left\{ \begin{matrix} 1, & x+ct \ge 0  \\ 0, & x+ct < 0 \end{matrix} \right.

which appears to be a solution to the original equation. However, since the partial differentials are undefined on the characteristic x+ct=0, so it becomes unclear what it means to say that the equation is true at that point.

We need to investigate further, starting by considering the possible types of discontinuities.

If we look at the derivations above, we see we've never use any second or higher order derivatives so it doesn't matter if they aren't continuous, the results above will still apply.

The next simplest case is when the function is continuous, but the first derivative is not, e.g. |x|. We'll initially restrict ourselves to the two-dimensional case, u(x, t) for the generic equation.

a(x,t)u_x + b(x,t)u_t = c(u,x,t) \quad (1)

Typically, the discontinuity is not confined to a single point, but is shared by all points on some curve, (x0(s), t0(s) )

Then we have

\begin{matrix}
x>x_0 & \lim_{x \rarr x_0} = u_+ \\
x<x_0 & \lim_{x \rarr x_0} = u_-
\end{matrix}

We can then compare u and its derivatives on both sides of this curve.

It will prove useful to name the jumps across the discontinuity. We say

 [u]=u_+-u_- \quad [u_x]=u_{x+}-u_{x-} \quad  [u_t]=u_{t+}-u_{t-}

Now, since the equation (1) is true on both sides of the discontinuity, we can see that both u+ and u-, being the limits of solutions, must themselves satisfy the equation. That is,

\begin{matrix}
a(x,t)u_{+x} + b(x,t)u_{+t} & = & c(u_+,x,t) \\
a(x,t)u_{-x} + b(x,t)u_{-t} & = & c(u_-,x,t)
\end{matrix} \mbox{  where  } 
\begin{matrix}x=x_0(s) \\ t=t_0(s)\end{matrix}

Subtracting then gives us an equation for the jumps in the differentials

a(x,t)[u_x] + b(x,t)[u_t] =  0 \,

We are considering the case where u itself is continuous so we know that [u]=0. Differentiating this with respect to s will give us a second equation in the differential jumps.

 \frac{dx_0}{ds}[u_x] + \frac{dt_0}{ds}[u_t] =0

The last two equations can only be both true if one is a multiple of the other, but multiplying s by a constant also multiplies the second equation by that same constant while leaving the curve of discontinuity unchanged, hence we can without loss of generality define s to be such that

 \frac{dx_0}{ds}=a \quad  \frac{dt_0}{ds}=b

But these are the equations for a characteristic, i.e. discontinuities propagate along characteristics. We could use this property as an alternative definition of characteristics.

We can deal similarly with discontinuous functions by first writing the equation in conservation form, so called because conservation laws can always be written this way.

 (au)_x+(bu)_t=a_x u + b_t u +c \quad (1)

Notice that the left hand side can be regarded as the divergence of (au, bu). Writing the equation this way allows us to use the theorems of vector calculus.

Consider a narrow strip with sides parallel to the discontinuity and width h

We can integrate both sides of (1) over R, giving

\int_R (au)_x+(bu)_t \, dx dt = \int_R (a_x + b_t) u +c \, dx dt

Next we use Green's theorem to convert the left hand side into a line integral.

\oint_{\partial R} au dt - bu dx = \int_R (a_x  + b_t) u +c \, dx dt

Now we let the width of the strip fall to zero. The right hand side also tends to zero but the left hand side reduces to the difference between two integrals along the part of the boundary of R parallel to the curve.

 \int au_+dt-bu_+dx - \int  au_-dt-bu_-dx =  0

The integrals along the opposite sides of R have different signs because they are in opposite directions.

For the last equation to always be true, the integrand must always be zero, i.e.

 \left( a \frac{dt_0}{ds} - b \frac{dx_0}{ds} \right) [u] = 0

Since, by assumption [u] isn't zero, the other factor must be, which immediately implies the curve of discontinuity is a characteristic.

Once again, discontinuities propagate along characteristics.

Above, we only considered functions of two variables, but it is straightforward to extend this to functions of n variables.

The initial condition is given on an n-1 dimensional surface, which evolves along the characteristics. Typical discontinuities in the initial condition will lie on a n-2 dimensional surface embedded within the initial surface. This surface of discontinuity will propagate along the characteristics that pass through the initial discontinuity.

The jumps themselves obey ordinary differential equations, much as u itself does on a characteristic. In the two dimensional case, for u continuous but not smooth, a little algebra shows that

\frac{d[u_x]}{ds}=[u_x]\left( \frac{\partial c}{\partial u} + a\frac{b_x}{b}-a_x \right)

while u obeys the same equation as before,

\frac{du}{ds}=c

We can integrate these equations to see how the discontinuity evolves as we move along the characteristic.

We may find that, for some future s, [ux] passes through zero. At such points, the discontinuity has vanished, and we can treat the function as smooth at that characteristic from then on.

Conversely, we can expect that smooth functions may, under the righr circumstances, become discontinuous.

To see how all this works in practice we'll consider the solutions of the equation

u_t+uu_x=0 \quad u(x,0)=f(x)

for three different initial conditions.

The general solution, using the techniques outlined earlier, is

 u=f(x-tu) \,\!

u is constant on the characteristics, which are straight lines with slope dependent on u.

First consider f such that

 f(x)= \left\{ \begin{matrix} 
1 & x>a \\ \frac{x}{a} & a \ge x > 0 \\ 0 & x \le 0 \end{matrix} \right.
\quad a>0

While u is continuous its derivative is discontinuous at x=0, where u=0, and at x=a, where u=1. The characteristics through these points divide the solution into three regions.

All the characteristics to the right of the characteristic through x=a, t=0 intersect the x-axis to the right of x=1, where u=1 so u is 1 on all those characteristics, i.e whenever x-t>a.

Similarly the characteristic through the origin is the line x=0, to the left of which u remains zero.

We could find the value of u at a point in between those two characteristics either by finding which intermediate characteristic it lies on and tracing it back to the initial line, or via the general solution.

Either way, we get

 f(x)= \left\{ \begin{matrix} 
1 & x-t>a \\ \frac{x}{a+t} & a+t \ge x > 0 \\ 0 & x \le 0 \end{matrix} \right.

At larger t the solution u is more spread out than at t=0 but still the same shape.

We can also consider what happens when a tends to 0, so that u itself is discontinuous at x=0.

If we write the PDE in conservation form then use Green's theorem, as we did above for the linear case, we get

[u]\frac{dx_0}{ds}=\frac{1}{2}[u^2]\frac{dt_0}{ds}

[u²] is the difference of two squares, so if we take s=t we get

\frac{dx_0}{dt}=\frac{1}{2}(u_-+u_+)

In this case the discontinuity behaves as if the value of u on it were the average of the limiting values on either side.

However, there is a caveat.

Since the limiting value to the left is u- the discontinuity must lie on that characteristic, and similarly for u+; i.e the jump discontinuity must be on an intersection of characteristics, at a point where u would otherwise be multivalued.

For this PDE the characteristic can only intersect on the discontinuity if

u_- > u_+ \,

If this is not true the discontinuity can not propagate. Something else must happen.

The limit a=0 is an example of a jump discontinuity for which this condition is false, so we can see what happens in such cases by studying it.

Taking the limit of the solution derived above gives

 f(x)= \left\{ \begin{matrix} 
1 & x>t \\ \frac{x}{t} & t \ge x > 0 \\ 0 & x \le 0 \end{matrix} \right.

If we had taken the limit of any other sequence of initials conditions tending to the same limit we would have obtained a trivially equivalent result.

Looking at the characteristics of this solution, we see that at the jump discontinuity characteristics on which u takes every value betweeen 0 and 1 all intersect.

At later times, there are two slope discontinuities, at x=0 and x=t, but no jump discontinuity.

This behaviour is typical in such cases. The jump discontinuity becomes a pair of slope discontinuities between which the solution takes all appropriate values.

Now, lets consider the same equation with the initial condition

 f(x)= \left\{ \begin{matrix} 
1 & x \le 0 \\ 1-\frac{x}{a} & a \ge x > 0 \\ 0 & x > a \end{matrix} \right.
\quad a>0

This has slope discontinuities at x=0 and x=a, dividing the solution into three regions.

The boundaries between these regions are given by the characteristics through these initial points, namely the two lines

 x=t \quad x=a

These characteristics intersect at t=a, so the nature of the solution must change then.

In between these two discontinuities, the characteristic through x=b at t=0 is clearly

 x=\left( 1- \frac{b}{a} \right) t + b \quad 0 \le b \le a

All these characteristics intersect at the same point, (x,t)=(a,a).

We can use these characteristics, or the general solution, to write u for t<a

 u(x,t)= \left\{ \begin{matrix} 
1 & x \le t \\ \frac{a-x}{a-t} & a \ge x > t \\ 0 & x > a \end{matrix} \right.
\quad a >t \ge 0

As t tends to a, this becomes a step function. Since u is greater to the left than the right of the discontinuity, it meets the condition for propagation deduced above, so for t>a u is a step function moving at the average speed of the two sides.

 u(x,t)= \left\{ \begin{matrix}
1 & x \le \frac{a+t}{2} \\ 0 & x > \frac{a+t}{2} \end{matrix} \right.
\quad t \ge a \ge 0

This is the reverse of what we saw for the initial condition previously considered, two slope discontinuities merging into a step discontinuity rather than vice versa. Which actually happens depends entirely on the initial conditions. Indeed, examples could be given for which both processes happen.

In the two examples above, we started with a discontinuity and investigated how it evolved. It is also possible for solutions which are initially smooth to become discontinuous.

For example, we saw earlier for this particular PDE that the solution with the initial condition u=x² breaks down when 2xt+1=0. At these points the solution becomes discontinuous.


Typically, discontinuities in the solution of any partial differential equation, not merely ones of first order, arise when solutions break down in this way and propagate similarly, merging and splitting in the same fashion.

Fully non-linear PDEs

It is possible to extend the approach of the previous sections to reduce any equation of the form

F(x_1,x_2, \ldots ,x_n,u,u_{x_1},u_{x_2}, \ldots , u_{x_n})=0

to a set of ODE's, for any function, F.

We will not prove this here, but the corresponding ODE's are

\frac{dx_i}{d\tau}=\frac{\partial F}{\partial u_i} \quad \frac{du_i}{d\tau} = -\left( \frac{\partial F}{\partial x_i} + 
u_i \frac{\partial F}{\partial u} \right) \quad
\frac{du}{d\tau}=\sum_{i=1}^n u_i \frac{\partial F}{\partial u_i}

If u is given on a surface parameterized by r1rn then we have, as before, n initial conditions on the n, xi

\tau=0 \quad x_i=f_i(r_1,r_2, \ldots ,r_{n-1})

given by the parameterization and one initial condition on u itself,

\tau=0 \quad u=f(r_1,r_2, \ldots ,r_{n-1})

but, because we have an extra n ODEs for the ui's, we need an extra n initial conditions.

These are, n-1 consistency conditions,

\tau=0 \quad \frac{\partial f}{\partial r_i}=
\sum_{j=1}^{n-1} u_i \frac{\partial f_i}{\partial r_j}

which state that the ui's are the partial derivatives of u on the initial surface, and one initial condition

\tau=0 \quad F(x_1,x_2, \ldots ,x_n,u,u_1,u_2, \ldots , u_n)=0

stating that the PDE itself holds on the initial surface.

These n initial conditions for the ui will be a set of algebraic equations, which may have multiple solutions. Each solution will give a different solution of the PDE.

Example

Consider

u_t=u_x^2+u_y^2, \quad u(x,y,0)=x^2+y^2

The initial conditions at τ=0 are

\begin{matrix} x=r & y=s & t=0 & u=r^2+s^2 \\
u_x=2r & u_y=2s & u_t=4(r^2+s^2) & \end{matrix}

and the ODE's are

\begin{matrix} 
\frac{dx}{d\tau}=-2u_x & \frac{dy}{d\tau}=-2u_y & \frac{dt}{d\tau}=1 
& \frac{du}{d\tau}=u_t-2(u_x^2+u_y^2)\\
\frac{du_x}{d\tau}=0 & \frac{du_y}{d\tau}=0 & \frac{du_t}{d\tau}=0 &  \end{matrix}

Note that the partial derivatives are constant on the characteristics. This always happen when the PDE contains only partial derivatives, simplifying the procedure.

These equations are readily solved to give

x=r(1-4\tau ) \quad y=s(1-4\tau ) \quad t=\tau \quad u=(r^2+s^2)(1-4\tau )

On eliminating the parameters we get the solution,

u=\frac{x^2+y^2}{1-4t}

which can easily be checked. abc

Second order

Suppose we are given a second order linear PDE to solve

a(x,y)u_{xx}+b(x,y)u_{xy}+c(x,y)u_{yy} =  d(x,y)u_x+e(x,y)u_y +p(x,y)u +q(x,y) \quad (1)

The natural approach, after our experience with ordinary differential equations and with simple algebraic equations, is attempt a factorisation. Let's see how for this takes us.

We would expect factoring the left hand of (1) to give us an equivalent equation of the form

 a(x,y)(D_x+\alpha_+(x,y)D_y)(D_x+\alpha_-(x,y)D_y)u \,

and we can immediately divide through by a. This suggests that those particular combinations of first order derivatives will play a special role.

Now, when studying first order PDE's we saw that such combinations were equivalent to the derivatives along characteristic curves. Effectively, we changed to a coordinate system defined by the characteristic curve and the initial curve.

Here, we have two combinations of first order derivatives each of which may define a different characteristic curve. If so, the two sets of characteristics will define a natural coordinate system for the problem, much as in the first order case.

In the new coordinates we will have

 D_x+\alpha_+(x,y)D_y=D_r \quad D_x+\alpha_-(x,y)D_y =D_s

with each of the factors having become a differentiation along its respective characteristic curve, and the left hand side will become simply urs giving us an equation of the form

u_{rs}=A(r,s)u_r+B(r,s)u_s+C(r,s)u+D(r,s) \,

If A, B, and C all happen to be zero, the solution is obvious. If not, we can hope that the simpler form of the left hand side will enable us to make progress.

However, before we can do all this, we must see if (1) can actually be factored.

Multiplying out the factors gives

u_{xx}+\frac{b(x,y)}{a(x,y)}u_{xy}+\frac{c(x,y)}{a(x,y)}u_{yy}= u_{xx}+(\alpha_++\alpha_-)u_{xy}+\alpha_+\alpha_-u_{yy}

On comparing coefficients, and solving for the α's we see that they are the roots of

a(x,y)\alpha^2+b(x,y)\alpha+c(x,y)=0 \,

Since we are discussing real functions, we are only interested in real roots, so the existence of the desired factorization will depend on the discriminant of this quadratic equation.

  • \mbox{If }  b(x,y)^2 > 4 a(x,y)c(x,y)
then we have two factors, and can follow the procedure outlined above. Equations like this are called hyperbolic
  • \mbox{If }  b(x,y)^2 = 4 a(x,y)c(x,y)
then we have only factor, giving us a single characteristic curve. It will be natural to use distance along these curves as one coordinate, but the second must be determined by other considerations.
The same line of argument as before shows that use the characteristic curve this way gives a second order term of the form urr, where we've only taken the second derivative with respect to one of the two coordinates. Equations like this are called parabolic
  • \mbox{If }  b(x,y)^2 < 4 a(x,y)c(x,y)
then we have no real factors. In this case the best we can do is reduce the second order terms to the simplest possible form satisfying this inequality, i.e urr+uss
It can be shown that this reduction is always possible. Equations like this are called elliptic

It can be shown that, just as for first order PDEs, discontinuities propagate along characteristics. Since elliptic equations have no real characteristics, this implies that any discontinuities they may have will be restricted to isolated points; i.e., that the solution is almost everywhere smooth.

This is not true for hyperbolic equations. Their behavior is largely controlled by the shape of their characteristic curves.

These differences mean different methods are required to study the three types of second equation. Fortunately, changing variables as indicated by the factorisation above lets us reduce any second order PDE to one in which the coefficients of the second order terms are constant, which means it is sufficient to consider only three standard equations.

u_{xx}+u_{yy}=0 \quad u_{xx}-u_{yy}=0 \quad u_{xx}-u_y=0

We could also consider the cases where the right hand side of these equations is a given function, or proportional to u or to one of its first order derivatives, but all the essential properties of hyperbolic, parabolic, and elliptic equations are demonstrated by these three standard forms.

While we've only demonstrated the reduction in two dimensions, a similar reduction applies in higher dimensions, leading to a similar classification. We get, as the reduced form of the second order terms,

a_1\frac{\partial^2 u}{\partial x_1^2}+ 
a_2\frac{\partial^2 u}{\partial x_2^2} + \cdots + 
a_n\frac{\partial^2 u}{\partial x_n^2}

where each of the ais is equal to either 0, +1, or -1.

If all the ais have the same sign the equation is elliptic

If any of the ais are zero the equation is parabolic

If exactly one of the ais has the opposite sign to the rest the equation is hyperbolic

In 2 or 3 dimensions these are the only possibilities, but in 4 or more dimensions there is a fourth possibility: at least two of the ais are positive, and at least two of the ais are negative.

Such equations are called ultrahyperbolic. They are less commonly encountered than the other three types, so will not be studied here.

When the coefficients are not constant, an equation can be hyperbolic in some regions of the xy plane, and elliptic in others. If so, different methods must be used for the solutions in the two regions.


Elliptic

Standard form, Laplace's equation:

\nabla^2 h=0

Quote equation in spherical and cylindrical coordinates, and give full solution for cartesian and cylindrical coordinates. Note averaging property Comment on physical significance, rotation invariance of laplacian.

Hyperbolic

Standard form, wave equation:

\nabla^2 h=c^2 h_{tt}

Solution, any sum of functions of the form

h=f(\mathbf{k}\cdot \mathbf{x}-\omega t) \quad \omega=|\mathbf{k}|c

These are waves. Compare with solution from separating variables. Domain of dependence, etc.

Parabolic

The canonical parabolic equation is the diffusion equation:

\nabla^2 h=h_t

Here, we will consider some simple solutions of the one-dimensional case.

The properties of this equation are in many respects intermediate between those of hyperbolic and elliptic equation.

As with hyperbolic equations but not elliptic, the solution is well behaved if the value is given on the initial surface t=0.

However, the characteristic surfaces of this equation are the surfaces of constant t, thus there is no way for discontinuities to propagate to positive t.

Therefore, as with elliptic equations but not hyberbolic, the solutions are typically smooth, even when the initial conditions aren't.

Furthermore, at a local maximum of h, its Laplacian is negative, so h is decreasing with t, while at local minima, where the Laplacian will be positive, h will increase with t. Thus, initial variations in h will be smoothed out as t increases.

In one dimension, we can learn more by integrating both sides,

\begin{matrix}
\int_{-a}^b h_t dt & = & \int_{-a}^b h_{xx} dx \\
\frac{d}{dt} \int_{-a}^b h \,dt & = & \left[ h_x \right]_{-a}^b
\end{matrix}

Provided that hx tends to zero for large x, we can take the limit as a and b tend to infinity, deducing

\frac{d}{dt} \int_{-\infty}^\infty h \,dt

so the integral of h over all space is constant.

This means this PDE can be thought of as describing some conserved quantity, initially concentrated but spreading out, or diffusing, over time.

This last result can be extended to two or more dimensions, using the theorems of vector calculus.

We can also differentiate any solution with respect to any coordinate to obtain another solution. E.g. if h is a solution then

\nabla^2 h_x = \partial_x \nabla^2 h = \partial_x \partial_t h 
=\partial_t h_x

so hx also satisfies the diffusion equation.

Similarity solution

Looking at this equation, we might notice that if we make the change of variables

 r = \alpha x \quad \tau =\alpha^2 t

then the equation retains the same form. This suggests that the combination of variables x²/t, which is unaffected by this variable change, may be significant.

We therefore assume this equation to have a solution of the special form

h(x,t)=f(\eta) \mbox{ where } \eta = \frac{x}{t^{1 \over 2}}

then

h_x = \eta_x f_{\eta} = t^{-{1\over 2}}f_{\eta} \quad
h_t = \eta_t f_{\eta} = -\frac{\eta}{2t}f_{\eta}

and substituting into the diffusion equation eventually gives

 f_{\eta \eta} + \frac{\eta}{2} f_{\eta}=0

which is an ordinary differential equation.

Integrating once gives

f_{\eta} = A e^{-\frac{\eta^2}{4}}

Reverting to h, we find

\begin{matrix}
h_x & = & \frac{A}{\sqrt{t}} e^{-\frac{\eta^2}{4}} \\
h & = & \frac{A}{\sqrt{t}} \int_{-\infty }^x e^{-s^2/4t} ds +B \\ 
& = & A \int_{-\infty }^{x/2\sqrt{t}} e^{-z^2} dz +B
\end{matrix}

This last integral can not be written in terms of elementary functions, but its values are well known.

In particular the limiting values of h at infinity are

h(-\infty ,t)=B \quad h(\infty ,t)= B + A \sqrt{\pi } ,

taking the limit as t tends to zero gives

h= \left\{ \begin{matrix}
B & x<0 \\ B + A \sqrt{\pi } & x>0 \end{matrix} \right.


We see that the initial discontinuity is immediately smoothed out. The solution at later times retains the same shape, but is more stretched out.

The derivative of this solution with respect to x

h_x  =  \frac{A}{\sqrt{t}} e^{-x^2/4t}

is itself a solution, with h spreading out from its initial peak, and plays a significant role in the further analysis of this equation.

The same similarity method can also be applied to some non-linear equations.

Separating variables

We can also obtain some solutions of this equation by separating variables.

h(x,t)=X(x)T(t) \Rightarrow X^{\prime \prime}T=X\dot{T}

giving us the two ordinary differential equations

 \frac{d^2X}{dx^2}+k^2 X=0 \quad \frac{dT}{dt} = -k T

and solutions of the general form

 h(x,t)= A e^{-kt} \sin (kx+\alpha) \,

<noinclude>

Exercises

Calculus/Multivariable and differential calculus:Solutions </noinclude>

Extensions

Systems of Ordinary Differential Equations

We have already examined cases where we have a single differential equation and found several methods to aid us in finding solutions to these equations. But what happens if we have two or more differential equations that depend on each other? For example, consider the case where

 D_t x(t) = 3y(t)^2 + x(t)t

and

 D_t y(t) = x(t) + y(t)

Such a set of differential equations is said to be coupled. Systems of ordinary differential equations such as these are what we will look into in this section.

First order systems

A general system of differential equations can be written in the form

 D_t \mathbf{x} = \mathbf{F}(\mathbf{x}, t)

Instead of writing the set of equations in a vector, we can write out each equation explicitly, in the form:

 D_t x_1 = F_1(x_1, \ldots, x_n, t)
 \vdots\!\;
 D_t x_i = F_i(x_1, \ldots, x_n, t)

If we have the system at the very beginning, we can write it as:

 D_t \mathbf{x} = \mathbf{G}(\mathbf{x}, t)

where

 \mathbf{x} = (x(t), y(t)) = (x,y)

and

 \mathbf{G}(\mathbf{x}, t) = (3y^2+xt,x+y)

or write each equation out as shown above.

Why are these forms important? Often, this arises as a single, higher order differential equation that is changed into a simpler form in a system. For example, with the same example,

 D_t x(t) = 3y(t)^2 + x(t)t
 D_t y(t) = x(t) + y(t)

we can write this as a higher order differential equation by simple substitution.

 D_t y(t) - y(t) = x(t)

then

 D_t x(t) = 3y(t)^2 + (D_t y(t) - y(t))t
 D_t x(t) = 3y(t)^2 + t D_t y(t) - t y(t)

Notice now that the vector form of the system is dependent on t since

\mathbf{G}(\mathbf{x}, t) = (3y^2+xt,x+y)

the first component is dependent on t. However, if instead we had

\mathbf{H}(\mathbf{x}) = (3y^2+x,x+y)

notice the vector field is no longer dependent on t. We call such systems autonomous. They appear in the form

 D_t \mathbf{x} = \mathbf{H}(\mathbf{x})

We can convert between an autonomous system and a non-autonomous one by simply making a substitution that involves t, such as y=(x, t), to get a system:

 D_t \mathbf{y} = (\mathbf{F}(\mathbf{y}), 1) = (\mathbf{F}(\mathbf{x}, t), 1)

In vector form, we may be able to separate F in a linear fashion to get something that looks like:

 \mathbf{F}(\mathbf{x}, t) = A(t)\mathbf{x} + \mathbf{b}(t)

where A(t) is a matrix and b is a vector. The matrix could contain functions or constants, clearly, depending on whether the matrix depends on t or not.

Real numbers

Fields

You are probably already familiar with many different sets of numbers from your past experience. Some of the commonly used sets of numbers are

  • Natural numbers, usually denoted with an N, are the numbers 0,1,2,3,...
  • Integers, usually denoted with a Z, are the positive and negative natural numbers: ...-3,-2,-1,0,1,2,3...
  • Rational numbers, denoted with a Q, are fractions of integers (excluding division by zero): -1/3, 5/1, 0, 2/7. etc.
  • Real numbers, denoted with a R, are constructed and discussed below.

Note that different sets of numbers have different properties. In the set integers for example, any number always has an additive inverse: for any integer x, there is another integer t such that x+t=0 This should not be terribly surprising: from basic arithmetic we know that t=-x. Try to prove to yourself that not all natural numbers have an additive inverse.

In mathematics, it is useful to note the important properties of each of these sets of numbers. The rational numbers, which will be of primary concern in constructing the real numbers, have the following properties:

There exists a number 0 such that for any other number a, 0+a=a+0=a
For any two numbers a and b, a+b is another number
For any three numbers a,b, and c, a+(b+c)=(a+b)+c
For any number a there is another number -a such that a+(-a)=0
For any two numbers a and b, a+b=b+a
For any two numbers a and b,a*b is another number
There is a number 1 such that for any number a, a*1=1*a=a
For any two numbers a and b, a*b=b*a
For any three numbers a,b and c, a(bc)=(ab)c
For any three numbers a,b and c, a(b+c)=ab+ac
For every number a there is another number a-1 such that aa-1=1

As presented above, these may seem quite intimidating. However, these properties are nothing more than basic facts from arithmetic. Any collection of numbers (and operations + and * on those numbers) which satisfies the above properties is called a field. The properties above are usually called field axioms. As an exercise, determine if the integers form a field, and if not, which field axiom(s) they violate.

Even though the list of field axioms is quite extensive, it does not fully explore the properties of rational numbers. Rational numbers also have an ordering.' A total ordering must satisfy several properties: for any numbers a, b, and c

if ab and ba then a = b (antisymmetry)
if ab and bc then ac (transitivity)
ab or ba (totality)

To familiarize yourself with these properties, try to show that (a) natural numbers, integers and rational numbers are all totally ordered and more generally (b) convince yourself that any collection of rational numbers are totally ordered (note that the integers and natural numbers are both collections of rational numbers).

Finally, it is useful to recognize one more characterization of the rational numbers: every rational number has a decimal expansion which is either repeating or terminating. The proof of this fact is omitted, however it follows from the definition of each rational number as a fraction. When performing long division, the remainder at any stage can only take on positive integer values smaller than the denominator, of which there are finitely many.

Constructing the Real Numbers

There are two additional tools which are needed for the construction of the real numbers: the upper bound and the least upper bound. Definition A collection of numbers E is bounded above if there exists a number m such that for all x in E x≤m. Any number m which satisfies this condition is called an upper bound of the set E.

Definition If a collection of numbers E is bounded above with m as an upper bound of E, and all other upper bounds of E are bigger than m, we call m the least upper bound or supremum of E, denoted by sup E.

Many collections of rational numbers do not have a least upper bound which is also rational, although some do. Suppose the numbers 5 and 10/3 are, together, taken to be E. The number 5 is not only an upper bound of E, it is a least upper bound. In general, there are many upper bounds (12, for instance, is an upper bound of the collection above), but there can be at most one least upper bound.

Consider the collection of numbers \{3, 3.1, 3.14, 3.141, 3.1415, \dots\}: You may recognize these decimals as the first few digits of pi. Since each decimal terminates, each number in this collection is a rational number. This collection has infinitely many upper bounds. The number 4, for instance, is an upper bound. There is no least upper bound, at least not in the rational numbers. Try to convince yourself of this fact by attempting to construct such a least upper bound: (a) why does pi not work as a least upper bound (hint: pi does not have a repeating or terminating decimal expansion), (b) what happens if the proposed supremum is equal to pi up to some decimal place, and zeros after (c) if the proposed supremum is bigger than pi, can you find a smaller upper bound which will work?

In fact, there are infinitely many collections of rational numbers which do not have a rational least upper bound. We define a real number to be any number that is the least upper bound of some collection of rational numbers.

Properties of Real Numbers

The reals are totally ordered.

For all reals; a, b, c
Either b>a, b=a, or b<a.
If a<b and b<c then a<c

Also

b>a implies b+c>a+c
b>a and c>0 implies bc>ac
b>a implies -a>-b

Upper bound axiom

Every non-empty set of real numbers which is bounded above has a supremum.

The upper bound axiom is necessary for calculus. It is not true for rational numbers.

We can also define lower bounds in the same way.

Definition A set E is bounded below if there exists a real M such that for all xE x≥M Any M which satisfies this condition is called an lower bound of the set E

Definition If a set, E, is bounded below, M is an lower bound of E, and all other lower bounds of E are less than M, we call M the greatest lower bound or inifimum of E, denoted by inf E

The supremum and infimum of finite sets are the same as their maximum and minimum.

Theorem

Every non-empty set of real numbers which is bounded below has an infimum.

Proof:

Let E be a non-empty set of of real numbers, bounded below
Let L be the set of all lower bounds of E
L is not empty, by definition of bounded below
Every element of E is an upper bound to the set L, by definition
Therefore, L is a non empty set which is bounded above
L has a supremum, by the upper bound axiom
1/ Every lower bound of E is ≤sup L, by definition of supremum
Suppose there were an e∈E such that e<sup L
Every element of L is ≤e, by definition
Therefore e is an upper bound of L and e<sup L
This contradicts the definition of supremum, so there can be no such e.
If e∈E then e≥sup L, proved by contradiction
2/ Therefore, sup L is a lower bound of E
inf E exists, and is equal to sup L, on comparing definition of infinum to lines 1 & 2

Bounds and inequalities, theorems: A \subseteq B \Rightarrow \sup A \le \sup B A \subseteq B \Rightarrow \inf A \ge \inf B  \sup A \cup B = \max (\sup A, \sup B)  \inf A \cup B = \min (\inf A, \inf B)


Theorem: (The triangle inequality)

\forall a,b,c \in \R \quad |a-b| \le |a-c|+|c-b|

Proof by considering cases

If a≤b≤c then |a-c|+|c-b| = (c-a)+(c-b) = 2(c-b)+(b-a)>b-a = |b-a|

Exercise: Prove the other five cases.

This theorem is a special case of the triangle inequality theorem from geometry: The sum of two sides of a triangle is greater than or equal to the third side. It is useful whenever we need to manipulate inequalities and absolute values.

Complex Numbers

In mathematics, a complex number is a number of the form

 a + bi \,

where a and b are real numbers, and i is the imaginary unit, with the property i 2 = −1. The real number a is called the real part of the complex number, and the real number b is the imaginary part. Real numbers may be considered to be complex numbers with an imaginary part of zero; that is, the real number a is equivalent to the complex number a+0i.

For example, 3 + 2i is a complex number, with real part 3 and imaginary part 2. If z = a + bi, the real part (a) is denoted Re(z), or ℜ(z), and the imaginary part (b) is denoted Im(z), or ℑ(z).

Complex numbers can be added, subtracted, multiplied, and divided like real numbers and have other elegant properties. For example, real numbers alone do not provide a solution for every polynomial algebraic equation with real coefficients, while complex numbers do (this is the fundamental theorem of algebra).

Equality

Two complex numbers are equal if and only if their real parts are equal and their imaginary parts are equal. That is, a + bi = c + di if and only if a = c and b = d.

Notation and operations

The set of all complex numbers is usually denoted by C, or in blackboard bold by \mathbb{C} (Unicode ℂ). The real numbers, R, may be regarded as "lying in" C by considering every real number as a complex: a = a + 0i.

Complex numbers are added, subtracted, and multiplied by formally applying the associative, commutative and distributive laws of algebra, together with the equation i2 = −1:

\,(a + bi) + (c + di) = (a + c) + (b + d)i
\,(a + bi) - (c + di) = (a - c) + (b - d)i
\,(a + bi)(c + di) = ac + bci + adi + bd i^2 = (ac - bd) + (bc + ad)i

Division of complex numbers can also be defined (see below). Thus, the set of complex numbers forms a field which, in contrast to the real numbers, is algebraically closed.

In mathematics, the adjective "complex" means that the field of complex numbers is the underlying number field considered, for example complex analysis, complex matrix, complex polynomial and complex Lie algebra.

The field of complex numbers

Formally, the complex numbers can be defined as ordered pairs of real numbers (a, b) together with the operations:

(a,b) + (c,d) = (a + c,b + d) \,
(a,b) \cdot (c,d) = (ac - bd,bc + ad). \,

So defined, the complex numbers form a field, the complex number field, denoted by C (a field is an algebraic structure in which addition, subtraction, multiplication, and division are defined and satisfy certain algebraic laws. For example, the real numbers form a field).

The real number a is identified with the complex number (a, 0), and in this way the field of real numbers R becomes a subfield of C. The imaginary unit i can then be defined as the complex number (0, 1), which verifies

(a, b) = a \cdot (1, 0) + b \cdot (0, 1) = a + bi \quad \text{and} \quad i^2 = (0, 1) \cdot (0, 1) = (-1, 0) = -1.

In C, we have:

  • additive identity ("zero"): (0, 0)
  • multiplicative identity ("one"): (1, 0)
  • additive inverse of (a,b): (−a, −b)
  • multiplicative inverse (reciprocal) of non-zero (a, b): \left({a\over a^2+b^2},{-b\over a^2+b^2}\right).

Since a complex number a + bi is uniquely specified by an ordered pair (a, b) of real numbers, the complex numbers are in one-to-one correspondence with points on a plane, called the complex plane.


The complex plane

A complex number z can be viewed as a point or a position vector in a two-dimensional Cartesian coordinate system called the complex plane or Argand diagram . The point and hence the complex number z can be specified by Cartesian (rectangular) coordinates. The Cartesian coordinates of the complex number are the real part x = Re(z) and the imaginary part y = Im(z). The representation of a complex number by its Cartesian coordinates is called the Cartesian form or rectangular form or algebraic form of that complex number.

Polar form

Alternatively, the complex number z can be specified by polar coordinates. The polar coordinates are r =  |z| ≥ 0, called the absolute value or modulus, and φ = arg(z), called the argument of z. For r = 0 any value of φ describes the same number. To get a unique representation, a conventional choice is to set arg(0) = 0. For r > 0 the argument φ is unique modulo 2π; that is, if any two values of the complex argument differ by an exact integer multiple of 2π, they are considered equivalent. To get a unique representation, a conventional choice is to limit φ to the interval (-π,π], i.e. −π < φ ≤ π. The representation of a complex number by its polar coordinates is called the polar form of the complex number.

Conversion from the polar form to the Cartesian form

x = r \cos \varphi
y = r \sin \varphi

Conversion from the Cartesian form to the polar form

r = \sqrt{x^2+y^2}
\varphi = 
\begin{cases}
\arctan(\frac{y}{x}) & \mbox{if } x > 0\\
\arctan(\frac{y}{x}) + \pi & \mbox{if } x < 0 \mbox{ and } y \ge 0\\
\arctan(\frac{y}{x}) - \pi & \mbox{if } x < 0 \mbox{ and } y < 0\\
+\frac{\pi}{2} & \mbox{if } x = 0 \mbox{ and } y > 0\\
-\frac{\pi}{2} & \mbox{if } x = 0 \mbox{ and } y < 0\\
\mathrm{undefined} & \mbox{if } x = 0 \mbox{ and } y = 0.
\end{cases}

The previous formula requires rather laborious case differentiations. However, many programming languages provide a variant of the arctangent function. A formula that uses the arccos function requires fewer case differentiations:

\varphi = 
\begin{cases}
+\arccos\frac{x}{r} & \mbox{if } y \geq 0 \mbox{ and } r \ne 0\\
-\arccos\frac{x}{r} & \mbox{if } y < 0\\
\mathrm{undefined} & \mbox{if } r = 0.
\end{cases}

Notation of the polar form

The notation of the polar form as

 z = r\,(\cos \varphi + i\sin \varphi )\,

is called trigonometric form. The notation cis φ is sometimes used as an abbreviation for cos φ + i sin φ. Using Euler's formula it can also be written as

 z = r\,\mathrm{e}^{i \varphi}\,,

which is called exponential form.

Multiplication, division, exponentiation, and root extraction in the polar form

Multiplication, division, exponentiation, and root extraction are much easier in the polar form than in the Cartesian form.

Using sum and difference identities its possible to obtain that

r_1\,e^{i\varphi_1} \cdot r_2\,e^{i\varphi_2} 
= r_1\,r_2\,e^{i(\varphi_1 + \varphi_2)} \,

and that

\frac{r_1\,e^{i\varphi_1}}{r_2\,e^{i\varphi_2}}
 = \frac{r_1}{r_2}\,e^{i (\varphi_1 - \varphi_2)}. \,

Exponentiation with integer exponents; according to de Moivre's formula,

\big(r\,e^{i\varphi}\big)^n = r^n\,e^{in\varphi}. \,

Exponentiation with arbitrary complex exponents is discussed in the article on exponentiation.

The addition of two complex numbers is just the addition of two vectors, and multiplication by a fixed complex number can be seen as a simultaneous rotation and stretching.

Multiplication by i corresponds to a counter-clockwise rotation by 90° (π/2 radians). The geometric content of the equation i 2 = −1 is that a sequence of two 90 degree rotations results in a 180 degree (π radians) rotation. Even the fact (−1) · (−1) = +1 from arithmetic can be understood geometrically as the combination of two 180 degree turns.

All the roots of any number, real or complex, may be found with a simple algorithm. The nth roots are given by

  \sqrt[n]{r e^{i\varphi}}=\sqrt[n]{r}\ e^{i\left(\frac{\varphi+2k\pi}{n}\right)}

for k = 0, 1, 2, …, n − 1, where \sqrt[n]{r} represents the principal nth root of r.

Absolute value, conjugation and distance

The absolute value (or modulus or magnitude) of a complex number z = r eiφ is defined as |z| = r. Algebraically, if z = a + bi, then  | z | = \sqrt{a^2+b^2}.

One can check readily that the absolute value has three important properties:

 | z | = 0 \, if and only if  z = 0 \,
 | z + w | \leq | z | + | w | \, (triangle inequality)
 | z \cdot w | = | z | \cdot | w | \,

for all complex numbers z and w. It then follows, for example, that  | 1 | = 1 and |z/w|=|z|/|w|. By defining the distance function d(z, w) = |zw| we turn the set of complex numbers into a metric space and we can therefore talk about limits and continuity.

The complex conjugate of the complex number z = a + bi is defined to be abi, written as \bar{z} or z^*\,. As seen in the figure, \bar{z} is the "reflection" of z about the real axis. The following can be checked:

\overline{z+w} = \bar{z} + \bar{w}
\overline{z\cdot w} = \bar{z}\cdot\bar{w}
\overline{(z/w)} = \bar{z}/\bar{w}
\bar{\bar{z}}=z
\bar{z}=z   if and only if z is real
|z|=|\bar{z}|
|z|^2 = z\cdot\bar{z}
z^{-1} = \bar{z}\cdot|z|^{-2}   if z is non-zero.

The latter formula is the method of choice to compute the inverse of a complex number if it is given in rectangular coordinates.

That conjugation commutes with all the algebraic operations (and many functions; e.g. \sin\bar z=\overline{\sin z}) is rooted in the ambiguity in choice of i (−1 has two square roots). It is important to note, however, that the function f(z) = \bar{z} is not complex-differentiable.

Complex fractions

We can divide a complex number (a + bi) by another complex number (c + di) ≠ 0 in two ways. The first way has already been implied: to convert both complex numbers into exponential form, from which their quotient is easily derived. The second way is to express the division as a fraction, then to multiply both numerator and denominator by the complex conjugate of the denominator. The new denominator is a real number.


\begin{align}
{a + bi \over c + di}& = {(a + bi) (c - di) \over (c + di) (c - di)} = {(ac + bd) + (bc - ad) i \over c^2 + d^2}\\ & = \left({ac + bd \over c^2 + d^2}\right) + i\left( {bc - ad \over c^2 + d^2} \right).\,
\end{align}

Matrix representation of complex numbers

While usually not useful, alternative representations of the complex field can give some insight into its nature. One particularly elegant representation interprets each complex number as a 2×2 matrix with real entries which stretches and rotates the points of the plane. Every such matrix has the form


\begin{bmatrix}
  a &   -b  \\
  b & \;\; a  
\end{bmatrix}

where a and b are real numbers. The sum and product of two such matrices is again of this form. Every non-zero matrix of this form is invertible, and its inverse is again of this form. Therefore, the matrices of this form are a field. In fact, this is exactly the field of complex numbers. Every such matrix can be written as


\begin{bmatrix}
  a &     -b  \\
  b & \;\; a  
\end{bmatrix}
=
a \begin{bmatrix}
  1 & \;\; 0  \\
  0 & \;\; 1 
\end{bmatrix}
+
b \begin{bmatrix}
  0 &     -1  \\
  1 & \;\; 0 
\end{bmatrix}

which suggests that we should identify the real number 1 with the identity matrix


\begin{bmatrix}
  1 & \;\; 0  \\
  0 & \;\; 1 
\end{bmatrix},

and the imaginary unit i with


\begin{bmatrix}
  0 &     -1  \\
  1 & \;\; 0  
\end{bmatrix},

a counter-clockwise rotation by 90 degrees. Note that the square of this latter matrix is indeed equal to the 2×2 matrix that represents −1.

The square of the absolute value of a complex number expressed as a matrix is equal to the determinant of that matrix.

 |z|^2 =
\begin{vmatrix}
  a & -b  \\
  b &  a  
\end{vmatrix}
= (a^2) - ((-b)(b)) = a^2 + b^2

If the matrix is viewed as a transformation of the plane, then the transformation rotates points through an angle equal to the argument of the complex number and scales by a factor equal to the complex number's absolute value. The conjugate of the complex number z corresponds to the transformation which rotates through the same angle as z but in the opposite direction, and scales in the same manner as z; this can be represented by the transpose of the matrix corresponding to z.

If the matrix elements are themselves complex numbers, the resulting algebra is that of the quaternions. In other words, this matrix representation is one way of expressing the Cayley-Dickson construction of algebras.

Advanced Integration Techniques

Integration by Complexifying

← Integration techniques/Integration by Parts Calculus Integration techniques/Partial Fraction Decomposition →
Print version

This technique requires an understanding and recognition of complex numbers. Specifically Euler's formula:

\cos \theta + i\cdot \sin \theta = e^{i \cdot \theta}

Recognize, for example, that the real portion:

\mathrm{Re}\{ e^{i \cdot \theta} \} = \cos \theta

Given an integral of the general form:

\int e^{x} \cos {2x} \; dx

We can complexify it:

\int \mathrm{Re}\{ e^{x} (\cos {2x} + i\cdot \sin {2x}) \} \; dx

\int \mathrm{Re}\{ e^{x} (e^{i 2x}) \} \; dx

With basic rules of exponents:

\int \mathrm{Re}\{ e^{x + i2x} \} \; dx

It can be proven that the "real portion" operator can be moved outside the integral:

\mathrm{Re}\{ \int e^{x(1 + 2i)} \; dx \}

The integral easily evaluates:

\mathrm{Re}\{ \frac{e^{x(1 + 2i)}}{1 + 2i} \}

Multiplying and dividing by (1-2i):

\mathrm{Re} \{ \frac{1 - 2i}{5} e^{x(1 + 2i)} \}

Which can be rewritten as:

\mathrm{Re} \{ \frac{1 - 2i}{5} e^{x} e^{i2x} \}

Applying Euler's forumula:

\mathrm{Re} \{ \frac{1 - 2i}{5} e^{x} (\cos 2x + i\cdot \sin 2x) \}

Expanding:

\mathrm{Re} \{ \frac{e^{x}}{5} (\cos 2x +2 \sin 2x) + i\cdot \frac{e^{x}}{5} (\sin 2x -2 \cos 2x) \}

Taking the Real part of this expression:

 \frac{e^{x}}{5} (\cos 2x +2 \sin 2x)

So:

\int e^{x} \cos {2x} \; dx = \frac{e^{x}}{5} (\cos 2x +2 \sin 2x)+C

Appendix

Calculus/Choosing delta

This page is an addendum to Calculus/Formal Definition of the Limit.

Recall the definition of a limit:

A number L is the limit of a function f(x) as x approaches c if and only if for all numbers ε > 0 there exists a number δ > 0 such that

\left| f(x) - L \right| < \epsilon
whenever
0 < \left| x - c \right| < \delta.

In other words, given a number ε we must construct a number δ such that assuming
0 < \left| x - c \right| < \delta
we can prove
\left| f(x) - L \right| < \epsilon;
moreover, this proof must work for all values of ε > 0.

Note: this definition is not constructive -- it does not tell you how to find the limit L, only how to check whether a particular value is indeed the limit. We use the informal definition of the limit, experience with similar problems, or theorems (L'Hopital's rule, for example), to determine the value, and then can prove the correctness of this value using the formal definition.

Example 1: Suppose we want to find the limit of f(x) = x + 5 as x approaches c = 9. We know that the limit L is 9+5=14, and desire to prove this.

We choose δ = ε (this will be explained later). Then, since we assume
\left| x - 9 \right| < \delta
we can show
\begin{matrix}
\left| (x + 5) - 14 \right| & = & \left| x - 9 \right| \\
\ & < & \delta \\
\ & = & \epsilon \end{matrix},
which is what we wanted to prove.

We chose δ by working backwards from the formula we are trying to prove:
\left| f(x) - L \right| < \epsilon.
In this case, we desire to prove
\left| x - 9 \right| < \epsilon,
given
\left| x - 9 \right| < \delta,
so the easiest way to prove it is by choosing δ = ε. This example, however, is too easy to adequately explain how to choose δ in general. Lets try something harder:

Example 2: Prove that the limit of f(x) = x² - 9 as x approaches 2 is L = -5.

We want to prove that
\left| f(x) - L \right| = \left| x^2 - 4 \right| < \epsilon
given
\left| x - 2 \right| < \delta.

We choose δ by working backwards. First, we need to rewrite the equation we want to prove using δ instead of x:

\begin{matrix}
\left| x^2 - 4 \right| & < & \epsilon \\
\left| x - 2 \right| \cdot \left| x + 2 \right| & < & \epsilon \\
(\delta) \cdot (\delta + 4) & = & \epsilon \end{matrix}

Note: we used the fact that |x + 2| < δ + 4, which can be proven with the triangle inequality.

Word of caution: the above series of equations is not a logical series of steps, and is not part of any proof, but is an informal technique used to help write the proof. We will select a value of δ so that the last equation is true, and then use the last equation to prove the equations above it in turn (which is what was meant earlier by working backwards).

Note: in the equations above, when δ was substituted for x, the sign < was replaced with =. This can be done (but is not necessary) because we are not told that |x-2| = δ, but rather |x-2| < δ. The justification for this becomes clear when the above equations are used in backwards order in the proof.

We can solve this last equation for δ using the quadratic formula:

\delta = \frac{-4 + \sqrt{16 - 4 \cdot 1 \cdot (- \epsilon) }}{2 \cdot 1} = -2 + \sqrt{4 + \epsilon}

Note: δ is always in terms of ε. A constant value of δ (e.g., δ = 0.5) will never work.

Now, we have a value of δ, and we can do our proof:

given
\left| x - 2 \right| < \delta,
\begin{matrix}
\left| f(x) - L \right| & = & \left| x^2 - 4 \right| \\
\ & = & \left| x - 2 \right| \cdot \left| x + 2 \right| \\
\ & < & (\delta) \cdot (\delta + 4) \\
\ & < & (\sqrt{4 + \epsilon} - 2) \cdot (\sqrt{4 + \epsilon} + 2) \\
\ & < & (\sqrt{4 + \epsilon})^2 - (2)^2 \\
\ & < & \epsilon \end{matrix}.

Here a few more examples of choosing δ; try to figure them out before reading the explanation.

Example 3: Prove that the limit of f(x) = sin(x)/x as x approaches 0 is L = 1.

Explanation:


Example 4: Prove that f(x) = 1/x has no limit as x approaches 0.


Example 5: Prove that  \lim_{x \to 2} x^2 = 4

Solution: To do it, we'll look at two cases: \epsilon\geq4 and \epsilon<4. The \epsilon\geq4 case is easy. First let's let \epsilon=4. That means we want the values chosen in the domain to map to (0,8) in the range. We want a delta such that (2+\delta)^2=8 so let's choose \delta=2\sqrt2-2. The chosen \delta defines the interval (4-2\sqrt2,2\sqrt2)\approxeq(1.1716,2.8284) in our domain. This gets mapped to (24-16\sqrt2,8)\approxeq(1.3726,8) in our range, which is contained in (0,8). Notice that \delta doesn't depend on \epsilon. So for \epsilon>4, we widen the interval in the range that we are allowed to map onto, but our interval in the domain stays fixed and always maps to the same sub-interval in the range. So \delta=2\sqrt2-2 works for any \epsilon\geq4.


Now suppose 0<\epsilon<4. We want a \delta such that 0<|x^2-4|<\epsilon whenever 0<|x-2|<\delta. So let's assume 0<|x^2-4|<\epsilon and work backwards to find a suitable \delta:

0<|x^2-4|<\epsilon
-\epsilon<x^2-4<\epsilon
4-\epsilon<x^2<\epsilon+4

Since 0<\epsilon<4, we have 4-\epsilon>0. Since both numbers above are positive, we can take the (positive) square root of both extremes of the inequality:

\sqrt(4-\epsilon)<x<\sqrt(\epsilon+4)
\sqrt(4-\epsilon)-2<x-2<\sqrt(\epsilon+4)-2

The above equation represents the distance, either negative or positive, that x can vary from 2 and still be within \epsilon of 4. We want to choose the smaller of the two extremes to construct our interval. It turns out that |\sqrt(\epsilon+4)-2|\leq|\sqrt(4-\epsilon)-2| for 0<\epsilon<4, so choose \delta=\sqrt(\epsilon+4)-2. As a sanity check, let's try with \epsilon=0.002.

\delta=\sqrt(\epsilon+4)-2
\delta=\sqrt(0.002+4)-2

which is approximately

\delta=0.0004999375

At the extreme right of the domain, this gives

x=2.0004999375

and

x^2=2.0004999375^2=4.00199999993750390625

which is within 0.002 of 4.

Exercise Solutions

Algebra Solutions

1. Factor x-1 out of 6x^3-4x^2+3x-5.
\begin{array}{rl}&~~\,6x^2+2x+5\\
x-1\!\!\!\!&\big)\!\!\!\begin{array}{lll}
\hline
\,6x^3-4x^2+3x-5
\end{array}\\
&\!\!\!\!-\underline{(6x^3-6x^2)~~~}\\
&\!\!\!\!~~~~~~~~~~~~2x^2+3x-5~~~\\
&\!\!\!\!~~~~~~~~-\underline{(2x^2-2x)~~~}\\
&\!\!\!\!~~~~~~~~~~~~~~~~~~~~~5x-5~~~\\
&\!\!\!\!~~~~~~~~~~~~~~~~-\underline{(5x-5)~~~}\\
&\!\!\!\!~~~~~~~~~~~~~~~~~~~~~~~~~~~~0~~~\\
\end{array}

\mathbf{6x^3-4x^2+3x-5=(x-1)(6x^2+2x+5)}

Precalculus Cumulative Exercise Set Solutions

Convert to interval notation

1.  \{x:-4<x<2\} \,

\mathbf{(-4,2)}

2.  \{x:-\frac{7}{3} \leq x \leq -\frac{1}{3}\}

\mathbf{[-\frac{7}{3},-\frac{1}{3}]}

3.  \{x:-\pi \leq x < \pi\}

\mathbf{[-\pi,\pi)}

4.  \{x:x \leq 17/9\}

\mathbf{(-\infty, \frac{17}{9}]}

5.  \{x:5 \leq x+1 \leq 6\}

4\leq x\leq5
\mathbf{[4, 5]}

6.  \{x:x - 1/4 < 1\} \,

x<1 \frac{1}{4}=\frac{5}{4}
\mathbf{(-\infty, \frac{5}{4})}

7.  \{x:3 > 3x\} \,

1>x
x<1
\mathbf{(-\infty, 1)}

8.  \{x:0 \leq 2x+1 < 3\}

-1\leq 2x\leq 2
-\frac{1}{2}\leq x<1
\mathbf{[-\frac{1}{2}, 1)}

9.  \{x:5<x \mbox{ and } x<6\} \,

This is equivalent to 5<x<6
\mathbf{(5,6)}

10.  \{x:5<x \mbox{ or } x<6\} \,

It helps to draw a picture to determine the set of numbers described:

5 lt x or x lt 6.png

A number in the set can be on either the red or blue line, so the entire number line is included.
\mathbf{(-\infty,\infty)}

State the following intervals using set notation

11.  [3,4] \,

\mathbf{\{x:3\leq x\leq 4\}}

12.  [3,4) \,

\mathbf{\{x:3\leq x<4\}}

13.  (3,\infty)

\mathbf{\{x:x>3\}}

14.  (-\frac{1}{3}, \frac{1}{3}) \,

\mathbf{\{x:-\frac{1}{3}<x<\frac{1}{3}\}}

15.  (-\pi, \frac{15}{16}) \,

\mathbf{\{x:-\pi<x<\frac{15}{16}\}}

16.  (-\infty,\infty)

\mathbf{\{x:x\in\Re\}}

Which one of the following is a true statement?

17.  |x+y| = |x| + |y| \,

Let x=-5, y=5. Then
|x+y|=|-5+5|=|0|=0, and
|x|+|y|=|-5|+|5|=5+5=10
Thus, |x+y| \neq |x| + |y|
false

18.  |x+y| \geq |x| + |y|

Using the same example as above, we have |x+y|\ngeq |x| + |y|.
false

19.  |x+y| \leq |x| + |y|

true

Evaluate the following expressions

20.  8^{1/3} \,

(2^3)^{1/3}=2^1=\mathbf{2}

21.  (-8)^{1/3} \,

(-2^3)^{1/3}=-2^1=\mathbf{-2}

22.  \bigg(\frac{1}{8}\bigg)^{1/3} \,

(\frac{1}{2^3})^{1/3}=(2^{-3})^{1/3}=2^{-1}=\mathbf{\frac{1}{2}}

23.  (8^{2/3}) (8^{3/2}) (8^0) \,

8^{\frac{2}{3}+\frac{3}{2}+0}=8^{\frac{4}{6}+\frac{9}{6}}=8^{\frac{13}{6}}=(2^3)^{\frac{13}{6}}=\mathbf{2^{13/2}}

24.  \bigg( \bigg(\frac{1}{8}\bigg)^{1/3} \bigg)^7

((\frac{1}{2^3})^{1/3})^7=((2^{-3})^{1/3})^7=(2^{-1})^7=2^{-7}=\frac{1}{2^7}=\mathbf{\frac{1}{128}}

25.  \sqrt[3]{\frac{27}{8}}

(\frac{27}{8})^{1/3}=(\frac{3^3}{2^3})^{1/3}=\frac{3^1}{2^1}=\mathbf{\frac{3}{2}}

26.  \frac{4^5 \cdot 4^{-2}}{4^3}

4^{5-2-3}=4^0=\mathbf{1}

27.  \bigg(\sqrt{27}\bigg)^{2/3}

((3^3)^{1/2})^{2/3}=(3^\frac{3}{2})^\frac{2}{3}=3^1=\mathbf{3}

28.  \frac{\sqrt{27}}{\sqrt[3]{9}}

\frac{(3^3)^{1/2}}{(3^2)^{1/3}}=\frac{3^\frac{3}{2}}{3^\frac{2}{3}}=3^{\frac{3}{2}-\frac{2}{3}}=3^{\frac{9}{6}-\frac{4}{6}}=\mathbf{3^{5/6}}

Simplify the following

29.  x^3 + 3x^3 \,

\mathbf{ 4x^3 }

30.  \frac{x^3 + 3x^3}{x^2}

\mathbf{ 4x }

31.  (x^3+3x^3)^3 \,

\mathbf{ 64x^9 }

32.  \frac{x^{15} + x^3}{x}

\mathbf{ x^{14} + x^2 }

33.  (2x^2)(3x^{-2}) \,

\mathbf{ 6 }

34.  \frac{x^2y^{-3}}{x^3y^2}

\mathbf{ \frac{1}{xy^5} }

35.  \sqrt{x^2y^4}

\mathbf{ xy^2 }

36.  \bigg(\frac{8x^6}{y^4}\bigg)^{1/3}

\mathbf{ \frac{2x^2}{y^{\frac{4}{3}}}}

Functions

52. Let f(x)=x^2.

a. Compute f(0) and f(2).

f(0)=0, f(2)=4

b. What are the domain and range of f?

The domain is (-\infty,\infty); the range is [0,\infty),

c. Does f have an inverse? If so, find a formula for it.

No, since f isn't one-to-one; for example, f(-1)=f(1)=1.

53. Let f(x)=x+2, g(x)=1/x.

a. Give formulae for
i. f+g

(f+g)(x)=x+2+1/x=(x^2+2x+1)/x.

ii. f-g

(f-g)(x)=x+2-1/x=(x^2+2x-1)/x.

iii. g-f

(g-f)(x)=1/x-x-2=(1-x^2-2x)/x.

iv. f\times g

(f\times g)(x)=(x+2)/x.

v. f/g

(f/g)(x)=x(x+2) provided x\ne0. Note that 0 is not in the domain of f/g, since it's not in the domain of g, and you can't divide by something that doesn't exist!

vi. g/f

(g/f)(x)=1/[x(x+2)]. Although 0 is still not in the domain, we don't need to state it now, since 0 isn't in the domain of the expression 1/[x(x+2)] either.

vii. f\circ g

(f\circ g)(x)=1/x+2=(2x+1)/x.

viii. g\circ f

(g\circ f)(x)=1/(x+2).

b. Compute f(g(2)) and g(f(2)).

f(g(2))=5/2; g(f(2))=1/4.

c. Do f and g have inverses? If so, find formulae for them.

Yes; f^{-1}(x)=x-2 and g^{-1}(x)=1/x. Note that g and its inverse are the same.

54. Does this graph represent a function? Sinx over x.svg

As pictured, by the Vertical Line test, this graph represents a function.

55. Consider the following function

f(x) = \begin{cases} -\frac{1}{9} & \mbox{if } x<-1 \\ 2 & \mbox{if } -1\leq x \leq 0 \\ x + 3 & \mbox{if } x>0. \end{cases}
a. What is the domain?
b. What is the range?
c. Where is f continuous?

56. Consider the following function

f(x) = \begin{cases} x^2 & \mbox{if } x>0 \\ -1 & \mbox{if } x\leq 0. \end{cases}
a. What is the domain?
b. What is the range?
c. Where is f continuous?

57. Consider the following function

f(x) = \frac{\sqrt{2x-3}}{x-10}
a. What is the domain?
b. What is the range?
c. Where is f continuous?

58. Consider the following function

f(x) = \frac{x-7}{x^2-49}
a. What is the domain?
b. What is the range?
c. Where is f continuous?


Limit Solutions

Infinite Limits/Infinity is not a number Solutions

Write out an explanatory paragraph for the following limits that include \infin. Remember that you will have to change any comparison of magnitude between a real number and \infin to a different phrase. In the second case, you will have to work out for yourself what the formula means.

1. \lim_{x \to \infin} \frac{1}{x^2} = 0

This formula says that I can make the values of \frac{1}{x^2} as close as I would like to 0, so long as I make x sufficiently large.

2. \sum_{n = 0}^{\infin} 2^{-n} = 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \cdots  = 2

This formula says that you can make the sum \sum_{n=0}^{i} 2^{-n} as close as you would like to 2 by making i sufficiently large.

Limits Cumulative Exercise Set Solutions

Basic Limit Exercises

1. \lim_{x\to 2} (4x^2 - 3x+1)

Since this is a polynomial, two can simply be plugged in. This results in 4(4)-2(3)+1=16-6+1=\mathbf{11}

2. \lim_{x\to 5} (x^2)

5^2=\mathbf{25}

One-Sided Limits

Evaluate the following limits or state that the limit does not exist.

3.  \lim_{x\to 0^-} \frac{x^3+x^2}{x^3+2x^2}

Factor as \frac{x^2}{x^2}\frac{x+1}{x+2}. In this form we can see that there is a removable discontinuity at x=0 and that the limit is \mathbf{\frac{1}{2}}

4.  \lim_{x\to 7^-} |x^2+x| -x

|7^2+7|-7 = \mathbf{49}

5.  \lim_{x\to -1^-} \sqrt{1-x^2}

\sqrt{1-x^2} is defined if x^2<1, so the limit is \sqrt{1-1^2}=\mathbf{0}

6.  \lim_{x\to -1^+} \sqrt{1-x^2}

\sqrt{1-x^2} is not defined if x^2>1, so the limit does not exist.

Two-Sided Limits

Evaluate the following limits or state that the limit does not exist.

7.  \lim_{x \to -1} \frac{1}{x-1}

\mathbf{-\frac{1}{2}}

8.  \lim_{x\to 4}  \frac{1}{x-4}

 \lim_{x\to 4^-}  \frac{1}{x-4}=-\infty
 \lim_{x\to 4^+}  \frac{1}{x-4}=+\infty
The limit does not exist.

9.  \lim_{x\to 2}  \frac{1}{x-2}

 \lim_{x\to 2^-}  \frac{1}{x-2}=-\infty
 \lim_{x\to 2^+}  \frac{1}{x-2}=+\infty
The limit does not exist.

10.  \lim_{x\to -3}  \frac{x^2 - 9}{x+3}

 \lim_{x\to -3}  \frac{(x+3)(x-3)}{x+3} = \lim_{x\to -3}  x-3 = -3-3=\mathbf{-6}

11.  \lim_{x\to 3} \frac{x^2 - 9}{x-3}

 \lim_{x\to 3} \frac{(x-3)(x+3)}{x-3} = \lim_{x\to 3} x+3 = 3+3 = \mathbf{6}

12.  \lim_{x\to -1} \frac{x^2+2x+1}{x+1}

 \lim_{x\to -1} \frac{(x+1)(x+1)}{x+1} = \lim_{x\to -1} x+1 = -1+1 = \mathbf{0}

13.  \lim_{x\to -1} \frac{x^3+1}{x+1}

 \lim_{x\to -1} \frac{(x^2-x+1)(x+1)}{x+1} = \lim_{x\to -1} x^2-x+1 = (-1)^2-(-1)+1 = 1+1+1 = \mathbf{3}

14.  \lim_{x\to 4} \frac{x^2 + 5x-36}{x^2 - 16}

 \lim_{x\to 4} \frac{(x-4)(x+9)}{(x-4)(x+4)} = \lim_{x\to 4} \frac{x+9}{x+4} = \frac{4+9}{4+4} = \mathbf{\frac{13}{8}}

15.  \lim_{x\to 25} \frac{x-25}{\sqrt{x}-5}

 \lim_{x\to 25} \frac{(\sqrt{x}-5)(\sqrt{x}+5)}{\sqrt{x}-5} = \lim_{x\to 25} \sqrt{x}+5) = \sqrt{25}+5) = 5+5 = \mathbf{10}

16.  \lim_{x\to 0} \frac{\left|x\right|}{x}

\lim_{x\to 0^-} \frac{\left|x\right|}{x} = \lim_{x\to 0^-} \frac{-x}{x} = \lim_{x\to 0^-} -1 = -1
\lim_{x\to 0^+} \frac{\left|x\right|}{x} = \lim_{x\to 0^+} \frac{x}{x} = \lim_{x\to 0^+} 1 = 1
The limit does not exist.

17.  \lim_{x\to 2} \frac{1}{(x-2)^2}

As x approaches 2, the denominator will be a very small positive number, so the whole fraction will be a very large positive number. Thus, the limit is \mathbf{\infty}.

18.  \lim_{x\to 3} \frac{\sqrt{x^2+16}}{x-3}

As x approaches 3, the numerator goes to 5 and the denominator goes to 0. Depending on whether you approach 3 from the left or the right, the denominator will be either a very small negative number, or a very small positive number. So the limit from the left is -\infty and the limit from the right is +\infty. Thus, the limit does not exist.

19.  \lim_{x\to -2} \frac{3x^2-8x -3}{2x^2-18}

 \frac{3(-2)^2-8(-2) -3}{2(-2)^2-18} = \frac{3(4)+16-3}{2(4)-18} = \frac{12+16-3}{8-18} = \frac{25}{-10} = \mathbf{-\frac{5}{2}}

20.  \lim_{x\to 2} \frac{x^2 + 2x + 1}{x^2-2x+1}

\frac{2^2 + 2(2) + 1}{2^2-2(2)+1} = \frac{4 + 4 + 1}{4-4+1} = \frac{9}{1} = \mathbf{9}

21.  \lim_{x\to 3} \frac{x+3}{x^2-9}

\lim_{x\to 3} \frac{x+3}{(x+3)(x-3)} = \lim_{x\to 3} \frac{1}{x-3}
\lim_{x\to 3^{-}} \frac{1}{x-3} = -\infty
\lim_{x\to 3^{+}} \frac{1}{x-3} = +\infty
The limit does not exist.

22.  \lim_{x\to -1} \frac{x+1}{x^2+x}

\lim_{x\to -1} \frac{x+1}{x(x+1)} = \lim_{x\to -1} \frac{1}{x} = \frac{1}{-1} = \mathbf{-1}

23.  \lim_{x\to 1} \frac{1}{x^2+1}

\frac{1}{1^2+1} = \frac{1}{1+1} = \mathbf{\frac{1}{2}}

24.  \lim_{x\to 1} x^3 + 5x - \frac{1}{2-x}

1^3 + 5(1) - \frac{1}{2-1} = 1 + 5 - \frac{1}{1} = 6 - 1 = \mathbf{5}

25.  \lim_{x\to 1} \frac{x^2-1}{x^2+2x-3}

\lim_{x\to 1} \frac{(x-1)(x+1)}{(x-1)(x+3)} = \lim_{x\to 1} \frac{x+1}{x+3} = \frac{1+1}{1+3} = \frac{2}{4} = \mathbf{\frac{1}{2}}

26.  \lim_{x\to 1} \frac{5x}{x^2+2x-3}

Notice that as x approaches 1, the numerator approaches 5 while the denominator approaches 0. However, if you approach from below, the denominator is negative, and if you approach from above, the denominator is positive. So the limits from the left and right will be -\infty and +\infty respectively. Thus, the limit does not exist.

Limits to Infinity

Evaluate the following limits or state that the limit does not exist.

27.  \lim_{x\to \infty} \frac{-x + \pi}{x^2 + 3x + 2}

This rational function is bottom-heavy, so the limit is \mathbf{0}.

28.  \lim_{x\to -\infty} \frac{x^2+2x+1}{3x^2+1}

This rational function has evenly matched powers of x in the numerator and denominator, so the limit will be the ratio of the coefficients, i.e. \mathbf{\frac{1}{3}}.

29.  \lim_{x\to -\infty} \frac{3x^2 + x}{2x^2 - 15}

Balanced powers in the numerator and denominator, so the limit is the ratio of the coefficients, i.e. \mathbf{\frac{3}{2}}.

30.  \lim_{x\to -\infty} 3x^2-2x+1

This is a top-heavy rational function, where the exponent of the ratio of the leading terms is 2. Since it is even, the limit will be \mathbf{\infty}.

31.  \lim_{x\to \infty} \frac{2x^2-32}{x^3-64}

Bottom-heavy rational function, so the limit is \mathbf{0}.

32.  \lim_{x\to \infty} 6

This is a rational function, as can be seen by writing it in the form \frac{6x^0}{1x^0}. Since the powers of x in the numerator and denominator are evenly matched, the limit will be the ratio of the coefficients, i.e. \mathbf{6}.

33.  \lim_{x\to \infty} \frac{3x^2 +4x}{x^4+2}

Bottom-heavy, so the limit is \mathbf{0}.

34.  \lim_{x\to -\infty} \frac{2x+3x^2+1}{2x^2+3}

Evenly matched highest powers of x in the numerator and denominator, so the limit will be the ratio of the corresponding coefficients, i.e. \mathbf{\frac{3}{2}}.

35.  \lim_{x\to -\infty} \frac{x^3-3x^2+1}{3x^2+x+5}

Top-heavy rational function, where the exponent of the ratio of the leading terms is 1, so the limit is \mathbf{-\infty}.

36.  \lim_{x\to \infty} \frac{x^2+2}{x^3-2}

Bottom-heavy, so the limit is \mathbf{0}.

Limits of Piecewise Functions

Evaluate the following limits or state that the limit does not exist.

37. Consider the function

 f(x) = \begin{cases} (x-2)^2 & \mbox{if }x<2 \\ x-3 & \mbox{if }x\geq 2. \end{cases}
a.  \lim_{x\to 2^-}f(x)

(2-2)^2 = \mathbf{0}

b.  \lim_{x\to 2^+}f(x)

2-3 = \mathbf{-1}

c.  \lim_{x\to 2}f(x)

Since the limits from the left and right don't match, the limit does not exist.


38. Consider the function

 g(x) = \begin{cases} -2x+1 & \mbox{if }x\leq 0 \\ x+1 & \mbox{if }0<x<4 \\ x^2 +2 & \mbox{if }x \geq 4. \end{cases}
a.  \lim_{x\to 4^+} g(x)

4^2+2 = 16+2 = \mathbf{18}

b.  \lim_{x\to 4^-} g(x)

4+1 = \mathbf{5}

c.  \lim_{x\to 0^+} g(x)

0+1 = \mathbf{1}

d.  \lim_{x\to 0^-} g(x)

-2(0)+1 = \mathbf{1}

e.  \lim_{x\to 0} g(x)

Since the left and right limits match, the overall limit is also \mathbf{1}.

f.  \lim_{x\to 1} g(x)

1+1 = \mathbf{2}


39. Consider the function

 h(x) = \begin{cases} 2x-3 & \mbox{if }x<2 \\ 8 & \mbox{if }x=2 \\ -x+3 & \mbox{if } x>2. \end{cases}
a.  \lim_{x\to 0} h(x)

2(0)-3 = \mathbf{-3}

b.  \lim_{x\to 2^-} h(x)

2(2)-3 = 4-3 = \mathbf{1}

c.  \lim_{x\to 2^+} h(x)

-(2)+3 = \mathbf{1}

d.  \lim_{x\to 2} h(x)

Since the limits from the right and left match, the overall limit is \mathbf{1}. Note that in this case, the limit at 2 does not match the function value at 2, so the function is discontinuous at this point.

Differentiation Solutions

Differentiation Defined Solutions

1. Find the slope of the tangent to the curve y=x^2 at (1,1).

The definition of the slope of f at x_0 is \lim_{h \to 0}\left[\frac{f\left( x_0+h \right) - f\left( x_0 \right)}{h}\right]
Substituting in f(x)=x^2 and x_0=1 gives:
\begin{align}
\lim_{h \to 0}\left[\frac{(1+h)^2-1}{h}\right] &= \lim_{h \to 0}\left[\frac{h^2+2h}{h}\right]\\
&=\lim_{h \to 0}\left[\frac{h(h+2)}{h}\right]\\
&=\lim_{h \to 0}  h+2\\
&=\mathbf{2}
\end{align}

2. Using the definition of the derivative find the derivative of the function f(x)=2x+3.

\begin{align}f^'(x)
&=\lim_{\Delta x\to 0}\frac{(2(x+\Delta x)+3)-(2x+3)}{\Delta x}\\
&=\lim_{\Delta x\to 0}\frac{2x+2\Delta x+3-2x-3)}{\Delta x}\\
&=\lim_{\Delta x\to 0}\frac{2\Delta x}{\Delta x}\\
&=\lim_{\Delta x\to 0}2\\
&=\mathbf{2}
\end{align}

3. Using the definition of the derivative find the derivative of the function f(x)=x^3. Now try f(x)=x^4. Can you see a pattern? In the next section we will find the derivative of f(x)=x^n for all n.

\begin{alignat}{2}\frac{d x^3}{dx}
&=\lim_{\Delta x\to 0}\frac{(x+\Delta x)^3-x^3}{\Delta x}
& \qquad\frac{d x^4}{dx}&=\lim_{\Delta x\to 0}\frac{(x+\Delta x)^4-x^4}{\Delta x}\\
&=\lim_{\Delta x\to 0}\frac{x^3+3x^2\Delta x+3x\Delta x^2+\Delta x^3-x^3}{\Delta x}
& &=\lim_{\Delta x\to 0}\frac{x^4+4x^3\Delta x+6x^2\Delta x^2+4x\Delta x^3+\Delta x^4-x^4}{\Delta x}\\
&=\lim_{\Delta x\to 0}\frac{3x^2\Delta x+3x\Delta x^2+\Delta x^3}{\Delta x}
& &=\lim_{\Delta x\to 0}\frac{4x^3\Delta x+6x^2\Delta x^2+4x\Delta x^3+\Delta x^4}{\Delta x}\\
&=\lim_{\Delta x\to 0}3x^2+3x\Delta x+\Delta x^2
& &=\lim_{\Delta x\to 0}4x^3+6x^2\Delta x+4x\Delta x^2+\Delta x^3\\
&=\mathbf{3x^2}
& &=\mathbf{4x^3}
\end{alignat}

4. The text states that the derivative of \left|x\right| is not defined at x = 0. Use the definition of the derivative to show this.

\begin{alignat}{2}\lim_{\Delta x\to 0^-}\frac{\left|0+\Delta x\right|-\left|0\right|}{\Delta x}
&=\lim_{\Delta x\to 0^-}\frac{-\Delta x}{\Delta x}
& \qquad\lim_{\Delta x\to 0^+}\frac{\left|0+\Delta x\right|-\left|0\right|}{\Delta x}
&= \lim_{\Delta x\to 0^+}\frac{\Delta x}{\Delta x}\\
&=\lim_{\Delta x\to 0^-}-1
& &=\lim_{\Delta x\to 0^+}1\\
&=-1
& &=1
\end{alignat}
Since the limits from the left and the right at x=0 are not equal, the limit does not exist, so \left|x\right| is not differentiable at x=0.

6. Use the definition of the derivative to show that the derivative of \sin x is \cos x . Hint: Use a suitable sum to product formula and the fact that \lim_{t \to 0}\frac{\sin(t)}{t}=1 and \lim_{t \to 0}\frac{\cos(t)-1}{t}=0.

\begin{align}\lim_{\Delta x\to 0}\frac{\sin(x+\Delta x)-\sin(x)}{\Delta x}
&=\lim_{\Delta x\to 0}\frac{(\sin(x)\cos(\Delta x)+\cos(x)\sin(\Delta x))-\sin(x)}{\Delta x}\\
&=\lim_{\Delta x\to 0}\frac{\sin(x)(\cos(\Delta x)-1)+\cos(x)\sin(\Delta x)}{\Delta x}\\
&=\sin(x)\cdot\lim_{\Delta x\to 0}\frac{\cos(\Delta x)-1}{\Delta x}+\cos(x)\cdot\lim_{\Delta x\to 0}\frac{\sin(\Delta x)}{\Delta x}\\
&=\sin(x)\cdot 0+\cos(x)\cdot 1\\
&=\cos(x)
\end{align}

  • Find the derivatives of the following equations:
7.  f(x) = 42

\mathbf{f'(x)=0}

8.  f(x) = 6x + 10

\mathbf{f'(x)=6}

9.  f(x) = 2x^2 + 12x + 3

\mathbf{f'(x)=4x+12}

Chain Rule Solutions

1. Evaluate f'(x) if f(x)=(x^2+5)^2, first by expanding and differentiating directly, and then by applying the chain rule on f(u(x))=u^2 where u=x^2+5. Compare answers.

First method:

f(x)=x^4 + 10x^2 + 25
\mathbf{f'(x) = 4x^3 + 20x}

Second method:

f'(u(x))=\frac{df}{du}\cdot\frac{du}{dx}=2u\cdot2x=2(x^2+5)\cdot2x=\mathbf{4x^3+20x}

The two methods give the same answer.

2. Evaluate the derivative of y=\sqrt{1 + x^2} using the chain rule by letting y=\sqrt{u} and u=1+x^2.
\frac{dy}{du} = \frac{1}{2 \sqrt{u}};\quad\frac{du}{dx} = 2x
\frac{dy}{dx} = \frac{dy}{du}\cdot\frac{du}{dx} = \frac {1} {2 \sqrt {1 + x^2}}\cdot 2x = \mathbf{\frac {x} \sqrt {1 + x^2}}

Some Important Theorems Solutions

Rolle's Thoerem

1. Show that Rolle's Theorem holds true between the x-intercepts of the function f(x)=x^2-3x.

1: The question wishes for us to use the x-intercepts as the endpoints of our interval.

Factor the expression to obtain x(x-3)= 0 . x=0 and x=3 are our two endpoints. We know that f(0) and f(3) are the same, thus that satisfies the first part of Rolle's theorem (f(a)=f(b)).

2: Now by Rolle's Theorem, we know that somewhere between these points, the slope will be zero. Where? Easy: Take the derivative.

\frac{dy}{dx}  = 2x - 3

Thus, at  x = 3/2 , we have a spot with a slope of zero. We know that 3/2 (or 1.5) is between 0 and 3. Thus, Rolle's Theorem is true for this (as it is for all cases).

Mean Value Theorem

2. Show that h(a)=h(b), where h(x) is the function that was defined in the proof of Cauchy's Mean Value Theorem.

\begin{align}h(a)&=f(a)(g(b)-g(a)-g(a)(f(b)-f(a)-f(a)g(b)+f(b)g(a)\\
&=f(a)g(b)-f(a)g(a)-g(a)f(b)-g(a)f(a)-f(a)g(b)+f(b)g(a)\\
&=0\end{align} \begin{align}h(b)&=f(b)(g(b)-g(a))-g(b)(f(b)-f(a))-f(a)g(b)+f(b)g(a)\\
&=f(b)g(b)-f(b)g(a)-g(b)f(b)-g(b)f(a)-f(a)g(b)+f(b)g(a)\\
&=0\end{align}

3. Show that the Mean Value Theorem follows from Cauchy's Mean Value Theorem.

Let g(x)=x. Then g'(x)=1 and g(b)-g(a)=b-a, which is non-zero if b\ne a. Then
 \frac{f'(c)}{g'(c)} = \frac{f(b) - f(a)}{g(b) - g(a)} simplifies to f'(c) = \frac{f(b) - f(a)}{b-a} , which is the Mean Value Theorem.

4. Find the x=c that satisfies the Mean Value Theorem for the function f(x)=x^3 with endpoints x=0 and x=2.

1: Using the expression from the mean value theorem

\frac{f(b)-f(a)}{b-a}

insert values. Our chosen interval is [0,2]. So, we have

\frac{f(2)-f(0)}{2-0} = \frac{8}{2} = 4


2: By the Mean Value Theorem, we know that somewhere in the interval exists a point that has the same slope as that point. Thus, let us take the derivative to find this point x = c.

\frac{dy}{dx} = 3x^2

Now, we know that the slope of the point is 4. So, the derivative at this point c is 4. Thus, 4 = 3x^2. So x=\sqrt{4/3}=\mathbf{\frac{2\sqrt{3}}{3}}

5. Find the point that satisifies the mean value theorem on the function f(x) = \sin(x) and the interval [0,\pi].

1: We start with the expression:

\frac{f(b)-f(a)}{b-a}

so,

\frac{\sin(\pi) - \sin(0)}{\pi - 0} = 0

(Remember, sin(π) and sin(0) are both 0.)

2: Now that we have the slope of the line, we must find the point x = c that has the same slope. We must now get the derivative!

\frac{d\sin(x)}{dx} = \cos(x) = 0

The cosine function is 0 at \pi /2 + \pi n (where n is an integer). Remember, we are bound by the interval [0,\pi], so \mathbf{\pi/2} is the point c that satisfies the Mean Value Theorem.

Basics of Differentiation Cumulative Exercise Set Solutions

Find the Derivative by Definition

1. f(x) = x^2 \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{(x+\Delta x)^2-x^2}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{x^2+2x\Delta x+\Delta x^2-x^2}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{2x\Delta x+\Delta x^2}{\Delta x}\\
&=\lim_{\Delta x \to 0}2x+\Delta x\\
&=\mathbf{2x}\end{align}

2. f(x) = 2x + 2 \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{[2(x+\Delta x) + 2] - (2x + 2)}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{2x+2\Delta x + 2 - 2x - 2}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{2\Delta x}{\Delta x}\\
&=\mathbf{2}\end{align}

3. f(x) = \frac{1}{2}x^2 \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{\frac{1}{2}(x+\Delta x)^2-\frac{1}{2}x^2}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{\frac{1}{2}(x^2+2x\Delta x+\Delta x^2)-\frac{1}{2}x^2}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{\frac{x^2}{2}+\frac{2x \Delta x}{2}+\frac{\Delta x^2}{2}-\frac{x^2}{2}}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{2x\Delta x + \Delta x^2}{2\Delta x}\\
&=\lim_{\Delta x \to 0}x+\frac{\Delta x}{2}\\
&=\mathbf{x}\end{align}

4. f(x) = 2x^2 + 4x + 4 \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{[2(x+\Delta x)^2 + 4(x+\Delta x)+4] - (2x^2+4x+4)}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{4x\Delta x + 2\Delta x^2 + 4\Delta x}{\Delta x}\\
&=\lim_{\Delta x \to 0}4x+2\Delta x + 4\\
&=\mathbf{4x + 4}\end{align}

5. f(x) = \sqrt{x+2} \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{\sqrt{x+\Delta x+2}-\sqrt{x+2}}{\Delta x}\\
&=\lim_{\Delta x \to 0}(\frac{\sqrt{x+\Delta x+2}-\sqrt{x+2}}{\Delta x})(\frac{\sqrt{x+\Delta x+2}+\sqrt{x+2}}{\sqrt{x+\Delta x+2}+\sqrt{x+2}})\\
&=\lim_{\Delta x \to 0}\frac{x+\Delta x+2-x-2}{\Delta x(\sqrt{x+\Delta x+2}+\sqrt{x+2})}\\
&=\lim_{\Delta x \to 0}\frac{\Delta x}{\Delta x(\sqrt{x+\Delta x+2}+\sqrt{x+2})}\\
&=\lim_{\Delta x \to 0}\frac{1}{\sqrt{x+\Delta x+2}+\sqrt{x+2}}\\
&=\mathbf{\frac{1}{2\sqrt{x+2}}}\end{align}

6. f(x) = \frac{1}{x} \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{\frac{1}{x+\Delta x}-\frac{1}{x}}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{\frac{x-x-\Delta x}{x(x+\Delta x)}}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{-\Delta x}{x\Delta x(x+\Delta x)}\\
&=\lim_{\Delta x \to 0}\frac{-1}{x(x+\Delta x)}\\
&=\mathbf{-\frac{1}{x^2}}\end{align}

7. f(x) = \frac{3}{x+1} \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{\frac{3}{x+\Delta x+1}-\frac{3}{x+1}}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{\frac{3x+3-(3x+3\Delta x+3)}{(x+1)(x+\Delta x+1)}}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{-3\Delta x}{\Delta x(x+1)(x+\Delta x+1)}\\
&=\lim_{\Delta x \to 0}\frac{-3}{(x+1)(x+\Delta x+1)}\\
&=\mathbf{\frac{-3}{(x+1)^2}}\end{align}

8. f(x) = \frac{1}{\sqrt{x+1}} \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{\frac{1}{\sqrt{x+\Delta x+1}}-\frac{1}{\sqrt{x+1}}}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{\frac{\sqrt{x+1}-\sqrt{x+\Delta x+1}}{\sqrt{x+\Delta x+1}\sqrt{x+1}}}{\Delta x}\\
&=\lim_{\Delta x \to 0}(\frac{\sqrt{x+1}-\sqrt{x+\Delta x+1}}{\Delta x\sqrt{x+\Delta x+1}\sqrt{x+1}})(\frac{\sqrt{x+1}+\sqrt{x+\Delta x+1}}{\sqrt{x+1}+\sqrt{x+\Delta x+1}})\\
&=\lim_{\Delta x \to 0}\frac{x+1-(x+\Delta x+1)}{\Delta x\sqrt{x+\Delta x+1}\sqrt{x+1}(\sqrt{x+\Delta x+1}+\sqrt{x+1})}\\
&=\lim_{\Delta x \to 0}\frac{-1}{\sqrt{x+\Delta x+1}\sqrt{x+1}(\sqrt{x+\Delta x+1}+\sqrt{x+1})}\\
&=\frac{-1}{(x+1)(2\sqrt{x+1})}\\
&=\mathbf{\frac{-1}{2(x+1)^{3/2}}}\end{align}

9. f(x) = \frac{x}{x+2} \,

\begin{align}f'(x)
&=\lim_{\Delta x \to 0}\frac{\frac{x+\Delta x}{x+\Delta x+2}-\frac{x}{x+2}}{\Delta x}\\
&=\lim_{\Delta x \to 0}\frac{(x+\Delta x)(x+2)-x(x+\Delta x+2)}{\Delta x(x+\Delta x+2)(x+2)}\\
&=\lim_{\Delta x \to 0}\frac{x^2+2x+x\Delta x+2\Delta x-x^2-x\Delta x-2x}{\Delta x(x+\Delta x+2)(x+2)}\\
&=\lim_{\Delta x \to 0}\frac{2\Delta x}{\Delta x(x+\Delta x+2)(x+2)}\\
&=\lim_{\Delta x \to 0}\frac{2}{(x+\Delta x+2)(x+2)}\\
&=\mathbf{\frac{2}{(x+2)^2}}\end{align}

Prove the Constant Rule

10. Use the definition of the derivative to prove that for any fixed real number c, \frac{d}{dx}\left[cf(x)\right] = c \frac{d}{dx}\left[f(x)\right]

\begin{align}\frac{d}{dx}\left[cf(x)\right]
&=\lim_{\Delta x \to 0}\frac{cf\left(x+\Delta x \right)-cf\left(x\right)}{\Delta x}\\
&=c\lim_{\Delta x \to 0}\frac{f(x+\Delta x)-f(x)}{\Delta x}\\
&=c\frac{d}{dx}\left[f(x)\right]\end{align}

Find the Derivative by Rules

Power Rule

11. f(x) = 2x^2 + 4\,

f'(x)=\mathbf{4x}

12. f(x) = 3\sqrt[3]{x}\,

f'(x)=3(\frac{1}{3})x^{-2/3}=\mathbf{\frac{1}{\sqrt[3]{x^2}}}

13. f(x) = 2x^5+8x^2+x-78\,

f'(x)=\mathbf{10x^4+16x+1}

14. f(x) = 7x^7+8x^5+x^3+x^2-x\,

f'(x)=\mathbf{49x^6+40x^4+3x^2+2x-1}

15. f(x) = \frac{1}{x^2}+3x^\frac{1}{3}\,

f'(x)=\frac{-2}{x^{3}}+x^{-2/3}=\mathbf{\frac{-2}{x^3}+\frac{1}{\sqrt[3]{x^2}}}

16. f(x) = 3x^{15} + \frac{1}{17}x^2 +\frac{2}{\sqrt{x}} \,

f'(x)=45x^{14}+\frac{2}{17}x-\frac{1}{\sqrt{x^{3}}}=\mathbf{45x^{14}+\frac{2}{17}x-\frac{1}{x\sqrt{x}}}

17. f(x) = \frac{3}{x^4} - \sqrt[4]{x} + x \,

f'(x)=\frac{-12}{x^{5}}-\frac{1}{4}x^{-3/4}+1=\mathbf{\frac{-12}{x^5}-\frac{1}{4\sqrt[4]{x^3}}+1}

18. f(x) = 6x^{1/3}-x^{0.4} +\frac{9}{x^2} \,

f'(x)=2x^{-2/3}-0.4x^{-0.6}-\frac{18}{x^{3}}=\mathbf{\frac{2}{\sqrt[3]{x^2}}-\frac{0.4}{x^{0.6}}-\frac{18}{x^3}}

19. f(x) = \frac{1}{\sqrt[3]{x}} + \sqrt{x} \,

f'(x)=-\frac{1}{3x^{4/3}}+\frac{1}{2\sqrt{x}}=\mathbf{\frac{-1}{3x\sqrt[3]{x}}+\frac{1}{2\sqrt{x}}}

Product Rule

20. f(x) = (x^4+4x+2)(2x+3) \,

f'(x)=(4x^{3}+4)(2x+3)+(x^{4}+4x+2)(2)=\mathbf{10x^4+12x^3+16x+16}

21. f(x) = (2x-1)(3x^2+2) \,

f'(x)=(2)(3x^{2}+2)+(2x-1)(6x)=\mathbf{18x^2-6x+4}

22. f(x) = (x^3-12x)(3x^2+2x) \,

f'(x)=(3x^{2}-12)(3x^{2}+2x)+(x^{3}-12x)(6x+2)=\mathbf{15x^4+8x^3-108x^2-48x}

23. f(x) = (2x^5-x)(3x+1) \,

f'(x)=(10x^{4}-1)(3x+1)+(2x^{5}-x)(3)=\mathbf{36x^5+10x^4-6x-1}

Quotient Rule

24. f(x) = \frac{2x+1}{x+5} \,

f'(x)=\frac{(x+5)(2)-(2x+1)}{(x+5)^{2}}=\mathbf{\frac{9}{(x+5)^2}}

25. f(x) = \frac{3x^4+2x +2}{3x^2+1} \,

f'(x)=\frac{(3x^{2}+1)(12x^{3}+2)-(3x^{4}+2x+2)(6x)}{(3x^{2}+1)^{2}}=\mathbf{\frac{18x^5+12x^3-6x^2-12x+2}{(3x^2+1)^2}}

26. f(x) = \frac{x^\frac{3}{2}+1}{x+2} \,

f'(x)=\frac{(x+2)(\frac{3}{2}\sqrt{x})-(x^{\frac{3}{2}}+1)}{(x+2)^2}=\mathbf{\frac{x\sqrt{x}+6\sqrt{x}-2}{2(x+2)^2}}

27. d(u) = \frac{u^3+2}{u^3} \,

d'(u)=\frac{u^{3}(3u^{2})-(u^{3}+2)(3u^{2})}{u^{6}}=\mathbf{-\frac{6}{u^4}}

28. f(x) = \frac{x^2+x}{2x-1} \,

f'(x)=\frac{(2x-1)(2x+1)-(x^{2}+x)(2)}{(2x-1)^{2}}=\mathbf{\frac{2x^2-2x-1}{(2x-1)^2}}

29. f(x) = \frac{x+1}{2x^2+2x+3} \,

f'(x)=\frac{(2x^{2}+2x+3)-(x+1)(4x+2)}{(2x^{2}+2x+3)^2}=\mathbf{\frac{-2x^2-4x+1}{(2x^2+2x+3)^2}}

30. f(x) = \frac{16x^4+2x^2}{x} \,

f'(x)=\frac{x(64x^{3}+4x)-(16x^{4}+2x^{2})}{x^{2}}=\mathbf{48x^2+2}

Chain Rule

31. f(x) = (x+5)^2 \,

Let f(x)=g(h(x));\quad g(x)=x^{2};\quad h(x)=x+5. Then
f'(x)=\frac{dg}{dh}\frac{dh}{dx}=2(x+5)(1)=\mathbf{2(x+5)}

32. g(x) = (x^3 - 2x + 5)^2 \,

Let g(x)=r(s(x));\quad r(x)=x^{2};\quad s(x)=x^{3}-2x+5. Then
g'(x)=\frac{dr}{ds}\frac{ds}{dx}=\mathbf{2(x^{3}-2x+5)(3x^{2}-2)}

33. f(x) = \sqrt{1-x^2} \,

Let f(x)=g(h(x));\quad g(x)=\sqrt{x};\quad h(x)=1-x^{2}. Then
f'(x)=\frac{dg}{dh}\frac{dh}{dx}=\frac{1}{2\sqrt{1-x^{2}}}(-2x)=\mathbf{-\frac{x}{\sqrt{1-x^{2}}}}

34. f(x) = \frac{(2x+4)^3}{4x^3+1} \,

Let f(x)=\frac{N(x)}{D(x)};\quad N(x)=g(h(x));\quad g(x)=x^{3};\quad h(x)=2x+4;\quad D(x)=4x^{3}+1. Then
f'(x)=\frac{D(x)N'(x)-N(x)D'(x)}{D^{2}(x)}
N'(x)=\frac{dg}{dh}\frac{dh}{dx}=3(2x+4)^{2}(2)=6(2x+4)^{2}
D'(x)=12x^{2}
f'(x)=\frac{(4x^{3}+1)6(2x+4)^{2}-(2x+4)^{3}12x^{2}}{(4x^{3}+1)^{2}}=\mathbf{\frac{6(4x^{3}+1)(2x+4)^{2}-12x^2(2x+4)^{3}}{(4x^{3}+1)^{2}}}

35. f(x) = (2x+1)\sqrt{2x+2} \,

Let f(x)=A(x)B(x);\quad A(x)=2x+1;\quad B(x)=g(h(x));\quad g(x)=\sqrt{x};\quad h(x)=2x+2. Then
f'(x)=A'(x)B(x)+A(x)B'(x)
A'(x)=2
B'(x)=\frac{dg}{dh}\frac{dh}{dx}=\frac{1}{2\sqrt{2x+2}}(2)=\frac{1}{\sqrt{2x+2}}
f'(x)=\mathbf{2\sqrt{2x+2}+\frac{2x+1}{\sqrt{2x+2}}}

36. f(x) = \frac{2x+1}{\sqrt{2x+2}} \,

Let f(x)=\frac{N(x)}{D(x)};\quad N(x)=2x+1;\quad D(x)=g(h(x));\quad g(x)=\sqrt{x};\quad h(x)=2x+2. Then
f'(x)=\frac{D(x)N'(x)-N(x)D'(x)}{D^{2}(x)}
N'(x)=2
D'(x)=\frac{dg}{dh}\frac{dh}{dx}=\frac{1}{2\sqrt{2x+2}}(2)=\frac{1}{\sqrt{2x+2}}
f'(x)=\frac{\sqrt{2x+2}(2)-\frac{(2x+1)}{\sqrt{2x+2}}}{2x+2}=\mathbf{\frac{2x+3}{(2x+2)^{3/2}}}

37. f(x) = \sqrt{2x^2+1}(3x^4+2x)^2 \,

Let f(x)=A(x)B(x);\quad A(x)=g(h(x));\quad g(x)=\sqrt{x};\quad h(x)=2x^{2}+1;\quad B(x)=r(s(x));\quad r(x)=x^{2};\quad s(x)=3x^{4}+2x. Then
f'(x)=A'(x)B(x)+A(x)B'(x)
A'(x)=\frac{dg}{dh}\frac{dh}{dx}=\frac{1}{2\sqrt{2x^{2}+1}}(4x)=\frac{2x}{\sqrt{2x^{2}+1}}
B'(x)=\frac{dr}{ds}\frac{ds}{dx}=2(3x^{4}+2x)(12x^{3}+2)
f'(x)=\frac{2x}{\sqrt{2x^{2}+1}}(3x^{4}+2x)^{2}+\sqrt{2x^{2}+1}(2)(3x^{4}+2x)(12x^{3}+2)=\mathbf{\frac{2x(3x^{4}+2x)^{2}}{\sqrt{2x^{2}+1}}+\sqrt{2x^{2}+1}(2)(3x^{4}+2x)(12x^{3}+2)}

38. f(x) = \frac{2x+3}{(x^4+4x+2)^2} \,

Let f(x)=\frac{N(x)}{D(x)};\quad N(x)=2x+3;\quad D(x)=g(h(x));\quad g(x)=x^{2};\quad h(x)=x^{4}+4x+2. Then
f'(x)=\frac{D(x)N'(x)-N(x)D'(x)}{D^{2}(x)}
N'(x)=2
D'(x)=\frac{dg}{dh}\frac{dh}{dx}=2(x^{4}+4x+2)(4x^{3}+4)
f'(x)=\frac{(x^{4}+4x+2)^{2}(2)-(2x+3)(2)(x^{4}+4x+2)(4x^{3}+4)}{(x^{4}+4x+2)^{4}}=\mathbf{\frac{2(x^{4}+4x+2)^{2}-2(2x+3)(x^{4}+4x+2)(4x^{3}+4)}{(x^{4}+4x+2)^{4}}}

39. f(x) = \sqrt{x^3+1}(x^2-1) \,

Let f(x)=A(x)B(x);\quad A(x)=g(h(x));\quad g(x)=\sqrt{x};\quad h(x)=x^{3}+1;\quad B(x)=x^{2}-1. Then
f'(x)=A'(x)B(x)+A(x)B'(x)
A'(x)=\frac{dg}{dh}\frac{dh}{dx}=\frac{1}{2\sqrt{x^{3}+1}}(3x)=\frac{3x}{2\sqrt{x^{3}+1}}
B'(x)=2x
f'(x)=\frac{3x}{2\sqrt{x^{3}+1}}(x^{2}-1)+\sqrt{x^{3}+1}(2x)=\mathbf{\frac{3x(x^{2}-1)}{2\sqrt{x^{3}+1}}+2x\sqrt{x^{3}+1}}

40. f(x) = ((2x+3)^4 + 4(2x+3) +2)^2 \,

Let f(x)=g((h(x));\quad g(x)=x^{2};\quad h(x)=A(x)+B(x)+2;\quad A(x)=r(s(x));\quad r(x)=x^{4};\quad s(x)=2x+3;\quad B(x)=4(2x+3). Then
f'(x)=\frac{dg}{dh}\frac{dh}{dx}=2(A(x)+B(x)+2)(A'(x)+B'(x))
A'(x)=\frac{dr}{ds}\frac{ds}{dx}=4(2x+3)^{3}(2)=8(2x+3)^{3}
B'(x)=8
f'(x)=\mathbf{2((2x+3)^{4}+4(2x+3)+2)(8(2x+3)^{3}+8)}

41. f(x) = \sqrt{1+x^2} \,

Let f(x)=g(h(x));\quad g(x)=\sqrt{x};\quad h(x)=1+x^{2}. Then
f'(x)=\frac{dg}{dh}\frac{dh}{dx}=\frac{1}{2\sqrt{1+x^{2}}}(2x)=\mathbf{\frac{x}{\sqrt{1+x^{2}}}}

Exponentials

42. f(x) = (3x^2+e)e^{2x}\,

f'(x)=f'(x)=(6x)e^{2x}+(3x^{2}+e)2e^{2x}=\mathbf{6xe^{2x}+2e^{2x}(3x^{2}+e)}

43. f(x) = e^{2x^2+3x}

Let f(x)=g(h(x));\quad g(x)=e^{x};\quad h(x)=2x^{2}+3x. Then
f'(x)=\frac{dg}{dh}\frac{dh}{dx}=e^{2x^{2}+3x}(4x+3)=\mathbf{(4x+3)e^{2x^{2}+3x}}

44. f(x) = e^{e^{2x^2+1}}

Let

u(x)=e^{x}
v(x)=e^{x}
w(x)=2x^{2}+1

Then

f(x)=u(v(w(x)))

Using the chain rule, we have

\frac{df}{dx}=\frac{du}{dv}\frac{dv}{dw}\frac{dw}{dx}

The individual factor are

\frac{du}{dv}=\frac{d(e^{v})}{dv}=e^{v}=e^{e^{w}}=e^{e^{2x^{2}+1}}
\frac{dv}{dw}=\frac{d(e^{w})}{dw}=e^{w}=e^{2x^{2}+1}
\frac{dw}{dx}=\frac{d(2x^{2}+1)}{dx}=4x

So

\frac{df}{dx}=(e^{e^{2x^{2}+1}})(e^{2x^{2}+1})(4x)=\mathbf{4xe^{2x^{2}+1+e^{2x^{2}+1}}}
45. f(x) = 4^x\,

f'(x)=\mathbf{\ln(4)4^{x}}

Logarithms

46. f(x) = 2^{x-3}\cdot3\sqrt{x^3-2}+\ln x\,

f'(x)=\ln(2)2^{x-3}\cdot3\sqrt{x^{3}-2}+2^{x-3}\frac{3}{2\sqrt{x^{3}-2}}3x^{2}+\frac{1}{x}=\mathbf{3\ln(2)2^{x-3}\sqrt{x^{3}-2}+\frac{9x^{2}2^{x-4}}{\sqrt{x^{3}-2}}+\frac{1}{x}}

47. f(x) = \ln x - 2e^x + \sqrt{x}\,

f'(x)=\mathbf{\frac{1}{x}-2e^{x}+\frac{1}{2\sqrt{x}}}

48. f(x) = \ln(\ln(x^3(x+1))) \,

Let f(x)=g(g(h(x)));\quad g(x)=\ln(x);\quad h(x)=x^{3}(x+1). Then
f'(x)=\frac{dg(g(h(x)))}{dg(h(x))}\frac{dg(h(x))}{dh(x)}\frac{dh(x)}{dx}=\frac{1}{\ln(x^{3}(x+1))}\frac{1}{x^{3}(x+1)}(4x^{3}+3x^{2})=\mathbf{\frac{4x^{3}+3x^{2}}{x^{3}(x+1)\ln(x^{3}(x+1))}}

49. f(x) = \ln(2x^2+3x)\,

f'(x)=\frac{1}{2x^{2}+3x}(4x+3)=\mathbf{\frac{4x+3}{2x^{2}+3x}}

50. f(x) = \log_4 x + 2\ln x\,

f'(x)=\mathbf{\frac{1}{x\ln4}+\frac{2}{x}}

Trigonometric functions

51. f(x) = 3e^x-4\cos (x) - \frac{1}{4}\ln x\,

f'(x)=\mathbf{3e^{x}+4\sin(x)-\frac{1}{4x}}

52. f(x) = \sin(x)+\cos(x)\,

f'(x)=\mathbf{\cos(x)-\sin(x)}

More Differentiation

53. \frac{d}{dx}[(x^{3}+5)^{10}]

10(x^{3}+5)^{9}(3x^{2})=\mathbf{30x^{2}(x^{3}+5)^{9}}

54. \frac{d}{dx}[x^{3}+3x]

\mathbf{3x^{2}+3}

55. \frac{d}{dx}[(x+4)(x+2)(x-3)]

Let f(x)=A(x)B(x)C(x);\quad A(x)=x+4;\quad B(x)=x+2;\quad C(x)=x-3. Then
f'(x)=A'(x)B(x)C(x)+A(x)B'(x)C(x)+A(x)B(x)C'(x)
A'(x)=B'(x)=C'(x)=1
f'(x)=\mathbf{(x+2)(x-3)+(x+4)(x-3)+(x+4)(x+2)}

56. \frac{d}{dx}[\frac{x+1}{3x^{2}}]

\frac{3x^{2}-(x+1)(6x)}{(3x^{2})^{2}}=\frac{3x^{2}-6x^{2}-6x}{9x^{4}}=\frac{-3x^{2}-6x}{9x^{4}}=\mathbf{-\frac{x+2}{3x^{3}}}

57. \frac{d}{dx}[3x^{3}]

\mathbf{9x^{2}}

58. \frac{d}{dx}[x^{4}\sin x]

\mathbf{4x^{3}\sin x+x^{4}\cos x}

59. 2^{x}

\mathbf{\ln(2)2^{x}}

60. \frac{d}{dx}[e^{x^{2}}]

\mathbf{2xe^{x^{2}}}

61. \frac{d}{dx}[e^{2^{x}}]

\mathbf{\ln(2)2^{x}e^{2^{x}}}

Implicit Differentiation

Use implicit differentiation to find y'

62.  x^3 + y^3 = xy \,

3x^{2}+3y^{2}y'=y+xy'
3y^{2}y'-xy'=y-3x^{2}
\mathbf{y'=\frac{y-3x^{2}}{3y^{2}-x}}

63.  (2x+y)^4 + 3x^2 +3y^2 = \frac{x}{y} + 1 \,

4(2x+y)^{3}(2+y')+6x+6yy'=\frac{y-xy'}{y^{2}}
8(2x+y)^{3}+4y'(2x+y)^{3}+6x+6yy'=\frac{1}{y}-\frac{xy'}{y^{2}}
y'(4(2x+y)^{3}+6y+\frac{x}{y^{2}})=\frac{1}{y}-8(2x+y)^{3}-6x
y'(\frac{4y^{2}(2x+y)^{3}+6y^{3}+x}{y^{2}})=\frac{y-8y^{2}(2x+y)^{3}-6xy^{2}}{y^{2}}
\mathbf{y'=\frac{y-8y^{2}(2x+y)^{3}-6xy^{2}}{4y^{2}(2x+y)^{3}+6y^{3}+x}}

Logarithmic Differentiation

Use logarithmic differentiation to find \frac{dy}{dx}:

64. y = x(\sqrt[4]{1-x^3}\,)

\ln y=\ln(x)+\ln(\sqrt[4]{1-x^{3}})=\ln(x)+\frac{\ln(1-x^{3})}{4}
\frac{y'}{y}=\frac{1}{x}-\frac{3x^{2}}{4(1-x^{3})}
y'=x(\sqrt[4]{1-x^{3}}\,)(\frac{1}{x}-\frac{3x^{2}}{4(1-x^{3})})=\mathbf{\sqrt[4]{1-x^{3}}-\frac{3x^{3}}{4(1-x^{3})^{3/4}}}

65. y = \sqrt{x+1 \over 1-x}\,

\ln y=\frac{1}{2}(\ln(x+1)-\ln(1-x))
\frac{y'}{y}=\frac{1}{2}(\frac{1}{x+1}+\frac{1}{1-x})
\mathbf{y'=\frac{1}{2}\sqrt{\frac{x+1}{1-x}}(\frac{1}{x+1}+\frac{1}{1-x})}

66. y = (2x)^{2x}\,

\ln y=2x\ln(2x)
\frac{y'}{y}=2\ln(2x)+2x\frac{2}{2x}=2\ln(2x)+2
\mathbf{y'=(2x)^{2x}(2\ln(2x)+2)}

67. y = (x^3+4x)^{3x+1}\,

\ln y=(3x+1)\ln(x^{3}+4x)
\frac{y'}{y}=3\ln(x^{3}+4x)+(3x+1)\frac{3x^{2}+4}{x^{3}+4x}
\mathbf{y'=(x^{3}+4x)^{3x+1}(3\ln(x^{3}+4x)+\frac{(3x+1)(3x^{2}+4)}{x^{3}+4x})}

68. y = (6x)^{\cos(x) + 1}\,

\ln y=(\cos(x)+1)\ln(6x)
\frac{y'}{y}=-\sin(x)\ln(6x)+\frac{\cos(x)+1}{x}
\mathbf{y'=(6x)^{\cos(x)+1}(-\sin(x)\ln(6x)+\frac{\cos(x)+1}{x})}

Equation of Tangent Line

For each function, f, (a) determine for what values of x the tangent line to f is horizontal and (b) find an equation of the tangent line to f at the given point.

69.  f(x) = \frac{x^3}{3} + x^2 + 5, \;\;\; (3,23)
f'(x)=x^{2}+2x

a) x^{2}+2x=0\implies \mathbf{x=0,-2}
b) m=3^{2}+2(3)=9+6=15

23=15(3)+b\implies b=-22
\mathbf{y=15x-22}
70.  f(x) = x^3 - 3x + 1, \;\;\;  (1,-1)
f'(x)=3x^{2}-3

a) 3x^{2}-3=0\implies \mathbf{x=\pm1}
b) m=3(1)^{2}-3=0

-1=b
\mathbf{y=-1}
71.  f(x) = \frac{2}{3} x^3 + x^2 - 12x + 6, \;\;\; (0,6)
f'(x)=2x^{2}+2x-12

a) 2x^{2}+2x-12=0=x^{2}+x-6=(x+3)(x-2)\implies \mathbf{x=2,-3}
b) m=-12

6=b
\mathbf{y=-12x+6}
72.  f(x) = 2x + \frac{1}{\sqrt{x}}, \;\;\; (1,3)
f'(x)=2-\frac{1}{2x^{3/2}}

a) 2-\frac{1}{2x^{3/2}}=0\implies2x^{3/2}=\frac{1}{2}\implies x^{3/2}=\frac{1}{4}=2^{-2}\implies \mathbf{x=2^{-4/3}}
b) m=2-\frac{1}{2(1)^{3/2}}=2-\frac{1}{2}=\frac{3}{2}

3=\frac{3}{2}(1)+b\implies b=\frac{3}{2}
\mathbf{y=\frac{3}{2}x+\frac{3}{2}}
73.  f(x) = (x^2+1)(2-x), \;\;\; (2,0)
f'(x)=(2x)(2-x)-(x^{2}+1)=4x-2x^{2}-x^{2}-1=-3x^{2}+4x-1

a) -3x^{2}+4x-1=0=(3x-1)(-x+1)\implies \mathbf{x=1,\frac{1}{3}}
b) m=-3(2)^{2}+4(2)-1=-3(4)+8-1=-12+7=-5

0=-5(2)+b\implies b=10
\mathbf{y=-5x+10}
74.  f(x) = \frac{2}{3}x^3+\frac{5}{2}x^2 +2x+1, \;\;\; (3,\frac{95}{2})
f'(x)=2x^{2}+5x+2

a) 2x^{2}+5x+2=0=(2x+1)(x+2)\implies \mathbf{x=-\frac{1}{2},-2}
/ b) m=2(3)^{2}+5(3)+2=18+15+2=35

\frac{95}{2}=35(3)+b\implies b=-\frac{115}{2}
\mathbf{y=35x-\frac{115}{2}}
75. Find an equation of the tangent line to the graph defined by (x-y-1)^3 = x \, at the point (1,-1).

3(x-y-1)^{2}(1-y')=1
1-y'=\frac{1}{3(x-y-1)^{2}}
y'=1-\frac{1}{3(x-y-1)^{2}}
m=1-\frac{1}{3(1-(-1)-1)^{2}}=1-\frac{1}{3(1)^{2}}=\frac{2}{3}
-1=\frac{2}{3}(1)+b\implies b=-\frac{5}{3}
\mathbf{y=\frac{2}{3}x-\frac{5}{3}}

76. Find an equation of the tangent line to the graph defined by  e^{xy} + x^2 = y^2 \, at the point (1,0).

y'e^{xy}+2x=2yy'
y'(e^{xy}-2y)=-2x
y'=\frac{2x}{2y-e^{xy}}
m=\frac{2(1)}{2(0)-e^{1(0)}}=\frac{2}{-1}=-2
0=-2(1)+b\implies b=2
\mathbf{y=-2x+2}

Higher Order Derivatives

77. What is the second derivative of 3x^4+3x^2+2x?

\frac{d}{dx}(3x^4+3x^2+2x)=12x^3+6x+2
\frac{d^2}{dx^2}(3x^4+3x^2+2x)=\frac{d}{dx}(12x^3+6x+2)=\mathbf{36x^2+6}

78. Use induction to prove that the (n+1)th derivative of a n-th order polynomial is 0.

base case: Consider the zeroth-order polynomial, c. \frac{dc}{dx}=0
induction step: Suppose that the n-th derivative of a (n-1)th order polynomial is 0. Consider the n-th order polynomial, f(x). We can write f(x)=cx^n+P(x) where P(x) is a (n-1)th polynomial.
\frac{d^{n+1}}{dx^{n+1}}f(x)=\frac{d^{n+1}}{dx^{n+1}}(cx^n+P(x))=\frac{d^{n+1}}{dx^{n+1}}(cx^n)+\frac{d^{n+1}}{dx^{n+1}}P(x)=\frac{d^{n}}{dx^{n}}(cnx^{n-1})+\frac{d}{dx}\frac{d^{n}}{dx^{n}}P(x)=0+\frac{d}{dx}0=0

L'Hôpital's Rule Solutions

L'Hôpital's rule Solutions

1. \lim_{x \to 0}\frac{x+\tan x}{\sin x}

\lim_{x\to 0}\frac{1+\sec^2 x}{\cos x} = \mathbf{2}

2. \lim_{x \to \pi}\frac{x-\pi}{\sin x}

\lim_{x\to \pi}\frac{1}{\cos x} = \mathbf{-1}

3. \lim_{x \to 0}\frac{\sin 3x}{\sin 4x}

\lim_{x\to 0}\frac{3\cos(3x)}{4\cos(4x)} = \mathbf{\frac{3}{4}}

4. \lim_{x \to \infty}\frac{x^5}{e^{5x}}

\begin{align}\lim_{x\to \infty}\frac{5x^4}{5e^{5x}}
&=\lim_{x\to \infty}\frac{5\cdot 4x^3}{5^2e^{5x}}\\
&=\lim_{x\to \infty}\frac{5\cdot 4\cdot 3x^2}{5^3e^{5x}}\\
&=\lim_{x\to \infty}\frac{5\cdot 4\cdot 3\cdot 2x}{5^4e^{5x}}\\
&=\lim_{x\to \infty}\frac{5\cdot 4\cdot 3\cdot 2\cdot 1}{5^5e^{5x}}\\
&=\mathbf{0}\end{align}

5. \lim_{x \to 0}\frac{\tan x - x}{\sin x - x}

\lim_{x\to 0}\frac{\sec^2 x-1}{\cos x-1}=\lim_{x\to 0}\frac{2\sec x \cos^{-2}x \sin x}{-\sin x}=\mathbf{-2}

Related Rates Solutions

1. A spherical balloon is inflated at a rate of 100 ft^3/min. Assuming the rate of inflation remains constant, how fast is the radius of the balloon increasing at the instant the radius is 4 ft?

Known:
V=\frac{4}{3}\pi r^{3}
\dot{V}=100
r=4
Take the time derivative:
\dot{V}=4\pi r^{2}\dot{r}
Solve for \dot{r}:
\dot{r}=\frac{\dot{V}}{4\pi r^{2}}
Plug in known values:
\dot{r}=\frac{100}{4\pi4^{2}}=\mathbf{\frac{25}{16\pi} \frac{ft}{min}}

2. Water is pumped from a cone shaped reservoir (the vertex is pointed down) 10 ft in diameter and 10 ft deep at a constant rate of 3 ft^3/min. How fast is the water level falling when the depth of the water is 6 ft?

Known:
h=2r
V=\frac{1}{3}\pi r^{2}h=\frac{1}{3}\pi(\frac{h}{2})^{2}h=\frac{1}{12}\pi h^{3}
\dot{V}=3
h=6
Take the time derivative:
\dot{V}=\frac{1}{4}\pi h^{2}\dot{h}
Solve for \dot{h}:
\dot{h}=\frac{4\dot{V}}{\pi h^{2}}
Plug in known values:
\dot{h}=\frac{(4)(3)}{\pi6^{2}}=\mathbf{\frac{1}{3\pi} \frac{ft}{min}}

3. A boat is pulled into a dock via a rope with one end attached to the bow of a boat and the other wound around a winch that is 2ft in diameter. If the winch turns at a constant rate of 2rpm, how fast is the boat moving toward the dock?

Let R be the number of revolutions made and s be the distance the boat has moved toward the dock.
Known:
\frac{R}{s}=\frac{1}{2\pi r} (each revolution adds one circumferance of distance to s)
\dot{R}=2
r=1
Solve for s:
s=2\pi rR
Take the time derivative:
\dot{s}=2\pi r\dot{R}
Plug in known values:
\dot{s}=2\pi(1)(2)=\mathbf{4\pi\frac{ft}{min}}

4. At time t=0 a pump begins filling a cylindrical reservoir with radius 1 meter at a rate of e^{-t} cubic meters per second. At what time is the liquid height increasing at 0.001 meters per second?

Known:
V=\pi r^{2}h
\dot{V}=e^{-t}
r=1
\dot{h}=0.001
Take the time derivative:
\dot{V}=\pi r^{2}\dot{h}
Plug in the known values:
e^{-t}=0.001\pi
Solve for t:
\mathbf{t=-\ln(.001\pi)}

Applications of Derivatives Cumulative Exercise Set Solutions

Relative Extrema

Find the relative maximum(s) and minimum(s), if any, of the following functions.

1.  f(x) = \frac{x}{x+1} \,

f'(x)=\frac{1}{(x+1)^{2}}
There are no roots of the derivative. The derivative fails to exist when x=-1 , but the function also fails to exists at that point, so it is not an extremum. Thus, the function has no relative extrema.

2.  f(x) = (x-1)^{2/3} \,

f'(x)=\frac{2}{3}(x-1)^{-1/3}=\frac{2}{3\sqrt[3]{x-1}}
There are no roots of the derivative. The derivative fails to exist at x=1 . f(1)=0. The point (1,0) is a minimum since f(x) is nonnegative because of the even numerator in the exponent. The function has no relative maximum.

3.  f(x) = x^2 + \frac{2}{x} \,

f'(x)=2x-\frac{2}{x^{2}}
2x-\frac{2}{x^{2}}=0\implies 2x^{3}-2=0\implies x=1
f^{\prime\prime}(x)=2+\frac{4}{x^{3}}
f^{\prime\prime}(1)=6
Since the second derivative is positive, x=1 corresponds to a relative minimum.
The derivative fails to exist when x=0 , but so does the function. There is no relative maximum.

4.  f(s) = \frac{s}{1+s^2} \,

f'(x)=\frac{(1+s^{2})-2s^{2}}{(1+s^{2})^{2}}=\frac{1-s^{2}}{(1+s^{2})^{2}}
\frac{1-s^{2}}{(1+s^{2})^{2}}=0\implies s=\pm1
f^{\prime\prime}(x)=\frac{-2s(1+s^{2})^{2}-2(1+s^{2})(2s)(1-s^{2})}{(1+s^{2})^{4}}
f^{\prime\prime}(-1)=\frac{-2(-1)(1+(-1)^{2})^{2}-2(1+(-1)^{2})(2(-1))(1-(-1)^{2})}{(1+(-1)^{2})^{4}}=\frac{8}{16}=\frac{1}{2}
Since the second derivative of f(x) at x=-1 is positive, x=-1 corresponds to a relative mimimum.
f^{\prime\prime}(1)=\frac{-2(1+1^{2})^{2}-2(1+1^{2})(2)(1-1^{2})}{(1+1^{2})^{4}}=\frac{-8}{16}=-\frac{1}{2}
Since the second derivative of f(x) at x=1 is negative, x=1 corresponds to a relative maximum.

5.  f(x) =  x^2 - 4x + 9 \,

f'(x)=2x-4
2x-4=0\implies x=2
f^{\prime\prime}(x)=2
Since the second derivative is positive, x=2 corresponds to a relative minimum. There is no relative maximum.

6.  f(x) = \frac{x^2 + x +1}{x^2 -x +1} \,

\begin{align}f'(x)&=\frac{(x^{2}-x+1)(2x+1)-(2x-1)(x^{2}+x+1)}{(x^{2}-x+1)^{2}}\\
&=\frac{2x^{3}-2x^{2}+2x+x^{2}-x+1-2x^{3}-2x^{2}-2x+x^{2}+x+1}{(x^{2}-x+1)^{2}}\\
&=\frac{-2x^{2}+2}{(x^{2}-x+1)^{2}}\end{align}
\frac{-2x^{2}+2}{(x^{2}-x+1)^{2}}=0\implies x=\pm1
f^{\prime\prime}(x)=\frac{(-4x)(x^{2}-x+1)^{2}-(2x-1)(-2x^{2}+2)}{(x^{2}-x+1)^{4}}
f^{\prime\prime}(-1)=\frac{(-4(-1))((-1)^{2}-(-1)+1)^{2}-(2(-1)-1)(-2(-1)^{2}+2)}{((-1)^{2}-(-1)+1)^{4}}=\frac{36}{81}=\frac{4}{9}
Since f^{\prime\prime}(-1) is positive, x=-1 corresponds to a relative minimum.
f^{\prime\prime}(1)=\frac{-4}{1}=-4
Since f^{\prime\prime}(1) is negative, x=1 corresponds to a relative maximum.

Range of Function

7. Show that the expression x+ 1/x cannot take on any value strictly between 2 and -2.

f(x)=x+\frac{1}{x}
f'(x)=1-\frac{1}{x^{2}}
1-\frac{1}{x^{2}}=0\implies x=\pm1
f^{\prime\prime}(x)=\frac{2}{x^{3}}
f^{\prime\prime}(-1)=-2
Since f^{\prime\prime}(-1) is negative, x=-1 corresponds to a relative maximum.
f(-1)=-2
\lim\limits _{x\to-\infty}f(x)=-\infty
For x<-1, f'(x) is positive, which means that the function is increasing. Coming from very negative x-values, f increases from a very negative value to reach a relative maximum of -2 at x=-1.
For -1<x<1, f'(x) is negative, which means that the function is decreasing.
\lim_{x\to0^{-}}f(x)=-\infty
\lim_{x\to0^{+}}f(x)=+\infty
f^{\prime\prime}(1)=2
Since f^{\prime\prime}(1) is positive, x=1 corresponds to a relative minimum.
f(1)=2
Between [-1,0) the function decreases from -2 to -\infty, then jumps to +\infty and decreases until it reaches a relative minimum of 2 at x=1.
For x>1, f'(x) is positive, so the function increases from a minimum of 2.
The above analysis shows that there is a gap in the function's range between -2 and 2.

Absolute Extrema

Determine the absolute maximum and minimum of the following functions on the given domain

8.  f(x) = \frac{1}{3}x^3 - \frac{1}{2}x^2 + 1 on [0,3]

f is differentiable on [0,3], so the extreme value theorem guarantees the existence of an absolute maximum and minimum on [0,3] . Find and check the critical points:
f'(x)=x^{2}-x
x^{2}-x=0\implies x=0,1
f(0)=1
f(1)=\frac{5}{6}
Check the endpoint:
f(3)=\frac{11}{2}
Maximum at (3,\frac{11}{2}); minimum at (1,\frac{5}{6})

9.  f(x) = (\frac{4}{3}x^2 -1)x on [-\frac{1}{2},2]

f'(x)=(\frac{4}{3}x^{2}-1)+x(\frac{8}{3}x)=4x^{2}-1
4x^{2}-1=0\implies x=\pm\frac{1}{2}
f(-\frac{1}{2})=(\frac{1}{3}-1)(-\frac{1}{2})=\frac{1}{3}
f(\frac{1}{2})=(\frac{1}{3}-1)(\frac{1}{2})=-\frac{1}{3}
f(2)=(\frac{16}{3}-1)(2)=\frac{26}{3}
Maximum at (2,\frac{26}{3}); minimum at (\frac{1}{2},-\frac{1}{3})

Determine Intervals of Change

Find the intervals where the following functions are increasing or decreasing

10. f(x)=10-6x-2x^2

f'(x)=-6-4x
-6-4x=0\implies x=-\frac{3}{2}
f'(x) is the equation of a line with negative slope, so f'(x) is positive for x<-\frac{3}{2} and negative for x>\frac{3}{2}.
This means that the function is increasing on (-\infty,-\frac{3}{2}) and decreasing on (\frac{3}{2},+\infty).

11. f(x)=2x^3-12x^2+18x+15

f'(x)=6x^{2}-24x+18
\begin{align}6x^{2}-24x+18=0&\implies x-4x+3=0\\
&\implies(x-1)(x-3)=0\\
&\implies x=1,3\end{align}
f'(x) is the equation of a bowl-shaped parabola that crosses the x-axis at 1 and 3, so f'(x) is negative for 1<x<3 and positive elsewhere.
This means that the function is decreasing on (1,3) and increasing elsewhere.

12. f(x)=5+36x+3x^2-2x^3

f'(x)=36+6x-6x^{2}
\begin{align}36+6x-6x^{2}=0&\implies6+x-x^{2}=0\\
&\implies(x+2)(-x+3)=0\\
&\implies x=-2,3\end{align}
f'(x) is the equation of a hill-shaped parabola that crosses the x-axis at -2 and 3, so f'(x) is positive for -2<x<3 and negative elsewhere.
This means that the function is increasing on (-2,3) and decreasing elsewhere.

13. f(x)=8+36x+3x^2-2x^3

If you did the previous exercise then no calculation is required since this function has the same derivative as that function and thus is increasing and decreasing on the same intervals; i.e., the function is increasing on (-2,3) and decreasing elsewhere.

14. f(x)=5x^3-15x^2-120x+3

f'(x)=15x^{2}-30x-120
\begin{align}15x^{2}-30x-120=0&\implies x^{2}-2x-8=0\\
&\implies(x+2)(x-4)=0\\
&\implies x=-2,4\end{align}
f' is negative on (-2,4) and positive elsewhere.
So f is decreasing on (-2,4) and increasing elsewhere.

15. f(x)=x^3-6x^2-36x+2

f'(x)=3x^{2}-12x-36
\begin{align}3x^{2}-12x-36=0&\implies x^{2}-4x-12=0\\
&\implies(x+2)(x-6)=0\\
&\implies x=-2,6\end{align}
f is decreasing on (-2,6) and increasing elsewhere.

Determine Intervals of Concavity

Find the intervals where the following functions are concave up or concave down

16. f(x)=10-6x-2x^2

f'(x)=-6-4x
f''(x)=-4
The function is concave down everywhere.

17. f(x)=2x^3-12x^2+18x+15

f'(x)=6x^{2}-24x+18
f^{\prime\prime}(x)=12x-24
12x-24=0\implies x=2
When x<2, f^{\prime\prime}(x) is negative, and when x>2, f^{\prime\prime}(x) is positive.
This means that the function is concave down on (-\infty,2) and concave up on (2,+\infty).

18. f(x)=5+36x+3x^2-2x^3

f'(x)=36+6x-6x^{2}
f^{\prime\prime}(x)=6-12x
6-12x=0\implies x=\frac{1}{2}
f^{\prime\prime}(x) is positive when x<\frac{1}{2} and negative when x>\frac{1}{2}.
This means that the function is concave up on (-\infty,\frac{1}{2}) and concave down on (\frac{1}{2},+\infty).

19. f(x)=8+36x+3x^2-2x^3

If you did the previous exercise then no calculation is required since this function has the same second derivative as that function and thus is concave up and concave down on the same intervals; i.e., the function is concave up on (-\infty,\frac{1}{2}) and concave down on (\frac{1}{2},+\infty).

20. f(x)=5x^3-15x^2-120x+3

f'(x)=15x^{2}-30x-120
f^{\prime\prime}(x)=30x-30
30x-30=0\implies x=1
f^{\prime\prime}(x) is positive when x>1 and negative when x<1.
This means that the function is concave down on (-\infty,1) and concave up on (1,+\infty).

21. f(x)=x^3-6x^2-36x+2

f'(x)=3x^{2}-12x-36
f^{\prime\prime}(x)=6x-12
6x-12=0\implies x=2
f^{\prime\prime}(x) is positive when x>2 and negative when x<2.
This means that the function is concave down on (-\infty,2) and concave up on (2,+\infty).

Word Problems

22. You peer around a corner. A velociraptor 64 meters away spots you. You run away at a speed of 6 meters per second. The raptor chases, running towards the corner you just left at a speed of 4t meters per second (time t measured in seconds after spotting). After you have run 4 seconds the raptor is 32 meters from the corner. At this time, how fast is death approaching your soon to be mangled flesh? That is, what is the rate of change in the distance between you and the raptor?

Velocity is the rate in change of position with respect to time. The raptor's velocity relative to you is given by
\frac{dx}{dt}=v(t)=4t-6
After 4 seconds, the rate of change in position with respect to time is
v(4)=16-6=\mathbf{10\frac{m}{s}}

23. Two bicycles leave an intersection at the same time. One heads north going 12 mph and the other heads east going 5 mph. How fast are the bikes getting away from each other after one hour?

Set up a coordinate system with the origin at the intersection and the +y-axis pointing north. We assume that the position of the bike heading north is a function of the position of the bike heading east.
y=\frac{12}{5}x
The distance between the bikes is given by
s=\sqrt{x^{2}+y^{2}}=\sqrt{x^{2}+(\frac{12}{5}x)^{2}}=\sqrt{\frac{144+25}{25}x^{2}}=\sqrt{\frac{169}{25}x^{2}}=\frac{13}{5}x
Let t represent the elapsed time in hours. We want \frac{ds}{dt} when t=1. Apply the chain rule to s:
\frac{ds}{dt}=\frac{ds}{dx}\frac{dx}{dt}=\frac{13}{5}\cdot5=13
Thus, the bikes are moving away from one another at 13 mph.

24. You're making a can of volume 200 m^3 with a gold side and silver top/bottom. Say gold costs 10 dollars per m^2 and silver costs 1 dollar per m^2. What's the minimum cost of such a can?

The volume of the can as a function of the radius, r, and the height, h, is
V=\pi r^{2}h
We are constricted to have a can with a volume of 200m^{3}, so we use this fact to relate the radius and the height:
\pi r^{2}h=200\implies h=\frac{200}{\pi r^{2}}
The surface area of the side is
A_{side}=2\pi rh=2\pi r\frac{200}{\pi r^{2}}=\frac{400}{r}
and the cost of the side is
C_{side}=\frac{4000}{r}
The surface area of the top and bottom (which is also the cost) is
A_{tb}=C_{tb}=2\pi r^{2}
The total cost is given by
C=\frac{4000}{r}+2\pi r^{2}
We want to minimize C, so take the derivative:
C'=-\frac{4000}{r^{2}}+4\pi r
Find the critical points:
-\frac{4000}{r^{2}}+4\pi r=0\implies4\pi r=\frac{4000}{r^{2}}\implies4\pi r^{3}=4000\implies r=\sqrt[3]{\frac{1000}{\pi}}=\frac{10}{\sqrt[3]{\pi}}
Check the second derivative to see if this point corresponds to a maximum or minimum:
C^{\prime\prime}=\frac{8000}{r^{3}}+4\pi
C^{\prime\prime}(\frac{10}{\sqrt[3]{\pi}})=\frac{8000}{1000/\pi}+4\pi=8\pi+4\pi=12\pi
Since the second derivative is positive, the critical point corresponds to a minimum. Thus, the minimum cost is
C(\frac{10}{\sqrt[3]{\pi}})=\frac{4000}{10/\sqrt[3]{\pi}}+2\pi\frac{10}{\sqrt[3]{\pi}}\approx\mathbf{\$878.76}

Integration Solutions

Indefinite Integral Solutions

1. Evaluate \int \frac{3x}{2}dx

We need to find a function, F , such that
F'(x)=\frac{3x}{2}
We know that
\frac{d}{dx}x^{2}=2x
So we need to find a constant, a , such that
\frac{d}{dx}ax^{2}=2ax=\frac{3x}{2}
Solving for a, we get
2ax=\frac{3x}{2}\implies a=\frac{3}{4}
So
\int\frac{3x}{2}=\mathbf{\frac{3}{4}x^{2}+C}
Check your answer by taking the derivative of the function you've found and checking that it matches the integrand:
\frac{d}{dx}(\frac{3}{4}x^{2}+C)=\frac{3}{4}(2x)=\frac{3x}{2}

2. Find the general antiderivative of the function f(x)=2x^4.

We know that
\frac{d}{dx}x^{5}=5x^{4}
We need to find a constant, a, such that
\frac{d}{dx}ax^{5}=5ax^{4}=2x^{4}
Solving for a, we get
5ax^{4}=2x^{4}\implies a=\frac{2}{5}
So the general antiderivative will be
\mathbf{\frac{2}{5}x^{5}+C}
Check your answer by taking the derivative of the antiderivative you've found and checking that you get back the function you started with:
\frac{d}{dx}\int 2x^4 dx=\frac{d}{dx}(\frac{2}{5}x^{5}+C)=\frac{2}{5}(5x^{4})=2x^{4}

3. Evaluate \int(7x^2+3\cos(x)-e^x)dx

\begin{align}\int(7x^{2}+3\cos(x)-e^{x})dx&=7\int x^{2}dx+3\int\cos(x)dx-\int e^{x}dx\\
&=7(\frac{x^{3}}{3})+3\sin(x)-e^{x}+C\\
&=\mathbf{\frac{7}{3}x^3+3\sin(x)-e^x+C}\end{align}

4. Evaluate \int(\frac{2}{5x}+\sin(x))dx

\begin{align}\int(\frac{2}{5x}+\sin(x))dx&=\frac{2}{5}\int\frac{dx}{x}+\int\sin(x)dx\\
&=\mathbf{\frac{2}{5}\ln|x|-\cos(x)+C}\end{align}

5. Evaluate \int x\sin(2x^2)dx by making the substitution u=2x^2

Since u=2x^{2}, du=4xdx and dx=\frac{du}{4x}
\begin{align}\int x\sin(2x^{2})dx&=\int x\sin(u)\frac{du}{4x}\\
&=\frac{1}{4}\int\sin(u)du\\
&=-\frac{\cos(u)}{4}+C\\
&=-\mathbf{\frac{\cos(2x^{2})}{4}+C}\end{align}

6. Evaluate \int-3\cos(x)e^{\sin(x)}dx

Let u=\sin(x), du=\cos(x)dx so that dx=\frac{du}{\cos(x)}
\begin{align}\int-3\cos(x)e^{\sin(x)}dx&=-3\int\cos(x)e^{u}\frac{du}{\cos(x)}\\
&=-3\int e^{u}du\\
&=-3e^{u}+C\\
&=\mathbf{-3e^{\sin(x)}+C}\end{align}

7. Evaluate \int \frac{2x-5}{x^3}dx using integration by parts with u=2x-5 and dv=\frac{dx}{x^3}

du=2dx; v=\int\frac{dx}{x^{3}}=-\frac{1}{2x^{2}}
\begin{align}\int\frac{2x-5}{x^{3}}dx&=\int udv\\
&=uv-\int vdu\\
&=(2x-5)(-\frac{1}{2x^{2}})-\int(-\frac{1}{2x^{2}})2dx\\
&=\frac{5-2x}{2x^{2}}+\int\frac{dx}{x^{2}}\\
&=\frac{5-2x}{2x^{2}}-\frac{1}{x}\\
&=\frac{5-2x}{2x^{2}}-\frac{2x}{2x^{2}}\\
&=\mathbf{\frac{5-4x}{2x^{2}}}\end{align}

8. Evaluate \int(2x-1)e^{-3x+1}dx

Let u=2x-1; dv=e^{-3x+1}dx
Then du=2dx and v=\int e^{-3x+1}dx
To evaluate v, make the substitution w=-3x+1; dw=-3dx; dx=\frac{-dw}{3}. Then
v=\int e^{-3x+1}dx=\int e^{w}(\frac{-1}{3})dw=\frac{-e^{w}}{3}=\frac{-e^{-3x+1}}{3}. So
\begin{align}\int(2x-1)e^{-3x+1}dx&=\int udv\\
&=uv-\int vdu\\
&=(2x-1)\frac{-e^{-3x+1}}{3}-\int\frac{-e^{-3x+1}}{3}(2)dx\\
&=\frac{(1-2x)e^{-3x+1}}{3}+\frac{2}{3}\int e^{-3x+1}dx\\
&=\frac{(1-2x)e^{-3x+1}}{3}+\frac{2}{3}\int\frac{-e^{w}}{3}dw\\
&=\frac{3(1-2x)e^{-3x+1}}{9}-\frac{2}{9}e^{w}\\
&=\frac{(3-6x)e^{-3x+1}}{9}-\frac{2}{9}e^{-3x+1}\\
&=\mathbf{\frac{(1-6x)e^{-3x+1}}{9}}\end{align}

Definite Integral Solutions

1. Use left- and right-handed Riemann sums with 5 subdivisions to get lower and upper bounds on the area under the function f(x)=x^6 from x=0 to x=1.

\left|\begin{array}{ccccccc}
i & x_{i} & x_{i}^{6} & 0.2\times x_{i-1}^{6} & \sum\limits _{k=1}^{i}0.2\times x_{i-1}^{6} & 0.2\times x_{i}^{6} & \sum\limits _{k=1}^{i}0.2\times x_{i}^{6}\\
0 & 0.0 & 0 & \, & \, & 0 & \,\\
1 & 0.2 & 0.000064 & 0 & 0 & 0.0000128 & 0.0000128\\
2 & 0.4 & 0.004096 & 0.0000128 & 0.0000128 & 0.0008192 & 0.000832\\
3 & 0.6 & 0.046656 & 0.0008192 & 0.000832 & 0.0093312 & 0.0101632\\
4 & 0.8 & 0.262144 & 0.0093312 & 0.0101632 & 0.0524288 & 0.062592\\
5 & 1.0 & 1 & 0.0524288 & 0.062592 & .2 & 0.262592
\end{array}\right|
Lower bound: 0.062592
Upper bound: 0.262592

2. Use left- and right-handed Riemann sums with 5 subdivisions to get lower and upper bounds on the area under the function f(x)=x^6 from x=1 to x=2.

\left|\begin{array}{ccccccc}
i & x_{i} & x_{i}^{6} & 0.2\times x_{i-1}^{6} & \sum\limits _{k=1}^{i}0.2\times x_{i-1}^{6} & 0.2\times x_{i}^{6} & \sum\limits _{k=1}^{i}0.2\times x_{i}^{6}\\
0 & 1.0 & 1 & \, & \, & .2 & \,\\
1 & 1.2 & 2.985984 & .2 & .2 & 0.5971968 & 0.5971968\\
2 & 1.4 & 7.529536 & 0.5971968 & 0.7971968 & 1.5059072 & 2.103104\\
3 & 1.6 & 16.777216 & 1.5059072 & 2.303104 & 3.3554432 & 5.4585472\\
4 & 1.8 & 34.012224 & 3.3554432 & 5.6585472 & 6.8024448 & 12.260992\\
5 & 2.0 & 64 & 6.8024448 & 12.460992 & 12.8 & 25.060992
\end{array}\right|
Lower bound: 12.460992
Upper bound: 25.060992

3. Use the subtraction rule to find the area between the graphs of f(x)=x and g(x)=x^2 between x=0 and x=1

From the earlier examples we know that \int_0^1 x dx=\frac{1}{2} and that \int_a^b x^2 dx=\frac{b^3}{3}-\frac{a^3}{3}. From this we can deduce
\int_0^1(x-x^2)dx=\frac{1}{2}-(\frac{1^3}{3}-\frac{0^3}{3})=\frac{1}{2}-\frac{1}{3}=\mathbf{\frac{1}{6}}

4. Use the results of exercises 1 and 2 and the property of linearity with respect to endpoints to determine upper and lower bounds on \int_0^2 x^6 dx.

In exercise 1 we found that

0.062592<\int_0^1 x^6 dx<0.262592

and in exercise 2 we found that

12.460992<\int_1^2 x^6 dx<25.060992

From this we can deduce that

0.062592+12.460992<\int_0^1 x^6 dx+\int_1^2 x^6 dx<0.262592+25.060992
\mathbf{12.523584<\int_0^2 x^6 dx<25.323584}
5. Prove that if f is a continuous even function then for any a,
\int_{-a}^a f(x) dx = 2 \int_0^a f(x)dx.

From the property of linearity of the endpoints we have

\int_{-a}^a f(x) dx = \int_{-a}^0 f(x) dx +\int_{0}^a f(x) dx

Make the substitution u=-x; du=-dx. u=a when x=-a and u=0 when x=0. Then

\int_{-a}^0 f(x)dx=\int_a^0 f(-u)(-du)=-\int_a^0 f(-u)du=\int_0^a f(-u)du=\int_0^a f(u)du

where the last step has used the evenness of f. Since u is just a dummy variable, we can replace it with x. Then

\int_{-a}^a f(x) dx = \int_0^a f(x)dx + \int_{0}^a f(x) dx = 2\int_{0}^a f(x) dx

Integration Cumulative Exercise Set Solutions

Integration of Polynomials

Evaluate the following:

1. \int (x^2-2)^{2}\, dx

\begin{align}\int(x^{2}-2)^{2}dx&=\int(x^{4}-4x^{2}+4)dx\\
&=\mathbf{\frac{x^{5}}{5}-\frac{4x^{3}}{3}+4x+C}\end{align}

2. \int 8x^3\, dx

\begin{align}\int8x^{3}dx&=\frac{8x^{4}}{4}+C\\
&=\mathbf{2x^{4}+C}\end{align}

3. \int (4x^2+11x^3)\, dx

\int(4x^{2}+11x^{3})dx=\mathbf{\frac{4x^{3}}{3}+\frac{11x^{4}}{4}+C}

4. \int (31x^{32}+4x^3-9x^4) \,dx

\begin{align}\int(31x^{32}+4x^{3}-9x^{4})dx&=\frac{31x^{33}}{33}+\frac{4x^{4}}{4}-\frac{9x^{5}}{5}+C\\
&=\mathbf{\frac{31x^{33}}{33}+x^{4}-\frac{9x^{5}}{5}+C}\end{align}

5. \int 5x^{-2}\, dx

\begin{align}\int5x^{-2}dx&=\frac{5x^{-1}}{-1}+C\\
&=\mathbf{-\frac{5}{x}+C}\end{align}

Indefinite Integration

Find the general antiderivative of the following:

6. \int (\cos x+\sin x)\, dx

\int (\cos x+\sin x)\, dx=\mathbf{\sin x-\cos x+C}

7. \int 3\sin x\, dx

\int 3\sin x\, dx=\mathbf{-3\cos(x)+C}

8. \int (1+\tan^2 x)\, dx

\begin{align}\int(1+\tan^{2}x)dx&=\int\sec^{2}x dx\\
&=\mathbf{\tan x+C}\end{align}

9. \int (3x-\sec^2 x)\, dx

\int (3x-\sec^2 x)\, dx=\mathbf{\frac{3x^{2}}{2}-\tan x+C}

10. \int -e^x\, dx

\int -e^x\, dx=\mathbf{-e^{x}+C}

11. \int 8e^x\, dx

\int 8e^x\, dx=\mathbf{8e^{x}+C}

12. \int \frac1{7x}\, dx

\int \frac1{7x}\, dx=\mathbf{\frac{1}{7}\ln|x|+C}

13. \int \frac1{x^2+a^2}\, dx

Let

x=a\tan\theta;\qquad dx=a\sec^{2}\theta d\theta

Then

\begin{align}\int\frac{1}{x^{2}+a^{2}}dx&=\int\frac{a\sec^{2}\theta d\theta}{a^{2}(\tan^{2}\theta+1)}\\
&=\int\frac{\sec^{2}\theta d\theta}{a\sec^{2}\theta}\\
&=\frac{1}{a}\int d\theta\\
&=\frac{\theta}{a}+C\\
&=\mathbf{\frac{1}{a}\arctan\frac{x}{a}+C}\end{align}

Integration by parts

14. Consider the integral \int \sin(x) \cos(x)\,dx. Find the integral in two different ways. (a) Integrate by parts with u=\sin(x) and  v' =\cos(x). (b) Integrate by parts with u=\cos(x) and  v' =\sin(x). Compare your answers. Are they the same?

(a)

u=\sin x;\qquad du=\cos x dx
v=\sin x;\qquad dv=\cos x dx
\begin{array}{ccc}
\int\sin x\cos x dx=\sin^{2}x-\int\sin x\cos x dx & \implies & 2\int\sin x\cos x dx=\sin^{2}x\\
 & \implies & \int\sin x\cos x dx=\mathbf{\frac{\sin^{2}x}{2}}
\end{array}

(b)

u=\cos x;\qquad du=-\sin x dx
v=-\cos x\qquad dv=\sin x dx
\begin{array}{ccc}
\int\sin x\cos x dx=-\cos^{2}x-\int\sin x\cos x dx & \implies & 2\int\sin x\cos x dx=-\cos^{2}x\\
 & \implies & \int\sin x\cos x dx=\mathbf{-\frac{\cos^{2}x}{2}}
\end{array}

Notice that the answers in parts (a) and (b) are not equal. However, since indefinite integrals include a constant term, we expect that the answers we found will differ by a constant. Indeed

\frac{\sin^{2}x}{2}-(-\frac{\cos^{2}x}{2})=\frac{\sin^{2}x+\cos^{2}x}{2}=\frac{1}{2}

<noinclude>

Sequences and Series Solutions

Calculus/Sequences and Series/Solutions </noinclude> <noinclude>

Multivariable and Differential Calculus Solutions

Calculus/Multivariable and differential calculus/Solutions </noinclude>


References

Table of Trigonometry

Definitions

  • \tan(x)=  \frac{\sin x}{\cos x}
  • \sec(x)=  \frac{1}{\cos x}
  • \cot(x)=  \frac{\cos x}{\sin x}= \frac{1}{\tan x}
  • \csc(x)=  \frac{1}{\sin x}

Pythagorean Identities

  • \sin^2 x + \cos^2 x =1 \
  • 1+\tan^2(x)=  \sec^2 x \
  • 1+\cot^2(x)=  \csc^2 x \

Double Angle Identities

  • \sin(2 x)=  2\sin x \cos x \
  • \cos(2 x)=  \cos^2 x - \sin^2 x \
  • \tan(2x) = \frac{2 \tan (x)} {1 - \tan^2(x)}
  • \cos^2(x) = {1 + \cos(2x) \over 2}
  • \sin^2(x) = {1 - \cos(2x) \over 2}

Angle Sum Identities

\sin \left(x+y\right)=\sin x \cos y + \cos x \sin y
\sin \left(x-y\right)=\sin x \cos y - \cos x \sin y
\cos \left(x+y\right)=\cos x \cos y - \sin x \sin y
\cos \left(x-y\right)=\cos x \cos y + \sin x \sin y
\sin x+\sin y=2\sin \left( \frac{x+y}{2} \right) \cos \left( \frac{x-y}{2} \right)
\sin x-\sin y=2\cos \left( \frac{x+y}{2} \right) \sin \left( \frac{x-y}{2} \right)
\cos x+\cos y=2\cos \left( \frac{x+y}{2} \right) \cos \left( \frac{x-y}{2} \right)
\cos x-\cos y=-2\sin \left( \frac{x+y}{2} \right)\sin \left( \frac{x-y}{2} \right)
\tan x+\tan y=\frac{\sin \left( x+y\right) }{\cos x\cos y}
\tan x-\tan y=\frac{\sin \left( x-y\right) }{\cos x\cos y}
\cot x+\cot y=\frac{\sin \left( x+y\right) }{\sin x\sin y}
\cot x-\cot y=\frac{-\sin \left( x-y\right) }{\sin x\sin y}

Product-to-sum identities

\cos\left (x\right ) \cos\left (y\right ) = {\cos\left (x + y\right ) + \cos\left (x - y\right ) \over 2} \;
\sin\left (x\right ) \sin\left (y\right ) = {\cos\left (x - y\right ) - \cos\left (x + y\right ) \over 2} \;
\sin\left (x\right ) \cos\left (y\right ) = {\sin\left (x + y\right ) + \sin\left (x - y\right ) \over 2} \;
\cos\left (x\right ) \sin\left (y\right ) = {\sin\left (x + y\right ) - \sin\left (x - y\right ) \over 2} \;

See also


Summation notation

Summation notation allows an expression that contains a sum to be expressed in a simple, compact manner. The uppercase Greek letter sigma, Σ, is used to denote the sum of a set of numbers.

Example
\sum_{i=3}^7 i^2 = 3^2 + 4^2 + 5^2 + 6^2+7^2

Let f be a function and N,M are integers with N<M. Then

\sum_{i=N}^M f(i) = f(N) + f(N+1) + f(N+2) + \cdots + f(M).

We say N is the lower limit and M is the upper limit of the sum.

We can replace the letter i with any other variable. For this reason i is referred to as a dummy variable. So


\sum_{i=1}^4 i = \sum_{j=1}^4 j = \sum_{\alpha=1}^4 \alpha = 1 + 2 + 3 + 4

Conventionally we use the letters i, j, k, m for dummy variables.

Example
\sum_{i=1}^5 i = 1 + 2 + 3 + 4 + 5

Here, the dummy variable is i, the lower limit of summation is 1, and the upper limit is 5.

Example

Sometimes, you will see summation signs with no dummy variable specified, e.g.,

\sum_1^4 i^3 =100

In such cases the correct dummy variable should be clear from the context.

You may also see cases where the limits are unspecified. Here too, they must be deduced from the context.

Common summations

\sum_{i=1}^n c = c + c + ... + c = nc , c\in\mathbb{R}

\sum_{i=1}^n i = 1 + 2 + 3 + ... + n = {n(n+1)\over 2}

\sum_{i=1}^n i^2 = 1^2 + 2^2 + 3^2 + ... + n^2 = {n(n+1)(2n+1)\over 6}

\sum_{i=1}^n i^3 = 1^3 + 2^3 + 3^3 + ... + n^3 = {n^2(n+1)^2 \over 4}

Tables of Integrals

Rules

  • \int cf(x)\,dx = c\int f(x)\,dx
  • \int f(x)+g(x)\,dx = \int f(x)\,dx+ \int g(x)\,dx
  • \int f(x)-g(x)\,dx = \int f(x)\,dx- \int g(x)\,dx
  • \int u\,dv\ = uv - \int v\,du

Powers

  • \int dx = x+C
  • \int a\,dx = ax+C
  • \int x^n\,dx = \frac{1}{n+1}x^{n+1}+C\qquad\mbox{ if }n\ne-1
  • \int {1\over x}\,dx = \ln|x|+C
  • \int \frac{1}{ax+b}\,dx = {1 \over a}\ln|ax+b|+C\qquad\mbox{ if }a\ne 0

Trigonometric Functions

Basic Trigonometric Functions

  • \int \sin{x}\,dx = -\cos{x} + C
  • \int \cos{x}\,dx = \sin{x} + C
  • \int \tan{x}\,dx = \ln \left |{\sec{x}} \right | + C
  • \int \sin^2{x}\,dx = \tfrac{1}{2} x - \tfrac{1}{4} \sin{2x} + C
  • \int \cos^2{x}\,dx = \tfrac{1}{2} x + \tfrac{1}{4} \sin{2x} + C
  • \int \tan^2{x}\,dx = \tan(x) - x + C
  • \int\sin^n x\,dx = -\frac{\sin^{n-1} x\cos x}{n} + \frac{n-1}{n}\int\sin^{n-2} x\,dx+C \qquad\mbox{(for }n>0\mbox{)}
  • \int\cos^n x\,dx = -\frac{\cos^{n-1} x\sin x}{n} + \frac{n-1}{n}\int\cos^{n-2} x\,dx+C \qquad\mbox{(for }n>0\mbox{)}\,\!
  • \int\tan^n x\,dx = \frac{1}{(n-1)}\tan^{n-1} x-\int\tan^{n-2} x\,dx+C \qquad\mbox{(for }n\neq 1\mbox{)}

Reciprocal Trigonometric Functions

  • \int \sec{x}\,dx = \ln \left |{\sec{x}}+\tan x \right | + C = \ln \left | \tan{\left( \frac1 2 x +\frac1 4 \pi \right) }\right |+C
  • \int \csc{x}\,dx = - \ln \left |{\csc x + \cot x} \right | + C=\ln \left | \tan \left(\frac1 2 x \right) \right |+C
  • \int \cot{x}\,dx = \ln \left |{\sin{x}} \right | + C


  • \int \sec^2 kx\,dx = \frac1 k \tan{kx} + C
  • \int \csc^2 kx\,dx = -\frac1 k \cot kx + C
  • \int \cot^2 kx\,dx = -x-\frac1 k \cot kx + C
  • \int \sec{x}\tan{x}\,dx = \sec x + C
  • \int \sec x \csc x\,dx =\ln \left | \tan x \right | + C


  • \int \sec^n{x}\,dx = \frac{\sec^{n-1}{x} \sin {x}}{n-1} + \frac{n-2}{n-1}\int \sec^{n-2}{x}\,dx+C \qquad \mbox{ (for }n \ne 1\mbox{)}
  • \int \csc^n{x}\,dx = -\frac{\csc^{n-1}{x} \cos{x}}{n-1} + \frac{n-2}{n-1}\int \csc^{n-2}{x}\,dx+C \qquad \mbox{ (for }n \ne 1\mbox{)}
  • \int\cot^n x\,dx = -\frac{1}{n-1}\cot^{n-1} x - \int\cot^{n-2} x\,dx+C \qquad\mbox{(for }n\neq 1\mbox{)}

Inverse Trigonometric Functions

  • \int {1\over \sqrt{1-x^2}}\,dx = \mbox{arcsin}(x) + C
  • \int {1\over \sqrt{a^2-x^2}}\,dx = \mbox{arcsin}(x/a) + C \qquad\mbox{ if }a\ne 0
  • \int {1\over 1+x^2}\,dx = \mbox{arctan}(x) + C
  • \int {1\over a^2+x^2}\,dx = {1\over a}\mbox{arctan}(x/a) + C \qquad\mbox{ if }a\ne 0

Exponential and Logarithmic Functions

  • \int e^x \,dx = e^x + C
  • \int e^{ax} \,dx = {1\over a}e^{ax} + C \qquad\mbox{ if }a\neq 0
  • \int a^x \,dx = {1\over \ln a}a^x + C \qquad\mbox{ if }a>0, a\neq 1
  • \int \ln x \,dx = x\ln x-x + C

Inverse Trigonometric Functions

  • \int \mbox{arcsin}(x) \,dx = x\,\mbox{arcsin}(x) + \sqrt{1-x^2} + C
  • \int \mbox{arccos}(x) \,dx = x\,\mbox{arccos}(x) - \sqrt{1-x^2} + C
  • \int \mbox{arctan}(x) \,dx = x\,\mbox{arctan}(x) - {1\over 2}\ln(1+x^2) + C

Further Resources

Tables of Derivatives

General Rules

\frac{d}{dx}(f + g)= \frac{df}{dx} + \frac{dg}{dx}

\frac{d}{dx}(cf)= c\frac{df}{dx}

\frac{d}{dx}(fg)= f\frac{dg}{dx} + g\frac{df}{dx}

\frac{d}{dx}\left(\frac{f}{g}\right) = \frac{g\frac{df}{dx} - f\frac{dg}{dx}}{g^2}

 [f(g(x))]' = f'(g(x)) g'(x)

Powers and Polynomials

  • \frac{d}{dx} (c) = 0
  • \frac{d}{dx}x=1
  • \frac{d}{dx}x^n=nx^{n-1}
  • \frac{d}{dx}\sqrt{x}=\frac{1}{2\sqrt x}
  • \frac{d}{dx}\frac{1}{x}=-\frac{1}{x^2}
  • \frac{d}{dx}(c_n x^n + c_{n-1} x^{n-1} + c_{n-2}x^{n-2} + \cdots +c_2x^2 +  c_1 x + c_0) = n c_n x^{n-1} + (n-1) c_{n-1} x^{n-2} + (n-2) c_{n-2}x^{n-3} + \cdots + 2c_2x+ c_1

Trigonometric Functions

\frac{d}{dx} \sin (x)= \cos (x)

\frac{d}{dx} \cos (x)= -\sin (x)

\frac{d}{dx} \tan (x)= \sec^2 (x)

\frac{d}{dx} \cot (x)= -\csc^2 (x)

\frac{d}{dx} \sec (x)= \sec (x) \tan (x)

\frac{d}{dx} \csc (x) = -\csc (x) \cot (x)

Exponential and Logarithmic Functions

  • \frac{d}{dx} e^x =e^x
  • \frac{d}{dx} a^x =a^x \ln (a)\qquad\mbox{if }a>0
  • \frac{d}{dx} \ln (x)= \frac{1}{x}
  • \frac{d}{dx} \log_a (x)= \frac{1}{x\ln (a)}\qquad\mbox{if }a>0, a\neq 1
  •     (f^g)' = \left(e^{g\ln f}\right)' = f^g\left(f'{g \over f} + g'\ln f\right),\qquad f > 0
  •     (c^f)' = \left(e^{f\ln c}\right)' = f' c^f \ln c

Inverse Trigonometric Functions

  • \frac{d}{dx} \mbox{arcsin x} = \frac{1}{\sqrt{1-x^2}}
  • \frac{d}{dx} \mbox{arccos x} = -\frac{1}{\sqrt{1-x^2}}
  • \frac{d}{dx} \mbox{arctan x} = \frac{1}{1+x^2}
  •     {d \over dx} \arcsec x = { 1 \over |x|\sqrt{x^2 - 1}}
  •     {d \over dx} \arccot x = {-1 \over 1 + x^2}
  •     {d \over dx} \arccsc x = {-1 \over |x|\sqrt{x^2 - 1}}

Hyperbolic and Inverse Hyperbolic Functions

{d \over dx} \sinh x = \cosh x
{d \over dx} \cosh x = \sinh x
{d \over dx} \tanh x = \mbox{sech}^2\,x
{d \over dx} \,\mbox{sech}\,x = -\tanh x\,\mbox{sech}\,x
{d \over dx} \,\mbox{coth}\,x = -\,\mbox{csch}^2\,x
{d \over dx} \,\mbox{csch}\,x = -\,\mbox{coth}\,x\,\mbox{csch}\,x
{d \over dx} \sinh^{-1} x = { 1 \over \sqrt{x^2 + 1}}
{d \over dx} \cosh^{-1} x = {-1 \over \sqrt{x^2 - 1}}
{d \over dx} \tanh^{-1} x = { 1 \over 1 - x^2}
{d \over dx} \mbox{sech}^{-1}\,x = { 1 \over x\sqrt{1 - x^2}}
{d \over dx} \mbox{coth}^{-1}\,x = {-1 \over 1 - x^2}
{d \over dx} \mbox{csch}^{-1}\,x = {-1 \over |x|\sqrt{1 + x^2}}

Acknowledgements and Further Reading

Acknowledgements

Portions of this book have been copied from relevant Wikipedia articles.

Contributors

In alphabetical order (by surname or display name):

Further Reading


The following books list Calculus as a prerequisite:


Other Calculus Textbooks

Other calculus textbooks available online:

Other printed calculus textbooks:

  • Apostol, Tom M. Calculus. ; This two-volume set provides a rigorous introduction to calculus.
Using infinitesimals

Interactive Websites

Other Resources

References

<references />