1 Introduction to Matrices and Linear Systems
1.1 Linear Geometry: Vectors in Space and Angle & Length
Vectors in Space
Definition: A vector is a magnitude with a direction.
\[\vec{v}=head-tail= \begin{bmatrix} b_1-a_1 \\ b_2-a_2 \end{bmatrix}\] Vectors or Scalars?
- 60 mph (scalar)
- 60 mph due East (vector)
- 100 lbs (vector)
- 5 kg (scalar)
NOTE: Vectors can exist independently of a coordinate system
Connection to Free Variables
Example: \[ S=\{\begin{bmatrix} 1\\2 \end{bmatrix}t|t \in\mathbb{R}\}\]
\[\begin{bmatrix} 1\\2 \end{bmatrix}\] can be scaled by any length. We say S generates or spans a line in \(\mathbb{R}^2.\)
Example: \[S=\{\begin{bmatrix} 1\\2\\1 \end{bmatrix} + \begin{bmatrix} -1\\1\\1 \end{bmatrix}t |t \in\mathbb{R}\}\]
Precalculus/Calculus Way: \(2x+y+z=4 \rightarrow{x=2-\frac{1}{2}y-\frac{1}{2}z}\) \(\Rightarrow (0,0,4), (0,4,0), (2,0,0)\)
Linear Algebra Way: \[S=\{\begin{bmatrix} 1\\0\\0 \end{bmatrix}s + \begin{bmatrix} 0\\1\\0 \end{bmatrix}t |s, t \in\mathbb{R}\}\]
\[\begin{bmatrix} 1\\0\\0 \end{bmatrix}\] and \[\begin{bmatrix} 0\\1\\0 \end{bmatrix}\] spans the xy-plane.
Vector Length
Definition: The length (magnitude) of a vector \(\vec{v}\in\mathbb{R}^n\) is the square root of the sum of the squares of components. Let \(\vec{v}=\langle v_1,..., v_n\rangle\), and its magnitude will be \(\lVert\vec{v}\rVert=\sqrt{v_1^2+...+v_n^2}\).
Angle between Vectors
Definition: The angle formed by two vectors is defined as the angle formed when they are joined tail-to-tail (subtended), and is denoted as \(\theta=\cos^{-1}(\frac{\vec{u}\cdot\vec{v}}{\lVert\vec{u}\rVert\lVert\vec{v}\rVert})\).
Proof:
1.2 Dot Product
Definition: The dot product, \(\vec{u}\cdot\vec{v}\), is \(\vec{u}\cdot\vec{v}=u_1v_1+u_2v_2\)
Note: if \(\vec{u}\cdot\vec{v}=0\), then \(\cos(\theta)\) which implies \(\theta=\frac{\pi}{2}\) (such vectors are orthogonal).
Properties of Dot Product
\(\vec{u}\cdot\vec{v}=\vec{v}\cdot\vec{u}\;\;\;\;\) (dot products are commutative)
\(c(\vec{u}\cdot\vec{v})=(c\vec{u})\cdot\vec{v}=\vec{u}\cdot(c\vec{v})\;\;\;\;\) (dot products are associative)
\(\vec{u}\cdot(\vec{v}+\vec{w})=\vec{u}\cdot\vec{v}=\vec{u}\cdot\vec{w}\;\;\;\;\) (dot products are distributive)
if \(\vec{u},\vec{v}\neq\vec{0}\) and \(\vec{u}\cdot\vec{v}=0\) then \(\vec{u}\) is orthogonal to \(\vec{v}\)
\(\vec{u}\cdot\vec{u}=\lVert\vec{u}\rVert^2\;\;\;\;\)
Examples: Let \[\vec{u}=\begin{bmatrix} 3\\1 \end{bmatrix}\], \[\vec{v}=\begin{bmatrix} -1\\3 \end{bmatrix}\], \[\vec{w}=\begin{bmatrix} 2\\5 \end{bmatrix}\]
\(\vec{u}\cdot\vec{w}=3(2)+1(5)=11=2(3)+5(1)=\vec{w}\cdot\vec{u}\)
\(2\vec{u}\cdot\vec{w}=2(11)=4(3)+10(1)=(2\vec{u})\cdot\vec{v}=2(6)+5(2)=\vec{u}\cdot(2\vec{v})\)
\(\vec{u}\cdot\vec{v}=3(-1)+1(3)=0\;\;\;\;\therefore\theta=\frac{\pi}{2}\)
\(\vec{u}\cdot\vec{u}=3(3)+1(1)=3^2+1^2=\lVert\vec{u}\rVert^2\)
Parallel Vectors: \(\vec{u}\parallel\vec{v}\) if their directions are either same or opposite. In other words, \(\vec{u}\parallel\vec{v}\Leftrightarrow c\vec{u}=\vec{v}\) for some \(c\in\mathbb{R}\)
Example: Let \(\vec{u}=\langle2,2,2\rangle\) and \(\vec{v}=\langle-1,-1,-1\rangle\), then \(vec{u}\parallel\vec{v}\) because \(\vec{u}=-2\vec{v}\).
Theorem [Cauchy-Schwarz Inequality]: For any \(\vec{u},\vec{v}\in\mathbb{R}^n\), \(\lvert\vec{u}\cdot\vec{v}\rvert\leq\lVert\vec{u}\rVert\lVert\vec{v}\rVert\).
Proof:
Theorem [Triangle Inequality]: For any \(\vec{u},\vec{v}\in\mathbb{R}^n\), \(\lVert\vec{u}+\vec{v}\rVert\leq\lVert\vec{u}\rVert+\lVert\vec{v}\rVert\).
Proof:
1.3 Solving Linear Equations
Gauss’s Method
Definition: A linear combination of \(x_1, ... x_n\) is of the form \(a_1{x_1} + ... + {a_n{x_n}}\) where \(a_i\) are coefficients.
An n-tuple (\(S_1, ... , S_n\)) is solution of or satisfies a system of equations:
\[\begin{bmatrix} a_{1,1}x_1 + & \dots & + a_{1, n}x_n & | &d_1 \\ \vdots & & \vdots \\ a_{m,1}x_1 + & \dots & + a_{m, n}x_n & | &d_m \end{bmatrix}\]
Example: \(S = {\{sinx, cosx\}}\)
\(4sinx - 3cosx\) is a linear combination of \(sinx, cosx\)
\(3(sinx)^2 - 3cos\sqrt{x}\) is NOT a linear combination of \(sinx, cosx\)
Gauss’ Method
Elementary row operations are:
Swapping Equations
Multiplication by nonzero constant \(\leftarrow\) (Re-scaling)
Replace with linear combination of existing equations \(\leftarrow\) (Row Combo)
Theorem: Performing elementary row operations on a system of equations does not change the solution set.
General = Particular + Homogeneous
Theorem: Any linear system has general solution of the form:
\(\begin{matrix} \{\vec{p} + c_1\vec{{b_1}}+...+c_1\vec{{b_k}}& c_1,...,c_k \in\mathbb{R}\}\\ \end{matrix}\)
where k = Number of free variables, \(\vec{\beta_1}, ... , \vec{\beta_k}\) are the vectors associated with free variables, and \(\vec{p}\) is a particular solution (vector of constants).
Definition: A linear system is homogeneous if all equations equal to 0.
Always guaranteed to have at least one solution. \((x_1 = x_2 = ... = 0)\)
For homogeneous systems \(\vec{p}=\vec{0}\)
Theorem: Any linear system has general solution of the form: \(\begin{matrix} \{\vec{p} + \vec{h}\} \end{matrix}\)
Where \(\vec{p}\) is particular solution, \(\vec{h}\) satisfies associated homogeneous system. This theorem suggest that \(\vec{h} = c_1{\vec{\beta_1}} + .. + c_1\vec{{\beta_k}}i\) in other words, the free variable terms satisfy associated the homogeneous system.
Theorem: Any system of linear equations must have either
Exactly 1 solution
No solutions
Infinitely many solutions
Definitions:
A square matrix is a matrix where n = m.
A square matrix is non-singular if it is the matrix of coefficients of a homogeneous system with a unique solution.
A square matrix is singular if it is not non-singular.
Example:
A = \(\begin{bmatrix} 2 & 1\\ -1 & 3\\ \end{bmatrix} \Rightarrow \begin{bmatrix} 2 & 1\\ 0 & 7\\ \end{bmatrix}\)
Associated System: \(2x+y=0\)
\(-x+3y=0\)
\(\therefore y=x=0\)
\(\therefore\) non singular
Example:
B = \(\begin{bmatrix} 2 & 2\\ 1 & 1 \\ \end{bmatrix}\) \(\Rightarrow\) \(\begin{bmatrix} 2 & 2\\ 0 & 0\\ \end{bmatrix}\) I - II
1.4 Row Echelon Form & Reduced Row Echelon Form
Definition: Occurs when leading variable of an equation is to the right of the row above it AND when all rows of zeros are at the bottom.
Example:
\[x-y=0\]
\[2x+2y+z+2w = 4\]
\[y+w = 0\] \[2z+w = 5\]
-2I + II, replace II
II swap III
\[x-y =0\] \[y+w = 0\] \[z+2w = 4\] \[2z+w = 5\]
-2III + IV, replace IV
\[x-y =0\] \[y+w = 0\] \[z+2w = 4\] \[-3w = -3\]
w=1
z=2
y=-1
x=-1
Solution: (-1, -1, 2, 1)
1.4.1 Reduced Row Echelon Form
*Gauss-Jordan reduction is an extension of Gauss’ method.
Definition: A matrix is in reduced row echelon form (RREF) if the matrix is in REF and:
All nonzero leading entries are 1.
Any rows containing all 0’s are below all nonzero rows.
Where possible, any non zero entry below a leading variable (column) is 0.
Reminders
Definition: (i) Elementary row opperations are: 1. Swapping, 2. Row Scaling, 3. Row Replacement (linear combination of rows)
Theorem: (ii) Elementary row operations are reversible
Definition: (iii) Two matrices are row equivalent if one can be reduced (changed to) the other by a sequence of elementary row operations.
Pivot Positions: location of a leading entry in RREF matrix.
Example: Compute RREF(A) if A = \(\begin{bmatrix} 2 & 6 & 1 & 2 & 5 \\ 0 & 3 & 1 & 4 & 1 \\ 0 & 3 & 1 & 2 & 5 \end{bmatrix}\)
-II + III \(\Rightarrow\) \(\begin{bmatrix} 2 & 6 & 1 & 2 & | & 5 \\ 0 & 3 & 1 & 4 & | & 1 \\ 0 & 0 & 0 & -2 & | & 4 \end{bmatrix}\)
(1/2)I + (-1/2)III, (1/3)II \(\Rightarrow\) \(\begin{bmatrix} 1 & 3 & 1/2 & 1& | & 5/2 \\ 0 & 3 & 1 & 4 & | & 1/3 \\ 0 & 0 & 0 & -2 & | & 4 \end{bmatrix}\)
II - (4/3)III, I - III \(\Rightarrow\) \(\begin{bmatrix} 1 & 3 & 1/2 & 0& | & 9/2 \\ 0 & 1 & 1/3 & 0 & | & 3 \\ 0 & 0 & 0 & 1 & | & -2 \end{bmatrix}\)
(-3)II + I \(\Rightarrow\) \(\begin{bmatrix} 1 & 0 & -1/2 & 0& | & -9/2 \\ 0 & 1 & 1/2 & 0 & | & 3 \\ 0 & 0 & 0 & 1 & | & -2 \end{bmatrix}\) = RREF A
Columns 1,2 & 4 are pivot columns and \(a_{11}\), \(a_{22}\), \(a_{44}\) are the pivots.
\(\begin{bmatrix} 2 & 6 & 1 & 2 & | & 5 \\ 0 & 3 & 1 & 4 & | & 1 \\ 0 & 0 & 0 & -2 & | & 4 \end{bmatrix}\) is row equivalent to \(\begin{bmatrix} 1 & 0 & -1/2 & 0& | & -9/2 \\ 0 & 1 & 1/2 & 0 & | & 3 \\ 0 & 0 & 0 & 1 & | & -2 \end{bmatrix}\)
Corresponding system of equations:
\(2x+6y+z+w = 5 \Rightarrow x-(1/2)z = (-9/2)\)
\(3y+z+4w = 1 \Rightarrow y+(1/3)z = 3\)
\(3y+z+2w = 5 \Rightarrow w = 2\)
Z is a free variable.
S = \(\{\begin{bmatrix} (-9/2)\\ 3\\ 0\\ -2 \end{bmatrix}\) + \(\begin{bmatrix} (1/2)\\ (1/3)\\ 1\\ 0 \end{bmatrix}\}\) | z \(\in\mathbb{R}\)}
Important Note: The RREF of a matrix is unique. There’s only 1 RREF of a matrix.
Equivalence Relations
Definition: A relation ~ between a set of elements is an equivalence relation if the relation is reflexive, symmetric, and transitive.
Example 1: Equality:
A = A \(\Rightarrow\) Reflexive
if A = B then B = A \(\Rightarrow\) Symmetric
if A = B and B = C then A = C \(\Rightarrow\) Transitive
Example 2: Is same birthday as, an equivalence relation?
Reflexive, Symmetric, and Transitive work.
Example 3: Is friends with an equivalence relation?
No, any of the conditions could fail.
Example 4: S = \(\mathbb{Z}\) relation: R = { x ~ y | x and y same parity}
Reflexive: if x is odd then it is odd (same for even) \(\therefore\) x ~ x
Symmetric: if x ~ y then y ~ x
Transitive: if x ~ y and y ~ z then x ~ z (all are odd)
Example 5: Let S be the set of nxm matrices
R = {A ~ B | A can be reduced to B by elementary row opperations}
Supporting Theorems
Linear Combination Lemma:
a linear combination of linear combination is a linear combination.
Proof: Let \(S = \{x_1,...,x_n\}\). Consider the following linear combinations of \(S\):
\(c_{11}x_{1} + ...+c_{1n}x_{n}\)
By definition, a linear combination of these combinations is:
\(d_{1}(c_{11}x_1+...+c_{1n}x_n) +...+d_m(c_{m1}x_1+...+c_{mn}x_n)\) \(= d_1c_{11}x_1+...+d_1c_{1n}x_n +...+d_mc_{m1}x_1+...+d_mc_{mn}x_n\)
which is a linear combination of \(x_1,...,x_n\)
In a REF matrix, no nonzero row is a liner combination of the other rows:
Example: \(B = \begin{bmatrix} 1 & 1 & 1 \\ 0 & 3 & 2 \\ 0 & 0 & 5 \end{bmatrix}\) is a REF matrix.
\(c_1(II) + c_2(III) \neq (I)\) since \(b_21 = 0 = b_31\)
\(c_1(I) + c_2(II) \neq (III)\) since \(c_1(1) = c_1 \ neq 0\), but \(b_13 = 0\)
\(c_1(I) + c_2(III) \neq (II)\) since \(c_1(1) = c_1 \neq 0\), but \(b_12 = 0\)
1.5 Determinants and Inverse Matrices
1.5.1 Determinants
The Determinant as a Function
The determinant of a matrix is a function of the form \(f: \mathbb{R}^{m \times n} \rightarrow \mathbb{R}\)
Example: \(\mathrm{det}(\begin{bmatrix} 4 & 2 \\ 2 & 1 \end{bmatrix} = 0\)
Background of Understanding the function, \(\mathrm{det}()\)
Permutation: Given \(\{1, 2, ..., n\}\) all arrangements without omissions or repetitions.
Example: \(\{1, 2, 3\} = S\) The permutations of \(S\) are
Inversion: Any time when a larger integer proceeds a smaller integer in a permutation.
Example: The permutation \((6, 1, 3, 4, 5, 2)\) has \(5+0+1+1+1=8\) inversions.
Odd or Even Permutation: Determine whether the permutation is odd or even based on the number of inversions.
Example: perumation \((6, 1, 3, 4, 5, 2)\) is even because there are 8 inversions.
Elementary Prodcuts: Consider permutations of \(n\) elements where elements do NOT share some row or columns as products.
Let \(A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a{33} \end{bmatrix}\)
Definition of Determinant Function
Let \(A\) be a square matrix. The determinant of \(A\) denoted \(\mathrm{det}(A)\), is the sum of all signed elementary products.
Example: \(\mathrm{det}(\begin{bmatrix} 4 & -1 \\ 3 & 2 \end{bmatrix}) = 8 - (-3) = 11 = a_{11}a_{12} - a_{12}a_{21}\)
Use the left expressions to prove the formula on the right
Example: \(\mathrm{det}(\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}) = a_{11}a_{22}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} - a_{11}a_{23}a_{32} - a_{12}a_{21}a_{33} - a_{13}a_{22}a_{31}\)
Trick: Remember \(3\times3\) determinant via sum of signed elementary products. (NOTE only valid for \(3\times3\) matrices)
Evaluating Determinants by Row Reduction
Theorem 1: If \(A\) is a square matrix containing a row of zeros then \(\mathrm{det}(A) = 0\)
This is so becuase each signed element contains one term from each row, each product is zero.
Theorem 2: If \(A\) is \(n \times n\) triangular matrix (upper, lower, or diagonal) then \(\mathrm{det}(A)= a_{11} \cdot a_{22} \cdot...\cdot a_{nn}\) - which is the product of diagonal elements
This is so since the signed element products are all zero except those along the diagonal.
Theorem 3: Let \(A\) be an \(n\times n\) matrix
- Row scale scales \(\mathrm{det}(A)\): If \(A'\) is a matrix contstructed by multiplying (a connection to row operations) one row of \(A\) by \(c \in \mathbb{R}\) then \(\mathrm{det}(A') = c\mathrm{det}(A)\)
- Row swap swaps sign of \(\mathrm{det}(A)\): If \(A'\) is a matrix constructed by interchanging 2 rows of \(A\) then \(\mathrm{det}(A') = -\mathrm{det}(A)\)
- Row replace with linear combination doesn’t change \(\mathrm{det}(A)\): If \(A'\) is a matrix constructed by replacing a row of \(A\) with a linear combination of rows then \(\mathrm{det}(A') = \mathrm{det}(A)\)
Properties of the Determinant Function
Theorem 1: If \(A\) is a square matrix then \(\mathrm{det}(A) = \mathrm{det}(A^T)\)
Theorem 2: \(\mathrm{det}(kA) = k^n\mathrm{det}(A)\) where \(A\) is \(n\times n\) and \(k \in \mathbb{R}\)
Theorem 3: \(A\) is invertible if and only if \(\mathrm{det}{A} \neq 0\)
NOTE: \(\mathrm{det}(A + B) \neq \mathrm{det}(A) + \mathrm{det}(B)\)
1.5.2 Cofactor Expansion
Definition: Cofactor Expansion is a computational method to evaluate determinants.
Let \(A\) be a square matrix. The minor of \(a_{ij}\), dneoted by \(M_{ij}\), is the determinant of the submatrix that remains after deleting the \(ith\) and \(jth\) row from \(A\). The number \((-1)^{i+j}M_{ij}\) is called the cofactor of \(a_{ij}\)
Cofactor Expansion Method
Recall: \(\mathrm{det}(A) = \mathrm{det}(\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}) = a_{11}a_{22}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} - a_{11}a_{23}a_{32} - a_{12}a_{21}a_{33} - a_{13}a_{22}a_{31}\)
Which we can manipulate to get the following:
\(= a_{11}(a_{22}a_{33} - a_{23}a_{32}) + a_{21}(a_{13}a_{32} - a_{12}a_{33}) + a_{31}(a_{12}a_{23} - a_{13}a_{22})\)
\(= a_{11}C_{11} + a_{21}C_{21} + a_{31}C_{31}\) - this is cofactor expansion along the 1st column
NOTE: we can expand along any column OR row
Example: \(A = \begin{bmatrix} 3 & 1 & 0 \\ -2 & -4 & 3 \\ 5 & 4 & -2 \end{bmatrix}\)
\(\therefore \mathrm{det}(A) = 3\begin{vmatrix}-4 & 3 \\ 4 & -2 \end{vmatrix} - (-2)\begin{vmatrix} 1 & 0 \\ 4 & -2 \end{vmatrix} + 5\begin{vmatrix} 1 & 0 \\ -4 & 3 \end{vmatrix}\)
\(= 3(8 - 12) + 2(-2 - 0) + 5(3-0)\)
\(=3(-4) - 4 + 15\)
\(=-12 -4 + 15 = -1\)
Try expanding along a diffefent row or column
1.5.3 Inverses of Matrices
Definition:
- An identity matrix, denoted \(I_n\), is an \(n\times n\) matrix with the value \(1\) in every main diagonal entry and zeros everywhere else.
Examples:
\(I_2 = \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix}\)
\(I_3 = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}\)
Unique Property - identity matrix acts like the number one: \(AI = IA = A\)
The inverse of an \(n \times n\) matrix, denoted \(A^{-1}\), is a matrix so that \(AA^{-1} = I_n\)
A matrix with an inverse is called invertible
Example:
\(A = \begin{bmatrix}1 & 2 \\ 1 & 3 \end{bmatrix}, M = \begin{bmatrix}3 & -2 \\ -1 & 1 \end{bmatrix}\)
\(AM = \begin{bmatrix}1 & 2 \\ 1 & 3 \end{bmatrix}\begin{bmatrix}3 & -2 \\ -1 & 1 \end{bmatrix} = \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix}\)
\(\therefore M = A^{-1}\)
Theorem: If \(B\) and \(C\) are inverses of \(A\) then \(B=C\).
Proof: \(BA = I\)
\((BA)C = IC\)
\((BA)C = C\)
\(B(AC) = C\)
\(BI = C\)
\(B = C\)
Theorem: If \(A\) and \(B\) are invertible \(n\times n\) matrices then
\(AB\) is invertible
\((AB)^{-1} = B^{-1}A^{-1}\)
Proof:
\((AB)(B^{-1}A^{-1})\)
\(=A(BB^{-1})A^{-1}\)
\(=AIA^{-1}\)
\(=AA^{-1}\)
\(=I \therefore (AB)^{-1} = B^{-1}A^{-1}\)
Theorem: If \(A\) is invertible then
\((A^{-1})^{-1} = A\)
\((A^n)^{-1} = (A^{-1})^n\) for \(n \in \mathbb{N}\)
\((kA)^{-1} = \frac{1}{k}A^{-1}\) for \(k \neq 0\)
Proof:
\((kA)(\frac{1}{k} A^{-1}) = \frac{1}{k} (kA) A^{-1}\)
\(= \frac{1 k}{k}A A^{-1}\)
\(= 1I\)
\(= I\)
\(\therefore (kA)^{-1} = \frac{1}{k}A^{-1}\)
How to Compute the Inverse of A Matrix
Recall: Elementary row operations are reversible.
Procedure for Computing the Inverse
Adjoin \(A\) and \(I\)
Row reduce \(A\) to RREF; apply same operations
The resultant matrix of the righthand side is \(A^{-1}\)
Example: Compute the inverse of \(A = \begin{bmatrix}1 & 2 & 3 \\ 2 & 5 & 3 \\ 1 & 0 & 8\end{bmatrix}\)
\([A | I]\)
\(= \begin{bmatrix}1 & 2 & 3 & | & 1 & 0 & 0 \\ 2 & 5 & 3 & | & 0 & 1 & 0\\ 1 & 0 & 8 & | & 0 & 0 & 1\end{bmatrix}\)
\(-2I + III\) and \(-I + III \rightarrow\)
\(\begin{bmatrix}1 & 2 & 3 & | & 1 & 0 & 0 \\ 0 & 1 & -3 & | & -2 & 1 & 0\\ 0 & -2 & 5 & | & -1 & 0 & 1\end{bmatrix}\)
\(2II + III \rightarrow\)
\(...A^{-1} = \begin{bmatrix}-40 & 16 & 9 \\ 13 & -5 & -3 \\ 5 & -2 & -1\end{bmatrix}\)
Comment: If row operations result in a row of zeros then \(A\) is noninvertible.