The provided text serves as a foundational resource for linear algebra, starting with basic concepts like vectors, vector spaces, real numbers, and geometric interpretations of these ideas. It progresses to vector operations such as addition, subtraction, scalar multiplication, and the crucial concepts of linear combinations and spans. The material then introduces special vectors like zero and unit vectors, along with the notion of linear independence and dependence. Further topics include the dot product and its relation to vector length, followed by an introduction to matrices, their operations (addition, subtraction, scalar multiplication, and matrix multiplication), and special types of matrices. The text also explores determinants and their geometric significance before moving into more advanced topics like vector spaces, bases, projections, orthonormal bases achieved through the Gram-Schmidt process, and finally, matrix decomposition techniques.

Vector Norms, Distances, Vector Operations, and Matrix Fundamentals Study Guide

I. Key Concepts and Definitions:

A. Vectors: * Definition: A quantity having magnitude and direction, often represented as an ordered list of numbers (components or entries). * Representation: Column vectors are commonly used, enclosed in square brackets (e.g., [x, y]). * Dimensions: An $n$-dimensional vector has $n$ components and belongs to the vector space $\mathbb{R}^n$. * Equality: Two vectors are equal if and only if all their corresponding components are equal.

B. Scalars: * Definition: A quantity with only magnitude (a single number).

C. Vector Norms: * Definition: A function that assigns a non-negative length or size to each vector in a vector space. * Norm Two (Euclidean Norm or $L^2$ Norm): For a vector $v = [v_1, v_2, …, v_n]^T$, the norm two is $||v||_2 = \sqrt{v_1^2 + v_2^2 + … + v_n^2}$.

D. Euclidean Distance: * Definition: The straight-line distance between two points in Euclidean space. * Formula: The Euclidean distance between two vectors $a = [a_1, a_2, …, a_n]^T$ and $b = [b_1, b_2, …, b_n]^T$ is $d(a, b) = ||a – b||_2 = \sqrt{(a_1 – b_1)^2 + (a_2 – b_2)^2 + … + (a_n – b_n)^2}$. * Relationship to Norm: The Euclidean distance between two points is the norm of the vector connecting them.

E. Special Vectors: * Zero Vector: A vector where all components are zero, denoted by $\vec{0}$. * Unit Vector: A vector with a magnitude (norm) of one. * Standard Unit Vectors ($e_i$): Vectors with a single component equal to one and all other components equal to zero (e.g., in $\mathbb{R}^3$, $e_1 = [1, 0, 0]^T$, $e_2 = [0, 1, 0]^T$, $e_3 = [0, 0, 1]^T$). * Sparse Vector: A vector with many of its entries as zero.

F. Vector Operations: * Addition: Adding two vectors of the same dimension is done component-wise: $(a + b)_i = a_i + b_i$. * Subtraction: Subtracting two vectors of the same dimension is done component-wise: $(a – b)_i = a_i – b_i$. * Scalar Multiplication: Multiplying a vector by a scalar multiplies each component of the vector by that scalar: $(ca)_i = c \cdot a_i$.

G. Properties of Vector Addition: * Commutativity: $a + b = b + a$ * Associativity: $(a + b) + c = a + (b + c)$ * Identity Element: $a + \vec{0} = \vec{0} + a = a$ * Inverse Element: $a + (-a) = (-a) + a = \vec{0}$, where $-a$ is the vector with components $-a_i$.

H. Linear Combination: * Definition: A vector formed by multiplying a set of vectors by scalars and adding the results. For vectors $v_1, v_2, …, v_k$ and scalars $c_1, c_2, …, c_k$, the linear combination is $c_1v_1 + c_2v_2 + … + c_kv_k$. * Coefficients: The scalars ($c_i$) in a linear combination.

I. Span of Vectors: * Definition: The set of all possible linear combinations of a set of vectors. * Example: The span of two non-collinear vectors in $\mathbb{R}^2$ is the entire $\mathbb{R}^2$. The span of two collinear vectors in $\mathbb{R}^2$ is a line through the origin. The span of a single non-zero vector in $\mathbb{R}^2$ is a line through the origin. The span of the zero vector is just the zero vector.

J. Dot Product (Scalar Product): * Definition: An operation that takes two equal-length sequences of numbers (usually coordinate vectors) and returns a single number. * Formula: For two vectors $a = [a_1, a_2, …, a_n]^T$ and $b = [b_1, b_2, …, b_n]^T$, their dot product is $a \cdot b = a_1b_1 + a_2b_2 + … + a_nb_n$. * Relationship to Norm: $v \cdot v = ||v||_2^2$.

K. Matrices: * Definition: A rectangular array of numbers arranged in rows and columns. * Dimensions: An $m \times n$ matrix has $m$ rows and $n$ columns. * Indexing: The element in the $i$-th row and $j$-th column is denoted as $A_{ij}$ or $a_{ij}$. * Square Matrix: A matrix with an equal number of rows and columns ($m = n$). * Identity Matrix ($I_n$): An $n \times n$ square matrix with ones on the main diagonal (from the upper-left to the lower-right) and zeros everywhere else. * Matrix Equality: Two matrices are equal if and only if they have the same dimensions and all their corresponding entries are equal.

L. Matrix Operations: * Scalar Multiplication: Multiplying a matrix by a scalar multiplies each entry of the matrix by that scalar. * Matrix Addition and Subtraction: Performed element-wise on matrices of the same dimensions. * Matrix Multiplication: The product of an $m \times k$ matrix $A$ and a $k \times n$ matrix $B$ is an $m \times n$ matrix $C$, where $C_{ij} = \sum_{l=1}^{k} A_{il}B_{lj}$ (the dot product of the $i$-th row of $A$ and the $j$-th column of $B$).

M. Properties of Matrix Operations: * Associativity of Matrix Multiplication: $(AB)C = A(BC)$ * Distributivity: $A(B + C) = AB + AC$ and $(A + B)C = AC + BC$ * Scalar Multiplication with Matrix Multiplication: $r(AB) = (rA)B = A(rB)$ * Identity Matrix Property: $AI = IA = A$, where $I$ is the identity matrix of appropriate dimension.

N. Determinants: * Definition (for a $2 \times 2$ matrix): For a matrix $A = \begin{pmatrix} a & b \ c & d \end{pmatrix}$, the determinant is $\det(A) = ad – bc$. * Calculation (by cofactor expansion): The determinant of a matrix can be calculated by expanding along any row or column using cofactors. * Properties: * $\det(I) = 1$ * Swapping two rows (or columns) changes the sign of the determinant. * If a matrix has a row (or column) of zeros, its determinant is zero. * If a matrix has two identical rows (or columns), its determinant is zero. * Multiplying a row (or column) by a scalar $c$ multiplies the determinant by $c$. * $\det(A^T) = \det(A)$ * $\det(AB) = \det(A)\det(B)$

O. Basis of a Vector Space: * Definition: A set of linearly independent vectors that span the entire vector space. * Properties: Every vector in the space can be expressed as a unique linear combination of the basis vectors.

P. Orthonormal Basis: * Definition: A basis where all vectors in the set are orthogonal to each other and each vector has a norm (length) of one (they are unit vectors). * Orthogonal Vectors: Two vectors $u$ and $v$ are orthogonal if their dot product is zero ($u \cdot v = 0$). * Gram-Schmidt Process: An algorithm for orthonormalizing a set of linearly independent vectors in an inner product space.

II. Quiz:

Explain the relationship between the Euclidean norm of a vector and the Euclidean distance between two points.
What are the key differences between a zero vector and a unit vector? Provide an example of each in $\mathbb{R}^3$.
Describe the process of adding two vectors and multiplying a vector by a scalar. How do the dimensions of the vectors play a role?
State the commutative and associative properties of vector addition. Provide a brief example to illustrate one of these properties.
Define a linear combination of vectors. What role do the coefficients in a linear combination play?
Explain the concept of the span of a set of vectors. What does it mean if the span of a set of vectors in $\mathbb{R}^n$ is equal to $\mathbb{R}^n$?
What is the formula for the dot product of two $n$-dimensional vectors? How is the dot product related to the norm of a vector?
Describe the structure of an $n \times n$ identity matrix. How does it behave when multiplied by another matrix?
For a $2 \times 2$ matrix $A = \begin{pmatrix} a & b \ c & d \end{pmatrix}$, what is its determinant? How does swapping the rows of $A$ affect its determinant?
Define a basis of a vector space. What two key properties must a set of vectors satisfy to form a basis?

III. Quiz Answer Key:

The Euclidean distance between two points (represented by vectors $a$ and $b$) is defined as the Euclidean norm of the vector that connects these two points, which is the vector $a – b$ (or $b – a$). Thus, $d(a, b) = ||a – b||_2$.
A zero vector is a vector where all its components are zero (e.g., $[0, 0, 0]^T$ in $\mathbb{R}^3$), and its magnitude is zero. A unit vector is a vector whose Euclidean norm (length) is one (e.g., $[1, 0, 0]^T$ in $\mathbb{R}^3$).
To add two vectors, their corresponding components are added together. To multiply a vector by a scalar, each component of the vector is multiplied by that scalar. Vector addition is only defined for vectors of the same dimension, and the resulting vector has the same dimension.
The commutative property states that the order of addition does not matter: $a + b = b + a$. For example, if $a = [1, 2]^T$ and $b = [3, 4]^T$, then $a + b = [4, 6]^T$ and $b + a = [4, 6]^T$.
A linear combination of vectors $v_1, v_2, …, v_k$ is a sum of scalar multiples of these vectors: $c_1v_1 + c_2v_2 + … + c_kv_k$. The coefficients ($c_i$) determine how each vector is scaled and contribute to the resulting vector of the combination.
The span of a set of vectors is the set of all vectors that can be formed by taking linear combinations of the original vectors. If the span of a set of vectors in $\mathbb{R}^n$ is equal to $\mathbb{R}^n$, it means that any vector in the $n$-dimensional space can be expressed as a linear combination of the vectors in that set.
The dot product of two $n$-dimensional vectors $a = [a_1, …, a_n]^T$ and $b = [b_1, …, b_n]^T$ is $a \cdot b = a_1b_1 + … + a_nb_n$. The dot product of a vector with itself is equal to the square of its Euclidean norm: $v \cdot v = ||v||_2^2$.
An $n \times n$ identity matrix ($I_n$) is a square matrix with ones on its main diagonal and zeros elsewhere. When an $m \times n$ matrix $A$ is multiplied by an $n \times n$ identity matrix $I_n$, the result is $A$. Similarly, when an $n \times n$ identity matrix $I_n$ is multiplied by an $n \times k$ matrix $B$, the result is $B$.
For $A = \begin{pmatrix} a & b \ c & d \end{pmatrix}$, $\det(A) = ad – bc$. Swapping the rows of $A$ results in the matrix $\begin{pmatrix} c & d \ a & b \end{pmatrix}$, whose determinant is $cb – da = -(ad – bc) = -\det(A)$. Thus, swapping rows changes the sign of the determinant.
A basis of a vector space is a set of vectors that satisfies two key properties: (1) the vectors in the set must be linearly independent (no vector in the set can be expressed as a linear combination of the others), and (2) the vectors in the set must span the entire vector space (every vector in the space can be expressed as a linear combination of the basis vectors).

IV. Essay Format Questions:

Discuss the significance of vector norms and Euclidean distance in various fields such as machine learning and data analysis. Provide specific examples of how these concepts are applied.
Explain the concept of linear dependence and linear independence of vectors. How is this concept related to the idea of a basis for a vector space and the uniqueness of linear combinations?
Describe the process of matrix multiplication and its properties. Compare and contrast matrix multiplication with scalar multiplication and discuss the implications of the non-commutative nature of matrix multiplication.
Explain the concept of the determinant of a square matrix and its key properties. Discuss the geometric interpretation of the determinant in two and three dimensions.
Describe the Gram-Schmidt process for orthonormalizing a set of linearly independent vectors. Explain the importance of orthonormal bases in linear algebra and its applications.

V. Glossary of Key Terms:

Vector: A quantity with both magnitude and direction, represented as an ordered list of numbers.
Scalar: A quantity with only magnitude (a single number).
Norm: A function that assigns a non-negative length or size to a vector.
Euclidean Norm ($L^2$ Norm): The square root of the sum of the squares of the vector’s components, representing its length.
Euclidean Distance: The straight-line distance between two points in Euclidean space.
Zero Vector: A vector with all components equal to zero.
Unit Vector: A vector with a magnitude (norm) of one.
Standard Unit Vectors: Vectors with one component equal to one and all others zero.
Sparse Vector: A vector with many zero entries.
Linear Combination: A vector formed by a sum of scalar multiples of other vectors.
Span: The set of all possible linear combinations of a set of vectors.
Dot Product: An operation on two vectors that returns a scalar, calculated as the sum of the products of their corresponding components.
Matrix: A rectangular array of numbers arranged in rows and columns.
Square Matrix: A matrix with an equal number of rows and columns.
Identity Matrix: A square matrix with ones on the main diagonal and zeros elsewhere.
Determinant: A scalar value that can be computed from the elements of a square matrix, providing information about the matrix’s properties.
Basis: A set of linearly independent vectors that span the entire vector space.
Orthonormal Basis: A basis where all vectors are orthogonal (dot product is zero) and have a norm of one.
Orthogonal Vectors: Two vectors whose dot product is zero.
Gram-Schmidt Process: An algorithm for orthonormalizing a set of linearly independent vectors.

Briefing Document: Review of Linear Algebra Concepts

This briefing document summarizes the main themes and important ideas from the provided source material, focusing on foundational concepts in linear algebra, including vectors, norms, distance, matrices, determinants, linear transformations, and the Gram-Schmidt process.

I. Vectors: Fundamental Building Blocks

Definition and Representation: Vectors are quantities possessing both magnitude (length) and direction, distinguishing them from scalars, which only have magnitude. They can be represented geometrically as arrows in space or algebraically as ordered lists of numbers (components or entries).
In a two-dimensional space (R²), a vector can be represented as [x, y] or using parentheses like (4, 0) or {3, 4}. The components indicate the movement along the horizontal (x) and vertical (y) axes.
In an n-dimensional space (Rⁿ), a vector a is represented as a column of n elements:
[ A₁ ]
[ A₂ ]
[ … ]
[ A<0xE2><0x82><0x99> ]
The index of a vector’s element starts from 1 to n. Notation like Aᵢ refers to the i-th element of vector A.
Vectors can also contain other vectors as entries (nested vectors), which can be represented using capital letters (e.g., A) with arrow notation for the vector entries (e.g., A₁). In such cases, double indexing (e.g., a₁₁) is used to denote the element within a specific vector entry.
Magnitude (Norm) and Distance:
The norm of a vector measures its size or length. For a vector v = [v₁, v₂, …, v<0xE2><0x82><0x99>], the norm (specifically the L₂ norm or Euclidean norm) is calculated as:
||v|| = √(v₁² + v₂² + … + v<0xE2><0x82><0x99>²)
“take all the units that form this vector and then so are on this vector and use them Square them and then add them and then take the square root of that that’s the distance or I have to say the norm of this Vector”. For example, the norm of vector v = [3, 4] is √(3² + 4²) = √25 = 5.
The Euclidean distance (also called L₂ distance or “aladine distance” in the source) between two points (vectors) A and B in n-dimensional space is the norm of the vector connecting A to B (B – A or A – B).
d(A, B) = ||B – A|| = √((B₁ – A₁)² + (B₂ – A₂)² + … + (B<0xE2><0x82><0x99> – A<0xE2><0x82><0x99>)²)
“the aladan distance between two points let’s say A and B in N dimensional space is the norm of the vector connecting a to B”. The order of subtraction inside the square doesn’t matter due to the squaring operation.
Special Vectors:
Zero Vector: A vector where all its components are zero. Denoted as 0 with an arrow on top (e.g., 0 in R³ is [0, 0, 0]). “the zero and arrow on the top it basically refers to the vector like we saw before only with the difference that all its me numbers are zero”.
Unit Vectors: Vectors with one element equal to one and all others zero. The i-th unit vector in n dimensions, eᵢ, has a 1 in the i-th position and 0 elsewhere (e.g., in R³, e₁ = [1, 0, 0], e₂ = [0, 1, 0], e₃ = [0, 0, 1]). “vectors with a single element equal to one and all the others zero denoted as EI for the E unit Vector in N dimensions are referred by unit vectors”.
Sparse Vectors: Vectors with many of their entries as zero.
Vector Operations:
Addition: Two vectors of the same dimension are added element-wise. “when it comes to Vector addition we just add the corresponding elements”.
Subtraction: Similar to addition, subtraction is performed element-wise.
Scalar Multiplication: Multiplying a vector by a scalar involves multiplying each component of the vector by that scalar, effectively scaling the vector’s magnitude. “formally the scalar multiplication involves multiplying each component of a vector by scalar value effectively scaling the vector’s magnitude”.
Properties of Vector Addition:
Commutative: a + b = b + a. “adding two different VOR the direction or the order is not important”.
Associative: (a + b) + c = a + (b + c).
Identity Element: There exists a zero vector 0 such that a + 0 = a. “adding a zero Vector to this original a vector or not we in all cases it just adding a zero Vector has no effect”.
Inverse Element: For every vector a, there exists a vector -a such that a + (-a) = 0. Subtracting a vector from itself results in the zero vector. “when we have a vector consisting of the scalers like a is equal 2 23 in the same manner if we take this a and we subtract it from itself so a minus a then what we will get is … Vector zero”.
Applications of Vectors:
Word Counting: Vectors can represent the frequency of words in a document. Each element of the vector corresponds to a word in a dictionary, and the value represents the number of times that word appears. This is fundamental in natural language processing for tasks like topic modeling and sentiment analysis. “a vector of a length n for instance can represent the number of times each of these words in a dictionary of n words appears in a document”.
Customer Purchases: Vectors can record customer purchases over time, with each element representing the quantity or dollar value of a specific item purchased.

II. Span and Linear Combinations

Linear Combination: A linear combination of vectors A₁, A₂, …, A<0xE2><0x82><0x99> is an expression of the form β₁A₁ + β₂A₂ + … + β<0xE2><0x82><0x99>A<0xE2><0x82><0x99>, where β₁, β₂, …, β<0xE2><0x82><0x99> are scalar coefficients. “a linear combination simply involves taking several vectors … and to get a linear combination we need to uh scale each of those vectors which means that we need to have these different scalers”. The scalars are the “coefficients of linear combination”.
Span: The span of a set of vectors is the set of all possible vectors that can be formed by taking all possible linear combinations of those vectors. “the spend of a set of vectors is a set of all possible linear combinations of these vectors”.
Relationship between Span and Linear Combination: If a vector b can be expressed as a linear combination of vectors in a set V, then b lies within the span of V.
Span Examples:The span of the zero vector is just the zero vector itself.
The span of a single non-zero vector in R² or R³ is the line passing through the origin and the vector.
The span of two non-parallel vectors in R² is the entire R² plane. If they are parallel, their span is a line.
Any vector in Rⁿ can be expressed as a linear combination of the standard unit vectors e₁, e₂, …, e<0xE2><0x82><0x99>. The coefficients of this linear combination are simply the components of the vector itself. “any Vector B in N Dimensions can be expressed as a linear combination of a standard unit vectors E1 up to e n … those coefficients they are not just randomly picked coefficients those are the entries of this Vector B”.

III. Matrices: Arrays of Numbers

Definition and Representation: A matrix is a rectangular array of numbers arranged in rows and columns. An m x n matrix has m rows and n columns. Elements of a matrix A are denoted by aᵢⱼ, where i is the row index and j is the column index (starting from 1). “a matrix is nothing else than a rectangular array of numbers organized in rows and columns”.
Special Types of Matrices:
Identity Matrix (I<0xE2><0x82><0x99>): A square (n x n) matrix with ones on the main diagonal (from top-left to bottom-right) and zeros elsewhere. It is formed by combining the unit vectors as its columns (or rows). “the identity Matrix i n which is a square Matrix with one on the diagonal and zeros elsewhere is basically a matrix that is built using those unit vectors”. The determinant of an identity matrix is always 1.
Zero Matrix: A matrix where all its elements are zero.
Square Matrix: A matrix with an equal number of rows and columns (n x n).
Diagonal Matrix: A square matrix where all the off-diagonal elements are zero. The identity matrix is a special case of a diagonal matrix.
Matrix Operations:
Scalar Multiplication: Multiplying a matrix by a scalar involves multiplying every element of the matrix by that scalar.
Matrix Addition and Subtraction: Performed element-wise between matrices of the same dimensions.
Matrix Multiplication: The product of an m x p matrix A and a p x n matrix B is an m x n matrix C, where each element cᵢⱼ is the dot product of the i-th row of A and the j-th column of B. Matrix multiplication is generally not commutative (AB ≠ BA).
Properties of Matrix Operations: Scalar multiplication distributes over matrix addition, and matrix multiplication is associative and distributes over matrix addition. r(AB) = (rA)B = A(rB).

IV. Determinants: A Scalar Value of a Square Matrix

Definition and Importance: The determinant is a scalar value that can be computed from the elements of a square matrix. It provides important information about the matrix, such as whether the matrix is invertible and the volume scaling factor of the linear transformation represented by the matrix. “the determinant is a scalar value that can be computed from the elements of a square Matrix”.
Calculation:2×2 Matrix: For a matrix [[a, b], [c, d]], the determinant is ad – bc.
3×3 Matrix: Can be calculated using the rule of Sarrus or by cofactor expansion along any row or column. The formula involves sums and differences of products of three elements, with signs determined by the position of the elements.
General (n x n) Matrix: Calculated using cofactor expansion recursively. The determinant is the sum (or difference) of the products of each element in a row (or column) with its corresponding cofactor, which is (-1)^(i+j) times the determinant of the submatrix obtained by removing the i-th row and j-th column.
Properties of Determinants:The determinant of the identity matrix is 1.
Swapping two rows or columns of a matrix changes the sign of its determinant.
If a matrix has a row or a column of zeros, its determinant is zero.
If a matrix has two identical rows or columns, its determinant is zero.
Multiplying a row or column of a matrix by a scalar k multiplies the determinant by k.
The determinant of a triangular matrix (upper or lower) is the product of its diagonal elements.
The determinant of the transpose of a matrix is equal to the determinant of the original matrix: det(Aᵀ) = det(A).
The determinant of a product of matrices is the product of their determinants: det(AB) = det(A) det(B).

V. Basis of a Vector Space

Definition: A basis of a vector space is a set of linearly independent vectors that span the entire vector space. Every vector in the space can be expressed as a unique linear combination of the basis vectors. “a basis of a vector space is a set of all both linearly independent vectors that spend the entire Vector space”.
Linearly Independent Vectors: A set of vectors is linearly independent if no vector in the set can be written as a linear combination of the others.
Spanning the Entire Vector Space: A set of vectors spans a vector space if every vector in the space can be expressed as a linear combination of the vectors in the set.

VI. Gram-Schmidt Process: Orthonormalizing a Set of Vectors

Purpose: The Gram-Schmidt process is an algorithm for orthonormalizing a set of linearly independent vectors in an inner product space (like Euclidean space). The output is a set of orthonormal vectors that span the same subspace as the original set.
Orthonormal Vectors: A set of vectors is orthonormal if all vectors in the set are orthogonal (their dot product is zero) and each vector has a norm (length) of 1.

Steps:Start with a set of linearly independent vectors {a₁, a₂, …, a<0xE2><0x82><0x99>}.
The first orthogonal vector v₁ is the same as the first original vector: v₁ = a₁. The first orthonormal vector e₁ is obtained by normalizing v₁: e₁ = v₁ / ||v₁||.
For each subsequent vector a<0xE2><0x82><0x99> (where k > 1), the orthogonal vector v<0xE2><0x82><0x99> is obtained by subtracting its projection onto the previously computed orthonormal vectors:
v<0xE2><0x82><0x99> = a<0xE2><0x82><0x99> – Σᵢ<0xE2><0x82><0x8B>¹ to <0xE2><0x82><0x99>-¹ proj_<0xE2><0x82><0x81>ᵢ(a<0xE2><0x82><0x99>) = a<0xE2><0x82><0x99> – Σᵢ<0xE2><0x82><0x8B>¹ to <0xE2><0x82><0x99>-¹ ((a<0xE2><0x82><0x99> ⋅ eᵢ) / (eᵢ ⋅ eᵢ)) eᵢ
Since eᵢ are orthonormal, eᵢ ⋅ eᵢ = ||eᵢ||² = 1, so the formula simplifies to:
v<0xE2><0x82><0x99> = a<0xE2><0x82><0x99> – Σᵢ<0xE2><0x82><0x8B>¹ to <0xE2><0x82><0x99>-¹ (a<0xE2><0x82><0x99> ⋅ eᵢ) eᵢ
“we need to subtract is projection on all previously computed orthogonal vectors by using this formula”.
The orthonormal vector e<0xE2><0x82><0x99> is then obtained by normalizing v<0xE2><0x82><0x99>: e<0xE2><0x82><0x99> = v<0xE2><0x82><0x99> / ||v<0xE2><0x82><0x99>||.
Repeat steps 3 and 4 for all vectors in the original set.

This briefing document provides a foundational overview of key linear algebra concepts discussed in the provided source material. Understanding these concepts is crucial for various applications in machine learning, data science, and other quantitative fields.

Frequently Asked Questions on Linear Algebra Concepts

1. What is the Euclidean distance and how is it related to the norm of a vector? The Euclidean distance between two points, say A and B, in an N-dimensional space is a measure of the straight-line distance between them. Mathematically, if A has coordinates (A1, A2, …, An) and B has coordinates (B1, B2, …, Bn), the Euclidean distance d(A, B) is calculated as the square root of the sum of the squares of the differences between their corresponding coordinates: √((A1 – B1)² + (A2 – B2)² + … + (An – Bn)²).

The norm of a vector, on the other hand, measures the length or magnitude of a single vector. For a vector V with components (V1, V2, …, Vn), the L2 norm (also known as the Euclidean norm or just the norm) is defined as ||V||₂ = √(V1² + V2² + … + Vn²).

The Euclidean distance between two points A and B can be seen as the norm of the vector that connects A to B. If we form a vector by subtracting the coordinates of A from B (or vice versa), the norm of this resulting vector is equal to the Euclidean distance between A and B. For example, if the vector connecting A to B is V = B – A = (B1 – A1, B2 – A2, …, Bn – An), then the norm of V, ||V||₂, is exactly the Euclidean distance d(A, B). Thus, the Euclidean distance is a specific application of the norm concept, applied to the vector representing the displacement between two points.

2. How are vectors represented, and what distinguishes them from scalars? Vectors are quantities that possess both magnitude (length) and direction. They are typically represented in a coordinate system. In a two-dimensional space (R²), a vector can be represented by its x and y components, either in parentheses (e.g., (4, 0)) or in square braces (e.g., [3, 4]). In a three-dimensional space (R³), a vector is represented by its x, y, and z components (e.g., (a, b, c) or [a, b, c]). Generally, in an N-dimensional space (Rⁿ), a vector ‘a’ can be represented as a column vector with n entries: [A1, A2, …, An]ᵀ.

Scalars, in contrast, are quantities that have only magnitude and no direction. They are simply numbers (real numbers in the context of these sources). Examples of scalars include temperature, speed (as opposed to velocity), and mass.

The key distinction is the presence of direction. Vectors describe a movement or displacement from one point to another in space, characterized by how far to move in each dimension, while scalars just indicate an amount or quantity.

3. What are zero vectors and unit vectors, and why are they important? A zero vector is a vector in which all of its components are zero (e.g., [0, 0, 0] in R³). It has zero magnitude and no specific direction. Zero vectors are important because they act as the additive identity in vector addition (adding a zero vector to any vector does not change the vector). They are also useful in various linear algebra operations and algorithms, such as representing a null displacement or an initial state.

Unit vectors are vectors with a magnitude (length) of one. In N dimensions, the standard unit vectors, often denoted as eᵢ, have a single component equal to one at the i-th position and all other components equal to zero (e.g., e₁ = [1, 0, 0]ᵀ, e₂ = [0, 1, 0]ᵀ, e₃ = [0, 0, 1]ᵀ in R³). Unit vectors are crucial because they provide a standard basis for representing any vector in a vector space as a linear combination of these unit vectors. They are also used to indicate direction without magnitude and are essential in concepts like vector normalization and coordinate systems.

4. What does the sparsity of a vector refer to, and why is it significant? The sparsity of a vector refers to the characteristic of having many of its entries as zero. A sparse vector is one where a significant proportion of its components are zero. The sparsity pattern indicates the positions of the non-zero entries.

Sparsity is significant for several reasons, especially in high-dimensional data and computations.

Computational Efficiency: Operations involving sparse vectors (like dot products or matrix-vector multiplications) can be performed more efficiently as we only need to process the non-zero elements. This saves time and memory.
Data Representation: In many real-world applications, data can be naturally sparse. For example, in text analysis, a document can be represented as a vector where each component corresponds to a word in a vocabulary, and most components will be zero for words that do not appear in that document.
Feature Selection: Sparsity is often a desirable property in machine learning models, as it can lead to feature selection (where irrelevant features have zero weights), making the models simpler, more interpretable, and less prone to overfitting.

5. How are vectors used to represent word counts in a document? Vectors can be used to represent the frequency of words in a document by creating a vector where each dimension corresponds to a unique word in a predefined vocabulary (or dictionary). The value in each dimension of the vector represents the number of times that specific word appears in the document.

For example, if our dictionary consists of the words {“word”, “row”, “number”, “horse”, “eel”, “document”}, a document could be represented by the vector [3, 2, 1, 0, 4, 2]. This vector indicates that the word “word” appears 3 times, “row” appears 2 times, “number” appears 1 time, “horse” appears 0 times, “eel” appears 4 times, and “document” appears 2 times in the document.

This representation is fundamental in natural language processing for tasks like text analysis, information retrieval, and building language models. By quantifying word occurrences, we can perform mathematical operations on text data, such as comparing documents based on their word content or identifying the topic of a document. The concept of stop words (common words like “the”, “is”, “a”) which often have high frequencies but low informational value, is also relevant in this context.

6. Explain the operations of vector addition and subtraction. Vector addition is performed element-wise. If we have two vectors, say A = [A₁, A₂, …, A<0xE2><0x82><0x99>]ᵀ and B = [B₁, B₂, …, B<0xE2><0x82><0x99>]ᵀ, both in Rⁿ, their sum C = A + B is a new vector C = [A₁ + B₁, A₂ + B₂, …, A<0xE2><0x82><0x99> + B<0xE2><0x82><0x99>]ᵀ. This means that the i-th component of the resulting vector is the sum of the i-th components of the two original vectors. Vector addition requires that the vectors have the same dimensions.

Similarly, vector subtraction is also performed element-wise. If we have two vectors A = [A₁, A₂, …, A<0xE2><0x82><0x99>]ᵀ and B = [B₁, B₂, …, B<0xE2><0x82><0x99>]ᵀ, their difference D = A – B is a vector D = [A₁ – B₁, A₂ – B₂, …, A<0xE2><0x82><0x99> – B<0xE2><0x82><0x99>]ᵀ. The i-th component of the resulting vector is the difference between the i-th components of the two original vectors. Like addition, vector subtraction also requires the vectors to have the same dimensions.

Geometrically, vector addition can be visualized using the parallelogram law or the head-to-tail method, where the resultant vector represents the diagonal of the parallelogram formed by the two vectors or the vector from the tail of the first to the head of the second. Vector subtraction A – B can be thought of as adding A to the vector -B, where -B has the same magnitude as B but points in the opposite direction.

7. What are the key properties of vector addition? Vector addition satisfies several important properties:

Commutativity: The order of addition does not matter. For any two vectors A and B, A + B = B + A. This is because scalar addition of their corresponding components is commutative.
Associativity: When adding three or more vectors, the way they are grouped does not affect the result. For any three vectors A, B, and C, (A + B) + C = A + (B + C). This follows from the associativity of scalar addition.
Identity Element: There exists a zero vector (0) such that for any vector A, A + 0 = 0 + A = A. The zero vector acts as the additive identity.
Inverse Element: For every vector A, there exists an additive inverse, denoted as -A, such that A + (-A) = (-A) + A = 0 (the zero vector). The inverse -A is obtained by negating each component of A.

These properties make the set of all vectors in a vector space under the operation of vector addition an abelian group, which is a fundamental concept in linear algebra.

8. Explain scalar multiplication of a vector and the concept of a linear combination of vectors. Scalar multiplication involves multiplying a vector by a scalar (a number). If we have a vector A = [A₁, A₂, …, A<0xE2><0x82><0x99>]ᵀ and a scalar c, the scalar product cA is a new vector where each component of A is multiplied by c: cA = [cA₁, cA₂, …, cA<0xE2><0x82><0x99>]ᵀ. Scalar multiplication effectively scales the magnitude (length) of the vector by |c|. If c is negative, it also reverses the direction of the vector.

A linear combination of a set of vectors {A₁, A₂, …, A<0xE2><0x82><0x99>} is a vector that is obtained by multiplying each vector in the set by a scalar and then adding the results together. If c₁, c₂, …, c<0xE2><0x82><0x99> are scalars, then a linear combination of these vectors is given by: L = c₁A₁ + c₂A₂ + … + c<0xE2><0x82><0x99>A<0xE2><0x82><0x99>. The scalars c₁, c₂, …, c<0xE2><0x82><0x99> are called the coefficients of the linear combination. Linear combinations are fundamental to understanding concepts like span, basis, and linear independence in vector spaces. Any vector in a vector space can often be expressed as a linear combination of a set of basis vectors for that space.

Understanding Linear Combinations of Vectors

A linear combination of vectors A₁ up to A<0xE2><0x82><0x98> using scalars β₁ up to β<0xE2><0x82><0x98> (also referred to as coefficients) is a vector formed by multiplying each vector by its corresponding scalar and then adding the results. Formally, it is expressed as β₁ * A₁ + β₂ * A₂ + … + β<0xE2><0x82><0x98> * A<0xE2><0x82><0x98>. The scalars β₁, β₂, …, β<0xE2><0x82><0x98> are real numbers.

Here’s a breakdown of the key concepts:

Vectors: A linear combination involves a set of vectors, for example, A₁, A₂, up to A<0xE2><0x82><0x98>.
Scalars (Coefficients): Each vector in the set is multiplied by a scalar, such as β₁, β₂, up to β<0xE2><0x82><0x98>. These scalars are real numbers. The choice of these coefficients determines the resulting linear combination.
Scalar Multiplication: Multiplying a vector by a scalar scales the vector’s magnitude. If the scalar is negative, it also reverses the vector’s direction.
Vector Addition: The scaled vectors are then added together element-wise to produce the linear combination. The resulting vector has the same size as the original vectors being combined.

Examples of Linear Combinations:

Let’s consider two vectors A = and B =. Some linear combinations of A and B are:

β₁ = 0, β₂ = 0: 0 * A + 0 * B = 0 * + 0 * = + = .
β₁ = 1, β₂ = 1: 1 * A + 1 * B = 1 * + 1 * = + =.
β₁ = 3, β₂ = -2: 3 * A + (-2) * B = 3 * + (-2) * = + [0, -6] =.

Importance of Coefficients:

The coefficients (β₁, β₂, …, β<0xE2><0x82><0x98>) are crucial because they define how the vectors are combined. By choosing different sets of coefficients, we can obtain different linear combinations of the same set of vectors. Finding these coefficients is often the goal in many applications of linear algebra.

Linear Combination of Unit Vectors:

Any vector B in an N-dimensional space (Rⁿ) can be expressed as a linear combination of the standard unit vectors E₁ up to E<0xE2><0x82><0x99>. The coefficients in this linear combination are the entries of the vector B itself.

For example, in three-dimensional space (R³), a vector B = [-1, 3, 5] can be written as a linear combination of the unit vectors E₁ =, E₂ =, and E₃ = as follows:

-1 * E₁ + 3 * E₂ + 5 * E₃ = -1 * + 3 * + 5 * = [-1, 0, 0] + + = [-1, 3, 5] = B.

In general, for a vector B = [B₁, B₂, …, B<0xE2><0x82><0x99>] in Rⁿ, the linear combination is:

B₁ * E₁ + B₂ * E₂ + … + B<0xE2><0x82><0x99> * E<0xE2><0x82><0x99> = B.

Span and Linear Combinations:

The span of a set of vectors is defined as the set of all possible linear combinations of those vectors. For instance, if the span of vectors A and B is R², it means that any vector in the two-dimensional plane can be expressed as a linear combination of A and B.

In the example with A = and B =, it was shown that any vector X = [X₁, X₂] in R² could be represented as a linear combination of A and B by finding appropriate coefficients β₁ and β₂. This demonstrates that the span of A and B is R².

However, if two vectors are linearly dependent (one can be written as a scalar multiple of the other), their span will be a line rather than the entire R². For example, if A = and B = (where B = 3A), their linear combinations will always lie on the same line.

Linear combinations are a fundamental concept in linear algebra, providing a way to understand how vectors can be combined to generate other vectors and define vector spaces and their properties like span and linear independence.

Understanding the Span of Vector Sets

The span of a set of vectors is the set of all possible linear combinations of those vectors. If you have a set of vectors, say V = {V₁, V₂, …, V<0xE2><0x82><0x98>}, then the span of V, often written as span(V), encompasses every vector that can be created by taking a linear combination of V₁, V₂, …, V<0xE2><0x82><0x98>. This means any vector in span(V) can be expressed in the form c₁V₁ + c₂V₂ + … + c<0xE2><0x82><0x98>V<0xE2><0x82><0x98>, where c₁, c₂, …, c<0xE2><0x82><0x98> are any real numbers (scalars).

We can explore the span of vectors through several cases, as discussed in the sources:

Span of the Zero Vector: If you have a set containing only the zero vector (e.g., a = ), then any linear combination of this vector will always result in the zero vector (c * = ). Therefore, the span of the zero vector is just the zero vector itself.
Span of a Single Non-Zero Vector: Consider a single non-zero vector, for example, a =. The span of this vector consists of all its scalar multiples (c * = [c, 2c]). When visualized in R², these scalar multiples will all lie on the same line passing through the origin and in the direction of the vector a. Thus, the span of a single non-zero vector in R² is a line passing through the origin. You can move along this line by choosing different scalar values for ‘c’, but you cannot reach any point off this line.
Span of Two Perpendicular Vectors: Let’s take two perpendicular vectors in R², such as a = and b =. The span of these two vectors is the set of all linear combinations of the form c₁a + c₂b = c₁ + c₂ = [c₁, c₂]. Since c₁ and c₂ can be any real numbers, this linear combination can produce any vector in R². Therefore, the span of two perpendicular vectors like and (which are the standard unit vectors E₁ and E₂) is the entire R². Any point in the 2D plane can be reached by some combination of these two vectors.
Span of Two Vectors Where the Span is R²: Consider two vectors V₁ = and V₂ =. It can be shown that the span of these two vectors is also R². This means that any vector X = [X₁, X₂] in R² can be expressed as a linear combination of V₁ and V₂ (C₁V₁ + C₂V₂ = X) by finding appropriate coefficients C₁ and C₂. This was demonstrated in the source by solving for C₁ and C₂ in terms of X₁ and X₂.
Span of Linearly Dependent Vectors: As mentioned in our previous conversation and implicitly in the source, if two vectors are linearly dependent (one is a scalar multiple of the other, like a = and b = where b = 3a), their span will be a line through the origin. All linear combinations of these vectors will lie on the same line because essentially, you are only varying the magnitude (and possibly direction) along a single direction.

In summary, the span of a set of vectors defines the subspace that can be reached by taking all possible linear combinations of those vectors. The number of vectors and their linear independence play a crucial role in determining the nature of the span (e.g., a point, a line, a plane, or the entire vector space). If the span of a set of vectors is the entire vector space, then that set of vectors is said to span the vector space. This concept is closely linked to linear independence and the basis of a vector space. A basis is a set of linearly independent vectors that span the entire vector space.

Linear Independence in Vector Spaces

Linear independence of a set of vectors is a fundamental concept in linear algebra that describes whether any vector in the set can be expressed as a linear combination of the others. Conversely, if at least one vector in the set can be written as a linear combination of the remaining vectors, then the set is said to be linearly dependent.

Formally, a set of vectors {V₁, V₂, …, V<0xE2><0x82><0x98>} is linearly independent if and only if the only solution to the vector equation:

c₁V₁ + c₂V₂ + … + c<0xE2><0x82><0x98>V<0xE2><0x82><0x98> = 0

is the trivial solution, where all the scalars (coefficients) c₁, c₂, …, c<0xE2><0x82><0x98> are equal to zero. In other words, the only way to get a linear combination of these vectors to equal the zero vector is by setting all the coefficients to zero.

If there exists a set of scalars c₁, c₂, …, c<0xE2><0x82><0x98>, where at least one of them is non-zero, such that the linear combination equals the zero vector, then the vectors {V₁, V₂, …, V<0xE2><0x82><0x98>} are linearly dependent. In this case, the non-zero coefficients indicate that at least one vector in the set can be expressed as a linear combination of the others.

Examples of Linear Dependence:

Consider two vectors a = and b =. We can see that b is a scalar multiple of a (b = 3a). Therefore, we can write the linear combination:

3a – b = 3 – = – =

Here, we have a non-trivial solution (coefficients 3 and -1 are not both zero) that results in the zero vector. Thus, vectors a and b are linearly dependent. Geometrically, linearly dependent vectors in R² lie on the same line passing through the origin (they are collinear). The span of these vectors is a line, not the entire R².

Examples of Linear Independence:

Consider two vectors a = and b =. To check for linear independence, we set up the equation:
c₁ + c₂ = [6c₁, 0] + [0, 7c₂] = [6c₁, 7c₂] =
This gives us two equations: 6c₁ = 0 and 7c₂ = 0. The only solution to this system is c₁ = 0 and c₂ = 0. Since the only solution is the trivial solution, vectors a and b are linearly independent. Geometrically, these perpendicular vectors span the entire R².
Consider the standard unit vectors in R³: E₁ =, E₂ =, and E₃ =. The equation for linear independence is:
c₁ + c₂ + c₃ = [c₁, 0, 0] + [0, c₂, 0] + [0, 0, c₃] = [c₁, c₂, c₃] =
This directly implies that c₁ = 0, c₂ = 0, and c₃ = 0. Therefore, the standard unit vectors in R³ are linearly independent. Furthermore, their span is the entire R³.

Linear Independence, Span, and Basis:

As discussed in our previous conversation, the concept of linear independence is crucial for understanding the span of a set of vectors and the basis of a vector space.

A basis of a vector space is a set of vectors that satisfies two conditions:
The vectors must be linearly independent.
The vectors must span the entire vector space.

For example, the standard unit vectors E₁ and E₂ form a basis for R² because they are linearly independent and their span is R². Similarly, E₁, E₂, and E₃ form a basis for R³.

If a set of vectors spans a vector space but is linearly dependent, it means that there is some redundancy in the set (one or more vectors can be removed without reducing the span) and it is not a basis. To form a basis, you need a minimal set of linearly independent vectors that can generate all other vectors in the space through linear combinations.

In the example of linearly dependent vectors a = and b =, they span a line in R², but they do not form a basis for R² because they are not linearly independent. A basis for the line they span could be just the vector a (or b), as it is a linearly independent set that spans that specific line.

Understanding linear independence is essential for many concepts in linear algebra, including finding the dimension of a vector space, performing matrix operations, and solving systems of linear equations.

Fundamental Matrix Operations

The sources discuss several fundamental matrix operations, which are essential for working with matrices in linear algebra. These operations include matrix addition, matrix subtraction, scalar multiplication, and matrix multiplication.

Matrix Addition

The sum of two matrices A and B of the same dimensions is obtained by adding their corresponding elements. If A and B are both M by N matrices, their sum, denoted as A + B, will also be an M by N matrix where each element (i, j) is the sum of the elements (i, j) of A and B.

For example, if matrix A is $\begin{pmatrix} 1 & 0 & 2 \ 0 & 1 & 3 \ 1 & 0 & 1 \end{pmatrix}$ and matrix B is $\begin{pmatrix} 1 & 2 & 3 \ 0 & 0 & 1 \ 2 & 1 & 3 \end{pmatrix}$, then their sum A + B is calculated element-wise:

$\begin{pmatrix} 1+1 & 0+2 & 2+3 \ 0+0 & 1+0 & 3+1 \ 1+2 & 0+1 & 1+3 \end{pmatrix} = \begin{pmatrix} 2 & 2 & 5 \ 0 & 1 & 4 \ 3 & 1 & 4 \end{pmatrix}$.

The order of addition does not matter (A + B = B + A), which is known as the commutative law for matrix addition. Also, matrix addition is associative, meaning (A + B) + C = A + (B + C). Adding a zero vector to a vector has no effect, and similarly, adding a zero matrix (a matrix where all elements are zero) to another matrix A results in A.

Matrix Subtraction

The difference of two matrices A and B of the same dimensions is obtained by subtracting their corresponding elements. If A and B are both M by N matrices, their difference, denoted as A – B, will also be an M by N matrix where each element (i, j) is the difference between the elements (i, j) of A and B. Subtracting a vector from itself results in the zero vector, and similarly, subtracting a matrix from itself results in the zero matrix.

Scalar Multiplication

Scalar multiplication of a matrix A by a scalar (a real number) $\alpha$ results in a new matrix where each entry of A is multiplied by $\alpha$, effectively scaling the vector’s magnitude. If A is an M by N matrix, then $\alpha$A is also an M by N matrix where each element (i, j) is $\alpha$ times the element (i, j) of A.

For example, if matrix A is $\begin{pmatrix} 1 & 2 & 3 \ 4 & 5 & 6 \end{pmatrix}$ and the scalar is 3, then 3A is:

$3 \times \begin{pmatrix} 1 & 2 & 3 \ 4 & 5 & 6 \end{pmatrix} = \begin{pmatrix} 3 \times 1 & 3 \times 2 & 3 \times 3 \ 3 \times 4 & 3 \times 5 & 3 \times 6 \end{pmatrix} = \begin{pmatrix} 3 & 6 & 9 \ 12 & 15 & 18 \end{pmatrix}$.

The scalar multiplication law for matrices states that for a scalar R and matrices A and B, R(AB) = (RA)B = A(RB).

Matrix Multiplication

The product of an M by N matrix A and an N by P matrix B results in an M by P matrix C, where each entry $c_{ij}$ of C is computed as the dot product of the i-th row of A and the j-th column of B. For matrix multiplication to be defined, the number of columns in the first matrix (N) must be equal to the number of rows in the second matrix (N). The resulting matrix C will have the same number of rows as A (M) and the same number of columns as B (P).

To calculate the element $c_{ij}$, we take the i-th row of A, which consists of elements $a_{i1}, a_{i2}, …, a_{in}$, and the j-th column of B, which consists of elements $b_{1j}, b_{2j}, …, b_{nj}$. The dot product is then calculated as:

$c_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} + … + a_{in}b_{nj} = \sum_{k=1}^{n} a_{ik}b_{kj}$.

For example, if $A = \begin{pmatrix} 1 & 2 \ 3 & 4 \end{pmatrix}$ and $B = \begin{pmatrix} 2 & 0 \ 1 & 2 \end{pmatrix}$, their product C = AB is a 2×2 matrix:

$c_{11} = (1 \times 2) + (2 \times 1) = 4$ $c_{12} = (1 \times 0) + (2 \times 2) = 4$ $c_{21} = (3 \times 2) + (4 \times 1) = 10$ $c_{22} = (3 \times 0) + (4 \times 2) = 8$

So, $AB = \begin{pmatrix} 4 & 4 \ 10 & 8 \end{pmatrix}$.

Unlike scalar multiplication and matrix addition, matrix multiplication is generally not commutative (AB $\neq$ BA). However, it is associative: (AB)C = A(BC). Matrix multiplication also follows the distributive law with respect to matrix addition: A(B + C) = AB + AC (left distribution) and (A + B)C = AC + BC (right distribution).

Understanding these basic matrix operations is crucial for more advanced topics in linear algebra, such as solving systems of linear equations, finding determinants, and performing matrix factorization.

Properties of Determinants

Based on the sources, here are the key properties of determinants:

The determinant of an identity matrix is one. The source explicitly states that if you have an identity matrix, denoted as $I_n$ for an n x n identity matrix, then its determinant, det($I_n$), is equal to 1. A specific example of a 2×2 identity matrix ($I_2 = \begin{pmatrix} 1 & 0 \ 0 & 1 \end{pmatrix}$) is given where its determinant is calculated as $(1 \times 1) – (0 \times 0) = 1$.
Swapping two rows or columns of a matrix changes the sign of its determinant. If you take a matrix A and create a new matrix $A’$ by interchanging two of its rows or two of its columns, then the determinant of $A’$ is the negative of the determinant of A, i.e., det($A’$) = -det(A). The source provides a notation where $A’$ is referred to as $A_{not}$ when it’s a manipulated version of A through swapping rows or columns.
If a matrix has a row or a column of zeros, its determinant is zero. The source explains that if a matrix contains a row or a column where all the elements are zero, then the determinant of that matrix is zero. An example is given with a 2×2 matrix A = $\begin{pmatrix} 0 & B \ 0 & D \end{pmatrix}$ where its determinant is $(0 \times D) – (0 \times B) = 0$. Similarly, a matrix B with a column of zeros, like $\begin{pmatrix} 1 & 0 & 3 \ 1 & 0 & 4 \ 1 & 0 & 5 \end{pmatrix}$, will have a determinant of zero. This can be understood through the calculation method where if you expand along a row or column of zeros, each term in the determinant calculation will have a factor of zero, resulting in a total determinant of zero.
The determinant of a product of matrices equals the product of their determinants. For two square matrices A and B of the same size, the determinant of their product (AB) is equal to the product of their individual determinants: det(AB) = det(A) $\times$ det(B). The source provides a detailed example with two 2×2 matrices A = $\begin{pmatrix} 1 & 2 \ 3 & 4 \end{pmatrix}$ and B = $\begin{pmatrix} 5 & 6 \ 7 & 8 \end{pmatrix}$. First, the product AB is calculated as $\begin{pmatrix} 19 & 22 \ 43 & 50 \end{pmatrix}$, and its determinant is found to be $(19 \times 50) – (43 \times 22) = 4$. Then, the determinant of A is calculated as $(1 \times 4) – (2 \times 3) = -2$, and the determinant of B is $(5 \times 8) – (7 \times 6) = -2$. Finally, it is shown that det(A) $\times$ det(B) = $(-2) \times (-2) = 4$, thus proving the property for this example.

These properties are fundamental for understanding and working with determinants in linear algebra. The determinant provides crucial information about a matrix, such as whether it is invertible.

Linear Algebra for Machine Learning

The Original Text

machine learning is at the Forefront of the Innovation powering the most advanced and transformative systems for the companies like apple Tesla Netflix Amazon open Ai and many others it enables the creation of the intelligent systems that can predict Trends personalized user experience and automate complex tasks to develop this practical applications a deep understanding of the underlying mechanics is important this require is a solid grasp of mathematics behind the machine learning so all these technical details with a particular focus on linear algebra this all-encompassing course explores the linear algebra in an interactive and machine learning Focus manner welcome to the linear algebra for machine learning course you will acquire the critical principles needed to build optimize and analyze sophisticated machine learning models from designing algorithms to enhancing curent Technologies this course provides the mathematical foundations with vital interest for those for pioneering advancements in machine learning for those dedicated to mastering the mathematical aspect and the technical details behind machine learning our extensive 26 plus hour course of fundamentals of machine learning within the mathematics boot camp as well as a separate course offers an in-depth exploration this extensive program includes certification and is tailored for individuals that are serious about advancing their career in the field of machine learning nii engineering this crash course in mathematics will serve you as a great starting point by establishing a robust foundation in linear algebra you will be well prepared to excel as machine learning practitioner equipped with a mathematical knowledge that drives the innovation and efficiency in this field so if you’re ready I’m really excited and without further Ado let’s get started welcome to the course on the fundamentals of linear algebra presented by lunch Academy my name is D Vasan and today we are going to start with some basic concepts that are important for understanding linear algebra linear algebra is one of the most applicable areas of mathematics it is used by pure mathematicians that you will see in universities doing research publishing research papers but also by the mathematically trained scientists of all disciplines this is really one of those areas in mathematics that you will see time and time again appearing in your professional life if you want to become a job ready data scientist or you want to do some handson machine learning deep learning nii stuff but also linear algebra is used in cryptology it is used in cyber cber security and in many other areas of computer science and artificial intelligence so if you want to become this well-rounded professional you want to go beyond using libraries and you want to truly understand the uh mathematics and the technical side of these different machine learning algorithms from very basic ones like linear regression to most complex ones coming from Deep learning like architectures in neural network how the optimization algorithms work how the grad descent Works in all these other different methods and models then you are in the right place because you must know linear algebra such that you will understand these different concepts from very basic ones to most advanced ones in data science machine learning deep learning artificial intelligence data analytics but also in many other applied science disciplines so before starting this comprehensive course that will give you everything that you need to know about linear algebra first I’m going to tell you what we assume that you already know because linear algebra it comes from about third year of Bachelors of different highly technical studies and here we are assuming that you already know certain Concepts so to ensure that this course stays really on the topic of linear algebra and that you understand all these Concepts really well for that we need to be able to know different topics so before we dive into this Concepts let’s familiarize ourselves with the basic prerequisites and notations used throughout this course and you will really need to know this in order to understand this Concepts really well such that instead of memorizing you will actually just hear me once or maybe twice and then every time you hear uh later on or you see it in the papers or in some algorithms you will recognize this is something that we already learned so some key prerequisites overview is here first of all to fully grasp the upcoming material you should be familiar with some basic concept like real numbers Vector spaces so you don’t need to know this idea of vectors though you already most likely are familiar with this given that you know how to plot different lines you know the idea of x’s and y’s and how to plot these different graphs but here we are going to based on this every time when we come close to this Concepts I will refresh you your memory and we will go through this numbers the idea of norms and distance measures because when it comes to the vectors when it comes to the magnitude and all these different topics that we are going to discuss as part of linear algebra knowing the what Norm is and what is the definition of distance what is the length between two points when we plot it in the two dimensional space or three dimens commcial space those are all very basic concept that usually you see as part of a basic pre-algebra or just common algebra courses and lessons in order to truly understand what the new algebra is about to understand this direction of vectors the angle and then the dimensionality reduction how linear algebra is applied for instance in different algorithms in machine learning deep learning data science statistics you really need to understand this Cartesian coordinate system so this is not only important for linear algebra but I assume you already know it given that you have passed those other courses like calculus or usually they are covered as part of pre-algebra or algebra so the Cartesian coordinate system I mean here understanding what is for instance the the common description of them for instance when you when we write like X and then y on the vertical axis and then we can we have here zero and then we can always plot this different plots you know we have a clear understanding what is this Y is equal to X line we understand how by knowing certain points we can plot different plots for instance that this is the Y is equal to X line that here it means that if we have here one then this is just one two this is two so we understand when we have the function of the line and we have a certain value whether it’s our y coordinate or x coordinate then the corresponding coordinates can be be found then you also need to know some basic things that I just didn’t mention right now so for instance that the numbers here can be like one 2 three up to Infinity so you understand this concepts of infinity and then here the same story then here we have minus one you know minus 2 uh and then this is then used later on and we will be touch basing this one we will be describing our vectors and how we can visualize our vectors e the two dimensional space La we have here because this is two dimensional so we have X and Y but we can also of course visualize it in three-dimensional Etc so this idea of basic coordinate system is really important usually covered as part of algebra if not pre-algebra then we have basic triog genetry which means that you need to have a clear understanding what sinus is what cosine is what tangent is and their recip foros and here I mean that you know for instance what is cosine function what is s function you know that you have an understanding for instance that what is this line you know whether it’s a sinus line or cosine line you have also an understanding what this Pi is one thing that I didn’t mention but it it just goes around all these topics some basic things that you understand what is X what is why why we use them and this idea of variables and also you need to understand this idea of square or you know 90° angle and then Pythagoras Theorem here we have the same so what is this relationship between different sides of a triangle uh that is a very unique triangle and that has one of the angles as 90° and this idea of you know the sides how this relates to the signus cosinus tangent cotangent and also how the Pythagorean Pythagorean theorem applies when we have a triangular but it is no longer with angle that is 90° what is the sum of all the angles of a triangle so those are basic stuff that are com commonly covered as part of trigonometric lessons or part of General geometry then another prerequisite is this understanding of identities and equations in triog genometric lessons something part of which I already covered and this just goes around of basic having a basic understanding of algebra and geometry those are super important to understand more Advanced Techniques from linear algebra then we have finally this idea of orthogonality perpendicularity in vectors so this also comes from geometry and from trigonometric lessons so you understand that if we we have for instance these two lines that don’t have any intersections then we are talking about two orthogonal lines and otherwise for instance if we have the two lines like this then we are talking about perpendicular vectors when you have two lines that are actually parallel so they don’t have any intersection and you won’t find any point that is common for the tube so when it comes to this R so as part of real numbers and Vector spaces R represents the set of all real numbers so you can be dealing with for instance and integers like 1 2 three this can Al this will also cover all the negative numbers like min-1 – 2 – 3 but also the floting numbers like 1. 223 and all the other numbers that you can think of those are the set of all real numbers so this is in onedimensional space right so you can see that I’m writing just one number you know two three and other numeric numbers then we have the idea of R2 R3 up to RN where all these numbers they represent in this case the N it represents the N dimensional aidian space so when it comes to this idea of n dimensional numbers so for instance R2 here we just mean 2D plane so I’m pretty sure you are familiar with this idea of for instance x-axis and Y AIS here we are dealing with two dimensional plane so for every point that we can find here we can describe them by assigning them a value X so coordinate X and a coordinate y that’s exactly what we mean by saying that the number can be represented in a 2d plane so here we are dealing with this two dimensional space this is our two dimensional Alan space and every number in here that is part of this R2 can be pictured here can be represented in this visualization so for instance if I have this number and let’s assume that the value on the x- axis is two and we can see here that the corresponding Y is zero I can describe this number which I will call a I can describe this by writing down first the x coordinate which is two and then the y coordinate which is zero so I’m then saying that a which is a point with x coordinate 2 and y coordinate zero it is part of my R2 and it’s part of my two dimensional Alan space when it comes to R3 similar thing we can do with that only in that case we need not just X AIS and Y AIS but we need to add our third dimension so here for instance when it comes to the r Tre then we need to do y AIS we need to have x axis but also we need to have some Zed AIS so such that every time every point in the space we can then describe by x y and Zed coordinates so if we write it in terms of the vector something that we will see very soon as part of our first unit of this course we will then need to write represent every number in this threedimensional Alan Space by writing down first the x coordinate let’s say one and then y coordinate let’s say another one and then Z coordinate which is one or even better even easier let’s do 0 0 0 which means that we are dealing with this initial number which is the center of this three-dimensional Elan space when it comes to the nend dimensional or higher dimensional spaces it’s much harder to visualize therefore usually when it comes to visualizations we do usually we usually only visualize the onedimensional two dimensional and three-dimensional spaces above then it just no longer does make sense to visualize it but we definitely deal with them and they are part of Applied linear algebra so understanding the spaces is very important for analyzing vectors for their interactions and this holds not just for this two dimensional and three-dimensional but really for multi-dimensional spaces let’s Now quickly Define this idea of Norm so the norm of a vector denoted by this V which you can see kind of like similar to the absolute value from pre-algebra you can see here that we have this double straight lines like from absolute value and then we have the name of the vector or the variable name that we are assigning to our vector and then you might notice here on the top of this this Arrow this basically says that we are dealing not with just a variable but really we are dealing with a vector this is really important because you can see that there makes a huge difference if we have for instance just V or V1 I have to say or just V those are really important and things that you need to keep in mind when it comes to linear algebra and trying to differentiate vectors from a point you will notice that when it comes to Norm we can represented it either by this notation or this usually it’s a common notation in machine learning or in data science with this two bars and when we do this we automatically also know that we are dealing with Eline distance we call it also L2 norm and this is something very common and usually used as part of retrogression which is an application of linear algebra and it’s used in regularization so we are regularizing our machine learning algorithms so when you get into machine learning you will see time and time again this notation so next time when you see this then you know automatically that you are dealing with L2 norm and L2 Norm which is also used a lot in machine learning it is referring to the usage of L2 Norm to uh in the retrogression and retrogression or L2 regularization is a very popular regularization techniques as part of machine learning so right now even you can see this intersection of linear algebra or this idea of norms in machine learning all right so now let’s see why we call it actually L2 Norm or often referred as Elan distance so in distance you can see here which is also the in this case this V which describes the norm of the vector v is equal to square roof and then we have all these coordinates assuming that the vector comes from an N dimensional space so you can see here the RN the V Vector the idian distance or the norm of this vector v is equal to square roof and then V1 2qu plus v2^ S Plus and all this in between numbers plus VN s so here basically it means take the square root of V1 squ then V2 squared plus V3 s blah blah blah plus VN squ so basically take all the units that form this vector and then so are on this vector and use them Square them and then add them and then take the square root of that that’s the distance or I have to say the norm of this Vector why this is important this idea of norms and Al IND distance beside of being used in machine learning and why is it chuse so Norms they provide a way to measure the size or the length of a vector in Vector spaces which means that when we want to measure a distance a similarity relationship between for instance vectors then it becomes much easier to use this idea and Alan distance is not only used in regularization techniques like L2 regularization or retrogression but it’s also used in other machine learning or deep learning algorithms as a way to measure the distance or the relationship or the similarity between two different entities those can be variables those can be two people that we want to compare in our algorithm or two entities um for instance the Norms or the Al and distance they are also used as part of K algorith something that you might have heard and if you follow later on the machine learning and the clustering section of machine learning you will see that aladan distance is used as part of C’s algorithm that aims to Cluster observations into different groups so this also yet another highly applicable uh topic that you must know in order to understand different linear algebra top topics but also machine learning topics let’s now talk about simple topic that we must know about and refresh our memory very quickly before moving forward to our next topic that is a prerequisite for this course so the cartisian coordinate system is just a fancy word of describing this idea of X and Y or XY Z when we just want to visualize them and showcase this numbers related to the space so we just learned the and I just quickly was talking about this idea of EX and and Y and how we can visualize that in Plaine so the cortesian coordinate system is a framework for specifying points in a plane or a space using ordered list of numbers so we know for instance when we plot this then here we need to put X and the Y in our two dimensional space or two and we know that here in the middle we have zero and here we have 1 2 3 4 and the same here one two and then three four which means that everyone that is in the indust whether it’s in mathematics in physics in data science or ml or AI we all universally agree on this system we know that this is this ordered list of numbers and we know that if we have for instance a point here then for this point we know that the xaxis and Y AIS is definitely positive even if we know don’t know the corresponding numbers and then once we have more General lines here so not General but specific lines then even know the exact coordinates and values here and we definitely know that this number should be so the x coordinate should be between two and three so first we have the two and then three and not the other way around so this ordered nature helps us to understand how we can put all these different numbers and organize them in our two dimensional space and we also know the corresponding y so we know that for instance our Y is not minus three because it’s lying in here in this part of our coordinate system and not somewhere here where the Y AIS are negative and why do we know that because it’s an ordered list of numbers that we can visualize in this 2D plane and here you also need to keep in mind and we need to remind ourselves about this idea of these four different parts that we got so we have our here the first part the second part the third part and then the fourth you know part of our coordinate system and and here we we are dealing with a two dimensional plane but if we were to deal with the threedimensional plane we no longer have just x-axis and Y AIS where x axis were on the horizontal and y- axis on the vertical but we have our third line which is the Z so we have now three different dimensions so X Y and Z and we are basically extending our two-dimensional plane to threedimensional so this system is fundamental for visualizing and working with vectors geometrically so then we can just use this two-dimensional uh plane in order to visualize this Vector for instance knowing what are all these points that appear on this Vector what is its direction where is it headed you know what is the beginning and then we can also find out all the so the relationship of these vectors with all the other vectors for since if we have not Vector here then we can use the coordinates of them and information about the vectors to understand that we are dealing with two parallel vectors that don’t have anything in common so no intersection points whereas just if we have another Vector like this and we know that here we are dealing with perpendicular you know orthogonal vectors so this is why those this coordinat Cartesian coordinate system is important and it’s not just important for linear algebra but just in general for mathematics and for data science and for AI and you will see this coordinate system time and time again in different visualizations even when you want to visualize the mean of your data or you want to visualize the probability distribution function describing your population from statistics or from data science you want to visualize for instance how your optimization is working or you want to visualize how your model is performing in terms of its Evol ition Matrix for all these cases and for any visualizations this idea of the Cartesian coordinate system is going to become very handy let’s now talk about this idea of angles and the idea of circles radian the pi as well as this degree sign so this comes usually from geometry or tonometry and this is very important when it comes to the vectors because when we have two different vectors then we want understand their relationship do they form this less than 90° or so are we dealing with sharp corner sharp angle or we are dealing with 90° angle so we are dealing with this type of vectors where we have you know 90° or we are dealing with um this type of vectors when the angle is 180° which is by the way uh something that we are referring as pi and here is one thing that is important here is that it’s not just Pi but it’s Pi radians why because in mathematics we also have this idea of Pi which is usually a number that is 3.14 so we should not confuse this Pi with pi radians so the relationship between the two is something that we have also seen as part of our pre-algebra and algebra corses so if it’s something that you want to just refresh your memory on this will be super helpful to check out very initial course on um all these Basics so pre-algebra so this number comes from prealgebra and then this idea of P radians and just in general all this information about what is 180° what is this angle what is 360° and all the information that comes from triog gometry and geometry can be found in our corresponding course so the next topic is the unit circle unit circle is highly related to this idea of radians degrees cosine s but also understanding the cartisian coordinate system will help you to understand the unit circle so this also comes from tonometry and geometry and it’s basically a fancy way of saying we have x-axis we have y AIS we have here zero so our common Cartesian coordinate system only we are trying to focus on this part of the system where we have here one we have here one so on the x-axis we have one and then here minus one here minus one for y AIS and here y the Y is equal to one so we have here all these points and then we have the circle with radius of one so here is this you know this is the radius and here we plot this circle and this will help us to understand this concepts of sinus cosinus you know this Theta is just variable that we use to describe the angle and for instance here we are dealing with 45° this angle is 90 de this entire thing is 360° and half of it so this part only is 180° so those are all important part of understanding this idea of unit circle so you might have already guessed that unit circle refers to this idea that we have here one unit here one unit one unit one unit forming this entire circle so with the radius that is equal to one all right so this is something that is very easy and this comes from the geometry and pre and tri gometry uh you also need to understand this concept of sinus and cosinus and how sinus and cosinus are related to this what do we refer by the sinus and cosine you know what is this what are these points so one Z for instance we understand that here the x is equal to 1 and Y is equal to Z so here this point is simply 1 Z so this point and then we have 2 p radians so what is this idea of P so we know that a p radians so P radians is simply the 180° which means that you also need to understand this concept of P2 which is simply the 90° so you can see here one thing that I forgot to mention you need to understand this concept the relationship between the pi and so Pi radians and radians and this unit circle you need to know that here the Pi / 2 is simply this angle and then the entire Pi is this angle and then this entire thing the entire angle with 360 degrees is equal to 2 pi so 2 pi radians is simply this entire thing so those are very easy Concepts that come from geometry and trometry and if you want to refresh them then had to work those courses because this will help you to understand all these concept from scratch let’s now continue our refreshment when it comes to geog genometric identities and we just spoke about this unit circle we talked about the sinus cosinus it’s really important to relate this back to bit more advanced topics coming from the same do domain and from the same area of mathematics and here we we need to know this concept before learning the your algebra few other things that um would be really great if you know but it’s actually not a must to understand all these different topics it is the idea of Pythagorean identity so don’t confuse this with Pythagoras Theorem this is the Pagan identity so this one that the square of the sign of an angle plus the cosine squared is equal to one and all these different rules that go around the sign and cosine and also the what is for instance the sign 2 Theta which is equal to 2 sin of theta and cosine of theta you know those are all different rules that would be handy to know and if you are so far I assume that you also know geometry and fundamentals to triogen ometry which means that you also know this trules but this might be just a great time to go ahead and quickly refresh your memory on this Concepts because those might become handy in your applied linear arge Applied Mathematics Journey but for now I would say this is not one of the most important things to know to learn this and to go through this course but just something to keep in mind so when it comes to the triog genometric equations uh this can become very handy later on when we want to prove something in linear algebra so to follow along it’s actually a good idea to know for instance what is how you can solve this different equations and this will go back and refer to the unit circle that we just saw for instance if the sinus Theta is equal to 1 / 2 then you will need to quickly remember what is that angle for which the sinus is equal to 1 / 2 then you realize that is actually the angle where you take the p and remember that Pi is equal to 180° and that is the one corresponding to and then Pi / to 6 is simply 180 / it to six so this is basically the 30° so those are things that you can do when you know for instance all these different sinus and cosinus so you have memorized for these different angles so what is the sinus and cosinus for 30 degree for 60° um let me actually remove this to make it easier so this type of problems is very easy to solve when we keep in mind and we memorize what are these different values for sinus and cosinus when it comes to different angles for instance for the angle equal to Z let me actually remove this and clean this part for better understanding so if we have for instance Z degrees then we know that the sinus for this is zero and the cosine of this is one so we are basically dealing so if I plot a unit circle we are dealing with this number so remember that sinus and cosinus those refer to the Y and X on our unit circle so keep this one in mind so if the cosine Theta is then equal to 1 and the sinus so Y is equal to Z we are dealing automatically this number with this number and you can see that here the angle is also zero so here we are dealing with one and zero coordinate so this is our cosine of zero angle and this is then our sinus of zero angle so we automatically even from this graph can see very easily that the S of 0° is equal to 0 and the cosine is equal to 1 all right so let’s quickly also refresh our memory on few other degrees so for the 30 deges which is simply the Pi / 2 6 so this is 30 ° then the sinus or the Y AIS is equal to 1 / 2 and the cosine or the XX value x coordinate is equal to square root of 3 2 so we are dealing with this this corner or angle so 30° so even from here you can see that the coordinates make sense make sense then we have the pi for another famous value which is corresponding to the 45° it simply this angle and for this angle the X AIS which is the cosine so this number is equal to one/ to 2 and then for the sinus the so the y coordinate is equal to 1 / two as square root of two as you might have guessed because in this number the xaxis and y axis is equal to is the same so you can see that this distance and this distance is the same because we are dealing with this type of figure so here we have 45° here we have 45° so this values are the same and this is something that you would know knowing the Pythagorean Pagan theorem so then you can go ahead and refresh your memory for the 60° so here I’m referring to the Pi / to 3 and then the 90° which is the very easy casee this one obviously the xaxis is equal to zero so here you should have zero and the y- axis is equal to one so here you should have one and so on all right so we went into quite detailed here but I think this is a very important topic knowing this idea of a trigonometric equations identities this idea of unit circle are super important because they are highly applicable to different fields in artificial intelligence data science machine learning and will definitely set you apart all right let’s now talk about the law of signs and cosiness those are things that I won’t be going on into too much details I just wanted to quickly showcase to you if you want to get the proof of those definitely check out our corresponding courses but for here I’m assuming that you already know so you know the law of signs which means that if you have this triangle you know you have this different sides so you have an angle a the corresponding side is a and then you have angle B corresponding side is B and then here C and the corresponding side is C then you know that a / to sinus of that angle is equal to B / to the S of that angle and then is equal you see divided to the sign of that angle so basically take this value divided to the sinus of this angle you know right in front of it is equal to taking this value and then dividing into the sinus of this angle so the proof of this low is outside so out of the scope of this course but knowing this will help you to understand different concepts and then the law of cosine is simply saying take the side of a Target angle so in our triangular we have here a we have here angle B and the C and if we go and look into this specific angle so angle C just randomly picking one of the three angles then the side right in front of that angle so the C c^ squ is equal to if we take this you know the other two sides forming that angle so A and B is equal to a s so this is just a constant a distance of this side a 2 + b 2 so this side squar minus 2 * a * B * the cosine of that angle this is what we are referring as the law of cosin quite easy we are not going to prove it again if you want to get the proofs make sure to check our other fores on the geometry and triogen ometry we’re almost done with the prerequisites just a quick refreshment we saw already the norm here is just a notu example what Norm is and on a specific two dimensional Vector when we have for instance that a vector is equal to three and four which means for the First Dimension let’s say on xaxis we have three and then on Y axis is equal to four then the norm or the Alan distance so this is equal to we take the X vol so three and then we Square it so V you can see here this is the case when n is equal to 2 this is simply equal to square root of V1 2 + v2^ 2 and as V1 is equal to 3 so this is our maybe I can make this just V1 and this is my V2 then the norm or the equid in distance for this Vector so this thing is equal to V1 2 + V 2^ 2 which is equal to 3 2 + 4^ 2 and this value is sare root of 25 and it’s equal to five so let’s now see the difference between aladine distance and the norm so you could see here the norm here we have just one vector like here and this Norm it has just two corresponding values into two dimensional space you see here we have just three and then four so this is V1 and V2 when it comes to the Alan distance this is kind of the generalization of this idea of Norm so the Alan distance between two points A and B in RN so in the N dimensional space is the norm of the vector connecting a to B so we see that the norm and the elidan distance are highly related to each other only we are talking about the norm when it comes to one vector but when we have this Vector a a and the vector B this is simply the Alan distance so for the aladine distance we know already this idea of distance how we can measure it and you can see that this comes very similar to what we see here notation and here we are saying well we have this vector and then it has the two coordinates in N is equal to 2 in two dimensional space when it comes to the A distance Elan distance helps you understand what is this distance between two points in an N dimensional space so the aladan distance between two points let’s say A and B in N dimensional space is the norm of the vector connecting a to B so for instance if we have a point a and we have a point B we are connecting this and this is the vector connecting these two points then the El distance is simply the norm of this Vector so this is the Alan distance so we can see that the norm and the distance they are highly related to each other in the Alan distance we are using this idea of norm and specifically the norm two as I mentioned before so here you can see that the definition of alodine distance so the distance between A and B the two point is equal to square root of A1 – B1 2 plus a and then here we have basically A2 – b 2 2 and then plus A3 minus B3 s those are things that we cover as part of this dot dot dot and then plus after to the last point when we have a nus bn^ 2 so here what we mean basically is that if we have two points here is a and here is B and this s vector and we know all these different points so A1 B1 A2 B2 A3 B3 blah blah blah and then here a n BN we know all these points lie in here in this distance then we are taking them and using them to calculate the L in distance so here for instance if we have point A and B so in this example let’s do a quick one specific example when we have a point a which has coordinates 1 and two so this is basically A1 A2 and then point B with points in it like B1 B2 you can notice that the da AB so the distance or the eiding distance of these two points which is equal to the norm of this vector or here this is a and this is B and this is this Vector this is equal to square root of 4 -1 so it takes the B1 so this is B1 and this is A1 takes the square and then says plus B2 – A 2^ 2 takes the square root of that and says this equal to 5 now you might be wondering but hey why do we do then instead of 1 – B1 2 we do B1 – A1 2 and the answer to this question lies in the uh properties that we learn as part of prealgebra because it doesn’t matter when we we take A1 – B1 2qu or B1 – A1 squ because this squared ensures that it doesn’t matter which one we take first and subtract the other now the proof of that is outside of the scope of this of course this is part of pre-algebra but I just wanted to put this out there to ensure that you are seeing what we are seeing here because here it says A1 minus B1 but in this example we are taking instead depth B1 and we are subtracting A1 this is a common thing that we do in P algebra and just in general in different culating distance or distance related cases so I just wanted to put this here to ensure that later on this is something that can be clear from the first view right and in here we will quickly refresh our memory on the Pythagorean theorem which basically says in the right angle triangle so if we have this type of triangle so here we have 90° this is a right angle triangle the square of the length of the the side opposite to the right angle so this side this we of refer C and this as B uh and then a those two are not very important but this is commonly referred by C so the the side opposite to the right angle then we know that the square of the C so C squ is equal to a 2 + B2 this is super important theorem and a fundamental principle for defining the Norms the distances in equity and spaces in and in many other applications so the angles play a crucial role in understanding the direction of the vectors and you know how they can be measured in degrees or in radians we saw also the pi radian this idea of you know that the pi radian is equal to 180° those are all very important when it comes to linear algebra and just in general application of mathematics in machine learning in Ai and other applications the relationships between these angle measurements and the triog genometric functions is foundational in solving different problems that are about these vectors and their orientations for instance this angle of sign cosine you know what is this idea of tangent they are very important just to give you an idea the um a t a tangent is specifically used as part of the activation functions we call tank activation function and knowing this tank will help you to understand the activation functions that I use as part of deep learning which are more advanced machine learning type of models and they are fundamentals in all these different new and Cutting Edge techniques large like large large language models Transformers encoder and decoder based algorithms Etc they’re also important in this idea of computing dot products so very important and a must know when it comes to linear algebra so this is just an simple example when it comes to this right angle triangle and Pythagorean theorem and how it is applied I will skip this for now it’s also important to understand this idea of orthogonality so the two vectors let’s say A and B they are orthogonal to each other if their dotproduct is zero so later on as part of the vectors when we will talk about dot product we will see what we mean when we say that the dot product is equal to zero and here you can even see that if the a norm if the a vector so you see here and B Vector if those vectors if we multiply them to each other their dot product is equal to zero it means they are orthogonal so this angle that they form is equal to 90° orthogonality implies that the vectors from the from a right angle with each other they are in you know we we are dealing with that in our2 in our tree so they are super important way it comes also to visualizing them correctly this concept is visually represented all this uh you know Vector a and then Vector B and they are perpendicular in the 2D uh coordinate system all right so when it comes to the applications of orthogonality orthogonality plays a crucial role in various aspect of linear algebra it’s fundamentally in defining Vector spaces subspaces in solving systems of linear equation later on when we pass the vector ideas and we go on to the matrices solving linear systems so equations with many unknowns and then we use this idea of reductions or gaussian reductions we will see how this idea of orthogonality can be important and how also it relates back to the norm of two vectors so it’s fundamental in defining all these different Iden ities and solving system of linear equations and also orthogonal vectors are used in finding the shortest distance from point to the plane um something that is important when it comes to the optimizations and here you can see an example the vector a which is equal to 2 three and then Vector B which is equal to minus 3 and 2 you can see that when we multiply 2 by minus 3 so we obtain basically the dot product by the way this is something that we are going to cover also part of the squares but for now you can see that if we take this number we multiply with this so 2 * – 3 we take this number multiply with this so we take three and multiply with two you can see that this equal to minus 6 this is equal to 6 so – 6 + 6 is equal to zero so you can see that the dotproduct of these two vectors is simply equal to zero and this is what we are referring as orthogonality this means that these two vectors form a right angle where we see here this angle is equal to 90° why this prerequisites matter and why I mentioned those understanding this concept is very crucial they underpin this geometric interpretation of linear algebra they will help you to better understand these Concepts and not just to memorize them but really understand and later on when you go into your machine learning and AI journey and your data science Journey seeing these Concepts will help Q to better understand those different algorithms this optimization techniques what we mean when we say we want our optimization algorithm to move towards local minimum Global minimum this idea of movement this idea of vectors later on will you will also understand this different concepts in deep learning how these models work how the neural networks work and those are essential Concepts that you need for solving different systems of linear equation a core part of this course they also help you in visualizing vectors spaces which are critical to understand this concept of linear algebra the applications of linear algebra when it comes to the real world applications so those are things that you can definitely Master by following some of our other courses but for this course I assume that you are already familiar with these Concepts right so now we are ready to actually begin and with this prerequisite it in mind you are prepared to start your linear arbra Journey we are going to learn everything in the most efficient way in such a way that you will learn the theory you are going to see many examples we are going to learn everything in detail but at the same time you’re going to learn the must know Concepts and I’m not going to overwhelm you with this most difficult concept that you will not be seeing in your career I’m going to give you this bare minimum when it comes to really knowing the must know for linear algebra such that you will be ready to apply linear algebra in your professional Journey whether you want to get into machine learning deep learning artificial intelligence data science knowing these different concepts in linear algebra you will be a pro in your field going to give you everything that you need the theory examples implementations everything in detail but at the same time you will be doing that in the most efficient and time saving way so without further Ado let’s get started let’s Now quickly Define this idea of Norm so the norm of a vector denoted by this uh uh V which you can see kind of like similar to the absolute value from pre-algebra you can see here that we have this double straight lines like from absolute value then we have the name of the vector or the variable name that we are assigning to our vector and then you might notice here on the top of this this Arrow this basically says that we are dealing not with just a variable but really we are dealing with a vector this is really important because you can see that there makes a huge difference if we have for instance just V or V1 I have to say or just V those are really important and things that you need to keep in mind when it comes to linear algebra and trying to differentiate vectors from a point you will notice that when it comes to Norm we can uh represented it either by this notation or this usually it’s a common um notation uh in machine learning or in data science um with this uh two bars and um when we do this we automatically also know L2 norm and this is something very common and uh usually used as part of uh retrogression which is an application of um linear algebra uh and it’s used in uh regularization so we are regularizing our machine learning algorithms so when you get into machine learning you will see time and time again this um notation so uh next time when you see this then you know automatically that you are dealing with L2 norm and L2 Norm which is also used a lot in machine learning it is referring to the usage of L2 Norm to in the uh R regression and R regression or L2 regularization is a very popular regularization techniques as part of machine learning so right now even you can see this uh intersection of linear algebra or um this uh idea of norms in machine learning the norm of this vector v is equal to square roof and then V1 squ plus v2^ squ plus and all this in between numbers plus VN squ so here basically it means take the square root of V1 squ vs2 squ plus V3 squared blah blah blah plus VN squ so basically take all the units that form this vector and then so are on this vector and use them Square them and then add them and then take the square root of that that’s the distance or I have to say the norm of this Vector we saw already the norm here is just a not example what the norm is and um on a specific two-dimensional Vector when we have for instance that a vector is equal to three and four which means for the First Dimension let’s say on x axis we have three and then on Y axis is equal to four then the norm or the Alan distance so this is equal to we take the x value so three and then we Square it so V you can see here this is the case when n is equal to 2 this is simply equal to sare root of v1^2 + v2^ 2 and as V1 is equal to three so this is our maybe I can make this just V1 and this is my V2 then the norm or the equ in distance for this Vector so this thing is equal to V1 s + v2^ 2 which is equal to 3^ 2 + 4^ 2 and this value is square root of 25 and it’s equal to 5 let’s now see the difference between aladine distance and the norm so you could see here the norm here we have just one vector like here and this Norm it has just two corresponding values into two dimensional space you see here we have just three and then four so this is V1 and V2 when it comes to the Alan distance this is kind of the generalization of this idea of Norm so the aladan distance between two points A and B in RN so in the N dimensional space is the norm of the vector connecting a to B so we see that the norm and the Alan distance are highly related to each other only we are talking about a norm when it comes to one vector but when we have this Vector a and the vector B this is simply the aladine distance so for the aladine distance we know already this idea of distance how we can measure it and you can see that this comes very similar to what we see here notation and here we are saying well we have this vector and then it has the two coordinates in N is equal to 2 in two dimensional space when it comes to the Aline distance aladine distance helps you understand what is this distance between two points in an N dimensional space so the aladan distance between two points let’s say A and B in N dimensional space is the norm of the vector connecting a to B so for instance if we have a point a and we have a point B we are connecting this and this is the vector connecting these two points then the aladine distance is simply the norm of this Vector so this is the aladine distance so we can see that the norm and the distance they are highly related to each other in the Alan distance we are using this idea of norm and specifically the norm two as I mentioned before so here you can see that the definition of aladine distance so the distance between A and B the two point is equal to square root of A1 – B1 2 plus a and then here we have basically A2 minus B 2^ 2 and then plus A3 – B3 2 those are things that we cover as part of this dot dot dot and then plus up to the last point when we have a n minus BN 2 so here what we mean basically is that if we have two points here is a and here is B and this s vector and we know all these different points so A1 B1 A2 B2 A3 B3 blah blah blah and then here a n BN we know all these points lie in here in this distance then we are taking them and using them to calculate the L in distance so here for instance if we have um point A and B so in this example let’s do quick one specific example when we have a point a which has coordinates 1 and two so this is basically A1 A2 and then point B with uh points in it like B1 B2 you can notice that the da AB so the distance or the equid distance of this two points which is equal to the norm of this um Vector but here this is a and this is B and this is this Vector this is equal to square root of 4 minus 1 so it takes the B1 so this is B1 and this is A1 takes the square and then says plus B2 – A 2^ 2 takes the square root of that and says this equal to 5 now you might be wondering but hey why do we do then instead of 1 minus B1 s we do B1 – A1 2 and the answer to this question lies in the um uh properties that we learn as part of prealgebra because it doesn’t matter when we take uh A1 – B1 2 or B1 – A1 squ because this squared ensures that it doesn’t matter which one we take first and subtract the other now the proof of that is outside of the scope of this um course is this is part of pre-algebra but I just wanted to put this out there to ensure that uh you are uh seeing what we are seeing here because here it says A1 minus B1 but in this example we are taking instead depth uh B1 and we are subtracting A1 this is a common thing that we do in um pre-algebra and just in general uh in different um Alan distance or distance related cases so I just wanted to put this here to ensure that uh later on this is something that can be clear um from the first view why this is important this idea of norms and equ in distance beside of being used in machine learning and why is it used so Norms they provide a way to measure the size or the length of a vector in Vector spaces which means that when we want to measure a distance a similar ility uh relationship between for instance vectors then it becomes much easier to use this idea and El IND distance is not only used in regularization techniques like L2 regularization or retrogression but it’s also used in other machine learning or deep learning algorithms as a way to measure the distance or the relationship or the similarity between two different entities those can be variables those can be two people that we want to comparing our algorithm or two entities um for instance uh the um Norms or the Al and distance they are also used as part of K algorithm something that you might have heard and if you follow later on the machine learning and the clustering section of machine learning you will see that elidan distance is used as part of C’s algorithm that aims to Cluster observations into different groups so this also yet another highly applicable uh topic that you must know in order to understand different linear albra top topics but also machine learning topics welcome to the course on the fundamentals of linear algebra my name is D Vasan and today we are going to start with some basic concepts that are important for understanding linear algebra linear algebra is one of the most applicable areas of mathematics it is used by pure mathematicians that you will see in universities doing research publishing research papers but also by the mathematically trained scientists of all disciplines this is really one of those areas in mathematics that you will see time and time again appearing in your professional life if you want to become a job ready uh data scientist or you want to do some handson machine learning deep learning and AI stuff but also linear algebra is used in cryptology it is used in cyber security and in many other areas of computer science and artificial intelligence so if you want to become this well-rounded professional you want to go beyond using libraries and you want to truly understand the uh mathematics and the technical side of these different machine learning algorithms from very basic was like linear regression to most complex ones coming from Deep learning like architectures in neural network how the optimization algorithms work how the gradient descent works and all these other uh different methods and models then you are in the right place because you must know linear algebra such that you will understand these different concepts from very basic ones to most advanced ones in data science machine learning deep learning artificial intelligence data analytics but also in many other applied science disciplines so before starting this comprehensive course that will give you everything that you need to know about linear algebra first I’m going to tell you what we assume that you already know because linear algebra it comes from about third uh year of Bachelors um of different uh highly technical studies and um here um we are assuming that you already know certain Concepts so uh to ensure that this course stays really on the topic of linear algebra and that you uh understand all these Concepts really well for that we need to uh be able to know different topics so before we dive into these Concepts uh let’s familiarize ourselves with the basic prerequisites and notations used throughout this course and you will really need to know this in order to understand this Concepts really well such that instead of memorizing you will actually just hear me once or maybe twice and then every time you hear uh later on or you see it in the papers or in some algorithms you will recognize um this is something that we already learned so so uh some key prerequisite overview is here um first of all to fully grasp the upcoming material you should be familiar with some basic concept like real numbers Vector spaces so you don’t need to know this idea of vectors though you uh already most likely are familiar with this given that you know how to plot different uh lines you know the idea of x’s and y’s and how to plot these different graphs but um here here we are going to touch base on this every time when we come close to this Concepts I will refresh you uh your memory and we will go through this numbers the idea of norms and distance measures because when it comes to the vectors when it comes to the magnitude and all these different uh topics that we are going to discuss as part of linear algebra knowing the what Norm is and um what is the definition of distance what is the length between uh two points when we plot it into two-dimensional space or three-dimensional space those are all very basic concept that usually you see as part of a basic pre-algebra or just uh common algebra courses and um lessons in order to truly understand what the algebra is about to understand the direction of vectors the angle and then um the uh dimensionality reduction how linear algebra is applied for instance in different algorithms in machine learning deep learning data science statistics you really need to understand this Cartesian coordinate system so uh this is not only important for linear algebra but I assume you already know it given that you have passed those um uh other courses like uh calculus or usually they are covered as part of pre-algebra or algebra so the Cartesian coordinate system I mean here understanding uh what is for instance the the common um description of them for instance when you when we write like X and then y on the vertical axis and then we can uh we have here zero and um then uh we can always plot this different plots you know we we have a clear understanding what is this um Y is equal to X line we understand how by knowing certain points we can plot different plots for instance that this is the Y is equal to X line that here it means that if we have here one then this is just one two this is two so we understand when we have the function of the line and we have a certain value that is our y coordinate or x coordinate then the corresponding uh coordinates can be found then um you also need to know um some basic things that I just didn’t mention uh right now so for instance that the numbers here can be like one 2 three up to Infinity so you understand this concept of infinity and then here the same uh story then here we have minus one you know minus 2 um and then this is then used later on and we will be uh touch basing this one we will be describing our vectors and how uh we can uh visualize our vectors either two dimensional space like we have here because this is two dimensional so we have X and Y but we can also of course visualize it in three-dimensional Etc so this idea of basic coordinate system is really important um us usually covered as part of algebra if not pre-algebra then we have basic triog gometry which means that you need to have a clear understanding what sinus is what cosine is what tangent is and their reciprocals and here I mean uh that you know for instance um what is cosine function what is s function um you know that you have an understanding for instance that um uh what is this line you know um whether it’s a sinus line or cosine line you have also an understanding what this Pi is um one thing that I didn’t mention but it it just goes um around all these topics some basic things that you understand what is X what is why why we uh use them and this idea of uh variables uh and also uh you need to understand this idea of square uh or you know a 90° uh angle and then uh pagas theorem here we have the same so what is this relationship between different sides of a triangle uh that is a very unique triangle and that has one of the uh angles as 90° um and uh this idea of um you know the sides how this relates to the sinus cosinus tangent cotangent um and also um how the Pythagorean um pagan theor applies when we have uh a triangular but it is no longer with a angle that is 90 degree what is the sum of all the angles of triangle so those are basic stuff that are com commonly covered as part of uh trigonometric uh lessons or part of General geometry then another prerequisite um is this uh understanding of uh identities and equations in triog genometric um lessons something part of which I already covered and this just goes around of basic having a basic uh understanding of algebra and geometry those are super important to understand more Advanced Techniques uh from linear algeb then we have finally this idea of orthogonality perpendicularity in vectors for instance if we have um two lines like this then we are talking about uh perpendicular vectors is when you have two lines that are actually parallel so they don’t have any intersection and you won’t find any point that is common for the TU hi there so let’s get started with our first module which is foundations of vectors in this module we are going to talk about fundamentals of linear algebra vectors we are going to make a differentiation with between scalars and vectors we are going to Define them so first we will learn the theory then we will Implement them into practice by plotting them by looking into different examples then we will look into this representation of vectors by looking into the magnitude and the direction of it and the representation of them just in general we are going to plot them in our coordinate system then we are going to see the common notational vectors and indexing of them vectors are super important when it comes to linear algebra and application of it and uh they matter not only in mathematics but beyond so uh vectors help us in many ways from figuring out how objects move to solving math problems in science and just in general in technology including in data science machine learning artificial intelligence Etc they are super useful tool so uh let’s start our journey with looking into scalers so scalers they are displaying numbers and by definition a scaler is a single numeric volume often representing magnitude or quantity for example uh scalers can be describing um the temperature outside for instance the temperature of um 22° uh can be represented by a scaler or a height of a person can be represented it’s a scaler so let’s assume we have a scaler that we will Define by a letter s is just a variable this scaler is then equal to 22 for instance and we are measuring Dr it in degrees so it means that uh if this s measures a room temperature then the scaler s which is equal to 22° which represents the room temperature it can be for instance 18° or 9° if it’s very cold uh it just measures a single volume it represents just a single number or it can be for instance 17 100 2.22 so all this they are just scalers they represent a single numeric volume they often represent a magnitude or a quantity very soon we will see that scalers they are a value that represent the magnitude of a vector so uh now when we are clear on this very basic concept of scalers let’s actually move to this idea of vectors so by definition a vector is an ordered array of numbers which can represent both magnitude and direction in space so uh vectors they are bit more they represent bit more than scalers they are numbers that also show direction like a car speeding down the highway or a ball uh being thrown for instance uh when it comes to our previous example where we were using this uh uh room temperature as a way to uh think about the scaler a scaler for instance scaler that we just saw was this room temperature room temperature which was 22° when it comes to the vector vector is different for Vector for instance we can have an example when a bird for instance bird it flies flies at 10 kilometer per hour and I I also add here another information which will make this as a vector which is that it flies South so here as you can see what I’m doing is that I’m not just oh let me actually remove this part to make it easier to understand okay so uh in this example let me write it down that the example Birds flies South at 10 kilom per hour so you can see that I’m not just adding the scaler which is in this case the magnitude we will see very soon the formal definition of it so I’m writing down the speed I’m defining the speed but also the direction so I’m saying I know that the bird is is flying south that’s the direction and I know also the speed of it which is the magnitude so 10 km hour so here in the vector I have much more information than in the scaler because in the scaler I just got temperature room temperature single value but in case of a vector I not only have a magnitude or speed like 10 kilm per hour but I have extra information which is the direction of it for instance flying to the South so let’s now look into some real examples and plotting them to make more sense out of this ideal vectors and what is this magnitude what is the direction so let’s assume we have a 2d plane so we have x-axis we have y AIS here like usual we have our z0 Center and we want to plot a simple Vector so uh usually the way we represent Vector in tutorials just writing down is by writing the name of the vector this can be just a a random name let’s assume that it’s a v letter V and then on the top we are always adding this Arrow so this Arrow it says and it tells the person who is reading that we are dealing with the vector arrow on the top is that reference so let’s assume this uh vector v it starts from the center of our coordinate system and it goes to this point so let’s say in here this is our vector v so let’s assume that this point in here is equal to 4 which means that the x coordinate is four and the y coordinate is zero as the uh uh Arrow it just as the point in here it has a a y value of zero so you can see that it goes straight from zero to to this one to this point okay so what tells this Vector uh to us is that we have a value that describes the length of the vector so it goes from 0 to 4 which means that the length is equal to unit 4 so it’s equal to four um and we have just learned and we were just talking about that the magnitude is the length in this case so the length describes the magnitude in this case so this means that the magnitude of this Vector is equal to four and then um what else we can see here we can see the direction of the vector which means that the direction is also something that we can see here this is the direction of the vector so this going straight from this point to this point in a horizontal Al way so independent whether I plot this Vector from 0 to 4 in here or in here or in here or in here or in here in all cases as long as the length is this I’m dealing with the same Vector because I am basically in this entire R2 space I have exactly the same Vector all I care is about the magnitude and the Direction Where will this Vector start and where will it end I am not interested I’m interested that the uh that the magnitude in this case the length is equal to the direction of the vector so let’s now look into another example where we go a bit more difficult on our coordinates and on our Vector we already saw that we had this Vector where we went let me change the color so this was our vector v and it went from zero till four so this point to be more specific is so this Vector it goes the vector B it goes from 0 0 to 4 0 so the coordinate X was four and the Y was Zero now let’s plot another one um where the direction is no longer horizontal for this Vector let’s call it Vector W and for this vector w we will again start with Z 0 so we will start again in here but this time we will go bit like this so let’s say we go all the way to this point so this point has a value for an x- axis of three and for y axis it has a value of four which means it goes from this point to this point and this is the direction of our vector v so it goes 2 3 4 because this point is 3 0 and this point is 04 so xaxis is 0 x coordinate and y coordinate is four so now you can see that the direction of this Vector is like this while the direction of the vector B was like this and like in case of vector v i again no longer care about where exactly my Vector W starts and ends but all I care is about its magnitude so the length and the direction so for instance I can have the same Vector in here the same Vector in here as long as the length the magnitude is the same and the direction I am dealing with the same Vector that’s all I care so the magnitude and the direction is all that you care about all right so now about the length um that’s uh something that you can see very easily from this specific example because by using the Pythagoras Theorem or P Pagan theorem we can see very quickly that as the length of this side of our right angle 30° so right triangle we can see that this side is three this side is 4 which means that this side is five because 4 2 + 3^ 2 then we take the square root of that is square root of 25 and it’s equal to 5 so the length or the magnitude of this vector v is simply equal to five all right this was about this uh specific vectors let’s now look into the uh common representation of the vectors so we always use the magnitude as well as the direction you know to represent the vectors and they commonly are represented by two different uh ways let’s now look into the first way that the vectors can be represented and then we will move on to the next one so when it comes to the vector v so we saw that vector v was moving from 0 till uh to the point of 40 so we can represent the vector B by 4 and Z when it comes to the vector w we can represent that uh Vector so Vector w we can again do the parenthesis and we can say that it’s equal to 34 so by using the coordinates from the coordinate system we can then represent our uh vectors so this is just one way of representing a vector another way of representing these vectors is by using this Square braces given that we are in a two dimensional space first we will mention here the four then we will mention the zero in here two so we can say three and four this is yet another way of representing the vectors in a two dimensional space so if we were to have a three dimensional space so let me actually show it on a new page so if we were to um if we were um to have vectors in three dimensional space so we are dealing with R3 so we have have points that can be described by X Y and Z so coordinate space like this so X and then Y and then the Z then every point so let’s say we have this Vector then we had to represented by a value let’s say x uh X1 y1 and Z1 or um better let me actually use a different letters a b and C and this would be my vector v and I could also represent this vector v is the same so vector v can be represented as a b and c so one thing that you can notice is that unlike the R2 now I have three different entries what we are also referring as rows and we just got one column so um we can uh of represent and usually that’s a common way of representing vectors by using this um columns so columns help us to represent our vectors and you can see very clearly then when it comes to the two-dimensional space so when we have R2 so then our vectors have just two rows so three and four 4 Z like in here when it comes to three dimensional space we have three entries and so so the same holds of course also for for instance R5 therefore for R5 our vectors so coordinate space can be for instance X Y Zed and then GMA and then let’s say Delta and then the coordinates of a vector in that space can be V and then arrow is equals sh and then we would have uh let’s say A B C D E you get the idea so depending on the space the coordinate space and and the dimension of that space then the corresponding vectors can be represented accordingly so the vectors are quantities that have both magnitude and direction as we just so distinguishing them from scalers which only have magnitude so we saw that the scalers got only magnitude while in case of vectors we saw both for the vector v and for the vector w we didn’t we didn’t only have the magnitude so the length of Vector but also the corresponding Direction so uh when it comes to the um vectors so this is exactly what we just saw in our example a vector in a two dimensional space so in R2 uh can be represented by using this Square braces and the corresponding entries for X and Y where X is basically the x coordinate in our coordinate system so in our X and Y system whenever you have this x and y coordinate then uh this x coordinate Will then describe your magnitude and the y coordinate Will then describe your second entry that you need to put when representing your vectors so here the X and Y indicate the movement in the horizontal and in the vertical Dimensions respectively so for the X’s it’s always the x coordinate so how far you move towards the horizontal Direction in here in here or independent in here so always take the x coordinate that is the value that you need to put first and then the Y need to be put it in here so indexing in vectors when it comes to the um indexing the standard mathematical notation uh indices in the N vectors goes from I is equal to 1 to I is equal to n so the um notation here can be a bit ambiguous so AI uh could mean the E element of AI uh the a vector or the each Vector in a collection so let’s start with a simple one and then move on to this next part so what this means and what this means we will look into now so uh usually uh when we have a um n dimensional space we are having hard time visualizing it therefore we use this two dimensional space or maximum threedimensional space in order to get an understanding of what these vectors are so we just saw examples of them uh when uh creating our vectors in um V and V uh and W in uh R2 and also in R3 but we can have similar vectors also in R4 in R5 or all the way down to RN where n can be 100 200 500 any number as large as you want the thing is is that visualizing R4 R5 RN is very hard but we can still benefit from this great properties of the vectors metrices and in general linear algebra in order to describe different things that have more than three dimensions therefore we have this a bit more ambiguous notation where we use RN and this n can be any real number and it can be all the way to Infinity so very large number and uh let’s say we have a vector in this RN then this Vector is usually described by using similar Square uh brackets like before only with uh more entries so like before we got just one column so that’s something that we didn’t uh change but here we have instead of just two entries or three entries like in the two dimensional or threedimensional spaces now we have A1 A2 A3 all the way down to a n minus one and a n so we got in total n elements in our column and this describes our uh single Vector so this Vector in an N dimensional space this we can call also a so one thing that we just so is that it was saying in our definition and notation that uh we might also be dealing with the E Vector in a collection which means means that sometimes you will see this while here the A1 A2 they are vector themselves so here we saw that these are just entries so A1 is a number A2 is a number A3 is a number a n is just a number but it’s also possible uh when you have a much more difficult and complicated case that you got an A let’s write it down with a capital letter A which is equal to A1 or let’s actually remove this so we got let’s say A1 A2 A3 all the way down to a n minus one and a n where you can already see what is going on so instead of having just a number as an entries instead we have vector vors in here so our first element is actually Vector our second element is actually Vector so A2 Arrow A3 Arrow all the way down to a n arrow so while here this can be for instance some numbers let’s say 1 one one all the way down to one one here we have a vector vector another vector and all the way down here you had another Vector where for instance let we remove this part where for instance A1 arrow is actually equal to a11 A1 2 a13 all the way down to A1 n one thing that you will notice here is that unlike in here here I got double indices so I got here a11 and then a12 and then a13 all the way to A1 n so the first index it doesn’t change as I have here a one so I’m writing down the index corresponding to this Vector but the second index it changes per entry indicating which element specifically in the vector I’m talking about so from the first index you can identify the Vector that I’m referring to which is A1 and from the second index you can see the corresponding um entry or the value that that um element is positioned in this Vector so you can see that this values for instance in the um Vector one so A1 to be more specific but then it is in the first position this is in the second position in the third position all the way down to the end position so so this is something that is really important to understand well because this notation is going to appear time and time again across various applications of matrices and vectors so is really important to understand well therefore I want to go one more time through this to make sure that we are clear on what this indices represent so whenever we have an index uh an a vector that we want to uh represent and it’s um it has just um it is just a vector which means that it’s not a nested vector vector in a vector then um we can Define it by let’s say a and then on the top an array and it’s equal to and here we can have A1 A2 all the way down to a n so you can see what we are also referring as dimension of this Vector is equal to n by 1 so I got n entries and just one column so n by one which means that this already gives me indication that most likely this A1 is a number this A2 is a number this a n is a number so let’s say this equal to one two uh three blah blah blah and then here I have let’s say 100 but if I’m dealing with the nested Vector later we will see that this can be repres represented by a matrix then um I can also Define this by Capal letter A which is a common way to refer to either matrices or nested vectors and then this is equal to A1 Arrow A2 Arrow A3 Arrow this already sends a message to the reader that we are dealing with no longer a constant within a vector but rather vectors in a vector and uh what can we see here is that the dimension of this nested vector or which we can also refer to as a matrix here the number of rows so the number of entries this elements we can see is equal to n but then this time the number of values that form these vectors is no longer one because we’re are not dealing with just a constant this is not some constant but rather this is yet another Vector so let’s assume this Vector has a length of M so let’s say this has a length of M then the dimension of this Matrix a is equal to M so something that we will see also when talking about matrices so let me actually clarify this bit more for better understanding let’s say we look into one of those um one uh one other example of an entry so let’s say we look into this specific Vector which is in the um the third uh Vector within this Vector capital A so this A3 Vector so one thing to see here already is that I assumed that these vectors they got M elements and keep in mind that all these vectors they should be of the same size so it means that I already know that this specific Vector A3 has M elements so M elements so I’m representing this uh A3 Vector from here I’m taking this out from this entire uh nested a vector and I just want to represent this and now unlike these elements that got an arrow on the top this time I will have uh constants forming the A3 Vector so I no longer have vectors but I have elements in it so in here I will have a a let me actually write down all the A’s but to refer and to make sure that I recognize that I’m dealing with the third a vector so a tree Arrow here I will put tree tree all the way here three so they all come from the same third A3 Vector but then their positions is different because this is let’s say uh one two and then all the way down to M position so this indices help us to keep track what are the um position that these values are taking part in the vector A3 errow this might seem bit complicated at the moment but whilst we move on onto bit more complex material like uh matrices it will make much more sense this is bit of an extra I just wanted to Showcase this but this is what uh is at its core and what you need to uh understand at the moment to understand this concept of vectors so you need to know that vectors can be represented by this arrow on the top so let’s say Vector a and it has let’s say n elements then you can write this square brackets and then you will need to mention A1 A2 all the way to a n which means that you have n different entries describing your vector so you have A1 which is the first element in your vector A2 the second element all the way to a n which is the end element but here you can see for instance so if I add here A3 that uh A1 is simply equal to 1 A2 is equal to 2 A3 is equal to 3 all the way to a n is equal to 100 so these numbers I’m basically taking and I’m representing them I’m putting them in here here within Square braces in order to get a representation of my Vector so my Vector a has all these different entries and different entries and it starts with one and it ends with 100 this is a vector and then when it comes to the vectors within vectors here we need to be a bit more careful cuz here we not just have uh constant values forming a vector but we have vectors that form yet not vectors so our Vector a our nested Vector a which we uh later will refer as Matrix a has actually entries that also are vectors so we have A1 Vector A2 Vector A3 Vector they are not just constants but on their own they are vectors so here for instance we have defined also an example of it we have said let’s look into this third specific Vector that is part of a which is A3 uh vector and uh that one has M different elements here we have then the index referring to the which Vector from the nested Vector a it is which is the third one because we have taken it from here but then on its own this Vector has different members and different members to be more specific therefore we have also an index to keep track of the position of this value one to up to M and this can be yet another uh this time it can contain some elements an example of which is for for instance 0 1 2 all the way to let’s say 500 and this can be different numbers it doesn’t need to be ordered it doesn’t need to have a specific pattern they can be just random numbers describing this A3 Vector so hopefully this makes sense if it doesn’t don’t worry because we are going to see this time and time again I just wanted to give you a brief of an intro such that you can uh remember this when we come uh back to bit more uh complex topics like uh indexing in matrices so now let’s talk about special vectors and operations here we are going to talk about zero vectors unit vectors the concept of sparcity in vectors as well as vectors in higher Dimensions like we just saw about this nend dimensional space we will also talk about different operations we can apply when it comes to vectors like uh addition subtraction and then later on in the next module we will also talk about multiplication we will also be looking into the properties of vector addition after we have looked into some detailed examples when it comes to operations on vectors all right so let’s start with the zero vectors and unit vectors when it comes to the zero vectors you can see here already that U the zero and arrow on the top it basically refers to the vector like we saw before only with the difference that all its me numbers are zero so you can see here that we have zero and then an arrow and then underneath here we have some number three and then this is described by this common representation with the square braces and then three different members 0 0 0 so all zero and then it says in R Tre okay so why are we doing this well uh when it comes to uh different linear oppr sometimes we just need to add zero vectors or we just want to create zero vectors it’s just easier to work with we want to uh just create an empty uh Vector we want we know the length but we want to keep it empty such later on we can add something on the top or knowing that when we add a zero on a number the number stays the same we can make use of this property to uh do different um uh tricks when it comes to programming in Python in color or in C++ Etc so therefore this idea of zero vectors can become very handy now one thing that you need to notice here is that we are not just writing down this zero to emphasize we are dealing with a vector but like uh before we have this error on the top emphasizing that we are dealing with a vector then what we are doing is that we are also adding the dimension of this V Vector so in what dimension in what space are we um uh creating this zero Vector that this Vector is located is it in R2 in RN in R3 in this specific case you can see that in this example the uh index that we got here is three which basically indicates we are dealing with a zero Vector in threedimensional space so in the R3 uh in general we would just knowe this by n sking the uh notation general which means that we are dealing with 0 0 all the way down to zero so it has n by one dimension in our n all right so this is about zero vectors it is just a way to uh make our programming life easier also to use it in different algorithms when it comes to bit more advanced algebra uh the next type of special vectors that we will look looking to is this unit vectors so vectors with a single element equal to one and all the others zero denoted as EI for the E unit Vector in N dimensions are referred by unit vectors so uh what we mean here when it comes to the unit vectors um if we have for instance E1 it means that we have a vector where the is in this case the first element is equal to one so you can see that E1 is equal to 1 0 0 so in the first element we got one and the remaining is zero and this is really important that we are dealing with vectors that contain only element of zeros and ones and the only member that is equal to the only element in that Vector that is equal to one is the E element in the entire Vector all the remaining ones are zero and you can see here that the dimension is no longer specified but just the um index of the entry where the um uh the uh one is located so let’s look at another example in here for instance when it comes to the um uh unit Vector yet another unit Vector is E2 which basically means that in the second element so in the second place uh the uh Vector con contains one and all the other members are zero so here you can see Zero here you can see Zero only in the second element we have one and then in the e3e what we have here is that the third element is one and all the other ones are zero so let’s actually look into uh one um bigger Vector uh in higher Dimension to make it even more sense so first I will Define and assume that we are dealing with a vector in RN so in an N dimensional space this gives me an idea that we are dealing with um so we are not dealing with nested Vector we are dealing with a simple n dimensional Vector so it has n rows and one column so using the square braces I’m going to represent my Vector so I have all these different members and members C so e let’s say it is E5 so what does this mean it means that r is equal to 5 and this I element so the fifth element is equal to 1 and all the other entries the elements in this Vector are zeros so let’s look into this is 0 0 0 0 I’m approaching the fifth element in my Vector so it’s this one this is one and the remaining all zeros so this is a unit Vector in an N dimensional space and I’m defining it by E5 because my fifth element is equal to one now those are very handy when it comes to some other uh techniques in linear algebra and just in general think about techniques like um uh row eum form solving linear equation something that we will see as part of the next unit so many things um we can do by using unit vectors unit vectors are super important so you need to understand this concept uh very well such that later on you will understand uh more advanced concepts in linear algebra so now look into the topic of sparsity in vectors so by definition the sparse Vector is characterized by having many of its entries as zero so its parity pattern indicates the position of a nonzero entries so uh what we are basically saying is that if we are dealing with a vector that contains too many zeros we are dealing with a sparse Vector so uh this sparsity pattern indicates uh also the positions of a nonzero elements so um if we have um unit Vector it means that we are already dealing with a sparse uh Vector this is a concept that is super important when it comes to linear algebra but also in general data science machine learning and AI because having a sparsity in your vector it means that you don’t have much of an information usually a value zero it means you don’t know much about that specific volum and if you got just too many of zeros and too few numbers which do um provide information it means that you are dealing with a factor that doesn’t provide you much information and there’s always a problem when it comes to data science machine learning and AI so sparcity is something that you you need to be aware of you need to know how to recognize it and you also need to know whe there’s a problem in your specific case or not so let’s look into an example let’s say we are dealing with this Vector X that has five different elements so X is a vector coming from um five dimensional space so we have for instance an element of three in the first entry then we have z z in the second and third uh entries then we have an entry um four which coincidentally also contains value four and then the last element in our five dimensional Vector X is equal to zero now what do we see here we see that the majority of elements of a vector X is equal to zero because we got in total five elements and then we got three of it actually uh being equal to zero and only two of them containing information like equal to three and four so only two elements that are not zero so nonzero elements it means that 3 / to 4 which is basically 60% 60% of all the entries in the vector X are equal to zero so the 60% it means that is above half so above 50% 60% of all the information in this Vector um the majority is simply equal to zero this type of vectors we are uh calling sparse vectors and sparsity is really important concept uh that we need to keep in mind later on so uh while we can visualize vectors in two and three dimensions in linear algebra like we just saw in case of this n dimensional vectors visualizing uh the this type of higher dimensional vectors becomes very difficult so uh this mathematical flexibility uh to work with uh this type of uh information so when we can represent uh information many with many entries we can represent it by vectors which we can actually not visualize becomes very handy for complex data structures for different simulations in physics and much more so uh we just saw in couple of examples uh how we can represent vectors in a high dimensional space using this Square braces and this common Vector notation presentation we saw that in an N dimensional space we could uh very easily represent this uh very large Matrix or vectors uh by just um using this Vector representation for instance if we got a vector that had n different entries where n is for instance thousand so let’s say we have thousand then uh we can represent uh this uh vector or this information by using common vector Vector notation so A1 A2 all the way to a th000 so of course we cannot visualize this it just doesn’t make sense we can visualize two dimensional vectors we can visualize three dimensional vectors but we cannot uh visualize thousand dimensional vectors so Vector that comes from uh r Thousand but what we can do is still make use of this very useful information in order to uh do different operations when and later on we will see that uh this property and specifically this part of linear algebra it helps us to work with vectors in any number of Dimensions whether thousands million billions this mathematical flexibility is super important for more complex data structures uh for metrix multiplications when for instance we are doing different uh algorithms including how we can represent very large matrices very large feature spaces all this different information we can represent just by making use of vectors coming from this specific uh part of linear algebra let’s now finish off this module by looking into some applications of vectors so one common application of making use of vectors is uh when we are performing different operations while having words and we want to count those words so this is a super common application of vectors and we can count these words and you can even plot a histogram of how often each of these words appear in a document so a vector of a length n for instance can represent the number of times each of these words in a dictionary of n words appears in a document so uh just for the sake of Simplicity let’s assume that we got um dictionary that contains only three words of course in reality the um dictionary what we also refer often as Corpus it contains much many uh much more many words but for the Simplicity we will assume that we just got three different words in our dictionary so that’s a total now let’s assume that we got a document uh with these different words and we want to count how many times each of those words that we got in dictionary actually appear in our document so uh if our document is described by this Vector so it contains an entry of 25 2 and zero it means that in our document we got 25 word one in our from our dictionary so in the position one 2 * word two and0 times word three so basically we have have a predetermined set of words in our dictionary in this case three words word one word two and word three and they have a specific index specific position in our vector and when we are putting these values in here then the machine or the uh computer the program will understand that if we have 25 in the first position then the word one in the dictionary appear 25 times in our document whereas the second word appeared only two times and the last word for three didn’t appeared at all so zero times in the entire document so let’s look into a practical example actually to make even more sense so um this is by the way a common practice to count different variations of a word they are common application in engrams large language models Transformers they are just a Cornerstone of many language models when we want to count the words in the document to understand how often the word appears because this gives us a idea what this document is about knowing how many time the same word appears in that uh document it gives as an indication of the topic of uh the do document also we can make use of it to do sentiment analysis to understand what this document is about not only in terms of the topic but also is it a positive is it a natural or a negative uh document so to say so uh for instance if we got uh the following words uh that correspond to our dictionary and in our dictionary we got just um let’s say six different words then what we can do is that we can say 3 2 1 let’s say 0 42 and the corresponding words are word row number horse eel and then document what this means is that we have a text what we refer as a document that contains three times the word word that contains two time the word row contains one time the word number zero times the word horse and four times the word eill and two times the word document so uh this is basically a common way representing the uh frequency of the words in the document let me actually give you uh another example and in here I want to emphasize another thing the concept of stop wordss so uh let’s say I make this 10 and then here I say there is a three times the word I two times the word uh reading two times the word library four times the word book zero times the word shower and 10 times the word uh so uh you can see a that in here we are dealing with a document that contains 10 times the word uh which is what something that we refer as a stop word so those are things that actually don’t give us too much information about what the document is about because uh is just used everywhere but it is appearing too often so you can see 10 times the most frequently appearing word this is what we refer as a toop word and then another thing that we can observe the second thing we can observe is that we are dealing most likely with a document that describes library reading uh because you see the words like reading you see the word like book library but another word shower that is totally unrelated to reading book or library is appearing zero times so even by looking at disc counts we can already get an idea what a topic of this document is about out so uh you can see already now from this very basic example where I made too many assumptions regarding how small the uh dictionary should be uh you can even see now how we can use this counts in our dictionary from our text in order to get idea about the topic of the document or topic of the conversation it can be topic of the uh tweets if you have a tweet data it can be topic uh regarding book if you have many book um uh book texts it can be for instance the topic of the review if you got uh product reviews from uh Amazon for instance using discount can help you to get a topic regarding topic from that text then you can also use it to remove the stop words because usually the stop words are the most frequently Pi words it can also give you an idea about the sentiment for instance here we are dealing with natural sentiment it’s not positive it’s not negative it’s just reading a book in library that kind of topic so all this can be super helpful when it comes to natural language for processing that’s a field where this uh text processing text cing and then using that for modeling purposes is what uh what plays a central role it also plays a super important role in the large language models in the Transformer models and uh in simple matters like uh B of words or uh in the uh tfidf all these they are based on this idea of counting words and how we can use this information and you can see how vectors come into play in these different applications of linear algebra in data science natural language processing in artificial intelligence in machine learning so they are super important another application of vectors can be representing customer purchases for example an N Vector P so let’s say p can record a customer purchases over time with pi being the quantity or dollar value of an item I now what does this mean so let’s say we have Vector P that represents the customer’s purchases and we are dealing with a single customer and we are just saving over time that information how many times customer has made purchases over time so the quantity is in the um dollars so the dollar value of item I purchase so we are basically keeping track of uh what is the value of the item I that the customer has purchased so what we can do is we can assume that in here actually it already makes that assumption it says n Vector which means that the number of rows or number of um items that the customer purchases is n now what the um the problem says that it represents is that in each entry and here we have in total n entries we got a dollar value of item I which means that here if I have P1 P2 all the way to PN and here somewhere in the middle I got Pi I in the East position it means Pi represents the value of item I so for example if I’m dealing with a CER that buys um let’s say uh cores and uh the first item that the customer is buying is a mathematics course so I’m writing mathematics course and this is the first course that it buys e is by the way just a um way to refer to the E purchase so let’s say um here somewhere in the middle the um customer decides to buy a deep learning course deep learning learning course and then it continues buying uh the customer continues buying courses and the last course that the customer buys is let’s say um career coaching course now let’s say the mathematics course cost uh around $1,000 let’s say the uh deep learning course costs $3,000 and then let’s say the career coaching service which is usually one of the most applied and personalized one can cost all the way to $5,000 now we see that in the East position this is the East position let me change the color by the way so let’s say this is the East position this is the first position and this is the last position so those are just indices we can see that in the East position we got the 3,000 which means that the p e is equal to $3,000 so this indicates that in the East purchase the customer purchased deep learning course and the value of that item was equal to $3,000 all right so now we are ready to go on to next major topic which is about vector addition and subtraction so we are going to do some operations and apply this operations to vectors so let’s first formulate Define this ideal of vector addition so uh two vectors of the same size are added by adding their corresponding elements the result is a vector of the same size so uh let’s unpack this it says two vectors of the same size are added by their corresponding elements so here it refers to two different vectors let’s say vector v and Vector W and it says let’s add them what we refer as vector addition and says for that what we need to do is to take all the elements of v and then all the elements of w and using their corresponding elements so indices that helps us to understand where those elements are located we are using in order to add each element in the vector v to the element of the vector W in the same position and do note that in the second part it says the result is a vector of the same size because we are adding two different vectors of the same size mentioning here it means if we add two different vectors to the same uh that have the same size we are going to end up with a vector that has the same size now once I go into the examples it will make much more sense let’s quickly also look into this concept of substraction so on its own uh subtraction is very similar to this idea of addition so if we have a substraction let’s say we have vector v We substract Vector W then we are doing basically uh what we just did to the addition only instead of doing add we are doing subtract so again we are just uh we are just substracting from vector v Vector W they have the same size so we end up having the result uh which is a vector of the same size only one thing that you can see is that this can be also Rew written as V Vector plus and then minus W so we basically can represent substraction um on its own as a way of adding only we take the negative so the um opposite directed Vector so this will make even much more sense uh once we go on to the examples so let’s look into our first operation example where we are adding two different vectors this a basic example we got just two dimensional two vectors we got Vector a that has entries 2 three and Vector B that has entries 1 4 and what we are doing is is that we are adding Vector a to Vector B we just learned that a we need to have the same size of vectors so you can see that Vector a has a dimension 2 by one vector B has a dimension of 2 by one so their sizes is the same both they got two entries only two elements and at the same time we just learned that what we need to do is to take their corresponding elements and add them to each other now what does this mean it means that we take from a the first element two and then we take the first element of the second Vector which is the B so we take the two from here and one from here the first element of a and the first element of B and then we are adding them to each other 2 + 1 is equal to tree and then the same holds for the second ENT tree so tree which is the second element of vector a and then 4 which is the second element of vector B we are saying 3 + 4 is = to 7 so let me write it down even in a simpler manner such that it will make much more sense so Vector a has elements 2 three in the first element we got two in the second element we got three so a then we want to add B which has in the first element element equal to one and the second element is equal to four this means that if we want to add these vectors 2 3 + 1 4 this is equal to we need to take two we need to add one so this element and this element and then we need to take three we need to add to four so this one and this one which is equal to 2 + 1 is equal to 3 3 + 4 is equal to 7 so we got uh Vector 37 do note that this Vector the result vector contains again two elements and just one column so 2 by 1 so you notice that the sign that the size is the same of this result Vector now let’s actually generalize this concept before moving on to the next example so if we got let’s say Vector a that contains n elements A1 A2 all the way down to a n and it is from n dimensional space and we got Vector B that also has n elements so remember that they both need to have the same size so B1 B2 all the way to b n so they come also so B comes also from n dimensional space so then when we add a to B this is equal to A1 A2 all the way to a n plus B1 B2 all the way to BN so n by one n by one the sizes this is equal to let me actually use this color to make it even more visible so I for the first entry for my result factor I will get A1 Plus B1 then A2 + B2 so all the way down onto the end element which is a n plus then we is a different color A1 B1 B2 b n so you can notice now in general terms what we are doing here so we are taking the A1 coming from the vector a we are adding in the same uh position the value that comes from Vector B which is B1 we are saying take the A1 Plus B1 this is the uh first element so the position stays the same and then in the result Vector so we take all the corresponding values that are have the same position in the corresponding Vector first from Vector a and then Vector B we are adding them and this forms our new vector and this new Vector will again have a size n by one so you can see that a the sizes of the two vectors are the same both have n elements and then we are using their corresponding elements to add them to each other element wise and then we are getting the result that has the same size so n by 1 so this is a more General description of how you can add two vectors let’s now look into this specific example so we have a vector with the entry 073 so this comes from R3 you can see so three dimensional vectors the second Vector is 1 2 0 and then the final result is 1 193 so how we got this we took Zer we added 1 seven we added two and then three we added zero so you can see all these elements element wise and then this is equal to 0 + 1 is 1 7 + 2 is 9 and then 3 + 0 is 3 exactly what we got here so again the same sizes and the result is from the same size so quite straight for forward now when it comes to the vector substruction what are we doing that um so what are we doing here so we are doing kind of very similar thing we are taking this element one we are subtracting the other one in this first element then we are taking the nine in the second position and subtracting this again from the second position of the second vector and we are putting in here 1 and then 1 – 1 is equal to0 0 9 y – 1 is = to 8 so we get result Factor 08 like in here and you can see that the sizes stay the same so also in this case let’s write more General um this ideal subtraction if we got a vector a from RN so n dimensional space and it can be represented by A1 A2 all the way down to a n so it has n elements n by one and then we got B also from RN so coming from the uh n dimensional space which means that it got n elements so B1 B2 all the way down to BN again with the same size n by one then a minus B is simply equal to a A1 let me actually show use the same colors to make it easier to follow so let me first draw my Square races and then here I will use blue for a and then red for the uh color for second Vector which is B here I will use black minus then given that the same size should be for the result Vector I already know that I expect n different elements for this and then here I’m taking this first element that comes from Vector a i subtracting from this the first element that comes from Vector B so element wise substraction B1 and I’m already getting the result for the first element in my result Vector so you can see A1 minus B1 I’m taking this element and this element as subtracting them from each other to get A1 minus B1 and then the same holds for all the other values only coming from different elements from Vector a sub substracting from this the corresponding values element Wise from the vector B so B2 B3 all the way to a n so you can see that in my result Vector a vector minus B Vector in the first element I get A1 minus B1 then A2 – B2 then A3 – B3 in the third element all the way down to the end element which is equal to oh this should be b a n minus BN so um this already should makes uh much more sense so every time we take the element in the same position from one vector than the other we subtract from each other in order to get the corresponding element in the final Vector all right so let’s now uh before moving on to the properties um I wanted to show you um this only in a coordinate space so what this means in terms of visualization in a coordinate space so uh let’s say we have a coordinate space this is my Y axis this is my x-axis so this is X and the Y and this is my Center so 0 0 and what I’m doing here is simply I want to have Vector a let’s say this is just um Vector a simple one with the coordinates um let’s say four and – 2 and I got Vector B let me use a different color Vector B that has coordinates let’s say minus 4 and four so let’s actually visualize them let’s first start with the vector a uh which has a x value four three and 4 1 2 3 and 4 and the Y value minus 2 so this is my Vector a and let’s now visualize the vector B so Min – 4 and four which means that let me actually extend this this is min -4 so the x coordinate is min -4 so it should be here and then the y coordinate is four so 1 2 3 and 4 it’s this one which means that my Vector B is this one all right so you can see now that the vector a is in here and the vector B is in here now what I want to do is to add this two vectors to each other so what I want to do is to take this Vector a and add to this the vector B which is is equal to 4 – 4 it was 0 and then – 2 + 4 is = to 2 so zero and then two it is zero and two two so this is my result Vector so now when we are clear on how we can add vectors how we can perform these different operations and what it means in practice when it comes to looking at the vectors in a coordi space and adding them or subtracting them we are ready to look into the properties of vector additions so this is something that will definitely seem familiar to you uh from pre-algebra where we are basically using all these properties that we already know that holds for uh numeric values for the scalers that being transferred to this Vector space so we are going to talk about this four different properties that a vectors uh have the first one is the cumulative property which says that if we add a vector a to Vector B then this is the same as adding a vector B to Vector a so basically the order of the vectors doesn’t really matter when it comes down to adding them so formally a plus b is equal to B+ a for any vectors A and B of the same size then we have associative property which says a plus b + C is equal to a + b + C we can write both as a + b plus C now what does this mean we know from prealgebra that this parenthesis me means first do this addition and then do the rest of operations in here it basically says if you add a to the B first and then you add the C is the same as first you add B to the C and then on the top of that you add the vector a so then the third property is addition of zero vectors which says if we add a zero Vector to Vector a then this is equal to adding a vector 0 to a and this is equal to Vector a so adding the zero Vector has basically no impact on the vector whatsoever then the final property is subtracting a vector from itself which means if we take the vector we substract the same Vector from itself so a minus a and we get a zero Vector so a minus a is equal to Zer vector and this heals the zero Vector now let’s look into each of those properties one by one and Let’s uh look into specific examples uh in some cases we will prove this on the example that we have to make this Concepts much more clear so let’s start with this cumulative property of vector additions so we want to see whether a plus b is equal to B + a so let’s say we have a vector a that has coordinates or magnitude and direction that is equal to one one and two then we have a vector uh let’s say B that has a magnitude and direction of Min – 2 and three so the first thing that we want to check is indeed whether the A + B is equal to B + a so therefore let’s first calculate this part and then we will calculate this part that I will def by one and two and we will see whether we are indeed having the same value the same vector or not so let’s see so we have here a so a plus b which is the first value that we want to calculate a + b is = to 1 2+ – 2 3 and we learned before that this is simply equal to take this value and then add this one so 1 + – 2 and then 2 + 3 so this gives us a vector 1 – 2 is = – 1 and 2 + 3 is = 5 so we get that A + B is = -15 this Vector now let’s look at the second quantity so B Vector B plus Vector a this is equal to – 2 3 + 1 2 and this is equal to – 2 + 1 and then 3 + 2 this gives us – 2 + 1 is = to min-1 and 3 + 2 is equal to 5 so we can already see from here that the quantity 1 is indeed equal to quantity 2 which proves that indeed the A + B is equal to B + a what this basically means is that adding two different VOR the direction or the order is not important whether you add a on the top of the b or B to a it doesn’t matter at the end is the same and actually you can also see it if you uh combine this or if you do this in a more general terms so let’s say if we have a vector a which is equal to in an N dimensional space A1 A2 up to a n so it has n by one dimension and you have a vector B with the same size from the same RN Dimension and it has elements B1 B2 up to BN and the dimension is equal to n by one then if we calculate first a plus b and this is equal to Simply A1 + B1 A2 + B2 up to a + BN and if you calculate the second U amount which is B + a and this is equal to B1 + A1 B2 + A2 up to BN + a n you can see that A1 + B1 is equal to B1 + A1 simply from pre-algebra you know that if those are constants for instance 2 + 3 is = to 3 + 2 in the same way a 2 + B2 is = to B2 plus A2 and then here up to a n plus BN is equal to BN + a n what this means is that all these elements they are basically the same which means that we already have a proof so we get this proof and we can see that even for the general term independent what is Vector a is what this Vector B is that a + b is equal to B + a this is exactly what we saw before in the first property which is called commutative property of the vectors that A+ B is equal to B + a now let’s move on to the other property which is called associative property of the vectors now what this property does and says is that a plus b so first we do this plus C is equal to a + b + C and this is then equal to a + b + C now let’s then see um this specific property on an actual example so what this basically says is that if we have this example where a is equal to actually I had this before let me simply just remove this part let’s then add our third vector which is C and let’s call it let’s say it has a representation of four and five then the idea behind this property is that what we need to prove here that A + B within the parentheses plus C is equal to a plus B+ C and then this is equal to a + b plus C so let’s see actually whether this is indeed true for this specific case now this should come very intuitive so I’m going to do it very quickly so first we have this quantity this one then we have this one and the third one let’s do it very quickly so so A + B + C is equal to one two plus and then we have C so it is simply 4 five and then this is equal to we saw before when doing this that we were getting one Min – 2 2 + 3 and then we add this four five this is simply equal 2 1 – 2 is -1 and 2 + 3 is 5 + 4 5 now given that it doesn’t really matter no longer that do we have uh here parenthesis or not this basically means that this volue is simply equal to -1 + 4 so here -1 + 4 here 5 + 5 so this is then equal to three and then 10 all right let’s then now quickly do the second amount which says first add the vector B to Vector C and only then add the vector a on the top what this means is that we need to take one two this is Vector a and we will only add this once we have added the minus 23 the vector B plus to the vector 4 five okay so we can see that we are just leaving this in here let’s first add this two – 2 + 4 3 + 5 so this gives us 1 2 + – 2 + 4 is two and then 3 + 5 is 8 so this gives us let me remove this calculations so this gives us 1 + 2 is = to three and then 2 + 8 is = to 10 okay great so now we got already the quantity one being equal to quantity 2 let’s check whether this is all equal to this one it should already be um something that you see now given that um we know just from mathematics that parenthesis doesn’t really matter when it comes to the scalers and adding two vectors is basically very close to this idea of edited property um of the edited property of the scalers but just let’s quickly do to be 100% sure so when we take this Vector a to the B and to the C we had all this this is equal two want to added to minus 2 three and then added this to four and five now what this is equal to let me actually write this in be sure to way such that it can be all fitting in the small place so one cu plusus 2 3 + 45 this is equal to basically 1 – 2 + 4 and then 2 + 3 + 5 and what is this number 1 – 2 + 4 is simply equal to 1 – 2 is = to -1 and then + 4 is = to 3 so first element is 3 2 + 3 + + 5 is equal to 5 + 5 which is equal 10 perfect so now we get the confirmation that indeed A + B + C is equal to a + b + C is equal to a + b + C so let’s quickly also look into this addition of zero vector and the subtracting a vector from itself properties and uh the detailed explanation of this or example of this I will leave it to you so when it comes to this a plus um 0 is equal to 0 + a is equal to a so this property let’s say if a is equal to this 23 and then we are adding on this a plus some zero Vector which basically means take two threee and then added the same size of zero Vector you can see that this is the same as adding this zeros on these values now what do we get we get add that this is equal to 2 + 0 is 2 and then 3 + 0 is 3 there we go so we already see very quickly that it doesn’t really matter whether we add a zero Vector to this original a vector or not we in all cases it just adding a zero Vector has no effect and seeing from the commutative property that a plus b is equal to B+ a we already know that if um a + 0 is equal to uh a and is equal to this then also 0 + a will be the same and we can see indeed that we just saw that a + 0 is simply equal to a so we basically have quickly proven all this now when it comes to the subtracting a vector from itself I think this is a very nice one just to see how we um uh take the same Vector I subtract from that value and we get zero and this is very similar to working with just real number numbers in the same way as 3 – 3 is equal to Z also when we have a vector consisting of the scalers like a is equal 2 23 in the same manner if we take this a and we subtract it from itself so a minus a then what we will get is 23 minus 23 and this will give us 2 – 2 is0 and then 3 – 3 is z so we get Vector zero so zero vector so now when we are clear on how we can perform different operations on our vectors and also we know uh what are the properties of uh adding and subtracting uh different vectors we are ready to move on to a bit more advaned topics so uh in this module we are going to discuss this idea of scalar multiplication we’re going to look into the example how what happens and how we can do the uh Vector multiplication with the scalar then we are going to uh look into the span of vectors what it means to have a span of vectors uh what is this IDE of linear combination and the relationship between the span and linear combination and the unit vectors now we are going to look into the application of scalar Vector multiplication in audio scaling uh example and then finally we are going to finish off this module by looking into the length of a vector and a DOT product and we are going to uh go back to this idea of distance understanding vector magnitude and understanding Vector length so let’s get started now before we look into this idea of span and linear combination I quickly wanted to look into this idea of scalar multiplication and the um specific definition of it so formally the scalar multiplication involves multiplying each component of a vector by scalar value effectively scaling the vector’s magnitude so what do I mean here let’s say we have a vector and I will write it in the general terms to keep everything General so let’s say we have a vector a let me pick up my pen a and this Vector a is from n dimensional space so it is from RN and it can be represented by A1 A2 up to a n and I have this magnitude um of a vector and now I want to scale this Vector for which I know the magnitude and the direction I want to scale it with a scaler and we learned before that the scaler is just a number so um scaler in this case I will be uh referring it to uh by C so c will be my scalar and uh this comes from R which means that it’s a real number let me actually use a different col color to make it easier to follow okay so my scaler will be with the color uh red so C and C comes from R so what do I mean by scalar multiplication I mean that I want to find what is this c times a this is what we mean by scaler multiplying with Vector now what does this definition say it says when we are multiplying scalar with Vector so the scalar multiplication meaning multiplying Vector with the scalar it involves multiplying each component of a vector by a scalar value so if we translate it to this specific example it means that this amount so this amount is equal to taking C and multiplying it with each element of this Vector so each component of a vector and what are the components of my Vector the A1 A2 a up to the point of a end so all these components so that means that the first element of this new Vector the scalar multiplication a result will be C * A1 C * A2 dot dot dot so all this middle elements and at the end again c times and then a n and then in both cases of course the number of elements doesn’t change so the so the number of rows of my Vector doesn’t change it’s n so here also n and then number of columns is the same so it’s just a column Vector so one column so what we see here is that we go from A1 to C * A1 we go from A2 to C C * A2 up to the a n transforms into C * a n so we see very easily that I keep all the elements from this Vector I take them in here and instead what I’m doing is that I’m multiplying every element from this vector by the scaler C so this is exactly what this definition says and let’s actually go ahead and do a Hands-On example with some real numbers to have this um method and to have this uh definition very clear in our mind because we are going to make use of this fundamental operation scalar multiplication on and on in the upcoming lectures and just in General in your journey in any applied sciences so this is an example of scalar multiplication uh here what we are doing is that we want to multiply this Vector C so in this case the vector is defined by a letter C and then on the top we can see the arrow indicating that this is the vector now and here we refer the scalar by a letter K we are saying we want to perform scalar multiplication which means that we want to multiply the uh Vector C by the scalar K so how we can do that so what we want is to multiply K by C and we just learned that for that what we need to do let me write this over so this equal to minus 2 multiplied by 4 – 3 this is my Vector so this is the k a and this is the C this is equal 2 so I take my scaler and I multiply it with the each of the element of the C so – 2 * 4 and then -2 * -3 so – 2 * 4 is = to – 8 and then – 2 * – 3 so– it goes away it becomes a plus and the 2 * 3 is 6 so my end result the K * C is equal to – 8 6 this is my final result so let’s quickly also do yet another example and this one is a unique one because it’s relating to this idea of U multiplying something with a zero uh which is something that we also uh know from our high school that when we multiply number let’s say seven by zero we’re getting zero and here in this example exle the uh problem is describe the effect of a scalar multiplication by zero on any Vector which means what we are doing is that in this example is we want to know what is this result of any Vector let’s say Vector uh C so we will use the same example C only this time instead of multiplying it with the scalar K = to minus 2 our scalar will be zero which means that c is equal to 4 – 3 and then K is now equal to zero and we want to find out what is this K * C let me actually write down the K with a different color k is equal to zero so what we want to find out is K times and then C and this is that equal to zero so I’m taking the K 0 times then I’m taking each of the elements of C which is four and then minus 3 and I know that when multiplying the number with a zero it gives me zero which means that I end up with 0 here 0 * 4 is 0 0 * – 3 is also zero so I end up with a zero Vector now this gives me an idea already that I can make a general conclusion that independent of the type of vector that I have independent what are these values in my C uh if I have any Vector C and I’m multiplying it with zero then this will always give me a vector of zero because all the members of this final Vector will be just zeros so if for instance the C comes from uh let’s say r n so it has n different elements it comes from uh n dimensional space then my final result of 0 time C so this zero Vector this one so zero that this one will come also from our end so you will be having a vector so zero time c will then be equal to 0 0 blah blah blah blah zero so n time zeros so this is then the idea of multiplying so scaling a vector with zero and this is our example two all right so let’s now move on on to our application of scalar Vector multiplication and then after this we will go back to this idea of linear combination and dispense so in this specific application we have a scalar Vector multiplication and we are looking into application of audio scaling so the scalar Vector multiplication audio processing uh this can change the volume for instance of an audio signal without altering its content so um you might have noticed that um when uh when you are listening to video you can simply increase the volume of that video or decrease it but you will notice that the content doesn’t change you are just increasing the volume or decreasing it even on the TV when you are watching a show you are increasing The Voice or decreasing now what you’re basically doing behind and this is super interesting is that behind the scenes what is happening is that there is simply um audio that um contains that show and audio of that show is being multiplied with a scaler and that scale is simply the volume scale if you scale it in such way that you want to decrease the volume so the audio will then have a lower volume then you are simply multiplying it uh your vector containing the audio information in such way that those newer volume indications they will be they will be containing lower numbers hope this makes sense let’s look into the example this make uh this will definitely clear this out so um let’s let assume we have an a vector a that represents the audio signal and we want to multiply Vector a by scalar B to adjust the volume so B is some sort of number it can be so B comes from our so is a real number while a is simply a vector given that it doesn’t mention it here I’m assuming that a comes from RN so it comes from our uh n dimensional space so imagine of a as this Vector A1 HQ blah blah blah blah up to a n and each of these values it basically describes uh an uh the audio signal so it represents um uh an amount so it contains an amount that represents the audio signal of your uh video or uh your uh show and then the B in this case for instance in this example you can see that the B is then uh equal to for instance 1.2 1id 2 or B is equal to Min – 1 / 2 so you can see that b is equal to 1 / 2 which basically is offensive saying that b is equal to 0.5 or B can be equal to min-1 / 2 which is – 0.5 now then it says then the B * a which basically means multiplying our um scalar beta by the vector containing the audio signal a so this B * a is perceived as the same audio signal but at the lower volume now why lower because you can see that b is equal to 0.5 or minus 0.5 it means that once you take all these elements of your a and you multiply it with a number that is smaller than one in this case 0.5 then all these numbers will decrease which means that also your audio volume will decrease so let me actually uh show you an example so let’s say our talk show is very short and you know the audio variation is very low you have a vector a that is quite small it comes comes from a three dimensional space so R three and it has numbers like three uh six and then five so 3x one vector and then we have our audio adjustment scaler beta which is equal to 0.5 now when we take the beta we’re multiply it by our audio signal then what we do times this clear so times what we are doing is that we are simply taking all the elements of our a so Three 6 and 5 and what we are doing is that we are multiplying it by 0.5 0.5 and 0.5 or you can also say 1 / 2 so what this is equal is that 3 * 0.5 is 1.5 6 * 0.5 is 3 and then 5 * 0.5 is 2 2.5 and you can see that all these numbers 1.5 3 and 2.5 they are smaller and specifically two times less than all the original values in the um original audio so original audio is a which was 3 6 and 5 and the new audio the scaled one is so audio scaled so B * a is equal to 1.5 3 and 2.5 so you can clearly see this transformation where uh this element three is larger than 1.5 6 is larger than three and then the last element five is larger than 2.5 which means that this audio audio is much at a higher volume so the Volume Two Times Higher then this audio so this is basically the idea of uh applying scalar multiplication to our audio preprocessing I will leave the other example to you that will show that when your scaler is equal to minus 0.5 you again will end up with a lower volume only that time the volume will be much much lower than the original one so now that we know how we can perform scal multiplication in theory as well as we have looked into an example how we can do it in terms of the numbers and multiplying them and we have also seen uh applying scale multiplication in practice uh so we have seen in this audio processing stage the uh multiplication process we are ready to look into the visualization of it this will help us to get a better understanding on uh what exactly happens when we are scaling different vectors let’s look actually in the following example so let’s assume we have a vector oh let me remove that so let’s us show we have a vector and that Vector is let me get a color this one for instance a vector a and this Vector a consists of elements one and two so where does this Vector like the vector is with um one so here in our coordinate system this is our xaxis this is our y AIS and here we got uh let me actually pick another color let’s say black one and then we got one and then two right this is two this is one so it is this one so the line that we get here it is this one so this is our Vector a now let’s assume I want to multiply my Vector a so I want to scale my Vector a by a constant Tree by scaler tree so I have a scaler let’s say I call K and this k a different number let’s say k is equal to three so what I want you to do is to perform a scale multiplication so I want to obtain K multiplied by a and we learned that this is simply equal two three times and then one two and then this is equal to 3 * 1 3 * 2 which is equal to 3 and then six so let’s also visualize the scaled uh Vector so let me pick this yellow color this will be our scaled Vector so we have that scale multiplication and we are going to visualize that so we have three and six so this is three one 2 three and this is six so we have this point so you should already see what is going on here so we got 3 a here so you can see that this part is our Vector a and this longer one is 3 a and even visually you can see that this longer Vector is simply the three times of the shorter Vector so we got this and then if you add on the top of this the same three times you will then end up with the original so scaled version of that so basically this is a this is a this is a we combine three different so we scale a three times and we simply get a three * longer version with the same direction so you can see that when we are scaling even visually it makes sense so we are scaling our Vector a three times and we are just getting that Vector so we are transforming well let me remove this so basically we are taking this vector and we are scaling it up to this point if I would do it on only two times then it would be something like this or one and a half times it would be something like this so only half of it so now this should make much more sense let us actually do yet another example to uh make sure that we are clear on this visualizations because we are going to make use of it when uh looking into this idea of linear comp combination in a spin so let’s say we have a vector B and this Vector B has elements Z and three so let’s visualize and uh plot this Vector so it contains elements Z and Zer and three so0 and three so this is the X element and the Y element on the Y AIS we can see this is three which means that our Vector B is this vector all right perfect so this is our B let’s now multiply so scale our Vector B by scalar 2 so let’s say we want to get 2 * B so what is this amount this is equal to 2 * 2 * and I’m simply taking each of those elements zero and then three so this is then equal to 2 * 0 is 0 and then 2 * 3 is equal to 6 so this is my new skilled Vector 2 * B Vector this one so let’s visualize this the x-axis value is zero so we are still here and then the Y AIS value is sixth so what is sixth this thing all right so you already should see that this is very similar what we had before so this is 2 B all right so this all uh should make sense uh also we learn as part of the um High School when visualizing different plots so this is quite similar to this idea of having Y is equal to X and then scaling it getting like Y is equal to 2x so in this case only we know exactly where the vector starts and ends uh so we have uh much more specific definition instead of having all this infinite number of points on the line but the idea stays the same so we are taking this vector and we are then scaling it two times so we get 2 B vector and I could do the same only instead what I could also do is I could do like uh 0.5 or 1 / 2 * B so I take the half of it which means I would get this vector or I could multiply with minus one so minus – 1 * B so I will scale with minus one and then I will simply get the negative version of my original Vector so this thing this would be minus b or min-1 * B so this is basically the idea of uh scal multiplication when visualizing it in our coordinate system cartisian coordinate system and now when we know all this we are ready to move on on this idea of linear combination and now when we know all this we are ready to move on on this idea of linear combination so let’s now formally Define this ideal of linear combinations a linear combination of vectors A1 up to a m using scalers B1 up to BM or what we also refer as beta 1 to Beta m is the vector beta 1 * A1 Plus up to Beta M * a m and the scalers are called the coefficients of linear combination and any Vector B in N Dimensions can be expressed as a linear combination of the standard unit vectors E1 up to n the coefficient in this combination are then the entries of B itself well this is whole bunch of information uh let’s unpack them one by one firstly um I want to mention about this m so far we have seen this idea of n so I just wanted to experiment with a different one just to ensure that we are clear that you can use any source of identifier to describe the size of your um uh number of vectors that you got and uh in this case we got M different vectors because so far we were using this n in order to describe the size of a vector and now we are no longer talking about the size of a vector but the number of vectors therefore I specifically didn’t use the letter N so here m is simply the number of vectors so don’t confuse this with this thing where we were plotting this and we were saying this A1 A2 up to a n because in here we basically mean that we are dealing with some Vector a and this has n different elements whereas in here we are already moving from this IDE of one vector and now we are talking about multiple vectors so we have M different vectors they all look like kind of this only with bit more complex indexing that we also saw before all right but we will learn this um that’s not an issue I just wanted to mention this to ensure we are at the same page so then let’s move on to this idea of using scalers beta 1 till beta M so it’s a common uh practice in linear algebra in just in general in mathematics but also definitely in data science statistics and in artificial intelligence to use beta 1 as a way to describe the coefficient so what do you mean by coefficient it is just a scaler so it’s just a constant or number so in this case for instance this beta one can be 0.5 beta one can be uh two beta one can be let’s say 100 it just describes how much we are multiplying scaling this Vector A1 so so far we have done lot of scaler multiplication already a lot of details there and we have seen different times different scalers that we use we can use um zero as a scal we can use any other number as long as it’s a real number so this beta 1 should belong uh in the real number space so it’s a real number and of course the same holds for uh all the other betas so we have M different vectors which means we are going to have M different scalers because each of those vectors we are going to multiply with their corresponding or respective scalers so beta 1 is basically the scaler uh or the um uh coefficient that we are using to scale A1 maybe I can actually write this down on a new page such that we can save this as a slight page for you let’s write it down so what do we have as this idea of linear combination so a linear combination simply involves taking several vectors uh to go from this uh formal definition to more practical uh terms so we got this A1 A2 up to am and what we want to do is to take the linear combination of this m different vectors so we got m is the number of vectors and to get a linear combination we need to uh scale each of those vectors which means that we need to have these different scalers let’s say beta 1 for A1 and then plus beta 2 for A2 so each time we are scaling each of those vectors where beta 1 is the uh scalar or the coefficient of the vector A1 and we are multiplying we are performing scalar multiplication of our scalar beta 1 with the vector A1 and then we are adding to this our beta 2 which is the coefficient corresponding to the vector A2 and then adding beta Tre times A3 and then dot dot dot so all these different uh vectors up to the point of beta M time a m and all this so A1 A2 up to a m those are all vectors belonging to the m space so those are all vectors coming from the um m diamension space so um in here this is the linear combination of our M different vectors and the uh beta 1 beta 2 up to Beta M those are all constants so those are scalers or real numbers that belong to R so those are real numbers all right so now when we are clear on that let’s also unpack this idea of coefficients so the scalers are called the coefficients of linear combination so basically all this members so beta 1 beta 2 of two beta M that belong to real number space they are called coefficients this is what we are referring as coefficients and this coefficients this ID and the name is super important because you will see this time and time again appearing in your uh very basic machine learning models or some other applications of linear algebra because the end goal is always to find these coefficients so this coefficients those are numbers that we are using to scale these different vectors and uh the idea of coefficients is very Central because those are numbers that Define how exactly we are combining all these different vectors because this beta 1 beta 2 Beta 3 they can be different numbers real numbers and every time when we are choosing these coefficients or these betas we will then end up with a different combination of these vectors so we’re basically mixing all these vectors and the way we mix it and how we will mix it it will depend on the values of this beta 1 beta 2 Beta 3 up to Beta M so these coefficients so therefore coefficients are super important and they Define the end results from our linear combination so any Vector B in N Dimensions can be expressed as a linear combination of a standard unit vectors E1 up to e n so when looking into this um idea of unit vectors uh we saw already what this E1 is what is E2 is up to e n and we saw that E1 is for instance if it’s from an N dimensional space and it says from n dimensions then E1 simply means 1 0 0 do dot do do0 then E2 means 0 1 0 dot dot dot dot zero so we already saw this this is not something new that we are seeing so 0 0 blah blah blah and then one at the end and what this definition basically says is that any Vector b as long as the B comes from n dimensional space we can represent this by using this uh unit vectors and by linearly combining them so this is get another part of this definition and we are going to by the way um go through each of the parts of this definition one by one going through each of their examples as well as visualizing them so now I just want to quickly unpack all the parts in this definition before moving on to step bystep examples and explanation so this is about this linear combination of any n dimensional uh Vector B that we can uh create by using a link combination of these unit vectors I will come to this in a bit so then the final part of this definition is that the coefficients in this combination are the entries of B itself so it says that the coefficients so beta 1 up to Beta m in this linear combination that we can create are the entries of B itself so we will come to this section once we are done with the first part so first let’s have a good understanding of what this linear combination is and we will also touch base and we will also formally Define the idea of span and after that we will move on on uh representing and expressing any Vector B in N dimensional Space by using standard unit vectors E1 up to e n and this idea of coefficients and then entries of B so let’s start with the first one so let’s assume we have two different vectors we have Vector a and this Vector a is equal to one 2 so let’s plot this one and two in our coordinate space that is this one which means that our Vector a is this one and let’s assume that we have a vector B and this Vector B is equal to 03 so 0 is here and then three is here which means that our Vector B is this one this is Vector B now I want to create a linear combination of this Vector a and Vector B so we just learned from the formal definition that in order to do so I need a beta 1 to multiply the vector a and then I need beta 2 which is the coefficient corresponding to to my second Vector in order to multiply the second vector which is B Vector B okay so I’m getting the linear combination of A and B by taking any beta 1 and beta 2 which are real numbers so beta 1 and beta 2 belong to R so they are real numbers and then I’m getting a linear combination of the two so let’s look into a few examples of a linear combination of vector A and B dependent on the different choice of the co efficients like beta 1 and beta 2 so example one is that beta 1 is equal to zero and then beta 2 is equal to Z Now what is the linear combination of A and B when my coefficient beta 1 and beta 2 both are zero it just means that I’m getting 0 times a plus 0 * B which is of course 0 * 1 0 * 2 plus and then multiplying Vector B with a scaler 0 which is 0 * 0 0 * 3 so let’s quickly do this what this value is this is equal to 0 * 1 is 0 0 * 2 is 0 0 * 0 is equal to 0 0 * 3 is equal to 0 and this is then equal to 0 + 0 0 0 + 0 is 0 so I’m basically getting a vector zero all right so this is then equal to zero so this equal to Vector is zero so I can also say that this vector or it’s actually a point so this point is simply a linear combination of these two vectors now this is super basic case let’s look at another case when our in our second example the beta 1 and beta 2 so our coefficients they are actually not zero there are some other nonzero real numbers so in this example I will then take beta 1 = to 3 and then beta 2 is equal to two and then what I will do is that I will take actually I will take the um minus 2 then I can also get rid of one of the elements and I can get actually a zero for one of the elements I will show you in a bit so then the linear combination of A and B using this coefficients beta 1 and beta 2 where beta 1 is equal to 3 and beta 2 is equal to minus 2 is then equal to so this amount this amount is equal to 3 * 1 2 and then plus we got – 2 times 0 and three now what does this give us 3 * 1 is = 3 3 * 2 is = 6 Plus and then – 2 * 0 is = to 0 and then – 2 * 3 is = to – 6 so you might have already noticed why I picked the beta to equal to minus 2 I wanted these two numbers to actually cancel each other so you see because 6 + – 6 is equal to Z so what do I get in my final result as a linear combination of these two vectors I get 3 + 0 so 3 + 0 so 3 + 0 and then 6 + – 6 and this gives me 3 + 0 is 3 6 + – 6 is 0 there we go so this is my linear combination of the vector A and B when using the coefficients equal to 3 and minus 2 respectively so this value is actually equal to three and zero in this case all right so let me actually clean this up because I also want to visualize this idea and then we will go uh back to this uh linear combination let’s just summarize uh what we got before moving on to the plotting part so if we simply take a and we add to this B so this is the first case so this is as you might have already guess this is also linear combination here we are saying take 1 * a and take a 1 * B and this is yet in our linear combination here the beta 1 is equal to 1 and then beta 2 is equal to 1 so this linear combination gives us a vector that is 1 + 0 is = to 1 and then 2 + 3 is = 5 this is our first linear combination when the beta 1 and beta 2 is equal to 1 this is a basic case so doesn’t require too much explanation here we have seen already this let’s now look into the other example that we saw when we use uh the zeros as our coefficient so that is 0 * a + 0 * B then this gave us 0 0 this was our second linear combination when beta 1 and beta 2 were both equal to zero and then the third linear combination that we saw was that 3 * A+ – 2 * B this gave us three and zero this was our third linear combination where beta one was three and then beta 2 was minus 2 so so then the linear combination of these two vectors is basically all the possible combinations of these two vectors that I can get when scaling or when multiplying these two different vectors by different sorts of uh vector by different sorts of scalers so in all these different cases what I’m simply doing is I’m taking different sorts of code coefficients beta 1 and beta 2 and then I’m getting the linear combination of these two vectors we saw that in the simple case when we take a and we had to B so basically the coefficients are equal to 1 so 1 * a + 1 * B then the corresponding linear combination is equal to 1 and five it means that we are getting this Vector so one and five is in here which means that we are getting this one this Vector if we get if we take the zero as scal so beta 1 and beta two are both equal to zero then the linear combination of these two vectors is simply the vector zero which means that it is this point then if we take the linear combination using three and minus two as coefficients then we are getting this so 0o and three so one two and three this is three then this is our linear combination I can also take any other uh like scaled version of my B and of my a and then I will get entirely different sort of vector so let me actually show you a few more times um a couple of other examples so let’s say I keep my a so I just take the beta 1 equal to 1 but instead I scale my Vector B two times so this was at three I’m taking two times of my Beta which means that I’m here then I can take this I can add this to my a so this is 2 B this will give me another linear combination of these two different vectors I can also you might recall that we said that the starting point and the end point doesn’t really matter for for us what matters is that we uh have the same magnitude and the same direction for our vectors so this means that for me the vector being here and the vector being here doesn’t matter when I scale it two with two I can be here with three I can be here so this is the same as my B only three * B this is basically scaling B with three and this in here means that my Beta 2 is simply equal to three and then this means that I can combine this with my a which was in here you remember so this here this means that I get yet another linear combination of these vectors which means that I’m taking 3B and I’m adding on this my a so 1 * a my beta 1 is equal to 1 my Beta 2 is equal to 3 which means that the linear combination of this is equal to one 2 plus and then 3 * B is equal to 0 and then 3 * 3 is 9 this is then the new linear combination which is one and 11 so the new linear combination is equal to 1 and 11 so this thing which is the same as this thing and then you can go on and on you can also calculate the same with a negative B so you can take B and then you can scale it with minus one so this is minus b or you can go in here in here the same holds for a so you can scale it all the way to here or in the negative side so this already uh give us the idea that we will go into to the next point which is the span so when it comes to the linear combination and in this specific case when we have these two vectors we can combine these two vectors in any way and uh we can mix them up by using different sorts of coefficient of beta 1 and beta 2 and we will can we can get any Vector in our R2 so this means that any Vector in our R2 we can represent by using only these two vectors and this is not always the case for this specific case we are dealing with two vectors that we can use to represent any Vector in our R2 so what I mean here is that let me clean this up so independent what kind of vector you will give me in the R2 so it has two different elements it is 2 by one I can use a linear combination of A and B so a linear combination of A and B is beta 1 * a plus beta 2 * B in order to represent this Vector X1 and X2 therefore we are saying and we will come to this um in the next slide too that this pen of the vectors A and B so this is the set of all possible combinations of these two vectors is equal to R2 because any Vector in R2 can be represented as a linear combination of this two vectors so we have a linear combination of A and B and I’m saying that I can represent any Vector so here Vector X and this Vector X I’m representing by X1 and X2 and X1 and X2 can be any real numbers so X1 and X2 they belong to R and X is simply a two-dimensional Vector so X1 and X2 those can be any numers Z 0 1 2 – 100 anything and I’m saying any number in this two dimensional space so whether it is this one any Vector this one this one or this vector or this one any Vector that you give me in two dimensional space I can find a linear combination of this A and B that is equal to that Vector so I can represent that Vector as a linear combination of vector A and B that we saw before so let let’s actually prove that so I’m going to represent this uh X1 and X2 by a linear combination of this Vector A and B and how we can do that so we have beta 1 * a plus beta 2 * B it is equal to X1 and X2 where beta 1 and beta 2 so beta 1 and beta 2 they are constants so they are also real numbers so let’s unpack this which is beta 1 * 1 2 2 + beta 2 * 03 and this should be equal to X1 and X2 now this is equivalent of so beta 1 * 1 beta 1 * 2 plus beta 2 * 0 and then beta 2 * 3 and this should be equal to X1 X two so this is my beta 1 a this is my Beta 2 B and this is my X all right so now what we get is that and this is equivalent beta 1 * 1 is equal to beta 1 beta 1 * 2 is 2 beta 1 plus then here beta 2 * 0 is z and then beta 2 * 3 is 3 beta 2 so we have learned uh from the uh operations on the vectors that beta 1 uh so in this case when we are adding two vectors so beta 1 plus zero is the uh amount that we need to put as our first element so when we are adding two vectors we just need to take their corresponding elements we need to add them up so this equal to beta 1 + 0 and then 2 beta 1 + 3 beta 2 this is the result and this should be equal to X1 and X2 at least this is what I’m claiming so this zero doesn’t matter matter so we what we are getting from here is that beta 1 is equal to X1 and then 2 beta 1 + 3 b.2 is equal to X2 this is the two expressions that we are getting based on all these different calculations so let me actually remove all this so we have beta 1 is = to X1 and 2 beta 1 + 3 beta 2 is equal to X2 so here given that we have already that beta 1 is equal to X1 and here we have two unknowns I’m going to fill in the value for beta 1 in here so I’m going to take this and I’m going to fill in it in here here so for this volum so remember that beta 1 and beta 2 are two unknowns and X1 and X2 are just uh numbers that we will get when we uh know exactly the vector and we just want to represent that Vector as a linear combination of two vectors so when I take this uh value for beta 1 which is equal to X1 and I’m going to fill that in in here it means that I’m going to get from here that beta 1 is equal to X1 and 2 * X1 because beta 1 is equal to X1 and here I got beta 1 I’m just filling in the value for beta 1 which is equal to X1 so 2 * X1 and then the rest I’m just taking over three bet 2 is equal to X2 let me remember move this and from here what we are getting is that beta 1 is equal to X1 and I will solve this equation for the unknown which is equal beta 2 so I will just take the three beta 2 from left hand side I will leave it there and I will take this and I will take it over to the right so I’m taking X2 over and this two X1 so this part I’m just taking to the right two of the equation so minus 2 X1 which then on its turn is equal to so it goes to beta 1 is = to X1 and then beta 2 is = to X2 – 2 X1 / 2 3 perfect so what do we get here what is our end result and why is it significant so what we are getting here is that based on all this information without knowing beta 1 and beta 2 we got that beta 1 should be equal to X1 and beta 2 should be equal to X2 – 2 X1 / 2 3 this means that independent what kind of ex’s you will give me so what kind of vector we have in our AR 2 so this X1 and X2 they are just real numbers we can always find beta 1 and beta2 that we can use to represent that X1 X2 so our X Vector as a linear combination of these two vectors let me actually give you an example so let’s remove this so let’s assume we have a vector X and this x is equal to 4 and let’s say three so if we got this Vector X and we are saying we can use this Vector A and B to represent X as a linear combination of vector A and B which means that I can find I can find real number beta 1 and beta 2 that I can use to multiply the vector A and B respectively combine them together so they’re linear combination that will be equal to this Vector X so this is my X1 this is my X2 so this is equal to 4 and three Now using this let’s actually see whether that is true So based on this example my beta 1 should be equal to X1 which is 4 my Beta 2 should be equal to X2 which is 3 so beta 2 should be equal to X2 which is 3 – 2 * X1 which is 4 / to three and what’s this number this means that my beta 1 should be equal to 4 and my Beta 2 should be equal to 3 – 8 so 3 – 8 / to 3 and this is equal to- 5 / to 3 so this means that I can use a coefficients beta 1 is equal to 4 and beta 2 = to – 5 / to 3 to represent my Vector X as a linear combination of vector a and Vector B so let’s actually prove that too as a final step so let’s see where four times Vector a which is one 2 plus – 5 / to 3 whether this is indeed equal to Vector X so my Vector B is 03 so this is the first part and I want to prove that this is indeed equal to X and we already know what x is so this is equal to 4 * 1 4 * 2 plus and then here we got – 5 / 3 * 0 and then – 5 / to 3 * 3 this is equal to 4 * 1 is = to 4 4 * 3 is equal to 8 and then here we need to subtract minus 53 5 5 / to 3 * 0 is equal to 0 so this one is zero and then minus 5 / to 3 so 5/3 * 3 this ones are canceling out and we got 8 + – 5 so here the plus and here minus just to make sure we got everything right and this is equal to 4 and then 8 + – 5 is equal to three so you can see already that this amount that we got here is equal to X which was equal to 4 / to 3 and this helps us to uh verify and to know for sure that indeed while given any Vector in a two dimension space in our two x independent what this X1 is or X2 is we can always find a pair of beta 1 and beta 2 that will ensure that the beta 1 a plus beta 2 B is actually equal to this x where X A and B they are part of R2 and a is equal to one two and then B B is equal to 03 so we can represent any Vector in our two dimensional space as a linear combination of this Vector a with elements 1 2 and um Vector B with elements 03 and that’s exactly what we saw here because we could find any vector and we can represent this Vector as a linear combination of this Vector A and B this Vector as a linear combination of this A and B this Vector as a linear combination in any vector or a point in this plane we can represent as a linear combination of this Vector a and Vector B and in this specific case with this Vector a and Vector B we are seeing that Vector a and Vector B they spin R 2 so Vector A and B Spen R 2 now we will come to this definition of the span and just in general for different sorts of vectors we will see what this idea of span is but for now given that we just proved that we can represent any Vector in R2 as a linear combination of these two vectors A and B therefore we can say and we usually say it in linear algebra that the vector a and Vector B they spend R2 before moving on onto this concept of Spence that we just touched upon in our example I wanted to quickly go back to this example that I promised to discuss uh which was part of the definition of the linear combinations and unit vectors because we saw in our definition and let me just show you that uh the uh definition was providing these two highlights these two bullet points and was saying any Vector B in N Dimensions can be expressed as a linear combination of the standard unit vectors E1 to up to E n and the coefficients in this combination are the entries of B itself so let’s look into the example and see what we mean by that in this specific example we have this Vector B it is coming from the threedimensional space which we can see given that we have three different uh three entries so three uh elements in our Vector so it’s three by one and this means that b belongs to R Tre and in here we can see that B can be written as a linear combination of these three vectors so you can see that b is equal to min-1 * this Vector 1 0 0 so this one then we have + 3 * 0 1 0 Vector so this one and plus 5 * this third Vector which is 0 01 now we already know from the unit vectors that E1 is equal to 1 0 0 assuming that we are in three dimensional space E2 is equal to 0 1 0 and E3 is equal to 0 01 you can notice that that’s exactly what we got here this Vector is E1 this Vector is E2 and this Vector is E3 where E1 E2 and E3 belong to threedimensional space okay so another thing that we can see is that here we got coefficients minus one here three and here five so this is basically how beta 1 beta 2 and beta 3 using the common conventions that we saw before when describing the linear combination so let’s actually check that and then we will comment on these values so let’s check whether -1 * E1 + 3 * E2 + 5 * E3 is indeed equal to this B so this is equal to min-1 * this Vector gives us Min – 1 0 0 three times this E2 gives us 0 three and 0 and then 5 * E3 gives us 0 0 5 and this is equal to -1 + 0 + 0 is = to -1 0 + 3 + 0 is = to 3 and then 0 + 0 + 5 is equal to 5 now what do we get here we see that this which is equal to this it is equal to this Vector B indeed okay so now when we have indeed checked that B can be represented as a linear combination of these three vectors these unit vectors E1 E2 E3 another thing that we can notice and I’m sure you already did is that those coefficients they are not just randomly picked coefficients those are the entries of this Vector B so this is exactly what that definition was about it was saying that any Vector B including this example in in this case threedimensional space can be expressed as a linear combination of the standard unit vectors E1 E2 E3 Etc so this coefficients in this combination so you can see that the beta 1 beta 2 and beta 3 which are our coefficients in our linear combination they are the entries so this values of the B itself so the same will hold for four-dimensional Case five-dimensional Case n dimensional case so this means that if we write this down for General case just to ensure that we are clear on this part of the definition so if we got B Vector in N dimensional space so it got be one B2 up to BN as the elements of it comes from RN then we can represent this B as a linear combination of unit vectors coming from the N dimensional space so we got E1 E2 up to n that belong to n dimensional space and we can represent this B as a linear combination of these unit vectors so by using beta 1 times so this this is a common Convention of the coefficient as you might recall time C1 then beta 2 * E2 blah blah blah plus beta n * e n and what is important here is that this beta 1 beta 2 and beta n those are not just some coefficients but we already know what these coefficients are because we can then represent this beta by taking the values so those are the entries the elements of the vector B itself so it is B1 time B1 plus B2 times E2 dot dot dot Plus BN time e n where B1 B2 up to BN they are all real numbers so basically knowing what this Vector is this elements of this Vector we can always describe and express it as a linear combination of the standard unit vectors and if you’re wondering why is this important in some cases when performing different operations or working on different algorithms it just becomes handy to represent your vector as a linear combination of multiple vectors and in those cases exactly you can make use of this property of linear combinations to express your n dimensional Vector b as a linear combination of the standard unit vectors because everything is then down to you by having this Vector B you will already know what are the entri that you can use as your coefficients in this case beta 1 beta 2 so those are all these values coming from your vector itself and then the remaining is also known because you know exactly what these unit vectors are and how they are represented so here for instance the E1 is basically 1 0 0 blah blah blah blah 0 and then this is n by one vector here the E2 is equal to 0 1 Zer blah blah blah and then zero here so n * 1 again up to the point where you have the e n where you have all these zeros only the last element is one again n by one vector so this is the idea behind this second part of this definition which says that any Vector being n dimensional space can be expressed as a linear combination of the standard unit vectors E1 up to e n let’s now talk about other concept which is also super important which is the span of vectors so by definition the spend of a set of vectors is a set of all possible linear combinations of these vectors so if V is equal to V1 V2 up to VK and is a set of vectors then the span of V is written as a span V and it includes any vectors that can be expressed as C1 V1 up to C2 V2 up to CK VK so basically it is a common uh notation uh to say that if we got for instance vectors V1 V2 up to VN so we have n different vectors then we say that the span of V1 V2 up to VN that this is simply the notation that we use in order to describe this Spen of this vectors and we briefly spoke about this concept of span when we were looking into our example that we saw before so you might recall this vectors A and B that we had and we saw and we said that the span of a and b is the entire space in the two dimensional uh real number space so we said that span of a and b is equal to R2 where our Vector a was simply equal to 1 2 and B was equal to 03 so we proved that the span of one two and 03 was the entire R2 and how we knew that because we proved that any Vector in R2 could be represented as a linear combination of these uh two vectors so so you might recall that we solved this equations we saw that independ what kind of X1 and X2 uh one will give us we can always use the um U we we found this amounts let me see where I can find it back I no longer have this so we saw that for uh specific values of um beta 1 and beta 2 we can always get a linear combination of this A and B in order to get our desired vector X so beta 1 * a plus beta 2 * e will always then be equal to X1 and X2 if our Vector a and a vector B are those but of course this doesn’t hold for all the vectors so not for all uh two-dimensional A and B uh we can say that the spend of these vectors is the entire R2 therefore to better understand this concept of span and this concept of span of vectors I wanted to distinguish five different cases one of which we already spoke about and that is the case when we had this Vector a and Vector B and we said that the span of a and b is the entire R2 but we will also look into the case when we for instance have a span of the zero Vector the span of a single vector and the span of perpendicular vectors we might also look if there is time left we will also look into the span of parallel vectors so let’s now look into this cases one by one so let’s say we have a vector of zero so we have zero Vector so this is a very simple case we will start with the simplest case and we will move on WE two more advanced cases if we have a vector a that is a zero Vector 0 0 then independent what kind of scal we will use to scale this so let’s say um we Define it by C so C * Z independent what kind of scalar we will use this will always end up being equal to 0 0 so if C is equal to z c * 0 will be equal to 0 if C is equal to 1 C * 0 will be equal to zero or C is equal to 100 C * 0 will still be 0 0 so independent what kind of scal we will be using what kind of linear combination we will create from our Vector a this will always stay in here so the point the vector will always stay in here in our two Dimension space so this is completely different from what we saw before when we could create and we could take any Vector in our R2 and we could represent it as a linear combination of these two vectors that we saw in the previous example so in this specific case um scaling the zero with independent of any scalers we use this will not change the magnitude nor it will change the direction of our Vector so no matter how we scale it we still get zero this means that the span or zero Vector is just the zero Vector itself so you can see that independent what I scale the zero Vector I always end up with the same zero Vector so therefore the span of the zero Vector is equal to zero because by definition the span of set of vectors is the collection of all possible vectors that I can reach by performing linear combination and in this case I will always Reach This Z 0 Vector so all possible collections of these vectors are the vector 0 0 which is single vector and the same as the input so this is the basic case now let’s move on onto a bit more uh Advanced case so bit more complicated than this one but itself also very easy which is when we got a single Vector a so let’s say a a is equal to 1 and 2 now I want to know what is the span of a in order to know what is the span of a we simply need to understand what are all these possible collections of vectors that I can get when I’m uh combining um a I’m multiplying a with different coefficients so what are the all possible linear combinations of this Vector because I got just single Vector a so a is one two which means one in here and then two here my a is this vector and let’s look into uh different uh scalar multiplications of this Vector so let’s say I want to calculate C * a so the scalar multiplication of this where C is equal to C is = to 2 C is equal to 3 C is = to uh min-1 C is = to Min – 3 and of course C is = to 1 so in all these cases when C is equal to 1 then the linear combination in this case just the scaled multiplication of this single Vector a so 1 * a is simply equal to one and two so the same Vector a c is equal to 2 this will give me 2 and four C is equ 3 this will give me 3 and 6 C is = to min-1 will give me -1 – 2 for my a and then C is = to -3 will give me -3 and then Min – 6 so let’s plot each of those so if we got for instance C is equal to one case you can see that we already got that Vector in here so it is this Vector when C is equal to 2 then we got this one so two and four so where is that it is in here let me use another color it is in here when we got say is equal to three then we got so this one three and six so this is three and this is six so it gives me this vector in the next example so in the next linear combination we have C isal to minus one so we got min-1 and min-2 so where is min-1 it is in here where is min-2 it is in here so I’m getting this vector and then finally when I have let me change the color when I have C is equal to minus 3 so this case then I got minus 3 six which means that here is my minus 3 here is my- 6 so we got this thing so you already should see what is going on here when we got just a single Vector for which we need to know what is a linear combination and that Vector is not a zero Vector it has nonzero elements but um it’s still it is just a single Vector then all its linear combinations given that it is simply uh scaled multiplication of it we are all getting them on the same line so you can see all the linear combinations of this single Vector is just a scaled version of it and it lies on the same line so what this tells us is that essentially you can move along the line defined by this Vector a but you cannot leave it so you cannot get a vector that is in here that is in here that is in here in here here so you cannot leave this uh line you will always stay on this line so this line essentially span of a so when we got a single vector and that Vector is not equal to zero Vector then the span of a is equal to and this can be expressed as C * a given that the C is a real number so we already saw this in depending on what kind of scalar we will take any linear combination of it will end up simply this C * a so therefore we are generalizing this and we are seeing that the span of a so to set of all possible linear combinations of this a is simply equal to C * a given that the C is a real number this is basically the spend of a real uh uh Vector in a two dimensional space let’s now look into the next case the next example when we will calculate or we will Define the span of a perpendicular vectors so let’s now look into another example when we are looking for a case when the um when we want to find out the span of perpendicular vectors so imagine we have these two vectors Vector a and Vector B where a is equal to 1 Z and then B is equal to 01 so we are still in our lovely uh two dimensional space so let’s first visualize the vector a it’s quite basic it is this one and then Vector B it is simply this one so we can already see why they are perpendicular so you can see that they are forming this um 90° angle so right angle here and then we know that the spin so that’s exactly what we want to find out so the span of a and b and this is what we want to find out out and we know that the span of two vectors is the set of all possible linear combinations of these vectors so we want to see what are these all possible linear combinations of C1 so all the possible outcomes that we will get when we get a linear combinations of these two vectors so basically C1 times a plus C2 * B Because those are all the linear combinations of these two vectors C1 * a + C2 * B nothing thing that we can see here quickly is that C1 * a so this part those are all the skill sced versions of a so scaling multiplications of a and this second term in the linear combination those are all the scaled variations so scalar multiplications of vector B which means that and we already have seen this time and time uh again that when it comes to Vector a all its linear combinations they will lie on the same line so let me take this color so if I do 2 a so C1 is equal to 2 then I will be in here if C1 is equal to three then I will be here C1 is equal to 4 I will be here C1 is equal to 10 I will be in here and then the opposite holds as well if C1 is equal to for instance minus uh 2 then I will be in here if it’s equal to Minus 5 I will be uh my Vector will look like this and so on so this means that all the scaled multiplications of vector a will lie on this line so I can also say that the span of C uh the span of a so span of a is simply equal to C1 a so you can see in here so on this line basically so this is C1 a so this basically means independent what kind of C1 I will take with is 1 2 3 0 – 5 – 100 I will always end up on this line so this line so this is about the uh scaled multiplication of a but of course to create this linear combination of A and B we also have the second element which is the all possible scaled multiplications with a vector B so C2 B2 so let’s see what that looks like so if I for instance take C2 is equal to Z I will be in here if I take C2 is equal to uh two I will be in here C2 is equal to 5 I will be in here C2 is equal to Minus 5 I will be here so you are already seeing what is happening here so all the power possible scaled multiplications with Vector B will be on this line so now we are then getting that the span of B will then be equal 2 C2 and then B and here I’m not using formal notation I’m just trying to um I’m just trying to to uh draft the idea of the span of vector a and uh span of vector B because we are not uh done yet we still need to combine the two in order to find the span of vectors A and B when they are perpendicular so when this angle is simply 90° okay so let’s also add this on our plot so this is C2 and then B so this already gives us an idea that all the possible combinations of the two so when we add these two elements to each other the outcome will always lie on this two lines but there is no way that we can find any other coefficient for C1 or C2 that can help us to get a value that will be so a vector that will be in here or in here or in here or in here that’s just not possible so just you can try to go ahead and solve that equations like we did before and you will see that there there is no way that you can pick here a line and you can represent it as a linear combination of these two vectors it just not possible and later on we will see why but just keep in mind for now that once we have this this type of vectors when two vectors are perpendicular then um we cannot find a line a vector that is outside of these two lines so here you can see the x-axis and the Y AIS but it can also be like this it can also be like this but then you cannot find any other line that lies outside of this area that you can uh create a linear combination of these two different vectors and then you say then you cannot say that you can create a linear combination of these two vectors A and B so therefore when it comes to defining the pan of this two perpendicular line we say that the span of a and b given that A and B are perpendicular but also given that this values in this case you know a is equal to 1 is z b is equal to 0 and 1 then their span you might have already guessed is equal to C1 a plus C2 B given that C1 and C2 are of course real numbers so in this case C1 and and C2 as expected are just scalers so they are just some real numbers coming from R and uh this a A and B those are vectors that are being spent and in this case specifically the vector a is equal to this one Z and Vector B is equal to 01 and this expression that we see here this pen this simply describes the set of all possible vectors that can be formed by adding the scaled versions of this A and B so C1 a plus C2 B in order to form this linear combination so this set this set of C1 a plus c2b which is a linear combination all possible linear combinations of these two vectors so um it effectively covers the entire plane illustrating that any point in 2B space can be reached by some combination of A and B let’s now move towards our final example that we saw also as part of our definition for the span of vectors in order to check and to learn how we can usually check uh whether the two vectors they really spend the entire space so in this case we got two vectors we got Vector uh V1 which is equal to one two and Vector V2 which is equal to 34 so in here and also um we uh have in our example that it says the span of V1 and V2 is all over the R2 because any Vector in R2 can be expressed as a linear combination of V1 and V2 so the example basically is saying that if we know that um we can express any Vector in R2 as a linear combination of V1 and V2 then we say that the span of V1 and V2 is the entire R2 so let’s actually go ahead and prove that from our example so we have X which we can represent as X1 and X2 and X1 and X2 are just real numbers and we got V1 which is 1 2 V2 which is 3 4 and we got in our example that uh we need to prove that the span of V1 and V2 is the entire R2 so for that what we need to do is we need to prove that we can express our coefficient C1 and C2 in such way using X1 and X2 that independent of what is X1 and X2 are so what kind of X Vector we have whether this is like one two or this is 04 or this is th000 and uh 5,000 independent what kind of vector we get uh we give here so X1 and X2 values as long as those are real uh numbers we can always find a set of C1 and C2 that we can use as coefficients in order to create a linear comp combination from vectors V1 and vs2 and in that case we say then the span of V1 and V2 is the in R2 okay so let’s go ahead and actually prove that using our previous knowledge that we already gained so keeping in mind that C1 and C2 are unknown numbers for us whereas X1 and X2 are just a way to describe those elements in our Vector X that will be provided to us so X1 and X2 will be basically known and C1 and C2 are the unknowns that we are chasing so for that the first thing that I’m going to do is to describe this linear combination that we got here C1 V1 plus C2 V2 with actual equations unknown equations and the way I’m going to do it is by simply filling in this Vector V1 and Vector V2 um vales so we have C1 and then C2 here and then here I got one two plus and then three and 4 here and what is this amount so this is equal to let me actually go on to the next row so we can create um set of equations for this so This equal to X and we get C1 * 1+ C2 * 3 is equal 2 2 and remember that this is X and we said that the x is equal to X1 and X2 so this basically equal to we can right here is equal to X1 and X2 so this is then equal to let’s not skip all the steps X1 and X2 so here then the second elements need to be added so C1 * 2 and C2 times 4 which is then the same as C1 + 3 C2 and then 2 C1 + 4 C2 and then this we are saying this Vector is equal to X1 and X2 so this is what we have here and let’s move from the vectors to equations so given that we have this we are allowed to say that this gives us actually two equations this means that this element this element from this part should be equal to this and this element should be equal to this now let’s write it done we see that C1 plus 3 C2 should be equal to X1 2 C1 + 4 C2 is equal to X2 this is all that we see in here let’s remove this to keep the space clean now what this means is that we have two equations with two unknowns C1 and C2 and X1 and X2 are the numbers that will be provided to us as part of our Vector so what we want to prove is that we can describe and we can express C1 and C2 which are our unknowns using a X1 and X2 so you see here this is C1 C2 those are our nouns and we want to describe them by using X1 and X2 and very soon we will also see why so for now let’s try to express those two unknowns using our nouns like X1 and X2 so here I already see that C1 is alone so there is no scaler so I will make use of that opportunity to keep the C1 one on the left hand side and I will take this this amount to the right so I will say C1 is equal to X1 minus 3 C2 okay slightly better so I have C1 at the left I do have X1 in the right but I also have 3 C2 in here but another thing that you will notice is that in my second expression here I got 2 C1 + 4 C2 + 62 I want to have the C2 on only because then I will have an expression of my C2 only using X1 and X2 um so numbers that are that will be provided to me that are n so for that what I’m going to do is basically trying to solve uh two equations with two unknowns exactly the same um process so I’m going to take this C1 from the first equation and I’m going to fill in in the second equation so I going to say two * and here I’m going to fill in that C1 expression from here so X1 – 3 C2 so this is my C1 plus just taking over this part so 4 C2 is equal to X2 okay perfect so now what I end up with is C1 is equal to X1 – 3 C2 just taking it over and then here I’m opening parenthesis which is 2 X1 – 6 C2 plus 4 C2 is equal to here I forgot an X2 is equal to X2 okay one step closer why because I in my second equation I no longer have a C1 I only have a C2 which is great which means that this gives me an indication that I can reun the C2 which is unknown with nouns with X1 and X2 so let’s make use of that opportunity the first equation I would just take over so C1 is equal to and then X1 minus 3 C2 and then here I will do 2 X1 and then here we got two * c2s which means we can combine them so minus C uh minus 6 * C2 + 4 C2 it gives me – 2 * C2 and this is equal to X2 all right let’s now solve that so what I want to have is just the C2 in the left hand side which means I need to bring all this to the right and I need to get rid of them such that I can leave the C2 in the left entirely alone so this is what I’m basing basically chasing for that I’m going to once again rewrite C1 is equal to X1 minus 3 C2 and this time I’m going to take them minus 2 C2 here I’m going to leave that in the left but then this one I’m going to bring to the right so X2 – 2 X1 here I need to be very careful to not make a mistake CU that will mess up my entire calculation all right so now we are one step closer just taking over the first equation again so C1 is equal to X1 – 3 C2 and here what I need to do to get rid of this minus 2 is to divide the two sides so both 2 minus 2 cuz that will help me to keep the C2 only in the left alone without any scaler so the C2 is than equal to X2 – 2 X1 / 2 – 2 so this is what I end up with perfect so we are very close stay with me so um here what we are getting is that C2 is equal to this amount we see that now we no longer have any other C’s in here which is great and remember that X1 and X2 will be numbers that it will be provided to us I just wanted to keep everything General and then uh another thing that I want to fix is this C2 because the C2 is an unknown and I want to fill in uh this value of C2 in here such that for the C1 I will have a similar picture so in the left hand side I will have C1 in the right hand side I will Express the C1 with no number so X1 and X2 but not the C2 or others all right so let’s then go ahead and do that first I will write C2 in a simpler way so C2 is equal to here I got a minus I will just write here minus so I will take the minus over here and then I will write X2 – 2×1 2 um be super careful with this minus therefore I’m using parenthesis so now I’m going to use this C2 and I’m going to fill that in in here so C1 is equal to X1 minus 3 times I can also make it plus because minus of here so minus of here and minus of here will cancel out therefore I will do plus three times and then X2 – 3 X1 / 22 and this then gives me C1 is equal to X1 + 3 / to 2 * X2 – 3 sorry 2 almost made a mistake 2 X1 and then C2 is equal to X2 – 2 X1 / 2 2 and here we got minus Perfect all right awesome so now we have expressed X1 and X2 well careful with this X1 and X2 with only noun numbers now what I’m going to do is that I’m going to prove that independent what kind of X we will be taking here we will end up getting the C1 and C2 using this what we just found here that will give us a linear combination of these two vectors that will be equal to that X so for that so to prove that this patter of V1 and V2 is the entire R2 I need to prove that in depend what kind of X I will take so X1 and X2 I can always find the C1 and C2 that I um just calculate in here using that X1 and X2 that I can then use to combine with my V1 and V2 to find the linear combination of these two vectors with that C1 and C2 which will be equal to this X so for that what I need to do first is to take such a uh random X so let’s say my X is equal to 0 and 4 this means that my X1 is equal to 0 and X2 is equal to 4 what this means is that this gives me C1 which is equal to and here X1 so I’m basically filling these two values for here to obtain my C1 so C1 corresponding to this specific Vector X so X1 is equal to 0 which means I end up C1 is = to 0 + 3 / to 2 * X2 is equal to 4 so 4 minus and then 2 * X1 is = to – 2 * 0 which is 0 and then C2 is = 2 minus and then X2 is equal to 4 so 4 and then minus 2 * XY is = to 0 0 and then this divided to 2 now what are those numbers so C1 is equal to 3 / to 2 * 4 which is 3 * 2 so 6 and then C2 is equal to minus 4 and then minus so this is zero this cancels out which means 4 / 2 is 2 and then C2 is equal to – 2 so basically I have calcul calculated the coefficients C1 and C2 by just knowing what is this Vector so knowing X the provided X1 and X2 I have calculated my C1 and C2 using my derivations in here so let’s now get rid of this this calculations to clear some space and to do the final part which is compute the linear combination of vector V1 and vs2 for this specific coefficients well knowing what this given Vector now is this example random Vector so the C1 is equal to 6 which means six times and then Vector V1 is one 2 so this is first part of my linear combination plus and then C2 is equal to minus 2 * then here three 4 what is this this is equal to 6 and then 6 * 2 is 12 plus now let’s calculate the second part – 2 * 3 is – 6 and – 2 * 4 is – 8 so what does this give us 6 – 6 and 12 – 8 this gives us0 and four nice so this confirms that we have done everything also correctly which is great because we have seen that using this C1 and C2 that we have just calculated we have successfully comp uh computed the six V1 so linear combination of this uh two vectors V1 plus minus 2 minus 2 and then V2 and we have seen that this linear combination is equal to 04 which is exactly our X so in this way we have proven that independent what kind of vector we will pick what kind of X we will pick here we can always find and calculate the corresponding coefficients C1 and C2 in the same way as I just did and then by using those when we calculate the linear combination of these two vectors with this specific coefficient this will be exactly equal to X and this proves that inde depending what kind of vector we have in our R2 we can always express that as a linear combination of the vector V1 and V2 and this proves and this concludes our proof that span of V1 and B2 is the entire R2 all right so we are very close to finishing up this unit so the next topic we are going to talk about is a linear Independence and all this important stuff that we learned as part of the previous modules are going to become super handy as part of this specific concept so we just spoke about the idea of span we have plotted a lot of vectors we have seen the linear combination of that and how we can find out whether the span of multiple vectors is the enti space uh for instance the R2 or it is just the line or it’s maybe the zero Vector we have seen many examples and many operations we have also seen this idea of unit vectors and we are finally ready to come to this very important concept which is a concept of linear Independence so by definition linear Independence says that the set of vectors is linearly independent if no Vector in a set can be written as a linear combination of the others otherwise they are linearly dependent so vectors V1 V2 up to VN are linearly independent if and only if the only solution to the equation C1 V1 plus C2 V2 plus CN VN isal to Z is C1 is equal to C2 up to CN is equal to Z in other words in a lar linearly independent set the equation C1 V1 plus C2 V2 plus CN VN is equal to zero has only the trival solution where all CIS are zeros so now what do we mean here there is a ton of information in this definition so let’s unpack them firstly it’s TR important to uh keep in mind this idea of Independence and dependence Independence and dependence there are things that we commonly use in data science in artificial intelligence in statistics so those are really important so we basically have linear independent condition so there is a certain condition that our vectors should satisfy vectors in our set in our Vector space for them to be named as linearly independent and otherwise we are calling them linearly dependent and you can you can see that here there are a couple of Parts as part of this definition first it talks about um being unable to recreate a vector in a vector set while using the remaining vectors in our set so it says if you can use the remaining vectors in your vector space and linear create a linear combination of them so linearly combine them and we have already seen the definition of linear combination so if we cannot create such linear combination from the remaining vectors to get our Target Vector then we are saying that we have a linearly independent vectors so if we want to say that all our vectors in our Vector set they are linearly independent it means that each of those vectors we should not be able to recreate out of the remaining vectors so we should not be able to find coefficients to create linear combination using the remaining vectors in order to get our Target Vector now what do I mean by this target Vector what do I mean by this linear combination uh I will come to this in a bit for now let’s just try to unpack this definition cuz uh with examples uh we will definitely go through this step by step in detail such that this idea of linear Independence and dependence is super clear so in this second part of the definition it says vectors V1 V2 up to VN are linearly independent even only if the only solution to the equation and we have here in the left hand side you might recognize the linear combination of our vectors V1 up to VN so in in the right hand side you have zero so you are saying our linear combination of vectors is equal to zero if and only if C1 C2 up to CN is equal to zero so linear Independence basically claims that we will have linearly independent vectors only if and only in the condition when um the only way we can create linear combination of these vectors equal to zero only if those coefficients are zero there is no other way that we can get a linear combination that is equal to zero while those coefficients are not zero so the only way that we can get a linear combination out of all our vectors equal zero is only when all of the coefficient C1 C2 up to CN is equal to zero that’s something that we will come later to this again there something also that we are going to come back in our next module and the next one so um this one will be also super clear once we go through those modules but for now keep in mind that the uh linear combination of all these vectors can only be zero in case when all these coefficients are equal to zero so and then we have the third part in our definition which says that in other words in a linearly independent set the equation C1 V1 + C2 V2 up to CN VN is equal to zero has only the trivial solution where all CIS are zero so this explanation is basically what we just talk about as part of this second part where we said that only in case the coefficients are all Zer we can have a Lear combination of our vectors V1 V2 up to VN which is equal to Zer and why we would like this linear combination to be equal to zero because it’s a common way to find solution to our linear system so this is something that we will also see as part of the next module when we will be discussing the IDE of solving l your systems we will go into more uh Advanced topics but for now in order to understand this idea of linear Independence we should just keep in mind that we cannot find any CIS so C1 C2 so any coefficients that is not equal to zero and then expect that the linear combination of these uh linearly independent vectors is equal to zero so that’s the uh if and only uh if and only uh if part which means that this holds from both sides on one hand we have V1 V2 up to VN which are linearly independent only if the linear equation so the linear combination of all these vectors is equal to zero if all these coefficients are zero but also the other way around holds as well so if we have a linear combination that is equal to zero only if those coefficients are zero that means that we are dealing with a linearly independent vectors this is the if and only if part which means that we have this uh conditions from both sides if one holds the other one holds but also the other way around all right so let’s now look into specific examples that will make our journey in understanding linear dependence much more convenient so let’s say we have our coordinate system and we have these two different vectors so we have Vector let’s say 2 and three which is our Vector a and we have a vector B that is equal to 6 and nine so those two are our vectors and what we want to understand is whether those two vectors are linearly independent or linearly dependent so one thing that you can quickly notice is that b looks quite similar to a in terms of its scales so there is a way that we can recreate Vector B by using Vector a now you can see that if I take Vector a which is equal to 23 if I take Vector a and I multiply it by three so three * Vector a this is a scale multiplication then what I can get is three times and then I have here 2 3 and this is then equal to 3 * 2 is 6 3 * 3 is 9 this gives me 6 and N which is our uh Vector now another thing that you can notice that that is exactly my B so you can see that those two are similar which means that 3 * a is equal to B now what this means is that I can recreate Vector B by using Vector a so in our definition we saw that a set of vectors is linearly independent if no Vector in the set can be written as a linear combination of the others so here I can take this 3A as a way to write down a linear combination so 3 a + 0 * B is then equal to B which is basically saying 3 a is equal to B so so by using these two vectors in a set I can then create a linear combination of the two and actually even basic way of writing this is saying I can use the vector a to write a linear combination from this so 3A is a linear combination so just a scaled multiplication in this case of course but if we have just two vectors our Target Vector is B and I want to write this uh I want to see whether I can rewrite the vector b as a linear combination of the remaining vectors which is Vector a so I can then write Vector b as a linear combination of vector a because I can say that 3 * a is equal to Vector B so this means that Vector a and Vector B they are linearly dependent this means that I can use Vector a to recreate Vector B and of course I can also do the other way around right what I can do is that I can just take Vector B so I can take Vector B I can multiply it by 1 / to three 1 / 3 is real number so I’m just performing a linear combination using B and this will give me 6 / to 3 is 2 9 / to 3 is 3 and I’m getting exactly what I have under a so I can then also rewrite Vector a by using Vector B so I created a linear combination using Vector B in order to get a vector a and that’s exactly the opposite what we have learned here because we should not be able to write this vectors using the other ones in our set because otherwise we have a linearly dependent set therefore we are saying that Vector a and Vector B they are not a set that is linearly independent but they are linearly dependent before moving on to another example I also wanted to visualize the vectors just to see what is going on with this pen and uh how are the two linearly dependent vectors look like in R2 so this is our R2 we have a vector a which has two three elements so we know already the magnitude and the direction this is two this is three which means here let me actually use another color so 2 three which means this is my Vector a and then my Vector B is simply 6 and N so it is this one so you can already see what is going on so this is Vector a and this entire thing is Vector B and you can see that those two vectors no matter how I combine them I can I will always get the combination so linear combination of the two on this line if I want to get um Vector that is for instance in here I can never find a scalers of C1 and C2 in such way that these vectors so A and B they can form a linear combination that will give me this Vector there is no way that I can do that and that’s why uh we say that the span of these two vectors so span of a and b with this A and B is this line and we cannot express any of these other vectors like this one or this one using a linear combination of these vectors A and B the only linear combinations that we can recreate using these vectors A and B are on this line so you can see that even if I have two different vectors I actually just got um single Vector because I have two three and both of these vectors they are actually um uh scaled multiplication of the other one so B is equal to I’m missing here something 1id 3 so B is simply equal to 3 * a and then a is equal to 1/ to 3 * B so in both cases there are simply a version of scaled multiplication of this Vector 2 Tre so a is simply equal to B time 13 and then B is equal to 3 * a and both of them they are actually based on this Vector two a tree on this Vector a so therefore they both actually form and they span around this single line and they are both linear we also call it collinear and they are linearly dependent okay so let’s now move on to the next uh example where we will have bit more interest in case and we will look into this example when we have linear Independence look into another example bit more interesting one as we want to see whether those two are linearly independent or not so the first Vector that we got is the vector a the vector a is equal to 6 and Z so it is this vector this is Vector a the vector B it is this one and it contains element of zero and seven so it is this Vector this is the vector B now in our definition of linearly independent vectors we saw that the idea of linear Independence is that the two vectors can only only be linear independent if we cannot rewrite one of them by using the other so this means that we cannot rewrite a in terms of B and we cannot rewrite B in terms of a so there is no way that we can scale the vector a to get Vector B and there is no way that we can scale Vector B with vect with some uh scaler in order to get D Vector a so there is no way that we can create a linear combination of this one V VOR to get the other one and the other way around so let’s see whether this is the case just from uh trial and error we have a vector a which contains elements 6 and zero for us to go from A to B that has elements from so we need to go from 6 to zero in this case and we need to go from 0 to 7 now we can automatically already from the second element that there is no way that we can go from 0 to 7 you cannot find any scaler C that you can multiply with zero in order to get seven there is no way that you can do that because any number any real number that is a real number if you multiply it with zero it will never become seven and of course another thing that you can notice here also very quickly is the other way around right so here if you go from this zero to six there is no way you can go from this 0 to six because there is no such C that you can take this zero and multiplying it with that so here our scaler and you get this equal to six this is just not possible so what we are seeing here is that there is no way that we can somehow change these vectors so there is no way that we can scale them in such way so this is minus B so all the scales scaled version of this so all the um scaled multiplications of vector B they will always be on this line and then the same holds for a as well so all the scaled multiplications of a will be on this line so then one thing we can quickly see here is that given that those two are perpendicular this pen of A and B is the entire R2 so we can see that by using those two lines we can recreate any other line in this R2 and this is highly related to this idea of linear Independence and given that we cannot come up with a linear combination using the other vectors to recreate the other one in this case given that we cannot recreate a using B and we cannot recreate B using a so no linear combination that exist that we can use to rec create Bay using a and the other way around we are saying that Vector a and Vector B are linearly independent let’s now look into another example that will uh clarify this linear Independence concept so we have three different vectors and the first Vector is Vector vect a 1 0 0 Vector B uh 0 1 0 our second vector and the third Vector 0 0 1 you can notice that we are in R Tre and then the example goes on and it says that those three vectors are linearly independent and as an explanation we have that there is no way to add these vectors together with any scalar multiples to equal the zero Vector unless all scalers are zero now before even going on to next part it’s actually very quickly um uh provable that those three vectors are linearly independent and you cannot create a linear combination of one using the remaining of the two let’s look into this example in more detail so we have three vectors A1 A2 sorry B so we got a B and C which are 1 0 0 0 1 0 and 0 01 now you can quickly see that if we are in R3 and this is actually our unit Vector E1 this is our unit Vector E2 and this is our unit Vector E3 because in that positions we got our ones and the remaining there are all zero and this is actually very similar to the previous example because we can quickly see how we are we will not be able to recreate one vector using the OD on by even looking at the positions of the zeros so for us to recreate Vector a which is equal to 1 0 0 it means that we should be able to find a linear combination C1 C2 and then using these vectors this is the vector B 0 1 0 plus C2 * Vector C which is 0 0 1 so so in here basically we are already seeing a problem because we have here an element one and we somehow need to be able to find C1 and C2 in such a way that 1 is equal to C1 * 0 + C2 times 0 but we know that there is no C1 and C2 that we can find so this uh expression actually is true because C1 and C2 they should be real numbers and there are no real numbers that we can find to multiply with zero such that this will end up to one because this is always equal to zero and we basically get one is equal to zero which is not true and of course the same holds the other way around you can prove that B can never be um recreated by using the linear combination of a and c and also the C can never be recreated by using a linear combination of A and B therefore we are saying given that a can’t be written as linear combination of B and C B can be written so the same only this time A and C and then C hun be written as linear combination of A and B those vectors A B and C they are linearly independent and even stronger you can actually go ahead and prove that this pen of these three vectors is the r Tre but that’s outside of the scope of this example so we will just pass but I will leave that um to you to prove all right so now when we are done with that let’s actually move on to the last module which is the dot product and its applications so uh the length of a vector and Dot product is a concept that um we um are familiar from the high school so the length of a vector is deeply related to this dot product idea the dot product of a vector v WID itself gives this a square of the length of V what basically um it means is that this dot product of vector v so this thing which means take the vector v and multiplying it with the with the other vector v is simply equal to the square of a length of B so this is way to express the length of the uh of the vector v and once we square that that is the dot product so that’s basically this definition what is about about so we know what this definition of the distance is and we Define it by this and then we take the square of that distance and that is our DOT product and we are going to see this idea of dot product a lot especially when it comes to uh matrix multiplication Vector multiplications also in many applications of linear algebra you will see this idea of thatt product coming again uh and coming back to us so uh this is a concept that we really need to understand so in the two Dimension space let’s say we have a vector B which is um consisting of the two elements X and Y then the dot product and the link are related by V by V this is the way we denote the dot product so we just simply use the dot and the name also makes sense because we are saying we are using the dot to perform dot product so we are multiplying to we are creating the product of this Vector with itself and this is equal to x² + Y 2 which is equal to the um square of the distance of this Vector now you might recall from the high schol that we have learned this idea of distance so if we have xaxis y axis then we basically use this uh x² + y sare to uh get the uh you know the formula for from our Circle and then uh we have the x² + y Square we take the square root of it and then this is our distance so once we take the square root of that square of that from this uh square root of x square + y Square then we are simply getting this two cancel out which is equal to x² + y^2 So This is highly related to this idea because we are again talking about distances and we are simply taking the distance since we are squaring them up and then we are getting the dot product so this the double uh straight lines this is just a notation that we use and we spoke about this also before this comes um from the prealgebra and this um this is highly important related to this idea of PAG theorem and how we compute the distances so for instance when we have this uh Square triangular so we have this um uh rectangle here and we have here the 90 uh uh gr so here we have the rate uh right um angle and here we have our C which is uh the side right in front of this uh 90° angle and here we have the A and the B and we say that c² is equal to a sare + b sare and if I were to actually write this in terms of X and Y so if this side is X and this side is y and this is my Z let’s say then z s would be equal to x² + y² and this is something that we can see here too and the two terms are highly related so the Z squ is equal to x² + y s and this is simply equal to Z * Z right and this is something that we know from High School welcome to the module one of this new unit when we are going to talk about about mates as as linear systems so those are all fundamental concepts that you will see time and time again when applying linear algebra not only in mathematics but also in applied sciences like data science artificial intelligence when training different machine learning models and trying to see what is this mathematics behind machine learning models different optimization techniques when you want to solve different problems using linear algebra so in this first module as part of found found ations of linear systems and matrices we are going to introduce this concept of linear systems and then we are going to talk about the general linear systems we are going to uh see this common labeling all the coefficients this idea of indices that refer to the rows and the columns we’re going to see what is this differentiation between homogeneous and nonhomogeneous systems so without further Ado let’s get started so uh the linear systems form the um bedr of linear algebra modeling this area of problems thanks to this advancements in this linear systems and solving it in Computing we can now solve a large amount of problems in a very efficient and a fast way so uh the general linear systems can be represented by this uh set of M equations with n unknowns in the previous unit when we were looking in to this uh linear combination of vectors we saw this notation which was A1 and then we had C1 multiplied or rather let me keep me uh let me keep the same notation so we had this linear combination of vectors so we had beta 1 and then we had A1 Plus beta 2 and then A2 and those are all vectors Plus A3 so beta 3 * A3 dot dot dot and then beta M time a m this is the notation that we saw before and we said we want to come up we wanted to come up with the linear combination of these different vectors A1 A2 A3 up to a m and then we use that in order to get a sense of whether we are dealing with linearly independent variables vectors or larly dependent vectors and then we also commented on the span that these vectors take now when it comes to um the uh vectors and just in general linear systems we can represent what we had before now in terms of with a bigger system so in terms of M equations and with n unknowns so here what you can see here is that we have M different equations so we have beta one B1 B2 up to BM so you can see it in here and then each of these equations it contains n unknowns so you can see that the unknowns it stays the same so the unknowns are those X1 X2 up to xn so X1 X2 up to xn are the set of all n unknowns and then M equations that you can see in here are all this equations so a11 x1+ a12 X2 dot dot dot and then a1n and then xn is equal to B1 and here one thing that is really important to keep in mind is that the indexing is what we need to focus on so we need to keep this one in mind this a i j and this XI so this is something that we also spoke about when uh discussing the linear combination of vectors we slightly uh touched upon on this topic so let’s now dive into this this indexing and how do we index is a i j what are this A’s what are this J’s and here you can see that we have a11 and then a12 and then up to the A1 n and this is in our equation one and then we have in our equation two A1 2 let may actually write this with a different color so in our equation two we got a 21 a 23 up to a 2N and this a that you see here those are just real numbers so so a11 can be 1 a12 can be three A1 n can be 100 and then the same also holds for this B1 for this B2 and for this BM and all these values A’s and B’s they are just real numbers the only unknowns that we got here are those so the X1 X2 up to xn all right so what about the indexing now so we got a i j and as you can see in this case the first thing that we can see here it stays everywhere the same which is the one so we got here one we got here one and up to the point we got here one whereas the second Index this one it does change it grows grad with one and it becomes it goes from 1 to two and up to n so you can see here that the first index first index or index I it goes from one it doesn’t change it’s just one so it is 1 one and one so here in all cases for this equation I is equal to 1 but another thing that you can notice here is that the index two unlike index I so the second index which is the J so you see here that the second index is referred as J this is a general way of defining the indices so here J is equal to 1 2 dot dot dot and then end so basically the I doesn’t change in the same row but the J changes and then of course we have slightly different in terms of I but then the same for J for our second equation so here I is equal to 2 and then J is again equal to one and then two dot dot dot and then n and then here up to for the last equation our I is equal to M and then our J is again equal to one till two dot dot dot so you might notice that I was looking at this from the row perspective so I was saying pair equation or pair Row the I doesn’t change but then the J stays the same and then it is either one two up to n but the set is the same so is it contains all these different elements here so one one two and then n but it contains all these different real numbers going from one till n because we are combining and we are creating this combination the sum of all this values a11 and then X1 a12 X2 A1 n xn and another thing that you can also notice here is that here with the second index so with this J J is equal to 1 then here the x is corresponding index is also one when the J is equal to two then the ex’s corresponding index is also two and then here the same story and you will notice that while the coefficient contains two indices 1 one 1 2 or 1 n which are the two indices for the coefficients for the unknowns we got just single index which goes from one till n so basically for a for the coefficients so I let me write with the right color so I can be one 2 all the way to m whereas in case of J it can be 1 2 all the way to n and the indices are basically used to help us to keep track of in which row we are and what is the um variable that the coefficient belongs to because knowing this second Index this helps us to understand that we are dealing with a coefficient that corresponds to this first unknown the first variable X1 and then the same holds in here as you can see in here and in here we are dealing with the same variable X1 therefore the second index the index J is then the same B both in the first equation and in the second one in both cases it’s equal to one okay so now when we are clear on that let’s also understand this high level concept because you will see this system of linear systems this m equations and N unknowns appearing a lot not only in terms of calculating and finding the solution to this linear system but this actually has a very common application ation when it comes to um running regression linear regression specifically and one thing that you can notice here is that here we got also this B1 B2 up to BM and you will notice that here the index also uh goes from one but then this time to M so when it comes to the rows we have M rows or M equations therefore we also expect when it comes to Counting from the top that at the bottom we will see an M whereas if we count from this side so kind of like imagine it like a column then we see that it goes from one till n so those are common observations and reference to um number of observations and number of uh features that you will see in your data when dealing with data analysis or modeling data so just this uh just keep those things in mind this uh abbrevation of M and then n m equations n unknowns because this will become very handy and the same also holds for this indexing just to keep in mind that this I and this J what those indices are and how for instance the first you know the I the first index changes when we go from up to the bottom and how the second index J goes and changes when we go from left to the right when we go through the columns but we are going to see this also in the uh upcoming slides so uh we can we will have time to practice it so um this is what we are calling a coefficient labeling the coefficient uh a i j so this thing in a linear system they are labeled where the first index represents the row and the second index denotes the column so when we see a i j we know that this refers to the row and the J refers to the column so this is something that we use in order to understand where exactly in our metrix something that we can we will see very soon where exactly our unit or our member that is part of our Matrix where exactly is that located in which row and in which column the systematic labeling is super important because this helps us to keep the structure and this helps us to understand uh what does this uh coefficient represent what what is this row that it belongs and what is the column it belongs so for which equation and for which unknown we have already solved the problem that say so we can know what this uh coefficient represents so before moving on on to the actual linear systems and this definition of matrices let’s quickly understand this distinction between homogeneous and non-homogeneous because this will help us to also get an understanding how we can solve a system of linear systems so a system is homogeneous if all the constant terms b i are zero otherwise it’s non homogeneous use so identifying this helps us to really understand nature of the solution set that we need to get and to understand what kind of strategy we need to use in order to solve this problem now what do I mean by bi we just saw that we had this system of M equations with n unknowns and we saw that that we have in the right hand side this B1 B2 up to BM which means that we had this m different equations with n different unknowns and to find a solution to the system it means finding this values corresponding to X1 X1 here X2 X2 xn so basically finding the set of X1 X2 up to xn that solves this problem and for us to know how to to solve this problem we need to know whether this B1 is equal to Zer or not this B2 is equal to Z or not and then this BM is equal to zero or not this is very similar to this idea of solving any sorts of um problems that contain unknowns for instance if we have three x is equal to let’s say five solving this is entirely different than if we know that the Tre X is equal to zero so this is a super simplified version of course but the idea is the same knowing that this B1 B2 up to BM this R zero this gives us an idea how we can solve this problem and later on we will see this distinction between non homogeneous and homogeneous system and whenever this B so whenever this B1 B2 up to BM whenever these BS are zero then we are saying that the system is homogeneous and we need to solve a homogeneous system otherwise we are dealing with non-homogeneous system so this means that the bis are not all zero let’s now move on to the second module which is about the matrices so we are going to define the Matrix we are going to see the definition of it as well as the notation decide your rows columns Dimensions uh some of which we have already touched upon but we are going to uh go into the depth of it we are going to learn properly as well as we are going to see many examples then we are going to talk about Matrix types so here we will talk about identity Matrix diagonal matrices and also special type of matrices like matrices containing only zeros and only ones so by definition and a matrix is a rectangular array of real numbers there are array Ed in rows and in columns for example an M byn Matrix a can be represented as follows so let’s look into this definition and this reference to Matrix we call this Matrix or Matrix a and every Matrix it can be described by this rows and columns where where we always have this uh way of describing this Matrix always should be defined by the number of rows and number of columns so this is super important and let’s look into this specific Matrix so we have a matrix a and all these values they are members of this Matrix they form The Matrix and we already saw this labeling of a i j where we said that I is referred to the row so you might recall that those were all these equations that we got so this horizontal lines where I was equal to 1 I I was equal to two I was equal to three up to the point of I was equal to M and then we had this J so this thing and then J was referred to the columns and we had J was here one and then two and then three up to the point of n so one 2 3 and N this is exactly what you can see here so in this Matrix we got all these elements a11 is a number a12 is a number up to a1n is a number those are all real numbers and one thing that you can notice here is that here we got a11 so this is our first row and First Column here we got a12 this is our first row and second column and then we got up to the point of a1n actually let me just write this down even at a bigger scale such that I can make more notes so let’s assume we have this Matrix a and this Matrix a if I’m bigger and we got all these different elements so we start with our first row and here we have A1 1 so here the row that I will write with let’s say with blue the row is equal to one and then the column is one so this is Row one this is Row one row one and this is column one let me write it with red this is column one this is column two this is column three dot dot dot and this is column n and this is row two this is Row three dot dot dot and this is row M so in total I got M rows and N columns I will come to this notation that I’m putting here later for now let’s keep track of the rows and the columns to get a good understanding what this indices were about that we just learned so every time I will also mention this reference to a i j to keep track of this and also let me write it with the right colors so a i this is the row and J which is the column so all the elements I’m just defining by this a because it’s just a way to reference a part that comes from a matrix it’s a just common way to write the entire matrix by capital letter A whereas its members will will write with the um with the lower case a so this is Matrix Matrix a all right so here in the second row But First Column we got a 2 and then one because it is still in the First Column and then when it comes to this element we have here a the row is the first one because we are in the first row but then we are in the second column so this one should be two then we go on to the next next element in our first row so a 1 and then three and then dot dot dot and the last element is an a as we are still in the first row it will be one the I but then given we are in the last column the column index or the J will be equal to n because we got in total n in columns so we are now ready to go into the second row so here given that we already have our first element a21 this is in our second row and the First Column so the I is equal to here 2 and G is equal to 1 let’s now write down the element in the second draw a second colum as you might have already guessed I is equal to here 1 I is equal to here two sorry and then uh the J is equal to two and then we go on to the next element which is in the second row and the third column so it’s a the row index is two so I is equal to two and then the column index is three dot dot dot and then we got a as we are in the second row it is the I is = to 2 and as we are in the last column the J is equal to n now you might have already guessed when I was writing this down that whenever you are in the row and you move on to all the elements in the same Row the I so the row index it stays the same only you need to uh update the column index so here for instance you got one one one here also one so all the way down in the same row or one which logically makes sense because we are in the same row so the row index should not change but instead you should change the column index like here column one column two column three all the way to column n so those are our columns dot dotp so let me make this distinction and those are our rows as you can see so this kind of mentally helps us to understand why we are writing all these indices over time once you practice more with this this will become more natural really quickly remove this so now I will write the rest very quickly so as you might have already guessed we are in the third row so we have a Tre so everywhere I will just write down D Ace so First I write down the A and then the rows the row index will stay the same as I in the same row but then I will increase the columns gradually so we are in the column one and then column two column three up to the column n so now the remaining stuff you can actually write down yourself to just practice let’s now move on onto the last row and last column so in the last row we got a a a up to here and in the last row the uh row index is M which means that here I need to have M M M everywhere I need to have M and then the column index is 1 2 3 all the way to n so this last column is very interesting too you can see here that we have the opposite of what we have here because in the last column we see that the uh column index is the same so it is everywhere n Only the first index the index of the row it changes it goes from 1 2 three up to M which is of course logical because we said that in the last column if we are looking it from the perspective of column so all these values this A’s so the all the ends they are logical because they we are in the last column we are in the same column but then the row changes here we are in the row one here we are in the row two Row three of two row M therefore we have also at the end a m n now let’s talk about this idea of MN we said that our Matrix a has M as a number of rows and n as a number of columns which you can see by the way also here so we always refer the dimension of a matrix so the dimension dimension of Matrix a by these two numbers so first we always write down the number of rows in this case M then as the second element we are writing the number of columns in this case n we are always putting this small X in between to kind of emphasize M by n Matrix and we most of the time use the square braces to Showcase that we are dealing with Dimension and in this case we are saying the dimension of Matrix a is equal to M byn so we are dealing with M byn Matrix this is a common convention used in linear algebra and Mathematics General but also used in data science uh in machine learning artificial intelligence so whenever you are dealing with matrices a it is a common convention to talk about this idea of dimensions and the idea of Dimensions is super important when it comes to the idea of multiplication multiplying Vector with Matrix Matrix with Matrix so this dot product Dimensions play a central rle in here so keep this one in mind once we uh get to the point of that products this one will become very handy so let’s now look into a specific example where we see simple Matrix a so in this case you can see that we are dealing with a matrix that has a 2×3 Dimensions so like we just learned 2×3 means that we got two rows and three columns that’s something that you can also see here very quickly so you have a small Matrix on the small matrix it’s really easy to actually count so you can see that we got Row one and row two and we got column one column two and column three so this basically confirms these Dimensions therefore we are also saying that we have a 2 by three Matrix and like usual we first write down the number of rows and then the number of columns you can see here that here we have this elements for our Matrix so a is equal to 1 2 3 for the first row and then 4 five6 for the second row so from this actually I think it’s a good exercise to just uh verify our understanding of indices and from this um we can write down that for instance all these different elements uh like A1 1 is equal to 1 A1 2 which means that we are in the first row and in the second column so we have this element is equal two two and then we got a and then one Tre so we are in the third column so this one is equal to three and then a 21 is equal to 4 a 22 is equal to 5 and then a 23 is equal to six so this is actually a good way to practice our understanding of IND is our understanding of this Matrix structure and the understanding of dimension of the Matrix which in this case is 2×3 so this is yet another different definition of a matrix structure when it comes to the rows comms and dimensions so this is exactly what we I just spoke about on our example and let’s just quickly look at the formal definition so the rows of a matrix are the horizontal lines of the of the entries while the comms are the vertical lines so basically it’s saying those are let me remove this so the rows are the horizontal line and the columns are those vertical lines those are the columns this helps us to form these columns so col one COL two and colum three whereas this horizontal lines it helps us to create the rows so Row one and row two so then we have the dimensions of Matrix are given by the number of rows and columns it has so an M byn Matrix has M rows and N columns that’s something that we already saw so let’s now look into some special type of matrices one Matrix type is the identity Matrix so we saw before we had this Identity or unit Vector now we have identity Matrix so the two are quite similar so like before for when we had our unit vectors we had this for instance E1 in three dimension we had 1 z0 then we had our E2 which had 0 1 0 and then we had our E3 which was 0 01 so you might recall this about our identity vectors or we were calling it unit vectors you might notice very quickly that we have formed an identity Matrix i n which is a square Matrix with one on the diagonal and zeros elsewhere is basically a matrix that is built using those unit vectors so here we have E1 here we have E2 and here we have E3 so you can also see that this 3×3 Matrix because we got three rows and three columns so you can see that here we have on the diagonal so we call this diagonal on this diagonal we have all ones and in here outside of the diagonal they are all zeros and this is the definition of identity Matrix it is this i n Matrix where n is the dimension of a matrix and given that it’s a square Matrix it means that the dimension of it is n by n so all the rows so the number of rows is equal to the number of columns on the diagonal we have all this ones and everywhere else we got zeros and do note that we are forming this identity Matrix simply by combining these different uh unit vectors so like here E1 E2 and E3 so let me actually uh give you yet another example but of much higher Dimension so of this identity Matrix so let’s say we have I and then this I uh let us actually use this notation i n so let’s say we got i n what this means is that we got actually this large matrix it’s a square Matrix which means that it is n by n so it has n as the number of rows and n as number of columns so the dimension is n by n you got n as number of columns too because it’s a square and let us actually write down that how that Matrix looks like it’s a large Matrix matx the N is the size of that Matrix so here we got on the diagonal we got one here we got one here we got one dot dot dot up to the last point one and the index of this one here so this is the first row this the First Column basically and everything else is simply zero so here we got zero 0 0 dot dot dot Z here we got 0 0 all the way down to zero here also Z all the way down to zero and then here also zero so everywhere here and here we all got zeros only on this diagonal we actually got once so basically by using our common notation we can say that in the DI in the identity Matrix we got A1 1 = to a 22 = to a33 equal to all the way to a NN equal to 1 and then when it comes down to the rest of the cases so all the other observations let’s say a 21 a 31 or a 41 anything so anything that is not um a11 or a22 anything that is not on the diagonal it is simply equal to zero we also say in those cases that a i j is equal to 1 if I is equal to to J because then it means that we are talking about item that is on diagonal because both the row index is equal to the column index otherwise the a i j is equal to Zer if I is not equal to J so this is in the nutshell how a large identity Matrix in general can be defined so let’s now move on to another type of Matrix which is the diagonal matrix so by definition a diagonal matrix is a matrix where all of diagonal elements are zero so what does this mean we just saw um example of a diagonal matrix which was our identity Matrix because identity Matrix is an example of a diagonal matrix and what do I mean by that that in our just seen example we saw that only on the diagonal we had all these nonzero elements but the rest were all zeros so all the of diagonal elements were zeros like in here and in here exactly the same holds for the diagonal matrices only unlike in the identity Matrix we no longer need to have this diagonal elements equal to one those can be any other numbers so as long as we have this um elements D1 D2 D3 that are not zeros but then off the diagonal numbers so all these elements they are zero then we are dealing with the diagonal matrix so in this case we got a 3X3 diagonal matrix because we have uh three rows and three columns and here we can see that the um the first so the a11 the first element from the first draw First Column is equal to D1 so a 22 is equal to D2 and then a33 is equal to D3 so D1 D2 and D3 those are all so D1 D2 and D Tre those are all real numbers now when it comes to the uh this numbers for example it can be that D is let’s say two 2 5 6 on diagonal then we have those zeros this is a diagonal matrix it can also be that D is equal to minus 3 and then 0 0 and then 5 8 and then here we have zeros so again we have on the diagonal all these elements and the off diagonal elements so if all the off diagonal elements are zero then we are dealing with diagonal matrix and if you are wondering well what happens if only the diagonal we got zeros do we still have a diagonal matrix it’s actually a great question but yes indeed we are dealing with diagonal matrix as long as all the off diagonal elements are zero so for instance if we got D is equal to here we have zero here we have 0 0 0 and and then 7 and then 0 and then 8 and then 0 0 so we got this off diagonal elements so here are the diagonal elements and all the off diagonal elements are those given that all the of diagonal elements are zero which is the definition of the diagonal matrix then we can say that our D Matrix in here is indeed a diagonal matrix let’s now look into yet another type of Matrix which is a special type of Matrix and it’s called one Matrix so by definition a one’s Matrix is denoted by 1 M1 so you can see here and here it mentions the dimension of it so the number of rows and number of columns and it’s a matrix in which all the elements are one so this is a very unique Matrix we often use it during the programming so in data science data analytics but also in um uh when creating like data structures when designing algorithms this becomes very handy and this idea of one’s Matrix is that all the elements are just one it means that if we want to create a placeholder in such way that we can then multiply any number in here with some other number and get that number then it can be done very easily because we know that when we multiply y a number with one then we get that number so a * 1 is = a x * 1 is = to X now this is exactly this property exactly is what motivates us to create and to have this type of ones matrices it means that we are defining matrix by its Dimension so it is M by n and here the m is equal to 2 and then n is equal to three because we got two rows and three columns but you can see that all the elements are the same and they are equal to one so a11 is equal to a 1 2 is equal to a 1 3 is equal to a uh 21 and is equal to a 22 and a 23 and they are all equal to one and this is the definition of one’s Matrix you can have um On’s Matrix of the size 4 by 10 one mat Matrix of the size thousand let’s say 10,000 by 100 Etc so any number any real number so M and then n are real numbers you can use in order to create this large M by n1’s matrix let’s now look into our final special type of Matrix before moving on onto the next module which is about zero matrices so similar to this one’s Matrix a zero Matrix denoted by 0 m by N is a matrix in which all the elements are the same with the one difference that this time all the elements are equal to zero so in the once Matrix all the elements were ones but in the zero Matrix all the elements are zero this type of matrices become very handy also during the programming creating um different algorithms during design in coding um but for slightly different purposes usually we create the zero matrices as a placeholder such that in the beginning we can have this uh tuples or we can have this um uh arrays or nested Loops um that we want to perform and then gradually add these values to the existing Mt array so if we create this zero Matrix and um this is a placeholder then next time we can always add on this this new data that we get and then we know that zero plus a number is always equal to number which means that once we have this updated information of a we can add this to the zero and we will then have this new updated information in our system therefore the zero Matrix is often used as a way to uh have this placeholder with the provided Dimension where we can always add new information and the information can be updated so in this specific case we got um a zero Matrix that has two rows and three columns so you can see two rows and three columns so m is equal to two and then n is equal to three three perfect so we are done with module two and now we are ready to go on to our next module which is the core Matrix operations so when when it comes to matrices we often perform Matrix additions metrix subtraction but also Matrix um a scalar multiplication of this Matrix so multiplying Matrix with a scalar and then Matrix um multiplication just in general so taking two matrices and multiplying them we are going to look into this concept in detail we are going to see many examples like before we are going to dive deeper into this such that we lay the ground on uh two to the next module which is solving a system of M equations with n unknown so solving this General U linear system so for the beginning uh we will be looking into this Matrix operations where we are adding or subtracting matrices so by definition the sum of two matrices A and B of the same dimensions is obtained by adding their corresponding elements so by taking the element i j from both matrices and adding them to each other so in this case you can see that Matrix A and B are here and uh the uh definition says we just simply need to take the corresponding elements corresponding elements from the row I and the column J take them add them and this will become an element in our final um Matrix because when we are adding two matrices of the same size the result is yet another Matrix so we will use the Matrix a to add to Matrix B and this will give us a matrix A + B and this IJ simply refers to the indices corresponding to the row and the column we will look into an example in a bit and this will make much more sense and the same holds also for the difference so by definition the difference of the two matrices A and B of the same dimensions is obtained by subtracting their corresponding Elements which means that in order to obtain this Matrix a minus B this is a new Matrix we simply need to look for each element so we are going to index them for a row I and J we are going to do this pairwise element wise subtractions we are going to see what is that element corresponding to the row I and column G in The Matrix a which we say is a i j we are going to subtract from this the element in the row I and column G that comes from Matrix B and this will give us our new Matrix which is a minus B so let’s now look into an example in this Matrix Matrix um uh a and Matrix B are used and Matrix a is of the size 3×3 Matrix 3 Matrix B is of the size 3 by 3 in order to obtain a + b what we are doing is that we are performing element wise additions now let’s verify this so what we are doing here is that we are saying a plus b let me actually get a larger area here so let’s say we have the two matrices I want to add the two in such way that we everything one by one such that this idea of a plus b and addition of the matrices will make sense so we want to find out a plus b for that what we are going to do is that we are going to make use of this definition that A + B and then I J is equal to a i j + b i which is a fancy way or mathematical way or describing that for each element we need to go and look for the row I and column J and take that element from the um column from that uh Matrix a and from The Matrix B so this means that for a plus b this is going to be a matrix that will have the same number of rows and the same number of columns as two matrices because both A and B are 3×3 which means also their sum is going to be 3×3 so this going to be 3×3 and here we are going to do so we are going to take for the first row and the First Column so for a plus b 1 one so first row and First Column we need to go to the first row and First Column of Matrix a and the first row and First Column of Matrix B and we need to add these two elements so we need to do 1 + 1 and then we need to go on to the second column so the first row and the second column which means that we need to be here in both matrices so here we have 0 + 2 and then we got 2 + 3 and then we got 0 + 0 so you can see it in here and then we have 1 + 0 and then we have 3 + 1 0 + 1 and and then 0 + 2 and then 1 + 3 which gives us so 1 + 1 is = to 2 0 + 2 is = to 2 and then 2 + 3 is = 5 0 + 0 is = 0 0 + 1 is = to 1 1 + 0 is = to 1 0 + 2 is = 2 and then 3 + 1 is = 4 1 + 3 is = 4 which means that our A + B is equal to this Matrix that we got in here so you can see that we are getting exactly what we uh what we have here only we have done it manually one by one so the same idea holds exactly when we have a minus B only instead of adding you will have to do here minuses so minus minus so everywhere minus so 1 – 1 0 – 2 2 – 3 Etc so let’s look into another addition so in this case by definition it is defined as this element wise uh of the adding of the two matrices here the only difference in this definition is that it’s saying it’s calling this a plus b as C so this new Matrix that we are getting as a result of adding a to B it’s calling C so basically it’s the same calling this Matrix a c you will see also this type of definitions so in this case The Matrix C is equal to a plus b which basically means that for each row with index I and with each column with index J go and look for row I and index J take the corresponding elements from Matrix a and Matrix B add them in order to get that corresponding element in our new Matrix C and you can see that in this example there’s EXA exactly what we are doing we have a we have B we are taking this element and this one so 1 + 1 we are getting here two and then 0 + 2 we are getting two here 2 + 3 is 5 and then 0 + 0 is equal to 0 1 + 0 is = to 1 and then 3 + 1 is equal to 4 so now we already go to the next topic which is about scaler multiplication of a matrix so by definition scalar multiplication of a matrix a by scalar Alpha results in new Matrix where each entry of a is multiplied by Alpha the idea of scal multiplication matrices is actually quite similar to this idea of scale multiplication in vectors so uh we have already seen in the lecture of the vector multiplication that when we were having the scaler C and we had this Vector a then uh when we are multiplying C which is a real number with Vector a then we simply need to take all the elements of vector a so A1 A2 all the way down to a n and we need to multiply them by this same scaler so see this is what we were doing with vectors and that’s exactly the idea behind matrices and when uh doing the scalar multiplication of matrices only instead of multiplying in only just one vector with the scaler C now we need to apply this to all the rows and all the comms so here we got this one column and Matrix is simply a combination of multiple vectors which means that we need to multiply all these elements of all the vectors of all the columns in this Matrix so let’s actually look into specific example so in this case we have a matrix a and and this Matrix a is this thing and we have a scaler which is three so in here our Alpha is equal to three or you can quote C or anything so you can see that when we are scaling The Matrix with a scaler in this case Tre with this Matrix what we are doing is that we are simply taking each of these elements and multiplying it with the scaler so 1 by 3 is 3 2x 3 is 6 3x is 9 and 4x 3 is 12 this is the idea behind this entire scale multiplication of a matrix in more general terms if we for instance have a matrix a so let’s actually look into a high level General example when we have a DA Matrix M by n so we got M rows and N columns and we want to get a skelet multiplication of this Matrix and um scalar that we have here as in our definition it is defined by this alpha alpha is just a number you can go C you can go B anything so in this case our scalar alpha alpha is coming from R so it’s a real number so Alpha time a is then simply equal to to this new Matrix where all of these elements are simply multiplied by this scum so I will just take over all these values H1 up to a M1 and then A1 2 a 22 all the way down to a M2 and then let me also add the last column just for fun here a 2 N and then here a m and so here this new scaled M multiplies so scaled uh Matrix a so Alpha * a is simply equal to Alpha time all these elements are simply multiplied by this scale it is as simple as that so that’s the simple idea behind uh metrix uh scaling so when you are doing scalar multiplication of this Matrix you simply take all the values and you multiply them element by element perir row and per column by that single scaler Alpha do note that you are multiplying them all without exclusion with exactly the same number which is that Alpha so let’s all look into the definition of matrix multiplication so here we are no longer multiplying a matrix with a scalar but we are multiplying Matrix with a matrix so the product of an M by n Matrix a and an N by P Matrix B results in an M by P Matrix C where each entry cig is computed as the dotproduct of the each Road of a and the Jade column of B now what does this mean firstly let’s look and unpack this part of the definition so we got Matrix a that is M by n and then we got Matrix B which is n by P what this means is that in this case Matrix a has M rows and N columns and Matrix B has n rows and P cups so this is then simply the dimension dimensions of the two matrices so then it’s saying that by definition the product of these two matrices so the product of A and B the product of the two B is equal to toce Matrix C and each entry c i j so c i j is computed as the dotproduct of the each row of a and the Jade column of B now this part might seem bit difficult but once we look into the actual example and we illustrate this on our common high level General expressions of Matrix a b and their multiplication this will make much more sense for now before coming to this one I just wanted to refresh our memory on one thing I said before when discussing also this idea of improving uh this uh different properties of vectors that when we want to multiply a vector with a matrix or Matrix with Matrix or vector with a vector we need to ensure that from the first element the number of columns is equal to the number of rows of the second element this is also very important for this specific case and just in general for matrix multiplication so you can notice here that the number of comms here is equal to the number of rows in here and the order is very important so in case of matrix multiplication the order is really important which means that if you have a matrix a and you want to multiply it with the Matrix B then the number of columns of a should be equal to the number of rows of B otherwise you cannot multiply those two matrices with each other so in case you got a matrix a that doesn’t have the same number of columns as the rows of number of the Matrix B then there are some alternative things that you can do including this idea of the transpose that we saw also doing when Computing the dot product between this Vector a and Vector B that’s something that we also do in programming when we are dealing with this Matrix and we want to compute this relationship between two matrices but the number of columns of one of the first one is not equal to the number of rows of the second one we are simply uh manipulating this matrices or removing some data if that’s not hurting our problem maybe uh flipping so transposing our Matrix or applying any other source of operation to it to ensure that the two matrixes that we are multiplying with each other the first one’s number of columns is equal to the second and once’s number of rows so that’s just the low and that’s something that you should follow if you want to multiply these two matrices all right so now let’s move on onto this idea of multiplying and Dot product let’s look into a specific example and this will um help us to understand this process better so before doing that I just want to quickly show you this general idea so if we have a matrix a that is M by n which means that it looks something like this like A1 1 a 21 up to the point of a M1 and then here we got let’s say a one 2 a 22 up to the point of a M2 and then at the end we got a MN and here we got A1 n so let me also add this one 2 N and we got a matrix B this Matrix B is n by P so it has n rows and P columns so we are fine in terms of Dimension here and we got here b11 B21 up to the point of b m sorry BN in this case let’s not confuse the letters so B N1 B1 2 B 22 up to the point of b n 2 because n now is the number of rows for Matrix B unlike for the Matrix a up to b 1 p and here b 2 p and here after to the point of B and then n p this is the last element in order to perform a multiplication between these two matrices so to obtain a matrix C which is equal to a * B what we need to do is we simply need to take pair case so pair Row for the row I for instance we need to take this element so this row and we need to multiply it with this so we need to find a dot product between this row and this column then we need to move on on to the next one and then for the second element we will then take this row and we will multiply it with this one so this is then something that we need to do in order to obtain these elements and you might have already noticed that we got this m by n and n by P so you might have already guessed what will be the dimension of the C if we got that the dimension of a is equal to M by n and the dimension of B is equal 2 and by P then the results Matrix after M multiplying the two of Matrix c will be will be having a number of rows equal to this and the number of col equal to this so This middle part basically disappears and the number of rows of the first Matrix will be then the number of rows of this result Matrix C and the number of columns of the second Matrix so Matrix B will then be our final number of comms so we will then have a matrix C that will have a dimension so Dimension so dimension of C will then be equal to M by P so we will have M rows and P columns so how we are going to compute this so for c i j which means row I and column J let’s look into the definition of it it’s saying c i J is computed as a dotproduct of the each row and the J column so each row from A and J column of B what is the each Road of a the each Road of a is somewhere here so each Road of a it is uh the A and then I one then a and then I2 and then a and then I3 dot dot dot and then a i and then we got in total n columns n and we always do the transpose right when Computing this um dot product so we then take the transpose so we take this row row I and we multiply it so we do the dot product between this one this this is the a i and the B J this is column J it is somewhere here so it is B and then we got the first element which is one and then J and then B 2J B 3j dot dot dot up to B and then in total we got n rows in B so n and then the J is the column so it stays the same so this is then the dot product between e row that comes from Matrix a and the J column that comes from Matrix P so it’s always like that actually so we always take row by row so we take this different so every time we take just a row and we multiply with the corresponding column and then we get the dot product between this row that comes from the first Matrix and then the column that comes from the second Matrix in that specific order in order to get our DOT product and that specific volume and what is this amount actually so when we calculate this dot product you can quickly see that we have a i1 * by B1 J plus a I2 * b 2 J and then dot dot dot a i n multiplied by b n g and this new Matrix c will then have all this so C11 C 21 and then c31 dot dot dot and then C the last row as the number of rows of C is m c m c here M so C and then here it will be 1 2 C 22 C3 and then two up to the point of cm and then two and then here the last coln will be C1 and then p is the number of coms in C so C1 p and then C 2 p and then C3 P dot dot dot and then c m and then P okay so this is what we get this is our final Matrix C when multiplying Matrix a and Matrix B so let me clean this up c is now if you want to find out what is C11 you can easily fill in this general formula that uh that we just calculated the I is equal to 1 and then J is equal to 1 and this will give you C11 by using this formula if you want to get the CM P then just fill in the I is equal to M and then J is equal to P in order to get this value CNP so you can already see the amount of calculations you need to do in order to get all these elements from these large matrices A and B let’s actually look into a simple example to clarify this so we have a matrix a here and Matrix B here and we want to do a multiplication of the two and we have just learned how to do it let’s actually do it one by one so we got a matrix a which is equal to 1 1 2 3 4 with Dimensions 2 by 2 then we got a matrix B which has values two Z and then one 2 so it is 2x two and I want to find what is c that is equal to a * B and I know already by looking at these Dimensions that c is going to be equal to 2 by 2 so you might recall that I said that when looking look at this final result the number of rows or the final um Matrix will be this so the number of rows of the initial Matrix a and then the number of columns of this final Vector c will be the number of vectors number of colums of this second Matrix B so two therefore I know already before even doing calculations that the uh product Matrix c equal to a * B is going to have a dimension 2x two let actually do a calculation to check this so C is then equal to a * B and it’s equal to 1 2 3 4 multiplied by 2 0 1 2 okay so I expect to have four different elements here here here and here so to obtain the C11 so it is C11 in here what I need to do is that I need to look at the first row and the first column in here so first row from a and the First Column of B and I’m doing the dotproduct which means 1 * 2 + 2 * 1 1 * 2 is 2 2 * 1 is 1 so here I’m getting 1 * 2 + 2 * 1 we which basically gives me 2 + 2 and that’s equal to 4 so here I’m just writing down 1 * 2 + 2 * 1 now when I want to get this value which is C12 this means that I want to get the first row and the second column and that’s exactly what I’m doing so I’m going back and I’m saying let’s look at the first row but this time we will look at the second column coming from the uh from The Matrix B so 1 * 0 * 0 + 2 * 2 and then I do the same only this time for the second row which means I’m picking this row and then this column so it is 3 * 2 + 4 * 1 and for the final element c22 I’m taking the second row and the second column which gives me three * 0 plus 4 * 2 now what does this gives me this gives me this 4×4 Matrix where 1 * 2 + 2 * 1 is 4 1 * 0 + 2 * 2 is 4 3 * 2 + 4 is = to 6 + 4 which is 10 and then 3 * 0 + 4 * 2 is = to 8 so let’s check 4 4 108 that’s exactly what we have here so as you could see here the idea is that every time to follow what element I’m looking for for the CJ and then I just go to the E rows from the first Matrix and the J’s column from the second Matrix and I do the dot product of the A and then I and then K let’s say so I’m going to the E Row from the first Matrix and I’m taking all the elements which means I don’t even need to mention this index it just means the entire each row coming from the Matrix a and then I’m doing the dot product between this row and the column that comes from the Matrix B which means B and then J which then will give me the cig so I’m looking at this and taking this multiplying this do product and this gives me the first element then the first row and then the second column which gives me the uh second element in the first row in my Matrix so this one and so on so hope this makes sense uh if it doesn’t make sure to reach out because it’s a very important concept and uh let’s also look into another example to make sure that we got this right so in this case as you can see we have another matrices so set of A and B matrices again 2x two a simple one and we want to know what is a so let’s say we call this C we already know C should be 2x 2 and what we are doing is basically for C11 we are saying let’s look at the first row so first row and the First Column coming from the second Matrix B and let’s do the dot product so 2 * 1 2 * 1 2 * 1 4 * 5 4 * 5 we get this and then when we want to find what is C oh what is C and then one two so in the first row but in the second element in our final Matrix so I is equal to 1 and J is equal to 2 it means we need to look at the first row from The Matrix a but this time the second column from from The Matrix B so it is 2x 3 2x 3 4 * 7 4 * 7 and this gives us a number 34 even if you calculate you can see that 2 * 1 is equal to 2 4 * 5 is 5 so 2 4 * 5 is 20 so 2 + 20 is 22 in here and then you can do the rest of calculations and this will be a good practice to see how we can do a basic matrix multiplication the idea is actually quite straight forward when it comes to multiplying it it just it comes with a practice when we see all this uh much bigger matrices so um this is another example I will leave this one to you to complete it just uh to keep in mind we always do uh so we always look at the dimension first in here 2 * 2 and 2 * 2 which gives me an impression already what I can expect the result will be 2 by two and when it comes to the uh cross elements just ensure to always look to the eth row and the jth column this comes from Matrix a and this comes from Matrix B take them compute the dotproduct and then you will find your C uh your final result let’s call it um kig J because in this case we have a matrix C already welcome to the module 4 of this course when we are talking about matrices and linear systems so in this module we are going to dive deeper into this uh idea of linear systems with matrices and solving linear systems using different techniques and specifically we are going to learn the uh concept behind solving linear systems using matrices named gausian elimination and gausian reduction welcome to the module one in this unit so in this uh case we are going to talk about algebraic lows for matrices we are going to disc discuss four different properties for matrices uh and the first one is the cumulative low for Matrix addition the associative low for matrices the distributive low for matrices both the left and the right one and then finally we’re going to talk about the scalar multiplication low for matrices so the algebraic laws for matrices they are like in case of real numbers like in case of vectors they help us to do different operations on these entities they are very similar to the real numbers and the vector cases where we for instance um so that if for instance A+ B uh is equal to B + C or a * um b + C is equal to a + a c those are uh all sorts of lows that we uh learn as part of high school pre-algebra and we have applied it to real numbers we know how helpful those can be and similar type of lows we have also for the matrices and we got in this case four different lows that we will be discussing the first one is what we are referring as associative low the second one is the distributive low the SEC the third one is the scalar multiplication low and the fourth one is the communative law for addition so these laws help us to do different metrix operations they help us to manipulate algebraically this metries and then uh this can help us to solve different sorts of problems including solving a system of linear equations so let’s start with the commutative low for Matrix addition so the Matrix addition is commutative um which means that a plus b is equal to B plus a so unlike the matrix multiplication that we have seen in the previous lessons uh where theorder did matter and we said that we um had to uh ensure that the number of columns of the first Matrix is equal to the number of rows of the second matx Matrix in case of addition that’s this is not the case so we should not care about the order whenever we want to add two matrices the other thing that we need to keep in mind though is that the two matrices needs to have the same size so I mean that both Matrix a and Matrix B need to have a dimension so dimension of Matrix a should be equal to dimension of Matrix B and let’s say should be equal to and M by n but for the rest we don’t really need to care uh which one we will put first will we put first a and then add the b or we will do the other way around so we will then First Take B and then we will add a so this is the idea behind commutative low for Matrix addition so first let’s look into all this uh lows and then we will also look into the corresponding examples so for this specific case it might actually also be helpful to write down the general formula which will um make sense out of this um low for the uh uh which is a communative low for the metrix additions so let’s say we got a matrix a which is M by n and this Matrix can be represented as a11 dot dot dot and a M1 so this is something that we saw time and time again so I’ll just quickly write it down the common notation for this and then here we have the last column which is a MN and this is the Matrix a then we got Matrix B which is again M by n and can be represented as b11 and then dot dot dot and then B M1 dot dot dot b 1 n dot dot dot b MN so the cumulative law says that A + B should be equal to B + a let’s check that whether this is the case let’s first compute this part and then we will do this one well the first one means that we get so A + B is and we remember how we add matrices right so we know that we just need to pick their corresponding elements and add them to each other so we get a11 plus b11 then dot dot dot and then a M1 Plus B M1 this is why also the it’s really important that they got um the same Dimension which means that they got exactly the same amount of elements or the same uh amount of columns and the same amount of rows um in terms of the uh Matrix size so then here we have a 1 n and then plus B1 n then dot dot dot and then a M1 and then plus b m here I need to put n we are in the last element of the Matrix so BMN and that is it this is our Matrix A + B let’s now look into the Matrix B+ a so what that amount is so The Matrix b + a will then be equal to b11 + A1 1 dot dot dot and then B M1 plus a M1 then dot dot dot and then b1n Plus A1 n then dot dot dot and then the last element will be B MN and then plus a MN so in here if we remember from the real numbers we know that A+ B is equal to b + a for instance if a is equal to 2 and then B is equal to 1 then a + b is = to 2 + 1 which is equal to 3 and then b + 1 is = to 1 + 2 and it’s equal to three so we know that indeed for the real numbers a + b is equal to B + a and making use of that property we can already state that B1 1 + a11 is equal to a11 + b11 and then the general case is that a i j plus b i j is equal to b i j plus a i j where I is the index of the rows and then J is the index of the columns from the coefficient labeling so using this property from the real numbers given that all these values in The Matrix are real numbers we can quickly see that the Matrix B+ a that we just got in here is equal to this Matrix a plus b which means that 1 is equal to B and this proves that a plus + B is equal to B + a this is the commutative property of the Matrix additions so the next law is the associative law for matrices which says that the in case of Matrix addition a plus b + C is equal to a + b + C so basically this time we go from here to adding one more element which is the third Matrix Matrix C so we are saying a plus b plus C is equal to a + B+ C so it doesn’t matter whether we will First Take The Matrix a and then B and then add them up and then we add Matrix C or if we first take the Matrix B and C add them up and then we add a to this sum it doesn’t matter we will see an example of this in a bit and then the uh second part of this associative law for matrices says that for matrix multiplication a * B * C is equal to a * B * C so again in terms of the um order when it comes to this specific multiplication so it doesn’t matter whether we will first multiply a by B and then by C or we will first multiply B by C and then we add the a at the end we will end up with the same amount so a * B and then * C is equal to B * C and then in the left hand side we add the a so a * B * C so this properties help us to add or multiply matrices without really worrying about this idea of grouping of the terms so we can always group them and perform all sorts of operations so this is this first property that we see in here let’s say we have this uh Matrix a matrix B and Matrix C so let’s prove that indeed the order doesn’t matter and dis associative property holes so let’s prove that so for that the first thing we need to do is to compute a plus b so this part so a plus b plus c what is that first I need to compute this part and then I will add C which is the second part so A + B is equal to my a is equal to 1 2 3 4 plus and my B is equal to 56 78 this is then equal to so 1 + 5 is equal to 6 2 + 6 is equal to 8 3 + 7 is equal to 10 and then 4 + 8 is equal to 12 this is my A + B this is the first part now the second part is then to add to this A + B this C this I can by the way also call some Matrix D so I can say that this is equal to D + C so let’s find out what is this amount so A+ b or what we’re referring as D is 6 8 10 12 we just calculated it in here I’m also adding now my Matrix C which is 911 10 12 so 9 10 11 12 what is this amount it is 6 + 9 is 15 8 + 10 is 18 10 + 11 is 21 12 + 12 is 24 this is my final Matrix so I have checked that D A + B + C is equal to 15 18 20 1 and 24 this is the first part let’s now go ahead and check whether this is equal to the second part which is this part so this is one this is two so this is then A + B + C as you can see it in here let’s now calculate that amount and like previously we will do it in an order so first we need to calculate this part and then the entire thing so B+ C is then equal to the B was 5 6 78 5 6 78 plus and the C was 9 10 11 12 9 10 11 12 what is the S much 5 + 9 is 14 6 + 10 is 16 7 + 11 is 18 and 8 + 12 is 20 this is the first amount let’s refer refer this as a d or we can even call it by some other letter let’s say k this is Matrix K so B plus C is k then the second part is to take this B+ C so B plus C which we have referred as K say K and then we are adding to this the A and specifically just to ensure that we stay with the same order I’m saying I will add from the left side the a to this Matrix K and this obviously means this is equal to so A + B + C this is what I’m referring by just uh in a more simpler notation I’m just using K in here so this is my B plus C or what I’m referring also as a k and this amount is equal to what is my a my a is 1 2 3 4 1 2 3 4 plus and what is B plus C we just calculated that that’s the K so 14 16 18 20 so 1 + 14 is 15 2 + 16 is 18 3 + 18 is 21 4 + 20 is 24 so we have learned that the a + B+ C is this vector now is this Vector equal to the A + B and then plus C well here we got this 15 18 21 24 15 18 21 24 so we have just proved that the first part is equal to second part which means that we have proved that indeed the order doesn’t matter and a plus b + C is equal to a + b + C so this calculation confirms that the both sides of this equations they are indeed equal and this confirms the associative low for the Matrix addition so let’s now look into the distributive law for matrices which says that Matrix addition and multiplication they satisfy the distributive property which means that if we have a left distribution a * b + C is equal to a + a and then in the right distribution we basically have the Matrix multiplying from the right from hence the name right distribution A + B * C is equal to a C + B C you might very quickly see and recognize from here that we have very similar actually exactly uh the same rule only for real numbers we know that a * b + C is equal to and then we open the parenthesis we say this is equal to this times this so AB plus this times this a c you can see that we have exactly the same here only in the capital letters so in the real numbers we have exactly the same low so the same we have also for our left distribution when it comes to Matrix additional multiplication and the same we have only with a different order here you can see the C so this one is basically uh with the different order instead of having the Matrix multiplied in the left here we have from the right and this is similar to the property that A + B * C is equal to C * a which is a c plus c * B which is b c to an example uh where we will prove that the distrib ative law for matrices um is indeed true and I have skipped deliberately the uh example for this one because uh this a b * C is equal to a * b c so the associative low for matrix multiplication it’s something that you can calculate for yourself using the same a b and c matrices only this includes multiplication of the two matrices and it’s something that we are going to to do as part of this example so instead of doing and redoing this multiplication I thought that it’s great to leave that for you as a practice and instead focus on bit more complex problem like this one that one way or the other includes the same matrix multiplication so I need to calculate the a * B in this case which means that by providing you this example I’m also including what is needed to do the previous example example only it would be a great way to practice the material for yourself so let’s now move into proving the distributive law for matrices so we got this matrices a b and c and here I’m going to apply matrix multiplication the same as that is needed for the previous uh case and here what we need to prove is that a * b + C is equal to AB * AC so this is the first part this is the second part so let’s go and calculate them separately so for the first one we need to calculate a * b + C which is then something that we can calculate by first doing the addition so we will first do the addition of matrices B and C and then once we are down with that we will then do a * b + C so that’s the second part so let’s go ahead and do that calculation so first we got B+ C what is B plus C B is 5678 5678 plus and the C is min-1 0 0 minus one so on the diagonal we got min-1 and minus one and then of diagonal lower and upper parts we got zero and what is this Matrix this is equal to 5 – 1 so 5 + – one is = to 4 6 + 0 is = 6 7 + 0 is = 7 and then 8 – 1 is = 7 this is our B plus C which we can refer also as Matrix D so let’s call this D which means that now we are interested in a * D so for this second part we need to take this Matrix a so a * D is then equal to we need to take the Matrix a which is 1 2 3 4 and we need to multiply it with this Matrix that we just got because this is the B plus C or the D that we were referring 4 6 77 okay so let me remove this part cuz we are going to need some space for this and let’s do this calculation this is 2x two and this is 2 by two I will do the calculations in here so we need to end up with the Matrix that is also 2x 2 because we know 2x 2 Matrix time 2x 2 we will pick this part so the number of rows and the number of columns of the second one this will be our resulting Matrix which is 2 by two all right so for the matrix multiplication we know that for the this element in the place of so one a or let’s call this Matrix we don’t even actually need to call this anything we can keep it simple so let’s say that we are in the first draw in the First Column so this is the first draw in the First Column for this what we need to do is we need to take the first row from the first Matrix so Matrix a and then the First Column of the Matrix D so this one and we need to do the dot product which means that we do basically 1 * 4 1 * 4+ 2 * 7 so for this element which is in the second row and the First Column we need to take the second row and First Column in here so we end up with 3 * 4 so 3 * 4 and then 4 * 7 so plus 4 * 7 The Dot product between this one and then this one so for this element which is in the first row and then second column of the final Matrix so first row and second column we need to pick the first row and second column and do a DOT product which means one 1 * 6 + 2 * 7 2 * 7 and then in here in this element we got the second column and second row so second row second column which means that we need to pick the second row and the second colum the dotproduct of the second show of Matrix a and the second column of Matrix D which is 3 * 6 + 4 * 7 4 * 7 so let’s quickly calculate what this amount is so this is the a * b + C basically and this Matrix is 1 + 4 is 4 2 * 7 is 14 4 + 14 is 18 1 * 6 is 6 2 * 7 is 14 and 6 + 14 is 20 3 * 4 is 12 4 * 7 is 28 which means that we got here 40 3 * 6 is 18 4 * 7 is 28 which means we got here 36 and 46 this is our final a * B+ C let’s now go ahead and calculate the second part so the second part says that we got a * b + a * C so a * b + a * C which means that first we need to do this calculation and then this one and then we need to add them to each other so let’s quickly then calculate what is a * B and then a * C and then add them to each other let me clean up some space in here we’re going to knit that then we write this one in a smaller format so this is equal to 18 20 40 and 46 and let me take over the second element which we still need to calculate which is a plus a c first we will do this and then this and then we will add them to each other so a * B is equal to 1 2 3 4 multiplied by 5 6 78 5 6 7 8 now follow following the same approach from the previous example when we calculate this Matrix I will then quickly calculate what is this a * B so in here we got first row and First Column so 1 * 5 + 2 * 7 so the dot product between the first row and the First Column from here now for this element here we got the second row and the First Column we need to take the second row Row in the First Column from here and we do the dot product which means 3 * 5 + 4 * 7 in here we got the first draw and second column so the first draw and second column which means that we need to have 1 * 6 + 2 * 8 then here we got the second row and then second column which means 3 * 6 + 4 * 8 and then this is equal to 1 * 5 is 5 2 * 7 is 14 so this is 19 1 * 6 is 6 2 * 8 is 16 6 + 16 is 22 in here 3 * 5 is 15 4 * 7 is 28 8 so this is then 33 and then 43 so 43 and in here we got 3 * 6 6 is 18 4 * 8 is 32 so we end up with 50 so hope I haven’t made any mistakes in the calculations so this is a * B so a * B is then equal to 1922 4350 let’s clean this piece and let’s move ahead to the second part of the calculation which is a Time C what is a * C well a * C is 1 2 3 4 1 2 3 4 multiplied by -1 0 0 -1 so here we are then getting -1 + 0 here we are getting – 3 + 0 here we have -1 + 0 so no so the first row and second column which is 0 – 2 and then in here we got the second row and the second column which is 0 – 4 which means that we end up with this Matrix and it’s equal to -1 – 3 then Min – 2 and then -4 which means that we are getting as a final step AB plus a which means 19 20 20 43 and then 50 then plus -1 – 2 – 3 – 4 and what is this 19 – 1 is 18 43 – 3 is 40 22 – 2 is 20 50 – 4 is 46 okay so we got that this amount AB + a is equal to 18 20 40 46 and as you can see already here this Matrix that we got in the previous calculation from one is equal to this Matrix that we got as part of second calculation which means that now we have proved that for this specific example indeed 1 is equal to 2 which means that a * b + C is equal to ab+ a there we go so let’s now look into another low which is the the scalar multiplication law for matrices so the scalar multiplication law for matrices says that if we got a scaler R and a matrix A and B then R * a * B is equal to R * a * B and is equal to a * R * B so here the r is just a scaler so it’s a number and then A and B are matrices and what this low basically says is that it doesn’t matter what in which stage you will do your matrix multiplication with the scaler if you have this external scalar you can first take the two matrices multiply them with each other so the A and then B and then multiply it with r or you can take the scalar R multiply with your first Matrix and then multiply with b or you can take your second Matrix multiply with the scaler and then multiply with a it doesn’t matter they will all result in the same Matrix so let us actually prove this by making use of our skills from matrix multiplication and scalar multiplication here I have picked up bit more uh Advanced example where uh a is 2 by3 and B is 3×3 in this way we will train our multiplication skills for matrices and at the same time we will also prove that the scalar multiplication law of matrices holds so let’s go ahead and do the multiplications so first we have a matrix a what is that Matrix Matrix a is 1 – one 2 so 1 – 1 and then 2 then we got 0 2 and then 1 which is 2 by 3 and then we got B which is equal to it is 3x 3 with elements 1 Z 1 1 2 0 one one and then 3 1 0 2 so it is 3 by 4 so it’s 3x 4 not 3x three but 3×4 Matrix now the final part that I need here is this which is R is equal to 2 the scalar value so R is equal to 2 so the first thing that I’m going to do is to calculate this amount which is R * a * B for that what I need to do is to First calculate this a * B so let me quickly go and calculate this for us e so given that the a has Dimension 2×3 and then B has a dimension 3×4 I can see that quickly that my Dimension criteria is satisfied the number of columns of a is equal to number of rows of B so that’s fine and then I know also know that the final dimension of a * B is going to be 2×4 so it’s going to be a 2×4 Matrix and how do I know that well because I I know that from our um all this problems that we have solved we have already seen that we always need to pick the number of rows of the First Column and the number of columns of the second uh Matrix in order to get the final Dimension which is 2×4 so let me then go ahead and do the calculation so we’re going to have a 2×4 Matrix let me write it even bigger so it’s going to be a 2×4 Matrix so for the first row and First Column I need to look in here the first row and the First Column which means I need to take one so it’s equal to 1 * 1 so plus 1 * 1 is 1 -1 * 2 is – 2 2 * 3 is 6 + 6 this is my first value and what is this amount it is equal to 1 – 3 – 1 1 and 6 – 1 is equal to 5 so this amount is five five so what is this amount well this is my second row and the First Column so I need to make use of second row and First Column which is equal to 0 * 1 is 0 2 * 2 is 4 and 1 * 3 is 3 4 + 3 is 7 so this value is seven we are ready to go on to the next column so column number two so then this time I need to look at the first row and second column so we are going to use this one so first we will use this first Row 1 * 0 is 0 – 1 * 0 is 0 0 + 0 is 0 and then 2 * 1 is the only non-zero element 2 * 1 is 2 so I already know that for my second column I got here two and what is this element well for this I need to look at the second row and second column so this thing so 0 * 0 is 0 2 * 0 is 0 1 * 1 is 1 which means that here I get a one let’s now move on to on uh towards the third column so in here first I need to look at the first row so 1 – one and two and then this time remove this I need to look at the third column because I’m here in the third column so 1 * 1 is 1 – 1 * 1 is 1 so here I got 1 – 1 and then 2 * 0 is 0 + 0 1 – 1 + 0 is 0 because those two cancel out this means that here in this element I got a zero and what about this this element where I need to look here in the second row and here I need to look at the third column 0 * 1 is 0 2 * 1 is 2 1 * 0 is 0 so 0 + 2 + 0 is equal to 2 so this is 2 and now we are left with the fourth column so for that I need to look in here so for the first row which means first row in here and then the fourth column here so first row in here and four column here 1 * 1 is 1 – 1 * 1 is 1 and then 2 * 2 is four which means that I end up with 1 minus one and then + 4 and what is this this two cancel out I end up with four which means that here I need to fill in four and what is this final element it is the second row in the fourth column so the second row in the fourth column 0 * 1 is 0 2 2 * 1 is 2 1 * 2 is 2 0 + 2 + 2 is = 4 so now we obtained that a * B is this 2 * 4 Matrix as we have expected so this is then equal to 5 2 04 and then 7 1 2 4 so then the next step would be to take the scaler R and multipied with a * B let me actually keep the colors consistent so a * B this is a * B so the only thing that I need to do is to take that in here and multiply this two with each of those elements so I will end up with the same size Matrix so 2x 4 only all these elements need to be multiplied with the scaler which means that I will get 5 * 2 is 10 2 * 2 is 4 0 * 2 is 0 4 * 2 is 8 and then 7 * 2 is 14 1 * 2 is 2 2 * 2 is 4 and then 4 * 2 is 8 so this is the result of the multiplication so this is the first part this is what we are referring as one so we have then checked in here that the r times actually we have already in here so I won’t be writing again so as part of the first section we have already seen that R * a * B is this Matrix let’s now move on to the next one which is calculating the second part so this is the first part this is the second and this is the third we have this already let’s now move on and calculate this one so for the second case so the second case what we want to calculate is R * a * B so it is R * a and then * B this is what we need to calculate so the first thing that we will do is to calculate this part and then to calculate the entire thing the second point so let’s go ahead and do that first we will take the A and then we will multiply all its elements by scaler two to get the R and then a this amount is equal 2 1 * 2 is = 2 – 1 * 2 is – 2 2 * 2 is = 4 0 * 2 is = 0 2 * 2 is = 4 2 * 1 is = to 2 this is R * a now in The Next Step so this was one the next step we need to take this amount this Matrix 2 – 2 4 04 2 and multiply it with 1 2 3 0 0 1 and then 1 1 0 and then 1 1 2 so basically the Matrix B let’s now move and work our way out with that one actually let me remove this from here and keep the space bit more clean R time a and I will be multiplying this with the Matrix 1 2 3 and then 0 0 1 and then 1 one 1 1 and then 0 2 well I know that this one is 2×3 and this one is 3x 4 which means that the result will be 2x 4 let’s now go ahead and calculate the Matrix which is equal with a dimension of 2x 4 well for the first row and First Column let me actually go and quickly do those calculations let’s now go ahead and do those calculations so we are going to have four columns as previously the dimension is going to be 2×4 so let’s do it column by column in here it means that we are in the row one and then column one so 2 * 1 is equal to 2 – 2 * 2 is -4 and then here we got 4 so 4 * 3 is 12 so we got 2 – 4 and then + 12 and what is this amount 2 – 4 is – 2 + 12 is 10 so here here we got 10 me remove this 10 this is the second row and the First Column which means we got 0 * 1 is 0 4 * 2 is 8 and 2 * 3 is 6 so 8 + 6 is equal to 14 so here we got 14 this is the first row and second column which means that we are looking at this row and second column this time so 2 * 0 is 0 – 2 * 0 is 0 the only thing that we care about is this one and this element which is 4 * 1 so this should be four let’s now do the same for the second row 0 * 0 is 0 4 * 0 is 0 0 + 0 is 0 which means we are left with 2 * 1 so here it comes two let’s Now do the third column so for the third column we got First Row 2 * 2 2 * 1 is 2 – 2 * 1 is – 2 and then 4 * 0 is 0 which means that here we get 0 because 2 – 2 + 0 is Z then we got the second row and third coln which is this row and then third column so 0 * 1 is 0 4 * 1 1 is 4 2 * 0 is 0 0 + 4 + 0 is 4 so this is four and then for the first row and then fourth column so it means that we need to look at this specific column the first row is 2 * 1 it is 2 – 2 * 1 is – 2 and then 4 * 2 is 8 so 2 – 2 + 8 is 8 and then finally for the second TR in the fourth column 0 * 1 is 0 4 * 1 is 4 2 * 2 is 4 4 + 4 is8 this is the final Matrix which means that this entire amount that we just calculated step by step this is equal to this Matrix in here okay so this is the second element let’s check whether the first element is equal to the first one so we see here 104 08 142 248 as you can see we are dealing with exactly the same Matrix which proves that indeed r * a * B is equal to R * a * B so this part we have already proven because we have seen that 1 is equal to two perfect so the only thing that is remaining is to calculate this third part and to see whether this is equal to this matrices because we have seen that the two of those are equal so the remaining thing that is left to prove this theorem is to calculate this third part let’s go ahead and do that so the third element says let’s first calculate the r * B and then multiply it by a so we need to calculate b r * B * a this is what we need to calculate which means first we need to calculate this and then we need to calculate the entire thing all right so let’s go ahead and do that so R * B is equal to so we need to multiply each of the elements of B by two so we end up with this Matrix 2 0 2 2 and then 2 * 2 is 4 2 * 0 is 0 and then 2 2 and then 2 * 3 is 6 2 * 1 is 2 2 * 0 is 0 2 * 2 is 4 this is that first Matrix let’s now go ahead and calculate the second part which is a * Matrix a so it is 1 – 1 2 and then 0 2 and then 1 multiplied by this Matrix which is 2 4 6 0 02 and then 2 2 0 and then 2 2 4 okay perfect so this is then what we need to calculate well this is 3 * 4 this is 2 * 3 which means the result should be 2 * 4 let’s go ahead and do those calculations this first amount will be the first row and the First Column dotproduct of those which means 1 * 2 is 2 – 1 * 4 is 4 – 4 and then 2 * 6 is 12 so here we got 2 – 4 + 12 and what is this amount well 2 – 4 is – 2 12 – 2 is = to 10 so this one this element is 10 then for the second show we need to look in here so 0 2 1 and the dotproduct of the one we the First Column so this thing and that is 0 * 2 is 0 2 * 4 is 8 and then 1 * 6 is 6 so what is 8 + 6 that is 14 and then for the first row and then the second column we need to look to the first row in here and then the second column in here and the DOT product of the two well 1 * 0 is 0 – 1 * 0 is 0 and then 2 * 2 is 4 so that’s what we are left with for and then for the second row and then the second column so this element we got 0 * 0 is 0 2 * 0 is 0 1 * 2 is 2 for the first row and the third column so we need to look in here 1 * 2 is 2 – 1 * 2 is – 2 and then 2 * 0 is 0 so we are left with zero 0 and then once we do the calculation for the second row we will see that we end up with 0 * 2 is 0 2 * 2 is 4 and then 1 * 0 is 0 so we end up with 4 and then here for the final column 1 * 2 is 2 -1 * 2 is – 2 and then 2 * 4 is 8 the first two cancel out and we end up with 8 and then for the second row and the fourth column we look into here again this time the second row 0 * 2 is 0 2 * 2 is 4 and 1 * 4 is 4 and then 4 + 4 is eight so if we look in here this is our third amount we will quickly see that again we have the same Matrix with exactly the same elements so now we have also proved this third part and we have seen that in all cases the r * a * B is equal to R * a * B is equal to a * R * B now we are ready to move on towards the second module in this unit which is about the determinants and their properties we are going to look into the uh determinants at high level we are going to Define them and going to understand why they matter and why they are important then we are going to see how we can calculate the determinants we are going to see the calculation for 2x two Matrix then 3×3 Matrix and then just in general how we can do it and then we are going to see the properties of determinants one by one and then finally we are going to see the determinant interpretation from the geometric perspective so when we visualize it using python so by definition the determinant is a scalar value that can be computed from the elements of a square Matrix so this important Square Matrix and encodes certain properties of the Matrix so the determinant provides a critical information about The Matrix such as whether it’s invertible and the volume scaling factor for the linear transformation it represents so we see that the uh concept of determinant is highly related to many other concept that we have seen before so first here it’s talking about the square Matrix then it’s talking about encoding certain properties so having the determinant it contains certain information that um is related to the properties of the system that that Matrix is is representing and then it provides critical information about the underlying Matrix because the determinant is calculated from a matrix we say the determinant of a matrix so it contains a critical information about that Matrix such as whether it’s invertible or not and this goes back to the concept of inverse we will see this once we learn the concept of determinant because the inverse calculation is dependent on the determinant but keep this thing in mind that the determinant contains information whether we can get um inverse from a matrix or not we will see this concept also in detail in the next section but for now we can remember that the determinant contains important information related to the invertibility of the Matrix so having an inverse or not and then it also contains information about the volume scaling factor for the linear transformation it represents so here we then go back to this concept of a x is equal to B and then knowing the determinant we can then comment on this volume scaling factor for this linear transformation that it represents so let’s go uh on to the next slide to find out bit more about the determinants and specifically how we can calculate the determinant in the mathematical terms when it comes to the 2x two Matrix because the uh determinant of a 2X two Matrix is quite straightforward so for 2x two Matrix a with this elements where a b c and d they are all real numbers the determinant which we Define by this de a so that is a short way of saying determinant and then in here we always write the Matrix of which we are Computing the determinant is then equal to and then we are taking this a * D so we are taking this diagonal elements a * D so they are on the diagonal and then we are subtracting from this this altitud remaining two elements of the diagonal so B * C and this gives us the determinant of a 2×2 matrix this is just a formula that you need to uh remember whenever you want to calculate the determinant of a matrix by hand manually so the calculation for larger matrices it involves bit more uh difficult uh calculation we will see also in a bit the uh determinant of a 3X3 Matrix It relies on the determinant of a 2×2 matrix and the idea is that every time we uh increase the dimension of our problem so let’s say we are in R4 then we will go back to the R3 and then given that R3 relies on the determinant of the underlying 2x two matrices anytime we increase the dimension we again go back to this idea of using 2x two matrices that form the entire Matrix in order to compute the determinant only when it this R4 R5 Etc so it becomes much more difficult to describe and to do it manually therefore there are other algorithms which we will see at the end of this course like uh the composition algorithms and factorization algorithm items that can be used in order to calculate the determinant of a matrix that has higher Dimension higher than the tree for instance but in this specific unit we are going to discuss both the calculation of the 2x two matrices determinant and the determinant of a 3X3 matrices and we will also see detailed examples of them so without further Ado let’s then go ahead and calculate the determinant of this 2×2 matrix so let’s now look into this specific example where we are calculating the determinant of this 2×2 matrix so this is the A and let’s keep in mind that this is the um uh a this is the uh B in this not in this uh way of writing the Matrix a so the uh letters corresponding of the uh elements of this Matrix a so this is the a this is the B and then this is the C this is the D and we said that the determinant that of a is equal to the diagonal elements so one * 4 minus the of diagonal Elements which is 2x 3 because we said that the definition of this determinant is that it’s equal to a * D and then minus B * C which is exactly what we are doing in here so if we calculate 1 * 4 is = to 4 and then 2 * 3 is = to 6 4 – 6 is equal to -2 therefore we say that the determinant of Matrix a is equal to Min – 2 let’s now go ahead and uh practice with calculation of determinants on two other matrices so in this case we are still in the two dimensional space so we have 2X two matrices we’ll first calculate the determinant for Matrix a so we see that we got this element 506 one and we know that by definition the determinant of the 2×2 matrix so that of Matrix is equal to a * d – B * C where the Matrix has the following form so we got a and then D in here and then B and the C in here so we see that this is basically our a this is our D this is our B and this is our C the way you can also see it is that those are the diagonal elements and those are the of diagonal elements so therefore it means that we can calculate the determinant of Matrix a by taking the 5 multiplying with one so it is 5 * 1 minus the of diagonal element which is 6 * 0 and this amount is equal to 5 – 0 and is equal to 5 let’s go ahead and also calculate the determinant of Matrix B we see here that on the diagonal we have this two elements one one and of the diagonal elements are both zero those two therefore we can calculate the determinant of this 2×2 matrix which is also sometimes referred as I2 so it is the identity Matrix because we got here the E1 and then E2 in the two dimensional space and the determinant of of the Matrix B using this definition is then equal to 1 * 1 – 0 * 0 so 1 * 1 – 0 * 0 and this is equal to one and this is actually a special case of determinant and later on we will see why it is so important to uh have this relationship of identity Matrix having a determinant and having it equal to one um and this relationship between determinant identity Matrix is something that we see also uh in the upcoming lesson so keep this one in mind so now when we are clear on how we can calculate the determinant for 2x two Matrix so this is quite simple and straightforward calculation by taking the diagonal elements a and then D and then subtracting from that from that product a * C we are subtracting the of diagonal elements products B * C we can then get our determinant and now when we are clear on that we are ready to go on to bit more advanced calculations which is calculating the determinant this time for the 3X3 Matrix so now we increase the um the dimension size and we go from R2 to R3 because now we have a 3X3 Matrix and by definition given a 3X3 Matrix a which has the following elements so a11 a21 a31 and then a12 a22 a32 so we have already seen this coefficient labeling this should look very familiar this is 3×3 Matrix 2 and the determinant of a matrix a denoted as that a is calculated using the formula and here we see the formula we are basically using the 2x two matrices that form this Matrix a in order to calculate the determinant of the 3X3 Matrix and how we are doing that well we are using this element and then this element and this element and every time we are hiding part of the Matrix so when we have for instance this a11 so for this first part we are saying well let’s hide the row and the column corresponding to this element which means that we need to hide this this uh row and this column and what is left is this 2x two Matrix we will calculate the determinant of this 2×2 matrix and we will multiply this with this element that we use in order to remove the corresponding row and column this will form the first element in here so you can see a11 which is a simple value so this this the um uh entry volume which is in the first draw and First Column a11 multiplied by the determinant of this Matrix so this Matrix so once we have that and we already know how we can calculate a determinant of a 2X two Matrix because this 2x two so taking the diagonal elements and then multiplying them together subtracting from that the of diagonal elements product now we are ready to go on to the next part of the calculation which is this time adding a minus here so you can see here this here is plus and then here is minus so we do here minus and for this second step what we need to do is kind of similar only this time the element that we will be using to understand how we can remove the row and the column so we will then dark it out it is this one a12 so then we will need to remove this column and this row and then the remaining Matrix which is this one this 2x two and here I mean a 21 a 31 and then a 23 and then a 33 this is the Matrix that you can see here remaining which means remove this one and then this one and then the remaining 2 by two Matrix is what you need to use in order to do your calculations so you can see that I got exactly the same in here and once again we are Computing the determinant of this Matrix we are multiplying this with this a12 element so this element and now we have also the second element in our calculation and then we go on to the next step which is a plus sign here let me use the same colors plus sign here and then we are using this time our final third element to understand which row and which column we need to dark out which is this element so we then remove the first row and the last column and this is then the Matrix the 2x two Matrix that we use in order to do our calculations so deter the determinant of this Matrix multiplied by the a13 so we could also use in the same manner this row or this row it really depends on the kind of values the the tip that um I will provide to you or the trick is that to always look for the Zer Zer values wherever I see Z zeros or I see one one I’m thinking that hey those same u values that um give me the most straightforward and easy calculations because if I have zeros in my C in my entry so if I got a zero here for instance 0 * any determinant is zero I don’t even then need to calculate the determinant right because then I know that I’m multiplying that determinant with zero therefore if I know that that entry for instance this row contains the majority uh of zero so it is 0 0 one then of course it’s a great uh row to pick to use this target elements so in that way I will then know that this is the row that I need to Target but if it is like that that for inance since I got a matrix 10 3 4 and here I got 0 1 0 and here I have 100 3 and 4 of course the easiest thing would be to not use this row but instead use this one so in that case I will then have this zero and zero as my target values which means that I will only need to calculate the determinant of a 2 x two Matrix this uh for this one for the two cases I don’t need to do because I know that the corresponding Target values the target elements from my Matrix will be zero so let me show you what I mean by that so if for instance I go for this second row and not the first one what I need to do is that I can calculate the determinant of a by taking the a21 this then will be my target I will then need to remove move this uh column and this row then I will need to do the determinant of A1 2 A1 3 a 3 2 a33 this is what then I need to do then the next thing I need to do is of course here I have a plus here I need to do minus because we always need to Interchange the values so here is a plus here is a minus here is a plus so I do PL a minus in here then I do the next element in my row which is this one let me use red color so a22 so I’m then doing a 22 multiply the determinant of so I’m re removing this row and this column a11 a13 and then a31 a33 and then the final part is of course as you might have already guessed is to look into this element so it is plus a 23 multiplied the determinant of let me actually write it down in here the determinant of a11 A1 2 and then a31 and then a32 so in this way basically independent of what row I will take as my leading row that I will do my calculations and I will just need to pick one row I can always get the same value for determinant of a but choosing intelligently which row to pick it will save you a lot of time and headache in terms of calculations because if you are dealing with a row that contains many zeros for instance you have 0 0 1 or 1 0 0 or even better 0 0 Z then you know automatically that you will need to calculate your DET the determinant once here also once and here you don’t even need to calculate it you know that you got zero here Z here Zer here so it’s automatically equal to zero so I hope this makes sense because this is a trick that usually you will not come across but this just helps you to save a lot of time uh when comes your calculation of your determinants in a 3X3 settings so in this case uh we have this um we now we have this definition and we know the tricks that we can use but I think it’s really uh helpful to go ahead and to solve a problem so basically this is the higher level summary of the steps that we just discussed um so the determinant of a 3X3 Matrix it simply involves multiplying the A1 1 by the determinant of the 2×2 matrix that that remains after excluding the row and column of a11 so what we did in here by doing this and subtracting the product of A1 2 and the determinant of its respective 2 two Matrix so this part and then adding the product of a13 and the determinant of its respective 2x two Matrix so this part and this signs alternate so it means first you always got the plus then you always get the minus and then the plus so they interchange you start with plus then you do the minus and then the plus so let’s go ahead and calculate the determinant of this Matrix so before even looking at the answer let’s actually go ahead and do that on this paper so we got a matrix a which is equal to 1 2 3 4 so basically from 1 till 9 1 2 3 4 6 4 5 6 and then 7 8 9 and for this 3×3 Matrix we need to calculate the determinant so the determinant of a the first thing that I’m seeing is that there are no rows with zeros or columns which means that I cannot use my uh trick and instead I will just need to go with let’s say the first R and it’s also convenient given that I got as scaler this values this much smaller values relatively to the other ones all right so first things first let’s go ahead and write down that formula so the determinant of a is equal to first we are going to take this one so our one one times and then we got determinant of and then we have this Matrix which is 5 6 8 and 9 this is our remaining Matrix then the next thing we need to do is to Interchange the size uh the the sign which is minus and then we got so the remaining Matrix is then determinant of 4 6 7 9 and then finally plus three times and then determinant of what do we have well this is the target so it is 4 5 7 8 right so let’s go and do those calculations quickly this is equal to 1 * the determinant of this is the diagonal element so 5 * it is 5 * 9 – 8 * 6 – 2 * 4 * 6 – 7 * 6 Sorry 4 * 9 so the diagonal elements 4 * 9 – 7 * 6 and + 3 * and then 4 * 8 – 5 * 7 this is equal to so 9 * 5 is = to 45 8 * 6 is = to 48 then- 2 * 4 * 9 is 36 7 * 6 is = 49 7 * 6 is = 42 + 3 * 4 * 8 is = 32 – 5 * 7 is = 35 so this is = to 1 * – 3 – 2 * and then here we got 36 – 42 so that’s – 6 + 3 * – 3 which is that equal to – 3 + 12 – 9 which is equal to Z so let’s check it indeed we got the right answer perfect so now when we are clear on how we can do this calculation let’s now go ahead and calculate yet another determinant of a 3X3 Matrix and this time I want to show you this uh simplified version by you making use of this trick that I uh specified so instead of using this first dra as an indicator I will be using the uh second row as my indicator one thing to keep in mind when making use of this trick is that when you start from the second row so from the even rows even rows second fourth or sixth then in those cas PES you need to flip the signs that you will be using so while in here you had the Sign Plus in the beginning then you got a minus and then a plus when doing all these calculations so you remember here we got plus minus plus when you start from the second row instead of first one you need to flip the order of this so you need to start with minus you have minus you got plus and then minus so knowing this trick it also means that you go One Step Beyond and you know how you need to intelligently uh reduce the time that you are spending on calculation uh calculation of the determinant but it also means that you need to be careful on knowing what kind of science you need to use because if you start from the first or you start with plus and then you do minus plus minus plus so knowing how to start you already already know how you can go on but when it comes to the second row so the even rows you need to start with the minus so you need to do minus plus minus plus dot dot dot all right so let’s now go ahead and use that technique in here so here I see that my first row doesn’t contain zeros but my second row does so this gives me indication that I can reduce the time that I spend on calculating the determinant at least one time because I didn’t no longer need to calculate that determinant so the determinant of B is then equals 2 I will then start with minus given that I’m going to use this rope and then I have 0er * so because this is my element the determinant the determinant of 2306 and then Plus then this time the second element Target element is this one so it’s four times and then we got determinant of 1 3 1 6 and then minus D 5 * so this five determinant of 1 2 1 Z and what is this amount it is equal to this I don’t need to calculate because I got a zero in here this this trick is all about this to not calculate the determinant too often and then this equal to four times 4 * determinant of this is 6 – 3 so 1 * 6 – 1 * 3 which is equal to 3 and then minus 5 * determinant of 1 2 1 0 which is 0 – 2 so this equal to 4 * 3 12 – 5 * – 2 which is equal to 12 + 10 and is equal to 22 let’s go ahead and check this and this is the more detailed and formal derivation so one uh interesting thing is that I calculated with my second row and in here in this slide you can see calculation with the first draw this is just a nice way of seeing the difference that you can do in here uh in this solution what we have is that we have manually calculated this first determinant to so in total three determinants but we again end up with the same determinant so independent what kind of row you will use in order to calculate your uh determinant of Matrix B you will all always end up with the um with the same similar volume unless you have made a mistaking your calculations so you just need to keep track of the uh rows that contain many zeros and you need to um be careful in terms of the signs that you need to use and the sign that you will need to start if you start with the first row then start with plus if you start with the second row then it is minus and then plus Etc so as you can see here it’s a plus and then minus and then Plus in my case I did with my second row therefore I started with minus all right so let’s now move on to the properties of determinance so the determinant of an identity Matrix is one that’s something that we have also seen when doing our calculations because we uh saw that in one specific case when we had this example so this Matrix B and the Matrix B was the identity 2 in two dimensional space we have calculated its determinant and we saw that it’s equal to one and this was not a coincidence because the determinant of identity matrices is always equal to one then the second property is that swapping two rows or Columns of a matrix changes the sign of its determinant so if you swap rows or columns in your Matrix so if you end up with Matrix A and B they are exactly the same only one swaps the two columns or two rows then you are changing the determinant of that Matrix uh the sign of that determinant but not devalue itself it means that if you got a and you got B and your a is equal to let’s say uh A1 and then B uh A2 and then A3 so it contains these columns and then Matrix B is equal to um let’s say A2 and then A1 A3 then and the determinant determinant of a will be equal to minus of the determinant of B you can also say determinant of B will then be equal to the minus determinant of a this is basically the idea of this property let’s now move on to the third property which says that if a matrix has a row or a column of Zer its determinant is zero so if you got a matrix a that contains these different values a11 H1 dot dot dot a n uh M1 and then here you got suddenly um column that contains all zeros and then the rest are nonzero even so in that case you know that your determinant is is equal to Z so for a specific example if you got for instance Matrix 1 0 2 0 0 3 13 then the determinant of this Matrix is equal to zero and otherwise if you got a matrix B that has a column of zeros so come that is entirely of zeros so let’s say here we have 1 one one and we have a a zero Vector here so we got in here 0 0 0 and then 3 4 five then given that we have here this zero Vector then the determinant of Matrix B is equal to zero and this actually straightforward to be seen from this calculations that we saw because if you do the uh if you pick this specific row and then you do 0o times determinant of the remaining Matrix 0 times determinant of the other Matrix and then plus so plus and then so minus and then plus and then minus 0 * determinant of the third Matrix it is obvious that 0 * a determinant is0 0 * determinant is 0 0 * determin is zero which means that you got a whole bunch of zeros to be added to each other or subtracted from each other this means that if you have a row or a column with zeros this already gives you an idea that your determinant is equal to zero you don’t even need to do calculations so the final property of determinants is that if a determinant of a product of matrices equals the product of their determinants so the determinant determinant of a b is equal to determinant of a multip by determinant of B this is basically what this property is about so let’s quickly go through examples to ensure that we are at the same page with all these properties and we can prove them so let’s say we have an identity Matrix n by n which is we are dealing with i n now according to this first propert T when we calculate the determinant of this Matrix so the determinant of i n is equal to 1 Let’s actually look at a specific example so here we got um identity um Matrix in the two dimensional space in the R2 and we can quickly calculate the determinant of this I2 and we can see that it is equal to this diagonal element so 1 by 1 minus 0 * U it’s actually something that we did as part of my uh previous examples so this equal to 1 – 0 and is equal to 1 one thing that I wanted to show you before moving on to the next example about the swapping rows is that when we are swapping some of the rows or some of the columns or two rows or two columns of Matrix a we are referring to this matrix by this notation so we add this not in here and we say that this is basically the manipulated version of Matrix a so if we have for instance Matrix a equals you a b and and c and those are vectors and then we are swapping two of the columns let’s say we are swapping this two we get B and then a and then C then this Matrix we are referring as a not this is just a matter of notation and we just learned it as part of the properties that the determinant of this new Matrix is equal to minus the determinant of a so if a matrix a has a row or column of zeros then the determinant of it is zero so let’s actually quickly look at this specific example in here we got a which is uh having a column of zeros and another column of B and D where B and D are real numbers so let’s prove that his determinant is actually equal to zero so the determinant of a 2X two Matrix we have already SE is equal to the diagonal elements so 0 * D minus the of diagonal elements which is 0 * B 0 * B and what is number * 0 is equal to 0 0us 0 * B is also 0 it’s equal to 0 therefore the determinant of a is equal to 0 so when it comes to the uh determinant of a product of matrices let’s prove that the determinant of a * B is equal to the determinant of a Time the determinant of B so therefore the first thing we need to do is to calculate this a * B let’s quickly go ahead and do that so let me add here this um blank file so a is equal to 1 2 3 3 and 4 B is equal to 5 6 78 and I want to prove that the determinant of a b is equal to determinant of a * determinant of B first I will be calculating this and then I will be calculating this so for the first one what I need to do is that first I need to calculate d a * B which is equal to 1 2 3 4times 5 6 7 8 and then this is equal 2 should be 2 by two so first I take this 1 * 5 is 5 2 * 7 is 14 14 + 5 is 19 then for this one I need to pick this row so 3 * 5 is 15 and then 4 * 7 is 28 so 15 + 28 is so there we have 33 43 so I got here 43 then I’m going on to the next column which is in this case 1 * 6 is 6 6 + 16 is 22 and now the second column 3 * 6 is 18 4 * 8 is 32 and this gives me 50 all right so now I have the 800 B then as the next step what I need to do is to calculate the determinant of this a * B which is equal to the determinant of this Matrix 19 22 43 50 that I just calculated and what is this amount the diagonal elements 19 * 50 – 43 * 22 19 * 50 is then equal to 950 and 43 * 22 is 946 which means that we end up with four this means that the determinant of the a * B is equal to 4 let’s quickly check what are the parts of the second amount so for that I need to calculate determinant of a which is equal to 1 * 4 – 2 * 3 1 * 4 is 4 3 * 2 is 6 so 4 – 6 is = – 2 determinant of B is equal to 5 * 8 which is equal to 40 and then 7 * 6 is equal to 42 and this is equal to Min – 2 and determinant of a times determinant of B is equal to – 2 * – 2 which is equal to 4 so we can see that now we just proved that the determinant of a * B is equal to 4 so we have seen that determinant of ab is equal to 4 and we see that that’s exactly the same as determinant a * determinant of B which is equal to four so we have just proven that the this equation indeed holes so the determinants they are not just um some calculations or some amounts but they are actually uh important concept and their interpretation um is highly relevant from geometric perspective so determinant have a geometric interpretation and the for example determinant of a 2X two uh Matrix or 3×3 Matrix they represent the area in case of 2x two or the volume in case of 3×3 Matrix uh of the parallelogram that they are forming so uh this is often referred as uh parallel uh piped um I hope I’m pronouncing this correctly and it’s formed by the con vectors of the Matrix so if we have for instance this uh Matrix a and then we have a b and then C and D we have this A and C which is the first vector and then B and D which is the second vector and the uh the two vectors they actually form a parallelogram um when it comes to the uh two dimensional space and the area that this uh parallelogram uh is forming that is equal to the determinant of this MRI so the determinant of this scalar value that summarizes this linear transformation that we describe by this Matrix because we saw that we had this a x is equal to B linear system that we were describing using this coefficient Matrix and this was our unknowns this was our variable and then this B was the um amount that we were uh putting this as equal to if B was equal to zero then we were solving the homogeneous system otherwise we had this non-homogeneous system and in the geometric terms the determinant of this Matrix a so the determinant of a um in case of 2x two space so in R2 um when we got two vectors basically in our Matrix a this is equal to the area that is spent by these vectors in the two dimensions space in a bit I will also show you specific example such that um we will be on the same page when it comes to this concept of parallelogram the determinant and those vectors that form the column um uh space of the uh Matrix a uh when it comes to the three dimensional space when we have R3 so we got 3×3 Matrix of a then the determinant of this Matrix a is it’s the volume that is um formed by these uh threedimensional vectors because unlike the two in R TR we got the three vectors that form the a let’s say this one this one and then this one and then here we can create this area covered by these three vectors and the area that is formed by these three vectors from a it is equal to the determinant of that Matrix a so in terms of the 3D it’s bit harder to uh visualize it but in uh case of the two-dimensional space I think this will help uh to improve our understanding of the determinants and make this interpretation uh from geometry uh from geometrical perspective so given the two vectors A and B in the two dimensional space the determinant of this Matrix uh is then equal to the um diagonal elements we already know minus the of diagonal elements right so we are also saying we have seen this notation already very often you will see this volume this is the absolute we already know this from high school this is the absolute volume because the determinant can also be a negative number we have seen – 20 or minus 2 and we know that the area cannot be a NE negative number therefore we are adding this absolute term here so knowing for example that we have this Matrix a which consists of the elements 3 2 and then 1 4 we know that the determinant of this a is equal to 3 * 4 12 – 1 * 2 it is 10 and the absolute value of it so absolute value of 10 is equal to 10 given that is positive and this is exactly what we have here and this is referred as the area of the parallelogram that the two vectors are forming and how does that look like in uh the uh coordinate space so this is the parallelogram that we were referring by and this area that is formed by this parallelogram is equal to the determinant of the a The Matrix a so one thing that we need to keep in mind is the definition of parallelogram which means that those two are parallel and they are the same so this and this lines those two are the same and then of course the same holes for those two they are parallel and they have the same um length therefore this figure in here this is what we are referring as parallelogram and those two vectors that we can see in here this one and this one they form this parallelogram and they are the two vectors that are part of the Matrix a hence if we got two vectors that the uh that come from The Matrix a so Matrix a and we got here this two vectors in a 2X two Matrix then the determinant of this Matrix is then describing the area that these two vectors are using or are spinning when creating this parallelogram so the determinants they play an important role in understanding the geometric properties of the spaces that uh spent uh by these vectors they provide valuable insights when it comes to the scaling effect effect of linear transformation the orientation and the um the locations of them in the coordinate system as well as the Practical applications in calculating areas in calculating volumes welcome to another unit in our fundamentals to linear algebra course where we are going to talk about Advanced linear algebra Concepts so uh in the first module we are going to talk about Vector spaces and the projections we are going to define the bases in a couple of examples of them we have already touched upon this concept briefly when we are calculating the basis of a no space and the basis of a comp space we are going to do a similar example in this case and then we are going to uh look into this concept of the uh standard bases for uh different spaces including the arum we’re going to to introduce the concept of projections what is the definition of projections what is the formula how we can calculate it we are going to look into detailed examples of them then we are going to talk about a concept of auton normal basis in this module we are going to introduce this concept then we are going to understand the orthogonality normalization we are going to then discuss a very important topic in linear algebra which is the gramme process we’re going to Define it we are going to see the overview the step by step process of applying grme uh algorithm then we are going to see an example of it and the calculations step by step and then we are going to talk about applications of auton normal basis the application of gram Smiths process and the importance of this auton normal basis this is the module one of this part so let’s first Define the basis a basis of a vector space is a set of all both linearly independent vectors that spend the entire Vector space every Vector in the space can be expressed as a unique linear combination of the basis vectors so there are a couple of parts in this definition they are really important and first thing that we need to uh mention here is this Vector space that says it is a set of linearly independent vectors that spend the entire Vector space this is very important because um here we are with the basis is simply this Vector space that is a set of linearly independent vectors which means that one of these vectors cannot be Rewritten as a linear combination of the other one so we have a larly independent vectors and they span the entire Vector spe space so for instance if we are in R2 then the basis of a vector space is then a set of linearly independent vectors that span this entire R2 so we do we then need to have four vectors forming a basis so let’s say we have a basis of vector space for us to say that this is the the basis of this Vector space let’s say in R2 we need to First say we need to First prove that these vectors this vectors are linearly independent and two they span the entire R2 which means that span of these vectors is equal to R2 we can actually be even more specific in a example of let’s say having a vectors A and B we can say that this set that we have here consisting of vectors A and B in R2 form the bases of a vector space if the first criteria is that a and b are linearly independent and the second criteria is is that those two vectors together they spend the entire Vector space of R2 which means that span of a and b Vector space is equal to R2 on more specific example and then the second part of this definition says that every vector in the space can be expressed as a unique linear combination of the basis vectors which means in our specific example when we had this A and B forming the bases of vector space this means that if we prove that this is indeed the basis of this Vector space then any combination every Vector let’s say a vector C that consists of this C1 and C2 elements that this Vector this random Vector from R2 C can be represented as a linear combination of these vectors A and B so let’s say we have a coefficient K1 time a plus K2 * B then here we are representing this random Vector c as a linear combination of these vectors A and B which form the basis of a vector space of this Vector space so we have previously spoken about the no space and Comm space so let’s now go ahead and do one more example when we are calculating the no space and the Comm space and then we are again calculating this concept of basis of vector space and B basis of the Comm space and then we will be uh finding the basis of a vector space uh with um R2 example so given that we have already looked into this concept the basis of Comm space and basis of no space I will try to uh go through this example bit more quickly to save time on more complex Concepts so let’s say we have an example of a matrix and that Matrix is a is equal to 1 2 3 6 this is our 2x 2 Matrix a and the first thing that I want to do is to understand look into my Matrix and understand whether I’m dealing with unique vectors or not and by unique I mean whether I’m dealing with two vectors that are linearly dependent or linearly independent this kind of inspection always helps us to save time when we are doing our calculation for the no space and for the Comm space and for the basis of no space and base of Comm space now here we can see that this our A1 the first Vector the first com Vector forming the Matrix a and this Vector is the hu another thing that uh we can notice here is that we can easily take the First Column A1 multiply it by two and get D A2 because 1 * 2 is 2 3 * 2 is 6 that is that 2 * A1 is equal to A2 which means that we can say that A1 and A2 are linearly dependent okay so seeing this and knowing this this can help us to quickly go through our calculations of the bases of the col space and the bases of a no space so let’s go ahead and first calculate what is the basis of no space of a so we have already learned that the um basis of a no space can be calculated when looking into the first no space so we we need to calculate the no space and then we need to calculate the basis of that no space so this means that we need to get the na and we have learned that in order to get the na we for that need to solve the a x is equal to zero problem and this x will give us the no space of a we have also learned that the no space of a is equal to the no space of R RF of a which means that using gausian reduction or gausian elimination we can quickly find the solution to this problem of a x is equal to Z and find this x this is simply solving a similar Pro problem only in this case the B so this is equal to zero because we are dealing with the homogeneous case I want do the calculation for this we have done a ton of examples when we were doing this step bystep calculation getting the uh argument Matrix of a and then doing all these different draw operations normalizations and then eliminations in order to uh get this uh complex Matrix a to the point of uh basic representation from which either we can visibly see the solution to the problem or we can at least simplify it and describe it as a linear combination of vectors in this case if you go ahead and solve this problem you will find that the x that solves the a x is equal to Z problem is unique and this x is equal to- 0.894 4 as the first element and then 0.447 as the second element this can be a good practice also to refresh um the memory when it comes to the gaion elimination and reduction the example itself is quite simple the a is just a 2×2 matrix um and um by performing couple of operations uh in terms of normalization and elimination you can find this X for your ax is equal to zero given that now we know what is the solution to ax equal to Z problem now we know what the no space is because in this case the all this help us to understand that the no space of a is then equal to the set the vector set where as part of this we got just single column which is minus 0.8 94 and 0.447 this is the no space this is the first part I will say it 1.1 and then 1.2 will be to get the bases basis of this na and we have just seen what is the definition of the bases so the basis of vector space is a set of linearly independent vectors that spend the entire Vector space therefore given that we got just this single Vector as a solution to our problem we can see then very quickly that the new space of a is based on this and then the basis of the no space is simply this entire set so knowing what the solution is to our homogeneous problem a is equal to Z so let me also write down in here then we know that the no space the N A is then equal 2D Vector minus 0.894 and 0.447 this is my Vector X that solve this a is to Z problem and this is simply the no space of a and given that we have calculated and we have got this unique solution to our problem we can say that any Vector in R2 can be represented as a linear combination of this Vector so 1.2 any Vector in R2 can be represented as linear combination combination of this Vector X therefore we are saying that the bases of no space of a is this entire set consisting of the single vector so this is about the basis of a no space let’s Now quickly look into the concept of the basis of a c space so the first thing we need to then uh get is the column space so to get the bases of CM space we need to get the ca first which is the Comm space of a and what is the Comm space of a the Comm space of a uh is the uh setle and the space of the vectors that we can see in here in this A1 and A2 is it’s quite straightforward so these two vectors they form the c space of this Matrix a so then the ca is simply the set of one three and then two six vectors this is A1 this is A2 now we have just seen in the beginning before even starting our calculations that A1 and A2 are linearly dependent because A2 can be right written as 2 * A1 so one of these vectors can be written as a linear combination of the other one this means that we got just a single linearly independent vectors and why is this important because we have seen in the definition of the basis that for us to have a basis we need to have a linearly independent vectors so the basis of vector space in this case a Comm space is a set of linearly independent vectors that need to spend the entire Vector space in this case R2 so therefore we need to look into the ca that we got in here and select one of these two vectors that can be considered as linearly independent let’s say we pick one three now we know that we can then write any Vector in R2 as a linear combination of of this Vector 13 so we can scale this Vector 13 and get a new Vector in R2 therefore we are saying that the basis of Comm space basis of Comm space space of a is then the set of one three because one three so A1 is then linearly independent and the span of A1 is R2 now when it comes to the uh bases of the entire R2 one thing that we can notice is that this A1 so one three it’s not forming it’s not spanning the entire R2 because because we cannot uh write any random Vector in R2 as a linear combination of these two therefore we are saying that this is the basis of Comm space but we are not saying that this is the basis of R2 and the final element in this definition that I want you to uh focus on is that every Vector in the space can be expressed as a unique linear combination of the bases vectors so in here we have looked into this idea of basis of a new space and the base of Comm space and we saw that we are talking about specifically the new space and Comm space but when it comes to the entire space for instance the basis for R2 then the basis of Comm space for instance is no longer um helping us because the basis of com space it consists of this Vector one tree and this one tree alone is not satisfying the second criteria that says that this Vector needs to spend the entire Vector space because this one tree Vector it’s a single vector and this Vector it is not forming the entire R2 it’s not um the basis for R2 it’s not spinning the entire uh R2 so given that D1 Tre is not spinning the entire R2 because of that we know that the one tree is not the set of one Tre is not the basis of R2 so this distinguishing of the basis of R2 based basis of Comm space basis of no space is really important because basis for R2 it means that we need to find set of linearly independent vectors that they together form the entire R2 they spend the R2 which means any random Vector that we can see in R2 we can represent as a linear combination of the vectors in this space so in here let me also prove that this one tree alone is actually not forming the R2 it’s not spinning the R2 which then uh concludes that they are not the it is not the bases of R2 and after this I will then provide you an example where we have a set of vectors that span R2 and are linearly independent which means that they are the bases of the entire R2 so first I want to show you why this single Vector one Tre is not the bases of R2 so being the base of R2 we have the criteria that the vectors need to be linearly independent so let me actually clear up some space here so I want to see and find the basis of R2 first I want to prove that this set which is the basis of Comm space I want to prove that this is not the basis of R2 then I will also as part of the second part of this proof look in look into the case when we do have vectors and the set of vectors it forms the base of R2 so the first thing the first criteria of the basis of R2 says that let’s qu it 1.1 the first criteria says that this Vector in this Vector space it need to be they need to be linearly independent well that criteria is valid given that 13 is linearly independent this means that criteria one is satisfied so whenever you got just one vector this criteria is automatically satisfied so then you have the 1.2 which says that we need to have this span of these vectors equal to R2 so is the span of 13 the R2 well no and how we can prove that because the idea is that any Vector including an example where I have for instance uh let’s say four and five this Vector that I need to be able to find find a scaler that will help me to create a linear combination let’s say C linear combination using this Vector 13 which will then set this amount this to be equal to this which means that I need to be able to write my random Vector 45 as a linear combination of this Vector that forms my uh Vector space so let’s see whether that is even possible well here I got four and five if I do this multiplication in the right hand side I get C and here I got 3 C because C * 1 is C and 3 * C is C and this means that I have an equation 4 is equal 2 C and 5 is equal to 3 * C from this I get that the C is equal to 4 and C is equal to 5 / to 3 but that is impossible because 4 is not equal to 5 / to three which means that I’m proving in here and I got to prove that the uh any random chosen Vector 4 five cannot be written as a linear combination of this Vector that forms this uh space therefore as random Vector from R2 can’t be written as linear combination of one Tre criteria two is not satisfied because for that we had to say that this pen of one3 is equal to R2 which we saw that it’s not the case because then we would have been able to represent this four five as linear combination of this one Tre Vector okay so now we have proven that the one Tre is not forming the basis of R2 let’s now look into what then does form the basis of R2 an example of it so we are familiar with the unit vectors E1 and E2 into r 2 which form the identity Matrix and this is 1 Z and this is 01 also 1 0 0 1 in the form of a matrix so in this example we have a set consisting of E1 and E2 where this is this E1 this is the E2 and the set corresponding to this Vector space is then 1 Z and then 0 1 and now I will be proving that this space this Vector space does indeed equal to the bases of R2 so this is the basis of R2 so the first criteria of the basis is that these two vectors should be linearly independent now we can quickly uh remember from our previous theory that the two unit vectors one0 0 1 are actually linearly independent that’s something that we have proven and you can easily see it also from here there is no way that you can find um scalar C that you can multiply this Vector with and get a vector zero one because for that for this one to become a zero you need to multiply this with 0o but then 0 * 0 is not equal to 1 which means that there is no way that you can find a scalar C to multiply this E1 to get the E2 so let me write this down E1 and E2 are linearly independent because there is no scalers C which is a real number such that such that c * E1 is equal to E2 so this means you can’t write E2 as linear combination of E1 or vice versa this means that E1 and E2 are linearly independent and this satisfies our first criteria so criteria one is satisfied what we have also learned is that any Vector in R2 can be actually written as a linear combination of a unit vectors that form that um R2 in this case 1 0 and 01 so let’s assume that this random Vector is C1 C2 so this is C vector and what we want to prove is that we can always write this C in terms of a linear combination of these two vectors and how can we do that well let’s say here we got a K1 K1 which is a real number and we multiply this by 1 Z and then we add K2 K2 and then here is 01 so this is our E1 this is our E2 can we do this well what is this this is equal to K1 0 Plus 0 K2 and what does this give us well this means this amount let me write it over K1 * 1 which is the E1 plus K2 * 01 which are which is our second Vector E2 this is equal to K1 0 + 0 K2 and this is equal to K1 K2 so I got on one hand this Vector C1 C2 which I want to write as a linear combination of K1 E1 plus K2 E2 if I take the K1 equal to C1 and K2 K2 equal 2 C2 well then in that case I can prove so this is basically equal to C1 and C2 which means if I take this K1 and K2 equal to C1 and C2 respectively and those numbers are given then I can represent this Vector c as a linear combination of E1 and E2 which is what I had to prove in order to say that the span of 1 0 which is the E1 and 01 which is the E2 is equal to R2 because any random Vector that will be provided to me with an element C1 and C2 and those are just real numbers can be written as a linear combination of these two vectors this means that the spend of these two vectors is equal to R2 and this is basically the second criteria so criteria two satisfied and if the criteria one and criteria 2 are both satisfied it means that this Vector space of 1 Z and 01 this is the basis of the entire R2 so let’s now talk about the concept of projections by definition a projection of a vector a on two not Vector B is the orthogonal projection of a along B it’s denoted by approach and then B underneath here we see the index and then a so projection of a onto B so here is the A and here is the B and represents the component of a in the direction of B so component of a in the direction of B all right so in order to properly understand this concept the intuition of it let’s actually make use of the R2 space so let’s first start by picturing in our flat world the R2 coordinates so the Cartesian coordinate system so let’s say here we got our y AIS here we got our xais is so this is the X this is the Y and uh here we of course we need to keep in mind this is just an example when it comes to projections we can always go beyond R2 but for keep it simple and truly understand this Concepts and this intuition behind the projection I want to simplify this and do the example in r two so here uh imagine that we got this line and this is our line that goes through the center that’s called this line B so B is line in R2 let’s say this is that line and now that imagine that we have this Vector which is part of this line let’s say this is this line and this line is the representing by uh on this line we got this Vector B and this Vector is basically part of that line as you can see this is the vector B on this line B so we know from this concept of the line spanning the R2 and then vectors we know that in this case independent what is the magnitude of this Vector what is the direction of this Vector we can represent this line B by this linear combination based on this Vector so linear combination of this Vector which is in this case D set set then here we got some C where C is a real number multiplied by this Vector B knowing that this C is just a real number so let’s make it actually green so we can basically say that this entire line B can be represented as the set of this linear combinations of these vectors so for instance if this is one and we do the C isal to two then we can get this part of so we can get this Vector otherwise if C is equal to three we can get this vector or C is equal to 4 this vector and then and so on which means that we can always come up with a linear combination forming a part of this line therefore we are seeing that this line can be represented as all these linear combinations uh of this Vector B which is part of this line and here this C is just a scalar so a number which is a real number so this C * Vector B represents this uh entire line so we will knit this in a bit but for now imagine this line and part of this line which is this Vector B so imagine then that we got yet another Vector which is let’s say in here I again going from the center but this time in this different direction so in here this is Vector a we call this Vector an A so you can see that this Vector a is actually much longer than the vector B and we see that V Vector a is not lying on the same line as B so B is lying on the line b and a is not lying on the line B now let’s say we want to project this Vector a onto this Vector B which means that we want to project this a in this direction so we want to bring this Vector a onto this line let me actually use a different color and the word of the projection actually does make sense in here as you might notice because we are trying to cast the shadow of a onto this line of B and how can we do that we can only do that if we connect this Vector a like this with this orthogonal line let’s say by using a different color of this so with this perpendicular line we then will be connecting the vector a to the line B because we want to project our Vector a onto this direction C so this perpendicular line that you see in here that goes from Vector a to the line B where on line B we have the vector B so here is the line a or line B and this perpendicular line it goes from a to line B and on line we we have the vector B that is represented like this then the projection of a onto Line B is this Shadow Vector that you see in here and the word projection or the name projection actually does make sense because we are projecting this Vector a onto this line and it creates this Shadow so we are casting this Shadow on here and and this Vector is what we are referring as projection of vector a onto Line B notice that we don’t say projection of B on Vector B but instead we are saying projection of a on the line B then another thing we can notice is that we are getting this projection of a on B so this vector by taking the Vector a so Vector a and subtracting from that projection of a on B that is the formula for this Vector that we refer as a perpendicular that goes from a to line B so when drawing this perpendicular line from a to line B we are referring this as a minus projection of a b because you can see that this Vector is simply this Vector minus this Vector that is the um ex mathematical expression for this perpendicular line so how we can then find out what is this C that we got in here because we understand that that to get this exact formula for the projection of a on the line B we need to understand what is the scaler specifically what value are we using to multiply this Vector B to get to this point so what is that c what is c what is C such that c * B is then equal to projection of a on the line B because we can have different sorts of a linear combination of vector B on this line uh B and in fact B this line B is the set of all linear combinations of this Vector B and I want to know specifically what is the vector that we see in here what is the shadow Vector because this is the projection of a on the line B what we see in here now how can we do that well let’s first formally def find on this specific case what is the projection of a on this line B so projection of a on line B is some Vector that is also on line B where a minus projection of a on B is per pendicular or ortogonal two this is basically the definition of the projection of a on line B under this specific example so in this case the way we can find this projection is by looking into this C so this is what we are interested this specific specific c times B vector and knowing C and knowing B we already know what is B what B is knowing C we can then describe this specific projection so one thing that we can know is the condition under which we say two vectors are a topen not that’s something that we already have learned as part of the previous lessons so let’s go ahead and find that amount so now what we need to do is to calculate this value of C because value of C calculation will then lead us to the exact uh Vector that we are interested in which is this projection so our end goal is to find out what is this projection of a on B this is what we want and for that we need to calculate this C because we already know the vector B so let me quickly remove this part cuz here we will then do our calculation so one thing that we need to make use of is this part when it says orthogonal because we know that if two vectors are orthogonal then their dotproduct is equal to zero so we know that this Vector is orthogonal to this target Vector which means that we can say that the Vector a and then minus projection of a on B multiplied with Vector B that this is equal to zero this is something that we know by definition of orthogonality two vectors are orthogonal it means that their dot product is then equal to zero now let’s make use of that part so this means that we need to describe this projection of A and B we need to make use of the fact that we know that this projection of a onto B is actually some linear combination of vector B so let me actually go ahead and remove this part we already know the definition so let us go ahead and calculate that c that we need in order to find out what is this entire projection so few things that we need to clear out is those formulas because then we can make use of them to find the C so we know that by definition the projection of a on the line B it is this vector that we get where we draw this perpendicular line from Vector a onto Line B and we said that this line is equal to this amount this is simply the vector a minus this Vector the shadow Vector which we said it’s defined by projection of A and B this thing so we can make use of that because we also see in here that this we are saying isogonal to this vector so given that this uh Vector a minus projection a b is orthogonal to line B that is also orthogonal on this specific Vector which is the projection itself so from this we can make use of the fact that two vectors when they are orthogonal their dotproduct is equal to zero in order to find this uh value of C so firstly we just said that D A minus projection of a on the line B that this multiplied by this Vector B is equal to zero because those two lines they should be perpendicular but at the same time we know that this is simply the linear combination of this vector because this line is perpendicular to this one and this line is some linear combination of this Vector B because if I have here a vector and then I have the longer version of that Vector on the same line which is then a linear combination of this original Vector let’s say this is my Vector B then this second Vector that I in here is then equal to sum C * Vector B this is also exactly what we said in here we said any Vector on line B can be represented as a linear combination of vector B and this is exactly what we are seeing in here so this projection is simply that c times Vector B this is something that we have already said so we are just making use of that to fill in that volum so this then results in a minus this C * B multiplied by this Vector B is equal to zero formula so here we are simply making use of the fact that the projection of a on to B is the shadow Vector which is then equal to some linear combination of this original Vector B which is on this line B then I can easily find the scalar C from here because we know how we can easily calculate this dot product so let us actually go ahead and do that let’s first multiply this a by B and then minus so I’m simply opening the parentheses C * then I got B by B and this equal to zero so C * B * B is then equal to a n B which means that c is equal to a * B / to B * B now when we have the C we can easily derive the formula for the projection of a on the line line B so this is the first part this is the second part so then the projection of a on B so projection of a on B is equal to this C c times the B and we just found out that this is equal to the C was equal to a * B / to B * B and now we need to take this C and then multiply by Vector B this is then the projection of a on B this Vector so projection of a on line B so you will notice that this is the same that we just got so whether you compute the projection of a on the entire line b or projection of a on the specific Vector b as we are using the vector b as a source for drawing our line this is the same as the projection of vector a on Vector B and this is the same formula as we seen here so this is the projection formula that we have just uh found out so projection of a on to B is given by this formula a * B so the dotproduct of the vector A and B divided to the dotproduct of the bay with itself and multiply it with the vector B and this is the in here this is something that we have calculated time and time again in our examples so if we go back to our example then here we can see that this is our Vector B this is our Vector a and we are saying if we take the vector a and we project it onto this Vector B then we can calculate this projection which is in here the formula for this entire Vector which we are calling projection of a on b or projection of a on B this can be find out so the the length of that Vector we can find by by using this formula so the dotproduct of vector A and B divided to the dotproduct of B with itself and then multiplied with Vector B so again a DOT product and this is of course something that we get as a vector so this is a vector something that is equal to this entire Vector in here this Vector so uh I know that this uh might look bit messy because it contains many moving Parts but I wanted to provide this detailed explanation and this step by-step process even if it is bit confusing and bit uh messy um in the beginning because this help us to understand what this uh formula is about and what is the intuition behind it because what we are doing is that we are making use of the fact that the line can be represented as a linear combination of all the um vectors that we use in here so this is Vector B and this entire line B is a linear combination of this Vector B and we can make use of that in order to find that scalar that we are multiplying to create this single linear combination that will end up giving us this Vector that we see in here which is the projection the projection that we are interested which is this line This is the projection that we are defining by this projection a on to B and we can get that by making use of the fact that this this perpendicular line that we are creating in here which is simply the vector a minus this projection this is this Vector this projection Vector that this is perpendicular to this line B and if the vector B is part of this line B this means also that this line a minus projection a is also perpendicular to that vector vector B making use of that formula we can then uh make use of the product of the two we know that the dot product of two perpendicular vectors is equal to zero making use of that we can then obtain this specific scaler C that we need in order to get the final formula for our projection we are interested in this C because knowing C we can and then multiply with this Vector B to get our final projection and we have found that that projection a on B is defined as the dotproduct of the A and B divided to the dotproduct of the B with the B and multiply with the vector p and this is again a vector now let’s look into a couple of numeric examples to clarify this topic and practice with it so given vectors A and vectors B find the pro ejection of a onto B so without looking into answer I will quickly go onto that example so Vector a is this Vector 3 4 can also represent this by our more common notation which is three and four and then Vector B is one and zero so let’s quickly draw our coordinate system this our x-axis this our y AIS and then what is the a the a is three and four three and four so this is our a and what is the B the B is 1 and zero which means that our line B is then C * the vector B given that the C is a real number and one thing that you can notice is that the line B is actually our x-axis it is this line this is our line B this is our line B so the projection is then this line this is our projection because we can know that by drawing a perpendicular line in here from a to the line B we can get then the connection between our Vector a and Vector B and create our projection so this is then the a minus projection of a on line B and this part is then this is then this projection a on B and how we can get this projection well we just learned that the projection of a on B is equal to dotproduct of a with B divided to dotproduct of B with be B itself and multipli it by B this is the formula that we can use and even if you don’t remember the formula by heart you can make use of this visualization to figure out what that formula is because we know that if this line is perpendicular to this one then a minus projection of a on B multiplied by this projection a on B should be equal to zero and this projection of a on B is equal to some scalar C multiplied by Vector B that’s something that we see in here the first thing we need to do is to compute the dot product between a and b a * B is equal to 34 multiplied by 1 Z this is the dot product which is then equal to 3 * 1 + 0 * 4 and this is equal to 3 the next thing we need to do is to compute the dotproduct between B itself so B * B and what’s that that is 1 0 with 1 0 multiplied this is equal to 1 1 * 1 + 0 0 * 0 is equal to 1 then the Third third thing that we can do then is to obtain the final value which is projection of a on B is then equal to 3 ided 2 1 multiplied by Vector B which is 1 0 which is equal to 3 0 and this actually makes sense visually too as you can see in here this is the tree for the x-axis and here we have the center zero so this projection is then the vector 3 0 so even without calculation we could see just from plotting the uh on the coordinate system the vectors A and B that the projection of a on B will be this Vector 3 Z but we have followed the formula in order to do calculation step by step which is something that you can see in this answer to so the projection of this Vector a onto B is then this Vector of a length tree in the direction of B so you can see that it is of the length of three so this is the tree on the direction of B so on the line B let’s now move ahead and look into a different example but this time we will do the calculation in a quicker way so we got two vectors 43 and B is equal to 2 and we need to find this projection of a on to B so the first thing we need to do is to calculate the a * B which is equal to 43 * 2 0 and that’s equal to 4 * 2 + 3 * 0 and it’s equal to 8 the second thing we need to calculate is the B do product with B which is equal to 2 0 2 0 this is then equal to 4 and the final part is to take and uh from this one and two this values and then bring them all together so then the projection of a on B is equal to 8 / 2 4 multiplied by the vector to0 and this is equal to 8 / to 4 is 2 2 * 2 is 2 2 * 0 is 0 so we are getting this 2 Vector so projection of a on B is then this Vector 2 which is actually on this x axis similar to what we had before only with the length of two on uh towards the direction of B which is then equal to 4 and z and this is again similar to what we had before uh where we got the projection of a on B on that end up on the xaxis but now with a length of four so now our projection has as the following Vector so the uh following magnitude and Direction so this is the step by-step process that I just followed if you want to do it bit slowly and this is the final result so uh the interpretation of this projection is that this projection a onto B is simply this 4 zero this means that the A’s component in the direction of B is spends uh four units along this x-axis that we saw in here because this is the value X this is the value of y so this projection shows us that a influence in the direction of B is completely horizontal with this magnitude of four because we saw that we end up with the projection on the x-axis again so this was four this was our projection vector and if you plot this entire Vector a and Vector B on this xaxis and Y AIS then you can clearly see that the uh horizontal line that we end up with the uh projection of A and B is very similar to what we had before in here let’s now talk about a concept of auton normal bases so let’s now Define what the auton normal bases are so by definition auton normal basis for a vector space is a basis where all vectors are orthogonal or perpendicular to each other and each Vector is of unit length so as you can notice here here we have a special type of basis it’s called auton normal basis because in the beginning of this section of this module we defined formally this concept of bases we talked about the concept of calm uh space and then the uh basis of a Comm space the no space the basis of a no space and then we talked about the concept of the basis of the entire space for instance dr2 and now we are defining a special type of bases which we are referring as auton normal bases and this auton normal basis as you can see from this definition it contains two criteria for it to be uton so an uton basis for Vector space is a basis where a all vectors are orthogonal or perpendicular to each other and B each Vector is of unit length we already have learned that when we have vectors let’s say Vector A and B perpendicular it means that A and B their dot product is equal to zero that’s the first CRI IA that we need for calling our basis an auton normal basis then the second criteria is that each of these vectors they need to have a length of one if we have this condition satisfied then we are saying that our vectors they help us to form this autona basis if we got three vectors forming this Vector space it means that we need to have the a * B = to0 A * C = 0 and then B * C = to zero this is if we are in in case we are using three different vectors that Define our Vector space in this case we make make this part smaller so let’s put the length of B in here in this case the second criteria becomes that the length of a is equal to the length of B and then is equal to the length of c and is equal to one so depending on the number of vectors that you use to form your vector space the prove that you are dealing with auton normal bases will be different here we got just two vectors here we got three vectors but in both case we first need to prove that we are dealing with uh vectors Each of which are a set of orthogonal perpendicular vectors and all of them pairwise they need to be perpendicular and at the same time the second criteria says that they all need to have a unit length so their length should be equal to one we need this auton normal basis in order to simplify our calculations including the calculations of projections and Transformations that we just saw before when we were discussing this concept of projecting a vector onto a line or projecting a vector onto OD Vector because we were in this basic case when we had just two vectors in R2 and calculating projection in R2 is very easy because we can make use of this formula um a a and then B uh the dot product of them and then divided two dot product of the B and then times the B this was quite straightforwards right but when it came so this is the projection of a on B but when it comes to projection in higher dimensional space let’s say you have R5 or you have r00 or R th000 then it becomes much more difficult to do those projections and to calculate data projections and for those cases we can make use of this concept of auton normal basis to simplify our calculations and we will see that in a bit so let’s first understand this orthogonality and the normalization part so orthogonality refers then to the part of uh when we are saying that the vectors should be orthogonal to each other and the normalization refers to the fact to the fact that the length should be one this is basically the set of two criteria area that I just discussed this is uh the summary slide that will give you an indication what is meant by that so if we have two vectors V and W then we say that the first criteria is that those two vectors are orthogonal which means they’re dot product is equal to zero and we are saying that their length is equal to one which we are referring as a normalized vector so if the vector has a length of one then we are calling a vector v normalized so if both of this criteria of normalization and orthogonality is satisfied that we are saying that we are dealing with an uton normal basis so now when we have learned this idea of projections also this idea of autog colonization and the uh concept of auton normal bases we are ready to discuss the concept of the grme process so the gram shade process is this method for orthogonalizing a set of vectors in an inner product space and turning them into an auton normal set so let’s say we have a set of vectors we want to uh bring and transform all these vectors onto this auton normal set of vectors which means that we want them to be aut toonize so we want them to be perpendicular and we want them to be normalized because we know that the two criteria were specified right so the first criteria was that we need to have vectors ortogonal hence we are doing orthogonalization and the second criteria was that they need to be normalized because we want the vectors to have length one so we are doing normalization this process of turning this set of vectors onto this auton normal set by using this method of orthogonalization which is something that we are referring as a gamish made process this is something that we can use in order to simplify later the different sorts of Transformations which we need in order to perform bit more advanced uh Transformations like Matrix uh factorization different decomposition techniques so given the set of linearly independent vectors this process which we are referring as grme process produces this auton normal set that is spinning the same Subspace so we have the same Subspace it’s just that we are turning the set of vectors into an auton normal set of vectors that is spanning the same Subspace so the grum Street process step by step looks like something like this so given the vectors A1 A2 up to a n the first thing we need to do is to start with the vector V1 which is equal to our first Vector A1 and first we need to normalize this vector and how we can normal normalize this Vector well we need to take this vector and we need to divide it to its length so the grme process step by step will look like something like this so in the first step what we need to do when starting with these vectors of A1 A2 up till a n so in RN we need to First Take the first vector and we need to normalize it and how we can normalize the vector and ensure that its length is equal to this length of V1 well we need take that vector and we need to divide it to this L because when we take the vector V1 and we divide it to its length of V1 then we will ensure that the length of that Vector is equal to 1 we can actually prove that very easily but I won’t do it in here uh feel free to go through the process assuming that the length of the vector what what you want to achieve at the end is that the length of vector v is equal to one this is something that we want to achieve and this normalization process can be done if we find a way to ensure that we uh get this E1 because E1 means that we end up with this Vector y 000000 0 this will be for first Vector so V1 this is E1 so the one is really important here so we want to normalize this Vector V1 by uh ensuring that we get the E1 so we go from V1 to E1 and the way we do that is that we take the V1 and we divide it to the length of V1 and in this way we get the E1 so the normalized version of P1 1 is E1 so then for each subsequent Vector a which means A2 A3 A4 up to a n we need to subtract its projection on all the previously computed orthogonal vectors in this way by using this step two we are ensuring that all is different each pair wise set of A1 A2 and then A2 A3 Etc they are all orthogonal to each other and we know that this projection is something that we got when we had this two perpendicular lines so we had this Vector we’re projecting onto this Vector we got that by finding this perpendicular line and making use of that using this property we are then making use of that in order to see how we can ensure that the subsequent Vector that we have is always perpendicular to this one so let me actually write down what is in this formula so here VK is equal to AK minus the sum of all the projections so then we need to normalize the VK to get the EK and then we need to repeat this Step 2 and three for over vectors which means that first here we apply this normalization on the vector A1 so V1 is equal to A1 and then we get the normalization by getting this E1 so E1 is normalized version and then we need to apply a bit different tactique for our V2 V3 up to VN and then let me actually write down this for this General case so what this processed this the gr let me ensure that I’m not making a typo processed step by step means step number one for vectors A1 A2 A3 dot dot dot a n so we are in the RN then step number one basically says take the V1 and set it equal to this first element V1 this is A1 then what we need to do is to normalize it to get the E1 so normalize normalize V1 to get E1 which is equal to 1 0 0 0 and then dot dot dot zero and the size of is n by 1 and how we can do that by taking this Vector V1 and divided it to the length of V1 which basically means in this specific case A1 ided to the length of A1 this will then give us our A1 this Vector this is basically what the step one entails then in the step number two we have for each subsequent AK where K is just an index referring to whether we are dealing with K is equal to 2 so uh A2 A3 and then dot dot dot a n this is what basically the K is used for to refer to which Vector we are dealing with we need to subtract is projection on all previously computed orthogonal vectors by using this formula so let’s actually do a couple of those case to see what is going on for instance for K is equal to 2 so K is equal to 2 and here is the formula by the way so VK a VK is equal to a k minus sum K is equal to K starts with one and then then me use a different index so I is = to 1 till K minus one and then projection of AK a K on E1 or EI I so the eii that we have just computed because every time you are then normalizing and normalizing every time your vectors and then you are uh finding out what is the projection of your vector a onto that EI and then you are subtracting that from your vector so what this means in Practical terms when for instance your K is equal to 2 it means that V2 is equal to a 2 minus sum of I is = to 1 and then K is = to 2 K minus 1 this means this is 1 projection of a and then 2 on A1 given that this is one this is simply equal to A2 minus projection of E1 that’s normalized version of E1 and then A2 so projection of A2 on E1 and then in the step number three we need to do we need to go from VK to get EK so basically we are ensuring with the step number two the orthogonally orthogonality condition and with step number three I me add some spe Cas in here so in the step number three step number three we then saying let’s normalize normalized this VK that we have just computed in here because we remember that the second criteria after tonality is normalization that the unit or the length of the vector should be equal to one so then V K in this case for K is equal to 2 for K is equal to 2 means that we need to go from V2 to E2 and the way we can do that is by taking the V2 by V2 and then divide it to the length of V2 this will then give us the E2 this is the normalization part and this step number four basically means repeat repeat step to entry for all case which means that if we go back so we are done with V2 so we have obtained V2 and then we have obtained normaliz normalized version of V2 by getting this E2 we are ready to come back and do the same for K is equal to 3 and for K is equal to 3 in Step number two we got V3 is equal to A3 minus making use of this formula sum overall I is = to 1 K – 1 is = 2 and then projection of this time A3 see three k is equal to three and then on A2 actually it says EI let me remove this this otherwise we would have made a mistake this should be I because then I will change per K this is the entire idea we need to re um subtract all the um projections what this basically means is that we need to take A3 and this time given that here we have two instead of one in here we need to have an extra step which means A3 minus and then what this formula basically says this is the sum of the projections of A3 on e i where I goes from one to two so projection of a Tre on a one when K so when I this is the I is equal to one case plus projection of a Tre on A2 this is the I equal to 2 case this is basically what this summation says this is this element and we have seen this as part of the high school but also the pre-algebra course okay so now when we are clear on how we can calculate the V3 in the step number two for K is equal to 3 we are ready to go onto the step number three and what was step number three the step number three for K is equal to Tre was saying let’s take the V Tre and nor normalize it to go from V Tre to e Tre and how we can do that by taking the V Tre and dividing it to the length of V tree to get on to E Tre and this cycle goes on and on until we cover all the case so all the vectors so the idea is that we first for our initial tab we set the V1 equal to A1 we normalize it then starting from the K is equal to 2 we don’t go first on and on WE autal it by formula in here by using this we can ensure that each of these vectors is then orthogonal to all the other vectors so for K is equal to 2 we ensure that this uh vector that we get is orthogonal to all the other ones and the cas is equal to treat that the third Vector is orthogonal to all the other ones and we are doing that in Step number two so for each case for each K we basically are ensuring that in this case we have an a vector that is autal to all the other vectors in this set and for each Vector we are also normalizing it to satisfy the second criteria because we had this two criterias to create this auton normal set so we are doing this in subsequent uh way so first for K is equal to 1 so basically for A1 and then we are doing this for K is equal to 2 so A2 and then until K is equal to n so a n what we are doing every time is that we are obtaining this V1 and then we go from V1 to E1 to normalize it and then here we are getting the V2 here to go to E2 by normalizing it so this basically the step two and step three and then we do this every time and up until to the point of obtaining VN and then from VN we go to to normalize it so this is the idea of this entire process step by step step to start with V1 as part of the step number one and then as part of Step number two for each subsequent Vector a k so K is equal to 2 obtain the VK and then normalize it for K is equal to 3 obtain the V3 and then normalize it to get E3 up to the point of the last Vector which is a n the vector a n we compute the VN and then we normalize it to get the and this is what this part is which is the step number two that says repeat steps to entry for all vectors it means that every time when you increase your K when you go into the next Vector we first compute the V so VK and then you normalize it you get the e k and then you go back to the step number two or three because you then again need to calculate the VK and then EK and then for the next case so this is something that you will see also a lot when you are writing Thea code for your uh algorithms because in many cases you need to do this repetion of the steps so uh you for one vector you do something or for one iteration you do process and then you uh go back and do for the next one and for next one this process is what we are referring by repeat step number two and three for all vectors so let’s now look into an example let’s apply this grumme process two vectors A1 and A2 where A1 is 1 1 0 and A2 is 1 1 so let’s go ahead and do that so A1 is equal to 1 1 0 A2 is equal to 1 0 1 we want to apply this grum process to create this auton normal basis for the Subspace that is pented by A1 and A2 so now we have the set 1 1 0 and one 1 and what we want is to create an autonoma basis so creating creating or to normal normal bases with from M process so here we got only two vectors so obviously it’s this and it’s a very simplified version of it what was the first step in our case uh in our algorithm it was to set the V 1 = to A1 what we need to do step number one we need to set the V1 equal to A1 and we need to normalize normalize the V1 to get E1 that’s what our goal is so let’s go ahead and do that V1 is equal to A1 and is equal to 1 1 0 that’s our A1 so 1 1 Z and in order to normalize V1 and get the E1 we know that this is equal to V1 / to the length of B1 which is then equal to take this V1 so that is 1 1 0 and then divide it to the length of V1 and you can very quickly see that given V1 is equal to V1 * V1 that’s something that we learn in the very beginning of our fundamental to linear algebra course that the length of V1 is simply the dotproduct between uh V1 and V1 and it’s equal to 1 1 0 * 1 1 0 which is equal to 1 + 1 so 2 so this is then equal to 1 1 0 / 2 which is equal to 1 / 2 1 / to two and then is zero this is our E1 so we are done with our step number one because now we have B1 and we got E1 so what was the step number two in the step number two we need to set the k equal to two this is the next K so for A2 what we need to do is we want to get V2 and normalized V2 by getting E2 and how can we get that well first let’s find what is the V2 well V2 was and using that formula that we saw before which was this formula so it’s equal to AK minus and the sum I is equal to 1 to K minus one and then projection of a onto EI I so let’s take this formula over this is equal to a 2 because K is = to 2 a k minus sum and then I is equal to one till K minus 1 and then Kus 1 which is equal to basically 1 given that K is equal to 2 and then projection of A2 onto e i Y and this is equal to H2 minus given that we got K minus y is equal to 1 so the limit for our summation is equal to 1 so this one this means that like before we got just one part as part of our summation so minus and then projection of let me actually keep the same color I want it to be consistent so projection of A2 on the E 1 so you see here the i i is equal to 1 and the limit of the I is K minus one which is equal to 1 so we got here just E1 so we got the V2 formula we can then now calculate because we know A2 and the A2 is this so one0 one 1 0 1 but now we got a problem we don’t know what this is so let’s quickly go and calculate this part so projection of A2 on A1 and we learned from the projection formula that this is equal to A2 * E1 / 2 E1 * E1 so the dot product multip by E1 and what is this this is equal to 1 1 multiplied by and what is the E1 E1 we just calculate in here so it is 1 / to 2 1 / to 2 and then zero here / 2 and then 1 / 2 1 / 2 and then zero multiplied by 1 / 2 1 / to 2 and then zero here multiplied by the same Vector so E1 so this two cancel out this two also cancel out and as you can see we are getting that the projection of A2 on E1 is equal to this Vector we can also manually check that actually so let’s let’s do that so let’s see we are not canceling out these vectors and instead we are manually calculating this so here we got 101 ultip by 0.5 and 0.50 this is equal to 1 * 1 / 2 is 1 / 2 0 * 1 / 2 is 0 1 * 0 is 1 so + 1 IDE 2 this amount is 1/4 + 1/4 this multiplied by the V Vector 1 / 2 1 / 2 and then zero in here this is equal to 1 / 2 1 + 1 / 2 is = to 3 / to 2 and then 1 1/4 + 1/4 is equal to 1 / to 2 multip by 1 2 and then one 2 and then zero what is this amount well those two can cancel out so we end up with three times and then 1 / 2 2 and then 1 / 2 and then zero this is then the projection 3 / 2 3 divided to two and then zero so let me remove all these calculations and then we can take over the projection value which is 3 / to 2 3 / to 2 and then zero to get our Vector V2 which is equal to 1 – 3 / to 2 0 – 3 / to 2 and then 1 – 0 and this is equal to here it is 1 here it is – 3 / 2 and here is minus and then 1 ided to 2 because 3 / 2 is minus uh it is 1.5 and then 1 – 1.5 is simply – 0.5 so this is then the vector V2 then what we need to do is to normalize this Vector to get D E2 which is then equal to V2 / to V2 length which is simply equal to V2 / to V2 * V2 so the dot product and this is equal to let’s take the V2 which is -1 / 2 and then Min – 3 / 2 and then 1 and then divide it to and this amount let’s quickly calculate that it is equal to so the length of V2 is equal to – 1 / 2^ 2 + – 3 / 2^ 2 + 1 this is equal to 1/4 + 9 / to 4 + 1 which is 4 / to 4 and then this is equal to 1 + 9 is 10 10 + 4 is 14 so 14 / to 4 this is the length of it so 14 / to 4 so then this is equal to this Vector to this threedimensional vector -1 * 14 -1 / 2 * 14 / 4 is equal to this is 7 so minus 7 / 2 4 and then – 3 / 2 think I just made a mistake here actually so minus 1 / to 2 so the first element and then divided to 14 / 4 is actually actually equal to this multiplied by 4 / to 14 so you take this element then divide it to this one and we know that a / to B * C / 2 D is equal to a * D and then B * C so we are basically flipping this side this is from pre-algebra and then here this is equal to 2 and then is = to – 1 / 2 7 then let’s sh the second one too so we got – 3 / to 2 / to 14 / to 4 is actually = to – 3 / 2 * 4 / to 14 and then if we remove this this is then two this is seven this cancel out this equal to -3 / to 7 – 3 / to 7 and then finally we got 1 / 2 14 / to 4 which is equal to 4 / to 14 this equal to 2 / to 7 so 2 / to 7 and this is our A2 and given that we got just two vectors so we have already reached the end of our solution so now when we have already the V1 and the V2 the E1 and the E2 We have basically completed the process of this grummet uh procedure because we have already uh only two vectors that means that we need to have V1 and V2 and then uh E1 and E2 and this is all that you need in case you got two vectors if you have three vectors of course the process will include um the same process of Step number two and three so the V2 and the normalization of it two times for your k equal to 2 and k equal to 3 and then if you have more vectors then every time you will have more of the steps but at the end what we want to have is the set of vectors that are orthogonal and at the same time they are normalized in this case we say that this vectors form this orthonormal bases now why is this important the applications of orthonormal bases well firstly it simplifies a complex Vector operations and uh this is the basis of many uh more difficult mathematical Concepts uh like for year series or quantum mechanics it’s used also um when it comes to this auton normal basis uh also signal processing and it’s a critical uh process in numerical methods especially in machine learning algorithms and in data compression so we will see this process to be Ed also as part of uh decomposition techniques which is really important when it comes to different algorithms uh whether it’s optimization algorithms but also um algorithms that are used for recommender systems for example and those uh Concepts they all come together and we will see later on when we will be discussing the concepts of the compositions and metrics factorization so this uton normal basis and this grum made process they are really foundational in linear algebra they provide tools for simplifying and also solving this high dimensional problems efficiently their application include different fields of science engineering demonstrating their versatility and utility let’s now talk about the special matrices and their properties so we are going to talk about special matrices like symmetric matrices and their example diagonal matrices and their corresponding example but also the orthogonal matrices with the corresponding example so when it comes to the special matrices special matrices have unique properties such as being symmetric or all nonzero elements on the diagonal like diagonal matrices or orthogonality uh in matrices which means that we have orthogonal matrices so when it comes to the symmetric Matrix it means that uh the a The Matrix a is equal to its transpose to a so a is equal to a in this case we can confirm and say that the Matrix a is symmetric so in this case we have Matrix a and we know that the way we need to transpose this Matrix is to taking this rows and making them The Columns of our transpose Matrix so a is then equal to 2 – 1 and then zero then the second row which is min -1 and then 2 and then Min -1 and then the third row which is 0 – 1 and two so the third row then becomes my third column so as you can see those two are the same so I’m using then the definition of the transpose of the um Matrix and then here then we end up with two matrices they are actually the same so we can see that the A and the a in both the First Column they got 2 minus one and a zero the second column Min -1 2 and minus one the third commn 0 – one and two so their columns and their rows they are the same which means that we are dealing with a symmetric Matrix so whenever we want to check whether the Matrix is symmetric we just need to take the transpose of it and see whether the Matrix is equal to its transpose in that case we are dealing with symmetric Matrix do also note that for a matrix to be symmetric it needs to be a square Matrix so it needs to be 2x two in the two dimensional space or 3×3 in the three dimensional space or n by N in N dimensional space which means that the number of rows should be equal to number of columns because otherwise when you flip your number of rows with number of columns on in case there is no um uh Square version of that Matrix so m is not equal to and in that case a will have a dimension of M by n and then a t so a t will have a dimension of n by m which means that there is no way that a can be equal to a this is not then possible therefore we need to have a square Matrix for them to be symmetric let’s now talk about diagonal matrix so a diagonal matrix has a nonzero element only on its diagonal which means that in this case we have this nonzero elements on the diagonal so let’s call it d11 d22 and then d33 this equal to 3 this equal to 5 this equal to 7 and all the elements as you can see in here they are zeros so the concept of diagonal matrices is very uh simple therefore we will then go through the next example which is about orthogonal Matrix now this is a concept that we haven’t yet seen and we spoken about so let’s code read through this bit slowly so an orthogonal Matrix is a square Matrix whose columns and rows are orthogonal unit V vectors so auton normal vectors and its transpose equals its inverse so there are two part of these elements so firstly it says that for the Matrix to be orthogonal Matrix it should be square Matrix so Square Matrix and then its columns and rows are orthogonal unit vectors so columns and rows are orthogonal unit vectors which means they need to be normalized so normalized so um we have seen when forming this uton normal bases that we had this process of uh this condition of orthogonality the vectors had to be orthogonal and they had to have a length of one which means that they had they had to be normalized we can see exactly the same in here so hence the name uton normal vectors so they are orthogonal and they are unit vectors which means they are normalized so then the final condition is added in here which actually is not so much a condition but rather than property something that we can prove that once we have all this we can also say that if we are dealing with orthogonal Matrix then Q T * Q so the dot product of the transpose with that Matrix Q is equal to the Q * QT is equal to i y because the QT is equal to the Q minus one because the transpose of that Matrix Q is actually equal to its inverse and given that we learned that the Q minus one so the inverse time Q is equal to Q inverse * Q is equal to I and given that here we are learning that QT is = to Q minus one we are then making use of this to claim this so instead of minus ones that we are used to when we are dealing with inverses here we have t the transpose so in this case this orthogonal Matrix that we have just learned about this is this Square Matrix who called rows are orthogonal and they are also normalized meaning that we are dealing with QT QT is = Q minus one it will look like this so this q1 you can see that here we got the first row here we got the second row and if we calculate the dot product between this row and this row we can quickly see that we are getting a value of zero so we can prove that that they are actually autog those two rows let’s go ahead and actually prove that so let’s call this R1 let’s call this R2 this is Row one and row two and I will leave the uh column version so q1 * Q2 that’s. product on you to prove that the columns are perpendicular I will work with the rows so R1 * R2 for me to prove that they are orthogonal I need to prove that this equal to zero can we do that well let’s try so 1 / 2 < of 2 1 / 2 < 2 multiplied by 1 / 2un 2 needs some bigger space in here so 1 / 2 of 2 and then- 1 / < of 2 that’s how I can calculate the dot product between R1 and R2 R2 and R1 you can see that the elements in here are the same and here the elements are also the same so then this is equal to 1 / 2 < of 2 * 1 2 of 2 – 1 / 2 2 * 1 / to of 2 I’m simply taking this minus and given that the dot product is basically plus and then this amount I’m just taking this and bringing up in here to a avoid one more step uh given the space is quite limited now what do we see in here this value is the same as this value which means that this is equal to zero and we know that the two vectors to be autal they need to have a dotproduct equal to zero so here we have proven that dotproduct of R1 and R2 is equal to zero so this proves that R1 and R2 so the two rows of this Matrix so R1 and R2 are autogo we can also prove that the second criteria of auton normal vectors is also satisfied in here we can prove that when we look at the length of this vector and of this one then they are of the unit one so let’s actually go ahead and and do for one of them so let’s prove that for 1 divided 2 Ro of two and then 1 / 2 Ro of 2 this is a vector that the length of it this is let’s say our first row so this is R1 then the R1 length is equal to 1 / 2 of 2 2ar + 1 / 2un of 2 2 this is equal to 1 / 2 + 1 / 2 and what is 1 / 2 + 1 / 2 it’s equal to 1 so we have proven that the length of R1 is equal to 1 you can quickly and E easily also compute that for the second row and you will then also prove that the R2 the length of it is also one which is then the second criteria which said that for the vectors to form this auton normal bases so to be auton normal vectors they uh also had to have a length of one so they had to be a unit vectors in this case then we can make use of the property that d q 2 transpose is equal to Q inverse and this then results in Q2 transpose time Q2 Q2 which is equal to Q2 * Q2 transpose which is equal to the identity Matrix and specifically I2 because we are in the R2 so both this Q2 and the previous example those are all to Al matrices and in here we have proven that the rows are indeed orthogonal and we have also seen that the length of them are unit vectors meaning that we have automatically got this I will leave this one to you to do those proofs so to uh ensure that the row one and row two are orthogonal so they are perpendicular which means they the product of their uh the dot prod of these two vectors is equal to zero and also that they are normalized which means the length of them is equal to one and this means that then this holds you can actually even go ahead and uh practice the material that we uh learned as part of the previous units by calculating the inverse of this Matrix and checking that the inverse of this Matrix is indeed equal to the transpose of the Matrix so that QT is equal to Q Q -1 because we learned how we can compute the inverse of a matrix because the inverse of a matrix was equal to 1 / the determinant of this Matrix times and then the manipulated version of it which was in this case 0 0 and then we need to have here one so Min – 1 * 1 and then 1 * -1 so we have to multiply this and this by minus so 1 one and then here minus one so in this way you can also prove that this inverse is actually equal to the Q2 transpose because then you can prove that indeed and you can see for yourself that this formula is in equal to the Q2 transpose because then you can prove that indeed and you can see for yourself that this formula is indeed true in this module we are going to talk about Matrix factorization we are going to discuss the significance of Matrix factorization we are going to Define Matrix factorization we are going also to discuss the common applications of Matrix factorization across different fields and then we are going to see detailed examples of Matrix factorization so let’s talk about why Matrix factorization matters so metrix factorization techniques they are essential for various reasons they are used for simplifying metrix operations like solving linear systems or when we have this um many uh matrixes but we want to um simplify these operations that we apply to these matrices and we want to solve the problem then we can make this uh complex Matrix operations more manageable and make this uh calculations more manageable by using metrix factorization techniques we can also use metrix factorization directly to solve system all linear equations efficiently we can also use Matrix factorization to perform igon value de composition singular value de their composition or called SVD and othero operations which are crucial in machine learning and data analysis so ion values and ion vectors you might have heard already they are part of also PCA which is the principle component analysis and this comes from uh fundamentals of statistics and the uh p a is used as a dimensionality technique and in fact it’s one of the most popular dimensionality techniques that you will find in the industry used in the data science used in data analytics machine learning even in the Deep learning so Matrix factorization can also be used to reduce the computational complexity by making use of this factorization we can then simplify the process and also make it more efficient for computation and it’s especially in important when we are dealing with this High dimensional data when we have many features or we have a very large model and complex model then this uh metric factorization technique can make a huge difference in our data processing process so these techniques underpin many algorithms in numeric analysis in optimizations and Beyond so whenever it comes to machine learning or data science or many other fields you will this uh process and this term Matrix authorization appearing a lot even um in the example of a streaming company Netflix which I’m sure that you are aware of Netflix is using uh metrix uh factorization to build a recommender system and uh metrix authorization usage in building recommender algorithm for personalized recommendations is actually one of the most popular applications of metrix factorization therefore I wanted to specifically discuss this to as part of our Advanced linear algebra course and some of the concepts might seem bit more complex than the ones that we have discuss as part of the previous units but once we go through them step by step and I will give you all the details in all these examples this entire process of these different metrix factorization techniques should become much more clear and straightforward so we will be discussing not just one but multiple fundamental metrix factorization techniques beside of talking the high level where they are used and how you can choose for what type of applications so we are going to demystify this entire concept of Matrix factorization and we are going to uh start from high level then we are going to go into the deepest details let’s now formally Define the metrix factorization so metrix factorization refers to decomposing a matrix into product of two or more matrices revealing its structure and simplifying further analysis so what is this idea behind metrix factorization the idea is that if we have a matrix a and we want to simplify our process of calculation or multiplication anything that’s related to this a but this a in itself it contains this weird numbers or it is just too complex you know it contains this ton of different numbers you don’t recognize where the columns are linearly independent it’s not very readable from the first View and you just want to make your life easier when performing this calculation well for that you can make use of this Matrix factorization to write this a in terms of some other matrices let’s say um and I’m calling here randomly Q or t so it’s equal to for instance the dotproduct of these two matrices Q * T where Q is much simpler and the t is also much simpler so those may contain vectors that are um for instance this can be a diagonal matrix or it can be a matrix uh with specific properties when using those you will feel much more comfortable so it will be easier for you to use them in order to multiply uh with other matri matrices it can be easier for you to solve this problem but of course if you are in the two dimensional space let’s say you are in R2 or in R3 then most likely it will be quite straightforward for you to use the a itself but if you are in the r 100 or r 1000 then of course this uh entire computations they become super complex it will be difficult to understand and compute this linear combinations find out whether you are dealing with a linearly independent columns find out um the um no space the column space the basis of the new space and cumn space and all this they might seem uh much more difficult when you are in high dimensional space for in those cases we can then make use of metrix factorization to make the entire process much more simplified and also more efficient this entire calculation process so common types of Matrix factorization include lower upper uh Matrix factorization or in short Lu QR factorization and INF famous type of factorization ation which is called orthogonal triangular factorization and then we have SVD singular value de composition yet another in famous metrix factorization and then finally the igon dec composition also another Super popular metrix factorization technique so uh the QR SVD and IA de composition are in fact highly popular the composition and Metric factorization techniques that you will see appearing in the 90% of all the statistics related and machine learning related books so this just comes to prove how important these concepts are when it comes to properly learning and mastering these more applied uh science related fields like machine learn so if you want to go beyond the level of knowing algorithms but rather than to also be able to edit the algorithms tweak them adjust them be able to understand machine learning algorithms deep learning algorithms data science at its core and in order to become a professional well-rounded professional then this I composition the singular valid composition and the QR metrix authorization techniques are techniques that you want to know and you want to understand at least higher level such that you can easier grasp more more advanced concepts that come from the applied sciences like machine learning and AI so let’s first discuss that high level what this QR decomposition is so what the QR DEC composition does is that it decomposes a matrix into an orthogonal Matrix which we are referring by q and then an upper triangular Matrix R so in here you can see that we have this two different matrices so we are basically saying a is equal to this product of this Matrix q and R where the first one this Matrix Q this one should be orthogonal Matrix so this part is really important and we have learned as part of the previous module the definition of orthogonal Matrix we learned that the rows or columns they had to be or Al to each other and we also learned that they need to have a length of one they need to um be normalized and we learned that this means that the transpose of those matrices is equal to the inverse of this matrices so this was just the last part of the previous module and this is exactly what this Matrix Q is about so we are saying that we will decompose I into these two matrices as a product of these two matrices q and R one of which this Matrix Q should be uh an nutal Matrix which means the rows and the columns they should be orthogonal to each other so their uh dot product each of them should be equal to zero and they need to be normalized so the length of them should be one for each of those rows and vectors and then the second part of this U the composition is this Matrix R which says that the Matrix R should be an upper triangular Matrix and what is the definition of upper triangular well in this case you can think of this r as this Matrix where we have here all zeros and then here on the diagonal you have numbers nonzero numbers and then here let’s say 1 2 3 4 five and then here in the upper part you will also have numbers so unlike in the lower part part of this Matrix R where you will have zeros in here you will have also n Zer numbers numbers let’s say 7even 10 uh 8 and I’m just writing these numbers randomly so of course in the real case when we have this Matrix a and we go through this process of QR de composition of course we will have an appropriate q and appropriate R where these numbers will be different and they will be specific numbers that will will will be calculating but the idea is that we need to get this upper triangular Matrix R for this calculation to make sense so we will then be using this qard composition for solving linear um linear Le squares problems for instance which is part of the linear regression too because linear regression from machine learning and from statistics uh it is based on the least Square technique the estimation technique that we are using for linear regression in machine learning um to solve this linear regression problem is called Ordinary Le squares so the algorithm is based on this idea of Le squares which is trying to minimize squared uh residuales of the model and that can be done by using this idea of QR the composition so it helps us to provide numerically stable solutions for this type of problems too and QR de composition is used extensively in Signal processing and statistical analysis let’s now briefly talk about the Lu decomposition so Lu DEC composition decomposes a matrix into lower triangular Matrix so this is the opposite of what we had before we can have an upper triangular triangular Matrix like we had in the QR the composition we can also have a lower triangular Matrix so you might have already guessed how it will look like I won’t go into that very soon in the QR composition example you will see the idea of the upper triangular I will also show the idea of a lower triang so the composition in case of Lu uh is done by decomposing a matrix into lower triangular Matrix l and an upper triangular Matrix U so basically the difference between the QR de composition and L de composition is that in the QR DEC composition we are decomposing a matrix into orthogonal Matrix and an upper triangular Matrix while in case of L decomposition we are decomposing a matrix into lower triangular Matrix and an upper triangular Matrix so here you can see that we no longer have this idea of orthogonal matrix but instead of that we are talking about lower triangle Matrix so in that aspect uh L DEC composition is different from QR DEC composition so what the Lu DEC composition does is that it facilitates the solving of linear equations and Matrix inversions it is common in engineering and in physical sciences for systems of this linear equation to be solved by using lud de composition and in fact if you are learning Quantum uh mechanics that this Lu decomposition can definitely help you to better understand many Concepts but if your target fields are machine learning deep learning or artificial intelligence then for those using QR de composition will be uh much more often a case than using this L de composition let’s now talk about the singular value de composition so what the VD does is that it decomposes a matrix into three matrices so first one is the orthogonal Matrix U the second one is a diagonal matrix and then the third one is this V Star which is the conjugate transpose of an orthogonal Matrix so for now this might seem bit complex and you can see that unlike the QR or Lu de composition where we got just uh two uh matrices as a result of our de composition in case of SD we got three matrices like the name suggest two by the way so three parts and this might seem bit complex but we are going to go through this process step by step and I’m going to provide you detailed example such that this will all make sense but for now let’s focus at the highight level usage of SVD so singular value decomposition is one of the most popular decomposition techniques and it is also directly used as part of machine learning algorithms to form uh those machine learning algorithms it is also used in the data compression in the noise reduction so when we are trying to clean our data and remove the Noise by using SVD because SVD can help us to identify those outliers and then remove them from the data by performing noise reduction and it is also used in the principal component analysis the PCA the uh same dimensionality reduction technique that I just uh mentioned related to Theon de composition because SVD and the ion de composition are highly related to each other so this SVD is used as part of this PCA algorithm and PCA is the most popular the infamous dimensionality reduction technique that is used both in the advanced statistical studies in the statistics in general also in finance and is also used as part of many machine learning and deep learning applications so knowing PCA is a must if you want to get into uh data analytics or data science machine learning and AI but also uh it helped you it will help you also to uh understand uh many other Concepts when it comes to these fields so the SVD provides insight into the structure but also the rank of the Matrix so we are going to see this as part of our example two let’s now also briefly talk about the igon de composition so igon de composition which is highly related to these concepts of igon values and igon vectors it decomposes a matrix into this igon values and ion vectors which then shows the matrices fundamental properties which are related to this idea of correlation what kind of information does this uh Matrix contain what is the variation in what direction is the variation the largest and this ion de composition which is then related to also this idea of SVD and in general this dimensions and correlations is critical for understanding linear Transformations the stability analysis and systems of differential equations but beside this uh mathematical side of uh Concepts and understanding these mathematical topics the ion de composition is also the basis for many algorithms in numerical linear algebra uh but also many applied linear algebra topics like in the data science in machine learning and is used heavily in artificial intelligence for feature extraction for dimensionality reduction related again to the concept of PCA because PCA is based entirely on this concept of I their composition PCA is the direct result of computing the igon values and ion vectors so without knowing what our Dion values and icon vectors you cannot perform PCA because the first step of the PCA is the computation of the icon values and ion vectors and then using different rules which we are referring as the elbow rule or Ka rule we can then use these icon values and icon vectors to understand what are the features in our data that contain the most variation so the most information and then we can use that in order to understand what are the most important features in our data and reduce the dimension of our model by selecting these most important features because what PCA basically does is that it it uses this icon values and icon vectors to understand how we can uh create a linear combination out of our features and understand the the amount of those linear combinations that contain the most variation and then select those and uh select the largest amount of information in the data and then Skip and drop those uninformative being your combinations while still keeping the most information and this definition of the most will then be decided by this differ ruls this is just higher level Insight background information on what you can expect when you are talking about applying this highly technical linear algebra concept of ion de composition into an applied science Fields like data science or machine learning or AI but we will see this later and I’ll also make comments regarding this and though PCA won’t be discussed as part of this course because here we are talking about linear algebra but PCA is part of the fundamental statistics course and in there we are no longer providing all these different details on how you can uh perform this ion composition therefore knowing how to perform I and composition will then set you for success to actually understand the mathematics behind the statistical Concepts like PCA and also later on understand how you can use that PCA in a machine learning Concepts and in AI Concepts like outter encoders and how you can relate your um Auto encoders to this concept of PCA how they are related what are their commonalities and what are their differences so everything is about the choice and choosing the right tool for your problem when it comes to the decomposition tools metrix factorization tools we have seen that there are many options and the the question is which one should we pick in what cases so choosing the right tool is really important when it comes to this different metric factorization techniques because there are many choices and each of them they can be used for different sorts of problems so therefore in order to understand which one you need to pick in what kind of cases what kind of requirements you have and what kind of Sol uh problem you are trying to solve that in those cases you will need to have this knowledge that you learn as part of this course in order to make that right choice of the tool so the choice among QR de composition the L de composition the SVD and igen de composition it really depends on your specific problems requirements and the data characteristics so are you dealing with a complex data are you dealing with a simple data with low Dimensions what is the goal that you uh want to uh achieve what is the problem that you are trying to solve is it to uh reduce the dimension of your feature space is it to solve a problem with linear equations is it to solve a quantum mechanics problem or is it to um incorporate this as part of your machine learning algorithm so QR and LU DEC compositions are usually preferred for solving linear systems while SVD and I and compositions they help us for deeper insights when it comes to the data and what kind of information it contains how we can reduce the dimension of the data or how we can use it as part of machine learning algorithm for noise reduction identifying outliers Etc so um this type of algorithms like SVD and ion de composition it helps us to also uh intuitively using geometry and our knowledge of geometry to um visualize the data for instance the PCA helps us to visualize this High dimensional data using just couple of principal components let’s say we have 10 features in our model so we have a dimension of 10 we are in r10 but we want to visualize our data by using PCA we can then reduce the dimension and come up with uh three principal components which are linear combination of our original 10 vectors and then we can use the three principal components to VIs visualize our data in 3D and this basically helps us to geometrically visualize our data and then make presentations make much more sense of our story so to do uh storytelling for our data and uh much more and these two uh models and tools they are invaluable when it comes to uh applications in machine learning in deep learning in data science and artificial intelligence so so matric factorization techniques they are super important when it comes to computational mathematics they are also directly affecting the data science Ai and many other algorithms so they are not only important in terms of the problem that they are trying to solve but also in order to make the computation process so when coding in python or in other programming languages to make that process much more efficient they also help us to uh make these computations efficient and provide insights into different properties that we have in our data as part of this course we are not only going to discuss one but actually three of these four the composition techniques and this metrix factorization techniques in detail we are going to talk about the qard de composition we are going to not just discuss it but uh also to learn it step by step and we are going to do a detail example with all this steps involved such that you will feel confident doing a QR de composition all by yourself then we are also going to do an SVD DEC composition as well as igon DEC composition and then we are again going to discuss them in terms of their mathematical formulation the definition but also the application stepbystep process and a detailed example such that you can conduct each of those metrix factorization techniques and these decomposition techniques by yourself manually doing all these calculations this understanding and this examples and this Concepts will help you to not just be able to formulate what these techniques are about but really and truly understand and then use them later on whether when doing your own research writing scientific papers or tweaking the algorithm all by yourself when inventing new algorithms I won’t be discussing this L de composition technique because we already know uh that the QR and LU they are both used for similar type of problems therefore to save us time I have selected carefully the uh most important the composition techniques and metrics soriz techniques that you will most likely be dealing with will be dealing with in your future career in applied sciences

By Amjad Izhar
Contact: amjad.izhar@gmail.com
https://amjadizhar.blog

Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!

Foundations of Linear Algebra: Core Concepts and Applications