In forecasting, matrices are powerful tools used for organizing and analyzing data. They allow the representation of multiple relationships and variables compactly, making it easier to perform computations and apply statistical or machine learning techniques. Here’s an overview of how matrices are commonly applied in forecasting:
11.1 Linear Regression
Linear regression aims to model the relationship between input features and a target variable. In this explanation, we will explore how to express and solve linear regression problems using matrices.
11.1.1 General Form
In linear regression, the relationship between the input features \(X\) and the target variable \(y\) is assumed to be linear. The linear regression equation is:
\[
y = X\beta + \epsilon
\]
Where:
\(y\) is an \(n \times 1\) vector of observed target values (response variable).
\(X\) is an \(n \times p\) design matrix (features matrix), where each row represents an observation and each column represents a feature.
\(\beta\) is a \(p \times 1\) vector of coefficients (parameters).
\(\epsilon\) is a vector of errors (residuals).
11.1.2 Matrix Representation
The matrix \(X\) contains the input features. The first column of \(X\) is filled with 1’s to represent the intercept \(\beta_0\). For example, for a dataset with three data points and two features:
To find the optimal coefficients \(\beta\), we minimize the error between the predicted values \(\hat{y}\) and the actual values \(y\). The error is measured using the sum of squared residuals (errors) called the cost function\(J(\beta)\):
Now, solve for \(\beta\) by multiplying both sides by \((X^T X)^{-1}\) (assuming \(X^T X\) is invertible):
\[
\beta = (X^T X)^{-1} X^T y
\]
This is the closed-form solution for linear regression, also known as the normal equation.
11.1.7 Making Predictions
Once \(\beta\) is computed, predictions can be made for the target variable \(y\) using:
\[
\hat{y} = X\beta
\]
11.1.8 Assumptions of Linear Regression
For the linear regression model to be meaningful, certain assumptions are typically made:
Linearity: The relationship between the input features and the target variable is linear.
Independence: The residuals (errors) are independent.
Homoscedasticity: The variance of residuals is constant across all observations.
Normality of Errors: The residuals follow a normal distribution (important for hypothesis testing and confidence intervals).
11.2 6. Example in R
Here’s an example in R of computing \(\beta\) using the closed-form solution and making predictions:
import numpy as np# Sample Data (X and y)X = np.array([[1, 1, 4], # Design matrix (including intercept column of 1's) [1, 2, 5], [1, 3, 6]])y = np.array([5, 7, 9]) # Actual target values# Compute the coefficients using the Normal Equation with pseudo-inverseX_transpose = X.T # Transpose of XX_transpose_X = X_transpose.dot(X) # X^T XX_transpose_y = X_transpose.dot(y) # X^T y# Use the pseudo-inverse in case X^T X is singularbeta = np.linalg.pinv(X_transpose_X).dot(X_transpose_y)# Display the coefficients (beta values)print("Coefficients (beta):", beta)# Make predictionsy_hat = X.dot(beta) # Predicted target valuesprint("Predicted values (y_hat):", y_hat)
A Markov Chain is a mathematical model that describes a system undergoing transitions from one state to another, where the probability of moving to the next state depends only on the current state (not past states). This property is called the Markov property.
In the context of Linear Algebra, Markov Chains can be analyzed using matrices, particularly the transition matrix, to understand how the system evolves over time.
11.3.1 State Vectors
In a Markov Chain, the system’s state at any given time is represented by a state vector. This vector consists of probabilities of being in each possible state.
For example, if a system has two states, Hujan (H) and Cerah (C), the state vector \(\mathbf{x}\) could be:
Where \(p(H)\) is the probability of the system being in state H, and \(p(C)\) is the probability of the system being in state C.
11.3.2 Transition Matrix
The transition matrix\(P\) describes the probabilities of transitioning between states in the system. It is a square matrix where the element \(P_{ij}\) represents the probability of transitioning from state \(i\) to state \(j\).
For a two-state system with Hujan and Cerah, the transition matrix might look like:
A crucial concept in Markov Chains is the steady state or stationary distribution, where the system reaches a point where the state probabilities no longer change over time.
Mathematically, the steady state vector \(\mathbf{x}\) satisfies the equation:
\[
\mathbf{x} = P \cdot \mathbf{x}
\]
To find the steady state, you need to solve for the eigenvector corresponding to eigenvalue \(\lambda = 1\) of the transition matrix. The steady-state vector is the distribution where the system remains unchanged after one application of the transition matrix.
11.3.5 Eigenvectors and Eigenvalues
The steady state of a Markov Chain can be determined by finding the eigenvector corresponding to the eigenvalue 1 of the transition matrix \(P\), since at steady state the state vector doesn’t change when multiplied by the transition matrix.
To summarize, in a Markov Chain:
The transition matrix\(P\) describes the system’s transition probabilities.
The state vector\(\mathbf{x}\) updates over time by multiplying it by the transition matrix.
The steady state vector \(\mathbf{x}\) is the eigenvector associated with eigenvalue 1, representing the system’s long-term probabilities of being in each state.
which translates to solving the system of equations to find the vector \(\mathbf{x}\) that does not change after multiplication with the matrix \(P\).
Markov Chains in Linear Algebra make use of key concepts such as matrices, vectors, and eigenvalues to model systems that evolve probabilistically. By applying matrix operations and finding eigenvectors corresponding to eigenvalue 1, we can describe long-term behavior and steady states in such systems.