**Linear Regression with Multiple Variables**

**Multiple Features**

**Linear regression with multiple variables is also known as "multivariable linear regression." We now introduce notation for equations where we can have any number of input variables.**

$$ \begin{align*} x_j^{(i)} &= \text{value of feature } j \text{ in the }i^{th}\text{ training example} \newline x^{(i)}& = \text{the column vector of all the feature inputs of the }i^{th}\text{ training example} \newline m &= \text{the number of training examples} \newline n &= \left| x^{(i)} \right| \; \text{(the number of features)} \end{align*} $$

Now define the multivariable form of the hypothesis function as follows, accomodating these multiple features:

$$ h_\theta (x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \theta_3 x_3 + \cdots + \theta_n x_n $$

Using the definition of matrix multiplication, our multivariable hypothesis function can be concisely represented as:

$$ \begin{align*} h_\theta(x) = \begin{bmatrix} \theta_0 \hspace{2em} \theta_1 \hspace{2em} ... \hspace{2em} \theta_n \end{bmatrix} \begin{bmatrix} x_0 \newline x_1 \newline \vdots \newline x_n \end{bmatrix} = \theta^T x \end{align*} $$

This is a vectorization of our hypothesis function for one training example; see the lessons on vectorization to learn more. [

**Note**: So that we can do matrix operations with theta and x, we will set $x^{(i)}_0 = 1$, for all values of $i$. This makes the two vectors 'theta' and $x^{(i)}$ match each other element-wise (that is, have the same number of elements: $n + 1$).]

Now we can collect all $m$ training examples each with $n$ features and record them in an $n+1$ by $m$ matrix. In this matrix we let the value of the subscript (feature) also represent the row number (except the initial row is the "zeroth" row), and the value of the superscript (the training example) also represent the column number, as shown here: