Wednesday, April 29, 2015

Cost Function : Multivariate Linear Regression

The concepts for calculating the cost function J, and estimating the parameters $\theta_0$ and $\theta_1$ can be easily extended to the case where we have multiple features in the dataset, i.e. instead of only one variable $x$ we have multiple variables $x_1, x_2, x_3...$ and so on.

Notations:

Consider a training dataset with 5 variables $x_1, x_2, x_3, x_4 and x_5$. The outcome variable is still $y$. There are $m$ examples in the dataset (number of rows)

$n$ = number of features
$x^{(i)}$ = input (features) of the $i^{th}$ training example
$x^{(i)}_j$ = value of feature $j$ in $i^{th}$ training example
Linear Regression : Multivariate [Multiple features]

Hypothesis:
Univariate: $h_{\theta}(x) = \theta_0 +\theta_1x$
Multivariate: $h_{\theta}(x) = \theta_0 +\theta_1x_1 + \theta_2x_2 +\theta_3x_3 +\theta_4x_4 ..... +\theta_nx_n  $

For convenience of notation, define $x_0$ =1

 $h_{\theta}(x) = \theta_0x_0 +\theta_1x_1 + \theta_2x_2 +\theta_3x_3 +\theta_4x_4 ..... +\theta_nx_n  $

The vector X contains $[x_0, x_1,x_2......x_n]$ and is a vector of dimension (n+1)
The vector $\theta$ contains $[\theta_0, \theta_1, \theta_2......\theta_n]$ and is a vector of dimension (n+1)

The hypothesis can be written as

$h_\theta(x) = \theta^TX$

Cost Function & Gradient Descent Algorithm for a Multivariate Linear Regression

Cost Function
$J(\theta_0,\theta_1,\theta_2.....\theta_n) = {\frac 1 {2m}}{\sum_{i=1}^m}{(h_\theta(x^{(i)})-y^{(i)})}^2$

Gradient Descent for $\theta_0,\theta_1,\theta_2....\theta_n$
repeat until convergence $\{$

$\theta_j : \theta_j-\alpha {\frac{1}{m}}\sum_{i=1}^m{({h_\theta}(x^{(i)})-y^{(i)})}.{x^{(i)}_j}$

$\}$ Simultaneously update $\theta_j$ for j= 0,1,....n


The value of $x_0$ is always 1, so this generalizes the Gradient Descent Algorithm for univariate as well as multivariate linear regression

No comments:

Post a Comment