Practical Machine Learning : July 2016

Multivariate Linear Regression:-

When more than 1 features are given in linear regression then problem comes under multivariate linear regression.All the techniques are same as used in univariate linear regression except vectorized form is used for better optimization .

Dataset:

so to find the best fit equation of line for linearly correlated dataset .first we assume the hypothesis

$ H \left( x \right) = \theta^{T} \times x $

WHERE
$ x= \left[\begin{matrix} x_{0} \\ x_{1} \\ \vdots \\ x_{n+1} \end{matrix}\right] , \theta = \left[\begin{matrix} \theta_{0} \\ \theta_{1} \\ \vdots \\ \theta_{n+1} \end{matrix}\right] $
And n is the number of features. and here $ x_{0}=1 $

Now we suppose the cost function for the multivariate linear regression which is same as to univariate linear regression.

$ J = \frac{1}{ 2n}\sum_{i=1}^{n} \left( \theta^T\times x-y^i \right)^2 $

For the optimal values of $ \theta $ found equation best fit the given dataset.
we can find optimal values of $ \theta $ by using gradient descent algorithm.

Repeat until convergence
{
$ \theta_{i}=\theta_{i}- \alpha \times \frac{\partial J }{\partial\theta_{i}} $
}

Matlab / Octave approach to Linear Regression :-

Dataset format is given as :

$ X= \left[\begin{matrix} 1 & a_{1,2} & \cdots & a_{1,n+1} \\ 1 & a_{2,2} & \cdots & a_{2,n+1} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & a_{m,2} & \cdots & a_{m,n+1} \end{matrix}\right] , \theta = \left[\begin{matrix} \theta_{0} \\ \theta_{1} \\ \vdots \\ \theta_{n+1} \end{matrix}\right] , y= \left[\begin{matrix} y_{0} \\ y_{1} \\ \vdots \\ y_{n+1} \end{matrix}\right] $

Where a is the feature

% X and y and theta's format is given above the codes in mathematical form
m = length(y);
X=[ones(m,1),x];
theta = zeros(m,1);

% plotting the value of x and y 
plot(x,y);
%gradient descent function 
%computing gradient descent
function theta=grad_descent(X, y, theta, alpha,iterations)

m=length(y);
for iter = 1:iterations
    S=zeros(1,length(y)); 
    <% vectorized implementation
    S=sum(((theta'.*X')'.-y).*X);
    theta=theta - alpha/m*S;
end
theta = grad_descent(X,y,theta,0.04,2500);
% plotting the hypothesis 
hold on; % keep previous plot visible
plot(X(:,2), X*theta, '-');
hold off;
% Now we can predict the value of y for any value of x by
predict_y = theta'*x_val;

Intro to linear Regression : -

linear regression is used for where variable or features are linearly correlated in the dataset .we find the hypothesis [ best fit equation. ] [ equation of line for linear regression].

Example of LR dataset :-

eg.1 suppose we are given the everyday net data uses of a user from 1st day to Nth day. Suppose now if we want to calculate the data uses of (N+1)th day in advance ? of that user .

For that first we will preprocess the given dataset to make them linearly correlated by giving the Nth day data uses by = sum of data uses from 1st day to (N-1)th day. then will simply use linear regression technique if there is no discontinuity in data uses vs number of days.

so dataset is between number of days vs net data uses .

so to find the best fit equation of line for linearly correlated dataset .first we assume the hypothesis

$$ H \left( x^{i} \right) = \theta _{0}+\theta _{1} \times x^{i} $$
this is the equation of line. in this equation $ \theta_{0} $ is intercept on y axis and $ \theta_{1} $ is the gradient of line .and here $ x^{i} $ means the $ i^{th} $ data example in the dataset.

Here $ \theta_{0} $ and $ \theta_{1} $ are the variables .
To find the value of $ \theta_{0} $ and $ \theta_{1} $ we can use the least square method.
First we suppose the Cost function J .

$$ J \left( \theta_{0} , \theta_{1} \right) = \frac{1}{2 n}\sum_{i=1}^{n} \left( H \left( x^i \right)-y^i \right)^2 $$

where n is the number of data examples in the dataset . $ y^i $ means the $ i^{th} $ data example's value of y.To find the value of $ \theta_{0} $ and $ \theta_{1} $ for best fit equation of line. we need to minimize the cost function.

Objective

Minimize $ J \left( \theta_{0} , \theta_{1} \right) $ with respect to $ \theta_{0} $ and $ \theta_{1} $

To minimize the cost function we use gradient descent algorithm.

Gradient descent algorithm:

Repeat until Convergence

{

$$ \theta_{0}=\theta_{0}- \alpha \times \frac{\partial J\left( \theta_{0} , \theta_{1} \right) }{\partial\theta_{0}} $$

$$ \theta_{1}=\theta_{1}- \alpha \times \frac{\partial J\left( \theta_{0} , \theta_{1} \right) }{\partial\theta_{1}} $$

}

Where partial derivative of Cost function J is given as:

$$ \frac{\partial J\left( \theta_{0} , \theta_{1} \right) }{\partial\theta_{i}} = \frac{1}{ n}\sum_{i=1}^{n} \left( H \left( x^i \right)-y^i \right) \times x^i $$

now we put the value of $ \theta_{0} $ and $ \theta_{1} $ to find the best fit equation of line .

Matlab / Octave approach to Linear Regression :-

Dataset:

% reading the csv dataset file
data = csvread('ex1data1.txt');

% extracting the value of x and y
x = data(:, 1);
y = data(:, 2);
m = length(y);
X=[ones(m,1),x];
theta = zeros(2,1);

% plotting the value of x and y 
plot(x,y);

% cost function 
function J = cost_J(X,y ,theta)

m=length(y);
S=0;
for i=1:m,
 S = S + (theta(1)+theta(2)*X(i, 2) - y(i))^2;
end;

J = 1/(m)*S;

end 

%gradient descent function 
%computing gradient descent
function theta=grad_descent(X, y, theta, alpha,iterations)

m=length(y);
for iter = 1:iterations
S=[0; 0]; 
    S1=0; 
    S2=0;
    for i=1:m,
        S1=S1 + ((theta(1)+theta(2)*X(i, 2) - y(i))) * X(i,1);
        S2=S2 + ((theta(1)+theta(2)*X(i, 2) - y(i))) * X(i,2);
    end;
    S=[S1;S2];
    theta=theta - alpha/m*S;
end

end
theta = grad_descent(X,y,theta,0.04,2500);
% plotting the hypothesis 
hold on; % keep previous plot visible
plot(X(:,2), X*theta, '-');
hold off;
% Now we can predict the value of y for any value of x by
predict_y = theta(1)+theta(2)*x_val;

Vectorized implementation of linear regression:-

% reading the csv dataset file
data = csvread('ex1data1.txt');

% extracting the value of x and y
x = data(:, 1);
y = data(:, 2);
m = length(y);
X=[ones(m,1),x];
theta = zeros(2,1);

% plotting the value of x and y 
plot(x,y);

% cost function 
function J = cost_J(X,y ,theta)

m=length(y);
S=0;
for i=1:m,
 S = S + (theta(1)+theta(2)*X(i, 2) - y(i))^2;
end;

J = 1/(m)*S;

end 

%gradient descent function 
%computing gradient descent
function theta=grad_descent(X, y, theta, alpha,iterations)

m=length(y);
for iter = 1:iterations
    S=[0; 0]; 
    <% vectorized implementation
    S=sum(((theta'.*X')'.-y).*X);
    theta=theta - alpha/m*S;
end
theta = grad_descent(X,y,theta,0.04,2500);
% plotting the hypothesis 
hold on; % keep previous plot visible
plot(X(:,2), X*theta, '-');
hold off;
<% Now we can predict the value of y for any value of x by
predict_y = theta(1)+theta(2)*x_val;

Useful datasets links for implementing the linear regression models:-

For this purpose UCI repository is good i've found so far on internet.

1. Datasets on Automobile

Practical Machine Learning

Tuesday, 19 July 2016

Multivariate Linear Regression

Multivariate Linear Regression:-

Matlab / Octave approach to Linear Regression :-

Wednesday, 13 July 2016

practical linear regression

Intro to linear Regression : -

Example of LR dataset :-

Objective

Gradient descent algorithm:

Matlab / Octave approach to Linear Regression :-

Vectorized implementation of linear regression:-

Useful datasets links for implementing the linear regression models:-