Practical Machine Learning : practical linear regression

Intro to linear Regression : -

linear regression is used for where variable or features are linearly correlated in the dataset .we find the hypothesis [ best fit equation. ] [ equation of line for linear regression].

Example of LR dataset :-

eg.1 suppose we are given the everyday net data uses of a user from 1st day to Nth day. Suppose now if we want to calculate the data uses of (N+1)th day in advance ? of that user .

For that first we will preprocess the given dataset to make them linearly correlated by giving the Nth day data uses by = sum of data uses from 1st day to (N-1)th day. then will simply use linear regression technique if there is no discontinuity in data uses vs number of days.

so dataset is between number of days vs net data uses .

so to find the best fit equation of line for linearly correlated dataset .first we assume the hypothesis

$$ H \left( x^{i} \right) = \theta _{0}+\theta _{1} \times x^{i} $$
this is the equation of line. in this equation $ \theta_{0} $ is intercept on y axis and $ \theta_{1} $ is the gradient of line .and here $ x^{i} $ means the $ i^{th} $ data example in the dataset.

Here $ \theta_{0} $ and $ \theta_{1} $ are the variables .
To find the value of $ \theta_{0} $ and $ \theta_{1} $ we can use the least square method.
First we suppose the Cost function J .

$$ J \left( \theta_{0} , \theta_{1} \right) = \frac{1}{2 n}\sum_{i=1}^{n} \left( H \left( x^i \right)-y^i \right)^2 $$

where n is the number of data examples in the dataset . $ y^i $ means the $ i^{th} $ data example's value of y.To find the value of $ \theta_{0} $ and $ \theta_{1} $ for best fit equation of line. we need to minimize the cost function.

Objective

Minimize $ J \left( \theta_{0} , \theta_{1} \right) $ with respect to $ \theta_{0} $ and $ \theta_{1} $

To minimize the cost function we use gradient descent algorithm.

Gradient descent algorithm:

Repeat until Convergence

{

$$ \theta_{0}=\theta_{0}- \alpha \times \frac{\partial J\left( \theta_{0} , \theta_{1} \right) }{\partial\theta_{0}} $$

$$ \theta_{1}=\theta_{1}- \alpha \times \frac{\partial J\left( \theta_{0} , \theta_{1} \right) }{\partial\theta_{1}} $$

}

Where partial derivative of Cost function J is given as:

$$ \frac{\partial J\left( \theta_{0} , \theta_{1} \right) }{\partial\theta_{i}} = \frac{1}{ n}\sum_{i=1}^{n} \left( H \left( x^i \right)-y^i \right) \times x^i $$

now we put the value of $ \theta_{0} $ and $ \theta_{1} $ to find the best fit equation of line .

Matlab / Octave approach to Linear Regression :-

Dataset:

% reading the csv dataset file
data = csvread('ex1data1.txt');

% extracting the value of x and y
x = data(:, 1);
y = data(:, 2);
m = length(y);
X=[ones(m,1),x];
theta = zeros(2,1);

% plotting the value of x and y 
plot(x,y);

% cost function 
function J = cost_J(X,y ,theta)

m=length(y);
S=0;
for i=1:m,
 S = S + (theta(1)+theta(2)*X(i, 2) - y(i))^2;
end;

J = 1/(m)*S;

end 

%gradient descent function 
%computing gradient descent
function theta=grad_descent(X, y, theta, alpha,iterations)

m=length(y);
for iter = 1:iterations
S=[0; 0]; 
    S1=0; 
    S2=0;
    for i=1:m,
        S1=S1 + ((theta(1)+theta(2)*X(i, 2) - y(i))) * X(i,1);
        S2=S2 + ((theta(1)+theta(2)*X(i, 2) - y(i))) * X(i,2);
    end;
    S=[S1;S2];
    theta=theta - alpha/m*S;
end

end
theta = grad_descent(X,y,theta,0.04,2500);
% plotting the hypothesis 
hold on; % keep previous plot visible
plot(X(:,2), X*theta, '-');
hold off;
% Now we can predict the value of y for any value of x by
predict_y = theta(1)+theta(2)*x_val;

Vectorized implementation of linear regression:-

% reading the csv dataset file
data = csvread('ex1data1.txt');

% extracting the value of x and y
x = data(:, 1);
y = data(:, 2);
m = length(y);
X=[ones(m,1),x];
theta = zeros(2,1);

% plotting the value of x and y 
plot(x,y);

% cost function 
function J = cost_J(X,y ,theta)

m=length(y);
S=0;
for i=1:m,
 S = S + (theta(1)+theta(2)*X(i, 2) - y(i))^2;
end;

J = 1/(m)*S;

end 

%gradient descent function 
%computing gradient descent
function theta=grad_descent(X, y, theta, alpha,iterations)

m=length(y);
for iter = 1:iterations
    S=[0; 0]; 
    <% vectorized implementation
    S=sum(((theta'.*X')'.-y).*X);
    theta=theta - alpha/m*S;
end
theta = grad_descent(X,y,theta,0.04,2500);
% plotting the hypothesis 
hold on; % keep previous plot visible
plot(X(:,2), X*theta, '-');
hold off;
<% Now we can predict the value of y for any value of x by
predict_y = theta(1)+theta(2)*x_val;

Useful datasets links for implementing the linear regression models:-

For this purpose UCI repository is good i've found so far on internet.

1. Datasets on Automobile

Practical Machine Learning

Wednesday, 13 July 2016

practical linear regression