{"id":11152,"date":"2020-12-31T07:53:00","date_gmt":"2020-12-31T07:53:00","guid":{"rendered":"https:\/\/www.askpython.com\/?p=11152"},"modified":"2021-01-06T08:57:54","modified_gmt":"2021-01-06T08:57:54","slug":"linear-regression-from-scratch","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python\/examples\/linear-regression-from-scratch","title":{"rendered":"Linear Regression from Scratch in Python"},"content":{"rendered":"\n<p>In this article, we&#8217;ll learn to implement Linear regression from scratch using Python. Linear regression is a basic and most commonly used type of predictive analysis. <\/p>\n\n\n\n<p>It is used to predict the value of a variable based on the value of another variable. The variable we want to predict is called the dependent variable. <\/p>\n\n\n\n<p>The variable we are using to predict the dependent variable&#8217;s value is called the independent variable.\u00a0<\/p>\n\n\n\n<p>The simplest form of the regression equation with one dependent and one independent variable.<\/p>\n\n\n\n<p> y = m * x + b<\/p>\n\n\n\n<p> where,<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>y = estimated dependent value.<\/li><li>b = constant or bias.<\/li><li>m = regression coefficient or slope.<\/li><li>x = value of the independent variable.<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Linear Regression from Scratch<\/h2>\n\n\n\n<p>In this article, we will implement the Linear Regression from scratch using only <a href=\"https:\/\/www.askpython.com\/python-modules\/numpy\/python-numpy-module\" class=\"rank-math-link\">Numpy<\/a>. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. Understanding Loss Function<\/h3>\n\n\n\n<p>While there are many loss functions to implement, We will use the <a href=\"https:\/\/www.askpython.com\/python\/examples\/rmse-root-mean-square-error\" class=\"rank-math-link\">Mean Squared Error<\/a> function as our loss function. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-medium is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/MSE-300x83.gif\" alt=\"MSE\" class=\"wp-image-11565\" width=\"400\" height=\"100\"\/><figcaption>MSE<\/figcaption><\/figure><\/div>\n\n\n\n<p>A mean squared error function as the name suggests is the mean of squared sum of difference between true and predicted value. <\/p>\n\n\n\n<p>As the predicted value of y depends on the slope and constant, hence our goal is to find the values for slope and constant that minimize the loss function or in other words, minimize the difference between y predicted and true values. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2. Optimization Algorithm<\/h3>\n\n\n\n<p>Optimization algorithms are used to find the optimal set of parameters given a training dataset that minimizes the loss function, in our case we need to find the optimal value of slope (m) and constant (b). <\/p>\n\n\n\n<p><strong>One such Algorithm is Gradient Descent. <\/strong><\/p>\n\n\n\n<p>Gradient descent is by far the most popular optimization algorithm used in machine learning.<\/p>\n\n\n\n<p>Using gradient descent we iteratively calculate the gradients of the loss function with respect to the parameters and keep on updating the parameters till we reach the local minima. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3. Steps to Implement Gradient Descent<\/h3>\n\n\n\n<p>Let&#8217;s understand how the gradient descent algorithm works behind the scenes. <\/p>\n\n\n\n<p><strong>Step-1 Initializing the parameters<\/strong><\/p>\n\n\n\n<p>Here, we need to initialize the values for our parameters. Let&#8217;s keep <code>slope = 0<\/code> and <code>constant = 0<\/code>. <\/p>\n\n\n\n<p>We will also need a learning rate to determine the step size at each iteration while moving toward a minimum value of our loss function.<\/p>\n\n\n\n<p><strong>Step -2 Calculate the Partial Derivatives with respect to parameters<\/strong><\/p>\n\n\n\n<p>Here we partially differentiate our loss function with respect to the parameters we have. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"596\" height=\"316\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Gradient-of-slope-and-bias.png\" alt=\"Gradient Of Slope And Bias\" class=\"wp-image-11568\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Gradient-of-slope-and-bias.png 596w, https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Gradient-of-slope-and-bias-300x159.png 300w\" sizes=\"auto, (max-width: 596px) 100vw, 596px\" \/><figcaption>Gradient Of Slope And Bias<\/figcaption><\/figure><\/div>\n\n\n\n<p><strong>Step &#8211; 3 Updating the parameters<\/strong><\/p>\n\n\n\n<p>Now, we update the values of our parameters using the equations given below:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"457\" height=\"294\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Updating-Parameters.jpg\" alt=\"Updating Parameters\" class=\"wp-image-11576\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Updating-Parameters.jpg 457w, https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Updating-Parameters-300x193.jpg 300w\" sizes=\"auto, (max-width: 457px) 100vw, 457px\" \/><figcaption>Updating Parameters<\/figcaption><\/figure><\/div>\n\n\n\n<p>The updated values for our parameters will be the values with which, each step minimizes our loss function and reduces the difference between the true and predicted values.<\/p>\n\n\n\n<p>Repeat the process to reach a point of local minima. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4. Implementing Linear Regression from Scratch in Python<\/h3>\n\n\n\n<p>Now that we have an idea about how Linear regression can be implemented using Gradient descent, let&#8217;s code it in Python.<\/p>\n\n\n\n<p>We will define <code>LinearRegression<\/code> class with two methods <code>.fit( )<\/code> and <code>.predict( )<\/code><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n#Import required modules\nimport numpy as np\n\n#Defining the class\nclass LinearRegression:\n    def __init__(self, x , y):\n        self.data = x\n        self.label = y\n        self.m = 0\n        self.b = 0\n        self.n = len(x)\n        \n    def fit(self , epochs , lr):\n        \n        #Implementing Gradient Descent\n        for i in range(epochs):\n            y_pred = self.m * self.data + self.b\n            \n            #Calculating derivatives w.r.t Parameters\n            D_m = (-2\/self.n)*sum(self.data * (self.label - y_pred))\n            D_b = (-1\/self.n)*sum(self.label-y_pred)\n            \n            #Updating Parameters\n            self.m = self.m - lr * D_m\n            self.c = self.b - lr * D_c\n            \n    def predict(self , inp):\n        y_pred = self.m * inp + self.b \n        return y_pred\n<\/pre><\/div>\n\n\n<p>We create an instance of our <code>LinearRegression<\/code> class with training data as the input to the class and initialize the bias and constant values as 0.<\/p>\n\n\n\n<p>The <code>.fit( )<\/code> method in our class implements Gradient Descent where with each iteration we calculate the partial derivatives of the function with respect to parameters and then update the parameters using the learning rate and the gradient value. <\/p>\n\n\n\n<p>With the <code>.predict( )<\/code> method we are simply evaluating the function <code>y = m * x + b<\/code> , using the optimal values of our parameters, in other words, this method estimates the line of best fit.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4. Testing the Linear Regression Model<\/h3>\n\n\n\n<p>Now as we created our class let&#8217;s test in on the data. Learn more about <a href=\"https:\/\/www.askpython.com\/python\/examples\/split-data-training-and-testing-set\" class=\"rank-math-link\">how to split training and testing data sets<\/a>. You can find the datasets and other resources used within this tutorial <a aria-label=\" (opens in a new tab)\" href=\"https:\/\/github.com\/Ash007-kali\/Article-Datasets\/tree\/main\/Linear%20Regression\" target=\"_blank\" rel=\"noreferrer noopener nofollow\" class=\"rank-math-link\">here<\/a>.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n#importing Matplotlib for plotting\nimport matplotlib.pyplot as plt\n\n#Loding the data\ndf = pd.read_csv(&#039;data_LinearRegression.csv&#039;)\n\n#Preparing the data\nx = np.array(df.iloc&#x5B;:,0])\ny = np.array(df.iloc&#x5B;:,1])\n\n#Creating the class object\nregressor = LinearRegression(x,y)\n\n#Training the model with .fit method\nregressor.fit(1000 , 0.0001) # epochs-1000 , learning_rate - 0.0001\n\n#Prediciting the values\ny_pred = regressor.predict(x)\n\n#Plotting the results\nplt.figure(figsize = (10,6))\nplt.scatter(x,y , color = &#039;green&#039;)\nplt.plot(x , y_pred , color = &#039;k&#039; , lw = 3)\nplt.xlabel(&#039;x&#039; , size = 20)\nplt.ylabel(&#039;y&#039;, size = 20)\nplt.show()\n<\/pre><\/div>\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"2560\" height=\"1536\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Prediction-Linear-Regression-scaled.jpeg\" alt=\"Prediction Linear Regression\" class=\"wp-image-11574\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Prediction-Linear-Regression-scaled.jpeg 2560w, https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Prediction-Linear-Regression-300x180.jpeg 300w, https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Prediction-Linear-Regression-1024x614.jpeg 1024w, https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Prediction-Linear-Regression-768x461.jpeg 768w, https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Prediction-Linear-Regression-1536x922.jpeg 1536w, https:\/\/www.askpython.com\/wp-content\/uploads\/2020\/12\/Prediction-Linear-Regression-2048x1229.jpeg 2048w\" sizes=\"auto, (max-width: 2560px) 100vw, 2560px\" \/><figcaption>Prediction Linear Regression<\/figcaption><\/figure><\/div>\n\n\n\n<p>Works fine !<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>This article was all about how we can make a Linear Regression model from scratch using only Numpy. The goal of this tutorial was to give you a deeper sense of what Linear Regression actually is and how it works.<\/p>\n\n\n\n<p>Till we meet next time.<\/p>\n\n\n\n<p>Happy Learning!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this article, we&#8217;ll learn to implement Linear regression from scratch using Python. Linear regression is a basic and most commonly used type of predictive analysis. It is used to predict the value of a variable based on the value of another variable. The variable we want to predict is called the dependent variable. The [&hellip;]<\/p>\n","protected":false},"author":16,"featured_media":11845,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[],"class_list":["post-11152","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-examples"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/11152","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=11152"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/11152\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/11845"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=11152"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=11152"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=11152"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}