线性回归与梯度下降

The model function for linear regression, which is a function that maps from x to y is represented as .
To train a linear regression model, you want to find the best parameters that fit your dataset.
- To compare how one choice of is better or worse than another choice, you can evaluate it with a cost function
  - is a function of . That is, the value of the cost depends on the value of .
- The choice of that fits your data the best is the one that has the smallest cost .
To find the values that gets the smallest possible cost , you can use a method called gradient descent.
- With each step of gradient descent, your parameters come closer to the optimal values that will achieve the lowest cost .
The trained linear regression model can then take the input feature and output a prediction .

Gradient descent involves repeated steps to adjust the value of your parameter to gradually get a smaller and smaller cost .

At each step of gradient descent, it will be helpful for you to monitor your progress by computing the cost as gets updated.
In this section, you will implement a function to calculate so that you can check the progress of your gradient descent implementation.

As you may recall from the lecture, for one variable, the cost function for linear regression is defined as

is the model's prediction through the mapping, as opposed to , which is the actual value from the dataset.
is the number of training examples in the dataset.

For linear regression with one variable, the prediction of the model for an example is represented as .

This is the equation for a line, with an intercept and a slope .

The gradient descent algorithm is:

where, parameters are both updated simultaniously and where
* m is the number of training examples in the dataset.