Equations

Let’s take the first partial derivative as an example.

Using the chain rule:

The constants are cancelled out.

Only $x_0$ will remain from $u = (y_{actual} - (w_0x_0 + w_1x_1 + ... + w_{m-1}x_{m-1}))$ since all other variables will be treated as a constant except for $w_0$.

Back to the other equation:

Substitute $u$.

Substitute the $y_{prediction}$ function.

Then do this for all the weights: