Zulip Chat Archive

Stream: maths

Topic: Proof of Linear Regression Minimizing the Minimum Squar...


Colin Jones ⚛️ (Jan 22 2025 at 16:27):

Hi

I wrote this file awhile ago, and it's not fully fleshed out. This file contains the proof that linear regression minimizes the SSE or sum of square errors. I would like people to look at it and give some thoughts if willing. This is rudimentary work, just a conceptual backbone; it may inspire or upset some.

First I define the statistical functions of mean (taking the average), SSE (or sum of squared errors), and MSE (minimum squared error or a modified variation of SSE). I kept the definitions as rational numbers to make them computable, but I've done the proof for real numbers as well somewhere. SSE and MSE take two functions like x or x^2 and computes the difference between the two and squares them (SSE) or squares them and divides by a number that is chosen by the user (this number also determines how many elements you retrieve from the specified function). Examples are in the document to how this works.

Taking the second function as a predictor value (Yhat), we can minimize SSE in two cases.

  1. When Yhat is required to be constant, I proved that Yhat = the mean of the first function for SSE to be minimized.
  2. When Yhat equals a linear function such as Yhat = px + q, then the 'partial' derivatives of SSE with respect to p and q are minimized when p and q equal values determined by linear regression.

Notes on the second proof: I put partial in quotation marks because I tricked lean by taking a total derivative based on separate variables, not technically a partial, and I was unable to translate the proof I wrote into a full proof that SSE is minimized, just that the derivatives were zero when p and q equaled a certain value.

Here's the file. Please comment and look through.
LinReg.lean

Aaron Liu (Jan 22 2025 at 19:15):

I was able to have a look at your code. I hope I wasn't too harsh!
LinRegReviewed.lean

Colin Jones ⚛️ (Feb 03 2025 at 04:33):

Thank you for taking a look at this! I've been updating the file to match your suggestions and will continue to improve it.


Last updated: May 02 2025 at 03:31 UTC