Learning Theory from First Principles [pdf]

miltava · 2025-03-27T22:58:00 1743116280

This looks interesting. Does anyone knows how it compares to other learning theory books like foundations of machine learning [1] in terms of depth and approachability?

[1] https://cs.nyu.edu/~mohri/mlbook/

mikhael · 2025-03-27T23:23:06 1743117786

Learning From Data: https://amlbook.com/

esafak · 2025-03-27T22:43:42 1743115422

https://francisbach.com/my-book-is-out/

https://mitpress.mit.edu/9780262049443/learning-theory-from-...

commandersaki · 2025-03-28T05:03:38 1743138218

Ah damn I was hoping this would be a position paper on why learning _theory_ from first principles is a good idea.

drsopp · 2025-03-28T06:03:15 1743141795

I was hoping the paper was on _learning_ theory.

amelius · 2025-03-28T11:55:45 1743162945

It is a book on that. Not sure why they left out "machine" from "machine learning".

grg0 · 2025-03-28T02:23:35 1743128615

Good resource, boys. Thanks for sharing.

almostgotcaught · 2025-03-28T00:10:57 1743120657

I honestly don't understand why people write these books anymore. Let me explain: there used to be a lot of these kinds survey books that start with linear regression and end at... something classical. I can rattle off a lot of titles (Pattern Recognition and Machine Learning, Elements of Statistical Learning, Intro to Statistical Learning, blah blah blah). They all covered the same material at various levels of sophistication (some of them covered meta theory like PAC learning or shattering dimension or empirical risk minimization or whatever). Some of them took the statistical approach and some of them took the optimization approach. Again: blah blah blah. The synthesis/summary is/was there is no grand unified theory of machine learning and everyone saw that it should be clear.

And then "deep learning" arrived and it became even more obvious that the only thing that matters is data and time spent crunching numbers (more of both and you get better results no matter the model).

Again I just want to be crystal clear, because I'm sure someone will pop in and claim "oh I still use SVM to pick my family's shopping list": no professional ML engineer/team/org today that ships and ML product "at scale" gives a fuck about SVMs or graphical models or bayes nets or kernel methods. No one. So who cares about all this sophistry? What value is it to learn concentration inequalities - training goes brrr no matter what if you have enough data. And if you don't, if you're really building a model to predict your family's shopping list, I encourage to reflect on whether it would be simpler to just ask your family what they want for dinner instead.

My 2 cents: teach people/students useful things instead of this stuff. They'll be happier and you'll feel more fulfilled (even though you didn't get flex your big math brain).

miltava · 2025-03-28T01:04:30 1743123870

Maybe the book it’s just not for you. It doesn’t mean it’s not for anyone.

I understand that deep learning is all in vogue now. But when I was in graduate school, a professor asked me why I was using neural nets in a project since it was not as good as SVMs. We used to study Vapnik and VC dimensions, SVMs etc. and neural nets were totally out of fashion.

Imagine what would have happened if everybody were using and researching only those methods because they worked better. And deep learning could benefit from a theory that explains why, when and how it works so well. Maybe someone working on this could develop on it to include it.

Also I don’t think you’re right to assume that all models out there are deep learning models. Yes they are very good for many cases (specially those with less structured data, like image or nlp). But in some cases gradient boosting or even GLMs are better suited for the task (because of the structure and size of the data or because of computing restrictions).

And in the end, people can just want to learn it because they find it interesting. It’s a bit sad to do only things that are “useful”. That’s my 2 cents.

pinkmuffinere · 2025-03-28T07:40:35 1743147635

Some of these “ML” methods have applications outside of what you’d think of as ML. My background is in control theory, which relies on guarantees which you just can’t get from neural nets. Skimming through the outline, there are tons of methods here which are used in controls and estimation — certainly they’re still useful

shusaku · 2025-03-28T03:26:49 1743132409

> no professional ML engineer/team/org today that ships and ML product "at scale" gives a fuck about SVMs or graphical models or bayes nets or kernel methods.

There’s a reason why “AI is just statistics” became a meme: a lot of places do use textbook machine learning techniques and dress it up as AI. Yes deep learning will win with enough data but few companies have that luxury.

choonway · 2025-03-28T01:09:49 1743124189

This book is not for you.

Teaching others how to replicate solutions is very different from guiding people how to solve yet unsolved problems in the field.

This book is for the latter. For the former, you might want to look out for one in the "for dummies" series.

almostgotcaught · 2025-03-28T03:33:28 1743132808

> This book is not for you... you might want to look out for one in the "for dummies" series.

My guy I learned this material from a healthy mix of ESL and Casella Berger and Billingsley. I could still, to this day, probably do every proof in this book without reviewing the material. And yet, despite all that training, I still argue this book is not useful for absolutely anything except assigning homework problems and setting exams.

choonway · 2025-03-28T08:46:40 1743151600

That's not strange. There are many people who can ace exams but are not competent in applying it in their workplace.

almostgotcaught · 2025-03-28T13:24:32 1743168272

Lol I think you've lost the thread here.

sarosh · 2025-03-28T00:52:50 1743123170

But why does, as you explain "training goes brrr"?

Francis Bach, the author, makes a good faith effort to explain exactly why this material is beneficial (see https://francisbach.com/my-book-is-out/):

"Why yet another book on learning theory? ...the main reason is that I felt that the current trend in the mathematical analysis of machine learning was leading to overly complicated arguments and results that are often not relevant to practitioners. Therefore, my aim was to propose the simplest formulations that can be derived from first principles, trying to remain rigorous without overwhelming readers with more powerful results that require too much mathematical sophistication."

From my own reading and experience on the mathematical analysis approach of this "training goes brrr" approach, I thought the material in Chapter 12, Overparameterized Models, was interesting and coherent with 12.2.4 Linear Regression with Gaussian Projections being an especially elegant explanation. It would be interesting to hear if you had read/skimmed/purused this section and found it wanting etc.

gloomyday · 2025-03-28T21:29:19 1743197359

The pursuit of knowledge is not a linear path. The reason you benefit from deep learning now is because a few people in the past believed neural networks had a future despite not working as well as other techniques such as SVMs.

Discovering knowledge and using the knowledge that works best are very different.

Your argument remind me from this lecture from Feynman. Quoting him: "...and every theoretical physicist that is any good knows 6 or 7 different theoretical representations for exactly the same physics and knows that they are equivalent... but he keeps them in his head hoping that they'll give him different ideas for guessing."

https://www.youtube.com/watch?v=NM-zWTU7X-k

fancyfredbot · 2025-03-28T20:05:40 1743192340

This book doesn't seem massively different from several other existing textbooks. There are also several good textbooks on deep learning specifically (I'd recommend the new Bishop)

This textbook is hardly irrelevant for people who only care about deep learning though. It covers regularisation, optimisation, overparameterised models, double descent and err, neural networks. Sounds pretty relevant to me?

If you think the rest of the book is irrelevant then skip it.

You sound a bit nutty when you confidently state nobody uses any of the other methods in this book. How could you possibly know that?

adalarmed · 2025-03-28T18:38:11 1743187091

You seem to know a lot about this area. I do not but I've heard explaining what deep learning models do is a black-box? If you work in a "misson critical" you'd have to explain all the math behind the model. Let's say in healthcare, finance, aviation, etc.

Also, the "big math brain"'s you're talking about probably read all the books your shutting down. I'd say their big math brains are the reason we have LLMs today.

kangda123 · 2025-03-28T09:49:11 1743155351

It's quite simple: there's world beyond shipping ML products at scale. Some of it far less and some far more lucrative.

ash-ali · 2025-03-28T02:51:45 1743130305

Vouch for this approach ... not just in ml/dl but as a general way of learning things in life.

resources like this are useful, in an academic setting. lets not forget we all forget 50% or more of what we learn about 20 minutes afterwards unless we consistently remind ourselves of what we learn.

unless others prove me wrong using personal anecdotes.

helicalspiral · 2025-03-28T07:39:37 1743147577

Do you have more specific things/resources to learn that would be more useful?

admiralrohan · 2025-03-28T07:51:45 1743148305

This feels boring because students can't connect with bigger picture. We still need to learn fundamentals but must be connected to actual product. Or else people will forget after leaving university.

solomatov · 2025-03-28T14:14:07 1743171247

May be you could give a link to a book which might be a better fit for practitioners?

f1shy · 2025-03-28T09:59:38 1743155978

I see a big danger in that line of thinking. It was (in part) what delegated NNs, that you ponder so high, to the basement for a long time, as many people thought “lets learn expert systems, that is what ships, nobody cares about NNs”

I would like to learn all, specially thing that lead to the current status.