tl;dr: paper proposes a principle called 'prospective configuration' to explain how the brain does credit assignment and learns, as opposed to backprop. Backprop can lead to 'catastrophic interference' where learning new things abalates old associations, which doesn't match observed biological processes. From what I can tell, prosp. config learns by solving what the activations should have been to explain the error, and then updates the weights in accordance, which apparently somehow avoids abalating old associations. They then show how prosp. config explains observed biological processes. Cool stuff, wish I could find the code. There's some supplemental notes:
https://static-content.springer.com/esm/art%3A10.1038%2Fs415...
A simulation of a thing is not thing itself, but it is illuminating.
> pile of linear algebra
The entirety of physics is -- as you say -- a 'pile of linear algebra' and 'backprop' (differential linear algebra...)
Most people find that if you move away from a topic and into a new one your knowledge of it starts to decay over time. 20+ years ago I had a job as a Perl and VB6 developer, I think most of my knowledge of those languages has been evacuated to make way for all the other technologies I've learned since (and 20 years of life experiences). Isn't that an example of "learning new things ablates old associations"?
Stuff like childhood memories seems very deeply ingrained even if rarely or never reinforced. I can still remember the phone number of our house we moved out of in 1991, when I was 8 or 9. If I’m still alive in 30/40/50 years time, I expect I’ll still remember it then.
It is alien to us, that doesn't mean it is harmless.
It might be a response to the many, many claims in articles that neural networks work like the brain. Even using terms like neurons and synapses. With those claims getting widespread, people also start building theories on top of them that make AI’s more like humans. Then, we won’t need humans or they’ll be extinct or something.
Many of us whom are tired of that are both countering it and just using different terms for each where possible. So, I’m calling the AI’s models, saying model training instead of learning, and finding and acting on patterns in data. Even laypeople seem to understand these terms with less confusion about them being just like brains.
Artificial neural networks originated as simplified models of how the brain actually works. So they really do "work like the brain" in the sense of taking inspiration from certain rudiments of its workings. The problem is "like" can mean anything from "almost the same as" to "in a vaguely resembling or reminiscent way". The claim that artificial neural networks "work like the brain" is false under the first reading of "like" but true under the second.
One of my favorite features is how they use local, likely Hebbian, learning instead of global with backpropagation. (I won’t rule out some global mechanism, though.) The local learning makes their training much more efficient. Even if a global mechanism exists (eg during sleep?), brain architectures could run through more training data faster and cheaper. Expensive step just tidies it up in shorter periods of time.
They are also more analog, parallel, sparse, and flexible. They have feedback loops (IIRC). Multiple tiers of memory integrated with their internal representation with hallucination mitigation. They also have many specialized components that automatically coordinate to do the work without being externally trained to. All in around 100 watts.
Brains are both different from and vastly superior to ANN’s. Similarities do exist, though. They both have cells, connections, and change connections based on incoming data. Quite abstract. Past that, I’m not sure what other similarities they have. Some non-brain-inspired ANN’s have memory in some form but I don’t know if it’s as effective and integrated as the brain’s yet.
Funny enough, I actually worked with Rafal Bogacz, the last-named author of the paper we’re discussing, during his Basal Ganglia (BG) phase. He’s an incredibly sharp guy and made a pretty compelling argument that the BG implement the multihypothesis sequential probability ratio test (MSPRT) to decide between competing action plans in an optimal way.
Back then, there was another popular theory that the BG used an actor-critic learning model—also quite convincing.
But here’s the rub: in CN, the trend is to take algorithms from computer science and statistics and map them onto biology. What’s far rarer is extracting new ML algorithms from the biology itself.
I got into CN because I thought the only way we’d ever crack AGI was by unlocking the secrets of the best example we’ve got—the mammalian brain. Unfortunately, I ended up frustrated with the biology-led approach. In ten years in the field, I didn’t see anything that really felt like progress toward AGI. CN just moves so much slower than mainstream ML!
Still, I hope Rafal’s onto something with this latest idea. Fingers crossed it gives ML researchers a shiny new algorithm to play with.
Except the networks studied here for prospective configuration are ... neural networks. No changes to the architecture have been proposed, only a new learning algorithm.
If anything, this article lends credence to the idea that ANNs do -- at some level -- simulate the same kind of thing that goes on in the brain. That is to say that the article posits that some set of weights would replicate the brain pretty closely. The issue is how to find those weights. Backprop is one of many known -- and used -- algorithms . It is liked because the mechanism is well understood (function minimization using calculus). There have been many other ways suggested to train ANNs (genetic algorithms, annealing, etc). This one suggests an energy based approach, which is also not novel.
In scientific investigations, it's best to look at one component, or feature, at a time. It's also common to put the feature in an existing architecture to assess the difference that feature makes in isolation. Many papers trying to imitate brain architecture only use one feature in the study. I've seen them try stateful neurons, spiking, sparsity, Hebbian learning, hippocampus-like memory, etc. Others will study combinations of such things.
So, the field looks at brain-inspired changes to common ML, specific components that closely follow brain design (software or hardware), and whole architectures imitating brain principles with artificial deviations. And everything in between. :)
This paper is an incremental step along that path but commenters here are acting as if it's a polemic against neural nets.
Do "AI" fanbois really think LLMs work like a biological brain?
This only reinforces the old maxim: Artificial intelligence will never be a match for natural stupidity
If you read the article you'd know two things: (1) the article explicitly calls out Hopfield networks as being more bio-similar (Hopfield networks are intricately connected to attention layers) and (2) the overall architecture (the inference pass) of the networks studied here remain unmodified. Only the training mechanism changes.
As for a direct addressing of the claim... if the article is on point, then 'learning' has a much more encompassing physical manifestation than was previously thought. Really any system that self optimizes would be seen as bio-similar. In both mechanisms, there's a process to drive the system to 'convergence'. The issue is how fast that convergence is, not the end result.
The current HN title ("Brain learning differs fundamentally from artificial intelligence systems") seems very heavily editorialized.
Making the 'fundimental difference' the focus seems like laying the foundation to a claim that AI lacks some ability because of the difference. The difference does mean you cannot infer abilities present in one by detecting them in the other. This is the similar to, and as about as profound as, saying that you cannot say that rocks can move fast because of their lack of legs. Which is true, but says nothing about the ability of rocks to move fast by other means.
The contribution of the paper, and its actual title is about the proposed mechanism.
All the comments amounting to ‘no shit, sherlock’, are about the mangled headline, not the paper.
1. both ANNs and the brain need to solve the credit assignment problem 2. backprop works well for ANNs but probably isn't how the problem is solved in the brain
This paper is really interesting, but is more a novel theory about how the brain solves the credit assignment problem. The HN title makes it sound like differences between the brain and ANNs were previously unknown and is misleading IMO.
Agreed on both counts. There's nothing surprising in "there are differences between the brain and ANN's."
But their might be something useful in the "novel theory about how the brain solves the credit assignment problem" presented in the paper. At least for me, it caught my attention enough to justify giving it a full reading sometime soon.
Dang it, how did I miss that. Uugh. :-(
For example, let's say instead of gradient descent you want to do a Newton descent. Then maybe there's a better way to compute the needed weight updates besides backprop?
The important thing is backprop does work and so we're just scaling it up to absurd levels to get good results. There is going to be a big step change found sooner or later where training gets a lot better. Maybe there is some sort of threshold we're looking for where a trick only works for models with lots of parameters or something before we stumble on it, but if evolution can do it so will researchers.
IIRC, feedback alignment [1] approximates Gauss-Newton minimization. So there is an easier way, that is potentially biologically more plausible, though not necessarily a better way.
There are no words in the title which express this. Your own brain is "making it sound" like that. Misleading, yes, but attribute it correctly.
The angle I actually see in it though is the typical pitiful appeal to the idea that the brain is this incredible thing we should never hope to unravel, that AI bad, and that everyone working on AI is an idiot as per the link (and then the link painting a leaps and bounds more nuanced picture).
Prospective Configuration is an actual algorithm that, to my understanding, attempts to reproduce input patterns but can also engage in supervised learning.
I'm less clear on Prospective Configuration than the other two, which I've worked with directly.
What would neural activity changes look like in an ML model?
(I like enactive models of perception such as those advocated by Alva Noe, Humberto Maturana, Francisco Valera, and others. They get us well beyond the straightjacket of Cartesian dualism.)
Rather than have error signals tweak synaptic weights after a behavior, a cognitive system generates a set of actions it predicts will accommodate needs. This can apparently be accomplished without requiring short term synaptic plasticity. Then if all is good, weights are modified in a secondary phase that is more about asserting utility of the “test” response. More selection than descent. The emphasis is more on feedforward modulation and selection. Clearly there must be error signal feedback so some if you may argue that the distinction will be blurry at some levels. Agreed.
Look forward to reading more carefully to see how far off-base I am.
Obviously. So can the scraping grifters who claim that AI 'learns just like a human' please shut up and never inflict their odious presence on the rest of humanity again? And also pay 10X damages for ruining the Internet.