Writing an LLM from scratch, part 8 – trainable self-attention
Published on: 2025-07-03 18:41:14
Writing an LLM from scratch, part 8 -- trainable self-attention
This is the eighth post in my trek through Sebastian Raschka's book "Build a Large Language Model (from Scratch)". I'm blogging about bits that grab my interest, and things I had to rack my brains over, as a way to get things straight in my own head -- and perhaps to help anyone else that is working through it too. It's been almost a month since my last update -- and if you were suspecting that I was blogging about blogging and spending time getting LaTeX working on this site as procrastination because this next section was always going to be a hard one, then you were 100% right! The good news is that -- as so often happens with these things -- it turned out to not be all that tough when I really got down to it. Momentum regained.
If you found this blog through the blogging-about-blogging, welcome! Those posts were not all that typical, though, and I hope you'll enjoy this return to my normal form.
This time I'm coverin
... Read full article.