Machine Learning

469 readers

1 users here now

A community for posting things related to machine learning

Icon base by Lorc under CC BY 3.0 with modifications to add a gradient

founded 1 year ago

MODERATORS

Ategon@programming.dev

Akisamb@programming.dev

ericjmorey@programming.dev

The Random Transformer | Understand how transformers work by demystifying all the math behind them (osanseviero.github.io)

submitted 8 months ago by ericjmorey@programming.dev to c/machine_learning@programming.dev

0 comments fedilink hide all child comments

January 1, 2024 - Omar Sanseviero writes:

In this blog post, we’ll do an end-to-end example of the math within a transformer model. The goal is to get a good understanding of how the model works. To make this manageable, we’ll do lots of simplification. As we’ll be doing quite a bit of the math by hand, we’ll reduce the dimensions of the model. For example, rather than using embeddings of 512 values, we’ll use embeddings of 4 values. This will make the math easier to follow! We’ll use random vectors and matrices, but you can use your own values if you want to follow along.

Read The Random Transformer | Understand how transformers work by demystifying all the math behind them

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here