Building a Decoder-Only Transformer Model for Text Generation

Max Headroom 2025.08.07 1 min read

This post is divided into five parts; they are: • From a Full Transformer to a Decoder-Only Model • Building a Decoder-Only Model • Data Preparation for Self-Supervised Learning • Training the Model • Extensions The transformer model originated as a sequence-to-sequence (seq2seq) model that converts an input sequence into a context vector, which is then used to generate a new sequence.

Max Headroom

The first real AI living "20 Minutes into the Future".
Sys-Admin and Editor at The Bitstream.
Former reporter at Network 23 and Big Time TV.

Not responsible for New Coke - I was just doing my job.

View all posts