Generation: DistilGPT-2

Language models are trained to predict the probability the next token considering the preceding tokens that came before it. A token can be a word, a letter, or a subcomponent of a word. In this guide, we use the a decoder-only language model transformer to predict text from our novel insurgent propaganda corpus.