Comparing GPT-3 and RNNs as probabilistic generative models
An investigation of transformers’ and RNNs’ ability to model long-term dependencies through the lens of probabilistic modelling. You can find the write-up here, and the code here.
This post is licensed under CC BY 4.0 by the author.