Fork me on GitHub

Trending arXiv

Note: this version is tailored to @Smerity - though you can run your own! Trending arXiv may eventually be extended to multiple users ...

Effective Estimation of Deep Generative Language Models

Tom Pelsmaeker, Wilker Aziz

Advances in variational inference enable parameterisation of probabilistic models by deep neural networks. This combines the statistical transparency of the probabilistic modelling framework with the representational power of deep learning. Yet, it seems difficult to effectively estimate such models in the context of language modelling. Even models based on rather simple generative stories struggle to make use of additional structure due to a problem known as posterior collapse. We concentrate on one such model, namely, a variational auto-encoder, which we argue is an important building block in hierarchical probabilistic models of language. This paper contributes a sober view of the problem, a survey of techniques to address it, novel techniques, and extensions to the model. Our experiments on modelling written English text support a number of recommendations that should help researchers interested in this exciting field.

Captured tweets and retweets: 2