A Distributional Approach To Controlled Text Generation

Muhammad Khalifa* Hady Elsahar* Marc Dymetman*
ICLR2021 ( Oral presentation - top 2.1% )
* equal contribution
[
Paper] [code] [Blogpost] [Twitter Thread]

We propose a Distributional Approach to address Controlled Text Generation from pretrained Language Models (LMs). This view permits to define, in a single formal framework, “pointwise’” and “distributional” constraints over the target LM— to our knowledge, this is the first approach with such generality — while minimizing KL divergence with the initial LM distribution. The optimal target distribution is then uniquely determined as an explicit EBM (Energy-Based Model) representation. From that optimal representation we then train the target controlled autoregressive LM through an adaptive distributional variant of Policy Gradient. We conduct a first set of experiments over pointwise constraints showing the advantages of our approach over a set of baselines, in terms of obtaining a controlled LM balancing constraint satisfaction with divergence from the initial LM (GPT-2). We then perform experiments over distributional constraints, a unique feature of our approach, demonstrating its potential as a remedy to the problem of Bias in Language Models. Through an ablation study we show the effectiveness of our adaptive technique for obtaining faster convergence

Previous
Previous

Debiasing large pretrained language models using distributional control

Next
Next

Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages