Description
Neural sequence models have been applied with great success to a variety of tasks in natural language processing in recent years. However, because they generate their outputs one token at a time from left to right, they are not immediately suitable for domains where non-trivial output constraints must be satisfied for well-formedness, or where parallel decoding may be desirable for higher throughput. In this dissertation, we explore how we can overcome these limitations through the use of more highly structured models and more flexibly structured decoding algorithms.
On the modeling side, we first introduce a span-based neural model for constituency parsing that permits efficient, globally optimal decoding over the space of parse trees using a chart-based dynamic program. We then present the Abstract Syntax Network, a tree-structured neural model for code generation whose scoring modules are composed together in a way that mirrors the syntactic structure of the program being produced. Next, turning to more flexible decoding algorithms for sequences, we demonstrate how the Transformer sequence model can be extended to accommodate blockwise parallel decoding for significant improvements in decoding speed without compromising accuracy. Finally, we present the Insertion Transformer, an insertion-based sequence model that enables out-of-order generation and logarithmic-time parallel decoding.