Tags

239 tags across the wiki

Pages tagged sequence-to-sequence

📄 **[Read on arXiv](https://arxiv.org/abs/1706.03762)** Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin, NeurIPS, 2017. - [Paper](https://arxiv.org/abs/1706.03762) - [The Annotated Transformer](htt…

Neural Machine Translation by Jointly Learning to Align and Translate

source-summary

📄 **[Read on arXiv](https://arxiv.org/abs/1409.0473)** This paper introduced the attention mechanism to deep learning, arguably the single most influential architectural innovation leading to modern transformers and LLM…

Order Matters Sequence To Sequence For Sets

source-summary

📄 **[Read on arXiv](https://arxiv.org/abs/1511.06391)** This paper by Samy Bengio, Oriol Vinyals, and Manjunath Kudlur challenges a core assumption in sequence modeling: that the order of input and output data is merely…

Pointer Networks

source-summary

📄 **[Read on arXiv](https://arxiv.org/abs/1506.03134)** Pointer Networks repurpose the attention mechanism as an output distribution, replacing the fixed output vocabulary of sequence-to-sequence models with attention w…