Showing 1-3 of 1234 results
-
Attention Is All You Need
The dominant sequence transduction models are based on complex...
Submitted 12 June, 2017; originally announced June 2017.
-
Attention Is Not All You Need: Pure Attention Loses Rank
Attention-based architectures have become ubiquitous...
Submitted 23 June, 2021.
-
Efficient Transformers: A Survey
Transformer model architectures have garnered immense interest...
Submitted 14 September, 2020.