Scaled Dot-Product Attention
We call the attention function on a set of queries simultaneously.
The two most commonly used attention functions are additive attention and dot-product attention.
We call the attention function on a set of queries simultaneously.
The two most commonly used attention functions are additive attention and dot-product attention.