You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. Choromanska, A., et al. (2020). "On the Importance of Initialization and Momentum in Deep Learning." *Proceedings of the 37th International Conference on Machine Learning*.
15
-
2. Dai, Z., et al. (2020). "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention." *arXiv preprint arXiv:2006.16236*.
15
+
1. Choromanska, A., et al. (2020). "On the Importance of Initialization and Momentum in
16
+
Deep Learning." *Proceedings of the 37th International Conference on Machine Learning*.
17
+
2. Dai, Z., et al. (2020). "Transformers are RNNs: Fast Autoregressive Transformers
18
+
with Linear Attention." *arXiv preprint arXiv:2006.16236*.
16
19
3. [Attention Mechanisms in Neural Networks](https://en.wikipedia.org/wiki/Attention_(machine_learning))
0 commit comments