Skip to content

Commit 4f573e0

Browse files
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
1 parent 3b88484 commit 4f573e0

File tree

1 file changed

+9
-9
lines changed

1 file changed

+9
-9
lines changed

neural_network/sliding_window_attention.py

+9-9
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"""
22
- - - - - -- - - - - - - - - - - - - - - - - - - - - - -
33
Name - - sliding_window_attention.py
4-
Goal - - Implement a neural network architecture using sliding
4+
Goal - - Implement a neural network architecture using sliding
55
window attention for sequence modeling tasks.
66
Detail: Total 5 layers neural network
77
* Input layer
@@ -12,11 +12,11 @@
1212
1313
Date: 2024.10.20
1414
References:
15-
1. Choromanska, A., et al. (2020). "On the Importance of
16-
Initialization and Momentum in Deep Learning." *Proceedings
15+
1. Choromanska, A., et al. (2020). "On the Importance of
16+
Initialization and Momentum in Deep Learning." *Proceedings
1717
of the 37th International Conference on Machine Learning*.
18-
2. Dai, Z., et al. (2020). "Transformers are RNNs: Fast
19-
Autoregressive Transformers with Linear Attention."
18+
2. Dai, Z., et al. (2020). "Transformers are RNNs: Fast
19+
Autoregressive Transformers with Linear Attention."
2020
*arXiv preprint arXiv:2006.16236*.
2121
3. [Attention Mechanisms in Neural Networks](https://en.wikipedia.org/wiki/Attention_(machine_learning))
2222
- - - - - -- - - - - - - - - - - - - - - - - - - - - - -
@@ -28,7 +28,7 @@
2828
class SlidingWindowAttention:
2929
"""Sliding Window Attention Module.
3030
31-
This class implements a sliding window attention mechanism where
31+
This class implements a sliding window attention mechanism where
3232
the model attends to a fixed-size window of context around each token.
3333
3434
Attributes:
@@ -54,13 +54,13 @@ def forward(self, input_tensor: np.ndarray) -> np.ndarray:
5454
Forward pass for the sliding window attention.
5555
5656
Args:
57-
input_tensor (np.ndarray): Input tensor of shape (batch_size,
57+
input_tensor (np.ndarray): Input tensor of shape (batch_size,
5858
seq_length, embed_dim).
5959
6060
Returns:
6161
np.ndarray: Output tensor of shape (batch_size, seq_length, embed_dim).
6262
63-
>>> x = np.random.randn(2, 10, 4) # Batch size 2, sequence
63+
>>> x = np.random.randn(2, 10, 4) # Batch size 2, sequence
6464
>>> attention = SlidingWindowAttention(embed_dim=4, window_size=3)
6565
>>> output = attention.forward(x)
6666
>>> output.shape
@@ -95,7 +95,7 @@ def forward(self, input_tensor: np.ndarray) -> np.ndarray:
9595

9696
# usage
9797
rng = np.random.default_rng()
98-
x = rng.standard_normal((2, 10, 4)) # Batch size 2,
98+
x = rng.standard_normal((2, 10, 4)) # Batch size 2,
9999
attention = SlidingWindowAttention(embed_dim=4, window_size=3)
100100
output = attention.forward(x)
101101
print(output)

0 commit comments

Comments
 (0)