Skip to content

Commit 0c0e6b5

Browse files
committed
Improved Documentation
1 parent 9e19e0d commit 0c0e6b5

File tree

3 files changed

+94
-73
lines changed

3 files changed

+94
-73
lines changed

benchmarks/python/compiled_methods.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,9 @@ def compute_accelerations(accelerations, masses, positions):
2525
for index_p0 in range(nb_particles - 1):
2626
position0 = positions[index_p0]
2727
mass0 = masses[index_p0]
28-
28+
29+
# TODO: Use compiled methods like Numba & Pythran in vectorized approach.
30+
# Issue: https://github.com/khushi-411/numpy-benchmarks/issues/4
2931
for index_p1 in range(index_p0 + 1, nb_particles):
3032
mass1 = masses[index_p1]
3133

benchmarks/python/plot.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ def plot(x, labels, list_df, names):
2424
plt.yticks(fontsize=300)
2525
plt.legend(fontsize=250)
2626

27-
plt.ylabel(r'$\frac{Time}{nParticles^2}$', fontsize=350)
27+
plt.ylabel(r'$\frac{Time}{nParticles}{(sec)}$', fontsize=350)
2828
plt.xlabel(r"$Number\ of\ Particles\ Simulated(nParticles)$", fontsize=270)
2929
plt.title(r"$Library\ based\ Implementation$", fontsize=300)
3030

@@ -47,7 +47,7 @@ def plot(x, labels, list_df, names):
4747
plt.yticks(fontsize=300)
4848
plt.legend(fontsize=250)
4949

50-
plt.ylabel(r'$\frac{Time}{nParticles^2}$', fontsize=350)
50+
plt.ylabel(r'$\frac{Time}{nParticles}{(sec)}$', fontsize=350)
5151
plt.tick_params(axis='x', which='major', pad=50)
5252
plt.xlabel(r"$Number\ of\ Particles\ Simulated(nParticles)$", fontsize=270)
5353
plt.title(r"$Compiler\ based\ Implementation$", fontsize=300)
@@ -57,7 +57,7 @@ def plot(x, labels, list_df, names):
5757

5858
if __name__ == "__main__":
5959

60-
data_path = "data/table.csv"
60+
data_path = "benchmarks/data/table.csv"
6161
df = pd.read_csv(data_path)
6262
df = df.drop(['Unnamed: 0'], axis=1)
6363
df = df.T

content/en/benchmark.md

Lines changed: 88 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,11 @@ title: NumPy Benchmarks
33
sidebar: false
44
---
55

6-
<img src = "/images/content_images/performance_benchmarking.png" alt = "Visualization" title = "Performance Benchmark; Number of Iterations: 5">
7-
6+
<img src = "/images/content_images/performance_benchmarking.png" alt = "Visualization" title = "Performance Benchmark; Number of Iterations: 50">
87

98
## Overview
109

11-
This blog post aims to benchmark NumPy's performance on the widely accepted N-body problem<a href="#nbody">[2]</a>. This work also compares NumPy with other popular libraries like pure Python and C++ and compilers like Numba and Pythran.
10+
This web page aims to benchmark NumPy's performance on the widely accepted N-body problem<a href="#nbody">[2]</a>. This work also compares NumPy with Python & C++ and with compilers like Numba and Pythran.
1211

1312
The objective of benchmarking NumPy revolves around the efficiency of the library in quasi real-life situations, and the N-body problem suits the purpose well. Benchmarking is performed over several iterations for different datasets to ensure the accuracy of the results.
1413

@@ -21,10 +20,10 @@ The objective of benchmarking NumPy revolves around the efficiency of the librar
2120
<!-- 2. About N-body Problem: Brief description on N-body problem and why it was chosen. -->
2221
<!-- 3. Dataset Description -->
2322
<!-- 4. Implemented Accelerators -->
24-
<!-- 5. Results -->
25-
<!-- 6. Source Code -->
26-
<!-- 7. References -->
27-
23+
<!-- 5. Source Code -->
24+
<!-- 6. Results -->
25+
<!-- 7. Conclusion -->
26+
<!-- 8. References -->
2827

2928
## About N-Body Problem
3029

@@ -46,23 +45,27 @@ From the definition above, the N-body problem includes the kinematics between th
4645

4746
A brief description of computations involved in solving the N-body problem is given below, along with the pseudo-code in the next section:
4847

49-
Consider $n$ bodies of masses $m_1, m_2, m_3, ... , m_n$, moving under the mutual [gravitational force](https://en.wikipedia.org/wiki/Gravity) of attraction between them in an [inertial frame of reference](https://en.wikipedia.org/wiki/Inertial_frame_of_reference) of three dimensions, such that consecutive positions and velocities of an ${ith}$ body are denoted by ($s_{i-1}$, $s_i$) and ($v_{i-1}$, $v_i$) respectively. The gravitational force felt on the $ith$ body of mass $m_i$ by a single body of mass $m_j$ is denoted as $F_{gravitational}$ and the acceleration of the $ith$ body is represented as $a_i$. Consider the position vectors of these two bodies as $r_i$ and $r_j$.
48+
Consider $n$ bodies of masses $m_1, m_2, m_3, ... , m_n$, moving under the mutual [gravitational force](https://en.wikipedia.org/wiki/Gravity) of attraction between them in an [inertial frame of reference](https://en.wikipedia.org/wiki/Inertial_frame_of_reference) of three dimensions, such that consecutive positions and velocities of an ${ith}$ body are denoted by ($s_{k-1}$, $s_k$) and ($v_{k-1}$, $v_k$) respectively. According to the [Newton's law of gravity](https://en.wikipedia.org/wiki/Newton%27s_law_of_universal_gravitation), the gravitational force felt on the $ith$ body of mass $m_i$ by a single body of mass $m_j$ is denoted as $F_{ij}$ and the acceleration of the $ith$ body is represented as $a_i$. Let $r_i$ and $r_j$ be the position vectors of two body, such that:
49+
50+
\begin{equation} {r_i} = {s_{k+1}} - {s_{k}} \tag{I} \end{equation}
51+
52+
\begin{equation} {r_j} = {r_{k}} - {r_{k+1}} \tag{II} \end{equation}
5053

5154
The final aim is to find time taken to evaluate the total energy of each particle in the celestial space at a given time step. The equations involved in solving the problem are listed below:
5255

53-
\begin{equation} {s_i} = {s_{i-1}} + {u\times t} + \frac{a\times t^2}{2} \tag{i} \end{equation}
56+
\begin{equation} {s_k} = {s_{k-1}} + {u\times t} + \frac{a\times t^2}{2} \tag{III} \end{equation}
5457

55-
\begin{equation}{v_i} = {v_{i-1}} + {a\times t} \tag{ii} \end{equation}
58+
\begin{equation}{v_k} = {v_{k-1}} + {a\times t} \tag{IV} \end{equation}
5659

57-
\begin{equation} {F_{ij}} = \frac{{G\times {m_i}\times {m_j}}\times \mid {r_j}-{r_i} \mid}{{\mid {r_j}-{r_i} \mid}^3} \tag{iii} \end{equation}
60+
\begin{equation} {F_{ij}} = \frac{{G\times {m_i}\times {m_j}}\times \mid {r_j}-{r_i} \mid}{{\mid {r_j}-{r_i} \mid}^3} \tag{V} \end{equation}
5861

59-
\begin{equation} {a_{i}} = \frac{F_{ij}}{m_{j}} \tag{iv} \end{equation}
62+
\begin{equation} {a_{i}} = \frac{F_{ij}}{m_{j}} \tag{VI} \end{equation}
6063

61-
\begin{equation} \textrm{Self Potential Energy} = \textrm{U} = -\frac{{m_i}\times {m_j}}{r^2} \tag{v} \end{equation}
64+
\begin{equation} \textrm{Self Potential Energy} = \textrm{U} = -\frac{{m_i}\times {m_j}}{r^2} \tag{VII} \end{equation}
6265

63-
\begin{equation} \textrm{Kinetic Energy} = \textrm{K.E} = \frac{\sum m\times v^2}{2} \tag{vi} \end{equation}
66+
\begin{equation} \textrm{Kinetic Energy} = \textrm{K.E} = \frac{\sum m\times v^2}{2} \tag{VIII} \end{equation}
6467

65-
\begin{equation} \textrm{Total Energy} = \textrm{Kinetic Energy} + \textrm{Self Potential Energy} \tag{vii} \end{equation}
68+
\begin{equation} \textrm{Total Energy} = \textrm{Kinetic Energy} + \textrm{Self Potential Energy} \tag{IX} \end{equation}
6669

6770
### Pseudo Code of Solving N-body Problem
6871

@@ -74,11 +77,11 @@ FOR time is less than or equal to time_end
7477
Calculate total initial energies:
7578
Calculate kinetic energy
7679
Calculate potential energy
77-
FOR i less than number_of_step
78-
Calculate positions (r[i+1])
80+
FOR k less than number_of_step
81+
Calculate positions (r[k+1])
7982
Swap accelerations
8083
Calculate accelerations
81-
Calculate velocities (v[i+1])
84+
Calculate velocities (v[k+1])
8285
Increment time
8386
IF number_of_step % 100 is not 0 THEN
8487
Calculate total energy
@@ -110,17 +113,10 @@ We considered accelerators like [Numba](http://numba.pydata.org/), [Pythran](htt
110113
111114
<div style="text-align: right">Source: <a href="http://numba.pydata.org/">Numba's Website</a></div>
112115

113-
Since Numba is a compiler focused on accelerating Python and NumPy codes, the user API of the library supports various decorators. The supported decorators are `@jit, @vectorize, @guvectorize, @stencil, @jitclass, @cfunc, @overload`. It also supports `nopython` mode to generate fully compiled results without the need for intermediate Python interpreter calls. Numba's assistance to NumPy arrays and functions also makes it a good candidate for comparison.
116+
Since Numba is a compiler focused on accelerating Python and NumPy codes, the user API of the library supports various decorators. It uses the industry-standard LLVM compiler library. It aims to translate the Python functions to optimized machine code during runtime. It supports variety of decorators like `@jit, @vectorize, @guvectorize, @stencil, @jitclass, @cfunc, @overload`. We are using `Just-In-Time` compilation in this work. It also supports `nopython` mode to generate fully compiled results without the need for intermediate Python interpreter calls. Numba's assistance to NumPy arrays and functions also makes it a good candidate for comparison.
114117

115118
<!-- NumPy and Numba both use a similar type of compilation for ufuncs in manual looping resulting in the same speed. Another thing that Numba lacks behind is that it does not support all functions of NumPy. There are functions in NumPy which does not hold up some of the optional arguments in nopython mode. It can implement linear algebra calls in the compiled functions but does not return any faster implementation. -->
116119

117-
Implementation details for benchmarking:
118-
119-
* `jit` decorator from Numba was used to compile the Python functions just-in-time.
120-
* `cache = True`: To avoid repetitive compile time.
121-
* Uses NumPy arrays and loops.
122-
* Implemented `jit` decorated functions to call another `jit` decorated functions to increase the performance of our model.
123-
124120
### Pythran
125121

126122
> Pythran is an ahead of time compiler for a subset of the Python language, with a focus on scientific computing.
@@ -131,9 +127,51 @@ Since the focus of Pythran was on accelerating Python and NumPy codes, its C++ A
131127

132128
<!-- NumPy arrays in Cython should be stored in contiguous memory like C-style or Fortran to use Pythran in the backend. Here, the Pythran lacks behind. Another limitation is that the sequence of bytes of words must be the same as the targeted architecture to make Pythran work.-->
133129

130+
## Source Code
131+
132+
* The code is inspired by <a href = "https://github.com/paugier/nbabel">Pierre Augier's work on N-Body Problem</a>.
133+
* Visualization Code: <a href = "/benchmarks/python/plot.py">here</a>.
134+
135+
<html>
136+
<head>
137+
<style>
138+
table, th, td {
139+
border: 1px solid black;
140+
border-collapse: collapse;
141+
}
142+
</style>
143+
</head>
144+
<table>
145+
<tr>
146+
<td><b>Algorithm & Source Code</b></td>
147+
<td><b>Implementation Details</b></td>
148+
</tr>
149+
<tr>
150+
<td><a href = "/benchmarks/python/optimized_numpy.py">NumPy</a></td>
151+
<td>Vectorized Approach, Broadcasting Method, NumPy Arrays</td>
152+
</tr>
153+
<tr>
154+
<td><a href = "/benchmarks/python/pure_python.py">Python</a></td>
155+
<td>Standard Python Approach, Using List</td>
156+
</tr>
157+
<tr>
158+
<td><a href = "/benchmarks/cpp/main.cpp">C++</a></td>
159+
<td>C++ Implementation, GNU C++ Compiler</td>
160+
</tr>
161+
<tr>
162+
<td><a href = "/benchmarks/python/compiled_methods.py">Numba</a></td>
163+
<td>Just-In-time Compilation, Non-Vectorized Approach, Using Numba at the Backend via Transonic, NumPy Arrays</td>
164+
</tr>
165+
<tr>
166+
<td><a href = "/benchmarks/python/compiled_methods.py">Pythran</a></td>
167+
<td>Just-In-Time Compilation, Non-Vectorized Approach, Pythran at the Backend via Transonic, NumPy Arrays</td>
168+
</tr>
169+
</table>
170+
</html>
171+
134172
## Results
135173

136-
Table values represent the normalized time taken in seconds by each algorithm to run on the given datasets for $5$ number of iterations.
174+
Table values represent the normalized time taken in seconds by each algorithm to run on the given datasets for $50$ number of iterations. The raw timing data can be downloaded from <a href = "benchmarks/data/table.csv">here</a>.
137175

138176
<html>
139177
<head>
@@ -152,99 +190,80 @@ table, th, td {
152190
<td><b>32</b></td>
153191
<td><b>64</b></td>
154192
<td><b>128</b></td>
193+
<td><b>256</b></td>
155194
</tr>
156195
<tr>
157196
<tr>
158197
<td><b>NumPy</b></td>
159198
<td>12.61</td>
160199
<td>13.88</td>
161200
<td>15.59</td>
162-
<td>17.9</td>
201+
<td>17.90</td>
202+
<td>18.27</td>
163203
</tr>
164204
<tr>
165205
<td><b>Python</b></td>
166206
<td>12.85</td>
167207
<td>26.82</td>
168208
<td>50.13</td>
169209
<td>105.01</td>
210+
<td></td>
170211
</tr>
171212
<tr>
172213
<td><b>C++</b></td>
173214
<td>1.646</td>
174215
<td>3.206</td>
175216
<td>5.725</td>
176217
<td>11.44</td>
218+
<td>19.43</td>
177219
<tr>
178220
<td><b>Numba</b></td>
179221
<td>1.567</td>
180222
<td>3.223</td>
181223
<td>6.521</td>
182224
<td>13.64</td>
225+
<td>26.64</td>
183226
</tr>
184227
<tr>
185228
<td><b>Pythran</b></td>
186229
<td>0.3177</td>
187230
<td>0.6591</td>
188231
<td>1.2811</td>
189232
<td>2.5082</td>
233+
<td>5.2042</td>
190234
</tr>
191235
</table>
192236
</body>
193237
</html>
194238

195-
**Note** on machine configuration used for benchmarking:
239+
## Environment configuration
196240

197-
* **Machine:** Intel(R) Core(TM) i7-10870H CPU @ 2.20GHz, 16GB RAM
241+
* **CPU Model:** Intel(R) Core(TM) i7-10870H CPU @ 2.20GHz
242+
* **RAM GB:** 16
243+
* **RAM Model:** DDR4
244+
* **Speed:** 3200 MT/s
198245
* **Operating System:** Manjaro Linux 21.1.1, Pahvo
199246
* **Library Versions:**
200247
* Python: 3.9.6
201248
* NumPy: 1.20.3
202249
* Numba: 0.54.0
203250
* Pythran: 0.9.12.post1
204251
* Transonic: 0.4.10
252+
* GCC: 11.1.0
205253

206-
## Source Code
254+
## Conclusion
207255

208-
* The code is highly inspired by <a href = "https://github.com/paugier/nbabel">Pierre Augier's work on N-Body Problem</a>.
209-
* Visualization Code: <a href = "/benchmarks/python/plot.py">here</a>.
256+
* NumPy is very efficient, especially for larger datasets. NumPy performs $3.2$ times faster than Python for input size $64$, $5.8$ times faster for a dataset of size, $128$, and $$ times better performance than Python for input size $256$. The performance of NumPy increases drastically as the number of particles in the datasets increases. Thanks to the vectorized approach in NumPy. Vectorization makes the code look clean and concise to read. It results in better performance without any explicit looping, indexing, etc. NumPy's concept of vectorization is handy for the beginner to learn. It is also beneficial for a highly skilled developer to debug the errors with fewer lines of code.
257+
* It uses pre-compiled C code, which adds up to the performance of NumPy. We can observe from the table the performance of the NumPy approaches to the speed of C++. For a dataset of size $64$, NumPy is $2.72$ times slower than C++. For the dataset of size $128$, it reaches equivalent to the speed of C++, with a running time of $1.56$ times the time taken by C++. NumPy outperforms C++ by $1.06$ times for input size $256$.
210258

211-
<html>
212-
<head>
213-
<style>
214-
table, th, td {
215-
border: 1px solid black;
216-
border-collapse: collapse;
217-
}
218-
</style>
219-
</head>
220-
<table>
221-
<tr>
222-
<td><b>Algorithm & Source Code</b></td>
223-
<td><b>Implementation Details</b></td>
224-
</tr>
225-
<tr>
226-
<td><a href = "/benchmarks/python/optimized_numpy.py">NumPy</a></td>
227-
<td>Vectorized Approach</td>
228-
</tr>
229-
<tr>
230-
<td><a href = "/benchmarks/python/pure_python.py">Python</a></td>
231-
<td>Standard Python Approach</td>
232-
</tr>
233-
<tr>
234-
<td><a href = "/benchmarks/cpp/main.cpp">C++</a></td>
235-
<td>C++ Implementation, GNU C++ Compiler</td>
236-
</tr>
237-
<tr>
238-
<td><a href = "/benchmarks/python/compiled_methods.py">Numba</a></td>
239-
<td>Just-In-time Compilation, Non-Vectorized Approach, Numba via Transonic Compiler</td>
240-
</tr>
241-
<tr>
242-
<td><a href = "/benchmarks/python/compiled_methods.py">Pythran</a></td>
243-
<td>Just-In-Time Compilation, Non-Vectorized Approach, Pythran via Transonic Compiler</td>
244-
</tr>
245-
</table>
246-
</html>
259+
**How can we accelerate NumPy?**
260+
261+
NumPy aims to improve itself and to give better performance for the end-users. It performs well in most cases. But to fill the gaps where NumPy is not so good various compiled methods like Numba, Pythran, etc are used. They play a huge role. In this implementation, we used Transonic's JIT Compilation at the backend for NumPy arrays to implement Numba & Pythran. To be specific, we want to compare NumPy's vectorized approach with the JIT-compiled non-vectorized approach.
262+
263+
* We observed Numba performs $2.72$ times faster than NumPy for input size $64$ and $1.56$ times faster for input size $128$. But later, NumPy outperforms Numba by $1.45$ times faster for input size $256$.
264+
* Pythran performs $12.17$ times faster for input size $64$, $7.13$ times better for input size $128$, and $3.51$ times faster than NumPy for input size $256$.
247265

266+
We have compared the performance of NumPy with two of the most popular languages Python and C++, and with popular compiled methods like Numba and Pythran. NumPy achieves better performance for scientific computations as well as for solving real-life situations. That's NumPy. It stands explicitly well in all kinds of circumstances.
248267

249268
## References
250269

0 commit comments

Comments
 (0)