diff --git a/lectures/_static/lecture_specific/need_for_speed/matlab.png b/lectures/_static/lecture_specific/need_for_speed/matlab.png new file mode 100644 index 00000000..0461ebd5 Binary files /dev/null and b/lectures/_static/lecture_specific/need_for_speed/matlab.png differ diff --git a/lectures/_static/lecture_specific/need_for_speed/numpy.pdf b/lectures/_static/lecture_specific/need_for_speed/numpy.pdf new file mode 100644 index 00000000..4d53cae2 Binary files /dev/null and b/lectures/_static/lecture_specific/need_for_speed/numpy.pdf differ diff --git a/lectures/about_py.md b/lectures/about_py.md index 8f5a4307..a4ae3370 100644 --- a/lectures/about_py.md +++ b/lectures/about_py.md @@ -133,6 +133,7 @@ The following chart, produced using Stack Overflow Trends, provides some evidenc It shows the popularity of a Python AI library called [PyTorch](https://pytorch.org/) relative to MATLAB. + ```{figure} /_static/lecture_specific/about_py/pytorch_vs_matlab.png ``` diff --git a/lectures/need_for_speed.md b/lectures/need_for_speed.md index 7ff7bed0..b61a56cd 100644 --- a/lectures/need_for_speed.md +++ b/lectures/need_for_speed.md @@ -29,18 +29,21 @@ premature optimization is the root of all evil." -- Donald Knuth Python is extremely popular for scientific computing, due to such factors as -* the accessible and flexible nature of the language itself, -* the huge range of high quality scientific libraries now available, +* the accessible and expressive nature of the language itself, +* its vast range of high quality scientific libraries, * the fact that the language and libraries are open source, -* the popular Anaconda Python distribution, which simplifies installation and - management of those libraries, and -* the recent surge of interest in using Python for machine learning and - artificial intelligence. +* the popular [Anaconda Python distribution](https://www.anaconda.com/download), which simplifies installation and management of scientific libraries, and +* the key role that Python plays in data science, machine learning and artificial intelligence. -In this lecture we give a short overview of scientific computing in Python, -addressing the following questions: +In previous lectures, we looked at some scientific Python libaries such as NumPy and Matplotlib. -* What are the relative strengths and weaknesses of Python for these tasks? +However, our main focus was the core Python language, rather than the libraries. + +Now we turn to the scientific libraries and give them our full attention. + +We'll also discuss the following topics: + +* What are the relative strengths and weaknesses of Python for scientific work? * What are the main elements of the scientific Python ecosystem? * How is the situation changing over time? @@ -53,21 +56,21 @@ tags: [hide-output] !pip install quantecon ``` + + ## Scientific Libraries -Let's briefly review Python's scientific libraries, starting with why we need -them. +Let's briefly review Python's scientific libraries, starting with why we need them. ### The Role of Scientific Libraries -One obvious reason we use scientific libraries is because they implement -routines we want to use. +One reason we use scientific libraries is because they implement routines we want to use. + +* numerical integration, interpolation, linear algebra, root finding, etc. -For example, it's almost always better to use an existing routine for root -finding than to write a new one from scratch. +For example, it's almost always better to use an existing routine for root finding than to write a new one from scratch. -(For standard algorithms, efficiency is maximized if the community can coordinate on a -common set of implementations, written by experts and tuned by users to be as fast and robust as possible.) +(For standard algorithms, efficiency is maximized if the community can coordinate on a common set of implementations, written by experts and tuned by users to be as fast and robust as possible.) But this is not the only reason that we use Python's scientific libraries. @@ -75,40 +78,45 @@ Another is that pure Python, while flexible and elegant, is not fast. So we need libraries that are designed to accelerate execution of Python code. -As we'll see below, there are now Python libraries that can do this extremely well. +They do this using two strategies: -### Python's Scientific Ecosystem +1. using compilers that convert Python-like statements into fast machine code for individual threads of logic and +2. parallelizing tasks across multiple "workers" (e.g., CPUs, individual threads inside GPUs). -In terms of popularity, the big four in the world of scientific Python -libraries are +There are several Python libraries that can do this extremely well. -* NumPy -* SciPy -* Matplotlib -* Pandas -For us, there's another (relatively new) library that will also be essential for -numerical computing: +### Python's Scientific Ecosystem -* Numba +At QuantEcon, the scientific libraries we use most often are -Over the next few lectures we'll see how to use these libraries. +* [NumPy](https://numpy.org/) +* [SciPy](https://scipy.org/) +* [Matplotlib](https://matplotlib.org/) +* [Pandas](https://pandas.pydata.org/) +* [Numba](https://numba.pydata.org/) and +* [JAX](https://github.com/jax-ml/jax) -But first, let's quickly review how they fit together. +Here's how they fit together: -* NumPy forms the foundations by providing a basic array data type (think of +* NumPy forms foundations by providing a basic array data type (think of vectors and matrices) and functions for acting on these arrays (e.g., matrix multiplication). -* SciPy builds on NumPy by adding the kinds of numerical methods that are - routinely used in science (interpolation, optimization, root finding, etc.). +* SciPy builds on NumPy by adding numerical methods routinely used in science (interpolation, optimization, root finding, etc.). * Matplotlib is used to generate figures, with a focus on plotting data stored in NumPy arrays. -* Pandas provides types and functions for empirical work (e.g., manipulating data). -* Numba accelerates execution via JIT compilation --- we'll learn about this - soon. +* Pandas provides types and functions for manipulating data. +* Numba provides a just-in-time compiler that integrates well with NumPy and + helps accelerate Python code. +* JAX includes array processing operations similar to NumPy, automatic + differentiation, a parallelization-centric just-in-time compiler, and automated integration with hardware accelerators such as + GPUs. + + + ## The Need for Speed -Now let's discuss execution speed. +Let's discuss execution speed and how scientific libraries can help us accelerate code. Higher-level languages like Python are optimized for humans. @@ -117,35 +125,38 @@ This means that the programmer can leave many details to the runtime environment * specifying variable types * memory allocation/deallocation, etc. -The upside is that, compared to low-level languages, Python is typically faster to write, less error-prone and easier to debug. +On one hand, compared to low-level languages, high-level languages are typically faster to write, less error-prone and easier to debug. -The downside is that Python is harder to optimize --- that is, turn into fast machine code --- than languages like C or Fortran. +On the other hand, high-level languages are harder to optimize --- that is, to turn into fast machine code --- than languages like C or Fortran. Indeed, the standard implementation of Python (called CPython) cannot match the speed of compiled languages such as C or Fortran. Does that mean that we should just switch to C or Fortran for everything? -The answer is: No, no and one hundred times no! +The answer is: No, no, and one hundred times no! -(This is what you should say to the senior professor insisting that the model -needs to be rewritten in Fortran or C++.) +(This is what you should say to your professor when they insist that your model needs to be rewritten in Fortran or C++.) There are two reasons why: -First, for any given program, relatively few lines are ever going to -be time-critical. +First, for any given program, relatively few lines are ever going to be time-critical. Hence it is far more efficient to write most of our code in a high productivity language like Python. Second, even for those lines of code that *are* time-critical, we can now achieve the same speed as C or Fortran using Python's scientific libraries. +In fact we can often do better, because some scientific libraries are so +effective at accelerating and parallelizing our code. + + ### Where are the Bottlenecks? -Before we learn how to do this, let's try to understand why plain vanilla -Python is slower than C or Fortran. +Before we learn how to do this, let's try to understand why plain vanilla Python is slower than C or Fortran. This will, in turn, help us figure out how to speed things up. +In reading the following, remember that the Python interpreter executes code line-by-line. + #### Dynamic Typing ```{index} single: Dynamic Typing @@ -180,10 +191,11 @@ a + b (We say that the operator `+` is *overloaded* --- its action depends on the type of the objects on which it acts) -As a result, Python must check the type of the objects and then call the correct operation. +As a result, when executing `a + b`, Python must first check the type of the objects and then call the correct operation. This involves substantial overheads. + #### Static Types ```{index} single: Static Types @@ -255,6 +267,9 @@ In fact, it's generally true that memory traffic is a major culprit when it come Let's look at some ways around these problems. + + + ## {index}`Vectorization ` ```{index} single: Python; Vectorization @@ -272,173 +287,12 @@ For example, when working in a high level language, the operation of inverting a This clever idea dates back to MATLAB, which uses vectorization extensively. -Vectorization can greatly accelerate many numerical computations (but not all, -as we shall see). - -Let's see how vectorization works in Python, using NumPy. - -### Operations on Arrays - -```{index} single: Vectorization; Operations on Arrays -``` - -First, let's run some imports - -```{code-cell} python3 -import random -import numpy as np -import quantecon as qe -``` - -Next let's try some non-vectorized code, which uses a native Python loop to generate, -square and then sum a large number of random variables: - -```{code-cell} python3 -n = 1_000_000 -``` - -```{code-cell} python3 -%%time - -y = 0 # Will accumulate and store sum -for i in range(n): - x = random.uniform(0, 1) - y += x**2 -``` - -The following vectorized code achieves the same thing. - -```{code-cell} ipython -%%time - -x = np.random.uniform(0, 1, n) -y = np.sum(x**2) -``` - -As you can see, the second code block runs much faster. Why? - -The second code block breaks the loop down into three basic operations - -1. draw `n` uniforms -1. square them -1. sum them - -These are sent as batch operators to optimized machine code. -Apart from minor overheads associated with sending data back and forth, the result is C or Fortran-like speed. - -When we run batch operations on arrays like this, we say that the code is *vectorized*. - -Vectorized code is typically fast and efficient. - -It is also surprisingly flexible, in the sense that many operations can be vectorized. - -The next section illustrates this point. - -(ufuncs)= -### Universal Functions - -```{index} single: NumPy; Universal Functions -``` - -Many functions provided by NumPy are so-called *universal functions* --- also called [ufuncs](https://docs.scipy.org/doc/numpy/reference/ufuncs.html). - -This means that they - -* map scalars into scalars, as expected -* map arrays into arrays, acting element-wise - -For example, `np.cos` is a ufunc: - -```{code-cell} python3 -np.cos(1.0) -``` - -```{code-cell} python3 -np.cos(np.linspace(0, 1, 3)) +```{figure} /_static/lecture_specific/need_for_speed/matlab.png ``` -By exploiting ufuncs, many operations can be vectorized. - -For example, consider the problem of maximizing a function $f$ of two -variables $(x,y)$ over the square $[-a, a] \times [-a, a]$. - -For $f$ and $a$ let's choose - -$$ -f(x,y) = \frac{\cos(x^2 + y^2)}{1 + x^2 + y^2} -\quad \text{and} \quad -a = 3 -$$ - -Here's a plot of $f$ - -```{code-cell} ipython -import matplotlib.pyplot as plt -from mpl_toolkits.mplot3d.axes3d import Axes3D -from matplotlib import cm - -def f(x, y): - return np.cos(x**2 + y**2) / (1 + x**2 + y**2) - -xgrid = np.linspace(-3, 3, 50) -ygrid = xgrid -x, y = np.meshgrid(xgrid, ygrid) - -fig = plt.figure(figsize=(10, 8)) -ax = fig.add_subplot(111, projection='3d') -ax.plot_surface(x, - y, - f(x, y), - rstride=2, cstride=2, - cmap=cm.jet, - alpha=0.7, - linewidth=0.25) -ax.set_zlim(-0.5, 1.0) -ax.set_xlabel('$x$', fontsize=14) -ax.set_ylabel('$y$', fontsize=14) -plt.show() -``` - -To maximize it, we're going to use a naive grid search: - -1. Evaluate $f$ for all $(x,y)$ in a grid on the square. -1. Return the maximum of observed values. - -The grid will be - -```{code-cell} python3 -grid = np.linspace(-3, 3, 1000) -``` - -Here's a non-vectorized version that uses Python loops. - -```{code-cell} python3 -%%time - -m = -np.inf - -for x in grid: - for y in grid: - z = f(x, y) - if z > m: - m = z -``` - -And here's a vectorized version - -```{code-cell} python3 -%%time - -x, y = np.meshgrid(grid, grid) -np.max(f(x, y)) -``` - -In the vectorized version, all the looping takes place in compiled code. - -As you can see, the second version is **much** faster. - -(We'll make it even faster again later on, using more scientific programming tricks.) +Vectorization can greatly accelerate many numerical computations, as we will see +in later lectures. (numba-p_c_vectorization)= ## Beyond Vectorization @@ -462,11 +316,11 @@ In these kinds of settings, we need to go back to loops. Fortunately, there are alternative ways to speed up Python loops that work in almost any setting. -For example, in the last few years, a new Python library called [Numba](http://numba.pydata.org/) has appeared that solves the main problems -with vectorization listed above. +For example, [Numba](http://numba.pydata.org/) solves the main problems with +vectorization listed above. It does so through something called **just in time (JIT) compilation**, which can generate extremely fast and efficient code. -We'll learn how to use Numba {doc}`soon `. +{doc}`Later ` we'll learn how to use Numba to accelerate Python code. diff --git a/lectures/numpy.md b/lectures/numpy.md index 05dea077..173433cb 100644 --- a/lectures/numpy.md +++ b/lectures/numpy.md @@ -36,14 +36,24 @@ kernelspec: We have already seen some code involving NumPy in the preceding lectures. -In this lecture, we will start a more systematic discussion of both +In this lecture, we will start a more systematic discussion of -* NumPy arrays and -* the fundamental array processing operations provided by NumPy. +1. NumPy arrays and +1. the fundamental array processing operations provided by NumPy. -### References -* [The official NumPy documentation](http://docs.scipy.org/doc/numpy/reference/). +(For an alternative reference, see [the official NumPy documentation](http://docs.scipy.org/doc/numpy/reference/).) + +We will use the following imports. + +```{code-cell} python3 +import numpy as np +import random +import quantecon as qe +import matplotlib.pyplot as plt +from mpl_toolkits.mplot3d.axes3d import Axes3D +from matplotlib import cm +``` (numpy_array)= ## NumPy Arrays @@ -53,15 +63,10 @@ In this lecture, we will start a more systematic discussion of both The essential problem that NumPy solves is fast array processing. -The most important structure that NumPy defines is an array data type formally called a [numpy.ndarray](http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html). - -NumPy arrays power a large proportion of the scientific Python ecosystem. +The most important structure that NumPy defines is an array data type, formally +called a [numpy.ndarray](http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html). -Let's first import the library. - -```{code-cell} python3 -import numpy as np -``` +NumPy arrays power a very large proportion of the scientific Python ecosystem. To create a NumPy array containing only zeros we use [np.zeros](http://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.zeros) @@ -523,9 +528,6 @@ tags: [hide-input] # Adapted and modified based on the code in the book written by Jake VanderPlas (see https://jakevdp.github.io/PythonDataScienceHandbook/06.00-figure-code.html#Broadcasting) # Originally from astroML: see http://www.astroml.org/book_figures/appendix/fig_broadcast_visual.html -import numpy as np -from matplotlib import pyplot as plt - def draw_cube(ax, xy, size, depth=0.4, edges=None, label=None, label_kwargs=None, **kwargs): @@ -1159,14 +1161,150 @@ We'll cover the SciPy versions in more detail {doc}`soon `. For a comprehensive list of what's available in NumPy see [this documentation](https://docs.scipy.org/doc/numpy/reference/routines.html). -## Exercises + +## Speed Comparisons + +```{index} single: Vectorization; Operations on Arrays +``` + +We mentioned in an {doc}`previous lecture ` that NumPy-based vectorization can +accelerate scientific applications. + +In this section we try some speed comparisons to illustrate this fact. + +### Vectorization vs Loops + +Let's begin with some non-vectorized code, which uses a native Python loop to generate, +square and then sum a large number of random variables: + +```{code-cell} python3 +n = 1_000_000 +``` + +```{code-cell} python3 +%%time + +y = 0 # Will accumulate and store sum +for i in range(n): + x = random.uniform(0, 1) + y += x**2 +``` + +The following vectorized code achieves the same thing. ```{code-cell} ipython -%matplotlib inline -import matplotlib.pyplot as plt -plt.rcParams['figure.figsize'] = (10,6) +%%time + +x = np.random.uniform(0, 1, n) +y = np.sum(x**2) +``` + +As you can see, the second code block runs much faster. Why? + +The second code block breaks the loop down into three basic operations + +1. draw `n` uniforms +1. square them +1. sum them + +These are sent as batch operators to optimized machine code. + +Apart from minor overheads associated with sending data back and forth, the result is C or Fortran-like speed. + +When we run batch operations on arrays like this, we say that the code is *vectorized*. + +The next section illustrates this point. + +(ufuncs)= +### Universal Functions + +```{index} single: NumPy; Universal Functions +``` + +As discussed above, many functions provided by NumPy are universal functions (ufuncs). + +By exploiting ufuncs, many operations can be vectorized, leading to faster +execution. + +For example, consider the problem of maximizing a function $f$ of two +variables $(x,y)$ over the square $[-a, a] \times [-a, a]$. + +For $f$ and $a$ let's choose + +$$ +f(x,y) = \frac{\cos(x^2 + y^2)}{1 + x^2 + y^2} +\quad \text{and} \quad +a = 3 +$$ + +Here's a plot of $f$ + +```{code-cell} ipython + +def f(x, y): + return np.cos(x**2 + y**2) / (1 + x**2 + y**2) + +xgrid = np.linspace(-3, 3, 50) +ygrid = xgrid +x, y = np.meshgrid(xgrid, ygrid) + +fig = plt.figure(figsize=(10, 8)) +ax = fig.add_subplot(111, projection='3d') +ax.plot_surface(x, + y, + f(x, y), + rstride=2, cstride=2, + cmap=cm.jet, + alpha=0.7, + linewidth=0.25) +ax.set_zlim(-0.5, 1.0) +ax.set_xlabel('$x$', fontsize=14) +ax.set_ylabel('$y$', fontsize=14) +plt.show() +``` + +To maximize it, we're going to use a naive grid search: + +1. Evaluate $f$ for all $(x,y)$ in a grid on the square. +1. Return the maximum of observed values. + +The grid will be + +```{code-cell} python3 +grid = np.linspace(-3, 3, 1000) +``` + +Here's a non-vectorized version that uses Python loops. + +```{code-cell} python3 +%%time + +m = -np.inf + +for x in grid: + for y in grid: + z = f(x, y) + if z > m: + m = z +``` + +And here's a vectorized version + +```{code-cell} python3 +%%time + +x, y = np.meshgrid(grid, grid) +np.max(f(x, y)) ``` +In the vectorized version, all the looping takes place in compiled code. + +As you can see, the second version is *much* faster. + + +## Exercises + + ```{exercise-start} :label: np_ex1 ``` @@ -1359,8 +1497,7 @@ Your task is to An example solution is given below. -In essence, we've just taken [this -code](https://github.com/QuantEcon/QuantEcon.py/blob/master/quantecon/ecdf.py) +In essence, we've just taken [this code](https://github.com/QuantEcon/QuantEcon.py/blob/master/quantecon/ecdf.py) from QuantEcon and added in a plot method ```{code-cell} python3 @@ -1486,7 +1623,6 @@ Let's make sure this library is installed. Now we can import the quantecon package. ```{code-cell} python3 -import quantecon as qe np.random.seed(123) x = np.random.randn(1000, 100, 100) @@ -1578,4 +1714,4 @@ print(np.array_equal(B, D)) ``` ```{solution-end} -``` \ No newline at end of file +```