NPBench - A Benchmarking Suite for High-Performance NumPy

spcl, updated 🕥 2022-11-14 21:43:38




To install NPBench, simply execute: python -m pip install -r requirements.txt python -m pip install . You can then run a subset of the benchmarks with NumPy, Numba, and DaCe and plot the speedup of DaCe and Numba against NumPy: python -m pip install numba python -m pip install dace python python

Supported Frameworks

Currently, the following frameworks are supported (in alphabetical order): - CuPy - DaCe - Numba - NumPy - Pythran

Support will also be added shortly for: - Legate

Please note that the NPBench setup only installs NumPy. To run benchmarks with other frameworks, you have to install them separately. Below, we provide some tips about installing each of the above frameworks:


If you already have CUDA installed, then you can install CuPy with pip: python -m pip install cupy-cuda<version> For example, if you have CUDA 11.1, then you should install CuPy with: python -m pip install cupy-cuda111 For more installation options, consult the CuPy installation guide.


DaCe can be install with pip: python -m pip install dace However, you may want to install the latest version from the GitHub repository. To run NPBench with DaCe, you have to select as framework (see details below) either dace_cpu or dace_gpu.


Numba can be installed with pip: python -m pip install numba If you use Anaconda on an Intel-based machine, then you can install an optimized version of Numba that uses Intel SVML: conda install -c numba icc_rt For more installation options, please consult the Numba installation guide.


Pythran can be install with pip and Anaconda. For detailed installation options, please consult the Pythran installation guide.

Running benchmarks

To run individual bencharks, you can use the run_benchmark script: python -b <benchmark> -f <framework> The available benchmarks are listed in the bench_info folder. The supported frameworks are listed in the framework_info folder. Please use the corresponding JSON filenames. For example, to run adi with NumPy, execute the following: python -b adi -f numpy You can run all the available benchmarks with a specific framework using the run_framework script: python -f <framework>


Each benchmark has four different presets; S, M, L, and paper. The S, M, and L presets have been selected so that NumPy finishes execution in about 10, 100, and 1000ms respectively in a machine with two 16-core Intel Xeon Gold 6130 processors. Exception to that are atax, bicg, mlp, mvt, and trisolv, which have been tuned for 5, 20 and 100ms approximately due to very high memory requirements. The paper preset is the problem sizes used in the NPBench paper. By default, the provided python scripts execute the benchmarks using the S preset. You can select a different preset with the optional -p flag: python -b gemm -f numpy -p L


After running some benchmarks with different frameworks, you can generate plots of the speedups and line-count differences (experimental) against NumPy: python python


It is possible to use the NPBench infrastructure with your own benchmarks and frameworks. For more information on this functionality please read the documentation for benchmarks and frameworks.


Please cite NPBench as follows:

bibtex @inproceedings{ npbench, author = {Ziogas, Alexandros Nikolaos and Ben-Nun, Tal and Schneider, Timo and Hoefler, Torsten}, title = {NPBench: A Benchmarking Suite for High-Performance NumPy}, year = {2021}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {}, doi = {10.1145/3447818.3460360}, booktitle = {Proceedings of the ACM International Conference on Supercomputing}, series = {ICS '21} }


NPBench is a collection of scientific Python/NumPy codes from various domains that we adapted from the following sources: - Azimuthal Integration from pyFAI - Navier-Stokes from CFD Python - Cython tutorial for NumPy users - Quantum Transport simulation from OMEN - CRC-16-CCITT algorithm from oysstu - Numba tutorial - Mandelbrot codes From Python to Numpy - N-Body simulation from nbody-python - PolyBench/C - Pythran benchmarks - Stockham-FFT - Weather stencils from gt4py


fpga benchmark support

opened on 2022-10-20 01:22:58 by jzhoulon

Currently, npbench seems only support cpu and gpu, is there any support for fpga? thanks

Add "--expected-fail" option to

opened on 2022-08-20 16:18:53 by Hardcode84

This option allows to specify list of benchmarks which are known to fail so entire run will still be considered successful. Also, it can detect if previously failed tests unexpectedly passed.

Halide's python bindings as new framework

opened on 2022-06-24 13:10:48 by lukastruemper

Halide pip package is in progress:

AttributeError: 'numpy.random._generator.Generator' object has no attribute 'rand'

opened on 2022-05-29 21:29:40 by SuhasSrinivasan

This issue occurred when running python -f numpy -p L

OS: Ubuntu 20.04.4 LTS Python: 3.7.13 numpy: 1.19.5

***** Testing NumPy with spmv on the L dataset ***** Process Process-46: Traceback (most recent call last): File "/opt/miniconda3/envs/chrombpnet/lib/python3.7/multiprocessing/", line 297, in _bootstrap File "/opt/miniconda3/envs/chrombpnet/lib/python3.7/multiprocessing/", line 99, in run self._target(*self._args, **self._kwargs) File "", line 18, in run_benchmark, validate, repeat, timeout) File "/root/npbench/npbench/infrastructure/", line 66, in run bdata = self.bench.get_data(preset) File "/root/npbench/npbench/infrastructure/", line 68, in get_data exec(init_str, data) File "<string>", line 1, in <module> File "/root/npbench/npbench/benchmarks/spmv/", line 19, in initialize random_state=rng) File "/opt/miniconda3/envs/chrombpnet/lib/python3.7/site-packages/scipy/sparse/", line 786, in random data_rvs = random_state.rand AttributeError: 'numpy.random._generator.Generator' object has no attribute 'rand'

Adding blur, hist, iir_blur benchmarks for numpy & dace

opened on 2022-05-03 10:38:21 by lukastruemper None

It would be fine to integrate with

opened on 2022-01-17 22:41:46 by ibobak

It would be good if you could integrate with or any other framework so that people would be able to upload the results on some public server and compare of how different hardware run the same benchmark.

python numpy benchmarking-suite benchmarking-framework