WarpBench
Parallel Linpack Benchmark

 

1. Introduction

WarpBench is a benchmark useful to evaulate the FPU performances based on the original Linpack test included in the NETLIB. it's available for Windows and Linux systems and it's able to detect the number of installed CPUs in order to enable the parallel code to evaluate the total computational power. If the CPU power management is enabled (e.g. AMD's Cool'n'Quiet), the software disables the CPU throttle avoiding wrong results due to the variable clock.

 

1.1 About the Linpack benchmark

The Linpack benchmark is a measure of a computer’s floating-point rate of execution. It is determined by running a computer program that solves a dense system of linear equations. Over the years the characteristics of the benchmark has changed a bit. In fact, there are three benchmarks included in the Linpack benchmark report. The computational power is expressed in Mflops/s that is a rate of execution, millions of floating point operations per second. Whenever this term is used it will refer to 64 bit floating point operations and the operations will be either addition or multiplication. Gflop/s refers to billions of floating point operations per second and Tflop/s refers to trillions of floating point operations per second.

 

2. Installation & usage

No installation required: just run the warpbench command in the command shell. WarpBench could be installed with VEGA ZZ as option and it can be executed selecting VEGA ZZ WarpProject WarpBench.

Warning:
If you are running the Linux version, it's possible that the file permissions aren't correctly set. To change them, type in the command prompt:

        chmod 755 warpbench

 

2.1 Command line options

WarpBench can be executed also by command shell. Here is the help that is shown when you invoke the command with -? option:

WarpBench 1.1.0 - Parallel Linpack Benchmark
Copyright 2006-2023, Alessandro Pedretti
Unrolled single precision Win32 version

Usage: WarpBench -c CPU_NUM -q

  c -> Number of threads (default all).
  q -> Quiet mode (show only the global performances).

You can choose manually the number of CPUs and enable the quiet mode.

 

3.0 Benchmark results

This is the report generated by WarpBench Linpack benchmark:

WarpBench 1.1.0 - Parallel Linpack Benchmark
Copyright 2006-2023, Alessandro Pedretti
Unrolled single precision Win32 version
Performing the benchmark for 2 CPU(s) ...
Average values for one CPU:
norm resid      resid           machep         x[0]-1          x[n-1]-1
   1.9   4.52336171e-005  1.19209290e-007 -1.31130219e-005 -1.30534172e-00
Times are reported for matrices of order          100
1 pass times for array with leading dimension of  201
      dgefa      dgesl      total     Mflops       unit      ratio
    0.00068    0.00003    0.00071     972.52     0.0021     0.0126
Overhead for 1 matgen      0.00014 seconds
Matgen/dgefa passes used 1239 for 1 seconds
Times for array with leading dimension of 201
      dgefa      dgesl      total     Mflops       unit      ratio
    0.00067    0.00011    0.00077     888.80     0.0023     0.0138
    0.00067    0.00011    0.00078     882.40     0.0023     0.0139
    0.00067    0.00011    0.00078     880.13     0.0023     0.0139
    0.00068    0.00011    0.00078     876.85     0.0023     0.0140
    0.00072    0.00011    0.00082     834.61     0.0024     0.0147
Average                               872.56
Calculating matgen2 overhead
Overhead for 1 matgen      0.00014 seconds
Times for array with leading dimension of 200
      dgefa      dgesl      total     Mflops       unit      ratio
    0.00071    0.00011    0.00082     839.49     0.0024     0.0146
    0.00071    0.00011    0.00082     841.87     0.0024     0.0146
    0.00072    0.00011    0.00082     834.09     0.0024     0.0147
    0.00071    0.00011    0.00082     841.08     0.0024     0.0146
    0.00071    0.00011    0.00082     838.77     0.0024     0.0146
Average                               839.06
Total computational power 1678.12 Mflops

In the following table, are reported some benchmark results:

CPU type CPUs Core x CPU Threads OS Mflops/s
Intel Xeon Gold 6238R 2 56 112 Windows 10 x64 Professional 110946
AMD Ryzen 9 3900XT 1 12 24 Windows 10 x64 Professional 78507
AMD Ryzen 9 3900X 1 12 24 Windows 10 x64 Professional 78328
Intel Xeon Gold 5120 2 14 56 Windows 10 x64 Enterprise 55397
AMD EPYC 7281 1 16 32 Windows Server 2019 Standard 49686
Intel Xeon E5-2650 v3 2 10 40 Windows 10 x64 Professional 39484
Intel Xeon E5-2630 v3 2 8 32 Windows 10 x64 Professional 31249
Intel Xeon E5-2640 v2 2 8 32 Windows 7 x64 27211
AMD Opteron 875 8 2 16 CentOS 4.3 64 bit 19254
AMD Ryzen 2400G 1 4 8 Windows 10 x64 Professional 16298
AMD Opteron 875 8 2 16 Windows Server 2008 R2 13632
Intel Xeon E5-2620 v2 1 6 12 CentOS 6.4 64 bit 10586
Intel Xeon E5-1620 v3 1 4 8 Windows 10 x64 Professional 9990
Intel i5-8250U 1 4 8 Windows 10 x64 Professional 9013
AMD Phenom II X6 1090T 1 6 6 Windows 7 x64 7052
Intel i5-6400 1 4 4 Windows 10 x64 5227
AMD Phenom II X4 955 1 4 4 Windows 7 x64 4711
AMD Athlon II X4 640 1 4 4 Windows Server 2003 Enterprise 4606
AMD A8 3870K 1 4 4 Windows 7 x64 4528
AMD Athlon MP 2200+ 2 1 2 Windows 2000 1678
Intel Core 2 Duo T5600 1 2 2 Windows 10 x64 1141
ARM Cortex A7 1.2 GHz (AllWinner H3, Orange Pi One) 1 4 4 ARMBIAN 3.4.113 RetroOrangePi 874
AMD Athlon 64 3200+ Venice 1 1 1 Windows XP 861
AMD Athlon 64 3200+ 1 1 1 Windows XP 849
AMD Sempron 64 2800+ 1 1 1 Windows XP 840
AMD Athlon 64 3000+ 1 1 1 Windows XP 821
ARM Cortex A9 r3p0 1.6 GHz (RockChip RK3188) 1 4 4 Ubuntu 12.04.5 LTS 693
Intel Pentium III 550 MHz 1 1 1 Windows 2000 133
ARM Cortex A9 r3p0 1.6 GHz 1 4 4 Android 4.1.1 Jelly Bean 97
ARM Cortex A9 r2p10 1.0 GHz 1 1 1 Android 4.1.1 Jelly Bean 23

The compiler performance may change the results in significant manner. Using a dual Athlon MP test PC with Windows 2000 as operating system and several C compiler, these results were found:

Compiler Company
Author
Free Version Build options Mflops/s
gcc
cygwin
RedHat Y 3.3.3 -O3 -march=pentium -malign-double -fomit-frame-pointer -ffast-math -funroll-loops -D WIN32 1678
gcc
mingw32
GNU Y 3.2.3 -O3 -march=pentium -malign-double -fomit-frame-pointer -ffast-math -funroll-loops 1678
gcc mingw32 GNU Y  3.4.5 -O3 -march=pentium -malign-double -fomit-frame-pointer -ffast-math -funroll-loops 1643
pgcc Portland Group N 6.0 -O3 -tp p5 -Munroll=c:5 -Mnoframe -Mlre -Mnozerotrip -D __TINYC__ 1632
bcc32 Borland N 5.6.4 -O2 -Hc -Vx -Ve -ff -X- -a8 -5 -b- -k- -vi -tWC -tWM 1464
lcc-win32 Jacob Navia Y 3.3 -O -D__TINYC__ 1024
tcc Fabrice Bellard Y 0.9.23 -lkernel32 455

 

4. History

  

5. Copyright and disclaimers

All trademarks and software directly or indirectly referred in this document, are copyrighted from legal owners. WarpBench is a freeware program and can be spread through Internet, BBS, CD-ROM and other electronic formats. The Authors of this program accept no responsibility for hardware/software damages resulting from the use of this package. No warranty is made about the software or its performance.

Use and copying of this software and the preparation of derivative works based on this software are permitted, so long as the following conditions are met:

   

WarpBench
is an enhanced version of the original Linpack benchmark
Copyright 2006-2023, Alessandro Pedretti & Giulio Vistoli
All rights reserved.

Alessandro Pedretti
Dipartimento di Scienze Farmaceutiche
Università degli Studi di Milano
Via Mangiagalli, 25
I-20133 Milano - Italy
Tel. +39 02 503 19332
Fax. +39 02 503 19359
E-Mail: info@vegazz.net
WWW: http://www.vegazz.net