Download Latest Version OpenBLAS-0.3.29_x86.zip (22.1 MB)
Email in envelope

Get an email when there's a new version of OpenBLAS

Home / v0.3.0
Name Modified Size InfoDownloads / Week
Parent folder
OpenBLAS 0.3.0 version.tar.gz 2018-05-23 11.8 MB
OpenBLAS 0.3.0 version.zip 2018-05-23 24.2 MB
README.md 2018-05-23 2.3 kB
Totals: 3 Items   36.0 MB 0

common:

* fixed some more thread race and locking bugs
* added preliminary support for calling an OpenMP build of the library from multiple threads
* removed performance impact of thread locks added in 0.2.20 on OpenMP code
* general code cleanup 
* optimized DSDOT implementation
* improved thread distribution for GEMM
* corrected IMATCOPY/OMATCOPY implementation
* fixed out-of-bounds accesses in the multithreaded xBMV/xPMV and SYMV implementations
* cmake build improvements
* pkgconfig file now contains build options
* openblas_get_config() now reports USE_OPENMP and NUM_THREADS settings used for the build
* corrections and improvements for systems with more than 64 cpus
* LAPACK code updated to 3.8.0 including later fixes
* added ReLAPACK, a recursive implementation of several LAPACK functions
* Rewrote ROTMG to handle cases that the netlib code failed to address
* Disabled (broken) multithreading code for xTRMV
* corrected prototypes of complex CBLAS functions to make our cblas.h match the generally accepted standard
* shared memory access failures on startup are now handled more gracefully
* restored utests from earlier releases (and made them pass on all affected systems)

SPARC:

* several fixes for cpu autodetection

POWER:

* corrected vector register overwriting in several Power8 kernels
* optimized additional BLAS functions

ARM:

* added support for CortexA53 and A72 
* added autodetection for ThunderX2T99
* made most optimized kernels the default for generic ARMv8 targets

x86_64:

* parallelized DDOT kernel for Haswell
* changed alignment directives in assembly kernels to boost performance on OSX
* fixed register handling in the GEMV microkernels (bug exposed by gcc7)
* added support for building on OpenBSD and Dragonfly 
* updated compiler options to work with Intel release 2018
* support fully optimized build with clang/flang on Microsoft Windows
* fixed building on AIX

IBM Z:

* added optimized BLAS 1/2 functions

MIPS:

* fixed cpu autodetection helper code
* added mips32 1004K cpu (Mediatek MT7621 and similar SoC)
* added mips64 I6500 cpu
Source: README.md, updated 2018-05-23