Saturday, October 29, 2011

Error in Compiling GotoBLAS2 in Westmere Chipsets

GotoBLAS2 uses new algorithms and memory techniques for optimal performance of the BLAS routines.

When I was tried compiling the GotoBLAS2 on my Westmere chipsets, I followed the "02QuickInstall.txt", I got this error

../kernel/x86_64/gemm_ncopy_4.S: Assembler messages:
../kernel/x86_64/gemm_ncopy_4.S:192: Error: undefined symbol `RPREFETCHSIZE' in                                        operation
...........
...........
...........

gcc -O2 -Wall -m64 -DF_INTERFACE_INTEL -fPIC  -DSMP_SERVER -DMAX_CPU_NUMBER=8 -D                                       ASMNAME=strmm_kernel_RN -DASMFNAME=strmm_kernel_RN_ -DNAME=strmm_kernel_RN_ -DCN                                       AME=strmm_kernel_RN -DCHAR_NAME=\"strmm_kernel_RN_\" -DCHAR_CNAME=\"strmm_kernel                                       _RN\" -I.. -UDOUBLE  -UCOMPLEX -c -DTRMMKERNEL -UDOUBLE -UCOMPLEX -ULEFT -UTRANS                                       A ../kernel/x86_64/gemm_kernel_8x4_sse3.S -o strmm_kernel_RN.o
make[1]: *** [sgemm_oncopy.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory `/root/GotoBLAS2/kernel'


I was quite puzzled to why the compilation did not work. I googled and found a wonderful answer Trouble compiling GotoBLAS2 on newer CPU. Basically, you will need to

gmake clean
gmake TARGET=NEHALEM
Eventually yo will get something like

 GotoBLAS build complete.

  OS               ... Linux
  Architecture     ... x86_64
  BINARY           ... 64bit
  C compiler       ... GCC  (command line : gcc)
  Fortran compiler ... INTEL  (command line : ifort)
  Library Name     ... libgoto2_nehalemp-r1.13.a (Multi threaded; Max num-threads is 8)



According to Trouble compiling GotoBLAS2 on newer CPU, the problem appears to be that newer CPUs (Intel X5650 in my case) are not detected properly by the CPU ID routine in GotoBlas2.

The problem with gemm_ncopy_4.S arises because it defines RPRETCHSIZE and WPREFETCHSIZE using #ifdef statements depending on CPU type. There is an entry for #ifdef GENERIC, but that was not set for me in config.h.

No comments: