IAM

Check out our CVPR'18 paper on weakly-supervised 3D shape completion — and let me know your opinion! @david_stutz

ARTICLE

Compiling OpenCV 2.4.x with CUDA 9

Currently, both OpenCV 2 and OpenCV 3 seem to have some minor issues with CUDA 9. However, CUDA 9 is required for the latest generation of NVidia graphics cards. In this article, based on this StackOverflow question, I want to discuss a very simple patch to get OpenCV 2 running with CUDA 9.

When trying to compile OpenCV 2, for example OpenCV 2.4.13.6, with CUDA 9 there are mainly two issues:

  • The nppi library was splitted up under CUDA 9 into a series of libraries, preventing the shipped FindCUDA.cmake script from finding it;
  • and the FindCUDA.cmake does not handle the latest GPU architectures correctly.

The first problem can be fixed following this StackOverflow question. Specifically, adapting FindCUDA.cmake as follows: replace

find_cuda_helper_libs(nppi)

with

find_cuda_helper_libs(nppial)
find_cuda_helper_libs(nppicc)
find_cuda_helper_libs(nppicom)
find_cuda_helper_libs(nppidei)
find_cuda_helper_libs(nppif)
find_cuda_helper_libs(nppig)
find_cuda_helper_libs(nppim)
find_cuda_helper_libs(nppist)
find_cuda_helper_libs(nppisu)
find_cuda_helper_libs(nppitc)

A few lines below, the set statement for CUDA_npp_LIBRARY needs to reflect these changes:

set(CUDA_npp_LIBRARY "${CUDA_nppc_LIBRARY};${CUDA_nppial_LIBRARY};${CUDA_nppicc_LIBRARY};${CUDA_nppicom_LIBRARY};${CUDA_nppidei_LIBRARY};${CUDA_nppif_LIBRARY};${CUDA_nppig_LIBRARY};${CUDA_nppim_LIBRARY};${CUDA_nppist_LIBRARY};${CUDA_nppisu_LIBRARY};${CUDA_nppitc_LIBRARY};${CUDA_npps_LIBRARY}")

Similarly, replace

unset(CUDA_nppi_LIBRARY CACHE)

with

unset(CUDA_nppial_LIBRARY CACHE)
unset(CUDA_nppicc_LIBRARY CACHE)
unset(CUDA_nppicom_LIBRARY CACHE)
unset(CUDA_nppidei_LIBRARY CACHE)
unset(CUDA_nppif_LIBRARY CACHE)
unset(CUDA_nppig_LIBRARY CACHE)
unset(CUDA_nppim_LIBRARY CACHE)
unset(CUDA_nppist_LIBRARY CACHE)
unset(CUDA_nppisu_LIBRARY CACHE)
unset(CUDA_nppitc_LIBRARY CACHE)

In OpenCVDetectCuda.cmake, two more adjustements are necessary to tackle the second problem. In particular, the _generations variable needs to reflect the latest GPU generations and needs to correctly map them to the corresponding compute capabilities. To this end,

set(_generations "Fermi" "Kepler" "Maxwell" "Pascal" "Volta")

can be used. Then, a few lines below, the case distinction needs to include these generations:

set(__cuda_arch_ptx "")
if(CUDA_GENERATION STREQUAL "Fermi")
  set(__cuda_arch_bin "2.0")
elseif(CUDA_GENERATION STREQUAL "Kepler")
  set(__cuda_arch_bin "3.0 3.5 3.7")
elseif(CUDA_GENERATION STREQUAL "Maxwell")
  set(__cuda_arch_bin "5.0 5.2")
elseif(CUDA_GENERATION STREQUAL "Pascal")
  set(__cuda_arch_bin "6.0 6.1")
elseif(CUDA_GENERATION STREQUAL "Volta")
  set(__cuda_arch_bin "7.0")
elseif(CUDA_GENERATION STREQUAL "Auto")
  # ...
endif()

Finally, to avoid compilation errors, the NVCC flag --expt-relaxed-constexpr needs to be set. To this end, FindCUDA.cmake needs to be adapted:

set(nvcc_flags "--expt-relaxed-constexpr")

OpenCV 2 should now be ready to be compiled with CUDA 9. As the correct GPU generation might not be selected automatically, make sure to use -DCUDA_GENERATION when running CMake to set the correct generation.

All these fixes can also be found in the following GitHub repository:

OpenCV 2 CUDA 9 Patch on GitHub

What is your opinion on this article? Did you find it interesting or useful? Let me know your thoughts in the comments below or get in touch with me:

@david_stutz