IAM

ARTICLE

Compiling OpenCV 2.4.x with CUDA 9

Currently, both OpenCV 2 and OpenCV 3 seem to have some minor issues with CUDA 9. However, CUDA 9 is required for the latest generation of NVidia graphics cards. In this article, based on this StackOverflow question, I want to discuss a very simple patch to get OpenCV 2 running with CUDA 9.

When trying to compile OpenCV 2, for example OpenCV 2.4.13.6, with CUDA 9 there are mainly two issues:

  • The nppi library was splitted up under CUDA 9 into a series of libraries, preventing the shipped FindCUDA.cmake script from finding it;
  • and the FindCUDA.cmake does not handle the latest GPU architectures correctly.

The first problem can be fixed following this StackOverflow question. Specifically, adapting FindCUDA.cmake as follows: replace

find_cuda_helper_libs(nppi)

with

find_cuda_helper_libs(nppial)
find_cuda_helper_libs(nppicc)
find_cuda_helper_libs(nppicom)
find_cuda_helper_libs(nppidei)
find_cuda_helper_libs(nppif)
find_cuda_helper_libs(nppig)
find_cuda_helper_libs(nppim)
find_cuda_helper_libs(nppist)
find_cuda_helper_libs(nppisu)
find_cuda_helper_libs(nppitc)

A few lines below, the set statement for CUDA_npp_LIBRARY needs to reflect these changes:

set(CUDA_npp_LIBRARY "${CUDA_nppc_LIBRARY};${CUDA_nppial_LIBRARY};${CUDA_nppicc_LIBRARY};${CUDA_nppicom_LIBRARY};${CUDA_nppidei_LIBRARY};${CUDA_nppif_LIBRARY};${CUDA_nppig_LIBRARY};${CUDA_nppim_LIBRARY};${CUDA_nppist_LIBRARY};${CUDA_nppisu_LIBRARY};${CUDA_nppitc_LIBRARY};${CUDA_npps_LIBRARY}")

Similarly, replace

unset(CUDA_nppi_LIBRARY CACHE)

with

unset(CUDA_nppial_LIBRARY CACHE)
unset(CUDA_nppicc_LIBRARY CACHE)
unset(CUDA_nppicom_LIBRARY CACHE)
unset(CUDA_nppidei_LIBRARY CACHE)
unset(CUDA_nppif_LIBRARY CACHE)
unset(CUDA_nppig_LIBRARY CACHE)
unset(CUDA_nppim_LIBRARY CACHE)
unset(CUDA_nppist_LIBRARY CACHE)
unset(CUDA_nppisu_LIBRARY CACHE)
unset(CUDA_nppitc_LIBRARY CACHE)

In OpenCVDetectCuda.cmake, two more adjustements are necessary to tackle the second problem. In particular, the _generations variable needs to reflect the latest GPU generations and needs to correctly map them to the corresponding compute capabilities. To this end,

set(_generations "Fermi" "Kepler" "Maxwell" "Pascal" "Volta")

can be used. Then, a few lines below, the case distinction needs to include these generations:

set(__cuda_arch_ptx "")
if(CUDA_GENERATION STREQUAL "Fermi")
  set(__cuda_arch_bin "2.0")
elseif(CUDA_GENERATION STREQUAL "Kepler")
  set(__cuda_arch_bin "3.0 3.5 3.7")
elseif(CUDA_GENERATION STREQUAL "Maxwell")
  set(__cuda_arch_bin "5.0 5.2")
elseif(CUDA_GENERATION STREQUAL "Pascal")
  set(__cuda_arch_bin "6.0 6.1")
elseif(CUDA_GENERATION STREQUAL "Volta")
  set(__cuda_arch_bin "7.0")
elseif(CUDA_GENERATION STREQUAL "Auto")
  # ...
endif()

Finally, to avoid compilation errors, the NVCC flag --expt-relaxed-constexpr needs to be set. To this end, FindCUDA.cmake needs to be adapted:

set(nvcc_flags "--expt-relaxed-constexpr")

OpenCV 2 should now be ready to be compiled with CUDA 9. As the correct GPU generation might not be selected automatically, make sure to use -DCUDA_GENERATION when running CMake to set the correct generation.

All these fixes can also be found in the following GitHub repository:

OpenCV 2 CUDA 9 Patch on GitHub
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.