[FFmpeg-trac] #5822(avfilter:new): filter "scale_npp" fails to select correct algorithm (Nvidia CUDA/NPP scaler)

Mon Sep 5 14:35:07 EEST 2016

#5822: filter "scale_npp" fails to select correct algorithm (Nvidia CUDA/NPP
scaler)
-------------------------------------+-------------------------------------
             Reporter:  sdack        |                     Type:  defect
               Status:  new          |                 Priority:  normal
            Component:  avfilter     |                  Version:
             Keywords:  scale_npp,   |  unspecified
  interp_algo, Nvidia, CUDA, NPP     |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 Summary of the bug:

 The Nvidia CUDA/NPP filter for scaling images on the GPU fails to select
 the requested algorithm in some cases.

 While scaling an UltraHD video from 3840x2160p down to 1280x720p (33% size
 reduction with no change in aspect ratio) do several filters produce the
 exact same result.

 This behaviour shows most severely with a 33% reduction in scale, but can
 also be observed at 50% and 25%. In other cases such as a 67% and 40%
 reduction of scale does it appear to be working as expected.

 When the aspect ratio is changed with the size then it behaves as expected
 again.

 One can see the effect here in a montage of various combinations of
 hardware and software scalers and encoders. Each picture shows the name of
 the algorithm, an encoder setting and the resulting file size of the
 video. In the cases where scale_npp fails to select the algorithm do the
 images look identical and the resulting file sizes are identical, too:

 http://i.imgur.com/5qfPCSU.png

 If I had to guess I'd say there is an optimization going wrong or the
 scaler could be running into a hardware limitation.

 The issue can be observed with CUDA 7.5 and 8.0rc1.

 How to reproduce:

 The following script can be used to detect the issue. It produces a 1
 second uhd2160 video with 50 frames, scales it down with scale_npp and
 runs a checksum (md5sum) on the raw video. Where the algorithms produced
 identical output for all 50 frames do they show identical checksums.

 --- test.sh ---
 #!/bin/bash
 function runtests() {
     w="$1" ; h="$2" ; fmt="nv12"
     for alg in nn linear cubic cubic2p_bspline \
                cubic2p_catmullrom cubic2p_b05c03 super lanczos; do
         ffmpeg -v error -f lavfi \
                -i testsrc2=duration=1:size=uhd2160:rate=50 \
                -pix_fmt $fmt \
                -filter:v
 "hwupload_cuda,scale_npp=w=$w:h=$h:format=$fmt:interp_algo=$alg,hwdownload"
 \
                -f rawvideo -y - | md5sum -b | sed "s/-/$alg/"
     done
 }

 echo "3840x2160 -> 2560x1440" ; runtests 2560 1440 | sort # 67% - ok
 echo "3840x2160 -> 1920x1080" ; runtests 1920 1080 | sort # 50% -
 linear=super
 echo "3840x2160 -> 1536x864"  ; runtests 1536 864  | sort # 40% - ok
 echo "3840x2160 -> 1280x720"  ; runtests 1280 720  | sort # 33% -
 nn=linear=cubic=catmulrom=lanczos
 echo "3840x2160 -> 960x540"   ; runtests 960  540  | sort # 25% -
 catmulrom=cubic

 echo "3840x2160 -> 2561x1441" ; runtests 2561 1441 | sort # 67% - ok
 echo "3840x2160 -> 1921x1081" ; runtests 1921 1081 | sort # 50% - ok
 echo "3840x2160 -> 1537x865"  ; runtests 1537 865  | sort # 40% - ok
 echo "3840x2160 -> 1281x721"  ; runtests 1281 721  | sort # 33% - ok
 echo "3840x2160 -> 961x541"   ; runtests 961  541  | sort # 25% - ok
 --- EOF ---

 A work-around is not to use the module, but to fallback to the software
 scaler.

--
Ticket URL: <https://trac.ffmpeg.org/ticket/5822>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker