[FFmpeg-trac] #7028(avfilter:new): Improper rounding of output sample rate when using libopus
FFmpeg
trac at avcodec.org
Sun Feb 18 05:37:04 EET 2018
#7028: Improper rounding of output sample rate when using libopus
----------------------------------+--------------------------------------
Reporter: heicrd | Type: defect
Status: new | Priority: normal
Component: avfilter | Version: git-master
Keywords: | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
----------------------------------+--------------------------------------
Summary of the bug:
When used in combination with `libopus`'s limited selection of supported
sample rates, an automatically-inserted `aresample` filter will round the
sample rate of the audio to the nearest supported sample rate, potentially
rounding down and incurring a significant and unexpected loss in fidelity.
For example, a 32khz input will be downsampled to 24khz instead of
upsampled to 48khz as `opusenc` seems to do.
How to reproduce:
Use ffmpeg to encode a 32khz file with libopus
{{{
% ./ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -codec:a pcm_s16le
-af aresample=32000 -f wav - | ./ffmpeg -v 9 -loglevel 99 -f wav -i -
-codec:a libopus -f ogg -y /dev/null
ffmpeg version N-90069-gdd8351b118ffmpeg version N-90069-gdd8351b118
Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 7 (Debian 7.3.0-3)
Copyright (c) 2000-2018 the FFmpeg developers configuration: --enable-
libopus
built with gcc 7 (Debian 7.3.0-3)
configuration: --enable-libopus
libavutil 56. 7.101 / 56. 7.101
libavcodec 58. 11.101 / 58. 11.101
libavformat 58. 9.100 / 58. 9.100
libavdevice 58. 1.100 / 58. 1.100
libavfilter 7. 12.100 / 7. 12.100
libswscale 5. 0.101 / 5. 0.101
libswresample 3. 0.101 / 3. 0.101
libavutil 56. 7.101 / 56. 7.101
libavcodec 58. 11.101 / 58. 11.101
libavformat 58. 9.100 / 58. 9.100
libavdevice 58. 1.100 / 58. 1.100
libavfilter 7. 12.100 / 7. 12.100
libswscale 5. 0.101 / 5. 0.101
Splitting the commandline.
libswresample 3. 0.101 / 3. 0.101
Reading option '-v' ... matched as option 'v' (set logging level) with
argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging
level) with argument '99'.
Reading option '-f' ... matched as option 'f' (force format) with argument
'wav'.
Reading option '-i' ... matched as input url with argument '-'.
Reading option '-codec:a' ... matched as option 'codec' (codec name) with
argument 'libopus'.
Reading option '-f' ... matched as option 'f' (force format) with argument
'ogg'.
Reading option '-y' ... matched as option 'y' (overwrite output files)
with argument '1'.
Reading option '/dev/null' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url -.
Applying option f (force format) with argument wav.
Successfully parsed a group of options.
Opening an input file: -.
[wav @ 0x5647ff2d2340] Opening 'pipe:' for reading
[pipe @ 0x5647ff2d2ec0] Setting default whitelist 'crypto'
Input #0, lavfi, from 'sine=frequency=1000:duration=5':
Duration: N/A, start: 0.000000, bitrate: 705 kb/s
Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'pipe:':
Metadata:
ISFT : Lavf58.9.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, mono,
s16, 512 kb/s
Metadata:
encoder : Lavc58.11.101 pcm_s16le
[wav @ 0x5647ff2d2340] Ignoring maximum wav data size, file may be invalid
[wav @ 0x5647ff2d2340] Before avformat_find_stream_info() pos: 78 bytes
read:66920 seeks:0 nb_streams:1
[wav @ 0x5647ff2d2340] probing stream 0 pp:32
[wav @ 0x5647ff2d2340] probing stream 0 pp:31
[wav @ 0x5647ff2d2340] probing stream 0 pp:30
[wav @ 0x5647ff2d2340] probing stream 0 pp:29
[wav @ 0x5647ff2d2340] probing stream 0 pp:28
[wav @ 0x5647ff2d2340] probing stream 0 pp:27
[wav @ 0x5647ff2d2340] probing stream 0 pp:26
[wav @ 0x5647ff2d2340] probing stream 0 pp:25
[wav @ 0x5647ff2d2340] probing stream 0 pp:24
[wav @ 0x5647ff2d2340] probing stream 0 pp:23
[wav @ 0x5647ff2d2340] probing stream 0 pp:22
[wav @ 0x5647ff2d2340] probing stream 0 pp:21
[wav @ 0x5647ff2d2340] probing stream 0 pp:20
[wav @ 0x5647ff2d2340] probing stream 0 pp:19
[wav @ 0x5647ff2d2340] probing stream 0 pp:18
[wav @ 0x5647ff2d2340] probing stream 0 pp:17
[wav @ 0x5647ff2d2340] probing stream 0 pp:16
[wav @ 0x5647ff2d2340] probing stream 0 pp:15
[wav @ 0x5647ff2d2340] probing stream 0 pp:14
[wav @ 0x5647ff2d2340] probing stream 0 pp:13
[wav @ 0x5647ff2d2340] probing stream 0 pp:12
[wav @ 0x5647ff2d2340] probing stream 0 pp:11
[wav @ 0x5647ff2d2340] probing stream 0 pp:10
[wav @ 0x5647ff2d2340] probing stream 0 pp:9
[wav @ 0x5647ff2d2340] probing stream 0 pp:8
[wav @ 0x5647ff2d2340] probing stream 0 pp:7
[wav @ 0x5647ff2d2340] probing stream 0 pp:6
[wav @ 0x5647ff2d2340] probing stream 0 pp:5
[wav @ 0x5647ff2d2340] probing stream 0 pp:4
[wav @ 0x5647ff2d2340] probing stream 0 pp:3
[wav @ 0x5647ff2d2340] probing stream 0 pp:2
[wav @ 0x5647ff2d2340] probing stream 0 pp:1
[wav @ 0x5647ff2d2340] probed stream 0
[wav @ 0x5647ff2d2340] parser not found for codec pcm_s16le, packets or
times may be invalid.
[wav @ 0x5647ff2d2340] All info found
[wav @ 0x5647ff2d2340] stream 0: start_time: -288230376151711.750
duration: -288230376151711.750
[wav @ 0x5647ff2d2340] format: start_time: -9223372036854.775 duration:
-9223372036854.775 bitrate=512 kb/s
[wav @ 0x5647ff2d2340] After avformat_find_stream_info() pos: 204878 bytes
read:205124 seeks:0 frames:50
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'pipe:':
Metadata:
encoder : Lavf58.9.100
Duration: N/A, bitrate: 512 kb/s
Stream #0:0, 50, 1/32000: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
32000 Hz, mono, s16, 512 kb/s
Successfully opened the file.
Parsing a group of options: output url /dev/null.
Applying option codec:a (codec name) with argument libopus.
Applying option f (force format) with argument ogg.
Successfully parsed a group of options.
Opening an output file: /dev/null.
[file @ 0x5647ff2f5e40] Setting default whitelist 'file,crypto'
Successfully opened the file.
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> opus (libopus))
cur_dts is invalid (this is harmless if it occurs once at the start per
stream)
detected 8 logical cores
[graph_0_in_0_0 @ 0x5647ff320c80] Setting 'time_base' to value '1/32000'
[graph_0_in_0_0 @ 0x5647ff320c80] Setting 'sample_rate' to value '32000'
[graph_0_in_0_0 @ 0x5647ff320c80] Setting 'sample_fmt' to value 's16'
[graph_0_in_0_0 @ 0x5647ff320c80] Setting 'channel_layout' to value '0x4'
[graph_0_in_0_0 @ 0x5647ff320c80] tb:1/32000 samplefmt:s16
samplerate:32000 chlayout:0x4
[format_out_0_0 @ 0x5647ff320f40] Setting 'sample_fmts' to value 's16|flt'
[format_out_0_0 @ 0x5647ff320f40] Setting 'sample_rates' to value
'48000|24000|16000|12000|8000'
[format_out_0_0 @ 0x5647ff320f40] auto-inserting filter 'auto_resampler_0'
between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 0x5647ff2f7400] query_formats: 4 queried, 6 merged, 3
already done, 0 delayed
[auto_resampler_0 @ 0x5647ff324580] [SWR @ 0x5647ff324a80] Using s16p
internally between filters
[auto_resampler_0 @ 0x5647ff324580] ch:1 chl:mono fmt:s16 r:32000Hz ->
ch:1 chl:mono fmt:s16 r:24000Hz
[libopus @ 0x5647ff2f5700] No bit rate set. Defaulting to 64000 bps.
Output #0, ogg, to '/dev/null':
Metadata:
encoder : Lavf58.9.100
Stream #0:0, 0, 1/48000: Audio: opus (libopus), 24000 Hz, mono, s16,
delay 156, 64 kb/s
Metadata:
encoder : Lavc58.11.101 libopus
[Parsed_sine_0 @ 0x55a947218500] EOF timestamp not reliable
size= 313kB time=00:00:05.00 bitrate= 512.1kbits/s speed= 107x
video:0kB audio:312kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 0.024375%
[out_0_0 @ 0x5647ff321cc0] EOF on sink link out_0_0:default.
No more output streams to write to, finishing.
[libopus @ 0x5647ff2f5700] Trying to remove 324 more samples than there
are in the queue
size= 60kB time=00:00:05.01 bitrate= 98.0kbits/s speed= 143x
video:0kB audio:59kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 1.050050%
Input file #0 (pipe:):
Input stream #0:0 (audio): 79 packets read (320000 bytes); 79 frames
decoded (160000 samples);
Total: 79 packets (320000 bytes) demuxed
Output file #0 (/dev/null):
Output stream #0:0 (audio): 250 frames encoded (120000 samples); 251
packets muxed (60759 bytes);
Total: 251 packets (60759 bytes) muxed
79 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x5647ff2f60c0] Statistics: 0 seeks, 8 writeouts
[AVIOContext @ 0x5647ff2db340] Statistics: 320078 bytes read, 0 seeks
}}}
Among the output is
{{{
[auto_resampler_0 @ 0x55e656ac6a80] ch:1 chl:mono fmt:s16 r:32000Hz ->
ch:1 chl:mono fmt:s16 r:24000Hz
}}}
Indicating that the 32khz file was downsampled to 24khz.
This can be worked around by manually specifying `-af resample=48000`,
however ffmpeg makes no indication that any downsampling was performed
unless verbose logging is enabled.
--
Ticket URL: <https://trac.ffmpeg.org/ticket/7028>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list