[FFmpeg-trac] #4915(avcodec:new): WebVTT decoder doesn't handle html escapes
FFmpeg
trac at avcodec.org
Thu Oct 8 00:50:21 CEST 2015
#4915: WebVTT decoder doesn't handle html escapes
---------------------------------+---------------------------------------
Reporter: RiCON | Type: enhancement
Status: new | Priority: minor
Component: avcodec | Version: git-master
Keywords: webvtt | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
---------------------------------+---------------------------------------
WebVTT spec specifies a dozen HTML escapes that should be handled,
including '>', '<' and '&'. These aren't converted back to the
proper characters.
FFmpeg version:
{{{
% ffmpeg -i htmlescapes.vtt out.srt
ffmpeg version N-75818-g8135b1e Copyright (c) 2000-2015 the FFmpeg
developers
built with gcc 5.2.0 (Rev4, Built by MSYS2 project)
}}}
Attached is an example vtt file, result with this build and proper result.
Examples of where these html escapes are used can be found by getting the
subtitles from any video in Comedy Central's site using something like
youtube-dl. Example:
{{{
% youtube-dl --all-subs "http://www.cc.com/video-clips/52dpzm/the-daily-
show-with-trevor-noah-terrible--unending-national-tragedies"
}}}
--
Ticket URL: <https://trac.ffmpeg.org/ticket/4915>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list