[FFmpeg-trac] #4852(avformat:new): Metadata stream of MPEG-2 file not being written out correctly.

Tue Sep 15 11:01:23 CEST 2015

#4852: Metadata stream of MPEG-2 file not being written out correctly.
-------------------------------------+-------------------------------------
             Reporter:               |                     Type:  defect
  anthonyvenables                    |                 Priority:  normal
               Status:  new          |                  Version:
            Component:  avformat     |  unspecified
             Keywords:  MPEG2        |               Blocked By:
  Metadata                           |  Reproduced by developer:  0
             Blocking:               |
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 I am using the FFMPEG libraries in a C++ application I am writing. The
 application should open a MPEG-2 file for display and save a tagged
 section to a new output file.

 The input MPEG-2 file consisting of the following three streams.

 a) a synchronous metadata stream of KLV data.
 b) a video stream.
 c) an audio stream.

 The file is opened and the packets are read one at a time and decoded for
 display to the user. I need to be able to copy some (or all) of the input
 file to an output file. I have written a function that performs an
 av_seek_frame() to get to the first frame to output, then  creates an
 output file. The input file (which is already open) is read one packet at
 a time and writtsn to the output file.

 Here is the code segment that opens the file for reading. The code is run
 during construction of the class that opens and reads the MPEG2 file.

    {
        pFormatCtx = NULL;
        pCodecCtx = NULL;
        pCodec = NULL;
        pFrame = NULL;
        packet = NULL;
        buffer = NULL;
        img_convert_ctx = NULL;
        memset(&next_time, 0, sizeof(next_time));
        current_index = 0;
        for (unsigned int i = 0; i < MAX_INPUT_BUFFERS; ++i)
            pFrameYUYV[i] = NULL;
        //setFormat(Frame::YUYV);
        initExternalBuffer = false;

        if (!initialized)
        {
            return;
        }

        if ( pFormatCtx != NULL )
        {
            delete pFormatCtx;
            pFormatCtx = NULL;
            av_free(packet);
            packet = NULL;

        }
        if (filename_ == NULL)
        {
            VIDEOLOG_WARNING("Invalid NULL file name.");
            initialized = false;
            return;
        }

        packet =(AVPacket *)av_malloc(sizeof(AVPacket));
        if (packet == NULL)
        {
            VIDEOLOG_WARNING("Could not allocate packet variable.");
            initialized = false;
            return;
        }

        // Open video file
        if(avformat_open_input(&pFormatCtx, const_cast <const char *>
 (filename_), NULL, NULL)!=0)
        {
            VIDEOLOG_WARNING("Could not open file.");
            initialized = false;
            return;
        }

        // Retrieve stream information
        if(avformat_find_stream_info(pFormatCtx, NULL) < 0)
        {
            VIDEOLOG_WARNING("Could not find stream information.");
            initialized = false;
            return;
        }

        // Dump information about file onto standard error
        av_dump_format(pFormatCtx, 0, const_cast <const char *>
 (filename_), false);

    }

 Here is the function I use to output to a file.

    void FFMpegInput::save_clip_to_file(
            const std::string& filename )
    {
        // Get start and end time values. These are relative positions
 within the clip in microseconds.
        // A value of 0 is start of clip.
        uint64_t start_marker_position =
 EventListContainer::get_instance().get_start_marker_position();
        uint64_t end_marker_position =
 EventListContainer::get_instance().get_end_marker_position();

        //convert start and end targets to fractional seconds
        uint64_t start_target =
                convert_microseconds_to_fractional_seconds(
                        start_marker_position,
                        m_current_video_stream );
        uint64_t end_target =
                convert_microseconds_to_fractional_seconds(
                        end_marker_position,
                        m_current_video_stream );

        // add video start pts to get actual target values.
        start_target += m_video_start_pts;
        end_target += m_video_start_pts;

        // Seek to start target position.
        unsigned int seek_stream_id = m_current_video_stream;

        if ( m_current_uas_data_stream >= 0 )
        {
            // seek on data stream if possible because data
            // stream leads video stream by a small amount.
            seek_stream_id = m_current_uas_data_stream;
 //           cout << "seeking on data_stream" << endl;
        }

        av_seek_frame(
                pFormatCtx,
                seek_stream_id,
                start_target,
                AVSEEK_FLAG_ANY );

        avcodec_flush_buffers(pCodecCtx);

        // open file
        AVOutputFormat *fmt;
        AVFormatContext *oc;

        fmt = av_guess_format(NULL, filename.c_str(), NULL);

        /* allocate the output media context */
        oc = avformat_alloc_context();
        if (!oc)
        {
            fprintf(stderr, "Memory error\n");
            exit(1);
        }

        oc->oformat = fmt;

        /* open the output file, if needed */
        if (!(fmt->flags & AVFMT_NOFILE))
        {
            if ( avio_open(&oc->pb, filename.c_str(), AVIO_FLAG_WRITE) < 0
 )
            {
                fprintf(stderr, "Could not open '%s'\n", filename.c_str());
                exit(1);
            }
        }

        // add streams to output file
        AVStream *st;
        AVStream *data_st;
        bool event_stream_present = false;

        for ( unsigned int i = 0;
              i < pFormatCtx->nb_streams;
              ++i                         )
        {
            // create new stream and copy details from input file streams
            st = avformat_new_stream(oc, NULL );
            *st = *pFormatCtx->streams[i];
        }

        // dump format of output context to screen
        av_dump_format(oc, 0, filename.c_str(), true);

        // write the stream header, if any
        cout << "write the stream header ..." << endl;
       /*
        * @param s Media file handle, must be allocated with
 avformat_alloc_context().
        *          Its oformat field must be set to the desired output
 format;
        *          Its pb field must be set to an already opened
 AVIOContext.
        */
        avformat_write_header( oc, NULL );
        cout << "write the stream header - complete ..." << endl;

        uint64_t counter = start_target;

        cout << "start target = " << start_target << "   end target = " <<
 end_target << endl;

        // loop from start marker to end marker
        while ( counter < end_target )
        {
            // get next frame
            if (av_read_frame(pFormatCtx, packet) >= 0 )
            {
                if ( packet->stream_index==m_current_video_stream )
                {
                    m_video_current_pts = packet->pts;

                    if ( m_video_start_pts == 0 )
                    {
                        m_video_start_pts = m_video_current_pts;
                    }

                    counter = m_video_current_pts;
                }

                //write frame
                int errnum = av_write_frame(oc, packet);
                if ( errnum != 0 )
                {
                    // something went wrong - report it.
                    cout << "failed packet write, errnum = " << errnum <<
 endl;
                }

                av_free_packet( packet );

            }
            else
            {
                cout << "failed to read frame."
                        << " counter = " << counter
                        << " end_target = " << end_target << endl;

                counter = end_target;
            }
        }

        // close file
        // write the trailer, if any.  the trailer must be written
        // before you close the CodecContexts open when you wrote the
        // header; otherwise write_trailer may try to use memory that
        // was freed on av_codec_close()

        cout << "write the stream trailer ..." << endl;
        av_write_trailer(oc);
        cout << "write the stream trailer - complete ..." << endl;

        if (!(fmt->flags & AVFMT_NOFILE))
        {
            // close the output file
            avio_close( oc->pb );
        }
    }

 At run time the following output was generated by the 'av_dump_format'
 function and by the cout calls.

 Input #0, mpegts, from '/mercury/VIDEO_FILES/AAA_TEST_FILE_INPUT.ts':
   Duration: 00:00:32.84, start: 34250.781244, bitrate: 15471 kb/s
   Program 1
     Stream #0:0[0xfc]: Data: klv (KLVA / 0x41564C4B)
     Stream #0:1[0xe0]: Video: h264 (Main) ([27][0][0][0] / 0x001B),
 yuv420p, 1280x720, 50 fps, 50 tbr, 90k tbn, 100 tbc
     Stream #0:2[0xc0]: Audio: mp2 ([3][0][0][0] / 0x0003), 44100 Hz,
 stereo, s16p, 128 kb/s

 Output #0, mpegts, to
 '/mercury/VIDEO_FILES/AAA_TEST_FILE_OUTPUT_ffmpeg_2.4.1.ts':
     Stream #0:0: Data: klv (KLVA / 0x41564C4B)
     Stream #0:1: Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p,
 1280x720, q=2-31, 50 fps, 50 tbr, 90k tbn, 100 tbc
     Stream #0:2: Audio: mp2 ([3][0][0][0] / 0x0003), 44100 Hz, stereo,
 s16p, 128 kb/s

 write the stream header ...
 write the stream header - complete ...
 start target = 3082570312   end target = 3085525923
 write the stream trailer ...
 write the stream trailer - complete ...

 When the saved file is then loaded, the following output is generated by
 the 'av_dump_format' function call directly after opening the file.

 Input #0, mpegts, from
 '/mercury/VIDEO_FILES/AAA_TEST_FILE_OUTPUT_ffmpeg_2.4.1.ts':
   Duration: 00:00:32.82, start: 34250.801244, bitrate: 16035 kb/s
   Program 1
     Metadata:
       service_name    : Service01
       service_provider: FFmpeg
     Stream #0:0[0xfc]: Data: klv (KLVA / 0x41564C4B)
     Stream #0:1[0xe0]: Video: h264 (Main) ([27][0][0][0] / 0x001B),
 yuv420p, 1280x720, 50 fps, 50 tbr, 90k tbn, 100 tbc
     Stream #0:2[0xc0]: Audio: mp3 ([3][0][0][0] / 0x0003), 0 channels,
 s16p

 Playback of this file within my application seemed to function correctly,
 but when I investigated the output file in a hex editor I discovered the
 PMT table entries and the PES packets associated with the data stream no
 longer match the input file and are also no longer consistent with "MISB
 ST 1402 (27 February 2014)".

 The code is being built against V2.4.10, but has been rebuilt using V2.7.1
 and 2.8 and the issues are still present in the later versions.

 I have observed the following specific differences.

 1) The PMT information is being changed for every PMT entry in the file.

 The PMT entries in the input file are the following hex bytes

      0x 15 E0 FC F0 16 26 09 01 00 FF 4B 4C 56 41 00 0F 27 09 C0 75 30 C0
 00 14 C0 00 00 1B E0 E0 F0 00 03 E0 C0 F0 00

 The PMT entry bytes can be split into three sections
  a)  data = 0x 15 E0 FC F0 16 26 09 01 00 FF 4B 4C 56 41 00 0F 27 09 C0 75
 30 C0 00 14 C0 00 00
  b) video = 0x 1B E0 E0 F0 00
  c) audio = 0x 03 E0 C0 F0 00

 In the output file, the PMT entries have been changed to

      0x 06 E0 FC F0 06 05 4B 4C 56 41 1B E0 E0 F0 00 03 E0 C0 F0 00

  a)  data = 0x 06 E0 FC F0 06 05 04 4B 4C 56 41
  b) video = 0x 1B E0 E0 F0 00
  c) audio = 0x 03 E0 C0 F0 00

 The first issue is the 'stream type' byte has changed from 0x15 to 0x06.
 This indicates that the data is now an asynchronous stream instead of a
 synchronous stream. There is an issue with the 'stream ID' byte (3rd byte
 = "0xFC"). MISB ST 1402 states that for synchronous metadata, the stream
 type shall be 0x15 and the stream ID shall be 0xFC, for asynchronous
 metadata, the stream type shall be 0x06 and the stream ID shall be 0xBD.

 2) The PES packet information is being changed for all of the data
 packets.

 In the source file, the first PES packet of the data stream consists of
 the following hex bytes.

     0x 47 40 FC 13 00 00 01 FC 00 E4 81 80 05 25 DE F1 B0 B1 00 3F DF 00
 D7 06 0E 2B 34

 The first four bytes are the packet header, the last four bytes shown (0x
 06 0E 2B 34) are the start of the klv data block. the remaining data is
 the PES header.

 In the output file, the first PES packet of the data stream consists of
 the following hex bytes.

     0x 47 40 FC 30 01 40 00 00 01 BD 00 E4 81 80 05 25 DE F1 B0 B1 06 0E
 2B 34

 The first four bytes (0x 47 40 FC 30) are the packet header, the last four
 bytes shown (0x 06 0E 2B 34) are the start of the klv data block. The
 remaining bytes contain the PES Header.

 It can clearly be seen that in the the output file that there are
 additional bytes directly after the header (0x 01 40). I am not certain
 what these represent, but This represents a  the PES header contains
 additional bytes PES Header (0x 00 00 01 FC ... ), byte 4 has changed from
 0xFC (synchronous metadata) to 0xBD (asynchronous metadata)

 Also the 5 bytes of the Metadata Access Unit Header ( 0x 00 3F DF 00 D7 )
 that immediately precede the klv data, do not exist in the output file,
 this is to be expected as the output file has marked the packet as
 asynchronous.

 In both input and output files, there is a 5 byte PTS value ( 0x 25 DE F1
 B0 B1 ). There should not be any PTS bytes for asynchronous data.

 I have tried to examine the packet data in a debugger and have noticed
 that the AVPacket->data provided a pointer to a memory location that
 started with 0x 06 0E 2B 34.  This indicates to me that the library is
 reading the packet, stripping the PES header and Metadata AU Header and
 placing the remaining bytes into the 'data' member data.  I have examined
 the AVPacket structure and can not see where the stripped Metadata AU
 header is stored.

 [[BR]]
 [[BR]]

 The questions I have are as follows.

 A) Is my methodology correct? Is this method of opening the file, reading
 all packets one at a time and writing to the output file an acceptable way
 to use the FFMPEG C++ libraries? Is my code correct, or are some function
 calls incorrect and alternative calls should be used in their place.

 B) Given that I am reading in each packet in turn and writing it to the
 output file, why is the data stream 'synchronous' type being changed to
 'asynchronous'? Have I done something wrong when creating the streams for
 the output AVFormatContext?

 C) If the stream type in the output file is being output as asynchronous
 (due to values being written into the PMT), why is a PTS being output in
 the PES header of the data packets?

 D) Should the Metadata AU header information be stripped from the PES data
 packet? The standard indicates that multiple Metadata AU cells can exist
 in the PES body each Metadata AU cell consists of a Metadata AU Header and
 a Metadata AU cell payload. maintaining the Metadata AU header in the
 packet->data field ensures that it is written out when the packet is
 written to file.

--
Ticket URL: <https://trac.ffmpeg.org/ticket/4852>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker