by Adobe

Adobe logo

Created

27 June 2011

The following frequently asked questions and answers are provided to help you package encoded media and develop applications using the HTTP Dynamic Streaming SDK (HDK).

Questions regarding source

1.1. How do I use RefCountedMemoryArena? Do I need IFileSouce to use the RefCountedMemoryArena?
1.2. What is the structure of the MediaMessage header field? Is it always 11 bytes? What do the different bytes in the header signify?
1.3. What is the BackTag in a MediaMessage? Why is it required? What is the size of the BackTag in the MediaMessage?
1.4. Is the Audio Data Transport Stream (ADTS) AAC audio payload format supported?
1.5. What is a codec header? Does the size of the codec header vary from codec to codec? How can I create codec headers for the HDK-supported codecs?
1.6. What should be the media handler name for video, audio, and data tracks for an F4V file to work with the HDK?
1.7. When should the AVC and ASC messages be ingested into the HDK? Can they be ingested after sending corresponding video and audio messages? Is it possible to package H.264 and AAC data without first ingesting the corresponding AVC and ASC messages to the HDK?
1.8. I have an H.264 stream wherein a single coded picture is being created in more than one NALUs. Should I create one MediaMessage for each NALU with same timestamp or stitch all the NALUs in a single MediaMessage?
1.9. How do I create a MediaMessage from raw payload data?

Troubleshooting

2.1. How do I debug the HDK? Is there any verbose mode of the HDK so I can get more details?
2.2. I get an error code from the HDK API. Can I get more information other than the message returned by getErrorString()?
2.3. Is there a sample logger? How can I use it in my application?
2.4. When I push my first data message after the first fragment is created, I get an error from the fragment: "Track not found." What causes this error? How can this be prevented?
2.5. Why do I get an unsupported codec error (offline Packager::fragment failed! error) when the same file runs fine using the f4fpackager shipped with Flash Media Server?

Content protection

3.1. What should be the format of the license server certificate and transport certificate used for encryption?
3.2. Do I need to host encrypted F4F assets somewhat differently than the unencrypted F4F assets?
3.3. How do I enable the fragmenter's encryption mode and specify encryption configuration for encrypting the content using the HDK?

Miscellaneous

4.1. What should be the ideal fragment duration?
4.2. Is it possible for the HDK to do sliding window DVR/VOD mode?
4.3. Are Apache Web Server and the HTTP origin module enough for supporting HTTP live streaming, or do I need Flash Media Server also?
4.4. Is the HDK thread-safe? Can it be used in a multi-threaded program?
4.5. Can the offline packager in the sample app be used to create multi-bitrate (MBR) assets (with a single manifest file for multiple streams)?
4.6. Is the format of the F4F file fixed? Can I store my fragments in a different format? Does the HDK support that?
4.7. What is the purpose of MoovBuffer? What happens if the setMoovBuffer() function is not called and the fragmentation process is initiated?

Below are the answers to the frequently asked questions about packaging encoded media and developing applications using the HTTP Dynamic Streaming SDK.

Questions regarding source

This section covers questions related to the input to the fragmentation process, such as construction of MediaMessage objects.

1.1. How do I use RefCountedMemoryArena? Do I need IFileSouce to use the RefCountedMemoryArena?

You can just create an instance of RefCountedMemoryArena and use it. For example:

RefCountedMemoryArena m_arena;

You can think of RefCountedMemoryArena as a memory allocator. It does not depend on the creation of IFileSource for it be used properly. For example, to allocate a memory block of size given by msglen:

IMemoryArenaAllocPtr areaAlloc = m_arena->allocate(msgLen);

1.2. What is the structure of the MediaMessage header field? Is it always 11 bytes? What do the different bytes in the header signify?

Every MediaMessage has a header, which will always be of 11 bytes. The bytes convey essential information about the message:

Byte 0: Contains the message ID, which tells whether the message is audio, video, or data. For video, the msgID is 0x9. For audio, it is 0x8. For an AMFO data message, it is 0x12. For an AMF3 data message, it is 0xF. If the payload is encrypted, the sixth bit (the first bit being the least significant bit) of the byte should be set to 1. For example, encrypted video will have a message ID of 0x29.

Bytes 1–3: Represent the size of the payload in bytes. The three bytes of length is set in big-endian format. For example, if the payload size is 258, it will be represented as header[1] = 0x0, header[2] = 0x1 and header[3] = 0x2.

Bytes 4–7: Represent the timestamp of the message. For historical reasons, the timestamp bytes are out of order. The header byte 7 used to be part of the stream ID, but we only need three bytes of stream ID and four bytes of timestamp. So, the following order is used:

  • header[7] = timestamp[3]
  • header[4] = timestamp[2]
  • header[5] = timestamp[1]
  • header[6] = timestamp[0]

Bytes 8–10: Represent the stream of the message. A stream is a logical channel of coomunication. MediaMessages being used with the HDK can have these bytes set to 0.

1.3. What is the BackTag in a MediaMessage? Why is it required? What is the size of the BackTag in the MediaMessage?

BackTag refers to the four bytes that are added to end of the payload of the MediaMessage. It is helpful in reverse-seeking of the MediaMessage. Using this field, you can move directly from the end of a MediaMessage to the beginning of the MediaMessage. The value contained in BackTag is (MediaMessage Header (11 bytes) + Codec Header Size + Payload) in big-endian format.

1.4. Is the Audio Data Transport Stream (ADTS) AAC audio payload format supported?

No. We do not support the ADTS AAC payload format in HDS. HDS expects a raw AAC bitstream, which follows the AAC sequence header (ASC) packet (which should be the first audio packet in the Fragmenter).

1.5. What is a codec header? Does the size of the codec header vary from codec to codec? How can I create codec headers for the HDK-supported codecs?

The codec header represents certain bytes that are prefixed to the payload of each MediaMessage and signify the codec used to encrypt the payload, along with supplementary information such as audio channels, sample rate, audio sample size, video keyframe or interframe, and so on.The size of the codec header varies from codec to codec. For example, AAC has a two-byte codec header, while MP3 and Nellymoser have a one-byte codec header each. Among video codecs, H.264 has a two-byte codec header, while VP6 has a one-byte codec header. Table 1 explains the process of creating the audio codec header for various audio formats.

Table 1. Audio codec header structure

Field

Type

Comment

SoundFormat

UB[4]

Format of SoundData. The following values are defined:

0 = Linear PCM, platform-endian

1 = ADPCM

2 = MP3

3 = Linear PCM, little-endian

4 = Nellymoser 16-kHz mono

5 = Nellymoser 8-kHz mono

6 = Nellymoser

7 = G.711 A-law logarithmic PCM

8 = G.711 mu-law logarithmic PCM

9 = reserved

10 = AAC

11 = Speex

14 = MP3 8-Khz

15 = Device-specific sound

Formats 7, 8, 14, and 15 are reserved for internal use.

AAC is supported in Flash Player 9,0,115,0 and later.

Speex is supported in Flash Player 10 and later.

SoundRate

UB[2]

Sampling rate. The following values are defined:

0 = 5.5-kHz

1 = 11-kHz

2 = 22-kHz

3 = 44-kHz

For AAC: always 3

SoundSize

UB[1]

Size of each sample. This parameter only pertains to uncompressed formats. Compressed formats always decode to 16 bits internally.

0 = snd8Bit

1 = snd16Bit

SoundType

UB[1]

Mono or stereo sound

0 = sndMono

1 = sndStereo

For Nellymoser: always 0

For AAC: always 1

For the AAC audio format, the codec header consists of another byte, which signifies whether the packet has AAC sequence header (ASC) data or AAC Raw Bitstream Payload data. For an ASC packet, this byte is set to 0, and for an AAC payload, it is set to 1.

For example, this is what the codec header would look like for an AAC encrypted data stream that is 22050kHz, stereo sound, and has 16-bit sample depth:

  • ASC packet codec header: af 00
  • AAC raw bitstream payload: af 01

Table 2 explains the process of creating the video codec header for various video formats.

Table 2. Video codec header structure

Field

Type

Comment

FrameType

UB[4]

Type of frame: keyframe, interframe, or other. The following values are defined:

1 = keyframe (for Advanced Video Coding [AVC], a seekable frame)

2 = interframe (for AVC, a nonseekable frame)

3 = disposable interframe (H.263 only)

4 = generated keyframe (reserved for server use only)

5 = video info/command frame

CodecID

UB[4]

Codec used to encrypt the payload. The following values are defined:

1 = JPEG (currently unused)

2 = Sorenson H.263

3 = Screen video

4 = On2 VP6

5 = On2 VP6 with alpha channel

6 = Screen video version 2

7 = AVC

For VP6, the video codec header is of two bytes, the first being created based on Table 2, and the second being created by the following formula:

val = ((((width & 0xF) << 4) | (height & 0xF)) ^ 0xFF);

For Advanced Video Coding (AVC), the total codec header is of five bytes, the first being created based on Table 2, and the rest of the four created as follows:

  • Byte 2: 0 if the payload is an AVC sequence header, 1 if the payload is an AVC Network Abstraction Layer Unit (NALU), and 2 if the packet is an AVC end-of-sequence packet.
  • Byte 3–5: If Byte 2 is 1, then these bytes represent the composition time offset; otherwise, these are set to 0.

1.6. What should be the media handler name for video, audio, and data tracks for an F4V file to work with the HDK?

The F4V reader, shipped as part of HDS, expects the video track's media handler name to be "vide" and audio track's handler name to be "soun." For data tracks, either the track should have a media handler name of "meta" or should have the codec type in the stsd box to be either amf0 or amf3.

1.7. When should the AVC and ASC messages be ingested into the HDK? Can they be ingested after sending corresponding video and audio messages? Is it possible to package H.264 and AAC data without first ingesting the corresponding AVC and ASC messages to the HDK?

AVC and ASC messages are an absolute must for packaging H.264 and AAC data respectively. These should be the first messages to be pushed to the fragmenter, before pushing messages that have the corresponding codec's encoded payload. This is required because these messages are used by the fragmenter to create corresponding video (H.264) and audio (AAC) tracks.

1.8. I have an H.264 stream wherein a single coded picture is being created in more than one NALUs. Should I create one MediaMessage for each NALU with same timestamp or stitch all the NALUs in a single MediaMessage?

Flash Player, and hence OSMF, expects that each MediaMessage (having H.264-encoded content) to contain one full-coded picture. If the input stream has a coded picture created by multiple NALUs, you need to stitch all these NALUs together into a single MediaMessage for the player to render the data properly on the screen.

1.9. How do I create a MediaMessage from raw payload data?

A very basic example of creating a MediaMessage from the payload data:

  • bufPayload: actual payload
  • payloadSize: size of this payload
  • codecHeaderBuf: buffer containing the codec header
  • codecHeaderLen: codec header length
  • kHeaderSize: size of standard header: 11 bytes
  • kBackTagSize: size of BackTag:. four bytes
  • ts: timestamp of the message in milliseconds
  • vidCodec: video codec
  • naluStartCodeSize: Size of the NALU start code
uint32_t szPayload = payloadSize + codecHeaderLen + kBackTagSize; if(vidCodec == H.264) //if video codec is H.264, remove the NALU startcode and adjust the var szPayload. 4 bytes containing the size of the payload should be added. { szPayload -= naluStartCodeSize; payloadSize -= naluStartCodeSize; //Add 4 bytes for payload size szPayload += 4; } size_t msgLen = szPayload + kHeaderSize; IMemoryArenaAllocPtr arenaAlloc = m_memArena.allocate(msgLen); uint8_t* newmsg = static_cast<uint8_t*>(arenaAlloc->data()); memset(newmsg,0,msgLen); uint32_t offset = kHeaderSize; memcpy(newmsg + offset , codecHeaderBuf, codecHeaderLen); offset += codecHeaderLen; //if H.264, write payload size in a buffer, and copy it in the newmsg if(vidCodec == H.264) { uint8_t sizeBuf[4]; sizeBuf[3] = payloadSize & 0xFF; sizeBuf[2] = (payloadSize >> 8) & 0xFF; sizeBuf[1] = (payloadSize >> 16) & 0xFF; sizeBuf[0] = (payloadSize >> 24) & 0xFF; memcpy(newmsg + offset, sizeBuf, 4); offset += 4; } if(vidCodec == H.264) { //Move the buffer and ignore the NALU start code bufPayload += naluStartCodeSize; } memcpy(newmsg+offset,bufPayload,payloadSize); MediaMessage::setMsgLen(newmsg,szPayload); MediaMessage::setStreamID(newmsg, 0); MediaMessage::setTimeStamp(newmsg,ts); MediaMessage::setBackTag(newmsg,szPayload – kBackTagSize); newMsg[0] = {8,9,15,18}//Type of message .. For audio, it is 8, for video it is 9, for amf0 it is 18, and for amf3 it is 18 //Create the Mediamessage MediaMsg msg = new MediaMessage(ts,msglen,arenaAlloc,newMsg+kHeaderSize, szPayload);

Troubleshooting

2.1. How do I debug the HDK? Is there any verbose mode of the HDK so I can get more details?

The HDK has logging capabilities. The user has to enable logging by calling the API setLogger, passing with it an implementation of the interface IMultiLogger or ILogger and calling LoggerCfg::setLogLevel to set the verbosity.

2.2. I get an error code from the HDK API. Can I get more information other than the message returned by getErrorString()?

As explained in 2.1, the SDK has logging capabilities, and it logs as much information as it can by invoking the logMessage() API of the ILogger object passed in setLogger().

2.3. Is there a sample logger? How can I use it in my application?

To set the logger, just call the setLogger API of the HDK and pass an implementation of the IMultiLogger or ILogger interface. An example implementation is provided in the samples. Refer to the class HSLogger in hslogger.hpp in STREAMING SDK ROOT/samples/common/include. Check the sample usage of the same API inside the Packager class contructor in packager.cpp (present in STREAMING SDK ROOT\samples\Packager\src).

2.4. When I push my first data message after the first fragment is created, I get an error from the fragment: "Track not found." What causes this error? How can this be prevented?

The HDK expects that all tracks will be created before the first fragment is created. This is required because the size of the MOOV box is finalized before the first fragment is written, and the MOOV box's size will not be expanded once finalized. (This is to enable writing of F4F files with the MOOV box at the beginning and fragments following that.) If no data message is pushed before the creation of the first fragment, no data track will be created; hence the error "Track not found" occurs when the first data message is pushed after the creation of the first fragment. This problem can be prevented for now by pushing in at least one AMF0 or AMF3 (whichever type of data message you expect to push later on) before the first fragment is created.

2.5. Why do I get an unsupported codec error (offline Packager::fragment failed! error) when the same file runs fine using the f4fpackager shipped with Flash Media Server?

Both the Flash Media Server implementation of f4fPackager and the HDK implementation do not support certain codecs. The two implemenations handle this scenario differently, however. While f4fPackager just ignores the track containing the unsupported codec and moves on to fragment the data available in the supported tracks, HDK considers this as an exception condition and throws an exception, and then aborts the process of creating the media source. Hence, you are not able to fragment the same file with the offline packager implemented using the HDK. This can be avoided by setting the ignoreUnsupportedCodecs parameter in the call to createF4VFileSource to true. Then, the HDK will just log an error message, but will continue with the fragmentation of the data available in the supported tracks.

Content protection

This section contains questions related to protected content and encryption.

3.1. What should be the format of the license server certificate and transport certificate used for encryption?

The license server certificate and transport certificate should be in DER format. Any other format, such as PFX, can result in assertions being hit in the HDS module.

3.2. Do I need to host encrypted F4F assets somewhat differently than the unencrypted F4F assets?

No, you do not need to host encrypted and unencrypted F4F assets differently. As long as the HTTP module is able to find them, and the manifest file is created properly with the DRM additional header information, the player will be able to play them.

3.3. How do I enable the fragmenter's encryption mode and specify encryption configuration for encrypting the content using the HDK?

To generate an encrypted fragment, the m_EncryptionRequired variable of the FragmenterConfig class should be set to true and the m_Encryptionconfig object of EncryptionConfig should be initialized. To initialize this object, you need to pass the following parameters to its constructor:

fms::String m_contentId; adbe::hds::ByteBuffer m_commonKey; adbe::hds::ByteBuffer m_encryptionKey; fms::String m_licenseServerUrl; adbe::hds::ByteBuffer m_transportCert; adbe::hds::ByteBuffer m_licenseServerCert; adbe::hds::ByteBuffer m_packagerCredential; fms::String m_credentialPwd; adbe::hds::ByteBuffer m_policy; adbe::hds::EncryptionSelection m_encSelection; bool m_usingCommonKey;

The EncryptionConfig class constructor syntax:

EncryptionConfig (key, licenseServerUrl, transportCert, licenseServerCert, packagerCredential, credentialPwd, policy, encSel, isCommonKey));

Miscellaneous

4.1. What should be the ideal fragment duration?

The ideal fragment duration should be a multiple of the keyframe frequency of the stream being pushed into the fragmenter. This helps create even-sized fragments (as fragmentation is done at keyframe arrival), and hence smaller fragment run tables, which results in smaller bootstrap information. Smaller bootstrap information results in less data to be supplied to the player, and hence reduced bandwidth usage.

Another thing to consider is that fragment duration shouldn't be too small. If the load time of the fragments to the player (which involves the player making a get request on the HTTP server for a fragment, and the resultant download of the content at the client) is more than the duration of the fragments, subscribers might encounter some video stuttering at the fragment boundaries. Also, seeking on the content might not give a smooth experience when the fragments are of very small duration. Generally, a four-second fragment duration is considered ideal, if it is a multiple of the incoming stream's keyframe frequency. On the other hand, too large a fragment duration will result in latency.

The OfflinePackager sample application uses a four-second fragment duration by default.

4.2. Is it possible for the HDK to do sliding window DVR/VOD mode?

At present, sliding window DVR/VOD mode is not directly supported by the HDK. It is being looked at, and should be available in future releases. However, you can achieve the effect of DVR sliding window in a limited way by writing application code to handle fragments by removing them from the F4F file, or just by modifying the HTTP module and the F4X file to not return fragments past a set time. This is more a "hack" than a complete solution. This will not prevent users from trying to seek (since the player is aware of the complete bootstrap, it may allow the user to seek), but will show an error if the viewer seeks beyond the set duration.

4.3. Are Apache Web Server and the HTTP origin module enough for supporting HTTP live streaming, or do I need Flash Media Server also?

Apache Web Server and the HTTP origin module is sufficient for supporting the HTTP live streaming use case. Flash Media Server is not necessary if there is another way to create the fragments as well as bootstrap to serve the client requests.

4.4. Is the HDK thread-safe? Can it be used in a multi-threaded program?

Individual instances of HDK objects (IFragmenter, IMediaSource, and others) are not thread-safe; the functions are not re-entrant. However, the HDK can be used in a multi-threaded program to fragment multiple streams in parallel, with each object being used by a single thread. There is no interdependence between multiple instances of HDK objects.

4.5. Can the offline packager in the sample app be used to create multi-bitrate (MBR) assets (with a single manifest file for multiple streams)?

The offline packager example as given can only process one stream and create one manifest file at a time. To create a MBR asset with a single manifest file, you will have to modify the manifest generation part of the code. The rest of the pieces like IMediaSource, IFragmenter, F4FWriter and the F4XWriter can be used as is.

4.6. Is the format of the F4F file fixed? Can I store my fragments in a different format? Does the HDK support that?

The F4F file format is an MPEG-4 compatible file format. However, you can choose to store the fragments in a different way, and as long as you are willing to write your own HTTP module to serve the requests for fragments, there is no restriction on how a fragment is stored, whether as part of an F4F file or some other way. The core HDK APIs do not have anything to do with the file writing operations. It is up to the application code to decide how the fragments are stored.

4.7. What is the purpose of MoovBuffer? What happens if the setMoovBuffer() function is not called and the fragmentation process is initiated?

The MOOV buffer is a part of the MPEG-4 format and has global metadata regarding the tracks and codecs used. It makes the F4F file a valid MPEG-4 file. However, it is not used during HTTP streaming playback, so the HDK allows you to not set a MOOV buffer and directly create and write fragments.