| rfc9628v2.txt | rfc9628.txt | |||
|---|---|---|---|---|
| Internet Engineering Task Force (IETF) J. Uberti | Internet Engineering Task Force (IETF) J. Uberti | |||
| Request for Comments: 9628 S. Holmer | Request for Comments: 9628 S. Holmer | |||
| Category: Standards Track M. Flodman | Category: Standards Track M. Flodman | |||
| ISSN: 2070-1721 D. Hong | ISSN: 2070-1721 D. Hong | |||
| J. Lennox | J. Lennox | |||
| 8x8 / Jitsi | 8x8 / Jitsi | |||
| August 2024 | October 2024 | |||
| RTP Payload Format for VP9 Video | RTP Payload Format for VP9 Video | |||
| Abstract | Abstract | |||
| This specification describes an RTP payload format for the VP9 video | This specification describes an RTP payload format for the VP9 video | |||
| codec. The payload format has wide applicability as it supports | codec. The payload format has wide applicability as it supports | |||
| applications from low bitrate peer-to-peer usage to high bitrate | applications from low bitrate peer-to-peer usage to high bitrate | |||
| video conferences. It includes provisions for temporal and spatial | video conferences. It includes provisions for temporal and spatial | |||
| scalability. | scalability. | |||
| skipping to change at line 240 ¶ | skipping to change at line 240 ¶ | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Figure 1: General RTP Payload Format for VP | Figure 1: General RTP Payload Format for VP | |||
| See Section 4.2 for more information on the VP9 payload descriptor; | See Section 4.2 for more information on the VP9 payload descriptor; | |||
| the VP9 payload is described in [VP9-BITSTREAM]. OPTIONAL RTP | the VP9 payload is described in [VP9-BITSTREAM]. OPTIONAL RTP | |||
| padding MUST NOT be included unless the P bit is set. | padding MUST NOT be included unless the P bit is set. | |||
| Marker bit (M): This bit MUST be set to 1 for the final packet of | Marker bit (M): This bit MUST be set to 1 for the final packet of | |||
| the highest spatial-layer frame (the final packet of the picture); | the highest spatial-layer frame (the final packet of the picture); | |||
| otherwise, it MUST be set to 0. Unless spatial scalability is in | otherwise, it is 0. Unless spatial scalability is in use for this | |||
| use for this picture, this bit will have the same value as the E | picture, this bit will have the same value as the E bit described | |||
| bit described in Section 4.2. Note this bit MUST be set to 1 for | in Section 4.2. Note this bit MUST be set to 1 for the target | |||
| the target spatial-layer frame if a stream is being rewritten to | spatial-layer frame if a stream is being rewritten to remove | |||
| remove higher spatial layers. | higher spatial layers. | |||
| Payload Type (PT): In line with the policy in Section 3 of | Payload Type (PT): In line with the policy in Section 3 of | |||
| [RFC3551], applications using the VP9 RTP payload profile MUST | [RFC3551], applications using the VP9 RTP payload profile MUST | |||
| assign a dynamic payload type number to be used in each RTP | assign a dynamic payload type number to be used in each RTP | |||
| session and provide a mechanism to indicate the mapping. See | session and provide a mechanism to indicate the mapping. See | |||
| Section 6.1 for the mechanism to be used with the Session | Section 6.1 for the mechanism to be used with the Session | |||
| Description Protocol (SDP) [RFC8866]. | Description Protocol (SDP) [RFC8866]. | |||
| Timestamp: The RTP timestamp [RFC3550] indicates the time when the | Timestamp: The RTP timestamp [RFC3550] indicates the time when the | |||
| input frame was sampled, at a clock rate of 90 kHz. If the input | input frame was sampled, at a clock rate of 90 kHz. If the input | |||
| skipping to change at line 365 ¶ | skipping to change at line 365 ¶ | |||
| resets the encoder state. This packet will have its P bit equal | resets the encoder state. This packet will have its P bit equal | |||
| to 0, SID or L bit (described below) equal to 0, and B bit | to 0, SID or L bit (described below) equal to 0, and B bit | |||
| (described below) equal to 1. | (described below) equal to 1. | |||
| B: Start of a frame. This bit MUST be set to 1 if the first payload | B: Start of a frame. This bit MUST be set to 1 if the first payload | |||
| octet of the RTP packet is the beginning of a new VP9 frame; | octet of the RTP packet is the beginning of a new VP9 frame; | |||
| otherwise, it MUST NOT be 1. Note that this frame might not be | otherwise, it MUST NOT be 1. Note that this frame might not be | |||
| the first frame of a picture. | the first frame of a picture. | |||
| E: End of a frame. This bit MUST be set to 1 for the final RTP | E: End of a frame. This bit MUST be set to 1 for the final RTP | |||
| packet of a VP9 frame; otherwise, it MUST be 0. This enables a | packet of a VP9 frame; otherwise, it is 0. This enables a decoder | |||
| decoder to finish decoding the frame, where it otherwise may need | to finish decoding the frame, where it otherwise may need to wait | |||
| to wait for the next packet to explicitly know that the frame is | for the next packet to explicitly know that the frame is complete. | |||
| complete. Note that, if spatial scalability is in use, more | Note that, if spatial scalability is in use, more frames from the | |||
| frames from the same picture may follow; see the description of | same picture may follow; see the description of the B bit above. | |||
| the B bit above. | ||||
| V: Scalability Structure (SS) data present. When set to 1, the | V: Scalability Structure (SS) data present. When set to 1, the | |||
| OPTIONAL SS data MUST be present in the payload descriptor. | OPTIONAL SS data MUST be present in the payload descriptor. | |||
| Otherwise, the SS data MUST NOT be present. | Otherwise, the SS data MUST NOT be present. | |||
| Z: Not a reference frame for upper spatial layers. If set to 1, | Z: Not a reference frame for upper spatial layers. If set to 1, | |||
| indicates that frames with higher spatial layers SID+1 and greater | indicates that frames with higher spatial layers SID+1 and greater | |||
| of the current and following pictures do not depend on the current | of the current and following pictures do not depend on the current | |||
| spatial-layer SID frame. This enables a decoder that is targeting | spatial-layer SID frame. This enables a decoder that is targeting | |||
| a higher spatial layer to know that it can safely discard this | a higher spatial layer to know that it can safely discard this | |||
| End of changes. 3 change blocks. | ||||
| 12 lines changed or deleted | 11 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||