| rfc9639.original | rfc9639.txt | |||
|---|---|---|---|---|
| cellar M.Q.C. van Beurden | Internet Engineering Task Force (IETF) M.Q.C. van Beurden | |||
| Internet-Draft | Request for Comments: 9639 | |||
| Intended status: Standards Track A. Weaver | Category: Standards Track A. Weaver | |||
| Expires: 17 July 2024 14 January 2024 | ISSN: 2070-1721 December 2024 | |||
| Free Lossless Audio Codec | Free Lossless Audio Codec (FLAC) | |||
| draft-ietf-cellar-flac-14 | ||||
| Abstract | Abstract | |||
| This document defines the Free Lossless Audio Codec (FLAC) format and | This document defines the Free Lossless Audio Codec (FLAC) format and | |||
| its streamable subset. FLAC is designed to reduce the amount of | its streamable subset. FLAC is designed to reduce the amount of | |||
| computer storage space needed to store digital audio signals without | computer storage space needed to store digital audio signals. It | |||
| losing information in doing so (i.e., lossless). FLAC is free in the | does this losslessly, i.e., it does so without losing information. | |||
| sense that its specification is open and its reference implementation | FLAC is free in the sense that its specification is open and its | |||
| is open-source. Compared to other lossless (audio) coding formats, | reference implementation is open source. Compared to other lossless | |||
| FLAC is a format with low complexity and can be coded to and from | audio coding formats, FLAC is a format with low complexity and can be | |||
| with little computing resources. Decoding of FLAC has seen many | encoded and decoded with little computing resources. Decoding of | |||
| independent implementations on many different platforms, and both | FLAC has been implemented independently for many different platforms, | |||
| encoding and decoding can be implemented without needing floating- | and both encoding and decoding can be implemented without needing | |||
| point arithmetic. | floating-point arithmetic. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
| provisions of BCP 78 and BCP 79. | ||||
| Internet-Drafts are working documents of the Internet Engineering | ||||
| Task Force (IETF). Note that other groups may also distribute | ||||
| working documents as Internet-Drafts. The list of current Internet- | ||||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
| Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
| and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
| time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
| material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
| Internet Standards is available in Section 2 of RFC 7841. | ||||
| This Internet-Draft will expire on 17 July 2024. | Information about the current status of this document, any errata, | |||
| and how to provide feedback on it may be obtained at | ||||
| https://www.rfc-editor.org/info/rfc9639. | ||||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2024 IETF Trust and the persons identified as the | Copyright (c) 2024 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
| license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | ||||
| Please review these documents carefully, as they describe your rights | carefully, as they describe your rights and restrictions with respect | |||
| and restrictions with respect to this document. Code Components | to this document. Code Components extracted from this document must | |||
| extracted from this document must include Revised BSD License text as | include Revised BSD License text as described in Section 4.e of the | |||
| described in Section 4.e of the Trust Legal Provisions and are | Trust Legal Provisions and are provided without warranty as described | |||
| provided without warranty as described in the Revised BSD License. | in the Revised BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction | |||
| 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 4 | 2. Notation and Conventions | |||
| 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 3. Definitions | |||
| 4. Conceptual overview . . . . . . . . . . . . . . . . . . . . . 7 | 4. Conceptual Overview | |||
| 4.1. Blocking . . . . . . . . . . . . . . . . . . . . . . . . 8 | 4.1. Blocking | |||
| 4.2. Interchannel Decorrelation . . . . . . . . . . . . . . . 8 | 4.2. Interchannel Decorrelation | |||
| 4.3. Prediction . . . . . . . . . . . . . . . . . . . . . . . 9 | 4.3. Prediction | |||
| 4.4. Residual Coding . . . . . . . . . . . . . . . . . . . . . 10 | 4.4. Residual Coding | |||
| 5. Format principles . . . . . . . . . . . . . . . . . . . . . . 11 | 5. Format Principles | |||
| 6. Format layout overview . . . . . . . . . . . . . . . . . . . 13 | 6. Format Layout Overview | |||
| 7. Streamable subset . . . . . . . . . . . . . . . . . . . . . . 14 | 7. Streamable Subset | |||
| 8. File-level metadata . . . . . . . . . . . . . . . . . . . . . 15 | 8. File-Level Metadata | |||
| 8.1. Metadata block header . . . . . . . . . . . . . . . . . . 15 | 8.1. Metadata Block Header | |||
| 8.2. Streaminfo . . . . . . . . . . . . . . . . . . . . . . . 16 | 8.2. Streaminfo | |||
| 8.3. Padding . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 8.3. Padding | |||
| 8.4. Application . . . . . . . . . . . . . . . . . . . . . . . 19 | 8.4. Application | |||
| 8.5. Seektable . . . . . . . . . . . . . . . . . . . . . . . . 20 | 8.5. Seek Table | |||
| 8.5.1. Seekpoint . . . . . . . . . . . . . . . . . . . . . . 21 | 8.5.1. Seek Point | |||
| 8.6. Vorbis comment . . . . . . . . . . . . . . . . . . . . . 21 | 8.6. Vorbis Comment | |||
| 8.6.1. Standard field names . . . . . . . . . . . . . . . . 22 | 8.6.1. Standard Field Names | |||
| 8.6.2. Channel mask . . . . . . . . . . . . . . . . . . . . 23 | 8.6.2. Channel Mask | |||
| 8.7. Cuesheet . . . . . . . . . . . . . . . . . . . . . . . . 25 | 8.7. Cuesheet | |||
| 8.7.1. Cuesheet track . . . . . . . . . . . . . . . . . . . 27 | 8.7.1. Cuesheet Track | |||
| 8.8. Picture . . . . . . . . . . . . . . . . . . . . . . . . . 28 | 8.8. Picture | |||
| 9. Frame structure . . . . . . . . . . . . . . . . . . . . . . . 32 | 9. Frame Structure | |||
| 9.1. Frame header . . . . . . . . . . . . . . . . . . . . . . 33 | 9.1. Frame Header | |||
| 9.1.1. Block size bits . . . . . . . . . . . . . . . . . . . 33 | 9.1.1. Block Size Bits | |||
| 9.1.2. Sample rate bits . . . . . . . . . . . . . . . . . . 34 | 9.1.2. Sample Rate Bits | |||
| 9.1.3. Channels bits . . . . . . . . . . . . . . . . . . . . 35 | 9.1.3. Channels Bits | |||
| 9.1.4. Bit depth bits . . . . . . . . . . . . . . . . . . . 37 | 9.1.4. Bit Depth Bits | |||
| 9.1.5. Coded number . . . . . . . . . . . . . . . . . . . . 37 | 9.1.5. Coded Number | |||
| 9.1.6. Uncommon block size . . . . . . . . . . . . . . . . . 39 | 9.1.6. Uncommon Block Size | |||
| 9.1.7. Uncommon sample rate . . . . . . . . . . . . . . . . 39 | 9.1.7. Uncommon Sample Rate | |||
| 9.1.8. Frame header CRC . . . . . . . . . . . . . . . . . . 40 | 9.1.8. Frame Header CRC | |||
| 9.2. Subframes . . . . . . . . . . . . . . . . . . . . . . . . 40 | 9.2. Subframes | |||
| 9.2.1. Subframe header . . . . . . . . . . . . . . . . . . . 40 | 9.2.1. Subframe Header | |||
| 9.2.2. Wasted bits per sample . . . . . . . . . . . . . . . 41 | 9.2.2. Wasted Bits per Sample | |||
| 9.2.3. Constant subframe . . . . . . . . . . . . . . . . . . 42 | 9.2.3. Constant Subframe | |||
| 9.2.4. Verbatim subframe . . . . . . . . . . . . . . . . . . 42 | 9.2.4. Verbatim Subframe | |||
| 9.2.5. Fixed predictor subframe . . . . . . . . . . . . . . 42 | 9.2.5. Fixed Predictor Subframe | |||
| 9.2.6. Linear predictor subframe . . . . . . . . . . . . . . 44 | 9.2.6. Linear Predictor Subframe | |||
| 9.2.7. Coded residual . . . . . . . . . . . . . . . . . . . 46 | 9.2.7. Coded Residual | |||
| 9.3. Frame footer . . . . . . . . . . . . . . . . . . . . . . 49 | 9.3. Frame Footer | |||
| 10. Container mappings . . . . . . . . . . . . . . . . . . . . . 49 | 10. Container Mappings | |||
| 10.1. Ogg mapping . . . . . . . . . . . . . . . . . . . . . . 49 | 10.1. Ogg Mapping | |||
| 10.2. Matroska mapping . . . . . . . . . . . . . . . . . . . . 51 | 10.2. Matroska Mapping | |||
| 10.3. ISO Base Media File Format (MP4) mapping . . . . . . . . 51 | 10.3. ISO Base Media File Format (MP4) Mapping | |||
| 11. Implementation status . . . . . . . . . . . . . . . . . . . . 52 | 11. Security Considerations | |||
| 12. Security Considerations . . . . . . . . . . . . . . . . . . . 52 | 12. IANA Considerations | |||
| 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 55 | 12.1. Media Type Registration | |||
| 13.1. Media type registration . . . . . . . . . . . . . . . . 55 | 12.2. FLAC Application Metadata Block IDs Registry | |||
| 13.2. Application ID Registry . . . . . . . . . . . . . . . . 56 | 13. References | |||
| 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 58 | 13.1. Normative References | |||
| 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 59 | 13.2. Informative References | |||
| 15.1. Normative References . . . . . . . . . . . . . . . . . . 59 | Appendix A. Numerical Considerations | |||
| 15.2. Informative References . . . . . . . . . . . . . . . . . 60 | A.1. Determining the Necessary Data Type Size | |||
| Appendix A. Numerical considerations . . . . . . . . . . . . . . 62 | A.2. Stereo Decorrelation | |||
| A.1. Determining the necessary data type size . . . . . . . . 63 | A.3. Prediction | |||
| A.2. Stereo decorrelation . . . . . . . . . . . . . . . . . . 63 | A.4. Residual | |||
| A.3. Prediction . . . . . . . . . . . . . . . . . . . . . . . 64 | A.5. Rice Coding | |||
| A.4. Residual . . . . . . . . . . . . . . . . . . . . . . . . 65 | Appendix B. Past Format Changes | |||
| A.5. Rice coding . . . . . . . . . . . . . . . . . . . . . . . 66 | B.1. Addition of Blocking Strategy Bit | |||
| Appendix B. Past format changes . . . . . . . . . . . . . . . . 66 | B.2. Restriction of Encoded Residual Samples | |||
| B.1. Addition of blocking strategy bit . . . . . . . . . . . . 66 | B.3. Addition of 5-Bit Rice Parameters | |||
| B.2. Restriction of encoded residual samples . . . . . . . . . 67 | B.4. Restriction of LPC Shift to Non-negative Values | |||
| B.3. Addition of 5-bit Rice parameters . . . . . . . . . . . . 67 | Appendix C. Interoperability Considerations | |||
| B.4. Restriction of LPC shift to non-negative values . . . . . 68 | C.1. Features outside of the Streamable Subset | |||
| Appendix C. Interoperability considerations . . . . . . . . . . 68 | C.2. Variable Block Size | |||
| C.1. Features outside of the streamable subset . . . . . . . . 68 | C.3. 5-Bit Rice Parameters | |||
| C.2. Variable block size . . . . . . . . . . . . . . . . . . . 68 | C.4. Rice Escape Code | |||
| C.3. 5-bit Rice parameter . . . . . . . . . . . . . . . . . . 69 | C.5. Uncommon Block Size | |||
| C.4. Rice escape code . . . . . . . . . . . . . . . . . . . . 69 | C.6. Uncommon Bit Depth | |||
| C.5. Uncommon block size . . . . . . . . . . . . . . . . . . . 69 | C.7. Multi-Channel Audio and Uncommon Sample Rates | |||
| C.6. Uncommon bit depth . . . . . . . . . . . . . . . . . . . 69 | C.8. Changing Audio Properties Mid-Stream | |||
| C.7. Multi-channel audio and uncommon sample rates . . . . . . 70 | Appendix D. Examples | |||
| C.8. Changing audio properties mid-stream . . . . . . . . . . 71 | D.1. Decoding Example 1 | |||
| Appendix D. Examples . . . . . . . . . . . . . . . . . . . . . . 71 | D.1.1. Example File 1 in Hexadecimal Representation | |||
| D.1. Decoding example 1 . . . . . . . . . . . . . . . . . . . 72 | D.1.2. Example File 1 in Binary Representation | |||
| D.1.1. Example file 1 in hexadecimal representation . . . . 72 | D.1.3. Signature and Streaminfo | |||
| D.1.2. Example file 1 in binary representation . . . . . . . 72 | D.1.4. Audio Frames | |||
| D.1.3. Signature and streaminfo . . . . . . . . . . . . . . 72 | D.2. Decoding Example 2 | |||
| D.1.4. Audio frames . . . . . . . . . . . . . . . . . . . . 74 | D.2.1. Example File 2 in Hexadecimal Representation | |||
| D.2. Decoding example 2 . . . . . . . . . . . . . . . . . . . 76 | D.2.2. Example File 2 in Binary Representation (Only Audio | |||
| D.2.1. Example file 2 in hexadecimal representation . . . . 76 | Frames) | |||
| D.2.2. Example file 2 in binary representation (only audio | D.2.3. Streaminfo Metadata Block | |||
| frames) . . . . . . . . . . . . . . . . . . . . . . . 77 | D.2.4. Seek Table | |||
| D.2.3. Streaminfo metadata block . . . . . . . . . . . . . . 78 | D.2.5. Vorbis Comment | |||
| D.2.4. Seektable . . . . . . . . . . . . . . . . . . . . . . 78 | D.2.6. Padding | |||
| D.2.5. Vorbis comment . . . . . . . . . . . . . . . . . . . 79 | D.2.7. First Audio Frame | |||
| D.2.6. Padding . . . . . . . . . . . . . . . . . . . . . . . 80 | D.2.8. Second Audio Frame | |||
| D.2.7. First audio frame . . . . . . . . . . . . . . . . . . 81 | D.2.9. MD5 Checksum Verification | |||
| D.2.8. Second audio frame . . . . . . . . . . . . . . . . . 87 | D.3. Decoding Example 3 | |||
| D.2.9. MD5 checksum verification . . . . . . . . . . . . . . 90 | D.3.1. Example File 3 in Hexadecimal Representation | |||
| D.3. Decoding example 3 . . . . . . . . . . . . . . . . . . . 90 | D.3.2. Example File 3 in Binary Representation (Only Audio | |||
| D.3.1. Example file 3 in hexadecimal representation . . . . 90 | Frame) | |||
| D.3.2. Example file 3 in binary representation (only audio | D.3.3. Streaminfo Metadata Block | |||
| frame) . . . . . . . . . . . . . . . . . . . . . . . 90 | D.3.4. Audio Frame | |||
| D.3.3. Streaminfo metadata block . . . . . . . . . . . . . . 90 | Acknowledgments | |||
| D.3.4. Audio frame . . . . . . . . . . . . . . . . . . . . . 91 | Authors' Addresses | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 96 | ||||
| 1. Introduction | 1. Introduction | |||
| This document defines the FLAC format and its streamable subset. | This document defines the Free Lossless Audio Codec (FLAC) format and | |||
| FLAC files and streams can code for pulse-code modulated (PCM) audio | its streamable subset. FLAC files and streams can code for pulse- | |||
| with 1 to 8 channels, sample rates from 1 up to 1048575 hertz and bit | code modulated (PCM) audio with 1 to 8 channels, sample rates from 1 | |||
| depths from 4 up to 32 bits. Most tools for coding to and decoding | to 1048575 hertz, and bit depths from 4 to 32 bits. Most tools for | |||
| from the FLAC format have been optimized for CD-audio, which is PCM | coding to and decoding from the FLAC format have been optimized for | |||
| audio with 2 channels, a sample rate of 44.1 kHz, and a bit depth of | CD-audio, which is PCM audio with 2 channels, a sample rate of 44.1 | |||
| 16 bits. | kHz, and a bit depth of 16 bits. | |||
| FLAC is able to achieve lossless compression because samples in audio | FLAC is able to achieve lossless compression because samples in audio | |||
| signals tend to be highly correlated with their close neighbors. In | signals tend to be highly correlated with their close neighbors. In | |||
| contrast with general-purpose compressors, which often use | contrast with general-purpose compressors, which often use | |||
| dictionaries, do run-length coding, or exploit long-term repetition, | dictionaries, do run-length coding, or exploit long-term repetition, | |||
| FLAC removes redundancy solely in the very short term, looking back | FLAC removes redundancy solely in the very short term, looking back | |||
| at at most 32 samples. | at 32 samples at most. | |||
| The coding methods provided by the FLAC format work best on PCM audio | The coding methods provided by the FLAC format work best on PCM audio | |||
| signals, of which the samples have a signed representation and are | signals with samples that have a signed representation and are | |||
| centered around zero. Audio signals in which samples have an | centered around zero. Audio signals in which samples have an | |||
| unsigned representation must be transformed to a signed | unsigned representation must be transformed to a signed | |||
| representation as described in this document in order to achieve | representation as described in this document in order to achieve | |||
| reasonable compression. The FLAC format is not suited for | reasonable compression. The FLAC format is not suited for | |||
| compressing audio that is not PCM. | compressing audio that is not PCM. | |||
| 2. Notation and Conventions | 2. Notation and Conventions | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| "OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in | |||
| 14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
| capitals, as shown here. | capitals, as shown here. | |||
| Values expressed as u(n) represent unsigned big-endian integer using | Values expressed as u(n) represent an unsigned big-endian integer | |||
| n bits. Values expressed as s(n) represent signed big-endian integer | using n bits. Values expressed as s(n) represent a signed big-endian | |||
| using n bits, signed two's complement. Where necessary n is | integer using n bits, signed two's complement. Where necessary, n is | |||
| expressed as an equation using * (multiplication), / (division), + | expressed as an equation using * (multiplication), / (division), + | |||
| (addition), or - (subtraction). An inclusive range of the number of | (addition), or - (subtraction). An inclusive range of the number of | |||
| bits expressed is represented with an ellipsis, such as u(m...n). | bits expressed is represented with an ellipsis, such as u(m...n). | |||
| All shifts mentioned in this document are arithmetic shifts. | ||||
| While the FLAC format can store digital audio as well as other | While the FLAC format can store digital audio as well as other | |||
| digital signals, this document uses terminology specific to digital | digital signals, this document uses terminology specific to digital | |||
| audio. The use of more generic terminology was deemed less clear, so | audio. The use of more generic terminology was deemed less clear, so | |||
| a reader interested in non-audio use of the FLAC format is expected | a reader interested in non-audio use of the FLAC format is expected | |||
| to make the translation from audio-specific terms to more generic | to make the translation from audio-specific terms to more generic | |||
| terminology. | terminology. | |||
| 3. Definitions | 3. Definitions | |||
| * *Lossless compression*: reducing the amount of computer storage | *Lossless compression*: Reducing the amount of computer storage | |||
| space needed to store data without needing to remove or | space needed to store data without needing to remove or | |||
| irreversibly alter any of this data in doing so. In other words, | irreversibly alter any of this data in doing so. In other words, | |||
| decompressing losslessly compressed information returns exactly | decompressing losslessly compressed information returns exactly | |||
| the original data. | the original data. | |||
| * *Lossy compression*: like lossless compression, but instead | *Lossy compression*: Like lossless compression, but instead | |||
| removing, irreversibly altering, or only approximating information | removing, irreversibly altering, or only approximating information | |||
| for the purpose of further reducing the amount of computer storage | for the purpose of further reducing the amount of computer storage | |||
| space needed. In other words, decompressing lossy compressed | space needed. In other words, decompressing lossy compressed | |||
| information returns an approximation of the original data. | information returns an approximation of the original data. | |||
| * *Block*: A (short) section of linear pulse-code modulated audio | *Block*: A (short) section of linear PCM audio with one or more | |||
| with one or more channels. | channels. | |||
| * *Subblock*: All samples within a corresponding block for one | *Subblock*: All samples within a corresponding block for one | |||
| channel. One or more subblocks form a block, and all subblocks in | channel. One or more subblocks form a block, and all subblocks in | |||
| a certain block contain the same number of samples. | a certain block contain the same number of samples. | |||
| * *Frame*: A frame header, one or more subframes, and a frame | *Frame*: A frame header, one or more subframes, and a frame footer. | |||
| footer. It encodes the contents of a corresponding block. | It encodes the contents of a corresponding block. | |||
| * *Subframe*: An encoded subblock. All subframes within a frame | *Subframe*: An encoded subblock. All subframes within a frame code | |||
| code for the same number of samples. When interchannel | for the same number of samples. When interchannel decorrelation | |||
| decorrelation is used, a subframe can correspond to either the | is used, a subframe can correspond to either the (per-sample) | |||
| (per-sample) average of two subblocks or the (per-sample) | average of two subblocks or the (per-sample) difference between | |||
| difference between two subblocks, instead of to a subblock | two subblocks, instead of to a subblock directly; see Section 4.2. | |||
| directly, see Section 4.2. | ||||
| * *Interchannel samples*: A sample count that applies to all | *Interchannel samples*: A sample count that applies to all channels. | |||
| channels. For example, one second of 44.1 kHz audio has 44100 | For example, one second of 44.1 kHz audio has 44100 interchannel | |||
| interchannel samples, meaning each channel has that number of | samples, meaning each channel has that number of samples. | |||
| samples. | ||||
| * *Block size*: The number of interchannel samples contained in a | *Block size*: The number of interchannel samples contained in a | |||
| block or coded in a frame. | block or coded in a frame. | |||
| * *Bit depth* or *bits per sample*: the number of bits used to | *Bit depth* or *bits per sample*: The number of bits used to contain | |||
| contain each sample. This MUST be the same for all subblocks in a | each sample. This MUST be the same for all subblocks in a block | |||
| block but MAY be different for different subframes in a frame | but MAY be different for different subframes in a frame because of | |||
| because of interchannel decorrelation. (See Section 4.2 for | interchannel decorrelation. (See Section 4.2 for details on | |||
| details on interchannel decorrelation) | interchannel decorrelation.) | |||
| * *Predictor*: a model used to predict samples in an audio signal | *Predictor*: A model used to predict samples in an audio signal | |||
| based on past samples. FLAC uses such predictors to remove | based on past samples. FLAC uses such predictors to remove | |||
| redundancy in a signal in order to be able to compress it. | redundancy in a signal in order to be able to compress it. | |||
| * *Linear predictor*: a predictor using linear prediction (see | *Linear predictor*: A predictor using linear prediction (see | |||
| [LinearPrediction]). This is also called *linear predictive | [LinearPrediction]). This is also called *linear predictive | |||
| coding (LPC)*. With a linear predictor, each prediction is a | coding (LPC)*. With a linear predictor, each prediction is a | |||
| linear combination of past samples, hence the name. A linear | linear combination of past samples (hence the name). A linear | |||
| predictor has a causal discrete-time finite impulse response (see | predictor has a causal discrete-time finite impulse response (see | |||
| [FIR]). | [FIR]). | |||
| * *Muxing*: short for multiplexing, combining several streams or | *Fixed predictor*: A linear predictor in which the model parameters | |||
| files into a single stream or file. In the context of this | are the same across all FLAC files and thus do not need to be | |||
| document, muxing more specifically refers to embedding a FLAC | stored. | |||
| stream in a container as described in Section 10. | ||||
| * *Fixed predictor*: a linear predictor in which the model | *Predictor order*: The number of past samples that a predictor uses. | |||
| parameters are the same across all FLAC files, and thus do not | For example, a 4th order predictor uses the 4 samples directly | |||
| need to be stored. | preceding a certain sample to predict it. In FLAC, samples used | |||
| in a predictor are always consecutive and are always the samples | ||||
| directly before the sample that is being predicted. | ||||
| * *Predictor order*: the number of past samples that a predictor | *Residual*: The audio signal that remains after a predictor has been | |||
| uses. For example, a 4th order predictor uses the 4 samples | subtracted from a subblock. If the predictor has been able to | |||
| directly preceding a certain sample to predict it. In FLAC, | remove redundancy from the signal, the samples of the remaining | |||
| samples used in a predictor are always consecutive, and are always | signal (the *residual samples*) will have, on average, a numerical | |||
| the samples directly before the sample that is being predicted. | value closer to zero than the original signal. | |||
| * *Residual*: The audio signal that remains after a predictor has | *Rice code*: A variable-length code (see [VarLengthCode]). It uses | |||
| been subtracted from a subblock. If the predictor has been able | a short code for samples close to zero and a progressively longer | |||
| to remove redundancy from the signal, the samples of the remaining | code for samples further away from zero. This makes use of the | |||
| signal (the *residual samples*) will have, on average, a smaller | observation that residual samples are often close to zero. | |||
| numerical value than the original signal. | ||||
| * *Rice code*: A variable-length code (see [VarLengthCode]) that | *Muxing*: Short for multiplexing. Combining several streams or | |||
| compresses data by making use of the observation that, after using | files into a single stream or file. In the context of this | |||
| an effective predictor, most residual samples are closer to zero | document, muxing specifically refers to embedding a FLAC stream in | |||
| than the original samples, while still allowing for a small part | a container as described in Section 10. | |||
| of the samples to be much larger. | ||||
| 4. Conceptual overview | 4. Conceptual Overview | |||
| Similar to many other audio coders, a FLAC file is encoded following | Similar to many other audio coders, a FLAC file is encoded following | |||
| the steps below. On decoding a FLAC file, these steps are undone in | the steps below. To decode a FLAC file, these steps are performed in | |||
| reverse order, i.e., from bottom to top. | reverse order, i.e., from bottom to top. | |||
| * *Blocking* (see Section 4.1). The input is split up into many | 1. *Blocking* (see Section 4.1). The input is split up into many | |||
| contiguous blocks. | contiguous blocks. | |||
| * *Interchannel Decorrelation* (see Section 4.2). In the case of | 2. *Interchannel Decorrelation* (see Section 4.2). In the case of | |||
| stereo streams, the FLAC format allows for transforming the left- | stereo streams, the FLAC format allows for transforming the left- | |||
| right signal into a mid-side signal, a left-side signal or a side- | right signal into a mid-side signal, a left-side signal, or a | |||
| right signal to remove redundancy between channels. Choosing | side-right signal to remove redundancy between channels. | |||
| between any of these transformations is done independently for | Choosing between any of these transformations is done | |||
| each block. | independently for each block. | |||
| * *Prediction* (see Section 4.3). To remove redundancy in a signal, | 3. *Prediction* (see Section 4.3). To remove redundancy in a | |||
| a predictor is stored for each subblock or its transformation as | signal, a predictor is stored for each subblock or its | |||
| formed in the previous step. A predictor consists of a simple | transformation as formed in the previous step. A predictor | |||
| mathematical description that can be used, as the name implies, to | consists of a simple mathematical description that can be used, | |||
| predict a certain sample from the samples that preceded it. As | as the name implies, to predict a certain sample from the samples | |||
| this prediction is rarely exact, the error of this prediction is | that preceded it. As this prediction is rarely exact, the error | |||
| passed on to the next stage. The predictor of each subblock is | of this prediction is passed on to the next stage. The predictor | |||
| completely independent from other subblocks. Since the methods of | of each subblock is completely independent from other subblocks. | |||
| prediction are known to both the encoder and decoder, only the | Since the methods of prediction are known to both the encoder and | |||
| parameters of the predictor need to be included in the compressed | decoder, only the parameters of the predictor need to be included | |||
| stream. If no usable predictor can be found for a certain | in the compressed stream. If no usable predictor can be found | |||
| subblock, the signal is stored uncompressed and the next stage is | for a certain subblock, the signal is stored uncompressed, and | |||
| skipped. | the next stage is skipped. | |||
| * *Residual Coding* (see Section 4.4). As the predictor does not | 4. *Residual Coding* (see Section 4.4). As the predictor does not | |||
| describe the signal exactly, the difference between the original | describe the signal exactly, the difference between the original | |||
| signal and the predicted signal (called the error or residual | signal and the predicted signal (called the error or residual | |||
| signal) is coded losslessly. If the predictor is effective, the | signal) is coded losslessly. If the predictor is effective, the | |||
| residual signal will require fewer bits per sample than the | residual signal will require fewer bits per sample than the | |||
| original signal. FLAC uses Rice coding, a subset of Golomb | original signal. FLAC uses Rice coding, a subset of Golomb | |||
| coding, with either 4-bit or 5-bit parameters to code the residual | coding, with either 4-bit or 5-bit parameters to code the | |||
| signal. | residual signal. | |||
| In addition, FLAC specifies a metadata system (see Section 8), which | In addition, FLAC specifies a metadata system (see Section 8) that | |||
| allows arbitrary information about the stream to be included at the | allows arbitrary information about the stream to be included at the | |||
| beginning of the stream. | beginning of the stream. | |||
| 4.1. Blocking | 4.1. Blocking | |||
| The block size used for audio data has a direct effect on the | The block size used for audio data has a direct effect on the | |||
| compression ratio. If the block size is too small, the resulting | compression ratio. If the block size is too small, the resulting | |||
| large number of frames means that a disproportionate amount of bytes | large number of frames means that a disproportionate number of bytes | |||
| will be spent on frame headers. If the block size is too large, the | will be spent on frame headers. If the block size is too large, the | |||
| characteristics of the signal may vary so much that the encoder will | characteristics of the signal may vary so much that the encoder will | |||
| be unable to find a good predictor. In order to simplify encoder/ | be unable to find a good predictor. In order to simplify encoder/ | |||
| decoder design, FLAC imposes a minimum block size of 16 samples, | decoder design, FLAC imposes a minimum block size of 16 samples, | |||
| except for the last block, and a maximum block size of 65535 samples. | except for the last block, and a maximum block size of 65535 samples. | |||
| The last block is allowed to be smaller than 16 samples to be able to | The last block is allowed to be smaller than 16 samples to be able to | |||
| match the length of the encoded audio without using padding. | match the length of the encoded audio without using padding. | |||
| While the block size does not have to be constant in a FLAC file, it | While the block size does not have to be constant in a FLAC file, it | |||
| is often difficult to find the optimal arrangement of block sizes for | is often difficult to find the optimal arrangement of block sizes for | |||
| maximum compression. Because of this, the FLAC format explicitly | maximum compression. Because of this, a FLAC stream has explicitly | |||
| stores whether a file has a constant or a variable block size | either a constant or variable block size throughout and stores a | |||
| throughout the stream, and stores a block number instead of a sample | block number instead of a sample number to slightly improve | |||
| number to slightly improve compression if a stream has a constant | compression if a stream has a constant block size. | |||
| block size. | ||||
| 4.2. Interchannel Decorrelation | 4.2. Interchannel Decorrelation | |||
| In many audio files, channels are correlated. The FLAC format can | Channels are correlated in many audio files. The FLAC format can | |||
| exploit this correlation in stereo files by not directly coding | exploit this correlation in stereo files by coding an average of all | |||
| subblocks into subframes, but instead coding an average of all | ||||
| samples in both subblocks (a mid channel) or the difference between | samples in both subblocks (a mid channel) or the difference between | |||
| all samples in both subblocks (a side channel). The following | all samples in both subblocks (a side channel) instead of directly | |||
| combinations are possible: | coding subblocks into subframes. The following combinations are | |||
| possible: | ||||
| * *Independent*. All channels are coded independently. All non- | * *Independent*. All channels are coded independently. All non- | |||
| stereo files MUST be encoded this way. | stereo files MUST be encoded this way. | |||
| * *Mid-side*. A left and right subblock are converted to mid and | * *Mid-side*. A left and right subblock are converted to mid and | |||
| side subframes. To calculate a sample for a mid subframe, the | side subframes. To calculate a sample for a mid subframe, the | |||
| corresponding left and right samples are summed and the result is | corresponding left and right samples are summed, and the result is | |||
| shifted right by 1 bit. To calculate a sample for a side | shifted right by 1 bit. To calculate a sample for a side | |||
| subframe, the corresponding right sample is subtracted from the | subframe, the corresponding right sample is subtracted from the | |||
| corresponding left sample. On decoding, all mid channel samples | corresponding left sample. On decoding, all mid channel samples | |||
| have to be shifted left by 1 bit. Also, if a side channel sample | have to be shifted left by 1 bit. Also, if a side channel sample | |||
| is odd, 1 has to be added to the corresponding mid channel sample | is odd, 1 has to be added to the corresponding mid channel sample | |||
| after it has been shifted left by one bit. To reconstruct the | after it has been shifted left by 1 bit. To reconstruct the left | |||
| left channel, the corresponding samples in the mid and side | channel, the corresponding samples in the mid and side subframes | |||
| subframes are added and the result shifted right by 1 bit, while | are added and the result shifted right by 1 bit. For the right | |||
| for the right channel the side channel has to be subtracted from | channel, the side channel has to be subtracted from the mid | |||
| the mid channel and the result shifted right by 1 bit. | channel and the result shifted right by 1 bit. | |||
| * *Left-side*. The left subblock is coded and the left and right | * *Left-side*. The left subblock is coded, and the left and right | |||
| subblocks are used to code a side subframe. The side subframe is | subblocks are used to code a side subframe. The side subframe is | |||
| constructed in the same way as for mid-side. To decode, the right | constructed in the same way as for mid-side. To decode, the right | |||
| subblock is restored by subtracting the samples in the side | subblock is restored by subtracting the samples in the side | |||
| subframe from the corresponding samples in the the left subframe. | subframe from the corresponding samples in the left subframe. | |||
| * *Side-right*. The left and right subblocks are used to code a side | * *Side-right*. The left and right subblocks are used to code a side | |||
| subframe and the right subblock is coded. The side subframe is | subframe, and the right subblock is coded. The side subframe is | |||
| constructed in the same way as for mid-side. To decode, the left | constructed in the same way as for mid-side. To decode, the left | |||
| subblock is restored by adding the samples in the side subframe to | subblock is restored by adding the samples in the side subframe to | |||
| the corresponding samples in the right subframe. | the corresponding samples in the right subframe. | |||
| The side channel needs one extra bit of bit depth as the subtraction | The side channel needs one extra bit of bit depth, as the subtraction | |||
| can produce sample values twice as large as the maximum possible in | can produce sample values twice as large as the maximum possible in | |||
| any given bit depth. The mid channel in mid-side stereo does not | any given bit depth. The mid channel in mid-side stereo does not | |||
| need one extra bit, as it is shifted right one bit. The right shift | need one extra bit, as it is shifted right 1 bit. The right shift of | |||
| of the mid channel does not lead to lossy behavior, because an odd | the mid channel does not lead to lossy behavior because an odd sample | |||
| sample in the mid subframe must always be accompanied by a | in the mid subframe must always be accompanied by a corresponding odd | |||
| corresponding odd sample in the side subframe, which means the lost | sample in the side subframe, which means the lost least-significant | |||
| least-significant bit can be restored by taking it from the sample in | bit can be restored by taking it from the sample in the side | |||
| the side subframe. | subframe. | |||
| 4.3. Prediction | 4.3. Prediction | |||
| The FLAC format has four methods for modeling the input signal: | The FLAC format has four methods for modeling the input signal: | |||
| 1. *Verbatim*. Samples are stored directly, without any modeling. | 1. *Verbatim*. Samples are stored directly, without any modeling. | |||
| This method is used for inputs with little correlation, like | This method is used for inputs with little correlation. Since | |||
| white noise. Since the raw signal is not actually passed through | the raw signal is not actually passed through the residual coding | |||
| the residual coding stage (it is added to the stream 'verbatim'), | stage (it is added to the stream "verbatim"), this method is | |||
| this method is different from using a zero-order fixed predictor. | different from using a zero-order fixed predictor. | |||
| 2. *Constant*. A single sample value is stored. This method is used | 2. *Constant*. A single sample value is stored. This method is used | |||
| whenever a signal is pure DC ("digital silence"), i.e., a | whenever a signal is pure DC ("digital silence"), i.e., a | |||
| constant value throughout. | constant value throughout. | |||
| 3. *Fixed predictor*. Samples are predicted with one of five fixed | 3. *Fixed predictor*. Samples are predicted with one of five fixed | |||
| (i.e., predefined) predictors, and the error of this prediction | (i.e., predefined) predictors, and the error of this prediction | |||
| is processed by the residual coder. These fixed predictors are | is processed by the residual coder. These fixed predictors are | |||
| well suited for predicting simple waveforms. Since the | well suited for predicting simple waveforms. Since the | |||
| predictors are fixed, no predictor coefficients are stored. From | predictors are fixed, no predictor coefficients are stored. From | |||
| skipping to change at page 10, line 18 ¶ | skipping to change at line 431 ¶ | |||
| predictor, using a generic linear predictor adds overhead as | predictor, using a generic linear predictor adds overhead as | |||
| predictor coefficients need to be stored. Therefore, this method | predictor coefficients need to be stored. Therefore, this method | |||
| of prediction is best suited for predicting more complex | of prediction is best suited for predicting more complex | |||
| waveforms, where the added overhead is offset by space savings in | waveforms, where the added overhead is offset by space savings in | |||
| the residual coding stage resulting from more accurate | the residual coding stage resulting from more accurate | |||
| prediction. A linear predictor in FLAC has two parameters | prediction. A linear predictor in FLAC has two parameters | |||
| besides the predictor coefficients and the predictor order: the | besides the predictor coefficients and the predictor order: the | |||
| number of bits with which each coefficient is stored (the | number of bits with which each coefficient is stored (the | |||
| coefficient precision) and a prediction right shift. A | coefficient precision) and a prediction right shift. A | |||
| prediction is formed by taking the sum of multiplying each | prediction is formed by taking the sum of multiplying each | |||
| predictor coefficient with the corresponding past sample, and | predictor coefficient with the corresponding past sample and | |||
| dividing that sum by applying the specified right shift. For | dividing that sum by applying the specified right shift. For | |||
| more information, see Section 9.2.6. | more information, see Section 9.2.6. | |||
| A FLAC encoder is free to select any of the above methods to model | A FLAC encoder is free to select any of the above methods to model | |||
| the input. However, to ensure lossless coding, the following | the input. However, to ensure lossless coding, the following | |||
| exceptions apply: | exceptions apply: | |||
| * When the samples that need to be stored do not all have the same | * When the samples that need to be stored do not all have the same | |||
| value (i.e., the signal is not constant), a constant subframe | value (i.e., the signal is not constant), a constant subframe | |||
| cannot be used. | cannot be used. | |||
| skipping to change at page 10, line 29 ¶ | skipping to change at line 442 ¶ | |||
| dividing that sum by applying the specified right shift. For | dividing that sum by applying the specified right shift. For | |||
| more information, see Section 9.2.6. | more information, see Section 9.2.6. | |||
| A FLAC encoder is free to select any of the above methods to model | A FLAC encoder is free to select any of the above methods to model | |||
| the input. However, to ensure lossless coding, the following | the input. However, to ensure lossless coding, the following | |||
| exceptions apply: | exceptions apply: | |||
| * When the samples that need to be stored do not all have the same | * When the samples that need to be stored do not all have the same | |||
| value (i.e., the signal is not constant), a constant subframe | value (i.e., the signal is not constant), a constant subframe | |||
| cannot be used. | cannot be used. | |||
| * When an encoder is unable to find a fixed or linear predictor for | * When an encoder is unable to find a fixed or linear predictor for | |||
| which all residual samples are representable in 32-bit signed | which all residual samples are representable in 32-bit signed | |||
| integers as stated in Section 9.2.7, a verbatim subframe is used. | integers as stated in Section 9.2.7, a verbatim subframe is used. | |||
| For more information on fixed and linear predictors, see | For more information on fixed and linear predictors, see | |||
| [HPL-1999-144] and [robinson-tr156]. | [Lossless-Compression] and [Robinson-TR156]. | |||
| 4.4. Residual Coding | 4.4. Residual Coding | |||
| If a subframe uses a predictor to approximate the audio signal, a | If a subframe uses a predictor to approximate the audio signal, a | |||
| residual is stored to 'correct' the approximation to the exact value. | residual is stored to "correct" the approximation to the exact value. | |||
| When an effective predictor is used, the average numerical value of | When an effective predictor is used, the average numerical value of | |||
| the residual samples is smaller than that of the samples before | the residual samples is smaller than that of the samples before | |||
| prediction. While having smaller values on average, it is possible | prediction. While having smaller values on average, it is possible | |||
| that a few 'outlier' residual samples are much larger than any of the | that a few "outlier" residual samples are much larger than any of the | |||
| original samples. Sometimes these outliers even exceed the range the | original samples. Sometimes these outliers even exceed the range | |||
| bit depth of the original audio offers. | that the bit depth of the original audio offers. | |||
| To be able to efficiently code such a stream of relatively small | To efficiently code such a stream of relatively small numbers with an | |||
| numbers with an occasional outlier, Rice coding (a subset of Golomb | occasional outlier, Rice coding (a subset of Golomb coding) is used. | |||
| coding) is used. Depending on how small the numbers are that have to | Depending on how small the numbers are that have to be coded, a Rice | |||
| be coded, a Rice parameter is chosen. The numerical value of each | parameter is chosen. The numerical value of each residual sample is | |||
| residual sample is split into two parts by dividing it by 2^(Rice | split into two parts by dividing it by 2^(Rice parameter), creating a | |||
| parameter), creating a quotient and a remainder. The quotient is | quotient and a remainder. The quotient is stored in unary form and | |||
| stored in unary form, the remainder in binary form. If indeed most | the remainder in binary form. If indeed most residual samples are | |||
| residual samples are close to zero and a suitable Rice parameter is | close to zero and a suitable Rice parameter is chosen, this form of | |||
| chosen, this form of coding, with a so-called variable-length code, | coding, with a so-called variable-length code, uses fewer bits than | |||
| uses fewer bits than the residual in unencoded form. | the residual in unencoded form. | |||
| As Rice codes can only handle unsigned numbers, signed numbers are | As Rice codes can only handle unsigned numbers, signed numbers are | |||
| zigzag encoded to a so-called folded residual. See Section 9.2.7 for | zigzag encoded to a so-called folded residual. See Section 9.2.7 for | |||
| a more thorough explanation. | a more thorough explanation. | |||
| Quite often, the optimal Rice parameter varies over the course of a | Quite often, the optimal Rice parameter varies over the course of a | |||
| subframe. To accommodate this, the residual can be split up into | subframe. To accommodate this, the residual can be split up into | |||
| partitions, where each partition has its own Rice parameter. To keep | partitions, where each partition has its own Rice parameter. To keep | |||
| overhead and complexity low, the number of partitions used in a | overhead and complexity low, the number of partitions used in a | |||
| subframe is limited to powers of two. | subframe is limited to powers of two. | |||
| The FLAC format uses two forms of Rice coding, which only differ in | The FLAC format uses two forms of Rice coding, which only differ in | |||
| the number of bits used for encoding the Rice parameter, either 4 or | the number of bits used for encoding the Rice parameter, either 4 or | |||
| 5 bits. | 5 bits. | |||
| 5. Format principles | 5. Format Principles | |||
| FLAC has no format version information, but it does contain reserved | FLAC has no format version information, but it does contain reserved | |||
| space in several places. Future versions of the format MAY use this | space in several places. Future versions of the format MAY use this | |||
| reserved space safely without breaking the format of older streams. | reserved space safely without breaking the format of older streams. | |||
| Older decoders MAY choose to abort decoding when encountering data | Older decoders MAY choose to abort decoding when encountering data | |||
| encoded using methods they do not recognize. Apart from reserved | that is encoded using methods they do not recognize. Apart from | |||
| patterns, the format specifies forbidden patterns in certain places, | reserved patterns, the format specifies forbidden patterns in certain | |||
| meaning that the patterns MUST NOT appear in any bitstream. They are | places, meaning that the patterns MUST NOT appear in any bitstream. | |||
| listed in the following table. | They are listed in the following table. | |||
| +=========================================+=============+ | +=========================================+=============+ | |||
| | Description | Reference | | | Description | Reference | | |||
| +=========================================+=============+ | +=========================================+=============+ | |||
| | Metadata block type 127 | Section 8.1 | | | Metadata block type 127 | Section 8.1 | | |||
| +-----------------------------------------+-------------+ | +-----------------------------------------+-------------+ | |||
| | Minimum and maximum block sizes smaller | Section 8.2 | | | Minimum and maximum block sizes smaller | Section 8.2 | | |||
| | than 16 in streaminfo metadata block | | | | than 16 in streaminfo metadata block | | | |||
| +-----------------------------------------+-------------+ | +-----------------------------------------+-------------+ | |||
| | Sample rate bits 0b1111 | Section | | | Sample rate bits 0b1111 | Section | | |||
| | | 9.1.2 | | | | 9.1.2 | | |||
| +-----------------------------------------+-------------+ | +-----------------------------------------+-------------+ | |||
| | Uncommon blocksize 65536 | Section | | | Uncommon block size 65536 | Section | | |||
| | | 9.1.6 | | | | 9.1.6 | | |||
| +-----------------------------------------+-------------+ | +-----------------------------------------+-------------+ | |||
| | Predictor coefficient precision bits | Section | | | Predictor coefficient precision bits | Section | | |||
| | 0b1111 | 9.2.6 | | | 0b1111 | 9.2.6 | | |||
| +-----------------------------------------+-------------+ | +-----------------------------------------+-------------+ | |||
| | Negative predictor right shift | Section | | | Negative predictor right shift | Section | | |||
| | | 9.2.6 | | | | 9.2.6 | | |||
| +-----------------------------------------+-------------+ | +-----------------------------------------+-------------+ | |||
| Table 1 | Table 1 | |||
| All numbers used in a FLAC bitstream are integers, there are no | All numbers used in a FLAC bitstream are integers; there are no | |||
| floating-point representations. All numbers are big-endian coded, | floating-point representations. All numbers are big-endian coded, | |||
| except the field lengths used in Vorbis comments (see Section 8.6), | except the field lengths used in Vorbis comments (see Section 8.6), | |||
| which are little-endian coded. This exception for Vorbis comments is | which are little-endian coded. This exception for Vorbis comments is | |||
| to keep as much commonality as possible with Vorbis comments as used | to keep as much commonality as possible with Vorbis comments as used | |||
| by the Vorbis codec (see [Vorbis]). All numbers are unsigned except | by the Vorbis codec (see [Vorbis]). All numbers are unsigned except | |||
| linear predictor coefficients, the linear prediction shift (see | linear predictor coefficients, the linear prediction shift (see | |||
| Section 9.2.6), and numbers that directly represent samples, which | Section 9.2.6), and numbers that directly represent samples, which | |||
| are signed. None of these restrictions apply to application metadata | are signed. None of these restrictions apply to application metadata | |||
| blocks or to Vorbis comment field contents. | blocks or to Vorbis comment field contents. | |||
| All samples encoded to and decoded from the FLAC format MUST be in a | All samples encoded to and decoded from the FLAC format MUST be in a | |||
| signed representation. | signed representation. | |||
| There are several ways to convert unsigned sample representations to | There are several ways to convert unsigned sample representations to | |||
| signed sample representations, but the coding methods provided by the | signed sample representations, but the coding methods provided by the | |||
| FLAC format work best on audio signals of which the numerical values | FLAC format work best on samples that have numerical values that are | |||
| of the samples are centered around zero, i.e., have no DC offset. In | centered around zero, i.e., have no DC offset. In most unsigned | |||
| most unsigned audio formats, signals are centered around halfway the | audio formats, signals are centered around halfway within the range | |||
| range of the unsigned integer type used. If that is the case, | of the unsigned integer type used. If that is the case, converting | |||
| converting sample representations by first copying the number to a | sample representations by first copying the number to a signed | |||
| signed integer with sufficient range and then subtracting half of the | integer with a sufficient range and then subtracting half of the | |||
| range of the unsigned integer type, results in a signal with samples | range of the unsigned integer type results in a signal with samples | |||
| centered around 0. | centered around 0. | |||
| Unary coding in a FLAC bitstream is done with zero bits terminated | Unary coding in a FLAC bitstream is done with zero bits terminated | |||
| with a one bit, e.g., the number 5 is coded unary as 0b000001. This | with a one bit, e.g., the number 5 is coded unary as 0b000001. This | |||
| prevents the frame sync code from appearing in unary coded numbers. | prevents the frame sync code from appearing in unary-coded numbers. | |||
| When a FLAC file contains data that is forbidden or otherwise not | When a FLAC file contains data that is forbidden or otherwise not | |||
| valid, decoder behavior is left unspecified. A decoder MAY choose to | valid, decoder behavior is left unspecified. A decoder MAY choose to | |||
| stop decoding upon encountering such data. Examples of such data are | stop decoding upon encountering such data. Examples of such data | |||
| include the following: | ||||
| * One or more decoded sample values exceed the range offered by the | * One or more decoded sample values exceed the range offered by the | |||
| bit depth as coded for that frame. E.g., in a frame with a bit | bit depth as coded for that frame. For example, in a frame with a | |||
| depth of 8 bits, any samples not in the inclusive range from -128 | bit depth of 8 bits, any samples not in the inclusive range from | |||
| to 127 are not valid. | -128 to 127 are not valid. | |||
| * The number of wasted bits (see Section 9.2.2) used by a subframe | * The number of wasted bits (see Section 9.2.2) used by a subframe | |||
| is such that the bit depth of that subframe (see Section 9.2.3 for | is such that the bit depth of that subframe (see Section 9.2.3 for | |||
| a description of subframe bit depth) equals zero or is negative. | a description of subframe bit depth) equals zero or is negative. | |||
| * A frame header CRC (see Section 9.1.8) or frame footer CRC (see | ||||
| Section 9.3) does not validate. | ||||
| * One of the forbidden bit patterns described in Table 1 above is | ||||
| used. | ||||
| 6. Format layout overview | * A frame header Cyclic Redundancy Check (CRC) (see Section 9.1.8) | |||
| or frame footer CRC (see Section 9.3) does not validate. | ||||
| * One of the forbidden bit patterns described in Table 1 is used. | ||||
| 6. Format Layout Overview | ||||
| A FLAC bitstream consists of the fLaC (i.e., 0x664C6143) marker at | A FLAC bitstream consists of the fLaC (i.e., 0x664C6143) marker at | |||
| the beginning of the stream, followed by a mandatory metadata block | the beginning of the stream, followed by a mandatory metadata block | |||
| (called the STREAMINFO block), any number of other metadata blocks, | (called the streaminfo metadata block), any number of other metadata | |||
| and then the audio frames. | blocks, and then the audio frames. | |||
| FLAC supports 127 kinds of metadata blocks; currently, 7 kinds are | FLAC supports 127 kinds of metadata blocks; currently, 7 kinds are | |||
| defined in Section 8. | defined in Section 8. | |||
| The audio data is composed of one or more audio frames. Each frame | The audio data is composed of one or more audio frames. Each frame | |||
| consists of a frame header, which contains a sync code, information | consists of a frame header that contains a sync code, information | |||
| about the frame (like the block size, sample rate and number of | about the frame (like the block size, sample rate, and number of | |||
| channels), and an 8-bit CRC. The frame header also contains either | channels), and an 8-bit CRC. The frame header also contains either | |||
| the sample number of the first sample in the frame (for variable | the sample number of the first sample in the frame (for variable | |||
| block size streams), or the frame number (for fixed block size | block size streams) or the frame number (for fixed block size | |||
| streams). This allows for fast, sample-accurate seeking to be | streams). This allows for fast, sample-accurate seeking to be | |||
| performed. Following the frame header are encoded subframes, one for | performed. Following the frame header are encoded subframes, one for | |||
| each channel. The frame is then zero-padded to a byte boundary and | each channel. The frame is then zero-padded to a byte boundary and | |||
| finished with a frame footer containing a checksum for the frame. | finished with a frame footer containing a checksum for the frame. | |||
| Each subframe has its own header that specifies how the subframe is | Each subframe has its own header that specifies how the subframe is | |||
| encoded. | encoded. | |||
| In order to allow a decoder to start decoding at any place in the | In order to allow a decoder to start decoding at any place in the | |||
| stream, each frame starts with a byte-aligned 15-bit sync code. | stream, each frame starts with a byte-aligned 15-bit sync code. | |||
| However, since it is not guaranteed that the sync code does not | However, since it is not guaranteed that the sync code does not | |||
| skipping to change at page 14, line 24 ¶ | skipping to change at line 610 ¶ | |||
| frame header contains some basic information about the stream. This | frame header contains some basic information about the stream. This | |||
| information includes sample rate, bits per sample, number of | information includes sample rate, bits per sample, number of | |||
| channels, etc. Since the frame header is overhead, it has a direct | channels, etc. Since the frame header is overhead, it has a direct | |||
| effect on the compression ratio. To keep the frame header as small | effect on the compression ratio. To keep the frame header as small | |||
| as possible, FLAC uses lookup tables for the most commonly used | as possible, FLAC uses lookup tables for the most commonly used | |||
| values for frame properties. When a certain property has a value | values for frame properties. When a certain property has a value | |||
| that is not covered by the lookup table, the decoder is directed to | that is not covered by the lookup table, the decoder is directed to | |||
| find the value of that property (for example, the sample rate) at the | find the value of that property (for example, the sample rate) at the | |||
| end of the frame header or in the streaminfo metadata block. If a | end of the frame header or in the streaminfo metadata block. If a | |||
| frame header refers to the streaminfo metadata block, the file is not | frame header refers to the streaminfo metadata block, the file is not | |||
| 'streamable', see Section 7 for details. By using lookup tables, the | "streamable"; see Section 7 for details. By using lookup tables, the | |||
| file is streamable and the frame header size small for the most | file is streamable and the frame header size is small for the most | |||
| common forms of audio data. | common forms of audio data. | |||
| Individual subframes (one for each channel) are coded separately | Individual subframes (one for each channel) are coded separately | |||
| within a frame, and appear serially in the stream. In other words, | within a frame and appear serially in the stream. In other words, | |||
| the encoded audio data is NOT channel-interleaved. This reduces | the encoded audio data is NOT channel-interleaved. This reduces | |||
| decoder complexity at the cost of requiring larger decode buffers. | decoder complexity at the cost of requiring larger decode buffers. | |||
| Each subframe has its own header specifying the attributes of the | Each subframe has its own header specifying the attributes of the | |||
| subframe, like prediction method and order, residual coding | subframe, like prediction method and order, residual coding | |||
| parameters, etc. Each subframe header is followed by the encoded | parameters, etc. Each subframe header is followed by the encoded | |||
| audio data for that channel. | audio data for that channel. | |||
| 7. Streamable subset | 7. Streamable Subset | |||
| The FLAC format specifies a subset of itself as the FLAC streamable | The FLAC format specifies a subset of itself as the FLAC streamable | |||
| subset. The purpose of this is to ensure that any streams encoded | subset. The purpose of this is to ensure that any streams encoded | |||
| according to this subset are truly "streamable", meaning that a | according to this subset are truly "streamable", meaning that a | |||
| decoder that cannot seek within the stream can still pick up in the | decoder that cannot seek within the stream can still pick up in the | |||
| middle of the stream and start decoding. It also makes hardware | middle of the stream and start decoding. It also makes hardware | |||
| decoder implementations more practical by limiting the encoding | decoder implementations more practical by limiting the encoding | |||
| parameters in such a way that decoder buffer sizes and other resource | parameters in such a way that decoder buffer sizes and other resource | |||
| requirements can be easily determined. The streamable subset makes | requirements can be easily determined. The streamable subset makes | |||
| the following limitations on what MAY be used in the stream: | the following limitations on what MAY be used in the stream: | |||
| skipping to change at page 15, line 8 ¶ | skipping to change at line 642 ¶ | |||
| requirements can be easily determined. The streamable subset makes | requirements can be easily determined. The streamable subset makes | |||
| the following limitations on what MAY be used in the stream: | the following limitations on what MAY be used in the stream: | |||
| * The sample rate bits (see Section 9.1.2) in the frame header MUST | * The sample rate bits (see Section 9.1.2) in the frame header MUST | |||
| be 0b0001-0b1110, i.e., the frame header MUST NOT refer to the | be 0b0001-0b1110, i.e., the frame header MUST NOT refer to the | |||
| streaminfo metadata block to describe the sample rate. | streaminfo metadata block to describe the sample rate. | |||
| * The bit depth bits (see Section 9.1.4) in the frame header MUST be | * The bit depth bits (see Section 9.1.4) in the frame header MUST be | |||
| 0b001-0b111, i.e., the frame header MUST NOT refer to the | 0b001-0b111, i.e., the frame header MUST NOT refer to the | |||
| streaminfo metadata block to describe the bit depth. | streaminfo metadata block to describe the bit depth. | |||
| * The stream MUST NOT contain blocks with more than 16384 | * The stream MUST NOT contain blocks with more than 16384 | |||
| interchannel samples, i.e., the maximum block size must not be | interchannel samples, i.e., the maximum block size must not be | |||
| larger than 16384. | larger than 16384. | |||
| * Audio with a sample rate less than or equal to 48000 Hz MUST NOT | * Audio with a sample rate less than or equal to 48000 Hz MUST NOT | |||
| be contained in blocks with more than 4608 interchannel samples, | be contained in blocks with more than 4608 interchannel samples, | |||
| i.e., the maximum block size used for this audio must not be | i.e., the maximum block size used for this audio must not be | |||
| larger than 4608. | larger than 4608. | |||
| * Linear prediction subframes (see Section 9.2.6) containing audio | * Linear prediction subframes (see Section 9.2.6) containing audio | |||
| with a sample rate less than or equal to 48000 Hz MUST have a | with a sample rate less than or equal to 48000 Hz MUST have a | |||
| predictor order less than or equal to 12, i.e., the subframe type | predictor order less than or equal to 12, i.e., the subframe type | |||
| bits in the subframe header (see Section 9.2.1) MUST NOT be | bits in the subframe header (see Section 9.2.1) MUST NOT be | |||
| 0b101100-0b111111. | 0b101100-0b111111. | |||
| * The Rice partition order (see Section 9.2.7) MUST be less than or | * The Rice partition order (see Section 9.2.7) MUST be less than or | |||
| equal to 8. | equal to 8. | |||
| * The channel ordering MUST be equal to one defined in | * The channel ordering MUST be equal to one defined in | |||
| Section 9.1.3, i.e., the FLAC file MUST NOT need a | Section 9.1.3, i.e., the FLAC file MUST NOT need a | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK tag to describe the channel | WAVEFORMATEXTENSIBLE_CHANNEL_MASK tag to describe the channel | |||
| ordering. See Section 8.6.2 for details. | ordering. See Section 8.6.2 for details. | |||
| 8. File-level metadata | 8. File-Level Metadata | |||
| At the start of a FLAC file or stream, following the fLaC ASCII file | At the start of a FLAC file or stream, following the fLaC ASCII file | |||
| signature, one or more metadata blocks MUST be present before any | signature, one or more metadata blocks MUST be present before any | |||
| audio frames appear. The first metadata block MUST be a streaminfo | audio frames appear. The first metadata block MUST be a streaminfo | |||
| block. | metadata block. | |||
| 8.1. Metadata block header | 8.1. Metadata Block Header | |||
| Each metadata block starts with a 4 byte header. The first bit in | Each metadata block starts with a 4-byte header. The first bit in | |||
| this header flags whether a metadata block is the last one: it is a 0 | this header flags whether a metadata block is the last one. It is 0 | |||
| when other metadata blocks follow, otherwise it is a 1. The 7 | when other metadata blocks follow; otherwise, it is 1. The 7 | |||
| remaining bits of the first header byte contain the type of the | remaining bits of the first header byte contain the type of the | |||
| metadata block as an unsigned number between 0 and 126 according to | metadata block as an unsigned number between 0 and 126, according to | |||
| the following table. A value of 127 (i.e., 0b1111111) is forbidden. | the following table. A value of 127 (i.e., 0b1111111) is forbidden. | |||
| The three bytes that follow code for the size of the metadata block | The three bytes that follow code for the size of the metadata block | |||
| in bytes, excluding the 4 header bytes, as an unsigned number coded | in bytes, excluding the 4 header bytes, as an unsigned number coded | |||
| big-endian. | big-endian. | |||
| +=========+======================================================+ | +=========+=======================================================+ | |||
| | Value | Metadata block type | | | Value | Metadata Block Type | | |||
| +=========+======================================================+ | +=========+=======================================================+ | |||
| | 0 | Streaminfo | | | 0 | Streaminfo | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| | 1 | Padding | | | 1 | Padding | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| | 2 | Application | | | 2 | Application | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| | 3 | Seektable | | | 3 | Seek table | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| | 4 | Vorbis comment | | | 4 | Vorbis comment | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| | 5 | Cuesheet | | | 5 | Cuesheet | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| | 6 | Picture | | | 6 | Picture | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| | 7 - 126 | reserved | | | 7 - 126 | Reserved | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| | 127 | forbidden, to avoid confusion with a frame sync code | | | 127 | Forbidden (to avoid confusion with a frame sync code) | | |||
| +---------+------------------------------------------------------+ | +---------+-------------------------------------------------------+ | |||
| Table 2 | Table 2 | |||
| 8.2. Streaminfo | 8.2. Streaminfo | |||
| The streaminfo metadata block has information about the whole stream, | The streaminfo metadata block has information about the whole stream, | |||
| like sample rate, number of channels, total number of samples, etc. | such as sample rate, number of channels, total number of samples, | |||
| It MUST be present as the first metadata block in the stream. Other | etc. It MUST be present as the first metadata block in the stream. | |||
| metadata blocks MAY follow. There MUST be no more than one | Other metadata blocks MAY follow. There MUST be no more than one | |||
| streaminfo metadata block per FLAC stream. | streaminfo metadata block per FLAC stream. | |||
| If the streaminfo metadata block contains incorrect or incomplete | If the streaminfo metadata block contains incorrect or incomplete | |||
| information, decoder behavior is left unspecified (i.e., up to the | information, decoder behavior is left unspecified (i.e., it is up to | |||
| decoder implementation). A decoder MAY choose to stop further | the decoder implementation). A decoder MAY choose to stop further | |||
| decoding when the information supplied by the streaminfo metadata | decoding when the information supplied by the streaminfo metadata | |||
| block turns out to be incorrect or contains forbidden values. A | block turns out to be incorrect or contains forbidden values. A | |||
| decoder accepting information from the streaminfo block (most- | decoder accepting information from the streaminfo metadata block | |||
| significantly the maximum frame size, maximum block size, number of | (most significantly, the maximum frame size, maximum block size, | |||
| audio channels, number of bits per sample, and total number of | number of audio channels, number of bits per sample, and total number | |||
| samples) without doing further checks during decoding of audio frames | of samples) without doing further checks during decoding of audio | |||
| could be vulnerable to buffer overflows. See also Section 12. | frames could be vulnerable to buffer overflows. See also Section 11. | |||
| The following table describes the streaminfo metadata block, | The following table describes the streaminfo metadata block in order, | |||
| excluding the metadata block header. | excluding the metadata block header. | |||
| +========+=================================================+ | +========+=================================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +========+=================================================+ | +========+=================================================+ | |||
| | u(16) | The minimum block size (in samples) used in the | | | u(16) | The minimum block size (in samples) used in the | | |||
| | | stream, excluding the last block. | | | | stream, excluding the last block. | | |||
| +--------+-------------------------------------------------+ | +--------+-------------------------------------------------+ | |||
| | u(16) | The maximum block size (in samples) used in the | | | u(16) | The maximum block size (in samples) used in the | | |||
| | | stream. | | | | stream. | | |||
| skipping to change at page 17, line 31 ¶ | skipping to change at line 757 ¶ | |||
| +--------+-------------------------------------------------+ | +--------+-------------------------------------------------+ | |||
| | u(20) | Sample rate in Hz. | | | u(20) | Sample rate in Hz. | | |||
| +--------+-------------------------------------------------+ | +--------+-------------------------------------------------+ | |||
| | u(3) | (number of channels)-1. FLAC supports from 1 | | | u(3) | (number of channels)-1. FLAC supports from 1 | | |||
| | | to 8 channels. | | | | to 8 channels. | | |||
| +--------+-------------------------------------------------+ | +--------+-------------------------------------------------+ | |||
| | u(5) | (bits per sample)-1. FLAC supports from 4 to | | | u(5) | (bits per sample)-1. FLAC supports from 4 to | | |||
| | | 32 bits per sample. | | | | 32 bits per sample. | | |||
| +--------+-------------------------------------------------+ | +--------+-------------------------------------------------+ | |||
| | u(36) | Total number of interchannel samples in the | | | u(36) | Total number of interchannel samples in the | | |||
| | | stream. A value of zero here means the number | | | | stream. A value of 0 here means the number of | | |||
| | | of total samples is unknown. | | | | total samples is unknown. | | |||
| +--------+-------------------------------------------------+ | +--------+-------------------------------------------------+ | |||
| | u(128) | MD5 checksum of the unencoded audio data. This | | | u(128) | MD5 checksum of the unencoded audio data. This | | |||
| | | allows the decoder to determine if an error | | | | allows the decoder to determine if an error | | |||
| | | exists in the audio data even when, despite the | | | | exists in the audio data even when, despite the | | |||
| | | error, the bitstream itself is valid. A value | | | | error, the bitstream itself is valid. A value | | |||
| | | of 0 signifies that the value is not known. | | | | of 0 signifies that the value is not known. | | |||
| +--------+-------------------------------------------------+ | +--------+-------------------------------------------------+ | |||
| Table 3 | Table 3 | |||
| The minimum block size and the maximum block size MUST be in the | The minimum block size and the maximum block size MUST be in the | |||
| 16-65535 range. The minimum block size MUST be equal to or less than | 16-65535 range. The minimum block size MUST be equal to or less than | |||
| the maximum block size. | the maximum block size. | |||
| Any frame but the last one MUST have a block size equal to or greater | Any frame but the last one MUST have a block size equal to or greater | |||
| than the minimum block size and MUST have a block size equal to or | than the minimum block size and MUST have a block size equal to or | |||
| lesser than the maximum block size. The last frame MUST have a block | less than the maximum block size. The last frame MUST have a block | |||
| size equal to or lesser than the maximum block size, it does not have | size equal to or less than the maximum block size; it does not have | |||
| to comply to the minimum block size because the block size of that | to comply to the minimum block size because the block size of that | |||
| frame must be able to accommodate the length of the audio data the | frame must be able to accommodate the length of the audio data the | |||
| stream contains. | stream contains. | |||
| If the minimum block size is equal to the maximum block size, the | If the minimum block size is equal to the maximum block size, the | |||
| file contains a fixed block size stream, as the minimum block size | file contains a fixed block size stream, as the minimum block size | |||
| excludes the last block. Note that in the case of a stream with a | excludes the last block. Note that in the case of a stream with a | |||
| variable block size, the actual maximum block size MAY be smaller | variable block size, the actual maximum block size MAY be smaller | |||
| than the maximum block size listed in the streaminfo block, and the | than the maximum block size listed in the streaminfo metadata block, | |||
| actual smallest block size excluding the last block MAY be larger | and the actual smallest block size excluding the last block MAY be | |||
| than the minimum block size listed in the streaminfo block. This is | larger than the minimum block size listed in the streaminfo metadata | |||
| because the encoder has to write these fields before receiving any | block. This is because the encoder has to write these fields before | |||
| input audio data, and cannot know beforehand what block sizes it will | receiving any input audio data and cannot know beforehand what block | |||
| use, only between what bounds these will be chosen. | sizes it will use, only between what bounds the block sizes will be | |||
| chosen. | ||||
| The sample rate MUST NOT be 0 when the FLAC file contains audio. A | The sample rate MUST NOT be 0 when the FLAC file contains audio. A | |||
| sample rate of 0 MAY be used when non-audio is represented. This is | sample rate of 0 MAY be used when non-audio is represented. This is | |||
| useful if data is encoded that is not along a time axis, or when the | useful if data is encoded that is not along a time axis or when the | |||
| sample rate of the data lies outside the range that FLAC can | sample rate of the data lies outside the range that FLAC can | |||
| represent in the streaminfo metadata block. If a sample rate of 0 is | represent in the streaminfo metadata block. If a sample rate of 0 is | |||
| used it is recommended to store the meaning of the encoded content in | used, it is recommended to store the meaning of the encoded content | |||
| a Vorbis comment field (see Section 8.6) or an application metadata | in a Vorbis comment field (see Section 8.6) or an application | |||
| block (see Section 8.4). This document does not define such | metadata block (see Section 8.4). This document does not define such | |||
| metadata. | metadata. | |||
| The MD5 checksum is computed by applying the MD5 message-digest | The MD5 checksum is computed by applying the MD5 message-digest | |||
| algorithm in [RFC1321]. The message to this algorithm consists of | algorithm in [RFC1321]. The message to this algorithm consists of | |||
| all the samples of all channels interleaved, represented in signed, | all the samples of all channels interleaved, represented in signed, | |||
| little-endian form. This interleaving is on a per-sample basis, so | little-endian form. This interleaving is on a per-sample basis, so | |||
| for a stereo file this means first the first sample of the first | for a stereo file, this means the first sample of the first channel, | |||
| channel, then the first sample of the second channel, then the second | then the first sample of the second channel, then the second sample | |||
| sample of the first channel etc. Before computing the checksum, all | of the first channel, etc. Before computing the checksum, all | |||
| samples must be byte-aligned. If the bit depth is not a whole number | samples must be byte-aligned. If the bit depth is not a whole number | |||
| of bytes, the value of each sample is sign extended to the next whole | of bytes, the value of each sample is sign-extended to the next whole | |||
| number of bytes. | number of bytes. | |||
| So, in the case of a 2-channel stream with 6-bit samples, bits will | In the case of a 2-channel stream with 6-bit samples, bits will be | |||
| be lined up as follows. | lined up as follows: | |||
| SSAAAAAASSBBBBBBSSCCCCCC | SSAAAAAASSBBBBBBSSCCCCCC | |||
| ^ ^ ^ ^ ^ ^ | ^ ^ ^ ^ ^ ^ | |||
| | | | | | Bits of 2nd sample of 1st channel | | | | | | Bits of 2nd sample of 1st channel | |||
| | | | | Sign extension bits of 2nd sample of 2nd channel | | | | | Sign extension bits of 2nd sample of 2nd channel | |||
| | | | Bits of 1st sample of 2nd channel | | | | Bits of 1st sample of 2nd channel | |||
| | | Sign extension bits of 1st sample of 2nd channel | | | Sign extension bits of 1st sample of 2nd channel | |||
| | Bits of 1st sample of 1st channel | | Bits of 1st sample of 1st channel | |||
| Sign extention bits of 1st sample of 1st channel | Sign extension bits of 1st sample of 1st channel | |||
| As another example, in the case of a 1-channel with 12-bit samples, | In the case of a 1-channel stream with 12-bit samples, bits are lined | |||
| bits are lined up as follows, showing the little-endian byte order | up in little-endian byte order as follows: | |||
| AAAAAAAASSSSAAAABBBBBBBBSSSSBBBB | AAAAAAAASSSSAAAABBBBBBBBSSSSBBBB | |||
| ^ ^ ^ ^ ^ ^ | ^ ^ ^ ^ ^ ^ | |||
| | | | | | Most-significant 4 bits of 2nd sample | | | | | | Most-significant 4 bits of 2nd sample | |||
| | | | | Sign extension bits of 2nd sample | | | | | Sign extension bits of 2nd sample | |||
| | | | Least-significant 8 bits of 2nd sample | | | | Least-significant 8 bits of 2nd sample | |||
| | | Most-significant 4 bits of 1st sample | | | Most-significant 4 bits of 1st sample | |||
| | Sign extension bits of 1st sample | | Sign extension bits of 1st sample | |||
| Least-significant 8 bits of 1st sample | Least-significant 8 bits of 1st sample | |||
| skipping to change at page 19, line 40 ¶ | skipping to change at line 852 ¶ | |||
| after encoding; the user can instruct the encoder to reserve a | after encoding; the user can instruct the encoder to reserve a | |||
| padding block of sufficient size so that when metadata is added, it | padding block of sufficient size so that when metadata is added, it | |||
| will simply overwrite the padding (which is relatively quick) instead | will simply overwrite the padding (which is relatively quick) instead | |||
| of having to insert it into the existing file (which would normally | of having to insert it into the existing file (which would normally | |||
| require rewriting the entire file). There MAY be one or more padding | require rewriting the entire file). There MAY be one or more padding | |||
| metadata blocks per FLAC stream. | metadata blocks per FLAC stream. | |||
| +======+======================================================+ | +======+======================================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +======+======================================================+ | +======+======================================================+ | |||
| | u(n) | n '0' bits (n MUST be a multiple of 8, i.e., a whole | | | u(n) | n "0" bits (n MUST be a multiple of 8, i.e., a whole | | |||
| | | number of bytes, and MAY be zero). n is 8 times the | | | | number of bytes, and MAY be zero). n is 8 times the | | |||
| | | size described in the metadata block header. | | | | size described in the metadata block header. | | |||
| +------+------------------------------------------------------+ | +------+------------------------------------------------------+ | |||
| Table 4 | Table 4 | |||
| 8.4. Application | 8.4. Application | |||
| The application metadata block is for use by third-party | The application metadata block is for use by third-party | |||
| applications. The only mandatory field is a 32-bit identifier. An | applications. The only mandatory field is a 32-bit application | |||
| ID registry is being maintained at https://xiph.org/flac/id.html | identifier (application ID). Application IDs are registered in the | |||
| (https://xiph.org/flac/id.html). | IANA "FLAC Application Metadata Block IDs" registry (see | |||
| Section 12.2). | ||||
| +=======+====================================================+ | ||||
| | Data | Description | | ||||
| +=======+====================================================+ | ||||
| | u(32) | Registered application ID. | | ||||
| +-------+----------------------------------------------------+ | ||||
| | u(n) | Application data (n MUST be a multiple of 8, i.e., | | ||||
| | | a whole number of bytes) n is 8 times the size | | ||||
| | | described in the metadata block header, minus the | | ||||
| | | 32 bits already used for the application ID. | | ||||
| +-------+----------------------------------------------------+ | ||||
| Table 5 | +=======+===================================================+ | |||
| | Data | Description | | ||||
| +=======+===================================================+ | ||||
| | u(32) | Registered application ID. | | ||||
| +-------+---------------------------------------------------+ | ||||
| | u(n) | Application data (n MUST be a multiple of 8, | | ||||
| | | i.e., a whole number of bytes). n is 8 times the | | ||||
| | | size described in the metadata block header minus | | ||||
| | | the 32 bits already used for the application ID. | | ||||
| +-------+---------------------------------------------------+ | ||||
| Application IDs are registered with the IANA, see Section 13.2. | Table 5 | |||
| 8.5. Seektable | 8.5. Seek Table | |||
| The seektable metadata block can be used to store seek points. It is | The seek table metadata block can be used to store seek points. It | |||
| possible to seek to any given sample in a FLAC stream without a seek | is possible to seek to any given sample in a FLAC stream without a | |||
| table, but the delay can be unpredictable since the bitrate may vary | seek table, but the delay can be unpredictable since the bitrate may | |||
| widely within a stream. By adding seek points to a stream, this | vary widely within a stream. By adding seek points to a stream, this | |||
| delay can be significantly reduced. There MUST NOT be more than one | delay can be significantly reduced. There MUST NOT be more than one | |||
| seektable metadata block in a stream, but the table can have any | seek table metadata block in a stream, but the table can have any | |||
| number of seek points. | number of seek points. | |||
| Each seek point takes 18 bytes, so a seek table with 1% resolution | Each seek point takes 18 bytes, so a seek table with 1% resolution | |||
| within a stream adds less than 2 kilobyte of data. The number of | within a stream adds less than 2 kilobytes of data. The number of | |||
| seek points is implied by the size described in the metadata block | seek points is implied by the size described in the metadata block | |||
| header, i.e., equal to size / 18. There is also a special | header, i.e., equal to size / 18. There is also a special | |||
| 'placeholder' seekpoint that will be ignored by decoders but can be | "placeholder" seek point that will be ignored by decoders but can be | |||
| used to reserve space for future seek point insertion. | used to reserve space for future seek point insertion. | |||
| +============+=============================+ | +=============+=============================+ | |||
| | Data | Description | | | Data | Description | | |||
| +============+=============================+ | +=============+=============================+ | |||
| | Seekpoints | Zero or more seek points as | | | Seek points | Zero or more seek points as | | |||
| | | defined in Section 8.5.1. | | | | defined in Section 8.5.1. | | |||
| +------------+-----------------------------+ | +-------------+-----------------------------+ | |||
| Table 6 | Table 6 | |||
| A seektable is generally not usable for seeking in a FLAC file | A seek table is generally not usable for seeking in a FLAC file | |||
| embedded in a container (see Section 10), as such containers usually | embedded in a container (see Section 10), as such containers usually | |||
| interleave FLAC data with other data and the offsets used in | interleave FLAC data with other data and the offsets used in seek | |||
| seekpoints are those of an unmuxed FLAC stream. Also, containers | points are those of an unmuxed FLAC stream. Also, containers often | |||
| often provide their own seeking methods. It is, however, possible to | provide their own seeking methods. However, it is possible to store | |||
| store the seektable in the container along with other metadata when | the seek table in the container along with other metadata when muxing | |||
| muxing a FLAC file, so this stored seektable can be restored when | a FLAC file, so this stored seek table can be restored when demuxing | |||
| demuxing the FLAC stream into a standalone FLAC file. | the FLAC stream into a standalone FLAC file. | |||
| 8.5.1. Seekpoint | 8.5.1. Seek Point | |||
| +=======+==========================================================+ | +=======+==========================================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +=======+==========================================================+ | +=======+==========================================================+ | |||
| | u(64) | Sample number of the first sample in the target frame, | | | u(64) | Sample number of the first sample in the target frame or | | |||
| | | or 0xFFFFFFFFFFFFFFFF for a placeholder point. | | | | 0xFFFFFFFFFFFFFFFF for a placeholder point. | | |||
| +-------+----------------------------------------------------------+ | +-------+----------------------------------------------------------+ | |||
| | u(64) | Offset (in bytes) from the first byte of the first frame | | | u(64) | Offset (in bytes) from the first byte of the first frame | | |||
| | | header to the first byte of the target frame's header. | | | | header to the first byte of the target frame's header. | | |||
| +-------+----------------------------------------------------------+ | +-------+----------------------------------------------------------+ | |||
| | u(16) | Number of samples in the target frame. | | | u(16) | Number of samples in the target frame. | | |||
| +-------+----------------------------------------------------------+ | +-------+----------------------------------------------------------+ | |||
| Table 7 | Table 7 | |||
| NOTES | Notes: | |||
| * For placeholder points, the second and third field values are | * For placeholder points, the second and third field values are | |||
| undefined. | undefined. | |||
| * Seek points within a table MUST be sorted in ascending order by | * Seek points within a table MUST be sorted in ascending order by | |||
| sample number. | sample number. | |||
| * Seek points within a table MUST be unique by sample number, with | * Seek points within a table MUST be unique by sample number, with | |||
| the exception of placeholder points. | the exception of placeholder points. | |||
| * The previous two notes imply that there MAY be any number of | * The previous two notes imply that there MAY be any number of | |||
| placeholder points, but they MUST all occur at the end of the | placeholder points, but they MUST all occur at the end of the | |||
| table. | table. | |||
| * The sample offsets are those of an unmuxed FLAC stream. The | * The sample offsets are those of an unmuxed FLAC stream. The | |||
| offsets MUST NOT be updated on muxing to reflect the new offsets | offsets MUST NOT be updated on muxing to reflect the new offsets | |||
| of FLAC frames in a container. | of FLAC frames in a container. | |||
| 8.6. Vorbis comment | 8.6. Vorbis Comment | |||
| A Vorbis comment metadata block contains human-readable information | A Vorbis comment metadata block contains human-readable information | |||
| coded in UTF-8. The name Vorbis comment points to the fact that the | coded in UTF-8. The name "Vorbis comment" points to the fact that | |||
| Vorbis codec stores such metadata in almost the same way, see | the Vorbis codec stores such metadata in almost the same way (see | |||
| [Vorbis]. A Vorbis comment metadata block consists of a vendor | [Vorbis]). A Vorbis comment metadata block consists of a vendor | |||
| string optionally followed by a number of fields, which are pairs of | string optionally followed by a number of fields, which are pairs of | |||
| field names and field contents. Many users refer to these fields as | field names and field contents. The vendor string contains the name | |||
| FLAC tags or simply as tags. A FLAC file MUST NOT contain more than | of the program that generated the file or stream. The fields contain | |||
| one Vorbis comment metadata block. | metadata describing various aspects of the contained audio. Many | |||
| users refer to these fields as "FLAC tags" or simply as "tags". A | ||||
| FLAC file MUST NOT contain more than one Vorbis comment metadata | ||||
| block. | ||||
| In a Vorbis comment metadata block, the metadata block header is | In a Vorbis comment metadata block, the metadata block header is | |||
| directly followed by 4 bytes containing the length in bytes of the | directly followed by 4 bytes containing the length in bytes of the | |||
| vendor string as an unsigned number coded little-endian. The vendor | vendor string as an unsigned number coded little-endian. The vendor | |||
| string follows UTF-8 coded, and is not terminated in any way. | string follows, is UTF-8 coded and is not terminated in any way. | |||
| Following the vendor string are 4 bytes containing the number of | Following the vendor string are 4 bytes containing the number of | |||
| fields that are in the Vorbis comment block, stored as an unsigned | fields that are in the Vorbis comment block, stored as an unsigned | |||
| number, coded little-endian. If this number is non-zero, it is | number coded little-endian. If this number is non-zero, it is | |||
| followed by the fields themselves, each of which is stored with a 4 | followed by the fields themselves, each of which is stored with a | |||
| byte length. First, the 4 byte field length in bytes is stored as an | 4-byte length. For each field, the field length in bytes is stored | |||
| unsigned number, coded little-endian. The field itself is, like the | as a 4-byte unsigned number coded little-endian. The field itself | |||
| vendor string, UTF-8 coded, not terminated in any way. | follows it. Like the vendor string, the field is UTF-8 coded and not | |||
| terminated in any way. | ||||
| Each field consists of a field name and a field content, separated by | Each field consists of a field name and field contents, separated by | |||
| an = character. The field name MUST only consist of UTF-8 code | an = character. The field name MUST only consist of UTF-8 code | |||
| points U+0020 through U+007E, excluding U+003D, which is the = | points U+0020 through U+007E, excluding U+003D, which is the = | |||
| character. In other words, the field name can contain all printable | character. In other words, the field name can contain all printable | |||
| ASCII characters except the equals sign. The evaluation of the field | ASCII characters except the equals sign. The evaluation of the field | |||
| names MUST be case insensitive, so U+0041 through 0+005A (A-Z) MUST | names MUST be case insensitive, so U+0041 through 0+005A (A-Z) MUST | |||
| be considered equivalent to U+0061 through U+007A (a-z) respectively. | be considered equivalent to U+0061 through U+007A (a-z). The field | |||
| The field contents can contain any UTF-8 character. | contents can contain any UTF-8 character. | |||
| Note that the Vorbis comment as used in Vorbis allows for on the | Note that the Vorbis comment as used in Vorbis allows for 2^64 bytes | |||
| order of 2^64 bytes of data whereas the FLAC metadata block is | of data whereas the FLAC metadata block is limited to 2^24 bytes. | |||
| limited to 2^24 bytes. Given the stated purpose of Vorbis comments, | Given the stated purpose of Vorbis comments, i.e., human-readable | |||
| i.e., human-readable textual information, the FLAC metadata block | textual information, the FLAC metadata block limit is unlikely to be | |||
| limit is unlikely to be restrictive. Also note that the 32-bit field | restrictive. Also, note that the 32-bit field lengths are coded | |||
| lengths are coded little-endian, as opposed to the usual big-endian | little-endian as opposed to the usual big-endian coding of fixed- | |||
| coding of fixed-length integers in the rest of the FLAC format. | length integers in the rest of the FLAC format. | |||
| 8.6.1. Standard field names | 8.6.1. Standard Field Names | |||
| Only one standard field name is defined: the channel mask field, in | Only one standard field name is defined: the channel mask field (see | |||
| Section 8.6.2. No other field names are defined because the | Section 8.6.2). No other field names are defined because the | |||
| applicability of any field name is strongly tied to the content it is | applicability of any field name is strongly tied to the content it is | |||
| associated with. For example, field names useful for describing | associated with. For example, field names that are useful for | |||
| files that contain a single work of music would be unusable when | describing files that contain a single work of music would be | |||
| labeling archived broadcasts, recordings of any kind, or a collection | unusable when labeling archived broadcasts, recordings of any kind, | |||
| of music works. Even when describing a single work of music, | or a collection of music works. Even when describing a single work | |||
| different conventions exist depending on the kind of music: | of music, different conventions exist depending on the kind of music: | |||
| orchestral music differs from music by solo artists or bands. | orchestral music differs from music by solo artists or bands. | |||
| Despite the fact that no field names are formally defined, there is a | Despite the fact that no field names are formally defined, there is a | |||
| general trend among devices and software capable of FLAC playback | general trend among devices and software capable of FLAC playback | |||
| that are meant to play music. Most of those recognize at least the | that are meant to play music. Most of those recognize at least the | |||
| following field names: | following field names: | |||
| * Title: name of the current work. | Title: Name of the current work. | |||
| * Artist: name of the artist generally responsible for the current | ||||
| Artist: Name of the artist generally responsible for the current | ||||
| work. For orchestral works, this is usually the composer; | work. For orchestral works, this is usually the composer; | |||
| otherwise, it is often the performer. | otherwise, it is often the performer. | |||
| * Album: name of the collection the current work belongs to. | ||||
| Album: Name of the collection the current work belongs to. | ||||
| For a more comprehensive list of possible field names suited for | For a more comprehensive list of possible field names suited for | |||
| describing a single work of music in various genres, the list of tags | describing a single work of music in various genres, the list of tags | |||
| used in the MusicBrainz project, see [MusicBrainz], is suggested. | used in the MusicBrainz project is suggested; see [MusicBrainz]. | |||
| 8.6.2. Channel mask | 8.6.2. Channel Mask | |||
| Besides fields containing information about the work itself, one | Besides fields containing information about the work itself, one | |||
| field is defined for technical reasons, of which the field name is | field is defined for technical reasons: | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK. This field is used to communicate | WAVEFORMATEXTENSIBLE_CHANNEL_MASK. This field is used to communicate | |||
| that the channels in a file differ from the default channels defined | that the channels in a file differ from the default channels defined | |||
| in Section 9.1.3. For example, by default, a FLAC file containing | in Section 9.1.3. For example, by default, a FLAC file containing | |||
| two channels is interpreted to contain a left and right channel, but | two channels is interpreted to contain a left and right channel, but | |||
| with this field, it is possible to describe different channel | with this field, it is possible to describe different channel | |||
| contents. | contents. | |||
| The channel mask consists of flag bits indicating which channels are | The channel mask consists of flag bits indicating which channels are | |||
| present. The flags only signal which channels are present, not in | present. The flags only signal which channels are present, not in | |||
| which order, so if a file has to be encoded in which channels are | which order, so if a file to be encoded has channels that are ordered | |||
| ordered differently, they have to be reordered. This mask is stored | differently, they have to be reordered. This mask is stored with a | |||
| with a hexadecimal representation, preceded by 0x, see the examples | hexadecimal representation preceded by 0x; see the examples below. | |||
| below. Please note that a file in which the channel order is defined | Please note that a file in which the channel order is defined through | |||
| through the WAVEFORMATEXTENSIBLE_CHANNEL_MASK is not streamable (see | the WAVEFORMATEXTENSIBLE_CHANNEL_MASK is not streamable (see | |||
| Section 7), as the field is not found in each frame header. The mask | Section 7), as the field is not found in each frame header. The mask | |||
| bits can be found in the following table. | bits can be found in the following table. | |||
| +============+=============================+ | +============+=============================+ | |||
| | Bit number | Channel description | | | Bit Number | Channel Description | | |||
| +============+=============================+ | +============+=============================+ | |||
| | 0 | Front left | | | 0 | Front left | | |||
| +------------+-----------------------------+ | +------------+-----------------------------+ | |||
| | 1 | Front right | | | 1 | Front right | | |||
| +------------+-----------------------------+ | +------------+-----------------------------+ | |||
| | 2 | Front center | | | 2 | Front center | | |||
| +------------+-----------------------------+ | +------------+-----------------------------+ | |||
| | 3 | Low-frequency effects (LFE) | | | 3 | Low-frequency effects (LFE) | | |||
| +------------+-----------------------------+ | +------------+-----------------------------+ | |||
| | 4 | Back left | | | 4 | Back left | | |||
| skipping to change at page 24, line 49 ¶ | skipping to change at line 1089 ¶ | |||
| +------------+-----------------------------+ | +------------+-----------------------------+ | |||
| | 16 | Top rear center | | | 16 | Top rear center | | |||
| +------------+-----------------------------+ | +------------+-----------------------------+ | |||
| | 17 | Top rear right | | | 17 | Top rear right | | |||
| +------------+-----------------------------+ | +------------+-----------------------------+ | |||
| Table 8 | Table 8 | |||
| Following are three examples: | Following are three examples: | |||
| * If a file has a single channel, being a LFE channel, the Vorbis | * A file has a single channel -- an LFE channel. The Vorbis comment | |||
| comment field is WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x8. | field is WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x8. | |||
| * If a file has four channels, being front left, front right, top | * A file has four channels -- front left, front right, top front | |||
| front left, and top front right, the Vorbis comment field is | left, and top front right. The Vorbis comment field is | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x5003. | WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x5003. | |||
| * If an input has four channels, being back center, top front | ||||
| center, front center, and top rear center in that order, they have | * An input has four channels -- back center, top front center, front | |||
| to be reordered to front center, back center, top front center and | center, and top rear center in that order. These have to be | |||
| top rear center. The Vorbis comment field added is | reordered to front center, back center, top front center, and top | |||
| rear center. The Vorbis comment field added is | ||||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x12104. | WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x12104. | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK fields MAY be padded with zeros, | WAVEFORMATEXTENSIBLE_CHANNEL_MASK fields MAY be padded with zeros, | |||
| for example, 0x0008 for a single LFE channel. Parsing of | for example, 0x0008 for a single LFE channel. Parsing of | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK fields MUST be case-insensitive for | WAVEFORMATEXTENSIBLE_CHANNEL_MASK fields MUST be case-insensitive for | |||
| both the field name and the field contents. | both the field name and the field contents. | |||
| A WAVEFORMATEXTENSIBLE_CHANNEL_MASK field of 0x0 can be used to | A WAVEFORMATEXTENSIBLE_CHANNEL_MASK field of 0x0 can be used to | |||
| indicate that none of the audio channels of a file correlate with | indicate that none of the audio channels of a file correlate with | |||
| speaker positions. This is the case when audio needs to be decoded | speaker positions. This is the case when audio needs to be decoded | |||
| skipping to change at page 25, line 33 ¶ | skipping to change at line 1121 ¶ | |||
| multitrack recording is contained. | multitrack recording is contained. | |||
| It is possible for a WAVEFORMATEXTENSIBLE_CHANNEL_MASK field to code | It is possible for a WAVEFORMATEXTENSIBLE_CHANNEL_MASK field to code | |||
| for fewer channels than are present in the audio. If that is the | for fewer channels than are present in the audio. If that is the | |||
| case, the remaining channels SHOULD NOT be rendered by a playback | case, the remaining channels SHOULD NOT be rendered by a playback | |||
| application unfamiliar with their purpose. For example, the | application unfamiliar with their purpose. For example, the | |||
| Ambisonics UHJ format is compatible with stereo playback: its first | Ambisonics UHJ format is compatible with stereo playback: its first | |||
| two channels can be played back on stereo equipment, but all four | two channels can be played back on stereo equipment, but all four | |||
| channels together can be decoded into surround sound. For that | channels together can be decoded into surround sound. For that | |||
| example, the Vorbis comment field | example, the Vorbis comment field | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x3 would be set, indicating the | WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x3 would be set, indicating that | |||
| first two channels are front left and front right, and other channels | the first two channels are front left and front right and other | |||
| do not correlate with speaker positions directly. | channels do not correlate with speaker positions directly. | |||
| If audio channels not assigned to any speaker are contained and | If audio channels not assigned to any speaker are contained and | |||
| decoding to speaker positions is possible, it is recommended to | decoding to speaker positions is possible, it is recommended to | |||
| provide metadata on how this decoding should take place in another | provide metadata on how this decoding should take place in another | |||
| Vorbis comment field or an application metadata block. This document | Vorbis comment field or an application metadata block. This document | |||
| does not define such metadata. | does not define such metadata. | |||
| 8.7. Cuesheet | 8.7. Cuesheet | |||
| To either store the track and index point structure of a Compact Disc | A cuesheet metadata block can be used either to store the track and | |||
| Digital Audio (CD-DA) along with its audio or to provide a mechanism | index point structure of a Compact Disc Digital Audio (CD-DA) along | |||
| to store locations of interest within a FLAC file, a cuesheet | with its audio or to provide a mechanism to store locations of | |||
| metadata block can be used. Certain aspects of this metadata block | interest within a FLAC file. Certain aspects of this metadata block | |||
| follow directly from the CD-DA specification, called Red Book, which | come directly from the CD-DA specification (called Red Book), which | |||
| is standardized as [IEC.60908.1999]. The description below is | is standardized as [IEC.60908.1999]. The description below is | |||
| complete and further reference to [IEC.60908.1999] is not needed to | complete, and further reference to [IEC.60908.1999] is not needed to | |||
| implement this metadata block. | implement this metadata block. | |||
| The structure of a cuesheet metadata block is enumerated in the | The structure of a cuesheet metadata block is enumerated in the | |||
| following table. | following table. | |||
| +============+======================================================+ | +============+======================================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +============+======================================================+ | +============+======================================================+ | |||
| | u(128*8) | Media catalog number, in ASCII | | | u(128*8) | Media catalog number in ASCII | | |||
| | | printable characters 0x20-0x7E. | | | | printable characters 0x20-0x7E. | | |||
| +------------+------------------------------------------------------+ | +------------+------------------------------------------------------+ | |||
| | u(64) | Number of lead-in samples. | | | u(64) | Number of lead-in samples. | | |||
| +------------+------------------------------------------------------+ | +------------+------------------------------------------------------+ | |||
| | u(1) | 1 if the cuesheet corresponds to a | | | u(1) | 1 if the cuesheet corresponds to a | | |||
| | | CD-DA, else 0. | | | | CD-DA; else 0. | | |||
| +------------+------------------------------------------------------+ | +------------+------------------------------------------------------+ | |||
| | u(7+258*8) | Reserved. All bits MUST be set to | | | u(7+258*8) | Reserved. All bits MUST be set to | | |||
| | | zero. | | | | zero. | | |||
| +------------+------------------------------------------------------+ | +------------+------------------------------------------------------+ | |||
| | u(8) | Number of tracks in this cuesheet. | | | u(8) | Number of tracks in this cuesheet. | | |||
| +------------+------------------------------------------------------+ | +------------+------------------------------------------------------+ | |||
| | Cuesheet | A number of structures as specified | | | Cuesheet | A number of structures as specified | | |||
| | tracks | in Section 8.7.1 equal to the number | | | tracks | in Section 8.7.1 equal to the number | | |||
| | | of tracks specified previously. | | | | of tracks specified previously. | | |||
| +------------+------------------------------------------------------+ | +------------+------------------------------------------------------+ | |||
| Table 9 | Table 9 | |||
| If the media catalog number is less than 128 bytes long, it is right- | If the media catalog number is less than 128 bytes long, it is right- | |||
| padded with 0x00 bytes. For CD-DA, this is a thirteen digit number, | padded with 0x00 bytes. For CD-DA, this is a 13-digit number | |||
| followed by 115 0x00 bytes. | followed by 115 0x00 bytes. | |||
| The number of lead-in samples has meaning only for CD-DA cuesheets; | The number of lead-in samples has meaning only for CD-DA cuesheets; | |||
| for other uses, it should be 0. For CD-DA, the lead-in is the TRACK | for other uses, it should be 0. For CD-DA, the lead-in is the TRACK | |||
| 00 area where the table of contents is stored; more precisely, it is | 00 area where the table of contents is stored; more precisely, it is | |||
| the number of samples from the first sample of the media to the first | the number of samples from the first sample of the media to the first | |||
| sample of the first index point of the first track. According to | sample of the first index point of the first track. According to | |||
| [IEC.60908.1999], the lead-in MUST be silence and CD grabbing | [IEC.60908.1999], the lead-in MUST be silent, and CD grabbing | |||
| software does not usually store it; additionally, the lead-in MUST be | software does not usually store it; additionally, the lead-in MUST be | |||
| at least two seconds but MAY be longer. For these reasons, the lead- | at least two seconds but MAY be longer. For these reasons, the lead- | |||
| in length is stored here so that the absolute position of the first | in length is stored here so that the absolute position of the first | |||
| track can be computed. Note that the lead-in stored here is the | track can be computed. Note that the lead-in stored here is the | |||
| number of samples up to the first index point of the first track, not | number of samples up to the first index point of the first track, not | |||
| necessarily to INDEX 01 of the first track; even the first track MAY | necessarily to INDEX 01 of the first track; even the first track MAY | |||
| have INDEX 00 data. | have INDEX 00 data. | |||
| The number of tracks MUST be at least 1, as a cuesheet block MUST | The number of tracks MUST be at least 1, as a cuesheet block MUST | |||
| have a lead-out track. For CD-DA, this number MUST be no more than | have a lead-out track. For CD-DA, this number MUST be no more than | |||
| 100 (99 regular tracks and one lead-out track). The lead-out track | 100 (99 regular tracks and one lead-out track). The lead-out track | |||
| is always the last track in the cuesheet. For CD-DA, the lead-out | is always the last track in the cuesheet. For CD-DA, the lead-out | |||
| track number MUST be 170 as specified by [IEC.60908.1999], otherwise | track number MUST be 170 as specified by [IEC.60908.1999]; otherwise, | |||
| it MUST be 255. | it MUST be 255. | |||
| 8.7.1. Cuesheet track | 8.7.1. Cuesheet Track | |||
| +=============+=====================================================+ | +=============+=====================================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +=============+=====================================================+ | +=============+=====================================================+ | |||
| | u(64) | Track offset of the first index point in | | | u(64) | Track offset of the first index point in | | |||
| | | samples, relative to the beginning of the | | | | samples, relative to the beginning of the | | |||
| | | FLAC audio stream. | | | | FLAC audio stream. | | |||
| +-------------+-----------------------------------------------------+ | +-------------+-----------------------------------------------------+ | |||
| | u(8) | Track number. | | | u(8) | Track number. | | |||
| +-------------+-----------------------------------------------------+ | +-------------+-----------------------------------------------------+ | |||
| skipping to change at page 27, line 46 ¶ | skipping to change at line 1227 ¶ | |||
| +-------------+-----------------------------------------------------+ | +-------------+-----------------------------------------------------+ | |||
| | Cuesheet | For all tracks except the lead-out track, a | | | Cuesheet | For all tracks except the lead-out track, a | | |||
| | track index | number of structures as specified in | | | track index | number of structures as specified in | | |||
| | points | Section 8.7.1.1 equal to the number of index | | | points | Section 8.7.1.1 equal to the number of index | | |||
| | | points specified previously. | | | | points specified previously. | | |||
| +-------------+-----------------------------------------------------+ | +-------------+-----------------------------------------------------+ | |||
| Table 10 | Table 10 | |||
| Note that the track offset differs from the one in CD-DA, where the | Note that the track offset differs from the one in CD-DA, where the | |||
| track's offset in the TOC is that of the track's INDEX 01 even if | track's offset in the table of contents (TOC) is that of the track's | |||
| there is an INDEX 00. For CD-DA, the track offset MUST be evenly | INDEX 01 even if there is an INDEX 00. For CD-DA, the track offset | |||
| divisible by 588 samples (588 samples = 44100 samples/s * 1/75 s). | MUST be evenly divisible by 588 samples (588 samples = 44100 samples/ | |||
| s * 1/75 s). | ||||
| A track number of 0 is not allowed, because the CD-DA specification | A track number of 0 is not allowed because the CD-DA specification | |||
| reserves this for the lead-in. For CD-DA the number MUST be 1-99, or | reserves this for the lead-in. For CD-DA, the number MUST be 1-99 or | |||
| 170 for the lead-out; for non-CD-DA, the track number MUST be 255 for | 170 for the lead-out; for non-CD-DA, the track number MUST be 255 for | |||
| the lead-out. It is recommended to start with track 1 and increase | the lead-out. It is recommended to start with track 1 and increase | |||
| sequentially. Track numbers MUST be unique within a cuesheet. | sequentially. Track numbers MUST be unique within a cuesheet. | |||
| The track ISRC (International Standard Recording Code) is a 12-digit | The track ISRC (International Standard Recording Code) is a 12-digit | |||
| alphanumeric code; see [ISRC-handbook]. A value of 12 ASCII 0x00 | alphanumeric code; see [ISRC-handbook]. A value of 12 ASCII 0x00 | |||
| characters MAY be used to denote the absence of an ISRC. | characters MAY be used to denote the absence of an ISRC. | |||
| There MUST be at least one index point in every track in a cuesheet | There MUST be at least one index point in every track in a cuesheet | |||
| except for the lead-out track, which MUST have zero. For CD-DA, the | except for the lead-out track, which MUST have zero. For CD-DA, the | |||
| number of index points MUST NOT be more than 100. | number of index points MUST NOT be more than 100. | |||
| 8.7.1.1. Cuesheet track index point | 8.7.1.1. Cuesheet Track Index Point | |||
| +========+====================================+ | +========+====================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +========+====================================+ | +========+====================================+ | |||
| | u(64) | Offset in samples, relative to the | | | u(64) | Offset in samples, relative to the | | |||
| | | track offset, of the index point. | | | | track offset, of the index point. | | |||
| +--------+------------------------------------+ | +--------+------------------------------------+ | |||
| | u(8) | The track index point number. | | | u(8) | The track index point number. | | |||
| +--------+------------------------------------+ | +--------+------------------------------------+ | |||
| | u(3*8) | Reserved. All bits MUST be set to | | | u(3*8) | Reserved. All bits MUST be set to | | |||
| skipping to change at page 28, line 49 ¶ | skipping to change at line 1276 ¶ | |||
| For CD-DA, a track index point number of 0 corresponds to the track | For CD-DA, a track index point number of 0 corresponds to the track | |||
| pre-gap. The first index point in a track MUST have a number of 0 or | pre-gap. The first index point in a track MUST have a number of 0 or | |||
| 1, and subsequently, index point numbers MUST increase by 1. Index | 1, and subsequently, index point numbers MUST increase by 1. Index | |||
| point numbers MUST be unique within a track. | point numbers MUST be unique within a track. | |||
| 8.8. Picture | 8.8. Picture | |||
| The picture metadata block contains image data of a picture in some | The picture metadata block contains image data of a picture in some | |||
| way belonging to the audio contained in the FLAC file. Its format is | way belonging to the audio contained in the FLAC file. Its format is | |||
| derived from the APIC frame in the ID3v2 specification, see [ID3v2]. | derived from the Attached Picture (APIC) frame in the ID3v2 | |||
| However, contrary to the APIC frame in ID3v2, the media type and | specification; see [ID3v2]. However, contrary to the APIC frame in | |||
| description are prepended with a 4-byte length field instead of being | ID3v2, the media type and description are prepended with a 4-byte | |||
| 0x00 delimited strings. A FLAC file MAY contain one or more picture | length field instead of being 0x00 delimited strings. A FLAC file | |||
| metadata blocks. | MAY contain one or more picture metadata blocks. | |||
| Note that while the length fields for media type, description, and | Note that while the length fields for media type, description, and | |||
| picture data are 4 bytes in length and could in theory code for a | picture data are 4 bytes in length and could code for a size up to 4 | |||
| size up to 4 GiB, the total metadata block size cannot exceed what | GiB in theory, the total metadata block size cannot exceed what can | |||
| can be described by the metadata block header, i.e., 16 MiB. | be described by the metadata block header, i.e., 16 MiB. | |||
| Instead of picture data, the picture metadata block can also contain | Instead of picture data, the picture metadata block can also contain | |||
| an URI as described in [RFC3986]. | a URI as described in [RFC3986]. | |||
| The structure of a picture metadata block is enumerated in the | The structure of a picture metadata block is enumerated in the | |||
| following table. | following table. | |||
| +========+==========================================================+ | +========+==========================================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +========+==========================================================+ | +========+==========================================================+ | |||
| | u(32) | The picture type according to next table | | | u(32) | The picture type according to Table 13. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(32) | The length of the media type string in bytes. | | | u(32) | The length of the media type string in bytes. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(n*8) | The media type string as specified by [RFC2046], | | | u(n*8) | The media type string as specified by [RFC2046], | | |||
| | | or the text string --> to signify that the data | | | | or the text string --> to signify that the data | | |||
| | | part is a URI of the picture instead of the | | | | part is a URI of the picture instead of the | | |||
| | | picture data itself. This field must be in | | | | picture data itself. This field must be in | | |||
| | | printable ASCII characters 0x20-0x7E. | | | | printable ASCII characters 0x20-0x7E. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(32) | The length of the description string in bytes. | | | u(32) | The length of the description string in bytes. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(n*8) | The description of the picture, in UTF-8. | | | u(n*8) | The description of the picture in UTF-8. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(32) | The width of the picture in pixels. | | | u(32) | The width of the picture in pixels. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(32) | The height of the picture in pixels. | | | u(32) | The height of the picture in pixels. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(32) | The color depth of the picture in bits per | | | u(32) | The color depth of the picture in bits per | | |||
| | | pixel. | | | | pixel. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(32) | For indexed-color pictures (e.g., GIF), the | | | u(32) | For indexed-color pictures (e.g., GIF), the | | |||
| | | number of colors used, or 0 for non-indexed | | | | number of colors used; 0 for non-indexed | | |||
| | | pictures. | | | | pictures. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(32) | The length of the picture data in bytes. | | | u(32) | The length of the picture data in bytes. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | u(n*8) | The binary picture data. | | | u(n*8) | The binary picture data. | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| Table 12 | Table 12 | |||
| The height, width, color depth, and 'number of colors' fields are for | The height, width, color depth, and "number of colors" fields are for | |||
| informational purposes only. Applications MUST NOT use them in | informational purposes only. Applications MUST NOT use them in | |||
| decoding the picture or deciding how to display it, but MAY use them | decoding the picture or deciding how to display it, but applications | |||
| to decide whether to process a block or not (e.g., when selecting | MAY use them to decide whether or not to process a block (e.g., when | |||
| between different picture blocks) and MAY show them to the user. If | selecting between different picture blocks) and MAY show them to the | |||
| a picture has no concept for any of these fields (e.g., vector images | user. If a picture has no concept for any of these fields (e.g., | |||
| may not have a height or width in pixels) or the content of any field | vector images may not have a height or width in pixels) or the | |||
| is unknown, the affected fields MUST be set to zero. | content of any field is unknown, the affected fields MUST be set to | |||
| zero. | ||||
| The following table contains all the defined picture types. Values | The following table contains all the defined picture types. Values | |||
| other than those listed in the table are reserved. There MAY only be | other than those listed in the table are reserved. There MAY only be | |||
| one each of picture types 1 and 2 in a file. In general practice, | one each of picture types 1 and 2 in a file. In general practice, | |||
| many FLAC playback devices and software display the contents of a | many FLAC playback devices and software display the contents of a | |||
| picture metadata block with picture type 3 (front cover) during | picture metadata block, if present, with picture type 3 (front cover) | |||
| playback, if present. | during playback. | |||
| +=======+=================================================+ | +=======+=================================================+ | |||
| | Value | Picture type | | | Value | Picture Type | | |||
| +=======+=================================================+ | +=======+=================================================+ | |||
| | 0 | Other | | | 0 | Other | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 1 | PNG file icon of 32x32 pixels, see [RFC2083] | | | 1 | PNG file icon of 32x32 pixels (see [RFC2083]) | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 2 | General file icon | | | 2 | General file icon | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 3 | Front cover | | | 3 | Front cover | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 4 | Back cover | | | 4 | Back cover | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 5 | Liner notes page | | | 5 | Liner notes page | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 6 | Media label (e.g., CD, Vinyl or Cassette label) | | | 6 | Media label (e.g., CD, Vinyl or Cassette label) | | |||
| skipping to change at page 32, line 5 ¶ | skipping to change at line 1393 ¶ | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 18 | Illustration | | | 18 | Illustration | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 19 | Band or artist logotype | | | 19 | Band or artist logotype | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| | 20 | Publisher or studio logotype | | | 20 | Publisher or studio logotype | | |||
| +-------+-------------------------------------------------+ | +-------+-------------------------------------------------+ | |||
| Table 13 | Table 13 | |||
| The origin and use of value 17, "A bright colored fish", is unclear. | The origin and use of value 17 ("A bright colored fish") is unclear. | |||
| This was copied to maintain compatibility with ID3v2. Applications | This was copied to maintain compatibility with ID3v2. Applications | |||
| are discouraged from offering this value to users when embedding a | are discouraged from offering this value to users when embedding a | |||
| picture. | picture. | |||
| If not a picture but a URI is contained in this block, the following | If a URI (not a picture) is contained in this block, the following | |||
| points apply: | points apply: | |||
| * The URI can be either in absolute or relative form. If an URI is | * The URI can be in either absolute or relative form. If a URI is | |||
| in relative form, it is related to the URI of the FLAC content | in relative form, it is related to the URI of the FLAC content | |||
| processed. | processed. | |||
| * Applications MUST obtain explicit user approval to retrieve images | * Applications MUST obtain explicit user approval to retrieve images | |||
| via remote protocols and to retrieve local images not located in | via remote protocols and to retrieve local images that are not | |||
| the same directory as the FLAC file being processed. | located in the same directory as the FLAC file being processed. | |||
| * Applications supporting linked images MUST handle unavailability | * Applications supporting linked images MUST handle unavailability | |||
| of URIs gracefully. They MAY report unavailability to the user. | of URIs gracefully. They MAY report unavailability to the user. | |||
| * Applications MAY reject processing URIs for any reason, in | ||||
| particular for security or privacy reasons. | ||||
| 9. Frame structure | * Applications MAY reject processing URIs for any reason, | |||
| particularly for security or privacy reasons. | ||||
| Directly after the last metadata block, one or more frames follow. | 9. Frame Structure | |||
| One or more frames follow directly after the last metadata block. | ||||
| Each frame consists of a frame header, one or more subframes, padding | Each frame consists of a frame header, one or more subframes, padding | |||
| zero bits to achieve byte-alignment, and a frame footer. The number | zero bits to achieve byte alignment, and a frame footer. The number | |||
| of subframes in each frame is equal to the number of audio channels. | of subframes in each frame is equal to the number of audio channels. | |||
| Each frame header stores the audio sample rate, number of bits per | Each frame header stores the audio sample rate, number of bits per | |||
| sample, and number of channels independently of the streaminfo | sample, and number of channels independently of the streaminfo | |||
| metadata block and other frame headers. This was done to permit | metadata block and other frame headers. This was done to permit | |||
| multicasting of FLAC files, but it also allows these properties to | multicasting of FLAC files, but it also allows these properties to | |||
| change mid-stream. Because not all environments in which FLAC | change mid-stream. Because not all environments in which FLAC | |||
| decoders are used are able to cope with changes to these properties | decoders are used are able to cope with changes to these properties | |||
| during playback, a decoder MAY choose to stop decoding on such a | during playback, a decoder MAY choose to stop decoding on such a | |||
| change. A decoder that does not check for such a change could be | change. A decoder that does not check for such a change could be | |||
| vulnerable to buffer overflows. See also Section 12. | vulnerable to buffer overflows. See also Section 11. | |||
| Note that storing audio with changing audio properties in FLAC | Note that storing audio with changing audio properties in FLAC | |||
| results in various practical problems. For example, these changes of | results in various practical problems. For example, these changes of | |||
| audio properties must happen on a frame boundary, or the process will | audio properties must happen on a frame boundary or the process will | |||
| not be lossless. When a variable block size is chosen to accommodate | not be lossless. When a variable block size is chosen to accommodate | |||
| this, note that blocks smaller than 16 samples are not allowed and it | this, note that blocks smaller than 16 samples are not allowed; | |||
| is therefore not possible to store an audio stream in which these | therefore, it is not possible to store an audio stream in which these | |||
| properties change within 16 samples of the last change or the start | properties change within 16 samples of the last change or the start | |||
| of the file. Also, since the streaminfo metadata block can only | of the file. Also, since the streaminfo metadata block can only | |||
| accommodate a single set of properties, it is only valid for part of | accommodate a single set of properties, it is only valid for part of | |||
| such an audio stream. Instead, it is RECOMMENDED to store an audio | such an audio stream. Instead, it is RECOMMENDED to store an audio | |||
| stream with changing properties in FLAC encapsulated in a container | stream with changing properties in FLAC encapsulated in a container | |||
| capable of handling such changes, as these do not suffer from the | capable of handling such changes, as these do not suffer from the | |||
| mentioned limitations. See Section 10 for details. | mentioned limitations. See Section 10 for details. | |||
| 9.1. Frame header | 9.1. Frame Header | |||
| Each frame MUST start on a byte boundary and starts with the 15-bit | Each frame MUST start on a byte boundary and start with the 15-bit | |||
| frame sync code 0b111111111111100. Following the sync code is the | frame sync code 0b111111111111100. Following the sync code is the | |||
| blocking strategy bit, which MUST NOT change during the audio stream. | blocking strategy bit, which MUST NOT change during the audio stream. | |||
| The blocking strategy bit is 0 for a fixed block size stream or 1 for | The blocking strategy bit is 0 for a fixed block size stream or 1 for | |||
| a variable block size stream. If the blocking strategy is known, a | a variable block size stream. If the blocking strategy is known, a | |||
| decoder can include this bit when searching for the start of a frame | decoder can include this bit when searching for the start of a frame | |||
| to reduce the possibility of encountering a false positive, as the | to reduce the possibility of encountering a false positive, as the | |||
| first two bytes of a frame are either 0xFFF8 for a fixed block size | first two bytes of a frame are either 0xFFF8 for a fixed block size | |||
| stream or 0xFFF9 for a variable block size stream. | stream or 0xFFF9 for a variable block size stream. | |||
| 9.1.1. Block size bits | 9.1.1. Block Size Bits | |||
| Following the frame sync code and blocking strategy bit are 4 bits | Following the frame sync code and blocking strategy bit are 4 bits | |||
| (the first 4 bits of the third byte of each frame) referred to as the | (the first 4 bits of the third byte of each frame) referred to as the | |||
| block size bits. Their value relates to the block size according to | block size bits. Their value relates to the block size according to | |||
| the following table, where v is the value of the 4 bits as an | the following table, where v is the value of the 4 bits as an | |||
| unsigned number. If the block size bits code for an uncommon block | unsigned number. If the block size bits code for an uncommon block | |||
| size, this is stored after the coded number, see Section 9.1.6. | size, this is stored after the coded number; see Section 9.1.6. | |||
| +=================+=============================================+ | +=================+=============================================+ | |||
| | Value | Block size | | | Value | Block Size | | |||
| +=================+=============================================+ | +=================+=============================================+ | |||
| | 0b0000 | reserved | | | 0b0000 | Reserved | | |||
| +-----------------+---------------------------------------------+ | +-----------------+---------------------------------------------+ | |||
| | 0b0001 | 192 | | | 0b0001 | 192 | | |||
| +-----------------+---------------------------------------------+ | +-----------------+---------------------------------------------+ | |||
| | 0b0010 - 0b0101 | 144 * (2^v), i.e., 576, 1152, 2304, or 4608 | | | 0b0010 - 0b0101 | 144 * (2^v), i.e., 576, 1152, 2304, or 4608 | | |||
| +-----------------+---------------------------------------------+ | +-----------------+---------------------------------------------+ | |||
| | 0b0110 | uncommon block size minus 1 stored as an | | | 0b0110 | Uncommon block size minus 1, stored as an | | |||
| | | 8-bit number | | | | 8-bit number | | |||
| +-----------------+---------------------------------------------+ | +-----------------+---------------------------------------------+ | |||
| | 0b0111 | uncommon block size minus 1 stored as a | | | 0b0111 | Uncommon block size minus 1, stored as a | | |||
| | | 16-bit number | | | | 16-bit number | | |||
| +-----------------+---------------------------------------------+ | +-----------------+---------------------------------------------+ | |||
| | 0b1000 - 0b1111 | 2^v, i.e., 256, 512, 1024, 2048, 4096, | | | 0b1000 - 0b1111 | 2^v, i.e., 256, 512, 1024, 2048, 4096, | | |||
| | | 8192, 16384, or 32768 | | | | 8192, 16384, or 32768 | | |||
| +-----------------+---------------------------------------------+ | +-----------------+---------------------------------------------+ | |||
| Table 14 | Table 14 | |||
| 9.1.2. Sample rate bits | 9.1.2. Sample Rate Bits | |||
| The next 4 bits (the last 4 bits of the third byte of each frame), | The next 4 bits (the last 4 bits of the third byte of each frame), | |||
| referred to as the sample rate bits, contain the sample rate of the | referred to as the sample rate bits, contain the sample rate of the | |||
| audio according to the following table. If the sample rate bits code | audio according to the following table. If the sample rate bits code | |||
| for an uncommon sample rate, this is stored after the uncommon block | for an uncommon sample rate, this is stored after the uncommon block | |||
| size or after the coded number if no uncommon block size was used. | size; if no uncommon block size was used, this is stored after the | |||
| See Section 9.1.7. | coded number. See Section 9.1.7. | |||
| +========+==========================================================+ | +========+==========================================================+ | |||
| | Value | Sample rate | | | Value | Sample Rate | | |||
| +========+==========================================================+ | +========+==========================================================+ | |||
| | 0b0000 | sample rate only stored in the | | | 0b0000 | Sample rate only stored in the | | |||
| | | streaminfo metadata block | | | | streaminfo metadata block | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b0001 | 88.2 kHz | | | 0b0001 | 88.2 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b0010 | 176.4 kHz | | | 0b0010 | 176.4 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b0011 | 192 kHz | | | 0b0011 | 192 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b0100 | 8 kHz | | | 0b0100 | 8 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| skipping to change at page 35, line 33 ¶ | skipping to change at line 1525 ¶ | |||
| | 0b0111 | 24 kHz | | | 0b0111 | 24 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b1000 | 32 kHz | | | 0b1000 | 32 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b1001 | 44.1 kHz | | | 0b1001 | 44.1 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b1010 | 48 kHz | | | 0b1010 | 48 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b1011 | 96 kHz | | | 0b1011 | 96 kHz | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b1100 | uncommon sample rate in kHz stored | | | 0b1100 | Uncommon sample rate in kHz, | | |||
| | | as an 8-bit number | | | | stored as an 8-bit number | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b1101 | uncommon sample rate in Hz stored | | | 0b1101 | Uncommon sample rate in Hz, stored | | |||
| | | as a 16-bit number | | | | as a 16-bit number | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b1110 | uncommon sample rate in Hz divided | | | 0b1110 | Uncommon sample rate in Hz divided | | |||
| | | by 10, stored as a 16-bit number | | | | by 10, stored as a 16-bit number | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| | 0b1111 | forbidden | | | 0b1111 | Forbidden | | |||
| +--------+----------------------------------------------------------+ | +--------+----------------------------------------------------------+ | |||
| Table 15 | Table 15 | |||
| 9.1.3. Channels bits | 9.1.3. Channels Bits | |||
| The next 4 bits (the first 4 bits of the fourth byte of each frame), | The next 4 bits (the first 4 bits of the fourth byte of each frame), | |||
| referred to as the channels bits, contain both the number of channels | referred to as the channels bits, contain both the number of channels | |||
| of the audio as well as any stereo decorrelation used according to | of the audio as well as any stereo decorrelation used according to | |||
| the following table. | the following table. | |||
| If a channel layout different than the ones listed in the following | If a channel layout different than the ones listed in the following | |||
| table is used, this can be signaled with a | table is used, this can be signaled with a | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK tag in a Vorbis comment metadata | WAVEFORMATEXTENSIBLE_CHANNEL_MASK tag in a Vorbis comment metadata | |||
| block, see Section 8.6.2 for details. Note that even when such a | block; see Section 8.6.2 for details. Note that even when such a | |||
| different channel layout is specified with a | different channel layout is specified with a | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK and the channel ordering in the | WAVEFORMATEXTENSIBLE_CHANNEL_MASK and the channel ordering in the | |||
| following table is overridden, the channels bits still contain the | following table is overridden, the channels bits still contain the | |||
| actual number of channels coded in the frame. For details on the way | actual number of channels coded in the frame. For details on the way | |||
| left/side, right/side, and mid/side stereo are coded, see | left-side, side-right, and mid-side stereo are coded, see | |||
| Section 4.2. | Section 4.2. | |||
| +==========+====================================================+ | +==========+====================================================+ | |||
| | Value | Channels | | | Value | Channels | | |||
| +==========+====================================================+ | +==========+====================================================+ | |||
| | 0b0000 | 1 channel: mono | | | 0b0000 | 1 channel: mono | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b0001 | 2 channels: left, right | | | 0b0001 | 2 channels: left, right | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b0010 | 3 channels: left, right, center | | | 0b0010 | 3 channels: left, right, center | | |||
| skipping to change at page 36, line 40 ¶ | skipping to change at line 1581 ¶ | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b0101 | 6 channels: front left, front right, front center, | | | 0b0101 | 6 channels: front left, front right, front center, | | |||
| | | LFE, back/surround left, back/surround right | | | | LFE, back/surround left, back/surround right | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b0110 | 7 channels: front left, front right, front center, | | | 0b0110 | 7 channels: front left, front right, front center, | | |||
| | | LFE, back center, side left, side right | | | | LFE, back center, side left, side right | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b0111 | 8 channels: front left, front right, front center, | | | 0b0111 | 8 channels: front left, front right, front center, | | |||
| | | LFE, back left, back right, side left, side right | | | | LFE, back left, back right, side left, side right | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b1000 | 2 channels, left, right, stored as left/side | | | 0b1000 | 2 channels: left, right; stored as left-side | | |||
| | | stereo | | | | stereo | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b1001 | 2 channels, left, right, stored as right/side | | | 0b1001 | 2 channels: left, right; stored as side-right | | |||
| | | stereo | | | | stereo | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b1010 | 2 channels, left, right, stored as mid/side stereo | | | 0b1010 | 2 channels: left, right; stored as mid-side stereo | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| | 0b1011 - | reserved | | | 0b1011 - | Reserved | | |||
| | 0b1111 | | | | 0b1111 | | | |||
| +----------+----------------------------------------------------+ | +----------+----------------------------------------------------+ | |||
| Table 16 | Table 16 | |||
| 9.1.4. Bit depth bits | 9.1.4. Bit Depth Bits | |||
| The next 3 bits (bits 5, 6 and 7 of each fourth byte of each frame) | The next 3 bits (bits 5, 6, and 7 of each fourth byte of each frame) | |||
| contain the bit depth of the audio according to the following table. | contain the bit depth of the audio according to the following table. | |||
| The next bit is reserved and MUST be zero. | ||||
| +=======+========================================================+ | +=======+========================================================+ | |||
| | Value | Bit depth | | | Value | Bit Depth | | |||
| +=======+========================================================+ | +=======+========================================================+ | |||
| | 0b000 | bit depth only stored in the streaminfo metadata block | | | 0b000 | Bit depth only stored in the streaminfo metadata block | | |||
| +-------+--------------------------------------------------------+ | +-------+--------------------------------------------------------+ | |||
| | 0b001 | 8 bits per sample | | | 0b001 | 8 bits per sample | | |||
| +-------+--------------------------------------------------------+ | +-------+--------------------------------------------------------+ | |||
| | 0b010 | 12 bits per sample | | | 0b010 | 12 bits per sample | | |||
| +-------+--------------------------------------------------------+ | +-------+--------------------------------------------------------+ | |||
| | 0b011 | reserved | | | 0b011 | Reserved | | |||
| +-------+--------------------------------------------------------+ | +-------+--------------------------------------------------------+ | |||
| | 0b100 | 16 bits per sample | | | 0b100 | 16 bits per sample | | |||
| +-------+--------------------------------------------------------+ | +-------+--------------------------------------------------------+ | |||
| | 0b101 | 20 bits per sample | | | 0b101 | 20 bits per sample | | |||
| +-------+--------------------------------------------------------+ | +-------+--------------------------------------------------------+ | |||
| | 0b110 | 24 bits per sample | | | 0b110 | 24 bits per sample | | |||
| +-------+--------------------------------------------------------+ | +-------+--------------------------------------------------------+ | |||
| | 0b111 | 32 bits per sample | | | 0b111 | 32 bits per sample | | |||
| +-------+--------------------------------------------------------+ | +-------+--------------------------------------------------------+ | |||
| Table 17 | Table 17 | |||
| The next bit is reserved and MUST be zero. | 9.1.5. Coded Number | |||
| 9.1.5. Coded number | ||||
| Following the reserved bit (starting at the fifth byte of the frame) | Following the reserved bit (starting at the fifth byte of the frame) | |||
| is either a sample or a frame number, which will be referred to as | is either a sample or a frame number, which will be referred to as | |||
| the coded number. When dealing with variable block size streams, the | the coded number. When dealing with variable block size streams, the | |||
| sample number of the first sample in the frame is encoded. When the | sample number of the first sample in the frame is encoded. When the | |||
| file contains a fixed block size stream, the frame number is encoded. | file contains a fixed block size stream, the frame number is encoded. | |||
| See Section 9.1 on the blocking strategy bit which signals whether a | See Section 9.1 on the blocking strategy bit, which signals whether a | |||
| stream is a fixed block size stream or a variable block size stream. | stream is a fixed block size stream or a variable block size stream. | |||
| Also see Appendix B.1. | See also Appendix B.1. | |||
| The coded number is stored in a variable length code like UTF-8 as | The coded number is stored in a variable-length code like UTF-8 as | |||
| defined in [RFC3629], but extended to a maximum of 36 bits unencoded, | defined in [RFC3629] but extended to a maximum of 36 bits unencoded | |||
| 7 bytes encoded. | or 7 bytes encoded. | |||
| When a frame number is encoded, the value MUST NOT be larger than | When a frame number is encoded, the value MUST NOT be larger than | |||
| what fits a value of 31 bits unencoded or 6 bytes encoded. Please | what fits a value of 31 bits unencoded or 6 bytes encoded. Please | |||
| note that as most general purpose UTF-8 encoders and decoders follow | note that as most general purpose UTF-8 encoders and decoders follow | |||
| [RFC3629], they will not be able to handle these extended codes. | [RFC3629], they will not be able to handle these extended codes. | |||
| Furthermore, while UTF-8 is specifically used to encode characters, | Furthermore, while UTF-8 is specifically used to encode characters, | |||
| FLAC uses it to encode numbers instead. To encode or decode a coded | FLAC uses it to encode numbers instead. To encode or decode a coded | |||
| number, follow the procedures of Section 3 of [RFC3629], but instead | number, follow the procedures in Section 3 of [RFC3629], but instead | |||
| of using a character number, use a frame or sample number, and | of using a character number, use a frame or sample number. In | |||
| instead of the table in Section 3 of [RFC3629], use the extended | addition, use the extended table below instead of the table in | |||
| table below. | Section 3 of [RFC3629]. | |||
| +============================+=====================================+ | +============================+=====================================+ | |||
| | Number range (hexadecimal) | Octet sequence (binary) | | | Number Range (Hexadecimal) | Octet Sequence (Binary) | | |||
| +============================+=====================================+ | +============================+=====================================+ | |||
| | 0000 0000 0000 - | 0xxxxxxx | | | 0000 0000 0000 - | 0xxxxxxx | | |||
| | 0000 0000 007F | | | | 0000 0000 007F | | | |||
| +----------------------------+-------------------------------------+ | +----------------------------+-------------------------------------+ | |||
| | 0000 0000 0080 - | 110xxxxx 10xxxxxx | | | 0000 0000 0080 - | 110xxxxx 10xxxxxx | | |||
| | 0000 0000 07FF | | | | 0000 0000 07FF | | | |||
| +----------------------------+-------------------------------------+ | +----------------------------+-------------------------------------+ | |||
| | 0000 0000 0800 - | 1110xxxx 10xxxxxx 10xxxxxx | | | 0000 0000 0800 - | 1110xxxx 10xxxxxx 10xxxxxx | | |||
| | 0000 0000 FFFF | | | | 0000 0000 FFFF | | | |||
| +----------------------------+-------------------------------------+ | +----------------------------+-------------------------------------+ | |||
| skipping to change at page 38, line 45 ¶ | skipping to change at line 1682 ¶ | |||
| +----------------------------+-------------------------------------+ | +----------------------------+-------------------------------------+ | |||
| Table 18 | Table 18 | |||
| If the coded number is a frame number, it MUST be equal to the number | If the coded number is a frame number, it MUST be equal to the number | |||
| of frames preceding the current frame. If the coded number is a | of frames preceding the current frame. If the coded number is a | |||
| sample number, it MUST be equal to the number of samples preceding | sample number, it MUST be equal to the number of samples preceding | |||
| the current frame. In a stream where these requirements are not met, | the current frame. In a stream where these requirements are not met, | |||
| seeking is not (reliably) possible. | seeking is not (reliably) possible. | |||
| For example, a frame that belongs to a variable block size stream and | For example, for a frame that belongs to a variable block size stream | |||
| has exactly 51 billion samples preceding it, has its coded number | and has exactly 51 billion samples preceding it, the coded number is | |||
| constructed as follows. | constructed as follows: | |||
| Octets 1-5 | Octets 1-5 | |||
| 0b11111110 0b10101111 0b10011111 0b10110101 0b10100011 | 0b11111110 0b10101111 0b10011111 0b10110101 0b10100011 | |||
| ^^^^^^ ^^^^^^ ^^^^^^ ^^^^^^ | ^^^^^^ ^^^^^^ ^^^^^^ ^^^^^^ | |||
| | | | Bits 18-13 | | | | Bits 18-13 | |||
| | | Bits 24-19 | | | Bits 24-19 | |||
| | Bits 30-25 | | Bits 30-25 | |||
| Bits 36-31 | Bits 36-31 | |||
| Octets 6-7 | Octets 6-7 | |||
| 0b10111000 0b10000000 | 0b10111000 0b10000000 | |||
| ^^^^^^ ^^^^^^ | ^^^^^^ ^^^^^^ | |||
| | Bits 6-1 | | Bits 6-1 | |||
| Bits 12-7 | Bits 12-7 | |||
| A decoder that relies on the coded number during seeking could be | A decoder that relies on the coded number during seeking could be | |||
| vulnerable to buffer overflows or getting stuck in an infinite loop | vulnerable to buffer overflows or getting stuck in an infinite loop | |||
| if it seeks in a stream where the coded numbers are not strictly | if it seeks in a stream where the coded numbers are not strictly | |||
| increasing or otherwise not valid. See also Section 12. | increasing or are otherwise not valid. See also Section 11. | |||
| 9.1.6. Uncommon block size | 9.1.6. Uncommon Block Size | |||
| If the block size bits defined earlier in this section were 0b0110 or | If the block size bits defined earlier in this section are 0b0110 or | |||
| 0b0111 (uncommon block size minus 1 stored), this follows the coded | 0b0111 (uncommon block size minus 1 stored), the block size minus 1 | |||
| number as either an 8-bit or a 16-bit unsigned number coded big- | follows the coded number as either an 8-bit or 16-bit unsigned number | |||
| endian. A value of 65535 (corresponding to a block size of 65536) is | coded big-endian. A value of 65535 (corresponding to a block size of | |||
| forbidden and MUST NOT be used, because such a block size cannot be | 65536) is forbidden and MUST NOT be used, because such a block size | |||
| represented in the streaminfo metadata block. A value from 0 up to | cannot be represented in the streaminfo metadata block. A value from | |||
| (and including) 14, which corresponds to a block size from 1 to 15, | 0 up to (and including) 14, which corresponds to a block size from 1 | |||
| is only valid for the last frame in a stream and MUST NOT be used for | to 15, is only valid for the last frame in a stream and MUST NOT be | |||
| any other frame. See also Section 8.2. | used for any other frame. See also Section 8.2. | |||
| 9.1.7. Uncommon sample rate | 9.1.7. Uncommon Sample Rate | |||
| Following the uncommon block size (or the coded number if no uncommon | If the sample rate bits are 0b1100, 0b1101, or 0b1110 (uncommon | |||
| block size is stored) is the sample rate, if the sample rate bits | sample rate stored), the sample rate follows the uncommon block size | |||
| were 0b1100, 0b1101, or 0b1110 (uncommon sample rate stored), as | (or the coded number if no uncommon block size is stored) as either | |||
| either an 8-bit or a 16-bit unsigned number coded big-endian. | an 8-bit or a 16-bit unsigned number coded big-endian. | |||
| The sample rate MUST NOT be 0 when the subframe contains audio. A | The sample rate MUST NOT be 0 when the subframe contains audio. A | |||
| sample rate of 0 MAY be used when non-audio is represented. See | sample rate of 0 MAY be used when non-audio is represented. See | |||
| Section 8.2 for details. | Section 8.2 for details. | |||
| 9.1.8. Frame header CRC | 9.1.8. Frame Header CRC | |||
| Finally, after either the frame/sample number, an uncommon block | Finally, an 8-bit CRC follows the frame/sample number, an uncommon | |||
| size, or an uncommon sample rate, depending on whether the latter two | block size, or an uncommon sample rate (depending on whether the | |||
| are stored, is an 8-bit CRC. This CRC is initialized with 0 and has | latter two are stored). This CRC is initialized with 0 and has the | |||
| the polynomial x^8 + x^2 + x^1 + x^0. This CRC covers the whole | polynomial x^8 + x^2 + x^1 + x^0. This CRC covers the whole frame | |||
| frame header before the CRC, including the sync code. | header before the CRC, including the sync code. | |||
| 9.2. Subframes | 9.2. Subframes | |||
| Following the frame header are a number of subframes equal to the | Following the frame header are a number of subframes equal to the | |||
| number of audio channels. Note that as subframes contain a bitstream | number of audio channels. Note that subframes contain a bitstream | |||
| that does not necessarily has to be a whole number of bytes, only the | that does not necessarily have to be a whole number of bytes, so only | |||
| first subframe always starts at a byte boundary. | the first subframe starts at a byte boundary. | |||
| 9.2.1. Subframe header | 9.2.1. Subframe Header | |||
| Each subframe starts with a header. The first bit of the header MUST | Each subframe starts with a header. The first bit of the header MUST | |||
| be 0, followed by 6 bits describing which subframe type is used | be 0, followed by 6 bits that describe which subframe type is used | |||
| according to the following table, where v is the value of the 6 bits | according to the following table, where v is the value of the 6 bits | |||
| as an unsigned number. | as an unsigned number. | |||
| +=====================+===========================================+ | +=====================+===========================================+ | |||
| | Value | Subframe type | | | Value | Subframe Type | | |||
| +=====================+===========================================+ | +=====================+===========================================+ | |||
| | 0b000000 | Constant subframe | | | 0b000000 | Constant subframe | | |||
| +---------------------+-------------------------------------------+ | +---------------------+-------------------------------------------+ | |||
| | 0b000001 | Verbatim subframe | | | 0b000001 | Verbatim subframe | | |||
| +---------------------+-------------------------------------------+ | +---------------------+-------------------------------------------+ | |||
| | 0b000010 - 0b000111 | reserved | | | 0b000010 - 0b000111 | Reserved | | |||
| +---------------------+-------------------------------------------+ | +---------------------+-------------------------------------------+ | |||
| | 0b001000 - 0b001100 | Subframe with a fixed predictor of order | | | 0b001000 - 0b001100 | Subframe with a fixed predictor of order | | |||
| | | v-8, i.e., 0, 1, 2, 3 or 4 | | | | v-8; i.e., 0, 1, 2, 3 or 4 | | |||
| +---------------------+-------------------------------------------+ | +---------------------+-------------------------------------------+ | |||
| | 0b001101 - 0b011111 | reserved | | | 0b001101 - 0b011111 | Reserved | | |||
| +---------------------+-------------------------------------------+ | +---------------------+-------------------------------------------+ | |||
| | 0b100000 - 0b111111 | Subframe with a linear predictor of order | | | 0b100000 - 0b111111 | Subframe with a linear predictor of order | | |||
| | | v-31, i.e., 1 through 32 (inclusive) | | | | v-31; i.e., 1 through 32 (inclusive) | | |||
| +---------------------+-------------------------------------------+ | +---------------------+-------------------------------------------+ | |||
| Table 19 | Table 19 | |||
| Following the subframe type bits is a bit that flags whether the | Following the subframe type bits is a bit that flags whether the | |||
| subframe uses any wasted bits (see Section 9.2.2). If it is 0, the | subframe uses any wasted bits (see Section 9.2.2). If the flag bit | |||
| subframe doesn't use any wasted bits and the subframe header is | is 0, the subframe doesn't use any wasted bits and the subframe | |||
| complete. If it is 1, the subframe does use wasted bits and the | header is complete. If the flag bit is 1, the subframe uses wasted | |||
| number of used wasted bits follows unary coded. | bits and the number of used wasted bits minus 1 appears in unary | |||
| form, directly following the flag bit. | ||||
| 9.2.2. Wasted bits per sample | 9.2.2. Wasted Bits per Sample | |||
| Most uncompressed audio file formats can only store audio samples | Most uncompressed audio file formats can only store audio samples | |||
| with a bit depth that is an integer number of bytes. Samples of | with a bit depth that is an integer number of bytes. Samples in | |||
| which the bit depth is not an integer number of bytes are usually | which the bit depth is not an integer number of bytes are usually | |||
| stored in such formats by padding them with least-significant zero | stored in such formats by padding them with least-significant zero | |||
| bits to a bit depth that is an integer number of bytes. For example, | bits to a bit depth that is an integer number of bytes. For example, | |||
| shifting a 14-bit sample right by 2 pads it to a 16-bit sample, which | shifting a 14-bit sample right by 2 pads it to a 16-bit sample, which | |||
| then has two zero least-significant bits. In this specification, | then has two zero least-significant bits. In this specification, | |||
| these least-significant zero bits are referred to as wasted bits per | these least-significant zero bits are referred to as wasted bits per | |||
| sample or simply wasted bits. They are wasted in the sense that they | sample or simply wasted bits. They are wasted in the sense that they | |||
| contain no information, but are stored anyway. | contain no information but are stored anyway. | |||
| The FLAC format can optionally take advantage of these wasted bits by | The FLAC format can optionally take advantage of these wasted bits by | |||
| signaling their presence and coding the subframe without them. To do | signaling their presence and coding the subframe without them. To do | |||
| this, the wasted bits per sample flag in a subframe header is set to | this, the wasted bits per sample flag in a subframe header is set to | |||
| 0 and the number of wasted bits per sample (k) minus 1 follows the | 1 and the number of wasted bits per sample (k) minus 1 follows the | |||
| flag in an unary encoding. For example, if k is 3, 0b001 follows. | flag in an unary encoding. For example, if k is 3, 0b001 follows. | |||
| If k = 0, the wasted bits per sample flag is 0 and no unary coded k | If k = 0, the wasted bits per sample flag is 0 and no unary-coded k | |||
| follows. In this document, if a subframe header signals a certain | follows. In this document, if a subframe header signals a certain | |||
| number of wasted bits, it is said it 'uses' these wasted bits. | number of wasted bits, it is said it "uses" these wasted bits. | |||
| If a subframe uses wasted bits (i.e., k is not equal to 0), samples | If a subframe uses wasted bits (i.e., k is not equal to 0), samples | |||
| are coded ignoring k least-significant bits. For example, if a frame | are coded ignoring k least-significant bits. For example, if a frame | |||
| not employing stereo decorrelation specifies a sample size of 16 bits | not employing stereo decorrelation specifies a sample size of 16 bits | |||
| per sample in the frame header and k of a subframe is 3, samples in | per sample in the frame header and k of a subframe is 3, samples in | |||
| the subframe are coded as 13 bits per sample. For more details, see | the subframe are coded as 13 bits per sample. For more details, see | |||
| Section 9.2.3 on how the bit depth of a subframe is calculated. A | Section 9.2.3 on how the bit depth of a subframe is calculated. A | |||
| decoder MUST add k least-significant zero bits by shifting left | decoder MUST add k least-significant zero bits by shifting left | |||
| (padding) after decoding a subframe sample. If the frame has left/ | (padding) after decoding a subframe sample. If the frame has left- | |||
| side, right/side, or mid/side stereo, a decoder MUST perform padding | side, side-right, or mid-side stereo, a decoder MUST perform padding | |||
| on the subframes before restoring the channels to left and right. | on the subframes before restoring the channels to left and right. | |||
| The number of wasted bits per sample MUST be such that the resulting | The number of wasted bits per sample MUST be such that the resulting | |||
| number of bits per sample (of which the calculation is explained in | number of bits per sample (of which the calculation is explained in | |||
| Section 9.2.3) is larger than zero. | Section 9.2.3) is larger than zero. | |||
| Besides audio files that have a certain number of wasted bits for the | Besides audio files that have a certain number of wasted bits for the | |||
| whole file, there exist audio files in which the number of wasted | whole file, audio files exist in which the number of wasted bits | |||
| bits varies. There are DVD-Audio discs in which blocks of samples | varies. There are DVD-Audio discs in which blocks of samples have | |||
| have had their least-significant bits selectively zeroed to slightly | had their least-significant bits selectively zeroed to slightly | |||
| improve the compression of their otherwise lossless Meridian Lossless | improve the compression of their otherwise lossless Meridian Lossless | |||
| Packing codec, see [MLP]. There are also audio processors like | Packing codec; see [MLP]. There are also audio processors like | |||
| lossyWAV, see [lossyWAV], which zero a number of least-sigificant | lossyWAV (see [lossyWAV]) that zero a number of least-significant | |||
| bits for a block of samples, increasing the compression in a non- | bits for a block of samples, increasing the compression in a non- | |||
| lossless way. Because of this, the number of wasted bits k MAY | lossless way. Because of this, the number of wasted bits k MAY | |||
| change between frames and MAY differ between subframes. If the | change between frames and MAY differ between subframes. If the | |||
| number of wasted bits changes halfway through a subframe (e.g., the | number of wasted bits changes halfway through a subframe (e.g., the | |||
| first part has 2 wasted bits and the second part has 4 wasted bits) | first part has 2 wasted bits and the second part has 4 wasted bits), | |||
| the subframe uses the lowest number of wasted bits, as otherwise non- | the subframe uses the lowest number of wasted bits; otherwise, non- | |||
| zero bits would be discarded and the process would not be lossless. | zero bits would be discarded, and the process would not be lossless. | |||
| 9.2.3. Constant subframe | 9.2.3. Constant Subframe | |||
| In a constant subframe, only a single sample is stored. This sample | In a constant subframe, only a single sample is stored. This sample | |||
| is stored as an integer number coded big-endian, signed two's | is stored as an integer number coded big-endian, signed two's | |||
| complement. The number of bits used to store this sample depends on | complement. The number of bits used to store this sample depends on | |||
| the bit depth of the current subframe. The bit depth of a subframe | the bit depth of the current subframe. The bit depth of a subframe | |||
| is equal to the bit depth as coded in the frame header (see | is equal to the bit depth as coded in the frame header (see | |||
| Section 9.1.4), minus the number of used wasted bits coded in the | Section 9.1.4) minus the number of used wasted bits coded in the | |||
| subframe header (see Section 9.2.2). If a subframe is a side | subframe header (see Section 9.2.2). If a subframe is a side | |||
| subframe (see Section 4.2), the bit depth of that subframe is | subframe (see Section 4.2), the bit depth of that subframe is | |||
| increased by 1 bit. | increased by 1 bit. | |||
| 9.2.4. Verbatim subframe | 9.2.4. Verbatim Subframe | |||
| A verbatim subframe stores all samples unencoded in sequential order. | A verbatim subframe stores all samples unencoded in sequential order. | |||
| See Section 9.2.3 on how a sample is stored unencoded. The number of | See Section 9.2.3 on how a sample is stored unencoded. The number of | |||
| samples that need to be stored in a subframe is given by the block | samples that need to be stored in a subframe is provided by the block | |||
| size in the frame header. | size in the frame header. | |||
| 9.2.5. Fixed predictor subframe | 9.2.5. Fixed Predictor Subframe | |||
| Five different fixed predictors are defined in the following table, | Five different fixed predictors are defined in the following table, | |||
| one for each prediction order 0 through 4. In the table is also a | one for each prediction order 0 through 4. The table also contains a | |||
| derivation, which explains the rationale for choosing these fixed | derivation that explains the rationale for choosing these fixed | |||
| predictors. | predictors. | |||
| +=======+==================================+======================+ | +=======+==================================+======================+ | |||
| | Order | Prediction | Derivation | | | Order | Prediction | Derivation | | |||
| +=======+==================================+======================+ | +=======+==================================+======================+ | |||
| | 0 | 0 | N/A | | | 0 | 0 | N/A | | |||
| +-------+----------------------------------+----------------------+ | +-------+----------------------------------+----------------------+ | |||
| | 1 | a(n-1) | N/A | | | 1 | a(n-1) | N/A | | |||
| +-------+----------------------------------+----------------------+ | +-------+----------------------------------+----------------------+ | |||
| | 2 | 2 * a(n-1) - a(n-2) | a(n-1) + a'(n-1) | | | 2 | 2 * a(n-1) - a(n-2) | a(n-1) + a'(n-1) | | |||
| +-------+----------------------------------+----------------------+ | +-------+----------------------------------+----------------------+ | |||
| | 3 | 3 * a(n-1) - 3 * a(n-2) + a(n-3) | a(n-1) + a'(n-1) + | | | 3 | 3 * a(n-1) - 3 * a(n-2) + a(n-3) | a(n-1) + a'(n-1) + | | |||
| | | | a''(n-1) | | | | | a''(n-1) | | |||
| +-------+----------------------------------+----------------------+ | +-------+----------------------------------+----------------------+ | |||
| | 4 | 4 * a(n-1) - 6 * a(n-2) + 4 * | a(n-1) + a'(n-1) + | | | 4 | 4 * a(n-1) - 6 * a(n-2) + 4 * | a(n-1) + a'(n-1) + | | |||
| | | a(n-3) - a(n-4) | a''(n-1) + a'''(n-1) | | | | a(n-3) - a(n-4) | a''(n-1) + a'''(n-1) | | |||
| +-------+----------------------------------+----------------------+ | +-------+----------------------------------+----------------------+ | |||
| Table 20 | Table 20 | |||
| Where | Where: | |||
| * n is the number of the sample being predicted. | * n is the number of the sample being predicted. | |||
| * a(n) is the sample being predicted. | * a(n) is the sample being predicted. | |||
| * a(n-1) is the sample before the one being predicted. | * a(n-1) is the sample before the one being predicted. | |||
| * a'(n-1) is the difference between the previous sample and the | * a'(n-1) is the difference between the previous sample and the | |||
| sample before that, i.e., a(n-1) - a(n-2). This is the closest | sample before that, i.e., a(n-1) - a(n-2). This is the closest | |||
| available first-order discrete derivative. | available first-order discrete derivative. | |||
| * a''(n-1) is a'(n-1) - a'(n-2) or the closest available second- | * a''(n-1) is a'(n-1) - a'(n-2) or the closest available second- | |||
| order discrete derivative. | order discrete derivative. | |||
| * a'''(n-1) is a''(n-1) - a''(n-2) or the closest available third- | * a'''(n-1) is a''(n-1) - a''(n-2) or the closest available third- | |||
| order discrete derivative. | order discrete derivative. | |||
| As a predictor makes use of samples preceding the sample that is | As a predictor makes use of samples preceding the sample that is | |||
| predicted, it can only be used when enough samples are known. As | predicted, it can only be used when enough samples are known. As | |||
| each subframe in FLAC is coded completely independently, the first | each subframe in FLAC is coded completely independently, the first | |||
| few samples in each subframe cannot be predicted. Therefore, a | few samples in each subframe cannot be predicted. Therefore, a | |||
| number of so-called warm-up samples equal to the predictor order is | number of so-called warm-up samples equal to the predictor order is | |||
| stored. These are stored unencoded, bypassing the predictor and | stored. These are stored unencoded, bypassing the predictor and | |||
| residual coding stages. See Section 9.2.3 on how samples are stored | residual coding stages. See Section 9.2.3 on how samples are stored | |||
| skipping to change at page 44, line 17 ¶ | skipping to change at line 1912 ¶ | |||
| +==========+===========================================+ | +==========+===========================================+ | |||
| | s(n) | Unencoded warm-up samples (n = subframe's | | | s(n) | Unencoded warm-up samples (n = subframe's | | |||
| | | bits per sample * predictor order). | | | | bits per sample * predictor order). | | |||
| +----------+-------------------------------------------+ | +----------+-------------------------------------------+ | |||
| | Coded | Coded residual as defined in | | | Coded | Coded residual as defined in | | |||
| | residual | Section 9.2.7 | | | residual | Section 9.2.7 | | |||
| +----------+-------------------------------------------+ | +----------+-------------------------------------------+ | |||
| Table 21 | Table 21 | |||
| As the fixed predictors are specified, they do not have to be stored. | Because fixed predictors are specified, they do not have to be | |||
| The fixed predictor order, which is stored in the subframe header, | stored. The fixed predictor order, which is stored in the subframe | |||
| specifies which predictor is used. | header, specifies which predictor is used. | |||
| To encode a signal with a fixed predictor, each sample has the | To encode a signal with a fixed predictor, each sample has the | |||
| corresponding prediction subtracted and sent to the residual coder. | corresponding prediction subtracted and sent to the residual coder. | |||
| To decode a signal with a fixed predictor, the residual is decoded, | To decode a signal with a fixed predictor, the residual is decoded, | |||
| and then the prediction can be added for each sample. This means | and then the prediction can be added for each sample. This means | |||
| that decoding is necessarily a sequential process within a subframe, | that decoding is necessarily a sequential process within a subframe, | |||
| as for each sample, enough fully decoded previous samples are needed | as for each sample, enough fully decoded previous samples are needed | |||
| to calculate the prediction. | to calculate the prediction. | |||
| For fixed predictor order 0, the prediction is always 0, thus each | For fixed predictor order 0, the prediction is always 0; thus, each | |||
| residual sample is equal to its corresponding input or decoded | residual sample is equal to its corresponding input or decoded | |||
| sample. The difference between a fixed predictor with order 0 and a | sample. The difference between a fixed predictor with order 0 and a | |||
| verbatim subframe, is that a verbatim subframe stores all samples | verbatim subframe is that a verbatim subframe stores all samples | |||
| unencoded, while a fixed predictor with order 0 has all its samples | unencoded while a fixed predictor with order 0 has all its samples | |||
| processed by the residual coder. | processed by the residual coder. | |||
| The first order fixed predictor is comparable to how DPCM encoding | The first-order fixed predictor is comparable to how differential | |||
| works, as the resulting residual sample is the difference between the | pulse-code modulation (DPCM) encoding works, as the resulting | |||
| corresponding sample and the sample before it. The higher order | residual sample is the difference between the corresponding sample | |||
| fixed predictors can be understood as polynomials fitted to the | and the sample before it. The higher-order fixed predictors can be | |||
| previous samples. | understood as polynomials fitted to the previous samples. | |||
| 9.2.6. Linear predictor subframe | 9.2.6. Linear Predictor Subframe | |||
| Whereas fixed predictors are well suited for simple signals, using a | Whereas fixed predictors are well suited for simple signals, using a | |||
| (non-fixed) linear predictor on more complex signals can improve | (non-fixed) linear predictor on more complex signals can improve | |||
| compression by making the residual samples even smaller. There is a | compression by making the residual samples even smaller. There is a | |||
| certain trade-off however, as storing the predictor coefficients | certain trade-off, however, as storing the predictor coefficients | |||
| takes up space as well. | takes up space as well. | |||
| In the FLAC format, a predictor is defined by up to 32 predictor | In the FLAC format, a predictor is defined by up to 32 predictor | |||
| coefficients and a shift. To form a prediction, each coefficient is | coefficients and a shift. To form a prediction, each coefficient is | |||
| multiplied by its corresponding past sample, the results are summed, | multiplied by its corresponding past sample, the results are summed, | |||
| and this sum is then shifted. To encode a signal with a linear | and this sum is then shifted. To encode a signal with a linear | |||
| predictor, each sample has the corresponding prediction subtracted | predictor, each sample has the corresponding prediction subtracted | |||
| and sent to the residual coder. To decode a signal with a linear | and sent to the residual coder. To decode a signal with a linear | |||
| predictor, the residual is decoded, and then the prediction can be | predictor, the residual is decoded, and then the prediction can be | |||
| added for each sample. This means that decoding MUST be a sequential | added for each sample. This means that decoding MUST be a sequential | |||
| process within a subframe, as for each sample, enough decoded samples | process within a subframe, as enough decoded samples are needed to | |||
| are needed to calculate the prediction. | calculate the prediction for each sample. | |||
| The table below defines how a linear predictor subframe appears in | The table below defines how a linear predictor subframe appears in | |||
| the bitstream. | the bitstream. | |||
| +==========+==========================================+ | +==========+==========================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +==========+==========================================+ | +==========+==========================================+ | |||
| | s(n) | Unencoded warm-up samples (n = | | | s(n) | Unencoded warm-up samples (n = | | |||
| | | subframe's bits per sample * lpc order). | | | | subframe's bits per sample * LPC order). | | |||
| +----------+------------------------------------------+ | +----------+------------------------------------------+ | |||
| | u(4) | (Predictor coefficient precision in | | | u(4) | (Predictor coefficient precision in | | |||
| | | bits)-1 (NOTE: 0b1111 is forbidden). | | | | bits)-1 (Note: 0b1111 is forbidden). | | |||
| +----------+------------------------------------------+ | +----------+------------------------------------------+ | |||
| | s(5) | Prediction right shift needed in bits. | | | s(5) | Prediction right shift needed in bits. | | |||
| +----------+------------------------------------------+ | +----------+------------------------------------------+ | |||
| | s(n) | Predictor coefficients (n = predictor | | | s(n) | Predictor coefficients (n = predictor | | |||
| | | coefficient precision * lpc order). | | | | coefficient precision * LPC order). | | |||
| +----------+------------------------------------------+ | +----------+------------------------------------------+ | |||
| | Coded | Coded residual as defined in | | | Coded | Coded residual as defined in | | |||
| | residual | Section 9.2.7 | | | residual | Section 9.2.7. | | |||
| +----------+------------------------------------------+ | +----------+------------------------------------------+ | |||
| Table 22 | Table 22 | |||
| See Section 9.2.3 on how the warm-up samples are stored unencoded. | See Section 9.2.3 on how the warm-up samples are stored unencoded. | |||
| The predictor coefficients are stored as an integer number coded big- | The predictor coefficients are stored as an integer number coded big- | |||
| endian, signed two's complement, where the number of bits needed for | endian, signed two's complement, where the number of bits needed for | |||
| each coefficient is defined by the predictor coefficient precision. | each coefficient is defined by the predictor coefficient precision. | |||
| While the prediction right shift is signed two's complement, this | While the prediction right shift is signed two's complement, this | |||
| number MUST NOT be negative, see Appendix B.4 for an explanation why | number MUST NOT be negative; see Appendix B.4 for an explanation why | |||
| this is. | this is. | |||
| Please note that the order in which the predictor coefficients appear | Please note that the order in which the predictor coefficients appear | |||
| in the bitstream corresponds to which *past* sample they belong to. | in the bitstream corresponds to which *past* sample they belong to. | |||
| In other words, the order of the predictor coefficients is opposite | In other words, the order of the predictor coefficients is opposite | |||
| to the chronological order of the samples. So, the first predictor | to the chronological order of the samples. So, the first predictor | |||
| coefficient has to be multiplied with the sample directly before the | coefficient has to be multiplied with the sample directly before the | |||
| sample that is being predicted, the second predictor coefficient has | sample that is being predicted, the second predictor coefficient has | |||
| to be multiplied with the sample before that, etc. | to be multiplied with the sample before that, etc. | |||
| 9.2.7. Coded residual | 9.2.7. Coded Residual | |||
| The first two bits in a coded residual indicate which coding method | The first two bits in a coded residual indicate which coding method | |||
| is used. See the table below. | is used. See the table below. | |||
| +=============+=============================================+ | +=============+=============================================+ | |||
| | Value | Description | | | Value | Description | | |||
| +=============+=============================================+ | +=============+=============================================+ | |||
| | 0b00 | partitioned Rice code with 4-bit parameters | | | 0b00 | Partitioned Rice code with 4-bit parameters | | |||
| +-------------+---------------------------------------------+ | +-------------+---------------------------------------------+ | |||
| | 0b01 | partitioned Rice code with 5-bit parameters | | | 0b01 | Partitioned Rice code with 5-bit parameters | | |||
| +-------------+---------------------------------------------+ | +-------------+---------------------------------------------+ | |||
| | 0b10 - 0b11 | reserved | | | 0b10 - 0b11 | Reserved | | |||
| +-------------+---------------------------------------------+ | +-------------+---------------------------------------------+ | |||
| Table 23 | Table 23 | |||
| Both defined coding methods work the same way, but differ in the | Both defined coding methods work the same way but differ in the | |||
| number of bits used for Rice parameters. The 4 bits that directly | number of bits used for Rice parameters. The 4 bits that directly | |||
| follow the coding method bits form the partition order, which is an | follow the coding method bits form the partition order, which is an | |||
| unsigned number. The rest of the coded residual consists of | unsigned number. The rest of the coded residual consists of | |||
| 2^(partition order) partitions. For example, if the 4 bits are | 2^(partition order) partitions. For example, if the 4 bits are | |||
| 0b1000, the partition order is 8 and the residual is split up into | 0b1000, the partition order is 8, and the residual is split up into | |||
| 2^8 = 256 partitions. | 2^8 = 256 partitions. | |||
| Each partition contains a certain number of residual samples. The | Each partition contains a certain number of residual samples. The | |||
| number of residual samples in the first partition is equal to (block | number of residual samples in the first partition is equal to (block | |||
| size >> partition order) - predictor order, i.e., the block size | size >> partition order) - predictor order, i.e., the block size | |||
| divided by the number of partitions minus the predictor order. In | divided by the number of partitions minus the predictor order. In | |||
| all other partitions, the number of residual samples is equal to | all other partitions, the number of residual samples is equal to | |||
| (block size >> partition order). | (block size >> partition order). | |||
| The partition order MUST be such that the block size is evenly | The partition order MUST be such that the block size is evenly | |||
| divisible by the number of partitions. This means, for example, that | divisible by the number of partitions. This means, for example, that | |||
| for all odd block sizes, only partition order 0 is allowed. The | only partition order 0 is allowed for all odd block sizes. The | |||
| partition order also MUST be such that the (block size >> partition | partition order also MUST be such that the (block size >> partition | |||
| order) is larger than the predictor order. This means, for example, | order) is larger than the predictor order. This means, for example, | |||
| that with a block size of 4096 and a predictor order of 4, the | that with a block size of 4096 and a predictor order of 4, the | |||
| partition order cannot be larger than 9. | partition order cannot be larger than 9. | |||
| Each partition starts with a parameter. If the coded residual of a | Each partition starts with a parameter. If the coded residual of a | |||
| subframe is one with 4-bit Rice parameters (see the table at the | subframe is one with 4-bit Rice parameters (see Table 23), the first | |||
| start of this section), the first 4 bits of each partition are either | 4 bits of each partition are either a Rice parameter or an escape | |||
| a Rice parameter or an escape code. These 4 bits indicate an escape | code. These 4 bits indicate an escape code if they are 0b1111; | |||
| code if they are 0b1111, otherwise they contain the Rice parameter as | otherwise, they contain the Rice parameter as an unsigned number. If | |||
| an unsigned number. If the coded residual of the current subframe is | the coded residual of the current subframe is one with 5-bit Rice | |||
| one with 5-bit Rice parameters, the first 5 bits of each partition | parameters, the first 5 bits of each partition indicate an escape | |||
| indicate an escape code if they are 0b11111, otherwise, they contain | code if they are 0b11111; otherwise, they contain the Rice parameter | |||
| the Rice parameter as an unsigned number as well. | as an unsigned number as well. | |||
| 9.2.7.1. Escaped partition | 9.2.7.1. Escaped Partition | |||
| If an escape code was used, the partition does not contain a | If an escape code was used, the partition does not contain a | |||
| variable-length Rice coded residual, but a fixed-length unencoded | variable-length Rice-coded residual; rather, it contains a fixed- | |||
| residual. Directly following the escape code are 5 bits containing | length unencoded residual. Directly following the escape code are 5 | |||
| the number of bits with which each residual sample is stored, as an | bits containing the number of bits with which each residual sample is | |||
| unsigned number. The residual samples themselves are stored signed | stored, as an unsigned number. The residual samples themselves are | |||
| two's complement. For example, when a partition is escaped and each | stored signed two's complement. For example, when a partition is | |||
| residual sample is stored with 3 bits, the number -1 is represented | escaped and each residual sample is stored with 3 bits, the number -1 | |||
| as 0b111. | is represented as 0b111. | |||
| Note that it is possible that the number of bits with which each | Note that it is possible that the number of bits with which each | |||
| sample is stored is 0, which means all residual samples in that | sample is stored is 0, which means that all residual samples in that | |||
| partition have a value of 0 and that no bits are used to store the | partition have a value of 0 and that no bits are used to store the | |||
| samples. In that case, the partition contains nothing except the | samples. In that case, the partition contains nothing except the | |||
| escape code and 0b00000. | escape code and 0b00000. | |||
| 9.2.7.2. Rice code | 9.2.7.2. Rice Code | |||
| If a Rice parameter was provided for a certain partition, that | If a Rice parameter was provided for a certain partition, that | |||
| partition contains a Rice coded residual. The residual samples, | partition contains a Rice-coded residual. The residual samples, | |||
| which are signed numbers, are represented by unsigned numbers in the | which are signed numbers, are represented by unsigned numbers in the | |||
| Rice code. For positive numbers, the representation is the number | Rice code. For positive numbers, the representation is the number | |||
| doubled, for negative numbers, the representation is the number | doubled. For negative numbers, the representation is the number | |||
| multiplied by -2 and has 1 subtracted. This representation of signed | multiplied by -2 and with 1 subtracted. This representation of | |||
| numbers is also known as zigzag encoding. The zigzag encoded | signed numbers is also known as zigzag encoding. The zigzag-encoded | |||
| residual is called the folded residual. | residual is called the folded residual. | |||
| Each folded residual sample is then split into two parts, a most- | Each folded residual sample is then split into two parts, a most- | |||
| significant part and a least-significant part. The Rice parameter at | significant part and a least-significant part. The Rice parameter at | |||
| the start of each partition determines where that split lies: it is | the start of each partition determines where that split lies: it is | |||
| the number of bits in the least-significant part. Each residual | the number of bits in the least-significant part. Each residual | |||
| sample is then stored by coding the most-significant part as unary, | sample is then stored by coding the most-significant part as unary, | |||
| followed by the least-significant part as binary. | followed by the least-significant part as binary. | |||
| For example, take a partition with Rice parameter 3 containing a | For example, take a partition with Rice parameter 3 containing a | |||
| folded residual sample with 38 as its value, which is 0b100110 in | folded residual sample with 38 as its value, which is 0b100110 in | |||
| binary. The most-significant part is 0b100 (4) and is stored unary | binary. The most-significant part is 0b100 (4) and is stored in | |||
| as 0b00001. The least-significant part is 0b110 (6) and is stored as | unary form as 0b00001. The least-significant part is 0b110 (6) and | |||
| is. The Rice code word is thus 0b00001110. The Rice code words for | is stored as is. The Rice code word is thus 0b00001110. The Rice | |||
| all residual samples in a partition are stored consecutively. | code words for all residual samples in a partition are stored | |||
| consecutively. | ||||
| To decode a Rice code word, zero bits must be counted until | To decode a Rice code word, zero bits must be counted until | |||
| encountering a one bit, after which a number of bits given by the | encountering a one bit, after which a number of bits given by the | |||
| Rice parameter must be read. The count of zero bits is shifted left | Rice parameter must be read. The count of zero bits is shifted left | |||
| by the Rice parameter (i.e., multiplied by 2 raised to the power Rice | by the Rice parameter (i.e., multiplied by 2 raised to the power Rice | |||
| parameter) and bitwise ORed with (i.e., added to) the read value. | parameter) and bitwise ORed with (i.e., added to) the read value. | |||
| This is the folded residual value. An even folded residual value is | This is the folded residual value. An even folded residual value is | |||
| shifted right 1 bit (i.e., divided by two) to get the (unfolded) | shifted right 1 bit (i.e., divided by 2) to get the (unfolded) | |||
| residual value. An odd folded residual value is shifted right 1 bit | residual value. An odd folded residual value is shifted right 1 bit | |||
| and then has all bits flipped (1 added to and divided by -2) to get | and then has all bits flipped (1 added to and divided by -2) to get | |||
| the (unfolded) residual value, subject to negative numbers being | the (unfolded) residual value, subject to negative numbers being | |||
| signed two's complement on the decoding machine. | signed two's complement on the decoding machine. | |||
| Appendix D shows decoding of a complete coded residual. | Appendix D shows decoding of a complete coded residual. | |||
| 9.2.7.3. Residual sample value limit | 9.2.7.3. Residual Sample Value Limit | |||
| All residual sample values MUST be representable in the range offered | All residual sample values MUST be representable in the range offered | |||
| by a 32-bit integer, signed one's complement. Equivalently, all | by a 32-bit integer, signed one's complement. Equivalently, all | |||
| residual sample values MUST fall in the range offered by a 32-bit | residual sample values MUST fall in the range offered by a 32-bit | |||
| integer signed two's complement excluding the most negative possible | integer signed two's complement, excluding the most negative possible | |||
| value of that range. This means residual sample values MUST NOT have | value of that range. This means residual sample values MUST NOT have | |||
| an absolute value equal to, or larger than, 2 to the power 31. A | an absolute value equal to, or larger than, 2 to the power 31. A | |||
| FLAC encoder MUST make sure of this. If a FLAC encoder is, for a | FLAC encoder MUST make sure of this. If a FLAC encoder is, for a | |||
| certain subframe, unable to find a suitable predictor for which all | certain subframe, unable to find a suitable predictor for which all | |||
| residual samples fall within said range, it MUST default to writing a | residual samples fall within said range, it MUST default to writing a | |||
| verbatim subframe. Appendix A explains in which circumstances | verbatim subframe. Appendix A explains in which circumstances | |||
| residual samples are already implicitly representable in said range | residual samples are already implicitly representable in said range; | |||
| and thus an additional check is not needed. | thus, an additional check is not needed. | |||
| The reason for this limit is to ensure that decoders can use 32-bit | The reason for this limit is to ensure that decoders can use 32-bit | |||
| integers when processing residuals, simplifying decoding. The reason | integers when processing residuals, simplifying decoding. The reason | |||
| the most negative value of a 32-bit int signed two's complement is | the most negative value of a 32-bit integer signed two's complement | |||
| specifically excluded is to prevent decoders from having to implement | is specifically excluded is to prevent decoders from having to | |||
| specific handling of that value, as it cannot be negated within a | implement specific handling of that value, as it cannot be negated | |||
| 32-bit signed int, and most library routines calculating an absolute | within a 32-bit signed integer, and most library routines calculating | |||
| value have undefined behavior on processing that value. | an absolute value have undefined behavior for processing that value. | |||
| 9.3. Frame footer | 9.3. Frame Footer | |||
| Following the last subframe is the frame footer. If the last | Following the last subframe is the frame footer. If the last | |||
| subframe is not byte aligned (i.e., the number of bits required to | subframe is not byte aligned (i.e., the number of bits required to | |||
| store all subframes put together is not divisible by 8), zero bits | store all subframes put together is not divisible by 8), zero bits | |||
| are added until byte alignment is reached. Following this is a | are added until byte alignment is reached. Following this is a | |||
| 16-bit CRC, initialized with 0, with the polynomial x^16 + x^15 + x^2 | 16-bit CRC, initialized with 0, with the polynomial x^16 + x^15 + x^2 | |||
| + x^0. This CRC covers the whole frame excluding the 16-bit CRC, | + x^0. This CRC covers the whole frame, excluding the 16-bit CRC but | |||
| including the sync code. | including the sync code. | |||
| 10. Container mappings | 10. Container Mappings | |||
| The FLAC format can be used without any container, as it already | The FLAC format can be used without any container, as it already | |||
| provides for the most basic features normally associated with a | provides for the most basic features normally associated with a | |||
| container. However, the functionality this basic container provides | container. However, the functionality this basic container provides | |||
| is rather limited, and for more advanced features, like combining | is rather limited, and for more advanced features (such as combining | |||
| FLAC audio with video, it needs to be encapsulated by a more capable | FLAC audio with video), it needs to be encapsulated by a more capable | |||
| container. This presents a problem: because of these container | container. This presents a problem: because of these container | |||
| features, the FLAC format mixes data that belongs to the encoded data | features, the FLAC format mixes data that belongs to the encoded data | |||
| (like block size and sample rate) with data that belongs to the | (like block size and sample rate) with data that belongs to the | |||
| container (like checksum and timecode). The choice was made to | container (like checksum and timecode). The choice was made to | |||
| encapsulate FLAC frames as they are, which means some data will be | encapsulate FLAC frames as they are, which means some data will be | |||
| duplicated and potentially deviating between the FLAC frames and the | duplicated and potentially deviating between the FLAC frames and the | |||
| encapsulating container. | encapsulating container. | |||
| As FLAC frames are completely independent of each other, container | As FLAC frames are completely independent of each other, container | |||
| format features handling dependencies do not need to be used. For | format features handling dependencies do not need to be used. For | |||
| example, all FLAC frames embedded in Matroska are marked as keyframes | example, all FLAC frames embedded in Matroska are marked as keyframes | |||
| when they are stored in a SimpleBlock, and tracks in an MP4 file | when they are stored in a SimpleBlock, and tracks in an MP4 file | |||
| containing only FLAC frames do not need a sync sample box. | containing only FLAC frames do not need a sync sample box. | |||
| 10.1. Ogg mapping | 10.1. Ogg Mapping | |||
| The Ogg container format is defined in [RFC3533]. The first packet | The Ogg container format is defined in [RFC3533]. The first packet | |||
| of a logical bitstream carrying FLAC data is structured according to | of a logical bitstream carrying FLAC data is structured according to | |||
| the following table. | the following table. | |||
| +=========+=========================================================+ | +=========+=========================================================+ | |||
| | Data | Description | | | Data | Description | | |||
| +=========+=========================================================+ | +=========+=========================================================+ | |||
| | 5 | Bytes 0x7F 0x46 0x4C 0x41 0x43 (as also defined by | | | 5 | Bytes 0x7F 0x46 0x4C 0x41 0x43 (as also defined by | | |||
| | bytes | [RFC5334]) | | | bytes | [RFC5334]). | | |||
| +---------+---------------------------------------------------------+ | +---------+---------------------------------------------------------+ | |||
| | 2 | Version number of the FLAC-in-Ogg mapping. These bytes | | | 2 | Version number of the FLAC-in-Ogg mapping. These bytes | | |||
| | bytes | are 0x01 0x00, meaning version 1.0 of the mapping. | | | bytes | are 0x01 0x00, meaning version 1.0 of the mapping. | | |||
| +---------+---------------------------------------------------------+ | +---------+---------------------------------------------------------+ | |||
| | 2 | Number of header packets (excluding the first header | | | 2 | Number of header packets (excluding the first header | | |||
| | bytes | packet) as an unsigned number coded big-endian. | | | bytes | packet) as an unsigned number coded big-endian. | | |||
| +---------+---------------------------------------------------------+ | +---------+---------------------------------------------------------+ | |||
| | 4 | The fLaC signature | | | 4 | The fLaC signature. | | |||
| | bytes | | | | bytes | | | |||
| +---------+---------------------------------------------------------+ | +---------+---------------------------------------------------------+ | |||
| | 4 | A metadata block header for the streaminfo block | | | 4 | A metadata block header for the streaminfo metadata | | |||
| | bytes | | | | bytes | block. | | |||
| +---------+---------------------------------------------------------+ | +---------+---------------------------------------------------------+ | |||
| | 34 | A streaminfo metadata block | | | 34 | A streaminfo metadata block. | | |||
| | bytes | | | | bytes | | | |||
| +---------+---------------------------------------------------------+ | +---------+---------------------------------------------------------+ | |||
| Table 24 | Table 24 | |||
| The number of header packets MAY be 0, which means the number of | The number of header packets MAY be 0, which means the number of | |||
| packets that follow is unknown. This first packet MUST NOT share a | packets that follow is unknown. This first packet MUST NOT share a | |||
| Ogg page with any other packets. This means the first page of a | Ogg page with any other packets. This means the first page of a | |||
| logical stream of FLAC-in-Ogg is always 79 bytes. | logical stream of FLAC-in-Ogg is always 79 bytes. | |||
| Following the first packet are one or more header packets, each of | Following the first packet are one or more header packets, each of | |||
| which contains a single metadata block. The first of these packets | which contains a single metadata block. The first of these packets | |||
| SHOULD be a Vorbis comment metadata block, for historic reasons. | SHOULD be a Vorbis comment metadata block for historic reasons. This | |||
| This is contrary to unencapsulated FLAC streams, where the order of | is contrary to unencapsulated FLAC streams, where the order of | |||
| metadata blocks is not important except for the streaminfo block and | metadata blocks is not important except for the streaminfo metadata | |||
| where a Vorbis comment metadata block is optional. | block and where a Vorbis comment metadata block is optional. | |||
| Following the header packets are audio packets. Each audio packet | Following the header packets are audio packets. Each audio packet | |||
| contains a single FLAC frame. The first audio packet MUST start on a | contains a single FLAC frame. The first audio packet MUST start on a | |||
| new Ogg page, i.e., the last metadata block MUST finish its page | new Ogg page, i.e., the last metadata block MUST finish its page | |||
| before any audio packets are encapsulated. | before any audio packets are encapsulated. | |||
| The granule position of all pages containing header packets MUST be | The granule position of all pages containing header packets MUST be | |||
| 0. For pages containing audio packets, the granule position is the | 0. For pages containing audio packets, the granule position is the | |||
| number of the last sample contained in the last completed packet in | number of the last sample contained in the last completed packet in | |||
| the frame. The sample numbering considers interchannel samples. If | the frame. The sample numbering considers interchannel samples. If | |||
| a page contains no packet end (e.g., when it only contains the start | a page contains no packet end (e.g., when it only contains the start | |||
| of a large packet, which continues on the next page), then the | of a large packet that continues on the next page), then the granule | |||
| granule position is set to the maximum value possible, i.e., 0xFF | position is set to the maximum value possible, i.e., 0xFF 0xFF 0xFF | |||
| 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF. | 0xFF 0xFF 0xFF 0xFF 0xFF. | |||
| The granule position of the first audio data page with a completed | The granule position of the first audio data page with a completed | |||
| packet MAY be larger than the number of samples contained in packets | packet MAY be larger than the number of samples contained in packets | |||
| that complete on that page. In other words, the apparent sample | that complete on that page. In other words, the apparent sample | |||
| number of the first sample in the stream following from the granule | number of the first sample in the stream following from the granule | |||
| position and the audio data MAY be larger than 0. This allows, for | position and the audio data MAY be larger than 0. This allows, for | |||
| example, a server to cast a live stream to several clients that | example, a server to cast a live stream to several clients that | |||
| joined at different moments, without rewriting the granule position | joined at different moments without rewriting the granule position | |||
| for each client. | for each client. | |||
| If an audio stream is encoded where audio properties (sample rate, | If an audio stream is encoded where audio properties (sample rate, | |||
| number of channels, or bit depth) change at some point in the stream, | number of channels, or bit depth) change at some point in the stream, | |||
| this should be dealt with by finishing encoding of the current Ogg | this should be dealt with by finishing encoding of the current Ogg | |||
| stream and starting a new Ogg stream, concatenated to the previous | stream and starting a new Ogg stream, concatenated to the previous | |||
| one. This is called chaining in Ogg. See the Ogg specification | one. This is called chaining in Ogg. See the Ogg specification | |||
| [RFC3533] for details. | [RFC3533] for details. | |||
| 10.2. Matroska mapping | 10.2. Matroska Mapping | |||
| The Matroska container format is defined in | The Matroska container format is defined in [RFC9559]. The codec ID | |||
| [I-D.ietf-cellar-matroska]. The codec ID (EBML path | (EBML path \Segment\Tracks\TrackEntry\CodecID) assigned to signal | |||
| \Segment\Tracks\TrackEntry\CodecID) assigned to signal tracks | tracks carrying FLAC data is A_FLAC in ASCII. All FLAC data before | |||
| carrying FLAC data is A_FLAC in ASCII. All FLAC data before the | the first audio frame (i.e., the fLaC ASCII signature and all | |||
| first audio frame (i.e., the fLaC ASCII signature and all metadata | metadata blocks) is stored as CodecPrivate data (EBML path | |||
| blocks) is stored as CodecPrivate data (EBML path | ||||
| \Segment\Tracks\TrackEntry\CodecPrivate). | \Segment\Tracks\TrackEntry\CodecPrivate). | |||
| Each FLAC frame (including all of its subframes) is treated as a | Each FLAC frame (including all of its subframes) is treated as a | |||
| single frame in the Matroska context. | single frame in the context of Matroska. | |||
| If an audio stream is encoded where audio properties (sample rate, | If an audio stream is encoded where audio properties (sample rate, | |||
| number of channels, or bit depth) change at some point in the stream, | number of channels, or bit depth) change at some point in the stream, | |||
| this should be dealt with by finishing the current Matroska segment | this should be dealt with by finishing the current Matroska segment | |||
| and starting a new one with the new properties. | and starting a new one with the new properties. | |||
| 10.3. ISO Base Media File Format (MP4) mapping | 10.3. ISO Base Media File Format (MP4) Mapping | |||
| The full encapsulation definition of FLAC audio in MP4 files was | The full encapsulation definition of FLAC audio in MP4 files was | |||
| deemed too extensive to include in this document. A definition | deemed too extensive to include in this document. A definition | |||
| document can be found at [FLAC-in-MP4-specification]. | document can be found at [FLAC-in-MP4-specification]. | |||
| 11. Implementation status | 11. Security Considerations | |||
| Note to RFC Editor - please remove this entire section before | ||||
| publication, as well as the reference to RFC 7942. | ||||
| This section records the status of known implementations of the FLAC | ||||
| format, and is based on a proposal described in [RFC7942]. Please | ||||
| note that the listing of any individual implementation here does not | ||||
| imply endorsement by the IETF. Furthermore, no effort has been spent | ||||
| to verify the information presented here that was supplied by IETF | ||||
| contributors. This is not intended as, and must not be construed to | ||||
| be, a catalog of available implementations or their features. | ||||
| Readers are advised to note that other implementations may exist. | ||||
| A reference encoder and decoder implementation of the FLAC format | ||||
| exists, known as libFLAC, maintained by Xiph.Org. It can be found at | ||||
| https://xiph.org/flac/ (https://xiph.org/flac/) Note that while all | ||||
| libFLAC components are licensed under 3-clause BSD, the flac and | ||||
| metaflac command line tools often supplied together with libFLAC are | ||||
| licensed under GPL. | ||||
| Another completely independent implementation of both encoder and | ||||
| decoder of the FLAC format is available in libavcodec, maintained by | ||||
| FFmpeg, licensed under LGPL 2.1 or later. It can be found at | ||||
| https://ffmpeg.org/ (https://ffmpeg.org/) | ||||
| A list of other implementations and an overview of which parts of the | ||||
| format they implement can be found at [FLAC-wiki-implementations]. | ||||
| 12. Security Considerations | ||||
| Like any other codec (such as [RFC6716]), FLAC should not be used | Like any other codec (such as [RFC6716]), FLAC should not be used | |||
| with insecure ciphers or cipher modes that are vulnerable to known | with insecure ciphers or cipher modes that are vulnerable to known | |||
| plaintext attacks. Some of the header bits as well as the padding | plaintext attacks. Some of the header bits, as well as the padding, | |||
| are easily predictable. | are easily predictable. | |||
| Implementations of the FLAC codec need to take appropriate security | Implementations of the FLAC codec need to take appropriate security | |||
| considerations into account. Section 2.1 of [RFC4732] provides | considerations into account. Section 2.1 of [RFC4732] provides | |||
| general information on DoS attacks on end-systems and describes some | general information on DoS attacks on end systems and describes some | |||
| mitigation strategies. Areas of concern specific to FLAC follow. | mitigation strategies. Areas of concern specific to FLAC follow. | |||
| It is extremely important for the decoder to be robust against | It is extremely important for the decoder to be robust against | |||
| malformed payloads. Payloads that do not conform to this | malformed payloads. Payloads that do not conform to this | |||
| specification MUST NOT cause the decoder to overrun its allocated | specification MUST NOT cause the decoder to overrun its allocated | |||
| memory or take an excessive amount of resources to decode. An | memory or take an excessive amount of resources to decode. An | |||
| overrun in allocated memory could lead to arbitrary code execution by | overrun in allocated memory could lead to arbitrary code execution by | |||
| an attacker. The same applies to the encoder, even though problems | an attacker. The same applies to the encoder, even though problems | |||
| with encoders are typically rarer. Malformed audio streams MUST NOT | with encoders are typically rarer. Malformed audio streams MUST NOT | |||
| cause the encoder to misbehave because this would allow an attacker | cause the encoder to misbehave because this would allow an attacker | |||
| to attack transcoding gateways. | to attack transcoding gateways. | |||
| As with all compression algorithms, both encoding and decoding can | As with all compression algorithms, both encoding and decoding can | |||
| produce an output much larger than the input. For decoding, the most | produce an output much larger than the input. For decoding, the most | |||
| extreme possible case of this is a frame with eight constant | extreme possible case of this is a frame with eight constant | |||
| subframes of block size 65535 and coding for 32-bit PCM. This frame | subframes of block size 65535 and coding for 32-bit PCM. This frame | |||
| is only 49 bytes in size, but codes for more than 2 megabytes of | is only 49 bytes in size but codes for more than 2 megabytes of | |||
| uncompressed PCM data. For encoding, it is possible to have an even | uncompressed PCM data. For encoding, it is possible to have an even | |||
| larger size increase, although such behavior is generally considered | larger size increase, although such behavior is generally considered | |||
| faulty. This happens if the encoder chooses a rice parameter that | faulty. This happens if the encoder chooses a Rice parameter that | |||
| does not fit with the residual that has to be encoded. In such a | does not fit with the residual that has to be encoded. In such a | |||
| case, very long unary coded symbols can appear, in the most extreme | case, very long unary-coded symbols can appear (in the most extreme | |||
| case, more than 4 gigabytes per sample. Decoder and encoder | case, more than 4 gigabytes per sample). Decoder and encoder | |||
| implementors are advised to take precautions to prevent excessive | implementors are advised to take precautions to prevent excessive | |||
| resource utilization in such cases. | resource utilization in such cases. | |||
| Where metadata is handled, implementors are advised to either | Where metadata is handled, implementors are advised to either | |||
| thoroughly test the handling of extreme cases or impose reasonable | thoroughly test the handling of extreme cases or impose reasonable | |||
| limits beyond the limits of this specification document. For | limits beyond the limits of this specification. For example, a | |||
| example, a single Vorbis comment metadata block can contain millions | single Vorbis comment metadata block can contain millions of valid | |||
| of valid fields. It is unlikely such a limit is ever reached except | fields. It is unlikely such a limit is ever reached except in a | |||
| in a potentially malicious file. Likewise, the media type and | potentially malicious file. Likewise, the media type and description | |||
| description of a picture metadata block can be millions of characters | of a picture metadata block can be millions of characters long, | |||
| long, despite there being no reasonable use of such contents. One | despite there being no reasonable use of such contents. One possible | |||
| possible use case for very long character strings is in lyrics, which | use case for very long character strings is in lyrics, which can be | |||
| can be stored in Vorbis comment metadata block fields. | stored in Vorbis comment metadata block fields. | |||
| Various kinds of metadata blocks contain length fields or field | Various kinds of metadata blocks contain length fields or field | |||
| counts. While reading a block following these lengths or counts, a | counts. While reading a block following these lengths or counts, a | |||
| decoder MUST make sure higher-level lengths or counts (most | decoder MUST make sure higher-level lengths or counts (most | |||
| importantly, the length field of the metadata block itself) are not | importantly, the length field of the metadata block itself) are not | |||
| exceeded. As some of these length fields code string lengths, memory | exceeded. As some of these length fields code string lengths and | |||
| for which must be allocated, parsers MUST first verify that a block | memory must be allocated for that, parsers MUST first verify that a | |||
| is valid before allocating memory based on its contents, except when | block is valid before allocating memory based on its contents, except | |||
| explicitly instructed to salvage data from a malformed file. | when explicitly instructed to salvage data from a malformed file. | |||
| Metadata blocks can also contain references, e.g., the picture | Metadata blocks can also contain references, e.g., the picture | |||
| metadata block can contain a URI. When following an URI, the | metadata block can contain a URI. When following a URI, the security | |||
| security considerations of [RFC3986] apply. Applications MUST obtain | considerations of [RFC3986] apply. Applications MUST obtain explicit | |||
| explicit user approval to retrieve resources via remote protocols. | user approval to retrieve resources via remote protocols. Following | |||
| external URIs introduces a tracking risk from on-path observers and | ||||
| Following external URIs introduces a tracking risk from on-path | the operator of the service hosting the URI. Likewise, the choice of | |||
| observers and the operator of the service hosting the URI. Likewise, | scheme, if it isn't protected like https, could also introduce | |||
| the choice of scheme, if it isn’t protected like https, could also | integrity attacks by an on-path observer. A malicious operator of | |||
| introduce integrity attacks by an on-path observer. A malicious | the service hosting the URI can return arbitrary content that the | |||
| operator of the service hosting the URI can return arbitrary content | parser will read. Also, such retrievals can be used in a DDoS attack | |||
| that the parser will read. Also, such retrievals can be used in a | when the URI points to a potential victim. Therefore, applications | |||
| DDoS attack when the URI points to a potential victim. Therefore, | need to ask user approval for each retrieval individually, take extra | |||
| applications need to ask user approval for each retrieval | precautions when parsing retrieved data, and cache retrieved | |||
| individually, take extra precautions when parsing retrieved data, and | resources. Applications MUST obtain explicit user approval to | |||
| cache retrieved resources. Applications MUST obtain explicit user | retrieve local resources not located in the same directory as the | |||
| approval to retrieve local resources not located in the same | FLAC file being processed. Since relative URIs are permitted, | |||
| directory as the FLAC file being processed. Since relative URIs are | applications MUST guard against directory traversal attacks and guard | |||
| permitted, applications MUST guard against directory traversal | against a violation of a same-origin policy if such a policy is being | |||
| attacks and guard against a violation of a same-origin policy if such | enforced. | |||
| a policy is being enforced. | ||||
| Seeking in a FLAC stream that is not in a container relies on the | Seeking in a FLAC stream that is not in a container relies on the | |||
| coded number in frame headers and optionally a seektable metadata | coded number in frame headers and optionally a seek table metadata | |||
| block. Parsers MUST employ thorough checks on whether a found coded | block. Parsers MUST employ thorough checks on whether a found coded | |||
| number or seekpoint is at all possible, e.g., whether it is within | number or seek point is at all possible, e.g., whether it is within | |||
| bounds and not directly contradicting any other coded number or | bounds and not directly contradicting any other coded number or seek | |||
| seekpoint that the seeking process relies on. Without these checks, | point that the seeking process relies on. Without these checks, | |||
| seeking might get stuck in an infinite loop when numbers in frames | seeking might get stuck in an infinite loop when numbers in frames | |||
| are non-consecutive or otherwise not valid, which could be used in | are non-consecutive or otherwise not valid, which could be used in | |||
| denial of service attacks. | DoS attacks. | |||
| Implementors are advised to employ fuzz testing combined with | Implementors are advised to employ fuzz testing combined with | |||
| different sanitizers on FLAC decoders to find security problems. | different sanitizers on FLAC decoders to find security problems. | |||
| Ignoring the results of CRC checks improves the efficiency of decoder | Ignoring the results of CRC checks improves the efficiency of decoder | |||
| fuzz testing. | fuzz testing. | |||
| See [FLAC-decoder-testbench] for a non-exhaustive list of FLAC files | See [FLAC-decoder-testbench] for a non-exhaustive list of FLAC files | |||
| with extreme configurations that lead to crashes or reboots on some | with extreme configurations that lead to crashes or reboots on some | |||
| known implementations. Besides providing a starting point for | known implementations. Besides providing a starting point for | |||
| security testing, this set of files can also be used to test | security testing, this set of files can also be used to test | |||
| conformance with this specification. | conformance with this specification. | |||
| FLAC files may contain executable code, although the FLAC format is | FLAC files may contain executable code, although the FLAC format is | |||
| not designed for it and it is uncommon. One use case where FLAC is | not designed for it and it is uncommon. One use case where FLAC is | |||
| occasionally used to store executable code is when compressing images | occasionally used to store executable code is when compressing images | |||
| of mixed mode CDs, which contain both audio and non-audio data, of | of mixed-mode CDs, which contain both audio and non-audio data, the | |||
| which the non-audio portion can contain executable code. In that | non-audio portion of which can contain executable code. In that | |||
| case, the executable code is stored as if it were audio and is | case, the executable code is stored as if it were audio and is | |||
| potentially obscured. Of course, it is also possible to store | potentially obscured. Of course, it is also possible to store | |||
| executable code as metadata, for example as a vorbis comment with | executable code as metadata, for example, as a Vorbis comment with | |||
| help of a binary-to-text encoding or directly in an application | help of a binary-to-text encoding or directly in an application | |||
| metadata block. Applications MUST NOT execute code contained in FLAC | metadata block. Applications MUST NOT execute code contained in FLAC | |||
| files or present parts of FLAC files as executable code to the user, | files or present parts of FLAC files as executable code to the user, | |||
| except when an application has that explicit purpose, e.g., | except when an application has that explicit purpose, e.g., | |||
| applications reading FLAC files as disc images and presenting it as | applications reading FLAC files as disc images and presenting it as a | |||
| virtual disc drive. | virtual disc drive. | |||
| 13. IANA Considerations | 12. IANA Considerations | |||
| This document registers one new media type, "audio/flac", as defined | Per this document, IANA has registered one new media type ("audio/ | |||
| in the following section, and creates a new IANA registry. | flac") and created a new IANA registry, as described in the | |||
| subsections below. | ||||
| 13.1. Media type registration | 12.1. Media Type Registration | |||
| The following information serves as the registration form for the | IANA has registered the "audio/flac" media type as follows. This | |||
| "audio/flac" media type. This media type is applicable for FLAC | media type is applicable for FLAC audio that is not packaged in a | |||
| audio that is not packaged in a container as described in Section 10. | container as described in Section 10. FLAC audio packaged in such a | |||
| FLAC audio packaged in such a container will take on the media type | container will take on the media type of that container, for example, | |||
| of that container, for example, audio/ogg when packaged in an Ogg | "audio/ogg" when packaged in an Ogg container or "video/mp4" when | |||
| container, or video/mp4 when packaged in an MP4 container alongside a | packaged in an MP4 container alongside a video track. | |||
| video track. | ||||
| Type name: audio | Type name: audio | |||
| Subtype name: flac | Subtype name: flac | |||
| Required parameters: N/A | Required parameters: N/A | |||
| Optional parameters: N/A | Optional parameters: N/A | |||
| Encoding considerations: as per THISRFC | Encoding considerations: as per RFC 9639 | |||
| Security considerations: see the security considerations in Section | Security considerations: See the security considerations in | |||
| 12 of THISRFC | Section 11 of RFC 9639. | |||
| Interoperability considerations: see the descriptions of past format | Interoperability considerations: See the descriptions of past format | |||
| changes in Appendix B of THISRFC | changes in Appendix B of RFC 9639. | |||
| Published specification: THISRFC | Published specification: RFC 9639 | |||
| Applications that use this media type: ffmpeg, apache, firefox | Applications that use this media type: FFmpeg, Apache, Firefox | |||
| Fragment identifier considerations: none | Fragment identifier considerations: N/A | |||
| Additional information: | Additional information: | |||
| Deprecated alias names for this type: audio/x-flac | Deprecated alias names for this type: audio/x-flac | |||
| Magic number(s): fLaC | ||||
| Magic number(s): fLaC | File extension(s): flac | |||
| Macintosh file type code(s): N/A | ||||
| File extension(s): flac | Uniform Type Identifier: org.xiph.flac conforms to public.audio | |||
| Macintosh file type code(s): none | Windows Clipboard Format Name: audio/flac | |||
| Uniform Type Identifier: org.xiph.flac conforms to public.audio | ||||
| Windows Clipboard Format Name: audio/flac | ||||
| Person & email address to contact for further information: | ||||
| IETF CELLAR WG cellar@ietf.org | ||||
| Intended usage: COMMON | Person & email address to contact for further information: IETF | |||
| CELLAR Working Group (cellar@ietf.org) | ||||
| Restrictions on usage: N/A | Intended usage: COMMON | |||
| Author: IETF CELLAR WG | Restrictions on usage: N/A | |||
| Change controller: Internet Engineering Task Force | Author: IETF CELLAR Working Group | |||
| (mailto:iesg@ietf.org) | ||||
| Provisional registration? (standards tree only): NO | Change controller: Internet Engineering Task Force (iesg@ietf.org) | |||
| 13.2. Application ID Registry | 12.2. FLAC Application Metadata Block IDs Registry | |||
| This document creates a new IANA registry called the "FLAC | IANA has created a new registry called the "FLAC Application Metadata | |||
| Application Metadata Block ID" registry. The values correspond to | Block IDs" registry. The values correspond to the 32-bit identifier | |||
| the 32-bit identifier described in Section 8.4. | described in Section 8.4. | |||
| To register a new Application ID in this registry, one needs an | To register a new application ID in this registry, one needs an | |||
| Application ID, a description, optionally a reference to a document | application ID, a description, an optional reference to a document | |||
| describing the Application ID and a Change Controller (IETF or email | describing the application ID, and a Change Controller (IETF or email | |||
| of registrant). The Application IDs are to be allocated according to | of registrant). The application IDs are allocated according to the | |||
| the "First Come First Served" policy [RFC8126], so that there is no | "First Come First Served" policy [RFC8126] so that there is no | |||
| impediment to registering any Application IDs the FLAC community | impediment to registering any application IDs the FLAC community | |||
| encounters, especially if they were used in audio files but were not | encounters, especially if they were used in audio files but were not | |||
| registered when the audio files were encoded. An Application ID can | registered when the audio files were encoded. An application ID can | |||
| be any 32-bit value, but is often composed of 4 ASCII characters, to | be any 32-bit value but is often composed of 4 ASCII characters that | |||
| be human-readable. | are human-readable. | |||
| The FLAC Application Metadata Block ID registry is assigned the | ||||
| following initial values, taken from the registration page at | ||||
| xiph.org (see [ID-registration-page]), which is no longer being | ||||
| maintained as it is replaced by this registry. | ||||
| +===========+==========+===========+====================+==========+ | ||||
| |Application|ASCII |Description| Specification |Change | | ||||
| |ID |rendition | | |controller| | ||||
| | |(if | | | | | ||||
| | |available)| | | | | ||||
| +===========+==========+===========+====================+==========+ | ||||
| |0x41544348 |ATCH |FlacFile | [FlacFile] |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x42534F4C |BSOL |beSolo | |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x42554753 |BUGS |Bugs Player| |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x43756573 |Cues |GoldWave | |IETF | | ||||
| | | |cue points | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x46696361 |Fica |CUE | |IETF | | ||||
| | | |Splitter | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x46746F6C |Ftol |flac-tools | |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x4D4F5442 |MOTB |MOTB | |IETF | | ||||
| | | |MetaCzar | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x4D505345 |MPSE |MP3 Stream | |IETF | | ||||
| | | |Editor | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x4D754D4C |MuML |MusicML: | |IETF | | ||||
| | | |Music | | | | ||||
| | | |Metadata | | | | ||||
| | | |Language | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x52494646 |RIFF |Sound | |IETF | | ||||
| | | |Devices | | | | ||||
| | | |RIFF chunk | | | | ||||
| | | |storage | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x5346464C |SFFL |Sound Font | |IETF | | ||||
| | | |FLAC | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x534F4E59 |SONY |Sony | |IETF | | ||||
| | | |Creative | | | | ||||
| | | |Software | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x5351455A |SQEZ |flacsqueeze| |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x54745776 |TtWv |TwistedWave| |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x55495453 |UITS |UITS | |IETF | | ||||
| | | |Embedding | | | | ||||
| | | |tools | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x61696666 |aiff |FLAC AIFF | [Foreign-metadata] |IETF | | ||||
| | | |chunk | | | | ||||
| | | |storage | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x696D6167 |imag |flac-image | |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x7065656D |peem |Parseable | |IETF | | ||||
| | | |Embedded | | | | ||||
| | | |Extensible | | | | ||||
| | | |Metadata | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x71667374 |qfst |QFLAC | |IETF | | ||||
| | | |Studio | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x72696666 |riff |FLAC RIFF | [Foreign-metadata] |IETF | | ||||
| | | |chunk | | | | ||||
| | | |storage | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x74756E65 |tune |TagTuner | |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x773634C0 |w64 |FLAC Wave64| [Foreign-metadata] |IETF | | ||||
| | | |chunk | | | | ||||
| | | |storage | | | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x78626174 |xbat |XBAT | |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| |0x786D6364 |xmcd |xmcd | |IETF | | ||||
| +-----------+----------+-----------+--------------------+----------+ | ||||
| Table 25 | ||||
| 14. Acknowledgments | ||||
| FLAC owes much to the many people who have advanced the audio | ||||
| compression field so freely. For instance: | ||||
| * A. J. Robinson for his work on Shorten; his paper (see | The initial contents of "FLAC Application Metadata Block IDs" | |||
| [robinson-tr156]) is a good starting point on some of the basic | registry are shown in the table below. These initial values were | |||
| methods used by FLAC. FLAC trivially extends and improves the | taken from the registration page at xiph.org (see | |||
| fixed predictors, LPC coefficient quantization, and Rice coding | [ID-registration-page]), which is no longer being maintained as it | |||
| used in Shorten. | has been replaced by this registry. | |||
| * S. W. Golomb and Robert F. Rice; their universal codes are used | ||||
| by FLAC's entropy coder, see [Rice]. | ||||
| * N. Levinson and J. Durbin; the FLAC reference encoder (see | ||||
| Section 11) uses an algorithm developed and refined by them for | ||||
| determining the LPC coefficients from the autocorrelation | ||||
| coefficients, see [Durbin]. | ||||
| * And of course, Claude Shannon, see [Shannon]. | ||||
| The FLAC format, the FLAC reference implementation, and this document | +===========+==========+===========+===================+==========+ | |||
| were originally developed by Josh Coalson. While many others have | |Application|ASCII |Description|Reference |Change | | |||
| contributed since, this original effort is deeply appreciated. | |ID |Rendition | | |Controller| | |||
| | |(If | | | | | ||||
| | |Available)| | | | | ||||
| +===========+==========+===========+===================+==========+ | ||||
| |0x41544348 |ATCH |FlacFile |[FlacFile], RFC |IETF | | ||||
| | | | |9639 | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x42534F4C |BSOL |beSolo |RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x42554753 |BUGS |Bugs Player|RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x43756573 |Cues |GoldWave |RFC 9639 |IETF | | ||||
| | | |cue points | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x46696361 |Fica |CUE |RFC 9639 |IETF | | ||||
| | | |Splitter | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x46746F6C |Ftol |flac-tools |RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x4D4F5442 |MOTB |MOTB |RFC 9639 |IETF | | ||||
| | | |MetaCzar | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x4D505345 |MPSE |MP3 Stream |RFC 9639 |IETF | | ||||
| | | |Editor | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x4D754D4C |MuML |MusicML: |RFC 9639 |IETF | | ||||
| | | |Music | | | | ||||
| | | |Metadata | | | | ||||
| | | |Language | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x52494646 |RIFF |Sound |RFC 9639 |IETF | | ||||
| | | |Devices | | | | ||||
| | | |RIFF chunk | | | | ||||
| | | |storage | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x5346464C |SFFL |Sound Font |RFC 9639 |IETF | | ||||
| | | |FLAC | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x534F4E59 |SONY |Sony |RFC 9639 |IETF | | ||||
| | | |Creative | | | | ||||
| | | |Software | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x5351455A |SQEZ |flacsqueeze|RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x54745776 |TtWv |TwistedWave|RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x55495453 |UITS |UITS |RFC 9639 |IETF | | ||||
| | | |Embedding | | | | ||||
| | | |tools | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x61696666 |aiff |FLAC AIFF |[Foreign-metadata],|IETF | | ||||
| | | |chunk |RFC 9639 | | | ||||
| | | |storage | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x696D6167 |imag |flac-image |RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x7065656D |peem |Parseable |RFC 9639 |IETF | | ||||
| | | |Embedded | | | | ||||
| | | |Extensible | | | | ||||
| | | |Metadata | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x71667374 |qfst |QFLAC |RFC 9639 |IETF | | ||||
| | | |Studio | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x72696666 |riff |FLAC RIFF |[Foreign-metadata],|IETF | | ||||
| | | |chunk |RFC 9639 | | | ||||
| | | |storage | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x74756E65 |tune |TagTuner |RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x77363420 |w64 |FLAC Wave64|[Foreign-metadata],|IETF | | ||||
| | | |chunk |RFC 9639 | | | ||||
| | | |storage | | | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x78626174 |xbat |XBAT |RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| |0x786D6364 |xmcd |xmcd |RFC 9639 |IETF | | ||||
| +-----------+----------+-----------+-------------------+----------+ | ||||
| 15. References | Table 25 | |||
| 15.1. Normative References | 13. References | |||
| [I-D.ietf-cellar-matroska] | 13.1. Normative References | |||
| Lhomme, S., Bunkus, M., and D. Rice, "Matroska Media | ||||
| Container Format Specifications", Work in Progress, | ||||
| Internet-Draft, draft-ietf-cellar-matroska-21, 22 October | ||||
| 2023, <https://datatracker.ietf.org/doc/html/draft-ietf- | ||||
| cellar-matroska-21>. | ||||
| [ISRC-handbook] | [ISRC-handbook] | |||
| International ISRC Registration Authority, "International | International ISRC Registration Authority, "International | |||
| Standard Recording Code (ISRC) Handbook, 4th edition", | Standard Recording Code (ISRC) Handbook", 4th edition, | |||
| 2021, <https://www.ifpi.org/isrc_handbook/>. | 2021, <https://www.ifpi.org/isrc_handbook/>. | |||
| [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, | [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, | |||
| DOI 10.17487/RFC1321, April 1992, | DOI 10.17487/RFC1321, April 1992, | |||
| <https://www.rfc-editor.org/info/rfc1321>. | <https://www.rfc-editor.org/info/rfc1321>. | |||
| [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |||
| Extensions (MIME) Part Two: Media Types", RFC 2046, | Extensions (MIME) Part Two: Media Types", RFC 2046, | |||
| DOI 10.17487/RFC2046, November 1996, | DOI 10.17487/RFC2046, November 1996, | |||
| <https://www.rfc-editor.org/info/rfc2046>. | <https://www.rfc-editor.org/info/rfc2046>. | |||
| skipping to change at page 60, line 14 ¶ | skipping to change at line 2571 ¶ | |||
| [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform | [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform | |||
| Resource Identifier (URI): Generic Syntax", STD 66, | Resource Identifier (URI): Generic Syntax", STD 66, | |||
| RFC 3986, DOI 10.17487/RFC3986, January 2005, | RFC 3986, DOI 10.17487/RFC3986, January 2005, | |||
| <https://www.rfc-editor.org/info/rfc3986>. | <https://www.rfc-editor.org/info/rfc3986>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| 15.2. Informative References | [RFC9559] Lhomme, S., Bunkus, M., and D. Rice, "Matroska Media | |||
| Container Format Specification", RFC 9559, | ||||
| DOI 10.17487/RFC9559, October 2024, | ||||
| <https://www.rfc-editor.org/info/rfc9559>. | ||||
| [Durbin] Durbin, J., "The Fitting of Time-Series Models", | 13.2. Informative References | |||
| DOI 10.2307/1401322, December 1959, | ||||
| [Durbin] Durbin, J., "The Fitting of Time-Series Models", Revue de | ||||
| l'Institut International de Statistique / Review of the | ||||
| International Statistical Institute, vol. 28, no. 3, pp. | ||||
| 233–44, DOI 10.2307/1401322, 1960, | ||||
| <https://www.jstor.org/stable/1401322>. | <https://www.jstor.org/stable/1401322>. | |||
| [FIR] "Finite impulse response - Wikipedia", | [FIR] Wikipedia, "Finite impulse response", August 2024, | |||
| <https://en.wikipedia.org/wiki/Finite_impulse_response>. | <https://en.wikipedia.org/w/ | |||
| index.php?title=Finite_impulse_response&oldid=1240945295>. | ||||
| [FLAC-decoder-testbench] | [FLAC-decoder-testbench] | |||
| "FLAC decoder testbench", commit aa7b0c6, August 2023, | "The Free Lossless Audio Codec (FLAC) test files", commit | |||
| aa7b0c6, August 2023, | ||||
| <https://github.com/ietf-wg-cellar/flac-test-files>. | <https://github.com/ietf-wg-cellar/flac-test-files>. | |||
| [FLAC-implementation] | ||||
| "FLAC", <https://xiph.org/flac/>. | ||||
| [FLAC-in-MP4-specification] | [FLAC-in-MP4-specification] | |||
| Montgomery, C., "Encapsulation of FLAC in ISO Base Media | "Encapsulation of FLAC in ISO Base Media File Format", | |||
| File Format", commit 78d85dd, July 2022, | commit 78d85dd, July 2022, | |||
| <https://github.com/xiph/flac/blob/master/doc/ | <https://github.com/xiph/flac/blob/master/doc/ | |||
| isoflac.txt>. | isoflac.txt>. | |||
| [FLAC-specification-github] | [FLAC-specification-github] | |||
| "FLAC specification github repository", | "The Free Lossless Audio Codec (FLAC) Specification", | |||
| <https://github.com/ietf-wg-cellar/flac-specification>. | <https://github.com/ietf-wg-cellar/flac-specification>. | |||
| [FLAC-wiki-implementations] | ||||
| "FLAC specification wiki: Implementations", | ||||
| <https://github.com/ietf-wg-cellar/flac- | ||||
| specification/wiki/Implementations>. | ||||
| [FLAC-wiki-interoperability] | [FLAC-wiki-interoperability] | |||
| "FLAC specification wiki: Interoperability | "Interoperability considerations", commit 58a06d6, | |||
| considerations", <https://github.com/ietf-wg-cellar/flac- | <https://github.com/ietf-wg-cellar/flac- | |||
| specification/wiki/Interoperability-considerations>. | specification/wiki/Interoperability-considerations>. | |||
| [FlacFile] "FlacFile", October 2007, | [FlacFile] "FlacFile", Wayback Machine archive, October 2007, | |||
| <https://web.archive.org/web/20071023070305/ | <https://web.archive.org/web/20071023070305/ | |||
| http://firestuff.org:80/flacfile/>. | http://firestuff.org:80/flacfile/>. | |||
| [Foreign-metadata] | [Foreign-metadata] | |||
| "Specification of foreign metadata storage in FLAC", | "Specification of foreign metadata storage in FLAC", | |||
| November 2023, | commit 72787c3, November 2023, | |||
| <https://github.com/xiph/flac/blob/master/doc/ | <https://github.com/xiph/flac/blob/master/doc/ | |||
| foreign_metadata_storage.md>. | foreign_metadata_storage.md>. | |||
| [HPL-1999-144] | ||||
| Hans, M. and RW. Schafer, "Lossless Compression of Digital | ||||
| Audio", DOI 10.1109/79.939834, November 1999, | ||||
| <https://www.hpl.hp.com/techreports/1999/HPL- | ||||
| 1999-144.pdf>. | ||||
| [ID-registration-page] | [ID-registration-page] | |||
| "FLAC - ID Registry", <https://xiph.org/flac/id.html>. | Xiph.Org, "ID registry", <https://xiph.org/flac/id.html>. | |||
| [ID3v2] Nilsson, M., "id3v2.4.0-frames.txt", November 2000, | [ID3v2] Nilsson, M., "ID3 tag version 2.4.0 - Native Frames", | |||
| Wayback Machine archive, November 2000, | ||||
| <https://web.archive.org/web/20220903174949/ | <https://web.archive.org/web/20220903174949/ | |||
| https://id3.org/id3v2.4.0-frames>. | https://id3.org/id3v2.4.0-frames>. | |||
| [IEC.60908.1999] | [IEC.60908.1999] | |||
| International Electrotechnical Commission, "Audio | International Electrotechnical Commission, "Audio | |||
| recording - Compact disc digital audio system", | recording - Compact disc digital audio system", | |||
| IEC International standard 60908 second edition, 1999. | IEC 60908:1999-02, 1999, | |||
| <https://webstore.iec.ch/publication/3885>. | ||||
| [LinearPrediction] | [LinearPrediction] | |||
| "Linear prediction - Wikipedia", | Wikipedia, "Linear prediction", August 2023, | |||
| <https://en.wikipedia.org/wiki/Linear_prediction>. | <https://en.wikipedia.org/w/ | |||
| index.php?title=Linear_prediction&oldid=1169015573>. | ||||
| [MLP] Gerzon, MA., Craven, PG., Stuart, JR., Law, MJ., and RJ. | [Lossless-Compression] | |||
| Wilson, "The MLP Lossless Compression System", September | Hans, M. and R. W. Schafer, "Lossless compression of | |||
| 1999, | digital audio", IEEE Signal Processing Magazine, vol. 18, | |||
| no. 4, pp. 21-32, DOI 10.1109/79.939834, July 2001, | ||||
| <https://ieeexplore.ieee.org/document/939834>. | ||||
| [lossyWAV] Hydrogenaudio Knowledgebase, "lossyWAV", July 2021, | ||||
| <https://wiki.hydrogenaud.io/ | ||||
| index.php?title=LossyWAV&oldid=32877>. | ||||
| [MLP] Gerzon, M. A., Craven, P. G., Stuart, J. R., Law, M. J., | ||||
| and R. J. Wilson, "The MLP Lossless Compression System", | ||||
| Audio Engineering Society Conference: 17th International | ||||
| Conference: High-Quality Audio Codin, September 1999, | ||||
| <https://www.aes.org/e-lib/online/browse.cfm?elib=8082>. | <https://www.aes.org/e-lib/online/browse.cfm?elib=8082>. | |||
| [MusicBrainz] | [MusicBrainz] | |||
| MusicBrainz, "Tags & Variables - MusicBrainz Picard v2.10 | MusicBrainz, "Tags & Variables", MusicBrainz Picard v2.10 | |||
| documentation", <https://picard- | documentation, <https://picard- | |||
| docs.musicbrainz.org/en/variables/variables.html>. | docs.musicbrainz.org/en/variables/variables.html>. | |||
| [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet | [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet | |||
| Denial-of-Service Considerations", RFC 4732, | Denial-of-Service Considerations", RFC 4732, | |||
| DOI 10.17487/RFC4732, December 2006, | DOI 10.17487/RFC4732, December 2006, | |||
| <https://www.rfc-editor.org/info/rfc4732>. | <https://www.rfc-editor.org/info/rfc4732>. | |||
| [RFC5334] Goncalves, I., Pfeiffer, S., and C. Montgomery, "Ogg Media | [RFC5334] Goncalves, I., Pfeiffer, S., and C. Montgomery, "Ogg Media | |||
| Types", RFC 5334, DOI 10.17487/RFC5334, September 2008, | Types", RFC 5334, DOI 10.17487/RFC5334, September 2008, | |||
| <https://www.rfc-editor.org/info/rfc5334>. | <https://www.rfc-editor.org/info/rfc5334>. | |||
| [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the | [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the | |||
| Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, | Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, | |||
| September 2012, <https://www.rfc-editor.org/info/rfc6716>. | September 2012, <https://www.rfc-editor.org/info/rfc6716>. | |||
| [RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running | [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for | |||
| Code: The Implementation Status Section", BCP 205, | Writing an IANA Considerations Section in RFCs", BCP 26, | |||
| RFC 7942, DOI 10.17487/RFC7942, July 2016, | RFC 8126, DOI 10.17487/RFC8126, June 2017, | |||
| <https://www.rfc-editor.org/info/rfc7942>. | <https://www.rfc-editor.org/info/rfc8126>. | |||
| [Rice] Rice, RF. and JR. Plaunt, "Adaptive Variable-Length Coding | [Rice] Rice, R. F. and J. R. Plaunt, "Adaptive Variable-Length | |||
| for Efficient Compression of Spacecraft Television Data", | Coding for Efficient Compression of Spacecraft Television | |||
| DOI 10.1109/TCOM.1971.1090789, December 1971, | Data", IEEE Transactions on Communication Technology, vol. | |||
| 19, no. 6, pp. 889-897, DOI 10.1109/TCOM.1971.1090789, | ||||
| December 1971, | ||||
| <https://ieeexplore.ieee.org/document/1090789>. | <https://ieeexplore.ieee.org/document/1090789>. | |||
| [Shannon] Shannon, CE., "Communication in the Presence of Noise", | [Robinson-TR156] | |||
| Robinson, T., "SHORTEN: Simple lossless and near-lossless | ||||
| waveform compression", Cambridge University Engineering | ||||
| Department Technical Report CUED/F-INFENG/TR.156, December | ||||
| 1994, <https://mi.eng.cam.ac.uk/reports/svr-ftp/auto-pdf/ | ||||
| robinson_tr156.pdf>. | ||||
| [Shannon] Shannon, C. E., "Communication in the Presence of Noise", | ||||
| Proceedings of the IRE, vol. 37, no. 1, pp. 10-21, | ||||
| DOI 10.1109/JRPROC.1949.232969, January 1949, | DOI 10.1109/JRPROC.1949.232969, January 1949, | |||
| <https://ieeexplore.ieee.org/document/1697831>. | <https://ieeexplore.ieee.org/document/1697831>. | |||
| [VarLengthCode] | [VarLengthCode] | |||
| "Variable-length code - Wikipedia", | Wikipedia, "Variable-length code", April 2024, | |||
| <https://en.wikipedia.org/wiki/Variable-length_code>. | <https://en.wikipedia.org/w/index.php?title=Variable- | |||
| length_code&oldid=1220260423>. | ||||
| [Vorbis] Xiph.Org, "Ogg Vorbis I format specification: comment | [Vorbis] Xiph.Org, "Ogg Vorbis I format specification: comment | |||
| field and header specification", | field and header specification", | |||
| <https://xiph.org/vorbis/doc/v-comment.html>. | <https://xiph.org/vorbis/doc/v-comment.html>. | |||
| [lossyWAV] "lossyWAV - Hydrogenaudio Knowledgebase", | Appendix A. Numerical Considerations | |||
| <https://wiki.hydrogenaud.io/index.php?title=LossyWAV>. | ||||
| [robinson-tr156] | ||||
| Robinson, T., "SHORTEN: Simple lossless and near-lossless | ||||
| waveform compression", December 1994, | ||||
| <https://mi.eng.cam.ac.uk/reports/abstracts/ | ||||
| robinson_tr156.html>. | ||||
| Appendix A. Numerical considerations | ||||
| In order to maintain lossless behavior, all arithmetic used in | In order to maintain lossless behavior, all arithmetic used in | |||
| encoding and decoding sample values must be done with integer data | encoding and decoding sample values must be done with integer data | |||
| types to eliminate the possibility of introducing rounding errors | types to eliminate the possibility of introducing rounding errors | |||
| associated with floating-point arithmetic. Use of floating-point | associated with floating-point arithmetic. Use of floating-point | |||
| representations in analysis (e.g., finding a good predictor or Rice | representations in analysis (e.g., finding a good predictor or Rice | |||
| parameter) is not a concern, as long as the process of using the | parameter) is not a concern as long as the process of using the found | |||
| found predictor and Rice parameter to encode audio samples is | predictor and Rice parameter to encode audio samples is implemented | |||
| implemented with only integer math. | with only integer math. | |||
| Furthermore, the possibility of integer overflow can be eliminated by | Furthermore, the possibility of integer overflow can be eliminated by | |||
| using large enough data types. Choosing a 64-bit signed data type | using data types that are large enough. Choosing a 64-bit signed | |||
| for all arithmetic involving sample values would make sure the | data type for all arithmetic involving sample values would make sure | |||
| possibility for overflow is eliminated, but usually smaller data | the possibility for overflow is eliminated, but usually, smaller data | |||
| types are chosen for increased performance, especially in embedded | types are chosen for increased performance, especially in embedded | |||
| devices. This appendix provides guidelines for choosing the | devices. This appendix provides guidelines for choosing the | |||
| appropriate data type for each step of encoding and decoding FLAC | appropriate data type for each step of encoding and decoding FLAC | |||
| files. | files. | |||
| In this appendix, signed data types are signed two's complement. | In this appendix, signed data types are signed two's complement. | |||
| A.1. Determining the necessary data type size | A.1. Determining the Necessary Data Type Size | |||
| To find the smallest data type size that is guaranteed not to | To find the smallest data type size that is guaranteed not to | |||
| overflow for a certain sequence of arithmetic operations, the | overflow for a certain sequence of arithmetic operations, the | |||
| combination of values producing the largest possible result should be | combination of values producing the largest possible result should be | |||
| considered. | considered. | |||
| If, for example, two 16-bit signed integers are added, the largest | For example, if two 16-bit signed integers are added, the largest | |||
| possible result forms if both values are the largest number that can | possible result forms if both values are the largest number that can | |||
| be represented with a 16-bit signed integer. To store the result, a | be represented with a 16-bit signed integer. To store the result, a | |||
| signed integer data type with at least 17 bits is needed. Similarly, | signed integer data type with at least 17 bits is needed. Similarly, | |||
| when adding 4 of these values, 18 bits are needed; when adding 8, 19 | when adding 4 of these values, 18 bits are needed; when adding 8, 19 | |||
| bits are needed, etc. In general, the number of bits necessary when | bits are needed, etc. In general, the number of bits necessary when | |||
| adding numbers together is increased by the log base 2 of the number | adding numbers together is increased by the log base 2 of the number | |||
| of values rounded up to the nearest integer. So, when adding 18 | of values rounded up to the nearest integer. So, when adding 18 | |||
| unknown values stored in 8 bit signed integers, we need a signed | unknown values stored in 8-bit signed integers, we need a signed | |||
| integer data type of at least 13 bits to store the result, as the log | integer data type of at least 13 bits to store the result, as the log | |||
| base 2 of 18 rounded up is 5. | base 2 of 18 rounded up is 5. | |||
| When multiplying two numbers, the number of bits needed for the | When multiplying two numbers, the number of bits needed for the | |||
| result is the size of the first number plus the size of the second | result is the size of the first number plus the size of the second | |||
| number. If, for example, a 16-bit signed integer is multiplied by | number. For example, if a 16-bit signed integer is multiplied by | |||
| another 16-bit signed integer, the result needs at least 32 bits to | another 16-bit signed integer, the result needs at least 32 bits to | |||
| be stored without overflowing. To show this in practice, the largest | be stored without overflowing. To show this in practice, the largest | |||
| signed value that can be stored in 4 bits is -8. (-8)*(-8) is 64, | signed value that can be stored in 4 bits is -8. (-8)*(-8) is 64, | |||
| which needs at least 8 bits (signed) to store. | which needs at least 8 bits (signed) to store. | |||
| A.2. Stereo decorrelation | A.2. Stereo Decorrelation | |||
| When stereo decorrelation is used, the side channel will have one | When stereo decorrelation is used, the side channel will have one | |||
| extra bit of bit depth, see Section 4.2. | extra bit of bit depth; see Section 4.2. | |||
| This means that while 16-bit signed integers have sufficient range to | This means that while 16-bit signed integers have sufficient range to | |||
| store samples from a fully decoded FLAC frame with a bit depth of 16 | store samples from a fully decoded FLAC frame with a bit depth of 16 | |||
| bits, the decoding of a side subframe in such a file will need a data | bits, the decoding of a side subframe in such a file will need a data | |||
| type with at least 17 bits to store decoded subframe samples before | type with at least 17 bits to store decoded subframe samples before | |||
| undoing stereo decorrelation. | undoing stereo decorrelation. | |||
| Most FLAC decoders store decoded (subframe) samples as 32-bit values, | Most FLAC decoders store decoded (subframe) samples as 32-bit values, | |||
| which is sufficient for files with bit depths up to (and including) | which is sufficient for files with bit depths up to (and including) | |||
| 31 bits. | 31 bits. | |||
| skipping to change at page 64, line 20 ¶ | skipping to change at line 2782 ¶ | |||
| A prediction (which is used to calculate the residual on encoding or | A prediction (which is used to calculate the residual on encoding or | |||
| added to the residual to calculate the sample value on decoding) is | added to the residual to calculate the sample value on decoding) is | |||
| formed by multiplying and summing preceding sample values. In order | formed by multiplying and summing preceding sample values. In order | |||
| to eliminate the possibility of integer overflow, the combination of | to eliminate the possibility of integer overflow, the combination of | |||
| preceding sample values and predictor coefficients producing the | preceding sample values and predictor coefficients producing the | |||
| largest possible value should be considered. | largest possible value should be considered. | |||
| To determine the size of the data type needed to calculate either a | To determine the size of the data type needed to calculate either a | |||
| residual sample (on encoding) or an audio sample value (on decoding) | residual sample (on encoding) or an audio sample value (on decoding) | |||
| in a fixed predictor subframe, the maximal possible value for these | in a fixed predictor subframe, the maximum possible value for these | |||
| is calculated as described in Appendix A.1 in the following table. | is calculated as described in Appendix A.1 and in the following | |||
| For example: if a frame codes for 16-bit audio and has some form of | table. For example, if a frame codes for 16-bit audio and has some | |||
| stereo decorrelation, the subframe coding for the side channel would | form of stereo decorrelation, the subframe coding for the side | |||
| need 16+1+3 bits if a third order fixed predictor is used. | channel would need 16+1+3 bits if a third-order fixed predictor is | |||
| used. | ||||
| +=======+==============================+===============+=======+ | +=======+==============================+===============+=======+ | |||
| | Order | Calculation of residual | Sample values | Extra | | | Order | Calculation of Residual | Sample Values | Extra | | |||
| | | | summed | bits | | | | | Summed | Bits | | |||
| +=======+==============================+===============+=======+ | +=======+==============================+===============+=======+ | |||
| | 0 | a(n) | 1 | 0 | | | 0 | a(n) | 1 | 0 | | |||
| +-------+------------------------------+---------------+-------+ | +-------+------------------------------+---------------+-------+ | |||
| | 1 | a(n) - a(n-1) | 2 | 1 | | | 1 | a(n) - a(n-1) | 2 | 1 | | |||
| +-------+------------------------------+---------------+-------+ | +-------+------------------------------+---------------+-------+ | |||
| | 2 | a(n) - 2 * a(n-1) + a(n-2) | 4 | 2 | | | 2 | a(n) - 2 * a(n-1) + a(n-2) | 4 | 2 | | |||
| +-------+------------------------------+---------------+-------+ | +-------+------------------------------+---------------+-------+ | |||
| | 3 | a(n) - 3 * a(n-1) + 3 * | 8 | 3 | | | 3 | a(n) - 3 * a(n-1) + 3 * | 8 | 3 | | |||
| | | a(n-2) - a(n-3) | | | | | | a(n-2) - a(n-3) | | | | |||
| +-------+------------------------------+---------------+-------+ | +-------+------------------------------+---------------+-------+ | |||
| | 4 | a(n) - 4 * a(n-1) + 6 * | 16 | 4 | | | 4 | a(n) - 4 * a(n-1) + 6 * | 16 | 4 | | |||
| | | a(n-2) - 4 * a(n-3) + a(n-4) | | | | | | a(n-2) - 4 * a(n-3) + a(n-4) | | | | |||
| +-------+------------------------------+---------------+-------+ | +-------+------------------------------+---------------+-------+ | |||
| Table 26 | Table 26 | |||
| Where | Where: | |||
| * n is the number of the sample being predicted. | * n is the number of the sample being predicted. | |||
| * a(n) is the sample being predicted. | * a(n) is the sample being predicted. | |||
| * a(n-1) is the sample before the one being predicted, a(n-2) is the | * a(n-1) is the sample before the one being predicted, a(n-2) is the | |||
| sample before that, etc. | sample before that, etc. | |||
| For subframes with a linear predictor, the calculation is a little | For subframes with a linear predictor, the calculation is a little | |||
| more complicated. Each prediction is the sum of several | more complicated. Each prediction is the sum of several | |||
| multiplications. Each of these multiply a sample value with a | multiplications. Each of these multiply a sample value with a | |||
| predictor coefficient. The extra bits needed can be calculated by | predictor coefficient. The extra bits needed can be calculated by | |||
| adding the predictor coefficient precision (in bits) to the bit depth | adding the predictor coefficient precision (in bits) to the bit depth | |||
| of the audio samples. To account for the summing of these | of the audio samples. To account for the summing of these | |||
| multiplications, the log base 2 of the predictor order rounded up is | multiplications, the log base 2 of the predictor order rounded up is | |||
| skipping to change at page 65, line 28 ¶ | skipping to change at line 2840 ¶ | |||
| least (24 + 1) + 15 + ceil(log2(12)) = 44 bits. As another example, | least (24 + 1) + 15 + ceil(log2(12)) = 44 bits. As another example, | |||
| with a side-channel subframe bit depth of 16, a predictor order of 8, | with a side-channel subframe bit depth of 16, a predictor order of 8, | |||
| and a predictor coefficient precision of 12 bits, the minimum | and a predictor coefficient precision of 12 bits, the minimum | |||
| required size of the used signed integer data type is (16 + 1) + 12 + | required size of the used signed integer data type is (16 + 1) + 12 + | |||
| ceil(log2(8)) = 32 bits. | ceil(log2(8)) = 32 bits. | |||
| A.4. Residual | A.4. Residual | |||
| As stated in Section 9.2.7, an encoder must make sure residual | As stated in Section 9.2.7, an encoder must make sure residual | |||
| samples are representable by a 32-bit integer, signed two's | samples are representable by a 32-bit integer, signed two's | |||
| complement, excluding the most negative value. Continuing as in the | complement, excluding the most negative value. As in the previous | |||
| previous section, it is possible to calculate when residual samples | section, it is possible to calculate when residual samples already | |||
| already implicitly fit and when an additional check is needed. This | implicitly fit and when an additional check is needed. This implicit | |||
| implicit fit is achieved when residuals would fit a theoretical | fit is achieved when residuals would fit a theoretical 31-bit signed | |||
| 31-bit signed int, as that satisfies both of the mentioned criteria. | integer, as that satisfies both of the mentioned criteria. When this | |||
| When this implicit fit is not achieved, all residual values must be | implicit fit is not achieved, all residual values must be calculated | |||
| calculated and checked individually. | and checked individually. | |||
| For the residual of a fixed predictor, the maximum residual sample | For the residual of a fixed predictor, the maximum residual sample | |||
| size was already calculated in the previous section. However, for a | size was already calculated in the previous section. However, for a | |||
| linear predictor, the prediction is shifted right by a certain | linear predictor, the prediction is shifted right by a certain | |||
| amount. The number of bits needed for the residual is the number of | amount. The number of bits needed for the residual is the number of | |||
| bits calculated in the previous section, reduced by the prediction | bits calculated in the previous section, reduced by the prediction | |||
| right shift, and increased by one bit to account for the subtraction | right shift, and increased by one bit to account for the subtraction | |||
| of the prediction from the current sample on encoding. | of the prediction from the current sample on encoding. | |||
| Taking the last example of the previous section, where 32 bits were | Taking the last example of the previous section, where 32 bits were | |||
| needed for the prediction, the required data type size for the | needed for the prediction, the required data type size for the | |||
| residual samples in case of a right shift of 10 bits would be 32 - 10 | residual samples in case of a right shift of 10 bits would be 32 - 10 | |||
| + 1 = 23 bits, which means it is not necessary to perform the | + 1 = 23 bits, which means it is not necessary to perform the | |||
| aforementioned check. | aforementioned check. | |||
| As another example, when encoding 32-bit PCM with fixed predictors, | As another example, when encoding 32-bit PCM with fixed predictors, | |||
| all predictor orders must be checked. While the 0-order fixed | all predictor orders must be checked. While the zero-order fixed | |||
| predictor is guaranteed to have residual samples that fit a 32-bit | predictor is guaranteed to have residual samples that fit a 32-bit | |||
| signed int, it might produce a residual sample value that is the most | signed integer, it might produce a residual sample value that is the | |||
| negative representable value of that 32-bit signed int. | most negative representable value of that 32-bit signed integer. | |||
| Note that on decoding, while the residual sample values are limited | Note that on decoding, while the residual sample values are limited | |||
| to the aforementioned range, the predictions are not. This means | to the aforementioned range, the predictions are not. This means | |||
| that while the decoding of the residual samples can happen fully in | that while the decoding of the residual samples can happen fully in | |||
| 32-bit signed integers, decoders must be sure to execute the addition | 32-bit signed integers, decoders must be sure to execute the addition | |||
| of each residual sample to its accompanying prediction with a wide | of each residual sample to its accompanying prediction with a signed | |||
| enough signed integer data type like on encoding. | integer data type that is wide enough, as with encoding. | |||
| A.5. Rice coding | A.5. Rice Coding | |||
| When folding (i.e., zig-zag encoding) the residual sample values, no | When folding (i.e., zigzag encoding) the residual sample values, no | |||
| extra bits are needed when the absolute value of each residual sample | extra bits are needed when the absolute value of each residual sample | |||
| is first stored in an unsigned data type of the size of the last | is first stored in an unsigned data type of the size of the last | |||
| step, then doubled, and then has one subtracted depending on whether | step, then doubled, and then has one subtracted depending on whether | |||
| the residual sample was positive or negative. Many implementations, | the residual sample was positive or negative. However, many | |||
| however, choose to require one extra bit of data type size so zig-zag | implementations choose to require one extra bit of data type size so | |||
| encoding can happen in one step and without a cast instead of the | zigzag encoding can happen in one step without a cast instead of the | |||
| procedure described in the previous sentence. | procedure described in the previous sentence. | |||
| Appendix B. Past format changes | Appendix B. Past Format Changes | |||
| This informational appendix documents the changes made to the FLAC | This informational appendix documents the changes made to the FLAC | |||
| format over the years. This information might be of use when | format over the years. This information might be of use when | |||
| encountering FLAC files that were made with software following the | encountering FLAC files that were made with software following the | |||
| format as it was before the changes documented in this appendix. | format as it was before the changes documented in this appendix. | |||
| The FLAC format was first specified in December 2000 and the | The FLAC format was first specified in December 2000, and the | |||
| bitstream format was considered frozen with the release of FLAC (the | bitstream format was considered frozen with the release of FLAC 1.0 | |||
| reference encoder/decoder) 1.0 in July 2001. Only changes made since | (the reference encoder/decoder) in July 2001. Only changes made | |||
| this first stable release are considered in this appendix. Changes | since this first stable release are considered in this appendix. | |||
| made to the FLAC streamable subset definition (see Section 7) are not | Changes made to the FLAC streamable subset definition (see Section 7) | |||
| considered. | are not considered. | |||
| B.1. Addition of blocking strategy bit | B.1. Addition of Blocking Strategy Bit | |||
| Perhaps the largest backwards incompatible change to the | Perhaps the largest backwards-incompatible change to the | |||
| specification was published in July 2007. Before this change, | specification was published in July 2007. Before this change, | |||
| variable block size streams were not explicitly marked as such by a | variable block size streams were not explicitly marked as such by a | |||
| flag bit in the frame header. A decoder had two ways to detect a | flag bit in the frame header. A decoder had two ways to detect a | |||
| variable block size stream, either by comparing the minimum and | variable block size stream: by comparing the minimum and maximum | |||
| maximum block size in the STREAMINFO metadata block (which are equal | block sizes in the streaminfo metadata block (which are equal for a | |||
| for a fixed block size stream), or, if a decoder did not receive a | fixed block size stream) or by detecting a change of block size | |||
| STREAMINFO metadata block, by detecting a change of block size during | during a stream if a decoder did not receive a streaminfo metadata | |||
| a stream, which could in theory not happen at all. As the meaning of | block, which could not happen at all in theory. As the meaning of | |||
| the coded number in the frame header depends on whether or not a | the coded number in the frame header depends on whether or not a | |||
| stream is variable block size, this presented a problem: the meaning | stream has a variable block size, this presented a problem: the | |||
| of the coded number could not be reliably determined. To fix this | meaning of the coded number could not be reliably determined. To fix | |||
| problem, one of the reserved bits was changed to be used as a | this problem, one of the reserved bits was changed to be used as a | |||
| blocking strategy bit. See also Section 9.1. | blocking strategy bit. See also Section 9.1. | |||
| Along with the addition of a new flag, the meaning of the block size | Along with the addition of a new flag, the meaning of the block size | |||
| bits (see Section 9.1.1) was subtly changed. Initially, block size | bits (see Section 9.1.1) was subtly changed. Initially, block size | |||
| bits patterns 0b0001-0b0101 and 0b1000-0b1111 could only be used for | bits patterns 0b0001-0b0101 and 0b1000-0b1111 could only be used for | |||
| fixed block size streams, while 0b0110 and 0b0111 could be used for | fixed block size streams, while 0b0110 and 0b0111 could be used for | |||
| both fixed block size and variable block size streams. With the | both fixed block size and variable block size streams. With this | |||
| change, these restrictions were lifted, and patterns 0b0001-0b1111 | change, these restrictions were lifted, and patterns 0b0001-0b1111 | |||
| are now used for both variable block size and fixed block size | are now used for both variable block size and fixed block size | |||
| streams. | streams. | |||
| B.2. Restriction of encoded residual samples | B.2. Restriction of Encoded Residual Samples | |||
| Another change to the specification was deemed necessary during | Another change to the specification was deemed necessary during | |||
| standardization by the CELLAR working group of the IETF. As | standardization by the CELLAR Working Group of the IETF. As | |||
| specified in Section 9.2.7 a limit is imposed on residual samples. | specified in Section 9.2.7, a limit is imposed on residual samples. | |||
| This limit was not specified prior to the IETF standardization | This limit was not specified prior to the IETF standardization | |||
| effort. However, as far as was known to the working group, no FLAC | effort. However, as far as was known to the working group, no FLAC | |||
| encoder at that time produced FLAC files containing residual samples | encoder at that time produced FLAC files containing residual samples | |||
| exceeding this limit. This is mostly because it is very unlikely to | exceeding this limit. This is mostly because it is very unlikely to | |||
| encounter residual samples exceeding this limit when encoding 24-bit | encounter residual samples exceeding this limit when encoding 24-bit | |||
| PCM, and encoding of PCM with higher bit depths was not yet | PCM, and encoding of PCM with higher bit depths was not yet | |||
| implemented in any known encoder. In fact, these FLAC encoders would | implemented in any known encoder. In fact, these FLAC encoders would | |||
| produce corrupt files upon being triggered to produce such residual | produce corrupt files upon being triggered to produce such residual | |||
| samples and it is unlikely any non-experimental encoder would ever do | samples, and it is unlikely any non-experimental encoder would ever | |||
| so, even when presented with crafted material. Therefore, it was not | do so, even when presented with crafted material. Therefore, it was | |||
| expected that existing implementations would be rendered non- | not expected that existing implementations would be rendered non- | |||
| compliant by this change. | compliant by this change. | |||
| B.3. Addition of 5-bit Rice parameters | B.3. Addition of 5-Bit Rice Parameters | |||
| One significant addition to the format was the residual coding method | One significant addition to the format was the residual coding method | |||
| using 5-bit Rice parameters. Prior to publication of this addition | using 5-bit Rice parameters. Prior to publication of this addition | |||
| in July 2007, there was only one residual coding method specified, a | in July 2007, a partitioned Rice code with 4-bit Rice parameters was | |||
| partitioned Rice code with 4-bit Rice parameters. The range offered | the only residual coding method specified. The range offered by this | |||
| by this coding method proved too small when encoding 24-bit PCM, | coding method proved too small when encoding 24-bit PCM; therefore, a | |||
| therefore, a second residual coding method was specified, identical | second residual coding method was specified that was identical to the | |||
| to the first but with 5-bit Rice parameters. | first, but with 5-bit Rice parameters. | |||
| B.4. Restriction of LPC shift to non-negative values | B.4. Restriction of LPC Shift to Non-negative Values | |||
| As stated in Section 9.2.6, the predictor right shift is a number | As stated in Section 9.2.6, the predictor right shift is a number | |||
| signed two's complement, which MUST NOT be negative. This is because | signed two's complement, which MUST NOT be negative. This is because | |||
| right shifting a number by a negative amount is undefined behavior in | shifting a number to the right by a negative amount is undefined | |||
| the C programming language standard. The intended behavior was that | behavior in the C programming language standard. The intended | |||
| a positive number would be a right shift and a negative number would | behavior was that a positive number would be a right shift and a | |||
| be a left shift. The FLAC reference encoder was changed in 2007 to | negative number would be a left shift. The FLAC reference encoder | |||
| not generate LPC subframes with a negative predictor right shift, as | was changed in 2007 to not generate LPC subframes with a negative | |||
| it turned out that the use of such subframes would only very rarely | predictor right shift, as it turned out that the use of such | |||
| provide any benefit, and the decoders that were already widely in use | subframes would only very rarely provide any benefit and the decoders | |||
| at that point were not able to handle such subframes. | that were already widely in use at that point were not able to handle | |||
| such subframes. | ||||
| Appendix C. Interoperability considerations | Appendix C. Interoperability Considerations | |||
| As documented in Appendix B, there have been some changes and | As documented in Appendix B, there have been some changes and | |||
| additions to the FLAC format. Additionally, implementation of | additions to the FLAC format. Additionally, implementation of | |||
| certain features of the FLAC format took many years, meaning early | certain features of the FLAC format took many years, meaning early | |||
| decoder implementations could not be tested against files with these | decoder implementations could not be tested against files with these | |||
| features. Finally, many lower-quality FLAC decoders only implement | features. Finally, many lower-quality FLAC decoders only implement | |||
| just enough features required for playback of the most common FLAC | just enough features required for playback of the most common FLAC | |||
| files. | files. | |||
| This appendix provides some considerations for encoder | This appendix provides some considerations for encoder | |||
| implementations aiming to create highly compatible files. As this | implementations aiming to create highly compatible files. As this | |||
| topic is one that might change after this document is finished, | topic is one that might change after this document is published, | |||
| consult [FLAC-wiki-interoperability] for more up-to-date information. | consult [FLAC-wiki-interoperability] for more up-to-date information. | |||
| C.1. Features outside of the streamable subset | C.1. Features outside of the Streamable Subset | |||
| As described in Section 7, FLAC specifies a subset of its | As described in Section 7, FLAC specifies a subset of its | |||
| capabilities as the FLAC streamable subset. Certain decoders may | capabilities as the FLAC streamable subset. Certain decoders may | |||
| choose to only decode FLAC files conforming to the limitations | choose to only decode FLAC files conforming to the limitations | |||
| imposed by the streamable subset. Therefore, maximum compatibility | imposed by the streamable subset. Therefore, maximum compatibility | |||
| with decoders is achieved when the limitations of the FLAC streamable | with decoders is achieved when the limitations of the FLAC streamable | |||
| subset are followed when creating FLAC files. | subset are followed when creating FLAC files. | |||
| C.2. Variable block size | C.2. Variable Block Size | |||
| Because it is often difficult to find the optimal arrangement of | Because it is often difficult to find the optimal arrangement of | |||
| block sizes for maximum compression, most encoders choose to create | block sizes for maximum compression, most encoders choose to create | |||
| files with a fixed block size. Because of this, many decoder | files with a fixed block size. Because of this, many decoder | |||
| implementations receive minimal use when handling variable block size | implementations receive minimal use when handling variable block size | |||
| streams, and this can reveal bugs or reveal that implementations do | streams, and this can reveal bugs or reveal that implementations do | |||
| not decode them at all. Furthermore, as explained in Appendix B.1, | not decode them at all. Furthermore, as explained in Appendix B.1, | |||
| there have been some changes to the way variable block size streams | there have been some changes to the way variable block size streams | |||
| were encoded. Because of this, maximum compatibility with decoders | are encoded. Because of this, maximum compatibility with decoders is | |||
| is achieved when FLAC files are created using fixed block size | achieved when FLAC files are created using fixed block size streams. | |||
| streams. | ||||
| C.3. 5-bit Rice parameter | C.3. 5-Bit Rice Parameters | |||
| As the addition of the 5-bit Rice parameter, as described in | As the addition of the coding method using 5-bit Rice parameters, as | |||
| Appendix B.3, occurred quite a few years after the FLAC format was | described in Appendix B.3, occurred quite a few years after the FLAC | |||
| first introduced, some early decoders might not be able to decode | format was first introduced, some early decoders might not be able to | |||
| files containing such Rice parameters. The introduction of this was | decode files containing such Rice parameters. The introduction of | |||
| specifically aimed at improving compression of 24-bit PCM audio, and | this was specifically aimed at improving compression of 24-bit PCM | |||
| compression of 16-bit PCM audio only rarely benefits from using 5-bit | audio, and compression of 16-bit PCM audio only rarely benefits from | |||
| Rice parameters. Therefore, maximum compatibility with decoders is | using 5-bit Rice parameters. Therefore, maximum compatibility with | |||
| achieved when FLAC files containing audio with a bit depth of 16 bits | decoders is achieved when FLAC files containing audio with a bit | |||
| or lower are created without any use of 5-bit Rice parameters. | depth of 16 bits or less are created without any use of 5-bit Rice | |||
| parameters. | ||||
| C.4. Rice escape code | C.4. Rice Escape Code | |||
| Escaped Rice partitions are seldom used, as it turned out their use | Escaped Rice partitions are seldom used, as it turned out their use | |||
| provides only a very small compression improvement. As many encoders | provides only a very small compression improvement. As many encoders | |||
| therefore do not use these by default or are not capable of producing | do not use these by default or are not capable of producing them at | |||
| them at all, it is likely that many decoder implementations are not | all, it is likely that many decoder implementations are not able to | |||
| able to decode them correctly. Therefore, maximum compatibility with | decode them correctly. Therefore, maximum compatibility with | |||
| decoders is achieved when FLAC files are created without any use of | decoders is achieved when FLAC files are created without any use of | |||
| escaped Rice partitions. | escaped Rice partitions. | |||
| C.5. Uncommon block size | C.5. Uncommon Block Size | |||
| For unknown reasons, some decoders have chosen to support only common | For unknown reasons, some decoders have chosen to support only common | |||
| block sizes for all but the last block of a stream. Therefore, | block sizes for all but the last block of a stream. Therefore, | |||
| maximum compatibility with decoders is achieved when creating FLAC | maximum compatibility with decoders is achieved when creating FLAC | |||
| files using common block sizes, as listed in Section 9.1.1, for all | files using common block sizes, as listed in Section 9.1.1, for all | |||
| but the last block of a stream. | but the last block of a stream. | |||
| C.6. Uncommon bit depth | C.6. Uncommon Bit Depth | |||
| Most audio is stored in bit depths that are a whole number of bytes, | Most audio is stored in bit depths that are a whole number of bytes, | |||
| e.g., 8, 16 or 24 bit. There is however audio with different bit | e.g., 8, 16, or 24 bits. However, there is audio with different bit | |||
| depths. A few examples: | depths. A few examples: | |||
| * DVD-Audio has the possibility to store 20 bit PCM audio. | * DVD-Audio has the possibility to store 20-bit PCM audio. | |||
| * DAT and DV can store 12 bit PCM audio. | ||||
| * NICAM-728 samples at 14 bit, which is companded to 10 bit. | * DAT and DV can store 12-bit PCM audio. | |||
| * 8-bit µ-law can be losslessly converted to 14 bit (Linear) PCM. | ||||
| * 8-bit A-law can be losslessly converted to 13 bit (Linear) PCM. | * NICAM-728 samples at 14 bits, which is companded to 10 bits. | |||
| * 8-bit µ-law can be losslessly converted to 14-bit (Linear) PCM. | ||||
| * 8-bit A-law can be losslessly converted to 13-bit (Linear) PCM. | ||||
| The FLAC format can contain these bit depths directly, but because | The FLAC format can contain these bit depths directly, but because | |||
| they are uncommon, some decoders are not able to process the | they are uncommon, some decoders are not able to process the | |||
| resulting files correctly. It is possible to store these formats in | resulting files correctly. It is possible to store these formats in | |||
| a FLAC file with a more common bit depth without sacrificing | a FLAC file with a more common bit depth without sacrificing | |||
| compression by padding each sample with zero bits to a bit depth that | compression by padding each sample with zero bits to a bit depth that | |||
| is a whole byte. The FLAC format can efficiently compress these | is a whole byte. The FLAC format can efficiently compress these | |||
| wasted bits. See Section 9.2.2 for details. | wasted bits. See Section 9.2.2 for details. | |||
| Therefore, maximum compatibility with decoders is achieved when FLAC | Therefore, maximum compatibility with decoders is achieved when FLAC | |||
| files are created by padding samples of such audio with zero bits to | files are created by padding samples of such audio with zero bits to | |||
| the bit depth that is the next whole number of bytes. | the bit depth that is the next whole number of bytes. | |||
| In cases where the original signal is already padded, this operation | In cases where the original signal is already padded, this operation | |||
| cannot be reversed losslessly without knowing the original bit depth. | cannot be reversed losslessly without knowing the original bit depth. | |||
| To leave no ambiguity, the original bit depth needs to be stored, for | To leave no ambiguity, the original bit depth needs to be stored, for | |||
| example, in a vorbis comment field, by storing the header of the | example, in a Vorbis comment field or by storing the header of the | |||
| original file, or in a description of the file. The choice of a | original file. The choice of a suitable method is left to the | |||
| suitable method is left to the implementer. | implementor. | |||
| Besides audio with a 'non-whole byte' bit depth, some decoder | Besides audio with a "non-whole byte" bit depth, some decoder | |||
| implementations have chosen to only accept FLAC files coding for PCM | implementations have chosen to only accept FLAC files coding for PCM | |||
| audio with a bit depth of 16 bit. Many implementations support bit | audio with a bit depth of 16 bits. Many implementations support bit | |||
| depths up to 24 bit but no higher. Consult | depths up to 24 bits, but no higher. Consult | |||
| [FLAC-wiki-interoperability] for more up-to-date information. | [FLAC-wiki-interoperability] for more up-to-date information. | |||
| C.7. Multi-channel audio and uncommon sample rates | C.7. Multi-Channel Audio and Uncommon Sample Rates | |||
| Many FLAC audio players are unable to render multi-channel audio or | Many FLAC audio players are unable to render multi-channel audio or | |||
| audio with an uncommon sample rate. While this is not a concern | audio with an uncommon sample rate. While this is not a concern | |||
| specific to the FLAC format, it is of note when requiring maximum | specific to the FLAC format, it is of note when requiring maximum | |||
| compatibility with decoders. Unlike the previously mentioned | compatibility with decoders. Unlike the previously mentioned | |||
| interoperability considerations, this is one where compatibility | interoperability considerations, this is one where compatibility | |||
| cannot be improved without sacrificing the lossless nature of the | cannot be improved without sacrificing the lossless nature of the | |||
| FLAC format. | FLAC format. | |||
| From a non-exhaustive inquiry, it seems that a non-negligible amount | From a non-exhaustive inquiry, it seems that a non-negligible number | |||
| of players, especially hardware players, do not support audio with 3 | of players, especially hardware players, do not support audio with 3 | |||
| or more channels or sample rates other than those considered common, | or more channels or sample rates other than those considered common; | |||
| see Section 9.1.2. | see Section 9.1.2. | |||
| For those players that do support and are able to render multi- | For those players that do support and are able to render multi- | |||
| channel audio, many do not parse and use the | channel audio, many do not parse and use the | |||
| WAVEFORMATEXTENSIBLE_CHANNEL_MASK tag (see Section 8.6.2). This too | WAVEFORMATEXTENSIBLE_CHANNEL_MASK tag (see Section 8.6.2). This is | |||
| is an interoperability consideration where compatibility cannot be | also an interoperability consideration because compatibility cannot | |||
| improved without sacrificing the lossless nature of the FLAC format. | be improved without sacrificing the lossless nature of the FLAC | |||
| format. | ||||
| C.8. Changing audio properties mid-stream | C.8. Changing Audio Properties Mid-Stream | |||
| Each FLAC frame header stores the audio sample rate, number of bits | Each FLAC frame header stores the audio sample rate, number of bits | |||
| per sample, and number of channels independently of the streaminfo | per sample, and number of channels independently of the streaminfo | |||
| metadata block and other frame headers. This was done to permit | metadata block and other frame headers. This was done to permit | |||
| multicasting of FLAC files, but it also allows these properties to | multicasting of FLAC files, but it also allows these properties to | |||
| change mid-stream. However, many FLAC decoders do not handle such | change mid-stream. However, many FLAC decoders do not handle such | |||
| changes, as few other formats are capable of holding such streams and | changes, as few other formats are capable of holding such streams and | |||
| changing playback properties during playback is often not possible | changing playback properties during playback is often not possible | |||
| without interrupting playback. Also, as explained in Section 9, | without interrupting playback. Also, as explained in Section 9, | |||
| using this feature of FLAC results in various practical problems. | using this feature of FLAC results in various practical problems. | |||
| skipping to change at page 71, line 30 ¶ | skipping to change at line 3123 ¶ | |||
| such a stream correctly. Therefore, maximum compatibility with | such a stream correctly. Therefore, maximum compatibility with | |||
| decoders is achieved when FLAC files are created with a single set of | decoders is achieved when FLAC files are created with a single set of | |||
| audio properties, in which the properties coded in the streaminfo | audio properties, in which the properties coded in the streaminfo | |||
| metadata block (see Section 8.2) and the properties coded in all | metadata block (see Section 8.2) and the properties coded in all | |||
| frame headers (see Section 9.1) are the same. This can be achieved | frame headers (see Section 9.1) are the same. This can be achieved | |||
| by splitting up an input stream with changing audio properties at the | by splitting up an input stream with changing audio properties at the | |||
| points where these properties change into separate streams or files. | points where these properties change into separate streams or files. | |||
| Appendix D. Examples | Appendix D. Examples | |||
| This informational appendix contains short example FLAC files that | This informational appendix contains short examples of FLAC files | |||
| are decoded step by step. These examples provide a more engaging way | that are decoded step by step. These examples provide a more | |||
| to understand the FLAC format than the formal specification. The | engaging way to understand the FLAC format than the formal | |||
| text explaining these examples assumes the reader has at least | specification. The text explaining these examples assumes the reader | |||
| cursorily read the specification and that the reader refers to the | has at least cursorily read the specification and that the reader | |||
| specification for explanation of the terminology used. These | refers to the specification for explanation of the terminology used. | |||
| examples mostly focus on the layout of several metadata blocks and | These examples mostly focus on the layout of several metadata blocks, | |||
| subframe types and the implications of certain aspects (for example, | subframe types, and the implications of certain aspects (e.g., wasted | |||
| wasted bits and stereo decorrelation) on this layout. | bits and stereo decorrelation) on this layout. | |||
| The examples feature files generated by various FLAC encoders. These | The examples feature files generated by various FLAC encoders. These | |||
| are presented in hexadecimal or binary format, followed by tables and | are presented in hexadecimal or binary format, followed by tables and | |||
| text referring to various features by their starting bit positions in | text referring to various features by their starting bit positions in | |||
| these representations. Each starting position (shortened to 'start' | these representations. Each starting position (shortened to "start" | |||
| in the tables) is a hexadecimal byte position and a start bit within | in the tables) is a hexadecimal byte position and a start bit within | |||
| that byte, separated by a plus sign. Counts for these start at zero. | that byte, separated by a plus sign. Counts for these start at zero. | |||
| For example, a feature starting at the 3rd bit of the 17th byte is | For example, a feature starting at the 3rd bit of the 17th byte is | |||
| referred to as starting at 0x10+2. The files that are explored in | referred to as starting at 0x10+2. The files that are explored in | |||
| these examples can be found at [FLAC-specification-github]. | these examples can be found at [FLAC-specification-github]. | |||
| All data in this appendix has been thoroughly verified. However, as | All data in this appendix has been thoroughly verified. However, as | |||
| this appendix is informational, if any information here conflicts | this appendix is informational, if any information here conflicts | |||
| with statements in the formal specification, the latter takes | with statements in the formal specification, the latter takes | |||
| precedence. | precedence. | |||
| D.1. Decoding example 1 | D.1. Decoding Example 1 | |||
| This very short example FLAC file codes for PCM audio that has two | This very short example FLAC file codes for PCM audio that has two | |||
| channels, each containing one sample. The focus of this example is | channels, each containing one sample. The focus of this example is | |||
| on the essential parts of a FLAC file. | on the essential parts of a FLAC file. | |||
| D.1.1. Example file 1 in hexadecimal representation | D.1.1. Example File 1 in Hexadecimal Representation | |||
| 00000000: 664c 6143 8000 0022 1000 1000 fLaC...".... | 00000000: 664c 6143 8000 0022 1000 1000 fLaC...".... | |||
| 0000000c: 0000 0f00 000f 0ac4 42f0 0000 ........B... | 0000000c: 0000 0f00 000f 0ac4 42f0 0000 ........B... | |||
| 00000018: 0001 3e84 b418 07dc 6903 0758 ..>.....i..X | 00000018: 0001 3e84 b418 07dc 6903 0758 ..>.....i..X | |||
| 00000024: 6a3d ad1a 2e0f fff8 6918 0000 j=......i... | 00000024: 6a3d ad1a 2e0f fff8 6918 0000 j=......i... | |||
| 00000030: bf03 58fd 0312 8baa 9a ..X...... | 00000030: bf03 58fd 0312 8baa 9a ..X...... | |||
| D.1.2. Example file 1 in binary representation | D.1.2. Example File 1 in Binary Representation | |||
| 00000000: 01100110 01001100 01100001 01000011 fLaC | 00000000: 01100110 01001100 01100001 01000011 fLaC | |||
| 00000004: 10000000 00000000 00000000 00100010 ..." | 00000004: 10000000 00000000 00000000 00100010 ..." | |||
| 00000008: 00010000 00000000 00010000 00000000 .... | 00000008: 00010000 00000000 00010000 00000000 .... | |||
| 0000000c: 00000000 00000000 00001111 00000000 .... | 0000000c: 00000000 00000000 00001111 00000000 .... | |||
| 00000010: 00000000 00001111 00001010 11000100 .... | 00000010: 00000000 00001111 00001010 11000100 .... | |||
| 00000014: 01000010 11110000 00000000 00000000 B... | 00000014: 01000010 11110000 00000000 00000000 B... | |||
| 00000018: 00000000 00000001 00111110 10000100 ..>. | 00000018: 00000000 00000001 00111110 10000100 ..>. | |||
| 0000001c: 10110100 00011000 00000111 11011100 .... | 0000001c: 10110100 00011000 00000111 11011100 .... | |||
| 00000020: 01101001 00000011 00000111 01011000 i..X | 00000020: 01101001 00000011 00000111 01011000 i..X | |||
| 00000024: 01101010 00111101 10101101 00011010 j=.. | 00000024: 01101010 00111101 10101101 00011010 j=.. | |||
| 00000028: 00101110 00001111 11111111 11111000 .... | 00000028: 00101110 00001111 11111111 11111000 .... | |||
| 0000002c: 01101001 00011000 00000000 00000000 i... | 0000002c: 01101001 00011000 00000000 00000000 i... | |||
| 00000030: 10111111 00000011 01011000 11111101 ..X. | 00000030: 10111111 00000011 01011000 11111101 ..X. | |||
| 00000034: 00000011 00010010 10001011 10101010 .... | 00000034: 00000011 00010010 10001011 10101010 .... | |||
| 00000038: 10011010 | 00000038: 10011010 | |||
| D.1.3. Signature and streaminfo | D.1.3. Signature and Streaminfo | |||
| The first 4 bytes of the file contain the fLaC file signature. | The first 4 bytes of the file contain the fLaC file signature. | |||
| Directly following it is a metadata block. The signature and the | Directly following it is a metadata block. The signature and the | |||
| first metadata block header are broken down in the following table. | first metadata block header are broken down in the following table. | |||
| +========+=========+============+===========================+ | +========+=========+============+===========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+============+===========================+ | +========+=========+============+===========================+ | |||
| | 0x00+0 | 4 bytes | 0x664C6143 | fLaC | | | 0x00+0 | 4 bytes | 0x664C6143 | fLaC | | |||
| +--------+---------+------------+---------------------------+ | +--------+---------+------------+---------------------------+ | |||
| | 0x04+0 | 1 bit | 0b1 | Last metadata block | | | 0x04+0 | 1 bit | 0b1 | Last metadata block | | |||
| +--------+---------+------------+---------------------------+ | +--------+---------+------------+---------------------------+ | |||
| | 0x04+1 | 7 bits | 0b0000000 | Streaminfo metadata block | | | 0x04+1 | 7 bits | 0b0000000 | Streaminfo metadata block | | |||
| +--------+---------+------------+---------------------------+ | +--------+---------+------------+---------------------------+ | |||
| | 0x05+0 | 3 bytes | 0x000022 | Length 34 byte | | | 0x05+0 | 3 bytes | 0x000022 | Length of 34 bytes | | |||
| +--------+---------+------------+---------------------------+ | +--------+---------+------------+---------------------------+ | |||
| Table 27 | Table 27 | |||
| As the header indicates that this is the last metadata block, the | As the header indicates that this is the last metadata block, the | |||
| position of the first audio frame can now be calculated as the | position of the first audio frame can now be calculated as the | |||
| position of the first byte after the metadata block header + the | position of the first byte after the metadata block header + the | |||
| length of the block, i.e., 8+34 = 42 or 0x2a. As can be seen, 0x2a | length of the block, i.e., 8+34 = 42 or 0x2a. Thus, 0x2a indeed | |||
| indeed contains the frame sync code for fixed block size streams, | contains the frame sync code for fixed block size streams -- 0xfff8. | |||
| 0xfff8. | ||||
| The streaminfo metadata block contents are broken down in the | The streaminfo metadata block contents are broken down in the | |||
| following table. | following table. | |||
| +========+==========+====================+=========================+ | +========+==========+====================+==========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+==========+====================+=========================+ | +========+==========+====================+==========================+ | |||
| | 0x08+0 | 2 bytes | 0x1000 | Min. block size 4096 | | | 0x08+0 | 2 bytes | 0x1000 | Min. block size 4096 | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x0a+0 | 2 bytes | 0x1000 | Max. block size 4096 | | | 0x0a+0 | 2 bytes | 0x1000 | Max. block size 4096 | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x0c+0 | 3 bytes | 0x00000f | Min. frame size 15 byte | | | 0x0c+0 | 3 bytes | 0x00000f | Min. frame size 15 bytes | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x0f+0 | 3 bytes | 0x00000f | Max. frame size 15 byte | | | 0x0f+0 | 3 bytes | 0x00000f | Max. frame size 15 bytes | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x12+0 | 20 bits | 0x0ac4, 0b0100 | Sample rate 44100 hertz | | | 0x12+0 | 20 bits | 0x0ac4, 0b0100 | Sample rate 44100 hertz | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x14+4 | 3 bits | 0b001 | 2 channels | | | 0x14+4 | 3 bits | 0b001 | 2 channels | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x14+7 | 5 bits | 0b01111 | Sample bit depth 16 | | | 0x14+7 | 5 bits | 0b01111 | Sample bit depth 16 | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x15+4 | 36 bits | 0b0000, 0x00000001 | Total no. of samples 1 | | | 0x15+4 | 36 bits | 0b0000, 0x00000001 | Total no. of samples 1 | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x1a | 16 bytes | (...) | MD5 checksum | | | 0x1a | 16 | (...) | MD5 checksum | | |||
| +--------+----------+--------------------+-------------------------+ | | | bytes | | | | |||
| +--------+----------+--------------------+--------------------------+ | ||||
| Table 28 | Table 28 | |||
| The minimum and maximum block size are both 4096. This was | The minimum and maximum block sizes are both 4096. This was | |||
| apparently the block size the encoder planned to use, but as only 1 | apparently the block size the encoder planned to use, but as only 1 | |||
| interchannel sample was provided, no frames with 4096 samples are | interchannel sample was provided, no frames with 4096 samples are | |||
| actually present in this file. | actually present in this file. | |||
| Note that anywhere a number of samples is mentioned (block size, | Note that anywhere a number of samples is mentioned (block size, | |||
| total number of samples, sample rate), interchannel samples are | total number of samples, sample rate), interchannel samples are | |||
| meant. | meant. | |||
| The MD5 checksum (starting at 0x1a) is 0x3e84 b418 07dc 6903 0758 | The MD5 checksum (starting at 0x1a) is 0x3e84 b418 07dc 6903 0758 | |||
| 6a3d ad1a 2e0f. This will be validated after decoding the samples. | 6a3d ad1a 2e0f. This will be validated after decoding the samples. | |||
| D.1.4. Audio frames | D.1.4. Audio Frames | |||
| The frame header starts at position 0x2a and is broken down in the | The frame header starts at position 0x2a and is broken down in the | |||
| following table. | following table. | |||
| +========+=========+=================+===================+ | +========+=========+=================+===================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+=================+===================+ | +========+=========+=================+===================+ | |||
| | 0x2a+0 | 15 bits | 0xff, 0b1111100 | frame sync | | | 0x2a+0 | 15 bits | 0xff, 0b1111100 | Frame sync | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2b+7 | 1 bit | 0b0 | blocking strategy | | | 0x2b+7 | 1 bit | 0b0 | Blocking strategy | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2c+0 | 4 bits | 0b0110 | 8-bit block size | | | 0x2c+0 | 4 bits | 0b0110 | 8-bit block size | | |||
| | | | | further down | | | | | | further down | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2c+4 | 4 bits | 0b1001 | sample rate 44.1 | | | 0x2c+4 | 4 bits | 0b1001 | Sample rate 44.1 | | |||
| | | | | kHz | | | | | | kHz | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2d+0 | 4 bits | 0b0001 | stereo, no | | | 0x2d+0 | 4 bits | 0b0001 | Stereo, no | | |||
| | | | | decorrelation | | | | | | decorrelation | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2d+4 | 3 bits | 0b100 | bit depth 16 bit | | | 0x2d+4 | 3 bits | 0b100 | Bit depth 16 bits | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2d+7 | 1 bit | 0b0 | mandatory 0 bit | | | 0x2d+7 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2e+0 | 1 byte | 0x00 | frame number 0 | | | 0x2e+0 | 1 byte | 0x00 | Frame number 0 | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2f+0 | 1 byte | 0x00 | block size 1 | | | 0x2f+0 | 1 byte | 0x00 | Block size 1 | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x30+0 | 1 byte | 0xbf | frame header CRC | | | 0x30+0 | 1 byte | 0xbf | Frame header CRC | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| Table 29 | Table 29 | |||
| As the stream is a fixed block size stream, the number at 0x2e | As the stream is a fixed block size stream, the number at 0x2e | |||
| contains a frame number. As the value is smaller than 128, only 1 | contains a frame number. Because the value is smaller than 128, only | |||
| byte is used for the encoding. | 1 byte is used for the encoding. | |||
| At byte 0x31, the first subframe starts, which is broken down in the | At byte 0x31, the first subframe starts, which is broken down in the | |||
| following table. | following table. | |||
| +========+=========+================+=========================+ | +========+=========+================+=========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+================+=========================+ | +========+=========+================+=========================+ | |||
| | 0x31+0 | 1 bit | 0b0 | mandatory 0 bit | | | 0x31+0 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+----------------+-------------------------+ | +--------+---------+----------------+-------------------------+ | |||
| | 0x31+1 | 6 bits | 0b000001 | verbatim subframe | | | 0x31+1 | 6 bits | 0b000001 | Verbatim subframe | | |||
| +--------+---------+----------------+-------------------------+ | +--------+---------+----------------+-------------------------+ | |||
| | 0x31+7 | 1 bit | 0b1 | wasted bits used | | | 0x31+7 | 1 bit | 0b1 | Wasted bits used | | |||
| +--------+---------+----------------+-------------------------+ | +--------+---------+----------------+-------------------------+ | |||
| | 0x32+0 | 2 bits | 0b01 | 2 wasted bits used | | | 0x32+0 | 2 bits | 0b01 | 2 wasted bits used | | |||
| +--------+---------+----------------+-------------------------+ | +--------+---------+----------------+-------------------------+ | |||
| | 0x32+2 | 14 bits | 0b011000, 0xfd | 14-bit unencoded sample | | | 0x32+2 | 14 bits | 0b011000, 0xfd | 14-bit unencoded sample | | |||
| +--------+---------+----------------+-------------------------+ | +--------+---------+----------------+-------------------------+ | |||
| Table 30 | Table 30 | |||
| As the wasted bits flag is 1 in this subframe, an unary coded number | As the wasted bits flag is 1 in this subframe, a unary-coded number | |||
| follows. Starting at 0x32, we see 0b01, which unary codes for 1, | follows. Starting at 0x32, we see 0b01, which unary codes for 1, | |||
| meaning this subframe uses 2 wasted bits. | meaning that this subframe uses 2 wasted bits. | |||
| As this is a verbatim subframe, the subframe only contains unencoded | As this is a verbatim subframe, the subframe only contains unencoded | |||
| sample values. With a block size of 1, it contains only a single | sample values. With a block size of 1, it contains only a single | |||
| sample. The bit depth of the audio is 16 bits, but as the subframe | sample. The bit depth of the audio is 16 bits, but as the subframe | |||
| header signals the use of 2 wasted bits, only 14 bits are stored. As | header signals the use of 2 wasted bits, only 14 bits are stored. As | |||
| no stereo decorrelation is used, a bit depth increase for the side | no stereo decorrelation is used, a bit depth increase for the side | |||
| channel is not applicable. So, the next 14 bits (starting at | channel is not applicable. So, the next 14 bits (starting at | |||
| position 0x32+2) contain the unencoded sample coded big-endian, | position 0x32+2) contain the unencoded sample coded big-endian, | |||
| signed two's complement. The value reads 0b011000 11111101, or 6397. | signed two's complement. The value reads 0b011000 11111101, or 6397. | |||
| This value needs to be shifted left by 2 bits, to account for the | This value needs to be shifted left by 2 bits to account for the | |||
| wasted bits. The value is then 0b011000 11111101 00, or 25588. | wasted bits. The value is then 0b011000 11111101 00, or 25588. | |||
| The second subframe starts at 0x34, and is broken down in the | The second subframe starts at 0x34 and is broken down in the | |||
| following table. | following table. | |||
| +========+=========+==============+=========================+ | +========+=========+==============+=========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+==============+=========================+ | +========+=========+==============+=========================+ | |||
| | 0x34+0 | 1 bit | 0b0 | mandatory 0 bit | | | 0x34+0 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+--------------+-------------------------+ | +--------+---------+--------------+-------------------------+ | |||
| | 0x34+1 | 6 bits | 0b000001 | verbatim subframe | | | 0x34+1 | 6 bits | 0b000001 | Verbatim subframe | | |||
| +--------+---------+--------------+-------------------------+ | +--------+---------+--------------+-------------------------+ | |||
| | 0x34+7 | 1 bit | 0b1 | wasted bits used | | | 0x34+7 | 1 bit | 0b1 | Wasted bits used | | |||
| +--------+---------+--------------+-------------------------+ | +--------+---------+--------------+-------------------------+ | |||
| | 0x35+0 | 4 bits | 0b0001 | 4 wasted bits used | | | 0x35+0 | 4 bits | 0b0001 | 4 wasted bits used | | |||
| +--------+---------+--------------+-------------------------+ | +--------+---------+--------------+-------------------------+ | |||
| | 0x35+4 | 12 bits | 0b0010, 0x8b | 12-bit unencoded sample | | | 0x35+4 | 12 bits | 0b0010, 0x8b | 12-bit unencoded sample | | |||
| +--------+---------+--------------+-------------------------+ | +--------+---------+--------------+-------------------------+ | |||
| Table 31 | Table 31 | |||
| Here the wasted bits flag is also one, but the unary coded number | The wasted bits flag is also one, but the unary-coded number that | |||
| that follows it is 4 bit long, indicating the use of 4 wasted bits. | follows it is 4 bits long, indicating the use of 4 wasted bits. This | |||
| This means the sample is stored in 12 bits. The sample value is | means the sample is stored in 12 bits. The sample value is 0b0010 | |||
| 0b0010 10001011, or 651. This value now has to be shifted left by 4 | 10001011, or 651. This value now has to be shifted left by 4 bits, | |||
| bits, i.e., 0b0010 10001011 0000 or 10416. | i.e., 0b0010 10001011 0000, or 10416. | |||
| At this point, we would undo stereo decorrelation if that was | At this point, we would undo stereo decorrelation if that was | |||
| applicable. | applicable. | |||
| As the last subframe ends byte-aligned, no padding bits follow it. | As the last subframe ends byte-aligned, no padding bits follow it. | |||
| The next 2 bytes, starting at 0x38, contain the frame CRC. As this | The next 2 bytes, starting at 0x38, contain the frame CRC. As this | |||
| is the only frame in the file, the file ends with the CRC. | is the only frame in the file, the file ends with the CRC. | |||
| To validate the MD5 checksum, we line up the samples interleaved, | To validate the MD5 checksum, we line up the samples interleaved, | |||
| byte-aligned, little endian, signed two's complement. The first | byte-aligned, little-endian, signed two's complement. The first | |||
| sample, with value 25588, translates to 0xf463, the second sample, | sample, with value 25588, translates to 0xf463, and the second | |||
| with value 10416, translates to 0xb028. When computing the MD5 | sample, with value 10416, translates to 0xb028. When computing the | |||
| checksum with 0xf463b028 as input, we get the MD5 checksum found in | MD5 checksum with 0xf463b028 as input, we get the MD5 checksum found | |||
| the header, so decoding was lossless. | in the header, so decoding was lossless. | |||
| D.2. Decoding example 2 | D.2. Decoding Example 2 | |||
| This FLAC file is larger than the first example, but still contains | This FLAC file is larger than the first example, but still contains | |||
| very little audio. The focus of this example is on decoding a | very little audio. The focus of this example is on decoding a | |||
| subframe with a fixed predictor and a coded residual, but it also | subframe with a fixed predictor and a coded residual, but it also | |||
| contains a very short seektable, a Vorbis comment metadata block, and | contains a very short seek table, a Vorbis comment metadata block, | |||
| a padding metadata block. | and a padding metadata block. | |||
| D.2.1. Example File 2 in Hexadecimal Representation | ||||
| D.2.1. Example file 2 in hexadecimal representation | ||||
| 00000000: 664c 6143 0000 0022 0010 0010 fLaC...".... | 00000000: 664c 6143 0000 0022 0010 0010 fLaC...".... | |||
| 0000000c: 0000 1700 0044 0ac4 42f0 0000 .....D..B... | 0000000c: 0000 1700 0044 0ac4 42f0 0000 .....D..B... | |||
| 00000018: 0013 d5b0 5649 75e9 8b8d 8b93 ....VIu..... | 00000018: 0013 d5b0 5649 75e9 8b8d 8b93 ....VIu..... | |||
| 00000024: 0422 757b 8103 0300 0012 0000 ."u{........ | 00000024: 0422 757b 8103 0300 0012 0000 ."u{........ | |||
| 00000030: 0000 0000 0000 0000 0000 0000 ............ | 00000030: 0000 0000 0000 0000 0000 0000 ............ | |||
| 0000003c: 0000 0010 0400 003a 2000 0000 .......: ... | 0000003c: 0000 0010 0400 003a 2000 0000 .......: ... | |||
| 00000048: 7265 6665 7265 6e63 6520 6c69 reference li | 00000048: 7265 6665 7265 6e63 6520 6c69 reference li | |||
| 00000054: 6246 4c41 4320 312e 332e 3320 bFLAC 1.3.3 | 00000054: 6246 4c41 4320 312e 332e 3320 bFLAC 1.3.3 | |||
| 00000060: 3230 3139 3038 3034 0100 0000 20190804.... | 00000060: 3230 3139 3038 3034 0100 0000 20190804.... | |||
| 0000006c: 0e00 0000 5449 544c 453d d7a9 ....TITLE=.. | 0000006c: 0e00 0000 5449 544c 453d d7a9 ....TITLE=.. | |||
| 00000078: d79c d795 d79d 8100 0006 0000 ............ | 00000078: d79c d795 d79d 8100 0006 0000 ............ | |||
| 00000084: 0000 0000 fff8 6998 000f 9912 ......i..... | 00000084: 0000 0000 fff8 6998 000f 9912 ......i..... | |||
| 00000090: 0867 0162 3d14 4299 8f5d f70d .g.b=.B..].. | 00000090: 0867 0162 3d14 4299 8f5d f70d .g.b=.B..].. | |||
| 0000009c: 6fe0 0c17 caeb 2100 0ee7 a77a o.....!....z | 0000009c: 6fe0 0c17 caeb 2100 0ee7 a77a o.....!....z | |||
| 000000a8: 24a1 590c 1217 b603 097b 784f $.Y......{xO | 000000a8: 24a1 590c 1217 b603 097b 784f $.Y......{xO | |||
| 000000b4: aa9a 33d2 85e0 70ad 5b1b 4851 ..3...p.[.HQ | 000000b4: aa9a 33d2 85e0 70ad 5b1b 4851 ..3...p.[.HQ | |||
| 000000c0: b401 0d99 d2cd 1a68 f1e6 b810 .......h.... | 000000c0: b401 0d99 d2cd 1a68 f1e6 b810 .......h.... | |||
| 000000cc: fff8 6918 0102 a402 c382 c40b ..i......... | 000000cc: fff8 6918 0102 a402 c382 c40b ..i......... | |||
| 000000d8: c14a 03ee 48dd 03b6 7c13 30 .J..H...|.0 | 000000d8: c14a 03ee 48dd 03b6 7c13 30 .J..H...|.0 | |||
| D.2.2. Example file 2 in binary representation (only audio frames) | D.2.2. Example File 2 in Binary Representation (Only Audio Frames) | |||
| 00000088: 11111111 11111000 01101001 10011000 ..i. | 00000088: 11111111 11111000 01101001 10011000 ..i. | |||
| 0000008c: 00000000 00001111 10011001 00010010 .... | 0000008c: 00000000 00001111 10011001 00010010 .... | |||
| 00000090: 00001000 01100111 00000001 01100010 .g.b | 00000090: 00001000 01100111 00000001 01100010 .g.b | |||
| 00000094: 00111101 00010100 01000010 10011001 =.B. | 00000094: 00111101 00010100 01000010 10011001 =.B. | |||
| 00000098: 10001111 01011101 11110111 00001101 .].. | 00000098: 10001111 01011101 11110111 00001101 .].. | |||
| 0000009c: 01101111 11100000 00001100 00010111 o... | 0000009c: 01101111 11100000 00001100 00010111 o... | |||
| 000000a0: 11001010 11101011 00100001 00000000 ..!. | 000000a0: 11001010 11101011 00100001 00000000 ..!. | |||
| 000000a4: 00001110 11100111 10100111 01111010 ...z | 000000a4: 00001110 11100111 10100111 01111010 ...z | |||
| 000000a8: 00100100 10100001 01011001 00001100 $.Y. | 000000a8: 00100100 10100001 01011001 00001100 $.Y. | |||
| skipping to change at page 78, line 5 ¶ | skipping to change at line 3413 ¶ | |||
| 000000c0: 10110100 00000001 00001101 10011001 .... | 000000c0: 10110100 00000001 00001101 10011001 .... | |||
| 000000c4: 11010010 11001101 00011010 01101000 ...h | 000000c4: 11010010 11001101 00011010 01101000 ...h | |||
| 000000c8: 11110001 11100110 10111000 00010000 .... | 000000c8: 11110001 11100110 10111000 00010000 .... | |||
| 000000cc: 11111111 11111000 01101001 00011000 ..i. | 000000cc: 11111111 11111000 01101001 00011000 ..i. | |||
| 000000d0: 00000001 00000010 10100100 00000010 .... | 000000d0: 00000001 00000010 10100100 00000010 .... | |||
| 000000d4: 11000011 10000010 11000100 00001011 .... | 000000d4: 11000011 10000010 11000100 00001011 .... | |||
| 000000d8: 11000001 01001010 00000011 11101110 .J.. | 000000d8: 11000001 01001010 00000011 11101110 .J.. | |||
| 000000dc: 01001000 11011101 00000011 10110110 H... | 000000dc: 01001000 11011101 00000011 10110110 H... | |||
| 000000e0: 01111100 00010011 00110000 |.0 | 000000e0: 01111100 00010011 00110000 |.0 | |||
| D.2.3. Streaminfo metadata block | D.2.3. Streaminfo Metadata Block | |||
| Most of the streaminfo block, including its header, is the same as in | Most of the streaminfo metadata block, including its header, is the | |||
| example 1, so only parts that are different are listed in the | same as in example 1, so only parts that are different are listed in | |||
| following table. | the following table. | |||
| +========+=========+============+=============================+ | +========+=========+============+=============================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+============+=============================+ | +========+=========+============+=============================+ | |||
| | 0x04+0 | 1 bit | 0b0 | Not the last metadata block | | | 0x04+0 | 1 bit | 0b0 | Not the last metadata block | | |||
| +--------+---------+------------+-----------------------------+ | +--------+---------+------------+-----------------------------+ | |||
| | 0x08+0 | 2 bytes | 0x0010 | Min. block size 16 | | | 0x08+0 | 2 bytes | 0x0010 | Min. block size 16 | | |||
| +--------+---------+------------+-----------------------------+ | +--------+---------+------------+-----------------------------+ | |||
| | 0x0a+0 | 2 bytes | 0x0010 | Max. block size 16 | | | 0x0a+0 | 2 bytes | 0x0010 | Max. block size 16 | | |||
| +--------+---------+------------+-----------------------------+ | +--------+---------+------------+-----------------------------+ | |||
| | 0x0c+0 | 3 bytes | 0x000017 | Min. frame size 23 byte | | | 0x0c+0 | 3 bytes | 0x000017 | Min. frame size 23 bytes | | |||
| +--------+---------+------------+-----------------------------+ | +--------+---------+------------+-----------------------------+ | |||
| | 0x0f+0 | 3 bytes | 0x000044 | Max. frame size 68 byte | | | 0x0f+0 | 3 bytes | 0x000044 | Max. frame size 68 bytes | | |||
| +--------+---------+------------+-----------------------------+ | +--------+---------+------------+-----------------------------+ | |||
| | 0x15+4 | 36 bits | 0b0000, | Total no. of samples 19 | | | 0x15+4 | 36 bits | 0b0000, | Total no. of samples 19 | | |||
| | | | 0x00000013 | | | | | | 0x00000013 | | | |||
| +--------+---------+------------+-----------------------------+ | +--------+---------+------------+-----------------------------+ | |||
| | 0x1a | 16 | (...) | MD5 checksum | | | 0x1a | 16 | (...) | MD5 checksum | | |||
| | | bytes | | | | | | bytes | | | | |||
| +--------+---------+------------+-----------------------------+ | +--------+---------+------------+-----------------------------+ | |||
| Table 32 | Table 32 | |||
| This time, the minimum and maximum block sizes are reflected in the | This time, the minimum and maximum block sizes are reflected in the | |||
| file: there is one block of 16 samples, the last block (which has 3 | file: there is one block of 16 samples, and the last block (which has | |||
| samples) is not considered for the minimum block size. The MD5 | 3 samples) is not considered for the minimum block size. The MD5 | |||
| checksum is 0xd5b0 5649 75e9 8b8d 8b93 0422 757b 8103, this will be | checksum is 0xd5b0 5649 75e9 8b8d 8b93 0422 757b 8103. This will be | |||
| verified at the end of this example. | verified at the end of this example. | |||
| D.2.4. Seektable | D.2.4. Seek Table | |||
| The seektable metadata block only holds one entry. It is not really | The seek table metadata block only holds one entry. It is not really | |||
| useful here, as it points to the first frame, but it is enough for | useful here, as it points to the first frame, but it is enough for | |||
| this example. The seektable metadata block is broken down in the | this example. The seek table metadata block is broken down in the | |||
| following table. | following table. | |||
| +========+========+====================+================+ | +========+========+====================+================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+========+====================+================+ | +========+========+====================+================+ | |||
| | 0x2a+0 | 1 bit | 0b0 | Not the last | | | 0x2a+0 | 1 bit | 0b0 | Not the last | | |||
| | | | | metadata block | | | | | | metadata block | | |||
| +--------+--------+--------------------+----------------+ | +--------+--------+--------------------+----------------+ | |||
| | 0x2a+1 | 7 bits | 0b0000011 | Seektable | | | 0x2a+1 | 7 bits | 0b0000011 | Seek table | | |||
| | | | | metadata block | | | | | | metadata block | | |||
| +--------+--------+--------------------+----------------+ | +--------+--------+--------------------+----------------+ | |||
| | 0x2b+0 | 3 | 0x000012 | Length 18 byte | | | 0x2b+0 | 3 | 0x000012 | Length 18 | | |||
| | | bytes | | | | | | bytes | | bytes | | |||
| +--------+--------+--------------------+----------------+ | +--------+--------+--------------------+----------------+ | |||
| | 0x2e+0 | 8 | 0x0000000000000000 | Seekpoint to | | | 0x2e+0 | 8 | 0x0000000000000000 | Seek point to | | |||
| | | bytes | | sample 0 | | | | bytes | | sample 0 | | |||
| +--------+--------+--------------------+----------------+ | +--------+--------+--------------------+----------------+ | |||
| | 0x36+0 | 8 | 0x0000000000000000 | Seekpoint to | | | 0x36+0 | 8 | 0x0000000000000000 | Seek point to | | |||
| | | bytes | | offset 0 | | | | bytes | | offset 0 | | |||
| +--------+--------+--------------------+----------------+ | +--------+--------+--------------------+----------------+ | |||
| | 0x3e+0 | 2 | 0x0010 | Seekpoint to | | | 0x3e+0 | 2 | 0x0010 | Seek point to | | |||
| | | bytes | | block size 16 | | | | bytes | | block size 16 | | |||
| +--------+--------+--------------------+----------------+ | +--------+--------+--------------------+----------------+ | |||
| Table 33 | Table 33 | |||
| D.2.5. Vorbis comment | D.2.5. Vorbis Comment | |||
| The Vorbis comment metadata block contains the vendor string and a | The Vorbis comment metadata block contains the vendor string and a | |||
| single comment. It is broken down in the following table. | single comment. It is broken down in the following table. | |||
| +========+==========+============+===============================+ | +========+==========+============+===============================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+==========+============+===============================+ | +========+==========+============+===============================+ | |||
| | 0x40+0 | 1 bit | 0b0 | Not the last metadata block | | | 0x40+0 | 1 bit | 0b0 | Not the last metadata block | | |||
| +--------+----------+------------+-------------------------------+ | +--------+----------+------------+-------------------------------+ | |||
| | 0x40+1 | 7 bits | 0b0000100 | Vorbis comment metadata block | | | 0x40+1 | 7 bits | 0b0000100 | Vorbis comment metadata block | | |||
| +--------+----------+------------+-------------------------------+ | +--------+----------+------------+-------------------------------+ | |||
| | 0x41+0 | 3 bytes | 0x00003a | Length 58 byte | | | 0x41+0 | 3 bytes | 0x00003a | Length 58 bytes | | |||
| +--------+----------+------------+-------------------------------+ | +--------+----------+------------+-------------------------------+ | |||
| | 0x44+0 | 4 bytes | 0x20000000 | Vendor string length 32 byte | | | 0x44+0 | 4 bytes | 0x20000000 | Vendor string length 32 bytes | | |||
| +--------+----------+------------+-------------------------------+ | +--------+----------+------------+-------------------------------+ | |||
| | 0x48+0 | 32 bytes | (...) | Vendor string | | | 0x48+0 | 32 bytes | (...) | Vendor string | | |||
| +--------+----------+------------+-------------------------------+ | +--------+----------+------------+-------------------------------+ | |||
| | 0x68+0 | 4 bytes | 0x01000000 | Number of fields 1 | | | 0x68+0 | 4 bytes | 0x01000000 | Number of fields 1 | | |||
| +--------+----------+------------+-------------------------------+ | +--------+----------+------------+-------------------------------+ | |||
| | 0x6c+0 | 4 bytes | 0x0e000000 | Field length 14 byte | | | 0x6c+0 | 4 bytes | 0x0e000000 | Field length 14 bytes | | |||
| +--------+----------+------------+-------------------------------+ | +--------+----------+------------+-------------------------------+ | |||
| | 0x70+0 | 14 bytes | (...) | Field contents | | | 0x70+0 | 14 bytes | (...) | Field contents | | |||
| +--------+----------+------------+-------------------------------+ | +--------+----------+------------+-------------------------------+ | |||
| Table 34 | Table 34 | |||
| The vendor string is reference libFLAC 1.3.3 20190804, and the field | The vendor string is reference libFLAC 1.3.3 20190804, and the field | |||
| contents of the only field is TITLE=שלום. The Vorbis comment field is | contents of the only field is | |||
| 14 bytes but only 10 characters in size, because it contains four | ||||
| 2-byte characters. | TITLE=שלום | |||
| where in direction of reading, the sequence of characters forming the | ||||
| field content is as follows: "ש" (HEBREW LETTER SHIN, U+05E9), "ל" | ||||
| (HEBREW LETTER LAMED, U+05DC), "ו" (HEBREW LETTER VAV, U+05D5), "ם" | ||||
| (HEBREW LETTER FINAL MEM, U+05DD). | ||||
| The Vorbis comment field is 14 bytes but only 10 characters in size, | ||||
| because it contains four 2-byte characters. | ||||
| D.2.6. Padding | D.2.6. Padding | |||
| The last metadata block is a (very short) padding block. | The last metadata block is a (very short) padding block. | |||
| +========+=========+================+========================+ | +========+=========+================+========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+================+========================+ | +========+=========+================+========================+ | |||
| | 0x7e+0 | 1 bit | 0b1 | Last metadata block | | | 0x7e+0 | 1 bit | 0b1 | Last metadata block | | |||
| +--------+---------+----------------+------------------------+ | +--------+---------+----------------+------------------------+ | |||
| | 0x7e+1 | 7 bits | 0b0000001 | Padding metadata block | | | 0x7e+1 | 7 bits | 0b0000001 | Padding metadata block | | |||
| +--------+---------+----------------+------------------------+ | +--------+---------+----------------+------------------------+ | |||
| | 0x7f+0 | 3 bytes | 0x000006 | Length 6 byte | | | 0x7f+0 | 3 bytes | 0x000006 | Length 6 byte | | |||
| +--------+---------+----------------+------------------------+ | +--------+---------+----------------+------------------------+ | |||
| | 0x82+0 | 6 bytes | 0x000000000000 | Padding bytes | | | 0x82+0 | 6 bytes | 0x000000000000 | Padding bytes | | |||
| +--------+---------+----------------+------------------------+ | +--------+---------+----------------+------------------------+ | |||
| Table 35 | Table 35 | |||
| D.2.7. First audio frame | D.2.7. First Audio Frame | |||
| The frame header starts at position 0x88 and is broken down in the | The frame header starts at position 0x88 and is broken down in the | |||
| following table. | following table. | |||
| +========+=========+=================+===================+ | +========+=========+=================+===================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+=================+===================+ | +========+=========+=================+===================+ | |||
| | 0x88+0 | 15 bits | 0xff, 0b1111100 | frame sync | | | 0x88+0 | 15 bits | 0xff, 0b1111100 | Frame sync | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x89+7 | 1 bit | 0b0 | blocking strategy | | | 0x89+7 | 1 bit | 0b0 | Blocking strategy | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x8a+0 | 4 bits | 0b0110 | 8-bit block size | | | 0x8a+0 | 4 bits | 0b0110 | 8-bit block size | | |||
| | | | | further down | | | | | | further down | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x8a+4 | 4 bits | 0b1001 | sample rate 44.1 | | | 0x8a+4 | 4 bits | 0b1001 | Sample rate 44.1 | | |||
| | | | | kHz | | | | | | kHz | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x8b+0 | 4 bits | 0b1001 | side-right stereo | | | 0x8b+0 | 4 bits | 0b1001 | Side-right stereo | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x8b+4 | 3 bits | 0b100 | bit depth 16 bit | | | 0x8b+4 | 3 bits | 0b100 | Bit depth 16 bit | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x8b+7 | 1 bit | 0b0 | mandatory 0 bit | | | 0x8b+7 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x8c+0 | 1 byte | 0x00 | frame number 0 | | | 0x8c+0 | 1 byte | 0x00 | Frame number 0 | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x8d+0 | 1 byte | 0x0f | block size 16 | | | 0x8d+0 | 1 byte | 0x0f | Block size 16 | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x8e+0 | 1 byte | 0x99 | frame header CRC | | | 0x8e+0 | 1 byte | 0x99 | Frame header CRC | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| Table 36 | Table 36 | |||
| The first subframe starts at byte 0x8f, it is broken down in the | The first subframe starts at byte 0x8f, and it is broken down in the | |||
| following table excluding the coded residual. As this subframe codes | following table, excluding the coded residual. As this subframe | |||
| for a side channel, the bit depth is increased by 1 bit from 16 bit | codes for a side channel, the bit depth is increased by 1 bit from 16 | |||
| to 17 bit. This is most clearly present in the unencoded warm-up | bits to 17 bits. This is most clearly present in the unencoded warm- | |||
| sample. | up sample. | |||
| +========+=========+=============+===========================+ | +========+=========+=============+===========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+=============+===========================+ | +========+=========+=============+===========================+ | |||
| | 0x8f+0 | 1 bit | 0b0 | mandatory 0 bit | | | 0x8f+0 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+-------------+---------------------------+ | +--------+---------+-------------+---------------------------+ | |||
| | 0x8f+1 | 6 bits | 0b001001 | fixed subframe, 1st order | | | 0x8f+1 | 6 bits | 0b001001 | Fixed subframe, 1st order | | |||
| +--------+---------+-------------+---------------------------+ | +--------+---------+-------------+---------------------------+ | |||
| | 0x8f+7 | 1 bit | 0b0 | no wasted bits used | | | 0x8f+7 | 1 bit | 0b0 | No wasted bits used | | |||
| +--------+---------+-------------+---------------------------+ | +--------+---------+-------------+---------------------------+ | |||
| | 0x90+0 | 17 bits | 0x0867, 0b0 | unencoded warm-up sample | | | 0x90+0 | 17 bits | 0x0867, 0b0 | Unencoded warm-up sample | | |||
| +--------+---------+-------------+---------------------------+ | +--------+---------+-------------+---------------------------+ | |||
| Table 37 | Table 37 | |||
| The coded residual is broken down in the following table. All | The coded residual is broken down in the following table. All | |||
| quotients are unary coded, all remainders are stored unencoded with a | quotients are unary coded, and all remainders are stored unencoded | |||
| number of bits specified by the Rice parameter. | with a number of bits specified by the Rice parameter. | |||
| +========+========+=================+=================+ | +========+========+=================+=================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+========+=================+=================+ | +========+========+=================+=================+ | |||
| | 0x92+1 | 2 bits | 0b00 | Rice code with | | | 0x92+1 | 2 bits | 0b00 | Rice code with | | |||
| | | | | 4-bit parameter | | | | | | 4-bit parameter | | |||
| +--------+--------+-----------------+-----------------+ | +--------+--------+-----------------+-----------------+ | |||
| | 0x92+3 | 4 bits | 0b0000 | Partition order | | | 0x92+3 | 4 bits | 0b0000 | Partition order | | |||
| | | | | 0 | | | | | | 0 | | |||
| +--------+--------+-----------------+-----------------+ | +--------+--------+-----------------+-----------------+ | |||
| skipping to change at page 84, line 20 ¶ | skipping to change at line 3687 ¶ | |||
| +--------+--------+-----------------+-----------------+ | +--------+--------+-----------------+-----------------+ | |||
| | 0xaa+5 | 11 | 0b00100001100 | Remainder 268 | | | 0xaa+5 | 11 | 0b00100001100 | Remainder 268 | | |||
| | | bits | | | | | | bits | | | | |||
| +--------+--------+-----------------+-----------------+ | +--------+--------+-----------------+-----------------+ | |||
| Table 38 | Table 38 | |||
| At this point, the decoder should know it is done decoding the coded | At this point, the decoder should know it is done decoding the coded | |||
| residual, as it received 16 samples: 1 warm-up sample and 15 residual | residual, as it received 16 samples: 1 warm-up sample and 15 residual | |||
| samples. Each residual sample can be calculated from the quotient | samples. Each residual sample can be calculated from the quotient | |||
| and remainder, and undoing the zig-zag encoding. For example, the | and remainder and from undoing the zigzag encoding. For example, the | |||
| value of the first zig-zag encoded residual sample is 3 * 2^11 + 244 | value of the first zigzag-encoded residual sample is 3 * 2^11 + 244 = | |||
| = 6388. As this is an even number, the zig-zag encoding is undone by | 6388. As this is an even number, the zigzag encoding is undone by | |||
| dividing by 2, the residual sample value is 3194. This is done for | dividing by 2; the residual sample value is 3194. This is done for | |||
| all residual samples in the next table. | all residual samples in the next table. | |||
| +==========+===========+=================+=======================+ | +==========+===========+================+=======================+ | |||
| | Quotient | Remainder | Zig-zag encoded | Residual sample value | | | Quotient | Remainder | Zigzag Encoded | Residual Sample Value | | |||
| +==========+===========+=================+=======================+ | +==========+===========+================+=======================+ | |||
| | 3 | 244 | 6388 | 3194 | | | 3 | 244 | 6388 | 3194 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 1 | 545 | 2593 | -1297 | | | 1 | 545 | 2593 | -1297 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 1 | 408 | 2456 | 1228 | | | 1 | 408 | 2456 | 1228 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 1885 | 1885 | -943 | | | 0 | 1885 | 1885 | -943 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 1904 | 1904 | 952 | | | 0 | 1904 | 1904 | 952 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 1391 | 1391 | -696 | | | 0 | 1391 | 1391 | -696 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 1536 | 1536 | 768 | | | 0 | 1536 | 1536 | 768 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 1047 | 1047 | -524 | | | 0 | 1047 | 1047 | -524 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 1198 | 1198 | 599 | | | 0 | 1198 | 1198 | 599 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 801 | 801 | -401 | | | 0 | 801 | 801 | -401 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 12 | 1767 | 26343 | -13172 | | | 12 | 1767 | 26343 | -13172 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 631 | 631 | -316 | | | 0 | 631 | 631 | -316 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 548 | 548 | 274 | | | 0 | 548 | 548 | 274 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 533 | 533 | -267 | | | 0 | 533 | 533 | -267 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| | 0 | 268 | 268 | 134 | | | 0 | 268 | 268 | 134 | | |||
| +----------+-----------+-----------------+-----------------------+ | +----------+-----------+----------------+-----------------------+ | |||
| Table 39 | Table 39 | |||
| It can be calculated that using a Rice code is, in this case, more | In this case, using a Rice code is more efficient than storing values | |||
| efficient than storing values unencoded. The Rice code (excluding | unencoded. The Rice code (excluding the partition order and | |||
| the partition order and parameter) is 199 bits in length. The | parameter) is 199 bits in length. The largest residual value | |||
| largest residual value (-13172) would need 15 bits to be stored | (-13172) would need 15 bits to be stored unencoded, so storing all 15 | |||
| unencoded, so storing all 15 samples with 15 bits results in a | samples with 15 bits results in a sequence with a length of 225 bits. | |||
| sequence with a length of 225 bits. | ||||
| The next step is using the predictor and the residuals to restore the | The next step is using the predictor and the residuals to restore the | |||
| sample values. As this subframe uses a fixed predictor with order 1, | sample values. As this subframe uses a fixed predictor with order 1, | |||
| this means adding the residual value to the value of the previous | the residual value is added to the value of the previous sample. | |||
| sample. | ||||
| +===========+==============+ | +===========+==============+ | |||
| | Residual | Sample value | | | Residual | Sample Value | | |||
| +===========+==============+ | +===========+==============+ | |||
| | (warm-up) | 4302 | | | (warm-up) | 4302 | | |||
| +-----------+--------------+ | +-----------+--------------+ | |||
| | 3194 | 7496 | | | 3194 | 7496 | | |||
| +-----------+--------------+ | +-----------+--------------+ | |||
| | -1297 | 6199 | | | -1297 | 6199 | | |||
| +-----------+--------------+ | +-----------+--------------+ | |||
| | 1228 | 7427 | | | 1228 | 7427 | | |||
| +-----------+--------------+ | +-----------+--------------+ | |||
| | -943 | 6484 | | | -943 | 6484 | | |||
| skipping to change at page 86, line 45 ¶ | skipping to change at line 3779 ¶ | |||
| +-----------+--------------+ | +-----------+--------------+ | |||
| | -267 | -6299 | | | -267 | -6299 | | |||
| +-----------+--------------+ | +-----------+--------------+ | |||
| | 134 | -6165 | | | 134 | -6165 | | |||
| +-----------+--------------+ | +-----------+--------------+ | |||
| Table 40 | Table 40 | |||
| With this, the decoding of the first subframe is complete. The | With this, the decoding of the first subframe is complete. The | |||
| decoding of the second subframe is very similar, as it also uses a | decoding of the second subframe is very similar, as it also uses a | |||
| fixed predictor of order 1, so this is left as an exercise for the | fixed predictor of order 1. This is left as an exercise for the | |||
| reader, the results are in the next table. The next step is undoing | reader; the results are in the next table. The next step is undoing | |||
| stereo decorrelation, which is done in the following table. As the | stereo decorrelation, which is done in the following table. As the | |||
| stereo decorrelation is side-right, the samples in the right channel | stereo decorrelation is side-right, the samples in the right channel | |||
| come directly from the second subframe, while the samples in the left | come directly from the second subframe, while the samples in the left | |||
| channel are found by adding the values of both subframes for each | channel are found by adding the values of both subframes for each | |||
| sample. | sample. | |||
| +============+============+========+=======+ | +============+============+========+=======+ | |||
| | Subframe 1 | Subframe 2 | Left | Right | | | Subframe 1 | Subframe 2 | Left | Right | | |||
| +============+============+========+=======+ | +============+============+========+=======+ | |||
| | 4302 | 6070 | 10372 | 6070 | | | 4302 | 6070 | 10372 | 6070 | | |||
| skipping to change at page 87, line 46 ¶ | skipping to change at line 3828 ¶ | |||
| | -6299 | -8896 | -15195 | -8896 | | | -6299 | -8896 | -15195 | -8896 | | |||
| +------------+------------+--------+-------+ | +------------+------------+--------+-------+ | |||
| | -6165 | -8653 | -14818 | -8653 | | | -6165 | -8653 | -14818 | -8653 | | |||
| +------------+------------+--------+-------+ | +------------+------------+--------+-------+ | |||
| Table 41 | Table 41 | |||
| As the second subframe ends byte-aligned, no padding bits follow it. | As the second subframe ends byte-aligned, no padding bits follow it. | |||
| Finally, the last 2 bytes of the frame contain the frame CRC. | Finally, the last 2 bytes of the frame contain the frame CRC. | |||
| D.2.8. Second audio frame | D.2.8. Second Audio Frame | |||
| The second audio frame is very similar to the frame decoded in the | The second audio frame is very similar to the frame decoded in the | |||
| first example, but this time not 1 but 3 samples are present. | first example, but this time, 3 samples (not 1) are present. | |||
| The frame header starts at position 0xcc and is broken down in the | The frame header starts at position 0xcc and is broken down in the | |||
| following table. | following table. | |||
| +========+=========+=================+===================+ | +========+=========+=================+===================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+=================+===================+ | +========+=========+=================+===================+ | |||
| | 0xcc+0 | 15 bits | 0xff, 0b1111100 | frame sync | | | 0xcc+0 | 15 bits | 0xff, 0b1111100 | Frame sync | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xcd+7 | 1 bit | 0b0 | blocking strategy | | | 0xcd+7 | 1 bit | 0b0 | Blocking strategy | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xce+0 | 4 bits | 0b0110 | 8-bit block size | | | 0xce+0 | 4 bits | 0b0110 | 8-bit block size | | |||
| | | | | further down | | | | | | further down | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xce+4 | 4 bits | 0b1001 | sample rate 44.1 | | | 0xce+4 | 4 bits | 0b1001 | Sample rate 44.1 | | |||
| | | | | kHz | | | | | | kHz | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xcf+0 | 4 bits | 0b0001 | stereo, no | | | 0xcf+0 | 4 bits | 0b0001 | Stereo, no | | |||
| | | | | decorrelation | | | | | | decorrelation | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xcf+4 | 3 bits | 0b100 | bit depth 16 bit | | | 0xcf+4 | 3 bits | 0b100 | Bit depth 16 bits | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xcf+7 | 1 bit | 0b0 | mandatory 0 bit | | | 0xcf+7 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xd0+0 | 1 byte | 0x01 | frame number 1 | | | 0xd0+0 | 1 byte | 0x01 | Frame number 1 | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xd1+0 | 1 byte | 0x02 | block size 3 | | | 0xd1+0 | 1 byte | 0x02 | Block size 3 | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0xd2+0 | 1 byte | 0xa4 | frame header CRC | | | 0xd2+0 | 1 byte | 0xa4 | Frame header CRC | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| Table 42 | Table 42 | |||
| The first subframe starts at 0xd3+0 and is broken down in the | The first subframe starts at 0xd3+0 and is broken down in the | |||
| following table. | following table. | |||
| +========+=========+==========+=========================+ | +========+=========+==========+=========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+==========+=========================+ | +========+=========+==========+=========================+ | |||
| | 0xd3+0 | 1 bit | 0b0 | mandatory 0 bit | | | 0xd3+0 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+----------+-------------------------+ | +--------+---------+----------+-------------------------+ | |||
| | 0xd3+1 | 6 bits | 0b000001 | verbatim subframe | | | 0xd3+1 | 6 bits | 0b000001 | Verbatim subframe | | |||
| +--------+---------+----------+-------------------------+ | +--------+---------+----------+-------------------------+ | |||
| | 0xd3+7 | 1 bit | 0b0 | no wasted bits used | | | 0xd3+7 | 1 bit | 0b0 | No wasted bits used | | |||
| +--------+---------+----------+-------------------------+ | +--------+---------+----------+-------------------------+ | |||
| | 0xd4+0 | 16 bits | 0xc382 | 16-bit unencoded sample | | | 0xd4+0 | 16 bits | 0xc382 | 16-bit unencoded sample | | |||
| +--------+---------+----------+-------------------------+ | +--------+---------+----------+-------------------------+ | |||
| | 0xd6+0 | 16 bits | 0xc40b | 16-bit unencoded sample | | | 0xd6+0 | 16 bits | 0xc40b | 16-bit unencoded sample | | |||
| +--------+---------+----------+-------------------------+ | +--------+---------+----------+-------------------------+ | |||
| | 0xd8+0 | 16 bits | 0xc14a | 16-bit unencoded sample | | | 0xd8+0 | 16 bits | 0xc14a | 16-bit unencoded sample | | |||
| +--------+---------+----------+-------------------------+ | +--------+---------+----------+-------------------------+ | |||
| Table 43 | Table 43 | |||
| The second subframe starts at 0xda+0 and is broken down in the | The second subframe starts at 0xda+0 and is broken down in the | |||
| following table. | following table. | |||
| +========+=========+===================+=========================+ | +========+=========+===================+=========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+===================+=========================+ | +========+=========+===================+=========================+ | |||
| | 0xda+0 | 1 bit | 0b0 | mandatory 0 bit | | | 0xda+0 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+-------------------+-------------------------+ | +--------+---------+-------------------+-------------------------+ | |||
| | 0xda+1 | 6 bits | 0b000001 | verbatim subframe | | | 0xda+1 | 6 bits | 0b000001 | Verbatim subframe | | |||
| +--------+---------+-------------------+-------------------------+ | +--------+---------+-------------------+-------------------------+ | |||
| | 0xda+7 | 1 bit | 0b1 | wasted bits used | | | 0xda+7 | 1 bit | 0b1 | Wasted bits used | | |||
| +--------+---------+-------------------+-------------------------+ | +--------+---------+-------------------+-------------------------+ | |||
| | 0xdb+0 | 1 bit | 0b1 | 1 wasted bit used | | | 0xdb+0 | 1 bit | 0b1 | 1 wasted bit used | | |||
| +--------+---------+-------------------+-------------------------+ | +--------+---------+-------------------+-------------------------+ | |||
| | 0xdb+1 | 15 bits | 0b110111001001000 | 15-bit unencoded sample | | | 0xdb+1 | 15 bits | 0b110111001001000 | 15-bit unencoded sample | | |||
| +--------+---------+-------------------+-------------------------+ | +--------+---------+-------------------+-------------------------+ | |||
| | 0xdd+0 | 15 bits | 0b110111010000001 | 15-bit unencoded sample | | | 0xdd+0 | 15 bits | 0b110111010000001 | 15-bit unencoded sample | | |||
| +--------+---------+-------------------+-------------------------+ | +--------+---------+-------------------+-------------------------+ | |||
| | 0xde+7 | 15 bits | 0b110110110011111 | 15-bit unencoded sample | | | 0xde+7 | 15 bits | 0b110110110011111 | 15-bit unencoded sample | | |||
| +--------+---------+-------------------+-------------------------+ | +--------+---------+-------------------+-------------------------+ | |||
| Table 44 | Table 44 | |||
| As this subframe uses wasted bits, the 15-bit unencoded samples need | As this subframe uses wasted bits, the 15-bit unencoded samples need | |||
| to be shifted left by 1 bit. For example, sample 1 is stored as | to be shifted left by 1 bit. For example, sample 1 is stored as | |||
| -4536 and becomes -9072 after shifting left 1 bit. | -4536 and becomes -9072 after shifting left 1 bit. | |||
| As the last subframe does not end on byte alignment, 2 padding bits | As the last subframe does not end on byte alignment, 2 padding bits | |||
| are added before the 2 byte frame CRC follows at 0xe1+0. | are added before the 2-byte frame CRC, which follows at 0xe1+0. | |||
| D.2.9. MD5 checksum verification | D.2.9. MD5 Checksum Verification | |||
| All samples in the file have been decoded, we can now verify the MD5 | All samples in the file have been decoded, and we can now verify the | |||
| checksum. All sample values must be interleaved and stored signed, | MD5 checksum. All sample values must be interleaved and stored | |||
| coded little-endian. The result of this follows in groups of 12 | signed coded little-endian. The result of this follows in groups of | |||
| samples (i.e., 6 interchannel samples) per line. | 12 samples (i.e., 6 interchannel samples) per line. | |||
| 0x8428 B617 7946 3129 5E3A 2722 D445 D128 0B3D B723 EB45 DF28 | 0x8428 B617 7946 3129 5E3A 2722 D445 D128 0B3D B723 EB45 DF28 | |||
| 0x723f 1E25 9D46 4929 B841 7026 5747 B829 8F43 8127 AEC7 14DF | 0x723f 1E25 9D46 4929 B841 7026 5747 B829 8F43 8127 AEC7 14DF | |||
| 0x9FC4 41DD 54C7 E4DE A5C4 40DD 1EC6 33DE 82C3 90DC 0BC4 02DD | 0x9FC4 41DD 54C7 E4DE A5C4 40DD 1EC6 33DE 82C3 90DC 0BC4 02DD | |||
| 0x4AC1 3EDB | 0x4AC1 3EDB | |||
| The MD5 checksum of this is indeed the same as the one found in the | The MD5 checksum of this is indeed the same as the one found in the | |||
| streaminfo metadata block. | streaminfo metadata block. | |||
| D.3. Decoding example 3 | D.3. Decoding Example 3 | |||
| This example is once again a very short FLAC file. The focus of this | This example is once again a very short FLAC file. The focus of this | |||
| example is on decoding a subframe with a linear predictor and a coded | example is on decoding a subframe with a linear predictor and a coded | |||
| residual with more than one partition. | residual with more than one partition. | |||
| D.3.1. Example file 3 in hexadecimal representation | D.3.1. Example File 3 in Hexadecimal Representation | |||
| 00000000: 664c 6143 8000 0022 1000 1000 fLaC...".... | 00000000: 664c 6143 8000 0022 1000 1000 fLaC...".... | |||
| 0000000c: 0000 1f00 001f 07d0 0070 0000 .........p.. | 0000000c: 0000 1f00 001f 07d0 0070 0000 .........p.. | |||
| 00000018: 0018 f8f9 e396 f5cb cfc6 dc80 ............ | 00000018: 0018 f8f9 e396 f5cb cfc6 dc80 ............ | |||
| 00000024: 7f99 7790 6b32 fff8 6802 0017 ..w.k2..h... | 00000024: 7f99 7790 6b32 fff8 6802 0017 ..w.k2..h... | |||
| 00000030: e944 004f 6f31 3d10 47d2 27cb .D.Oo1=.G.'. | 00000030: e944 004f 6f31 3d10 47d2 27cb .D.Oo1=.G.'. | |||
| 0000003c: 6d09 0831 452b dc28 2222 8057 m..1E+.("".W | 0000003c: 6d09 0831 452b dc28 2222 8057 m..1E+.("".W | |||
| 00000048: a3 . | 00000048: a3 . | |||
| D.3.2. Example file 3 in binary representation (only audio frame) | D.3.2. Example File 3 in Binary Representation (Only Audio Frame) | |||
| 0000002a: 11111111 11111000 01101000 00000010 ..h. | 0000002a: 11111111 11111000 01101000 00000010 ..h. | |||
| 0000002e: 00000000 00010111 11101001 01000100 ...D | 0000002e: 00000000 00010111 11101001 01000100 ...D | |||
| 00000032: 00000000 01001111 01101111 00110001 .Oo1 | 00000032: 00000000 01001111 01101111 00110001 .Oo1 | |||
| 00000036: 00111101 00010000 01000111 11010010 =.G. | 00000036: 00111101 00010000 01000111 11010010 =.G. | |||
| 0000003a: 00100111 11001011 01101101 00001001 '.m. | 0000003a: 00100111 11001011 01101101 00001001 '.m. | |||
| 0000003e: 00001000 00110001 01000101 00101011 .1E+ | 0000003e: 00001000 00110001 01000101 00101011 .1E+ | |||
| 00000042: 11011100 00101000 00100010 00100010 .("" | 00000042: 11011100 00101000 00100010 00100010 .("" | |||
| 00000046: 10000000 01010111 10100011 .W. | 00000046: 10000000 01010111 10100011 .W. | |||
| D.3.3. Streaminfo metadata block | D.3.3. Streaminfo Metadata Block | |||
| Most of the streaminfo metadata block, including its header, is the | Most of the streaminfo metadata block, including its header, is the | |||
| same as in example 1, so only parts that are different are listed in | same as in example 1, so only parts that are different are listed in | |||
| the following table. | the following table. | |||
| +========+==========+====================+=========================+ | +========+==========+====================+==========================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+==========+====================+=========================+ | +========+==========+====================+==========================+ | |||
| | 0x0c+0 | 3 bytes | 0x00001f | Min. frame size 31 byte | | | 0x0c+0 | 3 bytes | 0x00001f | Min. frame size 31 bytes | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x0f+0 | 3 bytes | 0x00001f | Max. frame size 31 byte | | | 0x0f+0 | 3 bytes | 0x00001f | Max. frame size 31 bytes | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x12+0 | 20 bits | 0x07d0, 0x0000 | Sample rate 32000 hertz | | | 0x12+0 | 20 bits | 0x07d0, 0x0000 | Sample rate 32000 hertz | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x14+4 | 3 bits | 0b000 | 1 channel | | | 0x14+4 | 3 bits | 0b000 | 1 channel | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x14+7 | 5 bits | 0b00111 | Sample bit depth 8 bit | | | 0x14+7 | 5 bits | 0b00111 | Sample bit depth 8 bits | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x15+4 | 36 bits | 0b0000, 0x00000018 | Total no. of samples 24 | | | 0x15+4 | 36 bits | 0b0000, 0x00000018 | Total no. of samples 24 | | |||
| +--------+----------+--------------------+-------------------------+ | +--------+----------+--------------------+--------------------------+ | |||
| | 0x1a | 16 bytes | (...) | MD5 checksum | | | 0x1a | 16 | (...) | MD5 checksum | | |||
| +--------+----------+--------------------+-------------------------+ | | | bytes | | | | |||
| +--------+----------+--------------------+--------------------------+ | ||||
| Table 45 | Table 45 | |||
| D.3.4. Audio frame | D.3.4. Audio Frame | |||
| The frame header starts at position 0x2a and is broken down in the | The frame header starts at position 0x2a and is broken down in the | |||
| following table. | following table. | |||
| +========+=========+=================+===================+ | +========+=========+=================+===================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+=========+=================+===================+ | +========+=========+=================+===================+ | |||
| | 0x2a+0 | 15 bits | 0xff, 0b1111100 | Frame sync | | | 0x2a+0 | 15 bits | 0xff, 0b1111100 | Frame sync | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2b+7 | 1 bit | 0b0 | blocking strategy | | | 0x2b+7 | 1 bit | 0b0 | blocking strategy | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2c+0 | 4 bits | 0b0110 | 8-bit block size | | | 0x2c+0 | 4 bits | 0b0110 | 8-bit block size | | |||
| | | | | further down | | | | | | further down | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2c+4 | 4 bits | 0b1000 | Sample rate 32 | | | 0x2c+4 | 4 bits | 0b1000 | Sample rate 32 | | |||
| | | | | kHz | | | | | | kHz | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2d+0 | 4 bits | 0b0000 | Mono audio (1 | | | 0x2d+0 | 4 bits | 0b0000 | Mono audio (1 | | |||
| | | | | channel) | | | | | | channel) | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2d+4 | 3 bits | 0b001 | Bit depth 8 bit | | | 0x2d+4 | 3 bits | 0b001 | Bit depth 8 bits | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2d+7 | 1 bit | 0b0 | Mandatory 0 bit | | | 0x2d+7 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2e+0 | 1 byte | 0x00 | Frame number 0 | | | 0x2e+0 | 1 byte | 0x00 | Frame number 0 | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x2f+0 | 1 byte | 0x17 | Block size 24 | | | 0x2f+0 | 1 byte | 0x17 | Block size 24 | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| | 0x30+0 | 1 byte | 0xe9 | Frame header CRC | | | 0x30+0 | 1 byte | 0xe9 | Frame header CRC | | |||
| +--------+---------+-----------------+-------------------+ | +--------+---------+-----------------+-------------------+ | |||
| Table 46 | Table 46 | |||
| The first and only subframe starts at byte 0x31, it is broken down in | The first and only subframe starts at byte 0x31. It is broken down | |||
| the following table, without the coded residual. | in the following table, without the coded residual. | |||
| +========+========+==========+=====================+ | +========+========+==========+=====================+ | |||
| | Start | Length | Contents | Description | | | Start | Length | Contents | Description | | |||
| +========+========+==========+=====================+ | +========+========+==========+=====================+ | |||
| | 0x31+0 | 1 bit | 0b0 | Mandatory 0 bit | | | 0x31+0 | 1 bit | 0b0 | Mandatory 0 bit | | |||
| +--------+--------+----------+---------------------+ | +--------+--------+----------+---------------------+ | |||
| | 0x31+1 | 6 bits | 0b100010 | Linear prediction | | | 0x31+1 | 6 bits | 0b100010 | Linear prediction | | |||
| | | | | subframe, 3rd order | | | | | | subframe, 3rd order | | |||
| +--------+--------+----------+---------------------+ | +--------+--------+----------+---------------------+ | |||
| | 0x31+7 | 1 bit | 0b0 | No wasted bits used | | | 0x31+7 | 1 bit | 0b0 | No wasted bits used | | |||
| skipping to change at page 94, line 50 ¶ | skipping to change at line 4116 ¶ | |||
| | | bits | | | | | | bits | | | | |||
| +--------+--------+----------+--------------------------------------+ | +--------+--------+----------+--------------------------------------+ | |||
| | 0x42+7 | 4 bits | 0b0001 | Rice parameter 1 | | | 0x42+7 | 4 bits | 0b0001 | Rice parameter 1 | | |||
| +--------+--------+----------+--------------------------------------+ | +--------+--------+----------+--------------------------------------+ | |||
| | 0x43+3 | 23 | (...) | Residual partition 4 | | | 0x43+3 | 23 | (...) | Residual partition 4 | | |||
| | | bits | | | | | | bits | | | | |||
| +--------+--------+----------+--------------------------------------+ | +--------+--------+----------+--------------------------------------+ | |||
| Table 48 | Table 48 | |||
| The frame ends with 6 padding bits and a 2 byte frame CRC | The frame ends with 6 padding bits and a 2-byte frame CRC. | |||
| To decode this subframe, 21 predictions have to be calculated and | To decode this subframe, 21 predictions have to be calculated and | |||
| added to their corresponding residuals. This is a sequential | added to their corresponding residuals. This is a sequential | |||
| process: as each prediction uses previous samples, it is not possible | process: as each prediction uses previous samples, it is not possible | |||
| to start this decoding halfway a subframe or decode a subframe with | to start this decoding halfway through a subframe or decode a | |||
| parallel threads. | subframe with parallel threads. | |||
| The following table breaks down the calculation for each sample. For | The following table breaks down the calculation for each sample. For | |||
| example, the predictor without shift value of row 4 is found by | example, the predictor without shift value of row 4 is found by | |||
| applying the predictor with the three warm-up samples: 7*111 - 6*79 + | applying the predictor with the three warm-up samples: 7*111 - 6*79 + | |||
| 2*0 = 303. This value is then shifted right by 2 bits: 303 >> 2 = | 2*0 = 303. This value is then shifted right by 2 bits: 303 >> 2 = | |||
| 75. Then, the decoded residual sample is added: 75 + 3 = 78. | 75. Then, the decoded residual sample is added: 75 + 3 = 78. | |||
| +===========+=====================+===========+==============+ | +===========+=====================+===========+==============+ | |||
| | Residual | Predictor w/o shift | Predictor | Sample value | | | Residual | Predictor w/o Shift | Predictor | Sample Value | | |||
| +===========+=====================+===========+==============+ | +===========+=====================+===========+==============+ | |||
| | (warm-up) | N/A | N/A | 0 | | | (warm-up) | N/A | N/A | 0 | | |||
| +-----------+---------------------+-----------+--------------+ | +-----------+---------------------+-----------+--------------+ | |||
| | (warm-up) | N/A | N/A | 79 | | | (warm-up) | N/A | N/A | 79 | | |||
| +-----------+---------------------+-----------+--------------+ | +-----------+---------------------+-----------+--------------+ | |||
| | (warm-up) | N/A | N/A | 111 | | | (warm-up) | N/A | N/A | 111 | | |||
| +-----------+---------------------+-----------+--------------+ | +-----------+---------------------+-----------+--------------+ | |||
| | 3 | 303 | 75 | 78 | | | 3 | 303 | 75 | 78 | | |||
| +-----------+---------------------+-----------+--------------+ | +-----------+---------------------+-----------+--------------+ | |||
| | -1 | 38 | 9 | 8 | | | -1 | 38 | 9 | 8 | | |||
| skipping to change at page 96, line 22 ¶ | skipping to change at line 4184 ¶ | |||
| +-----------+---------------------+-----------+--------------+ | +-----------+---------------------+-----------+--------------+ | |||
| | 2 | -24 | -6 | -4 | | | 2 | -24 | -6 | -4 | | |||
| +-----------+---------------------+-----------+--------------+ | +-----------+---------------------+-----------+--------------+ | |||
| | 2 | -26 | -7 | -5 | | | 2 | -26 | -7 | -5 | | |||
| +-----------+---------------------+-----------+--------------+ | +-----------+---------------------+-----------+--------------+ | |||
| | 0 | 1 | 0 | 0 | | | 0 | 1 | 0 | 0 | | |||
| +-----------+---------------------+-----------+--------------+ | +-----------+---------------------+-----------+--------------+ | |||
| Table 49 | Table 49 | |||
| By lining all these samples up, we get the following input for the | By lining up all these samples, we get the following input for the | |||
| MD5 checksum calculation process. | MD5 checksum calculation process: | |||
| 0x004F 6F4E 08C3 A6BC F32A 4335 0DE5 D2DA F40E 1813 06FC FB00 | 0x004F 6F4E 08C3 A6BC F32A 4335 0DE5 D2DA F40E 1813 06FC FB00 | |||
| Which indeed results in the MD5 checksum found in the streaminfo | This indeed results in the MD5 checksum found in the streaminfo | |||
| metadata block. | metadata block. | |||
| Acknowledgments | ||||
| FLAC owes much to the many people who have advanced the audio | ||||
| compression field so freely. For instance: | ||||
| * Tony Robinson: He worked on Shorten, and his paper (see | ||||
| [Robinson-TR156]) is a good starting point on some of the basic | ||||
| methods used by FLAC. FLAC trivially extends and improves the | ||||
| fixed predictors, LPC coefficient quantization, and Rice coding | ||||
| used in Shorten. | ||||
| * Solomon W. Golomb and Robert F. Rice: Their universal codes are | ||||
| used by FLAC's entropy coder. See [Rice]. | ||||
| * Norman Levinson and James Durbin: The FLAC reference encoder uses | ||||
| an algorithm developed and refined by them for determining the LPC | ||||
| coefficients from the autocorrelation coefficients. See | ||||
| [Durbin]). | ||||
| * Claude Shannon: See [Shannon]. | ||||
| The FLAC format, the FLAC reference implementation | ||||
| [FLAC-implementation], and the initial draft version of this document | ||||
| were originally developed by Josh Coalson. While many others have | ||||
| contributed since, this original effort is deeply appreciated. | ||||
| Authors' Addresses | Authors' Addresses | |||
| Martijn van Beurden | Martijn van Beurden | |||
| Netherlands | Netherlands | |||
| Email: mvanb1@gmail.com | Email: mvanb1@gmail.com | |||
| Andrew Weaver | Andrew Weaver | |||
| Email: theandrewjw@gmail.com | Email: theandrewjw@gmail.com | |||
| End of changes. 535 change blocks. | ||||
| 1371 lines changed or deleted | 1390 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||