1. 09 12月, 2020 29 次提交
    • A
      avcodec/rv10: Reduce number of exceptions when reading VLC value · df3269f5
      Andreas Rheinhardt 提交于
      RealVideo 1.0 uses an insane way to encode DC coefficients: There are
      several symbols that (for no good reason whatsoever) have multiple
      encodings, leading to longer codes than necessary.
      
      More specifically, the tree for the 256 luma symbols contains 255 codes
      belonging to 255 different symbols on the left; going further right,
      the tree consists of two blocks of 128 codes each of length 14 encoding
      consecutive numbers (including two encodings for the symbol missing among
      the 255 codes on the left); this is followed by two blocks of codes of
      length 16 each containing 256 elements with consecutive symbols (i.e.
      each of the blocks allows to encode all symbols). The rest of the tree
      consists of 2^11 codes that all encode the same symbol.
      
      The tree for the 256 chroma symbols is similar, but is missing the
      blocks of length 256 and there are only 2^9 consecutive codes that
      encode the same symbol; furthermore, the chroma tree is incomplete:
      The right-most node has no right child.
      
      All of this caused problems when parsing these codes; the reason is that
      the code for this predates commit b613bacc
      which added support for explicit symbol tables and thereby removed the
      requirement that different codes have different symbols. In order to
      address this, the trees used for parsing were incomplete: They contained
      the 255 codes on the left and one code for the remaining symbol. Whenever
      a code not in these trees was encountered, it was dealt with in
      special cases (one for each of the blocks mentioned above).
      
      This commit reduces the number of special cases: Using a symbols table
      allows to treat the blocks of consecutive symbols like ordinary codes;
      only the blocks encoding a single symbol are still treated specially
      (in order not to waste memory on tables for them).
      
      In order to not increment the size of the tables used to initialize the
      VLCs both the symbols as well as the lengths are now run-length encoded.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      df3269f5
    • A
      avcodec/rv10: Reduce the size of the tables used to initialize VLCs · e730b155
      Andreas Rheinhardt 提交于
      This can be achieved by switching to ff_init_vlc_from_lengths() which
      allows to replace two uint16_t tables for codes with uint8_t tables for
      the symbols by permuting the tables so that the codes are ordered from
      left to right in the tree in which case they can be easily computed from
      the lengths at runtime.
      
      And after doing so, it became apparent that the tables for the symbols
      are actually the same for luma and chroma, so that one can even omit one
      of them.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      e730b155
    • A
      7eef70a0
    • A
      avcodec/cook: Avoid big length tables for VLC initialization · 3ff0179b
      Andreas Rheinhardt 提交于
      Permuting the tables used to initialize the Cook VLCs so that the code
      tables are ordered from left to right in the tree revealed that the
      length of the codes are ascending from left to right. Therefore one can
      run-length encode them to avoid the big length tables; this saves a bit
      more than 1KB.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      3ff0179b
    • A
    • A
      avcodec/cook: Make tables to initialize VLCs smaller · 530d86b9
      Andreas Rheinhardt 提交于
      Up until now, the Cook decoder used tables for the lengths of codes and
      tables of the codes itself to initialize VLCs; the tables for the codes
      were of type uint16_t because the codes were so long. It did not use
      explicit symbol tables. This commit instead reorders the tables so that
      the code tables are sorted from left to right in the tree. Then the
      codes can be easily derived from the lengths and therefore be omitted.
      This comes at the price of explicitly coding the symbols, but this is
      nevertheless a net win because most of the symbols tables can be coded
      on one byte. Furthermore, Cook actually does not use a contiguous range
      of symbols for its main VLC tables and the old code compensated for that
      by adding holes (codes of length zero) to the tables (that are skipped by
      ff_init_vlc_sparse()). This is no longer necessary with the new
      approach. All in all, this saves about 1.7KB.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      530d86b9
    • A
      8887232c
    • A
    • A
      avcodec/wnv1: Make array for initializing VLC smaller · a8646a7b
      Andreas Rheinhardt 提交于
      This is possible by switching to ff_init_vlc_from_lengths() which allows
      to replace the table for the codes (which need an uint16_t) by a table
      of symbols which fit into an uint8_t. Also switch to an ordinary
      INIT_VLC macro while just at it.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      a8646a7b
    • A
      f86dcd00
    • A
    • A
    • A
      c880889c
    • A
      avcodec/clearvideo: Improve handling of VLC escape values · ef9652ba
      Andreas Rheinhardt 提交于
      Both the motion vector as well as the bias VLCs have an escape code;
      for the motion vectors, this value depended on the specific VLC table,
      whereas all the bias VLCs used the same value; the escape value has not
      been inlined in the latter case.
      
      But for both kinds of VLCs there are lots of values that are unused for
      all the VLCs of each kind and each of these can be used as common escape
      value, thus allowing to inline the escape value. This commit implements
      this.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      ef9652ba
    • A
      avcodec/clearvideo: Avoid huge VLC length tables · 1d3ec27b
      Andreas Rheinhardt 提交于
      After the motion vector and bias values tables have been reordered so
      that the codes are ordered from left to right, it emerged that the
      length of these entries are actually ascending for every table.
      Therefore it is possible to encode them in a run-length style and create
      the actual length tables during runtime. This commit implements this.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      1d3ec27b
    • A
    • A
      avcodec/clearvideo: Avoid code tables for initializing VLCs · c82a559e
      Andreas Rheinhardt 提交于
      The ClearVideo decoder uses VLC tables that are initialized at runtime
      from static length, symbol and codes tables. Yet the code tables can be
      omitted by subjecting all of these tables to the permutation that orders
      the codes from left to right in the tree. After this is done, the codes
      can be easily computed at runtime from the lengths and therefore
      omitted. This saves about 10KB.
      
      Only one minor complication is encountered when doing so: The tree
      corresponding to the AC VLC codes is incomplete; but this can be
      handled by adding an entry with negative length.
      
      Furthermore, there are also VLCs that are only initialized with lengths
      and codes tables with codes of type uint16_t. These have also been
      switched to ff_init_vlc_from_lengths() as this means that one can
      replace the uint16_t codes tables with uint8_t symbols tables.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      c82a559e
    • A
    • A
      avcodec/imc: Make Huffman tables smaller · 200e8e80
      Andreas Rheinhardt 提交于
      The IMC decoder uses Huffman tables which are created at runtime from
      length tables of type uint8_t and code tables of type uint16_t together
      with an implicit symbols table (i.e. symbol[i] == i). This commit
      changes this: All three tables are subjected to the same permutation to
      order the codes from left to right in the tree; afterwards the codes can
      be omitted because they are easily computable at runtime from the
      lengths, whereas the symbols need to be explicitly coded now. But said
      symbols fit into an uint8_t, so one saves space.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      200e8e80
    • A
    • A
      avcodec/on2avcdata: Combine tables for codebooks · 2935d29e
      Andreas Rheinhardt 提交于
      Using one big table for the codebook symbols and lengths makes it
      possible to remove the pointers to the individual tables.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      2935d29e
    • A
      avcodec/on2avc: Use smaller tables for VLCs · 71753c52
      Andreas Rheinhardt 提交于
      The On2 audio decoder uses huge tables to initialize VLC tables. These
      tables (mostly) use symbols tables in addition to the codes tables and
      the lengths tables. This commit makes the codes tables redundant and
      removes them: If all tables are permuted so that the codes are ordered
      from left to right in the Huffman tree, the codes become redundant and
      can be easily calculated at runtime from the lengths
      (via ff_init_vlc_from_lengths()); this also avoids sorting the codes in
      ff_init_vlc_sparse()*.
      
      The symbols tables are always 16bit, the codes tables are 32bit, 16bit
      or (rarely) 8bit, the lengths tables are always 8bit. Even though some
      symbols tables have been used twice (which is no longer possible now
      because different permutations need to be performed on the code tables
      sharing the same symbol table in order to order them from left to right),
      this nevertheless saves about 28KB.
      
      *: If the initializations of the VLCs are repeated 2048 times
      (interleaved with calls to free the VLCs which have not been timed), the
      number of decicycles spent on each round of initializations improves
      from 27669656 to 7356159.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      71753c52
    • A
      avcodec/smacker: Improve creating Huffman VLC tables · fefe2cbb
      Andreas Rheinhardt 提交于
      The Smacker Huffman tables are already stored in a tree-like structure;
      in particular, they are naturally ordered from left to right in the
      tree and are therefore suitable to be initialized by
      ff_init_vlc_from_lengths() which avoids traversing the data twice in
      order to sort only the codes that are so long that they need into a
      subtable.
      
      This improves performance (and reduces codesize): For the sample from
      ticket #2425 the number of decicycles for parsing and creating the VLCs
      in smka_decode_frame() decreased from 412322 to 359152 (tested with
      10 runs each looping 20 times over the file).
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      fefe2cbb
    • A
      avcodec/cllc: Improve creating VLCs · 09062eec
      Andreas Rheinhardt 提交于
      One can offload the computation of the codes to
      ff_init_vlc_from_lengths(); this also improves performance: The number
      of decicycles for one call to read_code_table() decreased from 198343
      to 148338 with the sample sample-cllc-rgb.avi from the FATE suite; it
      has been looped 100 times and the test repeated ten times to test it
      sufficiently often.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      09062eec
    • A
      avcodec/tscc2: Make VLC tables static · 4df51441
      Andreas Rheinhardt 提交于
      Also use a named constant for the number of bits of the VLC tables.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      4df51441
    • A
      avcodec/bitstream: Allow static VLC tables to be bigger than needed · b629deab
      Andreas Rheinhardt 提交于
      Right now the allocated size of the VLC table of a static VLC has to
      exactly match the size actually used for the VLC: If it is not enough,
      abort is called; if it is more than enough, an error message is
      emitted. This is no problem when one wants to initialize an individual
      VLC via one of the INIT_VLC macros as one just hardcodes the needed
      size. Yet it is an obstacle when one wants to initialize several VLCs
      in a loop as one then needs to add an array for the sizes/offsets of
      the VLC tables (unless max_depth of all arrays is one in which case
      the sizes are derivable from the number of bits used).
      
      Yet said size array is not necessary if one disables the warning for too
      big buffers. The reason is that the amount of entries needed for the
      table is of course generated as a byproduct of initializing the VLC.
      To this end a flag that disables the warning has been added.
      So one can proceed as follows:
      
      static VLC vlcs[NUM];
      static VLC_TYPE vlc_table[BUF_SIZE][2];
      
      for (int i = 0, offset = 0; i < NUM; i++) {
          vlcs[i].table           = &vlc_table[offset];
          vlcs[i].table_allocated = BUF_SIZE - offset;
          init_vlc(); /* With INIT_VLC_STATIC_OVERLONG flag */
          offset += vlcs[i].table_size;
      }
      
      Of course, BUF_SIZE should be equal to the number of entries actually
      needed.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      b629deab
    • A
      avcodec/tscc2: Combine tables for initializing VLCs · e127af3c
      Andreas Rheinhardt 提交于
      Using one big table for the symbols and lengths makes it
      possible to remove the pointers to the individual tables.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      e127af3c
    • A
      avcodec/tscc2: Reduce the size of the tables used to initialize VLCs · d2e55e3a
      Andreas Rheinhardt 提交于
      After permuting both the codes, lengths and symbols tables so that
      the codes tables are ordered from left to right in the tree, the codes
      tables can be easily computed from the lengths tables at runtime and
      therefore omitted. This saves about 2KB from the binary.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      d2e55e3a
    • A
      avcodec/bitstream: Add second function to create VLCs · f55da066
      Andreas Rheinhardt 提交于
      When using ff_init_vlc_sparse() to create a VLC, three input tables are
      used: A table for lengths, one for codes and one for symbols; the latter
      one can be omitted, then a default one will be used. These input tables
      will be traversed twice, once to get the long codes (which will be put
      into subtables) and once for the small codes. The long codes are then
      sorted so that entries that should be in the same subtable are
      contiguous.
      
      This commit adds an alternative to ff_init_vlc_sparse():
      ff_init_vlc_from_lengths(). It is based upon the observation that if
      lengths, codes and symbols tables are permuted (in the same way) so that
      the codes are ordered from left to right in the corresponding tree and
      if said tree is complete (i.e. every non-leaf node has two children),
      the codes can be easily computed from the lengths and are therefore
      redundant. This means that if one initializes such a VLC with explicitly
      coded lengths, codes and symbols, the codes can be avoided; and even if
      one has no explicitly coded symbols, it might still be beneficial to
      remove the codes even when one has to add a new symbol table, because
      codes are typically longer than symbols so that the latter often fit
      into a smaller type, saving space.
      
      Furthermore, given that the codes here are by definition ordered from
      left to right, it is unnecessary to sort them again; for the same
      reason, one does not have to traverse the input twice. This function
      proved to be faster than ff_init_vlc_sparse() whenever it has been
      benchmarked.
      
      This function is usable for static tables (they can simply be permuted
      once) as well as in scenarios where the tables are naturally ordered
      from left to right in the tree; the latter e.g. happens with Smacker,
      Theora and several other formats.
      
      In order to make it also usable for (static) tables with incomplete trees,
      negative lengths are used to indicate that there is an open end of a
      certain length.
      
      Finally, ff_init_vlc_from_lengths() has one downside compared to
      ff_init_vlc_sparse(): The latter uses tables that can be reused by
      encoders. Of course, one could calculate the needed table at runtime
      if one so wishes, but it is nevertheless an obstacle.
      Signed-off-by: NAndreas Rheinhardt <andreas.rheinhardt@gmail.com>
      f55da066
  2. 08 12月, 2020 9 次提交
  3. 07 12月, 2020 2 次提交