1. 16 5月, 2022 1 次提交
  2. 14 3月, 2022 3 次提交
  3. 27 10月, 2021 2 次提交
    • Q
      btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_write · 91507240
      Qu Wenruo 提交于
      Currently btrfs_submit_compressed_write() will check
      btrfs_bio_fits_in_stripe() each time a new page is going to be added.
      Even if compressed extent is small, we don't really need to do that for
      every page.
      
      Align the behavior to extent_io.c, by determining the stripe boundary
      when allocating a bio.
      
      Unlike extent_io.c, in compressed.c we don't need to bother things like
      different bio flags, thus no need to re-use bio_ctrl.
      
      Here we just manually introduce new local variable, next_stripe_start,
      and use that value returned from alloc_compressed_bio() to calculate
      the stripe boundary.
      
      Then each time we add some page range into the bio, we check if we
      reached the boundary.  And if reached, submit it.
      
      Also, since we have @cur_disk_bytenr to determine whether we're the last
      bio, we don't need a explicit last_bio: tag for error handling any more.
      
      And since we use @cur_disk_bytenr to wait, there is no need for
      pending_bios, also remove it to save some memory of compressed_bio.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      91507240
    • Q
      btrfs: introduce compressed_bio::pending_sectors to trace compressed bio · 6ec9765d
      Qu Wenruo 提交于
      For btrfs_submit_compressed_read() and btrfs_submit_compressed_write(),
      we have a pretty weird dance around compressed_bio::pending_bios:
      
        btrfs_submit_compressed_read/write()
        {
      	cb = kmalloc()
      	refcount_set(&cb->pending_bios, 0);
      	bio = btrfs_alloc_bio();
      
      	/* NOTE here, we haven't yet submitted any bio */
      	refcount_set(&cb->pending_bios, 1);
      
      	for (pg_index = 0; pg_index < cb->nr_pages; pg_index++) {
      		if (submit) {
      			/* Here we submit bio, but we always have one
      			 * extra pending_bios */
      			refcount_inc(&cb->pending_bios);
      			ret = btrfs_map_bio();
      		}
      	}
      
      	/* Submit the last bio */
      	ret = btrfs_map_bio();
        }
      
      There are two reasons why we do this:
      
      - compressed_bio::pending_bios is a refcount
        Thus if it's reduced to 0, it can not be increased again.
      
      - To ensure the compressed_bio is not freed by some submitted bios
        If the submitted bio is finished before the next bio submitted,
        we can free the compressed_bio completely.
      
      But the above code is sometimes confusing, and we can do it better by
      introducing a new member, compressed_bio::pending_sectors.
      
      Now we use compressed_bio::pending_sectors to indicate whether we have
      any pending sectors under IO or not yet submitted.
      
      If pending_sectors == 0, we're definitely the last bio of compressed_bio,
      and is OK to release the compressed bio.
      
      Now the workflow looks like this:
      
        btrfs_submit_compressed_read/write()
        {
      	cb = kmalloc()
      	atomic_set(&cb->pending_bios, 0);
      	refcount_set(&cb->pending_sectors,
      		     compressed_len >> sectorsize_bits);
      	bio = btrfs_alloc_bio();
      
      	for (pg_index = 0; pg_index < cb->nr_pages; pg_index++) {
      		if (submit) {
      			refcount_inc(&cb->pending_bios);
      			ret = btrfs_map_bio();
      		}
      	}
      
      	/* Submit the last bio */
      	refcount_inc(&cb->pending_bios);
      	ret = btrfs_map_bio();
        }
      
      For now we still need pending_bios for later error handling, but will
      remove pending_bios eventually after properly handling the errors.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6ec9765d
  4. 23 8月, 2021 1 次提交
    • Q
      btrfs: rework btrfs_decompress_buf2page() · 1c3dc173
      Qu Wenruo 提交于
      There are several bugs inside the function btrfs_decompress_buf2page()
      
      - @start_byte doesn't take bvec.bv_offset into consideration
        Thus it can't handle case where the target range is not page aligned.
      
      - Too many helper variables
        There are tons of helper variables, @buf_offset, @current_buf_start,
        @start_byte, @prev_start_byte, @working_bytes, @bytes.
        This hurts anyone who wants to read the function.
      
      - No obvious main cursor for the iteartion
        A new problem caused by previous problem.
      
      - Comments for parameter list makes no sense
        Like @buf_start is the offset to @buf, or offset inside the full
        decompressed extent? (Spoiler alert, the later case)
        And @total_out acts more like @buf_start + @size_of_buf.
      
        The worst is @disk_start.
        The real meaning of it is the file offset of the full decompressed
        extent.
      
      This patch will rework the whole function by:
      
      - Add a proper comment with ASCII art to explain the parameter list
      
      - Rework parameter list
        The old @buf_start is renamed to @decompressed, to show how many bytes
        are already decompressed inside the full decompressed extent.
        The old @total_out is replaced by @buf_len, which is the decompressed
        data size.
        For old @disk_start and @bio, just pass @compressed_bio in.
      
      - Use single main cursor
        The main cursor will be @cur_file_offset, to show what's the current
        file offset.
        Other helper variables will be declared inside the main loop, and only
        minimal amount of helper variables:
        * offset_inside_decompressed_buf:	The only real helper
        * copy_start_file_offset:		File offset we start memcpy
        * bvec_file_offset:			File offset of current bvec
      
      Even with all these extensive comments, the final function is still
      smaller than the original function, which is definitely a win.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1c3dc173
  5. 21 6月, 2021 2 次提交
    • A
      btrfs: optimize variables size in btrfs_submit_compressed_write · 65b5355f
      Anand Jain 提交于
      Patch "btrfs: reduce compressed_bio member's types" reduced some
      member's size. Function arguments @len, @compressed_len and @nr_pages
      can be declared as unsigned int.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      65b5355f
    • D
      btrfs: reduce compressed_bio members' types · 282ab3ff
      David Sterba 提交于
      Several members of compressed_bio are of type that's unnecessarily big
      for the values that they'd hold:
      
      - the size of the uncompressed and compressed data is 128K now, we can
        keep is as int
      - same for number of pages
      - the compress type fits to a byte
      - the errors is 0/1
      
      The size of the unpatched structure is 80 bytes with several holes.
      Reordering nr_pages next to the pages the hole after pending_bios is
      filled and the resulting size is 56 bytes. This keeps the csums array
      aligned to 8 bytes, which is nice. Further size optimizations may be
      possible but right now it looks good to me:
      
      struct compressed_bio {
              refcount_t                 pending_bios;         /*     0     4 */
              unsigned int               nr_pages;             /*     4     4 */
              struct page * *            compressed_pages;     /*     8     8 */
              struct inode *             inode;                /*    16     8 */
              u64                        start;                /*    24     8 */
              unsigned int               len;                  /*    32     4 */
              unsigned int               compressed_len;       /*    36     4 */
              u8                         compress_type;        /*    40     1 */
              u8                         errors;               /*    41     1 */
      
              /* XXX 2 bytes hole, try to pack */
      
              int                        mirror_num;           /*    44     4 */
              struct bio *               orig_bio;             /*    48     8 */
              u8                         sums[];               /*    56     0 */
      
              /* size: 56, cachelines: 1, members: 12 */
              /* sum members: 54, holes: 1, sum holes: 2 */
              /* last cacheline: 56 bytes */
      };
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      282ab3ff
  6. 07 10月, 2020 1 次提交
  7. 27 7月, 2020 1 次提交
  8. 25 5月, 2020 1 次提交
  9. 18 11月, 2019 15 次提交
  10. 09 9月, 2019 2 次提交
  11. 02 7月, 2019 1 次提交
    • J
      btrfs: correctly validate compression type · aa53e3bf
      Johannes Thumshirn 提交于
      Nikolay reported the following KASAN splat when running btrfs/048:
      
      [ 1843.470920] ==================================================================
      [ 1843.471971] BUG: KASAN: slab-out-of-bounds in strncmp+0x66/0xb0
      [ 1843.472775] Read of size 1 at addr ffff888111e369e2 by task btrfs/3979
      
      [ 1843.473904] CPU: 3 PID: 3979 Comm: btrfs Not tainted 5.2.0-rc3-default #536
      [ 1843.475009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [ 1843.476322] Call Trace:
      [ 1843.476674]  dump_stack+0x7c/0xbb
      [ 1843.477132]  ? strncmp+0x66/0xb0
      [ 1843.477587]  print_address_description+0x114/0x320
      [ 1843.478256]  ? strncmp+0x66/0xb0
      [ 1843.478740]  ? strncmp+0x66/0xb0
      [ 1843.479185]  __kasan_report+0x14e/0x192
      [ 1843.479759]  ? strncmp+0x66/0xb0
      [ 1843.480209]  kasan_report+0xe/0x20
      [ 1843.480679]  strncmp+0x66/0xb0
      [ 1843.481105]  prop_compression_validate+0x24/0x70
      [ 1843.481798]  btrfs_xattr_handler_set_prop+0x65/0x160
      [ 1843.482509]  __vfs_setxattr+0x71/0x90
      [ 1843.483012]  __vfs_setxattr_noperm+0x84/0x130
      [ 1843.483606]  vfs_setxattr+0xac/0xb0
      [ 1843.484085]  setxattr+0x18c/0x230
      [ 1843.484546]  ? vfs_setxattr+0xb0/0xb0
      [ 1843.485048]  ? __mod_node_page_state+0x1f/0xa0
      [ 1843.485672]  ? _raw_spin_unlock+0x24/0x40
      [ 1843.486233]  ? __handle_mm_fault+0x988/0x1290
      [ 1843.486823]  ? lock_acquire+0xb4/0x1e0
      [ 1843.487330]  ? lock_acquire+0xb4/0x1e0
      [ 1843.487842]  ? mnt_want_write_file+0x3c/0x80
      [ 1843.488442]  ? debug_lockdep_rcu_enabled+0x22/0x40
      [ 1843.489089]  ? rcu_sync_lockdep_assert+0xe/0x70
      [ 1843.489707]  ? __sb_start_write+0x158/0x200
      [ 1843.490278]  ? mnt_want_write_file+0x3c/0x80
      [ 1843.490855]  ? __mnt_want_write+0x98/0xe0
      [ 1843.491397]  __x64_sys_fsetxattr+0xba/0xe0
      [ 1843.492201]  ? trace_hardirqs_off_thunk+0x1a/0x1c
      [ 1843.493201]  do_syscall_64+0x6c/0x230
      [ 1843.493988]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1843.495041] RIP: 0033:0x7fa7a8a7707a
      [ 1843.495819] Code: 48 8b 0d 21 de 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 be 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee dd 2b 00 f7 d8 64 89 01 48
      [ 1843.499203] RSP: 002b:00007ffcb73bca38 EFLAGS: 00000202 ORIG_RAX: 00000000000000be
      [ 1843.500210] RAX: ffffffffffffffda RBX: 00007ffcb73bda9d RCX: 00007fa7a8a7707a
      [ 1843.501170] RDX: 00007ffcb73bda9d RSI: 00000000006dc050 RDI: 0000000000000003
      [ 1843.502152] RBP: 00000000006dc050 R08: 0000000000000000 R09: 0000000000000000
      [ 1843.503109] R10: 0000000000000002 R11: 0000000000000202 R12: 00007ffcb73bda91
      [ 1843.504055] R13: 0000000000000003 R14: 00007ffcb73bda82 R15: ffffffffffffffff
      
      [ 1843.505268] Allocated by task 3979:
      [ 1843.505771]  save_stack+0x19/0x80
      [ 1843.506211]  __kasan_kmalloc.constprop.5+0xa0/0xd0
      [ 1843.506836]  setxattr+0xeb/0x230
      [ 1843.507264]  __x64_sys_fsetxattr+0xba/0xe0
      [ 1843.507886]  do_syscall_64+0x6c/0x230
      [ 1843.508429]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [ 1843.509558] Freed by task 0:
      [ 1843.510188] (stack is not available)
      
      [ 1843.511309] The buggy address belongs to the object at ffff888111e369e0
                      which belongs to the cache kmalloc-8 of size 8
      [ 1843.514095] The buggy address is located 2 bytes inside of
                      8-byte region [ffff888111e369e0, ffff888111e369e8)
      [ 1843.516524] The buggy address belongs to the page:
      [ 1843.517561] page:ffff88813f478d80 refcount:1 mapcount:0 mapping:ffff88811940c300 index:0xffff888111e373b8 compound_mapcount: 0
      [ 1843.519993] flags: 0x4404000010200(slab|head)
      [ 1843.520951] raw: 0004404000010200 ffff88813f48b008 ffff888119403d50 ffff88811940c300
      [ 1843.522616] raw: ffff888111e373b8 000000000016000f 00000001ffffffff 0000000000000000
      [ 1843.524281] page dumped because: kasan: bad access detected
      
      [ 1843.525936] Memory state around the buggy address:
      [ 1843.526975]  ffff888111e36880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1843.528479]  ffff888111e36900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1843.530138] >ffff888111e36980: fc fc fc fc fc fc fc fc fc fc fc fc 02 fc fc fc
      [ 1843.531877]                                                        ^
      [ 1843.533287]  ffff888111e36a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1843.534874]  ffff888111e36a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1843.536468] ==================================================================
      
      This is caused by supplying a too short compression value ('lz') in the
      test-case and comparing it to 'lzo' with strncmp() and a length of 3.
      strncmp() read past the 'lz' when looking for the 'o' and thus caused an
      out-of-bounds read.
      
      Introduce a new check 'btrfs_compress_is_valid_type()' which not only
      checks the user-supplied value against known compression types, but also
      employs checks for too short values.
      Reported-by: NNikolay Borisov <nborisov@suse.com>
      Fixes: 272e5326 ("btrfs: prop: fix vanished compression property after failed set")
      CC: stable@vger.kernel.org # 5.1+
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      aa53e3bf
  12. 01 7月, 2019 1 次提交
  13. 25 2月, 2019 5 次提交
  14. 30 5月, 2018 1 次提交
  15. 12 4月, 2018 1 次提交
  16. 26 3月, 2018 1 次提交
    • D
      btrfs: add more __cold annotations · e67c718b
      David Sterba 提交于
      The __cold functions are placed to a special section, as they're
      expected to be called rarely. This could help i-cache prefetches or help
      compiler to decide which branches are more/less likely to be taken
      without any other annotations needed.
      
      Though we can't add more __exit annotations, it's still possible to add
      __cold (that's also added with __exit). That way the following function
      categories are tagged:
      
      - printf wrappers, error messages
      - exit helpers
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e67c718b
  17. 22 1月, 2018 1 次提交