1. 10 10月, 2015 14 次提交
    • C
      f2fs: do in batches truncation in truncate_hole · ea58711e
      Chao Yu 提交于
      truncate_data_blocks_range can do in batches truncation which makes all
      changes in dnode page content, dnode page status, extent cache, block
      count updating together.
      
      But previously, truncate_hole() always truncates one block in dnode page
      at a time by invoking truncate_data_blocks_range(,1), which make thing
      slow.
      
      This patch changes truncate_hole() to do in batches truncation for all
      target blocks in one direct node inside truncate_data_blocks_range, which
      can make our punch hole operation in ->fallocate more efficent.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ea58711e
    • F
      f2fs: optimize code of f2fs_update_extent_tree_range · 4d1fa815
      Fan Li 提交于
      Fix 2 potential problems:
      1. when largest extent needs to be invalidated, it will be reset in
         __drop_largest_extent, which makes __is_extent_same after always
         return false, and largest extent unchanged. Now we update it properly.
      
      2. when extent is split and the latter part remains in tree, next_en
         should be the latter part instead of next extent of original extent.
         It will cause merge failure if there is in-place update, although
         there is not, I think this fix will still makes codes less ambiguous.
      
      This patch also simplifies codes of invalidating extents, and optimizes the
      procedues that split extent into two.
      There are a few modifications after last patch:
      1. prev_en now is updated properly.
      2. more codes and branches are simplified.
      Signed-off-by: NFan li <fanofcode.li@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4d1fa815
    • F
      f2fs: drop largest extent by range · 41a099de
      Fan Li 提交于
      now we update extent by range, fofs may not be on the largest
      extent if the new extent overlaps with it. so add a new function
      to drop largest extent properly.
      Signed-off-by: NFan li <fanofcode.li@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      41a099de
    • J
      f2fs: check end_io for metapages before making next checkpoint blocks · a7230d16
      Jaegeuk Kim 提交于
      This patch avoids to produce new checkpoint blocks before the previous meta
      pages were written completely.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a7230d16
    • J
      f2fs crypto: allocate buffer for decrypting filename · 569cf187
      Jaegeuk Kim 提交于
      We got dentry pages from high_mem, and its address space directly goes into the
      decryption path via f2fs_fname_disk_to_usr.
      But, sg_init_one assumes the address is not from high_mem, so we can get this
      panic since it doesn't call kmap_high but kunmap_high is triggered at the end.
      
      kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
      ...
       (kunmap_high+0xb0/0xb8) from [<c0114534>] (__kunmap_atomic+0xa0/0xa4)
       (__kunmap_atomic+0xa0/0xa4) from [<c035f028>] (blkcipher_walk_done+0x128/0x1ec)
       (blkcipher_walk_done+0x128/0x1ec) from [<c0366c24>] (crypto_cbc_decrypt+0xc0/0x170)
       (crypto_cbc_decrypt+0xc0/0x170) from [<c0367148>] (crypto_cts_decrypt+0xc0/0x114)
       (crypto_cts_decrypt+0xc0/0x114) from [<c035ea98>] (async_decrypt+0x40/0x48)
       (async_decrypt+0x40/0x48) from [<c032ca34>] (f2fs_fname_disk_to_usr+0x124/0x304)
       (f2fs_fname_disk_to_usr+0x124/0x304) from [<c03056fc>] (f2fs_fill_dentries+0xac/0x188)
       (f2fs_fill_dentries+0xac/0x188) from [<c03059c8>] (f2fs_readdir+0x1f0/0x300)
       (f2fs_readdir+0x1f0/0x300) from [<c0218054>] (vfs_readdir+0x90/0xb4)
       (vfs_readdir+0x90/0xb4) from [<c0218418>] (SyS_getdents64+0x64/0xcc)
       (SyS_getdents64+0x64/0xcc) from [<c0105ba0>] (ret_fast_syscall+0x0/0x30)
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      569cf187
    • C
      f2fs: reorganize f2fs_map_blocks · 973163fc
      Chao Yu 提交于
      In this patch, we try to reorganize f2fs_map_blocks to make block mapping
      flow more clear by using following structure:
      
      /* check status of mapping */
      
      if (unmapped) {
      	/* blkaddr == NULL_ADDR || blkaddr == NEW_ADDR */
      
      	if (create) {
      		/* write path, handle dio write case here */
      		alloc_and_map;
      	} else {
      		/*
      		 * handle read cases from all call paths:
      		 *     1. generic read;
      		 *     2. dio read;
      		 *     3. fiemap;
      		 *     4. bmap
      		 */
      	}
      }
      
      /* map buffer_header */
      
      Besides, this patch handles the missing case correctly for dio write:
      When we fail in __allocate_data_blocks, then in f2fs_map_blocks, we will
      not allocate blocks correctly for preallocated blocks, but returning with
      an unmapped buffer head, which will result in failure of dio write.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      973163fc
    • J
      f2fs: declare f2fs_update_extent_tree_range as static · 514053e4
      Jaegeuk Kim 提交于
      This function should be static.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      514053e4
    • C
      f2fs: fix overflow of size calculation · 9edcdabf
      Chao Yu 提交于
      We have potential overflow issue when calculating size of object, when
      we left shift index with PAGE_CACHE_SHIFT bits, if type of index has only
      32-bits space in 32-bit architecture, left shifting will incur overflow,
      i.e:
      
      pgoff_t index =  0xFFFFFFFF;
      loff_t size = index << PAGE_CACHE_SHIFT;
      size: 0xFFFFF000
      
      So we should cast index with 64-bits type to avoid this issue.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9edcdabf
    • C
      f2fs: fix incorrect searching position when shrinking extent cache · 100136ac
      Chao Yu 提交于
      When shrinking extent cache, we have two steps in the flow:
      1) shrink objects which are unreferenced by inodes;
      2) shrink objects from LRU list of extent cache.
      
      In step 1, if we haven't shrunk enough number of objects, we will try
      step 2, but before that we didn't update the searching position which
      may point to last inode index in global extent tree, result in failing
      to shrink objects by traversing the all inodes' extent tree.
      
      In this patch, we reset searching position to beginning of global extent
      tree for fixing.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      100136ac
    • C
      f2fs: verify file type early in f2fs_fallocate · c998012b
      Chao Yu 提交于
      This patch changes to verify file type early in f2fs_fallocate for
      cleanup, meanwhile this also fixes to add missing verification for
      expand_inode_data.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c998012b
    • J
      f2fs: no need to lock for update_inode_page all the time · c5cd29d2
      Jaegeuk Kim 提交于
      As comment says, we don't need to call f2fs_lock_op in write_inode to prevent
      from producing dirty node pages all the time.
      That happens only when there is not enough free sections and we can avoid that
      by calling balance_fs in prior to that.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c5cd29d2
    • J
      f2fs: cover number of dirty node pages under node_write lock · 25b93346
      Jaegeuk Kim 提交于
      This number is referenced by checkpoint under node_write lock.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      25b93346
    • N
      f2fs: fix incorrect return statement in the function f2fs_ioc_release_volatile_write · 538e17e7
      Nicholas Krause 提交于
      This fixes the incorrect return statement at the end of the function
      f2fs_ioc_release_volatile_write's body for returning zero as this is
      incorrect due to the function call before this return statement to
      the function punch_hole being able to fail and we should return this
      function's return fail directly in order to signal to callers of the
      function f2fs_ioc_release_volatile if a failure arises with this call
      to punch_hole fails.
      Signed-off-by: NNicholas Krause <xerofoify@gmail.com>
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      538e17e7
    • C
      f2fs: trace in batches extent info update · 744288c7
      Chao Yu 提交于
      Rename trace_f2fs_update_extent_tree to trace_f2fs_update_extent_tree_range,
      then expand and enable it to trace in batches extent info updates.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      744288c7
  2. 02 9月, 2015 1 次提交
  3. 29 8月, 2015 1 次提交
    • C
      f2fs: avoid accessing NULL pointer in f2fs_drop_largest_extent · 54d71856
      Chao Yu 提交于
      If extent cache is disable, we will encounter oops when triggering direct
      IO as below:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000c
      IP: [<f0b9c61e>] f2fs_drop_largest_extent+0xe/0x30 [f2fs]
      *pdpt = 000000002bb9a001 *pde = 0000000000000000
      Oops: 0000 [#1] SMP
      Modules linked in: f2fs(O) fuse bnep rfcomm bluetooth nfsd dm_crypt nfs_acl auth_rpcgss oid_registry nfs binfmt_misc fscache lockd
      sunrpc grace snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer
      snd_seq_device snd soundcore joydev psmouse hid_generic i2c_piix4 serio_raw ppdev mac_hid parport_pc lp parport ext4 jbd2 mbcache
      usbhid hid e1000
      CPU: 3 PID: 3608 Comm: dd Tainted: G           O    4.2.0-rc4 #12
      Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      task: ef161600 ti: ebd5e000 task.ti: ebd5e000
      EIP: 0060:[<f0b9c61e>] EFLAGS: 00010202 CPU: 3
      EIP is at f2fs_drop_largest_extent+0xe/0x30 [f2fs]
      EAX: 00000000 EBX: ddebc000 ECX: 00000000 EDX: 00000000
      ESI: ebd5fdf8 EDI: 00000000 EBP: ebd5fd58 ESP: ebd5fd58
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      CR0: 80050033 CR2: 0000000c CR3: 2c24ee40 CR4: 000006f0
      Stack:
       ebd5fda4 f0b8c005 00000000 00000001 00000000 f0b8c430 c816cd68 ddebc000
       ddebc088 00001000 00000555 00000555 ffffffff c160bb00 00055501 00000000
       00000000 00000100 00000000 ebd5fe20 f0b8c430 00000046 ef161600 00001000
      Call Trace:
       [<f0b8c005>] __allocate_data_block+0x1a5/0x260 [f2fs]
       [<f0b8c430>] ? f2fs_direct_IO+0x370/0x440 [f2fs]
       [<c160bb00>] ? down_read+0x30/0x50
       [<f0b8c430>] f2fs_direct_IO+0x370/0x440 [f2fs]
       [<c113e115>] generic_file_direct_write+0xa5/0x260
       [<c10b53f8>] ? current_fs_time+0x18/0x50
       [<c113e38b>] __generic_file_write_iter+0xbb/0x210
       [<c113e50f>] ? generic_file_write_iter+0x2f/0x320
       [<c113e63c>] generic_file_write_iter+0x15c/0x320
       [<f0b77f29>] f2fs_file_write_iter+0x39/0x80 [f2fs]
       [<c11984d9>] __vfs_write+0xa9/0xe0
       [<c1199227>] vfs_write+0x97/0x180
       [<c119955b>] SyS_write+0x5b/0xd0
       [<c160dcd0>] sysenter_do_call+0x12/0x12
      Code: 10 8b 50 1c 89 53 14 eb ca 8d 74 26 00 85 f6 74 86 eb a6 0f 0b 90 8d b4 26 00 00 00 00 55 89 e5 3e 8d 74 26 00 8b 80 d4 02 00
      00 <8b> 48 0c 39 d1 77 0e 03 48 14 39 ca 73 07 c7 40 14 00 00 00 00
      EIP: [<f0b9c61e>] f2fs_drop_largest_extent+0xe/0x30 [f2fs] SS:ESP 0068:ebd5fd58
      CR2: 000000000000000c
      ---[ end trace a38c07026a1afffd ]---
      
      This is because when extent cache is disable, extent_tree pointer in struct
      f2fs_inode_info should be NULL, but in f2fs_drop_largest_extent we access
      this NULL pointer directly without checking state of extent cache, then,
      the oops occurs. Let's fix it by checking state of extent cache before
      accessing.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      54d71856
  4. 27 8月, 2015 1 次提交
    • C
      f2fs: update extent tree in batches · 19b2c30d
      Chao Yu 提交于
      This patch introduce a new helper f2fs_update_extent_tree_range which can
      do extent mapping update at a specified range.
      
      The main idea is:
      1) punch all mapping info in extent node(s) which are at a specified range;
      2) try to merge new extent mapping with adjacent node, or failing that,
         insert the mapping into extent tree as a new node.
      
      In order to see the benefit, I add a function for stating time stamping
      count as below:
      
      uint64_t rdtsc(void)
      {
      	uint32_t lo, hi;
      	__asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
      	return (uint64_t)hi << 32 | lo;
      }
      
      My test environment is: ubuntu, intel i7-3770, 16G memory, 256g micron ssd.
      
      truncation path:	update extent cache from truncate_data_blocks_range
      non-truncataion path:	update extent cache from other paths
      total:			all update paths
      
      a) Removing 128MB file which has one extent node mapping whole range of
      file:
      1. dd if=/dev/zero of=/mnt/f2fs/128M bs=1M count=128
      2. sync
      3. rm /mnt/f2fs/128M
      
      Before:
      		total		count		average
      truncation:	7651022		32768		233.49
      
      Patched:
      		total		count		average
      truncation:	3321		33		100.64
      
      b) fsstress:
      fsstress -d /mnt/f2fs -l 5 -n 100 -p 20
      Test times:		5 times.
      
      Before:
      		total		count		average
      truncation:	5812480.6	20911.6		277.95
      non-truncation:	7783845.6	13440.8		579.12
      total:		13596326.2	34352.4		395.79
      
      Patched:
      		total		count		average
      truncation:	1281283.0	3041.6		421.25
      non-truncation:	7355844.4	13662.8		538.38
      total:		8637127.4	16704.4		517.06
      
      1) For the updates in truncation path:
       - we can see updating in batches leads total tsc and update count reducing
         explicitly;
       - besides, for a single batched updating, punching multiple extent nodes
         in a loop, result in executing more operations, so our average tsc
         increase intensively.
      2) For the updates in non-truncation path:
       - there is a little improvement, that is because for the scenario that we
         just need to update in the head or tail of extent node, new interface
         optimize to update info in extent node directly, rather than removing
         original extent node for updating and then inserting that updated one
         into cache as new node.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      19b2c30d
  5. 25 8月, 2015 6 次提交
  6. 22 8月, 2015 8 次提交
    • C
      f2fs: lookup neighbor extent nodes for merging later · dac2ddef
      Chao Yu 提交于
      In __lookup_extent_tree_ret we will not try to find neighbor nodes if
      we find the target node, in this condition, we will lost the chance to
      merge the new mapping with exist extent node later.
      
      So our extent cache of inode will be fragmented after overwrite exist
      file, we can see the number of extent node increases intensively in
      following test case:
      
      dd if=/dev/zero of=/mnt/f2fs/4m bs=4K count=1024
      
      Extent Cache:
        - Hit Count: L1-1:0 L1-2:0 L2:0
        - Hit Ratio: 0% (0 / 3072)
        - Inner Struct Count: tree: 1, node: 1
      
      dd if=/dev/zero of=/mnt/f2fs/4m bs=4K count=1024 conv=notrunc
      
      Extent Cache:
        - Hit Count: L1-1:2048 L1-2:0 L2:0
        - Hit Ratio: 33% (2048 / 6144)
        - Inner Struct Count: tree: 1, node: 961
      
      This patch fixes to lookup neighbors of target node for further
      merging.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      dac2ddef
    • C
      f2fs: split __insert_extent_tree_ret for readability · ef05e221
      Chao Yu 提交于
      This patch splits __insert_extent_tree_ret into __try_merge_extent_node &
      __insert_extent_tree for code readability.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ef05e221
    • C
      f2fs: kill dead code in __insert_extent_tree · a6f78345
      Chao Yu 提交于
      After commit 0f825ee6 ("f2fs: add new interfaces for extent tree"),
      f2fs_init_extent_tree becomes the only caller of __insert_extent_tree, and
      in f2fs_init_extent_tree, we will only insert extent node in an empty tree,
      so __try_{back,front}_merge in __insert_extent_tree will never be called.
      
      This patch removes these dead codes, besides, rename __insert_extent_tree
      to __init_extent_tree for readability.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a6f78345
    • C
      f2fs: adjust showing of extent cache stat · 029e13cc
      Chao Yu 提交于
      This patch alters to replace total hit stat with rbtree hit stat,
      and then adjust showing of extent cache stat:
      
      Hit Count:
      L1-1: for largest node hit count;
      L1-2: for last cached node hit count;
      L2: for extent node hit after lookuping in rbtree.
      
      Hit Ratio:
      ratio (hit count / total lookup count)
      
      Inner Struct Count:
      tree count, node count.
      
      Before:
      Extent Hit Ratio: 0 / 2
      
      Extent Tree Count: 3
      
      Extent Node Count: 2
      
      Patched:
      Exten Cacache:
        - Hit Count: L1-1:4871 L1-2:2074 L2:208
        - Hit Ratio: 1% (7153 / 550751)
        - Inner Struct Count: tree: 26560, node: 11824
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      029e13cc
    • C
      f2fs: add largest/cached stat in extent cache · 91c481ff
      Chao Yu 提交于
      This patch adds to stat the hit count of largest/cached node for showing
      in debugfs.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      91c481ff
    • C
      f2fs: fix incorrect mapping for bmap · e2b4e2bc
      Chao Yu 提交于
      The test step is like below:
      1. touch file
      2. truncate -s $((1024*1024)) file
      3. fallocate -o 0 -l $((1024*1024)) file
      4. fibmap.f2fs file
      
      Our result of fibmap.f2fs showed below is not correct:
      
      file_pos   start_blk     end_blk        blks
             0    -937166132    -937166132           1
          4096    -937166132    -937166132           1
          8192    -937166132    -937166132           1
         12288    -937166132    -937166132           1
         16384    -937166132    -937166132           1
         20480    -937166132    -937166132           1
      ...
       1040384    -937166132    -937166132           1
       1044480    -937166132    -937166132           1
      
      This is because f2fs_map_blocks will return with no error when meeting
      a hole or preallocated block, the caller __get_data_block will map the
      uninitialized variable value to bh->b_blocknr.
      
      Unfortunately generic_block_bmap will neither check the return value of
      get_data() nor check mapping info of buffer_head, result in returning
      the random block address.
      
      After fixing the issue, our result shows correctly:
      
      file_pos   start_blk     end_blk        blks
             0           0           0         256
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e2b4e2bc
    • F
      f2fs: fix to update cached_en of extent tree properly · f8b703da
      Fan Li 提交于
      In f2fs_lookup_extent_tree, et->cached_en was read and updated with only
      read lock held,
      it could cause __lookup_extent_tree within return entirely wrong
      extent_node, if other
      thread update et->cached_en just before __lookup_extent_tree return.
      
      However, there are two things about this patch that need to be noticed:
      1. It does no good to arrange the order of concurrent read/write, the result
      would still
      be random in such case.
      2. It's built on this assumption: the mix up of reads and writes on a single
      pointer would
      not make the pointer partially wrong at any time. Please let me know if I'm
      wrong, thx.
      Signed-off-by: NFan li <fanofcode.li@samsung.com>
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f8b703da
    • J
      f2fs: fix typo · 217940d4
      Junesung Lee 提交于
      Fix typo.
      Signed-off-by: NJunesung Lee <junesoung412@gmail.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      217940d4
  7. 21 8月, 2015 9 次提交