1. 11 9月, 2014 1 次提交
  2. 25 6月, 2014 1 次提交
  3. 29 5月, 2014 3 次提交
  4. 12 4月, 2013 1 次提交
  5. 18 2月, 2013 1 次提交
    • F
      umount oops when remove blocklayoutdriver first · 5a12cca6
      fanchaoting 提交于
      now pnfs client uses block layout, maybe we can remove
      blocklayoutdriver first. if we umount later,
      it can cause oops in unset_pnfs_layoutdriver.
      because nfss->pnfs_curr_ld->clear_layoutdriver is invalid.
      
      reproduce it:
       modprobe  blocklayoutdriver
       mount -t nfs4 -o minorversion=1 pnfsip:/ /mnt/
       rmmod blocklayoutdriver
       umount /mnt
      
      then you can see following
      
      CPU 0
      Pid: 17023, comm: umount.nfs4 Tainted: GF          O 3.7.0-rc6-pnfs #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
      RIP: 0010:[<ffffffffa04cfe6d>]  [<ffffffffa04cfe6d>] unset_pnfs_layoutdriver+0x1d/0x70 [nfsv4]
      RSP: 0018:ffff8800022d9e48  EFLAGS: 00010286
      RAX: ffffffffa04a1b00 RBX: ffff88000b013800 RCX: 0000000000000001
      RDX: ffffffff81ae8ee0 RSI: ffff880001ee94b8 RDI: ffff88000b013800
      RBP: ffff8800022d9e58 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff880001ee9400
      R13: ffff8800105978c0 R14: 00007fff25846c08 R15: 0000000001bba550
      FS:  00007f45ae7f0700(0000) GS:ffff880012c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: ffffffffa04a1b38 CR3: 0000000002c0c000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process umount.nfs4 (pid: 17023, threadinfo ffff8800022d8000, task ffff880006e48aa0)
      Stack:
      ffff8800105978c0 ffff88000b013800 ffff8800022d9e78 ffffffffa04cd0ce
      ffff8800022d9e78 ffff88000b013800 ffff8800022d9ea8 ffffffffa04755a7
      ffff8800022d9ea8 ffff880002f96400 ffff88000b013800 ffff880002f96400
      Call Trace:
      [<ffffffffa04cd0ce>] nfs4_destroy_server+0x1e/0x30 [nfsv4]
      [<ffffffffa04755a7>] nfs_free_server+0xb7/0x150 [nfs]
      [<ffffffffa047d4d5>] nfs_kill_super+0x35/0x40 [nfs]
      [<ffffffff81178d35>] deactivate_locked_super+0x45/0x70
      [<ffffffff8117986a>] deactivate_super+0x4a/0x70
      [<ffffffff81193ee2>] mntput_no_expire+0xd2/0x130
      [<ffffffff81194d62>] sys_umount+0x72/0xe0
      [<ffffffff8154af59>] system_call_fastpath+0x16/0x1b
      Code: 06 e1 b8 ea ff ff ff eb 9e 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 80 03 00 00 48 89 fb 48 85 c0 74 29 <48> 8b 40 38 48 85 c0 74 02 ff d0 48 8b 03 3e ff 48 04 0f 94 c2
      RIP  [<ffffffffa04cfe6d>] unset_pnfs_layoutdriver+0x1d/0x70 [nfsv4]
      RSP <ffff8800022d9e48>
      CR2: ffffffffa04a1b38
      ---[ end trace 29f75aaedda058bf ]---
      
      Signed-off-by: fanchaoting<fanchaoting@cn.fujitsu.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org
      5a12cca6
  6. 17 10月, 2012 1 次提交
  7. 09 10月, 2012 1 次提交
  8. 03 8月, 2012 1 次提交
    • B
      pnfs-obj: Better IO pattern in case of unaligned offset · 7de6e284
      Boaz Harrosh 提交于
      Depending on layout and ARCH, ORE has some limits on max IO sizes
      which is communicated on (what else) ore_layout->max_io_length,
      which is always stripe aligned.
      This was considered as the pg_test boundary for splitting and starting
      a new IO.
      
      But in the case of a long IO where the start offset is not aligned
      what would happen is that both end of IO[N] and start of IO[N+1]
      would be unaligned, causing each IO boundary parity unit to be
      calculated and written twice.
      
      So what we do in this patch is split the very start of an unaligned
      IO, up to a stripe boundary, and then next IO's can continue fully
      aligned til the end.
      
      We might be sacrificing the case where the full unaligned IO would
      fit within a single max_io_length, but the sacrifice is well worth
      the elimination of double calculation and parity units IO.
      Actually the sacrificing is marginal and is almost unmeasurable.
      
      TODO:
      	If we know the total expected linear segment that will
      	be received, at pg_init, we could use that information
      	in many places:
      	1. blocks-layout get_layout write segment size
      	2. Better mds-threshold
      	3. In above situation for a better clean split
      
      	I will do this in future submission.
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      7de6e284
  9. 20 7月, 2012 2 次提交
    • B
      pnfs-obj: Fix __r4w_get_page when offset is beyond i_size · c999ff68
      Boaz Harrosh 提交于
      It is very common for the end of the file to be unaligned on
      stripe size. But since we know it's beyond file's end then
      the XOR should be preformed with all zeros.
      
      Old code used to just read zeros out of the OSD devices, which is a great
      waist. But what scares me more about this situation is that, we now have
      pages attached to the file's mapping that are beyond i_size. I don't
      like the kind of bugs this calls for.
      
      Fix both birds, by returning a global zero_page, if offset is beyond
      i_size.
      
      TODO:
      	Change the API to ->__r4w_get_page() so a NULL can be
      	returned without being considered as error, since XOR API
      	treats NULL entries as zero_pages.
      
      [Bug since 3.2. Should apply the same way to all Kernels since]
      CC: Stable Tree <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      c999ff68
    • B
      pnfs-obj: don't leak objio_state if ore_write/read fails · 9909d45a
      Boaz Harrosh 提交于
      [Bug since 3.2 Kernel]
      CC: Stable Tree <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      9909d45a
  10. 05 5月, 2012 1 次提交
    • T
      NFS: Fix sparse warnings · 1385b811
      Trond Myklebust 提交于
      Fix the following sparse warnings:
      
      fs/nfs/direct.c:221:6: warning: symbol 'nfs_direct_readpage_release' was
      not declared. Should it be static?
      fs/nfs/read.c:38:43: warning: non-ANSI function declaration of function
      'nfs_readhdr_alloc'
      fs/nfs/objlayout/objio_osd.c:214:5: warning: symbol '__alloc_objio_seg'
      was not declared. Should it be static?
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Fred Isaman <iisaman@netapp.com>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      1385b811
  11. 28 4月, 2012 1 次提交
    • F
      NFS: create common nfs_pgio_header for both read and write · cd841605
      Fred Isaman 提交于
      In order to avoid duplicating all the data in nfs_read_data whenever we
      split it up into multiple RPC calls (either due to a short read result
      or due to rsize < PAGE_SIZE), we split out the bits that are the same
      per RPC call into a separate "header" structure.
      
      The goal this patch moves towards is to have a single header
      refcounted by several rpc_data structures.  Thus, want to always refer
      from rpc_data to the header, and not the other way.  This patch comes
      close to that ideal, but the directio code currently needs some
      special casing, isolated in the nfs_direct_[read_write]hdr_release()
      functions.  This will be dealt with in a future patch.
      Signed-off-by: NFred Isaman <iisaman@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      cd841605
  12. 21 3月, 2012 1 次提交
    • S
      pnfs-obj: autologin: Add support for protocol autologin · 18d98f6c
      Sachin Bhamare 提交于
      The pnfs-objects protocol mandates that we autologin into devices not
      present in the system, according to information specified in the
      get_device_info returned from the server.
      
      The Protocol specifies two login hints.
      1. An IP address:port combination
      2. A string URI which is constructed as a URL with a protocol prefix
         followed by :// and a string as address. For each  protocol prefix
         the string-address format might be different.
      
      We only support the second option. The first option is just redundant
      to the second one.
      NOTE: The Kernel part of autologin does not parse the URI string. It
      just channels it to a user-mode script. So any new login protocols should
      only update the user-mode script which is a part of the nfs-utils package,
      but the Kernel need not change.
      
      We implement the autologin by using the call_usermodehelper() API.
      (Thanks to Steve Dickson <steved@redhat.com> for pointing it out)
      So there is no running daemon needed, and/or special setup.
      
      We Add the osd_login_prog Kernel module parameters which defaults to:
      	/sbin/osd_login
      
      Kernel try's to upcall the program specified in osd_login_prog. If the file is
      not found or the execution fails Kernel will disable any farther upcalls, by
      zeroing out  osd_login_prog, Until Admin re-enables it by setting the
      osd_login_prog parameter to a proper program.
      
      Also add text about the osd_login program command line API to:
      	Documentation/filesystems/nfs/pnfs.txt
      and documentation of the new  osd_login_prog  module parameter to:
      	Documentation/kernel-parameters.txt
      
      TODO: Add timeout option in the case osd_login program gets
                    stuck
      Signed-off-by: NSachin Bhamare <sbhamare@panasas.com>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      18d98f6c
  13. 14 3月, 2012 1 次提交
    • B
      pnfs-obj: Uglify objio_segment allocation for the sake of the principle :-( · 5318a29c
      Boaz Harrosh 提交于
      At some past instance Linus Trovalds wrote:
      > From: Linus Torvalds <torvalds@linux-foundation.org>
      > commit a84a79e4 upstream.
      >
      > The size is always valid, but variable-length arrays generate worse code
      > for no good reason (unless the function happens to be inlined and the
      > compiler sees the length for the simple constant it is).
      >
      > Also, there seems to be some code generation problem on POWER, where
      > Henrik Bakken reports that register r28 can get corrupted under some
      > subtle circumstances (interrupt happening at the wrong time?).  That all
      > indicates some seriously broken compiler issues, but since variable
      > length arrays are bad regardless, there's little point in trying to
      > chase it down.
      >
      > "Just don't do that, then".
      
      Since then any use of "variable length arrays" has become blasphemous.
      Even in perfectly good, beautiful, perfectly safe code like the one
      below where the variable length arrays are only used as a sizeof()
      parameter, for type-safe dynamic structure allocations. GCC is not
      executing any stack allocation code.
      
      I have produced a small file which defines two functions main1(unsigned numdevs)
      and main2(unsigned numdevs). main1 uses code as before with call to malloc
      and main2 uses code as of after this patch. I compiled it as:
      	gcc -O2 -S see_asm.c
      and here is what I get:
      
      <see_asm.s>
      main1:
      .LFB7:
      	.cfi_startproc
      	mov	%edi, %edi
      	leaq	4(%rdi,%rdi), %rdi
      	salq	$3, %rdi
      	jmp	malloc
      	.cfi_endproc
      .LFE7:
      	.size	main1, .-main1
      	.p2align 4,,15
      	.globl	main2
      	.type	main2, @function
      main2:
      .LFB8:
      	.cfi_startproc
      	mov	%edi, %edi
      	addq	$2, %rdi
      	salq	$4, %rdi
      	jmp	malloc
      	.cfi_endproc
      .LFE8:
      	.size	main2, .-main2
      	.section	.text.startup,"ax",@progbits
      	.p2align 4,,15
      </see_asm.s>
      
      *Exact* same code !!!
      
      So please seriously consider not accepting this patch and leave the
      perfectly good code intact.
      
      CC: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      5318a29c
  14. 07 2月, 2012 1 次提交
  15. 06 1月, 2012 1 次提交
    • B
      pnfs-obj: Must return layout on IO error · fe0fe835
      Boaz Harrosh 提交于
      As mandated by the standard. In case of an IO error, a pNFS
      objects layout driver must return it's layout. This is because
      all device errors are reported to the server as part of the
      layout return buffer.
      
      This is implemented the same way PNFS_LAYOUTRET_ON_SETATTR
      is done, through a bit flag on the pnfs_layoutdriver_type->flags
      member. The flag is set by the layout driver that wants a
      layout_return preformed at pnfs_ld_{write,read}_done in case
      of an error.
      (Though I have not defined a wrapper like pnfs_ld_layoutret_on_setattr
       because this code is never called outside of pnfs.c and pnfs IO
       paths)
      
      Without this patch 3.[0-2] Kernels leak memory and have an annoying
      WARN_ON after every IO error utilizing the pnfs-obj driver.
      
      [This patch is for 3.2 Kernel. 3.1/0 Kernels need a different patch]
      CC: Stable Tree <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      fe0fe835
  16. 03 11月, 2011 7 次提交
  17. 04 8月, 2011 2 次提交
    • B
      pnfs-obj: Fix the comp_index != 0 case · 9af7db32
      Boaz Harrosh 提交于
      There were bugs in the case of partial layout where olo_comp_index
      is not zero. This used to work and was tested but one of the later
      cleanup SQUASHMEs broke it and was not tested since.
      
      Also add a dprint that specify those received layout parameters.
      Everything else was already printed.
      
      [Needed in v3.0]
      CC: Stable Tree <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      9af7db32
    • B
      pnfs-obj: Bug when we are running out of bio · 20618b21
      Boaz Harrosh 提交于
      When we have a situation that the number of pages we want
      to encode is bigger then the size of the bio. (Which can
      currently happen only when all IO is going to a single device
      .e.g group_width==1) then the IO is submitted short and we
      report back only the amount of bytes we actually wrote/read
      and all is fine. BUT ...
      
      There was a bug that the current length counter was advanced
      before the fail to add the extra page, and we come to a situation
      that the CDB length was one-page longer then the actual bio size,
      which is of course rejected by the osd-target.
      
      While here also fix the bio size calculation, in the case
      that we received more then one group of devices.
      
      CC: Stable Tree <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      20618b21
  18. 16 7月, 2011 1 次提交
  19. 15 7月, 2011 2 次提交
  20. 13 7月, 2011 3 次提交
  21. 21 6月, 2011 1 次提交
    • T
      NFSv4.1: Fix some issues with pnfs_generic_pg_test · 8f7d5efb
      Trond Myklebust 提交于
      1. If the intention is to coalesce requests 'prev' and 'req' then we
         have to ensure at least that we have a layout starting at
         req_offset(prev).
      
      2. If we're only requesting a minimal layout of length desc->pg_count,
         we need to test the length actually returned by the server before
         we allow the coalescing to occur.
      
      3. We need to deal correctly with (pgio->lseg == NULL)
      
      4. Fixup the test guarding the pnfs_update_layout.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      8f7d5efb
  22. 20 6月, 2011 1 次提交
  23. 30 5月, 2011 5 次提交