1. 29 3月, 2014 1 次提交
  2. 28 3月, 2014 9 次提交
    • J
      lockd: ensure we tear down any live sockets when socket creation fails during lockd_up · 679b033d
      Jeff Layton 提交于
      We had a Fedora ABRT report with a stack trace like this:
      
      kernel BUG at net/sunrpc/svc.c:550!
      invalid opcode: 0000 [#1] SMP
      [...]
      CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1
      Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013
      task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000
      RIP: 0010:[<ffffffffa0305fa8>]  [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
      RSP: 0018:ffff88003f9b9de0  EFLAGS: 00010206
      RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee
      RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
      RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360
      R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600
      R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000
      FS:  00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0
      Stack:
       ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000
       ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60
       ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008
      Call Trace:
       [<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd]
       [<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd]
       [<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50
       [<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd]
       [<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd]
       [<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50
       [<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20
       [<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0
       [<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd]
       [<ffffffff811b8b34>] vfs_write+0xb4/0x1f0
       [<ffffffff811c3f99>] ? putname+0x29/0x40
       [<ffffffff811b9569>] SyS_write+0x49/0xa0
       [<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0
       [<ffffffff816962e9>] system_call_fastpath+0x16/0x1b
      Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
      RIP  [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
       RSP <ffff88003f9b9de0>
      
      Evidently, we created some lockd sockets and then failed to create
      others. make_socks then returned an error and we tried to tear down the
      svc, but svc->sv_permsocks was not empty so we ended up tripping over
      the BUG() in svc_destroy().
      
      Fix this by ensuring that we tear down any live sockets we created when
      socket creation is going to return an error.
      
      Fixes: 786185b5 (SUNRPC: move per-net operations from...)
      Reported-by: NRaphos <raphoszap@laposte.net>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      679b033d
    • K
      NFSD: Traverse unconfirmed client through hash-table · 2b905635
      Kinglong Mee 提交于
      When stopping nfsd, I got BUG messages, and soft lockup messages,
      The problem is cuased by double rb_erase() in nfs4_state_destroy_net()
      and destroy_client().
      
      This patch just let nfsd traversing unconfirmed client through
      hash-table instead of rbtree.
      
      [ 2325.021995] BUG: unable to handle kernel NULL pointer dereference at
                (null)
      [ 2325.022809] IP: [<ffffffff8133c18c>] rb_erase+0x14c/0x390
      [ 2325.022982] PGD 7a91b067 PUD 7a33d067 PMD 0
      [ 2325.022982] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ 2325.022982] Modules linked in: nfsd(OF) cfg80211 rfkill bridge stp
      llc snd_intel8x0 snd_ac97_codec ac97_bus auth_rpcgss nfs_acl serio_raw
      e1000 i2c_piix4 ppdev snd_pcm snd_timer lockd pcspkr joydev parport_pc
      snd parport i2c_core soundcore microcode sunrpc ata_generic pata_acpi
      [last unloaded: nfsd]
      [ 2325.022982] CPU: 1 PID: 2123 Comm: nfsd Tainted: GF          O
      3.14.0-rc8+ #2
      [ 2325.022982] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
      VirtualBox 12/01/2006
      [ 2325.022982] task: ffff88007b384800 ti: ffff8800797f6000 task.ti:
      ffff8800797f6000
      [ 2325.022982] RIP: 0010:[<ffffffff8133c18c>]  [<ffffffff8133c18c>]
      rb_erase+0x14c/0x390
      [ 2325.022982] RSP: 0018:ffff8800797f7d98  EFLAGS: 00010246
      [ 2325.022982] RAX: ffff880079c1f010 RBX: ffff880079f4c828 RCX:
      0000000000000000
      [ 2325.022982] RDX: 0000000000000000 RSI: ffff880079bcb070 RDI:
      ffff880079f4c810
      [ 2325.022982] RBP: ffff8800797f7d98 R08: 0000000000000000 R09:
      ffff88007964fc70
      [ 2325.022982] R10: 0000000000000000 R11: 0000000000000400 R12:
      ffff880079f4c800
      [ 2325.022982] R13: ffff880079bcb000 R14: ffff8800797f7da8 R15:
      ffff880079f4c860
      [ 2325.022982] FS:  0000000000000000(0000) GS:ffff88007f900000(0000)
      knlGS:0000000000000000
      [ 2325.022982] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 2325.022982] CR2: 0000000000000000 CR3: 000000007a3ef000 CR4:
      00000000000006e0
      [ 2325.022982] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [ 2325.022982] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
      0000000000000400
      [ 2325.022982] Stack:
      [ 2325.022982]  ffff8800797f7de0 ffffffffa0191c6e ffff8800797f7da8
      ffff8800797f7da8
      [ 2325.022982]  ffff880079f4c810 ffff880079bcb000 ffffffff81cc26c0
      ffff880079c1f010
      [ 2325.022982]  ffff880079bcb070 ffff8800797f7e28 ffffffffa01977f2
      ffff8800797f7df0
      [ 2325.022982] Call Trace:
      [ 2325.022982]  [<ffffffffa0191c6e>] destroy_client+0x32e/0x3b0 [nfsd]
      [ 2325.022982]  [<ffffffffa01977f2>] nfs4_state_shutdown_net+0x1a2/0x220
      [nfsd]
      [ 2325.022982]  [<ffffffffa01700b8>] nfsd_shutdown_net+0x38/0x70 [nfsd]
      [ 2325.022982]  [<ffffffffa017013e>] nfsd_last_thread+0x4e/0x80 [nfsd]
      [ 2325.022982]  [<ffffffffa001f1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc]
      [ 2325.022982]  [<ffffffffa017064b>] nfsd_destroy+0x5b/0x80 [nfsd]
      [ 2325.022982]  [<ffffffffa0170773>] nfsd+0x103/0x130 [nfsd]
      [ 2325.022982]  [<ffffffffa0170670>] ? nfsd_destroy+0x80/0x80 [nfsd]
      [ 2325.022982]  [<ffffffff810a8232>] kthread+0xd2/0xf0
      [ 2325.022982]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
      [ 2325.022982]  [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0
      [ 2325.022982]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
      [ 2325.022982] Code: 48 83 e1 fc 48 89 10 0f 84 02 01 00 00 48 3b 41 10
      0f 84 08 01 00 00 48 89 51 08 48 89 fa e9 74 ff ff ff 0f 1f 40 00 48 8b
      50 10 <f6> 02 01 0f 84 93 00 00 00 48 8b 7a 10 48 85 ff 74 05 f6 07 01
      [ 2325.022982] RIP  [<ffffffff8133c18c>] rb_erase+0x14c/0x390
      [ 2325.022982]  RSP <ffff8800797f7d98>
      [ 2325.022982] CR2: 0000000000000000
      [ 2325.022982] ---[ end trace 28c27ed011655e57 ]---
      
      [  228.064071] BUG: soft lockup - CPU#0 stuck for 22s! [nfsd:558]
      [  228.064428] Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211
      xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc
      ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
      nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
      ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
      nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security
      iptable_raw nfsd(OF) auth_rpcgss nfs_acl lockd snd_intel8x0
      snd_ac97_codec ac97_bus joydev snd_pcm snd_timer e1000 sunrpc snd ppdev
      parport_pc serio_raw pcspkr i2c_piix4 microcode parport soundcore
      i2c_core ata_generic pata_acpi
      [  228.064539] CPU: 0 PID: 558 Comm: nfsd Tainted: GF          O
      3.14.0-rc8+ #2
      [  228.064539] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
      VirtualBox 12/01/2006
      [  228.064539] task: ffff880076adec00 ti: ffff880074616000 task.ti:
      ffff880074616000
      [  228.064539] RIP: 0010:[<ffffffff8133ba17>]  [<ffffffff8133ba17>]
      rb_next+0x27/0x50
      [  228.064539] RSP: 0018:ffff880074617de0  EFLAGS: 00000282
      [  228.064539] RAX: ffff880074478010 RBX: ffff88007446f860 RCX:
      0000000000000014
      [  228.064539] RDX: ffff880074478010 RSI: 0000000000000000 RDI:
      ffff880074478010
      [  228.064539] RBP: ffff880074617de0 R08: 0000000000000000 R09:
      0000000000000012
      [  228.064539] R10: 0000000000000001 R11: ffffffffffffffec R12:
      ffffea0001d11a00
      [  228.064539] R13: ffff88007f401400 R14: ffff88007446f800 R15:
      ffff880074617d50
      [  228.064539] FS:  0000000000000000(0000) GS:ffff88007f800000(0000)
      knlGS:0000000000000000
      [  228.064539] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  228.064539] CR2: 00007fe9ac6ec000 CR3: 000000007a5d6000 CR4:
      00000000000006f0
      [  228.064539] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [  228.064539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
      0000000000000400
      [  228.064539] Stack:
      [  228.064539]  ffff880074617e28 ffffffffa01ab7db ffff880074617df0
      ffff880074617df0
      [  228.064539]  ffff880079273000 ffffffff81cc26c0 ffffffff81cc26c0
      0000000000000000
      [  228.064539]  0000000000000000 ffff880074617e48 ffffffffa01840b8
      ffffffff81cc26c0
      [  228.064539] Call Trace:
      [  228.064539]  [<ffffffffa01ab7db>] nfs4_state_shutdown_net+0x18b/0x220
      [nfsd]
      [  228.064539]  [<ffffffffa01840b8>] nfsd_shutdown_net+0x38/0x70 [nfsd]
      [  228.064539]  [<ffffffffa018413e>] nfsd_last_thread+0x4e/0x80 [nfsd]
      [  228.064539]  [<ffffffffa00aa1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc]
      [  228.064539]  [<ffffffffa018464b>] nfsd_destroy+0x5b/0x80 [nfsd]
      [  228.064539]  [<ffffffffa0184773>] nfsd+0x103/0x130 [nfsd]
      [  228.064539]  [<ffffffffa0184670>] ? nfsd_destroy+0x80/0x80 [nfsd]
      [  228.064539]  [<ffffffff810a8232>] kthread+0xd2/0xf0
      [  228.064539]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
      [  228.064539]  [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0
      [  228.064539]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
      [  228.064539] Code: 1f 44 00 00 55 48 8b 17 48 89 e5 48 39 d7 74 3b 48
      8b 47 08 48 85 c0 75 0e eb 25 66 0f 1f 84 00 00 00 00 00 48 89 d0 48 8b
      50 10 <48> 85 d2 75 f4 5d c3 66 90 48 3b 78 08 75 f6 48 8b 10 48 89 c7
      
      Fixes: ac55fdc4 (nfsd: move the confirmed and unconfirmed hlists...)
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2b905635
    • J
      svcrpc: explicitly reject compounds that are not padded out to 4-byte multiple · e874f9f8
      Jeff Layton 提交于
      We have a WARN_ON in the nfsd4_decode_write() that tells us when the
      client has sent a request that is not padded out properly according to
      RFC4506. A WARN_ON really isn't appropriate in this case though since
      this indicates a client bug, not a server one.
      
      Move this check out to the top-level compound decoder and have it just
      explicitly return an error. Also add a dprintk() that shows the client
      address and xid to help track down clients and frames that trigger it.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      e874f9f8
    • J
      nfsd: notify_change needs elevated write count · 9f67f189
      J. Bruce Fields 提交于
      Looks like this bug has been here since these write counts were
      introduced, not sure why it was just noticed now.
      
      Thanks also to Jan Kara for pointing out the problem.
      
      Cc: stable@vger.kernel.org
      Reported-by: NMatthew Rahtz <mrahtz@rapitasystems.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9f67f189
    • J
      nfsd4: fix test_stateid error reply encoding · a11fcce1
      J. Bruce Fields 提交于
      If the entire operation fails then there's nothing to encode.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a11fcce1
    • J
      nfsd4: leave reply buffer space for failed setattr · 04819bf6
      J. Bruce Fields 提交于
      This fixes an ommission from 18032ca0
      "NFSD: Server implementation of MAC Labeling", which increased the size
      of the setattr error reply without increasing COMPOUND_ERR_SLACK_SPACE.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      04819bf6
    • J
      nfsd4: make set of large acl return efbig, not resource · 798df338
      J. Bruce Fields 提交于
      If a client attempts to set an excessively large ACL, return
      NFS4ERR_FBIG instead of NFS4ERR_RESOURCE.  I'm not sure FBIG is correct,
      but I'm positive RESOURCE is wrong (it isn't even a well-defined error
      any more for NFS versions since 4.1).
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      798df338
    • J
      nfsd4: session needs room for following op to error out · 4c69d585
      J. Bruce Fields 提交于
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      4c69d585
    • J
      nfsd4: buffer-length check for SUPPATTR_EXCLCREAT · de3997a7
      J. Bruce Fields 提交于
      This was an omission from 8c18f205
      "nfsd41: SUPPATTR_EXCLCREAT attribute".
      
      Cc: Benny Halevy <bhalevy@primarydata.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      de3997a7
  3. 19 2月, 2014 1 次提交
  4. 14 2月, 2014 1 次提交
    • N
      lockd: send correct lock when granting a delayed lock. · 2ec197db
      NeilBrown 提交于
      If an NFS client attempts to get a lock (using NLM) and the lock is
      not available, the server will remember the request and when the lock
      becomes available it will send a GRANT request to the client to
      provide the lock.
      
      If the client already held an adjacent lock, the GRANT callback will
      report the union of the existing and new locks, which can confuse the
      client.
      
      This happens because __posix_lock_file (called by vfs_lock_file)
      updates the passed-in file_lock structure when adjacent or
      over-lapping locks are found.
      
      To avoid this problem we take a copy of the two fields that can
      be changed (fl_start and fl_end) before the call and restore them
      afterwards.
      An alternate would be to allocate a 'struct file_lock', initialise it,
      use locks_copy_lock() to take a copy, then locks_release_private()
      after the vfs_lock_file() call.  But that is a lot more work.
      Reported-by: NOlaf Kirch <okir@suse.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      
      --
      v1 had a couple of issues (large on-stack struct and didn't really work properly).
      This version is much better tested.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2ec197db
  5. 12 2月, 2014 1 次提交
    • J
      nfsd4: fix acl buffer overrun · 09bdc2d7
      J. Bruce Fields 提交于
      4ac7249e "nfsd: use get_acl and
      ->set_acl" forgets to set the size in the case get_acl() succeeds, so
      _posix_to_nfsv4_one() can then write past the end of its allocation.
      Symptoms were slab corruption warnings.
      
      Also, some minor cleanup while we're here.  (Among other things, note
      that the first few lines guarantee that pacl is non-NULL.)
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      09bdc2d7
  6. 10 2月, 2014 1 次提交
    • A
      fix O_SYNC|O_APPEND syncing the wrong range on write() · d311d79d
      Al Viro 提交于
      It actually goes back to 2004 ([PATCH] Concurrent O_SYNC write support)
      when sync_page_range() had been introduced; generic_file_write{,v}() correctly
      synced
      	pos_after_write - written .. pos_after_write - 1
      but generic_file_aio_write() synced
      	pos_before_write .. pos_before_write + written - 1
      instead.  Which is not the same thing with O_APPEND, obviously.
      A couple of years later correct variant had been killed off when
      everything switched to use of generic_file_aio_write().
      
      All users of generic_file_aio_write() are affected, and the same bug
      has been copied into other instances of ->aio_write().
      
      The fix is trivial; the only subtle point is that generic_write_sync()
      ought to be inlined to avoid calculations useless for the majority of
      calls.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d311d79d
  7. 09 2月, 2014 6 次提交
    • F
      Btrfs: fix data corruption when reading/updating compressed extents · a2aa75e1
      Filipe David Borba Manana 提交于
      When using a mix of compressed file extents and prealloc extents, it
      is possible to fill a page of a file with random, garbage data from
      some unrelated previous use of the page, instead of a sequence of zeroes.
      
      A simple sequence of steps to get into such case, taken from the test
      case I made for xfstests, is:
      
         _scratch_mkfs
         _scratch_mount "-o compress-force=lzo"
         $XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
      
      This results in the following file items in the fs tree:
      
         item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
             inode generation 6 transid 6 size 542872 block group 0 mode 100600
         item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16
             inode ref index 2 namelen 6 name: foobar
         item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
             extent data disk byte 0 nr 0 gen 6
             extent data offset 0 nr 24576 ram 266240
             extent compression 0
         item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53
             prealloc data disk byte 12849152 nr 241664 gen 6
             prealloc data offset 0 nr 241664
         item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53
             extent data disk byte 12845056 nr 4096 gen 6
             extent data offset 0 nr 20480 ram 20480
             extent compression 2
         item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53
             prealloc data disk byte 13090816 nr 405504 gen 6
             prealloc data offset 0 nr 258048
      
      The on disk extent at offset 266240 (which corresponds to 1 single disk block),
      contains 5 compressed chunks of file data. Each of the first 4 compress 4096
      bytes of file data, while the last one only compresses 3024 bytes of file data.
      Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 =
      1072 bytes) should always return zeroes (our next extent is a prealloc one).
      
      The solution here is the compression code path to zero the remaining (untouched)
      bytes of the last page it uncompressed data into, as the information about how
      much space the file data consumes in the last page is not known in the upper layer
      fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing
      the remainder of the page but only if it corresponds to the last page of the inode
      and if the inode's size is not a multiple of the page size.
      
      This would cause not only returning random data on reads, but also permanently
      storing random data when updating parts of the region that should be zeroed.
      For the example above, it means updating a single byte in the region [285648 ; 286720[
      would store that byte correctly but also store random data on disk.
      
      A test case for xfstests follows soon.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      a2aa75e1
    • J
      Btrfs: don't loop forever if we can't run because of the tree mod log · 27a377db
      Josef Bacik 提交于
      A user reported a 100% cpu hang with my new delayed ref code.  Turns out I
      forgot to increase the count check when we can't run a delayed ref because of
      the tree mod log.  If we can't run any delayed refs during this there is no
      point in continuing to look, and we need to break out.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      27a377db
    • D
      btrfs: reserve no transaction units in btrfs_ioctl_set_features · 8051aa1a
      David Sterba 提交于
      Added in patch "btrfs: add ioctls to query/change feature bits online"
      modifications to superblock don't need to reserve metadata blocks when
      starting a transaction.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      8051aa1a
    • J
      btrfs: commit transaction after setting label and features · d0270aca
      Jeff Mahoney 提交于
      The set_fslabel ioctl uses btrfs_end_transaction, which means it's
      possible that the change will be lost if the system crashes, same for
      the newly set features. Let's use btrfs_commit_transaction instead.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      d0270aca
    • J
      Btrfs: fix assert screwup for the pending move stuff · 6cc98d90
      Josef Bacik 提交于
      Wang noticed that he was failing btrfs/030 even though me and Filipe couldn't
      reproduce.  Turns out this is because Wang didn't have CONFIG_BTRFS_ASSERT set,
      which meant that a key part of Filipe's original patch was not being built in.
      This appears to be a mess up with merging Filipe's patch as it does not exist in
      his original patch.  Fix this by changing how we make sure del_waiting_dir_move
      asserts that it did not error and take the function out of the ifdef check.
      This makes btrfs/030 pass with the assert on or off.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NFilipe Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      6cc98d90
    • D
      jfs: fix generic posix ACL regression · c18f7b51
      Dave Kleikamp 提交于
      I missed a couple errors in reviewing the patches converting jfs
      to use the generic posix ACL function. Setting ACL's currently
      fails with -EOPNOTSUPP.
      Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
      Reported-by: NMichael L. Semon <mlsemon35@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      c18f7b51
  8. 07 2月, 2014 2 次提交
  9. 06 2月, 2014 2 次提交
    • L
      execve: use 'struct filename *' for executable name passing · c4ad8f98
      Linus Torvalds 提交于
      This changes 'do_execve()' to get the executable name as a 'struct
      filename', and to free it when it is done.  This is what the normal
      users want, and it simplifies and streamlines their error handling.
      
      The controlled lifetime of the executable name also fixes a
      use-after-free problem with the trace_sched_process_exec tracepoint: the
      lifetime of the passed-in string for kernel users was not at all
      obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize
      the pathname allocation lifetime with the execve() having finished,
      which in turn meant that the trace point that happened after
      mm_release() of the old process VM ended up using already free'd memory.
      
      To solve the kernel string lifetime issue, this simply introduces
      "getname_kernel()" that works like the normal user-space getname()
      function, except with the source coming from kernel memory.
      
      As Oleg points out, this also means that we could drop the tcomm[] array
      from 'struct linux_binprm', since the pathname lifetime now covers
      setup_new_exec().  That would be a separate cleanup.
      Reported-by: NIgor Zhbanov <i.zhbanov@samsung.com>
      Tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4ad8f98
    • T
      kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag · da9846ae
      Tejun Heo 提交于
      kernfs_deactivate() forgot to check whether KERNFS_LOCKDEP is set
      before performing lockdep annotations and ends up feeding
      uninitialized lockdep_map to lockdep triggering warning like the
      following on USB stick hotunplug.
      
       usb 1-2: USB disconnect, device number 2
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
       CPU: 1 PID: 62 Comm: khubd Not tainted 3.13.0-work+ #82
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
        ffff880065ca7f60 ffff88013a4ffa08 ffffffff81cfb6bd 0000000000000002
        ffff88013a4ffac8 ffffffff810f8530 ffff88013a4fc710 0000000000000002
        ffff880100000000 ffffffff82a3db50 0000000000000001 ffff88013a4fc710
       Call Trace:
        [<ffffffff81cfb6bd>] dump_stack+0x4e/0x7a
        [<ffffffff810f8530>] __lock_acquire+0x1910/0x1e70
        [<ffffffff810f931a>] lock_acquire+0x9a/0x1d0
        [<ffffffff8127c75e>] kernfs_deactivate+0xee/0x130
        [<ffffffff8127d4c8>] kernfs_addrm_finish+0x38/0x60
        [<ffffffff8127d701>] kernfs_remove_by_name_ns+0x51/0xa0
        [<ffffffff8127b4f1>] remove_files.isra.1+0x41/0x80
        [<ffffffff8127b7e7>] sysfs_remove_group+0x47/0xa0
        [<ffffffff8127b873>] sysfs_remove_groups+0x33/0x50
        [<ffffffff8177d66d>] device_remove_attrs+0x4d/0x80
        [<ffffffff8177e25e>] device_del+0x12e/0x1d0
        [<ffffffff819722c2>] usb_disconnect+0x122/0x1a0
        [<ffffffff819749b5>] hub_thread+0x3c5/0x1290
        [<ffffffff810c6a6d>] kthread+0xed/0x110
        [<ffffffff81d0a56c>] ret_from_fork+0x7c/0xb0
      
      Fix it by making kernfs_deactivate() perform lockdep annotations only
      if KERNFS_LOCKDEP is set.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NFabio Estevam <festevam@gmail.com>
      Reported-by: NAlan Stern <stern@rowland.harvard.edu>
      Reported-by: NJiri Kosina <jkosina@suse.cz>
      Reported-by: NDave Jones <davej@redhat.com>
      Tested-by: NFabio Estevam <fabio.estevam@freescale.com>
      Tested-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      da9846ae
  10. 04 2月, 2014 6 次提交
  11. 03 2月, 2014 2 次提交
    • M
      hpfs: optimize quad buffer loading · 1c0b8a7a
      Mikulas Patocka 提交于
      HPFS needs to load 4 consecutive 512-byte sectors when accessing the
      directory nodes or bitmaps.  We can't switch to 2048-byte block size
      because files are allocated in the units of 512-byte sectors.
      
      Previously, the driver would allocate a 2048-byte area using kmalloc,
      copy the data from four buffers to this area and eventually copy them
      back if they were modified.
      
      In the current implementation of the buffer cache, buffers are allocated
      in the pagecache.  That means that 4 consecutive 512-byte buffers are
      stored in consecutive areas in the kernel address space.  So, we don't
      need to allocate extra memory and copy the content of the buffers there.
      
      This patch optimizes the code to avoid copying the buffers.  It checks
      if the four buffers are stored in contiguous memory - if they are not,
      it falls back to allocating a 2048-byte area and copying data there.
      Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c0b8a7a
    • M
      hpfs: remember free space · 2cbe5c76
      Mikulas Patocka 提交于
      Previously, hpfs scanned all bitmaps each time the user asked for free
      space using statfs.  This patch changes it so that hpfs scans the
      bitmaps only once, remembes the free space and on next invocation of
      statfs it returns the value instantly.
      
      New versions of wine are hammering on the statfs syscall very heavily,
      making some games unplayable when they're stored on hpfs, with load
      times in minutes.
      
      This should be backported to the stable kernels because it fixes
      user-visible problem (excessive level load times in wine).
      Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2cbe5c76
  12. 02 2月, 2014 3 次提交
  13. 01 2月, 2014 5 次提交