1. 22 1月, 2018 14 次提交
  2. 20 1月, 2018 1 次提交
    • A
      proc: fix coredump vs read /proc/*/stat race · 8bb2ee19
      Alexey Dobriyan 提交于
      do_task_stat() accesses IP and SP of a task without bumping reference
      count of a stack (which became an entity with independent lifetime at
      some point).
      
      Steps to reproduce:
      
          #include <stdio.h>
          #include <sys/types.h>
          #include <sys/stat.h>
          #include <fcntl.h>
          #include <sys/time.h>
          #include <sys/resource.h>
          #include <unistd.h>
          #include <sys/wait.h>
      
          int main(void)
          {
          	setrlimit(RLIMIT_CORE, &(struct rlimit){});
      
          	while (1) {
          		char buf[64];
          		char buf2[4096];
          		pid_t pid;
          		int fd;
      
          		pid = fork();
          		if (pid == 0) {
          			*(volatile int *)0 = 0;
          		}
      
          		snprintf(buf, sizeof(buf), "/proc/%u/stat", pid);
          		fd = open(buf, O_RDONLY);
          		read(fd, buf2, sizeof(buf2));
          		close(fd);
      
          		waitpid(pid, NULL, 0);
          	}
          	return 0;
          }
      
          BUG: unable to handle kernel paging request at 0000000000003fd8
          IP: do_task_stat+0x8b4/0xaf0
          PGD 800000003d73e067 P4D 800000003d73e067 PUD 3d558067 PMD 0
          Oops: 0000 [#1] PREEMPT SMP PTI
          CPU: 0 PID: 1417 Comm: a.out Not tainted 4.15.0-rc8-dirty #2
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014
          RIP: 0010:do_task_stat+0x8b4/0xaf0
          Call Trace:
           proc_single_show+0x43/0x70
           seq_read+0xe6/0x3b0
           __vfs_read+0x1e/0x120
           vfs_read+0x84/0x110
           SyS_read+0x3d/0xa0
           entry_SYSCALL_64_fastpath+0x13/0x6c
          RIP: 0033:0x7f4d7928cba0
          RSP: 002b:00007ffddb245158 EFLAGS: 00000246
          Code: 03 b7 a0 01 00 00 4c 8b 4c 24 70 4c 8b 44 24 78 4c 89 74 24 18 e9 91 f9 ff ff f6 45 4d 02 0f 84 fd f7 ff ff 48 8b 45 40 48 89 ef <48> 8b 80 d8 3f 00 00 48 89 44 24 20 e8 9b 97 eb ff 48 89 44 24
          RIP: do_task_stat+0x8b4/0xaf0 RSP: ffffc90000607cc8
          CR2: 0000000000003fd8
      
      John Ogness said: for my tests I added an else case to verify that the
      race is hit and correctly mitigated.
      
      Link: http://lkml.kernel.org/r/20180116175054.GA11513@avx2Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Reported-by: N"Kohli, Gaurav" <gkohli@codeaurora.org>
      Tested-by: NJohn Ogness <john.ogness@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8bb2ee19
  3. 05 1月, 2018 1 次提交
  4. 04 1月, 2018 1 次提交
  5. 03 1月, 2018 5 次提交
  6. 02 1月, 2018 3 次提交
  7. 22 12月, 2017 6 次提交
  8. 21 12月, 2017 1 次提交
  9. 19 12月, 2017 1 次提交
  10. 18 12月, 2017 2 次提交
  11. 17 12月, 2017 1 次提交
  12. 16 12月, 2017 3 次提交
    • L
      Revert "mm: replace p??_write with pte_access_permitted in fault + gup paths" · f6f37321
      Linus Torvalds 提交于
      This reverts commits 5c9d2d5c, c7da82b8, and e7fe7b5c.
      
      We'll probably need to revisit this, but basically we should not
      complicate the get_user_pages_fast() case, and checking the actual page
      table protection key bits will require more care anyway, since the
      protection keys depend on the exact state of the VM in question.
      
      Particularly when doing a "remote" page lookup (ie in somebody elses VM,
      not your own), you need to be much more careful than this was.  Dave
      Hansen says:
      
       "So, the underlying bug here is that we now a get_user_pages_remote()
        and then go ahead and do the p*_access_permitted() checks against the
        current PKRU. This was introduced recently with the addition of the
        new p??_access_permitted() calls.
      
        We have checks in the VMA path for the "remote" gups and we avoid
        consulting PKRU for them. This got missed in the pkeys selftests
        because I did a ptrace read, but not a *write*. I also didn't
        explicitly test it against something where a COW needed to be done"
      
      It's also not entirely clear that it makes sense to check the protection
      key bits at this level at all.  But one possible eventual solution is to
      make the get_user_pages_fast() case just abort if it sees protection key
      bits set, which makes us fall back to the regular get_user_pages() case,
      which then has a vma and can do the check there if we want to.
      
      We'll see.
      
      Somewhat related to this all: what we _do_ want to do some day is to
      check the PAGE_USER bit - it should obviously always be set for user
      pages, but it would be a good check to have back.  Because we have no
      generic way to test for it, we lost it as part of moving over from the
      architecture-specific x86 GUP implementation to the generic one in
      commit e585513b ("x86/mm/gup: Switch GUP to the generic
      get_user_page_fast() implementation").
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f6f37321
    • S
      nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests · dc4fd9ab
      Scott Mayhew 提交于
      If there were no commit requests, then nfs_commit_inode() should not
      wait on the commit or mark the inode dirty, otherwise the following
      BUG_ON can be triggered:
      
      [ 1917.130762] kernel BUG at fs/inode.c:578!
      [ 1917.130766] Oops: Exception in kernel mode, sig: 5 [#1]
      [ 1917.130768] SMP NR_CPUS=2048 NUMA pSeries
      [ 1917.130772] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc sg nx_crypto pseries_rng ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod
      [ 1917.130805] CPU: 2 PID: 14923 Comm: umount.nfs4 Tainted: G               ------------ T 3.10.0-768.el7.ppc64 #1
      [ 1917.130810] task: c0000005ecd88040 ti: c00000004cea0000 task.ti: c00000004cea0000
      [ 1917.130813] NIP: c000000000354178 LR: c000000000354160 CTR: c00000000012db80
      [ 1917.130816] REGS: c00000004cea3720 TRAP: 0700   Tainted: G               ------------ T  (3.10.0-768.el7.ppc64)
      [ 1917.130820] MSR: 8000000100029032 <SF,EE,ME,IR,DR,RI>  CR: 22002822  XER: 20000000
      [ 1917.130828] CFAR: c00000000011f594 SOFTE: 1
      GPR00: c000000000354160 c00000004cea39a0 c0000000014c4700 c0000000018cc750
      GPR04: 000000000000c750 80c0000000000000 0600000000000000 04eeb76bea749a03
      GPR08: 0000000000000034 c0000000018cc758 0000000000000001 d000000005e619e8
      GPR12: c00000000012db80 c000000007b31200 0000000000000000 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR24: 0000000000000000 c000000000dfc3ec 0000000000000000 c0000005eefc02c0
      GPR28: d0000000079dbd50 c0000005b94a02c0 c0000005b94a0250 c0000005b94a01c8
      [ 1917.130867] NIP [c000000000354178] .evict+0x1c8/0x350
      [ 1917.130871] LR [c000000000354160] .evict+0x1b0/0x350
      [ 1917.130873] Call Trace:
      [ 1917.130876] [c00000004cea39a0] [c000000000354160] .evict+0x1b0/0x350 (unreliable)
      [ 1917.130880] [c00000004cea3a30] [c0000000003558cc] .evict_inodes+0x13c/0x270
      [ 1917.130884] [c00000004cea3af0] [c000000000327d20] .kill_anon_super+0x70/0x1e0
      [ 1917.130896] [c00000004cea3b80] [d000000005e43e30] .nfs_kill_super+0x20/0x60 [nfs]
      [ 1917.130900] [c00000004cea3c00] [c000000000328a20] .deactivate_locked_super+0xa0/0x1b0
      [ 1917.130903] [c00000004cea3c80] [c00000000035ba54] .cleanup_mnt+0xd4/0x180
      [ 1917.130907] [c00000004cea3d10] [c000000000119034] .task_work_run+0x114/0x150
      [ 1917.130912] [c00000004cea3db0] [c00000000001ba6c] .do_notify_resume+0xcc/0x100
      [ 1917.130916] [c00000004cea3e30] [c00000000000a7b0] .ret_from_except_lite+0x5c/0x60
      [ 1917.130919] Instruction dump:
      [ 1917.130921] 7fc3f378 486734b5 60000000 387f00a0 38800003 4bdcb365 60000000 e95f00a0
      [ 1917.130927] 694a0060 7d4a0074 794ad182 694a0001 <0b0a0000> 892d02a4 2f890000 40de0134
      Signed-off-by: NScott Mayhew <smayhew@redhat.com>
      Cc: stable@vger.kernel.org # 4.5+
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      dc4fd9ab
    • S
      nfs: fix a deadlock in nfs client initialization · c156618e
      Scott Mayhew 提交于
      The following deadlock can occur between a process waiting for a client
      to initialize in while walking the client list during nfsv4 server trunking
      detection and another process waiting for the nfs_clid_init_mutex so it
      can initialize that client:
      
      Process 1                               Process 2
      ---------                               ---------
      spin_lock(&nn->nfs_client_lock);
      list_add_tail(&CLIENTA->cl_share_link,
              &nn->nfs_client_list);
      spin_unlock(&nn->nfs_client_lock);
                                              spin_lock(&nn->nfs_client_lock);
                                              list_add_tail(&CLIENTB->cl_share_link,
                                                      &nn->nfs_client_list);
                                              spin_unlock(&nn->nfs_client_lock);
                                              mutex_lock(&nfs_clid_init_mutex);
                                              nfs41_walk_client_list(clp, result, cred);
                                              nfs_wait_client_init_complete(CLIENTA);
      (waiting for nfs_clid_init_mutex)
      
      Make sure nfs_match_client() only evaluates clients that have completed
      initialization in order to prevent that deadlock.
      
      This patch also fixes v4.0 trunking behavior by not marking the client
      NFS_CS_READY until the clientid has been confirmed.
      Signed-off-by: NScott Mayhew <smayhew@redhat.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      c156618e
  13. 15 12月, 2017 1 次提交