1. 15 8月, 2015 4 次提交
    • H
      ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem() · a9795584
      Herton R. Krzesinski 提交于
      After we acquire the sma->sem_perm lock in exit_sem(), we are protected
      against a racing IPC_RMID operation.  Also at that point, we are the last
      user of sem_undo_list.  Therefore it isn't required that we acquire or use
      ulp->lock.
      Signed-off-by: NHerton R. Krzesinski <herton@redhat.com>
      Acked-by: NManfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Rafael Aquini <aquini@redhat.com>
      CC: Aristeu Rozanski <aris@redhat.com>
      Cc: David Jeffery <djeffery@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9795584
    • H
      ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits · 602b8593
      Herton R. Krzesinski 提交于
      The current semaphore code allows a potential use after free: in
      exit_sem we may free the task's sem_undo_list while there is still
      another task looping through the same semaphore set and cleaning the
      sem_undo list at freeary function (the task called IPC_RMID for the same
      semaphore set).
      
      For example, with a test program [1] running which keeps forking a lot
      of processes (which then do a semop call with SEM_UNDO flag), and with
      the parent right after removing the semaphore set with IPC_RMID, and a
      kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
      CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
      in the kernel log:
      
         Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
         000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
         010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
         Prev obj: start=ffff88003b45c180, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
         Next obj: start=ffff88003b45c200, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
         BUG: spinlock wrong CPU on CPU#2, test/18028
         general protection fault: 0000 [#1] SMP
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: spin_dump+0x53/0xc0
         Call Trace:
           spin_bug+0x30/0x40
           do_raw_spin_unlock+0x71/0xa0
           _raw_spin_unlock+0xe/0x10
           freeary+0x82/0x2a0
           ? _raw_spin_lock+0xe/0x10
           semctl_down.clone.0+0xce/0x160
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           SyS_semctl+0x236/0x2c0
           ? syscall_trace_leave+0xde/0x130
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
         RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
          RSP <ffff88003750fd68>
         ---[ end trace 783ebb76612867a0 ]---
         NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: native_read_tsc+0x0/0x20
         Call Trace:
           ? delay_tsc+0x40/0x70
           __delay+0xf/0x20
           do_raw_spin_lock+0x96/0x140
           _raw_spin_lock+0xe/0x10
           sem_lock_and_putref+0x11/0x70
           SYSC_semtimedop+0x7bf/0x960
           ? handle_mm_fault+0xbf6/0x1880
           ? dequeue_task_fair+0x79/0x4a0
           ? __do_page_fault+0x19a/0x430
           ? kfree_debugcheck+0x16/0x40
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           ? do_audit_syscall_entry+0x66/0x70
           ? syscall_trace_enter_phase1+0x139/0x160
           SyS_semtimedop+0xe/0x10
           SyS_semop+0x10/0x20
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
         Kernel panic - not syncing: softlockup: hung tasks
      
      I wasn't able to trigger any badness on a recent kernel without the
      proper config debugs enabled, however I have softlockup reports on some
      kernel versions, in the semaphore code, which are similar as above (the
      scenario is seen on some servers running IBM DB2 which uses semaphore
      syscalls).
      
      The patch here fixes the race against freeary, by acquiring or waiting
      on the sem_undo_list lock as necessary (exit_sem can race with freeary,
      while freeary sets un->semid to -1 and removes the same sem_undo from
      list_proc or when it removes the last sem_undo).
      
      After the patch I'm unable to reproduce the problem using the test case
      [1].
      
      [1] Test case used below:
      
          #include <stdio.h>
          #include <sys/types.h>
          #include <sys/ipc.h>
          #include <sys/sem.h>
          #include <sys/wait.h>
          #include <stdlib.h>
          #include <time.h>
          #include <unistd.h>
          #include <errno.h>
      
          #define NSEM 1
          #define NSET 5
      
          int sid[NSET];
      
          void thread()
          {
                  struct sembuf op;
                  int s;
                  uid_t pid = getuid();
      
                  s = rand() % NSET;
                  op.sem_num = pid % NSEM;
                  op.sem_op = 1;
                  op.sem_flg = SEM_UNDO;
      
                  semop(sid[s], &op, 1);
                  exit(EXIT_SUCCESS);
          }
      
          void create_set()
          {
                  int i, j;
                  pid_t p;
                  union {
                          int val;
                          struct semid_ds *buf;
                          unsigned short int *array;
                          struct seminfo *__buf;
                  } un;
      
                  /* Create and initialize semaphore set */
                  for (i = 0; i < NSET; i++) {
                          sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                          if (sid[i] < 0) {
                                  perror("semget");
                                  exit(EXIT_FAILURE);
                          }
                  }
                  un.val = 0;
                  for (i = 0; i < NSET; i++) {
                          for (j = 0; j < NSEM; j++) {
                                  if (semctl(sid[i], j, SETVAL, un) < 0)
                                          perror("semctl");
                          }
                  }
      
                  /* Launch threads that operate on semaphore set */
                  for (i = 0; i < NSEM * NSET * NSET; i++) {
                          p = fork();
                          if (p < 0)
                                  perror("fork");
                          if (p == 0)
                                  thread();
                  }
      
                  /* Free semaphore set */
                  for (i = 0; i < NSET; i++) {
                          if (semctl(sid[i], NSEM, IPC_RMID))
                                  perror("IPC_RMID");
                  }
      
                  /* Wait for forked processes to exit */
                  while (wait(NULL)) {
                          if (errno == ECHILD)
                                  break;
                  };
          }
      
          int main(int argc, char **argv)
          {
                  pid_t p;
      
                  srand(time(NULL));
      
                  while (1) {
                          p = fork();
                          if (p < 0) {
                                  perror("fork");
                                  exit(EXIT_FAILURE);
                          }
                          if (p == 0) {
                                  create_set();
                                  goto end;
                          }
      
                          /* Wait for forked processes to exit */
                          while (wait(NULL)) {
                                  if (errno == ECHILD)
                                          break;
                          };
                  }
          end:
                  return 0;
          }
      
      [akpm@linux-foundation.org: use normal comment layout]
      Signed-off-by: NHerton R. Krzesinski <herton@redhat.com>
      Acked-by: NManfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Rafael Aquini <aquini@redhat.com>
      CC: Aristeu Rozanski <aris@redhat.com>
      Cc: David Jeffery <djeffery@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      602b8593
    • W
      mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held · 03613808
      Wanpeng Li 提交于
      Hugetlbfs pages will get a refcount in get_any_page() or
      madvise_hwpoison() if soft offlining through madvise.  The refcount which
      is held by the soft offline path should be released if we fail to isolate
      hugetlbfs pages.
      
      Fix it by reducing the refcount for both isolation success and failure.
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: <stable@vger.kernel.org>	[3.9+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      03613808
    • W
      mm/hwpoison: fix page refcount of unknown non LRU page · 4f32be67
      Wanpeng Li 提交于
      After trying to drain pages from pagevec/pageset, we try to get reference
      count of the page again, however, the reference count of the page is not
      reduced if the page is still not on LRU list.
      
      Fix it by adding the put_page() to drop the page reference which is from
      __get_any_page().
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: <stable@vger.kernel.org>	[3.9+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4f32be67
  2. 14 8月, 2015 8 次提交
    • L
      Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 7ddab733
      Linus Torvalds 提交于
      Pull ARM fixes from Russell King:
       "Another few small ARM fixes, mostly addressing some VDSO issues"
      
      * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
        ARM: 8410/1: VDSO: fix coarse clock monotonicity regression
        ARM: 8409/1: Mark ret_fast_syscall as a function
        ARM: 8408/1: Fix the secondary_startup function in Big Endian case
        ARM: 8405/1: VDSO: fix regression with toolchains lacking ld.bfd executable
      7ddab733
    • L
      x86: fix error handling for 32-bit compat out-of-range system call numbers · cd88ec23
      Linus Torvalds 提交于
      Commit 3f5159a9 ("x86/asm/entry/32: Update -ENOSYS handling to match
      the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case.
      The proper error return value was never loaded into %rax, except if
      things just happened to go through the audit paths, which ended up
      reloading the return value.
      
      This moves the loading or %rax into the normal system call path, just to
      make sure the error case triggers it.  It's kind of sad, since it adds a
      useless instruction to reload the register to the fast path, but it's
      not like that single load from the stack is going to be noticeable.
      Reported-by: NDavid Drysdale <drysdale@google.com>
      Tested-by: NKees Cook <keescook@chromium.org>
      Acked-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd88ec23
    • L
      Merge tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · 5b3e2e14
      Linus Torvalds 提交于
      Pull device mapper fixes from Mike Snitzer:
      
       - two stable fixes for corruption seen in a snapshot of thinp metadata;
         metadata snapshots aren't widely used but help provide a consistent
         view of the metadata associated with an active thin-pool.
      
       - a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"
      
      * tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm cache policy smq: move 'dm-cache-default' module alias to SMQ
        dm btree: add ref counting ops for the leaves of top level btrees
        dm thin metadata: delete btrees when releasing metadata snapshot
      5b3e2e14
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · ebcbf166
      Linus Torvalds 提交于
      Pull xen block driver fixes from Jens Axboe:
       "A few small bug fixes for xen-blk{front,back} that have been sitting
        over my vacation"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
        xen-blkfront: don't add indirect pages to list when !feature_persistent
        xen-blkfront: introduce blkfront_gather_backend_features()
      ebcbf166
    • L
      Merge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 6b476e11
      Linus Torvalds 提交于
      Pull xen bug fixes from David Vrabel:
      
       - revert a fix from 4.2-rc5 that was causing lots of WARNING spam.
      
       - fix a memory leak affecting backends in HVM guests.
      
       - fix PV domU hang with certain configurations.
      
      * tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
        Revert "xen/events/fifo: Handle linked events when closing a port"
        x86/xen: build "Xen PV" APIC driver for domU as well
      6b476e11
    • L
      Revert x86 sigcontext cleanups · ed596cde
      Linus Torvalds 提交于
      This reverts commits 9a036b93 ("x86/signal/64: Remove 'fs' and 'gs'
      from sigcontext") and c6f20629 ("x86/signal/64: Fix SS handling for
      signals delivered to 64-bit programs").
      
      They were cleanups, but they break dosemu by changing the signal return
      behavior (and removing 'fs' and 'gs' from the sigcontext struct - while
      not actually changing any behavior - causes build problems).
      Reported-and-tested-by: NStas Sergeev <stsp@list.ru>
      Acked-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed596cde
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 26b552e0
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Workaround hw bug when acquiring PCI bos ownership of iwlwifi
          devices, from Emmanuel Grumbach.
      
       2) Falling back to vmalloc in conntrack should not emit a warning, from
          Pablo Neira Ayuso.
      
       3) Fix NULL deref when rtlwifi driver is used as an AP, from Luis
          Felipe Dominguez Vega.
      
       4) Rocker doesn't free netdev on device removal, from Ido Schimmel.
      
       5) UDP multicast early sock demux has route handling races, from Eric
          Dumazet.
      
       6) Fix L4 checksum handling in openvswitch, from Glenn Griffin.
      
       7) Fix use-after-free in skb_set_peeked, from Herbert Xu.
      
       8) Don't advertize NETIF_F_FRAGLIST in virtio_net driver, this can lead
          to fraglists longer than the driver can support.  From Jason Wang.
      
       9) Fix mlx5 on non-4k-pagesize systems, from Carol L Soto.
      
      10) Fix interrupt storm in bna driver, from Ivan Vecera.
      
      11) Don't propagate -EBUSY from netlink_insert(), from Daniel Borkmann.
      
      12) Fix inet request sock leak, from Eric Dumazet.
      
      13) Fix TX interrupt masking and marking in TX descriptors of fs_enet
          driver, from LEROY Christophe.
      
      14) Get rid of rule optimizer in gianfar driver, it's buggy and unlikely
          to get fixed any time soon.  From Jakub Kicinski
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
        cosa: missing error code on failure in probe()
        gianfar: remove faulty filer optimizer
        gianfar: correct list membership accounting
        gianfar: correct filer table writing
        bonding: Gratuitous ARP gets dropped when first slave added
        net: dsa: Do not override PHY interface if already configured
        net: fs_enet: mask interrupts for TX partial frames.
        net: fs_enet: explicitly remove I flag on TX partial frames
        inet: fix possible request socket leak
        inet: fix races with reqsk timers
        mkiss: Fix error handling in mkiss_open()
        bnx2x: Free NVRAM lock at end of each page
        bnx2x: Prevent null pointer dereference on SKB release
        cxgb4: missing curly braces in t4_setup_debugfs()
        net-timestamp: Update skb_complete_tx_timestamp comment
        ipv6: don't reject link-local nexthop on other interface
        netlink: make sure -EBUSY won't escape from netlink_insert
        bna: fix interrupts storm caused by erroneous packets
        net: mvpp2: replace TX coalescing interrupts with hrtimer
        net: mvpp2: enable proper per-CPU TX buffers unmapping
        ...
      26b552e0
    • L
      Merge tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · 2331d30d
      Linus Torvalds 提交于
      Pull EDAC fix from Borislav Petkov:
       "A ppc4xx_edac fix for accessing ->csrows properly.  This driver was
        missed during the conversion a couple of years ago"
      
      * tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
        EDAC, ppc4xx: Access mci->csrows array elements properly
      2331d30d
  3. 13 8月, 2015 14 次提交
  4. 12 8月, 2015 9 次提交
  5. 11 8月, 2015 5 次提交