1. 20 3月, 2015 3 次提交
    • P
      KVM: PPC: Book3S HV: Fix instruction emulation · 2bf27601
      Paul Mackerras 提交于
      Commit 4a157d61 ("KVM: PPC: Book3S HV: Fix endianness of
      instruction obtained from HEIR register") had the side effect that
      we no longer reset vcpu->arch.last_inst to -1 on guest exit in
      the cases where the instruction is not fetched from the guest.
      This means that if instruction emulation turns out to be required
      in those cases, the host will emulate the wrong instruction, since
      vcpu->arch.last_inst will contain the last instruction that was
      emulated.
      
      This fixes it by making sure that vcpu->arch.last_inst is reset
      to -1 in those cases.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2bf27601
    • P
      KVM: PPC: Book3S HV: Endian fix for accessing VPA yield count · ecb6d618
      Paul Mackerras 提交于
      The VPA (virtual processor area) is defined by PAPR and is therefore
      big-endian, so we need a be32_to_cpu when reading it in
      kvmppc_get_yield_count().  Without this, H_CONFER always fails on a
      little-endian host, causing SMP guests to waste time spinning on
      spinlocks.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ecb6d618
    • P
      KVM: PPC: Book3S HV: Fix spinlock/mutex ordering issue in kvmppc_set_lpcr() · 8f902b00
      Paul Mackerras 提交于
      Currently, kvmppc_set_lpcr() has a spinlock around the whole function,
      and inside that does mutex_lock(&kvm->lock).  It is not permitted to
      take a mutex while holding a spinlock, because the mutex_lock might
      call schedule().  In addition, this causes lockdep to warn about a
      lock ordering issue:
      
      ======================================================
      [ INFO: possible circular locking dependency detected ]
      3.18.0-kvm-04645-gdfea862-dirty #131 Not tainted
      -------------------------------------------------------
      qemu-system-ppc/8179 is trying to acquire lock:
       (&kvm->lock){+.+.+.}, at: [<d00000000ecc1f54>] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv]
      
      but task is already holding lock:
       (&(&vcore->lock)->rlock){+.+...}, at: [<d00000000ecc1ea0>] .kvmppc_set_lpcr+0x40/0x1c0 [kvm_hv]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&(&vcore->lock)->rlock){+.+...}:
             [<c000000000b3c120>] .mutex_lock_nested+0x80/0x570
             [<d00000000ecc7a14>] .kvmppc_vcpu_run_hv+0xc4/0xe40 [kvm_hv]
             [<d00000000eb9f5cc>] .kvmppc_vcpu_run+0x2c/0x40 [kvm]
             [<d00000000eb9cb24>] .kvm_arch_vcpu_ioctl_run+0x54/0x160 [kvm]
             [<d00000000eb94478>] .kvm_vcpu_ioctl+0x4a8/0x7b0 [kvm]
             [<c00000000026cbb4>] .do_vfs_ioctl+0x444/0x770
             [<c00000000026cfa4>] .SyS_ioctl+0xc4/0xe0
             [<c000000000009264>] syscall_exit+0x0/0x98
      
      -> #0 (&kvm->lock){+.+.+.}:
             [<c0000000000ff28c>] .lock_acquire+0xcc/0x1a0
             [<c000000000b3c120>] .mutex_lock_nested+0x80/0x570
             [<d00000000ecc1f54>] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv]
             [<d00000000ecc510c>] .kvmppc_set_one_reg_hv+0x4dc/0x990 [kvm_hv]
             [<d00000000eb9f234>] .kvmppc_set_one_reg+0x44/0x330 [kvm]
             [<d00000000eb9c9dc>] .kvm_vcpu_ioctl_set_one_reg+0x5c/0x150 [kvm]
             [<d00000000eb9ced4>] .kvm_arch_vcpu_ioctl+0x214/0x2c0 [kvm]
             [<d00000000eb940b0>] .kvm_vcpu_ioctl+0xe0/0x7b0 [kvm]
             [<c00000000026cbb4>] .do_vfs_ioctl+0x444/0x770
             [<c00000000026cfa4>] .SyS_ioctl+0xc4/0xe0
             [<c000000000009264>] syscall_exit+0x0/0x98
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&(&vcore->lock)->rlock);
                                     lock(&kvm->lock);
                                     lock(&(&vcore->lock)->rlock);
        lock(&kvm->lock);
      
       *** DEADLOCK ***
      
      2 locks held by qemu-system-ppc/8179:
       #0:  (&vcpu->mutex){+.+.+.}, at: [<d00000000eb93f18>] .vcpu_load+0x28/0x90 [kvm]
       #1:  (&(&vcore->lock)->rlock){+.+...}, at: [<d00000000ecc1ea0>] .kvmppc_set_lpcr+0x40/0x1c0 [kvm_hv]
      
      stack backtrace:
      CPU: 4 PID: 8179 Comm: qemu-system-ppc Not tainted 3.18.0-kvm-04645-gdfea862-dirty #131
      Call Trace:
      [c000001a66c0f310] [c000000000b486ac] .dump_stack+0x88/0xb4 (unreliable)
      [c000001a66c0f390] [c0000000000f8bec] .print_circular_bug+0x27c/0x3d0
      [c000001a66c0f440] [c0000000000fe9e8] .__lock_acquire+0x2028/0x2190
      [c000001a66c0f5d0] [c0000000000ff28c] .lock_acquire+0xcc/0x1a0
      [c000001a66c0f6a0] [c000000000b3c120] .mutex_lock_nested+0x80/0x570
      [c000001a66c0f7c0] [d00000000ecc1f54] .kvmppc_set_lpcr+0xf4/0x1c0 [kvm_hv]
      [c000001a66c0f860] [d00000000ecc510c] .kvmppc_set_one_reg_hv+0x4dc/0x990 [kvm_hv]
      [c000001a66c0f8d0] [d00000000eb9f234] .kvmppc_set_one_reg+0x44/0x330 [kvm]
      [c000001a66c0f960] [d00000000eb9c9dc] .kvm_vcpu_ioctl_set_one_reg+0x5c/0x150 [kvm]
      [c000001a66c0f9f0] [d00000000eb9ced4] .kvm_arch_vcpu_ioctl+0x214/0x2c0 [kvm]
      [c000001a66c0faf0] [d00000000eb940b0] .kvm_vcpu_ioctl+0xe0/0x7b0 [kvm]
      [c000001a66c0fcb0] [c00000000026cbb4] .do_vfs_ioctl+0x444/0x770
      [c000001a66c0fd90] [c00000000026cfa4] .SyS_ioctl+0xc4/0xe0
      [c000001a66c0fe30] [c000000000009264] syscall_exit+0x0/0x98
      
      This fixes it by moving the mutex_lock()/mutex_unlock() pair outside
      the spin-locked region.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8f902b00
  2. 17 3月, 2015 1 次提交
  3. 14 3月, 2015 1 次提交
  4. 13 3月, 2015 3 次提交
  5. 11 3月, 2015 4 次提交
    • M
      arm64: KVM: Fix outdated comment about VTCR_EL2.PS · 84ed7412
      Marc Zyngier 提交于
      Commit 87366d8c ("arm64: Add boot time configuration of
      Intermediate Physical Address size") removed the hardcoded setting
      of VTCR_EL2.PS to use ID_AA64MMFR0_EL1.PARange instead, but didn't
      remove the (now rather misleading) comment.
      
      Fix the comments to match reality (at least for the next few minutes).
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      84ed7412
    • M
      arm64: KVM: Do not use pgd_index to index stage-2 pgd · 04b8dc85
      Marc Zyngier 提交于
      The kernel's pgd_index macro is designed to index a normal, page
      sized array. KVM is a bit diffferent, as we can use concatenated
      pages to have a bigger address space (for example 40bit IPA with
      4kB pages gives us an 8kB PGD.
      
      In the above case, the use of pgd_index will always return an index
      inside the first 4kB, which makes a guest that has memory above
      0x8000000000 rather unhappy, as it spins forever in a page fault,
      whist the host happilly corrupts the lower pgd.
      
      The obvious fix is to get our own kvm_pgd_index that does the right
      thing(tm).
      
      Tested on X-Gene with a hacked kvmtool that put memory at a stupidly
      high address.
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      04b8dc85
    • M
      arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting · a987370f
      Marc Zyngier 提交于
      We're using __get_free_pages with to allocate the guest's stage-2
      PGD. The standard behaviour of this function is to return a set of
      pages where only the head page has a valid refcount.
      
      This behaviour gets us into trouble when we're trying to increment
      the refcount on a non-head page:
      
      page:ffff7c00cfb693c0 count:0 mapcount:0 mapping:          (null) index:0x0
      flags: 0x4000000000000000()
      page dumped because: VM_BUG_ON_PAGE((*({ __attribute__((unused)) typeof((&page->_count)->counter) __var = ( typeof((&page->_count)->counter)) 0; (volatile typeof((&page->_count)->counter) *)&((&page->_count)->counter); })) <= 0)
      BUG: failure at include/linux/mm.h:548/get_page()!
      Kernel panic - not syncing: BUG!
      CPU: 1 PID: 1695 Comm: kvm-vcpu-0 Not tainted 4.0.0-rc1+ #3825
      Hardware name: APM X-Gene Mustang board (DT)
      Call trace:
      [<ffff80000008a09c>] dump_backtrace+0x0/0x13c
      [<ffff80000008a1e8>] show_stack+0x10/0x1c
      [<ffff800000691da8>] dump_stack+0x74/0x94
      [<ffff800000690d78>] panic+0x100/0x240
      [<ffff8000000a0bc4>] stage2_get_pmd+0x17c/0x2bc
      [<ffff8000000a1dc4>] kvm_handle_guest_abort+0x4b4/0x6b0
      [<ffff8000000a420c>] handle_exit+0x58/0x180
      [<ffff80000009e7a4>] kvm_arch_vcpu_ioctl_run+0x114/0x45c
      [<ffff800000099df4>] kvm_vcpu_ioctl+0x2e0/0x754
      [<ffff8000001c0a18>] do_vfs_ioctl+0x424/0x5c8
      [<ffff8000001c0bfc>] SyS_ioctl+0x40/0x78
      CPU0: stopping
      
      A possible approach for this is to split the compound page using
      split_page() at allocation time, and change the teardown path to
      free one page at a time.  It turns out that alloc_pages_exact() and
      free_pages_exact() does exactly that.
      
      While we're at it, the PGD allocation code is reworked to reduce
      duplication.
      
      This has been tested on an X-Gene platform with a 4kB/48bit-VA host
      kernel, and kvmtool hacked to place memory in the second page of
      the hardware PGD (PUD for the host kernel). Also regression-tested
      on a Cubietruck (Cortex-A7).
      
       [ Reworked to use alloc_pages_exact() and free_pages_exact() and to
         return pointers directly instead of by reference as arguments
          - Christoffer ]
      Reported-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      a987370f
    • P
      kvm: move advertising of KVM_CAP_IRQFD to common code · dc9be0fa
      Paolo Bonzini 提交于
      POWER supports irqfds but forgot to advertise them.  Some userspace does
      not check for the capability, but others check it---thus they work on
      x86 and s390 but not POWER.
      
      To avoid that other architectures in the future make the same mistake, let
      common code handle KVM_CAP_IRQFD the same way as KVM_CAP_IRQFD_RESAMPLE.
      Reported-and-tested-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org
      Fixes: 297e2105Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      dc9be0fa
  6. 10 3月, 2015 16 次提交
    • L
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · affb8172
      Linus Torvalds 提交于
      Pull kvm/s390 bugfixes from Marcelo Tosatti.
      
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: non-LPAR case obsolete during facilities mask init
        KVM: s390: include guest facilities in kvm facility test
        KVM: s390: fix in memory copy of facility lists
        KVM: s390/cpacf: Fix kernel bug under z/VM
        KVM: s390/cpacf: Enable key wrapping by default
      affb8172
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · ec0e6bd3
      Linus Torvalds 提交于
      Pull s390 fixes from Martin Schwidefsky:
       "One performance optimization for page_clear and a couple of bug fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/mm: fix incorrect ASCE after crst_table_downgrade
        s390/ftrace: fix crashes when switching tracers / add notrace to cpu_relax()
        s390/pci: unify pci_iomap symbol exports
        s390/pci: fix [un]map_resources sequence
        s390: let the compiler do page clearing
        s390/pci: fix possible information leak in mmio syscall
        s390/dcss: array index 'i' is used before limits check.
        s390/scm_block: fix off by one during cluster reservation
        s390/jump label: improve and fix sanity check
        s390/jump label: add missing jump_label_apply_nops() call
      ec0e6bd3
    • L
      Merge tag 'trace-fixes-v4.0-rc2-2' of... · e7901af1
      Linus Torvalds 提交于
      Merge tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull seq-buf/ftrace fixes from Steven Rostedt:
       "This includes fixes for seq_buf_bprintf() truncation issue.  It also
        contains fixes to ftrace when /proc/sys/kernel/ftrace_enabled and
        function tracing are started.  Doing the following causes some issues:
      
          # echo 0 > /proc/sys/kernel/ftrace_enabled
          # echo function_graph > /sys/kernel/debug/tracing/current_tracer
          # echo 1 > /proc/sys/kernel/ftrace_enabled
          # echo nop > /sys/kernel/debug/tracing/current_tracer
          # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
        As well as with function tracing too.  Pratyush Anand first reported
        this issue to me and supplied a patch.  When I tested this on my x86
        test box, it caused thousands of backtraces and warnings to appear in
        dmesg, which also caused a denial of service (a warning for every
        function that was listed).  I applied Pratyush's patch but it did not
        fix the issue for me.  I looked into it and found a slight problem
        with trampoline accounting.  I fixed it and sent Pratyush a patch, but
        he said that it did not fix the issue for him.
      
        I later learned tha Pratyush was using an ARM64 server, and when I
        tested on my ARM board, I was able to reproduce the same issue as
        Pratyush.  After applying his patch, it fixed the problem.  The above
        test uncovered two different bugs, one in x86 and one in ARM and
        ARM64.  As this looked like it would affect PowerPC, I tested it on my
        PPC64 box.  It too broke, but neither the patch that fixed ARM or x86
        fixed this box (the changes were all in generic code!).  The above
        test, uncovered two more bugs that affected PowerPC.  Again, the
        changes were only done to generic code.  It's the way the arch code
        expected things to be done that was different between the archs.  Some
        where more sensitive than others.
      
        The rest of this series fixes the PPC bugs as well"
      
      * tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled
        ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl
        ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl
        seq_buf: Fix seq_buf_bprintf() truncation
        seq_buf: Fix seq_buf_vprintf() truncation
      e7901af1
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 36bef883
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) nft_compat accidently truncates ethernet protocol to 8-bits, from
          Arturo Borrero.
      
       2) Memory leak in ip_vs_proc_conn(), from Julian Anastasov.
      
       3) Don't allow the space required for nftables rules to exceed the
          maximum value representable in the dlen field.  From Patrick
          McHardy.
      
       4) bcm63xx_enet can accidently leave interrupts permanently disabled
          due to errors in the NAPI polling exit logic.  Fix from Nicolas
          Schichan.
      
       5) Fix OOPSes triggerable by the ping protocol module, due to missing
          address family validations etc.  From Lorenzo Colitti.
      
       6) Don't use RCU locking in sleepable context in team driver, from Jiri
          Pirko.
      
       7) xen-netback miscalculates statistic offset pointers when reporting
          the stats to userspace.  From David Vrabel.
      
       8) Fix a leak of up to 256 pages per VIF destroy in xen-netaback, also
          from David Vrabel.
      
       9) ip_check_defrag() cannot assume that skb_network_offset(),
          particularly when it is used by the AF_PACKET fanout defrag code.
          From Alexander Drozdov.
      
      10) gianfar driver doesn't query OF node names properly when trying to
          determine the number of hw queues available.  Fix it to explicitly
          check for OF nodes named queue-group.  From Tobias Waldekranz.
      
      11) MID field in macb driver should be 12 bits, not 16.  From Punnaiah
          Choudary Kalluri.
      
      12) Fix unintentional regression in traceroute due to timestamp socket
          option changes.  Empty ICMP payloads should be allowed in
          non-timestamp cases.  From Willem de Bruijn.
      
      13) When devices are unregistered, we have to get rid of AF_PACKET
          multicast list entries that point to it via ifindex.  Fix from
          Francesco Ruggeri.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
        tipc: fix bug in link failover handling
        net: delete stale packet_mclist entries
        net: macb: constify macb configuration data
        MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer
        MAINTAINERS: linux-can moved to github
        can: kvaser_usb: Read all messages in a bulk-in URB buffer
        can: kvaser_usb: Avoid double free on URB submission failures
        can: peak_usb: fix missing ctrlmode_ init for every dev
        can: add missing initialisations in CAN related skbuffs
        ip: fix error queue empty skb handling
        bgmac: Clean warning messages
        tcp: align tcp_xmit_size_goal() on tcp_tso_autosize()
        net: fec: fix unbalanced clk disable on driver unbind
        net: macb: Correct the MID field length value
        net: gianfar: correctly determine the number of queue groups
        ipv4: ip_check_defrag should not assume that skb_network_offset is zero
        net: bcmgenet: properly disable password matching
        net: eth: xgene: fix booting with devicetree
        bnx2x: Force fundamental reset for EEH recovery
        xen-netback: refactor xenvif_handle_frag_list()
        ...
      36bef883
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · e93df634
      Linus Torvalds 提交于
      Pull input subsystem fixes from Dmitry Torokhov:
       "Miscellaneous driver fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: psmouse - disable "palm detection" in the focaltech driver
        Input: psmouse - disable changing resolution/rate/scale for FocalTech
        Input: psmouse - ensure that focaltech reports consistent coordinates
        Input: psmouse - remove hardcoded touchpad size from the focaltech driver
        Input: tc3589x-keypad - set IRQF_ONESHOT flag to ensure IRQ request
        Input: ALPS - fix memory leak when detection fails
        Input: sun4i-ts - add thermal driver dependency
        Input: cyapa - remove superfluous type check in cyapa_gen5_read_idac_data()
        Input: cyapa - fix unaligned functions redefinition error
        Input: mma8450 - add parent device
      e93df634
    • L
      Merge tag 'regulator-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator · 068c65c5
      Linus Torvalds 提交于
      Pull regulator fixes from Mark Brown:
       "A couple of driver specific fixes plus a fix for a regression in the
        core where the updates to use sysfs group registration were overly
        enthusiastic in eliding properties and removed some that had been
        previously present"
      
      * tag 'regulator-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: Fix regression due to NULL constraints check
        regulator: rk808: Set the enable time for LDOs
        regulator: da9210: Mask all interrupt sources to deassert interrupt line
      068c65c5
    • L
      Merge tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · d08edd8f
      Linus Torvalds 提交于
      Pull spi fixes from Mark Brown:
       "A collection of driver specific fixes to which the usual comments
        about them being important if you see them mostly apply (except for
        the comment fix).  The pl022 one is particularly nasty for anyone
        affected by it"
      
      * tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: pl022: Fix race in giveback() leading to driver lock-up
        spi: dw-mid: avoid potential NULL dereference
        spi: img-spfi: Verify max spfi transfer length
        spi: fix a typo in comment.
        spi: atmel: Fix interrupt setup for PDC transfers
        spi: dw: revisit FIFO size detection again
        spi: dw-pci: correct number of chip selects
        drivers: spi: ti-qspi: wait for busy bit clear before data write/read
      d08edd8f
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · eca8dac4
      Linus Torvalds 提交于
      Pull tpm fixes from James Morris:
       "fixes for the TPM driver"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        tpm: fix call order in tpm-chip.c
        tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send
      eca8dac4
    • L
      Merge tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux · ecddad64
      Linus Torvalds 提交于
      Pull fbdev fixes from Tomi Valkeinen:
       - Fix regression in with omapdss when using i2c displays
       - Fix possible null deref in fbmon
       - Check kalloc return value in AMBA CLCD
      
      * tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
        OMAPDSS: fix regression with display sysfs files
        video: fbdev: fix possible null dereference
        video: ARM CLCD: Add missing error check for devm_kzalloc
      ecddad64
    • L
      Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · c0e99a71
      Linus Torvalds 提交于
      Pull cgroup fixes from Tejun Heo:
       "The cgroup iteration update two years ago and the recent cpuset
        restructuring introduced regressions in subset of cpuset
        configurations.  Three patches to fix them.
      
        All are marked for -stable"
      
      * 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cpuset: Fix cpuset sched_relax_domain_level
        cpuset: fix a warning when clearing configured masks in old hierarchy
        cpuset: initialize effective masks when clone_children is enabled
      c0e99a71
    • L
      Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · f930713b
      Linus Torvalds 提交于
      Pull libata fixlet from Tejun Heo:
       "Speed limiting fix for sata_fsl"
      
      * 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        sata-fsl: Apply link speed limits
      f930713b
    • L
      Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · b695f31f
      Linus Torvalds 提交于
      Pull workqueue fix from Tejun Heo:
       "One fix patch for a subtle livelock condition which can happen on
        PREEMPT_NONE kernels involving two racing cancel_work calls.  Whoever
        comes in the second has to wait for the previous one to finish.  This
        was implemented by making the later one block for the same condition
        that the former would be (work item completion) and then loop and
        retest; unfortunately, depending on the wake up order, the later one
        could lock out the former one to finish by busy looping on the cpu.
      
        This is fixed by implementing explicit wait mechanism.  Work item
        might not belong anywhere at this point and there's remote possibility
        of thundering herd problem.  I originally tried to use bit_waitqueue
        but it didn't work for static work items on modules.  It's currently
        using single wait queue with filtering wake up function and exclusive
        wakeup.  If this ever becomes a problem, which is not very likely, we
        can try to figure out a way to piggy back on bit_waitqueue"
      
      * 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE
      b695f31f
    • J
      tipc: fix bug in link failover handling · e6441bae
      Jon Paul Maloy 提交于
      In commit c637c103
      ("tipc: resolve race problem at unicast message reception") we
      introduced a new mechanism for delivering buffers upwards from link
      to socket layer.
      
      That code contains a bug in how we handle the new link input queue
      during failover. When a link is reset, some of its users may be blocked
      because of congestion, and in order to resolve this, we add any pending
      wakeup pseudo messages to the link's input queue, and deliver them to
      the socket. This misses the case where the other, remaining link also
      may have congested users. Currently, the owner node's reference to the
      remaining link's input queue is unconditionally overwritten by the
      reset link's input queue. This has the effect that wakeup events from
      the remaining link may be unduely delayed (but not lost) for a
      potentially long period.
      
      We fix this by adding the pending events from the reset link to the
      input queue that is currently referenced by the node, whichever one
      it is.
      
      This commit should be applied to both net and net-next.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6441bae
    • F
      net: delete stale packet_mclist entries · 82f17091
      Francesco Ruggeri 提交于
      When an interface is deleted from a net namespace the ifindex in the
      corresponding entries in PF_PACKET sockets' mclists becomes stale.
      This can create inconsistencies if later an interface with the same ifindex
      is moved from a different namespace (not that unlikely since ifindexes are
      per-namespace).
      In particular we saw problems with dev->promiscuity, resulting
      in "promiscuity touches roof, set promiscuity failed. promiscuity
      feature of device might be broken" warnings and EOVERFLOW failures of
      setsockopt(PACKET_ADD_MEMBERSHIP).
      This patch deletes the mclist entries for interfaces that are deleted.
      Since this now causes setsockopt(PACKET_DROP_MEMBERSHIP) to fail with
      EADDRNOTAVAIL if called after the interface is deleted, also make
      packet_mc_drop not fail.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82f17091
    • J
      net: macb: constify macb configuration data · 0b2eb3e9
      Josh Cartwright 提交于
      The configurations are not modified by the driver.  Make them 'const' so
      that they may be placed in a read-only section.
      Signed-off-by: NJosh Cartwright <joshc@ni.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b2eb3e9
    • D
      Merge tag 'linux-can-fixes-for-4.0-20150309' of... · d0372504
      David S. Miller 提交于
      Merge tag 'linux-can-fixes-for-4.0-20150309' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2015-03-09
      
      this is a pull request for net/master for the 4.0 release cycle, it consists of
      6 patches:
      
      A patch by Oliver Hartkopp fixes a long outstanding bug in the infrastructure,
      which leads to skb_under_panics when CAN interfaces are used by AF_PACKET
      sockets e.g. by dhclient. Stephane Grosjean contributes a patch for the
      peak_usb driver which adds a missing initialization. Two patches by Ahmed S.
      Darwish fix problems in the kvaser_usb driver. Followed by two patches by
      myself, updating the MAINTAINERS file
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0372504
  7. 09 3月, 2015 12 次提交
    • S
      ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled · 524a3868
      Steven Rostedt (Red Hat) 提交于
      Some archs (specifically PowerPC), are sensitive with the ordering of
      the enabling of the calls to function tracing and setting of the
      function to use to be traced.
      
      That is, update_ftrace_function() sets what function the ftrace_caller
      trampoline should call. Some archs require this to be set before
      calling ftrace_run_update_code().
      
      Another bug was discovered, that ftrace_startup_sysctl() called
      ftrace_run_update_code() directly. If the function the ftrace_caller
      trampoline changes, then it will not be updated. Instead a call
      to ftrace_startup_enable() should be called because it tests to see
      if the callback changed since the code was disabled, and will
      tell the arch to update appropriately. Most archs do not need this
      notification, but PowerPC does.
      
      The problem could be seen by the following commands:
      
       # echo 0 > /proc/sys/kernel/ftrace_enabled
       # echo function > /sys/kernel/debug/tracing/current_tracer
       # echo 1 > /proc/sys/kernel/ftrace_enabled
       # cat /sys/kernel/debug/tracing/trace
      
      The trace will show that function tracing was not active.
      
      Cc: stable@vger.kernel.org # 2.6.27+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      524a3868
    • P
      ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl · 1619dc3f
      Pratyush Anand 提交于
      When ftrace is enabled globally through the proc interface, we must check if
      ftrace_graph_active is set. If it is set, then we should also pass the
      FTRACE_START_FUNC_RET command to ftrace_run_update_code(). Similarly, when
      ftrace is disabled globally through the proc interface, we must check if
      ftrace_graph_active is set. If it is set, then we should also pass the
      FTRACE_STOP_FUNC_RET command to ftrace_run_update_code().
      
      Consider the following situation.
      
       # echo 0 > /proc/sys/kernel/ftrace_enabled
      
      After this ftrace_enabled = 0.
      
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
      Since ftrace_enabled = 0, ftrace_enable_ftrace_graph_caller() is never
      called.
      
       # echo 1 > /proc/sys/kernel/ftrace_enabled
      
      Now ftrace_enabled will be set to true, but still
      ftrace_enable_ftrace_graph_caller() will not be called, which is not
      desired.
      
      Further if we execute the following after this:
        # echo nop > /sys/kernel/debug/tracing/current_tracer
      
      Now since ftrace_enabled is set it will call
      ftrace_disable_ftrace_graph_caller(), which causes a kernel warning on
      the ARM platform.
      
      On the ARM platform, when ftrace_enable_ftrace_graph_caller() is called,
      it checks whether the old instruction is a nop or not. If it's not a nop,
      then it returns an error. If it is a nop then it replaces instruction at
      that address with a branch to ftrace_graph_caller.
      ftrace_disable_ftrace_graph_caller() behaves just the opposite. Therefore,
      if generic ftrace code ever calls either ftrace_enable_ftrace_graph_caller()
      or ftrace_disable_ftrace_graph_caller() consecutively two times in a row,
      then it will return an error, which will cause the generic ftrace code to
      raise a warning.
      
      Note, x86 does not have an issue with this because the architecture
      specific code for ftrace_enable_ftrace_graph_caller() and
      ftrace_disable_ftrace_graph_caller() does not check the previous state,
      and calling either of these functions twice in a row has no ill effect.
      
      Link: http://lkml.kernel.org/r/e4fbe64cdac0dd0e86a3bf914b0f83c0b419f146.1425666454.git.panand@redhat.com
      
      Cc: stable@vger.kernel.org # 2.6.31+
      Signed-off-by: NPratyush Anand <panand@redhat.com>
      [
        removed extra if (ftrace_start_up) and defined ftrace_graph_active as 0
        if CONFIG_FUNCTION_GRAPH_TRACER is not set.
      ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1619dc3f
    • S
      ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl · b24d443b
      Steven Rostedt (Red Hat) 提交于
      When /proc/sys/kernel/ftrace_enabled is set to zero, all function
      tracing is disabled. But the records that represent the functions
      still hold information about the ftrace_ops that are hooked to them.
      
      ftrace_ops may request "REGS" (have a full set of pt_regs passed to
      the callback), or "TRAMP" (the ops has its own trampoline to use).
      When the record is updated to represent the state of the ops hooked
      to it, it sets "REGS_EN" and/or "TRAMP_EN" to state that the callback
      points to the correct trampoline (REGS has its own trampoline).
      
      When ftrace_enabled is set to zero, all ftrace locations are a nop,
      so they do not point to any trampoline. But the _EN flags are still
      set. This can cause the accounting to go wrong when ftrace_enabled
      is cleared and an ops that has a trampoline is registered or unregistered.
      
      For example, the following will cause ftrace to crash:
      
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
       # echo 0 > /proc/sys/kernel/ftrace_enabled
       # echo nop > /sys/kernel/debug/tracing/current_tracer
       # echo 1 > /proc/sys/kernel/ftrace_enabled
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
      As function_graph uses a trampoline, when ftrace_enabled is set to zero
      the updates to the record are not done. When enabling function_graph
      again, the record will still have the TRAMP_EN flag set, and it will
      look for an op that has a trampoline other than the function_graph
      ops, and fail to find one.
      
      Cc: stable@vger.kernel.org # 3.17+
      Reported-by: NPratyush Anand <panand@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b24d443b
    • J
    • M
      MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer · f7214cf2
      Marc Kleine-Budde 提交于
      This patch adds Marc Kleine-Budde as a co maintainer for the CAN networking
      layer.
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      f7214cf2
    • M
      MAINTAINERS: linux-can moved to github · 84b0d715
      Marc Kleine-Budde 提交于
      As gitorious will shut down at the end of May 2015, the linux-can website moved
      to github. This patch reflects this change.
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      84b0d715
    • A
      can: kvaser_usb: Read all messages in a bulk-in URB buffer · 2fec5104
      Ahmed S. Darwish 提交于
      The Kvaser firmware can only read and write messages that are
      not crossing the USB endpoint's wMaxPacketSize boundary. While
      receiving commands from the CAN device, if the next command in
      the same URB buffer crossed that max packet size boundary, the
      firmware puts a zero-length placeholder command in its place
      then moves the real command to the next boundary mark.
      
      The driver did not recognize such behavior, leading to missing
      a good number of rx events during a heavy rx load session.
      
      Moreover, a tx URB context only gets freed upon receiving its
      respective tx ACK event. Over time, the free tx URB contexts
      pool gets depleted due to the missing ACK events. Consequently,
      the netif transmission queue gets __permanently__ stopped; no
      frames could be sent again except after restarting the CAN
      newtwork interface.
      Signed-off-by: NAhmed S. Darwish <ahmed.darwish@valeo.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      2fec5104
    • A
      can: kvaser_usb: Avoid double free on URB submission failures · deb2701c
      Ahmed S. Darwish 提交于
      Upon a URB submission failure, the driver calls usb_free_urb()
      but then manually frees the URB buffer by itself.  Meanwhile
      usb_free_urb() has alredy freed out that transfer buffer since
      we're the only code path holding a reference to this URB.
      
      Remove two of such invalid manual free().
      Signed-off-by: NAhmed S. Darwish <ahmed.darwish@valeo.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      deb2701c
    • S
      can: peak_usb: fix missing ctrlmode_ init for every dev · b0d4724b
      Stephane Grosjean 提交于
      Fixes a missing initialization of ctrlmode and ctrlmode_supported fields,
      for all other CAN devices than the first one. This fix only concerns
      the PCAN-USB Pro FD dual-channels CAN-FD device made by PEAK-System.
      Signed-off-by: NStephane Grosjean <s.grosjean@peak-system.com>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      b0d4724b
    • O
      can: add missing initialisations in CAN related skbuffs · 96943901
      Oliver Hartkopp 提交于
      When accessing CAN network interfaces with AF_PACKET sockets e.g. by dhclient
      this can lead to a skb_under_panic due to missing skb initialisations.
      
      Add the missing initialisations at the CAN skbuff creation times on driver
      level (rx path) and in the network layer (tx path).
      Reported-by: NAustin Schuh <austin@peloton-tech.com>
      Reported-by: NDaniel Steer <daniel.steer@mclaren.com>
      Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      96943901
    • W
      ip: fix error queue empty skb handling · c247f053
      Willem de Bruijn 提交于
      When reading from the error queue, msg_name and msg_control are only
      populated for some errors. A new exception for empty timestamp skbs
      added a false positive on icmp errors without payload.
      
      `traceroute -M udpconn` only displayed gateways that return payload
      with the icmp error: the embedded network headers are pulled before
      sock_queue_err_skb, leaving an skb with skb->len == 0 otherwise.
      
      Fix this regression by refining when msg_name and msg_control
      branches are taken. The solutions for the two fields are independent.
      
      msg_name only makes sense for errors that configure serr->port and
      serr->addr_offset. Test the first instead of skb->len. This also fixes
      another issue. saddr could hold the wrong data, as serr->addr_offset
      is not initialized  in some code paths, pointing to the start of the
      network header. It is only valid when serr->port is set (non-zero).
      
      msg_control support differs between IPv4 and IPv6. IPv4 only honors
      requests for ICMP and timestamps with SOF_TIMESTAMPING_OPT_CMSG. The
      skb->len test can simply be removed, because skb->dev is also tested
      and never true for empty skbs. IPv6 honors requests for all errors
      aside from local errors and timestamps on empty skbs.
      
      In both cases, make the policy more explicit by moving this logic to
      a new function that decides whether to process msg_control and that
      optionally prepares the necessary fields in skb->cb[]. After this
      change, the IPv4 and IPv6 paths are more similar.
      
      The last case is rxrpc. Here, simply refine to only match timestamps.
      
      Fixes: 49ca0d8b ("net-timestamp: no-payload option")
      Reported-by: NJan Niehusmann <jan@gondor.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      
      ----
      
      Changes
        v1->v2
        - fix local origin test inversion in ip6_datagram_support_cmsg
        - make v4 and v6 code paths more similar by introducing analogous
          ipv4_datagram_support_cmsg
        - fix compile bug in rxrpc
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c247f053
    • P
      bgmac: Clean warning messages · 8edfe3b6
      Peter Senna Tschudin 提交于
      On my test environment the throughput of a file transfer drops
      from 4.4MBps to 116KBps due the number of repeated warning
      messages. This patch removes the warning messages as DMA works
      correctly with addresses using 0xC0000000 bits.
      Signed-off-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NRafał Miłecki <zajec5@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8edfe3b6