1. 01 4月, 2014 10 次提交
    • T
      can: c_can: Fix the lost message handling · 07c7b6f6
      Thomas Gleixner 提交于
      The lost message handling is broken in several ways.
      
      1) Clearing the message lost flag is done by writing 0 to the
         message control register of the object.
      
         #define IF_MCONT_CLR_MSGLST    (0 << 14)
      
         That clears the object buffer configuration in the worst case,
         which results in a loss of the EOB flag. That leaves the FIFO chain
         without a limit and causes a complete lockup of the HW
      
      2) In case that the error skb allocation fails, the code happily
         claims that it handed down a packet. Just an accounting bug, but ....
      
      3) The code adds a lot of pointless overhead to that error case, where
         we need to get stuff done as fast as possible to avoid more packet
         loss.
      
         - printk an annoying error message
         - reread the object buffer for nothing
      
      Fix is simple again:
      
        - Use the already known MSGCTRL content and only clear the MSGLST bit
        - Fix the buffer accounting by adding a proper return code
        - Remove the pointless operations
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      07c7b6f6
    • T
      can: c_can: Fix buffer ordering · 64f08f2f
      Thomas Gleixner 提交于
      The buffer handling of c_can has been broken forever. That leads to
      message reordering:
      
      ksoftirqd/0-3     [000] ..s.    79.123776: c_can_poll: rx_poll: val: 00007fff
      ksoftirqd/0-3     [000] ..s.    79.124101: c_can_poll: rx_poll: val: 00008001
      
      What happens is:
      
      CPU				HW
      				queue new packet into obj 16 (0-15 are busy)
      read obj 1-15
      return because pending is 0
      				set pending obj 16 -> pending reg 8000
      				queue new packet into obj 1
      				set pending obj 1 -> pending reg 8001
      
      So the current algorithmus reads the newest message first, which
      violates the ordering rules of CAN.
      
      Add proper handling of that situation by analyzing the contents of the
      pending register for gaps.
      
      This does NOT fix the message object corruption which can lead to
      interrupt storms. Thats addressed in the next patches.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      [mkl: adjusted subject]
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      64f08f2f
    • T
      can: c_can: Make it SMP safe · 640916db
      Thomas Gleixner 提交于
      The hardware has two message control interfaces, but the code only uses the
      first one. So on SMP the following can be observed:
      
      CPU0 	       	CPU1
      rx_poll()
        write IF1	xmit()
      		write IF1
        write IF1
      
      That results in corrupted message object configurations. The TX/RX is not
      globally serialized it's only serialized on a core.
      
      Simple solution: Let RX use IF1 and TX use IF2 and all is good.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      640916db
    • T
      can: c_can: Fix hardware raminit function · 5bb9cbaa
      Thomas Gleixner 提交于
      The function is broken in several ways:
      
          - The function does not wait for the init to complete.
            That can take quite some microseconds.
      
          - No protection against being called for two chips at the same
            time. SMP is such a new thing, right?
      
      Clear the start and the init done bit unconditionally and wait for both bits to
      be clear.
      
      In the enable path set the init bit and wait for the init done bit.
      
      Add proper locking.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      5bb9cbaa
    • T
      can: c_can: Wait for CONTROL_INIT to be cleared · 9fac1d1a
      Thomas Gleixner 提交于
      According to the documentation the CPU must wait for CONTROL_INIT to
      be cleared before writing to the baudrate registers.
      Signed-off-by: NBenedikt Spranger <b.spranger@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      9fac1d1a
    • M
      can: c_can: check return value to users of c_can_set_bittiming() · 130a5171
      Marc Kleine-Budde 提交于
      This patch adds return value checking to all direct and indirect users of
      c_can_set_bittiming().
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      130a5171
    • M
      can: c_can: free_c_can_dev(): add missing netif_napi_del() · f29b4238
      Marc Kleine-Budde 提交于
      This patch adds the missing netif_napi_del() to the free_c_can_dev() function.
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      f29b4238
    • R
      can: Documentation: fix parameter name "sample-point" · 5fb7639d
      Robert Schwebel 提交于
      This patch fixes the name of the parameter to configure the sample point used
      in iproute2's ip command. The correct writing is "sample-point" not
      "sample_point".
      Signed-off-by: NRobert Schwebel <r.schwebel@pengutronix.de>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      5fb7639d
    • B
      can: usb_8dev: Fix memory leak in usb_8dev_start_xmit · 636d0375
      Bjorn Van Tilt 提交于
      Fixed a memory leak when an error occurred in the transmit function. In the
      error handling the urb wasn't freed before returning. There was also a call to
      the usb_unanchor_urb() function but the urb wasn't anchored.
      Signed-off-by: NBjorn Van Tilt <bjorn.vantilt@gmail.com>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      636d0375
    • A
      at86rf230: mask irq's before deregister device · 17e84a92
      Alexander Aring 提交于
      While transmit over a at86rf231 device and unloading the module I got:
      
      [   29.643073] WARNING: CPU: 0 PID: 3 at kernel/workqueue.c:1335 __queue_work+0xb4/0x224()
      [   29.651457] Modules linked in: at86rf230(-) autofs4
      [   29.656612] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G        W    3.14.0-rc6-01602-g902659e-dirty #294
      [   29.666490] [<c00124f0>] (unwind_backtrace) from [<c0010ad0>] (show_stack+0x10/0x14)
      [   29.674628] [<c0010ad0>] (show_stack) from [<c0032c80>] (warn_slowpath_common+0x60/0x80)
      [   29.683116] [<c0032c80>] (warn_slowpath_common) from [<c0032d30>] (warn_slowpath_null+0x18/0x20)
      [   29.692329] [<c0032d30>] (warn_slowpath_null) from [<c0045b08>] (__queue_work+0xb4/0x224)
      [   29.700906] [<c0045b08>] (__queue_work) from [<c0045cc8>] (queue_work_on+0x50/0x78)
      [   29.708944] [<c0045cc8>] (queue_work_on) from [<c05669cc>] (mac802154_tx+0x1e4/0x240)
      [   29.717164] [<c05669cc>] (mac802154_tx) from [<c0471814>] (dev_hard_start_xmit+0x2f0/0x43c)
      [   29.725926] [<c0471814>] (dev_hard_start_xmit) from [<c04878d0>] (sch_direct_xmit+0x64/0x2a0)
      [   29.734867] [<c04878d0>] (sch_direct_xmit) from [<c0487c38>] (__qdisc_run+0x12c/0x18c)
      [   29.743169] [<c0487c38>] (__qdisc_run) from [<c046e1b0>] (net_tx_action+0xe0/0x178)
      [   29.751205] [<c046e1b0>] (net_tx_action) from [<c0036690>] (__do_softirq+0x100/0x264)
      [   29.759420] [<c0036690>] (__do_softirq) from [<c0036818>] (run_ksoftirqd+0x24/0x4c)
      [   29.767453] [<c0036818>] (run_ksoftirqd) from [<c005232c>] (smpboot_thread_fn+0x128/0x13c)
      [   29.776121] [<c005232c>] (smpboot_thread_fn) from [<c004c3fc>] (kthread+0xd0/0xe4)
      [   29.784061] [<c004c3fc>] (kthread) from [<c000da88>] (ret_from_fork+0x14/0x2c)
      [   29.791628] ---[ end trace 3406ff24bd973834 ]---
      
      The problem is there are still interrupts after deregister ieee802154
      device. This patch mask all interrupts in the at86rf2xx chips before
      deregister the device.
      Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17e84a92
  2. 30 3月, 2014 4 次提交
  3. 29 3月, 2014 26 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 49d8137a
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) We've discovered a common error in several networking drivers, they
          put VLAN offload features into ->vlan_features, which would suggest
          that they support offloading 2 or more levels of VLAN encapsulation.
          Not only do these devices not do that, but we don't have the
          infrastructure yet to handle that at all.
      
          Fixes from Vlad Yasevich.
      
       2) Fix tcpdump crash with bridging and vlans, also from Vlad.
      
       3) Some MAINTAINERS updates for random32 and bonding.
      
       4) Fix late reseeds of prandom generator, from Sasha Levin.
      
       5) Bridge doesn't handle stacked vlans properly, fix from Toshiaki
          Makita.
      
       6) Fix deadlock in openvswitch, from Flavio Leitner.
      
       7) get_timewait4_sock() doesn't report delay times correctly, fix from
          Eric Dumazet.
      
       8) Duplicate address detection and addrconf verification need to run in
          contexts where RTNL can be obtained.  Move them to run from a
          workqueue.  From Hannes Frederic Sowa.
      
       9) Fix route refcount leaking in ip tunnels, from Pravin B Shelar.
      
      10) Don't return -EINTR from non-blocking recvmsg() on AF_UNIX sockets,
          from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits)
        vlan: Warn the user if lowerdev has bad vlan features.
        veth: Turn off vlan rx acceleration in vlan_features
        ifb: Remove vlan acceleration from vlan_features
        qlge: Do not propaged vlan tag offloads to vlans
        bridge: Fix crash with vlan filtering and tcpdump
        net: Account for all vlan headers in skb_mac_gso_segment
        MAINTAINERS: bonding: change email address
        MAINTAINERS: bonding: change email address
        ipv6: move DAD and addrconf_verify processing to workqueue
        tcp: fix get_timewait4_sock() delay computation on 64bit
        openvswitch: fix a possible deadlock and lockdep warning
        bridge: Fix handling stacked vlan tags
        bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled
        vhost: validate vhost_get_vq_desc return value
        vhost: fix total length when packets are too short
        random32: avoid attempt to late reseed if in the middle of seeding
        random32: assign to network folks in MAINTAINERS
        net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset
        core, nfqueue, openvswitch: Orphan frags in skb_zerocopy and handle errors
        vlan: Set hard_header_len according to available acceleration
        ...
      49d8137a
    • D
      Merge branch 'vlan_offloads' · 5f2feca2
      David S. Miller 提交于
      Vlad Yasevich says:
      
      ====================
      Audit all drivers for correct vlan_features.
      
      Some drivers set vlan acceleration features in vlan_features.  This causes
      issues with Q-in-Q/802.1ad configurations.
      
      Audit all the drivers for correct vlan_features.  Fix broken ones.
      Add a warning to vlan code to help catch future offenders.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f2feca2
    • V
      vlan: Warn the user if lowerdev has bad vlan features. · 2adb956b
      Vlad Yasevich 提交于
      Some drivers incorrectly assign vlan acceleration features to
      vlan_features thus causing issues for Q-in-Q vlan configurations.
      Warn the user of such cases.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2adb956b
    • V
      veth: Turn off vlan rx acceleration in vlan_features · 3f8c707b
      Vlad Yasevich 提交于
      For completeness, turn off vlan rx acceleration in vlan_features so
      that it doesn't show up on q-in-q setups.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f8c707b
    • V
      ifb: Remove vlan acceleration from vlan_features · 8dd6e147
      Vlad Yasevich 提交于
      Do not include vlan acceleration features in vlan_features as that
      precludes correct Q-in-Q operation.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8dd6e147
    • V
      qlge: Do not propaged vlan tag offloads to vlans · f6d1ac4b
      Vlad Yasevich 提交于
      qlge driver turns off NETIF_F_HW_CTAG_FILTER, but forgets to
      turn off HW_CTAG_TX and HW_CTAG_RX on vlan devices.  With the
      current settings, q-in-q will only generate a single vlan header.
      Remember to mask off CTAG_TX and CTAG_RX features in vlan_features.
      
      CC: Shahed Shaikh <shahed.shaikh@qlogic.com>
      CC: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
      CC: Ron Mercer <ron.mercer@qlogic.com>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Acked-by: NJitendra Kalsaria <jitendra.kalsaria@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6d1ac4b
    • V
      bridge: Fix crash with vlan filtering and tcpdump · fc92f745
      Vlad Yasevich 提交于
      When the vlan filtering is enabled on the bridge, but
      the filter is not configured on the bridge device itself,
      running tcpdump on the bridge device will result in a
      an Oops with NULL pointer dereference.  The reason
      is that br_pass_frame_up() will bypass the vlan
      check because promisc flag is set.  It will then try
      to get the table pointer and process the packet based
      on the table.  Since the table pointer is NULL, we oops.
      Catch this special condition in br_handle_vlan().
      Reported-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Acked-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc92f745
    • V
      net: Account for all vlan headers in skb_mac_gso_segment · 53d6471c
      Vlad Yasevich 提交于
      skb_network_protocol() already accounts for multiple vlan
      headers that may be present in the skb.  However, skb_mac_gso_segment()
      doesn't know anything about it and assumes that skb->mac_len
      is set correctly to skip all mac headers.  That may not
      always be the case.  If we are simply forwarding the packet (via
      bridge or macvtap), all vlan headers may not be accounted for.
      
      A simple solution is to allow skb_network_protocol to return
      the vlan depth it has calculated.  This way skb_mac_gso_segment
      will correctly skip all mac headers.
      Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53d6471c
    • V
      898602a0
    • J
      MAINTAINERS: bonding: change email address · 79b30750
      Jay Vosburgh 提交于
      Update my email address.
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79b30750
    • L
      Merge branch 'akpm' (patches from Andrew Morton) · bc53267e
      Linus Torvalds 提交于
      Merge two fixes from Andrew Morton:
       "The x86 fix should come from x86 guys but they appear to be
        conferencing or otherwise distracted.
      
        The ocfs2 fix is a bit of a mess - the code runs into an immediate
        NULL deref and we're trying to work out how this got through test and
        review, but we haven't heard from Goldwyn in the past few days.
        Sasha's patch fixes the oops, but the feature as a whole is probably
        broken.  So this is a stopgap for 3.14 - I'll aim to get the real
        fixes into 3.14.x"
      
      * emailed patches from Andrew Morton akpm@linux-foundation.org>:
        x86: fix boot on uniprocessor systems
        ocfs2: check if cluster name exists before deref
      bc53267e
    • A
      x86: fix boot on uniprocessor systems · 825600c0
      Artem Fetishev 提交于
      On x86 uniprocessor systems topology_physical_package_id() returns -1
      which causes rapl_cpu_prepare() to leave rapl_pmu variable uninitialized
      which leads to GPF in rapl_pmu_init().
      
      See arch/x86/kernel/cpu/perf_event_intel_rapl.c.
      
      It turns out that physical_package_id and core_id can actually be
      retreived for uniprocessor systems too.  Enabling them also fixes
      rapl_pmu code.
      Signed-off-by: NArtem Fetishev <artem_fetishev@epam.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      825600c0
    • S
      ocfs2: check if cluster name exists before deref · d9060742
      Sasha Levin 提交于
      Commit c74a3bdd ("ocfs2: add clustername to cluster connection") is
      trying to strlcpy a string which was explicitly passed as NULL in the
      very same patch, triggering a NULL ptr deref.
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: strlcpy (lib/string.c:388 lib/string.c:151)
        CPU: 19 PID: 19426 Comm: trinity-c19 Tainted: G        W     3.14.0-rc7-next-20140325-sasha-00014-g9476368-dirty #274
        RIP:  strlcpy (lib/string.c:388 lib/string.c:151)
        Call Trace:
         ocfs2_cluster_connect (fs/ocfs2/stackglue.c:350)
         ocfs2_cluster_connect_agnostic (fs/ocfs2/stackglue.c:396)
         user_dlm_register (fs/ocfs2/dlmfs/userdlm.c:679)
         dlmfs_mkdir (fs/ocfs2/dlmfs/dlmfs.c:503)
         vfs_mkdir (fs/namei.c:3467)
         SyS_mkdirat (fs/namei.c:3488 fs/namei.c:3472)
         tracesys (arch/x86/kernel/entry_64.S:749)
      
      akpm: this patch probably disables the feature.  A temporary thing to
      avoid triviel oopses.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d9060742
    • H
      ipv6: move DAD and addrconf_verify processing to workqueue · c15b1cca
      Hannes Frederic Sowa 提交于
      addrconf_join_solict and addrconf_join_anycast may cause actions which
      need rtnl locked, especially on first address creation.
      
      A new DAD state is introduced which defers processing of the initial
      DAD processing into a workqueue.
      
      To get rtnl lock we need to push the code paths which depend on those
      calls up to workqueues, specifically addrconf_verify and the DAD
      processing.
      
      (v2)
      addrconf_dad_failure needs to be queued up to the workqueue, too. This
      patch introduces a new DAD state and stop the DAD processing in the
      workqueue (this is because of the possible ipv6_del_addr processing
      which removes the solicited multicast address from the device).
      
      addrconf_verify_lock is removed, too. After the transition it is not
      needed any more.
      
      As we are not processing in bottom half anymore we need to be a bit more
      careful about disabling bottom half out when we lock spin_locks which are also
      used in bh.
      
      Relevant backtrace:
      [  541.030090] RTNL: assertion failed at net/core/dev.c (4496)
      [  541.031143] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.33-1-amd64-vyatta #1
      [  541.031145] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  541.031146]  ffffffff8148a9f0 000000000000002f ffffffff813c98c1 ffff88007c4451f8
      [  541.031148]  0000000000000000 0000000000000000 ffffffff813d3540 ffff88007fc03d18
      [  541.031150]  0000880000000006 ffff88007c445000 ffffffffa0194160 0000000000000000
      [  541.031152] Call Trace:
      [  541.031153]  <IRQ>  [<ffffffff8148a9f0>] ? dump_stack+0xd/0x17
      [  541.031180]  [<ffffffff813c98c1>] ? __dev_set_promiscuity+0x101/0x180
      [  541.031183]  [<ffffffff813d3540>] ? __hw_addr_create_ex+0x60/0xc0
      [  541.031185]  [<ffffffff813cfe1a>] ? __dev_set_rx_mode+0xaa/0xc0
      [  541.031189]  [<ffffffff813d3a81>] ? __dev_mc_add+0x61/0x90
      [  541.031198]  [<ffffffffa01dcf9c>] ? igmp6_group_added+0xfc/0x1a0 [ipv6]
      [  541.031208]  [<ffffffff8111237b>] ? kmem_cache_alloc+0xcb/0xd0
      [  541.031212]  [<ffffffffa01ddcd7>] ? ipv6_dev_mc_inc+0x267/0x300 [ipv6]
      [  541.031216]  [<ffffffffa01c2fae>] ? addrconf_join_solict+0x2e/0x40 [ipv6]
      [  541.031219]  [<ffffffffa01ba2e9>] ? ipv6_dev_ac_inc+0x159/0x1f0 [ipv6]
      [  541.031223]  [<ffffffffa01c0772>] ? addrconf_join_anycast+0x92/0xa0 [ipv6]
      [  541.031226]  [<ffffffffa01c311e>] ? __ipv6_ifa_notify+0x11e/0x1e0 [ipv6]
      [  541.031229]  [<ffffffffa01c3213>] ? ipv6_ifa_notify+0x33/0x50 [ipv6]
      [  541.031233]  [<ffffffffa01c36c8>] ? addrconf_dad_completed+0x28/0x100 [ipv6]
      [  541.031241]  [<ffffffff81075c1d>] ? task_cputime+0x2d/0x50
      [  541.031244]  [<ffffffffa01c38d6>] ? addrconf_dad_timer+0x136/0x150 [ipv6]
      [  541.031247]  [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
      [  541.031255]  [<ffffffff8105313a>] ? call_timer_fn.isra.22+0x2a/0x90
      [  541.031258]  [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
      
      Hunks and backtrace stolen from a patch by Stephen Hemminger.
      Reported-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c15b1cca
    • E
      tcp: fix get_timewait4_sock() delay computation on 64bit · e2a1d3e4
      Eric Dumazet 提交于
      It seems I missed one change in get_timewait4_sock() to compute
      the remaining time before deletion of IPV4 timewait socket.
      
      This could result in wrong output in /proc/net/tcp for tm->when field.
      
      Fixes: 96f817fe ("tcp: shrink tcp6_timewait_sock by one cache line")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2a1d3e4
    • F
      openvswitch: fix a possible deadlock and lockdep warning · 4f647e0a
      Flavio Leitner 提交于
      There are two problematic situations.
      
      A deadlock can happen when is_percpu is false because it can get
      interrupted while holding the spinlock. Then it executes
      ovs_flow_stats_update() in softirq context which tries to get
      the same lock.
      
      The second sitation is that when is_percpu is true, the code
      correctly disables BH but only for the local CPU, so the
      following can happen when locking the remote CPU without
      disabling BH:
      
             CPU#0                            CPU#1
        ovs_flow_stats_get()
         stats_read()
       +->spin_lock remote CPU#1        ovs_flow_stats_get()
       |  <interrupted>                  stats_read()
       |  ...                       +-->  spin_lock remote CPU#0
       |                            |     <interrupted>
       |  ovs_flow_stats_update()   |     ...
       |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
       +---------------------------------- spin_lock local CPU#1
      
      This patch disables BH for both cases fixing the deadlocks.
      Acked-by: NJesse Gross <jesse@nicira.com>
      
      =================================
      [ INFO: inconsistent lock state ]
      3.14.0-rc8-00007-g632b06aa #1 Tainted: G          I
      ---------------------------------
      inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      swapper/0/0 [HC0[0]:SC1[5]:HE1:SE0] takes:
      (&(&cpu_stats->lock)->rlock){+.?...}, at: [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
      {SOFTIRQ-ON-W} state was registered at:
      [<ffffffff810f973f>] __lock_acquire+0x68f/0x1c40
      [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
      [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
      [<ffffffffa05dd9e4>] ovs_flow_stats_get+0xc4/0x1e0 [openvswitch]
      [<ffffffffa05da855>] ovs_flow_cmd_fill_info+0x185/0x360 [openvswitch]
      [<ffffffffa05daf05>] ovs_flow_cmd_build_info.constprop.27+0x55/0x90 [openvswitch]
      [<ffffffffa05db41d>] ovs_flow_cmd_new_or_set+0x4dd/0x570 [openvswitch]
      [<ffffffff816c245d>] genl_family_rcv_msg+0x1cd/0x3f0
      [<ffffffff816c270e>] genl_rcv_msg+0x8e/0xd0
      [<ffffffff816c0239>] netlink_rcv_skb+0xa9/0xc0
      [<ffffffff816c0798>] genl_rcv+0x28/0x40
      [<ffffffff816bf830>] netlink_unicast+0x100/0x1e0
      [<ffffffff816bfc57>] netlink_sendmsg+0x347/0x770
      [<ffffffff81668e9c>] sock_sendmsg+0x9c/0xe0
      [<ffffffff816692d9>] ___sys_sendmsg+0x3a9/0x3c0
      [<ffffffff8166a911>] __sys_sendmsg+0x51/0x90
      [<ffffffff8166a962>] SyS_sendmsg+0x12/0x20
      [<ffffffff817e3ce9>] system_call_fastpath+0x16/0x1b
      irq event stamp: 1740726
      hardirqs last  enabled at (1740726): [<ffffffff8175d5e0>] ip6_finish_output2+0x4f0/0x840
      hardirqs last disabled at (1740725): [<ffffffff8175d59b>] ip6_finish_output2+0x4ab/0x840
      softirqs last  enabled at (1740674): [<ffffffff8109be12>] _local_bh_enable+0x22/0x50
      softirqs last disabled at (1740675): [<ffffffff8109db05>] irq_exit+0xc5/0xd0
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&cpu_stats->lock)->rlock);
        <Interrupt>
          lock(&(&cpu_stats->lock)->rlock);
      
       *** DEADLOCK ***
      
      5 locks held by swapper/0/0:
       #0:  (((&ifa->dad_timer))){+.-...}, at: [<ffffffff810a7155>] call_timer_fn+0x5/0x320
       #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81788a55>] mld_sendpack+0x5/0x4a0
       #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8175d149>] ip6_finish_output2+0x59/0x840
       #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8168ba75>] __dev_queue_xmit+0x5/0x9b0
       #4:  (rcu_read_lock){.+.+..}, at: [<ffffffffa05e41b5>] internal_dev_xmit+0x5/0x110 [openvswitch]
      
      stack backtrace:
      CPU: 0 PID: 0 Comm: swapper/0 Tainted: G          I  3.14.0-rc8-00007-g632b06aa #1
      Hardware name:                  /DX58SO, BIOS SOX5810J.86A.5599.2012.0529.2218 05/29/2012
       0000000000000000 0fcf20709903df0c ffff88042d603808 ffffffff817cfe3c
       ffffffff81c134c0 ffff88042d603858 ffffffff817cb6da 0000000000000005
       ffffffff00000001 ffff880400000000 0000000000000006 ffffffff81c134c0
      Call Trace:
       <IRQ>  [<ffffffff817cfe3c>] dump_stack+0x4d/0x66
       [<ffffffff817cb6da>] print_usage_bug+0x1f4/0x205
       [<ffffffff810f7f10>] ? check_usage_backwards+0x180/0x180
       [<ffffffff810f8963>] mark_lock+0x223/0x2b0
       [<ffffffff810f96d3>] __lock_acquire+0x623/0x1c40
       [<ffffffff810f5707>] ? __lock_is_held+0x57/0x80
       [<ffffffffa05e26c6>] ? masked_flow_lookup+0x236/0x250 [openvswitch]
       [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
       [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
       [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffffa05dcc64>] ovs_dp_process_received_packet+0x84/0x120 [openvswitch]
       [<ffffffff810f93f7>] ? __lock_acquire+0x347/0x1c40
       [<ffffffffa05e3bea>] ovs_vport_receive+0x2a/0x30 [openvswitch]
       [<ffffffffa05e4218>] internal_dev_xmit+0x68/0x110 [openvswitch]
       [<ffffffffa05e41b5>] ? internal_dev_xmit+0x5/0x110 [openvswitch]
       [<ffffffff8168b4a6>] dev_hard_start_xmit+0x2e6/0x8b0
       [<ffffffff8168be87>] __dev_queue_xmit+0x417/0x9b0
       [<ffffffff8168ba75>] ? __dev_queue_xmit+0x5/0x9b0
       [<ffffffff8175d5e0>] ? ip6_finish_output2+0x4f0/0x840
       [<ffffffff8168c430>] dev_queue_xmit+0x10/0x20
       [<ffffffff8175d641>] ip6_finish_output2+0x551/0x840
       [<ffffffff8176128a>] ? ip6_finish_output+0x9a/0x220
       [<ffffffff8176128a>] ip6_finish_output+0x9a/0x220
       [<ffffffff8176145f>] ip6_output+0x4f/0x1f0
       [<ffffffff81788c29>] mld_sendpack+0x1d9/0x4a0
       [<ffffffff817895b8>] mld_send_initial_cr.part.32+0x88/0xa0
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff8178e301>] ipv6_mc_dad_complete+0x31/0x50
       [<ffffffff817690d7>] addrconf_dad_completed+0x147/0x220
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff8176934f>] addrconf_dad_timer+0x19f/0x1c0
       [<ffffffff810a71e9>] call_timer_fn+0x99/0x320
       [<ffffffff810a7155>] ? call_timer_fn+0x5/0x320
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff810a76c4>] run_timer_softirq+0x254/0x3b0
       [<ffffffff8109d47d>] __do_softirq+0x12d/0x480
      Signed-off-by: NFlavio Leitner <fbl@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f647e0a
    • T
      bridge: Fix handling stacked vlan tags · 99b192da
      Toshiaki Makita 提交于
      If a bridge with vlan_filtering enabled receives frames with stacked
      vlan tags, i.e., they have two vlan tags, br_vlan_untag() strips not
      only the outer tag but also the inner tag.
      
      br_vlan_untag() is called only from br_handle_vlan(), and in this case,
      it is enough to set skb->vlan_tci to 0 here, because vlan_tci has already
      been set before calling br_handle_vlan().
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99b192da
    • T
      bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled · 12464bb8
      Toshiaki Makita 提交于
      Bridge vlan code (br_vlan_get_tag()) assumes that all frames have vlan_tci
      if they are tagged, but if vlan tx offload is manually disabled on bridge
      device and frames are sent from vlan device on the bridge device, the tags
      are embedded in skb->data and they break this assumption.
      Extract embedded vlan tags and move them to vlan_tci at ingress.
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12464bb8
    • M
      vhost: validate vhost_get_vq_desc return value · a39ee449
      Michael S. Tsirkin 提交于
      vhost fails to validate negative error code
      from vhost_get_vq_desc causing
      a crash: we are using -EFAULT which is 0xfffffff2
      as vector size, which exceeds the allocated size.
      
      The code in question was introduced in commit
      8dd014ad
          vhost-net: mergeable buffers support
      
      CVE-2014-0055
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a39ee449
    • M
      vhost: fix total length when packets are too short · d8316f39
      Michael S. Tsirkin 提交于
      When mergeable buffers are disabled, and the
      incoming packet is too large for the rx buffer,
      get_rx_bufs returns success.
      
      This was intentional in order for make recvmsg
      truncate the packet and then handle_rx would
      detect err != sock_len and drop it.
      
      Unfortunately we pass the original sock_len to
      recvmsg - which means we use parts of iov not fully
      validated.
      
      Fix this up by detecting this overrun and doing packet drop
      immediately.
      
      CVE-2014-0077
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d8316f39
    • S
      random32: avoid attempt to late reseed if in the middle of seeding · 05efa8c9
      Sasha Levin 提交于
      Commit 4af712e8 ("random32: add prandom_reseed_late() and call when
      nonblocking pool becomes initialized") has added a late reseed stage
      that happens as soon as the nonblocking pool is marked as initialized.
      
      This fails in the case that the nonblocking pool gets initialized
      during __prandom_reseed()'s call to get_random_bytes(). In that case
      we'd double back into __prandom_reseed() in an attempt to do a late
      reseed - deadlocking on 'lock' early on in the boot process.
      
      Instead, just avoid even waiting to do a reseed if a reseed is already
      occuring.
      
      Fixes: 4af712e8 ("random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized")
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05efa8c9
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 2946369e
      Linus Torvalds 提交于
      Pull input subsystem fixes from Dmitry Torokhov:
       "Updates to Synaptics touchpad to better cope with devices in Lenovo
        laptops, and a couple more fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: synaptics - add manual min/max quirk for ThinkPad X240
        Input: synaptics - add manual min/max quirk
        Input: cypress_ps2 - don't report as a button pads
        Input: da9052_onkey - use correct register bit for key status
        Input: adp5588-keys - get value from data out when dir is out
      2946369e
    • S
      random32: assign to network folks in MAINTAINERS · 335a67d2
      Sasha Levin 提交于
      lib/random32.c was split out of the network code and is de-facto
      still maintained by the almighty net/ gods.
      
      Make it a bit more official so that people who aren't aware of
      that know where to send their patches.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      335a67d2
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 1fac1fa9
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "I didn't want these to wait for stable cycle.
      
        The nouveau and radeon ones are the same problem, where the runtime pm
        stuff broke non-runtime pm managed secondary GPUs.
      
        The udl fix is for an oops on unplug, and the i915 fix is for a
        regression on Sandybridge even though it may break haswell (regression
        wins)"
      
      Daniel Vetter comments:
       "My apologies for the i915 regression fumble, that thing somehow fell
        through the cracks here for almost half a year :( Imo that's more than
        enough flailing to just go ahead with the revert, and the re-broken
        hsw should get peoples attention ..."
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: Undo gtt scratch pte unmapping again
        drm/radeon: fix runtime suspend breaking secondary GPUs
        drm/nouveau: fail runtime pm properly.
        drm/udl: take reference to device struct for dma-bufs
      1fac1fa9
    • L
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 350bb4be
      Linus Torvalds 提交于
      Pull i2c build fix from Wolfram Sang:
       "The build fix from my last request unveiled another build problem
        which is fixed with this patch"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: cpm: Fix build by adding of_address.h and of_irq.h
      350bb4be
    • L
      Merge tag 'stable/for-linus-3.14-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 81250437
      Linus Torvalds 提交于
      Pull Xen bugfixes from David Vrabel:
       "Fix two bugs that cause x86 PV guest crashes.
      
        1. Ballooning a 32-bit guest would eventually crash it.
      
        2. Revert a broken fix for a regression with NUMA_BALACING.  The bad
           fix caused PV guests to crash after migration.  This is not ideal
           but unpicking the madness that is _PAGE_NUMA == _PAGE_PROTNONE will
           take a while longer"
      
      * tag 'stable/for-linus-3.14-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        Revert "xen: properly account for _PAGE_NUMA during xen pte translations"
        xen/balloon: flush persistent kmaps in correct position
      81250437