1. 01 3月, 2015 7 次提交
    • E
      macvtap: make sure neighbour code can push ethernet header · 2f1d8b9e
      Eric Dumazet 提交于
      Brian reported crashes using IPv6 traffic with macvtap/veth combo.
      
      I tracked the crashes in neigh_hh_output()
      
      -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);
      
      Neighbour code assumes headroom to push Ethernet header is
      at least 16 bytes.
      
      It appears macvtap has only 14 bytes available on arches
      where NET_IP_ALIGN is 0 (like x86)
      
      Effect is a corruption of 2 bytes right before skb->head,
      and possible crashes if accessing non existing memory.
      
      This fix should also increase IPv4 performance, as paranoid code
      in ip_finish_output2() wont have to call skb_realloc_headroom()
      Reported-by: NBrian Rak <brak@vultr.com>
      Tested-by: NBrian Rak <brak@vultr.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f1d8b9e
    • D
      Merge tag 'mac80211-for-davem-2015-02-27' of... · 32034e05
      David S. Miller 提交于
      Merge tag 'mac80211-for-davem-2015-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      A few patches have accumulated, among them the fix for Linus's
      four-way-handshake problem. The others are various small fixes
      for problems all over, nothing really stands out.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32034e05
    • E
      net: Verify permission to link_net in newlink · 06615bed
      Eric W. Biederman 提交于
      When applicable verify that the caller has permisson to the underlying
      network namespace for a newly created network device.
      
      Similary checks exist for the network namespace a network device will
      be created in.
      
      Fixes: 317f4810 ("rtnl: allow to create device with IFLA_LINK_NETNSID set")
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06615bed
    • E
      net: Verify permission to dest_net in newlink · 505ce415
      Eric W. Biederman 提交于
      When applicable verify that the caller has permision to create a
      network device in another network namespace.  This check is already
      present when moving a network device between network namespaces in
      setlink so all that is needed is to duplicate that check in newlink.
      
      This change almost backports cleanly, but there are context conflicts
      as the code that follows was added in v4.0-rc1
      
      Fixes: b51642f6 net: Enable a userns root rtnl calls that are safe for unprivilged users
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      505ce415
    • G
      drivers: net: cpsw: Set SECURE for dual_emac ucast · 56887149
      George McCollister 提交于
      Prior to this patch, sending a packet with the source MAC address of one
      of the CPSW interfaces to one of the CPSW slave ports while it's configured in
      dual_emac mode would update the port_num field of the VLAN/Unicast Address
      Table Entry. This would cause it to discard all incoming traffic addressed to
      that MAC address, essentially rendering the port useless until the ALE table is
      cleared (by starting and stopping the interface or rebooting.)
      
      For example, if eth0 has a MAC address of 90:59:af:8f:43:e9 it will have
      an ALE table entry:
      
      00 00 00 00 59 90 02 30 e9 43 8f af
      (VLAN Addr vlan_id=2 unicast type=0 port_num=0 addr=90:59:af:8f:43:e9)
      
      If you configure another device with the same MAC address and connect it
      to the first CPSW slave port and send some traffic the ALE table entry
      becomes:
      
      04 00 00 00 59 90 02 30 e9 43 8f af
      (VLAN Addr vlan_id=2 unicast type=0 port_num=1 addr=90:59:af:8f:43:e9)
      
      >From this point forward all incoming traffic addressed to
      90:59:af:8f:43:e9 will be dropped.
      
      Setting the SECURE bit for the VLAN/Unicast address table entry for each
      interface's MAC address corrects the problem.
      Signed-off-by: NGeorge McCollister <george.mccollister@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56887149
    • D
      niu: fix error handling in niu_class_to_ethflow() · f55ea3d9
      Dan Carpenter 提交于
      There is a discrepancy here because the niu_class_to_ethflow() returns
      zero on failure and one on success but the caller expected zero on
      success and negative on failure.
      
      The problem means that we allow the user to pass classes and flow_types
      which we don't want.  I've looked at it a bit and I don't see it as a
      very serious bug.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f55ea3d9
    • A
      net: smc91x: use run-time configuration on all ARM machines · b70661c7
      Arnd Bergmann 提交于
      The smc91x driver traditionally gets configured at compile-time
      for whichever hardware it runs on. This no longer works on
      ARM as we continue to move to building all-in-one kernels.
      
      Most ARM configurations with this driver already use run-time
      configuration through DT or through platform_data, but a
      few have not been converted yet.
      
      I've checked all ARM boards that use this driver in their
      legacy board files, and converted the ones that were using
      compile-time configuration in smc91x.h to behave like the
      other ones and provide the interrupt polarity along with
      the MMIO configuration (width, stride) at platform device
      creation time.
      
      In particular, these combinations were previously selectable
      in Kconfig but in fact broken:
      
      - sa1100 assabet plus pleb
      - msm combined with any other armv6/v7 platform
      - pxa-idp combined with any non-DMA pxa variant
      - LogicPD PXA270 combined with any other pxa
      - nomadik combined with any other armv4/v5 platform,
        e.g. versatile.
      
      None of these seem critical enough to warrant a backport
      to stable, but it would be nice to clean this up for good.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      ----
      I would like the patch to get merged through netdev, after
      Robert and/or Linus have verified it on at least some hardware.
      
      There are a few other non-ARM platforms using this driver,
      I could do the same patch for those if we want to take
      it further.
      
       arch/arm/mach-msm/board-halibut.c    |   8 ++++-
       arch/arm/mach-msm/board-qsd8x50.c    |   8 ++++-
       arch/arm/mach-pxa/idp.c              |   5 +++
       arch/arm/mach-pxa/lpd270.c           |   8 ++++-
       arch/arm/mach-realview/core.c        |   7 ++++
       arch/arm/mach-realview/realview_eb.c |   2 +-
       arch/arm/mach-sa1100/neponset.c      |   6 ++++
       arch/arm/mach-sa1100/pleb.c          |   7 ++++
       drivers/net/ethernet/smsc/smc91x.c   |   9 +++--
       drivers/net/ethernet/smsc/smc91x.h   | 114 ++----------------------------------------------------------
       10 files changed, 57 insertions(+), 117 deletions(-)
      Tested-by: NRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b70661c7
  2. 28 2月, 2015 13 次提交
    • E
      rhashtable: use cond_resched() · 5beb5c90
      Eric Dumazet 提交于
      If a hash table has 128 slots and 16384 elems, expand to 256 slots
      takes more than one second. For larger sets, a soft lockup is detected.
      
      Holding cpu for that long, even in a work queue is a show stopper
      for non preemptable kernels.
      
      cond_resched() at strategic points to allow process scheduler
      to reschedule us.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5beb5c90
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net · 061c1a6e
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2015-02-26
      
      This series contains fixes for i40e and i40evf only.
      
      Alexey Khoroshilov found a possible leak of 'cmd_buf' when copy_from_user()
      failed in i40e_dbg_command_write(), so resolved by calling kfree().
      
      Shannon provides a fix to ensure the shift and bitwise precedences do not
      work backwards for us by adding parans.  Fixed the driver by preventing
      the driver from allowing stray interrupts or causing system logs from
      un-handled interrupts by combining the ICR0 shutdown with the standard
      interrupt shutdown and add the interrupt clearing to the PCI shutdown
      path.  Fixed an issue where a NVM write times out before a transaction
      can complete, so Shannon added logic to make another attempt by
      reacquiring the semaphore, then retry the write, if the one retry fails,
      we will then give up.  Adds checks to pointers before their use to ensure
      we do not try to dereference NULL pointers when returning values from the
      AdminQ calls.
      
      Akeem adds a check to bail out if the device is already down when checking
      for Tx hang subtask.
      
      Anjali fixes TSO with more than 8 frags per segment issue.  The hardware
      has some limitations which the driver needs to adhere to:
        1) no more than 8 descriptors per packet on the wire
        2) no header can span more than 3 descriptors
      If one of these events happens, the hardware will generate an internal
      error and freeze the Tx queue, so Anjali fixes this by linearizes the skb
      to avoid these situations.  Fixed an issue where the per Traffic Class
      queue count was higher than queues enabled, which will fix a warning
      with multiple function mode where systems regularly have more cores than
      vectors.  Fixed TCP/IPv6 over VXLAN Tx checksum offload, where we were
      checking the outer protocol flags and deciding the flow for the inner
      header.
      
      Jesse fixes a race condition in the transmit hang detection.  Before we
      were having issues of false Tx hang detection, no the driver makes more
      direct with the checks for progress forward by directly checking the head
      write back address and tail register when determining progress.  This
      avoids Tx hangs where the software gets behind, because we are directly
      checking hardware state when determining a hang state.
      
      Neerav fixes the transmit ring Qset handle when DCB reconfigures. The issue
      was when DCB is reconfigured to a single traffic class (TC) and the driver
      did not reset the Tx ring Qset handle to correct the mapping, which caused
      the Tx queue to disable timeouts.  Also as part of DCB reconfiguration flow
      if the Tx queue disable times out, then issue a PF reset to do some level
      of recovery.
      
      Mitch stops flow director on shutdown because, in some cases, the hardware
      would continue to try to access the FDIR ring after entering D3Hot state,
      which would cause either PCIe errors or NMIs, depending upon the system
      configuration.
      
      * NOTE * I have verified that this series of patches for net will not cause
      any merge issues when you sync up your net tree with your net-next tree.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      061c1a6e
    • L
      amd-xgbe: Request IRQs only after driver is fully setup · c30e76a7
      Lendacky, Thomas 提交于
      It is possible that the hardware may not have been properly shutdown
      before this driver gets control, through use by firmware, for example.
      Until the driver is loaded, interrupts associated with the hardware
      could go pending. When the IRQs are requested napi support has not
      been initialized yet, but the ISR will get control and schedule napi
      processing resulting in a kernel panic because the poll routine has not
      been set.
      
      Adjust the code so that the driver is fully ready to handle and process
      interrupts as soon as the IRQs are requested. This involves requesting
      and freeing IRQs during start and stop processing and ordering the napi
      add and delete calls appropriately.
      
      Also adjust the powerup and powerdown routines to match the start and
      stop routines in regards to the ordering of tasks, including napi
      related calls.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c30e76a7
    • L
      net: asix: add support for the Sitecom LN-028 USB adapter · 7488c3e3
      Luca Ceresoli 提交于
      Just another AX88178-based 10/100/1000 USB-to-Ethernet dongle. This one
      shows up in lsusb as: "Sitecom Europe B.V. LN-028 Network USB 2.0 Adapter".
      Signed-off-by: NLuca Ceresoli <luca@lucaceresoli.net>
      Cc: Francois Romieu <romieu@fr.zoreil.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: linux-usb@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7488c3e3
    • D
      Merge branch 'rhashtable' · c0eebfa3
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      rhashtable updates
      
      As discussed, I'm sending out rhashtable fixups for -net.
      
      I have a couple of more patches I was working on last week pending,
      i.e. to get rid of ht->nelems and ht->shift atomic operations which
      speed-up pure insertions/deletions, e.g. on my laptop I have 2 threads,
      inserting 7M entries each, that will reduce insertion time from ~1,450 ms
      to 865 ms (performance should even be better after removing the
      grow/shrink indirections). I guess that however is rather something
      for net-next.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0eebfa3
    • D
      rhashtable: remove indirection for grow/shrink decision functions · 4c4b52d9
      Daniel Borkmann 提交于
      Currently, all real users of rhashtable default their grow and shrink
      decision functions to rht_grow_above_75() and rht_shrink_below_30(),
      so that there's currently no need to have this explicitly selectable.
      
      It can/should be generic and private inside rhashtable until a real
      use case pops up. Since we can make this private, we'll save us this
      additional indirection layer and can improve insertion/deletion time
      as well.
      
      Reference: http://patchwork.ozlabs.org/patch/443040/Suggested-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c4b52d9
    • D
      rhashtable: unconditionally grow when max_shift is not specified · 8331de75
      Daniel Borkmann 提交于
      While commit c0c09bfd ("rhashtable: avoid unnecessary wakeup for
      worker queue") rightfully moved part of the decision making of
      whether we should expand or shrink from the expand/shrink functions
      themselves into insert/delete functions in order to avoid unnecessary
      worker wake-ups, it however introduced a regression by doing so.
      
      Before that change, if no max_shift was specified (= 0) on rhashtable
      initialization, rhashtable_expand() would just grow unconditionally
      and lets the available memory be the limiting factor. After that
      change, if no max_shift was specified, there would be _no_ expansion
      step at all.
      
      Given that netlink and tipc have a max_shift specified, it was not
      visible there, but Josh Hunt reported that if nft that starts out
      with a default element hint of 3 if not otherwise provided, would
      slow i.e. inserts down trememdously as it cannot grow larger to
      relax table occupancy.
      
      Given that the test case verifies shrinks/expands manually, we also
      must remove pointer to the helper functions to explicitly avoid
      parallel resizing on insertions/deletions. test_bucket_stats() and
      test_rht_lookup() could also be wrapped around rhashtable mutex to
      explicitly synchronize a walk from resizing, but I think that defeats
      the actual test case which intended to have explicit test steps,
      i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object
      verification after each stage.
      Reported-by: NJosh Hunt <johunt@akamai.com>
      Fixes: c0c09bfd ("rhashtable: avoid unnecessary wakeup for worker queue")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Ying Xue <ying.xue@windriver.com>
      Cc: Josh Hunt <johunt@akamai.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8331de75
    • M
      vhost: drop hard-coded num_buffers size · 0d79a493
      Michael S. Tsirkin 提交于
      The 2 that we use for copy_to_iter comes from sizeof(u16),
      it used to be that way before the iov iter update.
      Fix it up, making it obvious the size of stack access
      is right.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0d79a493
    • M
      vhost: cleanup iterator update logic · 4c5a8442
      Michael S. Tsirkin 提交于
      Recent iterator-related changes in vhost made it
      harder to follow the logic fixing up the header.
      In fact, the fixup always happens at the same
      offset: sizeof(virtio_net_hdr): sometimes the
      fixup iterator is updated by copy_to_iter,
      sometimes-by iov_iter_advance.
      
      Rearrange code to make this obvious.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c5a8442
    • D
      rocker: silence shift wrapping warning · 5f2ebfbe
      Dan Carpenter 提交于
      "val" is declared as a u64 so static checkers complain that this shift
      can wrap.  I don't have the hardware but probably it's doesn't have over
      31 ports.  Still we may as well silence the warning even if it's not a
      real bug.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Acked-by: NScott Feldman <sfeldma@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f2ebfbe
    • D
      rocker: add a check for NULL in rocker_probe_ports() · e65ad3be
      Dan Carpenter 提交于
      Make sure kmalloc() succeeds.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NScott Feldman <sfeldma@gmail.com>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e65ad3be
    • H
      cxgb4: Fix PCI-E Memory window interface for big-endian systems · f01aa633
      Hariprasad Shenai 提交于
      When doing reads and writes to adapter memory via the PCI-E Memory Window
      interface, data gets swizzled on 4-byte boundaries on Big-Endian systems
      because we need to account for the register read/write interface which
      incorporates a swizzle onto the Little-Endian PCI-E Bus.
      
      Based on original work by Casey Leedom <leedom@chelsio.com>
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f01aa633
    • S
      enic: do notify_check before returning credits · 2b0c2e2d
      Sujith Sankar 提交于
      We should complete notify_check before returning the credits. Once we return the
      credits, adaptor may access the notify data.
      Signed-off-by: NSujith Sankar <ssujith@cisco.com>
      Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b0c2e2d
  3. 27 2月, 2015 1 次提交
    • J
      mac80211: Send EAPOL frames at lowest rate · 9c1c98a3
      Jouni Malinen 提交于
      The current minstrel_ht rate control behavior is somewhat optimistic in
      trying to find optimum TX rate. While this is usually fine for normal
      Data frames, there are cases where a more conservative set of retry
      parameters would be beneficial to make the connection more robust.
      
      EAPOL frames are critical to the authentication and especially the
      EAPOL-Key message 4/4 (the last message in the 4-way handshake) is
      important to get through to the AP. If that message is lost, the only
      recovery mechanism in many cases is to reassociate with the AP and start
      from scratch. This can often be avoided by trying to send the frame with
      more conservative rate and/or with more link layer retries.
      
      In most cases, minstrel_ht is currently using the initial EAPOL-Key
      frames for probing higher rates and this results in only five link layer
      transmission attempts (one at high(ish) MCS and four at MCS0). While
      this works with most APs, it looks like there are some deployed APs that
      may have issues with the EAPOL frames using HT MCS immediately after
      association. Similarly, there may be issues in cases where the signal
      strength or radio environment is not good enough to be able to get
      frames through even at couple of MCS 0 tries.
      
      The best approach for this would likely to be to reduce the TX rate for
      the last rate (3rd rate parameter in the set) to a low basic rate (say,
      6 Mbps on 5 GHz and 2 or 5.5 Mbps on 2.4 GHz), but doing that cleanly
      requires some more effort. For now, we can start with a simple one-liner
      that forces the minimum rate to be used for EAPOL frames similarly how
      the TX rate is selected for the IEEE 802.11 Management frames. This does
      result in a small extra latency added to the cases where the AP would be
      able to receive the higher rate, but taken into account how small number
      of EAPOL frames are used, this is likely to be insignificant. A future
      optimization in the minstrel_ht design can also allow this patch to be
      reverted to get back to the more optimized initial TX rate.
      
      It should also be noted that many drivers that do not use minstrel as
      the rate control algorithm are already doing similar workarounds by
      forcing the lowest TX rate to be used for EAPOL frames.
      
      Cc: stable@vger.kernel.org
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Tested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      9c1c98a3
  4. 26 2月, 2015 15 次提交
  5. 25 2月, 2015 4 次提交