1. 05 2月, 2015 3 次提交
    • M
      net/core: Add event for a change in slave state · 61bd3857
      Moni Shoua 提交于
      Add event which provides an indication on a change in the state
      of a bonding slave. The event handler should cast the pointer to the
      appropriate type (struct netdev_bonding_info) in order to get the
      full info about the slave.
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      61bd3857
    • T
      net: add skb functions to process remote checksum offload · dcdc8994
      Tom Herbert 提交于
      This patch adds skb_remcsum_process and skb_gro_remcsum_process to
      perform the appropriate adjustments to the skb when receiving
      remote checksum offload.
      
      Updated vxlan and gue to use these functions.
      
      Tested: Ran TCP_RR and TCP_STREAM netperf for VXLAN and GUE, did
      not see any change in performance.
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dcdc8994
    • E
      xps: fix xps for stacked devices · 2bd82484
      Eric Dumazet 提交于
      A typical qdisc setup is the following :
      
      bond0 : bonding device, using HTB hierarchy
      eth1/eth2 : slaves, multiqueue NIC, using MQ + FQ qdisc
      
      XPS allows to spread packets on specific tx queues, based on the cpu
      doing the send.
      
      Problem is that dequeues from bond0 qdisc can happen on random cpus,
      due to the fact that qdisc_run() can dequeue a batch of packets.
      
      CPUA -> queue packet P1 on bond0 qdisc, P1->ooo_okay=1
      CPUA -> queue packet P2 on bond0 qdisc, P2->ooo_okay=0
      
      CPUB -> dequeue packet P1 from bond0
              enqueue packet on eth1/eth2
      CPUC -> dequeue packet P2 from bond0
              enqueue packet on eth1/eth2 using sk cache (ooo_okay is 0)
      
      get_xps_queue() then might select wrong queue for P1, since current cpu
      might be different than CPUA.
      
      P2 might be sent on the old queue (stored in sk->sk_tx_queue_mapping),
      if CPUC runs a bit faster (or CPUB spins a bit on qdisc lock)
      
      Effect of this bug is TCP reorders, and more generally not optimal
      TX queue placement. (A victim bulk flow can be migrated to the wrong TX
      queue for a while)
      
      To fix this, we have to record sender cpu number the first time
      dev_queue_xmit() is called for one tx skb.
      
      We can union napi_id (used on receive path) and sender_cpu,
      granted we clear sender_cpu in skb_scrub_packet() (credit to Willem for
      this union idea)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2bd82484
  2. 03 2月, 2015 1 次提交
  3. 02 2月, 2015 2 次提交
  4. 30 1月, 2015 1 次提交
  5. 28 1月, 2015 3 次提交
    • J
      net/mlx4_core: Adjust command timeouts to conform to the firmware spec · 5a031086
      Jack Morgenstein 提交于
      The firmware spec states that the timeout for all commands should be 60 seconds.
      
      In the past, the spec indicated that there were several classes of timeout
      (short, medium, and long).  The driver has these different timeout classes.
      We leave the class differentiation in the driver as-is (to protect against any
      future spec changes), but set the timeout for all classes to be 60 seconds.
      
      In addition, we fix a few commands which had hard-coded numeric timeouts specified.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a031086
    • J
      net/mlx4_core: Add bad-cable event support · be6a6b43
      Jack Morgenstein 提交于
      If the firmware can detect a bad cable, allow it to generate an
      event, and print the problem in the log.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be6a6b43
    • C
      NFC: st21nfca: Adding support for secure element · 2130fb97
      Christophe Ricard 提交于
      st21nfca has 1 physical SWP line and can support up to 2 secure elements
      (UICC & eSE) thanks to an external switch managed with a gpio.
      
      The platform integrator needs to specify thanks to 2 initialization
      properties, uicc-present and ese-present, if it is suppose to have uicc
      and/or ese. Of course if the platform does not have an external switch,
      only one kind of secure element can be supported. Those parameters are
      under platform integrator responsibilities.
      
      During initialization, the white_list will be set according to those
      parameters.
      
      The discovery_se function will assume a secure element is physically
      present according to uicc-present and ese-present values and will add it
      to the secure element list. On ese activation, the atr is retrieved to
      calculate a command exchange timeout based on the first atr(TB) value.
      
      The se_io will allow to transfer data over SWP. 2 kind of events may appear
      after a data is sent over:
      - ST21NFCA_EVT_TRANSMIT_DATA when receiving an apdu answer
      - ST21NFCA_EVT_WTX_REQUEST when the secure element needs more time than
      expected to compute a command. If this timeout expired, a first recovery
      tentative consist to send a simple software reset proprietary command.
      If this tentative still fail, a second recovery tentative consist to send
      a hardware reset proprietary command.
      This function is only relevant for eSE like secure element.
      
      This patch also change the way a pipe is referenced. There can be
      different pipe connected to the same gate with different host destination
      (ex: CONNECTIVITY). In order to keep host information every pipe are
      reference with a tuple (gate, host). In order to reduce changes, we are
      keeping unchanged the way a gate is addressed on the Terminal Host.
      However, this is working because we consider the apdu reader gate is only
      present on the eSE slot also the connectivity gate cannot give a reliable
      value; it will give the latest stored pipe value.
      Signed-off-by: NChristophe Ricard <christophe-h.ricard@st.com>
      Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>
      2130fb97
  6. 27 1月, 2015 8 次提交
  7. 26 1月, 2015 9 次提交
  8. 24 1月, 2015 1 次提交
  9. 22 1月, 2015 1 次提交
  10. 20 1月, 2015 2 次提交
    • R
      module: remove mod arg from module_free, rename module_memfree(). · be1f221c
      Rusty Russell 提交于
      Nothing needs the module pointer any more, and the next patch will
      call it from RCU, where the module itself might no longer exist.
      Removing the arg is the safest approach.
      
      This just codifies the use of the module_alloc/module_free pattern
      which ftrace and bpf use.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: x86@kernel.org
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: linux-cris-kernel@axis.com
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: nios2-dev@lists.rocketboards.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: sparclinux@vger.kernel.org
      Cc: netdev@vger.kernel.org
      be1f221c
    • R
      module_arch_freeing_init(): new hook for archs before module->module_init freed. · d453cded
      Rusty Russell 提交于
      Archs have been abusing module_free() to clean up their arch-specific
      allocations.  Since module_free() is also (ab)used by BPF and trace code,
      let's keep it to simple allocations, and provide a hook called before
      that.
      
      This means that avr32, ia64, parisc and s390 no longer need to implement
      their own module_free() at all.  avr32 doesn't need module_finalize()
      either.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      d453cded
  11. 19 1月, 2015 1 次提交
  12. 18 1月, 2015 1 次提交
  13. 17 1月, 2015 3 次提交
    • J
      genetlink: synchronize socket closing and family removal · ee1c2442
      Johannes Berg 提交于
      In addition to the problem Jeff Layton reported, I looked at the code
      and reproduced the same warning by subscribing and removing the genl
      family with a socket still open. This is a fairly tricky race which
      originates in the fact that generic netlink allows the family to go
      away while sockets are still open - unlike regular netlink which has
      a module refcount for every open socket so in general this cannot be
      triggered.
      
      Trying to resolve this issue by the obvious locking isn't possible as
      it will result in deadlocks between unregistration and group unbind
      notification (which incidentally lockdep doesn't find due to the home
      grown locking in the netlink table.)
      
      To really resolve this, introduce a "closing socket" reference counter
      (for generic netlink only, as it's the only affected family) in the
      core netlink code and use that in generic netlink to wait for all the
      sockets that are being closed at the same time as a generic netlink
      family is removed.
      
      This fixes the race that when a socket is closed, it will should call
      the unbind, but if the family is removed at the same time the unbind
      will not find it, leading to the warning. The real problem though is
      that in this case the unbind could actually find a new family that is
      registered to have a multicast group with the same ID, and call its
      mcast_unbind() leading to confusing.
      
      Also remove the warning since it would still trigger, but is now no
      longer a problem.
      
      This also moves the code in af_netlink.c to before unreferencing the
      module to avoid having the same problem in the normal non-genl case.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee1c2442
    • Y
      PCI: Add pci_claim_bridge_resource() to clip window if necessary · 8505e729
      Yinghai Lu 提交于
      Add pci_claim_bridge_resource() to claim a PCI-PCI bridge window.  This is
      like regular pci_claim_resource(), except that if we fail to claim the
      window, we check to see if we can reduce the size of the window and try
      again.
      
      This is for scenarios like this:
      
        pci_bus 0000:00: root bus resource [mem 0xc0000000-0xffffffff]
        pci 0000:00:01.0:   bridge window [mem 0xbdf00000-0xddefffff 64bit pref]
        pci 0000:01:00.0: reg 0x10: [mem 0xc0000000-0xcfffffff pref]
      
      The 00:01.0 window is illegal: it starts before the host bridge window, so
      we have to assume the [0xbdf00000-0xbfffffff] region is inaccessible.  We
      can make it legal by clipping it to [mem 0xc0000000-0xddefffff 64bit pref].
      
      Previously we discarded the 00:01.0 window and tried to reassign that part
      of the hierarchy from scratch.  That is a problem because Linux doesn't
      always assign things optimally.  For example, in this case, BIOS put the
      01:00.0 device in a prefetchable window below 4GB, but after 5b285415,
      Linux puts the prefetchable window above 4GB where the 32-bit 01:00.0
      device can't use it.
      
      Clipping the 00:01.0 window is less intrusive than completely reassigning
      things and is sufficient to let us use most of the BIOS configuration.  Of
      course, it's possible that devices below 00:01.0 will no longer fit.  If
      that's the case, we'll have to reassign things.  But that's a separate
      problem.
      
      [bhelgaas: changelog, split into separate patch]
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491Reported-by: NMarek Kordik <kordikmarek@gmail.com>
      Fixes: 5b285415 ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org	# v3.16+
      8505e729
    • A
      PCI: Add flag for devices where we can't use bus reset · f331a859
      Alex Williamson 提交于
      Enable a mechanism for devices to quirk that they do not behave when
      doing a PCI bus reset.  We require a modest level of spec compliant
      behavior in order to do a reset, for instance the device should come
      out of reset without throwing errors and PCI config space should be
      accessible after reset.  This is too much to ask for some devices.
      
      Link: http://lkml.kernel.org/r/20140923210318.498dacbd@dualc.maya.orgSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org	# v3.14+
      f331a859
  14. 16 1月, 2015 1 次提交
    • Y
      rhashtable: Fix race in rhashtable_destroy() and use regular work_struct · 57699a40
      Ying Xue 提交于
      When we put our declared work task in the global workqueue with
      schedule_delayed_work(), its delay parameter is always zero.
      Therefore, we should define a regular work in rhashtable structure
      instead of a delayed work.
      
      By the way, we add a condition to check whether resizing functions
      are NULL before cancelling the work, avoiding to cancel an
      uninitialized work.
      
      Lastly, while we wait for all work items we submitted before to run
      to completion with cancel_delayed_work(), ht->mutex has been taken in
      rhashtable_destroy(). Moreover, cancel_delayed_work() doesn't return
      until all work items are accomplished, and when work items are
      scheduled, the work's function - rht_deferred_worker() will be called.
      However, as rht_deferred_worker() also needs to acquire the lock,
      deadlock might happen at the moment as the lock is already held before.
      So if the cancel work function is moved out of the lock covered scope,
      this will avoid the deadlock.
      
      Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking")
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57699a40
  15. 15 1月, 2015 2 次提交
  16. 14 1月, 2015 1 次提交