1. 12 11月, 2014 23 次提交
    • D
      Merge branch 'skb_alloc_pages' · ee47ad42
      David S. Miller 提交于
      Alexander Duyck says:
      
      ====================
      Replace __skb_alloc_pages with simpler function
      
      This patch series replaces __skb_alloc_pages with a much simpler function,
      __dev_alloc_pages.  The main difference between the two is that
      __skb_alloc_pages had an sk_buff pointer that was being passed as NULL in
      call places where it was called.  In a couple of cases the NULL was passed
      by variable and this led to unnecessary code being run.
      
      As such in order to simplify things the __dev_alloc_pages call only takes a
      mask and the page order being requested.  In addition it takes advantage of
      several behaviors already built into the page allocator so that it can just
      set GFP flags unconditionally.
      
      v2: Renamed functions to dev_alloc_page(s) instead of netdev_alloc_page(s)
          Removed __GFP_COLD flag from usb code as it was redundant
      v3: Update patch descriptions and subjects to match changes in v2
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee47ad42
    • A
      net: Remove __skb_alloc_page and __skb_alloc_pages · 160d2aba
      Alexander Duyck 提交于
      Remove the two functions which are now dead code.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      160d2aba
    • A
      fm10k/igb/ixgbe: Replace __skb_alloc_page with dev_alloc_page · 42b17f09
      Alexander Duyck 提交于
      The Intel drivers were pretty much just using the plain vanilla GFP flags
      in their calls to __skb_alloc_page so this change makes it so that they use
      dev_alloc_page which just uses GFP_ATOMIC for the gfp_flags value.
      
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Matthew Vick <matthew.vick@intel.com>
      Cc: Don Skidmore <donald.c.skidmore@intel.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42b17f09
    • A
      phonet: Replace calls to __skb_alloc_page with __dev_alloc_page · 5693d284
      Alexander Duyck 提交于
      Replace the calls to __skb_alloc_page that are passed NULL with calls to
      __dev_alloc_page.
      
      In addition remove __GFP_COLD flag from allocations as we only want it for
      the Rx buffer which is taken care of by __dev_alloc_skb, not for any
      secondary allocations such as the queue element transmit descriptors.
      
      Cc: Oliver Neukum <oliver@neukum.org>
      Cc: Felipe Balbi <balbi@ti.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5693d284
    • A
      cxgb4/cxgb4vf: Replace __skb_alloc_page with __dev_alloc_page · aa9cd31c
      Alexander Duyck 提交于
      Drop the bloated use of __skb_alloc_page and replace it with
      __dev_alloc_page.  In addition update the one other spot that is
      allocating a page so that it allocates with the correct flags.
      
      Cc: Hariprasad S <hariprasad@chelsio.com>
      Cc: Casey Leedom <leedom@chelsio.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa9cd31c
    • A
      net: Add device Rx page allocation function · 71dfda58
      Alexander Duyck 提交于
      This patch implements __dev_alloc_pages and __dev_alloc_page.  These are
      meant to replace the __skb_alloc_pages and __skb_alloc_page functions.  The
      reason for doing this is that it occurred to me that __skb_alloc_page is
      supposed to be passed an sk_buff pointer, but it is NULL in all cases where
      it is used.  Worse is that in the case of ixgbe it is passed NULL via the
      sk_buff pointer in the rx_buffer info structure which means the compiler is
      not correctly stripping it out.
      
      The naming for these functions is based on dev_alloc_skb and __dev_alloc_skb.
      There was originally a netdev_alloc_page, however that was passed a
      net_device pointer and this function is not so I thought it best to follow
      that naming scheme since that is the same difference between dev_alloc_skb
      and netdev_alloc_skb.
      
      In the case of anything greater than order 0 it is assumed that we want a
      compound page so __GFP_COMP is set for all allocations as we expect a
      compound page when assigning a page frag.
      
      The other change in this patch is to exploit the behaviors of the page
      allocator in how it handles flags.  So for example we can always set
      __GFP_COMP and __GFP_MEMALLOC since they are ignored if they are not
      applicable or are overridden by another flag.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      71dfda58
    • J
      irda: Remove IRDA_<TYPE> logging macros · 6c91023d
      Joe Perches 提交于
      And use the more common mechanisms directly.
      
      Other miscellanea:
      
      o Coalesce formats
      o Add missing newlines
      o Realign arguments
      o Remove unnecessary OOM message logging as
        there's a generic stack dump already on OOM.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c91023d
    • W
      net: kill netif_copy_real_num_queues() · 09626e9d
      WANG Cong 提交于
      vlan was the only user of netif_copy_real_num_queues(),
      but it no longer calls it after
      commit 4af429d2 ("vlan: lockless transmit path").
      So we can just remove it.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09626e9d
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · 2387e3b5
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2014-11-11
      
      This series contains updates to i40e, i40evf and ixgbe.
      
      Kamil updated the i40e and i40evf driver to poll the firmware slower
      since we were polling faster than the firmware could respond.
      
      Shannon updates i40e to add a check to keep the service_task from
      running the periodic tasks more than once per second, while still
      allowing quick action to service the events.
      
      Jesse cleans up the throttle rate code by fixing the minimum interrupt
      throttle rate and removing some unused defines.
      
      Mitch makes the early init admin queue message receive code more robust
      by handling messages in a loop and ignoring those that we are not
      interested in.  This also gets rid of some scary log messages that
      really do not indicate a problem.
      
      Don provides several ixgbe patches, first fixes an issue with x540
      completion timeout where on topologies including few levels of PCIe
      switching for x540 can run into an unexpected completion error.  Cleans
      up the functionality in ixgbe_ndo_set_vf_vlan() in preparation for
      future work.  Adds support for x550 MAC's to the driver.
      
      v2:
       - Remove code comment in patch 01 of the series, based on feedback from
         David Liaght
       - Updated the "goto" to "break" statements in patch 06 of the series,
         based on feedback from Sergei Shtylyov
       - Initialized the variable err due to the possibility of use before
         being assigned a value in patch 07 of the series
       - Added patch "ixgbe: add helper function for setting RSS key in
         preparation of X550" since it is needed for the addition of X550 MAC
         support
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2387e3b5
    • S
      usbnet: smsc95xx: dereferencing NULL pointer · 8bca81d9
      Sudip Mukherjee 提交于
      we were dereferencing dev to initialize pdata. but just after that we
      have a BUG_ON(!dev). so we were basically dereferencing the pointer
      first and then tesing it for NULL.
      Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bca81d9
    • J
      irda: Simplify IRDA logging macros · d65c4e4e
      Joe Perches 提交于
      These are the same as net_<level>_ratelimited, so
      use the more common style in the macro definition.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d65c4e4e
    • W
      neigh: remove dynamic neigh table registration support · d7480fd3
      WANG Cong 提交于
      Currently there are only three neigh tables in the whole kernel:
      arp table, ndisc table and decnet neigh table. What's more,
      we don't support registering multiple tables per family.
      Therefore we can just make these tables statically built-in.
      
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7480fd3
    • A
      stmmac: split to core library and probe drivers · b2e2f0c7
      Andy Shevchenko 提交于
      Instead of registering the platform and PCI drivers in one module let's move
      necessary bits to where it belongs. During this procedure we convert the module
      registration part to use module_*_driver() macros which makes code simplier.
      
      >From now on the driver consists three parts: core library, PCI, and platform
      drivers.
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Acked-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2e2f0c7
    • J
      net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited · ba7a46f1
      Joe Perches 提交于
      Use the more common dynamic_debug capable net_dbg_ratelimited
      and remove the LIMIT_NETDEBUG macro.
      
      All messages are still ratelimited.
      
      Some KERN_<LEVEL> uses are changed to KERN_DEBUG.
      
      This may have some negative impact on messages that were
      emitted at KERN_INFO that are not not enabled at all unless
      DEBUG is defined or dynamic_debug is enabled.  Even so,
      these messages are now _not_ emitted by default.
      
      This also eliminates the use of the net_msg_warn sysctl
      "/proc/sys/net/core/warnings".  For backward compatibility,
      the sysctl is not removed, but it has no function.  The extern
      declaration of net_msg_warn is removed from sock.h and made
      static in net/core/sysctl_net_core.c
      
      Miscellanea:
      
      o Update the sysctl documentation
      o Remove the embedded uses of pr_fmt
      o Coalesce format fragments
      o Realign arguments
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba7a46f1
    • D
      PPC: bpf_jit_comp: add SKF_AD_HATYPE instruction · 5b61c4db
      Denis Kirjanov 提交于
      Add BPF extension SKF_AD_HATYPE to ppc JIT to check
      the hw type of the interface
      
      Before:
      [   57.723666] test_bpf: #20 LD_HATYPE
      [   57.723675] BPF filter opcode 0020 (@0) unsupported
      [   57.724168] 48 48 PASS
      
      After:
      [  103.053184] test_bpf: #20 LD_HATYPE 7 6 PASS
      
      CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
      CC: Daniel Borkmann<dborkman@redhat.com>
      CC: Philippe Bergheaud<felix@linux.vnet.ibm.com>
      Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org>
      
      v2: address Alexei's comments
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b61c4db
    • D
      Merge branch 'net_next_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch · 4083c805
      David S. Miller 提交于
      Pravin B Shelar says:
      
      ====================
      Open vSwitch
      
      Following batch of patches brings feature parity between upstream
      ovs and out of tree ovs module.
      
      Two features are added, first adds support to export egress
      tunnel information for a packet. This is used to improve
      visibility in network traffic. Second feature allows userspace
      vswitchd process to probe ovs module features. Other patches
      are optimization and code cleanup.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4083c805
    • J
      dsa: Use netdev_<level> instead of printk · a2ae6007
      Joe Perches 提交于
      Neaten and standardize the logging output.
      
      Other miscellanea:
      
      o Use pr_notice_once instead of a guard flag.
      o Convert existing pr_<level> uses too.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2ae6007
    • D
      Merge branch 'mlx4-next' · 008e8165
      David S. Miller 提交于
      Or Gerlitz says:
      
      ====================
      mlx4: Add CHECKSUM_COMPLETE support
      
      These patches from Shani, Matan and myself add support for
      CHECKSUM_COMPLETE reporting on non TCP/UDP packets such as
      GRE and ICMP. I'd like to deeply thank Jerry Chu for his
      innovation and support in that effort.
      
      Based on the feedback from Eric and Ido Shamay, in V2 we dropped
      the patch which removed the calls to napi_gro_frags() and added
      a patch which makes the RX code to go through that path
      regardless of the checksum status.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      008e8165
    • S
      net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE · f8c6455b
      Shani Michaeli 提交于
      When processing received traffic, pass CHECKSUM_COMPLETE status to the
      stack, with calculated checksum for non TCP/UDP packets (such
      as GRE or ICMP).
      
      Although the stack expects checksum which doesn't include the pseudo
      header, the HW adds it. To address that, we are subtracting the pseudo
      header checksum from the checksum value provided by the HW.
      
      In the IPv6 case, we also compute/add the IP header checksum which
      is not added by the HW for such packets.
      
      Cc: Jerry Chu <hkchu@google.com>
      Signed-off-by: NShani Michaeli <shanim@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8c6455b
    • S
      net/mlx4_en: Extend usage of napi_gro_frags · dd65beac
      Shani Michaeli 提交于
      We can call napi_gro_frags for all the received traffic regardless
      of the checksum status. Specifically, received packets whose status
      is CHECKSUM_NONE (and soon to be added CHECKSUM_COMPLETE)
      are eligible for napi_gro_frags as well.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NShani Michaeli <shanim@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd65beac
    • D
      Merge branch 'so_incoming_cpu' · b00394c0
      David S. Miller 提交于
      Eric Dumazet says:
      
      ====================
      net: SO_INCOMING_CPU support
      
      SO_INCOMING_CPU socket option (read by getsockopt()) provides
      an alternative to RPS/RFS for high performance servers using
      multi queues NIC.
      
      TCP should use sk_mark_napi_id() for established sockets only.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b00394c0
    • E
      net: introduce SO_INCOMING_CPU · 2c8c56e1
      Eric Dumazet 提交于
      Alternative to RPS/RFS is to use hardware support for multiple
      queues.
      
      Then split a set of million of sockets into worker threads, each
      one using epoll() to manage events on its own socket pool.
      
      Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
      know after accept() or connect() on which queue/cpu a socket is managed.
      
      We normally use one cpu per RX queue (IRQ smp_affinity being properly
      set), so remembering on socket structure which cpu delivered last packet
      is enough to solve the problem.
      
      After accept(), connect(), or even file descriptor passing around
      processes, applications can use :
      
       int cpu;
       socklen_t len = sizeof(cpu);
      
       getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
      
      And use this information to put the socket into the right silo
      for optimal performance, as all networking stack should run
      on the appropriate cpu, without need to send IPI (RPS/RFS).
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c8c56e1
    • E
      tcp: move sk_mark_napi_id() at the right place · 3d97379a
      Eric Dumazet 提交于
      sk_mark_napi_id() is used to record for a flow napi id of incoming
      packets for busypoll sake.
      We should do this only on established flows, not on listeners.
      
      This was 'working' by virtue of the socket cloning, but doing
      this on SYN packets in unecessary cache line dirtying.
      
      Even if we move sk_napi_id in the same cache line than sk_lock,
      we are working to make SYN processing lockless, so it is desirable
      to set sk_napi_id only for established flows.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d97379a
  2. 11 11月, 2014 17 次提交