1. 10 10月, 2014 8 次提交
    • L
      lib/genalloc.c: add genpool range check function · 9efb3a42
      Laura Abbott 提交于
      After allocating an address from a particular genpool, there is no good
      way to verify if that address actually belongs to a genpool.  Introduce
      addr_in_gen_pool which will return if an address plus size falls
      completely within the genpool range.
      Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NOlof Johansson <olof@lixom.net>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Riley <davidriley@chromium.org>
      Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thierry Reding <thierry.reding@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9efb3a42
    • L
      lib/genalloc.c: add power aligned algorithm · 505e3be6
      Laura Abbott 提交于
      One of the more common algorithms used for allocation is to align the
      start address of the allocation to the order of size requested.  Add this
      as an algorithm option for genalloc.
      Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NOlof Johansson <olof@lixom.net>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Riley <davidriley@chromium.org>
      Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thierry Reding <thierry.reding@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      505e3be6
    • Z
      memory-hotplug: add sysfs valid_zones attribute · ed2f2400
      Zhang Zhen 提交于
      Currently memory-hotplug has two limits:
      
      1. If the memory block is in ZONE_NORMAL, you can change it to
         ZONE_MOVABLE, but this memory block must be adjacent to ZONE_MOVABLE.
      
      2. If the memory block is in ZONE_MOVABLE, you can change it to
         ZONE_NORMAL, but this memory block must be adjacent to ZONE_NORMAL.
      
      With this patch, we can easy to know a memory block can be onlined to
      which zone, and don't need to know the above two limits.
      
      Updated the related Documentation.
      
      [akpm@linux-foundation.org: use conventional comment layout]
      [akpm@linux-foundation.org: fix build with CONFIG_MEMORY_HOTREMOVE=n]
      [akpm@linux-foundation.org: remove unused local zone_prev]
      Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed2f2400
    • J
      mm/slab: use percpu allocator for cpu cache · bf0dea23
      Joonsoo Kim 提交于
      Because of chicken and egg problem, initialization of SLAB is really
      complicated.  We need to allocate cpu cache through SLAB to make the
      kmem_cache work, but before initialization of kmem_cache, allocation
      through SLAB is impossible.
      
      On the other hand, SLUB does initialization in a more simple way.  It uses
      percpu allocator to allocate cpu cache so there is no chicken and egg
      problem.
      
      So, this patch try to use percpu allocator in SLAB.  This simplifies the
      initialization step in SLAB so that we could maintain SLAB code more
      easily.
      
      In my testing there is no performance difference.
      
      This implementation relies on percpu allocator.  Because percpu allocator
      uses vmalloc address space, vmalloc address space could be exhausted by
      this change on many cpu system with *32 bit* kernel.  This implementation
      can cover 1024 cpus in worst case by following calculation.
      
      Worst: 1024 cpus * 4 bytes for pointer * 300 kmem_caches *
      	120 objects per cpu_cache = 140 MB
      Normal: 1024 cpus * 4 bytes for pointer * 150 kmem_caches(slab merge) *
      	80 objects per cpu_cache = 46 MB
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jeremiah Mahler <jmmahler@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bf0dea23
    • J
      topology: add support for node_to_mem_node() to determine the fallback node · ad2c8144
      Joonsoo Kim 提交于
      Anton noticed (http://www.spinics.net/lists/linux-mm/msg67489.html) that
      on ppc LPARs with memoryless nodes, a large amount of memory was consumed
      by slabs and was marked unreclaimable.  He tracked it down to slab
      deactivations in the SLUB core when we allocate remotely, leading to poor
      efficiency always when memoryless nodes are present.
      
      After much discussion, Joonsoo provided a few patches that help
      significantly.  They don't resolve the problem altogether:
      
       - memory hotplug still needs testing, that is when a memoryless node
         becomes memory-ful, we want to dtrt
       - there are other reasons for going off-node than memoryless nodes,
         e.g., fully exhausted local nodes
      
      Neither case is resolved with this series, but I don't think that should
      block their acceptance, as they can be explored/resolved with follow-on
      patches.
      
      The series consists of:
      
      [1/3] topology: add support for node_to_mem_node() to determine the
            fallback node
      
      [2/3] slub: fallback to node_to_mem_node() node if allocating on
            memoryless node
      
            - Joonsoo's patches to cache the nearest node with memory for each
              NUMA node
      
      [3/3] Partial revert of 81c98869 (""kthread: ensure locality of
            task_struct allocations")
      
       - At Tejun's request, keep the knowledge of memoryless node fallback
         to the allocator core.
      
      This patch (of 3):
      
      We need to determine the fallback node in slub allocator if the allocation
      target node is memoryless node.  Without it, the SLUB wrongly select the
      node which has no memory and can't use a partial slab, because of node
      mismatch.  Introduced function, node_to_mem_node(X), will return a node Y
      with memory that has the nearest distance.  If X is memoryless node, it
      will return nearest distance node, but, if X is normal node, it will
      return itself.
      
      We will use this function in following patch to determine the fallback
      node.
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Han Pingtian <hanpt@linux.vnet.ibm.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ad2c8144
    • J
      mm/sl[ao]b: always track caller in kmalloc_(node_)track_caller() · 61f47105
      Joonsoo Kim 提交于
      Now, we track caller if tracing or slab debugging is enabled.  If they are
      disabled, we could save one argument passing overhead by calling
      __kmalloc(_node)().  But, I think that it would be marginal.  Furthermore,
      default slab allocator, SLUB, doesn't use this technique so I think that
      it's okay to change this situation.
      
      After this change, we can turn on/off CONFIG_DEBUG_SLAB without full
      kernel build and remove some complicated '#if' defintion.  It looks more
      benefitial to me.
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      61f47105
    • J
      mm/slab_common: move kmem_cache definition to internal header · 07f361b2
      Joonsoo Kim 提交于
      We don't need to keep kmem_cache definition in include/linux/slab.h if we
      don't need to inline kmem_cache_size().  According to my code inspection,
      this function is only called at lc_create() in lib/lru_cache.c which may
      be called at initialization phase of something, so we don't need to inline
      it.  Therfore, move it to slab_common.c and move kmem_cache definition to
      internal header.
      
      After this change, we can change kmem_cache definition easily without full
      kernel build.  For instance, we can turn on/off CONFIG_SLUB_STATS without
      full kernel build.
      
      [akpm@linux-foundation.org: export kmem_cache_size() to modules]
      [rdunlap@infradead.org: add header files to fix kmemcheck.c build errors]
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      07f361b2
    • O
      proc/maps: make vm_is_stack() logic namespace-friendly · 58cb6548
      Oleg Nesterov 提交于
      - Rename vm_is_stack() to task_of_stack() and change it to return
        "struct task_struct *" rather than the global (and thus wrong in
        general) pid_t.
      
      - Add the new pid_of_stack() helper which calls task_of_stack() and
        uses the right namespace to report the correct pid_t.
      
        Unfortunately we need to define this helper twice, in task_mmu.c
        and in task_nommu.c. perhaps it makes sense to add fs/proc/util.c
        and move at least pid_of_stack/task_of_stack there to avoid the
        code duplication.
      
      - Change show_map_vma() and show_numa_map() to use the new helper.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58cb6548
  2. 09 10月, 2014 3 次提交
  3. 08 10月, 2014 2 次提交
    • E
      net: better IFF_XMIT_DST_RELEASE support · 02875878
      Eric Dumazet 提交于
      Testing xmit_more support with netperf and connected UDP sockets,
      I found strange dst refcount false sharing.
      
      Current handling of IFF_XMIT_DST_RELEASE is not optimal.
      
      Dropping dst in validate_xmit_skb() is certainly too late in case
      packet was queued by cpu X but dequeued by cpu Y
      
      The logical point to take care of drop/force is in __dev_queue_xmit()
      before even taking qdisc lock.
      
      As Julian Anastasov pointed out, need for skb_dst() might come from some
      packet schedulers or classifiers.
      
      This patch adds new helper to cleanly express needs of various drivers
      or qdiscs/classifiers.
      
      Drivers that need skb_dst() in their ndo_start_xmit() should call
      following helper in their setup instead of the prior :
      
      	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
      ->
      	netif_keep_dst(dev);
      
      Instead of using a single bit, we use two bits, one being
      eventually rebuilt in bonding/team drivers.
      
      The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being
      rebuilt in bonding/team. Eventually, we could add something
      smarter later.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02875878
    • P
      net: phy: adjust fixed_phy_register() return value · fd2ef0ba
      Petri Gynther 提交于
      Adjust fixed_phy_register() to return struct phy_device *, so that
      it becomes easy to use fixed PHYs without device tree support:
      
        phydev = fixed_phy_register(PHY_POLL, &fixed_phy_status, NULL);
        fixed_phy_set_link_update(phydev, fixed_phy_link_update);
        phy_connect_direct(netdev, phydev, handler_fn, phy_interface);
      
      This change is a prerequisite for modifying bcmgenet driver to work
      without a device tree on Broadcom's MIPS-based 7xxx platforms.
      Signed-off-by: NPetri Gynther <pgynther@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd2ef0ba
  4. 07 10月, 2014 1 次提交
  5. 06 10月, 2014 1 次提交
  6. 05 10月, 2014 1 次提交
    • V
      net: Cleanup skb cloning by adding SKB_FCLONE_FREE · c8753d55
      Vijay Subramanian 提交于
      SKB_FCLONE_UNAVAILABLE has overloaded meaning depending on type of skb.
      1: If skb is allocated from head_cache, it indicates fclone is not available.
      2: If skb is a companion fclone skb (allocated from fclone_cache), it indicates
      it is available to be used.
      
      To avoid confusion for case 2 above, this patch  replaces
      SKB_FCLONE_UNAVAILABLE with SKB_FCLONE_FREE where appropriate. For fclone
      companion skbs, this indicates it is free for use.
      
      SKB_FCLONE_UNAVAILABLE will now simply indicate skb is from head_cache and
      cannot / will not have a companion fclone.
      Signed-off-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8753d55
  7. 04 10月, 2014 7 次提交
  8. 02 10月, 2014 3 次提交
    • P
      d068b02c
    • T
      udp: Generalize skb_udp_segment · 8bce6d7d
      Tom Herbert 提交于
      skb_udp_segment is the function called from udp4_ufo_fragment to
      segment a UDP tunnel packet. This function currently assumes
      segmentation is transparent Ethernet bridging (i.e. VXLAN
      encapsulation). This patch generalizes the function to
      operate on either Ethertype or IP protocol.
      
      The inner_protocol field must be set to the protocol of the inner
      header. This can now be either an Ethertype or an IP protocol
      (in a union). A new flag in the skbuff indicates which type is
      effective. skb_set_inner_protocol and skb_set_inner_ipproto
      helper functions were added to set the inner_protocol. These
      functions are called from the point where the tunnel encapsulation
      is occuring.
      
      When skb_udp_tunnel_segment is called, the function to segment the
      inner packet is selected based on the inner IP or Ethertype. In the
      case of an IP protocol encapsulation, the function is derived from
      inet[6]_offloads. In the case of Ethertype, skb->protocol is
      set to the inner_protocol and skb_mac_gso_segment is called. (GRE
      currently does this, but it might be possible to lookup the protocol
      in offload_base and call the appropriate segmenation function
      directly).
      Signed-off-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bce6d7d
    • E
      net: cleanup and document skb fclone layout · d0bf4a9e
      Eric Dumazet 提交于
      Lets use a proper structure to clearly document and implement
      skb fast clones.
      
      Then, we might experiment more easily alternative layouts.
      
      This patch adds a new skb_fclone_busy() helper, used by tcp and xfrm,
      to stop leaking of implementation details.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0bf4a9e
  9. 01 10月, 2014 3 次提交
    • B
      HID: wacom: implement generic HID handling for pen generic devices · 7704ac93
      Benjamin Tissoires 提交于
      ISDv4 and v5 are plain HID devices. We can directly implement a generic
      HID parsing/handling and remove the need to manually add those PID in
      the list of supported devices.
      
      This patch implements the pen support only. The finger part will come in
      a later patch.
      
      To be properly notified of an .event() and a .report(), we need to force
      hid-core to go through the HID parsing. By default, wacom.ko binds only
      hidraw, so the hid parsing is not done by hid-core. When a true HID device
      is there, we add the flag HID_CLAIMED_DRIVER to hid->claimed which will
      force hid-core to parse the incoming reports.
      (Note that this can be easily backported by directly setting the .claimed
      flag to HID_CLAIMED_DRIVER even if hid-core does not support
      HID_CONNECT_DRIVER)
      Signed-off-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Acked-by: NJason Gerecke <killertofu@gmail.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      7704ac93
    • M
      net/mlx4_core: New init and exit flow for mlx4_core · e1c00e10
      Majd Dibbiny 提交于
      In the new flow, we separate the pci initialization and teardown
      from the initialization and teardown of the other resources.
      
      __mlx4_init_one handles the pci resources initialization. It then
      calls mlx4_load_one to initialize the remainder of the resources.
      
      When removing a device, mlx4_remove_one is invoked. However, now
      mlx4_remove_one calls mlx4_unload_one to free all the resources except the pci
      resources. When mlx4_unload_one returns, mlx4_remove_one then frees the
      pci resources.
      
      The above separation will allow us to implement 'reset flow' in the future.
      It will also enable more EQs for VFs and is a pre-step to the modern API to
      enable/disable SRIOV.
      
      Also added nvfs; an integer array of size MLX4_MAX_PORTS + 1; to the mlx4_dev
      struct. This new field is used to avoid parsing the num_vfs module parameter
      each time the mlx4_restart_one is called.
      Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1c00e10
    • H
      bcma: register bcma as device tree driver · 2101e533
      Hauke Mehrtens 提交于
      This driver is used by the bcm53xx ARM SoC code. Now it is possible to
      give the address of the chipcommon core in device tree and bcma will
      search for all the other cores.
      Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      2101e533
  10. 30 9月, 2014 7 次提交
    • S
      tty: serial: 8250: use 32bit variable for rpm_tx_active · baeb7ef3
      Sebastian Andrzej Siewior 提交于
      The kbuild test robot wrote me:
      |  make.cross ARCH=powerpc
      |>> ERROR: ".__xchg_called_with_bad_pointer" [drivers/tty/serial/8250/8250.ko] undefined!
      
      The generic implementation of xchg() on arm and x86 works for variables of
      size one bye (char). According to the report powerpc does not support
      xchg() for one byte sized variables and looking further it seems also to
      be the same case for sparc and tile (or for 10 out of 26 architectures
      which provide a custom implementation).
      For that reason I increase the size of the variable from one to four
      bytes to get it work on powerpc (and the others).
      Reported-By: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      baeb7ef3
    • M
      macvlan: add source mode · 79cf79ab
      Michael Braun 提交于
      This patch adds a new mode of operation to macvlan, called "source".
      It allows one to set a list of allowed mac address, which is used
      to match against source mac address from received frames on underlying
      interface.
      This enables creating mac based VLAN associations, instead of standard
      port or tag based. The feature is useful to deploy 802.1x mac based
      behavior, where drivers of underlying interfaces doesn't allows that.
      
      Configuration is done through the netlink interface using e.g.:
       ip link add link eth0 name macvlan0 type macvlan mode source
       ip link add link eth0 name macvlan1 type macvlan mode source
       ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
       ip link set link dev macvlan0 type macvlan macaddr add 00:22:22:22:22:22
       ip link set link dev macvlan0 type macvlan macaddr add 00:33:33:33:33:33
       ip link set link dev macvlan1 type macvlan macaddr add 00:33:33:33:33:33
       ip link set link dev macvlan1 type macvlan macaddr add 00:44:44:44:44:44
      
      This allows clients with MAC addresses 00:11:11:11:11:11,
      00:22:22:22:22:22 to be part of only VLAN associated with macvlan0
      interface. Clients with MAC addresses 00:44:44:44:44:44 with only VLAN
      associated with macvlan1 interface. And client with MAC address
      00:33:33:33:33:33 to be associated with both VLANs.
      
      Based on work of Stefan Gula <steweg@gmail.com>
      
      v8: last version of Stefan Gula for Kernel 3.2.1
      v9: rework onto linux-next 2014-03-12 by Michael Braun
          add MACADDR_SET command, enable to configure mac for source mode
          while creating interface
      v10:
        - reduce indention level
        - rename source_list to source_entry
        - use aligned 64bit ether address
        - use hash_64 instead of addr[5]
      v11:
        - rebase for 3.14 / linux-next 20.04.2014
      v12
        - rebase for linux-next 2014-09-25
      Signed-off-by: NMichael Braun <michael-dev@fami-braun.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79cf79ab
    • M
      ARCNET: add support for multi interfaces on com20020 · c51da42a
      Michael Grzeschik 提交于
      The com20020-pci driver is currently designed to instance
      one netdev with one pci device. This patch adds support to
      instance many cards with one pci device, depending on the device
      data in the private data.
      Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c51da42a
    • M
      ARCNET: add com20020 PCI IDs with metadata · 8c14f9c7
      Michael Grzeschik 提交于
      This patch adds metadata for the com20020 to prepare for devices with
      multiple io address areas with multi card interfaces.
      Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c14f9c7
    • A
      NFSD: Implement SEEK · 24bab491
      Anna Schumaker 提交于
      This patch adds server support for the NFS v4.2 operation SEEK, which
      returns the position of the next hole or data segment in a file.
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      24bab491
    • A
      NFSD: Add generic v4.2 infrastructure · 87a15a80
      Anna Schumaker 提交于
      It's cleaner to introduce everything at once and have the server reply
      with "not supported" than it would be to introduce extra operations when
      implementing a specific one in the middle of the list.
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      87a15a80
    • E
      net: reorganize sk_buff for faster __copy_skb_header() · b1937227
      Eric Dumazet 提交于
      With proliferation of bit fields in sk_buff, __copy_skb_header() became
      quite expensive, showing as the most expensive function in a GSO
      workload.
      
      __copy_skb_header() performance is also critical for non GSO TCP
      operations, as it is used from skb_clone()
      
      This patch carefully moves all the fields that were not copied in a
      separate zone : cloned, nohdr, fclone, peeked, head_frag, xmit_more
      
      Then I moved all other fields and all other copied fields in a section
      delimited by headers_start[0]/headers_end[0] section so that we
      can use a single memcpy() call, inlined by compiler using long
      word load/stores.
      
      I also tried to make all copies in the natural orders of sk_buff,
      to help hardware prefetching.
      
      I made sure sk_buff size did not change.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1937227
  11. 29 9月, 2014 4 次提交