1. 09 10月, 2015 22 次提交
  2. 08 10月, 2015 18 次提交
    • D
      Merge branch 'bpf_random32' · df718423
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      BPF/random32 updates
      
      BPF update to split the prandom state apart, and to move the
      *once helpers to the core. For details, please see individual
      patches. Given the changes and since it's in the tree for
      quite some time, net-next is a better choice in our opinion.
      
      v1 -> v2:
       - Make DO_ONCE() type-safe, remove the kvec helper. Credits
         go to Alexei Starovoitov for the __VA_ARGS__ hint, thanks!
       - Add a comment to the DO_ONCE() helper as suggested by Alexei.
       - Rework prandom_init_once() helper to the new API.
       - Keep Alexei's Acked-by on the last patch.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df718423
    • D
      bpf: split state from prandom_u32() and consolidate {c, e}BPF prngs · 3ad00405
      Daniel Borkmann 提交于
      While recently arguing on a seccomp discussion that raw prandom_u32()
      access shouldn't be exposed to unpriviledged user space, I forgot the
      fact that SKF_AD_RANDOM extension actually already does it for some time
      in cBPF via commit 4cd3675e ("filter: added BPF random opcode").
      
      Since prandom_u32() is being used in a lot of critical networking code,
      lets be more conservative and split their states. Furthermore, consolidate
      eBPF and cBPF prandom handlers to use the new internal PRNG. For eBPF,
      bpf_get_prandom_u32() was only accessible for priviledged users, but
      should that change one day, we also don't want to leak raw sequences
      through things like eBPF maps.
      
      One thought was also to have own per bpf_prog states, but due to ABI
      reasons this is not easily possible, i.e. the program code currently
      cannot access bpf_prog itself, and copying the rnd_state to/from the
      stack scratch space whenever a program uses the prng seems not really
      worth the trouble and seems too hacky. If needed, taus113 could in such
      cases be implemented within eBPF using a map entry to keep the state
      space, or get_random_bytes() could become a second helper in cases where
      performance would not be critical.
      
      Both sides can trigger a one-time late init via prandom_init_once() on
      the shared state. Performance-wise, there should even be a tiny gain
      as bpf_user_rnd_u32() saves one function call. The PRNG needs to live
      inside the BPF core since kernels could have a NET-less config as well.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Chema Gonzalez <chema@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3ad00405
    • D
      random32: add prandom_init_once helper for own rngs · 897ece56
      Daniel Borkmann 提交于
      Add a prandom_init_once() facility that works on the rnd_state, so that
      users that are keeping their own state independent from prandom_u32() can
      initialize their taus113 per cpu states.
      
      The motivation here is similar to net_get_random_once(): initialize the
      state as late as possible in the hope that enough entropy has been
      collected for the seeding. prandom_init_once() makes use of the recently
      introduced prandom_seed_full_state() helper and is generic enough so that
      it could also be used on fast-paths due to the DO_ONCE().
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      897ece56
    • D
      random32: add prandom_seed_full_state helper · 0dd50d1b
      Daniel Borkmann 提交于
      Factor out the full reseed handling code that populates the state
      through get_random_bytes() and runs prandom_warmup(). The resulting
      prandom_seed_full_state() will be used later on in more than the
      current __prandom_reseed() user. Fix also two minor whitespace
      issues along the way.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0dd50d1b
    • H
      once: make helper generic for calling functions once · c90aeb94
      Hannes Frederic Sowa 提交于
      Make the get_random_once() helper generic enough, so that functions
      in general would only be called once, where one user of this is then
      net_get_random_once().
      
      The only implementation specific call is to get_random_bytes(), all
      the rest of this *_once() facility would be duplicated among different
      subsystems otherwise. The new DO_ONCE() helper will be used by prandom()
      later on, but might also be useful for other scenarios/subsystems as
      well where a one-time initialization in often-called, possibly fast
      path code could occur.
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c90aeb94
    • H
      net: move net_get_random_once to lib · 46234253
      Hannes Frederic Sowa 提交于
      There's no good reason why users outside of networking should not
      be using this facility, f.e. for initializing their seeds.
      
      Therefore, make it accessible from there as get_random_once().
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46234253
    • D
      net: Do not drop to make_route if oif is l3mdev · 28335a74
      David Ahern 提交于
      Commit deaa0a6a ("net: Lookup actual route when oif is VRF device")
      exposed a bug in __ip_route_output_key_hash for VRF devices: on FIB lookup
      failure if the oif is specified the current logic drops to make_route on
      the assumption that the route tables are wrong. For VRF/L3 master devices
      this leads to wrong dst entries and route lookups. For example:
          $ ip route ls table vrf-red
          unreachable default
          broadcast 10.2.1.0 dev eth1  proto kernel  scope link  src 10.2.1.2
          10.2.1.0/24 dev eth1  proto kernel  scope link  src 10.2.1.2
          local 10.2.1.2 dev eth1  proto kernel  scope host  src 10.2.1.2
          broadcast 10.2.1.255 dev eth1  proto kernel  scope link  src 10.2.1.2
      
          $ ip route get oif vrf-red 1.1.1.1
          1.1.1.1 dev vrf-red  src 10.0.0.2
              cache
      
      With this patch:
          $  ip route get oif vrf-red 1.1.1.1
          RTNETLINK answers: No route to host
      
      which is the correct response based on the default route
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28335a74
    • D
      bpf, skb_do_redirect: clear sender_cpu before xmit · cfc81b50
      Daniel Borkmann 提交于
      Similar to commit c29390c6 ("xps: must clear sender_cpu before
      forwarding"), we also need to clear the skb->sender_cpu when moving
      from RX to TX via skb_do_redirect() due to the shared location of
      napi_id (used on RX) and sender_cpu (used on TX).
      
      Fixes: 27b29f63 ("bpf: add bpf_redirect() helper")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cfc81b50
    • A
      net: hns: fix 32-bit build warning · dfdd7230
      Arnd Bergmann 提交于
      The recently added hns driver causes a build warning in ARM
      allmodconfig builds:
      
      drivers/net/ethernet/hisilicon/hns/hnae.c: In function 'handles_show':
      drivers/net/ethernet/hisilicon/hns/hnae.c:452:13: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
                j, (u64)h->qs[i]->io_base);
                   ^
      
      This removes the pointless cast and prints the pointer address using
      the "%p" format string in all three locations.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfdd7230
    • J
      net: Microchip encx24j600 driver · d70e5326
      Jon Ringle 提交于
      This ethernet driver supports the Micorchip enc424j600/626j600 Ethernet
      controller over a SPI bus interface. This driver makes use of the regmap API to
      optimize access to registers by caching registers where possible.
      
      Datasheet:
      http://ww1.microchip.com/downloads/en/DeviceDoc/39935b.pdfSigned-off-by: NJon Ringle <jringle@gridpoint.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d70e5326
    • D
      Merge branch 'broadcom-iproc' · 494f8eb9
      David S. Miller 提交于
      Arun Parameswaran says:
      
      ====================
      Add support for Broadcom's iProc MDIO and Cygnus Ethernet PHY
      
      This patchset adds support for the iProc MDIO interface and the
      Broadcom Cygnus SoC's internal Ethernet PHY.
      
      The internal Ethernet PHY(s) in the Cygnus SoC's are accessed
      via the MDIO interface found in most of the iProc based chips.
      
      The patch also consolidates the common API's used by the
      Broadcom phys to a common library. Existing Broadcom phy
      drivers have been modified to use the common library API's.
      
      This patch series is based on Linux v4.3-rc1 and is avaliable in:
      https://github.com/Broadcom/cygnus-linux/tree/cygnus-net-phy-mdio-v3
      
      The Ethernet driver for the iProc family will be submitted soon,
      as will the device tree configurations for the different iProc
      family SoCs.
      
      Changes from v2:
      - Modified drivers/net/phy/Kconfig to modify the BCM_CYGNUS_PHY
        driver to 'depends on MDIO_BCM_IPROC' instead of 'select'.
      - Added github branch to the cover letter
      
      Changes from v1:
      - Updated device tree documentation for the iProc MDIO driver
        based on Florian's feedback.
      - Moved the core register defines from the Cygnus PHY driver to
        'include/linux/brcmphy.h' based on Florian's feedback.
      - Created a new patch/commit to modify the bcm7xxx phy driver
        to use the new core register defines.
      - Modified the Kconfig entry for the Broadcom PHY library to
        'tristate' instead of 'bool'
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      494f8eb9
    • A
      net: phy: bcm7xxx: Modified to use global core register defines · 9200c27a
      Arun Parameswaran 提交于
      Modified the bcm7xxx phy driver to remove local core register
      defines and use the common ones from "include/linux/brcmphy.h"
      Signed-off-by: NArun Parameswaran <arunp@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9200c27a
    • A
      net: phy: Broadcom Cygnus internal Etherent PHY driver · 8e185d69
      Arun Parameswaran 提交于
      Add support for the Broadcom Cygnus SoCs internal PHY's.
      The PHYs are 1000M/100M/10M capable with support for 'EEE'
      and 'APD' (Auto Power Down).
      
      This driver supports the following Broadcom Cygnus SoCs:
       - BCM583XX (BCM58300, BCM58302, BCM58303, BCM58305)
       - BCM113XX (BCM11300, BCM11320, BCM11350, BCM11360)
      
      The PHY's on these SoC's require some workarounds for
      stable operation, both during configuration time and
      during suspend/resume. This driver handles the
      application of the workarounds.
      Signed-off-by: NArun Parameswaran <arunp@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e185d69
    • A
      net: phy: Add Broadcom phy library for common interfaces · a1cba561
      Arun Parameswaran 提交于
      This patch adds the Broadcom phy library to consolidate common
      interfaces shared by Broadcom phy's.
      
      Moved the common interfaces to the 'bcm-phy-lib.c' and updated
      the Broadcom PHY drivers to use the new APIs.
      Signed-off-by: NArun Parameswaran <arunp@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1cba561
    • A
      net: phy: Broadcom iProc MDIO bus driver · ddc24ae1
      Arun Parameswaran 提交于
      This patch adds support for the Broadcom iProc MDIO bus interface.
      The MDIO interface can be found in the Broadcom iProc family Soc's.
      
      The MDIO bus is accessed using a combination of command and data
      registers. This MDIO driver provides access to the Etherent GPHY's
      connected to the MDIO bus.
      Signed-off-by: NArun Parameswaran <arunp@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddc24ae1
    • A
      dt-bindings: net: Broadcom iProc MDIO bus driver device tree binding · bb257c38
      Arun Parameswaran 提交于
      Add device tree binding documentation for the Broadcom iProc MDIO
      bus driver.
      Signed-off-by: NArun Parameswaran <arunp@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb257c38
    • D
      Merge branch 'net/rds/4.3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux · 91d2f14b
      David S. Miller 提交于
      Santosh Shilimkar says:
      
      ====================
      RDS: connection scalability and performance improvements
      
      [v4]
      Re-sending the same patches from v3 again since my repost of
      patch 05/14 from v3 was whitespace damaged.
      
      [v3]
      Updated patch "[PATCH v2 05/14] RDS: defer the over_batch work to
      send worker" as per David Miller's comment [4] to avoid the magic
      value usage. Patch now makes use of already available but unused
      send_batch_count module parameter. Rest of the patches are same as
      earlier version v2 [3]
      
      [v2]:
      Dropped "[PATCH 05/15] RDS: increase size of hash-table to 8K" from
      earlier version [1]. I plan to address the hash table scalability using
      re-sizable hash tables as suggested by David Laight and David Miller [2]
      
      This series addresses RDS connection bottlenecks on massive workloads and
      improve the RDMA performance almost by 3X. RDS TCP also gets a small gain
      of about 12%.
      
      RDS is being used in massive systems with high scalability where several
      hundred thousand end points and tens of thousands of local processes
      are operating in tens of thousand sockets. Being RC(reliable connection),
      socket bind and release happens very often and any inefficiencies in
      bind hash look ups hurts the overall system performance. RDS bin hash-table
      uses global spin-lock which is the biggest bottleneck. To make matter worst,
      it uses rcu inside global lock for hash buckets.
      This is being addressed by simply using per bucket rw lock which makes the
      locking simple and very efficient. The hash table size is still an issue and
      I plan to address it by using re-sizable hash tables as suggested on the list.
      
      For RDS RDMA improvement, the completion handling is revamped so that we
      can do batch completions. Both send and receive completion handlers are
      split logically to achieve the same. RDS 8K messages being one of the
      key usecase, mr pool is adapted to have the 8K mrs along with default 1M
      mrs. And while doing this, few fixes and couple of bottlenecks seen with
      rds_sendmsg() are addressed.
      
      Series applies against 4.3-rc1 as well net-next. Its tested on Oracle
      hardware with IB fabric for both bcopy as well as RDMA mode. RDS TCP is
      tested with iXGB NIC. Like last time, iWARP transport is untested with
      these changes. The patchset is also available at below git repo:
      
      git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux.git net/rds/4.3-v3
      
      As a side note, the IB HCA driver I used for testing misses at least 3
      important patches in upstream to see the full blown IB performance and
      am hoping to get that in mainline with help of them.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91d2f14b
    • D
      Merge branch 'pass_net_through_output_path' · 6b92d0c4
      David S. Miller 提交于
      Eric W. Biederman says:
      
      ====================
      net: Pass net through the output path v2
      
      This is the next installment of my work to pass struct net through the
      output path so the code does not need to guess how to figure out which
      network namespace it is in, and ultimately routes can have output
      devices in another network namespace.
      
      The first patch in this series is a fix for a bug that came in when sk
      was passed through the functions in the output path, and as such is
      probably a candidate for net.  At the same time my later patches depend
      on it so sending the fix separately would be confusing.
      
      The second patch in this series is another fix that for an issue that
      came in when sk was passed through the output path.  I don't think it
      needs a backport as I don't think anyone uses the path where the code
      was incorrect.
      
      The rest of the patchset focuses on the path from xxx_local_out to
      dst_output and in the end succeeds in passing sock_net(sk) from the
      socket a packet locally originates on to the dst->output function.
      
      Given the size reduction in the code I think this counts as a cleanup as
      much as feature work.
      
      There remain a number of helper functions (like ip option processing) to
      take care of before the network stack can support destination devices in
      other network namespaces but with this set of changes the backbone of
      the work is done.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b92d0c4