1. 30 11月, 2017 32 次提交
    • L
      Documentation: net: dsa: Cut set_addr() documentation · 0fc66ddf
      Linus Walleij 提交于
      This is not supported anymore, devices needing a MAC address
      just assign one at random, it's just a driver pecularity.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fc66ddf
    • D
      Merge branch 'net-dst_entry-shrink' · 3d8068c5
      David S. Miller 提交于
      David Miller says:
      
      ====================
      net: Significantly shrink the size of routes.
      
      Through a combination of several things, our route structures are
      larger than they need to be.
      
      Mostly this stems from having members in dst_entry which are only used
      by one class of routes.  So the majority of the work in this series is
      about "un-commoning" these members and pushing them into the type
      specific structures.
      
      Unfortunately, IPSEC needed the most surgery.  The majority of the
      changes here had to do with bundle creation and management.
      
      The other issue is the refcount alignment in dst_entry.  Once we get
      rid of the not-so-common members, it really opens the door to removing
      that alignment entirely.
      
      I think the new layout looks really nice, so I'll reproduce it here:
      
      	struct net_device       *dev;
      	struct  dst_ops	        *ops;
      	unsigned long		_metrics;
      	unsigned long           expires;
      	struct xfrm_state	*xfrm;
      	int			(*input)(struct sk_buff *);
      	int			(*output)(struct net *net, struct sock *sk, struct sk_buff *skb);
      	unsigned short		flags;
      	short			obsolete;
      	unsigned short		header_len;
      	unsigned short		trailer_len;
      	atomic_t		__refcnt;
      	int			__use;
      	unsigned long		lastuse;
      	struct lwtunnel_state   *lwtstate;
      	struct rcu_head		rcu_head;
      	short			error;
      	short			__pad;
      	__u32			tclassid;
      
      (This is for 64-bit, on 32-bit the __refcnt comes at the very end)
      
      So, the good news:
      
      1) struct dst_entry shrinks from 160 to 112 bytes.
      
      2) struct rtable shrinks from 216 to 168 bytes.
      
      3) struct rt6_info shrinks from 384 to 320 bytes.
      
      Enjoy.
      
      v2:
      	Collapse some patches logically based upon feedback.
      	Fix the strange patch #7.
      
      v3:	xfrm_dst_path() needs inline keyword
      	Properly align __refcnt on 32-bit.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d8068c5
    • D
      net: Remove dst->next · 7149f813
      David Miller 提交于
      There are no more users.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      7149f813
    • D
      xfrm: Stop using dst->next in bundle construction. · 5492093d
      David Miller 提交于
      While building ipsec bundles, blocks of xfrm dsts are linked together
      using dst->next from bottom to the top.
      
      The only thing this is used for is initializing the pmtu values of the
      xfrm stack, and for updating the mtu values at xfrm_bundle_ok() time.
      
      The bundle pmtu entries must be processed in this order so that pmtu
      values lower in the stack of routes can propagate up to the higher
      ones.
      
      Avoid using dst->next by simply maintaining an array of dst pointers
      as we already do for the xfrm_state objects when building the bundle.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      5492093d
    • D
      net: Rearrange dst_entry layout to avoid useless padding. · 8b207e73
      David Miller 提交于
      We have padding to try and align the refcount on a separate cache
      line.  But after several simplifications the padding has increased
      substantially.
      
      So now it's easy to change the layout to get rid of the padding
      entirely.
      
      We group the write-heavy __refcnt and __use with less often used
      items such as the rcu_head and the error code.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      8b207e73
    • D
      xfrm: Move dst->path into struct xfrm_dst · 0f6c480f
      David Miller 提交于
      The first member of an IPSEC route bundle chain sets it's dst->path to
      the underlying ipv4/ipv6 route that carries the bundle.
      
      Stated another way, if one were to follow the xfrm_dst->child chain of
      the bundle, the final non-NULL pointer would be the path and point to
      either an ipv4 or an ipv6 route.
      
      This is largely used to make sure that PMTU events propagate down to
      the correct ipv4 or ipv6 route.
      
      When we don't have the top of an IPSEC bundle 'dst->path == dst'.
      
      Move it down into xfrm_dst and key off of dst->xfrm.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      0f6c480f
    • D
      ipv6: Move dst->from into struct rt6_info. · 3a2232e9
      David Miller 提交于
      The dst->from value is only used by ipv6 routes to track where
      a route "came from".
      
      Any time we clone or copy a core ipv6 route in the ipv6 routing
      tables, we have the copy/clone's ->from point to the base route.
      
      This is used to handle route expiration properly.
      
      Only ipv6 uses this mechanism, and only ipv6 code references
      it.  So it is safe to move it into rt6_info.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      3a2232e9
    • D
      xfrm: Move child route linkage into xfrm_dst. · b6ca8bd5
      David Miller 提交于
      XFRM bundle child chains look like this:
      
      	xdst1 --> xdst2 --> xdst3 --> path_dst
      
      All of xdstN are xfrm_dst objects and xdst->u.dst.xfrm is non-NULL.
      The final child pointer in the chain, here called 'path_dst', is some
      other kind of route such as an ipv4 or ipv6 one.
      
      The xfrm output path pops routes, one at a time, via the child
      pointer, until we hit one which has a dst->xfrm pointer which
      is NULL.
      
      We can easily preserve the above mechanisms with child sitting
      only in the xfrm_dst structure.  All children in the chain
      before we break out of the xfrm_output() loop have dst->xfrm
      non-NULL and are therefore xfrm_dst objects.
      
      Since we break out of the loop when we find dst->xfrm NULL, we
      will not try to dereference 'dst' as if it were an xfrm_dst.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6ca8bd5
    • D
      ipsec: Create and use new helpers for dst child access. · 45b018be
      David Miller 提交于
      This will make a future change moving the dst->child pointer less
      invasive.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      45b018be
    • D
      net: Create and use new helper xfrm_dst_child(). · b92cf4aa
      David Miller 提交于
      Only IPSEC routes have a non-NULL dst->child pointer.  And IPSEC
      routes are identified by a non-NULL dst->xfrm pointer.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b92cf4aa
    • D
    • D
      fe736e77
    • D
      net: dst->rt_next is unused. · ca2c374a
      David Miller 提交于
      Delete it.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      ca2c374a
    • Z
      forcedeth: optimize the xmit with unlikely · b78a6aa3
      Zhu Yanjun 提交于
      In xmit, it is very impossible that TX_ERROR occurs. So using
      unlikely optimizes the xmit process.
      
      CC: Srinivas Eeda <srinivas.eeda@oracle.com>
      CC: Joe Jin <joe.jin@oracle.com>
      CC: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b78a6aa3
    • T
      atm: mpoa: remove 32-bit timekeeping · d750dbdc
      Tina Ruchandani 提交于
      net/atm/mpoa_* files use 'struct timeval' to store event
      timestamps. struct timeval uses a 32-bit seconds field which will
      overflow in the year 2038 and beyond. Morever, the timestamps are being
      compared only to get seconds elapsed, so struct timeval which stores
      a seconds and microseconds field is an overkill. This patch replaces
      the use of struct timeval with time64_t to store a 64-bit seconds field.
      Signed-off-by: NTina Ruchandani <ruchandani.tina@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d750dbdc
    • C
      atm: eni: fix several indentation issues · 59c03699
      Colin Ian King 提交于
      There are several statements that have incorrect indentation. Fix
      these.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59c03699
    • A
      openvswitch: use ktime_get_ts64() instead of ktime_get_ts() · 311af51d
      Arnd Bergmann 提交于
      timespec is deprecated because of the y2038 overflow, so let's convert
      this one to ktime_get_ts64(). The code is already safe even on 32-bit
      architectures, since it uses monotonic times. On 64-bit architectures,
      nothing changes, while on 32-bit architectures this avoids one
      type conversion.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      311af51d
    • A
      netxen: remove timespec usage · b2dfcb3f
      Arnd Bergmann 提交于
      netxen_collect_minidump() evidently just wants to get a monotonic
      timestamp. Using jiffies_to_timespec(jiffies, &ts) is not
      appropriate here, since it will overflow after 2^32 jiffies,
      which may be as short as 49 days of uptime.
      
      ktime_get_seconds() is the correct interface here.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2dfcb3f
    • R
      net: phy: harmonize phy_id{,_mask} data type · 511e3036
      Richard Leitner 提交于
      Previously phy_id was u32 and phy_id_mask was unsigned int. As the
      phy_id_mask defines the important bits of the phy_id (and is therefore
      the same size) these two variables should be the same data type.
      Signed-off-by: NRichard Leitner <richard.leitner@skidata.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      511e3036
    • L
      net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching · 3243ff2a
      Lukas Wunner 提交于
      No need to reinvent the wheel, we have bus_find_device_by_name().
      
      Cc: Grygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3243ff2a
    • S
      net: thunderx: Set max queue count taking XDP_TX into account · 87de0838
      Sunil Goutham 提交于
      on T81 there are only 4 cores, hence setting max queue count to 4
      would leave nothing for XDP_TX. This patch fixes this by doubling
      max queue count in above scenarios.
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: Ncjacob <cjacob@caviumnetworks.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87de0838
    • S
      net: thunderx: Add support for xdp redirect · aa136d0c
      Sunil Goutham 提交于
      This patch adds support for XDP_REDIRECT. Flush is not
      yet supported.
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: Ncjacob <cjacob@caviumnetworks.com>
      Signed-off-by: NAleksey Makarov <aleksey.makarov@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa136d0c
    • L
      Merge tag 'nfsd-4.15-1' of git://linux-nfs.org/~bfields/linux · b9151761
      Linus Torvalds 提交于
      Pull nfsd fixes from Bruce Fields:
       "I screwed up my merge window pull request; I only sent half of what I
        meant to.
      
        There were no new features, just bugfixes of various importance and
        some very minor cleanup, so I think it's all still appropriate for
        -rc2.
      
        Highlights:
      
         - Fixes from Trond for some races in the NFSv4 state code.
      
         - Fix from Naofumi Honda for a typo in the blocked lock notificiation
           code
      
         - Fixes from Vasily Averin for some problems starting and stopping
           lockd especially in network namespaces"
      
      * tag 'nfsd-4.15-1' of git://linux-nfs.org/~bfields/linux: (23 commits)
        lockd: fix "list_add double add" caused by legacy signal interface
        nlm_shutdown_hosts_net() cleanup
        race of nfsd inetaddr notifiers vs nn->nfsd_serv change
        race of lockd inetaddr notifiers vs nlmsvc_rqst change
        SUNRPC: make cache_detail structures const
        NFSD: make cache_detail structures const
        sunrpc: make the function arg as const
        nfsd: check for use of the closed special stateid
        nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat
        lockd: lost rollback of set_grace_period() in lockd_down_net()
        lockd: added cleanup checks in exit_net hook
        grace: replace BUG_ON by WARN_ONCE in exit_net hook
        nfsd: fix locking validator warning on nfs4_ol_stateid->st_mutex class
        lockd: remove net pointer from messages
        nfsd: remove net pointer from debug messages
        nfsd: Fix races with check_stateid_generation()
        nfsd: Ensure we check stateid validity in the seqid operation checks
        nfsd: Fix race in lock stateid creation
        nfsd4: move find_lock_stateid
        nfsd: Ensure we don't recognise lock stateids after freeing them
        ...
      b9151761
    • L
      Merge tag 'for-4.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 26cd9474
      Linus Torvalds 提交于
      Pull btrfs fixes from David Sterba:
       "We've collected some fixes in since the pre-merge window freeze.
      
        There's technically only one regression fix for 4.15, but the rest
        seems important and candidates for stable.
      
         - fix missing flush bio puts in error cases (is serious, but rarely
           happens)
      
         - fix reporting stat::st_blocks for buffered append writes
      
         - fix space cache invalidation
      
         - fix out of bound memory access when setting zlib level
      
         - fix potential memory corruption when fsync fails in the middle
      
         - fix crash in integrity checker
      
         - incremetnal send fix, path mixup for certain unlink/rename
           combination
      
         - pass flags to writeback so compressed writes can be throttled
           properly
      
         - error handling fixes"
      
      * tag 'for-4.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: incremental send, fix wrong unlink path after renaming file
        btrfs: tree-checker: Fix false panic for sanity test
        Btrfs: fix list_add corruption and soft lockups in fsync
        btrfs: Fix wild memory access in compression level parser
        btrfs: fix deadlock when writing out space cache
        btrfs: clear space cache inode generation always
        Btrfs: fix reported number of inode blocks after buffered append writes
        Btrfs: move definition of the function btrfs_find_new_delalloc_bytes
        Btrfs: bail out gracefully rather than BUG_ON
        btrfs: dev_alloc_list is not protected by RCU, use normal list_del
        btrfs: add missing device::flush_bio puts
        btrfs: Fix transaction abort during failure in btrfs_rm_dev_item
        Btrfs: add write_flags for compression bio
      26cd9474
    • L
      Merge tag 'microblaze-4.15-rc2' of git://git.monstr.eu/linux-2.6-microblaze · 198e0c0c
      Linus Torvalds 提交于
      Pull Microblaze fix from Michal Simek:
       "Add missing header to mmu_context_mm.h"
      
      * tag 'microblaze-4.15-rc2' of git://git.monstr.eu/linux-2.6-microblaze:
        microblaze: add missing include to mmu_context_mm.h
      198e0c0c
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · fccfde44
      Linus Torvalds 提交于
      Pull sparc fix from David Miller:
       "Sparc T4 and later cpu bootup regression fix"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Fix boot on T4 and later.
      fccfde44
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 96c22a49
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) The forcedeth conversion from pci_*() DMA interfaces to dma_*() ones
          missed one spot. From Zhu Yanjun.
      
       2) Missing CRYPTO_SHA256 Kconfig dep in cfg80211, from Johannes Berg.
      
       3) Fix checksum offloading in thunderx driver, from Sunil Goutham.
      
       4) Add SPDX to vm_sockets_diag.h, from Stephen Hemminger.
      
       5) Fix use after free of packet headers in TIPC, from Jon Maloy.
      
       6) "sizeof(ptr)" vs "sizeof(*ptr)" bug in i40e, from Gustavo A R Silva.
      
       7) Tunneling fixes in mlxsw driver, from Petr Machata.
      
       8) Fix crash in fanout_demux_rollover() of AF_PACKET, from Mike
          Maloney.
      
       9) Fix race in AF_PACKET bind() vs. NETDEV_UP notifier, from Eric
          Dumazet.
      
      10) Fix regression in sch_sfq.c due to one of the timer_setup()
          conversions. From Paolo Abeni.
      
      11) SCTP does list_for_each_entry() using wrong struct member, fix from
          Xin Long.
      
      12) Don't use big endian netlink attribute read for
          IFLA_BOND_AD_ACTOR_SYSTEM, it is in cpu endianness. Also from Xin
          Long.
      
      13) Fix mis-initialization of q->link.clock in CBQ scheduler, preventing
          adding filters there. From Jiri Pirko.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits)
        ethernet: dwmac-stm32: Fix copyright
        net: via: via-rhine: use %p to format void * address instead of %x
        net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit
        myri10ge: Update MAINTAINERS
        net: sched: cbq: create block for q->link.block
        atm: suni: remove extraneous space to fix indentation
        atm: lanai: use %p to format kernel addresses instead of %x
        VSOCK: Don't set sk_state to TCP_CLOSE before testing it
        atm: fore200e: use %pK to format kernel addresses instead of %x
        ambassador: fix incorrect indentation of assignment statement
        vxlan: use __be32 type for the param vni in __vxlan_fdb_delete
        bonding: use nla_get_u64 to extract the value for IFLA_BOND_AD_ACTOR_SYSTEM
        sctp: use right member as the param of list_for_each_entry
        sch_sfq: fix null pointer dereference at timer expiration
        cls_bpf: don't decrement net's refcount when offload fails
        net/packet: fix a race in packet_bind() and packet_notifier()
        packet: fix crash in fanout_demux_rollover()
        sctp: remove extern from stream sched
        sctp: force the params with right types for sctp csum apis
        sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail
        ...
      96c22a49
    • D
      sparc64: Fix boot on T4 and later. · e5372cd5
      David S. Miller 提交于
      If we don't put the NG4fls.o object into the same part of
      the link as the generic sparc64 objects for fls() and __fls()
      then the relocation in the branch we use for patching will
      not fit.
      
      Move NG4fls.o into lib-y to fix this problem.
      
      Fixes: 46ad8d2d ("sparc64: Use sparc optimized fls and __fls for T4 and above")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Reported-by: NAnatoly Pugachev <matorola@gmail.com>
      Tested-by: NAnatoly Pugachev <matorola@gmail.com>
      e5372cd5
    • L
      vsprintf: don't use 'restricted_pointer()' when not restricting · ef0010a3
      Linus Torvalds 提交于
      Instead, just fall back on the new '%p' behavior which hashes the
      pointer.
      
      Otherwise, '%pK' - that was intended to mark a pointer as restricted -
      just ends up leaking pointers that a normal '%p' wouldn't leak.  Which
      just make the whole thing pointless.
      
      I suspect we should actually get rid of '%pK' entirely, and make it just
      work as '%p' regardless, but this is the minimal obvious fix.  People
      who actually use 'kptr_restrict' should weigh in on which behavior they
      want.
      
      Cc: Tobin Harding <me@tobin.cc>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef0010a3
    • L
      kallsyms: take advantage of the new '%px' format · 668533dc
      Linus Torvalds 提交于
      The conditional kallsym hex printing used a special fixed-width '%lx'
      output (KALLSYM_FMT) in preparation for the hashing of %p, but that
      series ended up adding a %px specifier to help with the conversions.
      
      Use it, and avoid the "print pointer as an unsigned long" code.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      668533dc
    • L
      Merge tag 'printk-hash-pointer-4.15-rc2' of git://github.com/tcharding/linux · da6af54d
      Linus Torvalds 提交于
      Pull printk pointer hashing update from Tobin Harding:
       "Here is the patch set that implements hashing of printk specifier %p.
      
        First we have two clean up patches then we do the hashing. Hashing is
        done via the SipHash algorithm. The next patch adds printk specifier
        %px for printing pointers when we _really_ want to see the address i.e
        %px is functionally equivalent to %lx. Final patch in the set fixes
        KASAN since we break it by hashing %p.
      
        For the record here is the justification for the series:
      
          Currently there exist approximately 14 000 places in the Kernel
          where addresses are being printed using an unadorned %p. This
          potentially leaks sensitive information about the Kernel layout in
          memory. Many of these calls are stale, instead of fixing every call
          we hash the address by default before printing. We then add %px to
          provide a way to print the actual address. Although this is
          achievable using %lx, using %px will assist us if we ever want to
          change pointer printing behaviour. %px is more uniquely grep'able
          (there are already >50 000 uses of %lx).
      
          The added advantage of hashing %p is that security is now opt-out,
          if you _really_ want the address you have to work a little harder
          and use %px.
      
        This will of course break some users, forcing code printing needed
        addresses to be updated"
      
      [ I do expect this to be an annoyance, and a number of %px users to be
        added for debuggability. But nobody is willing to audit existing %p
        users for information leaks, and a number of places really only use
        the pointer as an object identifier rather than really 'I need the
        address'.
      
        IOW - sorry for the inconvenience, but it's the least inconvenient of
        the options.    - Linus ]
      
      * tag 'printk-hash-pointer-4.15-rc2' of git://github.com/tcharding/linux:
        kasan: use %px to print addresses instead of %p
        vsprintf: add printk specifier %px
        printk: hash addresses printed with %p
        vsprintf: refactor %pK code out of pointer()
        docs: correct documentation for %pK
      da6af54d
    • L
      Revert "mm, thp: Do not make pmd/pud dirty without a reason" · f55e1014
      Linus Torvalds 提交于
      This reverts commit 152e93af.
      
      It was a nice cleanup in theory, but as Nicolai Stange points out, we do
      need to make the page dirty for the copy-on-write case even when we
      didn't end up making it writable, since the dirty bit is what we use to
      check that we've gone through a COW cycle.
      Reported-by: NMichal Hocko <mhocko@kernel.org>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f55e1014
  2. 29 11月, 2017 8 次提交