1. 08 4月, 2019 5 次提交
    • F
      xfrm: remove gso_segment indirection from xfrm_mode · 7613b92b
      Florian Westphal 提交于
      These functions are small and we only have versions for tunnel
      and transport mode for ipv4 and ipv6 respectively.
      
      Just place the 'transport or tunnel' conditional in the protocol
      specific function instead of using an indirection.
      
      Before:
          3226       12       0     3238   net/ipv4/esp4_offload.o
          7004      492       0     7496   net/ipv4/ip_vti.o
          3339       12       0     3351   net/ipv6/esp6_offload.o
         11294      460       0    11754   net/ipv6/ip6_vti.o
          1180       72       0     1252   net/ipv4/xfrm4_mode_beet.o
           428       48       0      476   net/ipv4/xfrm4_mode_transport.o
          1271       48       0     1319   net/ipv4/xfrm4_mode_tunnel.o
          1083       60       0     1143   net/ipv6/xfrm6_mode_beet.o
           172       48       0      220   net/ipv6/xfrm6_mode_ro.o
           429       48       0      477   net/ipv6/xfrm6_mode_transport.o
          1164       48       0     1212   net/ipv6/xfrm6_mode_tunnel.o
      15730428  6937008 4046908 26714344   vmlinux
      
      After:
          3461       12       0     3473   net/ipv4/esp4_offload.o
          7000      492       0     7492   net/ipv4/ip_vti.o
          3574       12       0     3586   net/ipv6/esp6_offload.o
         11295      460       0    11755   net/ipv6/ip6_vti.o
          1180       64       0     1244   net/ipv4/xfrm4_mode_beet.o
           171       40       0      211   net/ipv4/xfrm4_mode_transport.o
          1163       40       0     1203   net/ipv4/xfrm4_mode_tunnel.o
          1083       52       0     1135   net/ipv6/xfrm6_mode_beet.o
           172       40       0      212   net/ipv6/xfrm6_mode_ro.o
           172       40       0      212   net/ipv6/xfrm6_mode_transport.o
          1056       40       0     1096   net/ipv6/xfrm6_mode_tunnel.o
      15730424  6937008 4046908 26714340   vmlinux
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      7613b92b
    • F
      xfrm: remove xmit indirection from xfrm_mode · 303c5fab
      Florian Westphal 提交于
      There are only two versions (tunnel and transport). The ip/ipv6 versions
      are only differ in sizeof(iphdr) vs ipv6hdr.
      
      Place this in the core and use x->outer_mode->encap type to call the
      correct adjustment helper.
      
      Before:
         text   data    bss     dec      filename
      15730311  6937008 4046908 26714227 vmlinux
      
      After:
      15730428  6937008 4046908 26714344 vmlinux
      
      (about 117 byte increase)
      
      v2: use family from x->outer_mode, not inner
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      303c5fab
    • F
      xfrm: remove output indirection from xfrm_mode · 0c620e97
      Florian Westphal 提交于
      Same is input indirection.  Only exception: we need to export
      xfrm_outer_mode_output for pktgen.
      
      Increases size of vmlinux by about 163 byte:
      Before:
         text    data     bss     dec      filename
      15730208  6936948 4046908 26714064   vmlinux
      
      After:
      15730311  6937008 4046908 26714227   vmlinux
      
      xfrm_inner_extract_output has no more external callers, make it static.
      
      v2: add IS_ENABLED(IPV6) guard in xfrm6_prepare_output
          add two missing breaks in xfrm_outer_mode_output (Sabrina Dubroca)
          add WARN_ON_ONCE for 'call AF_INET6 related output function, but
          CONFIG_IPV6=n' case.
          make xfrm_inner_extract_output static
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      0c620e97
    • F
      xfrm: remove input indirection from xfrm_mode · c2d305e5
      Florian Westphal 提交于
      No need for any indirection or abstraction here, both functions
      are pretty much the same and quite small, they also have no external
      dependencies.
      
      xfrm_prepare_input can then be made static.
      
      With allmodconfig build, size increase of vmlinux is 25 byte:
      
      Before:
         text   data     bss     dec      filename
      15730207  6936924 4046908 26714039  vmlinux
      
      After:
      15730208  6936948 4046908 26714064 vmlinux
      
      v2: Fix INET_XFRM_MODE_TRANSPORT name in is-enabled test (Sabrina Dubroca)
          change copied comment to refer to transport and network header,
          not skb->{h,nh}, which don't exist anymore. (Sabrina)
          make xfrm_prepare_input static (Eyal Birger)
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      c2d305e5
    • F
      xfrm: place af number into xfrm_mode struct · b262a695
      Florian Westphal 提交于
      This will be useful to know if we're supposed to decode ipv4 or ipv6.
      
      While at it, make the unregister function return void, all module_exit
      functions did just BUG(); there is never a point in doing error checks
      if there is no way to handle such error.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      b262a695
  2. 22 3月, 2019 7 次提交
    • J
      genetlink: make policy common to family · 3b0f31f2
      Johannes Berg 提交于
      Since maxattr is common, the policy can't really differ sanely,
      so make it common as well.
      
      The only user that did in fact manage to make a non-common policy
      is taskstats, which has to be really careful about it (since it's
      still using a common maxattr!). This is no longer supported, but
      we can fake it using pre_doit.
      
      This reduces the size of e.g. nl80211.o (which has lots of commands):
      
         text	   data	    bss	    dec	    hex	filename
       398745	  14323	   2240	 415308	  6564c	net/wireless/nl80211.o (before)
       397913	  14331	   2240	 414484	  65314	net/wireless/nl80211.o (after)
      --------------------------------
         -832      +8       0    -824
      
      Which is obviously just 8 bytes for each command, and an added 8
      bytes for the new policy pointer. I'm not sure why the ops list is
      counted as .text though.
      
      Most of the code transformations were done using the following spatch:
          @ops@
          identifier OPS;
          expression POLICY;
          @@
          struct genl_ops OPS[] = {
          ...,
           {
          -	.policy = POLICY,
           },
          ...
          };
      
          @@
          identifier ops.OPS;
          expression ops.POLICY;
          identifier fam;
          expression M;
          @@
          struct genl_family fam = {
                  .ops = OPS,
                  .maxattr = M,
          +       .policy = POLICY,
                  ...
          };
      
      This also gets rid of devlink_nl_cmd_region_read_dumpit() accessing
      the cb->data as ops, which we want to change in a later genl patch.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b0f31f2
    • N
      rhashtable: rename rht_for_each*continue as *from. · f7ad68bf
      NeilBrown 提交于
      The pattern set by list.h is that for_each..continue()
      iterators start at the next entry after the given one,
      while for_each..from() iterators start at the given
      entry.
      
      The rht_for_each*continue() iterators are documented as though the
      start at the 'next' entry, but actually start at the given entry,
      and they are used expecting that behaviour.
      So fix the documentation and change the names to *from for consistency
      with list.h
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7ad68bf
    • N
      rhashtable: don't hold lock on first table throughout insertion. · 4feb7c7a
      NeilBrown 提交于
      rhashtable_try_insert() currently holds a lock on the bucket in
      the first table, while also locking buckets in subsequent tables.
      This is unnecessary and looks like a hold-over from some earlier
      version of the implementation.
      
      As insert and remove always lock a bucket in each table in turn, and
      as insert only inserts in the final table, there cannot be any races
      that are not covered by simply locking a bucket in each table in turn.
      
      When an insert call reaches that last table it can be sure that there
      is no matchinf entry in any other table as it has searched them all, and
      insertion never happens anywhere but in the last table.  The fact that
      code tests for the existence of future_tbl while holding a lock on
      the relevant bucket ensures that two threads inserting the same key
      will make compatible decisions about which is the "last" table.
      
      This simplifies the code and allows the ->rehash field to be
      discarded.
      
      We still need a way to ensure that a dead bucket_table is never
      re-linked by rhashtable_walk_stop().  This can be achieved by calling
      call_rcu() inside the locked region, and checking with
      rcu_head_after_call_rcu() in rhashtable_walk_stop() to see if the
      bucket table is empty and dead.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Reviewed-by: NPaul E. McKenney <paulmck@linux.ibm.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4feb7c7a
    • J
      net: dst: remove gc leftovers · 02afc7ad
      Julian Wiedmann 提交于
      Get rid of some obsolete gc-related documentation and macros that were
      missed in commit 5b7c9a8f ("net: remove dst gc related code").
      
      CC: Wei Wang <weiwan@google.com>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Acked-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02afc7ad
    • D
      ipv4: Allow amount of dirty memory from fib resizing to be controllable · 9ab948a9
      David Ahern 提交于
      fib_trie implementation calls synchronize_rcu when a certain amount of
      pages are dirty from freed entries. The number of pages was determined
      experimentally in 2009 (commit c3059477).
      
      At the current setting, synchronize_rcu is called often -- 51 times in a
      second in one test with an average of an 8 msec delay adding a fib entry.
      The total impact is a lot of slow down modifying the fib. This is seen
      in the output of 'time' - the difference between real time and sys+user.
      For example, using 720,022 single path routes and 'ip -batch'[1]:
      
          $ time ./ip -batch ipv4/routes-1-hops
          real    0m14.214s
          user    0m2.513s
          sys     0m6.783s
      
      So roughly 35% of the actual time to install the routes is from the ip
      command getting scheduled out, most notably due to synchronize_rcu (this
      is observed using 'perf sched timehist').
      
      This patch makes the amount of dirty memory configurable between 64k where
      the synchronize_rcu is called often (small, low end systems that are memory
      sensitive) to 64M where synchronize_rcu is called rarely during a large
      FIB change (for high end systems with lots of memory). The default is 512kB
      which corresponds to the current setting of 128 pages with a 4kB page size.
      
      As an example, at 16MB the worst interval shows 4 calls to synchronize_rcu
      in a second blocking for up to 30 msec in a single instance, and a total
      of almost 100 msec across the 4 calls in the second. The trade off is
      allowing FIB entries to consume more memory in a given time window but
      but with much better fib insertion rates (~30% increase in prefixes/sec).
      With this patch and net.ipv4.fib_sync_mem set to 16MB, the same batch
      file runs in:
      
          $ time ./ip -batch ipv4/routes-1-hops
          real    0m9.692s
          user    0m2.491s
          sys     0m6.769s
      
      So the dead time is reduced to about 1/2 second or <5% of the real time.
      
      [1] 'ip' modified to not request ACK messages which improves route
          insertion times by about 20%
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ab948a9
    • K
      tun: Add ioctl() TUNGETDEVNETNS cmd to allow obtaining real net ns of tun device · 0c3e0e3b
      Kirill Tkhai 提交于
      In commit f2780d6d "tun: Add ioctl() SIOCGSKNS cmd to allow
      obtaining net ns of tun device" it was missed that tun may change
      its net ns, while net ns of socket remains the same as it was
      created initially. SIOCGSKNS returns net ns of socket, so it is
      not suitable for obtaining net ns of device.
      
      We may have two tun devices with the same names in two net ns,
      and in this case it's not possible to determ, which of them
      fd refers to (TUNGETIFF will return the same name).
      
      This patch adds new ioctl() cmd for obtaining net ns of a device.
      Reported-by: NHarald Albrecht <harald.albrecht@gmx.net>
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c3e0e3b
    • D
      ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create · c7a1ce39
      David Ahern 提交于
      Change addrconf_f6i_alloc to generate a fib6_config and call
      ip6_route_info_create. addrconf_f6i_alloc is the last caller to
      fib6_info_alloc besides ip6_route_info_create, and there is no
      reason for it to do its own initialization on a fib6_info.
      
      Host routes need to be created even if the device is down, so add a
      new flag, fc_ignore_dev_down, to fib6_config and update fib6_nh_init
      to not error out if device is not up.
      
      Notes on the conversion:
      - ip_fib_metrics_init is the same as fib6_config has fc_mx set to NULL
        and fc_mx_len set to 0
      - dst_nocount is handled by the RTF_ADDRCONF flag
      - dst_host is handled by fc_dst_len = 128
      
      nh_gw does not get set after the conversion to ip6_route_info_create
      but it should not be set in addrconf_f6i_alloc since this is a host
      route not a gateway route.
      
      Everything else is a straight forward map between fib6_info and
      fib6_config.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7a1ce39
  3. 21 3月, 2019 5 次提交
    • S
      ipv6: Add icmp_echo_ignore_anycast for ICMPv6 · 0b03a5ca
      Stephen Suryaputra 提交于
      In addition to icmp_echo_ignore_multicast, there is a need to also
      prevent responding to pings to anycast addresses for security.
      Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b03a5ca
    • P
      net: remove 'fallback' argument from dev->ndo_select_queue() · a350ecce
      Paolo Abeni 提交于
      After the previous patch, all the callers of ndo_select_queue()
      provide as a 'fallback' argument netdev_pick_tx.
      The only exceptions are nested calls to ndo_select_queue(),
      which pass down the 'fallback' available in the current scope
      - still netdev_pick_tx.
      
      We can drop such argument and replace fallback() invocation with
      netdev_pick_tx(). This avoids an indirect call per xmit packet
      in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen)
      with device drivers implementing such ndo. It also clean the code
      a bit.
      
      Tested with ixgbe and CONFIG_FCOE=m
      
      With pktgen using queue xmit:
      threads		vanilla 	patched
      		(kpps)		(kpps)
      1		2334		2428
      2		4166		4278
      4		7895		8100
      
       v1 -> v2:
       - rebased after helper's name change
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a350ecce
    • P
      packet: rework packet_pick_tx_queue() to use common code selection · b71b5837
      Paolo Abeni 提交于
      Currently packet_pick_tx_queue() is the only caller of
      ndo_select_queue() using a fallback argument other than
      netdev_pick_tx.
      
      Leveraging rx queue, we can obtain a similar queue selection
      behavior using core helpers. After this change, ndo_select_queue()
      is always invoked with netdev_pick_tx() as fallback.
      We can change ndo_select_queue() signature in a followup patch,
      dropping an indirect call per transmitted packet in some scenarios
      (e.g. TCP syn and XDP generic xmit)
      
      This changes slightly how af packet queue selection happens when
      PACKET_QDISC_BYPASS is set. It's now more similar to plan dev_queue_xmit()
      tacking in account both XPS and TC mapping.
      
       v1  -> v2:
        - rebased after helper name change
       RFC -> v1:
        - initialize sender_cpu to the expected value
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b71b5837
    • P
      net: dev: rename queue selection helpers. · 4bd97d51
      Paolo Abeni 提交于
      With the following patches, we are going to use __netdev_pick_tx() in
      many modules. Rename it to netdev_pick_tx(), to make it clear is
      a public API.
      
      Also rename the existing netdev_pick_tx() to netdev_core_pick_tx(),
      to avoid name clashes.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4bd97d51
    • V
      net/tls: Add support of AES128-CCM based ciphers · f295b3ae
      Vakul Garg 提交于
      Added support for AES128-CCM based record encryption. AES128-CCM is
      similar to AES128-GCM. Both of them have same salt/iv/mac size. The
      notable difference between the two is that while invoking AES128-CCM
      operation, the salt||nonce (which is passed as IV) has to be prefixed
      with a hardcoded value '2'. Further, CCM implementation in kernel
      requires IV passed in crypto_aead_request() to be full '16' bytes.
      Therefore, the record structure 'struct tls_rec' has been modified to
      reserve '16' bytes for IV. This works for both GCM and CCM based cipher.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f295b3ae
  4. 20 3月, 2019 3 次提交
    • S
      ipv6: Add icmp_echo_ignore_multicast support for ICMPv6 · 03f1eccc
      Stephen Suryaputra 提交于
      IPv4 has icmp_echo_ignore_broadcast to prevent responding to broadcast pings.
      IPv6 needs a similar mechanism.
      
      v1->v2:
      - Remove NET_IPV6_ICMP_ECHO_IGNORE_MULTICAST.
      Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03f1eccc
    • G
      tcp: free request sock directly upon TFO or syncookies error · 9403cf23
      Guillaume Nault 提交于
      Since the request socket is created locally, it'd make more sense to
      use reqsk_free() instead of reqsk_put() in TFO and syncookies' error
      path.
      
      However, tcp_get_cookie_sock() may set ->rsk_refcnt before freeing the
      socket; tcp_conn_request() may also have non-null ->rsk_refcnt because
      of tcp_try_fastopen(). In both cases 'req' hasn't been exposed
      to the outside world and is safe to free immediately, but that'd
      trigger the WARN_ON_ONCE in reqsk_free().
      
      Define __reqsk_free() for these situations where we know nobody's
      referencing the socket, even though ->rsk_refcnt might be non-null.
      Now we can consolidate the error path of tcp_get_cookie_sock() and
      tcp_conn_request().
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9403cf23
    • H
      tipc: support broadcast/replicast configurable for bc-link · 02ec6caf
      Hoang Le 提交于
      Currently, a multicast stream uses either broadcast or replicast as
      transmission method, based on the ratio between number of actual
      destinations nodes and cluster size.
      
      However, when an L2 interface (e.g., VXLAN) provides pseudo
      broadcast support, this becomes very inefficient, as it blindly
      replicates multicast packets to all cluster/subnet nodes,
      irrespective of whether they host actual target sockets or not.
      
      The TIPC multicast algorithm is able to distinguish real destination
      nodes from other nodes, and hence provides a smarter and more
      efficient method for transferring multicast messages than
      pseudo broadcast can do.
      
      Because of this, we now make it possible for users to force
      the broadcast link to permanently switch to using replicast,
      irrespective of which capabilities the bearer provides,
      or pretend to provide.
      Conversely, we also make it possible to force the broadcast link
      to always use true broadcast. While maybe less useful in
      deployed systems, this may at least be useful for testing the
      broadcast algorithm in small clusters.
      
      We retain the current AUTOSELECT ability, i.e., to let the broadcast link
      automatically select which algorithm to use, and to switch back and forth
      between broadcast and replicast as the ratio between destination
      node number and cluster size changes. This remains the default method.
      
      Furthermore, we make it possible to configure the threshold ratio for
      such switches. The default ratio is now set to 10%, down from 25% in the
      earlier implementation.
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NHoang Le <hoang.h.le@dektech.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02ec6caf
  5. 19 3月, 2019 2 次提交
    • X
      sctp: get sctphdr by offset in sctp_compute_cksum · 273160ff
      Xin Long 提交于
      sctp_hdr(skb) only works when skb->transport_header is set properly.
      
      But in Netfilter, skb->transport_header for ipv6 is not guaranteed
      to be right value for sctphdr. It would cause to fail to check the
      checksum for sctp packets.
      
      So fix it by using offset, which is always right in all places.
      
      v1->v2:
        - Fix the changelog.
      
      Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
      Reported-by: NLi Shuang <shuali@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      273160ff
    • M
      packets: Always register packet sk in the same order · a4dc6a49
      Maxime Chevallier 提交于
      When using fanouts with AF_PACKET, the demux functions such as
      fanout_demux_cpu will return an index in the fanout socket array, which
      corresponds to the selected socket.
      
      The ordering of this array depends on the order the sockets were added
      to a given fanout group, so for FANOUT_CPU this means sockets are bound
      to cpus in the order they are configured, which is OK.
      
      However, when stopping then restarting the interface these sockets are
      bound to, the sockets are reassigned to the fanout group in the reverse
      order, due to the fact that they were inserted at the head of the
      interface's AF_PACKET socket list.
      
      This means that traffic that was directed to the first socket in the
      fanout group is now directed to the last one after an interface restart.
      
      In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
      socket that used to receive traffic from the last CPU after an interface
      restart.
      
      This commit introduces a helper to add a socket at the tail of a list,
      then uses it to register AF_PACKET sockets.
      
      Note that this changes the order in which sockets are listed in /proc and
      with sock_diag.
      
      Fixes: dc99f600 ("packet: Add fanout support")
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4dc6a49
  6. 16 3月, 2019 3 次提交
  7. 15 3月, 2019 3 次提交
  8. 14 3月, 2019 2 次提交
    • M
      bpf: Add bpf_get_listener_sock(struct bpf_sock *sk) helper · dbafd7dd
      Martin KaFai Lau 提交于
      Add a new helper "struct bpf_sock *bpf_get_listener_sock(struct bpf_sock *sk)"
      which returns a bpf_sock in TCP_LISTEN state.  It will trace back to
      the listener sk from a request_sock if possible.  It returns NULL
      for all other cases.
      
      No reference is taken because the helper ensures the sk is
      in SOCK_RCU_FREE (where the TCP_LISTEN sock should be in).
      Hence, bpf_sk_release() is unnecessary and the verifier does not
      allow bpf_sk_release(listen_sk) to be called either.
      
      The following is also allowed because the bpf_prog is run under
      rcu_read_lock():
      
      	sk = bpf_sk_lookup_tcp();
      	/* if (!sk) { ... } */
      	listen_sk = bpf_get_listener_sock(sk);
      	/* if (!listen_sk) { ... } */
      	bpf_sk_release(sk);
      	src_port = listen_sk->src_port; /* Allowed */
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      dbafd7dd
    • M
      bpf: Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release · 1b986589
      Martin KaFai Lau 提交于
      Lorenz Bauer [thanks!] reported that a ptr returned by bpf_tcp_sock(sk)
      can still be accessed after bpf_sk_release(sk).
      Both bpf_tcp_sock() and bpf_sk_fullsock() have the same issue.
      This patch addresses them together.
      
      A simple reproducer looks like this:
      
      	sk = bpf_sk_lookup_tcp();
      	/* if (!sk) ... */
      	tp = bpf_tcp_sock(sk);
      	/* if (!tp) ... */
      	bpf_sk_release(sk);
      	snd_cwnd = tp->snd_cwnd; /* oops! The verifier does not complain. */
      
      The problem is the verifier did not scrub the register's states of
      the tcp_sock ptr (tp) after bpf_sk_release(sk).
      
      [ Note that when calling bpf_tcp_sock(sk), the sk is not always
        refcount-acquired. e.g. bpf_tcp_sock(skb->sk). The verifier works
        fine for this case. ]
      
      Currently, the verifier does not track if a helper's return ptr (in REG_0)
      is "carry"-ing one of its argument's refcount status. To carry this info,
      the reg1->id needs to be stored in reg0.
      
      One approach was tried, like "reg0->id = reg1->id", when calling
      "bpf_tcp_sock()".  The main idea was to avoid adding another "ref_obj_id"
      for the same reg.  However, overlapping the NULL marking and ref
      tracking purpose in one "id" does not work well:
      
      	ref_sk = bpf_sk_lookup_tcp();
      	fullsock = bpf_sk_fullsock(ref_sk);
      	tp = bpf_tcp_sock(ref_sk);
      	if (!fullsock) {
      	     bpf_sk_release(ref_sk);
      	     return 0;
      	}
      	/* fullsock_reg->id is marked for NOT-NULL.
      	 * Same for tp_reg->id because they have the same id.
      	 */
      
      	/* oops. verifier did not complain about the missing !tp check */
      	snd_cwnd = tp->snd_cwnd;
      
      Hence, a new "ref_obj_id" is needed in "struct bpf_reg_state".
      With a new ref_obj_id, when bpf_sk_release(sk) is called, the verifier can
      scrub all reg states which has a ref_obj_id match.  It is done with the
      changes in release_reg_references() in this patch.
      
      While fixing it, sk_to_full_sk() is removed from bpf_tcp_sock() and
      bpf_sk_fullsock() to avoid these helpers from returning
      another ptr. It will make bpf_sk_release(tp) possible:
      
      	sk = bpf_sk_lookup_tcp();
      	/* if (!sk) ... */
      	tp = bpf_tcp_sock(sk);
      	/* if (!tp) ... */
      	bpf_sk_release(tp);
      
      A separate helper "bpf_get_listener_sock()" will be added in a later
      patch to do sk_to_full_sk().
      
      Misc change notes:
      - To allow bpf_sk_release(tp), the arg of bpf_sk_release() is changed
        from ARG_PTR_TO_SOCKET to ARG_PTR_TO_SOCK_COMMON.  ARG_PTR_TO_SOCKET
        is removed from bpf.h since no helper is using it.
      
      - arg_type_is_refcounted() is renamed to arg_type_may_be_refcounted()
        because ARG_PTR_TO_SOCK_COMMON is the only one and skb->sk is not
        refcounted.  All bpf_sk_release(), bpf_sk_fullsock() and bpf_tcp_sock()
        take ARG_PTR_TO_SOCK_COMMON.
      
      - check_refcount_ok() ensures is_acquire_function() cannot take
        arg_type_may_be_refcounted() as its argument.
      
      - The check_func_arg() can only allow one refcount-ed arg.  It is
        guaranteed by check_refcount_ok() which ensures at most one arg can be
        refcounted.  Hence, it is a verifier internal error if >1 refcount arg
        found in check_func_arg().
      
      - In release_reference(), release_reference_state() is called
        first to ensure a match on "reg->ref_obj_id" can be found before
        scrubbing the reg states with release_reg_references().
      
      - reg_is_refcounted() is no longer needed.
        1. In mark_ptr_or_null_regs(), its usage is replaced by
           "ref_obj_id && ref_obj_id == id" because,
           when is_null == true, release_reference_state() should only be
           called on the ref_obj_id obtained by a acquire helper (i.e.
           is_acquire_function() == true).  Otherwise, the following
           would happen:
      
      	sk = bpf_sk_lookup_tcp();
      	/* if (!sk) { ... } */
      	fullsock = bpf_sk_fullsock(sk);
      	if (!fullsock) {
      		/*
      		 * release_reference_state(fullsock_reg->ref_obj_id)
      		 * where fullsock_reg->ref_obj_id == sk_reg->ref_obj_id.
      		 *
      		 * Hence, the following bpf_sk_release(sk) will fail
      		 * because the ref state has already been released in the
      		 * earlier release_reference_state(fullsock_reg->ref_obj_id).
      		 */
      		bpf_sk_release(sk);
      	}
      
        2. In release_reg_references(), the current reg_is_refcounted() call
           is unnecessary because the id check is enough.
      
      - The type_is_refcounted() and type_is_refcounted_or_null()
        are no longer needed also because reg_is_refcounted() is removed.
      
      Fixes: 655a51e5 ("bpf: Add struct bpf_tcp_sock and BPF_FUNC_tcp_sock")
      Reported-by: NLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      1b986589
  9. 13 3月, 2019 10 次提交
    • A
      dt-bindings: clock: imx8mq: Fix numbering overlaps and gaps · 010d5166
      Abel Vesa 提交于
      IMX8MQ_CLK_USB_PHY_REF changes from 163 to 153, this way removing the gap.
      All the following clock ids are now decreased by 10 to keep the numbering
      right. Doing this, the IMX8MQ_CLK_CSI2_CORE is not overlapped with
      IMX8MQ_CLK_GPT1 anymore. IMX8MQ_CLK_GPT1_ROOT changes from 193 to 183 and
      all the following ids are updated accordingly.
      Reported-by: NPatrick Wildt <patrick@blueri.se>
      Fixes: 1cf3817b ("dt-bindings: Add binding for i.MX8MQ CCM")
      Signed-off-by: NAbel Vesa <abel.vesa@nxp.com>
      Signed-off-by: NStephen Boyd <sboyd@kernel.org>
      010d5166
    • K
      Drop flex_arrays · 586187d7
      Kent Overstreet 提交于
      All existing users have been converted to generic radix trees
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-8-kent.overstreet@gmail.comSigned-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      Acked-by: NDave Hansen <dave.hansen@intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      586187d7
    • K
      sctp: convert to genradix · 2075e50c
      Kent Overstreet 提交于
      This also makes sctp_stream_alloc_(out|in) saner, in that they no longer
      allocate new flex_arrays/genradixes, they just preallocate more
      elements.
      
      This code does however have a suspicious lack of locking.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-7-kent.overstreet@gmail.comSigned-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2075e50c
    • K
      generic radix trees · ba20ba2e
      Kent Overstreet 提交于
      Very simple radix tree implementation that supports storing arbitrary
      size entries, up to PAGE_SIZE - upcoming patches will convert existing
      flex_array users to genradixes.  The new genradix code has a much
      simpler API and implementation, and doesn't have a hard limit on the
      number of elements like flex_array does.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-5-kent.overstreet@gmail.comSigned-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba20ba2e
    • M
      memblock: remove memblock_{set,clear}_region_flags · fe145124
      Mike Rapoport 提交于
      The memblock API provides dedicated helpers to set or clear a flag on a
      memory region, e.g.  memblock_{mark,clear}_hotplug().
      
      The memblock_{set,clear}_region_flags() functions are used only by the
      memblock internal function that adjusts the region flags.  Drop these
      functions and use open-coded implementation instead.
      
      Link: http://lkml.kernel.org/r/1549455025-17706-2-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe145124
    • M
      memblock: drop memblock_alloc_*_nopanic() variants · 26fb3dae
      Mike Rapoport 提交于
      As all the memblock allocation functions return NULL in case of error
      rather than panic(), the duplicates with _nopanic suffix can be removed.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-22-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: Petr Mladek <pmladek@suse.com>		[printk]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26fb3dae
    • M
      memblock: make memblock_find_in_range_node() and choose_memblock_flags() static · c366ea89
      Mike Rapoport 提交于
      These functions are not used outside memblock.  Make them static.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-12-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c366ea89
    • M
      memblock: refactor internal allocation functions · 92d12f95
      Mike Rapoport 提交于
      Currently, memblock has several internal functions with overlapping
      functionality.  They all call memblock_find_in_range_node() to find free
      memory and then reserve the allocated range and mark it with kmemleak.
      However, there is difference in the allocation constraints and in
      fallback strategies.
      
      The allocations returning physical address first attempt to find free
      memory on the specified node within mirrored memory regions, then retry
      on the same node without the requirement for memory mirroring and
      finally fall back to all available memory.
      
      The allocations returning virtual address start with clamping the
      allowed range to memblock.current_limit, attempt to allocate from the
      specified node from regions with mirroring and with user defined minimal
      address.  If such allocation fails, next attempt is done with node
      restriction lifted.  Next, the allocation is retried with minimal
      address reset to zero and at last without the requirement for mirrored
      regions.
      
      Let's consolidate various fallbacks handling and make them more
      consistent for physical and virtual variants.  Most of the fallback
      handling is moved to memblock_alloc_range_nid() and it now handles node
      and mirror fallbacks.
      
      The memblock_alloc_internal() uses memblock_alloc_range_nid() to get a
      physical address of the allocated range and converts it to virtual
      address.
      
      The fallback for allocation below the specified minimal address remains
      in memblock_alloc_internal() because memblock_alloc_range_nid() is used
      by CMA with exact requirement for lower bounds.
      
      The memblock_phys_alloc_nid() function is completely dropped as it is not
      used anywhere outside memblock and its only usage can be replaced by a
      call to memblock_alloc_range_nid().
      
      [rppt@linux.ibm.com: fix parameter order in memblock_phys_alloc_try_nid()]
        Link: http://lkml.kernel.org/r/20190203113915.GC8620@rapoport-lnx
      Link: http://lkml.kernel.org/r/1548057848-15136-11-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Tested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      92d12f95
    • M
      memblock: drop memblock_alloc_base() · 0ba9e6ed
      Mike Rapoport 提交于
      The memblock_alloc_base() function tries to allocate a memory up to the
      limit specified by its max_addr parameter and panics if the allocation
      fails.  Replace its usage with memblock_phys_alloc_range() and make the
      callers check the return value and panic in case of error.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-10-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ba9e6ed
    • M
      memblock: drop __memblock_alloc_base() · 42b46aef
      Mike Rapoport 提交于
      The __memblock_alloc_base() function tries to allocate a memory up to
      the limit specified by its max_addr parameter.  Depending on the value
      of this parameter, the __memblock_alloc_base() can is replaced with the
      appropriate memblock_phys_alloc*() variant.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-9-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Acked-by: NRob Herring <robh@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      42b46aef