1. 16 6月, 2015 1 次提交
    • C
      sock_diag: implement a get_info handler for inet · 35ac838a
      Craig Gallek 提交于
      This get_info handler will simply dispatch to the appropriate
      existing inet protocol handler.
      
      This patch also includes a new netlink attribute
      (INET_DIAG_PROTOCOL).  This attribute is currently only used
      for multicast messages.  Without this attribute, there is no
      way of knowing the IP protocol used by the socket information
      being broadcast.  This attribute is not necessary in the 'dump'
      variant of this protocol (though it could easily be added)
      because dump requests are issued for specific family/protocol
      pairs.
      
      Tested: ss -E (note, the -E option has not yet been merged into
      the upstream version of ss).
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35ac838a
  2. 25 5月, 2015 1 次提交
  3. 23 5月, 2015 1 次提交
    • E
      tcp: fix a potential deadlock in tcp_get_info() · d654976c
      Eric Dumazet 提交于
      Taking socket spinlock in tcp_get_info() can deadlock, as
      inet_diag_dump_icsk() holds the &hashinfo->ehash_locks[i],
      while packet processing can use the reverse locking order.
      
      We could avoid this locking for TCP_LISTEN states, but lockdep would
      certainly get confused as all TCP sockets share same lockdep classes.
      
      [  523.722504] ======================================================
      [  523.728706] [ INFO: possible circular locking dependency detected ]
      [  523.734990] 4.1.0-dbg-DEV #1676 Not tainted
      [  523.739202] -------------------------------------------------------
      [  523.745474] ss/18032 is trying to acquire lock:
      [  523.750002]  (slock-AF_INET){+.-...}, at: [<ffffffff81669d44>] tcp_get_info+0x2c4/0x360
      [  523.758129]
      [  523.758129] but task is already holding lock:
      [  523.763968]  (&(&hashinfo->ehash_locks[i])->rlock){+.-...}, at: [<ffffffff816bcb75>] inet_diag_dump_icsk+0x1d5/0x6c0
      [  523.774661]
      [  523.774661] which lock already depends on the new lock.
      [  523.774661]
      [  523.782850]
      [  523.782850] the existing dependency chain (in reverse order) is:
      [  523.790326]
      -> #1 (&(&hashinfo->ehash_locks[i])->rlock){+.-...}:
      [  523.796599]        [<ffffffff811126bb>] lock_acquire+0xbb/0x270
      [  523.802565]        [<ffffffff816f5868>] _raw_spin_lock+0x38/0x50
      [  523.808628]        [<ffffffff81665af8>] __inet_hash_nolisten+0x78/0x110
      [  523.815273]        [<ffffffff816819db>] tcp_v4_syn_recv_sock+0x24b/0x350
      [  523.822067]        [<ffffffff81684d41>] tcp_check_req+0x3c1/0x500
      [  523.828199]        [<ffffffff81682d09>] tcp_v4_do_rcv+0x239/0x3d0
      [  523.834331]        [<ffffffff816842fe>] tcp_v4_rcv+0xa8e/0xc10
      [  523.840202]        [<ffffffff81658fa3>] ip_local_deliver_finish+0x133/0x3e0
      [  523.847214]        [<ffffffff81659a9a>] ip_local_deliver+0xaa/0xc0
      [  523.853440]        [<ffffffff816593b8>] ip_rcv_finish+0x168/0x5c0
      [  523.859624]        [<ffffffff81659db7>] ip_rcv+0x307/0x420
      
      Lets use u64_sync infrastructure instead. As a bonus, 64bit
      arches get optimized, as these are nop for them.
      
      Fixes: 0df48c26 ("tcp: add tcpi_bytes_acked to tcp_info")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d654976c
  4. 22 5月, 2015 3 次提交
    • M
      tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info · 2efd055c
      Marcelo Ricardo Leitner 提交于
      This patch tracks the total number of inbound and outbound segments on a
      TCP socket. One may use this number to have an idea on connection
      quality when compared against the retransmissions.
      
      RFC4898 named these : tcpEStatsPerfSegsIn and tcpEStatsPerfSegsOut
      
      These are a 32bit field each and can be fetched both from TCP_INFO
      getsockopt() if one has a handle on a TCP socket, or from inet_diag
      netlink facility (iproute2/ss patch will follow)
      
      Note that tp->segs_out was placed near tp->snd_nxt for good data
      locality and minimal performance impact, while tp->segs_in was placed
      near tp->bytes_received for the same reason.
      
      Join work with Eric Dumazet.
      
      Note that received SYN are accounted on the listener, but sent SYNACK
      are not accounted.
      Signed-off-by: NMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2efd055c
    • J
      tcp: ensure epoll edge trigger wakeup when write queue is empty · ce5ec440
      Jason Baron 提交于
      We currently rely on the setting of SOCK_NOSPACE in the write()
      path to ensure that we wake up any epoll edge trigger waiters when
      acks return to free space in the write queue. However, if we fail
      to allocate even a single skb in the write queue, we could end up
      waiting indefinitely.
      
      Fix this by explicitly issuing a wakeup when we detect the condition
      of an empty write queue and a return value of -EAGAIN. This allows
      userspace to re-try as we expect this to be a temporary failure.
      
      I've tested this approach by artificially making
      sk_stream_alloc_skb() return NULL periodically. In that case,
      epoll edge trigger waiters will hang indefinitely in epoll_wait()
      without this patch.
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce5ec440
    • E
      tcp: add a force_schedule argument to sk_stream_alloc_skb() · eb934478
      Eric Dumazet 提交于
      In commit 8e4d980a ("tcp: fix behavior for epoll edge trigger")
      we fixed a possible hang of TCP sockets under memory pressure,
      by allowing sk_stream_alloc_skb() to use sk_forced_mem_schedule()
      if no packet is in socket write queue.
      
      It turns out there are other cases where we want to force memory
      schedule :
      
      tcp_fragment() & tso_fragment() need to split a big TSO packet into
      two smaller ones. If we block here because of TCP memory pressure,
      we can effectively block TCP socket from sending new data.
      If no further ACK is coming, this hang would be definitive, and socket
      has no chance to effectively reduce its memory usage.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb934478
  5. 20 5月, 2015 1 次提交
    • E
      tcp: Return error instead of partial read for saved syn headers · aea0929e
      Eric B Munson 提交于
      Currently the getsockopt() requesting the cached contents of the syn
      packet headers will fail silently if the caller uses a buffer that is
      too small to contain the requested data.  Rather than fail silently and
      discard the headers, getsockopt() should return an error and report the
      required size to hold the data.
      Signed-off-by: NEric B Munson <emunson@akamai.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aea0929e
  6. 18 5月, 2015 2 次提交
  7. 06 5月, 2015 1 次提交
    • E
      tcp: provide SYN headers for passive connections · cd8ae852
      Eric Dumazet 提交于
      This patch allows a server application to get the TCP SYN headers for
      its passive connections.  This is useful if the server is doing
      fingerprinting of clients based on SYN packet contents.
      
      Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
      
      The first is used on a socket to enable saving the SYN headers
      for child connections. This can be set before or after the listen()
      call.
      
      The latter is used to retrieve the SYN headers for passive connections,
      if the parent listener has enabled TCP_SAVE_SYN.
      
      TCP_SAVED_SYN is read once, it frees the saved SYN headers.
      
      The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
      headers.
      
      Original patch was written by Tom Herbert, I changed it to not hold
      a full skb (and associated dst and conntracking reference).
      
      We have used such patch for about 3 years at Google.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Tested-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd8ae852
  8. 30 4月, 2015 3 次提交
    • E
      tcp: add TCP_CC_INFO socket option · 6e9250f5
      Eric Dumazet 提交于
      Some Congestion Control modules can provide per flow information,
      but current way to get this information is to use netlink.
      
      Like TCP_INFO, let's add TCP_CC_INFO so that applications can
      issue a getsockopt() if they have a socket file descriptor,
      instead of playing complex netlink games.
      
      Sample usage would be :
      
        union tcp_cc_info info;
        socklen_t len = sizeof(info);
      
        if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e9250f5
    • E
      tcp: add tcpi_bytes_received to tcp_info · bdd1f9ed
      Eric Dumazet 提交于
      This patch tracks total number of payload bytes received on a TCP socket.
      This is the sum of all changes done to tp->rcv_nxt
      
      RFC4898 named this : tcpEStatsAppHCThruOctetsReceived
      
      This is a 64bit field, and can be fetched both from TCP_INFO
      getsockopt() if one has a handle on a TCP socket, or from inet_diag
      netlink facility (iproute2/ss patch will follow)
      
      Note that tp->bytes_received was placed near tp->rcv_nxt for
      best data locality and minimal performance impact.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Matt Mathis <mattmathis@google.com>
      Cc: Eric Salo <salo@google.com>
      Cc: Martin Lau <kafai@fb.com>
      Cc: Chris Rapier <rapier@psc.edu>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdd1f9ed
    • E
      tcp: add tcpi_bytes_acked to tcp_info · 0df48c26
      Eric Dumazet 提交于
      This patch tracks total number of bytes acked for a TCP socket.
      This is the sum of all changes done to tp->snd_una, and allows
      for precise tracking of delivered data.
      
      RFC4898 named this : tcpEStatsAppHCThruOctetsAcked
      
      This is a 64bit field, and can be fetched both from TCP_INFO
      getsockopt() if one has a handle on a TCP socket, or from inet_diag
      netlink facility (iproute2/ss patch will follow)
      
      Note that tp->bytes_acked was placed near tp->snd_una for
      best data locality and minimal performance impact.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Cc: Matt Mathis <mattmathis@google.com>
      Cc: Eric Salo <salo@google.com>
      Cc: Martin Lau <kafai@fb.com>
      Cc: Chris Rapier <rapier@psc.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0df48c26
  9. 22 4月, 2015 1 次提交
  10. 18 4月, 2015 1 次提交
  11. 12 4月, 2015 1 次提交
  12. 04 4月, 2015 2 次提交
  13. 25 3月, 2015 1 次提交
  14. 06 3月, 2015 1 次提交
  15. 03 3月, 2015 1 次提交
  16. 02 3月, 2015 1 次提交
  17. 04 2月, 2015 1 次提交
    • A
      ip: convert tcp_sendmsg() to iov_iter primitives · 57be5bda
      Al Viro 提交于
      patch is actually smaller than it seems to be - most of it is unindenting
      the inner loop body in tcp_sendmsg() itself...
      
      the bit in tcp_input.c is going to get reverted very soon - that's what
      memcpy_from_msg() will become, but not in this commit; let's keep it
      reasonably contained...
      
      There's one potentially subtle change here: in case of short copy from
      userland, mainline tcp_send_syn_data() discards the skb it has allocated
      and falls back to normal path, where we'll send as much as possible after
      rereading the same data again.  This patch trims SYN+data skb instead -
      that way we don't need to copy from the same place twice.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      57be5bda
  18. 10 12月, 2014 3 次提交
    • E
      tcp: refine TSO autosizing · 605ad7f1
      Eric Dumazet 提交于
      Commit 95bd09eb ("tcp: TSO packets automatic sizing") tried to
      control TSO size, but did this at the wrong place (sendmsg() time)
      
      At sendmsg() time, we might have a pessimistic view of flow rate,
      and we end up building very small skbs (with 2 MSS per skb).
      
      This is bad because :
      
       - It sends small TSO packets even in Slow Start where rate quickly
         increases.
       - It tends to make socket write queue very big, increasing tcp_ack()
         processing time, but also increasing memory needs, not necessarily
         accounted for, as fast clones overhead is currently ignored.
       - Lower GRO efficiency and more ACK packets.
      
      Servers with a lot of small lived connections suffer from this.
      
      Lets instead fill skbs as much as possible (64KB of payload), but split
      them at xmit time, when we have a precise idea of the flow rate.
      skb split is actually quite efficient.
      
      Patch looks bigger than necessary, because TCP Small Queue decision now
      has to take place after the eventual split.
      
      As Neal suggested, introduce a new tcp_tso_autosize() helper, so that
      tcp_tso_should_defer() can be synchronized on same goal.
      
      Rename tp->xmit_size_goal_segs to tp->gso_segs, as this variable
      contains number of mss that we can put in GSO packet, and is not
      related to the autosizing goal anymore.
      
      Tested:
      
      40 ms rtt link
      
      nstat >/dev/null
      netperf -H remote -l -2000000 -- -s 1000000
      nstat | egrep "IpInReceives|IpOutRequests|TcpOutSegs|IpExtOutOctets"
      
      Before patch :
      
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/s
      
       87380 2000000 2000000    0.36         44.22
      IpInReceives                    600                0.0
      IpOutRequests                   599                0.0
      TcpOutSegs                      1397               0.0
      IpExtOutOctets                  2033249            0.0
      
      After patch :
      
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380 2000000 2000000    0.36       44.27
      IpInReceives                    221                0.0
      IpOutRequests                   232                0.0
      TcpOutSegs                      1397               0.0
      IpExtOutOctets                  2013953            0.0
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      605ad7f1
    • A
      put iov_iter into msghdr · c0371da6
      Al Viro 提交于
      Note that the code _using_ ->msg_iter at that point will be very
      unhappy with anything other than unshifted iovec-backed iov_iter.
      We still need to convert users to proper primitives.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c0371da6
    • A
      f4362a2c
  19. 27 11月, 2014 1 次提交
  20. 24 11月, 2014 1 次提交
  21. 06 11月, 2014 1 次提交
    • D
      net: Add and use skb_copy_datagram_msg() helper. · 51f3d02b
      David S. Miller 提交于
      This encapsulates all of the skb_copy_datagram_iovec() callers
      with call argument signature "skb, offset, msghdr->msg_iov, length".
      
      When we move to iov_iters in the networking, the iov_iter object will
      sit in the msghdr.
      
      Having a helper like this means there will be less places to touch
      during that transformation.
      
      Based upon descriptions and patch from Al Viro.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51f3d02b
  22. 26 10月, 2014 1 次提交
    • E
      tcp: md5: do not use alloc_percpu() · 349ce993
      Eric Dumazet 提交于
      percpu tcp_md5sig_pool contains memory blobs that ultimately
      go through sg_set_buf().
      
      -> sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
      
      This requires that whole area is in a physically contiguous portion
      of memory. And that @buf is not backed by vmalloc().
      
      Given that alloc_percpu() can use vmalloc() areas, this does not
      fit the requirements.
      
      Replace alloc_percpu() by a static DEFINE_PER_CPU() as tcp_md5sig_pool
      is small anyway, there is no gain to dynamically allocate it.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Fixes: 765cf997 ("tcp: md5: remove one indirection level in tcp_md5sig_pool")
      Reported-by: NCrestez Dan Leonard <cdleonard@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      349ce993
  23. 02 10月, 2014 2 次提交
    • Y
      tcp: abort orphan sockets stalling on zero window probes · b248230c
      Yuchung Cheng 提交于
      Currently we have two different policies for orphan sockets
      that repeatedly stall on zero window ACKs. If a socket gets
      a zero window ACK when it is transmitting data, the RTO is
      used to probe the window. The socket is aborted after roughly
      tcp_orphan_retries() retries (as in tcp_write_timeout()).
      
      But if the socket was idle when it received the zero window ACK,
      and later wants to send more data, we use the probe timer to
      probe the window. If the receiver always returns zero window ACKs,
      icsk_probes keeps getting reset in tcp_ack() and the orphan socket
      can stall forever until the system reaches the orphan limit (as
      commented in tcp_probe_timer()). This opens up a simple attack
      to create lots of hanging orphan sockets to burn the memory
      and the CPU, as demonstrated in the recent netdev post "TCP
      connection will hang in FIN_WAIT1 after closing if zero window is
      advertised." http://www.spinics.net/lists/netdev/msg296539.html
      
      This patch follows the design in RTO-based probe: we abort an orphan
      socket stalling on zero window when the probe timer reaches both
      the maximum backoff and the maximum RTO. For example, an 100ms RTT
      connection will timeout after roughly 153 seconds (0.3 + 0.6 +
      .... + 76.8) if the receiver keeps the window shut. If the orphan
      socket passes this check, but the system already has too many orphans
      (as in tcp_out_of_resources()), we still abort it but we'll also
      send an RST packet as the connection may still be active.
      
      In addition, we change TCP_USER_TIMEOUT to cover (life or dead)
      sockets stalled on zero-window probes. This changes the semantics
      of TCP_USER_TIMEOUT slightly because it previously only applies
      when the socket has pending transmission.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Reported-by: NAndrey Dmitrov <andrey.dmitrov@oktetlabs.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b248230c
    • F
      tcp: add __init to tcp_init_mem · 47d7a88c
      Fabian Frederick 提交于
      tcp_init_mem is only called by __init tcp_init.
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47d7a88c
  24. 29 9月, 2014 2 次提交
    • F
      net: tcp: assign tcp cong_ops when tcp sk is created · 55d8694f
      Florian Westphal 提交于
      Split assignment and initialization from one into two functions.
      
      This is required by followup patches that add Datacenter TCP
      (DCTCP) congestion control algorithm - we need to be able to
      determine if the connection is moderated by DCTCP before the
      3WHS has finished.
      
      As we walk the available congestion control list during the
      assignment, we are always guaranteed to have Reno present as
      it's fixed compiled-in. Therefore, since we're doing the
      early assignment, we don't have a real use for the Reno alias
      tcp_init_congestion_ops anymore and can thus remove it.
      
      Actual usage of the congestion control operations are being
      made after the 3WHS has finished, in some cases however we
      can access get_info() via diag if implemented, therefore we
      need to zero out the private area for those modules.
      
      Joint work with Daniel Borkmann and Glenn Judd.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NGlenn Judd <glenn.judd@morganstanley.com>
      Acked-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55d8694f
    • E
      tcp: change tcp_skb_pcount() location · cd7d8498
      Eric Dumazet 提交于
      Our goal is to access no more than one cache line access per skb in
      a write or receive queue when doing the various walks.
      
      After recent TCP_SKB_CB() reorganizations, it is almost done.
      
      Last part is tcp_skb_pcount() which currently uses
      skb_shinfo(skb)->gso_segs, which is a terrible choice, because it needs
      3 cache lines in current kernel (skb->head, skb->end, and
      shinfo->gso_segs are all in 3 different cache lines, far from skb->cb)
      
      This very simple patch reuses space currently taken by tcp_tw_isn
      only in input path, as tcp_skb_pcount is only needed for skb stored in
      write queue.
      
      This considerably speeds up tcp_ack(), granted we avoid shinfo->tx_flags
      to get SKBTX_ACK_TSTAMP, which seems possible.
      
      This also speeds up all sack processing in general.
      
      This speeds up tcp_sendmsg() because it no longer has to access/dirty
      shinfo.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd7d8498
  25. 28 9月, 2014 2 次提交
  26. 27 9月, 2014 1 次提交
  27. 16 9月, 2014 1 次提交
    • E
      tcp: use TCP_SKB_CB(skb)->tcp_flags in input path · e11ecddf
      Eric Dumazet 提交于
      Input path of TCP do not currently uses TCP_SKB_CB(skb)->tcp_flags,
      which is only used in output path.
      
      tcp_recvmsg(), looks at tcp_hdr(skb)->syn for every skb found in receive queue,
      and its unfortunate because this bit is located in a cache line right before
      the payload.
      
      We can simplify TCP by copying tcp flags into TCP_SKB_CB(skb)->tcp_flags.
      
      This patch does so, and avoids the cache line miss in tcp_recvmsg()
      
      Following patches will
      - allow a segment with FIN being coalesced in tcp_try_coalesce()
      - simplify tcp_collapse() by not copying the headers.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e11ecddf
  28. 08 9月, 2014 1 次提交
    • T
      percpu_counter: add @gfp to percpu_counter_init() · 908c7f19
      Tejun Heo 提交于
      Percpu allocator now supports allocation mask.  Add @gfp to
      percpu_counter_init() so that !GFP_KERNEL allocation masks can be used
      with percpu_counters too.
      
      We could have left percpu_counter_init() alone and added
      percpu_counter_init_gfp(); however, the number of users isn't that
      high and introducing _gfp variants to all percpu data structures would
      be quite ugly, so let's just do the conversion.  This is the one with
      the most users.  Other percpu data structures are a lot easier to
      convert.
      
      This patch doesn't make any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJan Kara <jack@suse.cz>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Cc: x86@kernel.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      908c7f19
  29. 27 8月, 2014 1 次提交