1. 04 7月, 2018 2 次提交
    • X
      sctp: add support for setting flowlabel when adding a transport · 4be4139f
      Xin Long 提交于
      Struct sockaddr_in6 has the member sin6_flowinfo that includes the
      ipv6 flowlabel, it should also support for setting flowlabel when
      adding a transport whose ipaddr is from userspace.
      
      Note that addrinfo in sctp_sendmsg is using struct in6_addr for
      the secondary addrs, which doesn't contain sin6_flowinfo, and
      it needs to copy sin6_flowinfo from the primary addr.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4be4139f
    • X
      sctp: add support for dscp and flowlabel per transport · 8a9c58d2
      Xin Long 提交于
      Like some other per transport params, flowlabel and dscp are added
      in transport, asoc and sctp_sock. By default, transport sets its
      value from asoc's, and asoc does it from sctp_sock. flowlabel
      only works for ipv6 transport.
      
      Other than that they need to be passed down in sctp_xmit, flow4/6
      also needs to set them before looking up route in get_dst.
      
      Note that it uses '& 0x100000' to check if flowlabel is set and
      '& 0x1' (tos 1st bit is unused) to check if dscp is set by users,
      so that they could be set to 0 by sockopt in next patch.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a9c58d2
  2. 08 5月, 2018 1 次提交
    • X
      sctp: delay the authentication for the duplicated cookie-echo chunk · 59d8d443
      Xin Long 提交于
      Now sctp only delays the authentication for the normal cookie-echo
      chunk by setting chunk->auth_chunk in sctp_endpoint_bh_rcv(). But
      for the duplicated one with auth, in sctp_assoc_bh_rcv(), it does
      authentication first based on the old asoc, which will definitely
      fail due to the different auth info in the old asoc.
      
      The duplicated cookie-echo chunk will create a new asoc with the
      auth info from this chunk, and the authentication should also be
      done with the new asoc's auth info for all of the collision 'A',
      'B' and 'D'. Otherwise, the duplicated cookie-echo chunk with auth
      will never pass the authentication and create the new connection.
      
      This issue exists since very beginning, and this fix is to make
      sctp_assoc_bh_rcv() follow the way sctp_endpoint_bh_rcv() does
      for the normal cookie-echo chunk to delay the authentication.
      
      While at it, remove the unused params from sctp_sf_authenticate()
      and define sctp_auth_chunk_verify() used for all the places that
      do the delayed authentication.
      
      v1->v2:
        fix the typo in changelog as Marcelo noticed.
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59d8d443
  3. 28 4月, 2018 5 次提交
  4. 26 4月, 2018 1 次提交
  5. 12 12月, 2017 1 次提交
  6. 25 10月, 2017 1 次提交
  7. 07 8月, 2017 3 次提交
  8. 05 7月, 2017 1 次提交
  9. 02 7月, 2017 1 次提交
  10. 21 6月, 2017 1 次提交
  11. 11 6月, 2017 1 次提交
  12. 03 6月, 2017 2 次提交
  13. 25 5月, 2017 1 次提交
  14. 05 4月, 2017 1 次提交
  15. 31 3月, 2017 1 次提交
    • X
      sctp: alloc stream info when initializing asoc · 3dbcc105
      Xin Long 提交于
      When sending a msg without asoc established, sctp will send INIT packet
      first and then enqueue chunks.
      
      Before receiving INIT_ACK, stream info is not yet alloced. But enqueuing
      chunks needs to access stream info, like out stream state and out stream
      cnt.
      
      This patch is to fix it by allocing out stream info when initializing an
      asoc, allocing in stream and re-allocing out stream when processing init.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3dbcc105
  16. 23 3月, 2017 1 次提交
  17. 08 2月, 2017 1 次提交
  18. 19 1月, 2017 4 次提交
    • X
      sctp: add sockopt SCTP_ENABLE_STREAM_RESET · 9fb657ae
      Xin Long 提交于
      This patch is to add sockopt SCTP_ENABLE_STREAM_RESET to get/set
      strreset_enable to indicate which reconf request type it supports,
      which is described in rfc6525 section 6.3.1.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fb657ae
    • X
      sctp: add reconf_enable in asoc ep and netns · c28445c3
      Xin Long 提交于
      This patch is to add reconf_enable field in all of asoc ep and netns
      to indicate if they support stream reset.
      
      When initializing, asoc reconf_enable get the default value from ep
      reconf_enable which is from netns netns reconf_enable by default.
      
      It is also to add reconf_capable in asoc peer part to know if peer
      supports reconf_enable, the value is set if ext params have reconf
      chunk support when processing init chunk, just as rfc6525 section
      5.1.1 demands.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c28445c3
    • X
      sctp: add stream reconf timer · 7b9438de
      Xin Long 提交于
      This patch is to add a per transport timer based on sctp timer frame
      for stream reconf chunk retransmission. It would start after sending
      a reconf request chunk, and stop after receiving the response chunk.
      
      If the timer expires, besides retransmitting the reconf request chunk,
      it would also do the same thing with data RTO timer. like to increase
      the appropriate error counts, and perform threshold management, possibly
      destroying the asoc if sctp retransmission thresholds are exceeded, just
      as section 5.1.1 describes.
      
      This patch is also to add asoc strreset_chunk, it is used to save the
      reconf request chunk, so that it can be retransmitted, and to check if
      the response is really for this request by comparing the information
      inside with the response chunk as well.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b9438de
    • X
      sctp: add support for generating stream reconf ssn reset request chunk · cc16f00f
      Xin Long 提交于
      This patch is to add asoc strreset_outseq and strreset_inseq for
      saving the reconf request sequence, initialize them when create
      assoc and process init, and also to define Incoming and Outgoing
      SSN Reset Request Parameter described in rfc6525 section 4.1 and
      4.2, As they can be in one same chunk as section rfc6525 3.1-3
      describes, it makes them in one function.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc16f00f
  19. 07 1月, 2017 1 次提交
    • X
      sctp: prepare asoc stream for stream reconf · a8386317
      Xin Long 提交于
      sctp stream reconf, described in RFC 6525, needs a structure to
      save per stream information in assoc, like stream state.
      
      In the future, sctp stream scheduler also needs it to save some
      stream scheduler params and queues.
      
      This patchset is to prepare the stream array in assoc for stream
      reconf. It defines sctp_stream that includes stream arrays inside
      to replace ssnmap.
      
      Note that we use different structures for IN and OUT streams, as
      the members in per OUT stream will get more and more different
      from per IN stream.
      
      v1->v2:
        - put these patches into a smaller group.
      v2->v3:
        - define sctp_stream to contain stream arrays, and create stream.c
          to put stream-related functions.
        - merge 3 patches into 1, as new sctp_stream has the same name
          with before.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8386317
  20. 24 12月, 2016 2 次提交
    • M
      sctp: fix recovering from 0 win with small data chunks · 1636098c
      Marcelo Ricardo Leitner 提交于
      Currently if SCTP closes the receive window with window pressure, mostly
      caused by excessive skb overhead on payload/overheads ratio, SCTP will
      close the window abruptly while saving the delta on rwnd_press. It will
      start recovering rwnd as the chunks are consumed by the application and
      the rwnd_press will be only recovered after rwnd reach the same value as
      of rwnd_press, mostly to prevent silly window syndrome.
      
      Thing is, this is very inefficient with small data chunks, as with those
      it will never reach back that value, and thus it will never recover from
      such pressure. This means that we will not issue window updates when
      recovering from 0 window and will rely on a sender retransmit to notice
      it.
      
      The fix here is to remove such threshold, as no value is good enough: it
      depends on the (avg) chunk sizes being used.
      
      Test with netperf -t SCTP_STREAM -- -m 1, and trigger 0 window by
      sending SIGSTOP to netserver, sleep 1.2, and SIGCONT.
      Rate limited to 845kbps, for visibility. Capture done at netserver side.
      
      Previously:
      01.500751 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632372996] [a_rwnd 99153] [
      01.500752 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632372997] [SID: 0] [SS
      01.517471 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS
      01.517483 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap
      01.517485 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373083] [SID: 0] [SS
      01.517488 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap
      01.534168 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373096] [SID: 0] [SS
      01.534180 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap
      01.534181 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373169] [SID: 0] [SS
      01.534185 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap
      02.525978 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS
      02.526021 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap
        (window update missed)
      04.573807 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS
      04.779370 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373082] [a_rwnd 859] [#g
      04.789162 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373083] [SID: 0] [SS
      04.789323 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373156] [SID: 0] [SS
      04.789372 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373228] [a_rwnd 786] [#g
      
      After:
      02.568957 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098728] [a_rwnd 99153]
      02.568961 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098729] [SID: 0] [S
      02.585631 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S
      02.585666 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga
      02.585671 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098815] [SID: 0] [S
      02.585683 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga
      02.602330 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098828] [SID: 0] [S
      02.602359 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga
      02.602363 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098901] [SID: 0] [S
      02.602372 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga
      03.600788 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S
      03.600830 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga
      03.619455 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 13508]
      03.619479 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 27017]
      03.619497 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 40526]
      03.619516 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 54035]
      03.619533 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 67544]
      03.619552 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 81053]
      03.619570 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 94562]
        (following data transmission triggered by window updates above)
      03.633504 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S
      03.836445 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098814] [a_rwnd 100000]
      03.843125 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098815] [SID: 0] [S
      03.843285 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098888] [SID: 0] [S
      03.843345 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098960] [a_rwnd 99894]
      03.856546 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098961] [SID: 0] [S
      03.866450 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490099011] [SID: 0] [S
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1636098c
    • M
      sctp: do not loose window information if in rwnd_over · 58b94d88
      Marcelo Ricardo Leitner 提交于
      It's possible that we receive a packet that is larger than current
      window. If it's the first packet in this way, it will cause it to
      increase rwnd_over. Then, if we receive another data chunk (specially as
      SCTP allows you to have one data chunk in flight even during 0 window),
      rwnd_over will be overwritten instead of added to.
      
      In the long run, this could cause the window to grow bigger than its
      initial size, as rwnd_over would be charged only for the last received
      data chunk while the code will try open the window for all packets that
      were received and had its value in rwnd_over overwritten. This, then,
      can lead to the worsening of payload/buffer ratio and cause rwnd_press
      to kick in more often.
      
      The fix is to sum it too, same as is done for rwnd_press, so that if we
      receive 3 chunks after closing the window, we still have to release that
      same amount before re-opening it.
      
      Log snippet from sctp_test exhibiting the issue:
      [  146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000
      rwnd decreased by 1 to (0, 1, 114221)
      [  146.209232] sctp: sctp_assoc_rwnd_decrease:
      association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1!
      [  146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000
      rwnd decreased by 1 to (0, 1, 114221)
      [  146.209232] sctp: sctp_assoc_rwnd_decrease:
      association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1!
      [  146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000
      rwnd decreased by 1 to (0, 1, 114221)
      [  146.209232] sctp: sctp_assoc_rwnd_decrease:
      association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1!
      [  146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000
      rwnd decreased by 1 to (0, 1, 114221)
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58b94d88
  21. 17 11月, 2016 1 次提交
    • X
      sctp: use new rhlist interface on sctp transport rhashtable · 7fda702f
      Xin Long 提交于
      Now sctp transport rhashtable uses hash(lport, dport, daddr) as the key
      to hash a node to one chain. If in one host thousands of assocs connect
      to one server with the same lport and different laddrs (although it's
      not a normal case), all the transports would be hashed into the same
      chain.
      
      It may cause to keep returning -EBUSY when inserting a new node, as the
      chain is too long and sctp inserts a transport node in a loop, which
      could even lead to system hangs there.
      
      The new rhlist interface works for this case that there are many nodes
      with the same key in one chain. It puts them into a list then makes this
      list be as a node of the chain.
      
      This patch is to replace rhashtable_ interface with rhltable_ interface.
      Since a chain would not be too long and it would not return -EBUSY with
      this fix when inserting a node, the reinsert loop is also removed here.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7fda702f
  22. 22 9月, 2016 1 次提交
  23. 12 7月, 2016 1 次提交
    • X
      sctp: add SCTP_PR_SUPPORTED on sctp sockopt · 28aa4c26
      Xin Long 提交于
      According to section 4.5 of rfc7496, prsctp_enable should be per asoc.
      We will add prsctp_enable to both asoc and ep, and replace the places
      where it used net.sctp->prsctp_enable with asoc->prsctp_enable.
      
      ep->prsctp_enable will be initialized with net.sctp->prsctp_enable, and
      asoc->prsctp_enable will be initialized with ep->prsctp_enable. We can
      also modify it's value through sockopt SCTP_PR_SUPPORTED.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28aa4c26
  24. 21 3月, 2016 1 次提交
    • M
      sctp: align MTU to a word · 3822a5ff
      Marcelo Ricardo Leitner 提交于
      SCTP is a protocol that is aligned to a word (4 bytes). Thus using bare
      MTU can sometimes return values that are not aligned, like for loopback,
      which is 65536 but ipv4_mtu() limits that to 65535. This mis-alignment
      will cause the last non-aligned bytes to never be used and can cause
      issues with congestion control.
      
      So it's better to just consider a lower MTU and keep congestion control
      calcs saner as they are based on PMTU.
      
      Same applies to icmp frag needed messages, which is also fixed by this
      patch.
      
      One other effect of this is the inability to send MTU-sized packet
      without queueing or fragmentation and without hitting Nagle. As the
      check performed at sctp_packet_can_append_data():
      
      if (chunk->skb->len + q->out_qlen >= transport->pathmtu - packet->overhead)
      	/* Enough data queued to fill a packet */
      	return SCTP_XMIT_OK;
      
      with the above example of MTU, if there are no other messages queued,
      one cannot send a packet that just fits one packet (65532 bytes) and
      without causing DATA chunk fragmentation or a delay.
      
      v2:
       - Added WORD_TRUNC macro
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3822a5ff
  25. 14 3月, 2016 2 次提交
    • M
      sctp: allow sctp_transmit_packet and others to use gfp · cea8768f
      Marcelo Ricardo Leitner 提交于
      Currently sctp_sendmsg() triggers some calls that will allocate memory
      with GFP_ATOMIC even when not necessary. In the case of
      sctp_packet_transmit it will allocate a linear skb that will be used to
      construct the packet and this may cause sends to fail due to ENOMEM more
      often than anticipated specially with big MTUs.
      
      This patch thus allows it to inherit gfp flags from upper calls so that
      it can use GFP_KERNEL if it was triggered by a sctp_sendmsg call or
      similar. All others, like retransmits or flushes started from BH, are
      still allocated using GFP_ATOMIC.
      
      In netperf tests this didn't result in any performance drawbacks when
      memory is not too fragmented and made it trigger ENOMEM way less often.
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cea8768f
    • X
      sctp: fix the transports round robin issue when init is retransmitted · 39d2adeb
      Xin Long 提交于
      prior to this patch, at the beginning if we have two paths in one assoc,
      they may have the same params other than the last_time_heard, it will try
      the paths like this:
      
      1st cycle
        try trans1 fail.
        then trans2 is selected.(cause it's last_time_heard is after trans1).
      
      2nd cycle:
        try  trans2 fail
        then trans2 is selected.(cause it's last_time_heard is after trans1).
      
      3rd cycle:
        try  trans2 fail
        then trans2 is selected.(cause it's last_time_heard is after trans1).
      
      ....
      
      trans1 will never have change to be selected, which is not what we expect.
      we should keeping round robin all the paths if they are just added at the
      beginning.
      
      So at first every tranport's last_time_heard should be initialized 0, so
      that we ensure they have the same value at the beginning, only by this,
      all the transports could get equal chance to be selected.
      
      Then for sctp_trans_elect_best, it should return the trans_next one when
      *trans == *trans_next, so that we can try next if it fails,  but now it
      always return trans. so we can fix it by exchanging these two params when
      we calls sctp_trans_elect_tie().
      
      Fixes: 4c47af4d ('net: sctp: rework multihoming retransmission path selection to rfc4960')
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39d2adeb
  26. 06 1月, 2016 1 次提交
  27. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc