1. 31 1月, 2017 5 次提交
  2. 30 1月, 2017 6 次提交
  3. 28 1月, 2017 5 次提交
  4. 27 1月, 2017 4 次提交
  5. 26 1月, 2017 15 次提交
    • S
      ac79cbb9
    • S
      uapi: install batman_adv.h header · e60bf3ea
      Sven Eckelmann 提交于
      09748a22 ("batman-adv: add generic netlink family for batman-adv")
      introduced the new batman_adv.h which describes the netlink attributes and
      commands of batman-adv. But the Kbuild entry to install the header was not
      added.
      
      All currently known tools ship their own copy of batman_adv.h but it should
      be installed anyway to later be able to migrate to the system batman_adv.h.
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
      e60bf3ea
    • R
      net: phy: bcm-phy-lib: clean up remaining AUXCTL register defines · 5e7bfa6c
      Rafał Miłecki 提交于
      1) Use 0x%02x format for register number. This follows some other
         defines and makes it easier to distinct register from values.
      2) Put register define above values and sort the values. It makes
         reading header code easier.
      3) Use 0x%04x format for all values. It's about consistency with other
         values (and most of the header) not a personal preference.
      4) Separate define for reading shift value with an extre empty line.
         It's user for all AUXCTL registers in a bcm54xx_auxctl_read.
      Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e7bfa6c
    • R
      net: phy: broadcom: drop duplicated define for RGMII SKEW delay · 8293c7bc
      Rafał Miłecki 提交于
      We had two defines for the same bit (both were used with the
      MII_BCM54XX_AUXCTL_SHDWSEL_MISC register).
      Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8293c7bc
    • R
      net: phy: broadcom: use auxctl reading helper in BCM54612E code · 85b4685d
      Rafał Miłecki 提交于
      Starting with commit 5b4e2900 ("net: phy: broadcom: add
      bcm54xx_auxctl_read") we have a reading helper so use it and avoid code
      duplication.
      It also means we don't need MII_BCM54XX_AUXCTL_SHDWSEL_MISC define as
      it's the same as MII_BCM54XX_AUXCTL_SHDWSEL_MISC just for reading needs
      (same value shifted by 12 bits).
      Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85b4685d
    • D
      Revert "drm/probe-helpers: Drop locking from poll_enable" · 54a07c7b
      Dave Airlie 提交于
      This reverts commit 3846fd9b.
      
      There were some precursor commits missing for this around connector
      locking, we should probably merge Lyude's nouveau avoid the problem patch.
      54a07c7b
    • A
      net: dsa: Mop up remaining NET_DSA_HWMON references · 43450293
      Andrew Lunn 提交于
      Previous patches have moved the temperature sensor code into the
      Marvell PHYs. A few now dead references to NET_DSA_HWMON were left
      behind. Go reap them.
      Reported-by: NValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43450293
    • G
      net: phy: leds: Fix truncated LED trigger names · 3c880eb0
      Geert Uytterhoeven 提交于
      Commit 4567d686 ("phy: increase size of MII_BUS_ID_SIZE and
      bus_id") increased the size of MII bus IDs, but forgot to update the
      private definition in <linux/phy_led_triggers.h>.
      This may cause:
        1. Truncation of LED trigger names,
        2. Duplicate LED trigger names,
        3. Failures registering LED triggers,
        4. Crashes due to bad error handling in the LED trigger failure path.
      
      To fix this, and prevent the definitions going out of sync again in the
      future, let the PHY LED trigger code use the existing MII_BUS_ID_SIZE
      definition.
      
      Example:
        - Before I had triggers "ee700000.etherne:01:100Mbps" and
          "ee700000.etherne:01:10Mbps",
        - After the increase of MII_BUS_ID_SIZE, both became
          "ee700000.ethernet-ffffffff:01:" => FAIL,
        - Now, the triggers are "ee700000.ethernet-ffffffff:01:100Mbps" and
          "ee700000.ethernet-ffffffff:01:10Mbps", which are unique again.
      
      Fixes: 4567d686 ("phy: increase size of MII_BUS_ID_SIZE and bus_id")
      Fixes: 2e0bc452 ("net: phy: leds: add support for led triggers on phy link state change")
      Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c880eb0
    • G
      net: phy: leds: Break dependency of phy.h on phy_led_triggers.h · d6f8cfa3
      Geert Uytterhoeven 提交于
      <linux/phy.h> includes <linux/phy_led_triggers.h>, which is not really
      needed.  Drop the include from <linux/phy.h>, and add it to all users
      that didn't include it explicitly.
      Suggested-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6f8cfa3
    • W
      net/tcp-fastopen: make connect()'s return case more consistent with non-TFO · 3979ad7e
      Willy Tarreau 提交于
      Without TFO, any subsequent connect() call after a successful one returns
      -1 EISCONN. The last API update ensured that __inet_stream_connect() can
      return -1 EINPROGRESS in response to sendmsg() when TFO is in use to
      indicate that the connection is now in progress. Unfortunately since this
      function is used both for connect() and sendmsg(), it has the undesired
      side effect of making connect() now return -1 EINPROGRESS as well after
      a successful call, while at the same time poll() returns POLLOUT. This
      can confuse some applications which happen to call connect() and to
      check for -1 EISCONN to ensure the connection is usable, and for which
      EINPROGRESS indicates a need to poll, causing a loop.
      
      This problem was encountered in haproxy where a call to connect() is
      precisely used in certain cases to confirm a connection's readiness.
      While arguably haproxy's behaviour should be improved here, it seems
      important to aim at a more robust behaviour when the goal of the new
      API is to make it easier to implement TFO in existing applications.
      
      This patch simply ensures that we preserve the same semantics as in
      the non-TFO case on the connect() syscall when using TFO, while still
      returning -1 EINPROGRESS on sendmsg(). For this we simply tell
      __inet_stream_connect() whether we're doing a regular connect() or in
      fact connecting for a sendmsg() call.
      
      Cc: Wei Wang <weiwan@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3979ad7e
    • W
      net/tcp-fastopen: Add new API support · 19f6d3f3
      Wei Wang 提交于
      This patch adds a new socket option, TCP_FASTOPEN_CONNECT, as an
      alternative way to perform Fast Open on the active side (client). Prior
      to this patch, a client needs to replace the connect() call with
      sendto(MSG_FASTOPEN). This can be cumbersome for applications who want
      to use Fast Open: these socket operations are often done in lower layer
      libraries used by many other applications. Changing these libraries
      and/or the socket call sequences are not trivial. A more convenient
      approach is to perform Fast Open by simply enabling a socket option when
      the socket is created w/o changing other socket calls sequence:
        s = socket()
          create a new socket
        setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN_CONNECT …);
          newly introduced sockopt
          If set, new functionality described below will be used.
          Return ENOTSUPP if TFO is not supported or not enabled in the
          kernel.
      
        connect()
          With cookie present, return 0 immediately.
          With no cookie, initiate 3WHS with TFO cookie-request option and
          return -1 with errno = EINPROGRESS.
      
        write()/sendmsg()
          With cookie present, send out SYN with data and return the number of
          bytes buffered.
          With no cookie, and 3WHS not yet completed, return -1 with errno =
          EINPROGRESS.
          No MSG_FASTOPEN flag is needed.
      
        read()
          Return -1 with errno = EWOULDBLOCK/EAGAIN if connect() is called but
          write() is not called yet.
          Return -1 with errno = EWOULDBLOCK/EAGAIN if connection is
          established but no msg is received yet.
          Return number of bytes read if socket is established and there is
          msg received.
      
      The new API simplifies life for applications that always perform a write()
      immediately after a successful connect(). Such applications can now take
      advantage of Fast Open by merely making one new setsockopt() call at the time
      of creating the socket. Nothing else about the application's socket call
      sequence needs to change.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19f6d3f3
    • W
      net/tcp-fastopen: refactor cookie check logic · 065263f4
      Wei Wang 提交于
      Refactor the cookie check logic in tcp_send_syn_data() into a function.
      This function will be called else where in later changes.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      065263f4
    • D
      bpf: add initial bpf tracepoints · a67edbf4
      Daniel Borkmann 提交于
      This work adds a number of tracepoints to paths that are either
      considered slow-path or exception-like states, where monitoring or
      inspecting them would be desirable.
      
      For bpf(2) syscall, tracepoints have been placed for main commands
      when they succeed. In XDP case, tracepoint is for exceptions, that
      is, f.e. on abnormal BPF program exit such as unknown or XDP_ABORTED
      return code, or when error occurs during XDP_TX action and the packet
      could not be forwarded.
      
      Both have been split into separate event headers, and can be further
      extended. Worst case, if they unexpectedly should get into our way in
      future, they can also removed [1]. Of course, these tracepoints (like
      any other) can be analyzed by eBPF itself, etc. Example output:
      
        # ./perf record -a -e bpf:* sleep 10
        # ./perf script
        sock_example  6197 [005]   283.980322:      bpf:bpf_map_create: map type=ARRAY ufd=4 key=4 val=8 max=256 flags=0
        sock_example  6197 [005]   283.980721:       bpf:bpf_prog_load: prog=a5ea8fa30ea6849c type=SOCKET_FILTER ufd=5
        sock_example  6197 [005]   283.988423:   bpf:bpf_prog_get_type: prog=a5ea8fa30ea6849c type=SOCKET_FILTER
        sock_example  6197 [005]   283.988443: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[06 00 00 00] val=[00 00 00 00 00 00 00 00]
        [...]
        sock_example  6197 [005]   288.990868: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[01 00 00 00] val=[14 00 00 00 00 00 00 00]
             swapper     0 [005]   289.338243:    bpf:bpf_prog_put_rcu: prog=a5ea8fa30ea6849c type=SOCKET_FILTER
      
        [1] https://lwn.net/Articles/705270/Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a67edbf4
    • D
      trace: add variant without spacing in trace_print_hex_seq · 2acae0d5
      Daniel Borkmann 提交于
      For upcoming tracepoint support for BPF, we want to dump the program's
      tag. Format should be similar to __print_hex(), but without spacing.
      Add a __print_hex_str() variant for exactly that purpose that reuses
      trace_print_hex_seq().
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2acae0d5
    • J
      net sched actions: Add support for user cookies · 1045ba77
      Jamal Hadi Salim 提交于
      Introduce optional 128-bit action cookie.
      Like all other cookie schemes in the networking world (eg in protocols
      like http or existing kernel fib protocol field, etc) the idea is to save
      user state that when retrieved serves as a correlator. The kernel
      _should not_ intepret it.  The user can store whatever they wish in the
      128 bits.
      
      Sample exercise(showing variable length use of cookie)
      
      .. create an accept action with cookie a1b2c3d4
      sudo $TC actions add action ok index 1 cookie a1b2c3d4
      
      .. dump all gact actions..
      sudo $TC -s actions ls action gact
      
          action order 0: gact action pass
           random type none pass val 0
           index 1 ref 1 bind 0 installed 5 sec used 5 sec
          Action statistics:
          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
          backlog 0b 0p requeues 0
          cookie a1b2c3d4
      
      .. bind the accept action to a filter..
      sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
      u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 1
      
      ... send some traffic..
      $ ping 127.0.0.1 -c 3
      PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
      64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.020 ms
      64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms
      64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.038 ms
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1045ba77
  6. 25 1月, 2017 5 次提交
    • V
      mm, page_alloc: fix check for NULL preferred_zone · ea57485a
      Vlastimil Babka 提交于
      Patch series "fix premature OOM regression in 4.7+ due to cpuset races".
      
      This is v2 of my attempt to fix the recent report based on LTP cpuset
      stress test [1].  The intention is to go to stable 4.9 LTSS with this,
      as triggering repeated OOMs is not nice.  That's why the patches try to
      be not too intrusive.
      
      Unfortunately why investigating I found that modifying the testcase to
      use per-VMA policies instead of per-task policies will bring the OOM's
      back, but that seems to be much older and harder to fix problem.  I have
      posted a RFC [2] but I believe that fixing the recent regressions has a
      higher priority.
      
      Longer-term we might try to think how to fix the cpuset mess in a better
      and less error prone way.  I was for example very surprised to learn,
      that cpuset updates change not only task->mems_allowed, but also
      nodemask of mempolicies.  Until now I expected the parameter to
      alloc_pages_nodemask() to be stable.  I wonder why do we then treat
      cpusets specially in get_page_from_freelist() and distinguish HARDWALL
      etc, when there's unconditional intersection between mempolicy and
      cpuset.  I would expect the nodemask adjustment for saving overhead in
      g_p_f(), but that clearly doesn't happen in the current form.  So we
      have both crazy complexity and overhead, AFAICS.
      
      [1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com
      [2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz
      
      This patch (of 4):
      
      Since commit c33d6c06 ("mm, page_alloc: avoid looking up the first
      zone in a zonelist twice") we have a wrong check for NULL preferred_zone,
      which can theoretically happen due to concurrent cpuset modification.  We
      check the zoneref pointer which is never NULL and we should check the zone
      pointer.  Also document this in first_zones_zonelist() comment per Michal
      Hocko.
      
      Fixes: c33d6c06 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
      Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea57485a
    • D
      kernel/watchdog: prevent false hardlockup on overloaded system · b94f5118
      Don Zickus 提交于
      On an overloaded system, it is possible that a change in the watchdog
      threshold can be delayed long enough to trigger a false positive.
      
      This can easily be achieved by having a cpu spinning indefinitely on a
      task, while another cpu updates watchdog threshold.
      
      What happens is while trying to park the watchdog threads, the hrtimers
      on the other cpus trigger and reprogram themselves with the new slower
      watchdog threshold.  Meanwhile, the nmi watchdog is still programmed
      with the old faster threshold.
      
      Because the one cpu is blocked, it prevents the thread parking on the
      other cpus from completing, which is needed to shutdown the nmi watchdog
      and reprogram it correctly.  As a result, a false positive from the nmi
      watchdog is reported.
      
      Fix this by setting a park_in_progress flag to block all lockups until
      the parking is complete.
      
      Fix provided by Ulrich Obergfell.
      
      [akpm@linux-foundation.org: s/park_in_progress/watchdog_park_in_progress/]
      Link: http://lkml.kernel.org/r/1481041033-192236-1-git-send-email-dzickus@redhat.comSigned-off-by: NDon Zickus <dzickus@redhat.com>
      Reviewed-by: NAaron Tomlin <atomlin@redhat.com>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b94f5118
    • Y
      memory_hotplug: make zone_can_shift() return a boolean value · 8a1f780e
      Yasuaki Ishimatsu 提交于
      online_{kernel|movable} is used to change the memory zone to
      ZONE_{NORMAL|MOVABLE} and online the memory.
      
      To check that memory zone can be changed, zone_can_shift() is used.
      Currently the function returns minus integer value, plus integer
      value and 0. When the function returns minus or plus integer value,
      it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}.
      
      But when the function returns 0, there are two meanings.
      
      One of the meanings is that the memory zone does not need to be changed.
      For example, when memory is in ZONE_NORMAL and onlined by online_kernel
      the memory zone does not need to be changed.
      
      Another meaning is that the memory zone cannot be changed. When memory
      is in ZONE_NORMAL and onlined by online_movable, the memory zone may
      not be changed to ZONE_MOVALBE due to memory online limitation(see
      Documentation/memory-hotplug.txt). In this case, memory must not be
      onlined.
      
      The patch changes the return type of zone_can_shift() so that memory
      online operation fails when memory zone cannot be changed as follows:
      
      Before applying patch:
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
         # echo online_movable > memory4097/state
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  8388608
                 managed  8388608
      
         online_movable operation succeeded. But memory is onlined as
         ZONE_NORMAL, not ZONE_MOVABLE.
      
      After applying patch:
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
         # echo online_movable > memory4097/state
         bash: echo: write error: Invalid argument
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
      
         online_movable operation failed because of failure of changing
         the memory zone from ZONE_NORMAL to ZONE_MOVABLE
      
      Fixes: df429ac0 ("memory-hotplug: more general validation of zone during online")
      Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.comSigned-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reviewed-by: NReza Arbab <arbab@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a1f780e
    • R
      net: Specify the owning module for lwtunnel ops · 88ff7334
      Robert Shearman 提交于
      Modules implementing lwtunnel ops should not be allowed to unload
      while there is state alive using those ops, so specify the owning
      module for all lwtunnel ops.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88ff7334
    • P
      netfilter: nf_tables: deconstify walk callback function · de70185d
      Pablo Neira Ayuso 提交于
      The flush operation needs to modify set and element objects, so let's
      deconstify this.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      de70185d