1. 07 2月, 2017 3 次提交
  2. 06 2月, 2017 3 次提交
  3. 04 2月, 2017 6 次提交
  4. 03 2月, 2017 2 次提交
  5. 02 2月, 2017 3 次提交
  6. 01 2月, 2017 1 次提交
    • D
      fscache: Fix dead object requeue · e26bfebd
      David Howells 提交于
      Under some circumstances, an fscache object can become queued such that it
      fscache_object_work_func() can be called once the object is in the
      OBJECT_DEAD state.  This results in the kernel oopsing when it tries to
      invoke the handler for the state (which is hard coded to 0x2).
      
      The way this comes about is something like the following:
      
       (1) The object dispatcher is processing a work state for an object.  This
           is done in workqueue context.
      
       (2) An out-of-band event comes in that isn't masked, causing the object to
           be queued, say EV_KILL.
      
       (3) The object dispatcher finishes processing the current work state on
           that object and then sees there's another event to process, so,
           without returning to the workqueue core, it processes that event too.
           It then follows the chain of events that initiates until we reach
           OBJECT_DEAD without going through a wait state (such as
           WAIT_FOR_CLEARANCE).
      
           At this point, object->events may be 0, object->event_mask will be 0
           and oob_event_mask will be 0.
      
       (4) The object dispatcher returns to the workqueue processor, and in due
           course, this sees that the object's work item is still queued and
           invokes it again.
      
       (5) The current state is a work state (OBJECT_DEAD), so the dispatcher
           jumps to it - resulting in an OOPS.
      
      When I'm seeing this, the work state in (1) appears to have been either
      LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is
      fscache_osm_lookup_oob).
      
      The window for (2) is very small:
      
       (A) object->event_mask is cleared whilst the event dispatch process is
           underway - though there's no memory barrier to force this to the top
           of the function.
      
           The window, therefore is from the time the object was selected by the
           workqueue processor and made requeueable to the time the mask was
           cleared.
      
       (B) fscache_raise_event() will only queue the object if it manages to set
           the event bit and the corresponding event_mask bit was set.
      
           The enqueuement is then deferred slightly whilst we get a ref on the
           object and get the per-CPU variable for workqueue congestion.  This
           slight deferral slightly increases the probability by allowing extra
           time for the workqueue to make the item requeueable.
      
      Handle this by giving the dead state a processor function and checking the
      for the dead state address rather than seeing if the processor function is
      address 0x2.  The dead state processor function can then set a flag to
      indicate that it's occurred and give a warning if it occurs more than once
      per object.
      
      If this race occurs, an oops similar to the following is seen (note the RIP
      value):
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
      IP: [<0000000000000002>] 0x1
      PGD 0
      Oops: 0010 [#1] SMP
      Modules linked in: ...
      CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1
      Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015
      Workqueue: fscache_object fscache_object_work_func [fscache]
      task: ffff880302b63980 ti: ffff880717544000 task.ti: ffff880717544000
      RIP: 0010:[<0000000000000002>]  [<0000000000000002>] 0x1
      RSP: 0018:ffff880717547df8  EFLAGS: 00010202
      RAX: ffffffffa0368640 RBX: ffff880edf7a4480 RCX: dead000000200200
      RDX: 0000000000000002 RSI: 00000000ffffffff RDI: ffff880edf7a4480
      RBP: ffff880717547e18 R08: 0000000000000000 R09: dfc40a25cb3a4510
      R10: dfc40a25cb3a4510 R11: 0000000000000400 R12: 0000000000000000
      R13: ffff880edf7a4510 R14: ffff8817f6153400 R15: 0000000000000600
      FS:  0000000000000000(0000) GS:ffff88181f420000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000002 CR3: 000000000194a000 CR4: 00000000001407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00
       ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000
       ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900
      Call Trace:
       [<ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache]
       [<ffffffff8109d5db>] process_one_work+0x17b/0x470
       [<ffffffff8109e4ac>] worker_thread+0x21c/0x400
       [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400
       [<ffffffff810a5acf>] kthread+0xcf/0xe0
       [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
       [<ffffffff816460d8>] ret_from_fork+0x58/0x90
       [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJeremy McNicoll <jeremymc@redhat.com>
      Tested-by: NFrank Sorenson <sorenson@redhat.com>
      Tested-by: NBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: NBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e26bfebd
  7. 31 1月, 2017 2 次提交
  8. 30 1月, 2017 2 次提交
    • R
      net: add devm version of alloc_etherdev_mqs function · 40be0dda
      Rafał Miłecki 提交于
      This patch adds devm_alloc_etherdev_mqs function and devm_alloc_etherdev
      macro. These can be used for simpler netdev allocation without having to
      care about calling free_netdev.
      
      Thanks to this change drivers, their error paths and removal paths may
      get simpler by a bit.
      Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40be0dda
    • E
      can: Fix kernel panic at security_sock_rcv_skb · f1712c73
      Eric Dumazet 提交于
      Zhang Yanmin reported crashes [1] and provided a patch adding a
      synchronize_rcu() call in can_rx_unregister()
      
      The main problem seems that the sockets themselves are not RCU
      protected.
      
      If CAN uses RCU for delivery, then sockets should be freed only after
      one RCU grace period.
      
      Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
      ease stable backports with the following fix instead.
      
      [1]
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0
      
      Call Trace:
       <IRQ>
       [<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60
       [<ffffffff81d55771>] sk_filter+0x41/0x210
       [<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0
       [<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0
       [<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370
       [<ffffffff81f07af9>] can_receive+0xd9/0x120
       [<ffffffff81f07beb>] can_rcv+0xab/0x100
       [<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0
       [<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0
       [<ffffffff81d37f67>] process_backlog+0x127/0x280
       [<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0
       [<ffffffff810c88d4>] __do_softirq+0x184/0x440
       [<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30
       <EOI>
       [<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40
       [<ffffffff810c8bed>] do_softirq+0x1d/0x20
       [<ffffffff81d30085>] netif_rx_ni+0xe5/0x110
       [<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520
       [<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230
       [<ffffffff810e3baf>] process_one_work+0x24f/0x670
       [<ffffffff810e44ed>] worker_thread+0x9d/0x6f0
       [<ffffffff810e4450>] ? rescuer_thread+0x480/0x480
       [<ffffffff810ebafc>] kthread+0x12c/0x150
       [<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70
      Reported-by: NZhang Yanmin <yanmin.zhang@intel.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1712c73
  9. 28 1月, 2017 5 次提交
  10. 27 1月, 2017 2 次提交
  11. 26 1月, 2017 8 次提交
    • R
      net: phy: bcm-phy-lib: clean up remaining AUXCTL register defines · 5e7bfa6c
      Rafał Miłecki 提交于
      1) Use 0x%02x format for register number. This follows some other
         defines and makes it easier to distinct register from values.
      2) Put register define above values and sort the values. It makes
         reading header code easier.
      3) Use 0x%04x format for all values. It's about consistency with other
         values (and most of the header) not a personal preference.
      4) Separate define for reading shift value with an extre empty line.
         It's user for all AUXCTL registers in a bcm54xx_auxctl_read.
      Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e7bfa6c
    • R
      net: phy: broadcom: drop duplicated define for RGMII SKEW delay · 8293c7bc
      Rafał Miłecki 提交于
      We had two defines for the same bit (both were used with the
      MII_BCM54XX_AUXCTL_SHDWSEL_MISC register).
      Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8293c7bc
    • R
      net: phy: broadcom: use auxctl reading helper in BCM54612E code · 85b4685d
      Rafał Miłecki 提交于
      Starting with commit 5b4e2900 ("net: phy: broadcom: add
      bcm54xx_auxctl_read") we have a reading helper so use it and avoid code
      duplication.
      It also means we don't need MII_BCM54XX_AUXCTL_SHDWSEL_MISC define as
      it's the same as MII_BCM54XX_AUXCTL_SHDWSEL_MISC just for reading needs
      (same value shifted by 12 bits).
      Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85b4685d
    • G
      net: phy: leds: Fix truncated LED trigger names · 3c880eb0
      Geert Uytterhoeven 提交于
      Commit 4567d686 ("phy: increase size of MII_BUS_ID_SIZE and
      bus_id") increased the size of MII bus IDs, but forgot to update the
      private definition in <linux/phy_led_triggers.h>.
      This may cause:
        1. Truncation of LED trigger names,
        2. Duplicate LED trigger names,
        3. Failures registering LED triggers,
        4. Crashes due to bad error handling in the LED trigger failure path.
      
      To fix this, and prevent the definitions going out of sync again in the
      future, let the PHY LED trigger code use the existing MII_BUS_ID_SIZE
      definition.
      
      Example:
        - Before I had triggers "ee700000.etherne:01:100Mbps" and
          "ee700000.etherne:01:10Mbps",
        - After the increase of MII_BUS_ID_SIZE, both became
          "ee700000.ethernet-ffffffff:01:" => FAIL,
        - Now, the triggers are "ee700000.ethernet-ffffffff:01:100Mbps" and
          "ee700000.ethernet-ffffffff:01:10Mbps", which are unique again.
      
      Fixes: 4567d686 ("phy: increase size of MII_BUS_ID_SIZE and bus_id")
      Fixes: 2e0bc452 ("net: phy: leds: add support for led triggers on phy link state change")
      Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c880eb0
    • G
      net: phy: leds: Break dependency of phy.h on phy_led_triggers.h · d6f8cfa3
      Geert Uytterhoeven 提交于
      <linux/phy.h> includes <linux/phy_led_triggers.h>, which is not really
      needed.  Drop the include from <linux/phy.h>, and add it to all users
      that didn't include it explicitly.
      Suggested-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6f8cfa3
    • W
      net/tcp-fastopen: Add new API support · 19f6d3f3
      Wei Wang 提交于
      This patch adds a new socket option, TCP_FASTOPEN_CONNECT, as an
      alternative way to perform Fast Open on the active side (client). Prior
      to this patch, a client needs to replace the connect() call with
      sendto(MSG_FASTOPEN). This can be cumbersome for applications who want
      to use Fast Open: these socket operations are often done in lower layer
      libraries used by many other applications. Changing these libraries
      and/or the socket call sequences are not trivial. A more convenient
      approach is to perform Fast Open by simply enabling a socket option when
      the socket is created w/o changing other socket calls sequence:
        s = socket()
          create a new socket
        setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN_CONNECT …);
          newly introduced sockopt
          If set, new functionality described below will be used.
          Return ENOTSUPP if TFO is not supported or not enabled in the
          kernel.
      
        connect()
          With cookie present, return 0 immediately.
          With no cookie, initiate 3WHS with TFO cookie-request option and
          return -1 with errno = EINPROGRESS.
      
        write()/sendmsg()
          With cookie present, send out SYN with data and return the number of
          bytes buffered.
          With no cookie, and 3WHS not yet completed, return -1 with errno =
          EINPROGRESS.
          No MSG_FASTOPEN flag is needed.
      
        read()
          Return -1 with errno = EWOULDBLOCK/EAGAIN if connect() is called but
          write() is not called yet.
          Return -1 with errno = EWOULDBLOCK/EAGAIN if connection is
          established but no msg is received yet.
          Return number of bytes read if socket is established and there is
          msg received.
      
      The new API simplifies life for applications that always perform a write()
      immediately after a successful connect(). Such applications can now take
      advantage of Fast Open by merely making one new setsockopt() call at the time
      of creating the socket. Nothing else about the application's socket call
      sequence needs to change.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19f6d3f3
    • D
      bpf: add initial bpf tracepoints · a67edbf4
      Daniel Borkmann 提交于
      This work adds a number of tracepoints to paths that are either
      considered slow-path or exception-like states, where monitoring or
      inspecting them would be desirable.
      
      For bpf(2) syscall, tracepoints have been placed for main commands
      when they succeed. In XDP case, tracepoint is for exceptions, that
      is, f.e. on abnormal BPF program exit such as unknown or XDP_ABORTED
      return code, or when error occurs during XDP_TX action and the packet
      could not be forwarded.
      
      Both have been split into separate event headers, and can be further
      extended. Worst case, if they unexpectedly should get into our way in
      future, they can also removed [1]. Of course, these tracepoints (like
      any other) can be analyzed by eBPF itself, etc. Example output:
      
        # ./perf record -a -e bpf:* sleep 10
        # ./perf script
        sock_example  6197 [005]   283.980322:      bpf:bpf_map_create: map type=ARRAY ufd=4 key=4 val=8 max=256 flags=0
        sock_example  6197 [005]   283.980721:       bpf:bpf_prog_load: prog=a5ea8fa30ea6849c type=SOCKET_FILTER ufd=5
        sock_example  6197 [005]   283.988423:   bpf:bpf_prog_get_type: prog=a5ea8fa30ea6849c type=SOCKET_FILTER
        sock_example  6197 [005]   283.988443: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[06 00 00 00] val=[00 00 00 00 00 00 00 00]
        [...]
        sock_example  6197 [005]   288.990868: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[01 00 00 00] val=[14 00 00 00 00 00 00 00]
             swapper     0 [005]   289.338243:    bpf:bpf_prog_put_rcu: prog=a5ea8fa30ea6849c type=SOCKET_FILTER
      
        [1] https://lwn.net/Articles/705270/Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a67edbf4
    • D
      trace: add variant without spacing in trace_print_hex_seq · 2acae0d5
      Daniel Borkmann 提交于
      For upcoming tracepoint support for BPF, we want to dump the program's
      tag. Format should be similar to __print_hex(), but without spacing.
      Add a __print_hex_str() variant for exactly that purpose that reuses
      trace_print_hex_seq().
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2acae0d5
  12. 25 1月, 2017 3 次提交
    • V
      mm, page_alloc: fix check for NULL preferred_zone · ea57485a
      Vlastimil Babka 提交于
      Patch series "fix premature OOM regression in 4.7+ due to cpuset races".
      
      This is v2 of my attempt to fix the recent report based on LTP cpuset
      stress test [1].  The intention is to go to stable 4.9 LTSS with this,
      as triggering repeated OOMs is not nice.  That's why the patches try to
      be not too intrusive.
      
      Unfortunately why investigating I found that modifying the testcase to
      use per-VMA policies instead of per-task policies will bring the OOM's
      back, but that seems to be much older and harder to fix problem.  I have
      posted a RFC [2] but I believe that fixing the recent regressions has a
      higher priority.
      
      Longer-term we might try to think how to fix the cpuset mess in a better
      and less error prone way.  I was for example very surprised to learn,
      that cpuset updates change not only task->mems_allowed, but also
      nodemask of mempolicies.  Until now I expected the parameter to
      alloc_pages_nodemask() to be stable.  I wonder why do we then treat
      cpusets specially in get_page_from_freelist() and distinguish HARDWALL
      etc, when there's unconditional intersection between mempolicy and
      cpuset.  I would expect the nodemask adjustment for saving overhead in
      g_p_f(), but that clearly doesn't happen in the current form.  So we
      have both crazy complexity and overhead, AFAICS.
      
      [1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com
      [2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz
      
      This patch (of 4):
      
      Since commit c33d6c06 ("mm, page_alloc: avoid looking up the first
      zone in a zonelist twice") we have a wrong check for NULL preferred_zone,
      which can theoretically happen due to concurrent cpuset modification.  We
      check the zoneref pointer which is never NULL and we should check the zone
      pointer.  Also document this in first_zones_zonelist() comment per Michal
      Hocko.
      
      Fixes: c33d6c06 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
      Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea57485a
    • D
      kernel/watchdog: prevent false hardlockup on overloaded system · b94f5118
      Don Zickus 提交于
      On an overloaded system, it is possible that a change in the watchdog
      threshold can be delayed long enough to trigger a false positive.
      
      This can easily be achieved by having a cpu spinning indefinitely on a
      task, while another cpu updates watchdog threshold.
      
      What happens is while trying to park the watchdog threads, the hrtimers
      on the other cpus trigger and reprogram themselves with the new slower
      watchdog threshold.  Meanwhile, the nmi watchdog is still programmed
      with the old faster threshold.
      
      Because the one cpu is blocked, it prevents the thread parking on the
      other cpus from completing, which is needed to shutdown the nmi watchdog
      and reprogram it correctly.  As a result, a false positive from the nmi
      watchdog is reported.
      
      Fix this by setting a park_in_progress flag to block all lockups until
      the parking is complete.
      
      Fix provided by Ulrich Obergfell.
      
      [akpm@linux-foundation.org: s/park_in_progress/watchdog_park_in_progress/]
      Link: http://lkml.kernel.org/r/1481041033-192236-1-git-send-email-dzickus@redhat.comSigned-off-by: NDon Zickus <dzickus@redhat.com>
      Reviewed-by: NAaron Tomlin <atomlin@redhat.com>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b94f5118
    • Y
      memory_hotplug: make zone_can_shift() return a boolean value · 8a1f780e
      Yasuaki Ishimatsu 提交于
      online_{kernel|movable} is used to change the memory zone to
      ZONE_{NORMAL|MOVABLE} and online the memory.
      
      To check that memory zone can be changed, zone_can_shift() is used.
      Currently the function returns minus integer value, plus integer
      value and 0. When the function returns minus or plus integer value,
      it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}.
      
      But when the function returns 0, there are two meanings.
      
      One of the meanings is that the memory zone does not need to be changed.
      For example, when memory is in ZONE_NORMAL and onlined by online_kernel
      the memory zone does not need to be changed.
      
      Another meaning is that the memory zone cannot be changed. When memory
      is in ZONE_NORMAL and onlined by online_movable, the memory zone may
      not be changed to ZONE_MOVALBE due to memory online limitation(see
      Documentation/memory-hotplug.txt). In this case, memory must not be
      onlined.
      
      The patch changes the return type of zone_can_shift() so that memory
      online operation fails when memory zone cannot be changed as follows:
      
      Before applying patch:
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
         # echo online_movable > memory4097/state
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  8388608
                 managed  8388608
      
         online_movable operation succeeded. But memory is onlined as
         ZONE_NORMAL, not ZONE_MOVABLE.
      
      After applying patch:
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
         # echo online_movable > memory4097/state
         bash: echo: write error: Invalid argument
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
      
         online_movable operation failed because of failure of changing
         the memory zone from ZONE_NORMAL to ZONE_MOVABLE
      
      Fixes: df429ac0 ("memory-hotplug: more general validation of zone during online")
      Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.comSigned-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reviewed-by: NReza Arbab <arbab@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a1f780e