1. 07 12月, 2013 7 次提交
    • J
      ipv6 addrconf: extend ifa_flags to u32 · 479840ff
      Jiri Pirko 提交于
      There is no more space in u8 ifa_flags. So do what davem suffested and
      add another netlink attr called IFA_FLAGS for carry more flags.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NThomas Haller <thaller@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      479840ff
    • E
      net: introduce dev_consume_skb_any() · e6247027
      Eric Dumazet 提交于
      Some network drivers use dev_kfree_skb_any() and dev_kfree_skb_irq()
      helpers to free skbs, both for dropped packets and TX completed ones.
      
      We need to separate the two causes to get better diagnostics
      given by dropwatch or "perf record -e skb:kfree_skb"
      
      This patch provides two new helpers, dev_consume_skb_any() and
      dev_consume_skb_irq() to be used for consumed skbs.
      
      __dev_kfree_skb_irq() is slightly optimized to remove one
      atomic_dec_and_test() in fast path, and use this_cpu_{r|w} accessors.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6247027
    • F
      net: phy: breakdown PHY_*_FEATURES defines · e9fbdf17
      Florian Fainelli 提交于
      Breakdown the PHY_*_FEATURES into per speed defines such that we can
      easily re-use them individually.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9fbdf17
    • E
      tcp: auto corking · f54b3111
      Eric Dumazet 提交于
      With the introduction of TCP Small Queues, TSO auto sizing, and TCP
      pacing, we can implement Automatic Corking in the kernel, to help
      applications doing small write()/sendmsg() to TCP sockets.
      
      Idea is to change tcp_push() to check if the current skb payload is
      under skb optimal size (a multiple of MSS bytes)
      
      If under 'size_goal', and at least one packet is still in Qdisc or
      NIC TX queues, set the TCP Small Queue Throttled bit, so that the push
      will be delayed up to TX completion time.
      
      This delay might allow the application to coalesce more bytes
      in the skb in following write()/sendmsg()/sendfile() system calls.
      
      The exact duration of the delay is depending on the dynamics
      of the system, and might be zero if no packet for this flow
      is actually held in Qdisc or NIC TX ring.
      
      Using FQ/pacing is a way to increase the probability of
      autocorking being triggered.
      
      Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control
      this feature and default it to 1 (enabled)
      
      Add a new SNMP counter : nstat -a | grep TcpExtTCPAutoCorking
      This counter is incremented every time we detected skb was under used
      and its flush was deferred.
      
      Tested:
      
      Interesting effects when using line buffered commands under ssh.
      
      Excellent performance results in term of cpu usage and total throughput.
      
      lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking
      lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
      9410.39
      
       Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':
      
            35209.439626 task-clock                #    2.901 CPUs utilized
                   2,294 context-switches          #    0.065 K/sec
                     101 CPU-migrations            #    0.003 K/sec
                   4,079 page-faults               #    0.116 K/sec
          97,923,241,298 cycles                    #    2.781 GHz                     [83.31%]
          51,832,908,236 stalled-cycles-frontend   #   52.93% frontend cycles idle    [83.30%]
          25,697,986,603 stalled-cycles-backend    #   26.24% backend  cycles idle    [66.70%]
         102,225,978,536 instructions              #    1.04  insns per cycle
                                                   #    0.51  stalled cycles per insn [83.38%]
          18,657,696,819 branches                  #  529.906 M/sec                   [83.29%]
              91,679,646 branch-misses             #    0.49% of all branches         [83.40%]
      
            12.136204899 seconds time elapsed
      
      lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking
      lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
      6624.89
      
       Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':
            40045.864494 task-clock                #    3.301 CPUs utilized
                     171 context-switches          #    0.004 K/sec
                      53 CPU-migrations            #    0.001 K/sec
                   4,080 page-faults               #    0.102 K/sec
         111,340,458,645 cycles                    #    2.780 GHz                     [83.34%]
          61,778,039,277 stalled-cycles-frontend   #   55.49% frontend cycles idle    [83.31%]
          29,295,522,759 stalled-cycles-backend    #   26.31% backend  cycles idle    [66.67%]
         108,654,349,355 instructions              #    0.98  insns per cycle
                                                   #    0.57  stalled cycles per insn [83.34%]
          19,552,170,748 branches                  #  488.244 M/sec                   [83.34%]
             157,875,417 branch-misses             #    0.81% of all branches         [83.34%]
      
            12.130267788 seconds time elapsed
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f54b3111
    • J
      netfilter: Fix FSF address in file headers · e664eabd
      Jeff Kirsher 提交于
      Several files refer to an old address for the Free Software Foundation
      in the file header comment.  Resolve by replacing the address with
      the URL <http://www.gnu.org/licenses/> so that we do not have to keep
      updating the header comments anytime the address changes.
      
      CC: netfilter@vger.kernel.org
      CC: Pablo Neira Ayuso <pablo@netfilter.org>
      CC: Patrick McHardy <kaber@trash.net>
      CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e664eabd
    • J
      include/net/: Fix FSF address in file headers · a6227e26
      Jeff Kirsher 提交于
      Several files refer to an old address for the Free Software Foundation
      in the file header comment.  Resolve by replacing the address with
      the URL <http://www.gnu.org/licenses/> so that we do not have to keep
      updating the header comments anytime the address changes.
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6227e26
    • J
      sctp: Fix FSF address in file headers · 4b2f13a2
      Jeff Kirsher 提交于
      Several files refer to an old address for the Free Software Foundation
      in the file header comment.  Resolve by replacing the address with
      the URL <http://www.gnu.org/licenses/> so that we do not have to keep
      updating the header comments anytime the address changes.
      
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: Neil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b2f13a2
  2. 03 12月, 2013 2 次提交
  3. 02 12月, 2013 3 次提交
  4. 01 12月, 2013 1 次提交
  5. 29 11月, 2013 3 次提交
  6. 26 11月, 2013 13 次提交
    • S
      tracing: Allow events to have NULL strings · 4e58e547
      Steven Rostedt (Red Hat) 提交于
      If an TRACE_EVENT() uses __assign_str() or __get_str on a NULL pointer
      then the following oops will happen:
      
      BUG: unable to handle kernel NULL pointer dereference at   (null)
      IP: [<c127a17b>] strlen+0x10/0x1a
      *pde = 00000000 ^M
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc1-test+ #2
      Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006^M
      task: f5cde9f0 ti: f5e5e000 task.ti: f5e5e000
      EIP: 0060:[<c127a17b>] EFLAGS: 00210046 CPU: 1
      EIP is at strlen+0x10/0x1a
      EAX: 00000000 EBX: c2472da8 ECX: ffffffff EDX: c2472da8
      ESI: c1c5e5fc EDI: 00000000 EBP: f5e5fe84 ESP: f5e5fe80
       DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      CR0: 8005003b CR2: 00000000 CR3: 01f32000 CR4: 000007d0
      Stack:
       f5f18b90 f5e5feb8 c10687a8 0759004f 00000005 00000005 00000005 00200046
       00000002 00000000 c1082a93 f56c7e28 c2472da8 c1082a93 f5e5fee4 c106bc61^M
       00000000 c1082a93 00000000 00000000 00000001 00200046 00200082 00000000
      Call Trace:
       [<c10687a8>] ftrace_raw_event_lock+0x39/0xc0
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c106bc61>] lock_release+0x57/0x1a5
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c10824dd>] read_seqcount_begin.constprop.7+0x4d/0x75
       [<c1082a93>] ? ktime_get+0x29/0x69^M
       [<c1082a93>] ktime_get+0x29/0x69
       [<c108a46a>] __tick_nohz_idle_enter+0x1e/0x426
       [<c10690e8>] ? lock_release_holdtime.part.19+0x48/0x4d
       [<c10bc184>] ? time_hardirqs_off+0xe/0x28
       [<c1068c82>] ? trace_hardirqs_off_caller+0x3f/0xaf
       [<c108a8cb>] tick_nohz_idle_enter+0x59/0x62
       [<c1079242>] cpu_startup_entry+0x64/0x192
       [<c102299c>] start_secondary+0x277/0x27c
      Code: 90 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 66 66 66 66 90 83 c9 ff 89 c7 31 c0 <f2> ae f7 d1 8d 41 ff 5f 5d c3 55 89 e5 57 66 66 66 66 90 31 ff
      EIP: [<c127a17b>] strlen+0x10/0x1a SS:ESP 0068:f5e5fe80
      CR2: 0000000000000000
      ---[ end trace 01bc47bf519ec1b2 ]---
      
      New tracepoints have been added that have allowed for NULL pointers
      being assigned to strings. To fix this, change the TRACE_EVENT() code
      to check for NULL and if it is, it will assign "(null)" to it instead
      (similar to what glibc printf does).
      Reported-by: NShuah Khan <shuah.kh@samsung.com>
      Reported-by: NJovi Zhangwei <jovi.zhangwei@gmail.com>
      Link: http://lkml.kernel.org/r/CAGdX0WFeEuy+DtpsJzyzn0343qEEjLX97+o1VREFkUEhndC+5Q@mail.gmail.com
      Link: http://lkml.kernel.org/r/528D6972.9010702@samsung.com
      Fixes: 9cbf1176 ("tracing/events: provide string with undefined size support")
      Cc: stable@vger.kernel.org # 2.6.31+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4e58e547
    • T
      ARM: tegra: Provide dummy powergate implementation · 9886e1fd
      Thierry Reding 提交于
      In order to support increased build test coverage for drivers, implement
      dummies for the powergate implementation. This will allow the drivers to
      be built without requiring support for Tegra to be selected.
      
      This patch solves the following build errors, which can be triggered in
      v3.13-rc1 by selecting DRM_TEGRA without ARCH_TEGRA:
      
      drivers/built-in.o: In function `gr3d_remove':
      drivers/gpu/drm/tegra/gr3d.c:321: undefined reference to `tegra_powergate_power_off'
      drivers/gpu/drm/tegra/gr3d.c:325: undefined reference to `tegra_powergate_power_off'
      drivers/built-in.o: In function `gr3d_probe':
      drivers/gpu/drm/tegra/gr3d.c:266: undefined reference to `tegra_powergate_sequence_power_up'
      drivers/gpu/drm/tegra/gr3d.c:273: undefined reference to `tegra_powergate_sequence_power_up'
      Signed-off-by: NThierry Reding <treding@nvidia.com>
      [swarren, updated commit description]
      Signed-off-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      9886e1fd
    • L
      cfg80211: use enum nl80211_dfs_regions for dfs_region everywhere · 4c7d3982
      Luis R. Rodriguez 提交于
      u8 was used in some other places, just stick to the enum,
      this forces us to express the values that are expected.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      4c7d3982
    • E
      mac80211: add min required channel definition field · 21f659bf
      Eliad Peller 提交于
      Add a new field to ieee80211_chanctx_conf to indicate
      the min required channel configuration.
      
      Tuning to a narrower channel might help reducing
      the noise level and saving some power.
      
      The min required channel definition is the max of
      all min required channel definitions of the interfaces
      bound to this channel context.
      
      In AP mode, use 20MHz when there are no connected station.
      When a new station is added/removed, calculate the new max
      bandwidth supported by any of the stations (e.g. 80MHz when
      80MHz and 40MHz stations are connected).
      
      In other cases, simply use bss_conf.chandef as the
      min required chandef.
      
      Notify drivers about changes to this field by calling
      drv_change_chanctx with a new CHANGE_MIN_WIDTH notification.
      Signed-off-by: NEliad Peller <eliad@wizery.com>
      Reviewed-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      21f659bf
    • E
      mac80211: enable easier manipulation of VHT beamforming caps · fbdd90ea
      Eyal Shapira 提交于
      Introduce shift and mask defines for beamformee STS cap and number
      of sounding dimensions cap as these can take any 3 bit value.
      While at it also cleanup an unrequired parenthesis.
      Signed-off-by: NEyal Shapira <eyal@wizery.com>
      Reviewed-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      fbdd90ea
    • L
      cfg80211: add an option to disable processing country IEs · 2a901468
      Luis R. Rodriguez 提交于
      Certain vendors may want to disable the processing of
      country IEs so that they can continue using the regulatory
      domain the driver or user has set.  Currently there is no
      way to stop the core from processing country IEs, so add
      support to the core to ignore country IE hints.
      
      Cc: Mihir Shete <smihir@qti.qualcomm.com>
      Cc: Henri Bahini <hbahini@qca.qualcomm.com>
      Cc: Tushnim Bhattacharyya <tushnimb@qca.qualcomm.com>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      2a901468
    • L
      cfg80211: add flags to define country IE processing rules · a09a85a0
      Luis R. Rodriguez 提交于
      802.11 cards may have different country IE parsing behavioural
      preferences and vendors may want to support these. These preferences
      were managed by the REGULATORY_CUSTOM_REG and the REGULATORY_STRICT_REG
      flags and their combination. Instead of using this existing notation,
      split out the country IE behavioural preferences as a new flag. This
      will allow us to add more customizations easily and make the code more
      maintainable.
      
      Cc: Mihir Shete <smihir@qti.qualcomm.com>
      Cc: Henri Bahini <hbahini@qca.qualcomm.com>
      Cc: Tushnim Bhattacharyya <tushnimb@qca.qualcomm.com>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com>
      [fix up conflicts]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      a09a85a0
    • L
      cfg80211: move regulatory flags to their own variable · a2f73b6c
      Luis R. Rodriguez 提交于
      We'll expand this later, this will make it easier to
      classify and review what things are related to regulatory
      or not.
      
      Coccinelle only missed 4 hits, which I had to do manually,
      supplying the SmPL in case of merge conflicts.
      
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags |= WIPHY_FLAG_CUSTOM_REGULATORY
      +wiphy->regulatory_flags |= REGULATORY_CUSTOM_REG
      @@
      expression e;
      @@
      -e->flags |= WIPHY_FLAG_CUSTOM_REGULATORY
      +e->regulatory_flags |= REGULATORY_CUSTOM_REG
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags &= ~WIPHY_FLAG_CUSTOM_REGULATORY
      +wiphy->regulatory_flags &= ~REGULATORY_CUSTOM_REG
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags & WIPHY_FLAG_CUSTOM_REGULATORY
      +wiphy->regulatory_flags & REGULATORY_CUSTOM_REG
      
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags |= WIPHY_FLAG_STRICT_REGULATORY
      +wiphy->regulatory_flags |= REGULATORY_STRICT_REG
      @@
      expression e;
      @@
      -e->flags |= WIPHY_FLAG_STRICT_REGULATORY
      +e->regulatory_flags |= REGULATORY_STRICT_REG
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags &= ~WIPHY_FLAG_STRICT_REGULATORY
      +wiphy->regulatory_flags &= ~REGULATORY_STRICT_REG
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags & WIPHY_FLAG_STRICT_REGULATORY
      +wiphy->regulatory_flags & REGULATORY_STRICT_REG
      
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags |= WIPHY_FLAG_DISABLE_BEACON_HINTS
      +wiphy->regulatory_flags |= REGULATORY_DISABLE_BEACON_HINTS
      @@
      expression e;
      @@
      -e->flags |= WIPHY_FLAG_DISABLE_BEACON_HINTS
      +e->regulatory_flags |= REGULATORY_DISABLE_BEACON_HINTS
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags &= ~WIPHY_FLAG_DISABLE_BEACON_HINTS
      +wiphy->regulatory_flags &= ~REGULATORY_DISABLE_BEACON_HINTS
      @@
      struct wiphy *wiphy;
      @@
      -wiphy->flags & WIPHY_FLAG_DISABLE_BEACON_HINTS
      +wiphy->regulatory_flags & REGULATORY_DISABLE_BEACON_HINTS
      
      Generated-by: Coccinelle SmPL
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Peter Senna Tschudin <peter.senna@gmail.com>
      Cc: Mihir Shete <smihir@qti.qualcomm.com>
      Cc: Henri Bahini <hbahini@qca.qualcomm.com>
      Cc: Tushnim Bhattacharyya <tushnimb@qca.qualcomm.com>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com>
      [fix up whitespace damage, overly long lines]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      a2f73b6c
    • M
      mac80211: add generic cipher scheme support · 2475b1cc
      Max Stepanov 提交于
      This adds generic cipher scheme support to mac80211, such schemes
      are fully under control by the driver. On hw registration drivers
      may specify additional HW ciphers with a scheme how these ciphers
      have to be handled by mac80211 TX/RR. A cipher scheme specifies a
      cipher suite value, a size of the security header to be added to
      or stripped from frames and how the PN is to be verified on RX.
      Signed-off-by: NMax Stepanov <Max.Stepanov@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      2475b1cc
    • J
      cfg80211/mac80211: DFS setup chandef for cac event · d2859df5
      Janusz Dziedzic 提交于
      To report channel width correctly we have
      to send correct channel parameters from
      mac80211 when calling cfg80211_cac_event().
      
      This is required in case of using channel width
      higher than 20MHz and we have to set correct
      dfs channel state after CAC (NL80211_DFS_AVAILABLE).
      Signed-off-by: NJanusz Dziedzic <janusz.dziedzic@tieto.com>
      Reviewed-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      d2859df5
    • L
      cfg80211: force WIPHY_FLAG_CUSTOM_REGULATORY on wiphy_apply_custom_regulatory() · 222ea581
      Luis R. Rodriguez 提交于
      wiphy_apply_custom_regulatory() implies WIPHY_FLAG_CUSTOM_REGULATORY
      but we never enforced it, do that now and warn if the driver
      didn't set it. All drivers should be following this today already.
      
      Having WIPHY_FLAG_CUSTOM_REGULATORY does not however mean you will
      use wiphy_apply_custom_regulatory() though, you may have your own
      _orig value set up tools / helpers. The intel drivers are examples
      of this type of driver.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      222ea581
    • L
      cfg80211: consolidate passive-scan and no-ibss flags · 8fe02e16
      Luis R. Rodriguez 提交于
      These two flags are used for the same purpose, just
      combine them into a no-ir flag to annotate no initiating
      radiation is allowed.
      
      Old userspace sending either flag will have it treated as
      the no-ir flag. To be considerate to older userspace we
      also send both the no-ir flag and the old no-ibss flags.
      Newer userspace will have to be aware of older kernels.
      
      Update all places in the tree using these flags with the
      following semantic patch:
      
      @@
      @@
      -NL80211_RRF_PASSIVE_SCAN
      +NL80211_RRF_NO_IR
      @@
      @@
      -NL80211_RRF_NO_IBSS
      +NL80211_RRF_NO_IR
      @@
      @@
      -IEEE80211_CHAN_PASSIVE_SCAN
      +IEEE80211_CHAN_NO_IR
      @@
      @@
      -IEEE80211_CHAN_NO_IBSS
      +IEEE80211_CHAN_NO_IR
      @@
      @@
      -NL80211_RRF_NO_IR | NL80211_RRF_NO_IR
      +NL80211_RRF_NO_IR
      @@
      @@
      -IEEE80211_CHAN_NO_IR | IEEE80211_CHAN_NO_IR
      +IEEE80211_CHAN_NO_IR
      @@
      @@
      -(NL80211_RRF_NO_IR)
      +NL80211_RRF_NO_IR
      @@
      @@
      -(IEEE80211_CHAN_NO_IR)
      +IEEE80211_CHAN_NO_IR
      
      Along with some hand-optimisations in documentation, to
      remove duplicates and to fix some indentation.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com>
      [do all the driver updates in one go]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      8fe02e16
    • A
      nl80211: better document NL80211_CMD_TDLS_MGMT · c17bff87
      Arik Nemtsov 提交于
      This command has different semantics depending on the action code sent.
      Document this fact and detail the supported action codes.
      Signed-off-by: NArik Nemtsov <arik@wizery.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      c17bff87
  7. 25 11月, 2013 2 次提交
  8. 24 11月, 2013 1 次提交
  9. 22 11月, 2013 5 次提交
    • K
      mm: place page->pmd_huge_pte to right union · 7aa555bf
      Kirill A. Shutemov 提交于
      I don't know what went wrong, mis-merge or something, but ->pmd_huge_pte
      placed in wrong union within struct page.
      
      In original patch[1] it's placed to union with ->lru and ->slab, but in
      commit e009bb30 ("mm: implement split page table lock for PMD
      level") it's in union with ->index and ->freelist.
      
      That union seems also unused for pages with table tables and safe to
      re-use, but it's not what I've tested.
      
      Let's move it to original place.  It fixes indentation at least.  :)
      
      [1] https://lkml.org/lkml/2013/10/7/288Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7aa555bf
    • A
      mm: hugetlbfs: fix hugetlbfs optimization · 27c73ae7
      Andrea Arcangeli 提交于
      Commit 7cb2ef56 ("mm: fix aio performance regression for database
      caused by THP") can cause dereference of a dangling pointer if
      split_huge_page runs during PageHuge() if there are updates to the
      tail_page->private field.
      
      Also it is repeating compound_head twice for hugetlbfs and it is running
      compound_head+compound_trans_head for THP when a single one is needed in
      both cases.
      
      The new code within the PageSlab() check doesn't need to verify that the
      THP page size is never bigger than the smallest hugetlbfs page size, to
      avoid memory corruption.
      
      A longstanding theoretical race condition was found while fixing the
      above (see the change right after the skip_unlock label, that is
      relevant for the compound_lock path too).
      
      By re-establishing the _mapcount tail refcounting for all compound
      pages, this also fixes the below problem:
      
        echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
      
        BUG: Bad page state in process bash  pfn:59a01
        page:ffffea000139b038 count:0 mapcount:10 mapping:          (null) index:0x0
        page flags: 0x1c00000000008000(tail)
        Modules linked in:
        CPU: 6 PID: 2018 Comm: bash Not tainted 3.12.0+ #25
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        Call Trace:
          dump_stack+0x55/0x76
          bad_page+0xd5/0x130
          free_pages_prepare+0x213/0x280
          __free_pages+0x36/0x80
          update_and_free_page+0xc1/0xd0
          free_pool_huge_page+0xc2/0xe0
          set_max_huge_pages.part.58+0x14c/0x220
          nr_hugepages_store_common.isra.60+0xd0/0xf0
          nr_hugepages_store+0x13/0x20
          kobj_attr_store+0xf/0x20
          sysfs_write_file+0x189/0x1e0
          vfs_write+0xc5/0x1f0
          SyS_write+0x55/0xb0
          system_call_fastpath+0x16/0x1b
      Signed-off-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Tested-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Cc: Pravin Shelar <pshelar@nicira.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Johannes Weiner <jweiner@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27c73ae7
    • D
      mm: thp: give transparent hugepage code a separate copy_page · 30b0a105
      Dave Hansen 提交于
      Right now, the migration code in migrate_page_copy() uses copy_huge_page()
      for hugetlbfs and thp pages:
      
             if (PageHuge(page) || PageTransHuge(page))
                      copy_huge_page(newpage, page);
      
      So, yay for code reuse.  But:
      
        void copy_huge_page(struct page *dst, struct page *src)
        {
              struct hstate *h = page_hstate(src);
      
      and a non-hugetlbfs page has no page_hstate().  This works 99% of the
      time because page_hstate() determines the hstate from the page order
      alone.  Since the page order of a THP page matches the default hugetlbfs
      page order, it works.
      
      But, if you change the default huge page size on the boot command-line
      (say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
      so page_hstate() returns null and copy_huge_page() oopses pretty fast
      since copy_huge_page() dereferences the hstate:
      
        void copy_huge_page(struct page *dst, struct page *src)
        {
              struct hstate *h = page_hstate(src);
              if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
        ...
      
      Mel noticed that the migration code is really the only user of these
      functions.  This moves all the copy code over to migrate.c and makes
      copy_huge_page() work for THP by checking for it explicitly.
      
      I believe the bug was introduced in commit b32967ff ("mm: numa: Add
      THP migration for the NUMA working set scanning fault case")
      
      [akpm@linux-foundation.org: fix coding-style and comment text, per Naoya Horiguchi]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Tested-by: NDave Jiang <dave.jiang@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30b0a105
    • J
      genetlink: fix genl_set_err() group ID · 91398a09
      Johannes Berg 提交于
      Fix another really stupid bug - I introduced genl_set_err()
      precisely to be able to adjust the group and reject invalid
      ones, but then forgot to do so.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91398a09
    • J
      genetlink: fix genlmsg_multicast() bug · 220815a9
      Johannes Berg 提交于
      Unfortunately, I introduced a tremendously stupid bug into
      genlmsg_multicast() when doing all those multicast group
      changes: it adjusts the group number, but then passes it
      to genlmsg_multicast_netns() which does that again.
      
      Somehow, my tests failed to catch this, so add a warning
      into genlmsg_multicast_netns() and remove the offending
      group ID adjustment.
      
      Also add a warning to the similar code in other functions
      so people who misuse them are more loudly warned.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      220815a9
  10. 21 11月, 2013 3 次提交