1. 17 2月, 2016 1 次提交
  2. 15 2月, 2016 2 次提交
  3. 15 1月, 2016 1 次提交
    • V
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov 提交于
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  4. 13 1月, 2016 1 次提交
  5. 11 1月, 2016 4 次提交
  6. 08 1月, 2016 1 次提交
  7. 27 12月, 2015 3 次提交
    • A
      powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery() · dc3799bb
      Andrew Donnellan 提交于
      Fix off-by-one error in opal_mce_check_early_recovery() when checking
      whether the NIP falls within OPAL space.
      Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dc3799bb
    • M
      powerpc/powernv: Only delay opal_rtc_read() retry when necessary · 57a90390
      Michael Neuling 提交于
      Only delay opal_rtc_read() when busy and are going to retry.
      
      This has the advantage of possibly saving a massive 10ms off booting!
      
      Kudos to Stewart for noticing.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Reviewed-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      57a90390
    • R
      powerpc/powernv: Add a kmsg_dumper that flushes console output on panic · affddff6
      Russell Currey 提交于
      On BMC machines, console output is controlled by the OPAL firmware and is
      only flushed when its pollers are called.  When the kernel is in a panic
      state, it no longer calls these pollers and thus console output does not
      completely flush, causing some output from the panic to be lost.
      
      Output is only actually lost when the kernel is configured to not power off
      or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
      flushes the console buffer as part of its power down routines.  Before this
      patch, however, only partial output would be printed during the timeout wait.
      
      This patch adds a new kmsg_dumper which gets called at panic time to ensure
      panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
      in the OPAL API, and if that is not available, the pollers are called enough
      times to (hopefully) completely flush the buffer.
      
      The flushing mechanism will only affect output printed at and before the
      kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
      message may still be truncated as follows:
      
      >Call Trace:
      >[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable)
      >[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4
      >[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c
      >[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254
      >[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350
      >[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130
      >[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac
      >---[ end Kernel panic - not
      
      This functionality is implemented as a kmsg_dumper as it seems to be the
      most sensible way to introduce platform-specific functionality to the
      panic function.
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      affddff6
  8. 23 12月, 2015 5 次提交
  9. 18 12月, 2015 1 次提交
    • A
      powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion" · 036592fb
      Alistair Popple 提交于
      Commit 25642e14 ("powerpc/opal-irqchip: Fix double endian
      conversion") fixed an endian bug by calling opal_handle_events() in
      opal_event_unmask().
      
      However this introduced a deadlock if we find an event is active
      during unmasking and call opal_handle_events() again. The bad call
      sequence is:
      
        opal_interrupt()
        -> opal_handle_events()
           -> generic_handle_irq()
              -> handle_level_irq()
                 -> raw_spin_lock(&desc->lock)
                    handle_irq_event(desc)
                    unmask_irq(desc)
                    -> opal_event_unmask()
                       -> opal_handle_events()
                          -> generic_handle_irq()
                             -> handle_level_irq()
                                -> raw_spin_lock(&desc->lock)	(BOOM)
      
      When generating multiple opal events in quick succession this would lead
      to the following stall warnings:
      
      EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32
      INFO: rcu_sched detected stalls on CPUs/tasks:
      
               12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065
               15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065
               (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602)
      NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696]
      INFO: rcu_sched detected stalls on CPUs/tasks:
               12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371
               15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371
               (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290)
      
      This patch corrects the problem by queuing the work if an event is
      active during unmasking, which is similar to the pre-endian fix
      behaviour.
      
      Fixes: 25642e14 ("powerpc/opal-irqchip: Fix double endian conversion")
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      Reported-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      036592fb
  10. 17 12月, 2015 17 次提交
  11. 16 12月, 2015 1 次提交
    • D
      powerpc: Remove broken GregorianDay() · 00b912b0
      Daniel Axtens 提交于
      GregorianDay() is supposed to calculate the day of the week
      (tm->tm_wday) for a given day/month/year. In that calcuation it
      indexed into an array called MonthOffset using tm->tm_mon-1. However
      tm_mon is zero-based, not one-based, so this is off-by-one. It also
      means that every January, GregoiranDay() will access element -1 of
      the MonthOffset array.
      
      It also doesn't appear to be a correct algorithm either: see in
      contrast kernel/time/timeconv.c's time_to_tm function.
      
      It's been broken forever, which suggests no-one in userland uses
      this. It looks like no-one in the kernel uses tm->tm_wday either
      (see e.g. drivers/rtc/rtc-ds1305.c:319).
      
      tm->tm_wday is conventionally set to -1 when not available in
      hardware so we can simply set it to -1 and drop the function.
      (There are over a dozen other drivers in drivers/rtc that do
      this.)
      
      Found using UBSAN.
      
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andrew Morton <akpm@linux-foundation.org> # as an example of what UBSan finds.
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
      Cc: rtc-linux@googlegroups.com
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Acked-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      00b912b0
  12. 14 12月, 2015 3 次提交