1. 22 9月, 2012 5 次提交
    • L
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 63365013
      Linus Torvalds 提交于
      Pull arm-soc bug fixes from Olof Johansson:
       "A couple of samsung clock locking fixes, at91 device tree gpio
        configuration fix and a couple more for shmobile and i.MX.
      
        All small targeted fixes."
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM i.MX25: Make timer irq work again
        ARM: imx: armadillo5x0: Fix illegal register access
        ARM: shmobile: kzm9g: bugfix: correct mmcif interrupt settings
        ARM: SAMSUNG: Use spin_lock_{irqsave,irqrestore} in clk_set_rate
        ARM: at91: fix missing #interrupt-cells on gpio-controller
        ARM: SAMSUNG: use spin_lock_irqsave() in clk_set_parent
      63365013
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 267b50fe
      Linus Torvalds 提交于
      Pull s390 fixes from Martin Schwidefsky:
       "Bug fixes for 3.6-rc7, including some important patches for large page
        related memory management issues."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/dasd: fix read unit address configuration loop
        s390/dasd: fix pathgroup race
        s390/mm: fix user access page-table walk code
        s390/hwcaps: do not report high gprs for 31 bit kernel
        s390/cio: invalidate cdev pointer before deregistration
        s390/cio: fix IO subchannel event race
        s390/dasd: move wake_up call
        s390/hugetlb: use direct TLB flushing for hugetlbfs pages
        s390/mm: fix deadlock in unmap_hugepage_range()
      267b50fe
    • L
      Merge tag 'stable/for-linus-3.6-rc6-tag' of... · 8ca7de91
      Linus Torvalds 提交于
      Merge tag 'stable/for-linus-3.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
      
      Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
       - Fix M2P batching re-using the incorrect structure field.
      
         In v3.5 we added batching for M2P override (Machine Frame Number ->
         Physical Frame Number), but the original MFN was saved in an
         incorrect structure - and we would oops/restore when restoring with
         the old MFN.
      
       - Disable BIOS SMP MP table search.
      
         A bootup issue that we had ignored until we found that on DL380 G6 it
         was needed.
      
      * tag 'stable/for-linus-3.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        xen/boot: Disable BIOS SMP MP table search.
        xen/m2p: do not reuse kmap_op->dev_bus_addr
      8ca7de91
    • L
      debugfs: fix u32_array race in format_array_alloc · e05e279e
      Linus Torvalds 提交于
      The format_array_alloc() function is fundamentally racy, in that it
      prints the array twice: once to figure out how much space to allocate
      for the buffer, and the second time to actually print out the data.
      
      If any of the array contents changes in between, the allocation size may
      be wrong, and the end result may be truncated in odd ways.
      
      Just don't do it.  Allocate a maximum-sized array up-front, and just
      format the array contents once.  The only user of the u32_array
      interfaces is the Xen spinlock statistics code, and it has 31 entries in
      the arrays, so the maximum size really isn't that big, and the end
      result is much simpler code without the bug.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e05e279e
    • D
      debugfs: fix race in u32_array_read and allocate array at open · 36048853
      David Rientjes 提交于
      u32_array_open() is racy when multiple threads read from a file with a
      seek position of zero, i.e. when two or more simultaneous reads are
      occurring after the non-seekable files are created.  It is possible that
      file->private_data is double-freed because the threads races between
      
      	kfree(file->private-data);
      
      and
      
      	file->private_data = NULL;
      
      The fix is to only do format_array_alloc() when the file is opened and
      free it when it is closed.
      
      Note that because the file has always been non-seekable, you can't open
      it and read it multiple times anyway, so the data has always been
      generated just once.  The difference is that now it is generated at open
      time rather than at the time of the first read, and that avoids the
      race.
      Reported-by: NDave Jones <davej@redhat.com>
      Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Tested-by: NRaghavendra <raghavendra.kt@linux.vnet.ibm.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36048853
  2. 20 9月, 2012 8 次提交
    • K
      xen/boot: Disable BIOS SMP MP table search. · bd49940a
      Konrad Rzeszutek Wilk 提交于
      As the initial domain we are able to search/map certain regions
      of memory to harvest configuration data. For all low-level we
      use ACPI tables - for interrupts we use exclusively ACPI _PRT
      (so DSDT) and MADT for INT_SRC_OVR.
      
      The SMP MP table is not used at all. As a matter of fact we do
      not even support machines that only have SMP MP but no ACPI tables.
      
      Lets follow how Moorestown does it and just disable searching
      for BIOS SMP tables.
      
      This also fixes an issue on HP Proliant BL680c G5 and DL380 G6:
      
      9f->100 for 1:1 PTE
      Freeing 9f-100 pfn range: 97 pages freed
      1-1 mapping on 9f->100
      .. snip..
      e820: BIOS-provided physical RAM map:
      Xen: [mem 0x0000000000000000-0x000000000009efff] usable
      Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved
      Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable
      .. snip..
      Scan for SMP in [mem 0x00000000-0x000003ff]
      Scan for SMP in [mem 0x0009fc00-0x0009ffff]
      Scan for SMP in [mem 0x000f0000-0x000fffff]
      found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0]
      (XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0
      (XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e()
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffff81ac07e2>] xen_set_pte_init+0x66/0x71
      . snip..
      Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5
      .. snip..
      Call Trace:
       [<ffffffff81ad31c6>] __early_ioremap+0x18a/0x248
       [<ffffffff81624731>] ? printk+0x48/0x4a
       [<ffffffff81ad32ac>] early_ioremap+0x13/0x15
       [<ffffffff81acc140>] get_mpc_size+0x2f/0x67
       [<ffffffff81acc284>] smp_scan_config+0x10c/0x136
       [<ffffffff81acc2e4>] default_find_smp_config+0x36/0x5a
       [<ffffffff81ac3085>] setup_arch+0x5b3/0xb5b
       [<ffffffff81624731>] ? printk+0x48/0x4a
       [<ffffffff81abca7f>] start_kernel+0x90/0x390
       [<ffffffff81abc356>] x86_64_start_reservations+0x131/0x136
       [<ffffffff81abfa83>] xen_start_kernel+0x65f/0x661
      (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
      
      which is that ioremap would end up mapping 0xff using _PAGE_IOMAP
      (which is what early_ioremap sticks as a flag) - which meant
      we would get MFN 0xFF (pte ff461, which is OK), and then it would
      also map 0x100 (b/c ioremap tries to get page aligned request, and
      it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page)
      as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP
      bypasses the P2M lookup we would happily set the PTE to 1000461.
      Xen would deny the request since we do not have access to the
      Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example
      0x80140.
      
      CC: stable@vger.kernel.org
      Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665Acked-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      bd49940a
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · c46de226
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "A small collection of driver fixes/updates and a core fix for 3.6.  It
        contains:
      
         - Bug fixes for mtip32xx, and support for new hardware (just addition
           of IDs).  They have been queued up for 3.7 for a few weeks as well.
      
         - rate-limit a failing command error message in block core.
      
         - A fix for an old cciss bug from Stephen.
      
         - Prevent overflow of partition count from Alan."
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        cciss: fix handling of protocol error
        blk: add an upper sanity check on partition adding
        mtip32xx: fix user_buffer check in exec_drive_command
        mtip32xx: Remove dead code
        mtip32xx: Change printk to pr_xxxx
        mtip32xx: Proper reporting of write protect status on big-endian
        mtip32xx: Increase timeout for standby command
        mtip32xx: Handle NCQ commands during the security locked state
        mtip32xx: Add support for new devices
        block: rate-limit the error message from failing commands
      c46de226
    • L
      Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh · 077fee00
      Linus Torvalds 提交于
      Pull SuperH fixes from Paul Mundt.
      
      * tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
        sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling.
        sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path
        sh: intc: Fix up multi-evt irq association.
      077fee00
    • L
      Merge tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg · cf42d543
      Linus Torvalds 提交于
      Pull rpmsg fix from Ohad Ben-Cohen:
       "A quick rpmsg fix from Fernando, fixing two buggy invocations of
        dma_free_coherent"
      
      * tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg:
        rpmsg: fix dma_free_coherent dev parameter
      cf42d543
    • L
      Merge tag 'md-3.6-fixes' of git://neil.brown.name/md · 4b92c17e
      Linus Torvalds 提交于
      Pull md fixes from NeilBrown:
       "3 fixes for md in 3.6.
      
        One reverts a recent patch which turns out to not be such a good idea.
      
        Other two fix minor bugs with the new (since 3.3) 'replacement' code
        and have been tagged for -stable."
      
      * tag 'md-3.6-fixes' of git://neil.brown.name/md:
        md: make sure metadata is updated when spares are activated or removed.
        md/raid5: fix calculate of 'degraded' when a replacement becomes active.
        Revert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE."
      4b92c17e
    • L
      Merge branch 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · c5c473e2
      Linus Torvalds 提交于
      Pull workqueue / powernow-k8 fix from Tejun Heo:
       "This is the fix for the bug where cpufreq/powernow-k8 was tripping
        BUG_ON() in try_to_wake_up_local() by migrating workqueue worker to a
        different CPU.
      
          https://bugzilla.kernel.org/show_bug.cgi?id=47301
      
        As discussed, the fix is now two parts - one to reimplement
        work_on_cpu() so that it doesn't create a new kthread each time and
        the actual fix which makes powernow-k8 use work_on_cpu() instead of
        performing manual migration.
      
        While pretty late in the merge cycle, both changes are on the safer
        side.  Jiri and I verified two existing users of work_on_cpu() and
        Duncan confirmed that the powernow-k8 fix survived about 18 hours of
        testing."
      
      * 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU
        workqueue: reimplement work_on_cpu() using system_wq
      c5c473e2
    • T
      cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU · 6889125b
      Tejun Heo 提交于
      powernowk8_target() runs off a per-cpu work item and if the
      cpufreq_policy->cpu is different from the current one, it migrates the
      kworker to the target CPU by manipulating current->cpus_allowed.  The
      function migrates the kworker back to the original CPU but this is
      still broken.  Workqueue concurrency management requires the kworkers
      to stay on the same CPU and powernowk8_target() ends up triggerring
      BUG_ON(rq != this_rq()) in try_to_wake_up_local() if it contends on
      fidvid_mutex and sleeps.
      
      It is unclear why this bug is being reported now.  Duncan says it
      appeared to be a regression of 3.6-rc1 and couldn't reproduce it on
      3.5.  Bisection seemed to point to 63d95a91 "workqueue: use @pool
      instead of @gcwq or @cpu where applicable" which is an non-functional
      change.  Given that the reproduce case sometimes took upto days to
      trigger, it's easy to be misled while bisecting.  Maybe something made
      contention on fidvid_mutex more likely?  I don't know.
      
      This patch fixes the bug by using work_on_cpu() instead if @pol->cpu
      isn't the same as the current one.  The code assumes that
      cpufreq_policy->cpu is kept online by the caller, which Rafael tells
      me is the case.
      
      stable: ed48ece2 ("workqueue: reimplement work_on_cpu() using
              system_wq") should be applied before this; otherwise, the
              behavior could be horrible.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NDuncan <1i5t5.duncan@cox.net>
      Tested-by: NDuncan <1i5t5.duncan@cox.net>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: stable@vger.kernel.org
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47301
      6889125b
    • T
      workqueue: reimplement work_on_cpu() using system_wq · ed48ece2
      Tejun Heo 提交于
      The existing work_on_cpu() implementation is hugely inefficient.  It
      creates a new kthread, execute that single function and then let the
      kthread die on each invocation.
      
      Now that system_wq can handle concurrent executions, there's no
      advantage of doing this.  Reimplement work_on_cpu() using system_wq
      which makes it simpler and way more efficient.
      
      stable: While this isn't a fix in itself, it's needed to fix a
              workqueue related bug in cpufreq/powernow-k8.  AFAICS, this
              shouldn't break other existing users.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJiri Kosina <jkosina@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@vger.kernel.org
      ed48ece2
  3. 19 9月, 2012 6 次提交
  4. 18 9月, 2012 21 次提交
    • S
      ARM i.MX25: Make timer irq work again · 42a3f891
      Sascha Hauer 提交于
      Since i.MX has SPARSE_IRQ enabled the i.MX25 timer is broken. This
      is because the internal irqs now start at an offset of NR_IRQS_LEGACY.
      The patch fixed this up, but missed the i.MX25 timer which used a
      hardcoded value instead of a define. This patch introduces a define
      for the timer irq and uses it.
      
      This is broken since introduced with 3.6-rc1:
      
      | commit 8842a9e2
      | Author: Shawn Guo <shawn.guo@linaro.org>
      | Date:   Thu Jun 14 11:16:14 2012 +0800
      |
      |    ARM: imx: enable SPARSE_IRQ for imx platform
      Signed-off-by: NSascha Hauer <s.hauer@pengutronix.de>
      Acked-by: NShawn Guo <shawn.guo@linaro.org>
      42a3f891
    • F
      ARM: imx: armadillo5x0: Fix illegal register access · 35495173
      Fabio Estevam 提交于
      Since commit eb92044e (ARM i.MX3: Make ccm base address a variable )
      it is necessary to pass the CCM register base as a variable.
      
      Fix the CCM register access in mach-armadillo5x0 by passing mx3_ccm_base and
      avoid illegal accesses.
      
      Also applies to v3.5
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NSascha Hauer <s.hauer@pengutronix.de>
      Cc: stable@vger.kernel.org
      35495173
    • O
      Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes · 652f5b5e
      Olof Johansson 提交于
      From Nicolas Ferre:
      
      Modify AT91 device tree files for making the GPIO interrupts work.
      
      * tag 'at91-fixes' of git://github.com/at91linux/linux-at91:
        ARM: at91: fix missing #interrupt-cells on gpio-controller
      652f5b5e
    • O
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes · 0921dcea
      Olof Johansson 提交于
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
        ARM: shmobile: kzm9g: bugfix: correct mmcif interrupt settings
      0921dcea
    • O
      Merge branch 'v3.6-samsung-fixes-3' of... · d7235b8b
      Olof Johansson 提交于
      Merge branch 'v3.6-samsung-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes
      
      * 'v3.6-samsung-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
        ARM: SAMSUNG: Use spin_lock_{irqsave,irqrestore} in clk_set_rate
        ARM: SAMSUNG: use spin_lock_irqsave() in clk_set_parent
      d7235b8b
    • S
      cciss: fix handling of protocol error · 2453f5f9
      Stephen M. Cameron 提交于
      If a command completes with a status of CMD_PROTOCOL_ERR, this
      information should be conveyed to the SCSI mid layer, not dropped
      on the floor.  Unlike a similar bug in the hpsa driver, this bug
      only affects tape drives and CD and DVD ROM drives in the cciss
      driver, and to induce it, you have to disconnect (or damage) a
      cable, so it is not a very likely scenario (which would explain
      why the bug has gone undetected for the last 10 years.)
      Signed-off-by: NStephen M. Cameron <scameron@beardog.cce.hp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2453f5f9
    • A
      blk: add an upper sanity check on partition adding · 2bd6efad
      Alan Cox 提交于
      65536 should be ludicrous anyway but without it we overflow the
      memory computation doing the allocation and badness occurs.
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2bd6efad
    • A
      sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling. · 5e071e2b
      Al Viro 提交于
      As Al notes, we missed a TIF_NOTIFY_RESUME check which caused any
      handlers without TIF_SIGPENDING also set to skip the notification:
      
      	Looks like while it is in the relevant masks *and* checked in
      	do_notify_resume() both on 32bit and 64bit variants since commit
      	ab99c733 ("sh: Make syscall tracer
      	use tracehook notifiers, add TIF_NOTIFY_RESUME.") they are
      	actually *not* reached without simulataneous SIGPENDING, since
      	the actual glue in the callers had not been updated back then and
      	still checks for _TIF_SIGPENDING alone when deciding whether to
      	hit do_notify_resume() or not.
      Reported-by: NNobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: NNobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      5e071e2b
    • L
      sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path · 077664a2
      Laurent Pinchart 提交于
      The sh_pfc_gpio_request_enable() function acquires a spinlock but fails
      to release it before returning if the requested mux type is not
      supported. Fix this.
      Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      077664a2
    • T
      ARM: shmobile: kzm9g: bugfix: correct mmcif interrupt settings · a704835d
      Tetsuyuki Kobayashi 提交于
      Correct interrupt settings of sh_mmc:int and sh_mmc:error in board-kzm9g.c.
      Signed-off-by: NTetsuyuki Kobayashi <koba@kmckk.co.jp>
      Acked-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      a704835d
    • T
      ARM: SAMSUNG: Use spin_lock_{irqsave,irqrestore} in clk_set_rate · d6838a62
      Tushar Behera 提交于
      The spinlock clocks_lock can be held during ISR, hence it is not safe to
      hold that lock with disabling interrupts.
      
      It fixes following potential deadlock.
      
      =========================================================
      [ INFO: possible irq lock inversion dependency detected ]
      3.6.0-rc4+ #2 Not tainted
      ---------------------------------------------------------
      swapper/0/1 just changed the state of lock:
       (&(&host->lock)->rlock){-.....}, at: [<c027fb0d>] sdhci_irq+0x15/0x564
      but this lock took another, HARDIRQ-unsafe lock in the past:
       (clocks_lock){+.+...}
      
      and interrupts could create inverse lock ordering between them.
      
      other info that might help us debug this:
       Possible interrupt unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(clocks_lock);
                                     local_irq_disable();
                                     lock(&(&host->lock)->rlock);
                                     lock(clocks_lock);
        <Interrupt>
          lock(&(&host->lock)->rlock);
      
       *** DEADLOCK ***
      Signed-off-by: NTushar Behera <tushar.behera@linaro.org>
      Signed-off-by: NKukjin Kim <kgene.kim@samsung.com>
      d6838a62
    • L
      Merge branch 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 4651afbb
      Linus Torvalds 提交于
      Pull another workqueue fix from Tejun Heo:
       "Unfortunately, yet another late fix.  This too is discovered and fixed
        by Lai.  This bug was introduced during this merge window by commit
        25511a47 ("workqueue: reimplement CPU online rebinding to handle
        idle workers") which started using WORKER_REBIND flag for idle rebind
        too.
      
        The bug is relatively easy to trigger if the CPU rapidly goes through
        off, on and then off (and stay off).  The fix is on the safer side.
        This hasn't been on linux-next yet but I'm pushing early so that it
        can get more exposure before v3.6 release."
      
      * 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()
      4651afbb
    • L
      workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn() · 960bd11b
      Lai Jiangshan 提交于
      busy_worker_rebind_fn() didn't clear WORKER_REBIND if rebinding failed
      (CPU is down again).  This used to be okay because the flag wasn't
      used for anything else.
      
      However, after 25511a47 "workqueue: reimplement CPU online rebinding
      to handle idle workers", WORKER_REBIND is also used to command idle
      workers to rebind.  If not cleared, the worker may confuse the next
      CPU_UP cycle by having REBIND spuriously set or oops / get stuck by
      prematurely calling idle_worker_rebind().
      
        WARNING: at /work/os/wq/kernel/workqueue.c:1323 worker_thread+0x4cd/0x5
       00()
        Hardware name: Bochs
        Modules linked in: test_wq(O-)
        Pid: 33, comm: kworker/1:1 Tainted: G           O 3.6.0-rc1-work+ #3
        Call Trace:
         [<ffffffff8109039f>] warn_slowpath_common+0x7f/0xc0
         [<ffffffff810903fa>] warn_slowpath_null+0x1a/0x20
         [<ffffffff810b3f1d>] worker_thread+0x4cd/0x500
         [<ffffffff810bc16e>] kthread+0xbe/0xd0
         [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
        ---[ end trace e977cf20f4661968 ]---
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: [<ffffffff810b3db0>] worker_thread+0x360/0x500
        PGD 0
        Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
        Modules linked in: test_wq(O-)
        CPU 0
        Pid: 33, comm: kworker/1:1 Tainted: G        W  O 3.6.0-rc1-work+ #3 Bochs Bochs
        RIP: 0010:[<ffffffff810b3db0>]  [<ffffffff810b3db0>] worker_thread+0x360/0x500
        RSP: 0018:ffff88001e1c9de0  EFLAGS: 00010086
        RAX: 0000000000000000 RBX: ffff88001e633e00 RCX: 0000000000004140
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
        RBP: ffff88001e1c9ea0 R08: 0000000000000000 R09: 0000000000000001
        R10: 0000000000000002 R11: 0000000000000000 R12: ffff88001fc8d580
        R13: ffff88001fc8d590 R14: ffff88001e633e20 R15: ffff88001e1c6900
        FS:  0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 0000000000000000 CR3: 00000000130e8000 CR4: 00000000000006f0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Process kworker/1:1 (pid: 33, threadinfo ffff88001e1c8000, task ffff88001e1c6900)
        Stack:
         ffff880000000000 ffff88001e1c9e40 0000000000000001 ffff88001e1c8010
         ffff88001e519c78 ffff88001e1c9e58 ffff88001e1c6900 ffff88001e1c6900
         ffff88001e1c6900 ffff88001e1c6900 ffff88001fc8d340 ffff88001fc8d340
        Call Trace:
         [<ffffffff810bc16e>] kthread+0xbe/0xd0
         [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
        Code: b1 00 f6 43 48 02 0f 85 91 01 00 00 48 8b 43 38 48 89 df 48 8b 00 48 89 45 90 e8 ac f0 ff ff 3c 01 0f 85 60 01 00 00 48 8b 53 50 <8b> 02 83 e8 01 85 c0 89 02 0f 84 3b 01 00 00 48 8b 43 38 48 8b
        RIP  [<ffffffff810b3db0>] worker_thread+0x360/0x500
         RSP <ffff88001e1c9de0>
        CR2: 0000000000000000
      
      There was no reason to keep WORKER_REBIND on failure in the first
      place - WORKER_UNBOUND is guaranteed to be set in such cases
      preventing incorrectly activating concurrency management.  Always
      clear WORKER_REBIND.
      
      tj: Updated comment and description.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      960bd11b
    • L
      Merge branch 'akpm' (Andrew's patch-bomb) · 08077ca8
      Linus Torvalds 提交于
      Merge fixes from Andrew Morton:
       "13 patches.  12 are fixes and one is a little preparatory thing for
        Andi."
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (13 commits)
        memory hotplug: fix section info double registration bug
        mm/page_alloc: fix the page address of higher page's buddy calculation
        drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe
        compiler.h: add __visible
        pid-namespace: limit value of ns_last_pid to (0, max_pid)
        include/net/sock.h: squelch compiler warning in sk_rmem_schedule()
        slub: consider pfmemalloc_match() in get_partial_node()
        slab: fix starting index for finding another object
        slab: do ClearSlabPfmemalloc() for all pages of slab
        nbd: clear waiting_queue on shutdown
        MAINTAINERS: fix TXT maintainer list and source repo path
        mm/ia64: fix a memory block size bug
        memory hotplug: reset pgdat->kswapd to NULL if creating kernel thread fails
      08077ca8
    • Q
      memory hotplug: fix section info double registration bug · f14851af
      qiuxishi 提交于
      There may be a bug when registering section info.  For example, on my
      Itanium platform, the pfn range of node0 includes the other nodes, so
      other nodes' section info will be double registered, and memmap's page
      count will equal to 3.
      
        node0: start_pfn=0x100,    spanned_pfn=0x20fb00, present_pfn=0x7f8a3, => 0x000100-0x20fc00
        node1: start_pfn=0x80000,  spanned_pfn=0x80000,  present_pfn=0x80000, => 0x080000-0x100000
        node2: start_pfn=0x100000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x100000-0x180000
        node3: start_pfn=0x180000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x180000-0x200000
      
        free_all_bootmem_node()
      	register_page_bootmem_info_node()
      		register_page_bootmem_info_section()
      
      When hot remove memory, we can't free the memmap's page because
      page_count() is 2 after put_page_bootmem().
      
        sparse_remove_one_section()
      	free_section_usemap()
      		free_map_bootmem()
      			put_page_bootmem()
      
      [akpm@linux-foundation.org: add code comment]
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f14851af
    • L
      mm/page_alloc: fix the page address of higher page's buddy calculation · 0ba8f2d5
      Li Haifeng 提交于
      The heuristic method for buddy has been introduced since commit
      43506fad ("mm/page_alloc.c: simplify calculation of combined index
      of adjacent buddy lists").  But the page address of higher page's buddy
      was wrongly calculated, which will lead page_is_buddy to fail for ever.
      IOW, the heuristic method would be disabled with the wrong page address
      of higher page's buddy.
      
      Calculating the page address of higher page's buddy should be based
      higher_page with the offset between index of higher page and index of
      higher page's buddy.
      Signed-off-by: NHaifeng Li <omycle@gmail.com>
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: KyongHo Cho <pullip.cho@samsung.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Johannes Weiner <jweiner@redhat.com>
      Cc: <stable@vger.kernel.org>	[2.6.38+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ba8f2d5
    • K
      drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe · 8dcebaa9
      Kevin Hilman 提交于
      On some platforms, bootloaders are known to do some interesting RTC
      programming.  Without going into the obscurities as to why this may be
      the case, suffice it to say the the driver should not make any
      assumptions about the state of the RTC when the driver loads.  In
      particular, the driver probe should be sure that all interrupts are
      disabled until otherwise programmed.
      
      This was discovered when finding bursty I2C traffic every second on
      Overo platforms.  This I2C overhead was keeping the SoC from hitting
      deep power states.  The cause was found to be the RTC firing every
      second on the I2C-connected TWL PMIC.
      
      Special thanks to Felipe Balbi for suggesting to look for a rogue driver
      as the source of the I2C traffic rather than the I2C driver itself.
      
      Special thanks to Steve Sakoman for helping track down the source of the
      continuous RTC interrups on the Overo boards.
      Signed-off-by: NKevin Hilman <khilman@ti.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Tested-by: NSteve Sakoman <steve@sakoman.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Tested-by: NShubhrajyoti Datta <omaplinuxkernel@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8dcebaa9
    • A
      compiler.h: add __visible · 9a858dc7
      Andi Kleen 提交于
      gcc 4.6+ has support for a externally_visible attribute that prevents the
      optimizer from optimizing unused symbols away.  Add a __visible macro to
      use it with that compiler version or later.
      
      This is used (at least) by the "Link Time Optimization" patchset.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a858dc7
    • A
      pid-namespace: limit value of ns_last_pid to (0, max_pid) · 579035dc
      Andrew Vagin 提交于
      The kernel doesn't check the pid for negative values, so if you try to
      write -2 to /proc/sys/kernel/ns_last_pid, you will get a kernel panic.
      
      The crash happens because the next pid is -1, and alloc_pidmap() will
      try to access to a nonexistent pidmap.
      
        map = &pid_ns->pidmap[pid/BITS_PER_PAGE];
      Signed-off-by: NAndrew Vagin <avagin@openvz.org>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      579035dc
    • C
      include/net/sock.h: squelch compiler warning in sk_rmem_schedule() · 35c448a8
      Chuck Lever 提交于
      This warning:
      
        In file included from linux/include/linux/tcp.h:227:0,
                         from linux/include/linux/ipv6.h:221,
                         from linux/include/net/ipv6.h:16,
                         from linux/include/linux/sunrpc/clnt.h:26,
                         from linux/net/sunrpc/stats.c:22:
        linux/include/net/sock.h: In function `sk_rmem_schedule':
        linux/nfs-2.6/include/net/sock.h:1339:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
      
      is seen with gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2) using the
      -Wextra option.
      
      Commit c76562b6 ("netvm: prevent a stream-specific deadlock")
      accidentally replaced the "size" parameter of sk_rmem_schedule() with an
      unsigned int.  This changes the semantics of the comparison in the
      return statement.
      
      In sk_wmem_schedule we have syntactically the same comparison, but
      "size" is a signed integer.  In addition, __sk_mem_schedule() takes a
      signed integer for its "size" parameter, so there is an implicit type
      conversion in sk_rmem_schedule() anyway.
      
      Revert the "size" parameter back to a signed integer so that the
      semantics of the expressions in both sk_[rw]mem_schedule() are exactly
      the same.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: David Miller <davem@davemloft.net>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35c448a8
    • J
      slub: consider pfmemalloc_match() in get_partial_node() · 8ba00bb6
      Joonsoo Kim 提交于
      get_partial() is currently not checking pfmemalloc_match() meaning that
      it is possible for pfmemalloc pages to leak to non-pfmemalloc users.
      This is a problem in the following situation.  Assume that there is a
      request from normal allocation and there are no objects in the per-cpu
      cache and no node-partial slab.
      
      In this case, slab_alloc enters the slow path and new_slab_objects() is
      called which may return a PFMEMALLOC page.  As the current user is not
      allowed to access PFMEMALLOC page, deactivate_slab() is called
      ([5091b74a: mm: slub: optimise the SLUB fast path to avoid pfmemalloc
      checks]) and returns an object from PFMEMALLOC page.
      
      Next time, when we get another request from normal allocation,
      slab_alloc() enters the slow-path and calls new_slab_objects().  In
      new_slab_objects(), we call get_partial() and get a partial slab which
      was just deactivated but is a pfmemalloc page.  We extract one object
      from it and re-deactivate.
      
        "deactivate -> re-get in get_partial -> re-deactivate" occures repeatedly.
      
      As a result, access to PFMEMALLOC page is not properly restricted and it
      can cause a performance degradation due to frequent deactivation.
      deactivation frequently.
      
      This patch changes get_partial_node() to take pfmemalloc_match() into
      account and prevents the "deactivate -> re-get in get_partial()
      scenario.  Instead, new_slab() is called.
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: David Miller <davem@davemloft.net>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8ba00bb6