1. 18 1月, 2012 6 次提交
    • E
      Audit: push audit success and retcode into arch ptrace.h · d7e7528b
      Eric Paris 提交于
      The audit system previously expected arches calling to audit_syscall_exit to
      supply as arguments if the syscall was a success and what the return code was.
      Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
      by converting from negative retcodes to an audit internal magic value stating
      success or failure.  This helper was wrong and could indicate that a valid
      pointer returned to userspace was a failed syscall.  The fix is to fix the
      layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
      in turns calls back into arch code to collect the return value and to
      determine if the syscall was a success or failure.  We also define a generic
      is_syscall_success() macro which determines success/failure based on if the
      value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
      separate mechanism to indicate syscall failure.
      
      We make both the is_syscall_success() and regs_return_value() static inlines
      instead of macros.  The reason is because the audit function must take a void*
      for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
      pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
      function takes a void* we need to use static inlines to cast it back to the
      arch correct structure to dereference it.
      
      The other major change is that on some arches, like ia64, MIPS and ppc, we
      change regs_return_value() to give us the negative value on syscall failure.
      THE only other user of this macro, kretprobe_example.c, won't notice and it
      makes the value signed consistently for the audit functions across all archs.
      
      In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
      audit code as the return value.  But the ptrace_64.h code defined the macro
      regs_return_value() as regs[3].  I have no idea which one is correct, but this
      patch now uses the regs_return_value() function, so it now uses regs[3].
      
      For powerpc we previously used regs->result but now use the
      regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
      always positive so the regs_return_value(), much like ia64 makes it negative
      before calling the audit code when appropriate.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
      Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
      Acked-by: Richard Weinberger <richard@nod.at> [for uml]
      Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
      d7e7528b
    • E
      seccomp: audit abnormal end to a process due to seccomp · 85e7bac3
      Eric Paris 提交于
      The audit system likes to collect information about processes that end
      abnormally (SIGSEGV) as this may me useful intrusion detection information.
      This patch adds audit support to collect information when seccomp forces a
      task to exit because of misbehavior in a similar way.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      85e7bac3
    • E
      audit: check current inode and containing object when filtering on major and minor · 16c174bd
      Eric Paris 提交于
      The audit system has the ability to filter on the major and minor number of
      the device containing the inode being operated upon.  Lets say that
      /dev/sda1 has major,minor 8,1 and that we mount /dev/sda1 on /boot.  Now lets
      say we add a watch with a filter on 8,1.  If we proceed to open an inode
      inside /boot, such as /vboot/vmlinuz, we will match the major,minor filter.
      
      Lets instead assume that one were to use a tool like debugfs and were to
      open /dev/sda1 directly and to modify it's contents.  We might hope that
      this would also be logged, but it isn't.  The rules will check the
      major,minor of the device containing /dev/sda1.  In other words the rule
      would match on the major/minor of the tmpfs mounted at /dev.
      
      I believe these rules should trigger on either device.  The man page is
      devoid of useful information about the intended semantics.  It only seems
      logical that if you want to know everything that happened on a major,minor
      that would include things that happened to the device itself...
      Signed-off-by: NEric Paris <eparis@redhat.com>
      16c174bd
    • E
      audit: drop the meaningless and format breaking word 'user' · 3035c51e
      Eric Paris 提交于
      userspace audit messages look like so:
      
      type=USER msg=audit(1271170549.415:24710): user pid=14722 uid=0 auid=500 ses=1 subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 msg=''
      
      That third field just says 'user'.  That's useless and doesn't follow the
      key=value pair we are trying to enforce.  We already know it came from the
      user based on the record type.  Kill that word.  Die.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      3035c51e
    • E
      audit: dynamically allocate audit_names when not enough space is in the names array · 5195d8e2
      Eric Paris 提交于
      This patch does 2 things.  First it reduces the number of audit_names
      allocated in every audit context from 20 to 5.  5 should be enough for all
      'normal' syscalls (rename being the worst).  Some syscalls can still touch
      more the 5 inodes such as mount.  When rpc filesystem is mounted it will
      create inodes and those can exceed 5.  To handle that problem this patch will
      dynamically allocate audit_names if it needs more than 5.  This should
      decrease the typicall memory usage while still supporting all the possible
      kernel operations.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      5195d8e2
    • E
      audit: make filetype matching consistent with other filters · 5ef30ee5
      Eric Paris 提交于
      Every other filter that matches part of the inodes list collected by audit
      will match against any of the inodes on that list.  The filetype matching
      however had a strange way of doing things.  It allowed userspace to
      indicated if it should match on the first of the second name collected by
      the kernel.  Name collection ordering seems like a kernel internal and
      making userspace rules get that right just seems like a bad idea.  As it
      turns out the userspace audit writers had no idea it was doing this and
      thus never overloaded the value field.  The kernel always checked the first
      name collected which for the tested rules was always correct.
      
      This patch just makes the filetype matching like the major, minor, inode,
      and LSM rules in that it will match against any of the names collected.  It
      also changes the rule validation to reject the old unused rule types.
      
      Noone knew it was there.  Noone used it.  Why keep around the extra code?
      Signed-off-by: NEric Paris <eparis@redhat.com>
      5ef30ee5
  2. 12 1月, 2012 12 次提交
    • L
      Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9fc5c3e3
      Linus Torvalds 提交于
      * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/intel config: Fix the APB_TIMER selection
        x86/mrst: Add additional debug prints for pb_keys
        x86/intel config: Revamp configuration to allow for Moorestown and Medfield
        x86/intel/scu/ipc: Match the changes in the x86 configuration
        x86/apb: Fix configuration constraints
        x86: Fix INTEL_MID silly
        x86/Kconfig: Cyclone-timer depends on x86-summit
        x86: Reduce clock calibration time during slave cpu startup
        x86/config: Revamp configuration for MID devices
        x86/sfi: Kill the IRQ as id hack
      9fc5c3e3
    • L
      Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 541048a1
      Linus Torvalds 提交于
      * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, reboot: Fix typo in nmi reboot path
        x86, NMI: Add to_cpumask() to silence compile warning
        x86, NMI: NMI selftest depends on the local apic
        x86: Add stack top margin for stack overflow checking
        x86, NMI: NMI-selftest should handle the UP case properly
        x86: Fix the 32-bit stackoverflow-debug build
        x86, NMI: Add knob to disable using NMI IPIs to stop cpus
        x86, NMI: Add NMI IPI selftest
        x86, reboot: Use NMI instead of REBOOT_VECTOR to stop cpus
        x86: Clean up the range of stack overflow checking
        x86: Panic on detection of stack overflow
        x86: Check stack overflow in detail
      541048a1
    • L
      Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bcede2f6
      Linus Torvalds 提交于
      * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, efi: Break up large initrd reads
        x86, efi: EFI boot stub support
        efi: Add EFI file I/O data types
        efi.h: Add boottime->locate_handle search types
        efi.h: Add graphics protocol guids
        efi.h: Add allocation types for boottime->allocate_pages()
        efi.h: Add efi_image_loaded_t
        efi.h: Add struct definition for boot time services
        x86: Don't use magic strings for EFI loader signature
        x86: Add missing bzImage fields to struct setup_header
      bcede2f6
    • L
      Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d0b9706c
      Linus Torvalds 提交于
      * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/numa: Add constraints check for nid parameters
        mm, x86: Remove debug_pagealloc_enabled
        x86/mm: Initialize high mem before free_all_bootmem()
        arch/x86/kernel/e820.c: quiet sparse noise about plain integer as NULL pointer
        arch/x86/kernel/e820.c: Eliminate bubble sort from sanitize_e820_map()
        x86: Fix mmap random address range
        x86, mm: Unify zone_sizes_init()
        x86, mm: Prepare zone_sizes_init() for unification
        x86, mm: Use max_low_pfn for ZONE_NORMAL on 64-bit
        x86, mm: Wrap ZONE_DMA32 with CONFIG_ZONE_DMA32
        x86, mm: Use max_pfn instead of highend_pfn
        x86, mm: Move zone init from paging_init() on 64-bit
        x86, mm: Use MAX_DMA_PFN for ZONE_DMA on 32-bit
      d0b9706c
    • L
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq · 02d92950
      Linus Torvalds 提交于
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (23 commits)
        [CPUFREQ] EXYNOS: Removed useless headers and codes
        [CPUFREQ] EXYNOS: Make EXYNOS common cpufreq driver
        [CPUFREQ] powernow-k8: Update copyright, maintainer and documentation information
        [CPUFREQ] powernow-k8: Fix indexing issue
        [CPUFREQ] powernow-k8: Avoid Pstate MSR accesses on systems supporting CPB
        [CPUFREQ] update lpj only if frequency has changed
        [CPUFREQ] cpufreq:userspace: fix cpu_cur_freq updation
        [CPUFREQ] Remove wall variable from cpufreq_gov_dbs_init()
        [CPUFREQ] EXYNOS4210: cpufreq code is changed for stable working
        [CPUFREQ] EXYNOS4210: Update frequency table for cpu divider
        [CPUFREQ] EXYNOS4210: Remove code about bus on cpufreq
        [CPUFREQ] s3c64xx: Use pr_fmt() for consistent log messages
        cpufreq: OMAP: fixup for omap_device changes, include <linux/module.h>
        cpufreq: OMAP: fix freq_table leak
        cpufreq: OMAP: put clk if cpu_init failed
        cpufreq: OMAP: only supports OPP library
        cpufreq: OMAP: dont support !freq_table
        cpufreq: OMAP: deny initialization if no mpudev
        cpufreq: OMAP: move clk name decision to init
        cpufreq: OMAP: notify even with bad boot frequency
        ...
      02d92950
    • L
      Merge git://git.infradead.org/battery-2.6 · b24ca57e
      Linus Torvalds 提交于
      * git://git.infradead.org/battery-2.6: (68 commits)
        power_supply: Mark da9052 driver as broken
        power_supply: Drop usage of nowarn variant of sysfs_create_link()
        s3c_adc_battery: Average over more than one adc sample
        power_supply: Add DA9052 battery driver
        isp1704_charger: Fix missing check
        jz4740-battery: Fix signedness bug
        power_supply: Assume mains power by default
        sbs-battery: Fix devicetree match table
        ARM: rx51: Add bq27200 i2c board info
        sbs-battery: Change power supply name
        devicetree-bindings: Propagate bq20z75->sbs rename to dt bindings
        devicetree-bindings: Add vendor entry for Smart Battery Systems
        sbs-battery: Rename internals to new name
        bq20z75: Rename to sbs-battery
        wm97xx_battery: Use DEFINE_MUTEX() for work_lock
        max8997_charger: Remove duplicate module.h
        lp8727_charger: Some minor fixes for the header
        lp8727_charger: Add header file
        power_supply: Convert drivers/power/* to use module_platform_driver()
        power_supply: Add "unknown" in power supply type
        ...
      b24ca57e
    • L
      Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux · 6296e5d3
      Linus Torvalds 提交于
      * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
        slub: disallow changing cpu_partial from userspace for debug caches
        slub: add missed accounting
        slub: Extract get_freelist from __slab_alloc
        slub: Switch per cpu partial page support off for debugging
        slub: fix a possible memleak in __slab_alloc()
        slub: fix slub_max_order Documentation
        slub: add missed accounting
        slab: add taint flag outputting to debug paths.
        slub: add taint flag outputting to debug paths
        slab: introduce slab_max_order kernel parameter
        slab: rename slab_break_gfp_order to slab_max_order
      6296e5d3
    • L
      Merge tag 'md-3.3-fixes' of git://neil.brown.name/md · c086ae4e
      Linus Torvalds 提交于
      Two bugfixes for md.
      
      One is a recently introduced regression that affects an unusual
      configuration with a guaranteed BUG_ON.  Has been tagged for -stable.
      The other is minor missing functionality.
      
      * tag 'md-3.3-fixes' of git://neil.brown.name/md:
        md/raid1: perform bad-block tests for WriteMostly devices too.
        md: notify the 'degraded' sysfs attribute on failure.
      c086ae4e
    • L
      Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci · 7b67e751
      Linus Torvalds 提交于
      * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits)
        x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
        PCI: Increase resource array mask bit size in pcim_iomap_regions()
        PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES
        PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT)
        PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB
        x86/PCI: amd: factor out MMCONFIG discovery
        PCI: Enable ATS at the device state restore
        PCI: msi: fix imbalanced refcount of msi irq sysfs objects
        PCI: kconfig: English typo in pci/pcie/Kconfig
        PCI/PM/Runtime: make PCI traces quieter
        PCI: remove pci_create_bus()
        xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources
        x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus()
        x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented()
        x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan
        sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources
        sparc/PCI: convert to pci_create_root_bus()
        sh/PCI: convert to pci_scan_root_bus() for correct root bus resources
        powerpc/PCI: convert to pci_create_root_bus()
        powerpc/PCI: split PHB part out of pcibios_map_io_space()
        ...
      
      Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due
      to the same patches being applied in other branches.
      7b67e751
    • B
      cpu: Register a generic CPU device on architectures that currently do not · 9f13a1fd
      Ben Hutchings 提交于
      frv, h8300, m68k, microblaze, openrisc, score, um and xtensa currently
      do not register a CPU device.  Add the config option GENERIC_CPU_DEVICES
      which causes a generic CPU device to be registered for each present CPU,
      and make all these architectures select it.
      
      Richard Weinberger <richard@nod.at> covered UML and suggested using
      per_cpu.
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9f13a1fd
    • B
      cpu: Do not return errors from cpu_dev_init() which will be ignored · 024f7846
      Ben Hutchings 提交于
      cpu_dev_init() is only called from driver_init(), which does not check
      its return value.  Therefore make cpu_dev_init() return void.
      
      We must register the CPU subsystem, so panic if this fails.
      
      If sched_create_sysfs_power_savings_entries() fails, the damage is
      contained, so ignore this (as before).
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      024f7846
    • P
      Merge branch 'slab/urgent' into slab/for-linus · 5878cf43
      Pekka Enberg 提交于
      5878cf43
  3. 11 1月, 2012 22 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 4f58cb90
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (54 commits)
        crypto: gf128mul - remove leftover "(EXPERIMENTAL)" in Kconfig
        crypto: serpent-sse2 - remove unneeded LRW/XTS #ifdefs
        crypto: serpent-sse2 - select LRW and XTS
        crypto: twofish-x86_64-3way - remove unneeded LRW/XTS #ifdefs
        crypto: twofish-x86_64-3way - select LRW and XTS
        crypto: xts - remove dependency on EXPERIMENTAL
        crypto: lrw - remove dependency on EXPERIMENTAL
        crypto: picoxcell - fix boolean and / or confusion
        crypto: caam - remove DECO access initialization code
        crypto: caam - fix polarity of "propagate error" logic
        crypto: caam - more desc.h cleanups
        crypto: caam - desc.h - convert spaces to tabs
        crypto: talitos - convert talitos_error to struct device
        crypto: talitos - remove NO_IRQ references
        crypto: talitos - fix bad kfree
        crypto: convert drivers/crypto/* to use module_platform_driver()
        char: hw_random: convert drivers/char/hw_random/* to use module_platform_driver()
        crypto: serpent-sse2 - should select CRYPTO_CRYPTD
        crypto: serpent - rename serpent.c to serpent_generic.c
        crypto: serpent - cleanup checkpatch errors and warnings
        ...
      4f58cb90
    • L
      Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security · e7691a1c
      Linus Torvalds 提交于
      * 'for-linus' of git://selinuxproject.org/~jmorris/linux-security: (32 commits)
        ima: fix invalid memory reference
        ima: free duplicate measurement memory
        security: update security_file_mmap() docs
        selinux: Casting (void *) value returned by kmalloc is useless
        apparmor: fix module parameter handling
        Security: tomoyo: add .gitignore file
        tomoyo: add missing rcu_dereference()
        apparmor: add missing rcu_dereference()
        evm: prevent racing during tfm allocation
        evm: key must be set once during initialization
        mpi/mpi-mpow: NULL dereference on allocation failure
        digsig: build dependency fix
        KEYS: Give key types their own lockdep class for key->sem
        TPM: fix transmit_cmd error logic
        TPM: NSC and TIS drivers X86 dependency fix
        TPM: Export wait_for_stat for other vendor specific drivers
        TPM: Use vendor specific function for status probe
        tpm_tis: add delay after aborting command
        tpm_tis: Check return code from getting timeouts/durations
        tpm: Introduce function to poll for result of self test
        ...
      
      Fix up trivial conflict in lib/Makefile due to addition of CONFIG_MPI
      and SIGSIG next to CONFIG_DQL addition.
      e7691a1c
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 5cd9599b
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        autofs4: deal with autofs4_write/autofs4_write races
        autofs4: catatonic_mode vs. notify_daemon race
        autofs4: autofs4_wait() vs. autofs4_catatonic_mode() race
        hfsplus: creation of hidden dir on mount can fail
        block_dev: Suppress bdev_cache_init() kmemleak warninig
        fix shrink_dcache_parent() livelock
        coda: switch coda_cnode_make() to sane API as well, clean coda_lookup()
        coda: deal correctly with allocation failure from coda_cnode_makectl()
        securityfs: fix object creation races
      5cd9599b
    • A
      autofs4: deal with autofs4_write/autofs4_write races · d668dc56
      Al Viro 提交于
      Just serialize the actual writing of packets into pipe on
      a new mutex, independent from everything else in the locking
      hierarchy.  As soon as something has started feeding a piece
      of packet into the pipe to daemon, we *want* everything else
      about to try the same to wait until we are done.
      Acked-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d668dc56
    • A
      autofs4: catatonic_mode vs. notify_daemon race · 87533332
      Al Viro 提交于
      we need to hold ->wq_mutex while we are forming the packet to send,
      lest we have autofs4_catatonic_mode() setting wq->name.name to NULL
      just as autofs4_notify_daemon() decides to memcpy() from it...
      
      We do have check for catatonic mode immediately after that (under
      ->wq_mutex, as it ought to be) and packet won't be actually sent,
      but it'll be too late for us if we oops on that memcpy() from NULL...
      
      Fix is obvious - just extend the area covered by ->wq_mutex over
      that switch and check whether it's catatonic *before* doing anything
      else.
      Acked-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      87533332
    • A
      autofs4: autofs4_wait() vs. autofs4_catatonic_mode() race · 4041bcdc
      Al Viro 提交于
      We need to recheck ->catatonic after autofs4_wait() got ->wq_mutex
      for good, or we might end up with wq inserted into queue after
      autofs4_catatonic_mode() had done its thing.  It will stick there
      forever, since there won't be anything to clear its ->name.name.
      
      A bit of a complication: validate_request() drops and regains ->wq_mutex.
      It actually ends up the most convenient place to stick the check into...
      Acked-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4041bcdc
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · e343a895
      Linus Torvalds 提交于
      lib: use generic pci_iomap on all architectures
      
      Many architectures don't want to pull in iomap.c,
      so they ended up duplicating pci_iomap from that file.
      That function isn't trivial, and we are going to modify it
      https://lkml.org/lkml/2011/11/14/183
      so the duplication hurts.
      
      This reduces the scope of the problem significantly,
      by moving pci_iomap to a separate file and
      referencing that from all architectures.
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        alpha: drop pci_iomap/pci_iounmap from pci-noop.c
        mn10300: switch to GENERIC_PCI_IOMAP
        mn10300: add missing __iomap markers
        frv: switch to GENERIC_PCI_IOMAP
        tile: switch to GENERIC_PCI_IOMAP
        tile: don't panic on iomap
        sparc: switch to GENERIC_PCI_IOMAP
        sh: switch to GENERIC_PCI_IOMAP
        powerpc: switch to GENERIC_PCI_IOMAP
        parisc: switch to GENERIC_PCI_IOMAP
        mips: switch to GENERIC_PCI_IOMAP
        microblaze: switch to GENERIC_PCI_IOMAP
        arm: switch to GENERIC_PCI_IOMAP
        alpha: switch to GENERIC_PCI_IOMAP
        lib: add GENERIC_PCI_IOMAP
        lib: move GENERIC_IOMAP to lib/Kconfig
      
      Fix up trivial conflicts due to changes nearby in arch/{m68k,score}/Kconfig
      e343a895
    • L
      Merge tag 'for-linux-3.3-merge-window' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming · 06792c4d
      Linus Torvalds 提交于
      * tag 'for-linux-3.3-merge-window' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming: (29 commits)
        C6X: replace tick_nohz_stop/restart_sched_tick calls
        C6X: add register_cpu call
        C6X: deal with memblock API changes
        C6X: fix timer64 initialization
        C6X: fix layout of EMIFA registers
        C6X: MAINTAINERS
        C6X: DSCR - Device State Configuration Registers
        C6X: EMIF - External Memory Interface
        C6X: general SoC support
        C6X: library code
        C6X: headers
        C6X: ptrace support
        C6X: loadable module support
        C6X: cache control
        C6X: clocks
        C6X: build infrastructure
        C6X: syscalls
        C6X: interrupt handling
        C6X: time management
        C6X: signal management
        ...
      06792c4d
    • L
      Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze · 4690dfa8
      Linus Torvalds 提交于
      * 'next' of git://git.monstr.eu/linux-2.6-microblaze:
        microblaze: Wire-up new system calls
        microblaze: Remove NO_IRQ from architecture
        input: xilinx_ps2: Don't use NO_IRQ
        block: xsysace: Don't use NO_IRQ
        microblaze: Trivial asm fix
        microblaze: Fix debug message in module
        microblaze: Remove eprintk macro
        microblaze: Send CR before LF for early console
        microblaze: Change NO_IRQ to 0
        microblaze: Use irq_of_parse_and_map for timer
        microblaze: intc: Change variable name
        microblaze: Use of_find_compatible_node for timer and intc
        microblaze: Add __cmpdi2
        microblaze: Synchronize __pa __va macros
      4690dfa8
    • L
      Merge branch 'unicore32' of git://github.com/gxt/linux · c2e08e7c
      Linus Torvalds 提交于
      * 'unicore32' of git://github.com/gxt/linux:
        rtc-puv3: solve section mismatch in rtc-puv3.c
        rtc-puv3: using module_platform_driver()
        i2c-puv3: using module_platform_driver()
        rtc-puv3: irq: remove IRQF_DISABLED
        unicore32: Remove IRQF_DISABLED
        unicore32: Use set_current_blocked()
        unicore32: add ioremap_nocache definition
        unicore32: delete specified xlate_dev_mem_ptr
        of: add include asm/setup.h in drivers/of/fdt.c
        unicore32: standardize /proc/iomem "Kernel code" name
      c2e08e7c
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin · 28190145
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin:
        blackfin: bf561: add adv7183 capture support
        blackfin: bf537: add capture support
        blackfin: bf548: add capture support
        blackfin: time-ts: rm unused func broadcast_timer_setup()
        blackfin: i2c-lcd: change default clock rate
        blackfin: mac: dsa: add vlan mask in board file
        blackfin: bf537: change num_chipselect for spi-sport
        blackfin: serial: bfin-uart: remove unused field
        bf54x: get mem size: missing break in switch
        blackfin: smp: fix msg queue overflow issue
        blackfin: config: update macro SPI_BFIN in board file
        blackfin: config: update def config for all boards
        blackfin: smp: cleanup smp code
        blackfin: smp: add suspend and wakeup irq flags
        blackfin: bf533-stamp: add missed patches for new asoc driver
        blackfin: bf533-stamp: fix ad1836 name
      28190145
    • L
      Merge branch 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux · 001a541e
      Linus Torvalds 提交于
      * 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
        writeback: move MIN_WRITEBACK_PAGES to fs-writeback.c
        writeback: balanced_rate cannot exceed write bandwidth
        writeback: do strict bdi dirty_exceeded
        writeback: avoid tiny dirty poll intervals
        writeback: max, min and target dirty pause time
        writeback: dirty ratelimit - think time compensation
        btrfs: fix dirtied pages accounting on sub-page writes
        writeback: fix dirtied pages accounting on redirty
        writeback: fix dirtied pages accounting on sub-page writes
        writeback: charge leaked page dirties to active tasks
        writeback: Include all dirty inodes in background writeback
      001a541e
    • L
      Merge branch 'akpm' (aka "Andrew's patch-bomb") · 40ba5879
      Linus Torvalds 提交于
      Andrew elucidates:
       - First installmeant of MM.  We have a HUGE number of MM patches this
         time.  It's crazy.
       - MAINTAINERS updates
       - backlight updates
       - leds
       - checkpatch updates
       - misc ELF stuff
       - rtc updates
       - reiserfs
       - procfs
       - some misc other bits
      
      * akpm: (124 commits)
        user namespace: make signal.c respect user namespaces
        workqueue: make alloc_workqueue() take printf fmt and args for name
        procfs: add hidepid= and gid= mount options
        procfs: parse mount options
        procfs: introduce the /proc/<pid>/map_files/ directory
        procfs: make proc_get_link to use dentry instead of inode
        signal: add block_sigmask() for adding sigmask to current->blocked
        sparc: make SA_NOMASK a synonym of SA_NODEFER
        reiserfs: don't lock root inode searching
        reiserfs: don't lock journal_init()
        reiserfs: delay reiserfs lock until journal initialization
        reiserfs: delete comments referring to the BKL
        drivers/rtc/interface.c: fix alarm rollover when day or month is out-of-range
        drivers/rtc/rtc-twl.c: add DT support for RTC inside twl4030/twl6030
        drivers/rtc/: remove redundant spi driver bus initialization
        drivers/rtc/rtc-jz4740.c: make jz4740_rtc_driver static
        drivers/rtc/rtc-mc13xxx.c: make mc13xxx_rtc_idtable static
        rtc: convert drivers/rtc/* to use module_platform_driver()
        drivers/rtc/rtc-wm831x.c: convert to devm_kzalloc()
        drivers/rtc/rtc-wm831x.c: remove unused period IRQ handler
        ...
      40ba5879
    • S
      user namespace: make signal.c respect user namespaces · 6b550f94
      Serge E. Hallyn 提交于
      ipc/mqueue.c: for __SI_MESQ, convert the uid being sent to recipient's
      user namespace. (new, thanks Oleg)
      
      __send_signal: convert current's uid to the recipient's user namespace
      for any siginfo which is not SI_FROMKERNEL (patch from Oleg, thanks
      again :)
      
      do_notify_parent and do_notify_parent_cldstop: map task's uid to parent's
      user namespace
      
      ptrace_signal maps parent's uid into current's user namespace before
      including in signal to current.  IIUC Oleg has argued that this shouldn't
      matter as the debugger will play with it, but it seems like not converting
      the value currently being set is misleading.
      
      Changelog:
      Sep 20: Inspired by Oleg's suggestion, define map_cred_ns() helper to
      	simplify callers and help make clear what we are translating
              (which uid into which namespace).  Passing the target task would
      	make callers even easier to read, but we pass in user_ns because
      	current_user_ns() != task_cred_xxx(current, user_ns).
      Sep 20: As recommended by Oleg, also put task_pid_vnr() under rcu_read_lock
      	in ptrace_signal().
      Sep 23: In send_signal(), detect when (user) signal is coming from an
      	ancestor or unrelated user namespace.  Pass that on to __send_signal,
      	which sets si_uid to 0 or overflowuid if needed.
      Oct 12: Base on Oleg's fixup_uid() patch.  On top of that, handle all
      	SI_FROMKERNEL cases at callers, because we can't assume sender is
      	current in those cases.
      Nov 10: (mhelsley) rename fixup_uid to more meaningful usern_fixup_signal_uid
      Nov 10: (akpm) make the !CONFIG_USER_NS case clearer
      Signed-off-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      From: Serge Hallyn <serge.hallyn@canonical.com>
      Subject: __send_signal: pass q->info, not info, to userns_fixup_signal_uid (v2)
      
      Eric Biederman pointed out that passing info is a bug and could lead to a
      NULL pointer deref to boot.
      
      A collection of signal, securebits, filecaps, cap_bounds, and a few other
      ltp tests passed with this kernel.
      
      Changelog:
          Nov 18: previous patch missed a leading '&'
      Signed-off-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      From: Dan Carpenter <dan.carpenter@oracle.com>
      Subject: ipc/mqueue: lock() => unlock() typo
      
      There was a double lock typo introduced in b085f4bd6b21 "user namespace:
      make signal.c respect user namespaces"
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b550f94
    • T
      workqueue: make alloc_workqueue() take printf fmt and args for name · b196be89
      Tejun Heo 提交于
      alloc_workqueue() currently expects the passed in @name pointer to remain
      accessible.  This is inconvenient and a bit silly given that the whole wq
      is being dynamically allocated.  This patch updates alloc_workqueue() and
      friends to take printf format string instead of opaque string and matching
      varargs at the end.  The name is allocated together with the wq and
      formatted.
      
      alloc_ordered_workqueue() is converted to a macro to unify varargs
      handling with alloc_workqueue(), and, while at it, add comment to
      alloc_workqueue().
      
      None of the current in-kernel users pass in string with '%' as constant
      name and this change shouldn't cause any problem.
      
      [akpm@linux-foundation.org: use __printf]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Suggested-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b196be89
    • V
      procfs: add hidepid= and gid= mount options · 0499680a
      Vasiliy Kulikov 提交于
      Add support for mount options to restrict access to /proc/PID/
      directories.  The default backward-compatible "relaxed" behaviour is left
      untouched.
      
      The first mount option is called "hidepid" and its value defines how much
      info about processes we want to be available for non-owners:
      
      hidepid=0 (default) means the old behavior - anybody may read all
      world-readable /proc/PID/* files.
      
      hidepid=1 means users may not access any /proc/<pid>/ directories, but
      their own.  Sensitive files like cmdline, sched*, status are now protected
      against other users.  As permission checking done in proc_pid_permission()
      and files' permissions are left untouched, programs expecting specific
      files' modes are not confused.
      
      hidepid=2 means hidepid=1 plus all /proc/PID/ will be invisible to other
      users.  It doesn't mean that it hides whether a process exists (it can be
      learned by other means, e.g.  by kill -0 $PID), but it hides process' euid
      and egid.  It compicates intruder's task of gathering info about running
      processes, whether some daemon runs with elevated privileges, whether
      another user runs some sensitive program, whether other users run any
      program at all, etc.
      
      gid=XXX defines a group that will be able to gather all processes' info
      (as in hidepid=0 mode).  This group should be used instead of putting
      nonroot user in sudoers file or something.  However, untrusted users (like
      daemons, etc.) which are not supposed to monitor the tasks in the whole
      system should not be added to the group.
      
      hidepid=1 or higher is designed to restrict access to procfs files, which
      might reveal some sensitive private information like precise keystrokes
      timings:
      
      http://www.openwall.com/lists/oss-security/2011/11/05/3
      
      hidepid=1/2 doesn't break monitoring userspace tools.  ps, top, pgrep, and
      conky gracefully handle EPERM/ENOENT and behave as if the current user is
      the only user running processes.  pstree shows the process subtree which
      contains "pstree" process.
      
      Note: the patch doesn't deal with setuid/setgid issues of keeping
      preopened descriptors of procfs files (like
      https://lkml.org/lkml/2011/2/7/368).  We rely on that the leaked
      information like the scheduling counters of setuid apps doesn't threaten
      anybody's privacy - only the user started the setuid program may read the
      counters.
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Theodore Tso <tytso@MIT.EDU>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: James Morris <jmorris@namei.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0499680a
    • V
      procfs: parse mount options · 97412950
      Vasiliy Kulikov 提交于
      Add support for procfs mount options.  Actual mount options are coming in
      the next patches.
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Theodore Tso <tytso@MIT.EDU>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: James Morris <jmorris@namei.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97412950
    • P
      procfs: introduce the /proc/<pid>/map_files/ directory · 640708a2
      Pavel Emelyanov 提交于
      This one behaves similarly to the /proc/<pid>/fd/ one - it contains
      symlinks one for each mapping with file, the name of a symlink is
      "vma->vm_start-vma->vm_end", the target is the file.  Opening a symlink
      results in a file that point exactly to the same inode as them vma's one.
      
      For example the ls -l of some arbitrary /proc/<pid>/map_files/
      
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80403000-7f8f80404000 -> /lib64/libc-2.5.so
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f8061e000-7f8f80620000 -> /lib64/libselinux.so.1
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80826000-7f8f80827000 -> /lib64/libacl.so.1.1.0
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a2f000-7f8f80a30000 -> /lib64/librt-2.5.so
       | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a30000-7f8f80a4c000 -> /lib64/ld-2.5.so
      
      This *helps* checkpointing process in three ways:
      
      1. When dumping a task mappings we do know exact file that is mapped
         by particular region.  We do this by opening
         /proc/$pid/map_files/$address symlink the way we do with file
         descriptors.
      
      2. This also helps in determining which anonymous shared mappings are
         shared with each other by comparing the inodes of them.
      
      3. When restoring a set of processes in case two of them has a mapping
         shared, we map the memory by the 1st one and then open its
         /proc/$pid/map_files/$address file and map it by the 2nd task.
      
      Using /proc/$pid/maps for this is quite inconvenient since it brings
      repeatable re-reading and reparsing for this text file which slows down
      restore procedure significantly.  Also as being pointed in (3) it is a way
      easier to use top level shared mapping in children as
      /proc/$pid/map_files/$address when needed.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [gorcunov@openvz.org: make map_files depend on CHECKPOINT_RESTORE]
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Reviewed-by: NVasiliy Kulikov <segoon@openwall.com>
      Reviewed-by: N"Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      640708a2
    • C
      procfs: make proc_get_link to use dentry instead of inode · 7773fbc5
      Cyrill Gorcunov 提交于
      Prepare the ground for the next "map_files" patch which needs a name of a
      link file to analyse.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vasiliy Kulikov <segoon@openwall.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7773fbc5
    • M
      signal: add block_sigmask() for adding sigmask to current->blocked · 5e6292c0
      Matt Fleming 提交于
      Abstract the code sequence for adding a signal handler's sa_mask to
      current->blocked because the sequence is identical for all architectures.
      Furthermore, in the past some architectures actually got this code wrong,
      so introduce a wrapper that all architectures can use.
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5e6292c0
    • M
      sparc: make SA_NOMASK a synonym of SA_NODEFER · f350b177
      Matt Fleming 提交于
      Unlike other architectures, sparc currently has no SA_NODEFER definition
      but only the older SA_NOMASK.  Since SA_NOMASK is the historical name for
      SA_NODEFER, add SA_NODEFER and copy what other architectures do by making
      SA_NOMASK a synonym for SA_NODEFER.
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f350b177
    • F
      reiserfs: don't lock root inode searching · 9b467e6e
      Frederic Weisbecker 提交于
      Nothing requires that we lock the filesystem until the root inode is
      provided.
      
      Also iget5_locked() triggers a warning because we are holding the
      filesystem lock while allocating the inode, which result in a lockdep
      suspicion that we have a lock inversion against the reclaim path:
      
      [ 1986.896979] =================================
      [ 1986.896990] [ INFO: inconsistent lock state ]
      [ 1986.896997] 3.1.1-main #8
      [ 1986.897001] ---------------------------------
      [ 1986.897007] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
      [ 1986.897016] kswapd0/16 [HC0[0]:SC0[0]:HE1:SE1] takes:
      [ 1986.897023]  (&REISERFS_SB(s)->lock){+.+.?.}, at: [<c01f8bd4>] reiserfs_write_lock+0x20/0x2a
      [ 1986.897044] {RECLAIM_FS-ON-W} state was registered at:
      [ 1986.897050]   [<c014a5b9>] mark_held_locks+0xae/0xd0
      [ 1986.897060]   [<c014aab3>] lockdep_trace_alloc+0x7d/0x91
      [ 1986.897068]   [<c0190ee0>] kmem_cache_alloc+0x1a/0x93
      [ 1986.897078]   [<c01e7728>] reiserfs_alloc_inode+0x13/0x3d
      [ 1986.897088]   [<c01a5b06>] alloc_inode+0x14/0x5f
      [ 1986.897097]   [<c01a5cb9>] iget5_locked+0x62/0x13a
      [ 1986.897106]   [<c01e99e0>] reiserfs_fill_super+0x410/0x8b9
      [ 1986.897114]   [<c01953da>] mount_bdev+0x10b/0x159
      [ 1986.897123]   [<c01e764d>] get_super_block+0x10/0x12
      [ 1986.897131]   [<c0195b38>] mount_fs+0x59/0x12d
      [ 1986.897138]   [<c01a80d1>] vfs_kern_mount+0x45/0x7a
      [ 1986.897147]   [<c01a83e3>] do_kern_mount+0x2f/0xb0
      [ 1986.897155]   [<c01a987a>] do_mount+0x5c2/0x612
      [ 1986.897163]   [<c01a9a72>] sys_mount+0x61/0x8f
      [ 1986.897170]   [<c044060c>] sysenter_do_call+0x12/0x32
      [ 1986.897181] irq event stamp: 7509691
      [ 1986.897186] hardirqs last  enabled at (7509691): [<c0190f34>] kmem_cache_alloc+0x6e/0x93
      [ 1986.897197] hardirqs last disabled at (7509690): [<c0190eea>] kmem_cache_alloc+0x24/0x93
      [ 1986.897209] softirqs last  enabled at (7508896): [<c01294bd>] __do_softirq+0xee/0xfd
      [ 1986.897222] softirqs last disabled at (7508859): [<c01030ed>] do_softirq+0x50/0x9d
      [ 1986.897234]
      [ 1986.897235] other info that might help us debug this:
      [ 1986.897242]  Possible unsafe locking scenario:
      [ 1986.897244]
      [ 1986.897250]        CPU0
      [ 1986.897254]        ----
      [ 1986.897257]   lock(&REISERFS_SB(s)->lock);
      [ 1986.897265] <Interrupt>
      [ 1986.897269]     lock(&REISERFS_SB(s)->lock);
      [ 1986.897276]
      [ 1986.897277]  *** DEADLOCK ***
      [ 1986.897278]
      [ 1986.897286] no locks held by kswapd0/16.
      [ 1986.897291]
      [ 1986.897292] stack backtrace:
      [ 1986.897299] Pid: 16, comm: kswapd0 Not tainted 3.1.1-main #8
      [ 1986.897306] Call Trace:
      [ 1986.897314]  [<c0439e76>] ? printk+0xf/0x11
      [ 1986.897324]  [<c01482d1>] print_usage_bug+0x20e/0x21a
      [ 1986.897332]  [<c01479b8>] ? print_irq_inversion_bug+0x172/0x172
      [ 1986.897341]  [<c014855c>] mark_lock+0x27f/0x483
      [ 1986.897349]  [<c0148d88>] __lock_acquire+0x628/0x1472
      [ 1986.897358]  [<c0149fae>] lock_acquire+0x47/0x5e
      [ 1986.897366]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
      [ 1986.897384]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
      [ 1986.897397]  [<c043b5ef>] mutex_lock_nested+0x35/0x26f
      [ 1986.897409]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
      [ 1986.897421]  [<c01f8bd4>] reiserfs_write_lock+0x20/0x2a
      [ 1986.897433]  [<c01e2edd>] map_block_for_writepage+0xc9/0x590
      [ 1986.897448]  [<c01b1706>] ? create_empty_buffers+0x33/0x8f
      [ 1986.897461]  [<c0121124>] ? get_parent_ip+0xb/0x31
      [ 1986.897472]  [<c043ef7f>] ? sub_preempt_count+0x81/0x8e
      [ 1986.897485]  [<c043cae0>] ? _raw_spin_unlock+0x27/0x3d
      [ 1986.897496]  [<c0121124>] ? get_parent_ip+0xb/0x31
      [ 1986.897508]  [<c01e355d>] reiserfs_writepage+0x1b9/0x3e7
      [ 1986.897521]  [<c0173b40>] ? clear_page_dirty_for_io+0xcb/0xde
      [ 1986.897533]  [<c014a6e3>] ? trace_hardirqs_on_caller+0x108/0x138
      [ 1986.897546]  [<c014a71e>] ? trace_hardirqs_on+0xb/0xd
      [ 1986.897559]  [<c0177b38>] shrink_page_list+0x34f/0x5e2
      [ 1986.897572]  [<c01780a7>] shrink_inactive_list+0x172/0x22c
      [ 1986.897585]  [<c0178464>] shrink_zone+0x303/0x3b1
      [ 1986.897597]  [<c043cae0>] ? _raw_spin_unlock+0x27/0x3d
      [ 1986.897611]  [<c01788c9>] kswapd+0x3b7/0x5f2
      
      The deadlock shouldn't happen since we are doing that allocation in the
      mount path, the filesystem is not available for any reclaim.  Still the
      warning is annoying.
      
      To solve this, acquire the lock later only where we need it, right before
      calling reiserfs_read_locked_inode() that wants to lock to walk the tree.
      Reported-by: NKnut Petersen <Knut_Petersen@t-online.de>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9b467e6e