1. 31 7月, 2012 40 次提交
    • A
      vfio: Add documentation · 4a5b2a20
      Alex Williamson 提交于
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      4a5b2a20
    • A
      vfio: VFIO core · cba3345c
      Alex Williamson 提交于
      VFIO is a secure user level driver for use with both virtual machines
      and user level drivers.  VFIO makes use of IOMMU groups to ensure the
      isolation of devices in use, allowing unprivileged user access.  It's
      intended that VFIO will replace KVM device assignment and UIO drivers
      (in cases where the target platform includes a sufficiently capable
      IOMMU).
      
      New in this version of VFIO is support for IOMMU groups managed
      through the IOMMU core as well as a rework of the API, removing the
      group merge interface.  We now go back to a model more similar to
      original VFIO with UIOMMU support where the file descriptor obtained
      from /dev/vfio/vfio allows access to the IOMMU, but only after a
      group is added, avoiding the previous privilege issues with this type
      of model.  IOMMU support is also now fully modular as IOMMUs have
      vastly different interface requirements on different platforms.  VFIO
      users are able to query and initialize the IOMMU model of their
      choice.
      
      Please see the follow-on Documentation commit for further description
      and usage example.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      cba3345c
    • L
      Merge tag 'writeback-proportions' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux · 2e3ee613
      Linus Torvalds 提交于
      Pull writeback updates from Wu Fengguang:
       "Use time based periods to age the writeback proportions, which can
        adapt equally well to fast/slow devices."
      
      Fix up trivial conflict in comment in fs/sync.c
      
      * tag 'writeback-proportions' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
        writeback: Fix some comment errors
        block: Convert BDI proportion calculations to flexible proportions
        lib: Fix possible deadlock in flexible proportion code
        lib: Proportions with flexible period
      2e3ee613
    • L
      Merge tag 'nfs-for-3.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 1fad1e9a
      Linus Torvalds 提交于
      Pull NFS client updates from Trond Myklebust:
       "Features include:
         - More preparatory patches for modularising NFSv2/v3/v4.  Split out
           the various NFSv2/v3/v4-specific code into separate files
         - More preparation for the NFSv4 migration code
         - Ensure that OPEN(O_CREATE) observes the pNFS mds threshold
           parameters
         - pNFS fast failover when the data servers are down
         - Various cleanups and debugging patches"
      
      * tag 'nfs-for-3.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (67 commits)
        nfs: fix fl_type tests in NFSv4 code
        NFS: fix pnfs regression with directio writes
        NFS: fix pnfs regression with directio reads
        sunrpc: clnt: Add missing braces
        nfs: fix stub return type warnings
        NFS: exit_nfs_v4() shouldn't be an __exit function
        SUNRPC: Add a missing spin_unlock to gss_mech_list_pseudoflavors
        NFS: Split out NFS v4 client functions
        NFS: Split out the NFS v4 filesystem types
        NFS: Create a single nfs_clone_super() function
        NFS: Split out NFS v4 server creating code
        NFS: Initialize the NFS v4 client from init_nfs_v4()
        NFS: Move the v4 getroot code to nfs4getroot.c
        NFS: Split out NFS v4 file operations
        NFS: Initialize v4 sysctls from nfs_init_v4()
        NFS: Create an init_nfs_v4() function
        NFS: Split out NFS v4 inode operations
        NFS: Split out NFS v3 inode operations
        NFS: Split out NFS v2 inode operations
        NFS: Clean up nfs4_proc_setclientid() and friends
        ...
      1fad1e9a
    • L
      Merge tag 'mfd-for-linus-3.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 · bbeb0af2
      Linus Torvalds 提交于
      Pull MFD fix from Samuel Ortiz:
       "This one fixes an s5m8767 regulator build breakage due to a merge
        conflict caused by the MFD s5m API changes."
      
      * tag 'mfd-for-linus-3.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
        regulator: Fix an s5m8767 build failure
      bbeb0af2
    • L
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 6df419e4
      Linus Torvalds 提交于
      Pull media updates from Mauro Carvalho Chehab:
       "This is the first part of the media patches for v3.6.
      
        This patch series contain:
         - new DVB frontend: rtl2832
         - new video drivers: adv7393
         - some unused files got removed
         - a selection API cleanup between V4L2 and V4L2 subdev API's
         - a major redesign at v4l-ioctl2, in order to clean it up
         - several driver fixes and improvements."
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (174 commits)
        v4l: Export v4l2-common.h in include/linux/Kbuild
        media: Revert "[media] Terratec Cinergy S2 USB HD Rev.2"
        [media] media: Use pr_info not homegrown pr_reg macro
        [media] Terratec Cinergy S2 USB HD Rev.2
        [media] v4l: Correct conflicting V4L2 subdev selection API documentation
        [media] Feature removal: V4L2 selections API target and flag definitions
        [media] v4l: Unify selection flags documentation
        [media] v4l: Unify selection flags
        [media] v4l: Common documentation for selection targets
        [media] v4l: Unify selection targets across V4L2 and V4L2 subdev interfaces
        [media] v4l: Remove "_ACTUAL" from subdev selection API target definition names
        [media] V4L: Remove "_ACTIVE" from the selection target name definitions
        [media] media: dvb-usb: print mac address via native %pM
        [media] s5p-tv: Use module_i2c_driver in sii9234_drv.c file
        [media] media: gpio-ir-recv: add allowed_protos for platform data
        [media] s5p-jpeg: Use module_platform_driver in jpeg-core.c file
        [media] saa7134: fix spelling of detach in label
        [media] cx88-blackbird: replace ioctl by unlocked_ioctl
        [media] cx88: don't use current_norm
        [media] cx88: fix a number of v4l2-compliance violations
        ...
      6df419e4
    • L
      Merge branch 'akpm' (Andrew's patch-bomb) · 27c1ee3f
      Linus Torvalds 提交于
      Merge Andrew's first set of patches:
       "Non-MM patches:
      
         - lots of misc bits
      
         - tree-wide have_clk() cleanups
      
         - quite a lot of printk tweaks.  I draw your attention to "printk:
           convert the format for KERN_<LEVEL> to a 2 byte pattern" which
           looks a bit scary.  But afaict it's solid.
      
         - backlight updates
      
         - lib/ feature work (notably the addition and use of memweight())
      
         - checkpatch updates
      
         - rtc updates
      
         - nilfs updates
      
         - fatfs updates (partial, still waiting for acks)
      
         - kdump, proc, fork, IPC, sysctl, taskstats, pps, etc
      
         - new fault-injection feature work"
      
      * Merge emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
        drivers/misc/lkdtm.c: fix missing allocation failure check
        lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table()
        fault-injection: add tool to run command with failslab or fail_page_alloc
        fault-injection: add selftests for cpu and memory hotplug
        powerpc: pSeries reconfig notifier error injection module
        memory: memory notifier error injection module
        PM: PM notifier error injection module
        cpu: rewrite cpu-notifier-error-inject module
        fault-injection: notifier error injection
        c/r: fcntl: add F_GETOWNER_UIDS option
        resource: make sure requested range is included in the root range
        include/linux/aio.h: cpp->C conversions
        fs: cachefiles: add support for large files in filesystem caching
        pps: return PTR_ERR on error in device_create
        taskstats: check nla_reserve() return
        sysctl: suppress kmemleak messages
        ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION
        ipc: compat: use signed size_t types for msgsnd and msgrcv
        ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPC
        ipc: add COMPAT_SHMLBA support
        ...
      27c1ee3f
    • A
    • M
      lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table() · e04f2283
      Mandeep Singh Baines 提交于
      We are seeing a lot of sg_alloc_table allocation failures using the new
      drm prime infrastructure.  We isolated the cause to code in
      __sg_alloc_table that was re-writing the gfp_flags.
      
      There is a comment in the code that suggest that there is an assumption
      about the allocation coming from a memory pool.  This was likely true
      when sg lists were primarily used for disk I/O.
      Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Cong Wang <amwang@redhat.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Rob Clark <rob.clark@linaro.org>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Inki Dae <inki.dae@samsung.com>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Sonny Rao <sonnyrao@chromium.org>
      Cc: Olof Johansson <olofj@chromium.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e04f2283
    • A
      fault-injection: add tool to run command with failslab or fail_page_alloc · c24aa64d
      Akinobu Mita 提交于
      This adds tools/testing/fault-injection/failcmd.sh to run a command while
      injecting slab/page allocation failures via fault injection.
      
      Example:
      
      Run a command "make -C tools/testing/selftests/ run_tests" with
      injecting slab allocation failure.
      
      	# ./tools/testing/fault-injection/failcmd.sh \
      		-- make -C tools/testing/selftests/ run_tests
      
      Same as above except to specify 100 times failures at most instead of
      one time at most by default.
      
      	# ./tools/testing/fault-injection/failcmd.sh --times=100 \
      		-- make -C tools/testing/selftests/ run_tests
      
      Same as above except to inject page allocation failure instead of slab
      allocation failure.
      
      	# env FAILCMD_TYPE=fail_page_alloc \
      		./tools/testing/fault-injection/failcmd.sh --times=100 \
      		-- make -C tools/testing/selftests/ run_tests
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c24aa64d
    • A
      fault-injection: add selftests for cpu and memory hotplug · d89dffa9
      Akinobu Mita 提交于
      This adds two selftests
      
      * tools/testing/selftests/cpu-hotplug/on-off-test.sh is testing script
      for CPU hotplug
      
      1. Online all hot-pluggable CPUs
      2. Offline all hot-pluggable CPUs
      3. Online all hot-pluggable CPUs again
      4. Exit if cpu-notifier-error-inject.ko is not available
      5. Offline all hot-pluggable CPUs in preparation for testing
      6. Test CPU hot-add error handling by injecting notifier errors
      7. Online all hot-pluggable CPUs in preparation for testing
      8. Test CPU hot-remove error handling by injecting notifier errors
      
      * tools/testing/selftests/memory-hotplug/on-off-test.sh is doing the
      similar thing for memory hotplug.
      
      1. Online all hot-pluggable memory
      2. Offline 10% of hot-pluggable memory
      3. Online all hot-pluggable memory again
      4. Exit if memory-notifier-error-inject.ko is not available
      5. Offline 10% of hot-pluggable memory in preparation for testing
      6. Test memory hot-add error handling by injecting notifier errors
      7. Online all hot-pluggable memory in preparation for testing
      8. Test memory hot-remove error handling by injecting notifier errors
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d89dffa9
    • A
      powerpc: pSeries reconfig notifier error injection module · 08dfb4dd
      Akinobu Mita 提交于
      This provides the ability to inject artifical errors to pSeries reconfig
      notifier chain callbacks.  It is controlled through debugfs interface
      under /sys/kernel/debug/notifier-error-inject/pSeries-reconfig
      
      If the notifier call chain should be failed with some events
      notified, write the error code to "actions/<notifier event>/error".
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08dfb4dd
    • A
      memory: memory notifier error injection module · 9579f5bd
      Akinobu Mita 提交于
      This provides the ability to inject artifical errors to memory hotplug
      notifier chain callbacks.  It is controlled through debugfs interface
      under /sys/kernel/debug/notifier-error-inject/memory
      
      If the notifier call chain should be failed with some events notified,
      write the error code to "actions/<notifier event>/error".
      
      Example: Inject memory hotplug offline error (-12 == -ENOMEM)
      
      	# cd /sys/kernel/debug/notifier-error-inject/memory
      	# echo -12 > actions/MEM_GOING_OFFLINE/error
      	# echo offline > /sys/devices/system/memory/memoryXXX/state
      	bash: echo: write error: Cannot allocate memory
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9579f5bd
    • A
      PM: PM notifier error injection module · 048b9c35
      Akinobu Mita 提交于
      This provides the ability to inject artifical errors to PM notifier chain
      callbacks.  It is controlled through debugfs interface under
      /sys/kernel/debug/notifier-error-inject/pm
      
      Each of the files in "error" directory represents an event which can be
      failed and contains the error code.  If the notifier call chain should be
      failed with some events notified, write the error code to the files.
      
      If the notifier call chain should be failed with some events notified,
      write the error code to "actions/<notifier event>/error".
      
      Example: Inject PM suspend error (-12 = -ENOMEM)
      
      	# cd /sys/kernel/debug/notifier-error-inject/pm
      	# echo -12 > actions/PM_SUSPEND_PREPARE/error
      	# echo mem > /sys/power/state
      	bash: echo: write error: Cannot allocate memory
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      048b9c35
    • A
      cpu: rewrite cpu-notifier-error-inject module · f5a9f52e
      Akinobu Mita 提交于
      Rewrite existing cpu-notifier-error-inject module to use debugfs based new
      framework.
      
      This change removes cpu_up_prepare_error and cpu_down_prepare_error module
      parameters which were used to specify error code to be injected.  We could
      keep these module parameters for backward compatibility by module_param_cb
      but it seems overkill for this module.
      
      This provides the ability to inject artifical errors to CPU notifier chain
      callbacks.  It is controlled through debugfs interface under
      /sys/kernel/debug/notifier-error-inject/cpu
      
      If the notifier call chain should be failed with some events notified,
      write the error code to "actions/<notifier event>/error".
      
      Example1: inject CPU offline error (-1 == -EPERM)
      
      	# cd /sys/kernel/debug/notifier-error-inject/cpu
      	# echo -1 > actions/CPU_DOWN_PREPARE/error
      	# echo 0 > /sys/devices/system/cpu/cpu1/online
      	bash: echo: write error: Operation not permitted
      
      Example2: inject CPU online error (-2 == -ENOENT)
      
      	# cd /sys/kernel/debug/notifier-error-inject/cpu
      	# echo -2 > actions/CPU_UP_PREPARE/error
      	# echo 1 > /sys/devices/system/cpu/cpu1/online
      	bash: echo: write error: No such file or directory
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f5a9f52e
    • A
      fault-injection: notifier error injection · 8d438288
      Akinobu Mita 提交于
      This patchset provides kernel modules that can be used to test the error
      handling of notifier call chain failures by injecting artifical errors to
      the following notifier chain callbacks.
      
       * CPU notifier
       * PM notifier
       * memory hotplug notifier
       * powerpc pSeries reconfig notifier
      
      Example: Inject CPU offline error (-1 == -EPERM)
      
        # cd /sys/kernel/debug/notifier-error-inject/cpu
        # echo -1 > actions/CPU_DOWN_PREPARE/error
        # echo 0 > /sys/devices/system/cpu/cpu1/online
        bash: echo: write error: Operation not permitted
      
      The patchset also adds cpu and memory hotplug tests to
      tools/testing/selftests These tests first do simple online and offline
      test and then do fault injection tests if notifier error injection
      module is available.
      
      This patch:
      
      The notifier error injection provides the ability to inject artifical
      errors to specified notifier chain callbacks.  It is useful to test the
      error handling of notifier call chain failures.
      
      This adds common basic functions to define which type of events can be
      fail and to initialize the debugfs interface to control what error code
      should be returned and which event should be failed.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d438288
    • C
      c/r: fcntl: add F_GETOWNER_UIDS option · 1d151c33
      Cyrill Gorcunov 提交于
      When we restore file descriptors we would like them to look exactly as
      they were at dumping time.
      
      With help of fcntl it's almost possible, the missing snippet is file
      owners UIDs.
      
      To be able to read their values the F_GETOWNER_UIDS is introduced.
      
      This option is valid iif CONFIG_CHECKPOINT_RESTORE is turned on, otherwise
      returning -EINVAL.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d151c33
    • O
      resource: make sure requested range is included in the root range · 65fed8f6
      Octavian Purdila 提交于
      When the requested range is outside of the root range the logic in
      __reserve_region_with_split will cause an infinite recursion which will
      overflow the stack as seen in the warning bellow.
      
      This particular stack overflow was caused by requesting the
      (100000000-107ffffff) range while the root range was (0-ffffffff).  In
      this case __request_resource would return the whole root range as
      conflict range (i.e.  0-ffffffff).  Then, the logic in
      __reserve_region_with_split would continue the recursion requesting the
      new range as (conflict->end+1, end) which incidentally in this case
      equals the originally requested range.
      
      This patch aborts looking for an usable range when the request does not
      intersect with the root range.  When the request partially overlaps with
      the root range, it ajust the request to fall in the root range and then
      continues with the new request.
      
      When the request is modified or aborted errors and a stack trace are
      logged to allow catching the errors in the upper layers.
      
      [    5.968374] WARNING: at kernel/sched.c:4129 sub_preempt_count+0x63/0x89()
      [    5.975150] Modules linked in:
      [    5.978184] Pid: 1, comm: swapper Not tainted 3.0.22-mid27-00004-gb72c817 #46
      [    5.985324] Call Trace:
      [    5.987759]  [<c1039dfc>] ? console_unlock+0x17b/0x18d
      [    5.992891]  [<c1039620>] warn_slowpath_common+0x48/0x5d
      [    5.998194]  [<c1031758>] ? sub_preempt_count+0x63/0x89
      [    6.003412]  [<c1039644>] warn_slowpath_null+0xf/0x13
      [    6.008453]  [<c1031758>] sub_preempt_count+0x63/0x89
      [    6.013499]  [<c14d60c4>] _raw_spin_unlock+0x27/0x3f
      [    6.018453]  [<c10c6349>] add_partial+0x36/0x3b
      [    6.022973]  [<c10c7c0a>] deactivate_slab+0x96/0xb4
      [    6.027842]  [<c14cf9d9>] __slab_alloc.isra.54.constprop.63+0x204/0x241
      [    6.034456]  [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38
      [    6.039842]  [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38
      [    6.045232]  [<c10c7dc9>] kmem_cache_alloc_trace+0x51/0xb0
      [    6.050710]  [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38
      [    6.056100]  [<c103f78f>] kzalloc.constprop.5+0x29/0x38
      [    6.061320]  [<c17b45e9>] __reserve_region_with_split+0x1c/0xd1
      [    6.067230]  [<c17b4693>] __reserve_region_with_split+0xc6/0xd1
      ...
      [    7.179057]  [<c17b4693>] __reserve_region_with_split+0xc6/0xd1
      [    7.184970]  [<c17b4779>] reserve_region_with_split+0x30/0x42
      [    7.190709]  [<c17a8ebf>] e820_reserve_resources_late+0xd1/0xe9
      [    7.196623]  [<c17c9526>] pcibios_resource_survey+0x23/0x2a
      [    7.202184]  [<c17cad8a>] pcibios_init+0x23/0x35
      [    7.206789]  [<c17ca574>] pci_subsys_init+0x3f/0x44
      [    7.211659]  [<c1002088>] do_one_initcall+0x72/0x122
      [    7.216615]  [<c17ca535>] ? pci_legacy_init+0x3d/0x3d
      [    7.221659]  [<c17a27ff>] kernel_init+0xa6/0x118
      [    7.226265]  [<c17a2759>] ? start_kernel+0x334/0x334
      [    7.231223]  [<c14d7482>] kernel_thread_helper+0x6/0x10
      Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com>
      Signed-off-by: NRam Pai <linuxram@us.ibm.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65fed8f6
    • A
      include/linux/aio.h: cpp->C conversions · f7e1becb
      Andrew Morton 提交于
      Convert init_sync_kiocb() from a nasty macro into a nice C function.  The
      struct assignment trick takes care of zeroing all unmentioned fields.
      Shrinks fs/read_write.o's .text from 9857 bytes to 9714.
      
      Also demacroize is_sync_kiocb() and aio_ring_avail().  The latter fixes an
      arg-referenced-multiple-times hand grenade.
      
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f7e1becb
    • J
    • E
      pps: return PTR_ERR on error in device_create · 668f06b9
      Emil Goode 提交于
      We should return PTR_ERR if the call to the device_create function fails.
      Without this patch we instead return the value from a successful call to
      cdev_add if the call to device_create fails.
      Signed-off-by: NEmil Goode <emilgoode@gmail.com>
      Acked-by: NDevendra Naga <devendra.aaru@gmail.com>
      Cc: Alexander Gordeev <lasaine@lvk.cs.msu.su>
      Cc: Rodolfo Giometti <giometti@enneenne.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      668f06b9
    • A
      taskstats: check nla_reserve() return · 25353b33
      Alan Cox 提交于
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=44621
      
      Reported-by: <rucsoftsec@gmail.com>
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      25353b33
    • S
      sysctl: suppress kmemleak messages · fd4b616b
      Steven Rostedt 提交于
      register_sysctl_table() is a strange function, as it makes internal
      allocations (a header) to register a sysctl_table.  This header is a
      handle to the table that is created, and can be used to unregister the
      table.  But if the table is permanent and never unregistered, the header
      acts the same as a static variable.
      
      Unfortunately, this allocation of memory that is never expected to be
      freed fools kmemleak in thinking that we have leaked memory.  For those
      sysctl tables that are never unregistered, and have no pointer referencing
      them, kmemleak will think that these are memory leaks:
      
      unreferenced object 0xffff880079fb9d40 (size 192):
        comm "swapper/0", pid 0, jiffies 4294667316 (age 12614.152s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff8146b590>] kmemleak_alloc+0x73/0x98
          [<ffffffff8110a935>] kmemleak_alloc_recursive.constprop.42+0x16/0x18
          [<ffffffff8110b852>] __kmalloc+0x107/0x153
          [<ffffffff8116fa72>] kzalloc.constprop.8+0xe/0x10
          [<ffffffff811703c9>] __register_sysctl_paths+0xe1/0x160
          [<ffffffff81170463>] register_sysctl_paths+0x1b/0x1d
          [<ffffffff8117047d>] register_sysctl_table+0x18/0x1a
          [<ffffffff81afb0a1>] sysctl_init+0x10/0x14
          [<ffffffff81b05a6f>] proc_sys_init+0x2f/0x31
          [<ffffffff81b0584c>] proc_root_init+0xa5/0xa7
          [<ffffffff81ae5b7e>] start_kernel+0x3d0/0x40a
          [<ffffffff81ae52a7>] x86_64_start_reservations+0xae/0xb2
          [<ffffffff81ae53ad>] x86_64_start_kernel+0x102/0x111
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      The sysctl_base_table used by sysctl itself is one such instance that
      registers the table to never be unregistered.
      
      Use kmemleak_not_leak() to suppress the kmemleak false positive.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fd4b616b
    • W
      ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION · c1d7e01d
      Will Deacon 提交于
      Rather than #define the options manually in the architecture code, add
      Kconfig options for them and select them there instead.  This also allows
      us to select the compat IPC version parsing automatically for platforms
      using the old compat IPC interface.
      Reported-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1d7e01d
    • W
      ipc: compat: use signed size_t types for msgsnd and msgrcv · 05ba3f1a
      Will Deacon 提交于
      The msgsnd and msgrcv system calls use size_t to represent the size of the
      message being transferred.  POSIX states that values of msgsz greater than
      SSIZE_MAX cause the result to be implementation-defined.  On Linux, this
      equates to returning -EINVAL if (long) msgsz < 0.
      
      For compat tasks where !CONFIG_ARCH_WANT_OLD_COMPAT_IPC and compat_size_t
      is smaller than size_t, negative size values passed from userspace will be
      interpreted as positive values by do_msg{rcv,snd} and will fail to exit
      early with -EINVAL.
      
      This patch changes the compat prototypes for msg{rcv,snd} so that the
      message size is represented as a compat_ssize_t, which we cast to the
      native ssize_t type for the core IPC code.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05ba3f1a
    • W
      ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPC · b610c04c
      Will Deacon 提交于
      Commit 48b25c43 ("ipc: provide generic compat versions of IPC
      syscalls") added a new ARCH_WANT_OLD_COMPAT_IPC config option for
      architectures to select if their compat target requires the old IPC
      syscall interface.
      
      For architectures (such as AArch64) that do not require the internal
      calling conventions provided by this option, but have a compat target
      where the C library passes the IPC_64 flag explicitly,
      compat_ipc_parse_version no longer strips out the flag before calling
      the native system call implementation, resulting in unknown SHM/IPC
      commands and -EINVAL being returned to userspace.
      
      This patch separates the selection of the internal calling conventions
      for the IPC syscalls from the version parsing, allowing architectures to
      select __ARCH_WANT_COMPAT_IPC_PARSE_VERSION if they want to use version
      parsing whilst retaining the newer syscall calling conventions.
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b610c04c
    • W
      ipc: add COMPAT_SHMLBA support · 079a96ae
      Will Deacon 提交于
      If the SHMLBA definition for a native task differs from the definition for
      a compat task, the do_shmat() function would need to handle both.
      
      This patch introduces COMPAT_SHMLBA, which is used by the compat shmat
      syscall when calling the ipc code and allows architectures such as AArch64
      (where the native SHMLBA is 64k but the compat (AArch32) definition is
      16k) to provide the correct semantics for compat IPC system calls.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      079a96ae
    • V
      kdump: append newline to the last lien of vmcoreinfo note · 63dca8d5
      Vivek Goyal 提交于
      The last line of vmcoreinfo note does not end with \n.  Parsing all the
      lines in note becomes easier if all lines end with \n instead of trying to
      special case the last line.
      
      I know at least one tool, vmcore-dmesg in kexec-tools tree which made the
      assumption that all lines end with \n.  I think it is a good idea to fix
      it.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      63dca8d5
    • A
      fork: fix error handling in dup_task() · f19b9f74
      Akinobu Mita 提交于
      The function dup_task() may fail at the following function calls in the
      following order.
      
      0) alloc_task_struct_node()
      1) alloc_thread_info_node()
      2) arch_dup_task_struct()
      
      Error by 0) is not a matter, it can just return.  But error by 1) requires
      releasing task_struct allocated by 0) before it returns.  Likewise, error
      by 2) requires releasing task_struct and thread_info allocated by 0) and
      1).
      
      The existing error handling calls free_task_struct() and
      free_thread_info() which do not only release task_struct and thread_info,
      but also call architecture specific arch_release_task_struct() and
      arch_release_thread_info().
      
      The problem is that task_struct and thread_info are not fully initialized
      yet at this point, but arch_release_task_struct() and
      arch_release_thread_info() are called with them.
      
      For example, x86 defines its own arch_release_task_struct() that releases
      a task_xstate.  If alloc_thread_info_node() fails in dup_task(),
      arch_release_task_struct() is called with task_struct which is just
      allocated and filled with garbage in this error handling.
      
      This actually happened with tools/testing/fault-injection/failcmd.sh
      
      	# env FAILCMD_TYPE=fail_page_alloc \
      		./tools/testing/fault-injection/failcmd.sh --times=100 \
      		--min-order=0 --ignore-gfp-wait=0 \
      		-- make -C tools/testing/selftests/ run_tests
      
      In order to fix this issue, make free_{task_struct,thread_info}() not to
      call arch_release_{task_struct,thread_info}() and call
      arch_release_{task_struct,thread_info}() implicitly where needed.
      
      Default arch_release_task_struct() and arch_release_thread_info() are
      defined as empty by default.  So this change only affects the
      architectures which implement their own arch_release_task_struct() or
      arch_release_thread_info() as listed below.
      
      arch_release_task_struct(): x86, sh
      arch_release_thread_info(): mn10300, tile
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Salman Qazi <sqazi@google.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f19b9f74
    • A
      revert "sched: Fix fork() error path to not crash" · 87bec58a
      Andrew Morton 提交于
      To make way for "fork: fix error handling in dup_task()", which fixes the
      errors more completely.
      
      Cc: Salman Qazi <sqazi@google.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Akinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      87bec58a
    • H
      fork: use vma_pages() to simplify the code · b2412b7f
      Huang Shijie 提交于
      The current code can be replaced by vma_pages().  So use it to simplify
      the code.
      
      [akpm@linux-foundation.org: initialise `len' at its definition site]
      Signed-off-by: NHuang Shijie <shijie8@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b2412b7f
    • D
      proc: do not allow negative offsets on /proc/<pid>/environ · bc452b4b
      Djalal Harouni 提交于
      __mem_open() which is called by both /proc/<pid>/environ and
      /proc/<pid>/mem ->open() handlers will allow the use of negative offsets.
      /proc/<pid>/mem has negative offsets but not /proc/<pid>/environ.
      
      Clean this by moving the 'force FMODE_UNSIGNED_OFFSET flag' to mem_open()
      to allow negative offsets only on /proc/<pid>/mem.
      Signed-off-by: NDjalal Harouni <tixxdz@opendz.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bc452b4b
    • D
      proc: environ_read() make sure offset points to environment address range · e8905ec2
      Djalal Harouni 提交于
      Currently the following offset and environment address range check in
      environ_read() of /proc/<pid>/environ is buggy:
      
        int this_len = mm->env_end - (mm->env_start + src);
        if (this_len <= 0)
          break;
      
      Large or negative offsets on /proc/<pid>/environ converted to 'unsigned
      long' may pass this check since '(mm->env_start + src)' can overflow and
      'this_len' will be positive.
      
      This can turn /proc/<pid>/environ to act like /proc/<pid>/mem since
      (mm->env_start + src) will point and read from another VMA.
      
      There are two fixes here plus some code cleaning:
      
      1) Fix the overflow by checking if the offset that was converted to
         unsigned long will always point to the [mm->env_start, mm->env_end]
         address range.
      
      2) Remove the truncation that was made to the result of the check,
         storing the result in 'int this_len' will alter its value and we can
         not depend on it.
      
      For kernels that have commit b409e578 ("proc: clean up
      /proc/<pid>/environ handling") which adds the appropriate ptrace check and
      saves the 'mm' at ->open() time, this is not a security issue.
      
      This patch is taken from the grsecurity patch since it was just made
      available.
      Signed-off-by: NDjalal Harouni <tixxdz@opendz.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8905ec2
    • J
      coredump: fix wrong comments on core limits of pipe coredump case · 108ceeb0
      Jovi Zhang 提交于
      In commit 898b374a ("exec: replace call_usermodehelper_pipe with use
      of umh init function and resolve limit"), the core limits recursive
      check value was changed from 0 to 1, but the corresponding comments were
      not updated.
      Signed-off-by: NJovi Zhang <bookjovi@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      108ceeb0
    • T
      kmod: avoid deadlock from recursive kmod call · 0f20784d
      Tetsuo Handa 提交于
      The system deadlocks (at least since 2.6.10) when
      call_usermodehelper(UMH_WAIT_EXEC) request triggers
      call_usermodehelper(UMH_WAIT_PROC) request.
      
      This is because "khelper thread is waiting for the worker thread at
      wait_for_completion() in do_fork() since the worker thread was created
      with CLONE_VFORK flag" and "the worker thread cannot call complete()
      because do_execve() is blocked at UMH_WAIT_PROC request" and "the khelper
      thread cannot start processing UMH_WAIT_PROC request because the khelper
      thread is waiting for the worker thread at wait_for_completion() in
      do_fork()".
      
      The easiest example to observe this deadlock is to use a corrupted
      /sbin/hotplug binary (like shown below).
      
        # : > /tmp/dummy
        # chmod 755 /tmp/dummy
        # echo /tmp/dummy > /proc/sys/kernel/hotplug
        # modprobe whatever
      
      call_usermodehelper("/tmp/dummy", UMH_WAIT_EXEC) is called from
      kobject_uevent_env() in lib/kobject_uevent.c upon loading/unloading a
      module.  do_execve("/tmp/dummy") triggers a call to
      request_module("binfmt-0000") from search_binary_handler() which in turn
      calls call_usermodehelper(UMH_WAIT_PROC).
      
      In order to avoid deadlock, as a for-now and easy-to-backport solution, do
      not try to call wait_for_completion() in call_usermodehelper_exec() if the
      worker thread was created by khelper thread with CLONE_VFORK flag.  Future
      and fundamental solution might be replacing singleton khelper thread with
      some workqueue so that recursive calls up to max_active dependency loop
      can be handled without deadlock.
      
      [akpm@linux-foundation.org: add comment to kmod_thread_locker]
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0f20784d
    • A
      kernel/kmod.c: document call_usermodehelper_fns() a bit · 79c743dd
      Andrew Morton 提交于
      This function's interface is, uh, subtle.  Attempt to apologise for it.
      
      Cc: WANG Cong <xiyou.wangcong@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79c743dd
    • S
      fat: refactor shortname parsing · deb8274a
      Steven J. Magnani 提交于
      Nearly identical shortname parsing is performed in fat_search_long() and
      __fat_readdir().  Extract this code into a function that may be called by
      both.
      Signed-off-by: NSteven J. Magnani <steve@digidescorp.com>
      Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      deb8274a
    • S
      fat: accessors for msdos_dir_entry 'start' fields · a943ed71
      Steven J. Magnani 提交于
      Simplify code by providing accessor functions for the directory entry
      start cluster fields.
      Signed-off-by: NSteven J. Magnani <steve@digidescorp.com>
      Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a943ed71
    • N
      hfsplus: use -ENOMEM when kzalloc() fails · 497d48bd
      Namjae Jeon 提交于
      Use -ENOMEM return value instead of -EINVAL when kzalloc() fails.
      Signed-off-by: NNamjae Jeon <linkinjeon@gmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      497d48bd
    • V
      nilfs2: add omitted comments for different structures in driver implementation · f5974c8f
      Vyacheslav Dubeyko 提交于
      Add omitted comments for different structures in driver implementation.
      Signed-off-by: NVyacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f5974c8f
反馈
建议
客服 返回
顶部