1. 16 10月, 2014 1 次提交
  2. 15 10月, 2014 39 次提交
    • L
      Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu · 0429fbc0
      Linus Torvalds 提交于
      Pull percpu consistent-ops changes from Tejun Heo:
       "Way back, before the current percpu allocator was implemented, static
        and dynamic percpu memory areas were allocated and handled separately
        and had their own accessors.  The distinction has been gone for many
        years now; however, the now duplicate two sets of accessors remained
        with the pointer based ones - this_cpu_*() - evolving various other
        operations over time.  During the process, we also accumulated other
        inconsistent operations.
      
        This pull request contains Christoph's patches to clean up the
        duplicate accessor situation.  __get_cpu_var() uses are replaced with
        with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().
      
        Unfortunately, the former sometimes is tricky thanks to C being a bit
        messy with the distinction between lvalues and pointers, which led to
        a rather ugly solution for cpumask_var_t involving the introduction of
        this_cpu_cpumask_var_ptr().
      
        This converts most of the uses but not all.  Christoph will follow up
        with the remaining conversions in this merge window and hopefully
        remove the obsolete accessors"
      
      * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
        irqchip: Properly fetch the per cpu offset
        percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
        ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
        percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
        Revert "powerpc: Replace __get_cpu_var uses"
        percpu: Remove __this_cpu_ptr
        clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
        sparc: Replace __get_cpu_var uses
        avr32: Replace __get_cpu_var with __this_cpu_write
        blackfin: Replace __get_cpu_var uses
        tile: Use this_cpu_ptr() for hardware counters
        tile: Replace __get_cpu_var uses
        powerpc: Replace __get_cpu_var uses
        alpha: Replace __get_cpu_var
        ia64: Replace __get_cpu_var uses
        s390: cio driver &__get_cpu_var replacements
        s390: Replace __get_cpu_var uses
        mips: Replace __get_cpu_var uses
        MIPS: Replace __get_cpu_var uses in FPU emulator.
        arm: Replace __this_cpu_ptr with raw_cpu_ptr
        ...
      0429fbc0
    • L
      Merge tag 'llvmlinux-for-v3.18' of git://git.linuxfoundation.org/llvmlinux/kernel · 6929c358
      Linus Torvalds 提交于
      Pull LLVM updates from Behan Webster:
       "These patches remove the use of VLAIS using a new SHASH_DESC_ON_STACK
        macro.
      
        Some of the previously accepted VLAIS removal patches haven't used
        this macro.  I will push new patches to consistently use this macro in
        all those older cases for 3.19"
      
      [ More LLVM patches coming in through subsystem trees, and LLVM itself
        needs some fixes that are already in many distributions but not in
        released versions of LLVM.  Some day this will all "just work"  - Linus ]
      
      * tag 'llvmlinux-for-v3.18' of git://git.linuxfoundation.org/llvmlinux/kernel:
        crypto: LLVMLinux: Remove VLAIS usage from crypto/testmgr.c
        security, crypto: LLVMLinux: Remove VLAIS from ima_crypto.c
        crypto: LLVMLinux: Remove VLAIS usage from libcrc32c.c
        crypto: LLVMLinux: Remove VLAIS usage from crypto/hmac.c
        crypto, dm: LLVMLinux: Remove VLAIS usage from dm-crypt
        crypto: LLVMLinux: Remove VLAIS from crypto/.../qat_algs.c
        crypto: LLVMLinux: Remove VLAIS from crypto/omap_sham.c
        crypto: LLVMLinux: Remove VLAIS from crypto/n2_core.c
        crypto: LLVMLinux: Remove VLAIS from crypto/mv_cesa.c
        crypto: LLVMLinux: Remove VLAIS from crypto/ccp/ccp-crypto-sha.c
        btrfs: LLVMLinux: Remove VLAIS
        crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code
      6929c358
    • L
      Merge tag 'iommu-updates-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 23971bdf
      Linus Torvalds 提交于
      Pull IOMMU updates from Joerg Roedel:
       "This pull-request includes:
      
         - change in the IOMMU-API to convert the former iommu_domain_capable
           function to just iommu_capable
      
         - various fixes in handling RMRR ranges for the VT-d driver (one fix
           requires a device driver core change which was acked by Greg KH)
      
         - the AMD IOMMU driver now assigns and deassigns complete alias
           groups to fix issues with devices using the wrong PCI request-id
      
         - MMU-401 support for the ARM SMMU driver
      
         - multi-master IOMMU group support for the ARM SMMU driver
      
         - various other small fixes all over the place"
      
      * tag 'iommu-updates-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits)
        iommu/vt-d: Work around broken RMRR firmware entries
        iommu/vt-d: Store bus information in RMRR PCI device path
        iommu/vt-d: Only remove domain when device is removed
        driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
        iommu/amd: Fix devid mapping for ivrs_ioapic override
        iommu/irq_remapping: Fix the regression of hpet irq remapping
        iommu: Fix bus notifier breakage
        iommu/amd: Split init_iommu_group() from iommu_init_device()
        iommu: Rework iommu_group_get_for_pci_dev()
        iommu: Make of_device_id array const
        amd_iommu: do not dereference a NULL pointer address.
        iommu/omap: Remove omap_iommu unused owner field
        iommu: Remove iommu_domain_has_cap() API function
        IB/usnic: Convert to use new iommu_capable() API function
        vfio: Convert to use new iommu_capable() API function
        kvm: iommu: Convert to use new iommu_capable() API function
        iommu/tegra: Convert to iommu_capable() API function
        iommu/msm: Convert to iommu_capable() API function
        iommu/vt-d: Convert to iommu_capable() API function
        iommu/fsl: Convert to iommu_capable() API function
        ...
      23971bdf
    • L
      Merge tag 'clk-for-linus-3.18' of git://git.linaro.org/people/mike.turquette/linux · c0fa2373
      Linus Torvalds 提交于
      Pull clock tree updates from Mike Turquette:
       "The clk tree changes for 3.18 are dominated by clock drivers.  Mostly
        fixes and enhancements to existing drivers as well as new drivers.
        This tag contains a bit more arch code than I usually take due to some
        OMAP2+ changes.  Additionally it contains the restart notifier
        handlers which are merged as a dependency into several trees.
      
        The PXA changes are the only messy part.  Due to having a stable tree
        I had to revert one patch and follow up with one more fix near the tip
        of this tag.  Some dead code is introduced but it will soon become
        live code after 3.18-rc1 is released as the rest of the PXA family is
        converted over to the common clock framework.
      
        Another trend in this tag is that multiple vendors have started to
        push the complexity of changing their CPU frequency into the clock
        driver, whereas this used to be done in CPUfreq drivers.
      
        Changes to the clk core include a generic gpio-clock type and a
        clk_set_phase() function added to the top-level clk.h api.  Due to
        some confusion on the fbdev mailing list the kernel boot parameters
        documentation was updated to further explain the clk_ignore_unused
        parameter, which is often required by users of the simplefb driver.
      
        Finally some fixes to the locking around the clock debugfs stuff was
        done to prevent deadlocks when interacting with other subsystems."
      
      * tag 'clk-for-linus-3.18' of git://git.linaro.org/people/mike.turquette/linux: (99 commits)
        clk: pxa clocks build system fix
        Revert "arm: pxa: Transition pxa27x to clk framework"
        clk: samsung: register restart handlers for s3c2412 and s3c2443
        clk: rockchip: add restart handler
        clk: rockchip: rk3288: i2s_frac adds flag to set parent's rate
        doc/kernel-parameters.txt: clarify clk_ignore_unused
        arm: pxa: Transition pxa27x to clk framework
        dts: add devicetree bindings for pxa27x clocks
        clk: add pxa27x clock drivers
        arm: pxa: add clock pll selection bits
        clk: dts: document pxa clock binding
        clk: add pxa clocks infrastructure
        clk: gpio-gate: Ensure gpiod_ APIs are prototyped
        clk: ti: dra7-atl-clock: Mark the device as pm_runtime_irq_safe
        clk: ti: LLVMLinux: Move __init outside of type definition
        clk: ti: consider the fact that of_clk_get() might return an error
        clk: ti: dra7-atl-clock: fix a memory leak
        clk: ti: change clock init to use generic of_clk_init
        clk: hix5hd2: add I2C clocks
        clk: hix5hd2: add watchdog0 clocks
        ...
      c0fa2373
    • L
      Merge tag 'mfd-for-linus-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · fcc3a5d2
      Linus Torvalds 提交于
      Pull MFD updates from Lee Jones:
       "Changes to existing drivers:
        - DT clean-ups in da9055-core, max14577, rn5t618, arizona, hi6421, stmpe, twl4030
        - Export symbols for use in modules in max14577
        - Plenty of static code analysis/Coccinelle fixes throughout the SS
        - Regmap clean-ups in arizona, wm5102, wm5110, da9052, tps65217, rk808
        - Remove unused/duplicate code in da9052, 88pm860x, ti_ssp, lpc_sch, arizona
        - Bug fixes in ti_am335x_tscadc, da9052, ti_am335x_tscadc, rtsx_pcr
        - IRQ fixups in arizona, stmpe, max14577
        - Regulator related changes in axp20x
        - Pass DMA coherency information from parent => child in MFD core
        - Rename DT document files for consistency
        - Add ACPI support to the MFD core
        - Add Andreas Werner to MAINTAINERS for MEN F21BMC
      
       New drivers/supported devices:
        - New driver for MEN 14F021P00 Board Management Controller
        - New driver for Ricoh RN5T618 PMIC
        - New driver for Rockchip RK808
        - New driver for HiSilicon Hi6421 PMIC
        - New driver for Qualcomm SPMI PMICs
        - Add support for Intel Braswell in lpc_ich
        - Add support for Intel 9 Series PCH in lpc_ich
        - Add support for Intel Quark ILB in lpc_sch"
      
      [ Delayed to after the poweer/reset pull due to Kconfig problems with
        recursive Kconfig select/depends-on chains.   - Linus ]
      
      * tag 'mfd-for-linus-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (79 commits)
        mfd: cros_ec: wait for completion of commands that return IN_PROGRESS
        i2c: i2c-cros-ec-tunnel: Set retries to 3
        mfd: cros_ec: move locking into cros_ec_cmd_xfer
        mfd: cros_ec: stop calling ->cmd_xfer() directly
        mfd: cros_ec: Delay for 50ms when we see EC_CMD_REBOOT_EC
        MAINTAINERS: Adds Andreas Werner to maintainers list for MEN F21BMC
        mfd: arizona: Correct mask to allow setting micbias external cap
        mfd: Add ACPI support
        Revert "mfd: wm5102: Manually apply register patch"
        mfd: ti_am335x_tscadc: Update logic in CTRL register for 5-wire TS
        mfd: dt-bindings: atmel-gpbr: Rename doc file to conform to naming convention
        mfd: dt-bindings: qcom-pm8xxx: Rename doc file to conform to naming convention
        mfd: Inherit coherent_dma_mask from parent device
        mfd: Document DT bindings for Qualcomm SPMI PMICs
        mfd: Add support for Qualcomm SPMI PMICs
        mfd: dt-bindings: pm8xxx: Add new compatible string
        mfd: axp209x: Drop the parent supplies field
        mfd: twl4030-power: Use 'ti,system-power-controller' as alternative way to support system power off
        mfd: dt-bindings: twl4030-power: Use the standard property to mark power control
        mfd: syscon: Add Atmel GPBR DT bindings documention
        ...
      fcc3a5d2
    • L
      Merge tag 'for-v3.18' of git://git.infradead.org/battery-2.6 · 50fa8617
      Linus Torvalds 提交于
      Pull power supply and reset updates from Sebastian Reichel:
       - Initial support for the following chips
         * max77836 (charger)
         * max14577 (charger)
         * bq27742 (battery gauge)
         * ltc2952 (poweroff)
         * stih416 (restart)
         * syscon-reboot (restart)
         * gpio-restart (restart)
       - cleanup of power supply core
       - misc fixes in power supply and reset drivers
      
      * tag 'for-v3.18' of git://git.infradead.org/battery-2.6: (48 commits)
        power: ab8500_fg: Fix build warning
        Documentation: charger: max14577: Update the date of introducing ABI
        power: reset: corrections for simple syscon reboot driver
        Documentation: power: reset: Add documentation for generic SYSCON reboot driver
        power: reset: Add generic SYSCON register mapped reset
        bq27x00_battery: Fix flag reading for bq27742
        power: reset: use restart_notifier mechanism for msm-poweroff
        power: Add simple gpio-restart driver
        power: reset: st: Provide DT bindings for ST's Power Reset driver
        power: reset: Add restart functionality for STiH41x platforms
        power: charger-manager: Fix NULL pointer exception with missing cm-fuel-gauge
        power: max14577: Fix circular config SYSFS dependency
        power: gpio-charger: do not use gpio value directly
        power: max8925: Use of_get_child_by_name
        power: max8925: Fix NULL ptr dereference on memory allocation failure
        bq27x00_battery: Add support to bq27742
        Documentation: charger: max14577: Document exported sysfs entry
        devicetree: mfd: max14577: Add device tree bindings document
        power: max17040: Add ID for MAX77836 Fuel Gauge block
        charger: max14577: Configure battery-dependent settings from DTS and sysfs
        ...
      
      Conflicts:
      	drivers/power/reset/Kconfig
      	drivers/power/reset/Makefile
      50fa8617
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 6b049081
      Linus Torvalds 提交于
      Pull Ceph updates from Sage Weil:
       "There is the long-awaited discard support for RBD (Guangliang Zhao,
        Josh Durgin), a pile of RBD bug fixes that didn't belong in late -rc's
        (Ilya Dryomov, Li RongQing), a pile of fs/ceph bug fixes and
        performance and debugging improvements (Yan, Zheng, John Spray), and a
        smattering of cleanups (Chao Yu, Fabian Frederick, Joe Perches)"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (40 commits)
        ceph: fix divide-by-zero in __validate_layout()
        rbd: rbd workqueues need a resque worker
        libceph: ceph-msgr workqueue needs a resque worker
        ceph: fix bool assignments
        libceph: separate multiple ops with commas in debugfs output
        libceph: sync osd op definitions in rados.h
        libceph: remove redundant declaration
        ceph: additional debugfs output
        ceph: export ceph_session_state_name function
        ceph: include the initial ACL in create/mkdir/mknod MDS requests
        ceph: use pagelist to present MDS request data
        libceph: reference counting pagelist
        ceph: fix llistxattr on symlink
        ceph: send client metadata to MDS
        ceph: remove redundant code for max file size verification
        ceph: remove redundant io_iter_advance()
        ceph: move ceph_find_inode() outside the s_mutex
        ceph: request xattrs if xattr_version is zero
        rbd: set the remaining discard properties to enable support
        rbd: use helpers to handle discard for layered images correctly
        ...
      6b049081
    • L
      Merge branch 'CVE-2014-7970' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux · ce9d7f7b
      Linus Torvalds 提交于
      Pull pivot_root() fix from Andy Lutomirski.
      
      Prevent a leak of unreachable mounts.
      
      * 'CVE-2014-7970' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux:
        mnt: Prevent pivot_root from creating a loop in the mount tree
      ce9d7f7b
    • E
      mnt: Prevent pivot_root from creating a loop in the mount tree · 0d082601
      Eric W. Biederman 提交于
      Andy Lutomirski recently demonstrated that when chroot is used to set
      the root path below the path for the new ``root'' passed to pivot_root
      the pivot_root system call succeeds and leaks mounts.
      
      In examining the code I see that starting with a new root that is
      below the current root in the mount tree will result in a loop in the
      mount tree after the mounts are detached and then reattached to one
      another.  Resulting in all kinds of ugliness including a leak of that
      mounts involved in the leak of the mount loop.
      
      Prevent this problem by ensuring that the new mount is reachable from
      the current root of the mount tree.
      
      [Added stable cc.  Fixes CVE-2014-7970.  --Andy]
      
      Cc: stable@vger.kernel.org
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
      Link: http://lkml.kernel.org/r/87bnpmihks.fsf@x220.int.ebiederm.orgSigned-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      0d082601
    • Y
      ceph: fix divide-by-zero in __validate_layout() · 0bc62284
      Yan, Zheng 提交于
      The 'stripe_unit' field is 64 bits, casting it to 32 bits can result zero.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      0bc62284
    • I
      rbd: rbd workqueues need a resque worker · 792c3a91
      Ilya Dryomov 提交于
      Need to use WQ_MEM_RECLAIM for our workqueues to prevent I/O lockups
      under memory pressure - we sit on the memory reclaim path.
      
      Cc: stable@vger.kernel.org # 3.17, needs backporting for 3.16
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      Tested-by: NMicha Krause <micha@krausam.de>
      Reviewed-by: NSage Weil <sage@redhat.com>
      792c3a91
    • I
      libceph: ceph-msgr workqueue needs a resque worker · f9865f06
      Ilya Dryomov 提交于
      Commit f363e45f ("net/ceph: make ceph_msgr_wq non-reentrant")
      effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq.  This is
      wrong - libceph is very much a memory reclaim path, so restore it.
      
      Cc: stable@vger.kernel.org # needs backporting for < 3.12
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      Tested-by: NMicha Krause <micha@krausam.de>
      Reviewed-by: NSage Weil <sage@redhat.com>
      f9865f06
    • F
      ceph: fix bool assignments · ab6c2c3e
      Fabian Frederick 提交于
      Fix some coccinelle warnings:
      fs/ceph/caps.c:2400:6-10: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2401:6-15: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2402:6-17: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2403:6-22: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2404:6-22: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2405:6-19: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2440:4-20: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2469:3-16: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2490:2-18: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2519:3-7: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2549:3-12: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2575:2-6: WARNING: Assignment of bool to 0/1
      fs/ceph/caps.c:2589:3-7: WARNING: Assignment of bool to 0/1
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      ab6c2c3e
    • I
      libceph: separate multiple ops with commas in debugfs output · 25f89777
      Ilya Dryomov 提交于
      For requests with multiple ops, separate ops with commas instead of \t,
      which is a field separator here.
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      25f89777
    • I
      libceph: sync osd op definitions in rados.h · 70b5bfa3
      Ilya Dryomov 提交于
      Bring in missing osd ops and strings, use macros to eliminate multiple
      points of maintenance.
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      70b5bfa3
    • F
      libceph: remove redundant declaration · eb179d39
      Fabian Frederick 提交于
      ceph_release_page_vector was defined twice in libceph.h
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
      eb179d39
    • J
      ceph: additional debugfs output · 14ed9703
      John Spray 提交于
      MDS session state and client global ID is
      useful instrumentation when testing.
      Signed-off-by: NJohn Spray <john.spray@redhat.com>
      14ed9703
    • J
      ceph: export ceph_session_state_name function · a687ecaf
      John Spray 提交于
      ...so that it can be used from the ceph debugfs
      code when dumping session info.
      Signed-off-by: NJohn Spray <john.spray@redhat.com>
      a687ecaf
    • Y
      ceph: include the initial ACL in create/mkdir/mknod MDS requests · b1ee94aa
      Yan, Zheng 提交于
      Current code set new file/directory's initial ACL in a non-atomic
      manner.
      Client first sends request to MDS to create new file/directory, then set
      the initial ACL after the new file/directory is successfully created.
      
      The fix is include the initial ACL in create/mkdir/mknod MDS requests.
      So MDS can handle creating file/directory and setting the initial ACL in
      one request.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      b1ee94aa
    • Y
      ceph: use pagelist to present MDS request data · 25e6bae3
      Yan, Zheng 提交于
      Current code uses page array to present MDS request data. Pages in the
      array are allocated/freed by caller of ceph_mdsc_do_request(). If request
      is interrupted, the pages can be freed while they are still being used by
      the request message.
      
      The fix is use pagelist to present MDS request data. Pagelist is
      reference counted.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      25e6bae3
    • Y
      libceph: reference counting pagelist · e4339d28
      Yan, Zheng 提交于
      this allow pagelist to present data that may be sent multiple times.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      e4339d28
    • Y
      ceph: fix llistxattr on symlink · 0abb43dc
      Yan, Zheng 提交于
      only regular file and directory have vxattrs.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      0abb43dc
    • J
      ceph: send client metadata to MDS · dbd0c8bf
      John Spray 提交于
      Implement version 2 of CEPH_MSG_CLIENT_SESSION syntax,
      which includes additional client metadata to allow
      the MDS to report on clients by user-sensible names
      like hostname.
      Signed-off-by: NJohn Spray <john.spray@redhat.com>
      Reviewed-by: NYan, Zheng <zyan@redhat.com>
      dbd0c8bf
    • C
      ceph: remove redundant code for max file size verification · a4483e8a
      Chao Yu 提交于
      Both ceph_update_writeable_page and ceph_setattr will verify file size
      with max size ceph supported.
      There are two caller for ceph_update_writeable_page, ceph_write_begin and
      ceph_page_mkwrite. For ceph_write_begin, we have already verified the size in
      generic_write_checks of ceph_write_iter; for ceph_page_mkwrite, we have no
      chance to change file size when mmap. Likewise we have already verified the size
      in inode_change_ok when we call ceph_setattr.
      So let's remove the redundant code for max file size verification.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Reviewed-by: NYan, Zheng <zyan@redhat.com>
      a4483e8a
    • Y
      ceph: remove redundant io_iter_advance() · 3b70b388
      Yan, Zheng 提交于
      ceph_sync_read and generic_file_read_iter() have already advanced the
      IO iterator.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      3b70b388
    • Y
      ceph: move ceph_find_inode() outside the s_mutex · 6cd3bcad
      Yan, Zheng 提交于
      ceph_find_inode() may wait on freeing inode, using it inside the s_mutex
      may cause deadlock. (the freeing inode is waiting for OSD read reply, but
      dispatch thread is blocked by the s_mutex)
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      6cd3bcad
    • Y
      ceph: request xattrs if xattr_version is zero · 508b32d8
      Yan, Zheng 提交于
      Following sequence of events can happen.
        - Client releases an inode, queues cap release message.
        - A 'lookup' reply brings the same inode back, but the reply
          doesn't contain xattrs because MDS didn't receive the cap release
          message and thought client already has up-to-data xattrs.
      
      The fix is force sending a getattr request to MDS if xattrs_version
      is 0. The getattr mask is set to CEPH_STAT_CAP_XATTR, so MDS knows client
      does not have xattr.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      508b32d8
    • J
      rbd: set the remaining discard properties to enable support · b76f8239
      Josh Durgin 提交于
      max_discard_sectors must be set for the queue to support discard.
      Operations implementing discard for rbd zero data, so report that.
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      b76f8239
    • J
      rbd: use helpers to handle discard for layered images correctly · d3246fb0
      Josh Durgin 提交于
      Only allocate two osd ops for discard requests, since the
      preallocation hint is only added for regular writes.  Use
      rbd_img_obj_request_fill() to recreate the original write or discard
      osd operations, isolating that logic to one place, and change the
      assert in rbd_osd_req_create_copyup() to accept discard requests as
      well.
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      d3246fb0
    • J
      rbd: extract a method for adding object operations · 3b434a2a
      Josh Durgin 提交于
      rbd_img_request_fill() creates a ceph_osd_request and has logic for
      adding the appropriate osd ops to it based on the request type and
      image properties.
      
      For layered images, the original rbd_obj_request is resent with a
      copyup operation in front, using a new ceph_osd_request. The logic for
      adding the original operations should be the same as when first
      sending them, so move it to a helper function.
      
      op_type only needs to be checked once, so create a helper for that as
      well and call it outside the loop in rbd_img_request_fill().
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      3b434a2a
    • J
      rbd: make discard trigger copy-on-write · 1c220881
      Josh Durgin 提交于
      Discard requests are a form of write, so they should go through the
      same process as plain write requests and trigger copy-on-write for
      layered images.
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      1c220881
    • J
      rbd: tolerate -ENOENT for discard operations · d0265de7
      Josh Durgin 提交于
      Discard may try to delete an object from a non-layered image that does not exist.
      If this occurs, the image already has no data in that range, so change the
      result to success.
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      d0265de7
    • J
      rbd: fix snapshot context reference count for discards · bef95455
      Josh Durgin 提交于
      Discards take a reference to the snapshot context of an image when
      they are created.  This reference needs to be cleaned up when the
      request is done just as it is for regular writes.
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      bef95455
    • J
      rbd: read image size for discard check safely · 3c5df893
      Josh Durgin 提交于
      In rbd_img_request_fill() the image size is only checked to determine
      whether we can truncate an object instead of zeroing it for discard
      requests. Take rbd_dev->header_rwsem while reading the image size, and
      move this read into the discard check, so that non-discard ops don't
      need to take the semaphore in this function.
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      3c5df893
    • G
      rbd: initial discard bits from Guangliang Zhao · 90e98c52
      Guangliang Zhao 提交于
      This patch add the discard support for rbd driver.
      
      There are three types operation in the driver:
      1. The objects would be removed if they completely contained
         within the discard range.
      2. The objects would be truncated if they partly contained within
         the discard range, and align with their boundary.
      3. Others would be zeroed.
      
      A discard request from blkdev_issue_discard() is defined which
      REQ_WRITE and REQ_DISCARD both marked and no data, so we must
      check the REQ_DISCARD first when getting the request type.
      
      This resolve:
      	http://tracker.ceph.com/issues/190
      
      [ Ilya Dryomov: This is incomplete and somewhat buggy, see follow up
        commits by Josh Durgin for refinements and fixes which weren't
        folded in to preserve authorship. ]
      Signed-off-by: NGuangliang Zhao <lucienchao@gmail.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      90e98c52
    • G
      rbd: extend the operation type · 6d2940c8
      Guangliang Zhao 提交于
      It could only handle the read and write operations now,
      extend it for the coming discard support.
      Signed-off-by: NGuangliang Zhao <lucienchao@gmail.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      6d2940c8
    • G
      rbd: skip the copyup when an entire object writing · c622d226
      Guangliang Zhao 提交于
      It need to copyup the parent's content when layered writing,
      but an entire object write would overwrite it, so skip it.
      Signed-off-by: NGuangliang Zhao <lucienchao@gmail.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      c622d226
    • I
      rbd: add img_obj_request_simple() helper · 70d045f6
      Ilya Dryomov 提交于
      To clarify the conditions and make it easier to add new ones.
      Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
      70d045f6
    • J
      rbd: access snapshot context and mapping size safely · 4e752f0a
      Josh Durgin 提交于
      These fields may both change while the image is mapped if a snapshot
      is created or deleted or the image is resized.  They are guarded by
      rbd_dev->header_rwsem, so hold that while reading them, and store a
      local copy to refer to outside of the critical section. The local copy
      will stay consistent since the snapshot context is reference counted,
      and the mapping size is just a u64. This prevents torn loads from
      giving us inconsistent values.
      
      Move reading header.snapc into the caller of rbd_img_request_create()
      so that we only need to take the semaphore once. The read-only caller,
      rbd_parent_request_create() can just pass NULL for snapc, since the
      snapshot context is only relevant for writes.
      Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
      4e752f0a