1. 31 7月, 2016 21 次提交
    • S
      tcp: consider recv buf for the initial window scale · f626300a
      Soheil Hassas Yeganeh 提交于
      tcp_select_initial_window() intends to advertise a window
      scaling for the maximum possible window size. To do so,
      it considers the maximum of net.ipv4.tcp_rmem[2] and
      net.core.rmem_max as the only possible upper-bounds.
      However, users with CAP_NET_ADMIN can use SO_RCVBUFFORCE
      to set the socket's receive buffer size to values
      larger than net.ipv4.tcp_rmem[2] and net.core.rmem_max.
      Thus, SO_RCVBUFFORCE is effectively ignored by
      tcp_select_initial_window().
      
      To fix this, consider the maximum of net.ipv4.tcp_rmem[2],
      net.core.rmem_max and socket's initial buffer space.
      
      Fixes: b0573dea ("[NET]: Introduce SO_{SND,RCV}BUFFORCE socket options")
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Suggested-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f626300a
    • D
      Merge branch 'macsec-fixes' · c27bdce2
      David S. Miller 提交于
      Sabrina Dubroca says:
      
      ====================
      macsec: reference counting fixes
      
      Patch 1 adds explicit reference counting on RXSCs, instead of the
      current implicit reference counting using the RXSA's refcount.
      
      Patch 2 fixes possible kernel panics during module unload caused by an
      RCU callback that schedules another RCU callback, which the
      rcu_barrier() added in b196c22a ("macsec: add rcu_barrier() on
      module exit") didn't protect against.
      
      Patch 3 fixes a refcounting issue with the underlying device for a
      macsec device when link creation fails.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c27bdce2
    • S
      macsec: fix negative refcnt on parent link · 0759e552
      Sabrina Dubroca 提交于
      When creation of a macsec device fails because an identical device
      already exists on this link, the current code decrements the refcnt on
      the parent link (in ->destructor for the macsec device), but it had not
      been incremented yet.
      
      Move the dev_hold(parent_link) call earlier during macsec device
      creation.
      
      Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0759e552
    • S
      macsec: RXSAs don't need to hold a reference on RXSCs · 36b232c8
      Sabrina Dubroca 提交于
      Following the previous patch, RXSCs are held and properly refcounted in
      the RX path (instead of being implicitly held by their SA), so the SA
      doesn't need to hold a reference on its parent RXSC.
      
      This also avoids panics on module unload caused by the double layer of
      RCU callbacks (call_rcu frees the RXSA, which puts the final reference
      on the RXSC and allows to free it in its own call_rcu) that commit
      b196c22a ("macsec: add rcu_barrier() on module exit") didn't
      protect against.
      There were also some refcounting bugs in macsec_add_rxsa where I didn't
      put the reference on the RXSC on the error paths, which would lead to
      memory leaks.
      
      Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36b232c8
    • S
      macsec: fix reference counting on RXSC in macsec_handle_frame · c78ebe1d
      Sabrina Dubroca 提交于
      Currently, we lookup the RXSC without taking a reference on it.  The
      RXSA holds a reference on the RXSC, but the SA and SC could still both
      disappear before we take a reference on the SA.
      
      Take a reference on the RXSC in macsec_handle_frame.
      
      Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c78ebe1d
    • D
      Merge branch 'cpsw-fixes' · 122e9b71
      David S. Miller 提交于
      Grygorii Strashko says:
      
      ====================
      drivers: net: cpsw: fix driver loading/unloading
      
      This series fixes set of isssues observed when CPSW driver module is unloaded/loaded:
      1) rmmod: deadlock in cpdma_ctlr_destroy
      2) rmmod: L3 back-trace and crash if all net interfaces are down, because CPSW
      can be powerred down by PM runtime in this case.
      3) insmod: mdio device is not recreated on next insmod
       - need to use of_platform_depopulate() in cpsw_remove().
      4) rmmod: system crash on omap_device removal
      
      Tested on: am437x-idk, am57xx-beagle-x15
      
      Changes in v2:
      - build warning fixed
      - added fix for correct omap_device removal
      
      Link on v1:
       https://lkml.org/lkml/2016/7/22/240
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      122e9b71
    • G
      ARM: OMAP2+: omap_device: fix crash on omap_device removal · 213fa10d
      Grygorii Strashko 提交于
      Below call chain causes system crash when OMAP device is
      removed by calling of_platform_depopulate()/device_del():
      
      device_del()
      - blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
       			     BUS_NOTIFY_DEL_DEVICE, dev);
        - _omap_device_notifier_call()
          - omap_device_delete()
            - od->pdev->archdata.od = NULL;
      	kfree(od->hwmods);
      	kfree(od);
        - bus_remove_device()
          - device_release_driver()
            - __device_release_driver()
      	- pm_runtime_get_sync()
      	   - _od_runtime_resume()
      	     - omap_hwmod_enable() <- OOPS od's delted already
      
      Backtrace:
      Unable to handle kernel NULL pointer dereference at virtual address 0000000d
      pgd = eb100000
      [0000000d] *pgd=ad6e1831, *pte=00000000, *ppte=00000000
      Internal error: Oops: 17 [#1] PREEMPT SMP ARM
      CPU: 1 PID: 1273 Comm: modprobe Not tainted 4.4.15-rt19-00115-ge4d3cd3-dirty #68
      Hardware name: Generic DRA74X (Flattened Device Tree)
      task: eb1ee800 ti: ec962000 task.ti: ec962000
      PC is at omap_device_enable+0x10/0x90
      LR is at _od_runtime_resume+0x10/0x24
      [...]
      [<c00299dc>] (omap_device_enable) from [<c0029a6c>] (_od_runtime_resume+0x10/0x24)
      [<c0029a6c>] (_od_runtime_resume) from [<c04ad404>] (__rpm_callback+0x20/0x34)
      [<c04ad404>] (__rpm_callback) from [<c04ad438>] (rpm_callback+0x20/0x80)
      [<c04ad438>] (rpm_callback) from [<c04aee28>] (rpm_resume+0x48c/0x964)
      [<c04aee28>] (rpm_resume) from [<c04af360>] (__pm_runtime_resume+0x60/0x88)
      [<c04af360>] (__pm_runtime_resume) from [<c04a4974>] (__device_release_driver+0x30/0x100)
      [<c04a4974>] (__device_release_driver) from [<c04a4a60>] (device_release_driver+0x1c/0x28)
      [<c04a4a60>] (device_release_driver) from [<c04a38c0>] (bus_remove_device+0xec/0x144)
      [<c04a38c0>] (bus_remove_device) from [<c04a0764>] (device_del+0x10c/0x210)
      [<c04a0764>] (device_del) from [<c04a67b0>] (platform_device_del+0x18/0x84)
      [<c04a67b0>] (platform_device_del) from [<c04a6828>] (platform_device_unregister+0xc/0x20)
      [<c04a6828>] (platform_device_unregister) from [<c05adcfc>] (of_platform_device_destroy+0x8c/0x90)
      [<c05adcfc>] (of_platform_device_destroy) from [<c04a02f0>] (device_for_each_child+0x4c/0x78)
      [<c04a02f0>] (device_for_each_child) from [<c05adc5c>] (of_platform_depopulate+0x30/0x44)
      [<c05adc5c>] (of_platform_depopulate) from [<bf123920>] (cpsw_remove+0x68/0xf4 [ti_cpsw])
      [<bf123920>] (cpsw_remove [ti_cpsw]) from [<c04a68d8>] (platform_drv_remove+0x24/0x3c)
      [<c04a68d8>] (platform_drv_remove) from [<c04a49c8>] (__device_release_driver+0x84/0x100)
      [<c04a49c8>] (__device_release_driver) from [<c04a4b20>] (driver_detach+0xac/0xb0)
      [<c04a4b20>] (driver_detach) from [<c04a3be8>] (bus_remove_driver+0x60/0xd4)
      [<c04a3be8>] (bus_remove_driver) from [<c00d9870>] (SyS_delete_module+0x184/0x20c)
      [<c00d9870>] (SyS_delete_module) from [<c0010540>] (ret_fast_syscall+0x0/0x1c)
      Code: e3500000 e92d4070 1590630c 01a06000 (e5d6300d)
      
      Hence, fix it by using BUS_NOTIFY_REMOVED_DEVICE event for OMAP device
      deletion which is sent when DD has finished processing of device
      deletion.
      
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Tero Kristo <t-kristo@ti.com>
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      213fa10d
    • G
      drivers: net: cpsw: use of_platform_depopulate() · 3bf2cb3a
      Grygorii Strashko 提交于
      Use of_platform_depopulate() in cpsw_remove() instead of
      of_device_unregister(), because CSPW child devices will not be
      recreated otherwise on next insmod. of_platform_depopulate() is
      correct way now as it will ensure that all steps done in
      of_platform_populate() are reverted, including cleaning up of
      OF_POPULATED flag.
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Reviewed-by: NMugunthan V N <mugunthanvnm@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3bf2cb3a
    • G
      drivers: net: cpsw: fix wrong regs access in cpsw_remove · 8a0b6dc9
      Grygorii Strashko 提交于
      The L3 error will be generated and system will crash during unloading
      of CPSW driver if CPSW is used as module and ethX devices are down.
      This happens because CPSW can be power off by PM runtime now when ethX
      devices are down.
      
      Hence, ensure that CPSW powered up by PM runtime before performing any
      deinitialization actions which require CPSW registers access. In case
      of PM runtime error just leave cpsw_remove() as we can't do anything
      anymore.
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Reviewed-by: NMugunthan V N <mugunthanvnm@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a0b6dc9
    • G
      net: ethernet: ti: cpdma: fix lockup in cpdma_ctlr_destroy() · fccd5bad
      Grygorii Strashko 提交于
      Fix deadlock in cpdma_ctlr_destroy() which is triggered now on
      cpsw module removal:
       cpsw_remove()
       - cpdma_ctlr_destroy()
         - spin_lock_irqsave(&ctlr->lock, flags)
         - cpdma_ctlr_stop()
           - spin_lock_irqsave(&ctlr->lock, flags);
         - cpdma_chan_destroy()
           - spin_lock_irqsave(&ctlr->lock, flags);
      
      The issue has not been observed before because CPDMA channels have
      been destroyed manually by CPSW until commit d941ebe8 ("net:
      ethernet: ti: cpsw: use destroy ctlr to destroy channels") was merged.
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Reviewed-by: NMugunthan V N <mugunthanvnm@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fccd5bad
    • W
      net: ipv6: use list_move instead of list_del/list_add · c882219a
      Wei Yongjun 提交于
      Using list_move() instead of list_del() + list_add().
      Signed-off-by: NWei Yongjun <weiyj.lk@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c882219a
    • H
      cxgb4/cxgb4vf: Fixes regression in perf when tx vlan offload is disabled · 8d09e6b8
      Hariprasad Shenai 提交于
      The commit 637d3e99 ("cxgb4: Discard the packet if the length is
      greater than mtu") introduced a regression in the VLAN interface
      performance when Tx VLAN offload is disabled.
      
      Check if skb is tagged, regardless of whether it is hardware accelerated
      or not. Presently we were checking only for hardware acclereated one,
      which caused performance to drop to ~0.17Mbps on a 10GbE adapter for
      VLAN interface, when tx vlan offload is turned off using ethtool.
      The ethernet head length calculation was going wrong in this case, and
      driver ended up dropping packets.
      
      Fixes: 637d3e99 ("cxgb4: Discard the packet if the length is greater than mtu")
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d09e6b8
    • W
      drivers: net: phy: xgene: Remove redundant dev_err call in xgene_mdio_probe() · b2df430b
      Wei Yongjun 提交于
      There is a error message within devm_ioremap_resource
      already, so remove the dev_err call to avoid redundant
      error message.
      Signed-off-by: NWei Yongjun <weiyj.lk@gmail.com>
      Acked-By: NIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2df430b
    • W
      tipc: fix imbalance read_unlock_bh in __tipc_nl_add_monitor() · 6b65bc29
      Wei Yongjun 提交于
      In the error handling case of nla_nest_start() failed read_unlock_bh()
      is called  to unlock a lock that had not been taken yet. sparse warns
      about the context imbalance as the following:
      
      net/tipc/monitor.c:799:23: warning:
       context imbalance in '__tipc_nl_add_monitor' - different lock contexts for basic block
      
      Fixes: cf6f7e1d ('tipc: dump monitor attributes')
      Signed-off-by: NWei Yongjun <weiyj.lk@gmail.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b65bc29
    • D
      Merge branch 'qed-fixes' · 9d594b39
      David S. Miller 提交于
      Yuval Mintz says:
      
      ====================
      qed*: Small fixes series
      
      This contains several small [and straight-forward] fixes to qed*
      drivers.
      
      Please consider applying this to `net'.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d594b39
    • Y
      qed: Prevent over-usage of vlan credits by PF · 25eb8d46
      Yuval Mintz 提交于
      Each PF/VF has a limited number of vlan filters for
      configuration purposes; This information is passed to qede
      and is used to prevent over-usage - once a vlan is to be
      configured and no filter credit is available, the driver
      would switch into working in vlan-promisc mode.
      
      Problem is the credit pool is shared by both PFs and VFs,
      and currently PFs aren't deducting the filters that are
      reserved for their VFs from their quota, which may lead
      to some vlan filters failing unknowingly due to lack of credit.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25eb8d46
    • Y
      qed: Correct min bandwidth for 100g · d572c430
      Yuval Mintz 提交于
      Driver uses reverse logic when checking if minimum
      bandwidth configuration applied, causing it to
      configure the guarantee only on the first hw-function.
      
      Fixes: a0d26d5a ("qed*: Don't reset statistics on inner reload")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d572c430
    • Y
      qede: Reset statistics on explicit down · 7f7a144f
      Yuval Mintz 提交于
      Adding the necessary logic to prevet statistics reset
      on inner-reload introduced a bug, and now statistics
      are reset only when re-probing the driver.
      
      Fixes: a0d26d5a ("qed*: Don't reset statistics on inner reload")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f7a144f
    • Y
      qed: Don't over-do producer cleanup for Rx · b21290b7
      Yuval Mintz 提交于
      Before requesting the firmware to start Rx queues,
      driver goes and sets the queue producer in the device to 0.
      But while the producer is 32-bit, the driver currently clears 64 bits,
      effectively zeroing an additional CID's producer as well.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b21290b7
    • Y
      qed: Fix removal of spoof checking for VFs · cb1fa088
      Yuval Mintz 提交于
      Driver has reverse logic for checking the result of the
      spoof-checking configuration. As a result, it would log that
      the configuration failed [even though it succeeded], and will
      no longer do anything when requested to remove the configuration,
      as it's accounting of the feature will be incorrect.
      
      Fixes: 6ddc7608 ("qed*: IOV support spoof-checking")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb1fa088
    • Y
      qede: Don't try removing unconfigured vlans · c524e2f5
      Yuval Mintz 提交于
      As part of ndo_vlan_rx_kill_vid() implementation,
      qede is requesting firmware to remove the vlan filter.
      This currently happens even if the vlan wasn't previously
      added [In case device ran out of vlan credits].
      
      For PFs this doesn't cause any issues as the firmware
      would simply ignore the removal request. But for VFs their
      parent PF is holding an accounting of the configured vlans,
      and such a request would cause the PF to fail the VF's
      removal request.
      
      Simply fix this for both PFs & VFs and don't remove filters
      that were not previously added.
      
      Fixes: 7c1bfcad ("qede: Add vlan filtering offload support")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c524e2f5
  2. 30 7月, 2016 19 次提交
    • L
      Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit · 797cee98
      Linus Torvalds 提交于
      Pull audit updates from Paul Moore:
       "Six audit patches for 4.8.
      
        There are a couple of style and minor whitespace tweaks for the logs,
        as well as a minor fixup to catch errors on user filter rules, however
        the major improvements are a fix to the s390 syscall argument masking
        code (reviewed by the nice s390 folks), some consolidation around the
        exclude filtering (less code, always a win), and a double-fetch fix
        for recording the execve arguments"
      
      * 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit:
        audit: fix a double fetch in audit_log_single_execve_arg()
        audit: fix whitespace in CWD record
        audit: add fields to exclude filter by reusing user filter
        s390: ensure that syscall arguments are properly masked on s390
        audit: fix some horrible switch statement style crimes
        audit: fixup: log on errors from filter user rules
      797cee98
    • L
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 7a1e8b80
      Linus Torvalds 提交于
      Pull security subsystem updates from James Morris:
       "Highlights:
      
         - TPM core and driver updates/fixes
         - IPv6 security labeling (CALIPSO)
         - Lots of Apparmor fixes
         - Seccomp: remove 2-phase API, close hole where ptrace can change
           syscall #"
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
        apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
        tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
        tpm: Factor out common startup code
        tpm: use devm_add_action_or_reset
        tpm2_i2c_nuvoton: add irq validity check
        tpm: read burstcount from TPM_STS in one 32-bit transaction
        tpm: fix byte-order for the value read by tpm2_get_tpm_pt
        tpm_tis_core: convert max timeouts from msec to jiffies
        apparmor: fix arg_size computation for when setprocattr is null terminated
        apparmor: fix oops, validate buffer size in apparmor_setprocattr()
        apparmor: do not expose kernel stack
        apparmor: fix module parameters can be changed after policy is locked
        apparmor: fix oops in profile_unpack() when policy_db is not present
        apparmor: don't check for vmalloc_addr if kvzalloc() failed
        apparmor: add missing id bounds check on dfa verification
        apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
        apparmor: use list_next_entry instead of list_entry_next
        apparmor: fix refcount race when finding a child profile
        apparmor: fix ref count leak when profile sha1 hash is read
        apparmor: check that xindex is in trans_table bounds
        ...
      7a1e8b80
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · a867d734
      Linus Torvalds 提交于
      Pull userns vfs updates from Eric Biederman:
       "This tree contains some very long awaited work on generalizing the
        user namespace support for mounting filesystems to include filesystems
        with a backing store.  The real world target is fuse but the goal is
        to update the vfs to allow any filesystem to be supported.  This
        patchset is based on a lot of code review and testing to approach that
        goal.
      
        While looking at what is needed to support the fuse filesystem it
        became clear that there were things like xattrs for security modules
        that needed special treatment.  That the resolution of those concerns
        would not be fuse specific.  That sorting out these general issues
        made most sense at the generic level, where the right people could be
        drawn into the conversation, and the issues could be solved for
        everyone.
      
        At a high level what this patchset does a couple of simple things:
      
         - Add a user namespace owner (s_user_ns) to struct super_block.
      
         - Teach the vfs to handle filesystem uids and gids not mapping into
           to kuids and kgids and being reported as INVALID_UID and
           INVALID_GID in vfs data structures.
      
        By assigning a user namespace owner filesystems that are mounted with
        only user namespace privilege can be detected.  This allows security
        modules and the like to know which mounts may not be trusted.  This
        also allows the set of uids and gids that are communicated to the
        filesystem to be capped at the set of kuids and kgids that are in the
        owning user namespace of the filesystem.
      
        One of the crazier corner casees this handles is the case of inodes
        whose i_uid or i_gid are not mapped into the vfs.  Most of the code
        simply doesn't care but it is easy to confuse the inode writeback path
        so no operation that could cause an inode write-back is permitted for
        such inodes (aka only reads are allowed).
      
        This set of changes starts out by cleaning up the code paths involved
        in user namespace permirted mounts.  Then when things are clean enough
        adds code that cleanly sets s_user_ns.  Then additional restrictions
        are added that are possible now that the filesystem superblock
        contains owner information.
      
        These changes should not affect anyone in practice, but there are some
        parts of these restrictions that are changes in behavior.
      
         - Andy's restriction on suid executables that does not honor the
           suid bit when the path is from another mount namespace (think
           /proc/[pid]/fd/) or when the filesystem was mounted by a less
           privileged user.
      
         - The replacement of the user namespace implicit setting of MNT_NODEV
           with implicitly setting SB_I_NODEV on the filesystem superblock
           instead.
      
           Using SB_I_NODEV is a stronger form that happens to make this state
           user invisible.  The user visibility can be managed but it caused
           problems when it was introduced from applications reasonably
           expecting mount flags to be what they were set to.
      
        There is a little bit of work remaining before it is safe to support
        mounting filesystems with backing store in user namespaces, beyond
        what is in this set of changes.
      
         - Verifying the mounter has permission to read/write the block device
           during mount.
      
         - Teaching the integrity modules IMA and EVM to handle filesystems
           mounted with only user namespace root and to reduce trust in their
           security xattrs accordingly.
      
         - Capturing the mounters credentials and using that for permission
           checks in d_automount and the like.  (Given that overlayfs already
           does this, and we need the work in d_automount it make sense to
           generalize this case).
      
        Furthermore there are a few changes that are on the wishlist:
      
         - Get all filesystems supporting posix acls using the generic posix
           acls so that posix_acl_fix_xattr_from_user and
           posix_acl_fix_xattr_to_user may be removed.  [Maintainability]
      
         - Reducing the permission checks in places such as remount to allow
           the superblock owner to perform them.
      
         - Allowing the superblock owner to chown files with unmapped uids and
           gids to something that is mapped so the files may be treated
           normally.
      
        I am not considering even obvious relaxations of permission checks
        until it is clear there are no more corner cases that need to be
        locked down and handled generically.
      
        Many thanks to Seth Forshee who kept this code alive, and putting up
        with me rewriting substantial portions of what he did to handle more
        corner cases, and for his diligent testing and reviewing of my
        changes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
        fs: Call d_automount with the filesystems creds
        fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
        evm: Translate user/group ids relative to s_user_ns when computing HMAC
        dquot: For now explicitly don't support filesystems outside of init_user_ns
        quota: Handle quota data stored in s_user_ns in quota_setxquota
        quota: Ensure qids map to the filesystem
        vfs: Don't create inodes with a uid or gid unknown to the vfs
        vfs: Don't modify inodes with a uid or gid unknown to the vfs
        cred: Reject inodes with invalid ids in set_create_file_as()
        fs: Check for invalid i_uid in may_follow_link()
        vfs: Verify acls are valid within superblock's s_user_ns.
        userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
        fs: Refuse uid/gid changes which don't map into s_user_ns
        selinux: Add support for unprivileged mounts from user namespaces
        Smack: Handle labels consistently in untrusted mounts
        Smack: Add support for unprivileged mounts from user namespaces
        fs: Treat foreign mounts as nosuid
        fs: Limit file caps to the user namespace of the super block
        userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
        userns: Remove implicit MNT_NODEV fragility.
        ...
      a867d734
    • L
      Merge tag 'pm-urgent-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 601f887d
      Linus Torvalds 提交于
      Pull power management fix from Rafael Wysocki:
       "Fix a nasty (and really hard to debug) memory corruption during resume
        from hibernation on x86-64 (that leads to a kernel panic most of the
        time) due to the use of a stale stack pointer value in FRAME_BEGIN
        (Josh Poimboeuf)"
      
      * tag 'pm-urgent-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        x86/power/64: Fix hibernation return address corruption
      601f887d
    • L
      Merge branch 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 574c7e23
      Linus Torvalds 提交于
      Pull more cgroup updates from Tejun Heo:
       "I forgot to include the patches which got applied to for-4.7-fixes
        late during last cycle.
      
        Eric's three patches fix bugs introduced with the namespace support"
      
      * 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroupns: Only allow creation of hierarchies in the initial cgroup namespace
        cgroupns: Close race between cgroup_post_fork and copy_cgroup_ns
        cgroupns: Fix the locking in copy_cgroup_ns
      574c7e23
    • L
      Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a6408f6c
      Linus Torvalds 提交于
      Pull smp hotplug updates from Thomas Gleixner:
       "This is the next part of the hotplug rework.
      
         - Convert all notifiers with a priority assigned
      
         - Convert all CPU_STARTING/DYING notifiers
      
           The final removal of the STARTING/DYING infrastructure will happen
           when the merge window closes.
      
        Another 700 hundred line of unpenetrable maze gone :)"
      
      * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
        timers/core: Correct callback order during CPU hot plug
        leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
        powerpc/numa: Convert to hotplug state machine
        arm/perf: Fix hotplug state machine conversion
        irqchip/armada: Avoid unused function warnings
        ARC/time: Convert to hotplug state machine
        clocksource/atlas7: Convert to hotplug state machine
        clocksource/armada-370-xp: Convert to hotplug state machine
        clocksource/exynos_mct: Convert to hotplug state machine
        clocksource/arm_global_timer: Convert to hotplug state machine
        rcu: Convert rcutree to hotplug state machine
        KVM/arm/arm64/vgic-new: Convert to hotplug state machine
        smp/cfd: Convert core to hotplug state machine
        x86/x2apic: Convert to CPU hotplug state machine
        profile: Convert to hotplug state machine
        timers/core: Convert to hotplug state machine
        hrtimer: Convert to hotplug state machine
        x86/tboot: Convert to hotplug state machine
        arm64/armv8 deprecated: Convert to hotplug state machine
        hwtracing/coresight-etm4x: Convert to hotplug state machine
        ...
      a6408f6c
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide · 1a81a8f2
      Linus Torvalds 提交于
      Pull IDE updates from David Miller:
       "Just a couple small bug fixes, nothing overly exciting in here"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
        ide: missing break statement in set_timings_mdma()
        ide: hpt366: fix incorrect mask when checking at cmd_high_time
        ide-tape: fix misprint in failure handling in idetape_init()
        cmd640: add __init attribute
      1a81a8f2
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 86505fc0
      Linus Torvalds 提交于
      Pull sparc updates from David Miller:
      
       1) Double spin lock bug in sunhv serial driver, from Dan Carpenter.
      
       2) Use correct RSS estimate when determining whether to grow the huge
          TSB or not, from Mike Kravetz.
      
       3) Don't use full three level page tables for hugepages, PMD level is
          sufficient.  From Nitin Gupta.
      
       4) Mask out extraneous bits from TSB_TAG_ACCESS register, we only want
          the address bits.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Trim page tables for 8M hugepages
        sparc64 mm: Fix base TSB sizing when hugetlb pages are used
        sparc: serial: sunhv: fix a double lock bug
        sparc32: off by ones in BUG_ON()
        sparc: Don't leak context bits into thread->fault_address
      86505fc0
    • L
      Merge tag 'arc-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 9d3bc3d4
      Linus Torvalds 提交于
      Pull ARC updates from Vineet Gupta:
       "Things have been calm here - nothing much except for a few fixes"
      
      * tag 'arc-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: mm: don't loose PTE_SPECIAL in pte_modify()
        ARC: dma: fix address translation in arc_dma_free
        ARC: typo fix in mm/ioremap.c
        ARC: fix linux-next build breakage
      9d3bc3d4
    • R
      Merge branch 'pm-sleep' · e148d0f8
      Rafael J. Wysocki 提交于
      * pm-sleep:
        x86/power/64: Fix hibernation return address corruption
      e148d0f8
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32 · befff3bf
      Linus Torvalds 提交于
      Pull AVR32 updates from Hans-Christian Noren Egtvedt.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
        avr32: off by one in at32_init_pio()
        avr32: fixup code style in unistd.h and syscall_table.S
        avr32: wire up preadv2 and pwritev2 syscalls
      befff3bf
    • L
      Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · b5f00d18
      Linus Torvalds 提交于
      Pull ARM updates from Russell King:
       "Included in this update are:
      
         - Patches from Gregory Clement to fix the coherent DMA cases in our
           dma-mapping code.
      
         - A number of CPU errata updates and fixes.
      
         - ARM cpuidle improvements from Jisheng Zhang.
      
         - Fix from Kees for the location of _etext.
      
         - Cleanups from Masahiro Yamada to avoid duplicated messages during
           the kernel build, and remove CONFIG_ARCH_HAS_BARRIERS.
      
         - Remove a udelay loop limitation, allowing for faster CPUs to
           calibrate the delay correctly.
      
         - Cleanup some left-overs from the SW PAN implementation.
      
         - Ensure that a modified address limit is not visible to exception
           handlers"
      
      * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (21 commits)
        ARM: 8586/1: cpuidle: make arm_cpuidle_suspend() a bit more efficient
        ARM: 8585/1: cpuidle: fix !cpuidle_ops[cpu].init case during init
        ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used
        ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent
        ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421
        ARM: 8559/1: errata: Workaround erratum A12 821420
        ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423
        ARM: save and reset the address limit when entering an exception
        ARM: 8577/1: Fix Cortex-A15 798181 errata initialization
        ARM: 8584/1: floppy: avoid gcc-6 warning
        ARM: 8583/1: mm: fix location of _etext
        ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERS
        ARM: 8306/1: loop_udelay: remove bogomips value limitation
        ARM: 8581/1: add missing <asm/prom.h> to arch/arm/kernel/devtree.c
        ARM: 8576/1: avoid duplicating "Kernel: arch/arm/boot/*Image is ready"
        ARM: 8556/1: on a generic DT system: do not touch l2x0
        ARM: uaccess: remove put_user() code duplication
        ARM: 8580/1: Remove orphaned __addr_ok() definition
        ARM: get rid of horrible *(unsigned int *)(regs + 1)
        ARM: introduce svc_pt_regs structure
        ...
      b5f00d18
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 27ae0c41
      Linus Torvalds 提交于
      Pull fuse updates from Miklos Szeredi:
       "This fixes error propagation from writeback to fsync/close for
        writeback cache mode as well as adding a missing capability flag to
        the INIT message.  The rest are cleanups.
      
        (The commits are recent but all the code actually sat in -next for a
        while now.  The recommits are due to conflict avoidance and the
        addition of Cc: stable@...)"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: use filemap_check_errors()
        mm: export filemap_check_errors() to modules
        fuse: fix wrong assignment of ->flags in fuse_send_init()
        fuse: fuse_flush must check mapping->flags for errors
        fuse: fsync() did not return IO errors
        fuse: don't mess with blocking signals
        new helper: wait_event_killable_exclusive()
        fuse: improve aio directIO write performance for size extending writes
      27ae0c41
    • L
      Revert "vfs: add lookup_hash() helper" · 20d00ee8
      Linus Torvalds 提交于
      This reverts commit 3c9fe8cd.
      
      As Miklos points out in commit c1b2cc1a, the "lookup_hash()" helper
      is now unused, and in fact, with the hash salting changes, since the
      hash of a dentry name now depends on the directory dentry it is in, the
      helper function isn't even really likely to be useful.
      
      So rather than keep it around in case somebody else might end up finding
      a use for it, let's just remove the helper and not trick people into
      thinking it might be a useful thing.
      
      For example, I had obviously completely missed how the helper didn't
      follow the normal dentry hashing patterns, and how the hash salting
      patch broke overlayfs.  Things would quietly build and look sane, but
      not work.
      Suggested-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20d00ee8
    • L
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · e7b4f2d8
      Linus Torvalds 提交于
      Pull overlayfs update from Miklos Szeredi:
       "First of all, this fixes a regression in overlayfs introduced by the
        dentry hash salting.  I've moved the patch fixing this to the front of
        the queue, so if (god forbid) something needs to be bisected in
        overlayfs this regression won't interfere with that.
      
        The biggest part is preparation for selinux support, done by Vivek
        Goyal.  Essentially this makes all operations on underlying
        filesystems be done with credentials of mounter.  This makes
        everything nicely consistent.
      
        There are also fixes for a number of known and recently discovered
        non-standard behavior (thanks to Eryu Guan for testing and improving
        the test suites)"
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (23 commits)
        ovl: simplify empty checking
        qstr: constify instances in overlayfs
        ovl: clear nlink on rmdir
        ovl: disallow overlayfs as upperdir
        ovl: fix warning
        ovl: remove duplicated include from super.c
        ovl: append MAY_READ when diluting write checks
        ovl: dilute permission checks on lower only if not special file
        ovl: fix POSIX ACL setting
        ovl: share inode for hard link
        ovl: store real inode pointer in ->i_private
        ovl: permission: return ECHILD instead of ENOENT
        ovl: update atime on upper
        ovl: fix sgid on directory
        ovl: simplify permission checking
        ovl: do not require mounter to have MAY_WRITE on lower
        ovl: do operations on underlying file system in mounter's context
        ovl: modify ovl_permission() to do checks on two inodes
        ovl: define ->get_acl() for overlay inodes
        ovl: move some common code in a function
        ...
      e7b4f2d8
    • L
      Merge tag 'freevxfs-for-4.8' of git://git.infradead.org/users/hch/freevxfs · 0a7736d0
      Linus Torvalds 提交于
      Pull freevxfs updates from Christoph Hellwig:
       "Support for foreign endianess and HP-UP superblocks from
        Krzysztof Błaszkowski"
      
      * tag 'freevxfs-for-4.8' of git://git.infradead.org/users/hch/freevxfs:
        freevxfs: update Kconfig information
        freevxfs: refactor readdir and lookup code
        freevxfs: fix lack of inode initialization
        freevxfs: fix memory leak in vxfs_read_fshead()
        freevxfs: update documentation and cresdits for HP-UX support
        freevxfs: implement ->alloc_inode and ->destroy_inode
        freevxfs: avoid the need for forward declaring the super operations
        freevxfs: move VFS inode allocation into vxfs_blkiget and vxfs_stiget
        freevxfs: remove vxfs_put_fake_inode
        freevxfs: handle big endian HP-UX file systems
      0a7736d0
    • L
      Merge tag 'configfs-for-4.8' of git://git.infradead.org/users/hch/configfs · a54809f1
      Linus Torvalds 提交于
      Pull configfs update from Christoph Hellwig:
       "A simple error handling fix from Tal Shorer"
      
      * tag 'configfs-for-4.8' of git://git.infradead.org/users/hch/configfs:
        configfs: don't set buffer_needs_fill to zero if show() returns error
      a54809f1
    • L
      Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 · b0c4e2ac
      Linus Torvalds 提交于
      Pull CIFS/SMB3 fixes from Steve French:
       "Various CIFS/SMB3 fixes, most for stable"
      
      * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
        CIFS: Fix a possible invalid memory access in smb2_query_symlink()
        fs/cifs: make share unaccessible at root level mountable
        cifs: fix crash due to race in hmac(md5) handling
        cifs: unbreak TCP session reuse
        cifs: Check for existing directory when opening file with O_CREAT
        Add MF-Symlinks support for SMB 2.0
      b0c4e2ac
    • N
      sparc64: Trim page tables for 8M hugepages · 7bc3777c
      Nitin Gupta 提交于
      For PMD aligned (8M) hugepages, we currently allocate
      all four page table levels which is wasteful. We now
      allocate till PMD level only which saves memory usage
      from page tables.
      
      Also, when freeing page table for 8M hugepage backed region,
      make sure we don't try to access non-existent PTE level.
      
      Orabug: 22630259
      Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7bc3777c