1. 01 11月, 2018 3 次提交
    • A
      netfilter: ipset: fix ip_set_list allocation failure · ed956f39
      Andrey Ryabinin 提交于
      ip_set_create() and ip_set_net_init() attempt to allocate physically
      contiguous memory for ip_set_list. If memory is fragmented, the
      allocations could easily fail:
      
              vzctl: page allocation failure: order:7, mode:0xc0d0
      
              Call Trace:
               dump_stack+0x19/0x1b
               warn_alloc_failed+0x110/0x180
               __alloc_pages_nodemask+0x7bf/0xc60
               alloc_pages_current+0x98/0x110
               kmalloc_order+0x18/0x40
               kmalloc_order_trace+0x26/0xa0
               __kmalloc+0x279/0x290
               ip_set_net_init+0x4b/0x90 [ip_set]
               ops_init+0x3b/0xb0
               setup_net+0xbb/0x170
               copy_net_ns+0xf1/0x1c0
               create_new_namespaces+0xf9/0x180
               copy_namespaces+0x8e/0xd0
               copy_process+0xb61/0x1a00
               do_fork+0x91/0x320
      
      Use kvcalloc() to fallback to 0-order allocations if high order
      page isn't available.
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ed956f39
    • E
      netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net · 886503f3
      Eric Westbrook 提交于
      Allow /0 as advertised for hash:net,port,net sets.
      
      For "hash:net,port,net", ipset(8) says that "either subnet
      is permitted to be a /0 should you wish to match port
      between all destinations."
      
      Make that statement true.
      
      Before:
      
          # ipset create cidrzero hash:net,port,net
          # ipset add cidrzero 0.0.0.0/0,12345,0.0.0.0/0
          ipset v6.34: The value of the CIDR parameter of the IP address is invalid
      
          # ipset create cidrzero6 hash:net,port,net family inet6
          # ipset add cidrzero6 ::/0,12345,::/0
          ipset v6.34: The value of the CIDR parameter of the IP address is invalid
      
      After:
      
          # ipset create cidrzero hash:net,port,net
          # ipset add cidrzero 0.0.0.0/0,12345,0.0.0.0/0
          # ipset test cidrzero 192.168.205.129,12345,172.16.205.129
          192.168.205.129,tcp:12345,172.16.205.129 is in set cidrzero.
      
          # ipset create cidrzero6 hash:net,port,net family inet6
          # ipset add cidrzero6 ::/0,12345,::/0
          # ipset test cidrzero6 fe80::1,12345,ff00::1
          fe80::1,tcp:12345,ff00::1 is in set cidrzero6.
      
      See also:
      
        https://bugzilla.kernel.org/show_bug.cgi?id=200897
        https://github.com/ewestbrook/linux/commit/df7ff6efb0934ab6acc11f003ff1a7580d6c1d9cSigned-off-by: NEric Westbrook <linux@westbrook.io>
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      886503f3
    • S
      netfilter: ipset: list:set: Decrease refcount synchronously on deletion and replace · 439cd39e
      Stefano Brivio 提交于
      Commit 45040978 ("netfilter: ipset: Fix set:list type crash
      when flush/dump set in parallel") postponed decreasing set
      reference counters to the RCU callback.
      
      An 'ipset del' command can terminate before the RCU grace period
      is elapsed, and if sets are listed before then, the reference
      counter shown in userspace will be wrong:
      
       # ipset create h hash:ip; ipset create l list:set; ipset add l
       # ipset del l h; ipset list h
       Name: h
       Type: hash:ip
       Revision: 4
       Header: family inet hashsize 1024 maxelem 65536
       Size in memory: 88
       References: 1
       Number of entries: 0
       Members:
       # sleep 1; ipset list h
       Name: h
       Type: hash:ip
       Revision: 4
       Header: family inet hashsize 1024 maxelem 65536
       Size in memory: 88
       References: 0
       Number of entries: 0
       Members:
      
      Fix this by making the reference count update synchronous again.
      
      As a result, when sets are listed, ip_set_name_byindex() might
      now fetch a set whose reference count is already zero. Instead
      of relying on the reference count to protect against concurrent
      set renaming, grab ip_set_ref_lock as reader and copy the name,
      while holding the same lock in ip_set_rename() as writer
      instead.
      Reported-by: NLi Shuang <shuali@redhat.com>
      Fixes: 45040978 ("netfilter: ipset: Fix set:list type crash when flush/dump set in parallel")
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      439cd39e
  2. 29 10月, 2018 1 次提交
  3. 25 10月, 2018 30 次提交
    • J
      netfilter: bridge: define INT_MIN & INT_MAX in userspace · 5a8de47b
      Jiri Slaby 提交于
      With 4.19, programs like ebtables fail to build when they include
      "linux/netfilter_bridge.h". It is caused by commit 94276fa8 which
      added a use of INT_MIN and INT_MAX to the header:
      : In file included from /usr/include/linux/netfilter_bridge/ebtables.h:18,
      :                  from include/ebtables_u.h:28,
      :                  from communication.c:23:
      : /usr/include/linux/netfilter_bridge.h:30:20: error: 'INT_MIN' undeclared here (not in a function)
      :   NF_BR_PRI_FIRST = INT_MIN,
      :                     ^~~~~~~
      
      Define these constants by including "limits.h" when !__KERNEL__ (the
      same way as for other netfilter_* headers).
      
      Fixes: 94276fa8 ("netfilter: bridge: Expose nf_tables bridge hook priorities through uapi")
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Acked-by: NMáté Eckl <ecklm94@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      5a8de47b
    • P
      netfilter: nft_osf: check if attribute is present · 5e91c9d9
      Pablo Neira Ayuso 提交于
      If the attribute is not sent, eg. old libnftnl binary, then
      tb[NFTA_OSF_TTL] is NULL and kernel crashes from the _init path.
      
      Fixes: a218dc82 ("netfilter: nft_osf: Add ttl option support")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      5e91c9d9
    • F
      netfilter: ipv6: fix oops when defragmenting locally generated fragments · 61792b67
      Florian Westphal 提交于
      Unlike ipv4 and normal ipv6 defrag, netfilter ipv6 defragmentation did
      not save/restore skb->dst.
      
      This causes oops when handling locally generated ipv6 fragments, as
      output path needs a valid dst.
      Reported-by: NMaciej Żenczykowski <zenczykowski@gmail.com>
      Fixes: 84379c9a ("netfilter: ipv6: nf_defrag: drop skb dst before queueing")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      61792b67
    • D
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 4f3ebb04
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Fixes 2018-10-24
      
      This series contains fixes for the ice driver.
      
      Anirudh fixes a namespace issue which was introduced with a previous
      patch to remove ice_netpoll.  Fixed up the device ID define names to
      align with the branding string names.  Use the capability count returned
      by the firmware, instead of calculating the count.  Introduced driver
      workarounds due to current firmware limitations.  Fixed the queue
      mapping for a VF, which needs to be set in the config and scatter queue
      modes.  Fixed the driver which is setup to handle link status events
      (LSE), even though the firmware does not have this feature yet, so add
      the ability to poll for link status changes while we wait for updated
      firmware.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f3ebb04
    • D
      net/ipv6: Allow onlink routes to have a device mismatch if it is the default route · 4ed591c8
      David Ahern 提交于
      The intent of ip6_route_check_nh_onlink is to make sure the gateway
      given for an onlink route is not actually on a connected route for
      a different interface (e.g., 2001:db8:1::/64 is on dev eth1 and then
      an onlink route has a via 2001:db8:1::1 dev eth2). If the gateway
      lookup hits the default route then it most likely will be a different
      interface than the onlink route which is ok.
      
      Update ip6_route_check_nh_onlink to disregard the device mismatch
      if the gateway lookup hits the default route. Turns out the existing
      onlink tests are passing because there is no default route or it is
      an unreachable default, so update the onlink tests to have a default
      route other than unreachable.
      
      Fixes: fc1e64e1 ("net/ipv6: Add support for onlink flag")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ed591c8
    • D
      net: sched: Remove TCA_OPTIONS from policy · e72bde6b
      David Ahern 提交于
      Marco reported an error with hfsc:
      root@Calimero:~# tc qdisc add dev eth0 root handle 1:0 hfsc default 1
      Error: Attribute failed policy validation.
      
      Apparently a few implementations pass TCA_OPTIONS as a binary instead
      of nested attribute, so drop TCA_OPTIONS from the policy.
      
      Fixes: 8b4c3cdd ("net: sched: Add policy validation for tc attributes")
      Reported-by: NMarco Berizzi <pupilla@libero.it>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e72bde6b
    • A
      ice: Poll for link status change · 4f4be03b
      Anirudh Venkataramanan 提交于
      When the physical link goes up or down, the driver is supposed to
      receive a link status event (LSE). The driver currently has the code
      to handle LSEs but there is no firmware support for this feature yet.
      So this patch adds the ability for the driver to poll for link status
      changes. The polling itself is done in ice_watchdog_subtask.
      
      For namespace cleanliness, this patch also removes code that handles
      LSE. This code will be reintroduced once the feature is officially
      supported.
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4f4be03b
    • A
      ice: Allocate VF interrupts and set queue map · 982b1219
      Anirudh Venkataramanan 提交于
      Allocate VF interrupts using VPINT_ALLOC_PCI. Multiple interrupts are
      specified as a range from "first" to "last".
      
      Also, according to the spec, the queue mapping for a VF needs to be set
      in both contig and scatter queue modes. So make this change as well.
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      982b1219
    • A
      ice: Introduce ice_dev_onetime_setup · f203dca3
      Anirudh Venkataramanan 提交于
      ice_dev_onetime_setup contains a couple of driver workarounds for current
      firmware limitations. These workarounds are expected to go away once
      these limitations are fixed in the firmware.
      
      On a firmware release that has these issues addressed, these workarounds
      (while unnecessary) will not break anything.
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f203dca3
    • S
      net: hns3: Fix for warning uninitialized symbol hw_err_lst3 · ac0e5496
      Shiju Jose 提交于
      This patch fixes the smatch warning,
      
      drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c:700
      hclge_log_and_clear_ppp_error() error: uninitialized symbol
      'hw_err_lst3'
      
      Link: https://lkml.org/lkml/2018/10/23/430
      
      Fixes: da2d072a ("net: hns3: Add enable and process hw errors from PPP")
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NShiju Jose <shiju.jose@huawei.com>
      Signed-off-by: NSalil Mehta <salil.mehta@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac0e5496
    • D
      octeontx2-af: Copy the right amount of memory · cdaa18f9
      Dan Carpenter 提交于
      This is a copy and paste bug where we copied the sizeof() from the chunk
      before.  We're copying more data than intended but the destination is a
      union so it doesn't cause memory corruption.
      
      Fixes: ffb0abd7 ("octeontx2-af: NIX AQ instruction enqueue support")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cdaa18f9
    • S
      net: udp: fix handling of CHECKSUM_COMPLETE packets · db4f1be3
      Sean Tranchetti 提交于
      Current handling of CHECKSUM_COMPLETE packets by the UDP stack is
      incorrect for any packet that has an incorrect checksum value.
      
      udp4/6_csum_init() will both make a call to
      __skb_checksum_validate_complete() to initialize/validate the csum
      field when receiving a CHECKSUM_COMPLETE packet. When this packet
      fails validation, skb->csum will be overwritten with the pseudoheader
      checksum so the packet can be fully validated by software, but the
      skb->ip_summed value will be left as CHECKSUM_COMPLETE so that way
      the stack can later warn the user about their hardware spewing bad
      checksums. Unfortunately, leaving the SKB in this state can cause
      problems later on in the checksum calculation.
      
      Since the the packet is still marked as CHECKSUM_COMPLETE,
      udp_csum_pull_header() will SUBTRACT the checksum of the UDP header
      from skb->csum instead of adding it, leaving us with a garbage value
      in that field. Once we try to copy the packet to userspace in the
      udp4/6_recvmsg(), we'll make a call to skb_copy_and_csum_datagram_msg()
      to checksum the packet data and add it in the garbage skb->csum value
      to perform our final validation check.
      
      Since the value we're validating is not the proper checksum, it's possible
      that the folded value could come out to 0, causing us not to drop the
      packet. Instead, we believe that the packet was checksummed incorrectly
      by hardware since skb->ip_summed is still CHECKSUM_COMPLETE, and we attempt
      to warn the user with netdev_rx_csum_fault(skb->dev);
      
      Unfortunately, since this is the UDP path, skb->dev has been overwritten
      by skb->dev_scratch and is no longer a valid pointer, so we end up
      reading invalid memory.
      
      This patch addresses this problem in two ways:
      	1) Do not use the dev pointer when calling netdev_rx_csum_fault()
      	   from skb_copy_and_csum_datagram_msg(). Since this gets called
      	   from the UDP path where skb->dev has been overwritten, we have
      	   no way of knowing if the pointer is still valid. Also for the
      	   sake of consistency with the other uses of
      	   netdev_rx_csum_fault(), don't attempt to call it if the
      	   packet was checksummed by software.
      
      	2) Add better CHECKSUM_COMPLETE handling to udp4/6_csum_init().
      	   If we receive a packet that's CHECKSUM_COMPLETE that fails
      	   verification (i.e. skb->csum_valid == 0), check who performed
      	   the calculation. It's possible that the checksum was done in
      	   software by the network stack earlier (such as Netfilter's
      	   CONNTRACK module), and if that says the checksum is bad,
      	   we can drop the packet immediately instead of waiting until
      	   we try and copy it to userspace. Otherwise, we need to
      	   mark the SKB as CHECKSUM_NONE, since the skb->csum field
      	   no longer contains the full packet checksum after the
      	   call to __skb_checksum_validate_complete().
      
      Fixes: e6afc8ac ("udp: remove headers from UDP packets before queueing")
      Fixes: c84d9490 ("udp: copy skb->truesize in the first cache line")
      Cc: Sam Kumar <samanthakumar@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NSean Tranchetti <stranche@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db4f1be3
    • D
      Merge branch 'route-dump-filter-fixes' · 559bf69e
      David S. Miller 提交于
      David Ahern says:
      
      ====================
      net: Fixups for recent dump filtering changes
      
      Li RongQing noted that tgt_net is leaked in ipv4 due to the recent change
      to handle address dumps for a specific device. The report also applies to
      ipv6 and other error paths. Patches 1 and 2 fix those leaks.
      
      Patch 3 stops route dumps from erroring out when dumping across address
      families and a table id is given. This is needed in preparation for
      patch 4.
      
      Patch 4 updates the rtnl_dump_all to handle a failure in one of the dumpit
      functions. At the moment, if an address dump returns an error the dump all
      loop breaks but the error is dropped. The result can be no data is returned
      and no error either leaving the user wondering about the addresses.
      
      Patches were tested with a modified iproute2 to add invalid data to the
      dump request causing each specific failure path to be hit in addition
      to positive testing that it works as it should when given valid data.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      559bf69e
    • D
      net: rtnl_dump_all needs to propagate error from dumpit function · c63586dc
      David Ahern 提交于
      If an address, route or netconf dump request is sent for AF_UNSPEC, then
      rtnl_dump_all is used to do the dump across all address families. If one
      of the dumpit functions fails (e.g., invalid attributes in the dump
      request) then rtnl_dump_all needs to propagate that error so the user
      gets an appropriate response instead of just getting no data.
      
      Fixes: effe6792 ("net: Enable kernel side filtering of route dumps")
      Fixes: 5fcd266a ("net/ipv4: Add support for dumping addresses for a specific device")
      Fixes: 6371a71f ("net/ipv6: Add support for dumping addresses for a specific device")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c63586dc
    • D
      net: Don't return invalid table id error when dumping all families · ae677bbb
      David Ahern 提交于
      When doing a route dump across all address families, do not error out
      if the table does not exist. This allows a route dump for AF_UNSPEC
      with a table id that may only exist for some of the families.
      
      Do return the table does not exist error if dumping routes for a
      specific family and the table does not exist.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae677bbb
    • D
      net/ipv6: Put target net when address dump fails due to bad attributes · 242afaa6
      David Ahern 提交于
      If tgt_net is set based on IFA_TARGET_NETNSID attribute in the dump
      request, make sure all error paths call put_net.
      
      Fixes: 6371a71f ("net/ipv6: Add support for dumping addresses for a specific device")
      Fixes: ed6eff11 ("net/ipv6: Update inet6_dump_addr for strict data checking")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      242afaa6
    • D
      net/ipv4: Put target net when address dump fails due to bad attributes · d7e38611
      David Ahern 提交于
      If tgt_net is set based on IFA_TARGET_NETNSID attribute in the dump
      request, make sure all error paths call put_net.
      
      Fixes: 5fcd266a ("net/ipv4: Add support for dumping addresses for a specific device")
      Fixes: c33078e3 ("net/ipv4: Update inet_dump_ifaddr for strict data checking")
      Reported-by: NLi RongQing <lirongqing@baidu.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7e38611
    • A
      ice: Use capability count returned by the firmware · 99189e8b
      Anirudh Venkataramanan 提交于
      The firmware now returns the capability count in the command buffer.
      Use it.
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      99189e8b
    • A
      ice: Update expected FW version · ac5a8aef
      Anirudh Venkataramanan 提交于
      Update to the current firmware major and minor version which are
      1 and 3 respectively.
      
      Also remove an empty comment line.
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ac5a8aef
    • A
      ice: Change device ID define names to align with branding string · 633d7449
      Anirudh Venkataramanan 提交于
      Basically remove references to C810 and use E810C (from the branding
      string) instead.
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      633d7449
    • A
      ice: Make ice_msix_clean_rings static · f3aaaaaa
      Anirudh Venkataramanan 提交于
      commit 158a08a6 ("ice: remove ndo_poll_controller") removed
      ice_netpoll and introduced a namespace warning for ice_msix_clean_rings.
      Fix the namespace warning by making ice_msix_clean_rings static.
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f3aaaaaa
    • L
      Merge tag 'docs-4.20' of git://git.lwn.net/linux · 01aa9d51
      Linus Torvalds 提交于
      Pull documentation updates from Jonathan Corbet:
       "This is a fairly typical cycle for documentation. There's some welcome
        readability improvements for the formatted output, some LICENSES
        updates including the addition of the ISC license, the removal of the
        unloved and unmaintained 00-INDEX files, the deprecated APIs document
        from Kees, more MM docs from Mike Rapoport, and the usual pile of typo
        fixes and corrections"
      
      * tag 'docs-4.20' of git://git.lwn.net/linux: (41 commits)
        docs: Fix typos in histogram.rst
        docs: Introduce deprecated APIs list
        kernel-doc: fix declaration type determination
        doc: fix a typo in adding-syscalls.rst
        docs/admin-guide: memory-hotplug: remove table of contents
        doc: printk-formats: Remove bogus kobject references for device nodes
        Documentation: preempt-locking: Use better example
        dm flakey: Document "error_writes" feature
        docs/completion.txt: Fix a couple of punctuation nits
        LICENSES: Add ISC license text
        LICENSES: Add note to CDDL-1.0 license that it should not be used
        docs/core-api: memory-hotplug: add some details about locking internals
        docs/core-api: rename memory-hotplug-notifier to memory-hotplug
        docs: improve readability for people with poorer eyesight
        yama: clarify ptrace_scope=2 in Yama documentation
        docs/vm: split memory hotplug notifier description to Documentation/core-api
        docs: move memory hotplug description into admin-guide/mm
        doc: Fix acronym "FEKEK" in ecryptfs
        docs: fix some broken documentation references
        iommu: Fix passthrough option documentation
        ...
      01aa9d51
    • L
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 5993692f
      Linus Torvalds 提交于
      Pull ext4 updates from Ted Ts'o:
      
       - further restructure ext4 documentation
      
       - fix up ext4's delayed allocation for bigalloc file systems
      
       - fix up some syzbot-detected races in EXT4_IOC_MOVE_EXT,
         EXT4_IOC_SWAP_BOOT, and ext4_remount
      
       - ... and a few other miscellaneous bugs and optimizations.
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (21 commits)
        ext4: fix use-after-free race in ext4_remount()'s error path
        ext4: cache NULL when both default_acl and acl are NULL
        docs: promote the ext4 data structures book to top level
        docs: move ext4 administrative docs to admin-guide/
        jbd2: fix use after free in jbd2_log_do_checkpoint()
        ext4: propagate error from dquot_initialize() in EXT4_IOC_FSSETXATTR
        ext4: fix setattr project check in fssetxattr ioctl
        docs: make ext4 readme tables readable
        docs: fix ext4 documentation table formatting problems
        docs: generate a separate ext4 pdf file from the documentation
        ext4: convert fault handler to use vm_fault_t type
        ext4: initialize retries variable in ext4_da_write_inline_data_begin()
        ext4: fix EXT4_IOC_SWAP_BOOT
        ext4: fix build error when DX_DEBUG is defined
        ext4: fix argument checking in EXT4_IOC_MOVE_EXT
        ext4: fix reserved cluster accounting at page invalidation time
        ext4: adjust reserved cluster count when removing extents
        ext4: reduce reserved cluster count by number of allocated clusters
        ext4: fix reserved cluster accounting at delayed write time
        ext4: add new pending reservation mechanism
        ...
      5993692f
    • L
      Merge tag 'f2fs-for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · d6edff78
      Linus Torvalds 提交于
      Pull f2fs updates from Jaegeuk Kim:
       "In this round, we've added 1) superblock checksum feature, 2)
        implemented new mount option which we can disable/enable checkpoint to
        provide atomic updates of entire filesystem, 3) refactored quota
        operations to enhance its consistency along with checkpoint, 4) fixed
        subtle IO hang conditions and roll-forward recovery flow to resurrect
        any fsync'ed inode metadata.
      
        Enhancements:
         - add checksum to keep superblock contents more safe
         - add checkpoint=disable/enable to support A/B update of entire filesystem
         - use plug for readahead IO in readdir
         - add more IO counts to avoid block layer hacks
      
        Bug fixes:
         - prevent data corruption issue for hardware encryption
         - fix IO hang issues when GC is heavily triggered
         - add missing up_read in __write_node_page
         - recover inode metadata during roll-forward recovery flow
         - fix null pointer dereference issue in wrongly configured discard map
      
        There are some more sanity checks and minor bug fixes as well"
      
      * tag 'f2fs-for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (62 commits)
        f2fs: fix to keep project quota consistent
        f2fs: guarantee journalled quota data by checkpoint
        f2fs: cleanup dirty pages if recover failed
        f2fs: fix data corruption issue with hardware encryption
        f2fs: fix to recover inode->i_flags of inode block during POR
        f2fs: spread f2fs_set_inode_flags()
        f2fs: fix to spread clear_cold_data()
        Revert "f2fs: fix to clear PG_checked flag in set_page_dirty()"
        f2fs: account read IOs and use IO counts for is_idle
        f2fs: fix to account IO correctly for cgroup writeback
        f2fs: fix to account IO correctly
        f2fs: remove request_list check in is_idle()
        f2fs: allow to mount, if quota is failed
        f2fs: update REQ_TIME in f2fs_cross_rename()
        f2fs: do not update REQ_TIME in case of error conditions
        f2fs: remove unneeded disable_nat_bits()
        f2fs: remove unused sbi->trigger_ssr_threshold
        f2fs: shrink sbi->sb_lock coverage in set_file_temperature()
        f2fs: use rb_*_cached friends
        f2fs: fix to recover cold bit of inode block during POR
        ...
      d6edff78
    • L
      Merge tag 'xfs-4.20-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · fe0142df
      Linus Torvalds 提交于
      Pul xfs updates from Dave Chinner:
       "There's not a huge amount of change in this cycle - Darrick has been
        out of action for a couple of months (hence me sending the last few
        pull requests), so we decided a quiet cycle mainly focussed on bug
        fixes was a good idea. Darrick will take the helm again at the end of
        this merge window.
      
        FYI, I may be sending another update later in the cycle - there's a
        pending rework of the clone/dedupe_file_range code that fixes numerous
        bugs that is spread amongst the VFS, XFS and ocfs2 code. It has been
        reviewed and tested, Al and I just need to work out the details of the
        merge, so it may come from him rather than me.
      
        Summary:
      
         - only support filesystems with unwritten extents
      
         - add definition for statfs XFS magic number
      
         - remove unused parameters around reflink code
      
         - more debug for dangling delalloc extents
      
         - cancel COW extents on extent swap targets
      
         - fix quota stats output and clean up the code
      
         - refactor some of the attribute code in preparation for parent
           pointers
      
         - fix several buffer handling bugs"
      
      * tag 'xfs-4.20-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (21 commits)
        xfs: cancel COW blocks before swapext
        xfs: clear ail delwri queued bufs on unmount of shutdown fs
        xfs: use offsetof() in place of offset macros for __xfsstats
        xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
        xfs: fix use-after-free race in xfs_buf_rele
        xfs: Add attibute remove and helper functions
        xfs: Add attibute set and helper functions
        xfs: Add helper function xfs_attr_try_sf_addname
        xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
        xfs: issue log message on user force shutdown
        xfs: fix buffer state management in xrep_findroot_block
        xfs: always assign buffer verifiers when one is provided
        xfs: xrep_findroot_block should reject root blocks with siblings
        xfs: add a define for statfs magic to uapi
        xfs: print dangling delalloc extents
        xfs: fix fork selection in xfs_find_trim_cow_extent
        xfs: remove the unused trimmed argument from xfs_reflink_trim_around_shared
        xfs: remove the unused shared argument to xfs_reflink_reserve_cow
        xfs: handle zeroing in xfs_file_iomap_begin_delay
        xfs: remove suport for filesystems without unwritten extent flag
        ...
      fe0142df
    • L
      Merge tag 'gfs2-4.20.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · bfd93a87
      Linus Torvalds 提交于
      Pull gfs2 updates from Bob Peterson:
       "We've got 18 patches for this merge window, none of which are very
        major:
      
         - clean up the gfs2 block allocator to prepare for future performance
           enhancements (Andreas Gruenbacher)
      
         - fix a use-after-free problem (Andy Price)
      
         - patches that fix gfs2's broken rgrplvb mount option (me)
      
         - cleanup patches and error message improvements (me)
      
         - enable getlabel support (Steve Whitehouse and Abhi Das)
      
         - flush the glock delete workqueue at exit (Tim Smith)"
      
      * tag 'gfs2-4.20.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        gfs2: Fix minor typo: couln't versus couldn't.
        gfs2: write revokes should traverse sd_ail1_list in reverse
        gfs2: Pass resource group to rgblk_free
        gfs2: Remove unnecessary gfs2_rlist_alloc parameter
        gfs2: Fix marking bitmaps non-full
        gfs2: Fix some minor typos
        gfs2: Rename bitmap.bi_{len => bytes}
        gfs2: Remove unused RGRP_RSRV_MINBYTES definition
        gfs2: Move rs_{sizehint, rgd_gh} fields into the inode
        gfs2: Clean up out-of-bounds check in gfs2_rbm_from_block
        gfs2: Always check the result of gfs2_rbm_from_block
        gfs2: getlabel support
        GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads
        gfs2: Don't leave s_fs_info pointing to freed memory in init_sbd
        gfs2: Use fs_* functions instead of pr_* function where we can
        gfs2: slow the deluge of io error messages
        gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated
        gfs2: improve debug information when lvb mismatches are found
      bfd93a87
    • L
      Merge tag 'for-linus-4.20-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · e1cbbf40
      Linus Torvalds 提交于
      Pull orangefs updates from Mike Marshall:
       "Fixes and a cleanup.
      
        Fixes:
         - fix superfluous service_operation return code check in
           orangefs_lookup
         - fix some error code paths that missed kmem_cache_free
         - don't let orangefs_iget return NULL
         - don't let orangefs_new_inode return NULL
         - cache NULL when both default_acl and acl are NULL
      
       Cleanup:
         - rate limit the client not running info message"
      
      * tag 'for-linus-4.20-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
        orangefs: no need to check for service_operation returns > 0
        orangefs: some error code paths missed kmem_cache_free
        orangefs: don't let orangefs_iget return NULL.
        orangefs: don't let orangefs_new_inode return NULL
        orangefs: rate limit the client not running info message
        orangefs: cache NULL when both default_acl and acl are NULL
      e1cbbf40
    • L
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 6b609e3b
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro.
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        gfs2_meta: ->mount() can get NULL dev_name
        ecryptfs_rename(): verify that lower dentries are still OK after lock_rename()
        cachefiles: fix the race between cachefiles_bury_object() and rmdir(2)
      6b609e3b
    • L
      Merge tag 'jfs-for-4.20' of git://github.com/kleikamp/linux-shaggy · deba28b1
      Linus Torvalds 提交于
      Pull jfs updates from David Kleikamp:
       "Just a few small fixes"
      
      * tag 'jfs-for-4.20' of git://github.com/kleikamp/linux-shaggy:
        jfs: remove redundant dquot_initialize() in jfs_evict_inode()
        jfs: remove quota option from ignore list
        jfs: cache NULL when both default_acl and acl are NULL
      deba28b1
    • L
      Merge tag 'for-4.20-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 318b067a
      Linus Torvalds 提交于
      Pull btrfs updates from David Sterba:
       "This is the first batch with fixes and some nice performance
        improvements.
      
        Preliminary results show eg. more files/sec in fsmark, better perf on
        multi-threaded workloads (filebench, dbench), fewer context switches
        and overall better memory allocation characteristics (multiple
        benchmarks).
      
        Apart from general performance, there's an improvement for qgroups +
        balance workload that's been troubling our users.
      
        Note for stable: there are 20+ patches tagged for stable, out of 90.
        Not all of them apply cleanly on all stable versions but the conflicts
        are mostly due to simple cleanups and resolving should be obvious. The
        fixes are otherwise independent.
      
        Performance improvements:
      
         - transition between blocking and spinning modes of path is gone,
           which originally resulted to more unnecessary wakeups and updates
           to the path locks, the effects are measurable and improve latency
           and scalability
      
         - qgroups: first batch of changes that should speedup balancing with
           qgroups on, skip quota accounting on unchanged subtrees, overall
           gain is about 30+% in runtime
      
         - use rb-tree with cached first node for several structures, small
           improvement to avoid pointer chasing
      
        Fixes:
      
         - trim
            - fix: some blockgroups could have been missed if their logical
              address was past the total filesystem size (ie. after a lot of
              balancing)
            - better error reporting, after processing blockgroups and whole
              device
            - fix: continue trimming block groups after an error is
              encountered
            - check for trim support of the device earlier and avoid some
              unnecessary work
            - less interaction with transaction commit that improves latency
              on slower storage (eg. image files over NFS)
      
         - fsync
            - fix warning when replaying log after fsync of a O_TMPFILE
            - fix wrong dentries after fsync of file that got its parent
              replaced
      
         - qgroups: fix rescan that might misc some dirty groups
      
         - don't clean dirty pages during buffered writes, this could lead to
           lost updates in some corner cases
      
         - some block groups could have been delayed in creation, if the
           allocation triggered another one
      
         - error handling improvements
      
        Cleanups:
      
         - removed unused struct members and variables
      
         - function return type cleanups
      
         - delayed refs code refactoring
      
         - protect against deadlock that could be caused by crafted image that
           tries to allocate from a tree that's locked already"
      
      * tag 'for-4.20-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (93 commits)
        btrfs: switch return_bigger to bool in find_ref_head
        btrfs: remove fs_info from btrfs_should_throttle_delayed_refs
        btrfs: remove fs_info from btrfs_check_space_for_delayed_refs
        btrfs: delayed-ref: pass delayed_refs directly to btrfs_delayed_ref_lock
        btrfs: delayed-ref: pass delayed_refs directly to btrfs_select_ref_head
        btrfs: qgroup: move the qgroup->members check out from (!qgroup)'s else branch
        btrfs: relocation: Remove redundant tree level check
        btrfs: relocation: Cleanup while loop using rbtree_postorder_for_each_entry_safe
        btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled
        Btrfs: fix wrong dentries after fsync of file that got its parent replaced
        Btrfs: fix warning when replaying log after fsync of a tmpfile
        btrfs: drop min_size from evict_refill_and_join
        btrfs: assert on non-empty delayed iputs
        btrfs: make sure we create all new block groups
        btrfs: reset max_extent_size on clear in a bitmap
        btrfs: protect space cache inode alloc with GFP_NOFS
        btrfs: release metadata before running delayed refs
        Btrfs: kill btrfs_clear_path_blocking
        btrfs: dev-replace: remove pointless assert in write unlock
        btrfs: dev-replace: move replace members out of fs_info
        ...
      318b067a
  4. 24 10月, 2018 6 次提交
    • L
      Merge branch 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 44adbac8
      Linus Torvalds 提交于
      Pull tty ioctl updates from Al Viro:
       "This is the compat_ioctl work related to tty ioctls.
      
        Quite a bit of dead code taken out, all tty-related stuff gone from
        fs/compat_ioctl.c. A bunch of compat bugs fixed - some still remain,
        but all more or less generic tty-related ioctls should be covered
        (remaining issues are in things like driver-private ioctls in a pcmcia
        serial card driver not getting properly handled in 32bit processes on
        64bit host, etc)"
      
      * 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (53 commits)
        kill TIOCSERGSTRUCT
        change semantics of ldisc ->compat_ioctl()
        kill TIOCSER[SG]WILD
        synclink_gt(): fix compat_ioctl()
        pty: fix compat ioctls
        compat_ioctl - kill keyboard ioctl handling
        gigaset: add ->compat_ioctl()
        vt_compat_ioctl(): clean up, use compat_ptr() properly
        gigaset: don't try to printk userland buffer contents
        dgnc: don't bother with (empty) stub for TCXONC
        dgnc: leave TIOC[GS]SOFTCAR to ldisc
        remove fallback to drivers for TIOCGICOUNT
        dgnc: break-related ioctls won't reach ->ioctl()
        kill the rest of tty COMPAT_IOCTL() entries
        dgnc: TIOCM... won't reach ->ioctl()
        isdn_tty: TCSBRK{,P} won't reach ->ioctl()
        kill capinc_tty_ioctl()
        take compat TIOC[SG]SERIAL treatment into tty_compat_ioctl()
        synclink: reduce pointless checks in ->ioctl()
        complete ->[sg]et_serial() switchover
        ...
      44adbac8
    • L
      Merge tag 'pstore-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 08ffb584
      Linus Torvalds 提交于
      Pull pstore updates from Kees Cook:
       "pstore improvements:
      
         - refactor init to happen as early as possible again (Joel Fernandes)
      
         - improve resource reservation names"
      
      * tag 'pstore-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        pstore/ram: Clarify resource reservation labels
        pstore: Refactor compression initialization
        pstore: Allocate compression during late_initcall()
        pstore: Centralize init/exit routines
      08ffb584
    • L
      Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 638820d8
      Linus Torvalds 提交于
      Pull security subsystem updates from James Morris:
       "In this patchset, there are a couple of minor updates, as well as some
        reworking of the LSM initialization code from Kees Cook (these prepare
        the way for ordered stackable LSMs, but are a valuable cleanup on
        their own)"
      
      * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        LSM: Don't ignore initialization failures
        LSM: Provide init debugging infrastructure
        LSM: Record LSM name in struct lsm_info
        LSM: Convert security_initcall() into DEFINE_LSM()
        vmlinux.lds.h: Move LSM_TABLE into INIT_DATA
        LSM: Convert from initcall to struct lsm_info
        LSM: Remove initcall tracing
        LSM: Rename .security_initcall section to .lsm_info
        vmlinux.lds.h: Avoid copy/paste of security_init section
        LSM: Correctly announce start of LSM initialization
        security: fix LSM description location
        keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h
        seccomp: remove unnecessary unlikely()
        security: tomoyo: Fix obsolete function
        security/capabilities: remove check for -EINVAL
      638820d8
    • L
      Merge tag 'selinux-pr-20181022' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · d5e4d81d
      Linus Torvalds 提交于
      Pull SELinux updates from Paul Moore:
       "Three SELinux patches for v4.20, all fall under the bug-fix or
        behave-better category, which is good. All three have pretty good
        descriptions too, which is even better"
      
      * tag 'selinux-pr-20181022' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: Add __GFP_NOWARN to allocation at str_read()
        selinux: refactor mls_context_to_sid() and make it stricter
        selinux: fix mounting of cgroup2 under older policies
      d5e4d81d
    • L
      Merge branch 'siginfo-linus' of... · ba9f6f89
      Linus Torvalds 提交于
      Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull siginfo updates from Eric Biederman:
       "I have been slowly sorting out siginfo and this is the culmination of
        that work.
      
        The primary result is in several ways the signal infrastructure has
        been made less error prone. The code has been updated so that manually
        specifying SEND_SIG_FORCED is never necessary. The conversion to the
        new siginfo sending functions is now complete, which makes it
        difficult to send a signal without filling in the proper siginfo
        fields.
      
        At the tail end of the patchset comes the optimization of decreasing
        the size of struct siginfo in the kernel from 128 bytes to about 48
        bytes on 64bit. The fundamental observation that enables this is by
        definition none of the known ways to use struct siginfo uses the extra
        bytes.
      
        This comes at the cost of a small user space observable difference.
        For the rare case of siginfo being injected into the kernel only what
        can be copied into kernel_siginfo is delivered to the destination, the
        rest of the bytes are set to 0. For cases where the signal and the
        si_code are known this is safe, because we know those bytes are not
        used. For cases where the signal and si_code combination is unknown
        the bits that won't fit into struct kernel_siginfo are tested to
        verify they are zero, and the send fails if they are not.
      
        I made an extensive search through userspace code and I could not find
        anything that would break because of the above change. If it turns out
        I did break something it will take just the revert of a single change
        to restore kernel_siginfo to the same size as userspace siginfo.
      
        Testing did reveal dependencies on preferring the signo passed to
        sigqueueinfo over si->signo, so bit the bullet and added the
        complexity necessary to handle that case.
      
        Testing also revealed bad things can happen if a negative signal
        number is passed into the system calls. Something no sane application
        will do but something a malicious program or a fuzzer might do. So I
        have fixed the code that performs the bounds checks to ensure negative
        signal numbers are handled"
      
      * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits)
        signal: Guard against negative signal numbers in copy_siginfo_from_user32
        signal: Guard against negative signal numbers in copy_siginfo_from_user
        signal: In sigqueueinfo prefer sig not si_signo
        signal: Use a smaller struct siginfo in the kernel
        signal: Distinguish between kernel_siginfo and siginfo
        signal: Introduce copy_siginfo_from_user and use it's return value
        signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE
        signal: Fail sigqueueinfo if si_signo != sig
        signal/sparc: Move EMT_TAGOVF into the generic siginfo.h
        signal/unicore32: Use force_sig_fault where appropriate
        signal/unicore32: Generate siginfo in ucs32_notify_die
        signal/unicore32: Use send_sig_fault where appropriate
        signal/arc: Use force_sig_fault where appropriate
        signal/arc: Push siginfo generation into unhandled_exception
        signal/ia64: Use force_sig_fault where appropriate
        signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn
        signal/ia64: Use the generic force_sigsegv in setup_frame
        signal/arm/kvm: Use send_sig_mceerr
        signal/arm: Use send_sig_fault where appropriate
        signal/arm: Use force_sig_fault where appropriate
        ...
      ba9f6f89
    • L
      net/kconfig: Make QCOM_QMI_HELPERS available when COMPILE_TEST · a978a5b8
      Linus Torvalds 提交于
      The networking merge brought in the experimental support for the
      Qualcomm ath10k system NOC, which selects QCOM_QMI_HELPERS as a
      dependency.
      
      But the ATH10K_SNOC option (which selects QCOM_QMI_HELPERS) depends on
      ARCH_QCOM || COMPILE_TEST in order to get wider build testing than just
      the unusual QCOM architecture build, while the QCOM_QMI_HELPERS option
      doesn't have that COMPILE_TEST option and is limited to only ARCH_QCOM.
      
      As a result, a "make allmodconfig" complains
      
        WARNING: unmet direct dependencies detected for QCOM_QMI_HELPERS
          Depends on [n]: ARCH_QCOM && NET [=y]
          Selected by [m]:
          - ATH10K_SNOC [=m] && NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_ATH [=y] && ATH10K [=m] && (ARCH_QCOM || COMPILE_TEST [=y])
      
      Fix the config-time warning by making QCOM_QMI_HELPERS available when
      COMPILE_TEST, since the result seems to build fine.
      
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Cc: Govind Singh <govinds@codeaurora.org>
      Cc: Kalle Valo <kvalo@codeaurora.org>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a978a5b8