1. 16 1月, 2020 1 次提交
  2. 13 1月, 2020 1 次提交
  3. 10 1月, 2020 1 次提交
  4. 09 1月, 2020 1 次提交
    • E
      macvlan: do not assume mac_header is set in macvlan_broadcast() · 96cc4b69
      Eric Dumazet 提交于
      Use of eth_hdr() in tx path is error prone.
      
      Many drivers call skb_reset_mac_header() before using it,
      but others do not.
      
      Commit 6d1ccff6 ("net: reset mac header in dev_start_xmit()")
      attempted to fix this generically, but commit d346a3fa
      ("packet: introduce PACKET_QDISC_BYPASS socket option") brought
      back the macvlan bug.
      
      Lets add a new helper, so that tx paths no longer have
      to call skb_reset_mac_header() only to get a pointer
      to skb->data.
      
      Hopefully we will be able to revert 6d1ccff6
      ("net: reset mac header in dev_start_xmit()") and save few cycles
      in transmit fast path.
      
      BUG: KASAN: use-after-free in __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
      BUG: KASAN: use-after-free in mc_hash drivers/net/macvlan.c:251 [inline]
      BUG: KASAN: use-after-free in macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
      Read of size 4 at addr ffff8880a4932401 by task syz-executor947/9579
      
      CPU: 0 PID: 9579 Comm: syz-executor947 Not tainted 5.5.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
       __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
       kasan_report+0x12/0x20 mm/kasan/common.c:639
       __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:145
       __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
       mc_hash drivers/net/macvlan.c:251 [inline]
       macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
       macvlan_queue_xmit drivers/net/macvlan.c:520 [inline]
       macvlan_start_xmit+0x402/0x77f drivers/net/macvlan.c:559
       __netdev_start_xmit include/linux/netdevice.h:4447 [inline]
       netdev_start_xmit include/linux/netdevice.h:4461 [inline]
       dev_direct_xmit+0x419/0x630 net/core/dev.c:4079
       packet_direct_xmit+0x1a9/0x250 net/packet/af_packet.c:240
       packet_snd net/packet/af_packet.c:2966 [inline]
       packet_sendmsg+0x260d/0x6220 net/packet/af_packet.c:2991
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:659
       __sys_sendto+0x262/0x380 net/socket.c:1985
       __do_sys_sendto net/socket.c:1997 [inline]
       __se_sys_sendto net/socket.c:1993 [inline]
       __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1993
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x442639
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffc13549e08 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000442639
      RDX: 000000000000000e RSI: 0000000020000080 RDI: 0000000000000003
      RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000403bb0 R14: 0000000000000000 R15: 0000000000000000
      
      Allocated by task 9389:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       __kasan_kmalloc mm/kasan/common.c:513 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
       kasan_kmalloc+0x9/0x10 mm/kasan/common.c:527
       __do_kmalloc mm/slab.c:3656 [inline]
       __kmalloc+0x163/0x770 mm/slab.c:3665
       kmalloc include/linux/slab.h:561 [inline]
       tomoyo_realpath_from_path+0xc5/0x660 security/tomoyo/realpath.c:252
       tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
       tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
       tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
       security_inode_getattr+0xf2/0x150 security/security.c:1222
       vfs_getattr+0x25/0x70 fs/stat.c:115
       vfs_statx_fd+0x71/0xc0 fs/stat.c:145
       vfs_fstat include/linux/fs.h:3265 [inline]
       __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
       __se_sys_newfstat fs/stat.c:375 [inline]
       __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 9389:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       kasan_set_free_info mm/kasan/common.c:335 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
       __cache_free mm/slab.c:3426 [inline]
       kfree+0x10a/0x2c0 mm/slab.c:3757
       tomoyo_realpath_from_path+0x1a7/0x660 security/tomoyo/realpath.c:289
       tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
       tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
       tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
       security_inode_getattr+0xf2/0x150 security/security.c:1222
       vfs_getattr+0x25/0x70 fs/stat.c:115
       vfs_statx_fd+0x71/0xc0 fs/stat.c:145
       vfs_fstat include/linux/fs.h:3265 [inline]
       __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
       __se_sys_newfstat fs/stat.c:375 [inline]
       __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff8880a4932000
       which belongs to the cache kmalloc-4k of size 4096
      The buggy address is located 1025 bytes inside of
       4096-byte region [ffff8880a4932000, ffff8880a4933000)
      The buggy address belongs to the page:
      page:ffffea0002924c80 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0
      raw: 00fffe0000010200 ffffea0002846208 ffffea00028f3888 ffff8880aa402000
      raw: 0000000000000000 ffff8880a4932000 0000000100000001 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880a4932300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880a4932380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8880a4932400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                         ^
       ffff8880a4932480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880a4932500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: b863ceb7 ("[NET]: Add macvlan driver")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96cc4b69
  5. 07 1月, 2020 1 次提交
  6. 06 1月, 2020 1 次提交
  7. 05 1月, 2020 3 次提交
    • J
      block: remove unused mp_bvec_last_segment · 57415790
      Jens Axboe 提交于
      After commit 85a8ce62 ("block: add bio_truncate to fix guard_bio_eod")
      this function is unused, remove it.
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      57415790
    • A
      kcov: fix struct layout for kcov_remote_arg · a69b83e1
      Andrey Konovalov 提交于
      Make the layout of kcov_remote_arg the same for 32-bit and 64-bit code.
      This makes it more convenient to write userspace apps that can be
      compiled into 32-bit or 64-bit binaries and still work with the same
      64-bit kernel.
      
      Also use proper __u32 types in uapi headers instead of unsigned ints.
      
      Link: http://lkml.kernel.org/r/9e91020876029cfefc9211ff747685eba9536426.1575638983.git.andreyknvl@google.com
      Fixes: eec028c9 ("kcov: remote coverage support")
      Signed-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Acked-by: NMarco Elver <elver@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Felipe Balbi <balbi@kernel.org>
      Cc: Chunfeng Yun <chunfeng.yun@mediatek.com>
      Cc: "Jacky . Cao @ sony . com" <Jacky.Cao@sony.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a69b83e1
    • D
      mm/memory_hotplug: shrink zones when offlining memory · feee6b29
      David Hildenbrand 提交于
      We currently try to shrink a single zone when removing memory.  We use
      the zone of the first page of the memory we are removing.  If that
      memmap was never initialized (e.g., memory was never onlined), we will
      read garbage and can trigger kernel BUGs (due to a stale pointer):
      
          BUG: unable to handle page fault for address: 000000000000353d
          #PF: supervisor write access in kernel mode
          #PF: error_code(0x0002) - not-present page
          PGD 0 P4D 0
          Oops: 0002 [#1] SMP PTI
          CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
          Workqueue: kacpi_hotplug acpi_hotplug_work_fn
          RIP: 0010:clear_zone_contiguous+0x5/0x10
          Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
          RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
          RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
          RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
          R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
          FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           __remove_pages+0x4b/0x640
           arch_remove_memory+0x63/0x8d
           try_remove_memory+0xdb/0x130
           __remove_memory+0xa/0x11
           acpi_memory_device_remove+0x70/0x100
           acpi_bus_trim+0x55/0x90
           acpi_device_hotplug+0x227/0x3a0
           acpi_hotplug_work_fn+0x1a/0x30
           process_one_work+0x221/0x550
           worker_thread+0x50/0x3b0
           kthread+0x105/0x140
           ret_from_fork+0x3a/0x50
          Modules linked in:
          CR2: 000000000000353d
      
      Instead, shrink the zones when offlining memory or when onlining failed.
      Introduce and use remove_pfn_range_from_zone(() for that.  We now
      properly shrink the zones, even if we have DIMMs whereby
      
       - Some memory blocks fall into no zone (never onlined)
      
       - Some memory blocks fall into multiple zones (offlined+re-onlined)
      
       - Multiple memory blocks that fall into different zones
      
      Drop the zone parameter (with a potential dubious value) from
      __remove_pages() and __remove_section().
      
      Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: <stable@vger.kernel.org>	[5.0+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      feee6b29
  8. 04 1月, 2020 1 次提交
  9. 03 1月, 2020 1 次提交
    • D
      Revert "fs: remove ksys_dup()" · 74f1a299
      Dominik Brodowski 提交于
      This reverts commit 8243186f ("fs: remove ksys_dup()") and the
      subsequent fix for it in commit 2d3145f8 ("early init: fix error
      handling when opening /dev/console").
      
      Trying to use filp_open() and f_dupfd() instead of pseudo-syscalls
      caused more trouble than what is worth it: it requires accessing vfs
      internals and it turns out there were other bugs in it too.
      
      In particular, the file reference counting was wrong - because unlike
      the original "open+2*dup" sequence it used "filp_open+3*f_dupfd" and
      thus had an extra leaked file reference.
      
      That in turn then caused odd problems with Androidx86 long after boot
      becaue of how the extra reference to the console kept the session active
      even after all file descriptors had been closed.
      Reported-by: Nyouling 257 <youling257@gmail.com>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      74f1a299
  10. 02 1月, 2020 1 次提交
  11. 31 12月, 2019 3 次提交
    • D
      net/sched: add delete_empty() to filters and use it in cls_flower · a5b72a08
      Davide Caratti 提交于
      Revert "net/sched: cls_u32: fix refcount leak in the error path of
      u32_change()", and fix the u32 refcount leak in a more generic way that
      preserves the semantic of rule dumping.
      On tc filters that don't support lockless insertion/removal, there is no
      need to guard against concurrent insertion when a removal is in progress.
      Therefore, for most of them we can avoid a full walk() when deleting, and
      just decrease the refcount, like it was done on older Linux kernels.
      This fixes situations where walk() was wrongly detecting a non-empty
      filter, like it happened with cls_u32 in the error path of change(), thus
      leading to failures in the following tdc selftests:
      
       6aa7: (filter, u32) Add/Replace u32 with source match and invalid indev
       6658: (filter, u32) Add/Replace u32 with custom hash table and invalid handle
       74c2: (filter, u32) Add/Replace u32 filter with invalid hash table id
      
      On cls_flower, and on (future) lockless filters, this check is necessary:
      move all the check_empty() logic in a callback so that each filter
      can have its own implementation. For cls_flower, it's sufficient to check
      if no IDRs have been allocated.
      
      This reverts commit 275c44aa.
      
      Changes since v1:
       - document the need for delete_empty() when TCF_PROTO_OPS_DOIT_UNLOCKED
         is used, thanks to Vlad Buslov
       - implement delete_empty() without doing fl_walk(), thanks to Vlad Buslov
       - squash revert and new fix in a single patch, to be nice with bisect
         tests that run tdc on u32 filter, thanks to Dave Miller
      
      Fixes: 275c44aa ("net/sched: cls_u32: fix refcount leak in the error path of u32_change()")
      Fixes: 6676d5e4 ("net: sched: set dedicated tcf_walker flag when tp is empty")
      Suggested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Suggested-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Reviewed-by: NVlad Buslov <vladbu@mellanox.com>
      Tested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5b72a08
    • V
      ptp: fix the race between the release of ptp_clock and cdev · a33121e5
      Vladis Dronov 提交于
      In a case when a ptp chardev (like /dev/ptp0) is open but an underlying
      device is removed, closing this file leads to a race. This reproduces
      easily in a kvm virtual machine:
      
      ts# cat openptp0.c
      int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); }
      ts# uname -r
      5.5.0-rc3-46cf053e
      ts# cat /proc/cmdline
      ... slub_debug=FZP
      ts# modprobe ptp_kvm
      ts# ./openptp0 &
      [1] 670
      opened /dev/ptp0, sleeping 10s...
      ts# rmmod ptp_kvm
      ts# ls /dev/ptp*
      ls: cannot access '/dev/ptp*': No such file or directory
      ts# ...woken up
      [   48.010809] general protection fault: 0000 [#1] SMP
      [   48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e #25
      [   48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
      [   48.016270] RIP: 0010:module_put.part.0+0x7/0x80
      [   48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202
      [   48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0
      [   48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b
      [   48.019470] ...                                              ^^^ a slub poison
      [   48.023854] Call Trace:
      [   48.024050]  __fput+0x21f/0x240
      [   48.024288]  task_work_run+0x79/0x90
      [   48.024555]  do_exit+0x2af/0xab0
      [   48.024799]  ? vfs_write+0x16a/0x190
      [   48.025082]  do_group_exit+0x35/0x90
      [   48.025387]  __x64_sys_exit_group+0xf/0x10
      [   48.025737]  do_syscall_64+0x3d/0x130
      [   48.026056]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   48.026479] RIP: 0033:0x7f53b12082f6
      [   48.026792] ...
      [   48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm]
      [   48.045001] Fixing recursive fault but reboot is needed!
      
      This happens in:
      
      static void __fput(struct file *file)
      {   ...
          if (file->f_op->release)
              file->f_op->release(inode, file); <<< cdev is kfree'd here
          if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
                   !(mode & FMODE_PATH))) {
              cdev_put(inode->i_cdev); <<< cdev fields are accessed here
      
      Namely:
      
      __fput()
        posix_clock_release()
          kref_put(&clk->kref, delete_clock) <<< the last reference
            delete_clock()
              delete_ptp_clock()
                kfree(ptp) <<< cdev is embedded in ptp
        cdev_put
          module_put(p->owner) <<< *p is kfree'd, bang!
      
      Here cdev is embedded in posix_clock which is embedded in ptp_clock.
      The race happens because ptp_clock's lifetime is controlled by two
      refcounts: kref and cdev.kobj in posix_clock. This is wrong.
      
      Make ptp_clock's sysfs device a parent of cdev with cdev_device_add()
      created especially for such cases. This way the parent device with its
      ptp_clock is not released until all references to the cdev are released.
      This adds a requirement that an initialized but not exposed struct
      device should be provided to posix_clock_register() by a caller instead
      of a simple dev_t.
      
      This approach was adopted from the commit 72139dfa ("watchdog: Fix
      the race between the release of watchdog_core_data and cdev"). See
      details of the implementation in the commit 233ed09d ("chardev: add
      helper function to register char devs with a struct device").
      
      Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#uAnalyzed-by: NStephen Johnston <sjohnsto@redhat.com>
      Analyzed-by: NVern Lovejoy <vlovejoy@redhat.com>
      Signed-off-by: NVladis Dronov <vdronov@redhat.com>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a33121e5
    • K
      kernel.h: Remove unused FIELD_SIZEOF() · 1f07dcc4
      Kees Cook 提交于
      Now that all callers of FIELD_SIZEOF() have been converted to
      sizeof_field(), remove the unused prior macro.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      1f07dcc4
  12. 29 12月, 2019 1 次提交
    • M
      block: add bio_truncate to fix guard_bio_eod · 85a8ce62
      Ming Lei 提交于
      Some filesystem, such as vfat, may send bio which crosses device boundary,
      and the worse thing is that the IO request starting within device boundaries
      can contain more than one segment past EOD.
      
      Commit dce30ca9 ("fs: fix guard_bio_eod to check for real EOD errors")
      tries to fix this issue by returning -EIO for this situation. However,
      this way lets fs user code lose chance to handle -EIO, then sync_inodes_sb()
      may hang for ever.
      
      Also the current truncating on last segment is dangerous by updating the
      last bvec, given bvec table becomes not immutable any more, and fs bio
      users may not retrieve the truncated pages via bio_for_each_segment_all() in
      its .end_io callback.
      
      Fixes this issue by supporting multi-segment truncating. And the
      approach is simpler:
      
      - just update bio size since block layer can make correct bvec with
      the updated bio size. Then bvec table becomes really immutable.
      
      - zero all truncated segments for read bio
      
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: linux-fsdevel@vger.kernel.org
      Fixed-by: dce30ca9 ("fs: fix guard_bio_eod to check for real EOD errors")
      Reported-by: syzbot+2b9e54155c8c25d8d165@syzkaller.appspotmail.com
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      85a8ce62
  13. 28 12月, 2019 1 次提交
  14. 27 12月, 2019 1 次提交
  15. 26 12月, 2019 2 次提交
    • F
      ata: libahci_platform: Export again ahci_platform_<en/dis>able_phys() · 84b032db
      Florian Fainelli 提交于
      This reverts commit 6bb86fef
      ("libahci_platform: Staticize ahci_platform_<en/dis>able_phys()") we are
      going to need ahci_platform_{enable,disable}_phys() in a subsequent
      commit for ahci_brcm.c in order to properly control the PHY
      initialization order.
      
      Also make sure the function prototypes are declared in
      include/linux/ahci_platform.h as a result.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NHans de Goede <hdegoede@redhat.com>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      84b032db
    • S
      libata: Fix retrieving of active qcs · 8385d756
      Sascha Hauer 提交于
      ata_qc_complete_multiple() is called with a mask of the still active
      tags.
      
      mv_sata doesn't have this information directly and instead calculates
      the still active tags from the started tags (ap->qc_active) and the
      finished tags as (ap->qc_active ^ done_mask)
      
      Since 28361c40 the hw_tag and tag are no longer the same and the
      equation is no longer valid. In ata_exec_internal_sg() ap->qc_active is
      initialized as 1ULL << ATA_TAG_INTERNAL, but in hardware tag 0 is
      started and this will be in done_mask on completion. ap->qc_active ^
      done_mask becomes 0x100000000 ^ 0x1 = 0x100000001 and thus tag 0 used as
      the internal tag will never be reported as completed.
      
      This is fixed by introducing ata_qc_get_active() which returns the
      active hardware tags and calling it where appropriate.
      
      This is tested on mv_sata, but sata_fsl and sata_nv suffer from the same
      problem. There is another case in sata_nv that most likely needs fixing
      as well, but this looks a little different, so I wasn't confident enough
      to change that.
      
      Fixes: 28361c40 ("libata: add extra internal command")
      Cc: stable@vger.kernel.org
      Tested-by: NPali Rohár <pali.rohar@gmail.com>
      Signed-off-by: NSascha Hauer <s.hauer@pengutronix.de>
      
      Add missing export of ata_qc_get_active(), as per Pali.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8385d756
  16. 25 12月, 2019 3 次提交
    • H
      net/dst: do not confirm neighbor for vxlan and geneve pmtu update · f081042d
      Hangbin Liu 提交于
      When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
      we should not call dst_confirm_neigh() as there is no two-way communication.
      
      So disable the neigh confirm for vxlan and geneve pmtu update.
      
      v5: No change.
      v4: No change.
      v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
          dst_ops.update_pmtu to control whether we should do neighbor confirm.
          Also split the big patch to small ones for each area.
      v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
      
      Fixes: a93bf0ff ("vxlan: update skb dst pmtu on tx path")
      Fixes: 52a589d5 ("geneve: update skb dst pmtu on tx path")
      Reviewed-by: NGuillaume Nault <gnault@redhat.com>
      Tested-by: NGuillaume Nault <gnault@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f081042d
    • H
      net/dst: add new function skb_dst_update_pmtu_no_confirm · 07dc35c6
      Hangbin Liu 提交于
      Add a new function skb_dst_update_pmtu_no_confirm() for callers who need
      update pmtu but should not do neighbor confirm.
      
      v5: No change.
      v4: No change.
      v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
          dst_ops.update_pmtu to control whether we should do neighbor confirm.
          Also split the big patch to small ones for each area.
      v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
      Reviewed-by: NGuillaume Nault <gnault@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07dc35c6
    • H
      net: add bool confirm_neigh parameter for dst_ops.update_pmtu · bd085ef6
      Hangbin Liu 提交于
      The MTU update code is supposed to be invoked in response to real
      networking events that update the PMTU. In IPv6 PMTU update function
      __ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
      confirmed time.
      
      But for tunnel code, it will call pmtu before xmit, like:
        - tnl_update_pmtu()
          - skb_dst_update_pmtu()
            - ip6_rt_update_pmtu()
              - __ip6_rt_update_pmtu()
                - dst_confirm_neigh()
      
      If the tunnel remote dst mac address changed and we still do the neigh
      confirm, we will not be able to update neigh cache and ping6 remote
      will failed.
      
      So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
      should not be invoking dst_confirm_neigh() as we have no evidence
      of successful two-way communication at this point.
      
      On the other hand it is also important to keep the neigh reachability fresh
      for TCP flows, so we cannot remove this dst_confirm_neigh() call.
      
      To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
      to choose whether we should do neigh update or not. I will add the parameter
      in this patch and set all the callers to true to comply with the previous
      way, and fix the tunnel code one by one on later patches.
      
      v5: No change.
      v4: No change.
      v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
          dst_ops.update_pmtu to control whether we should do neighbor confirm.
          Also split the big patch to small ones for each area.
      v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Reviewed-by: NGuillaume Nault <gnault@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd085ef6
  17. 21 12月, 2019 3 次提交
    • G
      net: dst: Force 4-byte alignment of dst_metrics · 258a980d
      Geert Uytterhoeven 提交于
      When storing a pointer to a dst_metrics structure in dst_entry._metrics,
      two flags are added in the least significant bits of the pointer value.
      Hence this assumes all pointers to dst_metrics structures have at least
      4-byte alignment.
      
      However, on m68k, the minimum alignment of 32-bit values is 2 bytes, not
      4 bytes.  Hence in some kernel builds, dst_default_metrics may be only
      2-byte aligned, leading to obscure boot warnings like:
      
          WARNING: CPU: 0 PID: 7 at lib/refcount.c:28 refcount_warn_saturate+0x44/0x9a
          refcount_t: underflow; use-after-free.
          Modules linked in:
          CPU: 0 PID: 7 Comm: ksoftirqd/0 Tainted: G        W         5.5.0-rc2-atari-01448-g114a1a1038af891d-dirty #261
          Stack from 10835e6c:
      	    10835e6c 0038134f 00023fa6 00394b0f 0000001c 00000009 00321560 00023fea
      	    00394b0f 0000001c 001a70f8 00000009 00000000 10835eb4 00000001 00000000
      	    04208040 0000000a 00394b4a 10835ed4 00043aa8 001a70f8 00394b0f 0000001c
      	    00000009 00394b4a 0026aba8 003215a4 00000003 00000000 0026d5a8 00000001
      	    003215a4 003a4361 003238d6 000001f0 00000000 003215a4 10aa3b00 00025e84
      	    003ddb00 10834000 002416a8 10aa3b00 00000000 00000080 000aa038 0004854a
          Call Trace: [<00023fa6>] __warn+0xb2/0xb4
           [<00023fea>] warn_slowpath_fmt+0x42/0x64
           [<001a70f8>] refcount_warn_saturate+0x44/0x9a
           [<00043aa8>] printk+0x0/0x18
           [<001a70f8>] refcount_warn_saturate+0x44/0x9a
           [<0026aba8>] refcount_sub_and_test.constprop.73+0x38/0x3e
           [<0026d5a8>] ipv4_dst_destroy+0x5e/0x7e
           [<00025e84>] __local_bh_enable_ip+0x0/0x8e
           [<002416a8>] dst_destroy+0x40/0xae
      
      Fix this by forcing 4-byte alignment of all dst_metrics structures.
      
      Fixes: e5fd387a ("ipv6: do not overwrite inetpeer metrics prematurely")
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      258a980d
    • R
      net: phy: ensure that phy IDs are correctly typed · 7d49a32a
      Russell King 提交于
      PHY IDs are 32-bit unsigned quantities. Ensure that they are always
      treated as such, and not passed around as "int"s.
      
      Fixes: 13d0ab67 ("net: phy: check return code when requesting PHY driver module")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d49a32a
    • R
      mod_devicetable: fix PHY module format · d2ed49cf
      Russell King 提交于
      When a PHY is probed, if the top bit is set, we end up requesting a
      module with the string "mdio:-10101110000000100101000101010001" -
      the top bit is printed to a signed -1 value. This leads to the module
      not being loaded.
      
      Fix the module format string and the macro generating the values for
      it to ensure that we only print unsigned types and the top bit is
      always 0/1. We correctly end up with
      "mdio:10101110000000100101000101010001".
      
      Fixes: 8626d3b4 ("phylib: Support phy module autoloading")
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2ed49cf
  18. 20 12月, 2019 3 次提交
    • P
      xen/interface: re-define FRONT/BACK_RING_ATTACH() · 1ee54195
      Paul Durrant 提交于
      Currently these macros are defined to re-initialize a front/back ring
      (respectively) to values read from the shared ring in such a way that any
      requests/responses that are added to the shared ring whilst the front/back
      is detached will be skipped over. This, in general, is not a desirable
      semantic since most frontend implementations will eventually block waiting
      for a response which would either never appear or never be processed.
      
      Since the macros are currently unused, take this opportunity to re-define
      them to re-initialize a front/back ring using specified values. This also
      allows FRONT/BACK_RING_INIT() to be re-defined in terms of
      FRONT/BACK_RING_ATTACH() using a specified value of 0.
      
      NOTE: BACK_RING_ATTACH() will be used directly in a subsequent patch.
      Signed-off-by: NPaul Durrant <pdurrant@amazon.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      1ee54195
    • P
      xenbus: limit when state is forced to closed · 672b7763
      Paul Durrant 提交于
      If a driver probe() fails then leave the xenstore state alone. There is no
      reason to modify it as the failure may be due to transient resource
      allocation issues and hence a subsequent probe() may succeed.
      
      If the driver supports re-binding then only force state to closed during
      remove() only in the case when the toolstack may need to clean up. This can
      be detected by checking whether the state in xenstore has been set to
      closing prior to device removal.
      
      NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver,
            which defaults to false. Subsequent patches will add support to
            some backend drivers.
      Signed-off-by: NPaul Durrant <pdurrant@amazon.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      672b7763
    • A
      of: mdio: export of_mdiobus_child_is_phy · 0aa4d016
      Antoine Tenart 提交于
      This patch exports of_mdiobus_child_is_phy, allowing to check if a child
      node is a network PHY.
      Signed-off-by: NAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0aa4d016
  19. 18 12月, 2019 5 次提交
  20. 17 12月, 2019 2 次提交
  21. 16 12月, 2019 1 次提交
  22. 14 12月, 2019 3 次提交