1. 05 1月, 2020 3 次提交
    • D
      mm/memory_hotplug: shrink zones when offlining memory · feee6b29
      David Hildenbrand 提交于
      We currently try to shrink a single zone when removing memory.  We use
      the zone of the first page of the memory we are removing.  If that
      memmap was never initialized (e.g., memory was never onlined), we will
      read garbage and can trigger kernel BUGs (due to a stale pointer):
      
          BUG: unable to handle page fault for address: 000000000000353d
          #PF: supervisor write access in kernel mode
          #PF: error_code(0x0002) - not-present page
          PGD 0 P4D 0
          Oops: 0002 [#1] SMP PTI
          CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
          Workqueue: kacpi_hotplug acpi_hotplug_work_fn
          RIP: 0010:clear_zone_contiguous+0x5/0x10
          Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
          RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
          RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
          RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
          R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
          FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           __remove_pages+0x4b/0x640
           arch_remove_memory+0x63/0x8d
           try_remove_memory+0xdb/0x130
           __remove_memory+0xa/0x11
           acpi_memory_device_remove+0x70/0x100
           acpi_bus_trim+0x55/0x90
           acpi_device_hotplug+0x227/0x3a0
           acpi_hotplug_work_fn+0x1a/0x30
           process_one_work+0x221/0x550
           worker_thread+0x50/0x3b0
           kthread+0x105/0x140
           ret_from_fork+0x3a/0x50
          Modules linked in:
          CR2: 000000000000353d
      
      Instead, shrink the zones when offlining memory or when onlining failed.
      Introduce and use remove_pfn_range_from_zone(() for that.  We now
      properly shrink the zones, even if we have DIMMs whereby
      
       - Some memory blocks fall into no zone (never onlined)
      
       - Some memory blocks fall into multiple zones (offlined+re-onlined)
      
       - Multiple memory blocks that fall into different zones
      
      Drop the zone parameter (with a potential dubious value) from
      __remove_pages() and __remove_section().
      
      Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: <stable@vger.kernel.org>	[5.0+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      feee6b29
    • L
      Merge tag 'dmaengine-fix-5.5-rc5' of git://git.infradead.org/users/vkoul/slave-dma · 5613970a
      Linus Torvalds 提交于
      Pull dmaengine fixes from Vinod Koul:
       "A bunch of fixes for:
      
         - uninitialized dma_slave_caps access
      
         - virt-dma use after free in vchan_complete()
      
         - driver fixes for ioat, k3dma and jz4780"
      
      * tag 'dmaengine-fix-5.5-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
        ioat: ioat_alloc_ring() failure handling.
        dmaengine: virt-dma: Fix access after free in vchan_complete()
        dmaengine: k3dma: Avoid null pointer traversal
        dmaengine: dma-jz4780: Also break descriptor chains on JZ4725B
        dmaengine: Fix access to uninitialized dma_slave_caps
      5613970a
    • L
      Merge tag 'media/v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 50978df3
      Linus Torvalds 提交于
      Pull media fixes from Mauro Carvalho Chehab:
      
       - some fixes at CEC core to comply with HDMI 2.0 specs and fix some
         border cases
      
       - a fix at the transmission logic of the pulse8-cec driver
      
       - one alignment fix on a data struct at ipu3 when built with 32 bits
      
      * tag 'media/v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        media: intel-ipu3: Align struct ipu3_uapi_awb_fr_config_s to 32 bytes
        media: pulse8-cec: fix lost cec_transmit_attempt_done() call
        media: cec: check 'transmit_in_progress', not 'transmitting'
        media: cec: avoid decrementing transmit_queue_sz if it is 0
        media: cec: CEC 2.0-only bcast messages were ignored
      50978df3
  2. 04 1月, 2020 8 次提交
    • L
      Merge tag 'for-5.5-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 3a562aee
      Linus Torvalds 提交于
      Pull btrfs fixes from David Sterba:
       "A few fixes for btrfs:
      
         - blkcg accounting problem with compression that could stall writes
      
         - setting up blkcg bio for compression crashes due to NULL bdev
           pointer
      
         - fix possible infinite loop in writeback for nocow files (here
           possible means almost impossible, 13 things that need to happen to
           trigger it)"
      
      * tag 'for-5.5-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: fix infinite loop during nocow writeback due to race
        btrfs: fix compressed write bio blkcg attribution
        btrfs: punt all bios created in btrfs_submit_compressed_write()
      3a562aee
    • L
      Merge tag 'block-5.5-20200103' of git://git.kernel.dk/linux-block · b6b4aafc
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "Three fixes in here:
      
         - Fix for a missing split on default memory boundary mask (4G) (Ming)
      
         - Fix for multi-page read bio truncate (Ming)
      
         - Fix for null_blk zone close request handling (Damien)"
      
      * tag 'block-5.5-20200103' of git://git.kernel.dk/linux-block:
        null_blk: Fix REQ_OP_ZONE_CLOSE handling
        block: fix splitting segments on boundary masks
        block: add bio_truncate to fix guard_bio_eod
      b6b4aafc
    • L
      Merge tag 'kbuild-fixes-v5.5-2' of... · bed72351
      Linus Torvalds 提交于
      Merge tag 'kbuild-fixes-v5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - fix build error in usr/gen_initramfs_list.sh
      
       - fix libelf-dev dependency in deb-pkg build
      
      * tag 'kbuild-fixes-v5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild/deb-pkg: annotate libelf-dev dependency as :native
        gen_initramfs_list.sh: fix 'bad variable name' error
      bed72351
    • L
      Merge tag 'for-linus-2020-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · d9c82fd8
      Linus Torvalds 提交于
      Pull thread fixes from Christian Brauner:
       "Here are two fixes:
      
         - Panic earlier when global init exits to generate useable coredumps.
      
           Currently, when global init and all threads in its thread-group
           have exited we panic via:
      
             do_exit()
             -> exit_notify()
                -> forget_original_parent()
                   -> find_child_reaper()
      
           This makes it hard to extract a useable coredump for global init
           from a kernel crashdump because by the time we panic exit_mm() will
           have already released global init's mm. We now panic slightly
           earlier. This has been a problem in certain environments such as
           Android.
      
         - Fix a race in assigning and reading taskstats for thread-groups
           with more than one thread.
      
           This patch has been waiting for quite a while since people
           disagreed on what the correct fix was at first"
      
      * tag 'for-linus-2020-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        exit: panic before exit_mm() on global init exit
        taskstats: fix data-race
      d9c82fd8
    • L
      Merge tag 'powerpc-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 6f2e9c3d
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
       "Two more powerpc fixes for 5.5:
      
         - One commit to fix a build error when CONFIG_JUMP_LABEL=n,
           introduced by our recent fix to is_shared_processor().
      
         - A commit marking some SLB related functions as notrace, as tracing
           them triggers warnings.
      
        Thanks to Jason A Donenfeld"
      
      * tag 'powerpc-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/spinlocks: Include correct header for static key
        powerpc/mm: Mark get_slice_psize() & slice_addr_is_low() as notrace
      6f2e9c3d
    • L
      Merge tag 'sound-5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · e35d0165
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "Nothing to worry at this stage but all nice small changes:
      
         - A regression fix for AMD GPU detection in HD-audio
      
         - A long-standing sleep-in-atomic fix for an ice1724 device
      
         - Usual suspects, the device-specific quirks for HD- and USB-audio"
      
      * tag 'sound-5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek - Enable the bass speaker of ASUS UX431FLC
        ALSA: ice1724: Fix sleep-in-atomic in Infrasonic Quartet support code
        ALSA: hda/realtek - Add Bass Speaker and fixed dac for bass speaker
        ALSA: hda - Apply sync-write workaround to old Intel platforms, too
        ALSA: hda/hdmi - fix atpx_present when CLASS is not VGA
        ALSA: usb-audio: fix set_format altsetting sanity check
        ALSA: hda/realtek - Add headset Mic no shutup for ALC283
        ALSA: usb-audio: set the interface format after resume on Dell WD19
      e35d0165
    • L
      Merge tag 'drm-fixes-2020-01-03' of git://anongit.freedesktop.org/drm/drm · ca78fdeb
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "New Years fixes! Mostly amdgpu with a light smattering of arm
        graphics, and two AGP warning fixes.
      
        Quiet as expected, hopefully we don't get a post holiday rush.
      
        agp:
         - two unused variable removed
      
        amdgpu:
         - ATPX regression fix
         - SMU metrics table locking fixes
         - gfxoff fix for raven
         - RLC firmware loading stability fix
      
        mediatek:
         - external display fix
         - dsi timing fix
      
        sun4i:
         - Fix double-free in connector/encoder cleanup (Stefan)
      
        maildp:
         - Make vtable static (Ben)"
      
      * tag 'drm-fixes-2020-01-03' of git://anongit.freedesktop.org/drm/drm:
        agp: remove unused variable arqsz in agp_3_5_enable()
        agp: remove unused variable mcapndx
        drm/amdgpu: correct RLC firmwares loading sequence
        drm/amdgpu: enable gfxoff for raven1 refresh
        drm/amdgpu/smu: add metrics table lock for vega20 (v2)
        drm/amdgpu/smu: add metrics table lock for navi (v2)
        drm/amdgpu/smu: add metrics table lock for arcturus (v2)
        drm/amdgpu/smu: add metrics table lock
        Revert "drm/amdgpu: simplify ATPX detection"
        drm/arm/mali: make malidp_mw_connector_helper_funcs static
        drm/sun4i: hdmi: Remove duplicate cleanup calls
        drm/mediatek: reduce the hbp and hfp for phy timing
        drm/mediatek: Fix can't get component for external display plane.
        drm/mediatek: Check return value of mtk_drm_ddp_comp_for_plane.
      ca78fdeb
    • J
      mm/hugetlbfs: fix for_each_hstate() loop in init_hugetlbfs_fs() · 15f0ec94
      Jan Stancek 提交于
      LTP memfd_create04 started failing for some huge page sizes
      after v5.4-10135-gc3bfc5dd.
      
      The problem is the check introduced to for_each_hstate() loop that
      should skip default_hstate_idx.  Since it doesn't update 'i' counter,
      all subsequent huge page sizes are skipped as well.
      
      Fixes: 8fc312b3 ("mm/hugetlbfs: fix error handling when setting up mounts")
      Signed-off-by: NJan Stancek <jstancek@redhat.com>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      15f0ec94
  3. 03 1月, 2020 17 次提交
  4. 02 1月, 2020 3 次提交
  5. 01 1月, 2020 3 次提交
    • E
      drm/amdgpu: correct RLC firmwares loading sequence · 969e1152
      Evan Quan 提交于
      Per confirmation with RLC firmware team, the RLC should
      be unhalted after all RLC related firmwares uploaded.
      However, in fact the RLC is unhalted immediately after
      RLCG firmware uploaded. And that may causes unexpected
      PSP hang on loading the succeeding RLC save restore
      list related firmwares.
      So, we correct the firmware loading sequence to load
      RLC save restore list related firmwares before RLCG
      ucode. That will help to get around this issue.
      Signed-off-by: NEvan Quan <evan.quan@amd.com>
      Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      969e1152
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 738d2902
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix big endian overflow in nf_flow_table, from Arnd Bergmann.
      
       2) Fix port selection on big endian in nft_tproxy, from Phil Sutter.
      
       3) Fix precision tracking for unbound scalars in bpf verifier, from
          Daniel Borkmann.
      
       4) Fix integer overflow in socket rcvbuf check in UDP, from Antonio
          Messina.
      
       5) Do not perform a neigh confirmation during a pmtu update over a
          tunnel, from Hangbin Liu.
      
       6) Fix DMA mapping leak in dpaa_eth driver, from Madalin Bucur.
      
       7) Various PTP fixes for sja1105 dsa driver, from Vladimir Oltean.
      
       8) Add missing to dummy definition of of_mdiobus_child_is_phy(), from
          Geert Uytterhoeven
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
        hsr: fix slab-out-of-bounds Read in hsr_debugfs_rename()
        net/sched: add delete_empty() to filters and use it in cls_flower
        tcp: Fix highest_sack and highest_sack_seq
        ptp: fix the race between the release of ptp_clock and cdev
        net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S
        Documentation: net: dsa: sja1105: Remove text about taprio base-time limitation
        net: dsa: sja1105: Remove restriction of zero base-time for taprio offload
        net: dsa: sja1105: Really make the PTP command read-write
        net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slot
        cxgb4/cxgb4vf: fix flow control display for auto negotiation
        mlxsw: spectrum: Use dedicated policer for VRRP packets
        mlxsw: spectrum_router: Skip loopback RIFs during MAC validation
        net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs
        net/sched: act_mirred: Pull mac prior redir to non mac_header_xmit device
        net_sched: sch_fq: properly set sk->sk_pacing_status
        bnx2x: Fix accounting of vlan resources among the PFs
        bnx2x: Use appropriate define for vlan credit
        of: mdio: Add missing inline to of_mdiobus_child_is_phy() dummy
        net: phy: aquantia: add suspend / resume ops for AQR105
        dpaa_eth: fix DMA mapping leak
        ...
      738d2902
    • L
      Merge tag 'tomoyo-fixes-for-5.5' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1 · c5c928c6
      Linus Torvalds 提交于
      Pull tomoyo fixes from Tetsuo Handa:
       "Two bug fixes:
      
         - Suppress RCU warning at list_for_each_entry_rcu()
      
         - Don't use fancy names on sockets"
      
      * tag 'tomoyo-fixes-for-5.5' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
        tomoyo: Suppress RCU warning at list_for_each_entry_rcu().
        tomoyo: Don't use nifty names on sockets.
      c5c928c6
  6. 31 12月, 2019 6 次提交
    • T
      hsr: fix slab-out-of-bounds Read in hsr_debugfs_rename() · 04b69426
      Taehee Yoo 提交于
      hsr slave interfaces don't have debugfs directory.
      So, hsr_debugfs_rename() shouldn't be called when hsr slave interface name
      is changed.
      
      Test commands:
          ip link add dummy0 type dummy
          ip link add dummy1 type dummy
          ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1
          ip link set dummy0 name ap
      
      Splat looks like:
      [21071.899367][T22666] ap: renamed from dummy0
      [21071.914005][T22666] ==================================================================
      [21071.919008][T22666] BUG: KASAN: slab-out-of-bounds in hsr_debugfs_rename+0xaa/0xb0 [hsr]
      [21071.923640][T22666] Read of size 8 at addr ffff88805febcd98 by task ip/22666
      [21071.926941][T22666]
      [21071.927750][T22666] CPU: 0 PID: 22666 Comm: ip Not tainted 5.5.0-rc2+ #240
      [21071.929919][T22666] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [21071.935094][T22666] Call Trace:
      [21071.935867][T22666]  dump_stack+0x96/0xdb
      [21071.936687][T22666]  ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
      [21071.937774][T22666]  print_address_description.constprop.5+0x1be/0x360
      [21071.939019][T22666]  ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
      [21071.940081][T22666]  ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
      [21071.940949][T22666]  __kasan_report+0x12a/0x16f
      [21071.941758][T22666]  ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
      [21071.942674][T22666]  kasan_report+0xe/0x20
      [21071.943325][T22666]  hsr_debugfs_rename+0xaa/0xb0 [hsr]
      [21071.944187][T22666]  hsr_netdev_notify+0x1fe/0x9b0 [hsr]
      [21071.945052][T22666]  ? __module_text_address+0x13/0x140
      [21071.945897][T22666]  notifier_call_chain+0x90/0x160
      [21071.946743][T22666]  dev_change_name+0x419/0x840
      [21071.947496][T22666]  ? __read_once_size_nocheck.constprop.6+0x10/0x10
      [21071.948600][T22666]  ? netdev_adjacent_rename_links+0x280/0x280
      [21071.949577][T22666]  ? __read_once_size_nocheck.constprop.6+0x10/0x10
      [21071.950672][T22666]  ? lock_downgrade+0x6e0/0x6e0
      [21071.951345][T22666]  ? do_setlink+0x811/0x2ef0
      [21071.951991][T22666]  do_setlink+0x811/0x2ef0
      [21071.952613][T22666]  ? is_bpf_text_address+0x81/0xe0
      [ ... ]
      
      Reported-by: syzbot+9328206518f08318a5fd@syzkaller.appspotmail.com
      Fixes: 4c2d5e33 ("hsr: rename debugfs file when interface name is changed")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04b69426
    • D
      net/sched: add delete_empty() to filters and use it in cls_flower · a5b72a08
      Davide Caratti 提交于
      Revert "net/sched: cls_u32: fix refcount leak in the error path of
      u32_change()", and fix the u32 refcount leak in a more generic way that
      preserves the semantic of rule dumping.
      On tc filters that don't support lockless insertion/removal, there is no
      need to guard against concurrent insertion when a removal is in progress.
      Therefore, for most of them we can avoid a full walk() when deleting, and
      just decrease the refcount, like it was done on older Linux kernels.
      This fixes situations where walk() was wrongly detecting a non-empty
      filter, like it happened with cls_u32 in the error path of change(), thus
      leading to failures in the following tdc selftests:
      
       6aa7: (filter, u32) Add/Replace u32 with source match and invalid indev
       6658: (filter, u32) Add/Replace u32 with custom hash table and invalid handle
       74c2: (filter, u32) Add/Replace u32 filter with invalid hash table id
      
      On cls_flower, and on (future) lockless filters, this check is necessary:
      move all the check_empty() logic in a callback so that each filter
      can have its own implementation. For cls_flower, it's sufficient to check
      if no IDRs have been allocated.
      
      This reverts commit 275c44aa.
      
      Changes since v1:
       - document the need for delete_empty() when TCF_PROTO_OPS_DOIT_UNLOCKED
         is used, thanks to Vlad Buslov
       - implement delete_empty() without doing fl_walk(), thanks to Vlad Buslov
       - squash revert and new fix in a single patch, to be nice with bisect
         tests that run tdc on u32 filter, thanks to Dave Miller
      
      Fixes: 275c44aa ("net/sched: cls_u32: fix refcount leak in the error path of u32_change()")
      Fixes: 6676d5e4 ("net: sched: set dedicated tcf_walker flag when tp is empty")
      Suggested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Suggested-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Reviewed-by: NVlad Buslov <vladbu@mellanox.com>
      Tested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5b72a08
    • C
      tcp: Fix highest_sack and highest_sack_seq · 85369750
      Cambda Zhu 提交于
      >From commit 50895b9d ("tcp: highest_sack fix"), the logic about
      setting tp->highest_sack to the head of the send queue was removed.
      Of course the logic is error prone, but it is logical. Before we
      remove the pointer to the highest sack skb and use the seq instead,
      we need to set tp->highest_sack to NULL when there is no skb after
      the last sack, and then replace NULL with the real skb when new skb
      inserted into the rtx queue, because the NULL means the highest sack
      seq is tp->snd_nxt. If tp->highest_sack is NULL and new data sent,
      the next ACK with sack option will increase tp->reordering unexpectedly.
      
      This patch sets tp->highest_sack to the tail of the rtx queue if
      it's NULL and new data is sent. The patch keeps the rule that the
      highest_sack can only be maintained by sack processing, except for
      this only case.
      
      Fixes: 50895b9d ("tcp: highest_sack fix")
      Signed-off-by: NCambda Zhu <cambda@linux.alibaba.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85369750
    • V
      ptp: fix the race between the release of ptp_clock and cdev · a33121e5
      Vladis Dronov 提交于
      In a case when a ptp chardev (like /dev/ptp0) is open but an underlying
      device is removed, closing this file leads to a race. This reproduces
      easily in a kvm virtual machine:
      
      ts# cat openptp0.c
      int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); }
      ts# uname -r
      5.5.0-rc3-46cf053e
      ts# cat /proc/cmdline
      ... slub_debug=FZP
      ts# modprobe ptp_kvm
      ts# ./openptp0 &
      [1] 670
      opened /dev/ptp0, sleeping 10s...
      ts# rmmod ptp_kvm
      ts# ls /dev/ptp*
      ls: cannot access '/dev/ptp*': No such file or directory
      ts# ...woken up
      [   48.010809] general protection fault: 0000 [#1] SMP
      [   48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e #25
      [   48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
      [   48.016270] RIP: 0010:module_put.part.0+0x7/0x80
      [   48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202
      [   48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0
      [   48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b
      [   48.019470] ...                                              ^^^ a slub poison
      [   48.023854] Call Trace:
      [   48.024050]  __fput+0x21f/0x240
      [   48.024288]  task_work_run+0x79/0x90
      [   48.024555]  do_exit+0x2af/0xab0
      [   48.024799]  ? vfs_write+0x16a/0x190
      [   48.025082]  do_group_exit+0x35/0x90
      [   48.025387]  __x64_sys_exit_group+0xf/0x10
      [   48.025737]  do_syscall_64+0x3d/0x130
      [   48.026056]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   48.026479] RIP: 0033:0x7f53b12082f6
      [   48.026792] ...
      [   48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm]
      [   48.045001] Fixing recursive fault but reboot is needed!
      
      This happens in:
      
      static void __fput(struct file *file)
      {   ...
          if (file->f_op->release)
              file->f_op->release(inode, file); <<< cdev is kfree'd here
          if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
                   !(mode & FMODE_PATH))) {
              cdev_put(inode->i_cdev); <<< cdev fields are accessed here
      
      Namely:
      
      __fput()
        posix_clock_release()
          kref_put(&clk->kref, delete_clock) <<< the last reference
            delete_clock()
              delete_ptp_clock()
                kfree(ptp) <<< cdev is embedded in ptp
        cdev_put
          module_put(p->owner) <<< *p is kfree'd, bang!
      
      Here cdev is embedded in posix_clock which is embedded in ptp_clock.
      The race happens because ptp_clock's lifetime is controlled by two
      refcounts: kref and cdev.kobj in posix_clock. This is wrong.
      
      Make ptp_clock's sysfs device a parent of cdev with cdev_device_add()
      created especially for such cases. This way the parent device with its
      ptp_clock is not released until all references to the cdev are released.
      This adds a requirement that an initialized but not exposed struct
      device should be provided to posix_clock_register() by a caller instead
      of a simple dev_t.
      
      This approach was adopted from the commit 72139dfa ("watchdog: Fix
      the race between the release of watchdog_core_data and cdev"). See
      details of the implementation in the commit 233ed09d ("chardev: add
      helper function to register char devs with a struct device").
      
      Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#uAnalyzed-by: NStephen Johnston <sjohnsto@redhat.com>
      Analyzed-by: NVern Lovejoy <vlovejoy@redhat.com>
      Signed-off-by: NVladis Dronov <vdronov@redhat.com>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a33121e5
    • V
      net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S · 54fa49ee
      Vladimir Oltean 提交于
      For first-generation switches (SJA1105E and SJA1105T):
      - TPID means C-Tag (typically 0x8100)
      - TPID2 means S-Tag (typically 0x88A8)
      
      While for the second generation switches (SJA1105P, SJA1105Q, SJA1105R,
      SJA1105S) it is the other way around:
      - TPID means S-Tag (typically 0x88A8)
      - TPID2 means C-Tag (typically 0x8100)
      
      In other words, E/T tags untagged traffic with TPID, and P/Q/R/S with
      TPID2.
      
      So the patch mentioned below fixed VLAN filtering for P/Q/R/S, but broke
      it for E/T.
      
      We strive for a common code path for all switches in the family, so just
      lie in the static config packing functions that TPID and TPID2 are at
      swapped bit offsets than they actually are, for P/Q/R/S. This will make
      both switches understand TPID to be ETH_P_8021Q and TPID2 to be
      ETH_P_8021AD. The meaning from the original E/T was chosen over P/Q/R/S
      because E/T is actually the one with public documentation available
      (UM10944.pdf).
      
      Fixes: f9a1a764 ("net: dsa: sja1105: Reverse TPID and TPID2")
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54fa49ee
    • V
      Documentation: net: dsa: sja1105: Remove text about taprio base-time limitation · 3a323ed7
      Vladimir Oltean 提交于
      Since commit 86db36a3 ("net: dsa: sja1105: Implement state machine
      for TAS with PTP clock source"), this paragraph is no longer true. So
      remove it.
      Signed-off-by: NVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a323ed7