1. 22 3月, 2019 14 次提交
  2. 19 3月, 2019 26 次提交
    • G
      Linux 4.19.30 · 7794d352
      Greg Kroah-Hartman 提交于
      7794d352
    • Z
      vhost/vsock: fix vhost vsock cid hashing inconsistent · 842bdbe8
      Zha Bin 提交于
      commit 7fbe078c37aba3088359c9256c1a1d0c3e39ee81 upstream.
      
      The vsock core only supports 32bit CID, but the Virtio-vsock spec define
      CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as
      zero. This inconsistency causes one bug in vhost vsock driver. The
      scenarios is:
      
        0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock
        object. And hash_min() is used to compute the hash key. hash_min() is
        defined as:
        (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits)).
        That means the hash algorithm has dependency on the size of macro
        argument 'val'.
        0. In function vhost_vsock_set_cid(), a 64bit CID is passed to
        hash_min() to compute the hash key when inserting a vsock object into
        the hash table.
        0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min()
        to compute the hash key when looking up a vsock for an CID.
      
      Because the different size of the CID, hash_min() returns different hash
      key, thus fails to look up the vsock object for an CID.
      
      To fix this bug, we keep CID as u64 in the IOCTLs and virtio message
      headers, but explicitly convert u64 to u32 when deal with the hash table
      and vsock core.
      
      Fixes: 834e772c8db0 ("vhost/vsock: fix use-after-free in network stack callers")
      Link: https://github.com/stefanha/virtio/blob/vsock/trunk/content.texSigned-off-by: NZha Bin <zhabin@linux.alibaba.com>
      Reviewed-by: NLiu Jiang <gerry@linux.alibaba.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NShengjing Zhu <i@zhsj.me>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      842bdbe8
    • B
      i40e: report correct statistics when XDP is enabled · 090ce34b
      Björn Töpel 提交于
      commit cdec2141c24ef177d929765c5a6f95549c266fb3 upstream.
      
      When XDP is enabled, the driver will report incorrect
      statistics. Received frames will reported as transmitted frames.
      
      This commits fixes the i40e implementation of ndo_get_stats64 (struct
      net_device_ops), so that iproute2 will report correct statistics
      (e.g. when running "ip -stats link show dev eth0") even when XDP is
      enabled.
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Fixes: 74608d17 ("i40e: add support for XDP_TX action")
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Emeric Brun <ebrun@haproxy.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      090ce34b
    • G
      staging: erofs: fix race when the managed cache is enabled · eab8018f
      Gao Xiang 提交于
      commit 51232df5e4b268936beccde5248f312a316800be upstream.
      
      When the managed cache is enabled, the last reference count
      of a workgroup must be used for its workstation.
      
      Otherwise, it could lead to incorrect (un)freezes in
      the reclaim path, and it would be harmful.
      
      A typical race as follows:
      
      Thread 1 (In the reclaim path)  Thread 2
      workgroup_freeze(grp, 1)                                refcnt = 1
      ...
      workgroup_unfreeze(grp, 1)                              refcnt = 1
                                      workgroup_get(grp)      refcnt = 2 (x)
      workgroup_put(grp)                                      refcnt = 1 (x)
                                      ...unexpected behaviors
      
      * grp is detached but still used, which violates cache-managed
        freeze constraint.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NGao Xiang <gaoxiang25@huawei.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eab8018f
    • N
      drm: Block fb changes for async plane updates · 96ce54b2
      Nicholas Kazlauskas 提交于
      commit 25dc194b34dd5919dd07b8873ee338182e15df9d upstream.
      
      The prepare_fb call always happens on new_plane_state.
      
      The drm_atomic_helper_cleanup_planes checks to see if
      plane state pointer has changed when deciding to call cleanup_fb on
      either the new_plane_state or the old_plane_state.
      
      For a non-async atomic commit the state pointer is swapped, so this
      helper calls prepare_fb on the new_plane_state and cleanup_fb on the
      old_plane_state. This makes sense, since we want to prepare the
      framebuffer we are going to use and cleanup the the framebuffer we are
      no longer using.
      
      For the async atomic update helpers this differs. The async atomic
      update helpers perform in-place updates on the existing state. They call
      drm_atomic_helper_cleanup_planes but the state pointer is not swapped.
      This means that prepare_fb is called on the new_plane_state and
      cleanup_fb is called on the new_plane_state (not the old).
      
      In the case where old_plane_state->fb == new_plane_state->fb then
      there should be no behavioral difference between an async update
      and a non-async commit. But there are issues that arise when
      old_plane_state->fb != new_plane_state->fb.
      
      The first is that the new_plane_state->fb is immediately cleaned up
      after it has been prepared, so we're using a fb that we shouldn't
      be.
      
      The second occurs during a sequence of async atomic updates and
      non-async regular atomic commits. Suppose there are two framebuffers
      being interleaved in a double-buffering scenario, fb1 and fb2:
      
      - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
      - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
      - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
      
      We call cleanup_fb on fb2 twice in this example scenario, and any
      further use will result in use-after-free.
      
      The simple fix to this problem is to block framebuffer changes
      in the drm_atomic_helper_async_check function for now.
      
      v2: Move check by itself, add a FIXME (Daniel)
      
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Harry Wentland <harry.wentland@amd.com>
      Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
      Cc: <stable@vger.kernel.org> # v4.14+
      Fixes: fef9df8b ("drm/atomic: initial support for asynchronous plane update")
      Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
      Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
      Acked-by: NHarry Wentland <harry.wentland@amd.com>
      Reviewed-by: NDaniel Vetter <daniel@ffwll.ch>
      Signed-off-by: NHarry Wentland <harry.wentland@amd.com>
      Link: https://patchwork.freedesktop.org/patch/275364/Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96ce54b2
    • X
      It's wrong to add len to sector_nr in raid10 reshape twice · 27143c71
      Xiao Ni 提交于
      commit b761dcf1217760a42f7897c31dcb649f59b2333e upstream.
      
      In reshape_request it already adds len to sector_nr already. It's wrong to add len to
      sector_nr again after adding pages to bio. If there is bad block it can't copy one chunk
      at a time, it needs to goto read_more. Now the sector_nr is wrong. It can cause data
      corruption.
      
      Cc: stable@vger.kernel.org # v3.16+
      Signed-off-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27143c71
    • K
      perf/x86/intel: Make dev_attr_allow_tsx_force_abort static · d6b577c6
      kbuild test robot 提交于
      commit c634dc6bdedeb0b2c750fc611612618a85639ab2 upstream.
      
      Fixes: 400816f60c54 ("perf/x86/intel: Implement support for TSX Force Abort")
      Signed-off-by: Nkbuild test robot <lkp@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: kbuild-all@01.org
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190313184243.GA10820@lkp-sb-ep06Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6b577c6
    • P
      perf/x86/intel: Fix memory corruption · 92c9a389
      Peter Zijlstra 提交于
      commit ede271b059463731cbd6dffe55ffd70d7dbe8392 upstream.
      
      Through:
      
        validate_event()
          x86_pmu.get_event_constraints(.idx=-1)
            tfa_get_event_constraints()
              dyn_constraint()
      
      cpuc->constraint_list[-1] is used, which is an obvious out-of-bound access.
      
      In this case, simply skip the TFA constraint code, there is no event
      constraint with just PMC3, therefore the code will never result in the
      empty set.
      
      Fixes: 400816f60c54 ("perf/x86/intel: Implement support for TSX Force Abort")
      Reported-by: NTony Jones <tonyj@suse.com>
      Reported-by: N"DSouza, Nelson" <nelson.dsouza@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NTony Jones <tonyj@suse.com>
      Tested-by: N"DSouza, Nelson" <nelson.dsouza@intel.com>
      Cc: eranian@google.com
      Cc: jolsa@redhat.com
      Cc: stable@kernel.org
      Link: https://lkml.kernel.org/r/20190314130705.441549378@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92c9a389
    • J
      ALSA: hda/realtek: Enable headset MIC of Acer TravelMate X514-51T with ALC255 · 835bc1e2
      Jian-Hong Pan 提交于
      commit cbc05fd6708c1744ee6a61cb4c461ff956d30524 upstream.
      
      The Acer TravelMate X514-51T with ALC255 cannot detect the headset MIC
      until ALC255_FIXUP_ACER_HEADSET_MIC quirk applied.  Although, the
      internal DMIC uses another module - snd_soc_skl as the driver.  We still
      need the NID 0x1a in the quirk to enable the headset MIC.
      Signed-off-by: NJian-Hong Pan <jian-hong@endlessm.com>
      Signed-off-by: NKailang Yang <kailang@realtek.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      835bc1e2
    • T
      ALSA: hda/realtek - Reduce click noise on Dell Precision 5820 headphone · be888d9a
      Takashi Iwai 提交于
      commit c0ca5eced22215c1e03e3ad479f8fab0bbb30772 upstream.
      
      Dell Precision 5820 with ALC3234 codec (which is equivalent with
      ALC255) shows click noises at (runtime) PM resume on the headphone.
      The biggest source of the noise comes from the cleared headphone pin
      control at resume, which is done via the standard shutup procedure.
      
      Although we have an override of the standard shutup callback to
      replace with NOP, this would skip other needed stuff (e.g. the pull
      down of headset power).  So, instead, this "fixes" the behavior of
      alc_fixup_no_shutup() by introducing spec->no_shutup_pins flag.
      When this flag is set, Realtek codec won't call the standard
      snd_hda_shutup_pins() & co.  Now alc_fixup_no_shutup() just sets this
      flag instead of overriding spec->shutup callback itself.  This allows
      us to apply the similar fix for other entries easily if needed in
      future.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be888d9a
    • J
      ALSA: hda/realtek: Enable audio jacks of ASUS UX362FA with ALC294 · 8f6cf57e
      Jian-Hong Pan 提交于
      commit 8bb37a2a4d7c02affef554f5dc05f6d2e39c31f9 upstream.
      
      The ASUS UX362FA with ALC294 cannot detect the headset MIC and outputs
      through the internal speaker and the headphone.  This issue can be fixed
      by the quirk in the commit 4e0511067 ALSA: hda/realtek: Enable audio
      jacks of ASUS UX533FD with ALC294.
      
      Besides, ASUS UX362FA and UX533FD have the same audio initial pin config
      values.  So, this patch replaces SND_PCI_QUIRK of UX533FD with a new
      SND_HDA_PIN_QUIRK which benefits both UX362FA and UX533FD.
      
      Fixes: 4e051106730d ("ALSA: hda/realtek: Enable audio jacks of ASUS UX533FD with ALC294")
      Signed-off-by: NJian-Hong Pan <jian-hong@endlessm.com>
      Signed-off-by: NMing Shuo Chiu <chiu@endlessm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f6cf57e
    • J
      ALSA: hda - add more quirks for HP Z2 G4 and HP Z240 · 5da055b1
      Jaroslav Kysela 提交于
      commit 167897f4b32c2bc18b3b6183029a33fb420a114e upstream.
      
      Apply the HP_MIC_NO_PRESENCE fixups for the more HP Z2 G4 and
      HP Z240 models.
      Reported-by: NJeff Burrell <jeff.burrell@hp.com>
      Signed-off-by: NJaroslav Kysela <perex@perex.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5da055b1
    • T
      ALSA: hda: Extend i915 component bind timeout · 2191cd58
      Takashi Iwai 提交于
      commit cfc35f9c128cea8fce6a5513b1de50d36f3b209f upstream.
      
      I set 10 seconds for the timeout of the i915 audio component binding
      with a hope that recent machines are fast enough to handle all probe
      tasks in that period, but I was too optimistic.  The binding may take
      longer than that, and this caused a problem on the machine with both
      audio and graphics driver modules loaded in parallel, as Paul Menzel
      experienced.  This problem haven't hit so often just because the KMS
      driver is loaded in initrd on most machines.
      
      As a simple workaround, extend the timeout to 60 seconds.
      
      Fixes: f9b54e19 ("ALSA: hda/i915: Allow delayed i915 audio component binding")
      Reported-by: NPaul Menzel <pmenzel+alsa-devel@molgen.mpg.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2191cd58
    • T
      ALSA: firewire-motu: fix construction of PCM frame for capture direction · 8b2d6639
      Takashi Sakamoto 提交于
      commit f97a0944a72b26a2bece72516294e112a890f98a upstream.
      
      In data blocks of common isochronous packet for MOTU devices, PCM
      frames are multiplexed in a shape of '24 bit * 4 Audio Pack', described
      in IEC 61883-6. The frames are not aligned to quadlet.
      
      For capture PCM substream, ALSA firewire-motu driver constructs PCM
      frames by reading data blocks byte-by-byte. However this operation
      includes bug for lower byte of the PCM sample. This brings invalid
      content of the PCM samples.
      
      This commit fixes the bug.
      Reported-by: NPeter Sjöberg <autopeter@gmail.com>
      Cc: <stable@vger.kernel.org> # v4.12+
      Fixes: 4641c939 ("ALSA: firewire-motu: add MOTU specific protocol layer")
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b2d6639
    • T
      ALSA: bebob: use more identical mod_alias for Saffire Pro 10 I/O against Liquid Saffire 56 · bb2dde7c
      Takashi Sakamoto 提交于
      commit 7dc661bd8d3261053b69e4e2d0050cd1ee540fc1 upstream.
      
      ALSA bebob driver has an entry for Focusrite Saffire Pro 10 I/O. The
      entry matches vendor_id in root directory and model_id in unit
      directory of configuration ROM for IEEE 1394 bus.
      
      On the other hand, configuration ROM of Focusrite Liquid Saffire 56
      has the same vendor_id and model_id. This device is an application of
      TCAT Dice (TCD2220 a.k.a Dice Jr.) however ALSA bebob driver can be
      bound to it randomly instead of ALSA dice driver. At present, drivers
      in ALSA firewire stack can not handle this situation appropriately.
      
      This commit uses more identical mod_alias for Focusrite Saffire Pro 10
      I/O in ALSA bebob driver.
      
      $ python2 crpp < /sys/bus/firewire/devices/fw1/config_rom
                     ROM header and bus information block
                     -----------------------------------------------------------------
      400  042a829d  bus_info_length 4, crc_length 42, crc 33437
      404  31333934  bus_name "1394"
      408  f0649222  irmc 1, cmc 1, isc 1, bmc 1, pmc 0, cyc_clk_acc 100,
                     max_rec 9 (1024), max_rom 2, gen 2, spd 2 (S400)
      40c  00130e01  company_id 00130e     |
      410  000606e0  device_id 01000606e0  | EUI-64 00130e01000606e0
      
                     root directory
                     -----------------------------------------------------------------
      414  0009d31c  directory_length 9, crc 54044
      418  04000014  hardware version
      41c  0c0083c0  node capabilities per IEEE 1394
      420  0300130e  vendor
      424  81000012  --> descriptor leaf at 46c
      428  17000006  model
      42c  81000016  --> descriptor leaf at 484
      430  130120c2  version
      434  d1000002  --> unit directory at 43c
      438  d4000006  --> dependent info directory at 450
      
                     unit directory at 43c
                     -----------------------------------------------------------------
      43c  0004707c  directory_length 4, crc 28796
      440  1200a02d  specifier id: 1394 TA
      444  13010001  version: AV/C
      448  17000006  model
      44c  81000013  --> descriptor leaf at 498
      
                     dependent info directory at 450
                     -----------------------------------------------------------------
      450  000637c7  directory_length 6, crc 14279
      454  120007f5  specifier id
      458  13000001  version
      45c  3affffc7  (immediate value)
      460  3b100000  (immediate value)
      464  3cffffc7  (immediate value)
      468  3d600000  (immediate value)
      
                     descriptor leaf at 46c
                     -----------------------------------------------------------------
      46c  00056f3b  leaf_length 5, crc 28475
      470  00000000  textual descriptor
      474  00000000  minimal ASCII
      478  466f6375  "Focu"
      47c  73726974  "srit"
      480  65000000  "e"
      
                     descriptor leaf at 484
                     -----------------------------------------------------------------
      484  0004a165  leaf_length 4, crc 41317
      488  00000000  textual descriptor
      48c  00000000  minimal ASCII
      490  50726f31  "Pro1"
      494  30494f00  "0IO"
      
                     descriptor leaf at 498
                     -----------------------------------------------------------------
      498  0004a165  leaf_length 4, crc 41317
      49c  00000000  textual descriptor
      4a0  00000000  minimal ASCII
      4a4  50726f31  "Pro1"
      4a8  30494f00  "0IO"
      
      $ python2 crpp < /sys/bus/firewire/devices/fw1/config_rom
                     ROM header and bus information block
                     -----------------------------------------------------------------
      400  040442e4  bus_info_length 4, crc_length 4, crc 17124
      404  31333934  bus_name "1394"
      408  e0ff8112  irmc 1, cmc 1, isc 1, bmc 0, pmc 0, cyc_clk_acc 255,
                     max_rec 8 (512), max_rom 1, gen 1, spd 2 (S400)
      40c  00130e04  company_id 00130e     |
      410  018001e9  device_id 04018001e9  | EUI-64 00130e04018001e9
      
                     root directory
                     -----------------------------------------------------------------
      414  00065612  directory_length 6, crc 22034
      418  0300130e  vendor
      41c  8100000a  --> descriptor leaf at 444
      420  17000006  model
      424  8100000e  --> descriptor leaf at 45c
      428  0c0087c0  node capabilities per IEEE 1394
      42c  d1000001  --> unit directory at 430
      
                     unit directory at 430
                     -----------------------------------------------------------------
      430  000418a0  directory_length 4, crc 6304
      434  1200130e  specifier id
      438  13000001  version
      43c  17000006  model
      440  8100000f  --> descriptor leaf at 47c
      
                     descriptor leaf at 444
                     -----------------------------------------------------------------
      444  00056f3b  leaf_length 5, crc 28475
      448  00000000  textual descriptor
      44c  00000000  minimal ASCII
      450  466f6375  "Focu"
      454  73726974  "srit"
      458  65000000  "e"
      
                     descriptor leaf at 45c
                     -----------------------------------------------------------------
      45c  000762c6  leaf_length 7, crc 25286
      460  00000000  textual descriptor
      464  00000000  minimal ASCII
      468  4c495155  "LIQU"
      46c  49445f53  "ID_S"
      470  41464649  "AFFI"
      474  52455f35  "RE_5"
      478  36000000  "6"
      
                     descriptor leaf at 47c
                     -----------------------------------------------------------------
      47c  000762c6  leaf_length 7, crc 25286
      480  00000000  textual descriptor
      484  00000000  minimal ASCII
      488  4c495155  "LIQU"
      48c  49445f53  "ID_S"
      490  41464649  "AFFI"
      494  52455f35  "RE_5"
      498  36000000  "6"
      
      Cc: <stable@vger.kernel.org> # v3.16+
      Fixes: 25784ec2 ("ALSA: bebob: Add support for Focusrite Saffire/SaffirePro series")
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb2dde7c
    • P
      perf/x86: Fixup typo in stub functions · a8eae05f
      Peter Zijlstra 提交于
      commit f764c58b7faa26f5714e6907f892abc2bc0de4f8 upstream.
      
      Guenter reported a build warning for CONFIG_CPU_SUP_INTEL=n:
      
        > With allmodconfig-CONFIG_CPU_SUP_INTEL, this patch results in:
        >
        > In file included from arch/x86/events/amd/core.c:8:0:
        > arch/x86/events/amd/../perf_event.h:1036:45: warning: ‘struct cpu_hw_event’ declared inside parameter list will not be visible outside of this definition or declaration
        >  static inline int intel_cpuc_prepare(struct cpu_hw_event *cpuc, int cpu)
      
      While harmless (an unsed pointer is an unused pointer, no matter the type)
      it needs fixing.
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: d01b1f96a82e ("perf/x86/intel: Make cpuc allocations consistent")
      Link: http://lkml.kernel.org/r/20190315081410.GR5996@hirez.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8eae05f
    • J
      f2fs: wait on atomic writes to count F2FS_CP_WB_DATA · 2835c059
      Jaegeuk Kim 提交于
      commit 31867b23d7d1ee3535136c6a410a6cf56f666bfc upstream.
      
      Otherwise, we can get wrong counts incurring checkpoint hang.
      
      IO_W (CP:  -24, Data:   24, Flush: (   0    0    1), Discard: (   0    0))
      
      Thread A                        Thread B
      - f2fs_write_data_pages
       -  __write_data_page
        - f2fs_submit_page_write
         - inc_page_count(F2FS_WB_DATA)
           type is F2FS_WB_DATA due to file is non-atomic one
      - f2fs_ioc_start_atomic_write
       - set_inode_flag(FI_ATOMIC_FILE)
                                      - f2fs_write_end_io
                                       - dec_page_count(F2FS_WB_CP_DATA)
                                         type is F2FS_WB_DATA due to file becomes
                                         atomic one
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2835c059
    • V
      net: sched: flower: insert new filter to idr after setting its mask · 275a2c08
      Vlad Buslov 提交于
      [ Upstream commit ecb3dea400d3beaf611ce76ac7a51d4230492cf2 ]
      
      When adding new filter to flower classifier, fl_change() inserts it to
      handle_idr before initializing filter extensions and assigning it a mask.
      Normally this ordering doesn't matter because all flower classifier ops
      callbacks assume rtnl lock protection. However, when filter has an action
      that doesn't have its kernel module loaded, rtnl lock is released before
      call to request_module(). During this time the filter can be accessed bu
      concurrent task before its initialization is completed, which can lead to a
      crash.
      
      Example case of NULL pointer dereference in concurrent dump:
      
      Task 1                           Task 2
      
      tc_new_tfilter()
       fl_change()
        idr_alloc_u32(fnew)
        fl_set_parms()
         tcf_exts_validate()
          tcf_action_init()
           tcf_action_init_1()
            rtnl_unlock()
            request_module()
            ...                        rtnl_lock()
            				 tc_dump_tfilter()
            				  tcf_chain_dump()
      				   fl_walk()
      				    idr_get_next_ul()
      				    tcf_node_dump()
      				     tcf_fill_node()
      				      fl_dump()
      				       mask = &f->mask->key; <- NULL ptr
            rtnl_lock()
      
      Extension initialization and mask assignment don't depend on fnew->handle
      that is allocated by idr_alloc_u32(). Move idr allocation code after action
      creation and mask assignment in fl_change() to prevent concurrent access
      to not fully initialized filter when rtnl lock is released to load action
      module.
      
      Fixes: 01683a14 ("net: sched: refactor flower walk to iterate over idr")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      275a2c08
    • A
      missing barriers in some of unix_sock ->addr and ->path accesses · 345af5ab
      Al Viro 提交于
      [ Upstream commit ae3b564179bfd06f32d051b9e5d72ce4b2a07c37 ]
      
      Several u->addr and u->path users are not holding any locks in
      common with unix_bind().  unix_state_lock() is useless for those
      purposes.
      
      u->addr is assign-once and *(u->addr) is fully set up by the time
      we set u->addr (all under unix_table_lock).  u->path is also
      set in the same critical area, also before setting u->addr, and
      any unix_sock with ->path filled will have non-NULL ->addr.
      
      So setting ->addr with smp_store_release() is all we need for those
      "lockless" users - just have them fetch ->addr with smp_load_acquire()
      and don't even bother looking at ->path if they see NULL ->addr.
      
      Users of ->addr and ->path fall into several classes now:
          1) ones that do smp_load_acquire(u->addr) and access *(u->addr)
      and u->path only if smp_load_acquire() has returned non-NULL.
          2) places holding unix_table_lock.  These are guaranteed that
      *(u->addr) is seen fully initialized.  If unix_sock is in one of the
      "bound" chains, so's ->path.
          3) unix_sock_destructor() using ->addr is safe.  All places
      that set u->addr are guaranteed to have seen all stores *(u->addr)
      while holding a reference to u and unix_sock_destructor() is called
      when (atomic) refcount hits zero.
          4) unix_release_sock() using ->path is safe.  unix_bind()
      is serialized wrt unix_release() (normally - by struct file
      refcount), and for the instances that had ->path set by unix_bind()
      unix_release_sock() comes from unix_release(), so they are fine.
      Instances that had it set in unix_stream_connect() either end up
      attached to a socket (in unix_accept()), in which case the call
      chain to unix_release_sock() and serialization are the same as in
      the previous case, or they never get accept'ed and unix_release_sock()
      is called when the listener is shut down and its queue gets purged.
      In that case the listener's queue lock provides the barriers needed -
      unix_stream_connect() shoves our unix_sock into listener's queue
      under that lock right after having set ->path and eventual
      unix_release_sock() caller picks them from that queue under the
      same lock right before calling unix_release_sock().
          5) unix_find_other() use of ->path is pointless, but safe -
      it happens with successful lookup by (abstract) name, so ->path.dentry
      is guaranteed to be NULL there.
      earlier-variant-reviewed-by: N"Paul E. McKenney" <paulmck@linux.ibm.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      345af5ab
    • U
      net/smc: fix smc_poll in SMC_INIT state · f56b3c29
      Ursula Braun 提交于
      [ Upstream commit d7cf4a3bf3a83c977a29055e1c4ffada7697b31f ]
      
      smc_poll() returns with mask bit EPOLLPRI if the connection urg_state
      is SMC_URG_VALID. Since SMC_URG_VALID is zero, smc_poll signals
      EPOLLPRI errorneously if called in state SMC_INIT before the connection
      is created, for instance in a non-blocking connect scenario.
      
      This patch switches to non-zero values for the urg states.
      Reviewed-by: NKarsten Graul <kgraul@linux.ibm.com>
      Fixes: de8474eb ("net/smc: urgent data support")
      Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f56b3c29
    • M
      bonding: fix PACKET_ORIGDEV regression · 795cb33c
      Michal Soltys 提交于
      [ Upstream commit 3c963a3306eada999be5ebf4f293dfa3d3945487 ]
      
      This patch fixes a subtle PACKET_ORIGDEV regression which was a side
      effect of fixes introduced by:
      
      6a9e461f bonding: pass link-local packets to bonding master also.
      
      ... to:
      
      b89f04c6 bonding: deliver link-local packets with skb->dev set to link that packets arrived on
      
      While 6a9e461f restored pre-b89f04c6 presence of link-local
      packets on bonding masters (which is required e.g. by linux bridges
      participating in spanning tree or needed for lab-like setups created
      with group_fwd_mask) it also caused the originating device
      information to be lost due to cloning.
      
      Maciej Żenczykowski proposed another solution that doesn't require
      packet cloning and retains original device information - instead of
      returning RX_HANDLER_PASS for all link-local packets it's now limited
      only to packets from inactive slaves.
      
      At the same time, packets passed to bonding masters retain correct
      information about the originating device and PACKET_ORIGDEV can be used
      to determine it.
      
      This elegantly solves all issues so far:
      
      - link-local packets that were removed from bonding masters
      - LLDP daemons being forced to explicitly bind to slave interfaces
      - PACKET_ORIGDEV having no effect on bond interfaces
      
      Fixes: 6a9e461f (bonding: pass link-local packets to bonding master also.)
      Reported-by: NVincent Bernat <vincent@bernat.ch>
      Signed-off-by: NMichal Soltys <soltys@ziu.info>
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      795cb33c
    • P
      ipv6: route: enforce RCU protection in ip6_route_check_nh_onlink() · 2e4b2aeb
      Paolo Abeni 提交于
      [ Upstream commit bf1dc8bad1d42287164d216d8efb51c5cd381b18 ]
      
      We need a RCU critical section around rt6_info->from deference, and
      proper annotation.
      
      Fixes: 4ed591c8ab44 ("net/ipv6: Allow onlink routes to have a device mismatch if it is the default route")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e4b2aeb
    • P
      ipv6: route: enforce RCU protection in rt6_update_exception_stamp_rt() · 96dd4ef3
      Paolo Abeni 提交于
      [ Upstream commit 193f3685d0546b0cea20c99894aadb70098e47bf ]
      
      We must access rt6_info->from under RCU read lock: move the
      dereference under such lock, with proper annotation.
      
      v1 -> v2:
       - avoid using multiple, racy, fetch operations for rt->from
      
      Fixes: a68886a6 ("net/ipv6: Make from in rt6_info rcu protected")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96dd4ef3
    • D
      ipvlan: disallow userns cap_net_admin to change global mode/flags · 1856bbbe
      Daniel Borkmann 提交于
      [ Upstream commit 7cc9f7003a969d359f608ebb701d42cafe75b84a ]
      
      When running Docker with userns isolation e.g. --userns-remap="default"
      and spawning up some containers with CAP_NET_ADMIN under this realm, I
      noticed that link changes on ipvlan slave device inside that container
      can affect all devices from this ipvlan group which are in other net
      namespaces where the container should have no permission to make changes
      to, such as the init netns, for example.
      
      This effectively allows to undo ipvlan private mode and switch globally to
      bridge mode where slaves can communicate directly without going through
      hostns, or it allows to switch between global operation mode (l2/l3/l3s)
      for everyone bound to the given ipvlan master device. libnetwork plugin
      here is creating an ipvlan master and ipvlan slave in hostns and a slave
      each that is moved into the container's netns upon creation event.
      
      * In hostns:
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
           link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
           ipvlan  mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
           inet 10.41.0.1/32 scope link cilium_host
             valid_lft forever preferred_lft forever
        [...]
      
      * Spawn container & change ipvlan mode setting inside of it:
      
        # docker run -dt --cap-add=NET_ADMIN --network cilium-net --name client -l app=test cilium/netperf
        9fff485d69dcb5ce37c9e33ca20a11ccafc236d690105aadbfb77e4f4170879c
      
        # docker exec -ti client ip -d a
        [...]
        10: cilium0@if4: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
        # docker exec -ti client ip link change link cilium0 name cilium0 type ipvlan mode l2
      
        # docker exec -ti client ip -d a
        [...]
        10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
      * In hostns (mode switched to l2):
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.0.1/32 scope link cilium_host
               valid_lft forever preferred_lft forever
        [...]
      
      Same l3 -> l2 switch would also happen by creating another slave inside
      the container's network namespace when specifying the existing cilium0
      link to derive the actual (bond0) master:
      
        # docker exec -ti client ip link add link cilium0 name cilium1 type ipvlan mode l2
      
        # docker exec -ti client ip -d a
        [...]
        2: cilium1@if4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
        10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
      * In hostns:
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.0.1/32 scope link cilium_host
               valid_lft forever preferred_lft forever
        [...]
      
      One way to mitigate it is to check CAP_NET_ADMIN permissions of
      the ipvlan master device's ns, and only then allow to change
      mode or flags for all devices bound to it. Above two cases are
      then disallowed after the patch.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1856bbbe
    • G
      team: use operstate consistently for linkup · e5c31b5a
      George Wilkie 提交于
      [ Upstream commit 8c7a77267eec81dd81af8412f29e50c0b1082548 ]
      
      When a port is added to a team, its initial state is derived
      from netif_carrier_ok rather than netif_oper_up.
      If it is carrier up but operationally down at the time of being
      added, the port state.linkup will be set prematurely.
      port state.linkup should be set consistently using
      netif_oper_up rather than netif_carrier_ok.
      
      Fixes: f1d22a1e ("team: account for oper state")
      Signed-off-by: NGeorge Wilkie <gwilkie@vyatta.att-mail.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5c31b5a
    • P
      ipv6: route: purge exception on removal · b9d0cb75
      Paolo Abeni 提交于
      [ Upstream commit f5b51fe804ec2a6edce0f8f6b11ea57283f5857b ]
      
      When a netdevice is unregistered, we flush the relevant exception
      via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().
      
      Finally, we end-up calling rt6_remove_exception(), where we release
      the relevant dst, while we keep the references to the related fib6_info and
      dev. Such references should be released later when the dst will be
      destroyed.
      
      There are a number of caches that can keep the exception around for an
      unlimited amount of time - namely dst_cache, possibly even socket cache.
      As a result device registration may hang, as demonstrated by this script:
      
      ip netns add cl
      ip netns add rt
      ip netns add srv
      ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1
      
      ip link add name cl_veth type veth peer name cl_rt_veth
      ip link set dev cl_veth netns cl
      ip -n cl link set dev cl_veth up
      ip -n cl addr add dev cl_veth 2001::2/64
      ip -n cl route add default via 2001::1
      
      ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 hoplimit 64 dev cl_veth
      ip -n cl link set tunv6 up
      ip -n cl addr add 2013::2/64 dev tunv6
      
      ip link set dev cl_rt_veth netns rt
      ip -n rt link set dev cl_rt_veth up
      ip -n rt addr add dev cl_rt_veth 2001::1/64
      
      ip link add name rt_srv_veth type veth peer name srv_veth
      ip link set dev srv_veth netns srv
      ip -n srv link set dev srv_veth up
      ip -n srv addr add dev srv_veth 2002::1/64
      ip -n srv route add default via 2002::2
      
      ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 hoplimit 64 dev srv_veth
      ip -n srv link set tunv6 up
      ip -n srv addr add 2013::1/64 dev tunv6
      
      ip link set dev rt_srv_veth netns rt
      ip -n rt link set dev rt_srv_veth up
      ip -n rt addr add dev rt_srv_veth 2002::2/64
      
      ip netns exec srv netserver & sleep 0.1
      ip netns exec cl ping6 -c 4 2013::1
      ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1
      ip -n rt link set dev rt_srv_veth mtu 1400
      wait %2
      
      ip -n cl link del cl_veth
      
      This commit addresses the issue purging all the references held by the
      exception at time, as we currently do for e.g. ipv6 pcpu dst entries.
      
      v1 -> v2:
       - re-order the code to avoid accessing dst and net after dst_dev_put()
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9d0cb75