提交 · d93d588d1b66f5aa630bab8cbbdecf3cda394017 · openeuler / raspberrypi-kernel

16 3月, 2020 2 次提交

net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup · d93d588d

由 Sabrina Dubroca 提交于 3月 16, 2020

mainline inclusion
from mainline-v5.5-rc1
commit 6c8991f41546c3c472503dff1ea9daaddf9331c2
category: bugfix
bugzilla: 13690
CVE: CVE-2020-1749

-------------------------------------------------

ipv6_stub uses the ip6_dst_lookup function to allow other modules to
perform IPv6 lookups. However, this function skips the XFRM layer
entirely.

All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the
ip_route_output_key and ip_route_output helpers) for their IPv4 lookups,
which calls xfrm_lookup_route(). This patch fixes this inconsistent
behavior by switching the stub to ip6_dst_lookup_flow, which also calls
xfrm_lookup_route().

This requires some changes in all the callers, as these two functions
take different arguments and have different return types.

Fixes: 5f81bd2e ("ipv6: export a stub for IPv6 symbols used by vxlan")
Reported-by: NXiumei Mu <xmu@redhat.com>
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Conflicts:
  include/net/addrconf.h
  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
  net/core/lwt_bpf.c
  net/tipc/udp_media.c
  net/ipv6/addrconf_core.c
  net/ipv6/af_inet6.c
  drivers/infiniband/core/addr.c
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NWenan Mao <maowenan@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

d93d588d

net: ipv6: add net argument to ip6_dst_lookup_flow · e6a9ef86

由 Sabrina Dubroca 提交于 3月 16, 2020

mainline inclusion
from mainline-v5.5-rc1
commit c4e85f73afb6384123e5ef1bba3315b2e3ad031e
category: bugfix
bugzilla: 13690
CVE: CVE-2020-1749

It's prepare for fixing CVE-2020-1749.

-------------------------------------------------

This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow,
as some modules currently pass a net argument without a socket to
ip6_dst_lookup. This is equivalent to commit 343d60aa ("ipv6: change
ipv6_stub_impl.ipv6_dst_lookup to take net argument").
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NWenan Mao <maowenan@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

e6a9ef86

14 3月, 2020 6 次提交

openeuler/config: enable CONFIG_FCOE · 8fad9403

由 Cheng Jian 提交于 3月 14, 2020

hulk inclusion
category: config
bugzilla: 28204
CVE: NA

The fcoe service failed to start because we did not enable
CONFIG_FCOE just enable it.

Link: https://gitee.com/open_euler/dashboard?issue_id=I1B9O9Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

8fad9403

openeuler/config: disable unused debug config · c69d3159

由 Cheng Jian 提交于 3月 14, 2020

hulk inclusion
category: config
bugzilla: 28204
CVE: NA

Two config option confirmations are not needed.

	CONFIG_VIDEO_ADV_DEBUG
	CONFIG_INFINIBAND_IPOIB_DEBUG

So removed them from openeuler config
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c69d3159

net: hns3: update the number of version · ec44749a

由 shenhao 提交于 3月 14, 2020

driver inclusion
category: debug
bugzilla: NA
CVE: NA

---------------------------------------------
This patch update the number of version
Signed-off-by: Nshenhao <shenhao21@huawei.com>
Reviewed-by: NZhong Zhaohui <zhongzhaohui@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

ec44749a

net: hns3: add dumping vlan filter config in debugfs · 0f4250ef

由 shenhao 提交于 3月 14, 2020

driver inclusion
category: debug
bugzilla: NA
CVE: NA

----------------------------------------------------

This patch adds dumping vlan filter config in debugfs to add more DFX
method about vlan filter.
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Nshenhao <shenhao21@huawei.com>
Reviewed-by: NZhong Zhaohui <zhongzhaohui@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

0f4250ef

net: hns3: Increase vlan tag0 when close the port_base_vlan · 3a3c95bb

由 shenhao 提交于 3月 14, 2020

driver inclusion
category: debug
bugzilla: NA
CVE: NA

----------------------------------------------------------

This patch increase vlan tag0 when close the port_base_vlan
Signed-off-by: Nliaoguojia <liaoguojia@huawei.com>
Signed-off-by: Nshenhao <shenhao21@huawei.com>
Reviewed-by: NZhong Zhaohui <zhongzhaohui@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

3a3c95bb

net: hns3: adds support for extended VLAN mode and 'QOS' in vlan 802.1Q protocol. · 2eeca9da

由 shenhao 提交于 3月 14, 2020

driver inclusion
category: Feature
bugzilla: NA
CVE: NA

---------------------------------------------------
This patch adds support for extended VLAN mode, as we call dynamic
vlan mode. In this mode, vf vlan filter will be switch between enable
and disable, according to user configuration.

This patch also adds support for QOS in vlan 802.1Q protocol.
Signed-off-by: Nliaoguojia <liaoguojia@huawei.com>
Signed-off-by: Nshenhao <shenhao21@huawei.com>
Reviewed-by: NZhong Zhaohui <zhongzhaohui@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

2eeca9da

12 3月, 2020 9 次提交

net/hinic: driver code compliance rectification · 200f5d7b

由 Shaozhengchao 提交于 3月 12, 2020

driver inclusion
category:bugfix
bugzilla:4472
CVE:NA

-----------------------------------------------------------------------

hinic driver code compliance rectification.
1.Process return value of snprintf and sscanf function.
2.Modify devil number.
Signed-off-by: NShaozhengchao <shaozhengchao@huawei.com>
Reviewed-by: NLuoshaokai <luoshaokai@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

200f5d7b

net/hinic: Solve the problem that the network card hangs when receiving the skb which frag_size=0 · 2b34ce42

由 Shaozhengchao 提交于 3月 12, 2020

driver inclusion
category:bugfix
bugzilla:31091
CVE:NA

-----------------------------------------------------------------------

To solve the problem that the network card hangs when receiving the skb which frag_size=0

In order to solve this problem, hinic driver adds a judgment on the legality of frag_size
in tx process. If size of lastest frags are all zero, hinic driver will ignore this
frags. If size of some frags in the middle is zero, hinic driver will drop this skb.
Signed-off-by: NShaozhengchao <shaozhengchao@huawei.com>
Reviewed-by: NLuoshaokai <luoshaokai@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

2b34ce42

net: hns3: adds support for reading module eeprom info · fe7fe090

由 Shengzui You 提交于 3月 12, 2020

driver inclusion
category: feature
bugzilla: NA
CVE: NA

--------------------------------

This patch adds support for reading the optical module eeprom
info via "ethtool -m".
Signed-off-by: NShengzui You <youshengzui@huawei.com>
Reviewed-by: NWeiwei Deng <dengweiwei@huawei.com>
Reviewed-by: NZhaohui Zhong <zhongzhaohui@huawei.com>
Reviewed-by: NJunxing Chen <chenjunxin1@huawei.com>
Reviewed-by: NZhong Zhaohui <zhongzhaohui@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

fe7fe090

net: hns3: update hns3 version to 1.9.37.1 · 8ddbd6dc

由 Shengzui You 提交于 3月 12, 2020

driver inclusion
category: feature
bugzilla: NA
CVE: NA

-------------------------

This patch is used to modify hns3 version to 1.9.37.1
Signed-off-by: NShengzui You <youshengzui@huawei.com>
Reviewed-by: NWeiwei Deng <dengweiwei@huawei.com>
Reviewed-by: NZhaohui Zhong <zhongzhaohui@huawei.com>
Reviewed-by: NJunxing Chen <chenjunxin1@huawei.com>
Reviewed-by: NZhong Zhaohui <zhongzhaohui@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

8ddbd6dc

btrfs: tree-checker: Remove comprehensive root owner check · 217d3ab7

由 Qu Wenruo 提交于 3月 12, 2020

mainline inclusion
from mainline-5.2-rc1
commit ff2ac107
category: bugfix
bugzilla: NA
CVE: CVE-2019-19036
---------------------------

Commit 1ba98d08 ("Btrfs: detect corruption when non-root leaf has
zero item") introduced comprehensive root owner checker.

However it's pretty expensive tree search to locate the owner root,
especially when it get reused by mandatory read and write time
tree-checker.

This patch will remove that check, and completely rely on owner based
empty leaf check, which is much faster and still works fine for most
case.

And since we skip the old root owner check, now write time tree check
can be merged with btrfs_check_leaf_full().
Signed-off-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Conflict:
	fs/btrfs/tree-checker.c
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

217d3ab7

xfs: add agf freeblocks verify in xfs_agf_verify · 095cb0e6

由 Zheng Bin 提交于 3月 12, 2020

mainline inclusion
from mainline-v5.6
commit d0c7feaf
category: bugfix
bugzilla: 30215
CVE: NA

---------------------------

We recently used fuzz(hydra) to test XFS and automatically generate
tmp.img(XFS v5 format, but some metadata is wrong)

xfs_repair information(just one AG):
agf_freeblks 0, counted 3224 in ag 0
agf_longest 536874136, counted 3224 in ag 0
sb_fdblocks 613, counted 3228

Test as follows:
mount tmp.img tmpdir
cp file1M tmpdir
sync

In 4.19-stable, sync will stuck, the reason is:
xfs_mountfs
  xfs_check_summary_counts
    if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) ||
       XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) &&
       !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS))
	return 0;  -->just return, incore sb_fdblocks still be 613
    xfs_initialize_perag_data

cp file1M tmpdir -->ok(write file to pagecache)
sync -->stuck(write pagecache to disk)
xfs_map_blocks
  xfs_iomap_write_allocate
    while (count_fsb != 0) {
      nimaps = 0;
      while (nimaps == 0) { --> endless loop
         nimaps = 1;
         xfs_bmapi_write(..., &nimaps) --> nimaps becomes 0 again
xfs_bmapi_write
  xfs_bmap_alloc
    xfs_bmap_btalloc
      xfs_alloc_vextent
        xfs_alloc_fix_freelist
          xfs_alloc_space_available -->fail(agf_freeblks is 0)

In linux-next, sync not stuck, cause commit c2b31643 ("xfs:
use the latest extent at writeback delalloc conversion time") remove
the above while, dmesg is as follows:
[   55.250114] XFS (loop0): page discard on page ffffea0008bc7380, inode 0x1b0c, offset 0.

Users do not know why this page is discard, the better soultion is:
1. Like xfs_repair, make sure sb_fdblocks is equal to counted
(xfs_initialize_perag_data did this, who is not called at this mount)
2. Add agf verify, if fail, will tell users to repair

This patch use the second soultion.
Signed-off-by: NZheng Bin <zhengbin13@huawei.com>
Signed-off-by: NRen Xudong <renxudong1@huawei.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NZheng Bin <zhengbin13@huawei.com>
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

095cb0e6

blktrace: fix dereference after null check · 3b118e3b

由 Cengiz Can 提交于 3月 12, 2020

mainline inclusion
from mainline-v5.6-rc4
commit 153031a3
category: bugfix
bugzilla: 13690
CVE: CVE-2019-19768

-------------------------------------------------

There was a recent change in blktrace.c that added a RCU protection to
`q->blk_trace` in order to fix a use-after-free issue during access.

However the change missed an edge case that can lead to dereferencing of
`bt` pointer even when it's NULL:

Coverity static analyzer marked this as a FORWARD_NULL issue with CID
1460458.

```
/kernel/trace/blktrace.c: 1904 in sysfs_blk_trace_attr_store()
1898            ret = 0;
1899            if (bt == NULL)
1900                    ret = blk_trace_setup_queue(q, bdev);
1901
1902            if (ret == 0) {
1903                    if (attr == &dev_attr_act_mask)
>>>     CID 1460458:  Null pointer dereferences  (FORWARD_NULL)
>>>     Dereferencing null pointer "bt".
1904                            bt->act_mask = value;
1905                    else if (attr == &dev_attr_pid)
1906                            bt->pid = value;
1907                    else if (attr == &dev_attr_start_lba)
1908                            bt->start_lba = value;
1909                    else if (attr == &dev_attr_end_lba)
```

Added a reassignment with RCU annotation to fix the issue.

Fixes: c780e86d ("blktrace: Protect q->blk_trace with RCU")
Cc: stable@vger.kernel.org
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NBob Liu <bob.liu@oracle.com>
Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NCengiz Can <cengiz@kernel.wtf>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

3b118e3b

blktrace: Protect q->blk_trace with RCU · 40b51620

由 Jan Kara 提交于 3月 12, 2020

mainline inclusion
from mainline-v5.6-rc4
commit c780e86d
category: bugfix
bugzilla: 13690
CVE: CVE-2019-19768

-------------------------------------------------

KASAN is reporting that __blk_add_trace() has a use-after-free issue
when accessing q->blk_trace. Indeed the switching of block tracing (and
thus eventual freeing of q->blk_trace) is completely unsynchronized with
the currently running tracing and thus it can happen that the blk_trace
structure is being freed just while __blk_add_trace() works on it.
Protect accesses to q->blk_trace by RCU during tracing and make sure we
wait for the end of RCU grace period when shutting down tracing. Luckily
that is rare enough event that we can afford that. Note that postponing
the freeing of blk_trace to an RCU callback should better be avoided as
it could have unexpected user visible side-effects as debugfs files
would be still existing for a short while block tracing has been shut
down.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711
CC: stable@vger.kernel.org
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Tested-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Reported-by: NTristan Madani <tristmd@gmail.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Conflicts:
  kernel/trace/blktrace.c
[yyl: adjust context]
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

40b51620

vgacon: Fix a UAF in vgacon_invert_region · 2bc2906d

由 Zhang Xiaoxu 提交于 3月 12, 2020

mainline inclusion
from mainline-v5.6-rc4
commit 513dc792d6060d5ef572e43852683097a8420f56
category: bugfix
bugzilla: 13690
CVE: CVE-2020-8647, CVE-2020-8649

-------------------------------------------------

When syzkaller tests, there is a UAF:
  BUG: KASan: use after free in vgacon_invert_region+0x9d/0x110 at addr
    ffff880000100000
  Read of size 2 by task syz-executor.1/16489
  page:ffffea0000004000 count:0 mapcount:-127 mapping:          (null)
  index:0x0
  page flags: 0xfffff00000000()
  page dumped because: kasan: bad access detected
  CPU: 1 PID: 16489 Comm: syz-executor.1 Not tainted
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
  rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
  Call Trace:
    [<ffffffffb119f309>] dump_stack+0x1e/0x20
    [<ffffffffb04af957>] kasan_report+0x577/0x950
    [<ffffffffb04ae652>] __asan_load2+0x62/0x80
    [<ffffffffb090f26d>] vgacon_invert_region+0x9d/0x110
    [<ffffffffb0a39d95>] invert_screen+0xe5/0x470
    [<ffffffffb0a21dcb>] set_selection+0x44b/0x12f0
    [<ffffffffb0a3bfae>] tioclinux+0xee/0x490
    [<ffffffffb0a1d114>] vt_ioctl+0xff4/0x2670
    [<ffffffffb0a0089a>] tty_ioctl+0x46a/0x1a10
    [<ffffffffb052db3d>] do_vfs_ioctl+0x5bd/0xc40
    [<ffffffffb052e2f2>] SyS_ioctl+0x132/0x170
    [<ffffffffb11c9b1b>] system_call_fastpath+0x22/0x27
    Memory state around the buggy address:
     ffff8800000fff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     00 00
     ffff8800000fff80: 00 00 00 00 00 00 00 00 00 00 00 00 00
     00 00 00
    >ffff880000100000: ff ff ff ff ff ff ff ff ff ff ff ff ff
     ff ff ff

It can be reproduce in the linux mainline by the program:
  #include <stdio.h>
  #include <stdlib.h>
  #include <unistd.h>
  #include <fcntl.h>
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <sys/ioctl.h>
  #include <linux/vt.h>

  struct tiocl_selection {
    unsigned short xs;      /* X start */
    unsigned short ys;      /* Y start */
    unsigned short xe;      /* X end */
    unsigned short ye;      /* Y end */
    unsigned short sel_mode; /* selection mode */
  };

  #define TIOCL_SETSEL    2
  struct tiocl {
    unsigned char type;
    unsigned char pad;
    struct tiocl_selection sel;
  };

  int main()
  {
    int fd = 0;
    const char *dev = "/dev/char/4:1";

    struct vt_consize v = {0};
    struct tiocl tioc = {0};

    fd = open(dev, O_RDWR, 0);

    v.v_rows = 3346;
    ioctl(fd, VT_RESIZEX, &v);

    tioc.type = TIOCL_SETSEL;
    ioctl(fd, TIOCLINUX, &tioc);

    return 0;
  }

When resize the screen, update the 'vc->vc_size_row' to the new_row_size,
but when 'set_origin' in 'vgacon_set_origin', vgacon use 'vga_vram_base'
for 'vc_origin' and 'vc_visible_origin', not 'vc_screenbuf'. It maybe
smaller than 'vc_screenbuf'. When TIOCLINUX, use the new_row_size to calc
the offset, it maybe larger than the vga_vram_size in vgacon driver, then
bad access.
Also, if set an larger screenbuf firstly, then set an more larger
screenbuf, when copy old_origin to new_origin, a bad access may happen.

So, If the screen size larger than vga_vram, resize screen should be
failed. This alse fix CVE-2020-8649 and CVE-2020-8647.

Linus pointed out that overflow checking seems absent. We're saved by
the existing bounds checks in vc_do_resize() with rather strict
limits:

	if (cols > VC_RESIZE_MAXCOL || lines > VC_RESIZE_MAXROW)
		return -EINVAL;

Fixes: 0aec4867 ("[PATCH] SVGATextMode fix")
Reference: CVE-2020-8647 and CVE-2020-8649
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
[danvet: augment commit message to point out overflow safety]
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200304022429.37738-1-zhangxiaoxu5@huawei.comSigned-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

2bc2906d

11 3月, 2020 4 次提交

can, slip: Protect tty->disc_data in write_wakeup and close with RCU · 4aefbd10

由 Richard Palethorpe 提交于 3月 10, 2020

[ Upstream commit 0ace17d56824165c7f4c68785d6b58971db954dd ]

write_wakeup can happen in parallel with close/hangup where tty->disc_data
is set to NULL and the netdevice is freed thus also freeing
disc_data. write_wakeup accesses disc_data so we must prevent close from
freeing the netdev while write_wakeup has a non-NULL view of
tty->disc_data.

We also need to make sure that accesses to disc_data are atomic. Which can
all be done with RCU.

This problem was found by Syzkaller on SLCAN, but the same issue is
reproducible with the SLIP line discipline using an LTP test based on the
Syzkaller reproducer.

A fix which didn't use RCU was posted by Hillf Danton.

Fixes: 661f7fda ("slip: Fix deadlock in write_wakeup")
Fixes: a8e83b17 ("slcan: Port write_wakeup deadlock fix from slip")
Reported-by: syzbot+017e491ae13c0068598a@syzkaller.appspotmail.com
Signed-off-by: NRichard Palethorpe <rpalethorpe@suse.com>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tyler Hall <tylerwhall@gmail.com>
Cc: linux-can@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: syzkaller@googlegroups.com
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

4aefbd10

relay: handle alloc_percpu returning NULL in relay_open · 76385377

由 Yang Yingliang 提交于 3月 10, 2020

hulk inclusion
category: bugfix
bugzilla: 13690
CVE: CVE-2019-19462

-------------------------------------------------

alloc_percpu() may return NULL, which means chan->buf may be set to
NULL. In that case, when we do *per_cpu_ptr(chan->buf, ...), we
dereference an invalid pointer:

BUG: Unable to handle kernel data access at 0x7dae0000
Faulting instruction address: 0xc0000000003f3fec
...
NIP [c0000000003f3fec] relay_open+0x29c/0x600
LR [c0000000003f3fc0] relay_open+0x270/0x600
Call Trace:
[c000000054353a70] [c0000000003f3fb4] relay_open+0x264/0x600 (unreliable)
[c000000054353b00] [c000000000451764] __blk_trace_setup+0x254/0x600
[c000000054353bb0] [c000000000451b78] blk_trace_setup+0x68/0xa0
[c000000054353c10] [c0000000010da77c] sg_ioctl+0x7bc/0x2e80
[c000000054353cd0] [c000000000758cbc] do_vfs_ioctl+0x13c/0x1300
[c000000054353d90] [c000000000759f14] ksys_ioctl+0x94/0x130
[c000000054353de0] [c000000000759ff8] sys_ioctl+0x48/0xb0
[c000000054353e20] [c00000000000bcd0] system_call+0x5c/0x68

Check if alloc_percpu returns NULL.

This was found by syzkaller both on x86 and powerpc, and the reproducer
it found on powerpc is capable of hitting the issue as an unprivileged
user.

https://lore.kernel.org/lkml/20191219121256.26480-1-dja%40axtens.net/
Fixes: 017c59c0 ("relay: Use per CPU constructs for the relay channel buffer pointers")
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

76385377

drm/radeon: check the alloc_workqueue return value · afe1d9eb

由 Yang Yingliang 提交于 3月 10, 2020

hulk inclusion
category: bugfix
bugzilla: 13690
CVE: CVE-2019-16230

-------------------------------------------------

check the alloc_workqueue return value in radeon_crtc_init()
to avoid null-ptr-deref.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

afe1d9eb

apparmor: Fix use-after-free in aa_audit_rule_init · 9a2cd27d

由 Navid Emamdoost 提交于 3月 10, 2020

hulk inclusion
category: bugfix
bugzilla: NA
CVE: CVE-2019-18814

---------------------------

In the implementation of aa_audit_rule_init(), when aa_label_parse()
fails the allocated memory for rule is released using
aa_audit_rule_free(). But after this release, the return statement
tries to access the label field of the rule which results in
use-after-free. Before releasing the rule, copy errNo and return it
after release.

Fixes: 52e8c380 ("apparmor: Fix memory leak of rule on error exit path")
Signed-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com>
Reviewed-by: NTyler Hicks <tyhicks@canonical.com>
Signed-off-by: NJason Yan <yanaijie@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

9a2cd27d

05 3月, 2020 19 次提交

livepatch/x86: enable livepatch config openeuler · 4b90845b

由 Cheng Jian 提交于 3月 03, 2020

hulk inclusion
category: feature
bugzilla: 5507
CVE: NA

---------------------------

We have completed the livepatch without ftrace for x86_64,
we can now enable it.
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

4b90845b

livepatch/x86: enable livepatch config for hulk · 6a07930a

由 Cheng Jian 提交于 3月 03, 2020

hulk inclusion
category: feature
bugzilla: 5507
CVE: NA

---------------------------

We have completed the livepatch without ftrace for x86_64,
we can now enable it.
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

6a07930a

livepatch/arm64: check active func in consistency stack checking · b6f7ad60

由 Cheng Jian 提交于 3月 03, 2020

hulk inclusion
category: bugfix
bugzilla: 5507/31358
CVE: NA
---------------------------

When doing consistency stack checking, if we try to patch a
function which has been patched already. We should check the
new function(not the origin func) that is activeness currently,
it's always the first entry in list func_node->func_stack.

Example :
	module : origin			livepatch v1		livepatch v2
	func   : old func A -[enable]=> new func A' -[enable]=> new func A''
	check  :		A			A'

when we try to patch function A to new function A'' by livepatch
v2, but the func A has already patched to function A' by livepatch
v1, so function A' which provided in livepatch v1 is active in the
stack instead of origin function A. Even if the long jump method is
used, we jump to the new function A' using a call without LR, the
origin function A will not appear in the stack. We must check the
active function A' in consistency stack checking.
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

b6f7ad60

livepatch/x86: check active func in consistency stack checking · 27057710

由 Cheng Jian 提交于 3月 03, 2020

hulk inclusion
category: bugfix
bugzilla: 5507/31358
CVE: NA
---------------------------

When doing consistency stack checking, if we try to patch a
function which has been patched already. We should check the
new function(not the origin func) that is activeness currently,
it's always the first entry in list func_node->func_stack.

Example :
	module : origin			livepatch v1		livepatch v2
	func   : old func A -[enable]=> new func A' -[enable]=> new func A''
	check  :		A			A'

when we try to patch function A to new function A'' by livepatch
v2, but the func A has already patched to function A' by livepatch
v1, so function A' which provided in livepatch v1 is active in the
stack instead of origin function A. Even if the long jump method is
used, we jump to the new function A' using a call without LR, the
origin function A will not appear in the stack. We must check the
active function A' in consistency stack checking.
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

27057710

livepatch/x86: support livepatch without ftrace · 7e2ab91e

由 Cheng Jian 提交于 3月 03, 2020

hulk inclusion
category: feature
bugzilla: 5507
CVE: NA

----------------------------------------

support livepatch without ftrace for x86_64

supported now:
        livepatch relocation when init_patch after load_module;
        instruction patched when enable;
	activeness function check;
	enforcing the patch stacking principle;

x86_64 use variable length instruction, so there's no need to consider
extra implementation for long jumps.
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Signed-off-by: NLi Bin <huawei.libin@huawei.com>
Tested-by: NYang ZuoTing <yangzuoting@huawei.com>
Tested-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

7e2ab91e

KVM: nVMX: Check IO instruction VM-exit conditions · eb232fcd

由 Oliver Upton 提交于 3月 03, 2020

commit 35a571346a94fb93b5b3b6a599675ef3384bc75c upstream.

Consult the 'unconditional IO exiting' and 'use IO bitmaps' VM-execution
controls when checking instruction interception. If the 'use IO bitmaps'
VM-execution control is 1, check the instruction access against the IO
bitmaps to determine if the instruction causes a VM-exit.
Signed-off-by: NOliver Upton <oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

eb232fcd

KVM: nVMX: Refactor IO bitmap checks into helper function · 6959f6e4

由 Oliver Upton 提交于 3月 03, 2020

commit e71237d3ff1abf9f3388337cfebf53b96df2020d upstream.

Checks against the IO bitmap are useful for both instruction emulation
and VM-exit reflection. Refactor the IO bitmap checks into a helper
function.
Signed-off-by: NOliver Upton <oupton@google.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

6959f6e4

KVM: nVMX: Don't emulate instructions in guest mode · eb561a7a

由 Paolo Bonzini 提交于 3月 03, 2020

commit 07721feee46b4b248402133228235318199b05ec upstream.

vmx_check_intercept is not yet fully implemented. To avoid emulating
instructions disallowed by the L1 hypervisor, refuse to emulate
instructions by default.

Cc: stable@vger.kernel.org
[Made commit, added commit msg - Oliver]
Signed-off-by: NOliver Upton <oupton@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

eb561a7a

floppy: check FDC index for errors before assigning it · 2ad1a109

由 Linus Torvalds 提交于 3月 03, 2020

commit 2e90ca68b0d2f5548804f22f0dd61145516171e3 upstream.

Jordy Zomer reported a KASAN out-of-bounds read in the floppy driver in
wait_til_ready().

Which on the face of it can't happen, since as Willy Tarreau points out,
the function does no particular memory access.  Except through the FDCS
macro, which just indexes a static allocation through teh current fdc,
which is always checked against N_FDC.

Except the checking happens after we've already assigned the value.

The floppy driver is a disgrace (a lot of it going back to my original
horrd "design"), and has no real maintainer.  Nobody has the hardware,
and nobody really cares.  But it still gets used in virtual environment
because it's one of those things that everybody supports.

The whole thing should be re-written, or at least parts of it should be
seriously cleaned up.  The 'current fdc' index, which is used by the
FDCS macro, and which is often shadowed by a local 'fdc' variable, is a
prime example of how not to write code.

But because nobody has the hardware or the motivation, let's just fix up
the immediate problem with a nasty band-aid: test the fdc index before
actually assigning it to the static 'fdc' variable.
Reported-by: NJordy Zomer <jordy@simplyhacker.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

2ad1a109

ext4: add cond_resched() to __ext4_find_entry() · ef91f0f3

由 Shijie Luo via Kernel 提交于 2月 29, 2020

mainline inclusion
from mainline-v5.6-rc3
commit 9424ef56
category: bugfix
bugzilla: 31127
CVE: NA

-------------------------------------------------
We tested a soft lockup problem in linux 4.19 which could also
be found in linux 5.x.

When dir inode takes up a large number of blocks, and if the
directory is growing when we are searching, it's possible the
restart branch could be called many times, and the do while loop
could hold cpu a long time.

Here is the call trace in linux 4.19.

[  473.756186] Call trace:
[  473.756196]  dump_backtrace+0x0/0x198
[  473.756199]  show_stack+0x24/0x30
[  473.756205]  dump_stack+0xa4/0xcc
[  473.756210]  watchdog_timer_fn+0x300/0x3e8
[  473.756215]  __hrtimer_run_queues+0x114/0x358
[  473.756217]  hrtimer_interrupt+0x104/0x2d8
[  473.756222]  arch_timer_handler_virt+0x38/0x58
[  473.756226]  handle_percpu_devid_irq+0x90/0x248
[  473.756231]  generic_handle_irq+0x34/0x50
[  473.756234]  __handle_domain_irq+0x68/0xc0
[  473.756236]  gic_handle_irq+0x6c/0x150
[  473.756238]  el1_irq+0xb8/0x140
[  473.756286]  ext4_es_lookup_extent+0xdc/0x258 [ext4]
[  473.756310]  ext4_map_blocks+0x64/0x5c0 [ext4]
[  473.756333]  ext4_getblk+0x6c/0x1d0 [ext4]
[  473.756356]  ext4_bread_batch+0x7c/0x1f8 [ext4]
[  473.756379]  ext4_find_entry+0x124/0x3f8 [ext4]
[  473.756402]  ext4_lookup+0x8c/0x258 [ext4]
[  473.756407]  __lookup_hash+0x8c/0xe8
[  473.756411]  filename_create+0xa0/0x170
[  473.756413]  do_mkdirat+0x6c/0x140
[  473.756415]  __arm64_sys_mkdirat+0x28/0x38
[  473.756419]  el0_svc_common+0x78/0x130
[  473.756421]  el0_svc_handler+0x38/0x78
[  473.756423]  el0_svc+0x8/0xc
[  485.755156] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [tmp:5149]

Add cond_resched() to avoid soft lockup and to provide a better
system responding.

Link: https://lore.kernel.org/r/20200215080206.13293-1-luoshijie1@huawei.comSigned-off-by: NShijie Luo <luoshijie1@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: stable@kernel.org
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

ef91f0f3

x86 / config: add openeuler_defconfig · 95a54772

由 Xiongfeng Wang 提交于 2月 28, 2020

hulk inclusion
category: config
bugzilla: 31089
CVE: NA

-----------------------------

Add openeuler_defconfig for openeuler itself.
Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

95a54772

files_cgroup: Fix soft lockup when refcnt overflow. · 22f98d8e

由 Zhang Xiaoxu 提交于 2月 26, 2020

hulk inclusion
category: bugfix
bugzilla: 31087
CVE: NA

---------------------

There is a soft lockup call trace as below:
  CPU: 0 PID: 1360 Comm: imapsvcd Kdump: loaded Tainted: G           OE
  task: ffff8a7296e1eeb0 ti: ffff8a7296aa0000 task.ti: ffff8a7296aa0000
  RIP: 0010:[<ffffffffb691ecb4>]  [<ffffffffb691ecb4>]
  __css_tryget+0x24/0x50
  RSP: 0018:ffff8a7296aa3db8  EFLAGS: 00000a87
  RAX: 0000000080000000 RBX: ffff8a7296aa3df8 RCX: ffff8a72820d9a08
  RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8a72820d9a00
  RBP: ffff8a7296aa3db8 R08: 000000000001c360 R09: ffffffffb6a478f4
  R10: ffffffffb6935e83 R11: ffffffffffffffd0 R12: 0000000057d35cd8
  R13: 000000d000000002 R14: ffffffffb6892fbe R15: 000000d000000002
  FS:  0000000000000000(0000) GS:ffff8a72fec00000(0063)
  knlGS:00000000c6e65b40
  CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
  CR2: 0000000057d35cd8 CR3: 00000007e8008000 CR4: 00000000003607f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   [<ffffffffb6a93578>] files_cgroup_assign+0x48/0x60
   [<ffffffffb6a47972>] dup_fd+0xb2/0x2f0
   [<ffffffffb6935e83>] ? audit_alloc+0xe3/0x180
   [<ffffffffb6893a03>] copy_process+0xbd3/0x1a40
   [<ffffffffb6894a21>] do_fork+0x91/0x320
   [<ffffffffb6f329e6>] ? trace_do_page_fault+0x56/0x150
   [<ffffffffb6894d36>] SyS_clone+0x16/0x20
   [<ffffffffb6f3bf8c>] ia32_ptregs_common+0x4c/0xfc
   code: 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8d 4f 08 48 89 e5 8b
         47 08 8d 90 00 00 00 80 85 c0 0f 49 d0 8d 72 01 89 d0 f0 0f b1

When the child process exit, we doesn't call dec refcnt, so, the refcnt
maybe overflow. Then the 'task_get_css' will dead loop because the
'css_refcnt' will return an unbias refcnt, if the refcnt is negitave,
'__css_tryget' always return false, then 'task_get_css' dead looped.

The child process always call 'close_files' when exit, add dec refcnt in
it.
Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

22f98d8e

vt: selection, close sel_buffer race · 099d032f

由 Jiri Slaby 提交于 2月 25, 2020

mainline inclusion
from mainline-v5.6-rc2
commit 07e6124a1a46b4b5a9b3cacc0c306b50da87abf5
category: bugfix
bugzilla: 13690
CVE: CVE-2020-8648

-------------------------------------------------

syzkaller reported this UAF:
BUG: KASAN: use-after-free in n_tty_receive_buf_common+0x2481/0x2940 drivers/tty/n_tty.c:1741
Read of size 1 at addr ffff8880089e40e9 by task syz-executor.1/13184

CPU: 0 PID: 13184 Comm: syz-executor.1 Not tainted 5.4.7 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
...
 kasan_report+0xe/0x20 mm/kasan/common.c:634
 n_tty_receive_buf_common+0x2481/0x2940 drivers/tty/n_tty.c:1741
 tty_ldisc_receive_buf+0xac/0x190 drivers/tty/tty_buffer.c:461
 paste_selection+0x297/0x400 drivers/tty/vt/selection.c:372
 tioclinux+0x20d/0x4e0 drivers/tty/vt/vt.c:3044
 vt_ioctl+0x1bcf/0x28d0 drivers/tty/vt/vt_ioctl.c:364
 tty_ioctl+0x525/0x15a0 drivers/tty/tty_io.c:2657
 vfs_ioctl fs/ioctl.c:47 [inline]

It is due to a race between parallel paste_selection (TIOCL_PASTESEL)
and set_selection_user (TIOCL_SETSEL) invocations. One uses sel_buffer,
while the other frees it and reallocates a new one for another
selection. Add a mutex to close this race.

The mutex takes care properly of sel_buffer and sel_buffer_lth only. The
other selection global variables (like sel_start, sel_end, and sel_cons)
are protected only in set_selection_user. The other functions need quite
some more work to close the races of the variables there. This is going
to happen later.

This likely fixes (I am unsure as there is no reproducer provided) bug
206361 too. It was marked as CVE-2020-8648.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Reported-by: syzbot+59997e8d5cbdc486e6f6@syzkaller.appspotmail.com
References: https://bugzilla.kernel.org/show_bug.cgi?id=206361
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200210081131.23572-2-jslaby@suse.czSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

099d032f

vt: selection, handle pending signals in paste_selection · 734874b5

由 Jiri Slaby 提交于 2月 25, 2020

mainline inclusion
from mainline-v5.6-rc2
commit 687bff0cd08f790d540cfb7b2349f0d876cdddec
category: bugfix
bugzilla: 13690
CVE: CVE-2020-8648

-------------------------------------------------

When pasting a selection to a vt, the task is set as INTERRUPTIBLE while
waiting for a tty to unthrottle. But signals are not handled at all.
Normally, this is not a problem as tty_ldisc_receive_buf receives all
the goods and a user has no reason to interrupt the task.

There are two scenarios where this matters:
1) when the tty is throttled and a signal is sent to the process, it
   spins on a CPU until the tty is unthrottled. schedule() does not
   really echedule, but returns immediately, of course.
2) when the sel_buffer becomes invalid, KASAN prevents any reads from it
   and the loop simply does not proceed and spins forever (causing the
   tty to throttle, but the code never sleeps, the same as above). This
   sometimes happens as there is a race in the sel_buffer handling code.

So add signal handling to this ioctl (TIOCL_PASTESEL) and return -EINTR
in case a signal is pending.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200210081131.23572-1-jslaby@suse.czSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

734874b5

RDMA/hns: Compilation Configuration update · f8930e91

由 Gao Xun 提交于 2月 24, 2020

driver inclusion
category: Bugfix
bugzilla: NA
CVE: NA

We updated dfx module related conditional compilation layout to
ensure proper compilation when we turnoff dfx in .config file.
Signed-off-by: NGao Xun <gaoxun3@huawei.com>
Reviewed-by: NHu Chunzhi <huchunzhi@huawei.com>
Reviewed-by: NWang Lin <wanglin137@huawei.com>
Reviewed-by: NZhao Weibo <zhaoweibo3@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

f8930e91

jbd2: do not clear the BH_Mapped flag when forgetting a metadata buffer · 9e486a76

由 zhangyi (F) 提交于 2月 24, 2020

[ Upstream commit c96dceea ]

Commit 904cdbd4 ("jbd2: clear dirty flag when revoking a buffer from
an older transaction") set the BH_Freed flag when forgetting a metadata
buffer which belongs to the committing transaction, it indicate the
committing process clear dirty bits when it is done with the buffer. But
it also clear the BH_Mapped flag at the same time, which may trigger
below NULL pointer oops when block_size < PAGE_SIZE.

rmdir 1             kjournald2                 mkdir 2
                    jbd2_journal_commit_transaction
		    commit transaction N
jbd2_journal_forget
set_buffer_freed(bh1)
                    jbd2_journal_commit_transaction
                     commit transaction N+1
                     ...
                     clear_buffer_mapped(bh1)
                                               ext4_getblk(bh2 ummapped)
                                               ...
                                               grow_dev_page
                                                init_page_buffers
                                                 bh1->b_private=NULL
                                                 bh2->b_private=NULL
                     jbd2_journal_put_journal_head(jh1)
                      __journal_remove_journal_head(hb1)
		       jh1 is NULL and trigger oops

*) Dir entry block bh1 and bh2 belongs to one page, and the bh2 has
   already been unmapped.

For the metadata buffer we forgetting, we should always keep the mapped
flag and clear the dirty flags is enough, so this patch pick out the
these buffers and keep their BH_Mapped flag.

Link: https://lore.kernel.org/r/20200213063821.30455-3-yi.zhang@huawei.com
Fixes: 904cdbd4 ("jbd2: clear dirty flag when revoking a buffer from an older transaction")
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

9e486a76

jbd2: move the clearing of b_modified flag to the journal_unmap_buffer() · c61ee205

由 zhangyi (F) 提交于 2月 24, 2020

[ Upstream commit 6a66a7ded12baa6ebbb2e3e82f8cb91382814839 ]

There is no need to delay the clearing of b_modified flag to the
transaction committing time when unmapping the journalled buffer, so
just move it to the journal_unmap_buffer().

Link: https://lore.kernel.org/r/20200213063821.30455-2-yi.zhang@huawei.comReviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c61ee205

iscsi: use dynamic single thread workqueue to improve performance · c2cc4a1f

由 Biaoxiang Ye 提交于 2月 24, 2020

euleros inclusion
category: feature
feature: Implement NUMA affinity for order workqueue

-------------------------------------------------

On aarch64 NUMA machines, the kworker of iscsi created always jump
around across node boundaries. If it work on the different node even
different cpu package with the softirq of network interface, memcpy
with in iscsi_tcp_segment_recv will be slow down, and iscsi got an
terrible performance.

In this patch, we trace the cpu of softirq, and tell queue_work_on
to execute iscsi_xmitworker on the same NUMA node.

The performance data as below:
fio cmd:
fio -filename=/dev/disk/by-id/wwn-0x6883fd3100a2ad260036281700000000
-direct=1 -iodepth=32 -rw=read -bs=64k -size=30G -ioengine=libaio
-numjobs=1 -group_reporting -name=mytest -time_based -ramp_time=60
-runtime=60

before patch:
Jobs: 1 (f=1): [R] [52.5% done] [852.3MB/0KB/0KB /s] [13.7K/0/0 iops] [eta 00m:57s]
Jobs: 1 (f=1): [R] [53.3% done] [861.4MB/0KB/0KB /s] [13.8K/0/0 iops] [eta 00m:56s]
Jobs: 1 (f=1): [R] [54.2% done] [868.2MB/0KB/0KB /s] [13.9K/0/0 iops] [eta 00m:55s]

after pactch:
Jobs: 1 (f=1): [R] [53.3% done] [1070MB/0KB/0KB /s] [17.2K/0/0 iops] [eta 00m:56s]
Jobs: 1 (f=1): [R] [55.0% done] [1064MB/0KB/0KB /s] [17.3K/0/0 iops] [eta 00m:54s]
Jobs: 1 (f=1): [R] [56.7% done] [1069MB/0KB/0KB /s] [17.1K/0/0 iops] [eta 00m:52s]

cpu info:
Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                128
On-line CPU(s) list:   0-127
Thread(s) per core:    1
Core(s) per socket:    64
Socket(s):             2
NUMA node(s):          4
Model:                 0
CPU max MHz:           2600.0000
CPU min MHz:           200.0000
BogoMIPS:              200.00
L1d cache:             64K
L1i cache:             64K
L2 cache:              512K
L3 cache:              32768K
NUMA node0 CPU(s):     0-31
NUMA node1 CPU(s):     32-63
NUMA node2 CPU(s):     64-95
NUMA node3 CPU(s):     96-127
Signed-off-by: NBiaoxiang Ye <yebiaoxiang@huawei.com>
Acked-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

c2cc4a1f

workqueue: implement NUMA affinity for single thread workqueue · df86cc94

由 Biaoxiang Ye 提交于 2月 24, 2020

euleros inclusion
category: feature
feature: Implement NUMA affinity for order workqueue

-------------------------------------------------

Currently, single thread workqueue only have single pwq, all of
works are queued the same workerpool. This is not optimal on
NUMA machines, will cause workers jump around across node.

This patch add a new wq flags __WQ_DYNAMIC,  this new kind of
single thread workqueue creates a separate pwq covering the
intersecting CPUS for each NUMA node which has online CPUS
in @attrs->cpumask instead of mapping all entries of numa_pwq_tbl[]
to the same pwq. After this, we can specify the @cpu of
queue_work_on, so the work can be executed on the same NUMA
node of the specified @cpu.
This kind of wq only support single work, multi works can't guarantee
the work's order.
Signed-off-by: NBiaoxiang Ye <yebiaoxiang@huawei.com>
Acked-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

df86cc94