1. 04 9月, 2021 1 次提交
  2. 30 7月, 2021 1 次提交
  3. 17 7月, 2020 2 次提交
  4. 08 7月, 2020 1 次提交
  5. 30 6月, 2020 1 次提交
    • P
      net: sched: Pass root lock to Qdisc_ops.enqueue · aebe4426
      Petr Machata 提交于
      A following patch introduces qevents, points in qdisc algorithm where
      packet can be processed by user-defined filters. Should this processing
      lead to a situation where a new packet is to be enqueued on the same port,
      holding the root lock would lead to deadlocks. To solve the issue, qevent
      handler needs to unlock and relock the root lock when necessary.
      
      To that end, add the root lock argument to the qdisc op enqueue, and
      propagate throughout.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aebe4426
  6. 21 6月, 2020 1 次提交
  7. 28 4月, 2020 1 次提交
  8. 23 10月, 2019 1 次提交
  9. 09 8月, 2019 1 次提交
  10. 07 8月, 2019 2 次提交
    • D
      fq_codel: Kill useless per-flow dropped statistic · 77ddaff2
      Dave Taht 提交于
      It is almost impossible to get anything other than a 0 out of
      flow->dropped statistic with a tc class dump, as it resets to 0
      on every round.
      
      It also conflates ecn marks with drops.
      
      It would have been useful had it kept a cumulative drop count, but
      it doesn't. This patch doesn't change the API, it just stops
      tracking a stat and state that is impossible to measure and nobody
      uses.
      Signed-off-by: NDave Taht <dave.taht@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77ddaff2
    • D
      Increase fq_codel count in the bulk dropper · ae697f3b
      Dave Taht 提交于
      In the field fq_codel is often used with a smaller memory or
      packet limit than the default, and when the bulk dropper is hit,
      the drop pattern bifircates into one that more slowly increases
      the codel drop rate and hits the bulk dropper more than it should.
      
      The scan through the 1024 queues happens more often than it needs to.
      
      This patch increases the codel count in the bulk dropper, but
      does not change the drop rate there, relying on the next codel round
      to deliver the next packet at the original drop rate
      (after that burst of loss), then escalate to a higher signaling rate.
      Signed-off-by: NDave Taht <dave.taht@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae697f3b
  11. 18 7月, 2019 1 次提交
    • C
      net_sched: unset TCQ_F_CAN_BYPASS when adding filters · 3f05e688
      Cong Wang 提交于
      For qdisc's that support TC filters and set TCQ_F_CAN_BYPASS,
      notably fq_codel, it makes no sense to let packets bypass the TC
      filters we setup in any scenario, otherwise our packets steering
      policy could not be enforced.
      
      This can be reproduced easily with the following script:
      
       ip li add dev dummy0 type dummy
       ifconfig dummy0 up
       tc qd add dev dummy0 root fq_codel
       tc filter add dev dummy0 parent 8001: protocol arp basic action mirred egress redirect dev lo
       tc filter add dev dummy0 parent 8001: protocol ip basic action mirred egress redirect dev lo
       ping -I dummy0 192.168.112.1
      
      Without this patch, packets are sent directly to dummy0 without
      hitting any of the filters. With this patch, packets are redirected
      to loopback as expected.
      
      This fix is not perfect, it only unsets the flag but does not set it back
      because we have to save the information somewhere in the qdisc if we
      really want that. Note, both fq_codel and sfq clear this flag in their
      ->bind_tcf() but this is clearly not sufficient when we don't use any
      class ID.
      
      Fixes: 23624935 ("net_sched: TCQ_F_CAN_BYPASS generalization")
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f05e688
  12. 31 5月, 2019 1 次提交
  13. 28 4月, 2019 2 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
    • M
      netlink: make nla_nest_start() add NLA_F_NESTED flag · ae0be8de
      Michal Kubecek 提交于
      Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
      netlink based interfaces (including recently added ones) are still not
      setting it in kernel generated messages. Without the flag, message parsers
      not aware of attribute semantics (e.g. wireshark dissector or libmnl's
      mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
      the structure of their contents.
      
      Unfortunately we cannot just add the flag everywhere as there may be
      userspace applications which check nlattr::nla_type directly rather than
      through a helper masking out the flags. Therefore the patch renames
      nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
      as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
      are rewritten to use nla_nest_start().
      
      Except for changes in include/net/netlink.h, the patch was generated using
      this semantic patch:
      
      @@ expression E1, E2; @@
      -nla_nest_start(E1, E2)
      +nla_nest_start_noflag(E1, E2)
      
      @@ expression E1, E2; @@
      -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
      +nla_nest_start(E1, E2)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae0be8de
  14. 11 9月, 2018 1 次提交
  15. 13 7月, 2018 1 次提交
    • J
      sch_fq_codel: zero q->flows_cnt when fq_codel_init fails · 83fe6b87
      Jacob Keller 提交于
      When fq_codel_init fails, qdisc_create_dflt will cleanup by using
      qdisc_destroy. This function calls the ->reset() op prior to calling the
      ->destroy() op.
      
      Unfortunately, during the failure flow for sch_fq_codel, the ->flows
      parameter is not initialized, so the fq_codel_reset function will null
      pointer dereference.
      
         kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
         kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel]
         kernel: PGD 0 P4D 0
         kernel: Oops: 0000 [#1] SMP PTI
         kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me mei joydev i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c mgag200 ixgbe drm_kms_helper isci ttm firewire_ohci
         kernel:  mdio drm igb libsas crc32c_intel firewire_core ptp pps_core scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf ipmi_msghandler [last unloaded: i40e]
         kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G           OE    4.16.13custom-fq-codel-test+ #3
         kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.05.0004.051120151007 05/11/2015
         kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel]
         kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246
         kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX: 00000000000005b9
         kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI: ffff9cfd17b31c00
         kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09: ffffffffb679c3e9
         kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12: ffff9cfd17b31c00
         kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15: 0000000000000001
         kernel: FS:  00007fdec2f92800(0000) GS:ffff9d0326480000(0000) knlGS:0000000000000000
         kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
         kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4: 00000000000606e0
         kernel: Call Trace:
         kernel:  qdisc_destroy+0x56/0x140
         kernel:  qdisc_create_dflt+0x8b/0xb0
         kernel:  mq_init+0xc1/0xf0
         kernel:  qdisc_create_dflt+0x5a/0xb0
         kernel:  dev_activate+0x205/0x230
         kernel:  __dev_open+0xf5/0x160
         kernel:  __dev_change_flags+0x1a3/0x210
         kernel:  dev_change_flags+0x21/0x60
         kernel:  do_setlink+0x660/0xdf0
         kernel:  ? down_trylock+0x25/0x30
         kernel:  ? xfs_buf_trylock+0x1a/0xd0 [xfs]
         kernel:  ? rtnl_newlink+0x816/0x990
         kernel:  ? _xfs_buf_find+0x327/0x580 [xfs]
         kernel:  ? _cond_resched+0x15/0x30
         kernel:  ? kmem_cache_alloc+0x20/0x1b0
         kernel:  ? rtnetlink_rcv_msg+0x200/0x2f0
         kernel:  ? rtnl_calcit.isra.30+0x100/0x100
         kernel:  ? netlink_rcv_skb+0x4c/0x120
         kernel:  ? netlink_unicast+0x19e/0x260
         kernel:  ? netlink_sendmsg+0x1ff/0x3c0
         kernel:  ? sock_sendmsg+0x36/0x40
         kernel:  ? ___sys_sendmsg+0x295/0x2f0
         kernel:  ? ebitmap_cmp+0x6d/0x90
         kernel:  ? dev_get_by_name_rcu+0x73/0x90
         kernel:  ? skb_dequeue+0x52/0x60
         kernel:  ? __inode_wait_for_writeback+0x7f/0xf0
         kernel:  ? bit_waitqueue+0x30/0x30
         kernel:  ? fsnotify_grab_connector+0x3c/0x60
         kernel:  ? __sys_sendmsg+0x51/0x90
         kernel:  ? do_syscall_64+0x74/0x180
         kernel:  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
         kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f 84 84 00 00 00 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 <48> 8b 73 08 48 8b 3b e8 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00
         kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP: ffffbfbf4c1fb620
         kernel: CR2: 0000000000000008
         kernel: ---[ end trace e81a62bede66274e ]---
      
      This is caused because flows_cnt is non-zero, but flows hasn't been
      initialized. fq_codel_init has left the private data in a partially
      initialized state.
      
      To fix this, reset flows_cnt to 0 when we fail to initialize.
      Additionally, to make the state more consistent, also cleanup the flows
      pointer when the allocation of backlogs fails.
      
      This fixes the NULL pointer dereference, since both the for-loop and
      memset in fq_codel_reset will be no-ops when flow_cnt is zero.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83fe6b87
  16. 13 6月, 2018 1 次提交
    • K
      treewide: kvzalloc() -> kvcalloc() · 778e1cdd
      Kees Cook 提交于
      The kvzalloc() function has a 2-factor argument form, kvcalloc(). This
      patch replaces cases of:
      
              kvzalloc(a * b, gfp)
      
      with:
              kvcalloc(a * b, gfp)
      
      as well as handling cases of:
      
              kvzalloc(a * b * c, gfp)
      
      with:
      
              kvzalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kvcalloc(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kvzalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kvzalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kvzalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kvzalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kvzalloc
      + kvcalloc
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kvzalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kvzalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kvzalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kvzalloc(C1 * C2 * C3, ...)
      |
        kvzalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvzalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvzalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvzalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kvzalloc(sizeof(THING) * C2, ...)
      |
        kvzalloc(sizeof(TYPE) * C2, ...)
      |
        kvzalloc(C1 * C2 * C3, ...)
      |
        kvzalloc(C1 * C2, ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      778e1cdd
  17. 22 12月, 2017 5 次提交
  18. 22 10月, 2017 1 次提交
  19. 17 10月, 2017 1 次提交
  20. 31 8月, 2017 1 次提交
  21. 26 8月, 2017 1 次提交
    • W
      net_sched: remove tc class reference counting · 143976ce
      WANG Cong 提交于
      For TC classes, their ->get() and ->put() are always paired, and the
      reference counting is completely useless, because:
      
      1) For class modification and dumping paths, we already hold RTNL lock,
         so all of these ->get(),->change(),->put() are atomic.
      
      2) For filter bindiing/unbinding, we use other reference counter than
         this one, and they should have RTNL lock too.
      
      3) For ->qlen_notify(), it is special because it is called on ->enqueue()
         path, but we already hold qdisc tree lock there, and we hold this
         tree lock when graft or delete the class too, so it should not be gone
         or changed until we release the tree lock.
      
      Therefore, this patch removes ->get() and ->put(), but:
      
      1) Adds a new ->find() to find the pointer to a class by classid, no
         refcnt.
      
      2) Move the original class destroy upon the last refcnt into ->delete(),
         right after releasing tree lock. This is fine because the class is
         already removed from hash when holding the lock.
      
      For those who also use ->put() as ->unbind(), just rename them to reflect
      this change.
      
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      143976ce
  22. 07 6月, 2017 1 次提交
  23. 18 5月, 2017 2 次提交
  24. 09 5月, 2017 1 次提交
    • M
      treewide: use kv[mz]alloc* rather than opencoded variants · 752ade68
      Michal Hocko 提交于
      There are many code paths opencoding kvmalloc.  Let's use the helper
      instead.  The main difference to kvmalloc is that those users are
      usually not considering all the aspects of the memory allocator.  E.g.
      allocation requests <= 32kB (with 4kB pages) are basically never failing
      and invoke OOM killer to satisfy the allocation.  This sounds too
      disruptive for something that has a reasonable fallback - the vmalloc.
      On the other hand those requests might fallback to vmalloc even when the
      memory allocator would succeed after several more reclaim/compaction
      attempts previously.  There is no guarantee something like that happens
      though.
      
      This patch converts many of those places to kv[mz]alloc* helpers because
      they are more conservative.
      
      Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390
      Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim
      Acked-by: David Sterba <dsterba@suse.com> # btrfs
      Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph
      Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4
      Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Colin Cross <ccross@android.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Santosh Raspatur <santosh@chelsio.com>
      Cc: Hariprasad S <hariprasad@chelsio.com>
      Cc: Yishai Hadas <yishaih@mellanox.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: "Yan, Zheng" <zyan@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      752ade68
  25. 14 4月, 2017 1 次提交
  26. 17 3月, 2017 1 次提交
  27. 11 2月, 2017 1 次提交
  28. 21 1月, 2017 1 次提交
  29. 26 6月, 2016 2 次提交
  30. 16 6月, 2016 1 次提交
  31. 09 6月, 2016 1 次提交