1. 20 6月, 2020 1 次提交
  2. 27 3月, 2020 1 次提交
  3. 31 10月, 2019 2 次提交
  4. 30 10月, 2019 1 次提交
  5. 28 8月, 2019 1 次提交
    • C
      net_sched: fix a NULL pointer deref in ipt action · 981471bd
      Cong Wang 提交于
      The net pointer in struct xt_tgdtor_param is not explicitly
      initialized therefore is still NULL when dereferencing it.
      So we have to find a way to pass the correct net pointer to
      ipt_destroy_target().
      
      The best way I find is just saving the net pointer inside the per
      netns struct tcf_idrinfo, which could make this patch smaller.
      
      Fixes: 0c66dc1e ("netfilter: conntrack: register hooks in netns when needed by ruleset")
      Reported-and-tested-by: itugrok@yahoo.com
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      981471bd
  6. 09 8月, 2019 1 次提交
  7. 06 8月, 2019 1 次提交
    • D
      net: sched: use temporary variable for actions indexes · 7be8ef2c
      Dmytro Linkin 提交于
      Currently init call of all actions (except ipt) init their 'parm'
      structure as a direct pointer to nla data in skb. This leads to race
      condition when some of the filter actions were initialized successfully
      (and were assigned with idr action index that was written directly
      into nla data), but then were deleted and retried (due to following
      action module missing or classifier-initiated retry), in which case
      action init code tries to insert action to idr with index that was
      assigned on previous iteration. During retry the index can be reused
      by another action that was inserted concurrently, which causes
      unintended action sharing between filters.
      To fix described race condition, save action idr index to temporary
      stack-allocated variable instead on nla data.
      
      Fixes: 0190c1d4 ("net: sched: atomically check-allocate action")
      Signed-off-by: NDmytro Linkin <dmitrolin@mellanox.com>
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7be8ef2c
  8. 31 5月, 2019 1 次提交
  9. 28 4月, 2019 1 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
  10. 22 3月, 2019 2 次提交
    • D
      net/sched: act_skbedit: validate the control action inside init() · ec7727bb
      Davide Caratti 提交于
      the following script:
      
       # tc qdisc add dev crash0 clsact
       # tc filter add dev crash0 egress matchall \
       > action skbedit ptype host pass index 90
       # tc actions replace action skbedit \
       > ptype host goto chain 42 index 90 cookie c1a0c1a0
       # tc actions show action skbedit
      
      had the following output:
      
       Error: Failed to init TC action chain.
       We have an error talking to the kernel
       total acts 1
      
               action order 0: skbedit  ptype host goto chain 42
                index 90 ref 2 bind 1
               cookie c1a0c1a0
      
      Then, the first packet transmitted by crash0 made the kernel crash:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
       #PF error: [normal kernel read fault]
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP PTI
       CPU: 3 PID: 3467 Comm: kworker/3:3 Not tainted 5.0.0-rc4.gotochain_crash+ #536
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
       Workqueue: ipv6_addrconf addrconf_dad_work
       RIP: 0010:tcf_action_exec+0xb8/0x100
       Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
       RSP: 0018:ffffb50a81e1fad0 EFLAGS: 00010246
       RAX: 000000002000002a RBX: ffff9aa47ba4ea00 RCX: 0000000000000001
       RDX: 0000000000000000 RSI: ffff9aa469eeb3c0 RDI: ffff9aa47ba4ea00
       RBP: ffffb50a81e1fb70 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: ffff9aa47bce0638 R12: ffff9aa4793b0c00
       R13: ffff9aa4793b0c08 R14: 0000000000000001 R15: ffff9aa469eeb3c0
       FS:  0000000000000000(0000) GS:ffff9aa474780000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000000 CR3: 000000007360e005 CR4: 00000000001606e0
       Call Trace:
        tcf_classify+0x58/0x120
        __dev_queue_xmit+0x40a/0x890
        ? ndisc_next_option+0x50/0x50
        ? ___neigh_create+0x4d5/0x680
        ? ip6_finish_output2+0x1b5/0x590
        ip6_finish_output2+0x1b5/0x590
        ? ip6_output+0x68/0x110
        ip6_output+0x68/0x110
        ? nf_hook.constprop.28+0x79/0xc0
        ndisc_send_skb+0x248/0x2e0
        ndisc_send_ns+0xf8/0x200
        ? addrconf_dad_work+0x389/0x4b0
        addrconf_dad_work+0x389/0x4b0
        ? __switch_to_asm+0x34/0x70
        ? process_one_work+0x195/0x380
        ? addrconf_dad_completed+0x370/0x370
        process_one_work+0x195/0x380
        worker_thread+0x30/0x390
        ? process_one_work+0x380/0x380
        kthread+0x113/0x130
        ? kthread_park+0x90/0x90
        ret_from_fork+0x35/0x40
       Modules linked in: act_skbedit veth ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep mbcache snd_hda_core jbd2 snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd snd_timer glue_helper snd joydev soundcore pcspkr virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net net_failover drm failover virtio_blk virtio_console ata_piix virtio_pci crc32c_intel serio_raw libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
       CR2: 0000000000000000
      
      Validating the control action within tcf_skbedit_init() proved to fix the
      above issue. A TDC selftest is added to verify the correct behavior.
      
      Fixes: db50514f ("net: sched: add termination action to allow goto chain")
      Fixes: 97763dc0 ("net_sched: reject unknown tcfa_action values")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec7727bb
    • D
      net/sched: prepare TC actions to properly validate the control action · 85d0966f
      Davide Caratti 提交于
      - pass a pointer to struct tcf_proto in each actions's init() handler,
        to allow validating the control action, checking whether the chain
        exists and (eventually) refcounting it.
      - remove code that validates the control action after a successful call
        to the action's init() handler, and replace it with a test that forbids
        addition of actions having 'goto_chain' and NULL goto_chain pointer at
        the same time.
      - add tcf_action_check_ctrlact(), that will validate the control action
        and eventually allocate the action 'goto_chain' within the init()
        handler.
      - add tcf_action_set_ctrlact(), that will assign the control action and
        swap the current 'goto_chain' pointer with the new given one.
      
      This disallows 'goto_chain' on actions that don't initialize it properly
      in their init() handler, i.e. calling tcf_action_check_ctrlact() after
      successful IDR reservation and then calling tcf_action_set_ctrlact()
      to assign 'goto_chain' and 'tcf_action' consistently.
      
      By doing this, the kernel does not leak anymore refcounts when a valid
      'goto chain' handle is replaced in TC actions, causing kmemleak splats
      like the following one:
      
       # tc chain add dev dd0 chain 42 ingress protocol ip flower \
       > ip_proto tcp action drop
       # tc chain add dev dd0 chain 43 ingress protocol ip flower \
       > ip_proto udp action drop
       # tc filter add dev dd0 ingress matchall \
       > action gact goto chain 42 index 66
       # tc filter replace dev dd0 ingress matchall \
       > action gact goto chain 43 index 66
       # echo scan >/sys/kernel/debug/kmemleak
       <...>
       unreferenced object 0xffff93c0ee09f000 (size 1024):
       comm "tc", pid 2565, jiffies 4295339808 (age 65.426s)
       hex dump (first 32 bytes):
         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
         00 00 00 00 08 00 06 00 00 00 00 00 00 00 00 00  ................
       backtrace:
         [<000000009b63f92d>] tc_ctl_chain+0x3d2/0x4c0
         [<00000000683a8d72>] rtnetlink_rcv_msg+0x263/0x2d0
         [<00000000ddd88f8e>] netlink_rcv_skb+0x4a/0x110
         [<000000006126a348>] netlink_unicast+0x1a0/0x250
         [<00000000b3340877>] netlink_sendmsg+0x2c1/0x3c0
         [<00000000a25a2171>] sock_sendmsg+0x36/0x40
         [<00000000f19ee1ec>] ___sys_sendmsg+0x280/0x2f0
         [<00000000d0422042>] __sys_sendmsg+0x5e/0xa0
         [<000000007a6c61f9>] do_syscall_64+0x5b/0x180
         [<00000000ccd07542>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
         [<0000000013eaa334>] 0xffffffffffffffff
      
      Fixes: db50514f ("net: sched: add termination action to allow goto chain")
      Fixes: 97763dc0 ("net_sched: reject unknown tcfa_action values")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85d0966f
  11. 25 2月, 2019 1 次提交
    • D
      net/sched: act_skbedit: fix refcount leak when replace fails · 6191da98
      Davide Caratti 提交于
      when act_skbedit was converted to use RCU in the data plane, we added an
      error path, but we forgot to drop the action refcount in case of failure
      during a 'replace' operation:
      
       # tc actions add action skbedit ptype otherhost pass index 100
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 1 bind 0
       # tc actions replace action skbedit ptype otherhost drop index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'params_new' allocation failed,
      also when the action is being replaced.
      
      Fixes: c749cdda ("net/sched: act_skbedit: don't use spinlock in the data path")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6191da98
  12. 11 2月, 2019 1 次提交
  13. 09 9月, 2018 1 次提交
    • V
      net: sched: act_skbedit: remove dependency on rtnl lock · 6d7a8df6
      Vlad Buslov 提交于
      According to the new locking rule, we have to take tcf_lock for both
      ->init() and ->dump(), as RTNL will be removed.
      
      Use tcf lock to protect skbedit action struct private data from concurrent
      modification in init and dump. Use rcu swap operation to reassign params
      pointer under protection of tcf lock. (old params value is not used by
      init, so there is no need of standalone rcu dereference step)
      
      Remove rtnl lock assertion that is no longer required.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d7a8df6
  14. 01 9月, 2018 1 次提交
  15. 22 8月, 2018 1 次提交
  16. 14 8月, 2018 1 次提交
  17. 31 7月, 2018 1 次提交
    • P
      tc/act: remove unneeded RCU lock in action callback · 7fd4b288
      Paolo Abeni 提交于
      Each lockless action currently does its own RCU locking in ->act().
      This allows using plain RCU accessor, even if the context
      is really RCU BH.
      
      This change drops the per action RCU lock, replace the accessors
      with the _bh variant, cleans up a bit the surrounding code and
      documents the RCU status in the relevant header.
      No functional nor performance change is intended.
      
      The goal of this patch is clarifying that the RCU critical section
      used by the tc actions extends up to the classifier's caller.
      
      v1 -> v2:
       - preserve rcu lock in act_bpf: it's needed by eBPF helpers,
         as pointed out by Daniel
      
      v3 -> v4:
       - fixed some typos in the commit message (JiriP)
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7fd4b288
  18. 13 7月, 2018 2 次提交
  19. 08 7月, 2018 5 次提交
  20. 04 7月, 2018 1 次提交
  21. 12 5月, 2018 1 次提交
    • R
      net sched actions: fix invalid pointer dereferencing if skbedit flags missing · af5d0184
      Roman Mashak 提交于
      When application fails to pass flags in netlink TLV for a new skbedit action,
      the kernel results in the following oops:
      
      [    8.307732] BUG: unable to handle kernel paging request at 0000000000021130
      [    8.309167] PGD 80000000193d1067 P4D 80000000193d1067 PUD 180e0067 PMD 0
      [    8.310595] Oops: 0000 [#1] SMP PTI
      [    8.311334] Modules linked in: kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper serio_raw
      [    8.314190] CPU: 1 PID: 397 Comm: tc Not tainted 4.17.0-rc3+ #357
      [    8.315252] RIP: 0010:__tcf_idr_release+0x33/0x140
      [    8.316203] RSP: 0018:ffffa0718038f840 EFLAGS: 00010246
      [    8.317123] RAX: 0000000000000001 RBX: 0000000000021100 RCX: 0000000000000000
      [    8.319831] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000021100
      [    8.321181] RBP: 0000000000000000 R08: 000000000004adf8 R09: 0000000000000122
      [    8.322645] R10: 0000000000000000 R11: ffffffff9e5b01ed R12: 0000000000000000
      [    8.324157] R13: ffffffff9e0d3cc0 R14: 0000000000000000 R15: 0000000000000000
      [    8.325590] FS:  00007f591292e700(0000) GS:ffff8fcf5bc40000(0000) knlGS:0000000000000000
      [    8.327001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    8.327987] CR2: 0000000000021130 CR3: 00000000180e6004 CR4: 00000000001606a0
      [    8.329289] Call Trace:
      [    8.329735]  tcf_skbedit_init+0xa7/0xb0
      [    8.330423]  tcf_action_init_1+0x362/0x410
      [    8.331139]  ? try_to_wake_up+0x44/0x430
      [    8.331817]  tcf_action_init+0x103/0x190
      [    8.332511]  tc_ctl_action+0x11a/0x220
      [    8.333174]  rtnetlink_rcv_msg+0x23d/0x2e0
      [    8.333902]  ? _cond_resched+0x16/0x40
      [    8.334569]  ? __kmalloc_node_track_caller+0x5b/0x2c0
      [    8.335440]  ? rtnl_calcit.isra.31+0xf0/0xf0
      [    8.336178]  netlink_rcv_skb+0xdb/0x110
      [    8.336855]  netlink_unicast+0x167/0x220
      [    8.337550]  netlink_sendmsg+0x2a7/0x390
      [    8.338258]  sock_sendmsg+0x30/0x40
      [    8.338865]  ___sys_sendmsg+0x2c5/0x2e0
      [    8.339531]  ? pagecache_get_page+0x27/0x210
      [    8.340271]  ? filemap_fault+0xa2/0x630
      [    8.340943]  ? page_add_file_rmap+0x108/0x200
      [    8.341732]  ? alloc_set_pte+0x2aa/0x530
      [    8.342573]  ? finish_fault+0x4e/0x70
      [    8.343332]  ? __handle_mm_fault+0xbc1/0x10d0
      [    8.344337]  ? __sys_sendmsg+0x53/0x80
      [    8.345040]  __sys_sendmsg+0x53/0x80
      [    8.345678]  do_syscall_64+0x4f/0x100
      [    8.346339]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [    8.347206] RIP: 0033:0x7f591191da67
      [    8.347831] RSP: 002b:00007fff745abd48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [    8.349179] RAX: ffffffffffffffda RBX: 00007fff745abe70 RCX: 00007f591191da67
      [    8.350431] RDX: 0000000000000000 RSI: 00007fff745abdc0 RDI: 0000000000000003
      [    8.351659] RBP: 000000005af35251 R08: 0000000000000001 R09: 0000000000000000
      [    8.352922] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
      [    8.354183] R13: 00007fff745afed0 R14: 0000000000000001 R15: 00000000006767c0
      [    8.355400] Code: 41 89 d4 53 89 f5 48 89 fb e8 aa 20 fd ff 85 c0 0f 84 ed 00
      00 00 48 85 db 0f 84 cf 00 00 00 40 84 ed 0f 85 cd 00 00 00 45 84 e4 <8b> 53 30
      74 0d 85 d2 b8 ff ff ff ff 0f 8f b3 00 00 00 8b 43 2c
      [    8.358699] RIP: __tcf_idr_release+0x33/0x140 RSP: ffffa0718038f840
      [    8.359770] CR2: 0000000000021130
      [    8.360438] ---[ end trace 60c66be45dfc14f0 ]---
      
      The caller calls action's ->init() and passes pointer to "struct tc_action *a",
      which later may be initialized to point at the existing action, otherwise
      "struct tc_action *a" is still invalid, and therefore dereferencing it is an
      error as happens in tcf_idr_release, where refcnt is decremented.
      
      So in case of missing flags tcf_idr_release must be called only for
      existing actions.
      
      v2:
          - prepare patch for net tree
      
      Fixes: 5e1567ae ("net sched: skbedit action fix late binding")
      Signed-off-by: NRoman Mashak <mrv@mojatatu.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af5d0184
  22. 28 3月, 2018 1 次提交
  23. 28 2月, 2018 1 次提交
    • K
      net: Convert tc_action_net_init() and tc_action_net_exit() based pernet_operations · 685ecfb1
      Kirill Tkhai 提交于
      These pernet_operations are from net/sched directory, and they call only
      tc_action_net_init() and tc_action_net_exit():
      
      bpf_net_ops
      connmark_net_ops
      csum_net_ops
      gact_net_ops
      ife_net_ops
      ipt_net_ops
      xt_net_ops
      mirred_net_ops
      nat_net_ops
      pedit_net_ops
      police_net_ops
      sample_net_ops
      simp_net_ops
      skbedit_net_ops
      skbmod_net_ops
      tunnel_key_net_ops
      vlan_net_ops
      
      1)tc_action_net_init() just allocates and initializes per-net memory.
      2)There should not be in-flight packets at the time of tc_action_net_exit()
      call, or another pernet_operations send packets to dying net (except
      netlink). So, it seems they can be marked as async.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      685ecfb1
  24. 17 2月, 2018 4 次提交
  25. 14 12月, 2017 1 次提交
  26. 09 11月, 2017 1 次提交
  27. 03 11月, 2017 1 次提交
  28. 31 8月, 2017 1 次提交
  29. 14 4月, 2017 1 次提交
  30. 18 11月, 2016 1 次提交
    • A
      netns: make struct pernet_operations::id unsigned int · c7d03a00
      Alexey Dobriyan 提交于
      Make struct pernet_operations::id unsigned.
      
      There are 2 reasons to do so:
      
      1)
      This field is really an index into an zero based array and
      thus is unsigned entity. Using negative value is out-of-bound
      access by definition.
      
      2)
      On x86_64 unsigned 32-bit data which are mixed with pointers
      via array indexing or offsets added or subtracted to pointers
      are preffered to signed 32-bit data.
      
      "int" being used as an array index needs to be sign-extended
      to 64-bit before being used.
      
      	void f(long *p, int i)
      	{
      		g(p[i]);
      	}
      
        roughly translates to
      
      	movsx	rsi, esi
      	mov	rdi, [rsi+...]
      	call 	g
      
      MOVSX is 3 byte instruction which isn't necessary if the variable is
      unsigned because x86_64 is zero extending by default.
      
      Now, there is net_generic() function which, you guessed it right, uses
      "int" as an array index:
      
      	static inline void *net_generic(const struct net *net, int id)
      	{
      		...
      		ptr = ng->ptr[id - 1];
      		...
      	}
      
      And this function is used a lot, so those sign extensions add up.
      
      Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
      messing with code generation):
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      
      Unfortunately some functions actually grow bigger.
      This is a semmingly random artefact of code generation with register
      allocator being used differently. gcc decides that some variable
      needs to live in new r8+ registers and every access now requires REX
      prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
      used which is longer than [r8]
      
      However, overall balance is in negative direction:
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      	function                                     old     new   delta
      	nfsd4_lock                                  3886    3959     +73
      	tipc_link_build_proto_msg                   1096    1140     +44
      	mac80211_hwsim_new_radio                    2776    2808     +32
      	tipc_mon_rcv                                1032    1058     +26
      	svcauth_gss_legacy_init                     1413    1429     +16
      	tipc_bcbase_select_primary                   379     392     +13
      	nfsd4_exchange_id                           1247    1260     +13
      	nfsd4_setclientid_confirm                    782     793     +11
      		...
      	put_client_renew_locked                      494     480     -14
      	ip_set_sockfn_get                            730     716     -14
      	geneve_sock_add                              829     813     -16
      	nfsd4_sequence_done                          721     703     -18
      	nlmclnt_lookup_host                          708     686     -22
      	nfsd4_lockt                                 1085    1063     -22
      	nfs_get_client                              1077    1050     -27
      	tcf_bpf_init                                1106    1076     -30
      	nfsd4_encode_fattr                          5997    5930     -67
      	Total: Before=154856051, After=154854321, chg -0.00%
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d03a00