1. 31 3月, 2015 13 次提交
  2. 23 3月, 2015 1 次提交
  3. 21 3月, 2015 5 次提交
  4. 20 3月, 2015 1 次提交
  5. 19 3月, 2015 1 次提交
    • P
      netfilter: restore rule tracing via nfnetlink_log · 4017a7ee
      Pablo Neira Ayuso 提交于
      Since fab4085f ("netfilter: log: nf_log_packet() as real unified
      interface"), the loginfo structure that is passed to nf_log_packet() is
      used to explicitly indicate the logger type you want to use.
      
      This is a problem for people tracing rules through nfnetlink_log since
      packets are always routed to the NF_LOG_TYPE logger after the
      aforementioned patch.
      
      We can fix this by removing the trace loginfo structures, but that still
      changes the log level from 4 to 5 for tracing messages and there may be
      someone relying on this outthere. So let's just introduce a new
      nf_log_trace() function that restores the former behaviour.
      Reported-by: NMarkus Kötter <koetter@rrzn.uni-hannover.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4017a7ee
  6. 18 3月, 2015 3 次提交
    • D
      act_bpf: allow non-default TC_ACT opcodes as BPF exec outcome · ced585c8
      Daniel Borkmann 提交于
      Revisiting commit d23b8ad8 ("tc: add BPF based action") with regards
      to eBPF support, I was thinking that it might be better to improve
      return semantics from a BPF program invoked through BPF_PROG_RUN().
      
      Currently, in case filter_res is 0, we overwrite the default action
      opcode with TC_ACT_SHOT. A default action opcode configured through tc's
      m_bpf can be: TC_ACT_RECLASSIFY, TC_ACT_PIPE, TC_ACT_SHOT, TC_ACT_UNSPEC,
      TC_ACT_OK.
      
      In cls_bpf, we have the possibility to overwrite the default class
      associated with the classifier in case filter_res is _not_ 0xffffffff
      (-1).
      
      That allows us to fold multiple [e]BPF programs into a single one, where
      they would otherwise need to be defined as a separate classifier with
      its own classid, needlessly redoing parsing work, etc.
      
      Similarly, we could do better in act_bpf: Since above TC_ACT* opcodes
      are exported to UAPI anyway, we reuse them for return-code-to-tc-opcode
      mapping, where we would allow above possibilities. Thus, like in cls_bpf,
      a filter_res of 0xffffffff (-1) means that the configured _default_ action
      is used. Any unkown return code from the BPF program would fail in
      tcf_bpf() with TC_ACT_UNSPEC.
      
      Should we one day want to make use of TC_ACT_STOLEN or TC_ACT_QUEUED,
      which both have the same semantics, we have the option to either use
      that as a default action (filter_res of 0xffffffff) or non-default BPF
      return code.
      
      All that will allow us to transparently use tcf_bpf() for both BPF
      flavours.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ced585c8
    • E
      inet: Clean up inet_csk_wait_for_connect() vs. might_sleep() · cb7cf8a3
      Eric Dumazet 提交于
      I got the following trace with current net-next kernel :
      
      [14723.885290] WARNING: CPU: 26 PID: 22658 at kernel/sched/core.c:7285 __might_sleep+0x89/0xa0()
      [14723.885325] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810e8734>] prepare_to_wait_exclusive+0x34/0xa0
      [14723.885355] CPU: 26 PID: 22658 Comm: netserver Not tainted 4.0.0-dbg-DEV #1379
      [14723.885359]  ffffffff81a223a8 ffff881fae9e7ca8 ffffffff81650b5d 0000000000000001
      [14723.885364]  ffff881fae9e7cf8 ffff881fae9e7ce8 ffffffff810a72e7 0000000000000000
      [14723.885367]  ffffffff81a57620 000000000000093a 0000000000000000 ffff881fae9e7e64
      [14723.885371] Call Trace:
      [14723.885377]  [<ffffffff81650b5d>] dump_stack+0x4c/0x65
      [14723.885382]  [<ffffffff810a72e7>] warn_slowpath_common+0x97/0xe0
      [14723.885386]  [<ffffffff810a73e6>] warn_slowpath_fmt+0x46/0x50
      [14723.885390]  [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
      [14723.885393]  [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0
      [14723.885396]  [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0
      [14723.885399]  [<ffffffff810ccdc9>] __might_sleep+0x89/0xa0
      [14723.885403]  [<ffffffff81581846>] lock_sock_nested+0x36/0xb0
      [14723.885406]  [<ffffffff815829a3>] ? release_sock+0x173/0x1c0
      [14723.885411]  [<ffffffff815ea1f7>] inet_csk_accept+0x157/0x2a0
      [14723.885415]  [<ffffffff810e8900>] ? abort_exclusive_wait+0xc0/0xc0
      [14723.885419]  [<ffffffff8161b96d>] inet_accept+0x2d/0x150
      [14723.885424]  [<ffffffff8157db6f>] SYSC_accept4+0xff/0x210
      [14723.885428]  [<ffffffff8165a451>] ? retint_swapgs+0xe/0x44
      [14723.885431]  [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
      [14723.885437]  [<ffffffff81369c0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [14723.885441]  [<ffffffff8157ef40>] SyS_accept+0x10/0x20
      [14723.885444]  [<ffffffff81659872>] system_call_fastpath+0x12/0x17
      [14723.885447] ---[ end trace ff74cd83355b1873 ]---
      
      In commit 26cabd31
      Peter added a sched_annotate_sleep() in sk_wait_event()
      
      Is the following patch needed as well ?
      
      Alternative would be to use sk_wait_event() from inet_csk_wait_for_connect()
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb7cf8a3
    • N
      ip6_tunnel: fix error code when tunnel exists · 37355565
      Nicolas Dichtel 提交于
      After commit 2b0bb01b, the kernel returns -ENOBUFS when user tries to add
      an existing tunnel with ioctl API:
      $ ip -6 tunnel add ip6tnl1 mode ip6ip6 dev eth1
      add tunnel "ip6tnl0" failed: No buffer space available
      
      It's confusing, the right error is EEXIST.
      
      This patch also change a bit the code returned:
       - ENOBUFS -> ENOMEM
       - ENOENT -> ENODEV
      
      Fixes: 2b0bb01b ("ip6_tunnel: Return an error when adding an existing tunnel.")
      CC: Steffen Klassert <steffen.klassert@secunet.com>
      Reported-by: NPierre Cheynier <me@pierre-cheynier.net>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37355565
  7. 17 3月, 2015 1 次提交
  8. 16 3月, 2015 6 次提交
  9. 15 3月, 2015 1 次提交
  10. 14 3月, 2015 1 次提交
  11. 13 3月, 2015 2 次提交
  12. 12 3月, 2015 5 次提交
    • I
      netfilter: Zero the tuple in nfnl_cthelper_parse_tuple() · 78146572
      Ian Wilson 提交于
      nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(),
      nfnl_cthelper_get() and nfnl_cthelper_del().  In each case they pass
      a pointer to an nf_conntrack_tuple data structure local variable:
      
          struct nf_conntrack_tuple tuple;
          ...
          ret = nfnl_cthelper_parse_tuple(&tuple, tb[NFCTH_TUPLE]);
      
      The problem is that this local variable is not initialized, and
      nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and
      dst.protonum.  This leaves all other fields with undefined values
      based on whatever is on the stack:
      
          tuple->src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM]));
          tuple->dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]);
      
      The symptom observed was that when the rpc and tns helpers were added
      then traffic to port 1536 was being sent to user-space.
      Signed-off-by: NIan Wilson <iwilson@brocade.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      78146572
    • A
      rds: avoid potential stack overflow · f862e07c
      Arnd Bergmann 提交于
      The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
      on the stack in order to pass a pair of addresses. This happens to just
      fit withint the 1024 byte stack size warning limit on x86, but just
      exceed that limit on ARM, which gives us this warning:
      
      net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      As the use of this large variable is basically bogus, we can rearrange
      the code to not do that. Instead of passing an rds socket into
      rds_iw_get_device, we now just pass the two addresses that we have
      available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
      to create two address structures on the stack there.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f862e07c
    • W
      sock: fix possible NULL sk dereference in __skb_tstamp_tx · 3a8dd971
      Willem de Bruijn 提交于
      Test that sk != NULL before reading sk->sk_tsflags.
      
      Fixes: 49ca0d8b ("net-timestamp: no-payload option")
      Reported-by: NOne Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a8dd971
    • E
      xps: must clear sender_cpu before forwarding · c29390c6
      Eric Dumazet 提交于
      John reported that my previous commit added a regression
      on his router.
      
      This is because sender_cpu & napi_id share a common location,
      so get_xps_queue() can see garbage and perform an out of bound access.
      
      We need to make sure sender_cpu is cleared before doing the transmit,
      otherwise any NIC busy poll enabled (skb_mark_napi_id()) can trigger
      this bug.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NJohn <jw@nuclearfallout.net>
      Bisected-by: NJohn <jw@nuclearfallout.net>
      Fixes: 2bd82484 ("xps: fix xps for stacked devices")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c29390c6
    • A
      net: sysctl_net_core: check SNDBUF and RCVBUF for min length · b1cb59cf
      Alexey Kodanev 提交于
      sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
      set to incorrect values. Given that 'struct sk_buff' allocates from
      rcvbuf, incorrectly set buffer length could result to memory
      allocation failures. For example, set them as follows:
      
          # sysctl net.core.rmem_default=64
            net.core.wmem_default = 64
          # sysctl net.core.wmem_default=64
            net.core.wmem_default = 64
          # ping localhost -s 1024 -i 0 > /dev/null
      
      This could result to the following failure:
      
      skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32
      head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL>
      kernel BUG at net/core/skbuff.c:102!
      invalid opcode: 0000 [#1] SMP
      ...
      task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000
      RIP: 0010:[<ffffffff8155fbd1>]  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP: 0018:ffff88003ae8bc68  EFLAGS: 00010296
      RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000
      RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8
      RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900
      FS:  00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0
      Stack:
       ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d
       ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550
       ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00
      Call Trace:
       [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470
       [<ffffffff81556f56>] sock_write_iter+0x146/0x160
       [<ffffffff811d9612>] new_sync_write+0x92/0xd0
       [<ffffffff811d9cd6>] vfs_write+0xd6/0x180
       [<ffffffff811da499>] SyS_write+0x59/0xd0
       [<ffffffff81651532>] system_call_fastpath+0x12/0x17
      Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00
            00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b
            eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83
      RIP  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP <ffff88003ae8bc68>
      Kernel panic - not syncing: Fatal exception
      
      Moreover, the possible minimum is 1, so we can get another kernel panic:
      ...
      BUG: unable to handle kernel paging request at ffff88013caee5c0
      IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
      ...
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1cb59cf