1. 10 9月, 2014 17 次提交
  2. 09 9月, 2014 2 次提交
  3. 08 9月, 2014 2 次提交
  4. 07 9月, 2014 1 次提交
  5. 06 9月, 2014 18 次提交
    • E
      tcp: remove TCP_SKB_CB(skb)->when · 7faee5c0
      Eric Dumazet 提交于
      After commit 740b0f18 ("tcp: switch rtt estimations to usec resolution"),
      we no longer need to maintain timestamps in two different fields.
      
      TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7faee5c0
    • E
      tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn · 04317daf
      Eric Dumazet 提交于
      TCP_SKB_CB(skb)->when has different meaning in output and input paths.
      
      In output path, it contains a timestamp.
      In input path, it contains an ISN, chosen by tcp_timewait_state_process()
      
      Lets add a different name to ease code comprehension.
      
      Note that 'when' field will disappear in following patch,
      as skb_mstamp already contains timestamp, the anonymous
      union will promptly disappear as well.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04317daf
    • A
      net: Add function for parsing the header length out of linear ethernet frames · 56193d1b
      Alexander Duyck 提交于
      This patch updates some of the flow_dissector api so that it can be used to
      parse the length of ethernet buffers stored in fragments.  Most of the
      changes needed were to __skb_get_poff as it needed to be updated to support
      sending a linear buffer instead of a skb.
      
      I have split __skb_get_poff into two functions, the first is skb_get_poff
      and it retains the functionality of the original __skb_get_poff.  The other
      function is __skb_get_poff which now works much like __skb_flow_dissect in
      relation to skb_flow_dissect in that it provides the same functionality but
      works with just a data buffer and hlen instead of needing an skb.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56193d1b
    • A
      net: merge cases where sock_efree and sock_edemux are the same function · 82eabd9e
      Alexander Duyck 提交于
      Since sock_efree and sock_demux are essentially the same code for non-TCP
      sockets and the case where CONFIG_INET is not defined we can combine the
      code or replace the call to sock_edemux in several spots.  As a result we
      can avoid a bit of unnecessary code or code duplication.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82eabd9e
    • A
      net-timestamp: Make the clone operation stand-alone from phy timestamping · 62bccb8c
      Alexander Duyck 提交于
      The phy timestamping takes a different path than the regular timestamping
      does in that it will create a clone first so that the packets needing to be
      timestamped can be placed in a queue, or the context block could be used.
      
      In order to support these use cases I am pulling the core of the code out
      so it can be used in other drivers beyond just phy devices.
      
      In addition I have added a destructor named sock_efree which is meant to
      provide a simple way for dropping the reference to skb exceptions that
      aren't part of either the receive or send windows for the socket, and I
      have removed some duplication in spots where this destructor could be used
      in place of sock_edemux.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62bccb8c
    • A
      net-timestamp: Merge shared code between phy and regular timestamping · 37846ef0
      Alexander Duyck 提交于
      This change merges the shared bits that exist between skb_tx_tstamp and
      skb_complete_tx_timestamp.  By doing this we can avoid the two diverging as
      there were already changes pushed into skb_tx_tstamp that hadn't made it
      into the other function.
      
      In addition this resolves issues with the fact that
      skb_complete_tx_timestamp was included in linux/skbuff.h even though it was
      only compiled in if phy timestamping was enabled.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37846ef0
    • E
      ipv4: harden fnhe_hashfun() · d546c621
      Eric Dumazet 提交于
      Lets make this hash function a bit secure, as ICMP attacks are still
      in the wild.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d546c621
    • M
      net: treewide: Fix typo found in DocBook/networking.xml · e793c0f7
      Masanari Iida 提交于
      This patch fix spelling typo found in DocBook/networking.xml.
      It is because the neworking.xml is generated from comments
      in the source, I have to fix typo in comments within the source.
      Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
      Acked-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e793c0f7
    • P
      netfilter: add explicit Kconfig for NETFILTER_XT_NAT · 84a59ca5
      Pablo Neira Ayuso 提交于
      Paul Bolle reports that 'select NETFILTER_XT_NAT' from the IPV4 and IPV6
      NAT tables becomes noop since there is no Kconfig switch for it. Add the
      Kconfig switch to resolve this problem.
      
      Fixes: 8993cf8e netfilter: move NAT Kconfig switches out of the iptables scope
      Reported-by: NPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84a59ca5
    • E
      ipv4: fix a race in update_or_create_fnhe() · caa41527
      Eric Dumazet 提交于
      nh_exceptions is effectively used under rcu, but lacks proper
      barriers. Between kzalloc() and setting of nh->nh_exceptions(),
      we need a proper memory barrier.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Fixes: 4895c771 ("ipv4: Add FIB nexthop exceptions.")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      caa41527
    • N
      ipv6: use addrconf_get_prefix_route() to remove peer addr · e7478dfc
      Nicolas Dichtel 提交于
      addrconf_get_prefix_route() ensures to get the right route in the right table.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7478dfc
    • N
      ipv6: fix a refcnt leak with peer addr · f24062b0
      Nicolas Dichtel 提交于
      There is no reason to take a refcnt before deleting the peer address route.
      It's done some lines below for the local prefix route because
      inet6_ifa_finish_destroy() will release it at the end.
      For the peer address route, we want to free it right now.
      
      This bug has been introduced by commit
      caeaba79 ("ipv6: add support of peer address").
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f24062b0
    • A
      l2tp: fix missing line continuation · 29abe2fd
      Andy Zhou 提交于
      This syntax error was covered by L2TP_REFCNT_DEBUG not being set by
      default.
      Signed-off-by: NAndy Zhou <azhou@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29abe2fd
    • W
      net-timestamp: only report sw timestamp if reporting bit is set · c199105d
      Willem de Bruijn 提交于
      The timestamping API has separate bits for generating and reporting
      timestamps. A software timestamp should only be reported for a packet
      when the packet has the relevant generation flag (SKBTX_..) set
      and the socket has reporting bit SOF_TIMESTAMPING_SOFTWARE set.
      
      The second check was accidentally removed. Reinstitute the original
      behavior.
      
      Tested:
        Without this patch, Documentation/networking/txtimestamp reports
        timestamps regardless of whether SOF_TIMESTAMPING_SOFTWARE is set.
        After the patch, it only reports them when the flag is set.
      
      Fixes: f24b9be5 ("net-timestamp: extend SCM_TIMESTAMPING ancillary data struct")
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c199105d
    • G
      l2tp: fix race while getting PMTU on PPP pseudo-wire · eed4d839
      Guillaume Nault 提交于
      Use dst_entry held by sk_dst_get() to retrieve tunnel's PMTU.
      
      The dst_mtu(__sk_dst_get(tunnel->sock)) call was racy. __sk_dst_get()
      could return NULL if tunnel->sock->sk_dst_cache was reset just before the
      call, thus making dst_mtu() dereference a NULL pointer:
      
      [ 1937.661598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
      [ 1937.664005] IP: [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
      [ 1937.664005] PGD daf0c067 PUD d9f93067 PMD 0
      [ 1937.664005] Oops: 0000 [#1] SMP
      [ 1937.664005] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables udp_tunnel pppoe pppox ppp_generic slhc deflate ctr twofish_generic twofish_x86_64_3way xts lrw gf128mul glue_helper twofish_x86_64 twofish_common blowfish_generic blowfish_x86_64 blowfish_common des_generic cbc xcbc rmd160 sha512_generic hmac crypto_null af_key xfrm_algo 8021q garp bridge stp llc tun atmtcp clip atm ext3 mbcache jbd iTCO_wdt coretemp kvm_intel iTCO_vendor_support kvm pcspkr evdev ehci_pci lpc_ich mfd_core i5400_edac edac_core i5k_amb shpchp button processor thermal_sys xfs crc32c_generic libcrc32c dm_mod usbhid sg hid sr_mod sd_mod cdrom crc_t10dif crct10dif_common ata_generic ahci ata_piix tg3 libahci libata uhci_hcd ptp ehci_hcd pps_core usbcore scsi_mod libphy usb_common [last unloaded: l2tp_core]
      [ 1937.664005] CPU: 0 PID: 10022 Comm: l2tpstress Tainted: G           O   3.17.0-rc1 #1
      [ 1937.664005] Hardware name: HP ProLiant DL160 G5, BIOS O12 08/22/2008
      [ 1937.664005] task: ffff8800d8fda790 ti: ffff8800c43c4000 task.ti: ffff8800c43c4000
      [ 1937.664005] RIP: 0010:[<ffffffffa049db88>]  [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
      [ 1937.664005] RSP: 0018:ffff8800c43c7de8  EFLAGS: 00010282
      [ 1937.664005] RAX: ffff8800da8a7240 RBX: ffff8800d8c64600 RCX: 000001c325a137b5
      [ 1937.664005] RDX: 8c6318c6318c6320 RSI: 000000000000010c RDI: 0000000000000000
      [ 1937.664005] RBP: ffff8800c43c7ea8 R08: 0000000000000000 R09: 0000000000000000
      [ 1937.664005] R10: ffffffffa048e2c0 R11: ffff8800d8c64600 R12: ffff8800ca7a5000
      [ 1937.664005] R13: ffff8800c439bf40 R14: 000000000000000c R15: 0000000000000009
      [ 1937.664005] FS:  00007fd7f610f700(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000
      [ 1937.664005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 1937.664005] CR2: 0000000000000020 CR3: 00000000d9d75000 CR4: 00000000000027e0
      [ 1937.664005] Stack:
      [ 1937.664005]  ffffffffa049da80 ffff8800d8fda790 000000000000005b ffff880000000009
      [ 1937.664005]  ffff8800daf3f200 0000000000000003 ffff8800c43c7e48 ffffffff81109b57
      [ 1937.664005]  ffffffff81109b0e ffffffff8114c566 0000000000000000 0000000000000000
      [ 1937.664005] Call Trace:
      [ 1937.664005]  [<ffffffffa049da80>] ? pppol2tp_connect+0x235/0x41e [l2tp_ppp]
      [ 1937.664005]  [<ffffffff81109b57>] ? might_fault+0x9e/0xa5
      [ 1937.664005]  [<ffffffff81109b0e>] ? might_fault+0x55/0xa5
      [ 1937.664005]  [<ffffffff8114c566>] ? rcu_read_unlock+0x1c/0x26
      [ 1937.664005]  [<ffffffff81309196>] SYSC_connect+0x87/0xb1
      [ 1937.664005]  [<ffffffff813e56f7>] ? sysret_check+0x1b/0x56
      [ 1937.664005]  [<ffffffff8107590d>] ? trace_hardirqs_on_caller+0x145/0x1a1
      [ 1937.664005]  [<ffffffff81213dee>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [ 1937.664005]  [<ffffffff8114c262>] ? spin_lock+0x9/0xb
      [ 1937.664005]  [<ffffffff813092b4>] SyS_connect+0x9/0xb
      [ 1937.664005]  [<ffffffff813e56d2>] system_call_fastpath+0x16/0x1b
      [ 1937.664005] Code: 10 2a 84 81 e8 65 76 bd e0 65 ff 0c 25 10 bb 00 00 4d 85 ed 74 37 48 8b 85 60 ff ff ff 48 8b 80 88 01 00 00 48 8b b8 10 02 00 00 <48> 8b 47 20 ff 50 20 85 c0 74 0f 83 e8 28 89 83 10 01 00 00 89
      [ 1937.664005] RIP  [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
      [ 1937.664005]  RSP <ffff8800c43c7de8>
      [ 1937.664005] CR2: 0000000000000020
      [ 1939.559375] ---[ end trace 82d44500f28f8708 ]---
      
      Fixes: f34c4a35 ("l2tp: take PMTU from tunnel UDP socket")
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eed4d839
    • G
      ethtool: Add generic options for tunables · f0db9b07
      Govindarajulu Varadarajan 提交于
      This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting
      tunable values from driver.
      
      Add get_tunable and set_tunable to ethtool_ops. Driver implements these
      functions for getting/setting tunable value.
      Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f0db9b07
    • D
      dev_ioctl: remove dev_load() CAP_SYS_MODULE message · e020836d
      Daniel Borkmann 提交于
      Marcel reported to see the following message when autoloading
      is being triggered when adding nlmon device:
      
        Loading kernel module for a network device with
        CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
        netdev-nlmon instead.
      
      This false-positive happens despite with having correct
      capabilities set, e.g. through issuing `ip link del dev nlmon`
      more than once on a valid device with name nlmon, but Marcel
      has also seen it on creation time when no nlmon module is
      previously compiled-in or loaded as module and the device
      name equals a link type name (e.g. nlmon, vxlan, team).
      
      Stephen says:
      
        The netdev module alias is a hold over from the past. For
        normal devices, people used to create a alias eth0 to and
        point it to the type of network device used, that was back
        in the bad old ISA days before real discovery.
      
        Also, the tunnels create module alias for the control device
        and ip used to use this to autoload the tunnel device.
      
        The message is bogus and should just be removed, I also see
        it in a couple of other cases where tap devices are renamed
        for other usese.
      
      As mentioned in 8909c9ad ("net: don't allow CAP_NET_ADMIN
      to load non-netdev kernel modules"), we nevertheless still
      might want to leave the old autoloading behaviour in place
      as it could break old scripts, so for now, lets just remove
      the log message as Stephen suggests.
      
      Reference: http://thread.gmane.org/gmane.linux.kernel/1105168Reported-by: NMarcel Holtmann <marcel@holtmann.org>
      Suggested-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Vasiliy Kulikov <segoon@openwall.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e020836d
    • D
      net: bpf: make eBPF interpreter images read-only · 60a3b225
      Daniel Borkmann 提交于
      With eBPF getting more extended and exposure to user space is on it's way,
      hardening the memory range the interpreter uses to steer its command flow
      seems appropriate.  This patch moves the to be interpreted bytecode to
      read-only pages.
      
      In case we execute a corrupted BPF interpreter image for some reason e.g.
      caused by an attacker which got past a verifier stage, it would not only
      provide arbitrary read/write memory access but arbitrary function calls
      as well. After setting up the BPF interpreter image, its contents do not
      change until destruction time, thus we can setup the image on immutable
      made pages in order to mitigate modifications to that code. The idea
      is derived from commit 314beb9b ("x86: bpf_jit_comp: secure bpf jit
      against spraying attacks").
      
      This is possible because bpf_prog is not part of sk_filter anymore.
      After setup bpf_prog cannot be altered during its life-time. This prevents
      any modifications to the entire bpf_prog structure (incl. function/JIT
      image pointer).
      
      Every eBPF program (including classic BPF that are migrated) have to call
      bpf_prog_select_runtime() to select either interpreter or a JIT image
      as a last setup step, and they all are being freed via bpf_prog_free(),
      including non-JIT. Therefore, we can easily integrate this into the
      eBPF life-time, plus since we directly allocate a bpf_prog, we have no
      performance penalty.
      
      Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
      inspection of kernel_page_tables.  Brad Spengler proposed the same idea
      via Twitter during development of this patch.
      
      Joint work with Hannes Frederic Sowa.
      Suggested-by: NBrad Spengler <spender@grsecurity.net>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Kees Cook <keescook@chromium.org>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60a3b225