1. 26 6月, 2013 1 次提交
  2. 21 6月, 2013 1 次提交
    • E
      netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag · 681f130f
      Eric Dumazet 提交于
      xt_socket module can be a nice replacement to conntrack module
      in some cases (SYN filtering for example)
      
      But it lacks the ability to match the 3rd packet of TCP
      handshake (ACK coming from the client).
      
      Add a XT_SOCKET_NOWILDCARD flag to disable the wildcard mechanism.
      
      The wildcard is the legacy socket match behavior, that ignores
      LISTEN sockets bound to INADDR_ANY (or ipv6 equivalent)
      
      iptables -I INPUT -p tcp --syn -j SYN_CHAIN
      iptables -I INPUT -m socket --nowildcard -j ACCEPT
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      681f130f
  3. 20 6月, 2013 12 次提交
  4. 18 6月, 2013 6 次提交
  5. 15 6月, 2013 2 次提交
    • D
      smp.h: Use local_irq_{save,restore}() in !SMP version of on_each_cpu(). · f21afc25
      David Daney 提交于
      Thanks to commit f91eb62f ("init: scream bloody murder if interrupts
      are enabled too early"), "bloody murder" is now being screamed.
      
      With a MIPS OCTEON config, we use on_each_cpu() in our
      irq_chip.irq_bus_sync_unlock() function.  This gets called in early as a
      result of the time_init() call.  Because the !SMP version of
      on_each_cpu() unconditionally enables irqs, we get:
      
          WARNING: at init/main.c:560 start_kernel+0x250/0x410()
          Interrupts were enabled early
          CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0-rc5-Cavium-Octeon+ #801
          Call Trace:
            show_stack+0x68/0x80
            warn_slowpath_common+0x78/0xb0
            warn_slowpath_fmt+0x38/0x48
            start_kernel+0x250/0x410
      
      Suggested fix: Do what we already do in the SMP version of
      on_each_cpu(), and use local_irq_save/local_irq_restore.  Because we
      need a flags variable, make it a static inline to avoid name space
      issues.
      
      [ Change from v1: Convert on_each_cpu to a static inline function, add
        #include <linux/irqflags.h> to avoid build breakage on some files.
      
        on_each_cpu_mask() and on_each_cpu_cond() suffer the same problem as
        on_each_cpu(), but they are not causing !SMP bugs for me, so I will
        defer changing them to a less urgent patch. ]
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f21afc25
    • P
      openvswitch: Fix struct comment. · 45bfa52e
      Pravin B Shelar 提交于
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NJesse Gross <jesse@nicira.com>
      45bfa52e
  6. 14 6月, 2013 3 次提交
    • R
      net/mlx4: Add VF link state support · 948e306d
      Rony Efraim 提交于
      Add support to change the link state of VF (vPort)
      Signed-off-by: NRony Efraim <ronye@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      948e306d
    • R
      net/core: Add VF link state control · 1d8faf48
      Rony Efraim 提交于
      Add netlink directives and ndo entry to allow for controling
      VF link, which can be in one of three states:
      
      Auto - VF link state reflects the PF link state (default)
      
      Up - VF link state is up, traffic from VF to VF works even if
      the actual PF link is down
      
      Down - VF link state is down, no traffic from/to this VF, can be of
      use while configuring the VF
      Signed-off-by: NRony Efraim <ronye@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d8faf48
    • W
      net-rps: fixes for rps flow limit · 5f121b9a
      Willem de Bruijn 提交于
      Caught by sparse:
      - __rcu: missing annotation to sd->flow_limit
      - __user: direct access in cpumask_scnprintf
      
      Also
      - add endline character when printing bitmap if room in buffer
      - avoid bucket overflow by reducing FLOW_LIMIT_HISTORY
      
      The last item warrants some explanation. The hashtable buckets are
      subject to overflow if FLOW_LIMIT_HISTORY is larger than or equal
      to bucket size, since all packets may end up in a single bucket. The
      current (rather arbitrary) history value of 256 happens to match the
      buffer size (u8).
      
      As a result, with a single flow, the first 128 packets are accepted
      (correct), the second 128 packets dropped (correct) and then the
      history[] array has filled, so that each subsequent new packet
      causes an increment in the bucket for new_flow plus a decrement
      for old_flow: a steady state.
      
      This is fine if packets are dropped, as the steady state goes away
      as soon as a mix of traffic reappears. But, because the 256th packet
      overflowed the bucket to 0: no packets are dropped.
      
      Instead of explicitly adding an overflow check, this patch changes
      FLOW_LIMIT_HISTORY to never be able to overflow a single bucket.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      (first item)
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f121b9a
  7. 13 6月, 2013 8 次提交
    • E
      ip_tunnel: remove __net_init/exit from exported functions · d3b6f614
      Eric Dumazet 提交于
      If CONFIG_NET_NS is not set then __net_init is the same as __init and
      __net_exit is the same as __exit. These functions will be removed from
      memory after the module loads or is removed. Functions that are exported
      for use by other functions should never be labeled for removal.
      
      Bug introduced by commit c5441932
      ("GRE: Refactor GRE tunneling code.")
      Reported-by: NSteinar H. Gunderson <sgunderson@bigfoot.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3b6f614
    • Y
      tcp: properly send new data in fast recovery in first RTT · 85f16525
      Yuchung Cheng 提交于
      Linux sends new unset data during disorder and recovery state if all
      (suspected) lost packets have been retransmitted ( RFC5681, section
      3.2 step 1 & 2, RFC3517 section 4, NexSeg() Rule 2).  One requirement
      is to keep the receive window about twice the estimated sender's
      congestion window (tcp_rcv_space_adjust()), assuming the fast
      retransmits repair the losses in the next round trip.
      
      But currently it's not the case on the first round trip in either
      normal or Fast Open connection, beucase the initial receive window
      is identical to (expected) sender's initial congestion window. The
      fix is to double it.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85f16525
    • J
      macvtap: slient sparse warnings · d9a90a31
      Jason Wang 提交于
      This patch silents the following sparse warnings:
      
      drivers/net/macvtap.c:98:9: warning: incorrect type in assignment (different
      address spaces)
      drivers/net/macvtap.c:98:9:    expected struct macvtap_queue *<noident>
      drivers/net/macvtap.c:98:9:    got struct macvtap_queue [noderef]
      <asn:4>*<noident>
      drivers/net/macvtap.c:120:9: warning: incorrect type in assignment (different
      address spaces)
      drivers/net/macvtap.c:120:9:    expected struct macvtap_queue *<noident>
      drivers/net/macvtap.c:120:9:    got struct macvtap_queue [noderef]
      <asn:4>*<noident>
      drivers/net/macvtap.c:151:22: error: incompatible types in comparison expression
      (different address spaces)
      drivers/net/macvtap.c:233:23: error: incompatible types in comparison expression
      (different address spaces)
      drivers/net/macvtap.c:243:23: error: incompatible types in comparison expression
      (different address spaces)
      drivers/net/macvtap.c:247:15: error: incompatible types in comparison expression
      (different address spaces)
        CC [M]  drivers/net/macvtap.o
      drivers/net/macvlan.c:232:24: error: incompatible types in comparison expression
      (different address spaces)
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9a90a31
    • A
      include/linux/math64.h: add div64_ul() · c2853c8d
      Alex Shi 提交于
      There is div64_long() to handle the s64/long division, but no mocro do
      u64/ul division.  It is necessary in some scenarios, so add this
      function.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NAlex Shi <alex.shi@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c2853c8d
    • N
      mm: migration: add migrate_entry_wait_huge() · 30dad309
      Naoya Horiguchi 提交于
      When we have a page fault for the address which is backed by a hugepage
      under migration, the kernel can't wait correctly and do busy looping on
      hugepage fault until the migration finishes.  As a result, users who try
      to kick hugepage migration (via soft offlining, for example) occasionally
      experience long delay or soft lockup.
      
      This is because pte_offset_map_lock() can't get a correct migration entry
      or a correct page table lock for hugepage.  This patch introduces
      migration_entry_wait_huge() to solve this.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NWanpeng Li <liwanp@linux.vnet.ibm.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>	[2.6.35+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30dad309
    • K
      kmsg: honor dmesg_restrict sysctl on /dev/kmsg · 637241a9
      Kees Cook 提交于
      The dmesg_restrict sysctl currently covers the syslog method for access
      dmesg, however /dev/kmsg isn't covered by the same protections.  Most
      people haven't noticed because util-linux dmesg(1) defaults to using the
      syslog method for access in older versions.  With util-linux dmesg(1)
      defaults to reading directly from /dev/kmsg.
      
      To fix /dev/kmsg, let's compare the existing interfaces and what they
      allow:
      
       - /proc/kmsg allows:
        - open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
          single-reader interface (SYSLOG_ACTION_READ).
        - everything, after an open.
      
       - syslog syscall allows:
        - anything, if CAP_SYSLOG.
        - SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if
          dmesg_restrict==0.
        - nothing else (EPERM).
      
      The use-cases were:
       - dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
       - sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
         destructive SYSLOG_ACTION_READs.
      
      AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't
      clear the ring buffer.
      
      Based on the comments in devkmsg_llseek, it sounds like actions besides
      reading aren't going to be supported by /dev/kmsg (i.e.
      SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
      syslog syscall actions.
      
      To this end, move the check as Josh had done, but also rename the
      constants to reflect their new uses (SYSLOG_FROM_CALL becomes
      SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
      SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
      allows destructive actions after a capabilities-constrained
      SYSLOG_ACTION_OPEN check.
      
       - /dev/kmsg allows:
        - open if CAP_SYSLOG or dmesg_restrict==0
        - reading/polling, after open
      
      Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192
      
      [akpm@linux-foundation.org: use pr_warn_once()]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reported-by: NChristian Kujau <lists@nerdbynature.de>
      Tested-by: NJosh Boyer <jwboyer@redhat.com>
      Cc: Kay Sievers <kay@vrfy.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      637241a9
    • S
      CPU hotplug: provide a generic helper to disable/enable CPU hotplug · 16e53dbf
      Srivatsa S. Bhat 提交于
      There are instances in the kernel where we would like to disable CPU
      hotplug (from sysfs) during some important operation.  Today the freezer
      code depends on this and the code to do it was kinda tailor-made for
      that.
      
      Restructure the code and make it generic enough to be useful for other
      usecases too.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      16e53dbf
    • P
      tun: Report "persist" flag to userspace · 274038f8
      Pavel Emelyanov 提交于
      The TUN_PERSIST flag is not reported at all -- both TUNGETIFF, and sysfs
      "flags" attribute skip one. Knowing whether a device is persistent or not
      is critical for checkpoint-restore, thus I propose to add the read-only
      IFF_PERSIST one for this.
      
      Setting this new IFF_PERSIST is hardly possible, as TUNSETIFF doesn't check
      for unknown flags being zero and thus there can be trash.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      274038f8
  8. 12 6月, 2013 7 次提交