1. 14 9月, 2014 8 次提交
  2. 11 9月, 2014 1 次提交
  3. 10 9月, 2014 7 次提交
    • W
      net-timestamp: optimize sock_tx_timestamp default path · 67cc0d40
      Willem de Bruijn 提交于
      Few packets have timestamping enabled. Exit sock_tx_timestamp quickly
      in this common case.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67cc0d40
    • D
      net: bpf: be friendly to kmemcheck · 286aad3c
      Daniel Borkmann 提交于
      Reported by Mikulas Patocka, kmemcheck currently barks out a
      false positive since we don't have special kmemcheck annotation
      for bitfields used in bpf_prog structure.
      
      We currently have jited:1, len:31 and thus when accessing len
      while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that
      we're reading uninitialized memory.
      
      As we don't need the whole bit universe for pages member, we
      can just split it to u16 and use a bool flag for jited instead
      of a bitfield.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      286aad3c
    • D
      net: bpf: consolidate JIT binary allocator · 738cbe72
      Daniel Borkmann 提交于
      Introduced in commit 314beb9b ("x86: bpf_jit_comp: secure bpf jit
      against spraying attacks") and later on replicated in aa2d2c73
      ("s390/bpf,jit: address randomize and write protect jit code") for
      s390 architecture, write protection for BPF JIT images got added and
      a random start address of the JIT code, so that it's not on a page
      boundary anymore.
      
      Since both use a very similar allocator for the BPF binary header,
      we can consolidate this code into the BPF core as it's mostly JIT
      independant anyway.
      
      This will also allow for future archs that support DEBUG_SET_MODULE_RONX
      to just reuse instead of reimplementing it.
      
      JIT tested on x86_64 and s390x with BPF test suite.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      738cbe72
    • J
      bridge: implement rtnl_link_ops->get_size and rtnl_link_ops->fill_info · e5c3ea5c
      Jiri Pirko 提交于
      Allow rtnetlink users to get bridge master info in IFLA_INFO_DATA attr
      This initial part implements forward_delay, hello_time, max_age options.
      Signed-off-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5c3ea5c
    • V
      net/ipv4: bind ip_nonlocal_bind to current netns · 49a60158
      Vincent Bernat 提交于
      net.ipv4.ip_nonlocal_bind sysctl was global to all network
      namespaces. This patch allows to set a different value for each
      network namespace.
      Signed-off-by: NVincent Bernat <vincent@bernat.im>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49a60158
    • A
      net: filter: split filter.h and expose eBPF to user space · daedfb22
      Alexei Starovoitov 提交于
      allow user space to generate eBPF programs
      
      uapi/linux/bpf.h: eBPF instruction set definition
      
      linux/filter.h: the rest
      
      This patch only moves macro definitions, but practically it freezes existing
      eBPF instruction set, though new instructions can still be added in the future.
      
      These eBPF definitions cannot go into uapi/linux/filter.h, since the names
      may conflict with existing applications.
      
      Full eBPF ISA description is in Documentation/networking/filter.txt
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      daedfb22
    • A
      net: filter: add "load 64-bit immediate" eBPF instruction · 02ab695b
      Alexei Starovoitov 提交于
      add BPF_LD_IMM64 instruction to load 64-bit immediate value into a register.
      All previous instructions were 8-byte. This is first 16-byte instruction.
      Two consecutive 'struct bpf_insn' blocks are interpreted as single instruction:
      insn[0].code = BPF_LD | BPF_DW | BPF_IMM
      insn[0].dst_reg = destination register
      insn[0].imm = lower 32-bit
      insn[1].code = 0
      insn[1].imm = upper 32-bit
      All unused fields must be zero.
      
      Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM
      which loads 32-bit immediate value into a register.
      
      x64 JITs it as single 'movabsq %rax, imm64'
      arm64 may JIT as sequence of four 'movk x0, #imm16, lsl #shift' insn
      
      Note that old eBPF programs are binary compatible with new interpreter.
      
      It helps eBPF programs load 64-bit constant into a register with one
      instruction instead of using two registers and 4 instructions:
      BPF_MOV32_IMM(R1, imm32)
      BPF_ALU64_IMM(BPF_LSH, R1, 32)
      BPF_MOV32_IMM(R2, imm32)
      BPF_ALU64_REG(BPF_OR, R1, R2)
      
      User space generated programs will use this instruction to load constants only.
      
      To tell kernel that user space needs a pointer the _pseudo_ variant of
      this instruction may be added later, which will use extra bits of encoding
      to indicate what type of pointer user space is asking kernel to provide.
      For example 'off' or 'src_reg' fields can be used for such purpose.
      src_reg = 1 could mean that user space is asking kernel to validate and
      load in-kernel map pointer.
      src_reg = 2 could mean that user space needs readonly data section pointer
      src_reg = 3 could mean that user space needs a pointer to per-cpu local data
      All such future pseudo instructions will not be carrying the actual pointer
      as part of the instruction, but rather will be treated as a request to kernel
      to provide one. The kernel will verify the request_for_a_pointer, then
      will drop _pseudo_ marking and will store actual internal pointer inside
      the instruction, so the end result is the interpreter and JITs never
      see pseudo BPF_LD_IMM64 insns and only operate on generic BPF_LD_IMM64 that
      loads 64-bit immediate into a register. User space never operates on direct
      pointers and verifier can easily recognize request_for_pointer vs other
      instructions.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02ab695b
  4. 09 9月, 2014 7 次提交
  5. 06 9月, 2014 11 次提交
  6. 05 9月, 2014 4 次提交
    • H
      ipv4: implement igmp_qrv sysctl to tune igmp robustness variable · a9fe8e29
      Hannes Frederic Sowa 提交于
      As in IPv6 people might increase the igmp query robustness variable to
      make sure unsolicited state change reports aren't lost on the network. Add
      and document this new knob to igmp code.
      
      RFCs allow tuning this parameter back to first IGMP RFC, so we also use
      this setting for all counters, including source specific multicast.
      
      Also take over sysctl value when upping the interface and don't reuse
      the last one seen on the interface.
      
      Cc: Flavio Leitner <fbl@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NFlavio Leitner <fbl@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a9fe8e29
    • H
      ipv6: add sysctl_mld_qrv to configure query robustness variable · 2f711939
      Hannes Frederic Sowa 提交于
      This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query
      robustness variable. It specifies how many retransmit of unsolicited mld
      retransmit should happen. Admins might want to tune this on lossy links.
      
      Also reset mld state on interface down/up, so we pick up new sysctl
      settings during interface up event.
      
      IPv6 certification requests this knob to be available.
      
      I didn't make this knob netns specific, as it is mostly a setting in a
      physical environment and should be per host.
      
      Cc: Flavio Leitner <fbl@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NFlavio Leitner <fbl@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f711939
    • F
      nohz: Restore NMI safe local irq work for local nohz kick · 40bea039
      Frederic Weisbecker 提交于
      The local nohz kick is currently used by perf which needs it to be
      NMI-safe. Recent commit though (7d1311b9)
      changed its implementation to fire the local kick using the remote kick
      API. It was convenient to make the code more generic but the remote kick
      isn't NMI-safe.
      
      As a result:
      
      	WARNING: CPU: 3 PID: 18062 at kernel/irq_work.c:72 irq_work_queue_on+0x11e/0x140()
      	CPU: 3 PID: 18062 Comm: trinity-subchil Not tainted 3.16.0+ #34
      	0000000000000009 00000000903774d1 ffff880244e06c00 ffffffff9a7f1e37
      	0000000000000000 ffff880244e06c38 ffffffff9a0791dd ffff880244fce180
      	0000000000000003 ffff880244e06d58 ffff880244e06ef8 0000000000000000
      	Call Trace:
      	<NMI>  [<ffffffff9a7f1e37>] dump_stack+0x4e/0x7a
      	[<ffffffff9a0791dd>] warn_slowpath_common+0x7d/0xa0
      	[<ffffffff9a07930a>] warn_slowpath_null+0x1a/0x20
      	[<ffffffff9a17ca1e>] irq_work_queue_on+0x11e/0x140
      	[<ffffffff9a10a2c7>] tick_nohz_full_kick_cpu+0x57/0x90
      	[<ffffffff9a186cd5>] __perf_event_overflow+0x275/0x350
      	[<ffffffff9a184f80>] ? perf_event_task_disable+0xa0/0xa0
      	[<ffffffff9a01a4cf>] ? x86_perf_event_set_period+0xbf/0x150
      	[<ffffffff9a187934>] perf_event_overflow+0x14/0x20
      	[<ffffffff9a020386>] intel_pmu_handle_irq+0x206/0x410
      	[<ffffffff9a0b54d3>] ? arch_vtime_task_switch+0x63/0x130
      	[<ffffffff9a01937b>] perf_event_nmi_handler+0x2b/0x50
      	[<ffffffff9a007b72>] nmi_handle+0xd2/0x390
      	[<ffffffff9a007aa5>] ? nmi_handle+0x5/0x390
      	[<ffffffff9a0d131b>] ? lock_release+0xab/0x330
      	[<ffffffff9a008062>] default_do_nmi+0x72/0x1c0
      	[<ffffffff9a0c925f>] ? cpuacct_account_field+0xcf/0x200
      	[<ffffffff9a008268>] do_nmi+0xb8/0x100
      
      Lets fix this by restoring the use of local irq work for the nohz local
      kick.
      Reported-by: NCatalin Iacob <iacobcatalin@gmail.com>
      Reported-and-tested-by: NDave Jones <davej@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      40bea039
    • R
      bcma: get info about flash type SoC booted from · 87fed556
      Rafał Miłecki 提交于
      There is an ongoing work on cleaning MIPS's nvram support so it could be
      re-used on other platforms (bcm53xx to say precisely).
      This will require a bit of extra logic in bcma this patch implements.
      Signed-off-by: NRafał Miłecki <zajec5@gmail.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      87fed556
  7. 04 9月, 2014 2 次提交
    • Y
      lib/rhashtable: allow user to set the minimum shifts of shrinking · 94000176
      Ying Xue 提交于
      Although rhashtable library allows user to specify a quiet big size
      for user's created hash table, the table may be shrunk to a
      very small size - HASH_MIN_SIZE(4) after object is removed from
      the table at the first time. Subsequently, even if the total amount
      of objects saved in the table is quite lower than user's initial
      setting in a long time, the hash table size is still dynamically
      adjusted by rhashtable_shrink() or rhashtable_expand() each time
      object is inserted or removed from the table. However, as
      synchronize_rcu() has to be called when table is shrunk or
      expanded by the two functions, we should permit user to set the
      minimum table size through configuring the minimum number of shifts
      according to user specific requirement, avoiding these expensive
      actions of shrinking or expanding because of calling synchronize_rcu().
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94000176
    • Y
      ACPI / scan: not cache _SUN value in struct acpi_device_pnp · a383b68d
      Yasuaki Ishimatsu 提交于
      The _SUN device indentification object is not guaranteed to return
      the same value every time it is executed, so we should not cache its
      return value, but rather execute it every time as needed.  If it is
      cached, an incorrect stale value may be used in some situations.
      
      This issue was exposed by commit 202317a5 (ACPI / scan: Add
      acpi_device objects for all device nodes in the namespace).  Fix it
      by avoiding to cache the return value of _SUN.
      
      Fixes: 202317a5 (ACPI / scan: Add acpi_device objects for all device nodes in the namespace)
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
      [ rjw: Changelog ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a383b68d