1. 20 3月, 2014 4 次提交
  2. 08 3月, 2014 1 次提交
  3. 15 1月, 2014 3 次提交
  4. 14 1月, 2014 6 次提交
  5. 12 1月, 2014 1 次提交
  6. 11 1月, 2014 1 次提交
    • J
      net: core: explicitly select a txq before doing l2 forwarding · f663dd9a
      Jason Wang 提交于
      Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
      will cause several issues:
      
      - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
        instead of lower device which misses the necessary txq synchronization for
        lower device such as txq stopping or frozen required by dev watchdog or
        control path.
      - dev_hard_start_xmit() was called with NULL txq which bypasses the net device
        watchdog.
      - dev_hard_start_xmit() does not check txq everywhere which will lead a crash
        when tso is disabled for lower device.
      
      Fix this by explicitly introducing a new param for .ndo_select_queue() for just
      selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
      extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
      forwarding transmission.
      
      With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
      to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
      a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
      dev_queue_xmit() to do the transmission.
      
      In the future, it was also required for macvtap l2 forwarding support since it
      provides a necessary synchronization method.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: e1000-devel@lists.sourceforge.net
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f663dd9a
  7. 03 1月, 2014 1 次提交
    • W
      ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NIC · 7a7ffbab
      Wei-Chun Chao 提交于
      VM to VM GSO traffic is broken if it goes through VXLAN or GRE
      tunnel and the physical NIC on the host supports hardware VXLAN/GRE
      GSO offload (e.g. bnx2x and next-gen mlx4).
      
      Two issues -
      (VXLAN) VM traffic has SKB_GSO_DODGY and SKB_GSO_UDP_TUNNEL with
      SKB_GSO_TCP/UDP set depending on the inner protocol. GSO header
      integrity check fails in udp4_ufo_fragment if inner protocol is
      TCP. Also gso_segs is calculated incorrectly using skb->len that
      includes tunnel header. Fix: robust check should only be applied
      to the inner packet.
      
      (VXLAN & GRE) Once GSO header integrity check passes, NULL segs
      is returned and the original skb is sent to hardware. However the
      tunnel header is already pulled. Fix: tunnel header needs to be
      restored so that hardware can perform GSO properly on the original
      packet.
      Signed-off-by: NWei-Chun Chao <weichunc@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a7ffbab
  8. 01 1月, 2014 1 次提交
    • D
      vlan: Fix header ops passthru when doing TX VLAN offload. · 2205369a
      David S. Miller 提交于
      When the vlan code detects that the real device can do TX VLAN offloads
      in hardware, it tries to arrange for the real device's header_ops to
      be invoked directly.
      
      But it does so illegally, by simply hooking the real device's
      header_ops up to the VLAN device.
      
      This doesn't work because we will end up invoking a set of header_ops
      routines which expect a device type which matches the real device, but
      will see a VLAN device instead.
      
      Fix this by providing a pass-thru set of header_ops which will arrange
      to pass the proper real device instead.
      
      To facilitate this add a dev_rebuild_header().  There are
      implementations which provide a ->cache and ->create but not a
      ->rebuild (f.e. PLIP).  So we need a helper function just like
      dev_hard_header() to avoid crashes.
      
      Use this helper in the one existing place where the
      header_ops->rebuild was being invoked, the neighbour code.
      
      With lots of help from Florian Westphal.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2205369a
  9. 28 12月, 2013 1 次提交
  10. 25 12月, 2013 1 次提交
  11. 22 12月, 2013 1 次提交
    • B
      aio/migratepages: make aio migrate pages sane · 8e321fef
      Benjamin LaHaise 提交于
      The arbitrary restriction on page counts offered by the core
      migrate_page_move_mapping() code results in rather suspicious looking
      fiddling with page reference counts in the aio_migratepage() operation.
      To fix this, make migrate_page_move_mapping() take an extra_count parameter
      that allows aio to tell the code about its own reference count on the page
      being migrated.
      
      While cleaning up aio_migratepage(), make it validate that the old page
      being passed in is actually what aio_migratepage() expects to prevent
      misbehaviour in the case of races.
      Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
      8e321fef
  12. 21 12月, 2013 2 次提交
  13. 19 12月, 2013 5 次提交
  14. 18 12月, 2013 1 次提交
  15. 17 12月, 2013 1 次提交
  16. 13 12月, 2013 5 次提交
  17. 12 12月, 2013 1 次提交
  18. 11 12月, 2013 4 次提交
    • P
      sched/fair: Rework sched_fair time accounting · 9dbdb155
      Peter Zijlstra 提交于
      Christian suffers from a bad BIOS that wrecks his i5's TSC sync. This
      results in him occasionally seeing time going backwards - which
      crashes the scheduler ...
      
      Most of our time accounting can actually handle that except the most
      common one; the tick time update of sched_fair.
      
      There is a further problem with that code; previously we assumed that
      because we get a tick every TICK_NSEC our time delta could never
      exceed 32bits and math was simpler.
      
      However, ever since Frederic managed to get NO_HZ_FULL merged; this is
      no longer the case since now a task can run for a long time indeed
      without getting a tick. It only takes about ~4.2 seconds to overflow
      our u32 in nanoseconds.
      
      This means we not only need to better deal with time going backwards;
      but also means we need to be able to deal with large deltas.
      
      This patch reworks the entire code and uses mul_u64_u32_shr() as
      proposed by Andy a long while ago.
      
      We express our virtual time scale factor in a u32 multiplier and shift
      right and the 32bit mul_u64_u32_shr() implementation reduces to a
      single 32x32->64 multiply if the time delta is still short (common
      case).
      
      For 64bit a 64x64->128 multiply can be used if ARCH_SUPPORTS_INT128.
      Reported-and-Tested-by: NChristian Engelmayer <cengelma@gmx.at>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: fweisbec@gmail.com
      Cc: Paul Turner <pjt@google.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20131118172706.GI3866@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9dbdb155
    • P
      math64: Add mul_u64_u32_shr() · be5e610c
      Peter Zijlstra 提交于
      Introduce mul_u64_u32_shr() as proposed by Andy a while back; it
      allows using 64x64->128 muls on 64bit archs and recent GCC
      which defines __SIZEOF_INT128__ and __int128.
      
      (This new method will be used by the scheduler.)
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: fweisbec@gmail.com
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-hxjoeuzmrcaumR0uZwjpe2pv@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be5e610c
    • P
      sched: Remove PREEMPT_NEED_RESCHED from generic code · ba1f14fb
      Peter Zijlstra 提交于
      While hunting a preemption issue with Alexander, Ben noticed that the
      currently generic PREEMPT_NEED_RESCHED stuff is horribly broken for
      load-store architectures.
      
      We currently rely on the IPI to fold TIF_NEED_RESCHED into
      PREEMPT_NEED_RESCHED, but when this IPI lands while we already have
      a load for the preempt-count but before the store, the store will erase
      the PREEMPT_NEED_RESCHED change.
      
      The current preempt-count only works on load-store archs because
      interrupts are assumed to be completely balanced wrt their preempt_count
      fiddling; the previous preempt_count load will match the preempt_count
      state after the interrupt and therefore nothing gets lost.
      
      This patch removes the PREEMPT_NEED_RESCHED usage from generic code and
      pushes it into x86 arch code; the generic code goes back to relying on
      TIF_NEED_RESCHED.
      
      Boot tested on x86_64 and compile tested on ppc64.
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Reported-and-Tested-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20131128132641.GP10022@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ba1f14fb
    • S
      net: unix: allow set_peek_off to fail · 12663bfc
      Sasha Levin 提交于
      unix_dgram_recvmsg() will hold the readlock of the socket until recv
      is complete.
      
      In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until
      unix_dgram_recvmsg() will complete (which can take a while) without allowing
      us to break out of it, triggering a hung task spew.
      
      Instead, allow set_peek_off to fail, this way userspace will not hang.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12663bfc