1. 06 8月, 2013 1 次提交
    • F
      sctp: Pack dst_cookie into 1st cacheline hole for 64bit host · 5a139296
      fan.du 提交于
      As dst_cookie is used in fast path sctp_transport_dst_check.
      
      Before:
      struct sctp_transport {
      	struct list_head           transports;           /*     0    16 */
      	atomic_t                   refcnt;               /*    16     4 */
      	__u32                      dead:1;               /*    20:31  4 */
      	__u32                      rto_pending:1;        /*    20:30  4 */
      	__u32                      hb_sent:1;            /*    20:29  4 */
      	__u32                      pmtu_pending:1;       /*    20:28  4 */
      
      	/* XXX 28 bits hole, try to pack */
      
      	__u32                      sack_generation;      /*    24     4 */
      
      	/* XXX 4 bytes hole, try to pack */
      
      	struct flowi               fl;                   /*    32    64 */
      	/* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */
      	union sctp_addr            ipaddr;               /*    96    28 */
      
      After:
      struct sctp_transport {
      	struct list_head           transports;           /*     0    16 */
      	atomic_t                   refcnt;               /*    16     4 */
      	__u32                      dead:1;               /*    20:31  4 */
      	__u32                      rto_pending:1;        /*    20:30  4 */
      	__u32                      hb_sent:1;            /*    20:29  4 */
      	__u32                      pmtu_pending:1;       /*    20:28  4 */
      
      	/* XXX 28 bits hole, try to pack */
      
      	__u32                      sack_generation;      /*    24     4 */
      	u32                        dst_cookie;           /*    28     4 */
      	struct flowi               fl;                   /*    32    64 */
      	/* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */
      	union sctp_addr            ipaddr;               /*    96    28 */
      Signed-off-by: NFan Du <fan.du@windriver.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a139296
  2. 04 8月, 2013 2 次提交
  3. 03 8月, 2013 4 次提交
  4. 02 8月, 2013 4 次提交
    • C
      net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLL · e0d1095a
      Cong Wang 提交于
      Eliezer renames several *ll_poll to *busy_poll, but forgets
      CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too.
      
      Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0d1095a
    • C
      net: fix a compile error when CONFIG_NET_LL_RX_POLL is not set · dfcefb0b
      Cong Wang 提交于
      When CONFIG_NET_LL_RX_POLL is not set, I got:
      
      net/socket.c: In function ‘sock_poll’:
      net/socket.c:1165:4: error: implicit declaration of function ‘sk_busy_loop’ [-Werror=implicit-function-declaration]
      
      Fix this by adding a nop when !CONFIG_NET_LL_RX_POLL.
      
      Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfcefb0b
    • M
      ipv6: prevent fib6_run_gc() contention · 2ac3ac8f
      Michal Kubeček 提交于
      On a high-traffic router with many processors and many IPv6 dst
      entries, soft lockup in fib6_run_gc() can occur when number of
      entries reaches gc_thresh.
      
      This happens because fib6_run_gc() uses fib6_gc_lock to allow
      only one thread to run the garbage collector but ip6_dst_gc()
      doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
      returns. On a system with many entries, this can take some time
      so that in the meantime, other threads pass the tests in
      ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
      the lock. They then have to run the garbage collector one after
      another which blocks them for quite long.
      
      Resolve this by replacing special value ~0UL of expire parameter
      to fib6_run_gc() by explicit "force" parameter to choose between
      spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
      force=false if gc_thresh is reached but not max_size.
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ac3ac8f
    • E
      net: add a temporary sanity check in skb_orphan() · 376c7311
      Eric Dumazet 提交于
      David suggested to add a BUG_ON() to catch if some layer
      sets skb->sk pointer without a corresponding destructor.
      
      As skb can sit in a queue, it's mandatory to make sure the
      socket cannot disappear, and it's usually done by taking a
      reference on the socket, then releasing it from the skb
      destructor.
      
      This patch is a follow-up to commit c34a7612
      ("net: skb_orphan() changes") and will be reverted after
      catching all possible offenders if any.
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      376c7311
  5. 01 8月, 2013 18 次提交
  6. 31 7月, 2013 7 次提交
  7. 30 7月, 2013 1 次提交
    • C
      freezer: set PF_SUSPEND_TASK flag on tasks that call freeze_processes · 2b44c4db
      Colin Cross 提交于
      Calling freeze_processes sets a global flag that will cause any
      process that calls try_to_freeze to enter the refrigerator.  It
      skips sending a signal to the current task, but if the current
      task ever hits try_to_freeze, all threads will be frozen and the
      system will deadlock.
      
      Set a new flag, PF_SUSPEND_TASK, on the task that calls
      freeze_processes.  The flag notifies the freezer that the thread
      is involved in suspend and should not be frozen.  Also add a
      WARN_ON in thaw_processes if the caller does not have the
      PF_SUSPEND_TASK flag set to catch if a different task calls
      thaw_processes than the one that called freeze_processes, leaving
      a task with PF_SUSPEND_TASK permanently set on it.
      
      Threads that spawn off a task with PF_SUSPEND_TASK set (which
      swsusp does) will also have PF_SUSPEND_TASK set, preventing them
      from freezing while they are helping with suspend, but they need
      to be dead by the time suspend is triggered, otherwise they may
      run when userspace is expected to be frozen.  Add a WARN_ON in
      thaw_processes if more than one thread has the PF_SUSPEND_TASK
      flag set.
      Reported-and-tested-by: NMichael Leun <lkml20130126@newton.leun.net>
      Signed-off-by: NColin Cross <ccross@android.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2b44c4db
  8. 29 7月, 2013 2 次提交
    • R
      Revert "cpuidle: Quickly notice prediction failure for repeat mode" · 14851912
      Rafael J. Wysocki 提交于
      Revert commit 69a37bea (cpuidle: Quickly notice prediction failure for
      repeat mode), because it has been identified as the source of a
      significant performance regression in v3.8 and later as explained by
      Jeremy Eder:
      
        We believe we've identified a particular commit to the cpuidle code
        that seems to be impacting performance of variety of workloads.
        The simplest way to reproduce is using netperf TCP_RR test, so
        we're using that, on a pair of Sandy Bridge based servers.  We also
        have data from a large database setup where performance is also
        measurably/positively impacted, though that test data isn't easily
        share-able.
      
        Included below are test results from 3 test kernels:
      
        kernel       reverts
        -----------------------------------------------------------
        1) vanilla   upstream (no reverts)
      
        2) perfteam2 reverts e11538d1
      
        3) test      reverts 69a37bea
                             e11538d1
      
        In summary, netperf TCP_RR numbers improve by approximately 4%
        after reverting 69a37bea.  When
        69a37bea is included, C0 residency
        never seems to get above 40%.  Taking that patch out gets C0 near
        100% quite often, and performance increases.
      
        The below data are histograms representing the %c0 residency @
        1-second sample rates (using turbostat), while under netperf test.
      
        - If you look at the first 4 histograms, you can see %c0 residency
          almost entirely in the 30,40% bin.
        - The last pair, which reverts 69a37bea,
          shows %c0 in the 80,90,100% bins.
      
        Below each kernel name are netperf TCP_RR trans/s numbers for the
        particular kernel that can be disclosed publicly, comparing the 3
        test kernels.  We ran a 4th test with the vanilla kernel where
        we've also set /dev/cpu_dma_latency=0 to show overall impact
        boosting single-threaded TCP_RR performance over 11% above
        baseline.
      
        3.10-rc2 vanilla RX + c0 lock (/dev/cpu_dma_latency=0):
        TCP_RR trans/s 54323.78
      
        -----------------------------------------------------------
        3.10-rc2 vanilla RX (no reverts)
        TCP_RR trans/s 48192.47
      
        Receiver %c0
            0.0000 -    10.0000 [     1]: *
           10.0000 -    20.0000 [     0]:
           20.0000 -    30.0000 [     0]:
           30.0000 -    40.0000 [    59]:
        ***********************************************************
           40.0000 -    50.0000 [     1]: *
           50.0000 -    60.0000 [     0]:
           60.0000 -    70.0000 [     0]:
           70.0000 -    80.0000 [     0]:
           80.0000 -    90.0000 [     0]:
           90.0000 -   100.0000 [     0]:
      
        Sender %c0
            0.0000 -    10.0000 [     1]: *
           10.0000 -    20.0000 [     0]:
           20.0000 -    30.0000 [     0]:
           30.0000 -    40.0000 [    11]: ***********
           40.0000 -    50.0000 [    49]:
        *************************************************
           50.0000 -    60.0000 [     0]:
           60.0000 -    70.0000 [     0]:
           70.0000 -    80.0000 [     0]:
           80.0000 -    90.0000 [     0]:
           90.0000 -   100.0000 [     0]:
      
        -----------------------------------------------------------
        3.10-rc2 perfteam2 RX (reverts commit
        e11538d1)
        TCP_RR trans/s 49698.69
      
        Receiver %c0
            0.0000 -    10.0000 [     1]: *
           10.0000 -    20.0000 [     1]: *
           20.0000 -    30.0000 [     0]:
           30.0000 -    40.0000 [    59]:
        ***********************************************************
           40.0000 -    50.0000 [     0]:
           50.0000 -    60.0000 [     0]:
           60.0000 -    70.0000 [     0]:
           70.0000 -    80.0000 [     0]:
           80.0000 -    90.0000 [     0]:
           90.0000 -   100.0000 [     0]:
      
        Sender %c0
            0.0000 -    10.0000 [     1]: *
           10.0000 -    20.0000 [     0]:
           20.0000 -    30.0000 [     0]:
           30.0000 -    40.0000 [     2]: **
           40.0000 -    50.0000 [    58]:
        **********************************************************
           50.0000 -    60.0000 [     0]:
           60.0000 -    70.0000 [     0]:
           70.0000 -    80.0000 [     0]:
           80.0000 -    90.0000 [     0]:
           90.0000 -   100.0000 [     0]:
      
        -----------------------------------------------------------
        3.10-rc2 test RX (reverts 69a37bea
        and e11538d1)
        TCP_RR trans/s 47766.95
      
        Receiver %c0
            0.0000 -    10.0000 [     1]: *
           10.0000 -    20.0000 [     1]: *
           20.0000 -    30.0000 [     0]:
           30.0000 -    40.0000 [    27]: ***************************
           40.0000 -    50.0000 [     2]: **
           50.0000 -    60.0000 [     0]:
           60.0000 -    70.0000 [     2]: **
           70.0000 -    80.0000 [     0]:
           80.0000 -    90.0000 [     0]:
           90.0000 -   100.0000 [    28]: ****************************
      
        Sender:
            0.0000 -    10.0000 [     1]: *
           10.0000 -    20.0000 [     0]:
           20.0000 -    30.0000 [     0]:
           30.0000 -    40.0000 [    11]: ***********
           40.0000 -    50.0000 [     0]:
           50.0000 -    60.0000 [     1]: *
           60.0000 -    70.0000 [     0]:
           70.0000 -    80.0000 [     3]: ***
           80.0000 -    90.0000 [     7]: *******
           90.0000 -   100.0000 [    38]: **************************************
      
        These results demonstrate gaining back the tendency of the CPU to
        stay in more responsive, performant C-states (and thus yield
        measurably better performance), by reverting commit
        69a37bea.
      Requested-by: NJeremy Eder <jeder@redhat.com>
      Tested-by: NLen Brown <len.brown@intel.com>
      Cc: 3.8+ <stable@vger.kernel.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      14851912
    • Y
      net/mlx4_core: Respond to operation request by firmware · fe6f700d
      Yevgeny Petrilin 提交于
      This commit adds new firmware command and new firmware event.  The firmware
      raises the MLX4_EVENT_TYPE_OP_REQUIRED event in order to signal the driver it
      needs to perform an administrative operation throughout the MLX4_CMD_GET_OP_REQ
      command. At the moment the supported operation is adding/removing multicast
      entries which are used by the firmware for handling NCSI traffic in B0
      steering mode.
      
      Also, had to swap the order of mlx4_init_mcg_table() and
      mlx4_init_eq_table() to make sure that driver will get events only after
      resources are initialized to handle it.
      Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.com>
      Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe6f700d
  9. 28 7月, 2013 1 次提交