1. 19 10月, 2011 1 次提交
  2. 18 10月, 2011 1 次提交
  3. 17 10月, 2011 1 次提交
    • G
      if_link: Add additional parameter to IFLA_VF_INFO for spoof checking · 5f8444a3
      Greg Rose 提交于
      Add configuration setting for drivers to turn spoof checking on or off
      for discrete VFs.
      
      v2 - Fix indentation problem, wrap the ifla_vf_info structure in
           #ifdef __KERNEL__ to prevent user space from accessing and
           change function paramater for the spoof check setting netdev
           op from u8 to bool.
      v3 - Preset spoof check setting to -1 so that user space tools such
           as ip can detect that the driver didn't report a spoofcheck
           setting.  Prevents incorrect display of spoof check settings
           for drivers that don't report it.
      Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      5f8444a3
  4. 15 10月, 2011 2 次提交
  5. 14 10月, 2011 1 次提交
    • E
      net: more accurate skb truesize · 87fb4b7b
      Eric Dumazet 提交于
      skb truesize currently accounts for sk_buff struct and part of skb head.
      kmalloc() roundings are also ignored.
      
      Considering that skb_shared_info is larger than sk_buff, its time to
      take it into account for better memory accounting.
      
      This patch introduces SKB_TRUESIZE(X) macro to centralize various
      assumptions into a single place.
      
      At skb alloc phase, we put skb_shared_info struct at the exact end of
      skb head, to allow a better use of memory (lowering number of
      reallocations), since kmalloc() gives us power-of-two memory blocks.
      
      Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
      aligned to cache lines, as before.
      
      Note: This patch might trigger performance regressions because of
      misconfigured protocol stacks, hitting per socket or global memory
      limits that were previously not reached. But its a necessary step for a
      more accurate memory accounting.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Andi Kleen <ak@linux.intel.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87fb4b7b
  6. 13 10月, 2011 1 次提交
  7. 12 10月, 2011 2 次提交
  8. 07 10月, 2011 3 次提交
  9. 05 10月, 2011 1 次提交
  10. 04 10月, 2011 7 次提交
  11. 01 10月, 2011 13 次提交
    • J
      mac80211: document client powersave · 4b801bc9
      Johannes Berg 提交于
      With the addition of uAPSD and driver buffering
      the powersave handling has gotten quite complex.
      Add a section to the documentation to explain it
      for anyone wanting to implement it.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      4b801bc9
    • J
      mac80211: allow out-of-band EOSP notification · 37fbd908
      Johannes Berg 提交于
      iwlwifi has a separate EOSP notification from
      the device, and to make use of that properly
      it needs to be passed to mac80211. To be able
      to mix with tx_status_irqsafe and rx_irqsafe
      it also needs to be an "_irqsafe" version in
      the sense that it goes through the tasklet,
      the actual flag clearing would be IRQ-safe
      but doing it directly would cause reordering
      issues.
      
      This is needed in the case of a P2P GO going
      into an absence period without transmitting
      any frames that should be driver-released as
      in this case there's no other way to inform
      mac80211 that the service period ended. Note
      that for drivers that don't use the _irqsafe
      functions another version of this function
      will be required.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      37fbd908
    • J
      mac80211: explicitly notify drivers of frame release · 40b96408
      Johannes Berg 提交于
      iwlwifi needs to know the number of frames that are
      going to be sent to a station while it is asleep so
      it can properly handle the uCode blocking of that
      station.
      
      Before uAPSD, we got by by telling the device that
      a single frame was going to be released whenever we
      encountered IEEE80211_TX_CTL_POLL_RESPONSE. With
      uAPSD, however, that is no longer possible since
      there could be more than a single frame.
      
      To support this model, add a new callback to notify
      drivers when frames are going to be released.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      40b96408
    • J
      mac80211: reply only once to each PS-poll · deeaee19
      Johannes Berg 提交于
      If a PS-poll frame is retried (but was received)
      there is no way to detect that since it has no
      sequence number. As a consequence, the standard
      asks us to not react to PS-poll frames until the
      response to one made it out (was ACKed or lost).
      
      Implement this by using the WLAN_STA_SP flags to
      also indicate a PS-Poll "service period" and the
      IEEE80211_TX_STATUS_EOSP flag for the response
      packet to indicate the end of the "SP" as usual.
      
      We could use separate flags, but that will most
      likely completely confuse drivers, and while the
      standard doesn't exclude simultaneously polling
      using uAPSD and PS-Poll, doing that seems quite
      problematic.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      deeaee19
    • J
      mac80211: implement uAPSD · 47086fc5
      Johannes Berg 提交于
      Add uAPSD support to mac80211. This is probably not
      possible with all devices, so advertising it with
      the cfg80211 flag will be left up to drivers that
      want it.
      
      Due to my previous patches it is now a fairly
      straight-forward extension. Drivers need to have
      accurate TX status reporting for the EOSP frame.
      For drivers that buffer themselves, the provided
      APIs allow releasing the right number of frames,
      but then drivers need to set EOSP and more-data
      themselves. This is documented in more detail in
      the new code itself.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      47086fc5
    • J
      mac80211: allow releasing driver-buffered frames · 4049e09a
      Johannes Berg 提交于
      If there are frames for a station buffered in
      the driver, mac80211 announces those in the TIM
      IE but there's no way to release them. Add new
      API to release such frames and use it when the
      station polls for a frame.
      
      Since the API will soon also be used for uAPSD
      it is easily extensible.
      
      Note that before this change drivers announcing
      driver-buffered frames in the TIM bit actually
      will respond to a PS-Poll with a potentially
      lower priority frame (if there are any frames
      buffered in mac80211), after this patch a driver
      that hasn't been changed will no longer respond
      at all. This only affects ath9k, which will need
      to be fixed to implement the new API.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      4049e09a
    • J
      mac80211: split PS buffers into ACs · 948d887d
      Johannes Berg 提交于
      For uAPSD support we'll need to have per-AC PS
      buffers. As this is a major undertaking, split
      the buffers before really adding support for
      uAPSD. This already makes some reference to the
      uapsd_queues variable, but for now that will
      never be non-zero.
      
      Since book-keeping is complicated, also change
      the logic for keeping a maximum of frames only
      and allow 64 frames per AC (up from 128 for a
      station).
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      948d887d
    • J
      mac80211: let drivers inform it about per TID buffered frames · 042ec453
      Johannes Berg 提交于
      For uAPSD implementation, it is necessary to know on
      which ACs frames are buffered. mac80211 obviously
      knows about the frames it has buffered itself, but
      with aggregation many drivers buffer frames. Thus,
      mac80211 needs to be informed about this.
      
      For now, since we don't have APSD in any form, this
      will unconditionally set the TIM bit for the station
      but later with uAPSD only some ACs might cause the
      TIM bit to be set.
      
      ath9k is the only driver using this API and I only
      modify it in the most basic way, it won't be able
      to implement uAPSD with this yet. But it can't do
      that anyway since there's no way to selectively
      release frames to the peer yet.
      
      Since drivers will buffer frames per TID, let them
      inform mac80211 on a per TID basis, mac80211 will
      then sort out the AC mapping itself.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      042ec453
    • A
      nl80211/mac80211: allow adding TDLS peers as stations · 07ba55d7
      Arik Nemtsov 提交于
      When adding a TDLS peer STA, mark it with a new flag in both nl80211 and
      mac80211. Before adding a peer, make sure the wiphy supports TDLS and
      our operating mode is appropriate (managed).
      
      In addition, make sure all peers are removed on disassociation.
      
      A TDLS peer is first added just before link setup is initiated. In later
      setup stages we have more info about peer supported rates, capabilities,
      etc. This info is reported via nl80211_set_station().
      Signed-off-by: NArik Nemtsov <arik@wizery.com>
      Cc: Kalyan C Gaddam <chakkal@iit.edu>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      07ba55d7
    • A
      mac80211: handle TDLS high-level commands and frames · dfe018bf
      Arik Nemtsov 提交于
      Register and implement the TDLS cfg80211 callback functions.
      
      Internally prepare and send TDLS management frames. We incorporate
      local STA capabilities and supported rates with extra IEs given by
      usermode. The resulting packet is either encapsulated in a data frame,
      or assembled as an action frame. It is transmitted either directly or
      through the AP, as mandated by the TDLS specification.
      
      Declare support for the TDLS external setup wiphy capability. This
      tells usermode to handle link setup and discovery on its own, and use the
      kernel driver for sending TDLS mgmt packets.
      Signed-off-by: NArik Nemtsov <arik@wizery.com>
      Cc: Kalyan C Gaddam <chakkal@iit.edu>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      dfe018bf
    • A
      mac80211: standardize adding supported rates IEs · 768db343
      Arik Nemtsov 提交于
      Relocate the mesh implementation of adding the (extended) supported
      rates IE to util.c, anticipating its use by other parts of mac80211.
      Signed-off-by: NArik Nemtsov <arik@wizery.com>
      Cc: Kalyan C Gaddam <chakkal@iit.edu>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      768db343
    • A
      nl80211: support sending TDLS commands/frames · 109086ce
      Arik Nemtsov 提交于
      Add support for sending high-level TDLS commands and TDLS frames via
      NL80211_CMD_TDLS_OPER and NL80211_CMD_TDLS_MGMT, respectively. Add
      appropriate cfg80211 callbacks for lower level drivers.
      
      Add wiphy capability flags for TDLS support and advertise them via
      nl80211.
      Signed-off-by: NArik Nemtsov <arik@wizery.com>
      Cc: Kalyan C Gaddam <chakkal@iit.edu>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      109086ce
    • J
      cfg80211/mac80211: apply station uAPSD parameters selectively · 3b9ce80c
      Johannes Berg 提交于
      Currently, when hostapd sets the station as authorized
      we also overwrite its uAPSD parameter. This obviously
      leads to buggy behaviour (later, with my patches that
      actually add uAPSD support). To fix this, only apply
      those parameters if they were actually set in nl80211,
      and to achieve that add a bitmap of things to apply.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      3b9ce80c
  12. 30 9月, 2011 1 次提交
    • P
      posix-cpu-timers: Cure SMP wobbles · d670ec13
      Peter Zijlstra 提交于
      David reported:
      
        Attached below is a watered-down version of rt/tst-cpuclock2.c from
        GLIBC.  Just build it with "gcc -o test test.c -lpthread -lrt" or
        similar.
      
        Run it several times, and you will see cases where the main thread
        will measure a process clock difference before and after the nanosleep
        which is smaller than the cpu-burner thread's individual thread clock
        difference.  This doesn't make any sense since the cpu-burner thread
        is part of the top-level process's thread group.
      
        I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
        64-bit binaries).
      
        For example:
      
        [davem@boricha build-x86_64-linux]$ ./test
        process: before(0.001221967) after(0.498624371) diff(497402404)
        thread:  before(0.000081692) after(0.498316431) diff(498234739)
        self:    before(0.001223521) after(0.001240219) diff(16698)
        [davem@boricha build-x86_64-linux]$ 
      
        The diff of 'process' should always be >= the diff of 'thread'.
      
        I make sure to wrap the 'thread' clock measurements the most tightly
        around the nanosleep() call, and that the 'process' clock measurements
        are the outer-most ones.
      
        ---
        #include <unistd.h>
        #include <stdio.h>
        #include <stdlib.h>
        #include <time.h>
        #include <fcntl.h>
        #include <string.h>
        #include <errno.h>
        #include <pthread.h>
      
        static pthread_barrier_t barrier;
      
        static void *chew_cpu(void *arg)
        {
      	  pthread_barrier_wait(&barrier);
      	  while (1)
      		  __asm__ __volatile__("" : : : "memory");
      	  return NULL;
        }
      
        int main(void)
        {
      	  clockid_t process_clock, my_thread_clock, th_clock;
      	  struct timespec process_before, process_after;
      	  struct timespec me_before, me_after;
      	  struct timespec th_before, th_after;
      	  struct timespec sleeptime;
      	  unsigned long diff;
      	  pthread_t th;
      	  int err;
      
      	  err = clock_getcpuclockid(0, &process_clock);
      	  if (err)
      		  return 1;
      
      	  err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
      	  if (err)
      		  return 1;
      
      	  pthread_barrier_init(&barrier, NULL, 2);
      	  err = pthread_create(&th, NULL, chew_cpu, NULL);
      	  if (err)
      		  return 1;
      
      	  err = pthread_getcpuclockid(th, &th_clock);
      	  if (err)
      		  return 1;
      
      	  pthread_barrier_wait(&barrier);
      
      	  err = clock_gettime(process_clock, &process_before);
      	  if (err)
      		  return 1;
      
      	  err = clock_gettime(my_thread_clock, &me_before);
      	  if (err)
      		  return 1;
      
      	  err = clock_gettime(th_clock, &th_before);
      	  if (err)
      		  return 1;
      
      	  sleeptime.tv_sec = 0;
      	  sleeptime.tv_nsec = 500000000;
      	  nanosleep(&sleeptime, NULL);
      
      	  err = clock_gettime(th_clock, &th_after);
      	  if (err)
      		  return 1;
      
      	  err = clock_gettime(my_thread_clock, &me_after);
      	  if (err)
      		  return 1;
      
      	  err = clock_gettime(process_clock, &process_after);
      	  if (err)
      		  return 1;
      
      	  diff = process_after.tv_nsec - process_before.tv_nsec;
      	  printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
      		 process_before.tv_sec, process_before.tv_nsec,
      		 process_after.tv_sec, process_after.tv_nsec, diff);
      	  diff = th_after.tv_nsec - th_before.tv_nsec;
      	  printf("thread:  before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
      		 th_before.tv_sec, th_before.tv_nsec,
      		 th_after.tv_sec, th_after.tv_nsec, diff);
      	  diff = me_after.tv_nsec - me_before.tv_nsec;
      	  printf("self:    before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
      		 me_before.tv_sec, me_before.tv_nsec,
      		 me_after.tv_sec, me_after.tv_nsec, diff);
      
      	  return 0;
        }
      
      This is due to us using p->se.sum_exec_runtime in
      thread_group_cputime() where we iterate the thread group and sum all
      data. This does not take time since the last schedule operation (tick
      or otherwise) into account. We can cure this by using
      task_sched_runtime() at the cost of having to take locks.
      
      This also means we can (and must) do away with
      thread_group_sched_runtime() since the modified thread_group_cputime()
      is now more accurate and would deadlock when called from
      thread_group_sched_runtime().
      
      Aside of that it makes the function safe on 32 bit systems. The old
      code added t->se.sum_exec_runtime unprotected. sum_exec_runtime is a
      64bit value and could be changed on another cpu at the same time.
      Reported-by: NDavid Miller <davem@davemloft.net>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: stable@kernel.org
      Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twinsTested-by: NDavid Miller <davem@davemloft.net>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d670ec13
  13. 29 9月, 2011 4 次提交
    • R
      ptp: fix L2 event message recognition · f75159e9
      Richard Cochran 提交于
      The IEEE 1588 standard defines two kinds of messages, event and general
      messages. Event messages require time stamping, and general do not. When
      using UDP transport, two separate ports are used for the two message
      types.
      
      The BPF designed to recognize event messages incorrectly classifies L2
      general messages as event messages. This commit fixes the issue by
      extending the filter to check the message type field for L2 PTP packets.
      Event messages are be distinguished from general messages by testing
      the "general" bit.
      Signed-off-by: NRichard Cochran <richard.cochran@omicron.at>
      Cc: <stable@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f75159e9
    • Y
    • V
      connector: add comm change event report to proc connector · f786ecba
      Vladimir Zapolskiy 提交于
      Add an event to monitor comm value changes of tasks.  Such an event
      becomes vital, if someone desires to control threads of a process in
      different manner.
      
      A natural characteristic of threads is its comm value, and helpfully
      application developers have an opportunity to change it in runtime.
      Reporting about such events via proc connector allows to fine-grain
      monitoring and control potentials, for instance a process control daemon
      listening to proc connector and following comm value policies can place
      specific threads to assigned cgroup partitions.
      
      It might be possible to achieve a pale partial one-shot likeness without
      this update, if an application changes comm value of a thread generator
      task beforehand, then a new thread is cloned, and after that proc
      connector listener gets the fork event and reads new thread's comm value
      from procfs stat file, but this change visibly simplifies and extends the
      matter.
      Signed-off-by: NVladimir Zapolskiy <vzapolskiy@gmail.com>
      Acked-by: NEvgeniy Polyakov <zbr@ioremap.net>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f786ecba
    • E
      af_unix: dont send SCM_CREDENTIALS by default · 16e57262
      Eric Dumazet 提交于
      Since commit 7361c36c (af_unix: Allow credentials to work across
      user and pid namespaces) af_unix performance dropped a lot.
      
      This is because we now take a reference on pid and cred in each write(),
      and release them in read(), usually done from another process,
      eventually from another cpu. This triggers false sharing.
      
      # Events: 154K cycles
      #
      # Overhead  Command       Shared Object        Symbol
      # ........  .......  ..................  .........................
      #
          10.40%  hackbench  [kernel.kallsyms]   [k] put_pid
           8.60%  hackbench  [kernel.kallsyms]   [k] unix_stream_recvmsg
           7.87%  hackbench  [kernel.kallsyms]   [k] unix_stream_sendmsg
           6.11%  hackbench  [kernel.kallsyms]   [k] do_raw_spin_lock
           4.95%  hackbench  [kernel.kallsyms]   [k] unix_scm_to_skb
           4.87%  hackbench  [kernel.kallsyms]   [k] pid_nr_ns
           4.34%  hackbench  [kernel.kallsyms]   [k] cred_to_ucred
           2.39%  hackbench  [kernel.kallsyms]   [k] unix_destruct_scm
           2.24%  hackbench  [kernel.kallsyms]   [k] sub_preempt_count
           1.75%  hackbench  [kernel.kallsyms]   [k] fget_light
           1.51%  hackbench  [kernel.kallsyms]   [k]
      __mutex_lock_interruptible_slowpath
           1.42%  hackbench  [kernel.kallsyms]   [k] sock_alloc_send_pskb
      
      This patch includes SCM_CREDENTIALS information in a af_unix message/skb
      only if requested by the sender, [man 7 unix for details how to include
      ancillary data using sendmsg() system call]
      
      Note: This might break buggy applications that expected SCM_CREDENTIAL
      from an unaware write() system call, and receiver not using SO_PASSCRED
      socket option.
      
      If SOCK_PASSCRED is set on source or destination socket, we still
      include credentials for mere write() syscalls.
      
      Performance boost in hackbench : more than 50% gain on a 16 thread
      machine (2 quad-core cpus, 2 threads per core)
      
      hackbench 20 thread 2000
      
      4.228 sec instead of 9.102 sec
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16e57262
  14. 28 9月, 2011 2 次提交