1. 19 12月, 2008 1 次提交
    • V
      x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3 · 3c8bb73a
      venkatesh.pallipadi@intel.com 提交于
      Impact: Code transformation, new functions added should have no effect.
      
      Drivers use mmap followed by pgprot_* and remap_pfn_range or vm_insert_pfn,
      in order to export reserved memory to userspace. Currently, such mappings are
      not tracked and hence not kept consistent with other mappings (/dev/mem,
      pci resource, ioremap) for the sme memory, that may exist in the system.
      
      The following patchset adds x86 PAT attribute tracking and untracking for
      pfnmap related APIs.
      
      First three patches in the patchset are changing the generic mm code to fit
      in this tracking. Last four patches are x86 specific to make things work
      with x86 PAT code. The patchset aso introduces pgprot_writecombine interface,
      which gives writecombine mapping when enabled, falling back to
      pgprot_noncached otherwise.
      
      This patch:
      
      While working on x86 PAT, we faced some hurdles with trackking
      remap_pfn_range() regions, as we do not have any information to say
      whether that PFNMAP mapping is linear for the entire vma range or
      it is smaller granularity regions within the vma.
      
      A simple solution to this is to use vm_pgoff as an indicator for
      linear mapping over the vma region. Currently, remap_pfn_range
      only sets vm_pgoff for COW mappings. Below patch changes the
      logic and sets the vm_pgoff irrespective of COW. This will still not
      be enough for the case where pfn is zero (vma region mapped to
      physical address zero). But, for all the other cases, we can look at
      pfnmap VMAs and say whether the mappng is for the entire vma region
      or not.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      3c8bb73a
  2. 18 12月, 2008 1 次提交
  3. 16 12月, 2008 1 次提交
  4. 11 12月, 2008 6 次提交
  5. 10 12月, 2008 1 次提交
    • N
      netpoll: fix race on poll_list resulting in garbage entry · 7b363e44
      Neil Horman 提交于
      	A few months back a race was discused between the netpoll napi service
      path, and the fast path through net_rx_action:
      http://kerneltrap.org/mailarchive/linux-netdev/2007/10/16/345470
      
      A patch was submitted for that bug, but I think we missed a case.
      
      Consider the following scenario:
      
      INITIAL STATE
      CPU0 has one napi_struct A on its poll_list
      CPU1 is calling netpoll_send_skb and needs to call poll_napi on the same
      napi_struct A that CPU0 has on its list
      
      
      
      CPU0						CPU1
      net_rx_action					poll_napi
      !list_empty (returns true)			locks poll_lock for A
      						 poll_one_napi
      						  napi->poll
      						   netif_rx_complete
      						    __napi_complete
      						    (removes A from poll_list)
      list_entry(list->next)
      
      
      In the above scenario, net_rx_action assumes that the per-cpu poll_list is
      exclusive to that cpu.  netpoll of course violates that, and because the netpoll
      path can dequeue from the poll list, its possible for CPU0 to detect a non-empty
      list at the top of the while loop in net_rx_action, but have it become empty by
      the time it calls list_entry.  Since the poll_list isn't surrounded by any other
      structure, the returned data from that list_entry call in this situation is
      garbage, and any number of crashes can result based on what exactly that garbage
      is.
      
      Given that its not fasible for performance reasons to place exclusive locks
      arround each cpus poll list to provide that mutal exclusion, I think the best
      solution is modify the netpoll path in such a way that we continue to guarantee
      that the poll_list for a cpu is in fact exclusive to that cpu.  To do this I've
      implemented the patch below.  It adds an additional bit to the state field in
      the napi_struct.  When executing napi->poll from the netpoll_path, this bit will
      be set. When a driver calls netif_rx_complete, if that bit is set, it will not
      remove the napi_struct from the poll_list.  That work will be saved for the next
      iteration of net_rx_action.
      
      I've tested this and it seems to work well.  About the biggest drawback I can
      see to it is the fact that it might result in an extra loop through
      net_rx_action in the event that the device is actually contended for (i.e. the
      netpoll path actually preforms all the needed work no the device, and the call
      to net_rx_action winds up doing nothing, except removing the napi_struct from
      the poll_list.  However I think this is probably a small price to pay, given
      that the alternative is a crash.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b363e44
  6. 09 12月, 2008 3 次提交
  7. 06 12月, 2008 1 次提交
  8. 04 12月, 2008 3 次提交
  9. 03 12月, 2008 4 次提交
    • M
      block: fix setting of max_segment_size and seg_boundary mask · 0e435ac2
      Milan Broz 提交于
      Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
      devices.
      
      When stacking devices (LVM over MD over SCSI) some of the request queue
      parameters are not set up correctly in some cases by default, namely
      max_segment_size and and seg_boundary mask.
      
      If you create MD device over SCSI, these attributes are zeroed.
      
      Problem become when there is over this mapping next device-mapper mapping
      - queue attributes are set in DM this way:
      
      request_queue   max_segment_size  seg_boundary_mask
      SCSI                65536             0xffffffff
      MD RAID1                0                      0
      LVM                 65536                 -1 (64bit)
      
      Unfortunately bio_add_page (resp.  bio_phys_segments) calculates number of
      physical segments according to these parameters.
      
      During the generic_make_request() is segment cout recalculated and can
      increase bio->bi_phys_segments count over the allowed limit.  (After
      bio_clone() in stack operation.)
      
      Thi is specially problem in CCISS driver, where it produce OOPS here
      
          BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);
      
      (MAXSEGENTRIES is 31 by default.)
      
      Sometimes even this command is enough to cause oops:
      
        dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10
      
      This command generates bios with 250 sectors, allocated in 32 4k-pages
      (last page uses only 1024 bytes).
      
      For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
      unfortunatelly on lower layer it is recalculated to 32 segments and this
      violates CCISS restriction and triggers BUG_ON().
      
      The patch tries to fix it by:
      
       * initializing attributes above in queue request constructor
         blk_queue_make_request()
      
       * make sure that blk_queue_stack_limits() inherits setting
      
       (DM uses its own function to set the limits because it
       blk_queue_stack_limits() was introduced later.  It should probably switch
       to use generic stack limit function too.)
      
       * sets the default seg_boundary value in one place (blkdev.h)
      
       * use this mask as default in DM (instead of -1, which differs in 64bit)
      
      Bugs related to this:
      https://bugzilla.redhat.com/show_bug.cgi?id=471639
      http://bugzilla.kernel.org/show_bug.cgi?id=8672Signed-off-by: NMilan Broz <mbroz@redhat.com>
      Reviewed-by: NAlasdair G Kergon <agk@redhat.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      0e435ac2
    • T
      block: internal dequeue shouldn't start timer · 53a08807
      Tejun Heo 提交于
      blkdev_dequeue_request() and elv_dequeue_request() are equivalent and
      both start the timeout timer.  Barrier code dequeues the original
      barrier request but doesn't passes the request itself to lower level
      driver, only broken down proxy requests; however, as the original
      barrier code goes through the same dequeue path and timeout timer is
      started on it.  If barrier sequence takes long enough, this timer
      expires but the low level driver has no idea about this request and
      oops follows.
      
      Timeout timer shouldn't have been started on the original barrier
      request as it never goes through actual IO.  This patch unexports
      elv_dequeue_request(), which has no external user anyway, and makes it
      operate on elevator proper w/o adding the timer and make
      blkdev_dequeue_request() call elv_dequeue_request() and add timer.
      Internal users which don't pass the request to driver - barrier code
      and end_that_request_last() - are converted to use
      elv_dequeue_request().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      53a08807
    • J
      nfsd: fix vm overcommit crash fix #2 · 1b79cd04
      Junjiro R. Okajima 提交于
      The previous patch from Alan Cox ("nfsd: fix vm overcommit crash",
      commit 731572d3) fixed the problem where
      knfsd crashes on exported shmemfs objects and strict overcommit is set.
      
      But the patch forgot supporting the case when CONFIG_SECURITY is
      disabled.
      
      This patch copies a part of his fix which is mainly for detecting a bug
      earlier.
      Acked-by: NJames Morris <jmorris@namei.org>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NJunjiro R. Okajima <hooanon05@yahoo.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b79cd04
    • B
      amd74xx: workaround unreliable AltStatus register for nVidia controllers · 6636487e
      Bartlomiej Zolnierkiewicz 提交于
      It seems that on some nVidia controllers using AltStatus register
      can be unreliable so default to Status register if the PCI device
      is in Compatibility Mode.  In order to achieve this:
      
      * Add ide_pci_is_in_compatibility_mode() inline helper to <linux/ide.h>.
      
      * Add IDE_HFLAG_BROKEN_ALTSTATUS host flag and set it in amd74xx host
        driver for nVidia controllers in Compatibility Mode.
      
      * Teach actual_try_to_identify() and drive_is_ready() about the new flag.
      
      This fixes the regression caused by removal of CONFIG_IDEPCI_SHARE_IRQ
      config option in 2.6.25 and using AltStatus register unconditionally when
      available (kernel.org bugs #11659 and #10216).
      
      [ Moreover for CONFIG_IDEPCI_SHARE_IRQ=y (which is what most people
        and distributions use) it never worked correctly. ]
      
      Thanks to Remy LABENE and Lars Winterfeld for help with debugging the problem.
      
      More info at:
      http://bugzilla.kernel.org/show_bug.cgi?id=11659
      http://bugzilla.kernel.org/show_bug.cgi?id=10216Reported-by: NRemy LABENE <remy.labene@free.fr>
      Tested-by: NRemy LABENE <remy.labene@free.fr>
      Tested-by: NLars Winterfeld <lars.winterfeld@tu-ilmenau.de>
      Acked-by: NBorislav Petkov <petkovbb@gmail.com>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      6636487e
  10. 02 12月, 2008 3 次提交
    • M
      lib/idr.c: fix rcu related race with idr_find · 6ff2d39b
      Manfred Spraul 提交于
      2nd part of the fixes needed for
      http://bugzilla.kernel.org/show_bug.cgi?id=11796.
      
      When the idr tree is either grown or shrunk, then the update to the number
      of layers and the top pointer were not atomic.  This race caused crashes.
      
      The attached patch fixes that by replicating the layers counter in each
      layer, thus idr_find doesn't need idp->layers anymore.
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Cc: Clement Calmels <cboulte@gmail.com>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Cc: Pierre Peiffer <peifferp@gmail.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ff2d39b
    • D
      epoll: introduce resource usage limits · 7ef9964e
      Davide Libenzi 提交于
      It has been thought that the per-user file descriptors limit would also
      limit the resources that a normal user can request via the epoll
      interface.  Vegard Nossum reported a very simple program (a modified
      version attached) that can make a normal user to request a pretty large
      amount of kernel memory, well within the its maximum number of fds.  To
      solve such problem, default limits are now imposed, and /proc based
      configuration has been introduced.  A new directory has been created,
      named /proc/sys/fs/epoll/ and inside there, there are two configuration
      points:
      
        max_user_instances = Maximum number of devices - per user
      
        max_user_watches   = Maximum number of "watched" fds - per user
      
      The current default for "max_user_watches" limits the memory used by epoll
      to store "watches", to 1/32 of the amount of the low RAM.  As example, a
      256MB 32bit machine, will have "max_user_watches" set to roughly 90000.
      That should be enough to not break existing heavy epoll users.  The
      default value for "max_user_instances" is set to 128, that should be
      enough too.
      
      This also changes the userspace, because a new error code can now come out
      from EPOLL_CTL_ADD (-ENOSPC).  The EMFILE from epoll_create() was already
      listed, so that should be ok.
      
      [akpm@linux-foundation.org: use get_current_user()]
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: <stable@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Reported-by: NVegard Nossum <vegardno@ifi.uio.no>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ef9964e
    • T
      libata: blacklist Seagate drives which time out FLUSH_CACHE when used with NCQ · ac70a964
      Tejun Heo 提交于
      Some recent Seagate harddrives have firmware bug which causes FLUSH
      CACHE to timeout under certain circumstances if NCQ is being used.
      This can be worked around by disabling NCQ and fixed by updating the
      firmware.  Implement ATA_HORKAGE_FIRMWARE_UPDATE and blacklist these
      devices.
      
      The wiki page has been updated to contain information on this issue.
      
        http://ata.wiki.kernel.org/index.php/Known_issuesSigned-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      ac70a964
  11. 01 12月, 2008 3 次提交
  12. 29 11月, 2008 1 次提交
  13. 28 11月, 2008 1 次提交
    • R
      Allow architectures to override copy_user_highpage() · 487ff320
      Russell King 提交于
      With aliasing VIPT cache support, the ARM implementation of
      clear_user_page() and copy_user_page() sets up a temporary kernel space
      mapping such that we have the same cache colour as the userspace page.
      This avoids having to consider any userspace aliases from this operation.
      
      However, when highmem is enabled, kmap_atomic() have to setup mappings.
      The copy_user_highpage() and clear_user_highpage() call these functions
      before delegating the copies to copy_user_page() and clear_user_page().
      
      The effect of this is that each of the *_user_highpage() functions setup
      their own kmap mapping, followed by the *_user_page() functions setting
      up another mapping.  This is rather wasteful.
      
      Thankfully, copy_user_highpage() can be overriden by architectures by
      defining __HAVE_ARCH_COPY_USER_HIGHPAGE.  However, replacement of
      clear_user_highpage() is more difficult because its inline definition
      is not conditional.  It seems that you're expected to define
      __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE and provide a replacement
      __alloc_zeroed_user_highpage() implementation instead.
      
      The allocation itself is fine, so we don't want to override that.  What
      we really want to do is to override clear_user_highpage() with our own
      version which doesn't kmap_atomic() unnecessarily.
      
      Other VIPT architectures (PARISC and SH) would also like to override
      this function as well.
      Acked-by: NHugh Dickins <hugh@veritas.com>
      Acked-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      487ff320
  14. 27 11月, 2008 3 次提交
  15. 25 11月, 2008 2 次提交
  16. 23 11月, 2008 1 次提交
  17. 22 11月, 2008 1 次提交
  18. 21 11月, 2008 1 次提交
  19. 20 11月, 2008 2 次提交
    • M
      cpuset: update top cpuset's mems after adding a node · f481891f
      Miao Xie 提交于
      After adding a node into the machine, top cpuset's mems isn't updated.
      
      By reviewing the code, we found that the update function
      
        cpuset_track_online_nodes()
      
      was invoked after node_states[N_ONLINE] changes.  It is wrong because
      N_ONLINE just means node has pgdat, and if node has/added memory, we use
      N_HIGH_MEMORY.  So, We should invoke the update function after
      node_states[N_HIGH_MEMORY] changes, just like its commit says.
      
      This patch fixes it.  And we use notifier of memory hotplug instead of
      direct calling of cpuset_track_online_nodes().
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Acked-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Paul Menage <menage@google.com
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f481891f
    • U
      reintroduce accept4 · de11defe
      Ulrich Drepper 提交于
      Introduce a new accept4() system call.  The addition of this system call
      matches analogous changes in 2.6.27 (dup3(), evenfd2(), signalfd4(),
      inotify_init1(), epoll_create1(), pipe2()) which added new system calls
      that differed from analogous traditional system calls in adding a flags
      argument that can be used to access additional functionality.
      
      The accept4() system call is exactly the same as accept(), except that
      it adds a flags bit-mask argument.  Two flags are initially implemented.
      (Most of the new system calls in 2.6.27 also had both of these flags.)
      
      SOCK_CLOEXEC causes the close-on-exec (FD_CLOEXEC) flag to be enabled
      for the new file descriptor returned by accept4().  This is a useful
      security feature to avoid leaking information in a multithreaded
      program where one thread is doing an accept() at the same time as
      another thread is doing a fork() plus exec().  More details here:
      http://udrepper.livejournal.com/20407.html "Secure File Descriptor Handling",
      Ulrich Drepper).
      
      The other flag is SOCK_NONBLOCK, which causes the O_NONBLOCK flag
      to be enabled on the new open file description created by accept4().
      (This flag is merely a convenience, saving the use of additional calls
      fcntl(F_GETFL) and fcntl (F_SETFL) to achieve the same result.
      
      Here's a test program.  Works on x86-32.  Should work on x86-64, but
      I (mtk) don't have a system to hand to test with.
      
      It tests accept4() with each of the four possible combinations of
      SOCK_CLOEXEC and SOCK_NONBLOCK set/clear in 'flags', and verifies
      that the appropriate flags are set on the file descriptor/open file
      description returned by accept4().
      
      I tested Ulrich's patch in this thread by applying against 2.6.28-rc2,
      and it passes according to my test program.
      
      /* test_accept4.c
      
        Copyright (C) 2008, Linux Foundation, written by Michael Kerrisk
             <mtk.manpages@gmail.com>
      
        Licensed under the GNU GPLv2 or later.
      */
      #define _GNU_SOURCE
      #include <unistd.h>
      #include <sys/syscall.h>
      #include <sys/socket.h>
      #include <netinet/in.h>
      #include <stdlib.h>
      #include <fcntl.h>
      #include <stdio.h>
      #include <string.h>
      
      #define PORT_NUM 33333
      
      #define die(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0)
      
      /**********************************************************************/
      
      /* The following is what we need until glibc gets a wrapper for
        accept4() */
      
      /* Flags for socket(), socketpair(), accept4() */
      #ifndef SOCK_CLOEXEC
      #define SOCK_CLOEXEC    O_CLOEXEC
      #endif
      #ifndef SOCK_NONBLOCK
      #define SOCK_NONBLOCK   O_NONBLOCK
      #endif
      
      #ifdef __x86_64__
      #define SYS_accept4 288
      #elif __i386__
      #define USE_SOCKETCALL 1
      #define SYS_ACCEPT4 18
      #else
      #error "Sorry -- don't know the syscall # on this architecture"
      #endif
      
      static int
      accept4(int fd, struct sockaddr *sockaddr, socklen_t *addrlen, int flags)
      {
         printf("Calling accept4(): flags = %x", flags);
         if (flags != 0) {
             printf(" (");
             if (flags & SOCK_CLOEXEC)
                 printf("SOCK_CLOEXEC");
             if ((flags & SOCK_CLOEXEC) && (flags & SOCK_NONBLOCK))
                 printf(" ");
             if (flags & SOCK_NONBLOCK)
                 printf("SOCK_NONBLOCK");
             printf(")");
         }
         printf("\n");
      
      #if USE_SOCKETCALL
         long args[6];
      
         args[0] = fd;
         args[1] = (long) sockaddr;
         args[2] = (long) addrlen;
         args[3] = flags;
      
         return syscall(SYS_socketcall, SYS_ACCEPT4, args);
      #else
         return syscall(SYS_accept4, fd, sockaddr, addrlen, flags);
      #endif
      }
      
      /**********************************************************************/
      
      static int
      do_test(int lfd, struct sockaddr_in *conn_addr,
             int closeonexec_flag, int nonblock_flag)
      {
         int connfd, acceptfd;
         int fdf, flf, fdf_pass, flf_pass;
         struct sockaddr_in claddr;
         socklen_t addrlen;
      
         printf("=======================================\n");
      
         connfd = socket(AF_INET, SOCK_STREAM, 0);
         if (connfd == -1)
             die("socket");
         if (connect(connfd, (struct sockaddr *) conn_addr,
                     sizeof(struct sockaddr_in)) == -1)
             die("connect");
      
         addrlen = sizeof(struct sockaddr_in);
         acceptfd = accept4(lfd, (struct sockaddr *) &claddr, &addrlen,
                            closeonexec_flag | nonblock_flag);
         if (acceptfd == -1) {
             perror("accept4()");
             close(connfd);
             return 0;
         }
      
         fdf = fcntl(acceptfd, F_GETFD);
         if (fdf == -1)
             die("fcntl:F_GETFD");
         fdf_pass = ((fdf & FD_CLOEXEC) != 0) ==
                    ((closeonexec_flag & SOCK_CLOEXEC) != 0);
         printf("Close-on-exec flag is %sset (%s); ",
                 (fdf & FD_CLOEXEC) ? "" : "not ",
                 fdf_pass ? "OK" : "failed");
      
         flf = fcntl(acceptfd, F_GETFL);
         if (flf == -1)
             die("fcntl:F_GETFD");
         flf_pass = ((flf & O_NONBLOCK) != 0) ==
                    ((nonblock_flag & SOCK_NONBLOCK) !=0);
         printf("nonblock flag is %sset (%s)\n",
                 (flf & O_NONBLOCK) ? "" : "not ",
                 flf_pass ? "OK" : "failed");
      
         close(acceptfd);
         close(connfd);
      
         printf("Test result: %s\n", (fdf_pass && flf_pass) ? "PASS" : "FAIL");
         return fdf_pass && flf_pass;
      }
      
      static int
      create_listening_socket(int port_num)
      {
         struct sockaddr_in svaddr;
         int lfd;
         int optval;
      
         memset(&svaddr, 0, sizeof(struct sockaddr_in));
         svaddr.sin_family = AF_INET;
         svaddr.sin_addr.s_addr = htonl(INADDR_ANY);
         svaddr.sin_port = htons(port_num);
      
         lfd = socket(AF_INET, SOCK_STREAM, 0);
         if (lfd == -1)
             die("socket");
      
         optval = 1;
         if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, &optval,
                        sizeof(optval)) == -1)
             die("setsockopt");
      
         if (bind(lfd, (struct sockaddr *) &svaddr,
                  sizeof(struct sockaddr_in)) == -1)
             die("bind");
      
         if (listen(lfd, 5) == -1)
             die("listen");
      
         return lfd;
      }
      
      int
      main(int argc, char *argv[])
      {
         struct sockaddr_in conn_addr;
         int lfd;
         int port_num;
         int passed;
      
         passed = 1;
      
         port_num = (argc > 1) ? atoi(argv[1]) : PORT_NUM;
      
         memset(&conn_addr, 0, sizeof(struct sockaddr_in));
         conn_addr.sin_family = AF_INET;
         conn_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
         conn_addr.sin_port = htons(port_num);
      
         lfd = create_listening_socket(port_num);
      
         if (!do_test(lfd, &conn_addr, 0, 0))
             passed = 0;
         if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, 0))
             passed = 0;
         if (!do_test(lfd, &conn_addr, 0, SOCK_NONBLOCK))
             passed = 0;
         if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, SOCK_NONBLOCK))
             passed = 0;
      
         close(lfd);
      
         exit(passed ? EXIT_SUCCESS : EXIT_FAILURE);
      }
      
      [mtk.manpages@gmail.com: rewrote changelog, updated test program]
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Tested-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com>
      Cc: <linux-api@vger.kernel.org>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de11defe
  20. 19 11月, 2008 1 次提交
    • J
      mac80211: remove ieee80211_notify_mac · 8e3bad65
      Johannes Berg 提交于
      Before ieee80211_notify_mac() was added, it was presented with the
      use case of using it to tell mac80211 that the association may
      have been lost because the firmware crashed/reset.
      
      Since then, it has also been used by iwlwifi to (slightly) speed
      up re-association after resume, a workaround around the fact that
      mac80211 has no suspend/resume handling yet. It is also not used
      by any other drivers, so clearly it cannot be necessary for "good
      enough" suspend/resume.
      
      Unfortunately, the callback suffers from a severe problem: It only
      works for station mode. If suspend/resume happens while in IBSS or
      any other mode (but station), then the callback is pointless.
      
      Recently, it has created a number of locking issues, first because
      it required rtnl locking rather than RCU due to calling sleeping
      functions within the critical section, and now because it's called
      by iwlwifi from the mac80211 workqueue that may not use the rtnl
      because it is flushed under rtnl.
      (cf. http://bugzilla.kernel.org/show_bug.cgi?id=12046)
      
      I think, therefore, that we should take a step back, remove it
      entirely for now and add the small feature it provided properly.
      For suspend and resume we will need to introduce new hooks, and for
      the case where the firmware was reset the driver will probably
      simply just pretend it has done a suspend/resume cycle to get
      mac80211 to reprogram the hardware completely, not just try to
      connect to the current AP again in station mode. When doing so, we
      will need to take into account locking issues and possibly defer
      to schedule_work from within mac80211 for the resume operation,
      while the suspend operation must be done directly.
      
      Proper suspend/resume should also not necessarily try to reconnect
      to the current AP, the time spent in suspend may have been short
      enough to not be disconnected from the AP, mac80211 will detect
      that the AP went out of range quickly if it did, and if the
      association is lost then the AP will disassoc as soon as a data
      frame is sent. We might also take into account WWOL then, and
      have mac80211 program the hardware into such a mode where it is
      available and requested.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      8e3bad65