1. 27 3月, 2007 3 次提交
    • R
      [PATCH] Add const to pointer qualifiers for __chk_user_ptr and __chk_io_ptr. · 04a39523
      Russ Cox 提交于
      Change prototypes for __chk_user_ptr and __chk_io_ptr to take const
      void* instead of void*, so that code can pass "const void *" to them.
      
      (Right now sparse does not warn about passing const void* to void*
      functions, but that is a separate bug that I believe Josh is working on,
      and once sparse does check this, the changed prototypes will be
      necessary.)
      Signed-off-by: NRuss Cox <rsc@swtch.com>
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Acked-by: NChristopher Li <sparse@chrisli.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      04a39523
    • S
      ide: use correct IDE error recovery · 513daadd
      Suleiman Souhlal 提交于
      IDE error recovery is using IDLE IMMEDIATE if the drive is busy or has DRQ set.
      This violates the ATA spec (can only send IDLE IMMEDIATE when drive is not
      busy) and really hoses up some drives (modern drives will not be able to
      recover using this error handling).  The correct thing to do is issue a SRST
      followed by a SET FEATURES command.  This is what Western Digital recommends
      for error recovery and what Western Digital says Windows does.  It also does
      not violate the ATA spec as far as I can tell.
      
      Bart:
      * port the patch over the current tree
      * undo the recalibration code removal
      * send SET FEATURES command after checking for good drive status
      * don't check whether the current request is of REQ_TYPE_ATA_{CMD,TASK}
        type because we need to send SET FEATURES before handling any requests
      * some pre-ATA4 drives require INITIALIZE DEVICE PARAMETERS command before
        other commands (except IDENTIFY) so send SET FEATURES only if there are
        no pending drive->special requests
      * update comments and patch description
      * any bugs introduced by this patch are mine and not Suleiman's :-)
      Signed-off-by: NSuleiman Souhlal <suleiman@google.com>
      Acked-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      513daadd
    • H
      [S390] Fix TCP/UDP pseudo header checksum computation. · afbc1e99
      Heiko Carstens 提交于
      git commit f994aae1 changed the
      function declaration of csum_tcpudp_nofold. Argument types were
      changed from unsigned long to __be32 (unsigned int). Therefore we
      lost the implicit type conversion that zeroed the upper half of the
      registers that are used to pass parameters. Since the inline assembly
      relied on this we ended up adding random values and wrong checksums
      were created.
      Showed only up on machines with more than 4GB since gcc produced code
      where the registers that are used to pass 'saddr' and 'daddr' previously
      contained addresses before calling this function.
      Fix this by using 32 bit arithmetics and convert code to C, since gcc
      produces better code than these hand-optimized versions.
      
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      afbc1e99
  2. 26 3月, 2007 3 次提交
    • D
      [IPV6]: Fix routing round-robin locking. · f11e6659
      David S. Miller 提交于
      As per RFC2461, section 6.3.6, item #2, when no routers on the
      matching list are known to be reachable or probably reachable we
      do round robin on those available routes so that we make sure
      to probe as many of them as possible to detect when one becomes
      reachable faster.
      
      Each routing table has a rwlock protecting the tree and the linked
      list of routes at each leaf.  The round robin code executes during
      lookup and thus with the rwlock taken as a reader.  A small local
      spinlock tries to provide protection but this does not work at all
      for two reasons:
      
      1) The round-robin list manipulation, as coded, goes like this (with
         read lock held):
      
      	walk routes finding head and tail
      
      	spin_lock();
      	rotate list using head and tail
      	spin_unlock();
      
         While one thread is rotating the list, another thread can
         end up with stale values of head and tail and then proceed
         to corrupt the list when it gets the lock.  This ends up causing
         the OOPS in fib6_add() later onthat many people have been hitting.
      
      2) All the other code paths that run with the rwlock held as
         a reader do not expect the list to change on them, they
         expect it to remain completely fixed while they hold the
         lock in that way.
      
      So, simply stated, it is impossible to implement this correctly using
      a manipulation of the list without violating the rwlock locking
      semantics.
      
      Reimplement using a per-fib6_node round-robin pointer.  This way we
      don't need to manipulate the list at all, and since the round-robin
      pointer can only ever point to real existing entries we don't need
      to perform any locking on the changing of the round-robin pointer
      itself.  We only need to reset the round-robin pointer to NULL when
      the entry it is pointing to is removed.
      
      The idea is from Thomas Graf and it is very similar to how this
      was implemented before the advanced router selection code when in.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f11e6659
    • A
      [NET]: Fix neighbour destructor handling. · ecbb4169
      Alexey Kuznetsov 提交于
      ->neigh_destructor() is killed (not used), replaced with
      ->neigh_cleanup(), which is called when neighbor entry goes to dead
      state. At this point everything is still valid: neigh->dev,
      neigh->parms etc.
      
      The device should guarantee that dead neighbor entries (neigh->dead !=
      0) do not get private part initialized, otherwise nobody will cleanup
      it.
      
      I think this is enough for ipoib which is the only user of this thing.
      Initialization private part of neighbor entries happens in ipib
      start_xmit routine, which is not reached when device is down.  But it
      would be better to add explicit test for neigh->dead in any case.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecbb4169
    • T
      [NET]: Fix fib_rules compatibility breakage · e1701c68
      Thomas Graf 提交于
      Based upon a patch from Patrick McHardy.
      
      The fib_rules netlink attribute policy introduced in 2.6.19 broke
      userspace compatibilty. When specifying a rule with "from all"
      or "to all", iproute adds a zero byte long netlink attribute,
      but the policy requires all addresses to have a size equal to
      sizeof(struct in_addr)/sizeof(struct in6_addr), resulting in a
      validation error.
      
      Check attribute length of FRA_SRC/FRA_DST in the generic framework
      by letting the family specific rules implementation provide the
      length of an address. Report an error if address length is non
      zero but no address attribute is provided. Fix actual bug by
      checking address length for non-zero instead of relying on
      availability of attribute.
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1701c68
  3. 25 3月, 2007 7 次提交
  4. 24 3月, 2007 3 次提交
    • R
      [PATCH] i386: clear segment register padding in core dumps · 6ea65ff7
      Roland McGrath 提交于
      The segment register slots in struct pt_regs are padded to 32 bits.
      Some of these are stored with instructions like "pushl %es", which
      leaves the high 16 bits as they were.  So the high bits of these
      fields in struct pt_regs contain kernel stack garbage.  These bits are
      ignored by everything and never leak to user space, except in core
      dumps.  The user struct pt_regs is always at the base of the thread's
      kernel stack and so it seems unlikely the information that leaks from
      here is ever worthwhile so as to be a security concern, but I'm not
      sure about that.  It has been this way for ages; userland consumers of
      core dumps all mask off these high bits themselves.  So it is not urgent.
      
      This change masks off the padding bits of the segment register slots
      in core dumps.  ptrace already masks off these high bits, so this
      makes the values in core dumps consistent with what ptrace would
      report just before the process died.
      
      As I read the processor manuals, the cs and ss values will always be
      padded with zero bits rather than stack garbage.  But unlike "pushl %es",
      this is not simple to test with a userland program.  So I added the two
      instructions rather than wonder if they are really never necessary.
      
      I think that x86_64 does not have this problem (for either 32-bit or
      64-bit processes).  It only uses "mov" instructions from segment
      registers, which zero-extend.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ea65ff7
    • L
      x86-64: add "local_apic_timer_c2_ok" here too · 2e7c2838
      Linus Torvalds 提交于
      Needed for any architecture that claims ARCH_APICTIMER_STOPS_ON_C3,
      not just i386.
      
      I'm hoping Thomas will clean this up a bit later..
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e7c2838
    • T
      [PATCH] i386: add command line option "local_apic_timer_c2_ok" · e585bef8
      Thomas Gleixner 提交于
      It turned out that it is almost impossible to trust ACPI, BIOS & Co.
      regarding the C states. This was the reason to switch the local apic
      timer off in C2 state already. OTOH there are sane and well behaving
      systems, which get punished by that decision.
      
      Allow the user to confirm that the local apic timer is trustworthy in C2
      state. This keeps the default behaviour on the safe side.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e585bef8
  5. 23 3月, 2007 2 次提交
  6. 22 3月, 2007 1 次提交
  7. 21 3月, 2007 2 次提交
    • Z
      [IA64] min_low_pfn and max_low_pfn calculation fix · a3f5c338
      Zou Nan hai 提交于
      We have seen bad_pte_print when testing crashdump on an SN machine in
      recent 2.6.20 kernel.  There are tons of bad pte print (pfn < max_low_pfn)
      reports when the crash kernel boots up, all those reported bad pages
      are inside initmem range; That is because if the crash kernel code and
      data happens to be at the beginning of the 1st node. build_node_maps in
      discontig.c will bypass reserved regions with filter_rsvd_memory. Since
      min_low_pfn is calculated in build_node_map, so in this case, min_low_pfn
      will be greater than kernel code and data.
      
      Because pages inside initmem are freed and reused later, we saw
      pfn_valid check fail on those pages.
      
      I think this theoretically happen on a normal kernel. When I check
      min_low_pfn and max_low_pfn calculation in contig.c and discontig.c.
      I found more issues than this.
      
      1. min_low_pfn and max_low_pfn calculation is inconsistent between
      contig.c and discontig.c,
      min_low_pfn is calculated as the first page number of boot memmap in
      contig.c (Why? Though this may work at the most of the time, I don't
      think it is the right logic). It is calculated as the lowest physical
      memory page number bypass reserved regions in discontig.c.
      max_low_pfn is calculated include reserved regions in contig.c. It is
      calculated exclude reserved regions in discontig.c.
      
      2. If kernel code and data region is happen to be at the begin or the
      end of physical memory, when min_low_pfn and max_low_pfn calculation is
      bypassed kernel code and data, pages in initmem will report bad.
      
      3. initrd is also in reserved regions, if it is at the begin or at the
      end of physical memory, kernel will refuse to reuse the memory. Because
      the virt_addr_valid check in free_initrd_mem.
      
      So it is better to fix and clean up those issues.
      Calculate min_low_pfn and max_low_pfn in a consistent way.
      Signed-off-by: NZou Nan hai <nanhai.zou@intel.com>
      Acked-by: NJay Lan <jlan@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a3f5c338
    • U
      [ARM] 4235/1: ns9xxx: declare the clock functions as "const" · 9d5cf5ad
      Uwe Kleine-König 提交于
      This patch removes some "const"s that I introduced thinking they mean
      the same thing as the "const"s introduced here.  So it fixes three warnings.
      Signed-off-by: NUwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      9d5cf5ad
  8. 20 3月, 2007 3 次提交
  9. 19 3月, 2007 4 次提交
  10. 18 3月, 2007 1 次提交
  11. 17 3月, 2007 11 次提交