1. 24 2月, 2010 1 次提交
  2. 12 2月, 2010 2 次提交
    • S
      ptrace: Add support for generic PTRACE_GETREGSET/PTRACE_SETREGSET · 2225a122
      Suresh Siddha 提交于
      Generic support for PTRACE_GETREGSET/PTRACE_SETREGSET commands which
      export the regsets supported by each architecture using the correponding
      NT_* types. These NT_* types are already part of the userland ABI, used
      in representing the architecture specific register sets as different NOTES
      in an ELF core file.
      
      'addr' parameter for the ptrace system call encode the REGSET type (using
      the corresppnding NT_* type) and the 'data' parameter points to the
      struct iovec having the user buffer and the length of that buffer.
      
      	struct iovec iov = { buf, len};
      	ret = ptrace(PTRACE_GETREGSET/PTRACE_SETREGSET, pid, NT_XXX_TYPE, &iov);
      
      On successful completion, iov.len will be updated by the kernel specifying
      how much the kernel has written/read to/from the user's iov.buf.
      
      x86 extended state registers are primarily exported using this interface.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100211195614.886724710@sbs-t61.sc.intel.com>
      Acked-by: NHongjiu Lu <hjl.tools@gmail.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      2225a122
    • S
      x86, ptrace: regset extensions to support xstate · 5b3efd50
      Suresh Siddha 提交于
      Add the xstate regset support which helps extend the kernel ptrace and the
      core-dump interfaces to support AVX state etc.
      
      This regset interface is designed to support all the future state that gets
      supported using xsave/xrstor infrastructure.
      
      Looking at the memory layout saved by "xsave", one can't say which state
      is represented in the memory layout. This is because if a particular state is
      in init state, in the xsave hdr it can be represented by bit '0'. And hence
      we can't really say by the xsave header wether a state is in init state or
      the state is not saved in the memory layout.
      
      And hence the xsave memory layout available through this regset
      interface uses SW usable bytes [464..511] to convey what state is represented
      in the memory layout.
      
      First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate
      mask(which is same as the 64bit mask returned by the xgetbv's xCR0).
      
      The note NT_X86_XSTATE represents the extended state information in the
      core file, using the above mentioned memory layout.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100211195614.802495327@sbs-t61.sc.intel.com>
      Signed-off-by: NHongjiu Lu <hjl.tools@gmail.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      5b3efd50
  3. 11 2月, 2010 2 次提交
  4. 09 2月, 2010 3 次提交
    • M
      drm/nouveau: Add getparam to get available PGRAPH units. · 69c9700b
      Marcin Kościelnicki 提交于
      On nv50, this will be needed by applications using CUDA to know
      how much stack/local memory to allocate.
      Signed-off-by: NMarcin Kościelnicki <koriakin@0x04.net>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      69c9700b
    • P
      netfilter: nf_conntrack: fix hash resizing with namespaces · d696c7bd
      Patrick McHardy 提交于
      As noticed by Jon Masters <jonathan@jonmasters.org>, the conntrack hash
      size is global and not per namespace, but modifiable at runtime through
      /sys/module/nf_conntrack/hashsize. Changing the hash size will only
      resize the hash in the current namespace however, so other namespaces
      will use an invalid hash size. This can cause crashes when enlarging
      the hashsize, or false negative lookups when shrinking it.
      
      Move the hash size into the per-namespace data and only use the global
      hash size to initialize the per-namespace value when instanciating a
      new namespace. Additionally restrict hash resizing to init_net for
      now as other namespaces are not handled currently.
      
      Cc: stable@kernel.org
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d696c7bd
    • E
      netfilter: nf_conntrack: per netns nf_conntrack_cachep · 5b3501fa
      Eric Dumazet 提交于
      nf_conntrack_cachep is currently shared by all netns instances, but
      because of SLAB_DESTROY_BY_RCU special semantics, this is wrong.
      
      If we use a shared slab cache, one object can instantly flight between
      one hash table (netns ONE) to another one (netns TWO), and concurrent
      reader (doing a lookup in netns ONE, 'finding' an object of netns TWO)
      can be fooled without notice, because no RCU grace period has to be
      observed between object freeing and its reuse.
      
      We dont have this problem with UDP/TCP slab caches because TCP/UDP
      hashtables are global to the machine (and each object has a pointer to
      its netns).
      
      If we use per netns conntrack hash tables, we also *must* use per netns
      conntrack slab caches, to guarantee an object can not escape from one
      namespace to another one.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      [Patrick: added unique slab name allocation]
      Cc: stable@kernel.org
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      5b3501fa
  5. 07 2月, 2010 2 次提交
  6. 05 2月, 2010 1 次提交
  7. 04 2月, 2010 1 次提交
  8. 03 2月, 2010 1 次提交
    • E
      connector: Delete buggy notification code. · f98bfbd7
      Evgeniy Polyakov 提交于
      On Tue, Feb 02, 2010 at 02:57:14PM -0800, Greg KH (gregkh@suse.de) wrote:
      > > There are at least two ways to fix it: using a big cannon and a small
      > > one. The former way is to disable notification registration, since it is
      > > not used by anyone at all. Second way is to check whether calling
      > > process is root and its destination group is -1 (kind of priveledged
      > > one) before command is dispatched to workqueue.
      > 
      > Well if no one is using it, removing it makes the most sense, right?
      > 
      > No objection from me, care to make up a patch either way for this?
      
      Getting it is not used, let's drop support for notifications about
      (un)registered events from connector.
      Another option was to check credentials on receiving, but we can always
      restore it without bugs if needed, but genetlink has a wider code base
      and none complained, that userspace can not get notification when some
      other clients were (un)registered.
      
      Kudos for Sebastian Krahmer <krahmer@suse.de>, who found a bug in the
      code.
      Signed-off-by: NEvgeniy Polyakov <zbr@ioremap.net>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f98bfbd7
  9. 01 2月, 2010 2 次提交
    • J
      softlockup: Add sched_clock_tick() to avoid kernel warning on kgdb resume · d6ad3e28
      Jason Wessel 提交于
      When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, sched_clock() gets
      the time from hardware such as the TSC on x86. In this
      configuration kgdb will report a softlock warning message on
      resuming or detaching from a debug session.
      
      Sequence of events in the problem case:
      
       1) "cpu sched clock" and "hardware time" are at 100 sec prior
          to a call to kgdb_handle_exception()
      
       2) Debugger waits in kgdb_handle_exception() for 80 sec and on
          exit the following is called ...  touch_softlockup_watchdog() -->
          __raw_get_cpu_var(touch_timestamp) = 0;
      
       3) "cpu sched clock" = 100s (it was not updated, because the
          interrupt was disabled in kgdb) but the "hardware time" = 180 sec
      
       4) The first timer interrupt after resuming from
          kgdb_handle_exception updates the watchdog from the "cpu sched clock"
      
      update_process_times() { ...  run_local_timers() -->
      softlockup_tick() --> check (touch_timestamp == 0) (it is "YES"
      here, we have set "touch_timestamp = 0" at kgdb) -->
      __touch_softlockup_watchdog() ***(A)--> reset "touch_timestamp"
      to "get_timestamp()" (Here, the "touch_timestamp" will still be
      set to 100s.)  ...
      
          scheduler_tick() ***(B)--> sched_clock_tick() (update "cpu sched
          clock" to "hardware time" = 180s) ...  }
      
       5) The Second timer interrupt handler appears to have a large
          jump and trips the softlockup warning.
      
      update_process_times() { ...  run_local_timers() -->
      softlockup_tick() --> "cpu sched clock" - "touch_timestamp" =
      180s-100s > 60s --> printk "soft lockup error messages" ...  }
      
      note: ***(A) reset "touch_timestamp" to
      "get_timestamp(this_cpu)"
      
      Why is "touch_timestamp" 100 sec, instead of 180 sec?
      
      When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, the call trace of
      get_timestamp() is:
      
      get_timestamp(this_cpu)
       -->cpu_clock(this_cpu)
       -->sched_clock_cpu(this_cpu)
       -->__update_sched_clock(sched_clock_data, now)
      
      The __update_sched_clock() function uses the GTOD tick value to
      create a window to normalize the "now" values.  So if "now"
      value is too big for sched_clock_data, it will be ignored.
      
      The fix is to invoke sched_clock_tick() to update "cpu sched
      clock" in order to recover from this state.  This is done by
      introducing the function touch_softlockup_watchdog_sync(). This
      allows kgdb to request that the sched clock is updated when the
      watchdog thread runs the first time after a resume from kgdb.
      
      [yong.zhang0@gmail.com: Use per cpu instead of an array]
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: NDongdong Deng <Dongdong.Deng@windriver.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: peterz@infradead.org
      LKML-Reference: <1264631124-4837-2-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d6ad3e28
    • M
  10. 30 1月, 2010 2 次提交
    • J
      perf, hw_breakpoint, kgdb: Do not take mutex for kernel debugger · 5352ae63
      Jason Wessel 提交于
      This patch fixes the regression in functionality where the
      kernel debugger and the perf API do not nicely share hw
      breakpoint reservations.
      
      The kernel debugger cannot use any mutex_lock() calls because it
      can start the kernel running from an invalid context.
      
      A mutex free version of the reservation API needed to get
      created for the kernel debugger to safely update hw breakpoint
      reservations.
      
      The possibility for a breakpoint reservation to be concurrently
      processed at the time that kgdb interrupts the system is
      improbable. Should this corner case occur the end user is
      warned, and the kernel debugger will prohibit updating the
      hardware breakpoint reservations.
      
      Any time the kernel debugger reserves a hardware breakpoint it
      will be a system wide reservation.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: torvalds@linux-foundation.org
      LKML-Reference: <1264719883-7285-3-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5352ae63
    • L
      Split 'flush_old_exec' into two functions · 221af7f8
      Linus Torvalds 提交于
      'flush_old_exec()' is the point of no return when doing an execve(), and
      it is pretty badly misnamed.  It doesn't just flush the old executable
      environment, it also starts up the new one.
      
      Which is very inconvenient for things like setting up the new
      personality, because we want the new personality to affect the starting
      of the new environment, but at the same time we do _not_ want the new
      personality to take effect if flushing the old one fails.
      
      As a result, the x86-64 '32-bit' personality is actually done using this
      insane "I'm going to change the ABI, but I haven't done it yet" bit
      (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
      personality, but just the "pending" bit, so that "flush_thread()" can do
      the actual personality magic.
      
      This patch in no way changes any of that insanity, but it does split the
      'flush_old_exec()' function up into a preparatory part that can fail
      (still called flush_old_exec()), and a new part that will actually set
      up the new exec environment (setup_new_exec()).  All callers are changed
      to trivially comply with the new world order.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      221af7f8
  11. 29 1月, 2010 2 次提交
  12. 28 1月, 2010 1 次提交
    • L
      mm: add new 'read_cache_page_gfp()' helper function · 0531b2aa
      Linus Torvalds 提交于
      It's a simplified 'read_cache_page()' which takes a page allocation
      flag, so that different paths can control how aggressive the memory
      allocations are that populate a address space.
      
      In particular, the intel GPU object mapping code wants to be able to do
      a certain amount of own internal memory management by automatically
      shrinking the address space when memory starts getting tight.  This
      allows it to dynamically use different memory allocation policies on a
      per-allocation basis, rather than depend on the (static) address space
      gfp policy.
      
      The actual new function is a one-liner, but re-organizing the helper
      functions to the point where you can do this with a single line of code
      is what most of the patch is all about.
      Tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0531b2aa
  13. 27 1月, 2010 1 次提交
  14. 25 1月, 2010 2 次提交
  15. 24 1月, 2010 1 次提交
  16. 21 1月, 2010 3 次提交
    • P
      perf: Change the is_software_event() definition · 92b67598
      Peter Zijlstra 提交于
      The is_software_event() definition always confuses me because its an
      exclusive expression, make it an inclusive one.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      92b67598
    • M
      sched: Fix vmark regression on big machines · 50b926e4
      Mike Galbraith 提交于
      SD_PREFER_SIBLING is set at the CPU domain level if power saving isn't
      enabled, leading to many cache misses on large machines as we traverse
      looking for an idle shared cache to wake to.  Change the enabler of
      select_idle_sibling() to SD_SHARE_PKG_RESOURCES, and enable same at the
      sibling domain level.
      Reported-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1262612696.15495.15.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      50b926e4
    • S
      USB: Fix duplicate sysfs problem after device reset. · 04a723ea
      Sarah Sharp 提交于
      Borislav Petkov reports issues with duplicate sysfs endpoint files after a
      resume from a hibernate.  It turns out that the code to support alternate
      settings under xHCI has issues when a device with a non-default alternate
      setting is reset during the hibernate:
      
      [  427.681810] Restarting tasks ...
      [  427.681995] hub 1-0:1.0: state 7 ports 6 chg 0004 evt 0000
      [  427.682019] usb usb3: usb resume
      [  427.682030] ohci_hcd 0000:00:12.0: wakeup root hub
      [  427.682191] hub 1-0:1.0: port 2, status 0501, change 0000, 480 Mb/s
      [  427.682205] usb 1-2: usb wakeup-resume
      [  427.682226] usb 1-2: finish reset-resume
      [  427.682886] done.
      [  427.734658] ehci_hcd 0000:00:12.2: port 2 high speed
      [  427.734663] ehci_hcd 0000:00:12.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT
      [  427.746682] hub 3-0:1.0: hub_reset_resume
      [  427.746693] hub 3-0:1.0: trying to enable port power on non-switchable hub
      [  427.786715] usb 1-2: reset high speed USB device using ehci_hcd and address 2
      [  427.839653] ehci_hcd 0000:00:12.2: port 2 high speed
      [  427.839666] ehci_hcd 0000:00:12.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT
      [  427.847717] ohci_hcd 0000:00:12.0: GetStatus roothub.portstatus [1] = 0x00010100 CSC PPS
      [  427.915497] hub 1-2:1.0: remove_intf_ep_devs: if: ffff88022f9e8800 ->ep_devs_created: 1
      [  427.915774] hub 1-2:1.0: remove_intf_ep_devs: bNumEndpoints: 1
      [  427.915934] hub 1-2:1.0: if: ffff88022f9e8800: endpoint devs removed.
      [  427.916158] hub 1-2:1.0: create_intf_ep_devs: if: ffff88022f9e8800 ->ep_devs_created: 0, ->unregistering: 0
      [  427.916434] hub 1-2:1.0: create_intf_ep_devs: bNumEndpoints: 1
      [  427.916609]  ep_81: create, parent hub
      [  427.916632] ------------[ cut here ]------------
      [  427.916644] WARNING: at fs/sysfs/dir.c:477 sysfs_add_one+0x82/0x96()
      [  427.916649] Hardware name: System Product Name
      [  427.916653] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:12.2/usb1/1-2/1-2:1.0/ep_81'
      [  427.916658] Modules linked in: binfmt_misc kvm_amd kvm powernow_k8 cpufreq_ondemand cpufreq_powersave cpufreq_userspace freq_table cpufreq_conservative ipv6 vfat fat
      +8250_pnp 8250 pcspkr ohci_hcd serial_core k10temp edac_core
      [  427.916694] Pid: 278, comm: khubd Not tainted 2.6.33-rc2-00187-g08d869aa-dirty #13
      [  427.916699] Call Trace:
      
      The problem is caused by a mismatch between the USB core's view of the
      device state and the USB device and xHCI host's view of the device state.
      
      After the device reset and re-configuration, the device and the xHCI host
      think they are using alternate setting 0 of all interfaces.  However, the
      USB core keeps track of the old state, which may include non-zero
      alternate settings.  It uses intf->cur_altsetting to keep the endpoint
      sysfs files for the old state across the reset.
      
      The bandwidth allocation functions need to know what the xHCI host thinks
      the current alternate settings are, so original patch set
      intf->cur_altsetting to the alternate setting 0.  This caused duplicate
      endpoint files to be created.
      
      The solution is to not set intf->cur_altsetting before calling
      usb_set_interface() in usb_reset_and_verify_device().  Instead, we add a
      new flag to struct usb_interface to tell usb_hcd_alloc_bandwidth() to use
      alternate setting 0 as the currently installed alternate setting.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Tested-by: NBorislav Petkov <petkovbb@googlemail.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      04a723ea
  17. 19 1月, 2010 2 次提交
  18. 18 1月, 2010 3 次提交
  19. 17 1月, 2010 8 次提交