1. 19 5月, 2010 4 次提交
  2. 11 5月, 2010 1 次提交
  3. 09 5月, 2010 3 次提交
    • F
      tracing: Factorize lock events in a lock class · 2c193c73
      Frederic Weisbecker 提交于
      lock_acquired, lock_contended and lock_release now share the
      same prototype and format. Let's factorize them into a lock
      event class.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      2c193c73
    • F
      tracing: Drop the nested field from lock_release event · 93135439
      Frederic Weisbecker 提交于
      Drop the nested field as we don't use it. Every nested state can
      be computed from a state machine on post processing already.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      93135439
    • F
      tracing: Drop lock_acquired waittime field · 883a2a31
      Frederic Weisbecker 提交于
      Drop the waittime field from the lock_acquired event, we can
      calculate it by substracting the lock_acquired event timestamp
      with the matching lock_acquire one.
      
      It is not needed and takes useless space in the traces.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      883a2a31
  4. 07 5月, 2010 3 次提交
    • L
      perf: Add group scheduling transactional APIs · 6bde9b6c
      Lin Ming 提交于
      Add group scheduling transactional APIs to struct pmu.
      These APIs will be implemented in arch code, based on Peter's idea as
      below.
      
      > the idea behind hw_perf_group_sched_in() is to not perform
      > schedulability tests on each event in the group, but to add the group
      > as a whole and then perform one test.
      >
      > Of course, when that test fails, you'll have to roll-back the whole
      > group again.
      >
      > So start_txn (or a better name) would simply toggle a flag in the pmu
      > implementation that will make pmu::enable() not perform the
      > schedulablilty test.
      >
      > Then commit_txn() will perform the schedulability test (so note the
      > method has to have a !void return value.
      >
      > This will allow us to use the regular
      > kernel/perf_event.c::group_sched_in() and all the rollback code.
      > Currently each hw_perf_group_sched_in() implementation duplicates all
      > the rolllback code (with various bugs).
      
      ->start_txn:
      Start group events scheduling transaction, set a flag to make
      pmu::enable() not perform the schedulability test, it will be performed
      at commit time.
      
      ->commit_txn:
      Commit group events scheduling transaction, perform the group
      schedulability as a whole
      
      ->cancel_txn:
      Stop group events scheduling transaction, clear the flag so
      pmu::enable() will perform the schedulability test.
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1272002160.5707.60.camel@minggr.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6bde9b6c
    • P
      perf, x86: Improve the PEBS ABI · ab608344
      Peter Zijlstra 提交于
      Rename perf_event_attr::precise to perf_event_attr::precise_ip and
      widen it to 2 bits. This new field describes the required precision of
      the PERF_SAMPLE_IP field:
      
        0 - SAMPLE_IP can have arbitrary skid
        1 - SAMPLE_IP must have constant skid
        2 - SAMPLE_IP requested to have 0 skid
        3 - SAMPLE_IP must have 0 skid
      
      And modify the Intel PEBS code accordingly. The PEBS implementation
      now supports up to precise_ip == 2, where we perform the IP fixup.
      
      Also s/PERF_RECORD_MISC_EXACT/&_IP/ to clarify its meaning, this bit
      should be set for each PERF_SAMPLE_IP field known to match the actual
      instruction triggering the event.
      
      This new scheme allows for a PEBS mode that uses the buffer for more
      than a single event.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ab608344
    • P
      perf: Fix exit() vs PERF_FORMAT_GROUP · 4fd38e45
      Peter Zijlstra 提交于
      Both Stephane and Corey reported that PERF_FORMAT_GROUP didn't work
      as expected if the task the counters were attached to quit before
      the read() call.
      
      The cause is that we unconditionally destroy the grouping when we
      remove counters from their context. Fix this by only doing this when
      we free the counter itself.
      Reported-by: NCorey Ashford <cjashfor@linux.vnet.ibm.com>
      Reported-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1273160566.5605.404.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4fd38e45
  5. 04 5月, 2010 1 次提交
  6. 03 5月, 2010 1 次提交
  7. 01 5月, 2010 4 次提交
    • F
      hw-breakpoints: Get the number of available registers on boot dynamically · feef47d0
      Frederic Weisbecker 提交于
      The breakpoint generic layer assumes that archs always know in advance
      the static number of address registers available to host breakpoints
      through the HBP_NUM macro.
      
      However this is not true for every archs. For example Arm needs to get
      this information dynamically to handle the compatiblity between
      different versions.
      
      To solve this, this patch proposes to drop the static HBP_NUM macro
      and let the arch provide the number of available slots through a
      new hw_breakpoint_slots() function. For archs that have
      CONFIG_HAVE_MIXED_BREAKPOINTS_REGS selected, it will be called once
      as the number of registers fits for instruction and data breakpoints
      together.
      For the others it will be called first to get the number of
      instruction breakpoint registers and another time to get the
      data breakpoint registers, the targeted type is given as a
      parameter of hw_breakpoint_slots().
      Reported-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      feef47d0
    • F
      hw-breakpoints: Separate constraint space for data and instruction breakpoints · 0102752e
      Frederic Weisbecker 提交于
      There are two outstanding fashions for archs to implement hardware
      breakpoints.
      
      The first is to separate breakpoint address pattern definition
      space between data and instruction breakpoints. We then have
      typically distinct instruction address breakpoint registers
      and data address breakpoint registers, delivered with
      separate control registers for data and instruction breakpoints
      as well. This is the case of PowerPc and ARM for example.
      
      The second consists in having merged breakpoint address space
      definition between data and instruction breakpoint. Address
      registers can host either instruction or data address and
      the access mode for the breakpoint is defined in a control
      register. This is the case of x86 and Super H.
      
      This patch adds a new CONFIG_HAVE_MIXED_BREAKPOINTS_REGS config
      that archs can select if they belong to the second case. Those
      will have their slot allocation merged for instructions and
      data breakpoints.
      
      The others will have a separate slot tracking between data and
      instruction breakpoints.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      0102752e
    • F
      hw-breakpoints: Tag ptrace breakpoint as exclude_kernel · 73266fc1
      Frederic Weisbecker 提交于
      Tag ptrace breakpoints with the exclude_kernel attribute set. This
      will make it easier to set generic policies on breakpoints, when it
      comes to ensure nobody unpriviliged try to breakpoint on the kernel.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      73266fc1
    • D
      USB: rename usb_buffer_alloc() and usb_buffer_free() · 073900a2
      Daniel Mack 提交于
      For more clearance what the functions actually do,
      
        usb_buffer_alloc() is renamed to usb_alloc_coherent()
        usb_buffer_free()  is renamed to usb_free_coherent()
      
      They should only be used in code which really needs DMA coherency.
      
      [added compatibility macros so we can convert things easier - gregkh]
      Signed-off-by: NDaniel Mack <daniel@caiaq.de>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Pedro Ribeiro <pedrib@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      073900a2
  8. 29 4月, 2010 3 次提交
    • N
      sctp: Fix skb_over_panic resulting from multiple invalid parameter errors (CVE-2010-1173) (v4) · 5fa782c2
      Neil Horman 提交于
      Ok, version 4
      
      Change Notes:
      1) Minor cleanups, from Vlads notes
      
      Summary:
      
      Hey-
      	Recently, it was reported to me that the kernel could oops in the
      following way:
      
      <5> kernel BUG at net/core/skbuff.c:91!
      <5> invalid operand: 0000 [#1]
      <5> Modules linked in: sctp netconsole nls_utf8 autofs4 sunrpc iptable_filter
      ip_tables cpufreq_powersave parport_pc lp parport vmblock(U) vsock(U) vmci(U)
      vmxnet(U) vmmemctl(U) vmhgfs(U) acpiphp dm_mirror dm_mod button battery ac md5
      ipv6 uhci_hcd ehci_hcd snd_ens1371 snd_rawmidi snd_seq_device snd_pcm_oss
      snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_ac97_codec snd soundcore
      pcnet32 mii floppy ext3 jbd ata_piix libata mptscsih mptsas mptspi mptscsi
      mptbase sd_mod scsi_mod
      <5> CPU:    0
      <5> EIP:    0060:[<c02bff27>]    Not tainted VLI
      <5> EFLAGS: 00010216   (2.6.9-89.0.25.EL)
      <5> EIP is at skb_over_panic+0x1f/0x2d
      <5> eax: 0000002c   ebx: c033f461   ecx: c0357d96   edx: c040fd44
      <5> esi: c033f461   edi: df653280   ebp: 00000000   esp: c040fd40
      <5> ds: 007b   es: 007b   ss: 0068
      <5> Process swapper (pid: 0, threadinfo=c040f000 task=c0370be0)
      <5> Stack: c0357d96 e0c29478 00000084 00000004 c033f461 df653280 d7883180
      e0c2947d
      <5>        00000000 00000080 df653490 00000004 de4f1ac0 de4f1ac0 00000004
      df653490
      <5>        00000001 e0c2877a 08000800 de4f1ac0 df653490 00000000 e0c29d2e
      00000004
      <5> Call Trace:
      <5>  [<e0c29478>] sctp_addto_chunk+0xb0/0x128 [sctp]
      <5>  [<e0c2947d>] sctp_addto_chunk+0xb5/0x128 [sctp]
      <5>  [<e0c2877a>] sctp_init_cause+0x3f/0x47 [sctp]
      <5>  [<e0c29d2e>] sctp_process_unk_param+0xac/0xb8 [sctp]
      <5>  [<e0c29e90>] sctp_verify_init+0xcc/0x134 [sctp]
      <5>  [<e0c20322>] sctp_sf_do_5_1B_init+0x83/0x28e [sctp]
      <5>  [<e0c25333>] sctp_do_sm+0x41/0x77 [sctp]
      <5>  [<c01555a4>] cache_grow+0x140/0x233
      <5>  [<e0c26ba1>] sctp_endpoint_bh_rcv+0xc5/0x108 [sctp]
      <5>  [<e0c2b863>] sctp_inq_push+0xe/0x10 [sctp]
      <5>  [<e0c34600>] sctp_rcv+0x454/0x509 [sctp]
      <5>  [<e084e017>] ipt_hook+0x17/0x1c [iptable_filter]
      <5>  [<c02d005e>] nf_iterate+0x40/0x81
      <5>  [<c02e0bb9>] ip_local_deliver_finish+0x0/0x151
      <5>  [<c02e0c7f>] ip_local_deliver_finish+0xc6/0x151
      <5>  [<c02d0362>] nf_hook_slow+0x83/0xb5
      <5>  [<c02e0bb2>] ip_local_deliver+0x1a2/0x1a9
      <5>  [<c02e0bb9>] ip_local_deliver_finish+0x0/0x151
      <5>  [<c02e103e>] ip_rcv+0x334/0x3b4
      <5>  [<c02c66fd>] netif_receive_skb+0x320/0x35b
      <5>  [<e0a0928b>] init_stall_timer+0x67/0x6a [uhci_hcd]
      <5>  [<c02c67a4>] process_backlog+0x6c/0xd9
      <5>  [<c02c690f>] net_rx_action+0xfe/0x1f8
      <5>  [<c012a7b1>] __do_softirq+0x35/0x79
      <5>  [<c0107efb>] handle_IRQ_event+0x0/0x4f
      <5>  [<c01094de>] do_softirq+0x46/0x4d
      
      Its an skb_over_panic BUG halt that results from processing an init chunk in
      which too many of its variable length parameters are in some way malformed.
      
      The problem is in sctp_process_unk_param:
      if (NULL == *errp)
      	*errp = sctp_make_op_error_space(asoc, chunk,
      					 ntohs(chunk->chunk_hdr->length));
      
      	if (*errp) {
      		sctp_init_cause(*errp, SCTP_ERROR_UNKNOWN_PARAM,
      				 WORD_ROUND(ntohs(param.p->length)));
      		sctp_addto_chunk(*errp,
      			WORD_ROUND(ntohs(param.p->length)),
      				  param.v);
      
      When we allocate an error chunk, we assume that the worst case scenario requires
      that we have chunk_hdr->length data allocated, which would be correct nominally,
      given that we call sctp_addto_chunk for the violating parameter.  Unfortunately,
      we also, in sctp_init_cause insert a sctp_errhdr_t structure into the error
      chunk, so the worst case situation in which all parameters are in violation
      requires chunk_hdr->length+(sizeof(sctp_errhdr_t)*param_count) bytes of data.
      
      The result of this error is that a deliberately malformed packet sent to a
      listening host can cause a remote DOS, described in CVE-2010-1173:
      http://cve.mitre.org/cgi-bin/cvename.cgi?name=2010-1173
      
      I've tested the below fix and confirmed that it fixes the issue.  We move to a
      strategy whereby we allocate a fixed size error chunk and ignore errors we don't
      have space to report.  Tested by me successfully
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fa782c2
    • V
      sctp: Fix oops when sending queued ASCONF chunks · c0786693
      Vlad Yasevich 提交于
      When we finish processing ASCONF_ACK chunk, we try to send
      the next queued ASCONF.  This action runs the sctp state
      machine recursively and it's not prepared to do so.
      
      kernel BUG at kernel/timer.c:790!
      invalid opcode: 0000 [#1] SMP
      last sysfs file: /sys/module/ipv6/initstate
      Modules linked in: sha256_generic sctp libcrc32c ipv6 dm_multipath
      uinput 8139too i2c_piix4 8139cp mii i2c_core pcspkr virtio_net joydev
      floppy virtio_blk virtio_pci [last unloaded: scsi_wait_scan]
      
      Pid: 0, comm: swapper Not tainted 2.6.34-rc4 #15 /Bochs
      EIP: 0060:[<c044a2ef>] EFLAGS: 00010286 CPU: 0
      EIP is at add_timer+0xd/0x1b
      EAX: cecbab14 EBX: 000000f0 ECX: c0957b1c EDX: 03595cf4
      ESI: cecba800 EDI: cf276f00 EBP: c0957aa0 ESP: c0957aa0
       DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process swapper (pid: 0, ti=c0956000 task=c0988ba0 task.ti=c0956000)
      Stack:
       c0957ae0 d1851214 c0ab62e4 c0ab5f26 0500ffff 00000004 00000005 00000004
      <0> 00000000 d18694fd 00000004 1666b892 cecba800 cecba800 c0957b14
      00000004
      <0> c0957b94 d1851b11 ceda8b00 cecba800 cf276f00 00000001 c0957b14
      000000d0
      Call Trace:
       [<d1851214>] ? sctp_side_effects+0x607/0xdfc [sctp]
       [<d1851b11>] ? sctp_do_sm+0x108/0x159 [sctp]
       [<d1863386>] ? sctp_pname+0x0/0x1d [sctp]
       [<d1861a56>] ? sctp_primitive_ASCONF+0x36/0x3b [sctp]
       [<d185657c>] ? sctp_process_asconf_ack+0x2a4/0x2d3 [sctp]
       [<d184e35c>] ? sctp_sf_do_asconf_ack+0x1dd/0x2b4 [sctp]
       [<d1851ac1>] ? sctp_do_sm+0xb8/0x159 [sctp]
       [<d1863334>] ? sctp_cname+0x0/0x52 [sctp]
       [<d1854377>] ? sctp_assoc_bh_rcv+0xac/0xe1 [sctp]
       [<d1858f0f>] ? sctp_inq_push+0x2d/0x30 [sctp]
       [<d186329d>] ? sctp_rcv+0x797/0x82e [sctp]
      Tested-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NYuansong Qiao <ysqiao@research.ait.ie>
      Signed-off-by: NShuaijun Zhang <szhang@research.ait.ie>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0786693
    • W
      sctp: avoid irq lock inversion while call sk->sk_data_ready() · 561b1733
      Wei Yongjun 提交于
      sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
      contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
      not be used in this case. Therefore, we have to make a new function
      sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
      
      =========================================================
      [ INFO: possible irq lock inversion dependency detected ]
      2.6.33-rc6 #129
      ---------------------------------------------------------
      sctp_darn/1517 just changed the state of lock:
       (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
      but this lock took another, SOFTIRQ-unsafe lock in the past:
       (slock-AF_INET){+.-...}
      
      and interrupts could create inverse lock ordering between them.
      
      other info that might help us debug this:
      1 lock held by sctp_darn/1517:
       #0:  (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      561b1733
  9. 28 4月, 2010 1 次提交
  10. 25 4月, 2010 2 次提交
  11. 24 4月, 2010 1 次提交
  12. 23 4月, 2010 1 次提交
    • T
      NFS: Fix an unstable write data integrity race · 71d0a611
      Trond Myklebust 提交于
      Commit 2c61be0a (NFS: Ensure that the WRITE
      and COMMIT RPC calls are always uninterruptible) exposed a race on file
      close. In order to ensure correct close-to-open behaviour, we want to wait
      for all outstanding background commit operations to complete.
      
      This patch adds an inode flag that indicates if a commit operation is under
      way, and provides a mechanism to allow ->write_inode() to wait for its
      completion if this is a data integrity flush.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      71d0a611
  13. 22 4月, 2010 4 次提交
  14. 21 4月, 2010 1 次提交
  15. 20 4月, 2010 3 次提交
  16. 19 4月, 2010 4 次提交
  17. 16 4月, 2010 1 次提交
  18. 15 4月, 2010 2 次提交
    • S
      firewire: cdev: change license of exported header files to MIT license · 19b3eecc
      Stefan Richter 提交于
      Among else, this allows projects like libdc1394 to carry copies of the
      ABI related header files without them or distributors having to worry
      about effects on the project's overall license terms.  Switch to MIT
      license as suggested by Kristian.  Also update the year in the
      copyright statement according to source history.
      
      Cc: Jay Fenlason <fenlason@redhat.com>
      Acked-by: NClemens Ladisch <clemens@ladisch.de>
      Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: NKristian Høgsberg <krh@bitplanet.net>
      19b3eecc
    • F
      perf: Store active software events in a hashlist · 76e1d904
      Frederic Weisbecker 提交于
      Each time a software event triggers, we need to walk through
      the entire list of events from the current cpu and task contexts
      to retrieve a running perf event that matches.
      We also need to check a matching perf event is actually counting.
      
      This walk is wasteful and makes the event fast path scaling
      down with a growing number of events running on the same
      contexts.
      
      To solve this, we store the running perf events in a hashlist to
      get an immediate access to them against their type:event_id when
      they trigger.
      
      v2: - Fix SWEVENT_HLIST_SIZE definition (and re-learn some basic
            maths along the way)
          - Only allocate hlist for online cpus, but keep track of the
            refcount on offline possible cpus too, so that we allocate it
            if needed when it becomes online.
          - Drop the kref use as it's not adapted to our tricks anymore.
      
      v3: - Fix bad refcount check (address instead of value). Thanks to
            Eric Dumazet who spotted this.
          - While exiting cpu, move the hlist release out of the IPI path
            to lock the hlist mutex sanely.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      76e1d904