1. 28 9月, 2017 10 次提交
  2. 29 6月, 2017 1 次提交
  3. 20 6月, 2017 1 次提交
    • T
      ipmi: use rcu lock around call to intf->handlers->sender() · cdea4656
      Tony Camuso 提交于
      A vendor with a system having more than 128 CPUs occasionally encounters
      the following crash during shutdown. This is not an easily reproduceable
      event, but the vendor was able to provide the following analysis of the
      crash, which exhibits the same footprint each time.
      
      crash> bt
      PID: 0      TASK: ffff88017c70ce70  CPU: 5   COMMAND: "swapper/5"
       #0 [ffff88085c143ac8] machine_kexec at ffffffff81059c8b
       #1 [ffff88085c143b28] __crash_kexec at ffffffff811052e2
       #2 [ffff88085c143bf8] crash_kexec at ffffffff811053d0
       #3 [ffff88085c143c10] oops_end at ffffffff8168ef88
       #4 [ffff88085c143c38] no_context at ffffffff8167ebb3
       #5 [ffff88085c143c88] __bad_area_nosemaphore at ffffffff8167ec49
       #6 [ffff88085c143cd0] bad_area_nosemaphore at ffffffff8167edb3
       #7 [ffff88085c143ce0] __do_page_fault at ffffffff81691d1e
       #8 [ffff88085c143d40] do_page_fault at ffffffff81691ec5
       #9 [ffff88085c143d70] page_fault at ffffffff8168e188
          [exception RIP: unknown or invalid address]
          RIP: ffffffffa053c800  RSP: ffff88085c143e28  RFLAGS: 00010206
          RAX: ffff88017c72bfd8  RBX: ffff88017a8dc000  RCX: ffff8810588b5ac8
          RDX: ffff8810588b5a00  RSI: ffffffffa053c800  RDI: ffff8810588b5a00
          RBP: ffff88085c143e58   R8: ffff88017c70d408   R9: ffff88017a8dc000
          R10: 0000000000000002  R11: ffff88085c143da0  R12: ffff8810588b5ac8
          R13: 0000000000000100  R14: ffffffffa053c800  R15: ffff8810588b5a00
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
          <IRQ stack>
          [exception RIP: cpuidle_enter_state+82]
          RIP: ffffffff81514192  RSP: ffff88017c72be50  RFLAGS: 00000202
          RAX: 0000001e4c3c6f16  RBX: 000000000000f8a0  RCX: 0000000000000018
          RDX: 0000000225c17d03  RSI: ffff88017c72bfd8  RDI: 0000001e4c3c6f16
          RBP: ffff88017c72be78   R8: 000000000000237e   R9: 0000000000000018
          R10: 0000000000002494  R11: 0000000000000001  R12: ffff88017c72be20
          R13: ffff88085c14f8e0  R14: 0000000000000082  R15: 0000001e4c3bb400
          ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
      
      This is the corresponding stack trace
      
      It has crashed because the area pointed with RIP extracted from timer
      element is already removed during a shutdown process.
      
      The function is smi_timeout().
      
      And we think ffff8810588b5a00 in RDX is a parameter struct smi_info
      
      crash> rd ffff8810588b5a00 20
      ffff8810588b5a00:  ffff8810588b6000 0000000000000000   .`.X............
      ffff8810588b5a10:  ffff880853264400 ffffffffa05417e0   .D&S......T.....
      ffff8810588b5a20:  24a024a000000000 0000000000000000   .....$.$........
      ffff8810588b5a30:  0000000000000000 0000000000000000   ................
      ffff8810588b5a30:  0000000000000000 0000000000000000   ................
      ffff8810588b5a40:  ffffffffa053a040 ffffffffa053a060   @.S.....`.S.....
      ffff8810588b5a50:  0000000000000000 0000000100000001   ................
      ffff8810588b5a60:  0000000000000000 0000000000000e00   ................
      ffff8810588b5a70:  ffffffffa053a580 ffffffffa053a6e0   ..S.......S.....
      ffff8810588b5a80:  ffffffffa053a4a0 ffffffffa053a250   ..S.....P.S.....
      ffff8810588b5a90:  0000000500000002 0000000000000000   ................
      
      Unfortunately the top of this area is already detroyed by someone.
      But because of two reasonns we think this is struct smi_info
       1) The address included in between  ffff8810588b5a70 and ffff8810588b5a80:
        are inside of ipmi_si_intf.c  see crash> module ffff88085779d2c0
      
       2) We've found the area which point this.
        It is offset 0x68 of  ffff880859df4000
      
      crash> rd  ffff880859df4000 100
      ffff880859df4000:  0000000000000000 0000000000000001   ................
      ffff880859df4010:  ffffffffa0535290 dead000000000200   .RS.............
      ffff880859df4020:  ffff880859df4020 ffff880859df4020    @.Y.... @.Y....
      ffff880859df4030:  0000000000000002 0000000000100010   ................
      ffff880859df4040:  ffff880859df4040 ffff880859df4040   @@.Y....@@.Y....
      ffff880859df4050:  0000000000000000 0000000000000000   ................
      ffff880859df4060:  0000000000000000 ffff8810588b5a00   .........Z.X....
      ffff880859df4070:  0000000000000001 ffff880859df4078   ........x@.Y....
      
       If we regards it as struct ipmi_smi in shutdown process
       it looks consistent.
      
      The remedy for this apparent race is affixed below.
      Signed-off-by: NTony Camuso <tcamuso@redhat.com>
      Cc: stable@vger.kernel.org # 3.19
      
      This was first introduced in 7ea0ed2b ipmi: Make the
      message handler easier to use for SMI interfaces
      where some code was moved outside of the rcu_read_lock()
      and the lock was not added.
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      cdea4656
  4. 06 1月, 2017 1 次提交
  5. 13 12月, 2016 1 次提交
  6. 25 11月, 2016 1 次提交
    • C
      ipmi: Fix sequence number handling · a24b5dd5
      Corey Minyard 提交于
      The IPMI message handler uses a message id that the lower-layer
      preserved to track the sequence number of the message.  The macros
      that handled these sequence numbers were somewhat broken as they
      could result in sequence number truncation and they were not
      doing an "and" of the proper number of bits.
      
      I think this actually is not a problem, because the truncation
      should be harmless and the improper "and" didn't hurt anything
      because sequence number generation used the same improper "and"
      and wouldn't generate a sequence number that would get
      truncated wrong.  However, it should be fixed.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      a24b5dd5
  7. 03 10月, 2016 1 次提交
    • X
      ipmi: fix crash on reading version from proc after unregisted bmc · bd85f4b3
      Xie XiuQi 提交于
      I meet a crash, which could be reproduce:
      1) while true; do cat /proc/ipmi/0/version; done
      2) modprobe -rv ipmi_si ipmi_msghandler ipmi_devintf
      
      [82761.021137] IPMI BT: req2rsp=5 secs retries=2
      [82761.034524] ipmi device interface
      [82761.222218] ipmi_si ipmi_si.0: Found new BMC (man_id: 0x0007db, prod_id: 0x0001, dev_id: 0x01)
      [82761.222230] ipmi_si ipmi_si.0: IPMI bt interface initialized
      [82903.922740] BUG: unable to handle kernel NULL pointer dereference at 00000000000002d4
      [82903.930952] IP: [<ffffffffa030d9e8>] smi_version_proc_show+0x18/0x40 [ipmi_msghandler]
      [82903.939220] PGD 86693a067 PUD 865304067 PMD 0
      [82903.943893] Thread overran stack, or stack corrupted
      [82903.949034] Oops: 0000 [#1] SMP
      [82903.983091] Modules linked in: ipmi_si(-) ipmi_msghandler binfmt_misc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
      ...
      [82904.057285]  pps_core scsi_transport_sas dm_mod vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp
                      nf_nat nf_conntrack sctp libcrc32c [last unloaded: ipmi_devintf]
      [82904.073169] CPU: 37 PID: 28089 Comm: cat Tainted: GF          O   ---- -------   3.10.0-327.28.3.el7.x86_64 #1
      [82904.083373] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.22 05/16/2016
      [82904.090592] task: ffff880101cc2e00 ti: ffff880369c54000 task.ti: ffff880369c54000
      [82904.098414] RIP: 0010:[<ffffffffa030d9e8>]  [<ffffffffa030d9e8>] smi_version_proc_show+0x18/0x40 [ipmi_msghandler]
      [82904.109124] RSP: 0018:ffff880369c57e70  EFLAGS: 00010203
      [82904.114608] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000024688470
      [82904.121912] RDX: fffffffffffffff4 RSI: ffffffffa0313404 RDI: ffff8808670ce200
      [82904.129218] RBP: ffff880369c57e70 R08: 0000000000019720 R09: ffffffff81204a27
      [82904.136521] R10: ffff88046f803300 R11: 0000000000000246 R12: ffff880662399700
      [82904.143828] R13: 0000000000000001 R14: ffff880369c57f48 R15: ffff8808670ce200
      [82904.151128] FS:  00007fb70c9ca740(0000) GS:ffff88086e340000(0000) knlGS:0000000000000000
      [82904.159557] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [82904.165473] CR2: 00000000000002d4 CR3: 0000000864c0c000 CR4: 00000000003407e0
      [82904.172778] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [82904.180084] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [82904.187385] Stack:
      [82904.189573]  ffff880369c57ee0 ffffffff81204f1a 00000000122a2427 0000000001426000
      [82904.197392]  ffff8808670ce238 0000000000010000 0000000000000000 0000000000000fff
      [82904.205198]  00000000122a2427 ffff880862079600 0000000001426000 ffff880369c57f48
      [82904.212962] Call Trace:
      [82904.219667]  [<ffffffff81204f1a>] seq_read+0xfa/0x3a0
      [82904.224893]  [<ffffffff8124ce2d>] proc_reg_read+0x3d/0x80
      [82904.230468]  [<ffffffff811e102c>] vfs_read+0x9c/0x170
      [82904.235689]  [<ffffffff811e1b7f>] SyS_read+0x7f/0xe0
      [82904.240816]  [<ffffffff81649209>] system_call_fastpath+0x16/0x1b
      [82904.246991] Code: 30 a0 e8 0c 6f ef e0 5b 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f
                     44 00 00 48 8b 47 78 55 48 c7 c6 04 34 31 a0 48 89 e5 48 8b 40 50 <0f>
      	       b6 90 d4 02 00 00 31 c0 89 d1 83 e2 0f c0 e9 04 0f b6 c9 e8
      [82904.267710] RIP  [<ffffffffa030d9e8>] smi_version_proc_show+0x18/0x40 [ipmi_msghandler]
      [82904.276079]  RSP <ffff880369c57e70>
      [82904.279734] CR2: 00000000000002d4
      [82904.283731] ---[ end trace a69e4328b49dd7c4 ]---
      [82904.328118] Kernel panic - not syncing: Fatal exception
      
      Reading versin from /proc need bmc device struct available. So in this patch
      we move add/remove_proc_entries between ipmi_bmc_register and ipmi_bmc_unregister.
      
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      bd85f4b3
  8. 27 7月, 2016 1 次提交
    • T
      ipmi: remove trydefaults parameter and default init · b07b58a3
      Tony Camuso 提交于
      Parameter trydefaults=1 causes the ipmi_init to initialize ipmi through
      the legacy port io space that was designated for ipmi. Architectures
      that do not map legacy port io can panic when trydefaults=1.
      
      Rather than implement build-time conditional exceptions for each
      architecture that does not map legacy port io, we have removed legacy
      port io from the driver.
      
      Parameter 'trydefaults' has been removed. Attempts to use it hereafter
      will evoke the "Unknown symbol in module, or unknown parameter" message.
      
      The patch was built against a number of architectures and tested for
      regressions and functionality on x86_64 and ARM64.
      Signed-off-by: NTony Camuso <tcamuso@redhat.com>
      
      Removed the config entry and the address source entry for default,
      since neither were used any more.
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      b07b58a3
  9. 13 6月, 2016 1 次提交
    • J
      ipmi: Remove smi_msg from waiting_rcv_msgs list before handle_one_recv_msg() · ae4ea9a2
      Junichi Nomura 提交于
      Commit 7ea0ed2b ("ipmi: Make the message handler easier to use for
      SMI interfaces") changed handle_new_recv_msgs() to call handle_one_recv_msg()
      for a smi_msg while the smi_msg is still connected to waiting_rcv_msgs list.
      That could lead to following list corruption problems:
      
      1) low-level function treats smi_msg as not connected to list
      
        handle_one_recv_msg() could end up calling smi_send(), which
        assumes the msg is not connected to list.
      
        For example, the following sequence could corrupt list by
        doing list_add_tail() for the entry still connected to other list.
      
          handle_new_recv_msgs()
            msg = list_entry(waiting_rcv_msgs)
            handle_one_recv_msg(msg)
              handle_ipmb_get_msg_cmd(msg)
                smi_send(msg)
                  spin_lock(xmit_msgs_lock)
                  list_add_tail(msg)
                  spin_unlock(xmit_msgs_lock)
      
      2) race between multiple handle_new_recv_msgs() instances
      
        handle_new_recv_msgs() once releases waiting_rcv_msgs_lock before calling
        handle_one_recv_msg() then retakes the lock and list_del() it.
      
        If others call handle_new_recv_msgs() during the window shown below
        list_del() will be done twice for the same smi_msg.
      
        handle_new_recv_msgs()
          spin_lock(waiting_rcv_msgs_lock)
          msg = list_entry(waiting_rcv_msgs)
          spin_unlock(waiting_rcv_msgs_lock)
        |
        | handle_one_recv_msg(msg)
        |
          spin_lock(waiting_rcv_msgs_lock)
          list_del(msg)
          spin_unlock(waiting_rcv_msgs_lock)
      
      Fixes: 7ea0ed2b ("ipmi: Make the message handler easier to use for SMI interfaces")
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      [Added a comment to describe why this works.]
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      Cc: stable@vger.kernel.org # 3.19
      Tested-by: NYe Feng <yefeng.yl@alibaba-inc.com>
      ae4ea9a2
  10. 13 1月, 2016 1 次提交
  11. 04 9月, 2015 6 次提交
  12. 06 5月, 2015 1 次提交
    • J
      ipmi: Remove incorrect use of seq_has_overflowed · 5e33cd0c
      Joe Perches 提交于
      commit d6c5dc18 ("ipmi: Remove uses of return value of seq_printf")
      incorrectly changed the return value of various proc_show functions
      to use seq_has_overflowed().
      
      These functions should return 0 on completion rather than 1/true
      on overflow.  1 is the same as #define SEQ_SKIP which would cause
      the output to not be emitted (skipped) instead.
      
      This is a logical defect only as the length of these outputs are
      all smaller than the initial allocation done by the seq filesystem.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      5e33cd0c
  13. 20 2月, 2015 4 次提交
  14. 22 12月, 2014 1 次提交
    • C
      ipmi: Finish cleanup of BMC attributes · 9c633317
      Corey Minyard 提交于
      The previous cleanup of BMC attributes left a few holes, and if
      you run with lockdep debugging with a BMC with the proper attributes,
      you could get a warning.
      
      This patch removes all the unused attributes from the BMC structure,
      since they are all declared in the .data section now.  It makes
      the attributes all static.  It fixes the referencing of the
      attributes in a couple of cases that dynamically added the files
      depending on BMC information.
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Tested-by: NAlexei Starovoitov <ast@plumgrid.com>
      9c633317
  15. 12 12月, 2014 9 次提交