1. 25 1月, 2014 2 次提交
  2. 24 1月, 2014 1 次提交
    • K
      percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask · 6f6b5d1e
      Kent Overstreet 提交于
      This patch changes percpu_ida_alloc() + callers to accept task state
      bitmask for prepare_to_wait() for code like target/iscsi that needs
      it for interruptible sleep, that is provided in a subsequent patch.
      
      It now expects TASK_UNINTERRUPTIBLE when the caller is able to sleep
      waiting for a new tag, or TASK_RUNNING when the caller cannot sleep,
      and is forced to return a negative value when no tags are available.
      
      v2 changes:
        - Include blk-mq + tcm_fc + vhost/scsi + target/iscsi changes
        - Drop signal_pending_state() call
      v3 changes:
        - Only call prepare_to_wait() + finish_wait() when != TASK_RUNNING
          (PeterZ)
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: <stable@vger.kernel.org> #3.12+
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      6f6b5d1e
  3. 18 1月, 2014 4 次提交
    • N
      target: Add protection SGLs to target_submit_cmd_map_sgls · def2b339
      Nicholas Bellinger 提交于
      This patch adds support to target_submit_cmd_map_sgls() for
      accepting 'sgl_prot' + 'sgl_prot_count' parameters for
      DIF protection information.
      
      Note the passed parameters are stored at se_cmd->t_prot_sg
      and se_cmd->t_prot_nents respectively.
      
      Also, update tcm_loop and vhost-scsi fabrics usage of
      target_submit_cmd_map_sgls() to take into account the
      new parameters.
      
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      def2b339
    • N
      target/sbc: Add DIF TYPE1+TYPE3 read/write verify emulation · 41861fa8
      Nicholas Bellinger 提交于
      This patch adds support for DIF read/write verify emulation
      for TARGET_DIF_TYPE1_PROT + TARGET_DIF_TYPE3_PROT operation.
      
      This includes sbc_dif_verify_write() + sbc_dif_verify_read()
      calls accessable by backend drivers to perform DIF verify
      for SGL based data and protection information.
      
      Also included is sbc_dif_copy_prot() logic to copy protection
      information to/from backend provided protection SGLs.
      
      Based on scsi_debug.c DIF TYPE1+TYPE3 emulation.
      
      v2 changes:
        - Select CRC_T10DIF for TARGET_CORE in Kconfig (Fengguang)
        - Drop IP checksum logic from sbc_dif_v1_verify (MKP)
        - Fix offset on app_tag = 0xffff in sbc_dif_verify_read()
      
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      41861fa8
    • N
      target: Add DIF CHECK_CONDITION ASC/ASCQ exception cases · fcc4f17b
      Nicholas Bellinger 提交于
      This patch adds support for DIF related CHECK_CONDITION ASC/ASCQ
      exception cases into transport_send_check_condition_and_sense().
      
      This includes:
      
        LOGICAL BLOCK GUARD CHECK FAILED
        LOGICAL BLOCK APPLICATION TAG CHECK FAILED
        LOGICAL BLOCK REFERENCE TAG CHECK FAILED
      
      that used by DIF TYPE1 and TYPE3 failure cases.
      
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      fcc4f17b
    • N
      target: Add DIF related base definitions · ce65e5b9
      Nicholas Bellinger 提交于
      This patch adds DIF related definitions to target_core_base.h
      that includes enums for target_prot_op + target_prot_type +
      target_prot_version + target_guard_type + target_pi_error.
      
      Also included is struct se_dif_v1_tuple, along with changes
      to struct se_cmd, struct se_dev_attrib, and struct se_device.
      
      Also, add new se_subsystem_api->[init,format,free]_prot() callers
      used by target core code to setup backend specific protection
      information after the device has been configured.
      
      Enums taken from Sagi Grimberg's original patch.
      
      v2 changes:
        - Drop guard_type related definitions
        - Update target_prot_op + target_prot_ho definitions (Sagi)
        - Drop SCF_PROT + pi_prot_version flag
        - Add se_subsystem_api->format_prot() (Sagi)
        - Add hw_pi_prot_type device attribute
      
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      ce65e5b9
  4. 10 1月, 2014 2 次提交
  5. 18 12月, 2013 4 次提交
  6. 17 12月, 2013 1 次提交
  7. 03 12月, 2013 2 次提交
  8. 01 12月, 2013 1 次提交
  9. 29 11月, 2013 5 次提交
    • M
      [SCSI] Disable WRITE SAME for RAID and virtual host adapter drivers · 54b2b50c
      Martin K. Petersen 提交于
      Some host adapters do not pass commands through to the target disk
      directly. Instead they provide an emulated target which may or may not
      accurately report its capabilities. In some cases the physical device
      characteristics are reported even when the host adapter is processing
      commands on the device's behalf. This can lead to adapter firmware hangs
      or excessive I/O errors.
      
      This patch disables WRITE SAME for devices connected to host adapters
      that provide an emulated target. Driver writers can disable WRITE SAME
      by setting the no_write_same flag in the host adapter template.
      
      [jejb: fix up rejections due to eh_deadline patch]
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      54b2b50c
    • X
      sctp: Restore 'resent' bit to avoid retransmitted chunks for RTT measurements · 6eabca54
      Xufeng Zhang 提交于
      Currently retransmitted DATA chunks could also be used for
      RTT measurements since there are no flag to identify whether
      the transmitted DATA chunk is a new one or a retransmitted one.
      This problem is introduced by commit ae19c548 ("sctp: remove
      'resent' bit from the chunk") which inappropriately removed the
      'resent' bit completely, instead of doing this, we should set
      the resent bit only for the retransmitted DATA chunks.
      Signed-off-by: NXufeng Zhang <xufeng.zhang@windriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6eabca54
    • J
      genetlink/pmcraid: use proper genetlink multicast API · 5e53e689
      Johannes Berg 提交于
      The pmcraid driver is abusing the genetlink API and is using its
      family ID as the multicast group ID, which is invalid and may
      belong to somebody else (and likely will.)
      
      Make it use the correct API, but since this may already be used
      as-is by userspace, reserve a family ID for this code and also
      reserve that group ID to not break userspace assumptions.
      
      My previous patch broke event delivery in the driver as I missed
      that it wasn't using the right API and forgot to update it later
      in my series.
      
      While changing this, I noticed that the genetlink code could use
      the static group ID instead of a strcmp(), so also do that for
      the VFS_DQUOT family.
      
      Cc: Anil Ravindranath <anil_ravindranath@pmc-sierra.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e53e689
    • N
      diag: warn about missing first netlink attribute · 31e20bad
      Nicolas Dichtel 提交于
      The first netlink attribute (value 0) must always be defined as none/unspec.
      This is correctly done in inet_diag.h, but other diag interfaces are wrong.
      
      Because we cannot change an existing API, I add a comment to point the mistake
      and avoid to propagate it in a new diag API in the future.
      
      CC: Thomas Graf <tgraf@suug.ch>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31e20bad
    • S
      efivars, efi-pstore: Hold off deletion of sysfs entry until the scan is completed · e0d59733
      Seiji Aguchi 提交于
      Currently, when mounting pstore file system, a read callback of
      efi_pstore driver runs mutiple times as below.
      
      - In the first read callback, scan efivar_sysfs_list from head and pass
        a kmsg buffer of a entry to an upper pstore layer.
      - In the second read callback, rescan efivar_sysfs_list from the entry
        and pass another kmsg buffer to it.
      - Repeat the scan and pass until the end of efivar_sysfs_list.
      
      In this process, an entry is read across the multiple read function
      calls. To avoid race between the read and erasion, the whole process
      above is protected by a spinlock, holding in open() and releasing in
      close().
      
      At the same time, kmemdup() is called to pass the buffer to pstore
      filesystem during it. And then, it causes a following lockdep warning.
      
      To make the dynamic memory allocation runnable without taking spinlock,
      holding off a deletion of sysfs entry if it happens while scanning it
      via efi_pstore, and deleting it after the scan is completed.
      
      To implement it, this patch introduces two flags, scanning and deleting,
      to efivar_entry.
      
      On the code basis, it seems that all the scanning and deleting logic is
      not needed because __efivars->lock are not dropped when reading from the
      EFI variable store.
      
      But, the scanning and deleting logic is still needed because an
      efi-pstore and a pstore filesystem works as follows.
      
      In case an entry(A) is found, the pointer is saved to psi->data.  And
      efi_pstore_read() passes the entry(A) to a pstore filesystem by
      releasing  __efivars->lock.
      
      And then, the pstore filesystem calls efi_pstore_read() again and the
      same entry(A), which is saved to psi->data, is used for resuming to scan
      a sysfs-list.
      
      So, to protect the entry(A), the logic is needed.
      
      [    1.143710] ------------[ cut here ]------------
      [    1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110()
      [    1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
      [    1.144058] Modules linked in:
      [    1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2
      [    1.144058]  0000000000000009 ffff8800797e9ae0 ffffffff816614a5 ffff8800797e9b28
      [    1.144058]  ffff8800797e9b18 ffffffff8105510d 0000000000000080 0000000000000046
      [    1.144058]  00000000000000d0 00000000000003af ffffffff81ccd0c0 ffff8800797e9b78
      [    1.144058] Call Trace:
      [    1.144058]  [<ffffffff816614a5>] dump_stack+0x54/0x74
      [    1.144058]  [<ffffffff8105510d>] warn_slowpath_common+0x7d/0xa0
      [    1.144058]  [<ffffffff8105517c>] warn_slowpath_fmt+0x4c/0x50
      [    1.144058]  [<ffffffff8131290f>] ? vsscanf+0x57f/0x7b0
      [    1.144058]  [<ffffffff810bbd74>] lockdep_trace_alloc+0x104/0x110
      [    1.144058]  [<ffffffff81192da0>] __kmalloc_track_caller+0x50/0x280
      [    1.144058]  [<ffffffff815147bb>] ? efi_pstore_read_func.part.1+0x12b/0x170
      [    1.144058]  [<ffffffff8115b260>] kmemdup+0x20/0x50
      [    1.144058]  [<ffffffff815147bb>] efi_pstore_read_func.part.1+0x12b/0x170
      [    1.144058]  [<ffffffff81514800>] ? efi_pstore_read_func.part.1+0x170/0x170
      [    1.144058]  [<ffffffff815148b4>] efi_pstore_read_func+0xb4/0xe0
      [    1.144058]  [<ffffffff81512b7b>] __efivar_entry_iter+0xfb/0x120
      [    1.144058]  [<ffffffff8151428f>] efi_pstore_read+0x3f/0x50
      [    1.144058]  [<ffffffff8128d7ba>] pstore_get_records+0x9a/0x150
      [    1.158207]  [<ffffffff812af25c>] ? selinux_d_instantiate+0x1c/0x20
      [    1.158207]  [<ffffffff8128ce30>] ? parse_options+0x80/0x80
      [    1.158207]  [<ffffffff8128ced5>] pstore_fill_super+0xa5/0xc0
      [    1.158207]  [<ffffffff811ae7d2>] mount_single+0xa2/0xd0
      [    1.158207]  [<ffffffff8128ccf8>] pstore_mount+0x18/0x20
      [    1.158207]  [<ffffffff811ae8b9>] mount_fs+0x39/0x1b0
      [    1.158207]  [<ffffffff81160550>] ? __alloc_percpu+0x10/0x20
      [    1.158207]  [<ffffffff811c9493>] vfs_kern_mount+0x63/0xf0
      [    1.158207]  [<ffffffff811cbb0e>] do_mount+0x23e/0xa20
      [    1.158207]  [<ffffffff8115b51b>] ? strndup_user+0x4b/0xf0
      [    1.158207]  [<ffffffff811cc373>] SyS_mount+0x83/0xc0
      [    1.158207]  [<ffffffff81673cc2>] system_call_fastpath+0x16/0x1b
      [    1.158207] ---[ end trace 61981bc62de9f6f4 ]---
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Tested-by: NMadper Xie <cxie@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      e0d59733
  10. 28 11月, 2013 2 次提交
  11. 26 11月, 2013 2 次提交
    • S
      tracing: Allow events to have NULL strings · 4e58e547
      Steven Rostedt (Red Hat) 提交于
      If an TRACE_EVENT() uses __assign_str() or __get_str on a NULL pointer
      then the following oops will happen:
      
      BUG: unable to handle kernel NULL pointer dereference at   (null)
      IP: [<c127a17b>] strlen+0x10/0x1a
      *pde = 00000000 ^M
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc1-test+ #2
      Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006^M
      task: f5cde9f0 ti: f5e5e000 task.ti: f5e5e000
      EIP: 0060:[<c127a17b>] EFLAGS: 00210046 CPU: 1
      EIP is at strlen+0x10/0x1a
      EAX: 00000000 EBX: c2472da8 ECX: ffffffff EDX: c2472da8
      ESI: c1c5e5fc EDI: 00000000 EBP: f5e5fe84 ESP: f5e5fe80
       DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      CR0: 8005003b CR2: 00000000 CR3: 01f32000 CR4: 000007d0
      Stack:
       f5f18b90 f5e5feb8 c10687a8 0759004f 00000005 00000005 00000005 00200046
       00000002 00000000 c1082a93 f56c7e28 c2472da8 c1082a93 f5e5fee4 c106bc61^M
       00000000 c1082a93 00000000 00000000 00000001 00200046 00200082 00000000
      Call Trace:
       [<c10687a8>] ftrace_raw_event_lock+0x39/0xc0
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c106bc61>] lock_release+0x57/0x1a5
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c10824dd>] read_seqcount_begin.constprop.7+0x4d/0x75
       [<c1082a93>] ? ktime_get+0x29/0x69^M
       [<c1082a93>] ktime_get+0x29/0x69
       [<c108a46a>] __tick_nohz_idle_enter+0x1e/0x426
       [<c10690e8>] ? lock_release_holdtime.part.19+0x48/0x4d
       [<c10bc184>] ? time_hardirqs_off+0xe/0x28
       [<c1068c82>] ? trace_hardirqs_off_caller+0x3f/0xaf
       [<c108a8cb>] tick_nohz_idle_enter+0x59/0x62
       [<c1079242>] cpu_startup_entry+0x64/0x192
       [<c102299c>] start_secondary+0x277/0x27c
      Code: 90 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 66 66 66 66 90 83 c9 ff 89 c7 31 c0 <f2> ae f7 d1 8d 41 ff 5f 5d c3 55 89 e5 57 66 66 66 66 90 31 ff
      EIP: [<c127a17b>] strlen+0x10/0x1a SS:ESP 0068:f5e5fe80
      CR2: 0000000000000000
      ---[ end trace 01bc47bf519ec1b2 ]---
      
      New tracepoints have been added that have allowed for NULL pointers
      being assigned to strings. To fix this, change the TRACE_EVENT() code
      to check for NULL and if it is, it will assign "(null)" to it instead
      (similar to what glibc printf does).
      Reported-by: NShuah Khan <shuah.kh@samsung.com>
      Reported-by: NJovi Zhangwei <jovi.zhangwei@gmail.com>
      Link: http://lkml.kernel.org/r/CAGdX0WFeEuy+DtpsJzyzn0343qEEjLX97+o1VREFkUEhndC+5Q@mail.gmail.com
      Link: http://lkml.kernel.org/r/528D6972.9010702@samsung.com
      Fixes: 9cbf1176 ("tracing/events: provide string with undefined size support")
      Cc: stable@vger.kernel.org # 2.6.31+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4e58e547
    • T
      ARM: tegra: Provide dummy powergate implementation · 9886e1fd
      Thierry Reding 提交于
      In order to support increased build test coverage for drivers, implement
      dummies for the powergate implementation. This will allow the drivers to
      be built without requiring support for Tegra to be selected.
      
      This patch solves the following build errors, which can be triggered in
      v3.13-rc1 by selecting DRM_TEGRA without ARCH_TEGRA:
      
      drivers/built-in.o: In function `gr3d_remove':
      drivers/gpu/drm/tegra/gr3d.c:321: undefined reference to `tegra_powergate_power_off'
      drivers/gpu/drm/tegra/gr3d.c:325: undefined reference to `tegra_powergate_power_off'
      drivers/built-in.o: In function `gr3d_probe':
      drivers/gpu/drm/tegra/gr3d.c:266: undefined reference to `tegra_powergate_sequence_power_up'
      drivers/gpu/drm/tegra/gr3d.c:273: undefined reference to `tegra_powergate_sequence_power_up'
      Signed-off-by: NThierry Reding <treding@nvidia.com>
      [swarren, updated commit description]
      Signed-off-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      9886e1fd
  12. 25 11月, 2013 2 次提交
  13. 24 11月, 2013 2 次提交
  14. 22 11月, 2013 5 次提交
    • K
      mm: place page->pmd_huge_pte to right union · 7aa555bf
      Kirill A. Shutemov 提交于
      I don't know what went wrong, mis-merge or something, but ->pmd_huge_pte
      placed in wrong union within struct page.
      
      In original patch[1] it's placed to union with ->lru and ->slab, but in
      commit e009bb30 ("mm: implement split page table lock for PMD
      level") it's in union with ->index and ->freelist.
      
      That union seems also unused for pages with table tables and safe to
      re-use, but it's not what I've tested.
      
      Let's move it to original place.  It fixes indentation at least.  :)
      
      [1] https://lkml.org/lkml/2013/10/7/288Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7aa555bf
    • A
      mm: hugetlbfs: fix hugetlbfs optimization · 27c73ae7
      Andrea Arcangeli 提交于
      Commit 7cb2ef56 ("mm: fix aio performance regression for database
      caused by THP") can cause dereference of a dangling pointer if
      split_huge_page runs during PageHuge() if there are updates to the
      tail_page->private field.
      
      Also it is repeating compound_head twice for hugetlbfs and it is running
      compound_head+compound_trans_head for THP when a single one is needed in
      both cases.
      
      The new code within the PageSlab() check doesn't need to verify that the
      THP page size is never bigger than the smallest hugetlbfs page size, to
      avoid memory corruption.
      
      A longstanding theoretical race condition was found while fixing the
      above (see the change right after the skip_unlock label, that is
      relevant for the compound_lock path too).
      
      By re-establishing the _mapcount tail refcounting for all compound
      pages, this also fixes the below problem:
      
        echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
      
        BUG: Bad page state in process bash  pfn:59a01
        page:ffffea000139b038 count:0 mapcount:10 mapping:          (null) index:0x0
        page flags: 0x1c00000000008000(tail)
        Modules linked in:
        CPU: 6 PID: 2018 Comm: bash Not tainted 3.12.0+ #25
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        Call Trace:
          dump_stack+0x55/0x76
          bad_page+0xd5/0x130
          free_pages_prepare+0x213/0x280
          __free_pages+0x36/0x80
          update_and_free_page+0xc1/0xd0
          free_pool_huge_page+0xc2/0xe0
          set_max_huge_pages.part.58+0x14c/0x220
          nr_hugepages_store_common.isra.60+0xd0/0xf0
          nr_hugepages_store+0x13/0x20
          kobj_attr_store+0xf/0x20
          sysfs_write_file+0x189/0x1e0
          vfs_write+0xc5/0x1f0
          SyS_write+0x55/0xb0
          system_call_fastpath+0x16/0x1b
      Signed-off-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Tested-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Cc: Pravin Shelar <pshelar@nicira.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Johannes Weiner <jweiner@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27c73ae7
    • D
      mm: thp: give transparent hugepage code a separate copy_page · 30b0a105
      Dave Hansen 提交于
      Right now, the migration code in migrate_page_copy() uses copy_huge_page()
      for hugetlbfs and thp pages:
      
             if (PageHuge(page) || PageTransHuge(page))
                      copy_huge_page(newpage, page);
      
      So, yay for code reuse.  But:
      
        void copy_huge_page(struct page *dst, struct page *src)
        {
              struct hstate *h = page_hstate(src);
      
      and a non-hugetlbfs page has no page_hstate().  This works 99% of the
      time because page_hstate() determines the hstate from the page order
      alone.  Since the page order of a THP page matches the default hugetlbfs
      page order, it works.
      
      But, if you change the default huge page size on the boot command-line
      (say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
      so page_hstate() returns null and copy_huge_page() oopses pretty fast
      since copy_huge_page() dereferences the hstate:
      
        void copy_huge_page(struct page *dst, struct page *src)
        {
              struct hstate *h = page_hstate(src);
              if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
        ...
      
      Mel noticed that the migration code is really the only user of these
      functions.  This moves all the copy code over to migrate.c and makes
      copy_huge_page() work for THP by checking for it explicitly.
      
      I believe the bug was introduced in commit b32967ff ("mm: numa: Add
      THP migration for the NUMA working set scanning fault case")
      
      [akpm@linux-foundation.org: fix coding-style and comment text, per Naoya Horiguchi]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Tested-by: NDave Jiang <dave.jiang@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30b0a105
    • J
      genetlink: fix genl_set_err() group ID · 91398a09
      Johannes Berg 提交于
      Fix another really stupid bug - I introduced genl_set_err()
      precisely to be able to adjust the group and reject invalid
      ones, but then forgot to do so.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91398a09
    • J
      genetlink: fix genlmsg_multicast() bug · 220815a9
      Johannes Berg 提交于
      Unfortunately, I introduced a tremendously stupid bug into
      genlmsg_multicast() when doing all those multicast group
      changes: it adjusts the group number, but then passes it
      to genlmsg_multicast_netns() which does that again.
      
      Somehow, my tests failed to catch this, so add a warning
      into genlmsg_multicast_netns() and remove the offending
      group ID adjustment.
      
      Also add a warning to the similar code in other functions
      so people who misuse them are more loudly warned.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      220815a9
  15. 21 11月, 2013 5 次提交