1. 18 4月, 2013 8 次提交
  2. 08 4月, 2013 1 次提交
  3. 19 3月, 2013 1 次提交
  4. 13 3月, 2013 1 次提交
  5. 05 3月, 2013 1 次提交
  6. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  7. 23 2月, 2013 2 次提交
  8. 16 2月, 2013 4 次提交
  9. 15 2月, 2013 1 次提交
  10. 13 2月, 2013 5 次提交
  11. 11 2月, 2013 1 次提交
  12. 08 2月, 2013 2 次提交
    • N
      pseries/iommu: Remove DDW on kexec · 14b6f00f
      Nishanth Aravamudan 提交于
      pseries/iommu: remove DDW on kexec
      
      We currently insert a property in the device-tree when we successfully
      configure DDW for a given slot. This was meant to be an optimization to
      speed up kexec/kdump, so that we don't need to make the RTAS calls again
      to re-configured DDW in the new kernel.
      
      However, we end up tripping a plpar_tce_stuff failure on kexec/kdump
      because we unconditionally parse the ibm,dma-window property for the
      node at bus/dev setup time. This property contains the 32-bit DMA window
      LIOBN, which is distinct from the DDW window's. We pass that LIOBN (via
      iommu_table_init -> iommu_table_clear -> tce_free ->
      tce_freemulti_pSeriesLP) to plpar_tce_stuff, which fails because that
      32-bit window is no longer present after
      25ebc45b ("powerpc/pseries/iommu: remove
      default window before attempting DDW manipulation").
      
      I believe the simplest, easiest-to-maintain fix is to just change our
      initcall to, rather than detecting and updating the new kernel's DDW
      knowledge, just remove all DDW configurations. When the drivers
      re-initialize, we will set everything back up as it was before.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      14b6f00f
    • N
      pseries/iommu: Restore_default_window does not use liobn parameter · a1dabade
      Nishanth Aravamudan 提交于
      The parameter is unused, and complicates a following fix. Just remove
      it.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a1dabade
  13. 05 2月, 2013 3 次提交
  14. 29 1月, 2013 5 次提交
    • A
      powerpc/512x: initialize clocks before bus probing · f29bc0a4
      Anatolij Gustschin 提交于
      Early driver probing can fail due to not available clocks
      (clk_get() fails) since the clk API init didn't take place yet.
      Move clocks init before bus probing.
      Signed-off-by: NAnatolij Gustschin <agust@denx.de>
      f29bc0a4
    • N
      pseries/iommu: Ensure TCEs are cleared with non-huge DDW · 71cf1def
      Nishanth Aravamudan 提交于
      There are now two kinds of DMA windows that might be presented by
      PowerVM DDW support -- huge windows (that can map all of system memory
      regardless of the LPAR configuration) and non-huge windows (which
      can't). They are implemented slightly differently in PowerVM, and thus
      have different characteristics. The most obvious is that slot isolate
      doesn't clear the TCEs/window for us with non-huge windows. Thus, when a
      DLPAR operation occurs on a slot using a non-huge window, TCEs are still
      present (the notifier chain doesn't currently remove them explicitly)
      and the DLPAR fails. Fix this by calling remove_ddw() first, which will
      unmap the DDW TCEs.
      
      Note: a corresponding change to drmgr is needed to actually successfully
      DLPAR, such that the device-tree update (which causes the notifier chain
      to fire) occurs before slot isolate.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      71cf1def
    • N
      pseries/iommu: Fix iteration in DDW TCE clearrange · 22b38298
      Nishanth Aravamudan 提交于
      tce_clearrange_multi_pSeriesLP is attempting to iterate over all TCEs in
      a given range. However, is it not advancing the dma_offset value passed
      to plpar_tce_stuff via the next value. This prevents DLPAR from
      completing, because TCEs are still present at slot isolation time.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      22b38298
    • B
      powerpc: Add support for CTS-1000 GPIO controlled system poweroff · 5611fe48
      Benjamin Collins 提交于
      CTS-1000 is based on P4080. GPIO 27 is used to signal the FPGA to
      switch off power, and also associates IRQ 8 with front-panel button
      press (which we use to call orderly_poweroff()).
      
      The relevant device-tree looks like this:
      
      	gpio0: gpio@130000 {
      		compatible = "fsl,qoriq-gpio";
      		reg = <0x130000 0x1000>;
      		interrupts = <55 2 0 0>;
      		#gpio-cells = <2>;
      		gpio-controller;
      
      		/* Allows powering off the system via GPIO signal. */
      		gpio-halt@27 {
      			compatible = "sgy,gpio-halt";
      			gpios = <&gpio0 27 0>;
      			interrupts = <8 1 0 0>;
      		};
      	};
      
      Because the driver cannot match on sgy,gpio-halt (because the node is never
      processed through of_platform), it matches on fsl,qoriq-gpio and then
      checks child nodes for the matching sgy,gpio-halt. This also ensures that
      the GPIO controller is detected prior to sgy_cts1000's probe callback,
      since that node wont match via of_platform until the controller is
      registered.
      
      Also, because the GPIO handler for triggering system poweroff might sleep,
      the IRQ uses a workqueue to call orderly_poweroff().
      
      As a final note, this driver may be expanded for other features specific to
      the CTS-1000.
      Signed-off-by: NBen Collins <ben.c@servergy.com>
      Cc: Jack Smith <jack.s@servergy.com>
      Cc: Vihar Rai <vihar.r@servergy.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5611fe48
    • S
      powerpc/pasemi: Fix crash on reboot · 72640d88
      Steven Rostedt 提交于
      commit f96972f2 "kernel/sys.c: call disable_nonboot_cpus() in
      kernel_restart()"
      
      added a call to disable_nonboot_cpus() on kernel_restart(), which tries
      to shutdown all the CPUs except the first one. The issue with the PA
      Semi, is that it does not support CPU hotplug.
      
      When the call is made to __cpu_down(), it calls the notifiers
      CPU_DOWN_PREPARE, and then tries to take the CPU down.
      
      One of the notifiers to the CPU hotplug code, is the cpufreq. The
      DOWN_PREPARE will call __cpufreq_remove_dev() which calls
      cpufreq_driver->exit. The PA Semi exit handler unmaps regions of I/O
      that is used by an interrupt that goes off constantly
      (system_reset_common, but it goes off during normal system operations
      too). I'm not sure exactly what this interrupt does.
      
      Running a simple function trace, you can see it goes off quite a bit:
      
      # tracer: function
      #
      #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
      #              | |       |          |         |
                <idle>-0     [001]  1558.859363: .pasemi_system_reset_exception <-.system_reset_exception
                <idle>-0     [000]  1558.860112: .pasemi_system_reset_exception <-.system_reset_exception
                <idle>-0     [000]  1558.861109: .pasemi_system_reset_exception <-.system_reset_exception
                <idle>-0     [001]  1558.861361: .pasemi_system_reset_exception <-.system_reset_exception
                <idle>-0     [000]  1558.861437: .pasemi_system_reset_exception <-.system_reset_exception
      
      When the region is unmapped, the system crashes with:
      
      Disabling non-boot CPUs ...
      Error taking CPU1 down: -38
      Unable to handle kernel paging request for data at address 0xd0000800903a0100
      Faulting instruction address: 0xc000000000055fcc
      Oops: Kernel access of bad area, sig: 11 [#1]
      PREEMPT SMP NR_CPUS=64 NUMA PA Semi PWRficient
      Modules linked in: shpchp
      NIP: c000000000055fcc LR: c000000000055fb4 CTR: c0000000000df1fc
      REGS: c0000000012175d0 TRAP: 0300   Not tainted  (3.8.0-rc4-test-dirty)
      MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 24000088  XER: 00000000
      SOFTE: 0
      DAR: d0000800903a0100, DSISR: 42000000
      TASK = c0000000010e9008[0] 'swapper/0' THREAD: c000000001214000 CPU: 0
      GPR00: d0000800903a0000 c000000001217850 c0000000012167e0 0000000000000000
      GPR04: 0000000000000000 0000000000000724 0000000000000724 0000000000000000
      GPR08: 0000000000000000 0000000000000000 0000000000000001 0000000000a70000
      GPR12: 0000000024000080 c00000000fff0000 ffffffffffffffff 000000003ffffae0
      GPR16: ffffffffffffffff 0000000000a21198 0000000000000060 0000000000000000
      GPR20: 00000000008fdd35 0000000000a21258 000000003ffffaf0 0000000000000417
      GPR24: 0000000000a226d0 c000000000000000 0000000000000000 0000000000000000
      GPR28: c00000000138b358 0000000000000000 c000000001144818 d0000800903a0100
      NIP [c000000000055fcc] .set_astate+0x5c/0xa4
      LR [c000000000055fb4] .set_astate+0x44/0xa4
      Call Trace:
      [c000000001217850] [c000000000055fb4] .set_astate+0x44/0xa4 (unreliable)
      [c0000000012178f0] [c00000000005647c] .restore_astate+0x2c/0x34
      [c000000001217980] [c000000000054668] .pasemi_system_reset_exception+0x6c/0x88
      [c000000001217a00] [c000000000019ef0] .system_reset_exception+0x48/0x84
      [c000000001217a80] [c000000000001e40] system_reset_common+0x140/0x180
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      72640d88
  15. 28 1月, 2013 1 次提交
    • F
      cputime: Generic on-demand virtual cputime accounting · abf917cd
      Frederic Weisbecker 提交于
      If we want to stop the tick further idle, we need to be
      able to account the cputime without using the tick.
      
      Virtual based cputime accounting solves that problem by
      hooking into kernel/user boundaries.
      
      However implementing CONFIG_VIRT_CPU_ACCOUNTING require
      low level hooks and involves more overhead. But we already
      have a generic context tracking subsystem that is required
      for RCU needs by archs which plan to shut down the tick
      outside idle.
      
      This patch implements a generic virtual based cputime
      accounting that relies on these generic kernel/user hooks.
      
      There are some upsides of doing this:
      
      - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
      if context tracking is already built (already necessary for RCU in full
      tickless mode).
      
      - We can rely on the generic context tracking subsystem to dynamically
      (de)activate the hooks, so that we can switch anytime between virtual
      and tick based accounting. This way we don't have the overhead
      of the virtual accounting when the tick is running periodically.
      
      And one downside:
      
      - There is probably more overhead than a native virtual based cputime
      accounting. But this relies on hooks that are already set anyway.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      abf917cd
  16. 22 1月, 2013 2 次提交
  17. 16 1月, 2013 1 次提交