1. 10 11月, 2012 4 次提交
    • D
      PCI: Provide method to reduce the number of total VFs supported · bff73156
      Donald Dutile 提交于
      Some implementations of SRIOV provide a capability structure
      value of TotalVFs that is greater than what the software can support.
      Provide a method to reduce the capability structure reported value
      to the value the driver can support.
      This ensures sysfs reports the current capability of the system,
      hardware and software.
      Example for its use: igb & ixgbe -- report 8 & 64 as TotalVFs,
      but drivers only support 7 & 63 maximum.
      Signed-off-by: NDonald Dutile <ddutile@redhat.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      bff73156
    • D
      PCI: SRIOV control and status via sysfs · 1789382a
      Donald Dutile 提交于
      Provide files under sysfs to determine the maximum number of VFs
      an SR-IOV-capable PCIe device supports, and methods to enable and
      disable the VFs on a per-device basis.
      
      Currently, VF enablement by SR-IOV-capable PCIe devices is done
      via driver-specific module parameters.  If not setup in modprobe files,
      it requires admin to unload & reload PF drivers with number of desired
      VFs to enable.  Additionally, the enablement is system wide: all
      devices controlled by the same driver have the same number of VFs
      enabled.  Although the latter is probably desired, there are PCI
      configurations setup by system BIOS that may not enable that to occur.
      
      Two files are created for the PF of PCIe devices with SR-IOV support:
      
          sriov_totalvfs	Contains the maximum number of VFs the device
      			could support as reported by the TotalVFs register
      			in the SR-IOV extended capability.
      
          sriov_numvfs	Contains the number of VFs currently enabled on
      			this device as reported by the NumVFs register in
      			the SR-IOV extended capability.
      
      			Writing zero to this file disables all VFs.
      
      			Writing a positive number to this file enables that
      			number of VFs.
      
      These files are readable for all SR-IOV PF devices.  Writes to the
      sriov_numvfs file are effective only if a driver that supports the
      sriov_configure() method is attached.
      Signed-off-by: NDonald Dutile <ddutile@redhat.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      1789382a
    • Y
      PCI: Use is_visible() with boot_vga attribute for pci_dev · 625e1d59
      Yinghai Lu 提交于
      Should make pci_create_sysfs_dev_files() simpler.  Also fix possible
      memleak in remove path.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      625e1d59
    • Y
      PCI: Add pci_device_type to pdev's device struct · 4e15c46b
      Yinghai Lu 提交于
      Need type filled in device structure so it can be used for visible
      attribute control in sysfs for pci_dev.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      4e15c46b
  2. 22 8月, 2012 1 次提交
  3. 24 6月, 2012 1 次提交
    • H
      PCI/PM: add PCIe runtime D3cold support · 448bd857
      Huang Ying 提交于
      This patch adds runtime D3cold support and corresponding ACPI platform
      support.  This patch only enables runtime D3cold support; it does not
      enable D3cold support during system suspend/hibernate.
      
      D3cold is the deepest power saving state for a PCIe device, where its main
      power is removed.  While it is in D3cold, you can't access the device at
      all, not even its configuration space (which is still accessible in D3hot).
      Therefore the PCI PM registers can not be used to transition into/out of
      the D3cold state; that must be done by platform logic such as ACPI _PR3.
      
      To support wakeup from D3cold, a system may provide auxiliary power, which
      allows a device to request wakeup using a Beacon or the sideband WAKE#
      signal.  WAKE# is usually connected to platform logic such as ACPI GPE.
      This is quite different from other power saving states, where devices
      request wakeup via a PME message on the PCIe link.
      
      Some devices, such as those in plug-in slots, have no direct platform
      logic.  For example, there is usually no ACPI _PR3 for them.  D3cold
      support for these devices can be done via the PCIe Downstream Port leading
      to the device.  When the PCIe port is powered on/off, the device is powered
      on/off too.  Wakeup events from the device will be notified to the
      corresponding PCIe port.
      
      For more information about PCIe D3cold and corresponding ACPI support,
      please refer to:
      
      - PCI Express Base Specification Revision 2.0
      - Advanced Configuration and Power Interface Specification Revision 5.0
      
      [bhelgaas: changelog]
      Reviewed-by: NRafael J. Wysocki <rjw@sisk.pl>
      Originally-by: NZheng Yan <zheng.z.yan@intel.com>
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      448bd857
  4. 21 6月, 2012 1 次提交
  5. 24 4月, 2012 1 次提交
  6. 28 2月, 2012 1 次提交
  7. 15 2月, 2012 1 次提交
  8. 06 1月, 2012 1 次提交
    • E
      capabilities: reverse arguments to security_capable · b7e724d3
      Eric Paris 提交于
      security_capable takes ns, cred, cap.  But the LSM capable() hook takes
      cred, ns, cap.  The capability helper functions also take cred, ns, cap.
      Rather than flip argument order just to flip it back, leave them alone.
      Heck, this should be a little faster since argument will be in the right
      place!
      Signed-off-by: NEric Paris <eparis@redhat.com>
      b7e724d3
  9. 01 11月, 2011 1 次提交
  10. 22 5月, 2011 2 次提交
  11. 31 3月, 2011 1 次提交
  12. 24 3月, 2011 1 次提交
    • S
      userns: security: make capabilities relative to the user namespace · 3486740a
      Serge E. Hallyn 提交于
      - Introduce ns_capable to test for a capability in a non-default
        user namespace.
      - Teach cap_capable to handle capabilities in a non-default
        user namespace.
      
      The motivation is to get to the unprivileged creation of new
      namespaces.  It looks like this gets us 90% of the way there, with
      only potential uid confusion issues left.
      
      I still need to handle getting all caps after creation but otherwise I
      think I have a good starter patch that achieves all of your goals.
      
      Changelog:
      	11/05/2010: [serge] add apparmor
      	12/14/2010: [serge] fix capabilities to created user namespaces
      	Without this, if user serge creates a user_ns, he won't have
      	capabilities to the user_ns he created.  THis is because we
      	were first checking whether his effective caps had the caps
      	he needed and returning -EPERM if not, and THEN checking whether
      	he was the creator.  Reverse those checks.
      	12/16/2010: [serge] security_real_capable needs ns argument in !security case
      	01/11/2011: [serge] add task_ns_capable helper
      	01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
      	02/16/2011: [serge] fix a logic bug: the root user is always creator of
      		    init_user_ns, but should not always have capabilities to
      		    it!  Fix the check in cap_capable().
      	02/21/2011: Add the required user_ns parameter to security_capable,
      		    fixing a compile failure.
      	02/23/2011: Convert some macros to functions as per akpm comments.  Some
      		    couldn't be converted because we can't easily forward-declare
      		    them (they are inline if !SECURITY, extern if SECURITY).  Add
      		    a current_user_ns function so we can use it in capability.h
      		    without #including cred.h.  Move all forward declarations
      		    together to the top of the #ifdef __KERNEL__ section, and use
      		    kernel-doc format.
      	02/23/2011: Per dhowells, clean up comment in cap_capable().
      	02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
      
      (Original written and signed off by Eric;  latest, modified version
      acked by him)
      
      [akpm@linux-foundation.org: fix build]
      [akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
      [serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NDaniel Lezcano <daniel.lezcano@free.fr>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3486740a
  13. 15 2月, 2011 1 次提交
  14. 13 2月, 2011 1 次提交
  15. 11 2月, 2011 1 次提交
  16. 09 2月, 2011 1 次提交
  17. 15 1月, 2011 1 次提交
  18. 17 11月, 2010 1 次提交
    • D
      PCI: fix offset check for sysfs mmapped files · 8c05cd08
      Darrick J. Wong 提交于
      I just loaded 2.6.37-rc2 on my machines, and I noticed that X no longer starts.
      Running an strace of the X server shows that it's doing this:
      
      open("/sys/bus/pci/devices/0000:07:00.0/resource0", O_RDWR) = 10
      mmap(NULL, 16777216, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 EINVAL (Invalid argument)
      
      This code seems to be asking for a shared read/write mapping of 16MB worth of
      BAR0 starting at file offset 0, and letting the kernel assign a starting
      address.  Unfortunately, this -EINVAL causes X not to start.  Looking into
      dmesg, there's a complaint like so:
      
      process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x        96000000, size 0x         1000000)
      
      ...with the following code in pci_mmap_fits:
      
      	pci_start = (mmap_api == PCI_MMAP_SYSFS) ?
      		pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0;
              if (start >= pci_start && start < pci_start + size &&
                              start + nr <= pci_start + size)
      
      It looks like the logic here is set up such that when the mmap call comes via
      sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the
      resource's start and end address, and the end of the vma to be no farther than
      the end.  However, the sysfs PCI resource files always start at offset zero,
      which means that this test always fails for programs that mmap the sysfs files.
      Given the comment in the original commit
      3b519e4e, I _think_ the old procfs files
      require that the file offset be equal to the resource's base address when
      mmapping.
      
      I think what we want here is for pci_start to be 0 when mmap_api ==
      PCI_MMAP_PROCFS.  The following patch makes that change, after which the Matrox
      and Mach64 X drivers work again.
      Acked-by: NMartin Wilck <martin.wilck@ts.fujitsu.com>
      Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      8c05cd08
  19. 16 11月, 2010 1 次提交
  20. 12 11月, 2010 1 次提交
    • M
      PCI: fix size checks for mmap() on /proc/bus/pci files · 3b519e4e
      Martin Wilck 提交于
      The checks for valid mmaps of PCI resources made through /proc/bus/pci files
      that were introduced in 9eff02e2 have several
      problems:
      
      1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0,
      whereas under /sys/bus/pci/devices, the start of the resource corresponds
      to offset 0. This may lead to false negatives in pci_mmap_fits(), which
      implicitly assumes the /sys/bus/pci/devices layout.
      
      2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads
      to false positives, because pci_mmap_fits() doesn't treat empty resources
      correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT)
      in this case!).
      
      3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus
      WARNINGS for the first resources that don't fit until the correct one is found.
      
      On many controllers the first 2-4 BARs are used, and the others are empty.
      In this case, an mmap attempt will first fail on the non-empty BARs
      (including the "right" BAR because of 1.) and emit bogus WARNINGS because
      of 3., and finally succeed on the first empty BAR because of 2.
      This is certainly not the intended behaviour.
      
      This patch addresses all 3 issues.
      Updated with an enum type for the additional parameter for pci_mmap_fits().
      
      Cc: stable@kernel.org
      Signed-off-by: NMartin Wilck <martin.wilck@ts.fujitsu.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      3b519e4e
  21. 31 7月, 2010 3 次提交
  22. 12 6月, 2010 1 次提交
  23. 22 5月, 2010 2 次提交
    • C
      pci: check caps from sysfs file open to read device dependent config space · de139a33
      Chris Wright 提交于
      The PCI config space bin_attr read handler has a hardcoded CAP_SYS_ADMIN
      check to verify privileges before allowing a user to read device
      dependent config space.  This is meant to protect from an unprivileged
      user potentially locking up the box.
      
      When assigning a PCI device directly to a guest with libvirt and KVM,
      the sysfs config space file is chown'd to the unprivileged user that
      the KVM guest will run as.  The guest needs to have full access to the
      device's config space since it's responsible for driving the device.
      However, despite being the owner of the sysfs file, the CAP_SYS_ADMIN
      check will not allow read access beyond the config header.
      
      With this patch we check privileges against the capabilities used when
      openining the sysfs file.  The allows a privileged process to open the
      file and hand it to an unprivileged process, and the unprivileged process
      can still read all of the config space.
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      de139a33
    • C
      sysfs: add struct file* to bin_attr callbacks · 2c3c8bea
      Chris Wright 提交于
      This allows bin_attr->read,write,mmap callbacks to check file specific data
      (such as inode owner) as part of any privilege validation.
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      2c3c8bea
  24. 12 5月, 2010 2 次提交
  25. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  26. 19 3月, 2010 1 次提交
  27. 08 3月, 2010 2 次提交
  28. 05 1月, 2010 1 次提交
  29. 05 12月, 2009 1 次提交
  30. 07 11月, 2009 1 次提交
    • A
      PCI: derive nearby CPUs from device's instead of bus' NUMA information · e0cd5160
      Andreas Herrmann 提交于
      In case of AMD CPU northbridge functions this NUMA information might
      differ.  Here is an example from a 4-socket system.
      
      Currently Linux shows
      
        root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat numa_node
        0
        root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat local_cpu*
        0-3
        00000000,0000000f
      
      which is not correct for northbridge functions as the local CPUs
      are those of the same socket.
      
      With this patch and a quirk for AMD CPU NB functions Linux can
      do better and correctly show
      
        root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat numa_node
        2
        root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat local_cpu*
        8-11
        00000000,00000f00
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      e0cd5160
  31. 10 9月, 2009 1 次提交