So drivers can use them. This can be used to replace
duplicate code in the drm subsystem.
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Seeing there's been some confusion about the use of pci_add_dma_alias(),
expand the comment to describe why it must be called early and how
early it must be called.
Also, expand on the purpose of this function and common reasons it would
be used.
[The comment was reworded to some extent by Alex Williamson]
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Doug Meyer <dmeyer@gigaio.com>
Move early dump functionality into common code so that it is available for
all architectures. No need to carry arch-specific reads around as the read
hooks are already initialized by the time pci_setup_device() is getting
called during scan.
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cleanup PCI_REBAR_CTRL_BAR_SHIFT handling. That was hard coded instead of
properly defined in the header for some reason.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Resize BARs after resume to the expected size again.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199959
Fixes: d6895ad39f ("drm/amdgpu: resize VRAM BAR for CPU access v6")
Fixes: 276b738deb ("PCI: Add resizable BAR infrastructure")
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.15+
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlsZdg0UHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vwJOBAAsuuWsOdiJRRhQLU5WfEMFgzcL02R
gsumqZkK7E8LOq0DPNMtcgv9O0KgYZyCiZyTMJ8N7sEYohg04lMz8mtYXOibjcwI
p+nVMko8jQXV9FXwSMGVqigEaLLcrbtkbf/mPriD63DDnRMa/+/Jh15SwfLTydIH
QRTJbIxkS3EiOauj5C8QY3UwzjlvV9mDilzM/x+MSK27k2HFU9Pw/3lIWHY716rr
grPZTwBTfIT+QFZjwOm6iKzHjxRM830sofXARkcH4CgSNaTeq5UbtvAs293MHvc+
v/v/1dfzUh00NxfZDWKHvTUMhjazeTeD9jEVS7T+HUcGzvwGxMSml6bBdznvwKCa
46ynePOd1VcEBlMYYS+P4njRYBLWeUwt6/TzqR4yVwb0keQ6Yj3Y9H2UpzscYiCl
O+0qz6RwyjKY0TpxfjoojgHn4U5ByI5fzVDJHbfr2MFTqqRNaabVrfl6xU4sVuhh
OluT5ym+/dOCTI/wjlolnKNb0XThVre8e2Busr3TRvuwTMKMIWqJ9sXLovntdbqE
furPD/UnuZHkjSFhQ1SQwYdWmsZI5qAq2C9haY8sEWsXEBEcBGLJ2BEleMxm8UsL
KXuy4ER+R4M+sFtCkoWf3D4NTOBUdPHi4jyk6Ooo1idOwXCsASVvUjUEG5YcQC6R
kpJ1VPTKK1XN64I=
=aFAi
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- unify AER decoding for native and ACPI CPER sources (Alexandru
Gagniuc)
- add TLP header info to AER tracepoint (Thomas Tai)
- add generic pcie_wait_for_link() interface (Oza Pawandeep)
- handle AER ERR_FATAL by removing and re-enumerating devices, as
Downstream Port Containment does (Oza Pawandeep)
- factor out common code between AER and DPC recovery (Oza Pawandeep)
- stop triggering DPC for ERR_NONFATAL errors (Oza Pawandeep)
- share ERR_FATAL recovery path between AER and DPC (Oza Pawandeep)
- disable ASPM L1.2 substate if we don't have LTR (Bjorn Helgaas)
- respect platform ownership of LTR (Bjorn Helgaas)
- clear interrupt status in top half to avoid interrupt storm (Oza
Pawandeep)
- neaten pci=earlydump output (Andy Shevchenko)
- avoid errors when extended config space inaccessible (Gilles Buloz)
- prevent sysfs disable of device while driver attached (Christoph
Hellwig)
- use core interface to report PCIe link properties in bnx2x, bnxt_en,
cxgb4, ixgbe (Bjorn Helgaas)
- remove unused pcie_get_minimum_link() (Bjorn Helgaas)
- fix use-before-set error in ibmphp (Dan Carpenter)
- fix pciehp timeouts caused by Command Completed errata (Bjorn
Helgaas)
- fix refcounting in pnv_php hotplug (Julia Lawall)
- clear pciehp Presence Detect and Data Link Layer Status Changed on
resume so we don't miss hotplug events (Mika Westerberg)
- only request pciehp control if we support it, so platform can use
ACPI hotplug otherwise (Mika Westerberg)
- convert SHPC to be builtin only (Mika Westerberg)
- request SHPC control via _OSC if we support it (Mika Westerberg)
- simplify SHPC handoff from firmware (Mika Westerberg)
- fix an SHPC quirk that mistakenly included *all* AMD bridges as well
as devices from any vendor with device ID 0x7458 (Bjorn Helgaas)
- assign a bus number even to non-native hotplug bridges to leave
space for acpiphp additions, to fix a common Thunderbolt xHCI
hot-add failure (Mika Westerberg)
- keep acpiphp from scanning native hotplug bridges, to fix common
Thunderbolt hot-add failures (Mika Westerberg)
- improve "partially hidden behind bridge" messages from core (Mika
Westerberg)
- add macros for PCIe Link Control 2 register (Frederick Lawler)
- replace IB/hfi1 custom macros with PCI core versions (Frederick
Lawler)
- remove dead microblaze and xtensa code (Bjorn Helgaas)
- use dev_printk() when possible in xtensa and mips (Bjorn Helgaas)
- remove unused pcie_port_acpi_setup() and portdrv_acpi.c (Bjorn
Helgaas)
- add managed interface to get PCI host bridge resources from OF (Jan
Kiszka)
- add support for unbinding generic PCI host controller (Jan Kiszka)
- fix memory leaks when unbinding generic PCI host controller (Jan
Kiszka)
- request legacy VGA framebuffer only for VGA devices to avoid false
device conflicts (Bjorn Helgaas)
- turn on PCI_COMMAND_IO & PCI_COMMAND_MEMORY in pci_enable_device()
like everybody else, not in pcibios_fixup_bus() (Bjorn Helgaas)
- add generic enable function for simple SR-IOV hardware (Alexander
Duyck)
- use generic SR-IOV enable for ena, nvme (Alexander Duyck)
- add ACS quirk for Intel 7th & 8th Gen mobile (Alex Williamson)
- add ACS quirk for Intel 300 series (Mika Westerberg)
- enable register clock for Armada 7K/8K (Gregory CLEMENT)
- reduce Keystone "link already up" log level (Fabio Estevam)
- move private DT functions to drivers/pci/ (Rob Herring)
- factor out dwc CONFIG_PCI Kconfig dependencies (Rob Herring)
- add DesignWare support to the endpoint test driver (Gustavo
Pimentel)
- add DesignWare support for endpoint mode (Gustavo Pimentel)
- use devm_ioremap_resource() instead of devm_ioremap() in dra7xx and
artpec6 (Gustavo Pimentel)
- fix Qualcomm bitwise NOT issue (Dan Carpenter)
- add Qualcomm runtime PM support (Srinivas Kandagatla)
- fix DesignWare enumeration below bridges (Koen Vandeputte)
- use usleep() instead of mdelay() in endpoint test (Jia-Ju Bai)
- add configfs entries for pci_epf_driver device IDs (Kishon Vijay
Abraham I)
- clean up pci_endpoint_test driver (Gustavo Pimentel)
- update Layerscape maintainer email addresses (Minghuan Lian)
- add COMPILE_TEST to improve build test coverage (Rob Herring)
- fix Hyper-V bus registration failure caused by domain/serial number
confusion (Sridhar Pitchai)
- improve Hyper-V refcounting and coding style (Stephen Hemminger)
- avoid potential Hyper-V hang waiting for a response that will never
come (Dexuan Cui)
- implement Mediatek chained IRQ handling (Honghui Zhang)
- fix vendor ID & class type for Mediatek MT7622 (Honghui Zhang)
- add Mobiveil PCIe host controller driver (Subrahmanya Lingappa)
- add Mobiveil MSI support (Subrahmanya Lingappa)
- clean up clocks, MSI, IRQ mappings in R-Car probe failure paths
(Marek Vasut)
- poll more frequently (5us vs 5ms) while waiting for R-Car data link
active (Marek Vasut)
- use generic OF parsing interface in R-Car (Vladimir Zapolskiy)
- add R-Car V3H (R8A77980) "compatible" string (Sergei Shtylyov)
- add R-Car gen3 PHY support (Sergei Shtylyov)
- improve R-Car PHYRDY polling (Sergei Shtylyov)
- clean up R-Car macros (Marek Vasut)
- use runtime PM for R-Car controller clock (Dien Pham)
- update arm64 defconfig for Rockchip (Shawn Lin)
- refactor Rockchip code to facilitate both root port and endpoint
mode (Shawn Lin)
- add Rockchip endpoint mode driver (Shawn Lin)
- support VMD "membar shadow" feature (Jon Derrick)
- support VMD bus number offsets (Jon Derrick)
- add VMD "no AER source ID" quirk for more device IDs (Jon Derrick)
- remove unnecessary host controller CONFIG_PCIEPORTBUS Kconfig
selections (Bjorn Helgaas)
- clean up quirks.c organization and whitespace (Bjorn Helgaas)
* tag 'pci-v4.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (144 commits)
PCI/AER: Replace struct pcie_device with pci_dev
PCI/AER: Remove unused parameters
PCI: qcom: Include gpio/consumer.h
PCI: Improve "partially hidden behind bridge" log message
PCI: Improve pci_scan_bridge() and pci_scan_bridge_extend() doc
PCI: Move resource distribution for single bridge outside loop
PCI: Account for all bridges on bus when distributing bus numbers
ACPI / hotplug / PCI: Drop unnecessary parentheses
ACPI / hotplug / PCI: Mark stale PCI devices disconnected
ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug
PCI: hotplug: Add hotplug_is_native()
PCI: shpchp: Add shpchp_is_native()
PCI: shpchp: Fix AMD POGO identification
PCI: mobiveil: Add MSI support
PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver
PCI/AER: Decode Error Source Requester ID
PCI/AER: Remove aer_recover_work_func() forward declaration
PCI/DPC: Use the generic pcie_do_fatal_recovery() path
PCI/AER: Pass service type to pcie_do_fatal_recovery()
PCI/DPC: Disable ERR_NONFATAL handling by DPC
...
- add generic enable function for simple SR-IOV hardware (Alexander
Duyck)
- use generic SR-IOV enable for ena, nvme (Alexander Duyck)
- add ACS quirk for Intel 7th & 8th Gen mobile (Alex Williamson)
- add ACS quirk for Intel 300 series (Mika Westerberg)
* pci/virtualization:
PCI/IOV: Allow PF drivers to limit total_VFs to 0
PCI: Add "pci=noats" boot parameter
PCI: Add ACS quirk for Intel 300 series
PCI: Add ACS quirk for Intel 7th & 8th Gen mobile
nvme-pci: Use pci_sriov_configure_simple() to enable VFs
net: ena: Use pci_sriov_configure_simple() to enable VFs
PCI/IOV: Add pci-pf-stub driver for PFs that only enable VFs
PCI/IOV: Add pci_sriov_configure_simple()
- neaten pci=earlydump output (Andy Shevchenko)
- avoid errors when extended config space inaccessible (Gilles Buloz)
- prevent sysfs disable of device while driver attached (Christoph
Hellwig)
- use core interface to report PCIe link properties in bnx2x, bnxt_en,
cxgb4, ixgbe (Bjorn Helgaas)
- remove unused pcie_get_minimum_link() (Bjorn Helgaas)
* pci/enumeration:
PCI: Remove unused pcie_get_minimum_link()
ixgbe: Report PCIe link properties with pcie_print_link_status()
cxgb4: Report PCIe link properties with pcie_print_link_status()
bnxt_en: Report PCIe link properties with pcie_print_link_status()
bnx2x: Report PCIe link properties with pcie_print_link_status()
PCI: Prevent sysfs disable of device while driver is attached
PCI: Check whether bridges allow access to extended config space
x86/PCI: Make pci=earlydump output neat
In some cases pcie_get_minimum_link() returned misleading information
because it found the slowest link and the narrowest link without
considering the total bandwidth of the link.
For example, consider a path with these two links:
- 16.0 GT/s x1 link (16.0 * 10^9 * 128 / 130) * 1 / 8 = 1969 MB/s
- 2.5 GT/s x16 link ( 2.5 * 10^9 * 8 / 10) * 16 / 8 = 4000 MB/s
The available bandwidth of the path is limited by the 16 GT/s link to about
1969 MB/s, but pcie_get_minimum_link() returned 2.5 GT/s x1, which
corresponds to only 250 MB/s.
Callers should use pcie_print_link_status() instead, or
pcie_bandwidth_available() if they need more detailed information.
Remove pcie_get_minimum_link() since there are no callers left.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Two comments in pci_target_state() are outdated, as the function
doesn't set the target power state for the device any more, only
finds one for it, so fix them accordingly.
Reported-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Clients such as hotplug and Downstream Port Containment (DPC) both need to
wait until a link becomes active or inactive.
Add a generic pcie_wait_link_active() interface and use it instead of
duplicating the code.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
The only user of pci_get_new_domain_nr() is of_pci_bus_find_domain_nr().
Since they are defined in the same file, pci_get_new_domain_nr() can be
made static, which also simplifies preprocessor conditionals.
No functional change intended.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Adds a "pci=noats" boot parameter. When supplied, all ATS related
functions fail immediately and the IOMMU is configured to not use
device-IOTLB.
Any function that checks for ATS capabilities directly against the devices
should also check this flag. Currently, such functions exist only in IOMMU
drivers, and they are covered by this patch.
The motivation behind this patch is the existence of malicious devices.
Lots of research has been done about how to use the IOMMU as protection
from such devices. When ATS is supported, any I/O device can access any
physical address by faking device-IOTLB entries. Adding the ability to
ignore these entries lets sysadmins enhance system security.
Signed-off-by: Gil Kupfer <gilkup@cs.technion.ac.il>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Commit 0847684cfc (PCI / PM: Simplify device wakeup settings code)
went too far and dropped the device_may_wakeup() check from
pci_enable_wake() which causes wakeup to be enabled during system
suspend, hibernation or shutdown for some PCI devices that are not
allowed by user space to wake up the system from sleep (or power off).
As a result of this, excessive power is drawn by some of the affected
systems while in sleep states or off.
Restore the device_may_wakeup() check in pci_enable_wake(), but make
sure that the PCI bus type's runtime suspend callback will not call
device_may_wakeup() which is about system wakeup from sleep and not
about device wakeup from runtime suspend.
Fixes: 0847684cfc (PCI / PM: Simplify device wakeup settings code)
Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Cc: 4.13+ <stable@vger.kernel.org> # 4.13+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
USB controller ASM1042 stops working after commit de3ef1eb1c (PM /
core: Drop run_wake flag from struct dev_pm_info).
The device in question is not power managed by platform firmware,
furthermore, it only supports PME# from D3cold:
Capabilities: [78] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Before commit de3ef1eb1c, the device never gets runtime suspended.
After that commit, the device gets runtime suspended to D3hot, which can
not generate any PME#.
usb_hcd_pci_probe() unconditionally calls device_wakeup_enable(), hence
device_can_wakeup() in pci_dev_run_wake() always returns true.
So pci_dev_run_wake() needs to check PME wakeup capability as its first
condition.
In addition, change wakeup flag passed to pci_target_state() from false
to true, because we want to find the deepest state different from D3cold
that the device can still generate PME#. In this case, it's D0 for the
device in question.
Fixes: de3ef1eb1c (PM / core: Drop run_wake flag from struct dev_pm_info)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: 4.13+ <stable@vger.kernel.org> # 4.13+
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently the pcie_print_link_status() will print PCIe bandwidth and link
width information but does not mention it is pertaining to the PCIe. Since
this and related functions are used exclusively by networking drivers today
users may get confused into thinking that it's the NIC bandwidth that is
being talked about. Insert a "PCIe" into the messages.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When reassigning device resources to increase their alignment, e.g.,
because of a "pci=resource_alignment=" kernel parameter or because the
platform aligns resources to its page size, we previously emitted messages
like this:
pci 0000:00:00.0: Disabling memory decoding and releasing memory resources
pci 0000:00:00.0: disabling bridge mem windows
These messages don't convey any useful information, so remove them.
Fixes: 3827463769 ("powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned")
Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.vnet.ibm.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlrHeY8UHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxhLRAAndV/0NDyWZU0eZNM6twri2SEFnF7
E4ar+YthxDxxJG4TLJbIA12jc5NgHZy4WuttDa6Jb99KreBXIHJFlNi/V/tme6zf
+yXUuxWae7wJzBiaay57VqLGSc80gt/LTgjLa1siwQqjTbO3wSXR6JJXNaE9FtQ4
/jL61t8bD1Peb5cWTpt9p0hrnKI0/pHwASdReyFS4F/HDKdvpof7BxE/OU3HSxxA
XKC2v6RjY4S93vkzvApDXQ+vhKquVRK7/ojyTXQUO/GIzcARprO7H4k62N4ar0x/
qbXLkR8IMkwA8ecsNmcL92ftb/cXoHfd+wdK8WpijqzF4kW4SdteVWbIhUzI0gbr
0gjDYIzjplvH3pZGv/qvx+8sFtAP95OdPjuAAW2qJ9TCVfmiS8naNFCvcxg87RhD
gjyQD3If1X7F8wy309lhq7VNyRexTHgIMgTXHyFvuZMzn/Qe1huL2XCwDcEAg/OX
AvU2iuSE5tWAh7gIUMF/aWi3uoeJUyyoru5ZR//gqdFfx9YxpSimO1UDXnpPi8SR
Iz/jzHJc0aWGYdQ9l6HiSbJF3P/QQcWYs9igt0A7BRGB05SPdWCh7sSO70FJa8ME
f4WID5/qEiaH26kiSRX4cUqpc8Amk8bT0DXw2OT57qy3JM0ZdV5ENQX11pSpr9hv
uLEf0DU7AEmdvzQ=
=T++R
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- move pci_uevent_ers() out of pci.h (Michael Ellerman)
- skip ASPM common clock warning if BIOS already configured it (Sinan
Kaya)
- fix ASPM Coverity warning about threshold_ns (Gustavo A. R. Silva)
- remove last user of pci_get_bus_and_slot() and the function itself
(Sinan Kaya)
- add decoding for 16 GT/s link speed (Jay Fang)
- add interfaces to get max link speed and width (Tal Gilboa)
- add pcie_bandwidth_capable() to compute max supported link bandwidth
(Tal Gilboa)
- add pcie_bandwidth_available() to compute bandwidth available to
device (Tal Gilboa)
- add pcie_print_link_status() to log link speed and whether it's
limited (Tal Gilboa)
- use PCI core interfaces to report when device performance may be
limited by its slot instead of doing it in each driver (Tal Gilboa)
- fix possible cpqphp NULL pointer dereference (Shawn Lin)
- rescan more of the hierarchy on ACPI hotplug to fix Thunderbolt/xHCI
hotplug (Mika Westerberg)
- add support for PCI I/O port space that's neither directly accessible
via CPU in/out instructions nor directly mapped into CPU physical
memory space. This is fairly intrusive and includes minor changes to
interfaces used for I/O space on most platforms (Zhichang Yuan, John
Garry)
- add support for HiSilicon Hip06/Hip07 LPC I/O space (Zhichang Yuan,
John Garry)
- use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas)
- remove possible NULL pointer dereference in of_pci_bus_find_domain_nr()
(Shawn Lin)
- report quirk timings with dev_info (Bjorn Helgaas)
- report quirks that take longer than 10ms (Bjorn Helgaas)
- add and use Altera Vendor ID (Johannes Thumshirn)
- tidy Makefiles and comments (Bjorn Helgaas)
- don't set up INTx if MSI or MSI-X is enabled to align cris, frv,
ia64, and mn10300 with x86 (Bjorn Helgaas)
- move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick
Lawler)
- merge pcieport_if.h into portdrv.h (Bjorn Helgaas)
- move workaround for BIOS PME issue from portdrv to PCI core (Bjorn
Helgaas)
- completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas)
- remove portdrv link order dependency (Bjorn Helgaas)
- remove support for unused VC portdrv service (Bjorn Helgaas)
- simplify portdrv feature permission checking (Bjorn Helgaas)
- remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn
Helgaas)
- remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas)
- use cached AER capability offset (Frederick Lawler)
- don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg)
- rename pcie-dpc.c to dpc.c (Bjorn Helgaas)
- use generic pci_mmap_resource_range() instead of powerpc and xtensa
arch-specific versions (David Woodhouse)
- support arbitrary PCI host bridge offsets on sparc (Yinghai Lu)
- remove System and Video ROM reservations on sparc (Bjorn Helgaas)
- probe for device reset support during enumeration instead of runtime
(Bjorn Helgaas)
- add ACS quirk for Ampere (née APM) root ports (Feng Kan)
- add function 1 DMA alias quirk for Marvell 88SE9220 (Thomas
Vincent-Cross)
- protect device restore with device lock (Sinan Kaya)
- handle failure of FLR gracefully (Sinan Kaya)
- handle CRS (config retry status) after device resets (Sinan Kaya)
- skip various config reads for SR-IOV VFs as an optimization
(KarimAllah Ahmed)
- consolidate VPD code in vpd.c (Bjorn Helgaas)
- add Tegra dependency on PCI_MSI_IRQ_DOMAIN (Arnd Bergmann)
- add DT support for R-Car r8a7743 (Biju Das)
- fix a PCI_EJECT vs PCI_BUS_RELATIONS race condition in Hyper-V host
bridge driver that causes a general protection fault (Dexuan Cui)
- fix Hyper-V host bridge hang in MSI setup on 1-vCPU VMs with SR-IOV
(Dexuan Cui)
- fix Hyper-V host bridge hang when ejecting a VF before setting up MSI
(Dexuan Cui)
- make several structures static (Fengguang Wu)
- increase number of MSI IRQs supported by Synopsys DesignWare bridges
from 32 to 256 (Gustavo Pimentel)
- implemented multiplexed IRQ domain API and remove obsolete MSI IRQ
API from DesignWare drivers (Gustavo Pimentel)
- add Tegra power management support (Manikanta Maddireddy)
- add Tegra loadable module support (Manikanta Maddireddy)
- handle 64-bit BARs correctly in endpoint support (Niklas Cassel)
- support optional regulator for HiSilicon STB (Shawn Guo)
- use regulator bulk API for Qualcomm apq8064 (Srinivas Kandagatla)
- support power supplies for Qualcomm msm8996 (Srinivas Kandagatla)
* tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (123 commits)
MAINTAINERS: Add John Garry as maintainer for HiSilicon LPC driver
HISI LPC: Add ACPI support
ACPI / scan: Do not enumerate Indirect IO host children
ACPI / scan: Rename acpi_is_serial_bus_slave() for more general use
HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings
of: Add missing I/O range exception for indirect-IO devices
PCI: Apply the new generic I/O management on PCI IO hosts
PCI: Add fwnode handler as input param of pci_register_io_range()
PCI: Remove __weak tag from pci_register_io_range()
MAINTAINERS: Add missing /drivers/pci/cadence directory entry
fm10k: Report PCIe link properties with pcie_print_link_status()
net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth
net/mlx5: Report PCIe link properties with pcie_print_link_status()
net/mlx4_core: Report PCIe link properties with pcie_print_link_status()
PCI: Add pcie_print_link_status() to log link speed and whether it's limited
PCI: Add pcie_bandwidth_available() to compute bandwidth available to device
misc: pci_endpoint_test: Handle 64-bit BARs properly
PCI: designware-ep: Make dw_pcie_ep_reset_bar() handle 64-bit BARs properly
PCI: endpoint: Make sure that BAR_5 does not have 64-bit flag set when clearing
PCI: endpoint: Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct *epf_bar
...
- probe for device reset support during enumeration instead of runtime
(Bjorn Helgaas)
- add ACS quirk for Ampere (née APM) root ports (Feng Kan)
- add function 1 DMA alias quirk for Marvell 88SE9220 (Thomas
Vincent-Cross)
- protect device restore with device lock (Sinan Kaya)
- handle failure of FLR gracefully (Sinan Kaya)
- handle CRS (config retry status) after device resets (Sinan Kaya)
- skip various config reads for SR-IOV VFs as an optimization (KarimAllah
Ahmed)
* pci/virtualization:
PCI/IOV: Add missing prototypes for powerpc pcibios interfaces
PCI/IOV: Use VF0 cached config registers for other VFs
PCI/IOV: Skip BAR sizing for VFs
PCI/IOV: Skip INTx config reads for VFs
PCI: Wait for device to become ready after secondary bus reset
PCI: Add a return type for pci_reset_bridge_secondary_bus()
PCI: Wait for device to become ready after a power management reset
PCI: Rename pci_flr_wait() to pci_dev_wait() and make it generic
PCI: Handle FLR failure and allow other reset types
PCI: Protect restore with device lock to be consistent
PCI: Add function 1 DMA alias quirk for Marvell 88SE9220
PCI: Add ACS quirk for Ampere root ports
PCI: Remove redundant probes for device reset support
PCI: Probe for device reset support during enumeration
Conflicts:
include/linux/pci.h
- move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick
Lawler)
- merge pcieport_if.h into portdrv.h (Bjorn Helgaas)
- move workaround for BIOS PME issue from portdrv to PCI core (Bjorn
Helgaas)
- completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas)
- remove portdrv link order dependency (Bjorn Helgaas)
- remove support for unused VC portdrv service (Bjorn Helgaas)
- simplify portdrv feature permission checking (Bjorn Helgaas)
- remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn
Helgaas)
- remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas)
- use cached AER capability offset (Frederick Lawler)
- don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg)
- rename pcie-dpc.c to dpc.c (Bjorn Helgaas)
* pci/portdrv:
PCI/DPC: Rename from pcie-dpc.c to dpc.c
PCI/DPC: Do not enable DPC if AER control is not allowed by the BIOS
PCI/AER: Use cached AER Capability offset
PCI/portdrv: Rename and reverse sense of pcie_ports_auto
PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver
PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter
PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter
PCI/portdrv: Remove unnecessary include of <linux/pci-aspm.h>
PCI/portdrv: Simplify PCIe feature permission checking
PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
PCI/portdrv: Remove pcie_port_bus_type link order dependency
PCI/portdrv: Disable port driver in compat mode
PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
PCI/PM: Move pcie_clear_root_pme_status() to core
PCI/portdrv: Merge pcieport_if.h into portdrv.h
PCI/portdrv: Move pcieport_if.h to drivers/pci/pcie/
Conflicts:
drivers/pci/pcie/Makefile
drivers/pci/pcie/portdrv.h
- use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas)
- remove possible NULL pointer dereference in of_pci_bus_find_domain_nr()
(Shawn Lin)
- report quirk timings with dev_info (Bjorn Helgaas)
- report quirks that take longer than 10ms (Bjorn Helgaas)
- add and use Altera Vendor ID (Johannes Thumshirn)
- tidy Makefiles and comments (Bjorn Helgaas)
* pci/misc:
PCI: Always define the of_node helpers
PCI: Tidy comments
PCI: Tidy Makefiles
mcb: Add Altera PCI ID to mcb-pci
PCI: Add Altera vendor ID
PCI: Report quirks that take more than 10ms
PCI: Report quirk timings with pci_info() instead of pr_debug()
PCI: Fix NULL pointer dereference in of_pci_bus_find_domain_nr()
rapidio/tsi721: use PCI_EXP_DEVCTL2_COMP_TIMEOUT macro
- add support for PCI I/O port space that's neither directly accessible
via CPU in/out instructions nor directly mapped into CPU physical
memory space (Zhichang Yuan)
- add support for HiSilicon Hip06/Hip07 LPC I/O space (Zhichang Yuan,
John Garry)
* pci/lpc:
MAINTAINERS: Add John Garry as maintainer for HiSilicon LPC driver
HISI LPC: Add ACPI support
ACPI / scan: Do not enumerate Indirect IO host children
ACPI / scan: Rename acpi_is_serial_bus_slave() for more general use
HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings
of: Add missing I/O range exception for indirect-IO devices
PCI: Apply the new generic I/O management on PCI IO hosts
PCI: Add fwnode handler as input param of pci_register_io_range()
PCI: Remove __weak tag from pci_register_io_range()
lib: Add generic PIO mapping method
After introducing the new generic I/O space management (Logical PIO), the
original PCI MMIO relevant helpers need to be updated based on the new
interfaces defined in logical PIO.
Adapt the corresponding code to match the changes introduced by logical
PIO.
Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Zhichang Yuan <yuanzhichang@hisilicon.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de> # earlier draft
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
In preparation for having the PCI MMIO helpers use the new generic I/O
space management (logical PIO) we need to add the fwnode handler as an
extra input parameter.
Changes the signature of pci_register_io_range() and its callers as
needed.
Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
pci_register_io_range() has only one definition, so there is no need for
the __weak attribute. Remove it.
Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Add pcie_print_link_status(). This logs the current settings of the link
(speed, width, and total available bandwidth).
If the device is capable of more bandwidth but is limited by a slower
upstream link, we include information about the link that limits the
device's performance.
The user may be able to move the device to a different slot for better
performance.
This provides a unified method for all PCI devices to report status and
issues, instead of each device reporting in a different way, using
different code.
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: changelog, reword log messages, print device capabilities when
not limited, print bandwidth in Gb/s]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add pcie_bandwidth_available() to compute the bandwidth available to a
device. This may be limited by the device itself or by a slower upstream
link leading to the device.
The available bandwidth at each link along the path is computed as:
link_width * link_speed * (1 - encoding_overhead)
2.5 and 5.0 GT/s links use 8b/10b encoding, which reduces the raw bandwidth
available by 20%; 8.0 GT/s and faster links use 128b/130b encoding, which
reduces it by about 1.5%.
The result is in Mb/s, i.e., megabits/second, of raw bandwidth.
Also return the device with the slowest link and the speed and width of
that link.
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: changelog, leave pcie_get_minimum_link() alone for now, return
bw directly, use pci_upstream_bridge(), check "next_bw <= bw" to find
uppermost limiting device, return speed/width of the limiting device]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull x86 platform updates from Ingo Molnar:
"The main changes in this cycle were:
- Add "Jailhouse" hypervisor support (Jan Kiszka)
- Update DeviceTree support (Ivan Gorinov)
- Improve DMI date handling (Andy Shevchenko)"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/PCI: Fix a potential regression when using dmi_get_bios_year()
firmware/dmi_scan: Uninline dmi_get_bios_year() helper
x86/devicetree: Use CPU description from Device Tree
of/Documentation: Specify local APIC ID in "reg"
MAINTAINERS: Add entry for Jailhouse
x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI
x86: Consolidate PCI_MMCONFIG configs
x86: Align x86_64 PCI_MMCONFIG with 32-bit variant
x86/jailhouse: Enable PCI mmconfig access in inmates
PCI: Scan all functions when running over Jailhouse
jailhouse: Provide detection for non-x86 systems
x86/devicetree: Fix device IRQ settings in DT
x86/devicetree: Initialize device tree before using it
pci: Simplify code by using the new dmi_get_bios_year() helper
ACPI/sleep: Simplify code by using the new dmi_get_bios_year() helper
x86/pci: Simplify code by using the new dmi_get_bios_year() helper
dmi: Introduce the dmi_get_bios_year() helper function
x86/platform/quark: Re-use DEFINE_SHOW_ATTRIBUTE() macro
x86/platform/atom: Re-use DEFINE_SHOW_ATTRIBUTE() macro
Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
a device, based on the max link speed and width, adjusted by the encoding
overhead.
The maximum bandwidth of the link is computed as:
max_link_width * max_link_speed * (1 - encoding_overhead)
2.5 and 5.0 GT/s links use 8b/10b encoding, which reduces the raw bandwidth
available by 20%; 8.0 GT/s and faster links use 128b/130b encoding, which
reduces it by about 1.5%.
The result is in Mb/s, i.e., megabits/second, of raw bandwidth.
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: add 16 GT/s, adjust for pcie_get_speed_cap() and
pcie_get_width_cap() signatures, don't export outside drivers/pci]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add pcie_get_width_cap() to find the max link width supported by a device.
Change max_link_width_show() to use pcie_get_width_cap().
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: return width directly instead of error and *width, don't export
outside drivers/pci]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Add pcie_get_speed_cap() to find the max link speed supported by a device.
Change max_link_speed_show() to use pcie_get_speed_cap().
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: return speed directly instead of error and *speed, don't export
outside drivers/pci]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Remove pointless comments that tell us the file name, remove blank line
comments, follow multi-line comment conventions. No functional change
intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There are PCI devices which are power-manageable by a nonstandard means,
such as a custom ACPI method. One example are discrete GPUs in hybrid
graphics laptops, another are Thunderbolt controllers in Macs.
Such devices can't be put into D3cold with pci_set_power_state() because
pci_platform_power_transition() fails with -ENODEV. Instead they're put
into D3hot by pci_set_power_state() and subsequently into D3cold by
invoking the nonstandard means. However as a consequence the cached
current_state is incorrectly left at D3hot.
What we need to do is walk the hierarchy below such a PCI device on
powerdown and update the current_state to D3cold. On powerup the PCI
device itself and the hierarchy below it is in D0uninitialized, so we
need to walk the hierarchy again and wake all devices, causing them to
be put into D0active and then letting them autosuspend as they see fit.
To this end make pci_wakeup_bus() & pci_bus_set_current_state() public
so PCI drivers don't have to reinvent the wheel.
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/2962443259e7faec577274b4ef8c54aad66f9a94.1520068884.git.lukas@wunner.de
Move pcie_clear_root_pme_status() from the port driver to the PCI core so
it will be available even when the port driver isn't present. No
functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Setting Secondary Bus Reset of a downstream port sends a hot reset. PCIe
r4.0, sec 2.3.1, Request Handling Rules, indicates that a device can return
CRS Completion Status following such a reset. Wait until the device
becomes ready in that situation.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Add a return value to pci_reset_bridge_secondary_bus() so we can return an
error if the device doesn't become ready after the reset.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
PCIe r4.0, sec 2.3.1, Request Handling Rules, indicates that a device can
return CRS Completion Status following a D3hot to D0 transition. Wait
until the device becomes ready in that situation.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If the "parent" pointer passed to of_pci_bus_find_domain_nr() is NULL,
don't dereference it.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCIe r4.0, sec 2.3.1, Request Handling Rules, says:
Valid reset conditions after which a device is permitted to return CRS
are:
* Cold, Warm, and Hot Resets,
* FLR
* A reset initiated in response to a D3hot to D0 uninitialized
Try to reuse FLR implementation towards other reset types.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_flr_wait() and pci_af_flr() functions assume graceful return even
though the device is inaccessible under error conditions.
Return -ENOTTY in error cases so that __pci_reset_function_locked() can
try other reset types if AF_FLR/FLR reset fails.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Commit b014e96d1a ("PCI: Protect pci_error_handlers->reset_notify() usage
with device_lock()") added protection around pci_dev_restore() function so
a device-specific remove callback does not cause a race condition with
hotplug.
pci_dev_lock() usage has been forgotten in two places. Add locks for
pci_slot_restore() and moving pci_dev_restore() inside the locks for
pci_try_reset_function().
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
...instead of open coding its functionality.
No changes in functionality.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-acpi@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/20180222125923.57385-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We probe every device for whether it supports reset so we can tell whether
to create a sysfs "reset" file for it. We do that probe in
pci_init_capabilities() during enumeration and save the result in
dev->reset_fn. The result doesn't depend on any other devices on the bus
and shouldn't change after boot, so we don't need to do the probe again.
Remove the pci_probe_reset_function() calls and rely on the dev->reset_fn
we found during enumeration. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJad5lgAAoJEFmIoMA60/r8s2kQAI3PztawDpaCP9Z12pkbBHSt
Ho0xTyk9rCZi9kQJbNjc+a+QrlA3QmTHXIXerB3LSWoh7M+XhsECjem92eHpgLNS
JvYPhTfOrCr0vdiAmOz6hD0AqN/psrbfzgiJhSwomsGEFS77k7kERSJckRv81sxb
Aj5F/WjucAgLorwm4auveAJEQ7atE7/6pkXzoqYm4G6NLOb46jUcRGndrnvXZBlz
fws8fBM4BHyi7i25CYQl24tFq1CGax1rIPgLg+4KnH76bQk/N6Ju0sGVSzfh+hG8
SIerK9bJbzGRAuNKoxB3aO1dyzsK3x9WztE2mG98w5trOISPIR1FqnvC/225FWAU
d6eIXiC7wKnEx+DElNTzCjzfHc7SAJoupO32H7CoiTe5zPUlWlxJ1zLYkK1gt50q
m8PRBiYTglxyznzrO0drtcdjEzvbdZNRrsYnul4wi1vSHzjk6F6XLtzT10XWM1M1
1pXLB8384FTj0Hu4bq6Y3Aivkmz0Sf+eQM2NaOwe+Zj7/1VV0d3lvi4LUXkqzLCA
FoXPJSMxG2Qu+iflCeYRQBJjExaZH3eNLZ3dT6QpcJrjaFVedd9u5DeeFqNL27zV
bhr8TdqrR4p4rc8EBAGoCapw96IxLZROKB3gxbrZVOpfIZpzthwHbElHX6aqUgF4
w/EV1JWs36WXWaxFk8wd
=ttq9
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- skip AER driver error recovery callbacks for correctable errors
reported via ACPI APEI, as we already do for errors reported via the
native path (Tyler Baicar)
- fix DPC shared interrupt handling (Alex Williamson)
- print full DPC interrupt number (Keith Busch)
- enable DPC only if AER is available (Keith Busch)
- simplify DPC code (Bjorn Helgaas)
- calculate ASPM L1 substate parameter instead of hardcoding it (Bjorn
Helgaas)
- enable Latency Tolerance Reporting for ASPM L1 substates (Bjorn
Helgaas)
- move ASPM internal interfaces out of public header (Bjorn Helgaas)
- allow hot-removal of VGA devices (Mika Westerberg)
- speed up unplug and shutdown by assuming Thunderbolt controllers
don't support Command Completed events (Lukas Wunner)
- add AtomicOps support for GPU and Infiniband drivers (Felix Kuehling,
Jay Cornwall)
- expose "ari_enabled" in sysfs to help NIC naming (Stuart Hayes)
- clean up PCI DMA interface usage (Christoph Hellwig)
- remove PCI pool API (replaced with DMA pool) (Romain Perier)
- deprecate pci_get_bus_and_slot(), which assumed PCI domain 0 (Sinan
Kaya)
- move DT PCI code from drivers/of/ to drivers/pci/ (Rob Herring)
- add PCI-specific wrappers for dev_info(), etc (Frederick Lawler)
- remove warnings on sysfs mmap failure (Bjorn Helgaas)
- quiet ROM validation messages (Alex Deucher)
- remove redundant memory alloc failure messages (Markus Elfring)
- fill in types for compile-time VGA and other I/O port resources
(Bjorn Helgaas)
- make "pci=pcie_scan_all" work for Root Ports as well as Downstream
Ports to help AmigaOne X1000 (Bjorn Helgaas)
- add SPDX tags to all PCI files (Bjorn Helgaas)
- quirk Marvell 9128 DMA aliases (Alex Williamson)
- quirk broken INTx disable on Ceton InfiniTV4 (Bjorn Helgaas)
- fix CONFIG_PCI=n build by adding dummy pci_irqd_intx_xlate() (Niklas
Cassel)
- use DMA API to get MSI address for DesignWare IP (Niklas Cassel)
- fix endpoint-mode DMA mask configuration (Kishon Vijay Abraham I)
- fix ARTPEC-6 incorrect IS_ERR() usage (Wei Yongjun)
- add support for ARTPEC-7 SoC (Niklas Cassel)
- add endpoint-mode support for ARTPEC (Niklas Cassel)
- add Cadence PCIe host and endpoint controller driver (Cyrille
Pitchen)
- handle multiple INTx status bits being set in dra7xx (Vignesh R)
- translate dra7xx hwirq range to fix INTD handling (Vignesh R)
- remove deprecated Exynos PHY initialization code (Jaehoon Chung)
- fix MSI erratum workaround for HiSilicon Hip06/Hip07 (Dongdong Liu)
- fix NULL pointer dereference in iProc BCMA driver (Ray Jui)
- fix Keystone interrupt-controller-node lookup (Johan Hovold)
- constify qcom driver structures (Julia Lawall)
- rework Tegra config space mapping to increase space available for
endpoints (Vidya Sagar)
- simplify Tegra driver by using bus->sysdata (Manikanta Maddireddy)
- remove PCI_REASSIGN_ALL_BUS usage on Tegra (Manikanta Maddireddy)
- add support for Global Fabric Manager Server (GFMS) event to
Microsemi Switchtec switch driver (Logan Gunthorpe)
- add IDs for Switchtec PSX 24xG3 and PSX 48xG3 (Kelvin Cao)
* tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits)
PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller
dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe endpoint controller
PCI: endpoint: Fix EPF device name to support multi-function devices
PCI: endpoint: Add the function number as argument to EPC ops
PCI: cadence: Add host driver for Cadence PCIe controller
dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe host controller
PCI: Add vendor ID for Cadence
PCI: Add generic function to probe PCI host controllers
PCI: generic: fix missing call of pci_free_resource_list()
PCI: OF: Add generic function to parse and allocate PCI resources
PCI: Regroup all PCI related entries into drivers/pci/Makefile
PCI/DPC: Reformat DPC register definitions
PCI/DPC: Add and use DPC Status register field definitions
PCI/DPC: Squash dpc_rp_pio_get_info() into dpc_process_rp_pio_error()
PCI/DPC: Remove unnecessary RP PIO register structs
PCI/DPC: Push dpc->rp_pio_status assignment into dpc_rp_pio_get_info()
PCI/DPC: Squash dpc_rp_pio_print_error() into dpc_rp_pio_get_info()
PCI/DPC: Make RP PIO log size check more generic
PCI/DPC: Rename local "status" to "dpc_status"
PCI/DPC: Squash dpc_rp_pio_print_tlp_header() into dpc_rp_pio_print_error()
...
* pci/spdx:
PCI: Add SPDX GPL-2.0+ to replace implicit GPL v2 or later statement
PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate
PCI: Add SPDX GPL-2.0 to replace COPYING boilerplate
PCI: Add SPDX GPL-2.0 to replace GPL v2 boilerplate
PCI: Add SPDX GPL-2.0 when no license was specified
* pci/misc:
PCI: Add dummy pci_irqd_intx_xlate() for CONFIG_PCI=n build
PCI: Add wrappers for dev_printk()
PCI: Remove unnecessary messages for memory allocation failures
PCI: Add #defines for Completion Timeout Disable feature
hinic: Replace PCI pool old API
net: e100: Replace PCI pool old API
block: DAC960: Replace PCI pool old API
MAINTAINERS: Include more PCI files
PCI: Remove unneeded kallsyms include
powerpc/pci: Unroll two pass loop when scanning bridges
powerpc/pci: Use for_each_pci_bridge() helper
b24413180f ("License cleanup: add SPDX GPL-2.0 license identifier to
files with no license") added SPDX GPL-2.0 to several PCI files that
previously contained no license information.
Add SPDX GPL-2.0 to all other PCI files that did not contain any license
information and hence were under the default GPL version 2 license of the
kernel.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Atomic Operations feature (PCIe r4.0, sec 6.15) allows atomic
transctions to be requested by, routed through and completed by PCIe
components. Routing and completion do not require software support.
Component support for each is detectable via the DEVCAP2 register.
A Requester may use AtomicOps only if its PCI_EXP_DEVCTL2_ATOMIC_REQ is
set. This should be set only if the Completer and all intermediate routing
elements support AtomicOps.
A concrete example is the AMD Fiji-class GPU (which is capable of making
AtomicOp requests), below a PLX 8747 switch (advertising AtomicOp routing)
with a Haswell host bridge (advertising AtomicOp completion support).
Add pci_enable_atomic_ops_to_root() for per-device control over AtomicOp
requests. This checks to be sure the Root Port supports completion of the
desired AtomicOp sizes and the path to the Root Port supports routing the
AtomicOps.
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[bhelgaas: changelog, comments, whitespace]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add PCI-specific dev_printk() wrappers and use them to simplify the code
slightly. No functional change intended.
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
[bhelgaas: squash into one patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add pcim_set_mwi(), a device-managed version of pci_set_mwi().
First user is the Realtek r8169 driver.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* pci/host-thunder:
PCI: Avoid slot reset if bridge itself is broken
PCI: Avoid bus reset if bridge itself is broken
PCI: Mark Cavium CN8xxx to avoid bus reset
* pci/virtualization:
PCI: Document reset method return values
PCI: Detach driver before procfs & sysfs teardown on device remove
PCI: Apply Cavium ThunderX ACS quirk to more Root Ports
PCI: Set Cavium ACS capability quirk flags to assert RR/CR/SV/UF
PCI: Restore ARI Capable Hierarchy before setting numVFs
PCI: Create SR-IOV virtfn/physfn links before attaching driver
PCI: Expose SR-IOV offset, stride, and VF device ID via sysfs
PCI: Cache the VF device ID in the SR-IOV structure
PCI: Add Kconfig PCI_IOV dependency for PCI_REALLOC_ENABLE_AUTO
PCI: Remove unused function __pci_reset_function()
PCI: Remove reset argument from pci_iov_{add,remove}_virtfn()
* pci/resource:
PCI: Fail pci_map_rom() if the option ROM is invalid
PCI: Move pci_map_rom() error path
x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)
PCI: Add pci_resize_resource() for resizing BARs
PCI: Add resizable BAR infrastructure
PCI: Add PCI resource type mask #define
Fix build error in kernel-doc notation:
../drivers/pci/pci.c:3479: ERROR: Unexpected indentation.
"::" tells the kernel-doc "reStructuredText" processor that the following
block is a literal block of some blob that should be kept as is.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
[bhelgaas: add hint about "::" meaning]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Replace the PCI-specific flag PCI_DEV_FLAGS_NEEDS_RESUME with the
PM core's DPM_FLAG_NEVER_SKIP one everywhere and drop it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
The pci_reset_function() path may try several different reset methods:
device-specific resets, PCIe Function Level Resets, PCI Advanced Features
Function Level Reset, etc.
Add a comment about what the return values from these methods mean. If one
of the methods fails, in some cases we want to continue and try the next
one in the list, but sometimes we want to stop trying.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add resizable BAR infrastructure, including defines and helper functions to
read the possible sizes of a BAR and update its size. See PCIe r3.1, sec
7.22.
Link: https://pcisig.com/sites/default/files/specification_documents/ECN_Resizable-BAR_24Apr2008.pdf
Signed-off-by: Christian König <christian.koenig@amd.com>
[bhelgaas: rename to functions with "rebar" (to match #defines), drop shift
#defines, drop "_MASK" suffixes, fix typos, fix kerneldoc]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
When checking to see if a PCI slot can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set.
Some PCIe root port bridges do not behave well after a slot reset, and may
cause the device in the slot to become unusable.
Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to prevent the slot from being reset.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
When checking to see if a PCI bus can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set. Children marked with that flag are known not to behave well
after a bus reset.
Some PCIe root port bridges also do not behave well after a bus reset,
sometimes causing the devices behind the bridge to become unusable.
Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to allow these bridges to be flagged, and prevent their secondary buses
from being reset.
Signed-off-by: David Daney <david.daney@cavium.com>
[jglauber@cavium.com: fixed typo]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
The last caller of __pci_reset_function() has been removed. Remove the
function as well.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This reverts commit 40f11adc7c.
Jens found that iwlwifi firmware loading failed on a Lenovo X1 Carbon,
gen4:
iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-34.ucode failed with error -2
iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-33.ucode failed with error -2
iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-32.ucode failed with error -2
iwlwifi 0000:04:00.0: loaded firmware version 31.532993.0 op_mode iwlmvm
iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 8260, REV=0x208
...
iwlwifi 0000:04:00.0: Failed to load firmware chunk!
iwlwifi 0000:04:00.0: Could not load the [0] uCode section
iwlwifi 0000:04:00.0: Failed to start INIT ucode: -110
iwlwifi 0000:04:00.0: Failed to run INIT ucode: -110
He bisected it to 40f11adc7c ("PCI: Avoid race while enabling upstream
bridges"). Revert that commit to fix the regression.
Link: http://lkml.kernel.org/r/4bcbcbc1-7c79-09f0-5071-bc2f53bf6574@kernel.dk
Fixes: 40f11adc7c ("PCI: Avoid race while enabling upstream bridges")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Srinath Mannam <srinath.mannam@broadcom.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Luca Coelho <luca@coelho.fi>
CC: Johannes Berg <johannes@sipsolutions.net>
CC: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZsr8cAAoJEFmIoMA60/r8lXYQAKViYIRMJDD4n3NhjMeLOsnJ
vwaBmWlLRjSFIEpag5kMjS1RJE17qAvmkBZnDvSNZ6cT28INkkZnVM2IW96WECVq
64MIvDijVPcvqGuWePCfWdDiSXApiDWwJuw55BOhmvV996wGy0gYgzpPY+1g0Knh
XzH9IOzDL79hZleLfsxX0MLV6FGBVtOsr0jvQ04k4IgEMIxEDTlbw85rnrvzQUtc
0Vj2koaxWIESZsq7G/wiZb2n6ekaFdXO/VlVvvhmTSDLCBaJ63Hb/gfOhwMuVkS6
B3cVprNrCT0dSzWmU4ZXf+wpOyDpBexlemW/OR/6CQUkC6AUS6kQ5si1X44dbGmJ
nBPh414tdlm/6V4h/A3UFPOajSGa/ZWZ/uQZPfvKs1R6WfjUerWVBfUpAzPbgjam
c/mhJ19HYT1J7vFBfhekBMeY2Px3JgSJ9rNsrFl48ynAALaX5GEwdpo4aqBfscKz
4/f9fU4ysumopvCEuKD2SsJvsPKd5gMQGGtvAhXM1TxvAoQ5V4cc99qEetAPXXPf
h2EqWm4ph7YP4a+n/OZBjzluHCmZJn1CntH5+//6wpUk6HnmzsftGELuO9n12cLE
GGkreI3T9ctV1eOkzVVa0l0QTE1X/VLyEyKCtb9obXsDaG4Ud7uKQoZgB19DwyTJ
EG76ridTolUFVV+wzJD9
=9cLP
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- add enhanced Downstream Port Containment support, which prints more
details about Root Port Programmed I/O errors (Dongdong Liu)
- add Layerscape ls1088a and ls2088a support (Hou Zhiqiang)
- add MediaTek MT2712 and MT7622 support (Ryder Lee)
- add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang)
- add Qualcom IPQ8074 support (Varadarajan Narayanan)
- add R-Car r8a7743/5 device tree support (Biju Das)
- add Rockchip per-lane PHY support for better power management (Shawn
Lin)
- fix IRQ mapping for hot-added devices by replacing the
pci_fixup_irqs() boot-time design with a host bridge hook called at
probe-time (Lorenzo Pieralisi, Matthew Minter)
- fix race when enabling two devices that results in upstream bridge
not being enabled correctly (Srinath Mannam)
- fix pciehp power fault infinite loop (Keith Busch)
- fix SHPC bridge MSI hotplug events by enabling bus mastering
(Aleksandr Bezzubikov)
- fix a VFIO issue by correcting PCIe capability sizes (Alex
Williamson)
- fix an INTD issue on Xilinx and possibly other drivers by unifying
INTx IRQ domain support (Paul Burton)
- avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg
Roedel)
- allow APM X-Gene device assignment to guests by adding an ACS quirk
(Feng Kan)
- fix driver crashes by disabling Extended Tags on Broadcom HT2100
(Extended Tags support is required for PCIe Receivers but not
Requesters, and we now enable them by default when Requesters support
them) (Sinan Kaya)
- fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs
use the real Requester ID (not a phantom RID) (Robin Murphy)
- prevent assignment of Intel VMD children to guests (which may be
supported eventually, but isn't yet) by not associating an IOMMU with
them (Jon Derrick)
- fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott
Bauer)
- fix a Function-Level Reset issue with Intel 750 NVMe by waiting
longer (up to 60sec instead of 1sec) for device to become ready
(Sinan Kaya)
- fix a Function-Level Reset issue on iProc Stingray by working around
hardware defects in the CRS implementation (Oza Pawandeep)
- fix an issue with Intel NVMe P3700 after an iProc reset by adding a
delay during shutdown (Oza Pawandeep)
- fix a Microsoft Hyper-V lockdep issue by polling instead of blocking
in compose_msi_msg() (Stephen Hemminger)
- fix a wireless LAN driver timeout by clearing DesignWare MSI
interrupt status after it is handled, not before (Faiz Abbas)
- fix DesignWare ATU enable checking (Jisheng Zhang)
- reduce Layerscape dependencies on the bootloader by doing more
initialization in the driver (Hou Zhiqiang)
- improve Intel VMD performance allowing allocation of more IRQ vectors
than present CPUs (Keith Busch)
- improve endpoint framework support for initial DMA mask, different
BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon
Vijay Abraham I, Stan Drozd)
- rework CRS support to add periodic messages while we poll during
enumeration and after Function-Level Reset and prepare for possible
other uses of CRS (Sinan Kaya)
- clean up Root Port AER handling by removing unnecessary code and
moving error handler methods to struct pcie_port_service_driver
(Christoph Hellwig)
- clean up error handling paths in various drivers (Bjorn Andersson,
Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen,
Lorenzo Pieralisi, Sergei Shtylyov)
- clean up SR-IOV resource handling by disabling VF decoding before
updating the corresponding resource structs (Gavin Shan)
- clean up DesignWare-based drivers by unifying quirks to update Class
Code and Interrupt Pin and related handling of write-protected
registers (Hou Zhiqiang)
- clean up by adding empty generic pcibios_align_resource() and
pcibios_fixup_bus() and removing empty arch-specific implementations
(Palmer Dabbelt)
- request exclusive reset control for several drivers to allow cleanup
elsewhere (Philipp Zabel)
- constify various structures (Arvind Yadav, Bhumika Goyal)
- convert from full_name() to %pOF (Rob Herring)
- remove unused variables from iProc, HiSi, Altera, Keystone (Shawn
Lin)
* tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits)
PCI: xgene: Clean up whitespace
PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset
PCI: xgene: Fix platform_get_irq() error handling
PCI: xilinx-nwl: Fix platform_get_irq() error handling
PCI: rockchip: Fix platform_get_irq() error handling
PCI: altera: Fix platform_get_irq() error handling
PCI: spear13xx: Fix platform_get_irq() error handling
PCI: artpec6: Fix platform_get_irq() error handling
PCI: armada8k: Fix platform_get_irq() error handling
PCI: dra7xx: Fix platform_get_irq() error handling
PCI: exynos: Fix platform_get_irq() error handling
PCI: iproc: Clean up whitespace
PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP
PCI: iproc: Add 500ms delay during device shutdown
PCI: Fix typos and whitespace errors
PCI: Remove unused "res" variable from pci_resource_io()
PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag()
PCI/AER: Reformat AER register definitions
iommu/vt-d: Prevent VMD child devices from being remapping targets
x86/PCI: Use is_vmd() rather than relying on the domain number
...
Sporadic reset issues have been observed with an Intel 750 NVMe drive while
assigning the physical function to the guest machine. The sequence of
events observed is as follows:
- perform a Function Level Reset (FLR)
- sleep up to 1000ms total
- read ~0 from PCI_COMMAND (CRS completion for config read)
- warn that the device didn't return from FLR
- touch the device before it's ready
- device drops config writes when we restore register settings (there's
no mechanism for software to learn about CRS completions for writes)
- incomplete register restore leaves device in inconsistent state
- device probe fails because device is in inconsistent state
After reset, an endpoint may respond to config requests with Configuration
Request Retry Status (CRS) to indicate that it is not ready to accept new
requests. See PCIe r3.1, sec 2.3.1 and 6.6.2.
Increase the timeout value from 1 second to 60 seconds to cover the period
where device responds with CRS and also report polling progress.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: include the mandatory 100ms in the delays we print]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now that we have a custom printf format specifier, convert users of
full_name() to use %pOF instead. This is preparation for removing storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
When we enable a device, we first enable any upstream bridges. If a bridge
has multiple downstream devices and we enable them simultaneously, the race
to enable the upstream bridge may cause problems. Consider this hierarchy:
bridge A --+-- device B
+-- device C
If drivers for B and C call pci_enable_device() simultaneously, both will
attempt to enable A, which involves setting PCI_COMMAND_MASTER via
pci_set_master() and PCI_COMMAND_MEMORY via pci_enable_resources().
In the following sequence, B's update to set A's PCI_COMMAND_MEMORY is
lost, and neither B nor C will work correctly:
B C
pci_set_master(A)
cmd = read(A, PCI_COMMAND)
cmd |= PCI_COMMAND_MASTER
pci_set_master(A)
cmd = read(A, PCI_COMMAND)
cmd |= PCI_COMMAND_MASTER
write(A, PCI_COMMAND, cmd)
pci_enable_device(A)
pci_enable_resources(A)
cmd = read(A, PCI_COMMAND)
cmd |= PCI_COMMAND_MEMORY
write(A, PCI_COMMAND, cmd)
write(A, PCI_COMMAND, cmd)
Avoid this race by holding a new pci_bridge_mutex while enabling a bridge.
This ensures that both PCI_COMMAND_MASTER and PCI_COMMAND_MEMORY will be
updated before another thread can start enabling the bridge.
Note that although pci_enable_bridge() is recursive, it enables any
upstream bridges *before* acquiring the mutex. When it acquires the mutex
and calls pci_set_master() and pci_enable_device(), any upstream bridges
have already been enabled so pci_enable_device() will not deadlock by
calling pci_enable_bridge() again.
Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
[bhelgaas: changelog, comment]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If the pci_find_pcie_root_port() function is called on a root port
itself, return the root port rather than NULL.
This effectively reverts commit 0e40523287 ("PCI: fix oops when
try to find Root Port for a PCI device") which added an extra check
that would now be redundant.
Fixes: a99b646afa ("PCI: Disable PCIe Relaxed Ordering if unsupported")
Fixes: c56d4450eb ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix TCP checksum offload handling in iwlwifi driver, from Emmanuel
Grumbach.
2) In ksz DSA tagging code, free SKB if skb_put_padto() fails. From
Vivien Didelot.
3) Fix two regressions with bonding on wireless, from Andreas Born.
4) Fix build when busypoll is disabled, from Daniel Borkmann.
5) Fix copy_linear_skb() wrt. SO_PEEK_OFF, from Eric Dumazet.
6) Set SKB cached route properly in inet_rtm_getroute(), from Florian
Westphal.
7) Fix PCI-E relaxed ordering handling in cxgb4 driver, from Ding
Tianhong.
8) Fix module refcnt leak in ULP code, from Sabrina Dubroca.
9) Fix use of GFP_KERNEL in atomic contexts in AF_KEY code, from Eric
Dumazet.
10) Need to purge socket write queue in dccp_destroy_sock(), also from
Eric Dumazet.
11) Make bpf_trace_printk() work properly on 32-bit architectures, from
Daniel Borkmann.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
bpf: fix bpf_trace_printk on 32 bit archs
PCI: fix oops when try to find Root Port for a PCI device
sfc: don't try and read ef10 data on non-ef10 NIC
net_sched: remove warning from qdisc_hash_add
net_sched/sfq: update hierarchical backlog when drop packet
net_sched: reset pointers to tcf blocks in classful qdiscs' destructors
ipv4: fix NULL dereference in free_fib_info_rcu()
net: Fix a typo in comment about sock flags.
ipv6: fix NULL dereference in ip6_route_dev_notify()
tcp: fix possible deadlock in TCP stack vs BPF filter
dccp: purge write queue in dccp_destroy_sock()
udp: fix linear skb reception with PEEK_OFF
ipv6: release rt6->rt6i_idev properly during ifdown
af_key: do not use GFP_KERNEL in atomic contexts
tcp: ulp: avoid module refcnt leak in tcp_set_ulp
net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
PCI: Disable Relaxed Ordering Attributes for AMD A1100
PCI: Disable Relaxed Ordering for some Intel processors
PCI: Disable PCIe Relaxed Ordering if unsupported
...
Add two reasons for returning 0 value to the description of
pci_set_power_state() to include the cases when:
- the transition is to D1 or D2 but D1 and D2 are not supported
- the transition is to D3 but D3 is not supported
Signed-off-by: Piotr Gregor <piotrgregor@rsyncme.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The implementation of PCI workarounds may require that the device is reset
from its probe function. This implies that the PCI device lock is already
held, and makes calling pci_reset_function() impossible (since it will
itself try to take that lock).
Add pci_reset_function_locked(), which is the equivalent of
pci_reset_function(), except that it requires the PCI device lock to be
already held by the caller.
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[bhelgaas: folded in fix for conflict with 52354b9d1f ("PCI: Remove
__pci_dev_reset() and pci_dev_reset()")]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # 4.11: 52354b9d1f46: PCI: Remove __pci_dev_reset() and pci_dev_reset()
Cc: stable@vger.kernel.org # 4.11
PCI bridges only have a reason to generate wakeup signals on behalf
of devices below them, so avoid preparing bridges for wakeup directly
in pci_enable_wake().
Also drop the pci_has_subordinate() check from pci_pm_default_resume()
as this will be done by pci_enable_wake() itself now.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Commit dc15e71eef (PCI / PM: Restore PME Enable if skipping wakeup
setup) introduced a mechanism by which the PME Enable bit can be
restored by pci_enable_wake() if dev->wakeup_prepared is set in
case it has been overwritten by PCI config space restoration.
However, that commit overlooked the fact that on some systems (Dell
XPS13 9360 in particular) the AML handling wakeup events checks PME
Status and PME Enable and it won't trigger a Notify() for devices
where those bits are not set while it is running.
That happens during resume from suspend-to-idle when pci_restore_state()
invoked by pci_pm_default_resume_early() clears PME Enable before the
wakeup events are processed by AML, effectively causing those wakeup
events to be ignored.
Fix this issue by restoring the PME Enable configuration right after
pci_restore_state() has been called instead of doing that in
pci_enable_wake().
Fixes: dc15e71eef (PCI / PM: Restore PME Enable if skipping wakeup setup)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/virtualization:
PCI: Remove __pci_dev_reset() and pci_dev_reset()
PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done()
PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()
PCI: Protect pci_driver->sriov_configure() usage with device_lock()
PCI: Mark Intel XXV710 NIC INTx masking as broken
PCI: Restore PRI and PASID state after Function-Level Reset
PCI: Cache PRI and PASID bits in pci_dev
The pci_error_handlers->reset_notify() method had a flag to indicate
whether to prepare for or clean up after a reset. The prepare and done
cases have no shared functionality whatsoever, so split them into separate
methods.
[bhelgaas: changelog, update locking comments]
Link: http://lkml.kernel.org/r/20170601111039.8913-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/resource:
PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11
PCI: Do not disregard parent resources starting at 0x0
Conflicts:
arch/x86/pci/fixup.c
* pci/pm:
PCI/PM: Avoid using device_may_wakeup() for runtime PM
x86/PCI: Avoid AMD SB7xx EHCI USB wakeup defect
PCI/PM: Restore the status of PCI devices across hibernation
drm/radeon: make MacBook Pro d3_delay quirk more generic
drm/amdgpu: remove unnecessary save/restore of pdev->d3_delay
PCI/PM: Add needs_resume flag to avoid suspend complete optimization
PCI: imx6: Fix config read timeout handling
switchtec: Fix minor bug with partition ID register
switchtec: Use new cdev_device_add() helper function
PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
pci_target_state() calls device_may_wakeup() which checks whether or not
the device may wake up the system from sleep states, but pci_target_state()
is used for runtime PM too.
Since runtime PM is expected to always enable remote wakeup if possible,
modify pci_target_state() to take additional argument indicating whether or
not it should look for a state from which the device can signal wakeup and
pass either the return value of device_can_wakeup(), or "false" (if the
device itself is not wakeup-capable) to it from the code related to runtime
PM.
While at it, fix the comment in pci_dev_run_wake() which is not about sleep
states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
The run_wake flag in struct dev_pm_info is used to indicate whether
or not the device is capable of generating remote wakeup signals at
run time (or in the system working state), but the distinction
between runtime remote wakeup and system wakeup signaling has always
been rather artificial. The only practical reason for it to exist
at the core level was that ACPI and PCI treated those two cases
differently, but that's not the case any more after recent changes.
For this reason, get rid of the run_wake flag and, when applicable,
use device_set_wakeup_capable() and device_can_wakeup() instead of
device_set_run_wake() and device_run_wake(), respectively.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
After previous changes it is not necessary to distinguish between
device wakeup for run time and device wakeup from system sleep states
any more, so rework the PCI device wakeup settings code accordingly.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
The test for INTx masking via PCI_COMMAND_INTX_DISABLE performed in
pci_intx_mask_supported() should be done before the device can be used.
This is to avoid writing PCI_COMMAND while the driver owns the device, in
case that has any effect on MSI/MSI-X interrupts.
Move the content of pci_intx_mask_supported() to pci_intx_mask_broken() and
call it from pci_setup_device().
The test result can be queried at any time later using the same
pci_intx_mask_supported() interface as before (though with changed
implementation), so callers (uio, vfio) should be unaffected.
Signed-off-by: Piotr Gregor <piotrgregor@rsyncme.org>
[bhelgaas: changelog, remove quirk check, remove locking, move
dev->broken_intx_masking assignment to caller]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Every method in struct device_driver or structures derived from it like
struct pci_driver MUST provide exclusion vs the driver's ->remove() method,
usually by using device_lock().
Protect use of pci_error_handlers->reset_notify() by holding the device
lock while calling it.
Note:
- pci_dev_lock() calls device_lock() in addition to blocking user-space
config accesses.
- pci_err_handlers->reset_notify() is used inside
pci_dev_save_and_disable() and pci_dev_restore(). We could hold the
device lock directly in pci_reset_notify(), but we expand the region
since we have several calls following each other.
Without this, ->reset_notify() may race with ->remove() calls, which can be
easily triggered in NVMe.
[bhelgaas: changelog, add pci_reset_notify() comment]
[bhelgaas: fold in fix from Dan Carpenter <dan.carpenter@oracle.com>:
http://lkml.kernel.org/r/20170701135323.x5vaj4e2wcs2mcro@mwanda]
Link: http://lkml.kernel.org/r/20170601111039.8913-2-hch@lst.de
Reported-by: Rakesh Pandit <rakesh@tuxera.com>
Tested-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The wakeup_prepared PCI device flag is used for preventing subsequent
changes of PCI device wakeup settings in the same way (e.g. enabling
device wakeup twice in a row).
However, in some cases PME Enable may be updated by things like PCI
configuration space restoration in the meantime and it may need to be
set again even though the rest of the settings need not change, so
modify __pci_enable_wake() to do that when it is about to return
early.
Also, it is reasonable to expect that __pci_enable_wake() will always
clear PME Status when invoked to disable device wakeup, so make it do
so even if it is going to return early then.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
After a Function-Level Reset, PCI states need to be restored. Save PASID
features and PRI reqs cached.
[bhelgaas: search for capability only if PRI/PASID were enabled]
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jean-Phillipe Brucker <jean-philippe.brucker@arm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Some drivers - like i915 - may not support the system suspend direct
complete optimization due to differences in their runtime and system
suspend sequence. Add a flag that when set resumes the device before
calling the driver's system suspend handlers which effectively disables
the optimization.
Needed by a future patch fixing suspend/resume on i915.
Suggested by Rafael.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
Commit f44116ae88 ("PCI: Remove pci_find_parent_resource() use for
allocation") updated the logic that iterates over all bus resources and
compares them to a given resource, in order to decide whether one is the
parent of the latter.
This change inadvertently causes pci_find_parent_resource() to disregard
resources starting at address 0x0, resulting in an error such as the one
below on ARM systems whose I/O window starts at 0x0.
pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff window]
pci_bus 0000:00: root bus resource [io 0x0000-0xffff window]
pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff window]
pci_bus 0000:00: root bus resource [bus 00-0f]
pci 0000:00:01.0: PCI bridge to [bus 01]
pci 0000:00:02.0: PCI bridge to [bus 02]
pci 0000:00:03.0: PCI bridge to [bus 03]
pci 0000:00:03.0: can't claim BAR 13 [io 0x0000-0x0fff]: no compatible bridge window
pci 0000:03:01.0: can't claim BAR 0 [io 0x0000-0x001f]: no compatible bridge window
While this never happens on x86, it is perfectly legal in general for a PCI
MMIO or IO window to start at address 0x0, and it was supported in the code
before commit f44116ae88.
Drop the test for res->start != 0; resource_contains() already checks
whether [start, end) completely covers the resource, and so it should be
redundant.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/virtualization:
ixgbe: Use pcie_flr() instead of duplicating it
IB/hfi1: Use pcie_flr() instead of duplicating it
PCI: Call pcie_flr() from reset_chelsio_generic_dev()
PCI: Call pcie_flr() from reset_intel_82599_sfp_virtfn()
PCI: Export pcie_flr()
PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding
PCI: Avoid FLR for Intel 82579 NICs
Conflicts:
include/linux/pci.h
* pci/resource:
PCI: Don't resize resources when realigning all devices in system
PCI: Don't reassign resources that are already aligned
PCI: Factor pci_reassigndev_resource_alignment()
powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
PCI: Add pcibios_default_alignment() for arch-specific alignment control
PCI: Fix calculation of bridge window's size and alignment
PCI: Ignore requested alignment for IOV BARs
PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constant
* pci/ioremap:
PCI: versatile: Update PCI config space remap function
PCI: keystone-dw: Update PCI config space remap function
PCI: layerscape: Update PCI config space remap function
PCI: hisi: Update PCI config space remap function
PCI: tegra: Update PCI config space remap function
PCI: xgene: Update PCI config space remap function
PCI: armada8k: Update PCI config space remap function
PCI: designware: Update PCI config space remap function
PCI: iproc-platform: Update PCI config space remap function
PCI: qcom: Update PCI config space remap function
PCI: rockchip: Update PCI config space remap function
PCI: spear13xx: Update PCI config space remap function
PCI: xilinx-nwl: Update PCI config space remap function
PCI: xilinx: Update PCI config space remap function
PCI: ECAM: Map config region with pci_remap_cfgspace()
PCI: Implement devm_pci_remap_cfgspace()
devres: fix devm_ioremap_*() offset parameter kerneldoc description
ARM: Implement pci_remap_cfgspace() interface
ARM64: Implement pci_remap_cfgspace() interface
linux/io.h: Add pci_remap_cfgspace() interface
PCI: Remove __weak tag from pci_remap_iospace()
The introduction of the pci_remap_cfgspace() interface allows PCI host
controller drivers to map PCI config space through a dedicated kernel
interface. Current PCI host controller drivers use the devm_ioremap_*()
devres interfaces to map PCI configuration space regions so in order to
update them to the new pci_remap_cfgspace() mapping interface a new set of
devres interfaces should be implemented so that PCI host controller drivers
can make use of them.
Introduce two new functions in the PCI kernel layer and Devres
documentation:
- devm_pci_remap_cfgspace()
- devm_pci_remap_cfg_resource()
so that PCI host controller drivers can make use of them to map PCI
configuration space regions.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
These are useful for PCIe host drivers, and those drivers can be modules.
[bhelgaas: don't remove __weak; it's removed elsewhere]
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Currently we opencode the FLR sequence in lots of place; export a core
helper instead. We split out the probing for FLR support as all the
non-core callers already know their hardware.
Note that in the new pci_has_flr() function the quirk check has been moved
before the capability check as there is no point in reading the capability
in this case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_remap_iospace() is marked as a weak symbol even though no architecture
is currently overriding it; given that its implementation internals have
already code paths that are arch specific (ie PCI_IOBASE and
ioremap_page_range() attributes) there is no need to leave the weak symbol
in the kernel since the same functionality can be achieved by customizing
per-arch the corresponding functionality.
Remove the __weak symbol from pci_remap_iospace().
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The "pci=resource_alignment" argument aligns BARs of designated devices by
artificially increasing their size. Increasing the size increases the
alignment and prevents other resources from being assigned in the same
alignment region, e.g., in the same page, but it can break drivers that use
the BAR size to locate things, e.g., ilo_map_device() does this:
off = pci_resource_len(pdev, bar) - 0x2000;
The new pcibios_default_alignment() interface allows an arch to request
that *all* BARs in the system be aligned to a larger size. In this case,
we don't need to artificially increase the resource size because we know
every BAR of every device will be realigned, so nothing will share the same
alignment region.
Use IORESOURCE_STARTALIGN to request realignment of PCI BARs when we know
we're realigning all BARs in the system.
[bhelgaas: comment, changelog]
Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The "pci=resource_alignment=" kernel argument designates devices for which
we want alignment greater than is required by the PCI specs. Previously we
set IORESOURCE_UNSET for every MEM resource of those devices, even if the
resource was *already* sufficiently aligned.
If a resource is already sufficiently aligned, leave it alone and don't try
to reassign it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull the BAR size adjustment out into a new function,
pci_request_resource_alignment(), and add a comment about how and why we
increase the resource size and alignment.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When VFIO passes through a PCI device to a guest, it does not allow the
guest to mmap BARs that are smaller than PAGE_SIZE unless it can reserve
the rest of the page (see vfio_pci_probe_mmaps()). This is because a page
might contain several small BARs for unrelated devices and a guest should
not be able to access all of them.
VFIO emulates guest accesses to non-mappable BARs, which is functional but
slow. On systems with large page sizes, e.g., PowerNV with 64K pages, BARs
are more likely to share a page and performance is more likely to be a
problem.
Add a weak function to set default alignment for all PCI devices. An arch
can override it to force the PCI core to place memory BARs on their own
pages.
Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790)
crashes during suspend tests. Geert Uytterhoeven managed to reproduce the
issue on an M2-W Koelsch board (r8a7791):
It occurs when the PME scan runs, once per second. During PME scan, the
PCI host bridge (rcar-pci) registers are accessed while its module clock
has already been disabled, leading to the crash.
One reproducer is to configure s2ram to use "s2idle" instead of "deep"
suspend:
# echo 0 > /sys/module/printk/parameters/console_suspend
# echo s2idle > /sys/power/mem_sleep
# echo mem > /sys/power/state
Another reproducer is to write either "platform" or "processors" to
/sys/power/pm_test. It does not (or is less likely) to happen during full
system suspend ("core" or "none") because system suspend also disables
timers, and thus the workqueue handling PME scans no longer runs. Geert
believes the issue may still happen in the small window between disabling
module clocks and disabling timers:
# echo 0 > /sys/module/printk/parameters/console_suspend
# echo platform > /sys/power/pm_test # Or "processors"
# echo mem > /sys/power/state
(Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.)
Rafael Wysocki agrees that PME scans should be suspended before the host
bridge registers become inaccessible. To that end, queue the task on a
workqueue that gets frozen before devices suspend.
Rafael notes however that as a result, some wakeup events may be missed if
they are delivered via PME from a device without working IRQ (which hence
must be polled) and occur after the workqueue has been frozen. If that
turns out to be an issue in practice, it may be possible to solve it by
calling pci_pme_list_scan() once directly from one of the host bridge's
pm_ops callbacks.
Stacktrace for posterity:
PM: Syncing filesystems ... [ 38.566237] done.
PM: Preparing system for sleep (mem)
Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: Suspending system (mem)
PM: suspend of devices complete after 152.456 msecs
PM: late suspend of devices complete after 2.809 msecs
PM: noirq suspend of devices complete after 29.863 msecs
suspend debug: Waiting for 5 second(s).
Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
pgd = c0003000
[00000000] *pgd=80000040004003, *pmd=00000000
Internal error: : 1211 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383
Hardware name: Generic R8A7791 (Flattened Device Tree)
Workqueue: events pci_pme_list_scan
task: eb56e140 task.stack: eb58e000
PC is at pci_generic_config_read+0x64/0x6c
LR is at rcar_pci_cfg_base+0x64/0x84
pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093
sp : eb58fe98 ip : c041d750 fp : 00000008
r10: c0e2283c r9 : 00000000 r8 : 600d0013
r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4
r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: 30c5387d Table: 6a9f6c80 DAC: 55555555
Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
Stack: (0xeb58fe98 to 0xeb590000)
fe80: 00000002 00000044
fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000
fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830
fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000
ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380
ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000
ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0
ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd
[<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>]
(pci_bus_read_config_word+0x58/0x80)
[<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>]
(pci_check_pme_status+0x34/0x78)
[<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54)
[<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4)
[<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>]
(process_one_work+0x1bc/0x308)
[<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0)
[<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc)
[<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c)
Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000)
---[ end trace 667d43ba3aa9e589 ]---
Fixes: df17e62e5b ("PCI: Add support for polling PME state on suspended legacy PCI devices")
Reported-and-tested-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org # 2.6.37+
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Cc: Simon Horman <horms+renesas@verge.net.au>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
We would call pci_reassigndev_resource_alignment() before
pci_init_capabilities(). So the requested alignment would never work for
IOV BARs.
Furthermore, it's meaningless to request additional alignment for IOV BARs,
the IOV BAR alignment is only determined by the VF BAR size.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Per Intel Specification Update 335553-002 (see link below), some 82579
network adapters advertise a Function Level Reset (FLR) capability, but
they can hang when an FLR is triggered.
To reproduce the problem, attach the device to a VM, then detach and try to
attach again.
Add a quirk to prevent the use of FLR on these devices.
[bhelgaas: changelog, comments]
Link: http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82579lm-82579v-gigabit-network-connection-spec-update.pdf
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If the PCI device is disconnected, return false immediately from
pci_device_is_present(). pci_device_is_present() uses the bus accessors,
so the early return in the device accessors doesn't help here.
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wei Zhang <wzhang@fb.com>
msleep() still sleeps 1 jiffy even when told to sleep for zero
milliseconds. That can end up being 1-2 milliseconds or more. In the
cases of d3_delay and d3cold_delay, that unnecessarily increases suspend
and/or resume latencies.
Do not sleep at all for the respective cases if d3_delay is zero or
d3cold_delay is zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/virtualization:
PCI: Add comments about ROM BAR updating
PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
PCI: Don't update VF BARs while VF memory space is enabled
PCI: Separate VF BAR updates from standard BAR updates
PCI: Update BARs using property bits appropriate for type
PCI: Ignore BAR updates on virtual functions
PCI: Do any VF BAR updates before enabling the BARs
PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+
PCI: Convert Mellanox broken INTx quirks to be for listed devices only
PCI: Convert broken INTx masking quirks from HEADER to FINAL
net/mlx4_core: Use device ID defines
PCI: Add Mellanox device IDs
pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.
Compute the BAR address inline and remove pci_resource_bar(). That makes
pci_iov_resource_bar() unused, so remove that as well.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.
We already ignore these updates because of 70675e0b6a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Linux 4.8 added support for runtime suspending PCIe ports to D3hot with
commit 006d44e49a ("PCI: Add runtime PM support for PCIe ports"), but
excluded hotplug ports. Those are now afforded runtime PM by the present
commit.
Hotplug ports require a few extra considerations:
- The configuration space of the port remains accessible in D3hot, so all
the functions to read or modify the Slot Status and Slot Control
registers need not be modified. Even turning on slot power doesn't seem
to require the port to be in D0, at least the PCIe spec doesn't say so
and I confirmed that by testing with a Thunderbolt controller.
- However D0 is required to access devices on the secondary bus. This
happens in pciehp_check_link_status() and pciehp_configure_device() (both
called from board_added()) and in pciehp_unconfigure_device() (called
from remove_board()), so acquire a runtime PM ref for their invocation.
- The hotplug port stays active as long as it has active children. If all
hotplugged devices below the port runtime suspend, the port is allowed to
runtime suspend as well. Plug and unplug detection continues to work in
D3hot.
- Hotplug interrupts are delivered in-band, so while the hotplug port
itself is allowed to go to D3hot, its parent ports must stay in D0 for
interrupts to come through. Add a corresponding restriction to
pci_dev_check_d3cold().
- Runtime PM may only be allowed if the hotplug port is handled natively by
the OS. On ACPI systems, the port may alternatively be handled by the
firmware and things break if the OS puts the port into D3 behind the
firmware's back: E.g. Thunderbolt hotplug ports on non-Macs are handled
by Intel's firmware in System Management Mode and the firmware is known
to access devices on the port's secondary bus without checking first if
the port is in D0: https://bugzilla.kernel.org/show_bug.cgi?id=53811
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: Mika Westerberg <mika.westerberg@linux.intel.com>
The conditions to block D3 on parent ports are currently condensed into a
single expression in pci_dev_check_d3cold(). Upcoming commits will add
further conditions for hotplug ports, making this expression fairly large
and impenetrable. Unfold the conditions to maintain readability when they
are amended.
No functional change intended.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Mika Westerberg <mika.westerberg@linux.intel.com>
CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The conditions to allow runtime PM on PCIe ports are currently spread
across two different files: The condition relating to hotplug ports is
located in portdrv_pci.c whereas all other conditions are located in pci.c.
Consolidate all conditions in a single place in pci.c, thus making it
easier to follow the logic and amend conditions down the road.
Note that the condition relating to hotplug ports is inserted *before* the
condition relating to the "pcie_port_pm=force" command line option, so
runtime PM is not afforded to hotplug ports even if this option is given.
That's exactly how the code behaved up until now. If this is not desired,
the ordering of the conditions can simply be reversed.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently pcie_portdrv_probe() activates runtime PM on a PCIe port even
if it will never actually suspend because the BIOS is too old or the
"pcie_port_pm=off" option was specified on the kernel command line.
A few CPU cycles can be saved by not activating runtime PM at all in these
cases, because rpm_idle() and rpm_suspend() will bail out right at the
beginning when calling rpm_check_suspend_allowed(), instead of carrying out
various locking and assignments, invoking rpm_callback(), getting back
-EBUSY and rolling everything back.
The conditions checked in pci_bridge_d3_possible() are all static, they
never change during uptime of the system, hence it's safe to call this to
determine if runtime PM should be activated.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After a device has been added, removed or had its D3cold attributes
changed, we recheck whether its parent bridge may runtime suspend to D3hot
with pci_bridge_d3_update().
The most naive algorithm would be to iterate over the bridge's children and
check if any of them are blocking D3.
The function already tries to be a bit smarter than that by first checking
the device that was changed. If this device already blocks D3 on the
bridge, then walking over all the other children can be skipped. A
drawback of this approach is that if the device is *not* blocking D3, it
will be checked a second time by pci_walk_bus(). But that's cheap and is
outweighed by the performance gain of potentially skipping pci_walk_bus()
altogether.
The algorithm can be optimized further by taking into account if D3 is
currently allowed for the bridge, as shown in the following truth table:
(a) remove && bridge_d3: D3 is currently allowed for the bridge and
removing one of its children won't change
that. No action necessary.
(b) remove && !bridge_d3: D3 may now be allowed for the bridge if the
removed child was the only one blocking it.
Check all its siblings to verify that.
(c) !remove && bridge_d3: D3 may now be disallowed but this can only
be caused by the added/changed child, not
any of its siblings. Check only that single
device.
(d) !remove && !bridge_d3: D3 may now be allowed for the bridge if the
changed child was the only one blocking it.
Check all its siblings to verify that.
By checking beforehand if the changed child
is blocking D3, we may be able to skip
checking its siblings.
Currently we do not special-case option (a) and in case of option (c) we
gratuitously call pci_walk_bus(). Speed up the algorithm by adding these
optimizations. Reword the comments a bit in an attempt to improve clarity.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The algorithm to update the flag indicating whether a bridge may go to D3
makes a few optimizations based on whether the update was caused by the
removal of a device on the one hand, versus the addition of a device or the
change of its D3cold flags on the other hand.
The information whether the update pertains to a removal is currently
passed in by the caller, but the function may as well determine that itself
by examining the device in question, thereby allowing for a considerable
simplification and reduction of the code.
Out of several options to determine removal, I've chosen the function
device_is_registered() because it's cheap: It merely returns the
dev->kobj.state_in_sysfs flag. That flag is set through device_add() when
the root bus is scanned and cleared through device_remove(). The call to
pci_bridge_d3_update() happens after each of these calls, respectively, so
the ordering is correct.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This function is always called with an existing pci_dev struct, which
holds a reference on the pci_bus struct it resides on, which in turn
holds a reference on pci_bus->bridge, which is the pci_dev's parent.
Hence there's no need to acquire an additional ref on the parent.
More specifically, the pci_dev exists until pci_destroy_dev() drops the
final reference on it, so all calls to pci_bridge_d3_update() must be
finished before that. It is arguably the caller's responsibility to ensure
that it doesn't call pci_bridge_d3_update() with a pci_dev that might
suddenly disappear, but in any case the existing callers are all safe:
- The call in pci_destroy_dev() happens before the call to put_device().
- The call in pci_bus_add_device() is synchronized with pci_destroy_dev()
using pci_lock_rescan_remove().
- The calls to pci_d3cold_disable() from the xhci and nouveau drivers
are safe because a ref on the pci_dev is held as long as it's bound to
a driver.
- The calls to pci_d3cold_enable() / pci_d3cold_disable() when modifying
the sysfs "d3cold_allowed" entry are also safe because kernfs_drain()
waits for existing sysfs users to finish before removing the entry,
and pci_destroy_dev() is called way after that.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
One some systems, the firmware does not allow certain PCI devices to be put
in deep D-states. This can cause problems for wakeup signalling, if the
device does not support PME# in the deepest allowed suspend state. For
example, Pierre reports that on his system, ACPI does not permit his xHCI
host controller to go into D3 during runtime suspend -- but D3 is the only
state in which the controller can generate PME# signals. As a result, the
controller goes into runtime suspend but never wakes up, so it doesn't work
properly. USB devices plugged into the controller are never detected.
If the device relies on PME# for wakeup signals but is not capable of
generating PME# in the target state, the PCI core should accurately report
that it cannot do wakeup from runtime suspend. This patch modifies the
pci_dev_run_wake() routine to add this check.
Reported-by: Pierre de Villemereuil <flyos@mailoo.org>
Tested-by: Pierre de Villemereuil <flyos@mailoo.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org
CC: Lukas Wunner <lukas@wunner.de>
Resource allocation for VFs is done via the VF BARx registers in the PF's
SR-IOV Capability, and the BARs in the VFs themselves are read-only zeros
(see SR-IOV spec r1.1, secs 3.3.14 and 3.4.1.11).
Even though the actual VF BARs are read-only zeros, the VF dev->resource[]
structs describe the space allocated for the VF (this is a piece of the
space described by the VF BARx register in the PF's SR-IOV capability).
It's meaningless to request additional alignment for a VF: the VF BAR
alignment is completely determined by the alignment of the VF BARx in the
PF and the size of the VF BAR.
Ignore the user's alignment requests for VF devices.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Users may request additional alignment of PCI resources, e.g., to align
BARs on page boundaries so they can be shared with guests via VFIO. This
of course may require reallocation if firmware has already assigned the
BARs with smaller alignments.
If the platform has requested PCI_PROBE_ONLY, we should never change any
PCI BARs, so we can't provide any additional alignment. Also, if a BAR is
marked as IORESOURCE_PCI_FIXED, e.g., for PCI Enhanced Allocation or if the
firmware depends on the current BAR value, we can't change the alignment.
In these cases, log a message and ignore the user's alignment requests.
[bhelgaas: changelog, use goto to simplify PCI_PROBE_ONLY check]
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Whenever a device is resumed or its power state is changed using the
platform, its new power state is read from the PM Control & Status Register
and cached in pci_dev->current_state by calling pci_update_current_state().
If the device is in D3cold, reading from config space typically results in
a fabricated "all ones" response. But if it's in D3hot, the two bits
representing the power state in the PMCSR are *also* set to 1. Thus D3hot
and D3cold are not discernible by just reading the PMCSR.
To account for this, pci_update_current_state() uses two workarounds:
- When transitioning to D3cold using pci_platform_power_transition(), the
new power state is set blindly by pci_update_current_state(), i.e.
without verifying that the device actually *is* in D3cold. This is
achieved by setting the "state" argument to PCI_D3cold. The "state"
argument was originally intended to convey the new state in case the
device doesn't have the PM capability. It is *also* used to convey the
device state if the PM capability is present and the new state is D3cold,
but this was never explained in the kerneldoc.
- Once the current_state is set to D3cold, further invocations of
pci_update_current_state() will blindly assume that the device is still
in D3cold and leave the current_state unmodified. To get out of this
impasse, the current_state has to be set directly, typically by calling
pci_raw_set_power_state() or pci_enable_device().
It would be desirable if pci_update_current_state() could reliably detect
D3cold by itself. That would allow us to do away with these workarounds,
and it would allow for a smarter, more energy conserving runtime resume
strategy after system sleep: Currently devices which utilize
direct_complete are mandatorily runtime resumed in their ->complete stage.
This can be avoided if their power state after system sleep is the same as
before, but it requires a mechanism to detect the power state reliably.
We've just gained the ability to query the platform firmware for its
opinion on the device's power state. On platforms conforming to ACPI 4.0
or newer, this allows recognition of D3cold. Pre-4.0 platforms lack _PR3
and therefore the deepest power state that will ever be reported is D3hot,
even though the device may actually be in D3cold. To detect D3cold in
those cases, accessibility of the vendor ID in config space is probed using
pci_device_is_present(). This also works for devices which are not
platform-power-manageable at all, but can be suspended to D3cold using a
nonstandard mechanism (e.g. some hybrid graphics laptops or Thunderbolt on
the Mac).
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Usually the most accurate way to determine a PCI device's power state is to
read its PM Control & Status Register. There are two cases however when
this is not an option: If the device doesn't have the PM capability at
all, or if it is in D3cold (in which case its config space is
inaccessible).
In both cases, we can alternatively query the platform firmware for its
opinion on the device's power state. To facilitate this, augment struct
pci_platform_pm_ops with a ->get_power callback and implement it for
acpi_pci_platform_pm (the only pci_platform_pm_ops existing so far).
It is used by a forthcoming commit to let pci_update_current_state()
recognize D3cold.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are devices not power-manageable by the platform, but still able to
runtime suspend to D3cold with a non-standard mechanism. One example is
laptop hybrid graphics where the discrete GPU and its built-in HDA
controller are power-managed either with a _DSM (AMD PowerXpress, Nvidia
Optimus) or a separate gmux controller (MacBook Pro). Another example is
Thunderbolt on Macs which is power-managed with custom ACPI methods.
When putting the system to sleep, we currently handle such devices
improperly by transitioning them from D3cold to D3hot (the default power
state defined at the top of pci_target_state()). This wastes energy and
prolongs the suspend sequence (powering up the Thunderbolt controller takes
2 seconds).
Avoid that by assuming that a non-standard PM mechanism is at work if the
device is not platform-power-manageable but currently in D3cold.
If the device is wakeup enabled, we might still have to wake it up from
D3cold if PME cannot be signaled from that power state.
The check for devices without PM capability comes before the check for
D3cold since such devices could in theory also be powered down by
non-standard means and should then be afforded direct-complete as well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a new helper function pci_find_resource() that can be used to find out
whether a given resource (for example from a child device) is contained
within given PCI device's standard resources.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/resource:
unicore32/PCI: Remove pci=firmware command line parameter handling
ARM/PCI: Remove arch-specific pcibios_enable_device()
ARM64/PCI: Remove arch-specific pcibios_enable_device()
MIPS/PCI: Claim bus resources on PCI_PROBE_ONLY set-ups
ARM/PCI: Claim bus resources on PCI_PROBE_ONLY set-ups
PCI: generic: Claim bus resources on PCI_PROBE_ONLY set-ups
PCI: Add generic pci_bus_claim_resources()
alx: Use pci_(request|release)_mem_regions
ethernet/intel: Use pci_(request|release)_mem_regions
GenWQE: Use pci_(request|release)_mem_regions
lpfc: Use pci_(request|release)_mem_regions
NVMe: Use pci_(request|release)_mem_regions
PCI: Add helpers to request/release memory and I/O regions
PCI: Extending pci=resource_alignment to specify device/vendor IDs
sparc/PCI: Implement pci_resource_to_user() with pcibios_resource_to_bus()
powerpc/pci: Implement pci_resource_to_user() with pcibios_resource_to_bus()
microblaze/PCI: Implement pci_resource_to_user() with pcibios_resource_to_bus()
PCI: Unify pci_resource_to_user() declarations
microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
powerpc/pci: Remove __pci_mmap_set_pgprot()
PCI: Ignore write combining when mapping I/O port space
* pci/aspm:
PCI/ASPM: Remove redundant check of pcie_set_clkpm
* pci/dpc:
PCI: Remove DPC tristate module option
PCI: Bind DPC to Root Ports as well as Downstream Ports
PCI: Fix whitespace in struct dpc_dev
PCI: Convert Downstream Port Containment driver to use devm_* functions
* pci/hotplug:
PCI: Allow additional bus numbers for hotplug bridges
* pci/misc:
PCI: Include <asm/dma.h> for isa_dma_bridge_buggy
PCI: Make bus_attr_resource_alignment static
MAINTAINERS: Add file patterns for PCI device tree bindings
PCI: Fix comment typo
* pci/msi:
PCI/MSI: irqchip: Fix PCI_MSI dependencies
* pci/pm:
PCI: pciehp: Ignore interrupts during D3cold
PCI: Document connection between pci_power_t and hardware PM capability
PCI: Add runtime PM support for PCIe ports
ACPI / hotplug / PCI: Runtime resume bridge before rescan
PCI: Power on bridges before scanning new devices
PCI: Put PCIe ports into D3 during suspend
PCI: Don't clear d3cold_allowed for PCIe ports
PCI / PM: Enforce type casting for pci_power_t
* pci/virtualization:
PCI: Add ACS quirk for Solarflare SFC9220
PCI: Add DMA alias quirk for Adaptec 3805
PCI: Mark Atheros AR9485 and QCA9882 to avoid bus reset
PCI: Add function 1 DMA alias quirk for Marvell 88SE9182
A user may hot add a switch requiring more than one bus to enumerate. This
previously required a system reboot if BIOS did not sufficiently pad the
bus resource, which they frequently don't do.
Add a kernel parameter so a user can specify the minimum number of bus
numbers to reserve for a hotplug bridge's subordinate buses so rebooting
won't be necessary.
The default is 1, which is equivalent to previous behavior.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When assign new PCI platform PM operations check for all mandatory fields to
prevent NULL pointer dereference.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
At least on arm, <asm/dma.h> does not get included when building
drivers/pci/pci.o. This causes the following build warning which can be
fixed by including <asm/dma.h>:
drivers/pci/pci.c:37:5: warning: symbol 'isa_dma_bridge_buggy' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some uio-based PCI drivers, e.g., uio_cif do not work if the assigned PCI
memory resources are not page aligned.
By using the kernel option "pci=resource_alignment" it is possible to force
single PCI boards to use page alignment for their memory resources.
However, this is fairly cumbersome if several of these boards are in use
as the specification of the cards has to be done via PCI bus/slot/function
number which might change, e.g., by adding another board.
Extend the kernel option "pci=resource_alignment" to allow specification of
relevant devices via PCI device/vendor (and subdevice/subvendor) IDs. The
specification of the devices via device/vendor is indicated by a leading
string "pci:" as argument to "pci=resource_alignment". The format of the
specification is pci:<vendor>:<device>[:<subvendor>:<subdevice>]
Signed-off-by: Mathias Koehrer <mathias.koehrer@etas.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently the Linux PCI core does not touch power state of PCI bridges and
PCIe ports when system suspend is entered. Leaving them in D0 consumes
power unnecessarily and may prevent the CPU from entering deeper C-states.
With recent PCIe hardware we can power down the ports to save power given
that we take into account few restrictions:
- The PCIe port hardware is recent enough, starting from 2015.
- Devices connected to PCIe ports are effectively in D3cold once the port
is transitioned to D3 (the config space is not accessible anymore and
the link may be powered down).
- Devices behind the PCIe port need to be allowed to transition to D3cold
and back. There is a way both drivers and userspace can forbid this.
- If the device behind the PCIe port is capable of waking the system it
needs to be able to do so from D3cold.
This patch adds a new flag to struct pci_device called 'bridge_d3'. This
flag is set and cleared by the PCI core whenever there is a change in power
management state of any of the devices behind the PCIe port. When system
later on is suspended we only need to check this flag and if it is true
transition the port to D3 otherwise we leave it in D0.
Also provide override mechanism via command line parameter
"pcie_port_pm=[off|force]" that can be used to disable or enable the
feature regardless of the BIOS manufacturing date.
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The symbol bus_attr_resource_alignment is not exported or declared
elsewhere, so make it static to fix the following warning:
drivers/pci/pci.c:4900:1: warning: symbol 'bus_attr_resource_alignment' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Extend pci_bus_find_domain_nr() so it can find the domain from either:
- ACPI, via the new acpi_pci_bus_find_domain_nr() interface, or
- DT, via of_pci_bus_find_domain_nr()
Note that this is only used for CONFIG_PCI_DOMAINS_GENERIC=y, so it does
not affect x86 or ia64.
[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_bus_find_domain_nr() retrieves the host bridge domain number in a
DT-specific way. Rename it to of_pci_bus_find_domain_nr() to reflect that,
so we can add a corresponding function for ACPI.
[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Instead of assigning bus->domain_nr inside pci_bus_assign_domain_nr(),
return the domain and let the caller do the assignment. Rename
pci_bus_assign_domain_nr() to pci_bus_find_domain_nr() to reflect this.
No functional change intended.
[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Add pci_unmap_iospace() to undo what pci_remap_iospace() did.
This is needed to support hotplug removal of host bridges that use
pci_remap_iospace().
[bhelgaas: changelog]
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
* pci/arm64:
PCI, of: Move PCI I/O space management to PCI core code
PCI: generic, thunder: Use generic ECAM API
PCI: Provide common functions for ECAM mapping
* pci/host-hv:
PCI: hv: Add explicit barriers to config space access
* pci/hotplug:
PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
* pci/resource:
PCI: Disable all BAR sizing for devices with non-compliant BARs
x86/PCI: Mark Broadwell-EP Home Agent 1 as having non-compliant BARs
PCI: Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs
Resource flags are exposed to userspace via the sysfs "resource" file.
lspci reads the sysfs file to determine resource properties.
Add a "BAR Equivalent Indicator" flag so lspci can distinguish between
[virtual] and [enhanced] resources.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No functional changes in this patch.
PCI I/O space mapping code does not depend on OF; therefore it can be moved
to PCI core code. This way we will be able to use it, e.g., in ACPI PCI
code.
Suggested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Liviu Dudau <Liviu.Dudau@arm.com>
The original thought was that if a device implemented ACS, then surely
we want to use that... well, it turns out that devices can make an ACS
capability so broken that we still need to fall back to quirks.
Reverse the order of ACS enabling to give quirks first shot at it.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Solve IOMMU support issues with PCIe non-transparent bridges that use
Requester ID look-up tables (RID-LUT), e.g., the PEX8733.
The NTB connects devices in two independent PCI domains. Devices separated
by the NTB are not able to discover each other. A PCI packet being
forwared from one domain to another has to have its RID modified so it
appears on correct bus and completions are forwarded back to the original
domain through the NTB. The RID is translated using a preprogrammed table
(LUT) and the PCI packet propagates upstream away from the NTB. If the
destination system has IOMMU enabled, the packet will be discarded because
the new RID is unknown to the IOMMU. Adding a DMA alias for the new RID
allows IOMMU to properly recognize the packet.
Each device behind the NTB has a unique RID assigned in the RID-LUT. The
current DMA alias implementation supports only a single alias, so it's not
possible to support mutiple devices behind the NTB when IOMMU is enabled.
Enable all possible aliases on a given bus (256) that are stored in a
bitset. Alias devfn is directly translated to a bit number. The bitset is
not allocated for devices that have no need for DMA aliases.
More details can be found in the following article:
http://www.plxtech.com/files/pdf/technical/expresslane/RTC_Enabling%20MulitHostSystemDesigns.pdf
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
One of the quirks that adds DMA aliases logs an informational message in
dmesg. Move that to pci_add_dma_alias() so all users log the message
consistently. No functional change intended (except extra message).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Add a pci_add_dma_alias() interface to encapsulate the details of adding an
alias. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Fix spelling of "initalization".
[bhelgaas: also fix pci/pci.c]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/aer:
PCI/AER: Log aer_inject error injections
PCI/AER: Log actual error causes in aer_inject
PCI/AER: Use dev_warn() in aer_inject
PCI/AER: Fix aer_inject error codes
* pci/enumeration:
PCI: Fix broken URL for Dell biosdevname
* pci/kconfig:
PCI: Cleanup pci/pcie/Kconfig whitespace
PCI: Include pci/hotplug Kconfig directly from pci/Kconfig
PCI: Include pci/pcie/Kconfig directly from pci/Kconfig
* pci/misc:
PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
PCI: Add QEMU top-level IDs for (sub)vendor & device
unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition
PCI: Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h
PCI: Move pci_dma_* helpers to common code
frv/PCI: Remove stray pci_{alloc,free}_consistent() declaration
* pci/virtualization:
PCI: Wait for up to 1000ms after FLR reset
PCI: Support SR-IOV on any function type
* pci/vpd:
PCI: Prevent VPD access for buggy devices
PCI: Sleep rather than busy-wait for VPD access completion
PCI: Fold struct pci_vpd_pci22 into struct pci_vpd
PCI: Rename VPD symbols to remove unnecessary "pci22"
PCI: Remove struct pci_vpd_ops.release function pointer
PCI: Move pci_vpd_release() from header file to pci/access.c
PCI: Move pci_read_vpd() and pci_write_vpd() close to other VPD code
PCI: Determine actual VPD size on first access
PCI: Use bitfield instead of bool for struct pci_vpd_pci22.busy
PCI: Allow access to VPD attributes with size 0
PCI: Update VPD definitions
Some devices take longer than the spec indicates to return from FLR reset,
a notable case of this is Intel integrated graphics (IGD), which can often
take an additional 300ms powering down an attached LCD panel as part of the
FLR. Allow devices up to 1000ms, testing every 100ms whether the second
dword of config space is read as -1.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_create_root_bus() passes a "parent" pointer to
pci_bus_assign_domain_nr(). When CONFIG_PCI_DOMAINS_GENERIC is defined,
pci_bus_assign_domain_nr() dereferences that pointer. Many callers of
pci_create_root_bus() supply a NULL "parent" pointer, which leads to a NULL
pointer dereference error.
7c67470009 ("PCI: Move domain assignment from arm64 to generic code")
moved the "parent" dereference from arm64 to generic code. Only arm64 used
that code (because only arm64 defined CONFIG_PCI_DOMAINS_GENERIC), and it
always supplied a valid "parent" pointer. Other arches supplied NULL
"parent" pointers but didn't defined CONFIG_PCI_DOMAINS_GENERIC, so they
used a no-op version of pci_bus_assign_domain_nr().
8c7d14746a ("ARM/PCI: Move to generic PCI domains") defined
CONFIG_PCI_DOMAINS_GENERIC on ARM, and many ARM platforms use
pci_common_init(), which supplies a NULL "parent" pointer.
These platforms (cns3xxx, dove, footbridge, iop13xx, etc.) crash
with a NULL pointer dereference like this while probing PCI:
Unable to handle kernel NULL pointer dereference at virtual address 000000a4
PC is at pci_bus_assign_domain_nr+0x10/0x84
LR is at pci_create_root_bus+0x48/0x2e4
Kernel panic - not syncing: Attempted to kill init!
[bhelgaas: changelog, add "Reported:" and "Fixes:" tags]
Reported: http://forum.doozan.com/read.php?2,17868,22070,quote=1
Fixes: 8c7d14746a ("ARM/PCI: Move to generic PCI domains")
Fixes: 7c67470009 ("PCI: Move domain assignment from arm64 to generic code")
Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: stable@vger.kernel.org # v4.0+
Christoph added a generic include/linux/pci-dma-compat.h, so now there's
one place with most of the PCI DMA interfaces. Move more PCI DMA-related
things there:
- The PCI_DMA_* direction constants from linux/pci.h
- The pci_set_dma_max_seg_size() and pci_set_dma_seg_boundary()
CONFIG_PCI implementations from drivers/pci/pci.c
- The pci_set_dma_max_seg_size() and pci_set_dma_seg_boundary()
!CONFIG_PCI stubs from linux/pci.h
- The pci_set_dma_mask() and pci_set_consistent_dma_mask()
!CONFIG_PCI stubs from linux/pci.h
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
include/asm-generic/pci-bridge.h is now empty, so remove every #include of
it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Will Deacon <will.deacon@arm.com> (arm64)
Fix all whitespace issues (missing or needed whitespace) in all files in
drivers/pci. Code is compiled with allyesconfig before and after code
changes and objects are recorded and checked with objdiff and they are not
changed after this commit.
Signed-off-by: Bogicevic Sasa <brutallesale@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The pci_platform_pm_ops structure is never modified, so declare it as
const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/aer:
PCI/AER: Clear error status registers during enumeration and restore
* pci/hotplug:
PCI: pciehp: Queue power work requests in dedicated function
* pci/misc:
PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum
x86/PCI: Make pci_subsys_init() static
PCI: Add builtin_pci_driver() to avoid registration boilerplate
PCI: Remove unnecessary "if" statement
* pci/msi:
x86/PCI: Don't alloc pcibios-irq when MSI is enabled
PCI/MSI: Export all remapped MSIs to sysfs attributes
PCI: Disable MSI on SiS 761
* pci/resource:
sparc/PCI: Add mem64 resource parsing for root bus
PCI: Expand Enhanced Allocation BAR output
PCI: Make Enhanced Allocation bitmasks more obvious
PCI: Handle Enhanced Allocation capability for SR-IOV devices
PCI: Add support for Enhanced Allocation devices
PCI: Add Enhanced Allocation register entries
PCI: Handle IORESOURCE_PCI_FIXED when assigning resources
PCI: Handle IORESOURCE_PCI_FIXED when sizing resources
PCI: Clear IORESOURCE_UNSET when reverting to firmware-assigned address
* pci/virtualization:
PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures
PCI: Wait 1 second between disabling VFs and clearing NumVFs
PCI: Reorder pcibios_sriov_disable()
PCI: Remove VFs in reverse order if virtfn_add() fails
PCI: Remove redundant validation of SR-IOV offset/stride registers
PCI: Set SR-IOV NumVFs to zero after enumeration
PCI: Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs
PCI: Don't try to restore VF BARs
An Enhanced Allocation Capability entry with BEI 0 fills in
dev->resource[0] just like a real BAR 0 would, but non-EA experts might not
connect "EA - BEI 0" with BAR 0.
Decode the EA jargon a little bit, e.g., change this:
pci 0002:01:00.0: EA - BEI 0, Prop 0x00: [mem 0x84300000-0x84303fff]
to this:
pci 0002:01:00.0: BAR 0: [mem 0x84300000-0x84303fff] (from Enhanced Allocation, properties 0x00)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Expand bitmask #defines completely. This puts the shift in the code
instead of in the #define, but it makes it more obvious in the header file
how fields in the register are laid out.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
SR-IOV BARs can be specified via EA entries. Extend the EA parser to
extract the SRIOV BAR resources, and modify sriov_init() to use resources
previously obtained via EA.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean O. Stalley <sean.stalley@intel.com>
Add support for devices using Enhanced Allocation entries instead of BARs.
This allows the kernel to parse the EA Extended Capability structure in PCI
config space and claim the BAR-equivalent resources.
See https://pcisig.com/sites/default/files/specification_documents/ECN_Enhanced_Allocation_23_Oct_2014_Final.pdf
[bhelgaas: add spec URL, s/pci_ea_set_flags/pci_ea_flags/, consolidate
declarations, print unknown property in hex to match spec]
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
[david.daney@cavium.com: Add more support/checking for Entry Properties,
allow EA behind bridges, rewrite some error messages.]
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The Chelsio T5 has a PCIe compliance erratum that causes Malformed TLP or
Unexpected Completion errors in some systems, which may cause device access
timeouts.
Per PCIe r3.0, sec 2.2.9, "Completion headers must supply the same values
for the Attribute as were supplied in the header of the corresponding
Request, except as explicitly allowed when IDO is used."
Instead of copying the Attributes from the Request to the Completion, the
T5 always generates Completions with zero Attributes. The receiver of a
Completion whose Attributes don't match the Request may accept it (which
itself seems non-compliant based on sec 2.3.2), or it may handle it as a
Malformed TLP or an Unexpected Completion, which will probably lead to a
device access timeout.
Work around this by disabling "Relaxed Ordering" and "No Snoop" in the Root
Port so it always generate Requests with zero Attributes.
This does affect all other devices which are downstream of that Root Port,
but these are performance optimizations that should not make a functional
difference.
Note that Configuration Space accesses are never supposed to have TLP
Attributes, so we're safe waiting till after any Configuration Space
accesses to do the Root Port "fixup".
Based on original work by Casey Leedom <leedom@chelsio.com>
[bhelgaas: changelog, comments, rename to pci_find_pcie_root_port(), rework
to use pci_upstream_bridge() and check for Root Port device type, edit
diagnostics to clarify intent and devices affected]
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit bac2a909a0 (PCI / PM: Avoid resuming PCI devices during
system suspend) introduced a mechanism by which some PCI devices that
were runtime-suspended at the system suspend time might be left in
that state for the duration of the system suspend-resume cycle.
However, it overlooked devices that were marked as capable of waking
up the system just because PME support was detected in their PCI
config space.
Namely, in that case, device_can_wakeup(dev) returns 'true' for the
device and if the device is not configured for system wakeup,
device_may_wakeup(dev) returns 'false' and it will be resumed during
system suspend even though configuring it for system wakeup may not
really make sense at all.
To avoid this problem, simply disable PME for PCI devices that have
not been configured for system wakeup and are runtime-suspended at
the system suspend time for the duration of the suspend-resume cycle.
If the device is in D3cold, its config space is not available and it
shouldn't be written to, but that's only possible if the device
has platform PM support and the platform code is responsible for
checking whether or not the device's configuration is suitable for
system suspend in that case.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
AER errors might be recorded when powering-on devices. These errors can be
ignored, so firmware usually clears them before the OS enumerates devices.
However, firmware is not involved when devices are added via hotplug, so
the OS may discover power-up errors that should be ignored. The same may
happen when powering up devices when resuming after suspend.
Clear the AER error status registers during enumeration and resume.
[bhelgaas: changelog, remove repetitive comments]
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.
Don't update VF BARs in pci_restore_bars().
This avoids spurious "BAR %d: error updating" messages that we see when
doing vfio pass-through after 6eb7018705 ("vfio-pci: Move idle devices to
D3hot power state").
[bhelgaas: changelog, fix whitespace]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull x86 mm updates from Ingo Molnar:
"The dominant change in this cycle was the continued work to isolate
kernel drivers from MTRR legacies: this tree gets rid of all kernel
internal driver interfaces to MTRRs (mostly by rewriting it to proper
PAT interfaces), the only access left is the /proc/mtrr ABI.
This work was done by Luis R Rodriguez.
There's also some related PCI interface additions for which I've
Cc:-ed Bjorn"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del()
s390/io: Add pci_iomap_wc() and pci_iomap_wc_range()
drivers/dma/iop-adma: Use dma_alloc_writecombine() kernel-style
drivers/video/fbdev/vt8623fb: Use arch_phys_wc_add() and pci_iomap_wc()
drivers/video/fbdev/s3fb: Use arch_phys_wc_add() and pci_iomap_wc()
drivers/video/fbdev/arkfb.c: Use arch_phys_wc_add() and pci_iomap_wc()
PCI: Add pci_iomap_wc() variants
drivers/video/fbdev/gxt4500: Use pci_ioremap_wc_bar() to map framebuffer
drivers/video/fbdev/kyrofb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
drivers/video/fbdev/i740fb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
PCI: Add pci_ioremap_wc_bar()
x86/mm: Make kernel/check.c explicitly non-modular
x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular
x86/mm/pat: Add comments to cachemode translation tables
arch/*/io.h: Add ioremap_uc() to all architectures
drivers/video/fbdev/atyfb: Use arch_phys_wc_add() and ioremap_wc()
drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC
drivers/video/fbdev/atyfb: Clarify ioremap() base and length used
drivers/video/fbdev/atyfb: Carve out framebuffer length fudging into a helper
x86/mm, asm-generic: Add IOMMU ioremap_uc() variant default
...
This lets drivers take advantage of PAT when available. It
should help with the transition of converting video drivers over
to ioremap_wc() to help with the goal of eventually using
_PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on
ioremap_nocache(), see:
de33c442ed ("x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()")
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: <syrjala@sci.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Ville Syrjälä <syrjala@sci.fi>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: airlied@linux.ie
Cc: benh@kernel.crashing.org
Cc: dan.j.williams@intel.com
Cc: konrad.wilk@oracle.com
Cc: linux-fbdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: mst@redhat.com
Cc: vinod.koul@intel.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/1440443613-13696-2-git-send-email-mcgrof@do-not-panic.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Firmware typically configures the PCIe fabric with a consistent Max Payload
Size setting based on the devices present at boot. A hot-added device
typically has the power-on default MPS setting (128 bytes), which may not
match the fabric.
The previous Linux default, in the absence of any "pci=pcie_bus_*" options,
was PCIE_BUS_TUNE_OFF, in which we never touch MPS, even for hot-added
devices.
Add a new default setting, PCIE_BUS_DEFAULT, in which we make sure every
device's MPS setting matches the upstream bridge. This makes it more
likely that a hot-added device will work in a system with optimized MPS
configuration.
Note that if we hot-add a device that only supports 128-byte MPS, it still
likely won't work because we don't reconfigure the rest of the fabric.
Booting with "pci=pcie_bus_peer2peer" is a workaround for this because it
sets MPS to 128 for everything.
[bhelgaas: changelog, new default, rework for pci_configure_device() path]
Tested-by: Keith Busch <keith.busch@intel.com>
Tested-by: Jordan Hargrave <jharg93@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
* pci/irq:
PCI/MSI: Free legacy IRQ when enabling MSI/MSI-X
PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed
PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()
PCI: Add pcibios_alloc_irq() and pcibios_free_irq()
* pci/misc:
PCI: Remove unused "pci_probe" flags
PCI: Add VPD function 0 quirk for Intel Ethernet devices
PCI: Add dev_flags bit to access VPD through function 0
PCI / ACPI: Fix pci_acpi_optimize_delay() comment
PCI: Remove a broken link in quirks.c
PCI: Remove useless redundant code
PCI: Simplify pci_find_(ext_)capability() return value checks
PCI: Move PCI_FIND_CAP_TTL to pci.h and use it in quirks
PCI: Add pcie_downstream_port() (true for Root and Switch Downstream Ports)
PCI: Fix pcie_port_device_resume() comment
PCI: Shift PCI_CLASS_NOT_DEFINED consistently with other classes
PCI: Revert aeb30016fe ("PCI: add Intel USB specific reset method")
PCI: Fix TI816X class code quirk
PCI: Fix generic NCR 53c810 class code quirk
PCI: Use PCI_CLASS_SERIAL_USB instead of bare number
PCI: Add quirk for Intersil/Techwell TW686[4589] AV capture cards
PCI: Remove Intel Cherrytrail D3 delays
* pci/resource:
PCI: Call pci_read_bridge_bases() from core instead of arch code
* pci/virtualization:
PCI: Restore ACS configuration as part of pci_restore_state()
Previously we did not restore ACS state after a PCIe reset. This meant
that we could not reassign interfaces after a system suspend because the
D0->D3 transition disabled ACS, and we didn't restore it when going back to
D0.
Restore ACS configuration in pci_restore_state().
[bhelgaas: changelog]
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Allen Kay <allen.m.kay@intel.com>
CC: Chris Wright <chris@sous-sol.org>
CC: Alex Williamson <alex.williamson@redhat.com>
The return value of the pci_find_(ext_)capability() is either zero or the
position of a capability. It is never negative.
This patch consolidates the form of check from (pos <= 0) to (!pos).
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some quirks search for a HyperTransport capability and use a hard-coded TTL
value of 48 to avoid an infinite loop.
Move the definition of PCI_FIND_CAP_TTL to pci.h and use it instead of the
hard-coded TTL values.
[bhelgaas: changelog]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Refine the mechanism introduced by commit f244d8b623 ("ACPIPHP / radeon /
nouveau: Fix VGA switcheroo problem related to hotplug") to propagate the
ignore_hotplug setting of the device to its parent bridge in case hotplug
notifications related to the graphics adapter switching are given for the
bridge rather than for the device itself (they need to be ignored in both
cases).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=61891
Link: https://bugs.freedesktop.org/show_bug.cgi?id=88927
Fixes: b440bde74f ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device")
Reported-and-tested-by: tiagdtd-lava <tiagdtd-lava@yahoo.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.17+
* pci/misc:
PCI: Read capability list as dwords, not bytes
PCI: Don't clear ASPM bits when the FADT declares it's unsupported
PCI: Clarify policy for vendor IDs in pci.txt
PCI/ACPI: Optimize device state transition delays
PCI: Export pci_find_host_bridge() for use inside PCI core
PCI: Make a shareable UUID for PCI firmware ACPI _DSM
PCI: Fix typo in Thunderbolt kernel message
Reading both the capability ID and "next" pointer at the same time lets us
parse the list with half the number of config reads.
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Export the following symbols so they can be referenced by a PCI host bridge
driver compiled as a kernel loadable module:
pci_common_swizzle
pci_create_root_bus
pci_stop_root_bus
pci_remove_root_bus
pci_assign_unassigned_bus_resources
pci_fixup_irqs
Signed-off-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Make pci_ioremap_bar() fail if we're trying to map a BAR that hasn't been
assigned.
Normally pci_enable_device() will fail if a BAR hasn't been assigned, but a
driver can successfully call pci_enable_device_io() even if a memory BAR
hasn't been assigned. That driver should not be able to use
pci_ioremap_bar() to map that unassigned memory BAR.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use dev_warn() to complain about a pci_ioremap_bar() failure so we can
include the driver name, BAR number, and the resource itself. We could use
dev_WARN() to also get the backtrace as we did previously, but I think
that's more information than we need.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Rework of the core ACPI resources parsing code to fix issues
in it and make using resource offsets more convenient and
consolidation of some resource-handing code in a couple of places
that have grown analagous data structures and code to cover the
the same gap in the core (Jiang Liu, Thomas Gleixner, Lv Zheng).
- ACPI-based IOAPIC hotplug support on top of the resources handling
rework (Jiang Liu, Yinghai Lu).
- ACPICA update to upstream release 20150204 including an interrupt
handling rework that allows drivers to install raw handlers for
ACPI GPEs which then become entirely responsible for the given GPE
and the ACPICA core code won't touch it (Lv Zheng, David E Box,
Octavian Purdila).
- ACPI EC driver rework to fix several concurrency issues and other
problems related to events handling on top of the ACPICA's new
support for raw GPE handlers (Lv Zheng).
- New ACPI driver for AMD SoCs analogous to the LPSS (Low-Power
Subsystem) driver for Intel chips (Ken Xue).
- Two minor fixes of the ACPI LPSS driver (Heikki Krogerus,
Jarkko Nikula).
- Two new blacklist entries for machines (Samsung 730U3E/740U3E and
510R) where the native backlight interface doesn't work correctly
while the ACPI one does (Hans de Goede).
- Rework of the ACPI processor driver's handling of idle states
to make the code more straightforward and less bloated overall
(Rafael J Wysocki).
- Assorted minor fixes related to ACPI and SFI (Andreas Ruprecht,
Andy Shevchenko, Hanjun Guo, Jan Beulich, Rafael J Wysocki,
Yaowei Bai).
- PCI core power management modification to avoid resuming (some)
runtime-suspended devices during system suspend if they are in
the right states already (Rafael J Wysocki).
- New SFI-based cpufreq driver for Intel platforms using SFI
(Srinidhi Kasagar).
- cpufreq core fixes, cleanups and simplifications (Viresh Kumar,
Doug Anderson, Wolfram Sang).
- SkyLake CPU support and other updates for the intel_pstate driver
(Kristen Carlson Accardi, Srinivas Pandruvada).
- cpufreq-dt driver cleanup (Markus Elfring).
- Init fix for the ARM big.LITTLE cpuidle driver (Sudeep Holla).
- Generic power domains core code fixes and cleanups (Ulf Hansson).
- Operating Performance Points (OPP) core code cleanups and kernel
documentation update (Nishanth Menon).
- New dabugfs interface to make the list of PM QoS constraints
available to user space (Nishanth Menon).
- New devfreq driver for Tegra Activity Monitor (Tomeu Vizoso).
- New devfreq class (devfreq_event) to provide raw utilization data
to devfreq governors (Chanwoo Choi).
- Assorted minor fixes and cleanups related to power management
(Andreas Ruprecht, Krzysztof Kozlowski, Rickard Strandqvist,
Pavel Machek, Todd E Brandt, Wonhong Kwon).
- turbostat updates (Len Brown) and cpupower Makefile improvement
(Sriram Raghunathan).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJU2neOAAoJEILEb/54YlRx51QP/jrv1Wb5eMaemzMksPIWI5Zn
I8IbxzToxu7wDDsrTBRv+LuyllMPrnppFOHHvB35gUYu7Y6I066s3ErwuqeFlbmy
+VicmyGMahv3yN74qg49MXzWtaJZa8hrFXn8ItujiUIcs08yELi0vBQFlZImIbTB
PdQngO88VfiOVjDvmKkYUU//9Sc9LCU0ZcdUQXSnA1oNOxuUHjiARz98R03hhSqu
BWR+7M0uaFbu6XeK+BExMXJTpKicIBZ1GAF6hWrS8V4aYg+hH1cwjf2neDAzZkcU
UkXieJlLJrCq+ZBNcy7WEhkWQkqJNWei5WYiy6eoQeQpNoliY2V+2OtSMJaKqDye
PIiMwXstyDc5rgyULN0d1UUzY6mbcUt2rOL0VN2bsFVIJ1HWCq8mr8qq689pQUYv
tcH18VQ2/6r2zW28sTO/ByWLYomklD/Y6bw2onMhGx3Knl0D8xYJKapVnTGhr5eY
d4k41ybHSWNKfXsZxdJc+RxndhPwj9rFLfvY/CZEhLcW+2pAiMarRDOPXDoUI7/l
aJpmPzy/6mPXGBnTfr6jKDSY3gXNazRIvfPbAdiGayKcHcdRM4glbSbNH0/h1Iq6
HKa8v9Fx87k1X5r4ZbhiPdABWlxuKDiM7725rfGpvjlWC3GNFOq7YTVMOuuBA225
Mu9PRZbOsZsnyNkixBpX
=zZER
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"We have a few new features this time, including a new SFI-based
cpufreq driver, a new devfreq driver for Tegra Activity Monitor, a new
devfreq class for providing its governors with raw utilization data
and a new ACPI driver for AMD SoCs.
Still, the majority of changes here are reworks of existing code to
make it more straightforward or to prepare it for implementing new
features on top of it. The primary example is the rework of ACPI
resources handling from Jiang Liu, Thomas Gleixner and Lv Zheng with
support for IOAPIC hotplug implemented on top of it, but there is
quite a number of changes of this kind in the cpufreq core, ACPICA,
ACPI EC driver, ACPI processor driver and the generic power domains
core code too.
The most active developer is Viresh Kumar with his cpufreq changes.
Specifics:
- Rework of the core ACPI resources parsing code to fix issues in it
and make using resource offsets more convenient and consolidation
of some resource-handing code in a couple of places that have grown
analagous data structures and code to cover the the same gap in the
core (Jiang Liu, Thomas Gleixner, Lv Zheng).
- ACPI-based IOAPIC hotplug support on top of the resources handling
rework (Jiang Liu, Yinghai Lu).
- ACPICA update to upstream release 20150204 including an interrupt
handling rework that allows drivers to install raw handlers for
ACPI GPEs which then become entirely responsible for the given GPE
and the ACPICA core code won't touch it (Lv Zheng, David E Box,
Octavian Purdila).
- ACPI EC driver rework to fix several concurrency issues and other
problems related to events handling on top of the ACPICA's new
support for raw GPE handlers (Lv Zheng).
- New ACPI driver for AMD SoCs analogous to the LPSS (Low-Power
Subsystem) driver for Intel chips (Ken Xue).
- Two minor fixes of the ACPI LPSS driver (Heikki Krogerus, Jarkko
Nikula).
- Two new blacklist entries for machines (Samsung 730U3E/740U3E and
510R) where the native backlight interface doesn't work correctly
while the ACPI one does (Hans de Goede).
- Rework of the ACPI processor driver's handling of idle states to
make the code more straightforward and less bloated overall (Rafael
J Wysocki).
- Assorted minor fixes related to ACPI and SFI (Andreas Ruprecht,
Andy Shevchenko, Hanjun Guo, Jan Beulich, Rafael J Wysocki, Yaowei
Bai).
- PCI core power management modification to avoid resuming (some)
runtime-suspended devices during system suspend if they are in the
right states already (Rafael J Wysocki).
- New SFI-based cpufreq driver for Intel platforms using SFI
(Srinidhi Kasagar).
- cpufreq core fixes, cleanups and simplifications (Viresh Kumar,
Doug Anderson, Wolfram Sang).
- SkyLake CPU support and other updates for the intel_pstate driver
(Kristen Carlson Accardi, Srinivas Pandruvada).
- cpufreq-dt driver cleanup (Markus Elfring).
- Init fix for the ARM big.LITTLE cpuidle driver (Sudeep Holla).
- Generic power domains core code fixes and cleanups (Ulf Hansson).
- Operating Performance Points (OPP) core code cleanups and kernel
documentation update (Nishanth Menon).
- New dabugfs interface to make the list of PM QoS constraints
available to user space (Nishanth Menon).
- New devfreq driver for Tegra Activity Monitor (Tomeu Vizoso).
- New devfreq class (devfreq_event) to provide raw utilization data
to devfreq governors (Chanwoo Choi).
- Assorted minor fixes and cleanups related to power management
(Andreas Ruprecht, Krzysztof Kozlowski, Rickard Strandqvist, Pavel
Machek, Todd E Brandt, Wonhong Kwon).
- turbostat updates (Len Brown) and cpupower Makefile improvement
(Sriram Raghunathan)"
* tag 'pm+acpi-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (151 commits)
tools/power turbostat: relax dependency on APERF_MSR
tools/power turbostat: relax dependency on invariant TSC
Merge branch 'pci/host-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into acpi-resources
tools/power turbostat: decode MSR_*_PERF_LIMIT_REASONS
tools/power turbostat: relax dependency on root permission
ACPI / video: Add disable_native_backlight quirk for Samsung 510R
ACPI / PM: Remove unneeded nested #ifdef
USB / PM: Remove unneeded #ifdef and associated dead code
intel_pstate: provide option to only use intel_pstate with HWP
ACPI / EC: Add GPE reference counting debugging messages
ACPI / EC: Add query flushing support
ACPI / EC: Refine command storm prevention support
ACPI / EC: Add command flushing support.
ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag
ACPI: add AMD ACPI2Platform device support for x86 system
ACPI / table: remove duplicate NULL check for the handler of acpi_table_parse()
ACPI / EC: Update revision due to raw handler mode.
ACPI / EC: Reduce ec_poll() by referencing the last register access timestamp.
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
ACPICA: Events: Enable APIs to allow interrupt/polling adaptive request based GPE handling model
...
* pci/enumeration:
PCI: Generate uppercase hex for modalias var in uevent
* pci/hotplug:
PCI: pciehp: Handle surprise add even if surprise removal isn't supported
* pci/resource:
PCI: Fix infinite loop with ROM image of size 0
* pci/virtualization:
PCI: Add Wellsburg (X99) to Intel PCH root port ACS quirk
PCI: Add DMA alias quirk for Adaptec 3405
PCI: Add ACS quirk for Emulex NICs
PCI: Mark AMD/ATI VGA devices that don't reset on D3hot->D0 transition
PCI: Add flag for devices that don't reset on D3hot->D0 transition
PCI: Mark Atheros AR93xx to avoid bus reset
PCI: Add flag for devices where we can't use bus reset
Commit f25c0ae2b4 (ACPI / PM: Avoid resuming devices in ACPI PM
domain during system suspend) modified the ACPI PM domain's system
suspend callbacks to allow devices attached to it to be left in the
runtime-suspended state during system suspend so as to optimize
the suspend process.
This was based on the general mechanism introduced by commit
aae4518b31 (PM / sleep: Mechanism to avoid resuming runtime-suspended
devices unnecessarily).
Extend that approach to PCI devices by modifying the PCI bus type's
->prepare callback to return 1 for devices that are runtime-suspended
when it is being executed and that are in a suitable power state and
need not be resumed going forward.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Per the PCI Power Management spec r1.2, sec 3.2.4, a device that advertises
No_Soft_Reset == 0 in the PMCSR register (reported by lspci as "NoSoftRst-")
should perform an internal reset when transitioning from D3hot to D0 via
software control. Configuration context is lost and the device requires a
full reinitialization sequence.
Unfortunately the definition of "internal reset", beyond the application of
the configuration context, is largely left to the interpretation of the
specific device. Some devices don't seem to perform an "internal reset"
even if they report No_Soft_Reset == 0.
We still need to honor the PCI specification and restore PCI config context
in the event that we do a PM reset, so we don't cache and modify the
PCI_PM_CTRL_NO_SOFT_RESET bit for the device, but for interfaces where the
intention is to reset the device, like pci_reset_function(), we need a
mechanism to flag that PM reset (a D3hot->D0 transition) doesn't perform
any significant "internal reset" of the device.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Enable a mechanism for devices to quirk that they do not behave when
doing a PCI bus reset. We require a modest level of spec compliant
behavior in order to do a reset, for instance the device should come
out of reset without throwing errors and PCI config space should be
accessible after reset. This is too much to ask for some devices.
Link: http://lkml.kernel.org/r/20140923210318.498dacbd@dualc.maya.org
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.14+
The current logic in arm64 pci_bus_assign_domain_nr() is flawed in that
depending on the host controller configuration for a platform and the
initialization sequence, core code may end up allocating PCI domain numbers
from both DT and the generic domain counter, which would result in PCI
domain allocation aliases/errors.
Fix the logic behind the PCI domain number assignment and move the
resulting code to the PCI core so the same domain allocation logic is used
on all platforms that select CONFIG_PCI_DOMAINS_GENERIC.
[bhelgaas: tidy changelog]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Liviu Dudau <Liviu.Dudau@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
CC: Rob Herring <robh+dt@kernel.org>
CC: Catalin Marinas <catalin.marinas@arm.com>
- Fully support non-coherent devices on ARM by introducing the
mechanisms to request the hypervisor to perform the required cache
maintainance operations.
- A number of pciback bug fixes and cleanups. Notably a deadlock fix
if a PCI device was manually uunbound and a fix for incorrectly
restoring state after a function reset.
- In x86 PVHVM guests, use the APIC for interrupts if this has been
virtualized by the hardware. This reduces the number of interrupt-
related VM exits on such hardware.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJUiYb+AAoJEFxbo/MsZsTRwmEH+gNaJz5r8gIJlq8Q51+nOIs4
Gw6HdjUB5MOT47vDV4treEOx0Bk8hYTfgWUWvAC81JMJ1sMWOVrUGuG/0lmzaomW
zXvSk+o0n4LafwEhHb8LIccZMbaH7f9o3PNdNchrTkPrIl8Gf2nmBXCkDsT4mRye
5ZFpc4ntgBrznh3baPYDS8PCAmlyZ0uVEnz1ofYI6S80dC13siEiPG0c9TrNEKzO
glhvgCRmR0C4ZNLblM36HWBEqrdLuGCoNJSH+7okygyP2TLD3aO4R+9aD5JWYNdf
fO2WmivX/zK+UGVAElrLx+rb8R2dv3ddeaE5piZhIBUieopIWJd32L3LhQORdtc=
=N6DP
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.19-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen features and fixes from David Vrabel:
- Fully support non-coherent devices on ARM by introducing the
mechanisms to request the hypervisor to perform the required cache
maintainance operations.
- A number of pciback bug fixes and cleanups. Notably a deadlock fix
if a PCI device was manually uunbound and a fix for incorrectly
restoring state after a function reset.
- In x86 PVHVM guests, use the APIC for interrupts if this has been
virtualized by the hardware. This reduces the number of interrupt-
related VM exits on such hardware.
* tag 'stable/for-linus-3.19-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (26 commits)
Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single"
xen/pci: Use APIC directly when APIC virtualization hardware is available
xen/pci: Defer initialization of MSI ops on HVM guests
xen-pciback: drop SR-IOV VFs when PF driver unloads
xen/pciback: Restore configuration space when detaching from a guest.
PCI: Expose pci_load_saved_state for public consumption.
xen/pciback: Remove tons of dereferences
xen/pciback: Print out the domain owning the device.
xen/pciback: Include the domain id if removing the device whilst still in use
driver core: Provide an wrapper around the mutex to do lockdep warnings
xen/pciback: Don't deadlock when unbinding.
swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single
swiotlb-xen: call xen_dma_sync_single_for_device when appropriate
swiotlb-xen: remove BUG_ON in xen_bus_to_phys
swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu
xen/arm: introduce GNTTABOP_cache_flush
xen/arm/arm64: introduce xen_arch_need_swiotlb
xen/arm/arm64: merge xen/mm32.c into xen/mm.c
xen/arm: use hypercall to flush caches in map_page
xen: add a dma_addr_t dev_addr argument to xen_dma_map_page
...
We have the pci_load_and_free_saved_state, and pci_store_saved_state
but are missing the functionality to just load the state
multiple times in the PCI device without having to free/save
the state.
This patch makes it possible to use this function.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
pci_iov_resource_bar() always sets its 'pci_bar_type' parameter to
'pci_bar_unknown'. Drop the parameter and just use 'pci_bar_unknown'
directly in the callers.
No functional change intended.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Chris Wright <chrisw@sous-sol.org>
CC: Yu Zhao <yuzhao@google.com>
We have same warning message for FLR and AF FLR and users can't know which
type of resets the PCI device is taking when there are pending
transactions. Print different messages for FLR and AF FLR cases.
[bhelgaas: make code structure parallel, add "anyway" to suggest risk]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/host-generic:
arm64: Add architectural support for PCI
PCI: Add pci_remap_iospace() to map bus I/O resources
of/pci: Add support for parsing PCI host bridge resources from DT
of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()
PCI: Add generic domain handling
of/pci: Fix the conversion of IO ranges into IO resources
of/pci: Move of_pci_range_to_resource() to of/address.c
ARM: Define PCI_IOBASE as the base of virtual PCI IO space
of/pci: Add pci_register_io_range() and pci_pio_to_address()
asm-generic/io.h: Fix ioport_map() for !CONFIG_GENERIC_IOMAP
Conflicts:
drivers/pci/host/pci-tegra.c
Add pci_remap_iospace() to map bus I/O resources into the CPU virtual
address space. Architectures with special needs may provide their own
version, but most should be able to use this one.
This function is useful for PCI host bridge drivers that need to map the
PCI I/O resources into virtual memory space.
[bhelgaas: phys_addr description, drop temporary "err" variable]
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
CC: Arnd Bergmann <arnd@arndb.de>
Add pci_get_new_domain_nr() to allocate a new domain number and
of_get_pci_domain_nr() to retrieve the PCI domain number of a given device
from DT. Host bridge drivers or architecture-specific code can choose to
implement their PCI domain number policy using these two functions.
Using of_get_pci_domain_nr() guarantees a stable PCI domain number on every
boot provided that all host bridge controllers are assigned a number in the
device tree using "linux,pci-domain" property. Mixing use of
pci_get_new_domain_nr() and of_get_pci_domain_nr() is not recommended as it
can lead to potentially conflicting domain numbers being assigned to root
buses behind different host bridges.
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Grant Likely <grant.likely@linaro.org>
CC: Rob Herring <robh+dt@kernel.org>
CC: Catalin Marinas <catalin.marinas@arm.com>
The following Coccinelle semantic patch was used to find and correct cases
of assignments in "if" conditions:
@@
expression var, expr;
statement S;
@@
+ var = expr;
if(
- (var = expr)
+ var
) S
Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit 448bd857d4 ("PCI/PM: add PCIe runtime D3cold support") added a
check to prevent PCI devices from being put into D3cold during system
suspend without giving any particular reason.
Also the check isn't really necessary, because acpi_pci_set_power_state()
maps PCI_D3hot to ACPI_STATE_D3_COLD anyway.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/host-generic:
PCI: generic: Fix GPL v2 license string typo
* pci/host-mvebu:
PCI: mvebu: Fix GPL v2 license string typo
* pci/host-rcar:
PCI: rcar: Fix GPL v2 license string typo
* pci/host-tegra:
PCI: tegra: Fix GPL v2 license string typo
* pci/msi:
PCI/MSI: Use irq_get_msi_desc() to simplify code
PCI/MSI: Remove unused list access in __pci_restore_msix_state()
PCI/MSI: Retrieve first MSI IRQ from msi_desc rather than pci_dev
PCI/MSI: Remove unused function msi_remove_pci_irq_vectors()
PCI/MSI: Add msi_setup_entry() to clean up MSI initialization
* pci/misc:
PCI: Configure ASPM when enabling device
x86: don't exclude low BIOS area when allocating address space for non-PCI cards
PCI: Add include guard to include/linux/pci_ids.h
x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup()
* pci/resource:
PCI: Tidy resource assignment messages
PCI: Return conventional error values from pci_revert_fw_address()
PCI: Cleanup control flow
PCI: Support BAR sizes up to 128GB
PCI: Keep original resource if we fail to expand it
* pci/virtualization:
powerpc/pci: Remove duplicate logic
PCI: Make resetting secondary bus logic common
We can't do ASPM configuration at enumeration-time because enabling it
makes some defective hardware unresponsive, even if ASPM is disabled later
(see 41cd766b06 ("PCI: Don't enable aspm before drivers have had a chance
to veto it"). Therefore, we have to do it after a driver claims the
device.
We previously configured ASPM in pci_set_power_state(), but that's not a
very good place because it's not really related to setting the PCI device
power state, and doing it there means:
- We incorrectly skipped ASPM config when setting a device that's
already in D0 to D0.
- We unnecessarily configured ASPM when setting a device to a low-power
state (the ASPM feature only applies when the device is in D0).
- We unnecessarily configured ASPM when called from a .resume() method
(ASPM configuration needs to be restored during resume, but
pci_restore_pcie_state() should already do this).
Move ASPM configuration from pci_set_power_state() to
do_pci_enable_device() so we do it when a driver enables a device.
[bhelgaas: changelog]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79621
Fixes: db288c9c5f ("PCI / PM: restore the original behavior of pci_set_power_state()")
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Vidya Sagar <sagar.tv@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.6+
Commit d92a208d08 ("powerpc/pci: Mask linkDown on resetting PCI bus")
implemented same logic (resetting PCI secondary bus by bridge's config
register PCI_BRIDGE_CTL_BUS_RESET) in PCI core and arch-dependent code. To
avoid the duplication, move the logic to pci_reset_secondary_bus().
That commit did not declare the pcibios_reset_secondary_bus() interface in
linux/include/pci.h. Add the declaration.
No functional change.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_wait_for_pending() uses word access, so we shouldn't be passing
an offset that is only byte aligned. Use the control register offset
instead, shifting the mask to match.
Fixes: d0b4cc4e32 ("PCI: Wrong register used to check pending traffic")
Fixes: 157e876ffe ("PCI: Add pci_wait_for_pending() (refactor pci_wait_for_pending_transaction())
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
CC: stable@vger.kernel.org # v3.14+
IOMMU
- Add DMA alias iterator (Alex Williamson)
- Add DMA alias quirks for ASMedia, ITE, Tundra bridges (Alex Williamson)
- Add DMA alias quirks for Marvell, Ricoh devices (Alex Williamson)
- Add DMA alias quirk for HighPoint devices (Jérôme Carretero)
MSI
- Fix leak in free_msi_irqs() (Alexei Starovoitov)
Marvell MVEBU
- Remove unnecessary use of 'conf_lock' spinlock (Andrew Murray)
- Avoid setting an undefined window size (Jason Gunthorpe)
- Allow several windows with the same target/attribute (Thomas Petazzoni)
- Split PCIe BARs into multiple MBus windows when needed (Thomas Petazzoni)
- Fix off-by-one in the computed size of the mbus windows (Willy Tarreau)
NVIDIA Tegra
- Use new OF interrupt mapping when possible (Lucas Stach)
Synopsys DesignWare
- Remove unnecessary use of 'conf_lock' spinlock (Andrew Murray)
- Use new OF interrupt mapping when possible (Lucas Stach)
- Split Exynos and i.MX bindings (Lucas Stach)
- Fix comment for setting number of lanes (Mohit Kumar)
- Fix iATU programming for cfg1, io and mem viewport (Mohit Kumar)
Miscellaneous
- EXPORT_SYMBOL cleanup (Ryan Desfosses)
- Whitespace cleanup (Ryan Desfosses)
- Merge multi-line quoted strings (Ryan Desfosses)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTmfzfAAoJEFmIoMA60/r8/YUQAJMrbeoUqtc/DqDKVOxed1ZG
itkQ2tDgdQAo9G0axerpQS82+j/3SUnaQ/5FaX4+vyu5DA/Po9grQjRrJO+ppNzH
31pbeWwz3mUvAVeBkFZAf93+mbdlwQ6EMpPFsX86AftGVS81xh9eUZGek2sm/v9t
VLM2AcVDbj//Bu4+7oC8l/ws1GcG17dLgvlZydvN9O07aj7mEDWlh/6+fnTKiYXU
OcOwWK/ey5npeedcBHlUmQkiM/DVc8R3pHHQCSc40DZJR7YHiGoQnMxpQ/KNBin+
42ZcOT5gPYvdft+TX6kygYgsgO6NbIU2Vnl3LvO9vlv8nq4qsCz3lbpPEz0+xGiF
W0oXWXw3VjdLUsSfwL30PGSYpQ8DbVs5jGl6vJXyDNzDiF36yvPCIOM69Y6gyvvM
hKdSeleG6e+tYjeW1qdmGbrnSNj+zmt6muSY16U+lkeOifcWnMTtx8BYpxrjDC76
2lXCp47V05flRapU2ZU5ocR57uTLoEsLNx01yFbnI+50SEBUamEGuKT3YMROOQnm
zUiwDONhRAVT55m8QwUHXbryOGmALwCL0Zvb1RG3QetnSyGiK4mWC59t0mWubf4X
nlKSE0Buaxiy9UdBIKvMt1oFLV8s0kElac5Rl2K0I2CiIzX/zBRT65GvpNGrWrwo
91pap9RTTOHGpICGSewx
=woH9
-----END PGP SIGNATURE-----
Merge tag 'pci-v3.16-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull more PCI updates from Bjorn Helgaas:
"Here are some more things I'd like to see in v3.16-rc1:
- DMA alias iterator, part of some work to fix IOMMU issues
- MVEBU, Tegra, DesignWare changes that I forgot to include before
- Some whitespace code cleanup
Details:
IOMMU
- Add DMA alias iterator (Alex Williamson)
- Add DMA alias quirks for ASMedia, ITE, Tundra bridges (Alex Williamson)
- Add DMA alias quirks for Marvell, Ricoh devices (Alex Williamson)
- Add DMA alias quirk for HighPoint devices (Jérôme Carretero)
MSI
- Fix leak in free_msi_irqs() (Alexei Starovoitov)
Marvell MVEBU
- Remove unnecessary use of 'conf_lock' spinlock (Andrew Murray)
- Avoid setting an undefined window size (Jason Gunthorpe)
- Allow several windows with the same target/attribute (Thomas Petazzoni)
- Split PCIe BARs into multiple MBus windows when needed (Thomas Petazzoni)
- Fix off-by-one in the computed size of the mbus windows (Willy Tarreau)
NVIDIA Tegra
- Use new OF interrupt mapping when possible (Lucas Stach)
Synopsys DesignWare
- Remove unnecessary use of 'conf_lock' spinlock (Andrew Murray)
- Use new OF interrupt mapping when possible (Lucas Stach)
- Split Exynos and i.MX bindings (Lucas Stach)
- Fix comment for setting number of lanes (Mohit Kumar)
- Fix iATU programming for cfg1, io and mem viewport (Mohit Kumar)
Miscellaneous
- EXPORT_SYMBOL cleanup (Ryan Desfosses)
- Whitespace cleanup (Ryan Desfosses)
- Merge multi-line quoted strings (Ryan Desfosses)"
* tag 'pci-v3.16-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (21 commits)
PCI: Add function 1 DMA alias quirk for HighPoint RocketRaid 642L
PCI/MSI: Fix memory leak in free_msi_irqs()
PCI: Merge multi-line quoted strings
PCI: Whitespace cleanup
PCI: Move EXPORT_SYMBOL so it immediately follows function/variable
PCI: Add bridge DMA alias quirk for ITE bridge
PCI: designware: Split Exynos and i.MX bindings
PCI: Add bridge DMA alias quirk for ASMedia and Tundra bridges
PCI: Add support for PCIe-to-PCI bridge DMA alias quirks
PCI: Add function 1 DMA alias quirk for Marvell devices
PCI: Add function 0 DMA alias quirk for Ricoh devices
PCI: Add support for DMA alias quirks
PCI: Convert pci_dev_flags definitions to bit shifts
PCI: Add DMA alias iterator
PCI: mvebu: Use '%pa' for printing 'phys_addr_t' type
PCI: mvebu: Remove unnecessary use of 'conf_lock' spinlock
PCI: designware: Remove unnecessary use of 'conf_lock' spinlock
PCI: designware: Use new OF interrupt mapping when possible
PCI: designware: Fix iATU programming for cfg1, io and mem viewport
PCI: designware: Fix comment for setting number of lanes
...
Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.
No functional change.
[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix various whitespace errors.
No functional change.
[bhelgaas: fix other similar problems]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull powerpc updates from Ben Herrenschmidt:
"Here is the bulk of the powerpc changes for this merge window. It got
a bit delayed in part because I wasn't paying attention, and in part
because I discovered I had a core PCI change without a PCI maintainer
ack in it. Bjorn eventually agreed it was ok to merge it though we'll
probably improve it later and I didn't want to rebase to add his ack.
There is going to be a bit more next week, essentially fixes that I
still want to sort through and test.
The biggest item this time is the support to build the ppc64 LE kernel
with our new v2 ABI. We previously supported v2 userspace but the
kernel itself was a tougher nut to crack. This is now sorted mostly
thanks to Anton and Rusty.
We also have a fairly big series from Cedric that add support for
64-bit LE zImage boot wrapper. This was made harder by the fact that
traditionally our zImage wrapper was always 32-bit, but our new LE
toolchains don't really support 32-bit anymore (it's somewhat there
but not really "supported") so we didn't want to rely on it. This
meant more churn that just endian fixes.
This brings some more LE bits as well, such as the ability to run in
LE mode without a hypervisor (ie. under OPAL firmware) by doing the
right OPAL call to reinitialize the CPU to take HV interrupts in the
right mode and the usual pile of endian fixes.
There's another series from Gavin adding EEH improvements (one day we
*will* have a release with less than 20 EEH patches, I promise!).
Another highlight is the support for the "Split core" functionality on
P8 by Michael. This allows a P8 core to be split into "sub cores" of
4 threads which allows the subcores to run different guests under KVM
(the HW still doesn't support a partition per thread).
And then the usual misc bits and fixes ..."
[ Further delayed by gmail deciding that BenH is a dirty spammer.
Google knows. ]
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (155 commits)
powerpc/powernv: Add missing include to LPC code
selftests/powerpc: Test the THP bug we fixed in the previous commit
powerpc/mm: Check paca psize is up to date for huge mappings
powerpc/powernv: Pass buffer size to OPAL validate flash call
powerpc/pseries: hcall functions are exported to modules, need _GLOBAL_TOC()
powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC()
powerpc/powernv: Set memory_block_size_bytes to 256MB
powerpc: Allow ppc_md platform hook to override memory_block_size_bytes
powerpc/powernv: Fix endian issues in memory error handling code
powerpc/eeh: Skip eeh sysfs when eeh is disabled
powerpc: 64bit sendfile is capped at 2GB
powerpc/powernv: Provide debugfs access to the LPC bus via OPAL
powerpc/serial: Use saner flags when creating legacy ports
powerpc: Add cpu family documentation
powerpc/xmon: Fix up xmon format strings
powerpc/powernv: Add calls to support little endian host
powerpc: Document sysfs DSCR interface
powerpc: Fix regression of per-CPU DSCR setting
powerpc: Split __SYSFS_SPRSETUP macro
arch: powerpc/fadump: Cleaning up inconsistent NULL checks
...
Move EXPORT_SYMBOL so it immediately follows the function or variable.
No functional change.
[bhelgaas: squash similar changes, fix hotplug, probe, rom, search, too]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull trivial tree changes from Jiri Kosina:
"Usual pile of patches from trivial tree that make the world go round"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
staging: go7007: remove reference to CONFIG_KMOD
aic7xxx: Remove obsolete preprocessor define
of: dma: doc fixes
doc: fix incorrect formula to calculate CommitLimit value
doc: Note need of bc in the kernel build from 3.10 onwards
mm: Fix printk typo in dmapool.c
modpost: Fix comment typo "Modules.symvers"
Kconfig.debug: Grammar s/addition/additional/
wimax: Spelling s/than/that/, wording s/destinatary/recipient/
aic7xxx: Spelling s/termnation/termination/
arm64: mm: Remove superfluous "the" in comment
of: Spelling s/anonymouns/anonymous/
dma: imx-sdma: Spelling s/determnine/determine/
ath10k: Improve grammar in comments
ath6kl: Spelling s/determnine/determine/
of: Improve grammar for of_alias_get_id() documentation
drm/exynos: Spelling s/contro/control/
radio-bcm2048.c: fix wrong overflow check
doc: printk-formats: do not mention casts for u64/s64
doc: spelling error changes
...
* pci/misc:
PCI: Fix return value from pci_user_{read,write}_config_*()
PCI: Turn pcibios_penalize_isa_irq() into a weak function
PCI: Test for std config alias when testing extended config space
* pci/hotplug:
PCI: cpqphp: Fix possible null pointer dereference
NVMe: Implement PCIe reset notification callback
PCI: Notify driver before and after device reset
* pci/pci_is_bridge:
pcmcia: Use pci_is_bridge() to simplify code
PCI: pciehp: Use pci_is_bridge() to simplify code
PCI: acpiphp: Use pci_is_bridge() to simplify code
PCI: cpcihp: Use pci_is_bridge() to simplify code
PCI: shpchp: Use pci_is_bridge() to simplify code
PCI: rpaphp: Use pci_is_bridge() to simplify code
sparc/PCI: Use pci_is_bridge() to simplify code
powerpc/PCI: Use pci_is_bridge() to simplify code
ia64/PCI: Use pci_is_bridge() to simplify code
x86/PCI: Use pci_is_bridge() to simplify code
PCI: Use pci_is_bridge() to simplify code
PCI: Add new pci_is_bridge() interface
PCI: Rename pci_is_bridge() to pci_has_subordinate()
* pci/virtualization:
PCI: Introduce new device binding path using pci_dev.driver_override
Conflicts:
drivers/pci/pci-sysfs.c
pcibios_penalize_isa_irq() is only implemented by x86 now, and legacy ISA
is not used by some architectures. Make pcibios_penalize_isa_irq() a
__weak function to simplify the code. This removes the need for new
platforms to add stub implementations of pcibios_penalize_isa_irq().
[bhelgaas: changelog, comments]
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Notify a PCI device driver when its device's access is about to be disabled
for an impending reset attempt, then after the attempt completes and device
access is restored. The notification is via the pci_error_handlers
interface.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
change made to resolve following checkpatch message:
drivers/pci/pci.c:109: ERROR: "foo* bar" should be "foo *bar"
branch: Linux 3.14-rc8
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The problem was initially reported by Wendy who tried pass through
IPR adapter, which was connected to PHB root port directly, to KVM
based guest. When doing that, pci_reset_bridge_secondary_bus() was
called by VFIO driver and linkDown was detected by the root port.
That caused all PEs to be frozen.
The patch fixes the issue by routing the reset for the secondary bus
of root port to underly firmware. For that, one more weak function
pci_reset_secondary_bus() is introduced so that the individual platforms
can override that and do specific reset for bridge's secondary bus.
Reported-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
3448a19da4 "vgaarb: use bridges to control VGA routing where possible"
added the "flags & PCI_VGA_STATE_CHANGE_DECODES" condition to an existing
WARN_ON(), but used bitwise AND (&) instead of logical AND (&&), so the
condition is never true. Replace with logical AND.
Found by Coverity (CID 142811).
Fixes: 3448a19da4 "vgaarb: use bridges to control VGA routing where possible"
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: David Airlie <airlied@redhat.com>
* pci/resource: (26 commits)
Revert "[PATCH] Insert GART region into resource map"
PCI: Log IDE resource quirk in dmesg
PCI: Change pci_bus_alloc_resource() type_mask to unsigned long
PCI: Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region()
resources: Set type in __request_region()
PCI: Don't check resource_size() in pci_bus_alloc_resource()
s390/PCI: Use generic pci_enable_resources()
tile PCI RC: Use default pcibios_enable_device()
sparc/PCI: Use default pcibios_enable_device() (Leon only)
sh/PCI: Use default pcibios_enable_device()
microblaze/PCI: Use default pcibios_enable_device()
alpha/PCI: Use default pcibios_enable_device()
PCI: Add "weak" generic pcibios_enable_device() implementation
PCI: Don't enable decoding if BAR hasn't been assigned an address
PCI: Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit
PCI: Don't try to claim IORESOURCE_UNSET resources
PCI: Check IORESOURCE_UNSET before updating BAR
PCI: Don't clear IORESOURCE_UNSET when updating BAR
PCI: Mark resources as IORESOURCE_UNSET if we can't assign them
PCI: Remove pci_find_parent_resource() use for allocation
...
Many architectures implement pcibios_enable_device() the same way, so
provide a default implementation in the core.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Andreas reported that after 1f42db786b ("PCI: Enable INTx if BIOS left
them disabled"), pciehp surprise removal stopped working.
This happens because pci_reenable_device() on the hotplug bridge (used in
the pciehp_configure_device() path) clears the Interrupt Disable bit, which
apparently breaks the bridge's MSI hotplug event reporting.
Previously we cleared the Interrupt Disable bit in do_pci_enable_device(),
which is used by both pci_enable_device() and pci_reenable_device(). But
we use pci_reenable_device() after the driver may have enabled MSI or
MSI-X, and we *set* Interrupt Disable as part of enabling MSI/MSI-X.
This patch clears Interrupt Disable only when MSI/MSI-X has not been
enabled.
Fixes: 1f42db786b PCI: Enable INTx if BIOS left them disabled
Link: https://bugzilla.kernel.org/show_bug.cgi?id=71691
Reported-and-tested-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org
CC: Sarah Sharp <sarah.a.sharp@linux.intel.com>
When assigning addresses to resources, mark them with IORESOURCE_UNSET
before we start and clear IORESOURCE_UNSET if assignment is successful.
That means that if we print the resource during assignment, we will show
the size, not a meaningless address.
Also, clear IORESOURCE_UNSET if we do assign an address, so we print the
address when it is valid.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If the resource hasn't been allocated yet, pci_find_parent_resource() is
documented as returning the region "where it should be allocated from."
This is impossible in general because there may be several candidates: a
prefetchable BAR can be put in either a prefetchable or non-prefetchable
window, a transparent bridge may have overlapping positively- and
subtractively-decoded windows, and a root bus may have several windows of
the same type.
Allocation should be done by pci_bus_alloc_resource(), which iterates
through all bus resources and looks for the best match, e.g., one with the
desired prefetchability attributes, and falls back to less-desired
possibilities.
The only valid use of pci_find_parent_resource() is to find the parent of
an already-allocated resource so we can claim it via request_resource(),
and all we need for that is a bus region of the correct type that contains
the resource.
Note that like 8c8def26bf ("PCI: allow matching of prefetchable resources
to non-prefetchable windows"), this depends on pci_bus_for_each_resource()
iterating through positively-decoded regions before subtractively-decoded
ones. We prefer not to return a subtractively-decoded region because
requesting from it will likely conflict with the overlapping positively-
decoded window (see Launchpad report below).
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/424142
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
* pci/misc:
PCI: Enable INTx if BIOS left them disabled
ia64/PCI: Set IORESOURCE_ROM_SHADOW only for the default VGA device
x86/PCI: Set IORESOURCE_ROM_SHADOW only for the default VGA device
PCI: Update outdated comment for pcibios_bus_report_status()
PCI: Cleanup per-arch list of object files
PCI: cpqphp: Fix hex vs decimal typo in cpqhpc_probe()
x86/PCI: Fix function definition whitespace
x86/PCI: Reword comments
x86/PCI: Remove unnecessary local variable initialization
PCI: Remove unnecessary list_empty(&pci_pme_list) check
Some firmware leaves the Interrupt Disable bit set even if the device uses
INTx interrupts. Clear Interrupt Disable so we get those interrupts.
Based on the report mentioned below, if the user selects the "EHCI only"
option in the Intel Baytrail BIOS, the EHCI device is handed off to the OS
with the PCI_COMMAND_INTX_DISABLE bit set.
Link: http://lkml.kernel.org/r/20140114181721.GC12126@xanatos
Link: https://bugzilla.kernel.org/show_bug.cgi?id=70601
Reported-by: Chris Cheng <chris.cheng@atrustcorp.com>
Reported-and-tested-by: Jamie Chen <jamie.chen@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org
CC: Sarah Sharp <sarah.a.sharp@linux.intel.com>
* pci/list-for-each-entry:
PCI: Remove pci_bus_b() and use list_for_each_entry() directly
pcmcia: Use list_for_each_entry() for bus traversal
powerpc/PCI: Use list_for_each_entry() for bus traversal
drm: Use list_for_each_entry() for bus traversal
ARM/PCI: Use list_for_each_entry() for bus traversal
ACPI / hotplug / PCI: Use list_for_each_entry() for bus traversal
Replace list_for_each() with list_for_each_entry(), which means we no
longer need pci_bus_b() and can remove it.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some devices support PCI ACS-like features, but don't report it using the
standard PCIe capabilities. We already provide hooks for device-specific
testing of ACS, but not for device-specific enabling of ACS. This provides
that setup hook.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
list_for_each_entry() handles empty lists just fine, so there's no need to
check whether the list is empty first.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
When doing a function/slot/bus reset PCI grabs the device_lock for each
device to block things like suspend and driver probes, but call paths exist
where this lock may already be held. This creates an opportunity for
deadlock. For instance, vfio allows userspace to issue resets so long as
it owns the device(s). If a driver unbind .remove callback races with
userspace issuing a reset, we have a deadlock as userspace gets stuck
waiting on device_lock while another thread has device_lock and waits for
.remove to complete. To resolve this, we can make a version of the reset
interfaces which use trylock. With this, we can safely attempt a reset and
return error to userspace if there is contention.
[bhelgaas: the deadlock happens when A (userspace) has a file descriptor for
the device, and B waits in this path:
driver_detach
device_lock # take device_lock
__device_release_driver
pci_device_remove # pci_bus_type.remove
vfio_pci_remove # pci_driver .remove
vfio_del_group_dev
wait_event(vfio.release_q, !vfio_dev_present) # wait (holding device_lock)
Now B is stuck until A gives up the file descriptor. If A tries to acquire
device_lock for any reason, we deadlock because A is waiting for B to release
the lock, and B is waiting for A to release the file descriptor.]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Using 'make namespacecheck' identify code which should be declared static.
Checked for users in other driver/archs as well. Compile tested only.
This stops exporting the following interfaces to modules:
pci_target_state()
pci_load_saved_state()
[bhelgaas: retained pci_find_next_ext_capability() and pci_cfg_space_size()]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.
This reverts b48d4425b6 ("PCI: add ID-based ordering enable/disable
support"), removing these interfaces:
pci_enable_ido()
pci_disable_ido()
[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.
This reverts 48a92a8179 ("PCI: add OBFF enable/disable support"),
removing these interfaces:
pci_enable_obff()
pci_disable_obff()
[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.
This reverts 51c2e0a7e5 ("PCI: add latency tolerance reporting
enable/disable support"), removing these interfaces:
pci_enable_ltr()
pci_disable_ltr()
pci_set_ltr()
[bhelgaas: split to separate patch, also remove prototypes from pci.h]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
While we don't really have any infrastructure for making use of VC
support, the system BIOS can configure the topology to non-default
VC values prior to boot. This may be due to silicon bugs, desire to
reserve traffic classes, or perhaps just BIOS bugs. When we reset
devices, the VC configuration may return to default values, which can
be incompatible with devices upstream. For instance, Nvidia GRID
cards provide a PCIe switch and some number of GPUs, all supporting
VC. The power-on default for VC is to support TC0-7 across VC0,
however some platforms will only enable TC0/VC0 mapping across the
topology. When we do a secondary bus reset on the downstream switch
port, the GPU is reset to a TC0-7/VC0 mapping while the opposite end
of the link only enables TC0/VC0. If the GPU attempts to use TC1-7,
it fails.
This patch attempts to provide complete support for VC save/restore,
even beyond the minimally required use case above. This includes
save/restore and reload of the arbitration table, save/restore and
reload of the port arbitration tables, and re-enabling of the
channels for VC, VC9, and MFVC capabilities.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Current save/restore is specific to standard capabilities.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We currently have two instance of this loop which waits for a pending bit
to clear in a status dword. Generalize the function for future users.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Modify tg3_chip_reset() and tg3_close() to check if the PCI network
adapter device is accessible at all in order to skip poking it or
trying to handle a carrier loss in vain when that's not the case.
Introduce a special PCI helper function pci_device_is_present()
for this purpose.
Of course, this uncovers the lack of the appropriate RTNL locking
in tg3_suspend() and tg3_resume(), so add that locking in there
too.
These changes prevent tg3 from burning a CPU at 100% load level for
solid several seconds after the Thunderbolt link is disconnected from
a Matrox DS1 docking station.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix whitespace, capitalization, and spelling errors. No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.
Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/misc:
PCI: Enable upstream bridges even for VFs on virtual buses
PCI: Add pci_upstream_bridge()
PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()
Previously we enabled the upstream PCI-to-PCI bridge only when
"dev->bus->self != NULL". In the case of a VF on a virtual bus, where
"bus->self == NULL", we didn't enable the upstream bridge.
This fixes that by enabling the upstream bridge of the PF corresponding to
the VF.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
* pci/misc:
PCI: Warn on driver probe return value greater than zero
PCI: Drop warning about drivers that don't use pci_set_master()
PCI: Workaround missing pci_set_master in pci drivers
PCI: Update pcie_ports 'auto' behavior for non-ACPI platforms
f41f064cf4 ("PCI: Workaround missing pci_set_master in pci drivers") made
pci_enable_bridge() turn on bus mastering if the driver hadn't done so
already. It also added a warning in this case. But there's no reason to
warn about it unless it's actually a problem to enable bus mastering here.
This patch drops the warning because I'm not aware of any such problem.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Paul Bolle <pebolle@tiscali.nl>
Ben Herrenschmidt found that commit 928bea9648 ("PCI: Delay enabling
bridges until they're needed") breaks PCI in some powerpc environments.
The reason is that the PCIe port driver will call pci_enable_device() on
the bridge, so the device is enabled, but skips pci_set_master because
pcie_port_auto and no acpi on powerpc.
Because of that, pci_enable_bridge() later on (called as a result of the
child device driver doing pci_enable_device) will see the bridge as
already enabled and will not call pci_set_master() on it.
Fixed by add checking in pci_enable_bridge, and call pci_set_master
if driver skip that.
That will make the code more robot and wade off problem for missing
pci_set_master in drivers.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pci/yijing-mps-v1:
drm/radeon: use pcie_get_readrq() and pcie_set_readrq() to simplify code
staging: et131x: Use pci_dev->pcie_mpss and pcie_set_readrq() to simplify code
IB/qib: Drop qib_tune_pcie_caps() and qib_tune_pcie_coalesce() return values
IB/qib: Use pcie_set_mps() and pcie_get_mps() to simplify code
IB/qib: Use pci_is_root_bus() to check whether it is a root bus
tile/PCI: use cached pci_dev->pcie_mpss to simplify code
PCI: Export pcie_set_mps() and pcie_get_mps()
Previously, if kmalloc() failed, we claimed "PME# enabled" in dmesg,
even though we didn't add the device to the pci_pme_list. This prints
a more correct warning.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Export pcie_get_mps() and pcie_set_mps() functions so drivers can use
them to simplify code.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull networking changes from David Miller:
"Noteworthy changes this time around:
1) Multicast rejoin support for team driver, from Jiri Pirko.
2) Centralize and simplify TCP RTT measurement handling in order to
reduce the impact of bad RTO seeding from SYN/ACKs. Also, when
both timestamps and local RTT measurements are available prefer
the later because there are broken middleware devices which
scramble the timestamp.
From Yuchung Cheng.
3) Add TCP_NOTSENT_LOWAT socket option to limit the amount of kernel
memory consumed to queue up unsend user data. From Eric Dumazet.
4) Add a "physical port ID" abstraction for network devices, from
Jiri Pirko.
5) Add a "suppress" operation to influence fib_rules lookups, from
Stefan Tomanek.
6) Add a networking development FAQ, from Paul Gortmaker.
7) Extend the information provided by tcp_probe and add ipv6 support,
from Daniel Borkmann.
8) Use RCU locking more extensively in openvswitch data paths, from
Pravin B Shelar.
9) Add SCTP support to openvswitch, from Joe Stringer.
10) Add EF10 chip support to SFC driver, from Ben Hutchings.
11) Add new SYNPROXY netfilter target, from Patrick McHardy.
12) Compute a rate approximation for sending in TCP sockets, and use
this to more intelligently coalesce TSO frames. Furthermore, add
a new packet scheduler which takes advantage of this estimate when
available. From Eric Dumazet.
13) Allow AF_PACKET fanouts with random selection, from Daniel
Borkmann.
14) Add ipv6 support to vxlan driver, from Cong Wang"
Resolved conflicts as per discussion.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1218 commits)
openvswitch: Fix alignment of struct sw_flow_key.
netfilter: Fix build errors with xt_socket.c
tcp: Add missing braces to do_tcp_setsockopt
caif: Add missing braces to multiline if in cfctrl_linkup_request
bnx2x: Add missing braces in bnx2x:bnx2x_link_initialize
vxlan: Fix kernel panic on device delete.
net: mvneta: implement ->ndo_do_ioctl() to support PHY ioctls
net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
icplus: Use netif_running to determine device state
ethernet/arc/arc_emac: Fix huge delays in large file copies
tuntap: orphan frags before trying to set tx timestamp
tuntap: purge socket error queue on detach
qlcnic: use standard NAPI weights
ipv6:introduce function to find route for redirect
bnx2x: VF RSS support - VF side
bnx2x: VF RSS support - PF side
vxlan: Notify drivers for listening UDP port changes
net: usbnet: update addr_assign_type if appropriate
driver/net: enic: update enic maintainers and driver
driver/net: enic: Exposing symbols for Cisco's low latency driver
...
* pci/misc:
PCI: Remove pcie_cap_has_devctl()
PCI: Support PCIe Capability Slot registers only for ports with slots
PCI: Remove PCIe Capability version checks
PCI: Allow PCIe Capability link-related register access for switches
PCI: Add offsets of PCIe capability registers
PCI: Tidy bitmasks and spacing of PCIe capability definitions
PCI: Remove obsolete comment reference to pci_pcie_cap2()
PCI: Clarify PCI_EXP_TYPE_PCI_BRIDGE comment
PCI: Rename PCIe capability definitions to follow convention
PCI: Disable decoding for BAR sizing only when it was actually enabled
PCI: Add comment about needing pci_msi_off() even when CONFIG_PCI_MSI=n
PCI: Add pcibios_pm_ops for optional arch-specific hibernate functionality
All other PCIe capability register fields include "PCI_EXP" + <reg-name> +
<field-name>. This renames PCI_EXP_OBFF_MASK, PCI_EXP_IDO_REQ_EN,
PCI_EXP_LTR_EN, and related fields using the same convention.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com> # for MFD driver
* pci/yinghai-assign-unassigned-v6:
PCI: Assign resources for hot-added host bridge more aggressively
PCI: Move resource reallocation code to non-__init
PCI: Delay enabling bridges until they're needed
PCI: Assign resources on a per-bus basis
PCI: Enable unassigned resource reallocation on per-bus basis
PCI: Turn on reallocation for unassigned resources with host bridge offset
PCI: Look for unassigned resources on per-bus basis
PCI: Drop temporary variable in pci_assign_unassigned_resources()
Per f5f2b13129 ("msi: sanely support hardware level msi disabling"), we
want pci_msi_off() to work even if MSI support is not compiled into the
kernel, and there are existing callers that use it when CONFIG_PCI_MSI=n.
This adds a comment to that effect.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
After 59875ae489 ("PCI/core: Use PCI Express Capability accessors"),
pcie_get_mps() never returns an error, so don't bother to check for it.
No functional change.
[bhelgaas: changelog, fix pcie_get_mps() doc]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Users of pci_reset_bus() and pci_reset_slot() need a way to probe
whether the bus or slot supports reset. Add trivial helper functions
and export them as vfio-pci will make use of these.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI spec indicates that with stable power, reset needs to be
asserted for a minimum of 1ms (Trst). We should be able to assume
stable power for a Hot Reset, but we add another millisecond as
a fudge factor to make sure the reset is seen on the bus for at least
a full 1ms.
After reset is de-asserted we must wait for devices to complete
initialization. The specs refer to this as "recovery time" (Trhfa).
For PCI this is 2^25 clock cycles or 2^26 for PCI-X. For minimum
bus speeds, both of those come to 1s. PCIe "softens" this
requirement with the Configuration Request Retry Status (CRS)
completion status. Theoretically we could use CRS to shorten the
wait time. We don't make use of that here, using a fixed 1s delay
to allow devices to re-initialize.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Devices come out of reset in D0. Restoring a device to a different
post-reset state takes more smarts than our simple config space
restore, which can leave devices in an inconsistent state. For
example, if a device is reset in D3, but the restore doesn't
successfully return the device to D3, then the actual state of the
device and dev->current_state are contradictory. Put everything
in D0 going into the reset, then we don't need to do anything
special on the way out.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Sometimes pci_reset_function() is not sufficient. We have cases where
devices do not support any kind of reset, but there might be multiple
functions on the bus preventing pci_reset_function() from doing a
secondary bus reset. We also have cases where a device will advertise
that it supports a PM reset, but really does nothing on D3hot->D0
(graphics cards are notorious for this). These devices often also
have more than one function, so even blacklisting PM reset for them
wouldn't allow a secondary bus reset through pci_reset_function().
If a driver supports multiple devices it should have the ability to
induce a bus reset when it needs to. This patch provides that ability
through pci_reset_slot() and pci_reset_bus(). It's the caller's
responsibility when using these interfaces to understand that all of
the devices in or below the slot (or on or below the bus) will be
reset and therefore should be under control of the caller. PCI state
of all the affected devices is saved and restored around these resets,
but internal state of all of the affected devices is reset (which
should be the intention).
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Only cosmetic code changes to existing paths. Expand the comment in
the new pci_dev_save_and_disable() function since there's a lot
hidden in that Command register write.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If the hotplug controller provides a way to reset a slot, use that
before a direct parent bus reset. Like the bus reset option, this is
only available when a single pci_dev occupies the slot.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/vipul-chelsio-reset-v2:
PCI: Use pci_wait_for_pending_transaction() instead of for loop
bnx2x: Use pci_wait_for_pending_transaction() instead of for loop
PCI: Chelsio quirk: Enable Bus Master during Function-Level Reset
PCI: Add pci_wait_for_pending_transaction()
New routine to avoid duplication of code to wait for pending PCI
transactions to complete.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move the secondary bus reset code from pci_parent_bus_reset() into its own
function. Export it as we'll later be calling it from hotplug controllers
and elsewhere.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/misc:
PCI: Fix comment typo for pci_add_cap_save_buffer()
PCI: Return -ENOSYS for SR-IOV operations on non-SR-IOV devices
PCI: Update NumVFs register when disabling SR-IOV
x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero
PCI: Convert class code to use dev_groups
frv/PCI: Mark pcibios_fixup_bus() as non-init
x86/pci/mrst: Cleanup checkpatch.pl warnings
PCI: Rename "PCI Express support" kconfig title
PCI: Fix comment typo in iov.c
A PCI Express device can potentially report a link width and speed which it will
not properly fulfill due to being plugged into a slower link higher in the
chain. This function walks up the PCI bus chain and calculates the minimum link
width and speed of this entire chain. This can be useful to enable a device to
determine if it has enough bandwidth for optimum functionality.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We currently enable PCI bridges after scanning a bus and assigning
resources. This is often done in arch code.
This patch changes this so we don't enable a bridge until necessary, i.e.,
until we enable a PCI device behind the bridge. We do this in the generic
pci_enable_device() path, so this also removes the arch-specific code to
enable bridges.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We currently misinterpret that in order for an ACS feature to be
enabled it must be set in the control field. In reality, this means
that the feature is not only enabled, but controllable. Many of the
ACS capability bits are not required if the device behaves by default
in the way specified when both the capability and control bit are set
and does not support or allow the alternate mode. We therefore need
to check the capabilities and mask out flags that are enabled but not
controllable. Egress control seems to be the only flag which is
purely optional.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
The multifunction ACS rules do not apply to downstream ports. Those
should be tested regardless of whether they are single function or
multifunction. The PCIe spec also fully specifies which PCIe types
are subject to the multifunction rules and excludes event collectors
and PCIe-to-PCI bridges entirely. Document each rule to the section
of the PCIe spec and provide overall documentation of the function.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
PCI PM cap register offset has been saved in pci_pm_init(),
so we can use pdev->pm_cap instead of using pci_find_capability(..)
here.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Platforms may want to provide architecture-specific functionality when
a PCI device is released. Add a pcibios_release_device() call that
architectures can override to do so.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The INTx pin should be INIT[ABCD]. Fix the typo "3=INTC".
Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make pci_pme_active() ignore devices without PME support, so that
it doesn't print the "PME enabled" or "PME disabled" debug messages
for devices that don't support PME.
So that pci_pme_active() doesn't have to check pm_cap in addition
to pme_support, make pci_pm_init() clear pme_support upfront to
make sure that it will be 0 for pm_cap equal to 0.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit b51306c (PCI: Set device power state to PCI_D0 for device
without native PM support) modified pci_platform_power_transition()
by adding code causing dev->current_state for devices that don't
support native PCI PM but are power-manageable by the platform to be
changed to PCI_D0 regardless of the value returned by the preceding
platform_pci_set_power_state(). In particular, that also is done
if the platform_pci_set_power_state() has been successful, which
causes the correct power state of the device set by
pci_update_current_state() in that case to be overwritten by PCI_D0.
Fix that mistake by making the fallback to PCI_D0 only happen if
the platform_pci_set_power_state() has returned an error.
[bhelgaas: folded in Yinghai's simplification, added URL & stable info]
Reference: http://lkml.kernel.org/r/27806FC4E5928A408B78E88BBC67A2306F466BBA@ORSMSX101.amr.corp.intel.com
Reported-by: Chris J. Benenati <chris.j.benenati@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@vger.kernel.org> # v3.2+
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pci/konstantin-runtime-pm:
PCI/PM: Clear state_saved during suspend
PCI: Use atomic_inc_return() rather than atomic_add_return()
PCI: Catch attempts to disable already-disabled devices
PCI: Disable Bus Master unconditionally in pci_device_shutdown()
No functional change; just use atomic_inc_return() rather than the
general-purpose atomic_add_return().
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Warn when disabling a device that has already been disabled.
[bhelgaas: message wording]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/yijing-ari:
PCI: shpchp: Iterate over all devices in slot, not functions 0-7
PCI: sgihp: Iterate over all devices in slot, not functions 0-7
PCI: cpcihp: Iterate over all devices in slot, not functions 0-7
PCI: pciehp: Iterate over all devices in slot, not functions 0-7
PCI: Consolidate "next-function" functions
PCI: Rename pci_enable_ari() to pci_configure_ari()
PCI: Enable ARI if dev and upstream bridge support it; disable otherwise
pci_enable_ari() now supports enabling or disabling ARI forwarding. So
rename pci_enable_ari() to pci_configure_ari() for easy understanding.
No functional change.
[bhelgaas: changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_reassigndev_resource_alignment() is the only user of
pci_is_reassigndev(). If we just use pci_specified_resource_alignment()
directly, we only need to call it once instead of twice, and we can get
rid of pci_is_reassigndev() altogether. No functional change.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently, we enable ARI in a device's upstream bridge if the bridge and
the device support it. But we never disable ARI, even if the device is
removed and replaced with a device that doesn't support ARI.
This means that if we hot-remove an ARI device and replace it with a
non-ARI multi-function device, we find only function 0 of the new device
because the upstream bridge still has ARI enabled, and next_ari_fn()
only returns function 0 for the new non-ARI device.
This patch disables ARI in the upstream bridge if the device doesn't
support ARI. See the PCIe spec, r3.0, sec 6.13.
[bhelgaas: changelog, function comment]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/yinghai-survey-resources+acpi-scan:
ACPI / scan: Treat power resources in a special way
ACPI: Remove unused struct acpi_pci_root.id member
ACPI: Drop ACPI device .bind() and .unbind() callbacks
ACPI / PCI: Move the _PRT setup and cleanup code to pci-acpi.c
ACPI / PCI: Rework the setup and cleanup of device wakeup
ACPI: Add .setup() and .cleanup() callbacks to struct acpi_bus_type
ACPI: Make acpi_bus_scan() and acpi_bus_add() take only one argument
ACPI: Replace ACPI device add_type field with a match_driver flag
ACPI: Drop the second argument of acpi_bus_scan()
ACPI: Remove the arguments of acpi_bus_add() that are not used
ACPI: Remove acpi_start_single_object() and acpi_bus_start()
ACPI / PCI: Fold acpi_pci_root_start() into acpi_pci_root_add()
ACPI: Change the ordering of acpi_bus_check_add()
ACPI: Replace struct acpi_bus_ops with enum type
ACPI: Reduce the usage of struct acpi_bus_ops
ACPI: Make acpi_bus_add() and acpi_bus_start() visibly different
ACPI: Change the ordering of PCI root bridge driver registrarion
ACPI: Separate adding ACPI device objects from probing ACPI drivers
__pci_enable_device_flags() takes values like IORESOURCE_IO and
IORESOURCE_MEM, which are values for struct resource.flags, which is
"unsigned long", not "resource_size_t".
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently, the ACPI wakeup capability of PCI devices is set up
in two different places, partially in acpi_pci_bind() where
runtime wakeup is initialized and partially in
platform_pci_wakeup_init(), where system wakeup is initialized.
The cleanup is only done in acpi_pci_unbind() and it only covers
runtime wakeup.
Use the new .setup() and .cleanup() callbacks in struct acpi_bus_type
to consolidate that code and do the setup and the cleanup each in one
place.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Host bridge hotplug:
- Untangle _PRT from struct pci_bus (Bjorn Helgaas)
- Request _OSC control before scanning root bus (Taku Izumi)
- Assign resources when adding host bridge (Yinghai Lu)
- Remove root bus when removing host bridge (Yinghai Lu)
- Remove _PRT during hot remove (Yinghai Lu)
SRIOV
- Add sysfs knobs to control numVFs (Don Dutile)
Power management
- Notify devices when power resource turned on (Huang Ying)
Bug fixes
- Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
- Keep runtime PM enabled for unbound PCI devices (Huang Ying)
- Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
- Fix xen frontend shutdown issue (David Vrabel)
- Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
Miscellaneous
- Add GPL license for drivers/pci/ioapic (Andrew Cooks)
- Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
- NumaChip remote PCI support (Daniel Blueman)
- Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo Han)
- Convert dev_printk() to dev_info(), etc (Joe Perches)
- Add support for non PCI BAR ROM data (Matthew Garrett)
- Add x86 support for host bridge translation offset (Mike Yoknis)
- Report success only when every driver supports AER (Vijay Pandarathil)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJQyKwSAAoJEPGMOI97Hn6zScgQAJZK2VDfCv74mKrgSDNokIzH
5nVDrc9AHKJm7CUODs6keJK5d4TD/za3Zao68zrYHsJJKes2ni2Z3W34HP2RXKK2
eOmePXOHYPPZMlimP9r9cVxNu1ZJCyp/yWSBcsPF4zUgWhBWLRaSj85I049gQ0sz
+05nZYfLjVd3HNiaXsG4CQyMrNF46XEsLhF9vs+Nr2GHPwrpzhfScgYv63oDS86C
3ICKsjmiRUZcNelxIFYmyxa5u89QdW5XHjzc9eHGQuus24Vxw+TZzsdfc17sUJEE
HTyXY+RjDpOVhdtwwUjrCEOiyZYvy3g9+3sKxoxgt/76ghdUaR7fxITwB97qVMFD
T0ESlKjSV/Qv5QYdyy5uP4zwNs/PXCWXkTg/L1m71F30BxKWDa7tgiA6uK7Z7fl5
1aokKBdk3mtJJJIDJG1YkxPXx/JItTGCNYrx7CcFj49rSjrUWLQdmrYahersRIsB
3wiD2xTi9e4dXeP/+VGzGOWB/sHk+73jvrvZe/REa1FCnMINDz4+9V9WaGROMqyq
MQ8kX0KfYcNVNxy1GOXjU5wLpMN/t/QbvI7gwzRP1DAUCJPoOgFy7AjvSTVG3zuy
8CtdOFttVkUn5dqsbQR0gVbyQVTS3PGSKz5XC/s8kVDWhja0xZTBYwrskM/4zdSD
Xf48OyYV5EjpC3FYUSiU
=OE3Q
-----END PGP SIGNATURE-----
Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI update from Bjorn Helgaas:
"Host bridge hotplug:
- Untangle _PRT from struct pci_bus (Bjorn Helgaas)
- Request _OSC control before scanning root bus (Taku Izumi)
- Assign resources when adding host bridge (Yinghai Lu)
- Remove root bus when removing host bridge (Yinghai Lu)
- Remove _PRT during hot remove (Yinghai Lu)
SRIOV
- Add sysfs knobs to control numVFs (Don Dutile)
Power management
- Notify devices when power resource turned on (Huang Ying)
Bug fixes
- Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
- Keep runtime PM enabled for unbound PCI devices (Huang Ying)
- Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
- Fix xen frontend shutdown issue (David Vrabel)
- Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
Miscellaneous
- Add GPL license for drivers/pci/ioapic (Andrew Cooks)
- Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
- NumaChip remote PCI support (Daniel Blueman)
- Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo
Han)
- Convert dev_printk() to dev_info(), etc (Joe Perches)
- Add support for non PCI BAR ROM data (Matthew Garrett)
- Add x86 support for host bridge translation offset (Mike Yoknis)
- Report success only when every driver supports AER (Vijay
Pandarathil)"
Fix up trivial conflicts.
* tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
PCI: Use phys_addr_t for physical ROM address
x86/PCI: Add NumaChip remote PCI support
ath9k: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: collapse wrapper for pcie_capability_read_word()
iwlegacy: Use standard #defines for PCIe Capability ASPM fields
iwlegacy: collapse wrapper for pcie_capability_read_word()
cxgb3: Use standard #defines for PCIe Capability ASPM fields
PCI: Add standard PCIe Capability Link ASPM field names
PCI/portdrv: Use PCI Express Capability accessors
PCI: Use standard PCIe Capability Link register field names
x86: Use PCI setup data
PCI: Add support for non-BAR ROMs
PCI: Add pcibios_add_device
EFI: Stash ROMs if they're not in the PCI BAR
PCI: Add and use standard PCI-X Capability register names
PCI/PM: Keep runtime PM enabled for unbound PCI devices
xen-pcifront: Handle backend CLOSED without CLOSING
PCI: SRIOV control and status via sysfs (documentation)
PCI/AER: Report success only when every device has AER-aware driver
...
* pci/mjg-pci-roms-from-efi:
x86: Use PCI setup data
PCI: Add support for non-BAR ROMs
PCI: Add pcibios_add_device
EFI: Stash ROMs if they're not in the PCI BAR
Platforms may want to provide architecture-specific functionality during
PCI enumeration. Add a pcibios_add_device() call that architectures can
override to do so.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
For unbound PCI devices, what we need is:
- Always in D0 state, because some devices do not work again after
being put into D3 by the PCI bus.
- In SUSPENDED state if allowed, so that the parent devices can still
be put into low power state.
To satisfy these requirements, the runtime PM for the unbound PCI
devices are disabled and set to SUSPENDED state. One issue of this
solution is that the PCI devices will be put into SUSPENDED state even
if the SUSPENDED state is forbidden via the sysfs interface
(.../power/control) of the device. This is not an issue for most
devices, because most PCI devices are not used at all if unbound.
But there are exceptions. For example, unbound VGA card can be used
for display, but suspending its parents makes it stop working.
To fix the issue, we keep the runtime PM enabled when the PCI devices
are unbound. But the runtime PM callbacks will do nothing if the PCI
devices are unbound. This way, we can put the PCI devices into
SUSPENDED state without putting the PCI devices into D3 state.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=48201
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v3.6+
CONFIG_HOTPLUG is going away as an option so __devexit_p, __devint,
__devinitdata, __devinitconst, and _devexit are no longer needed.
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* pci/taku-prt-cleanup:
PCI/ACPI: Request _OSC control before scanning PCI root bus
PCI: Don't pass pci_dev to pci_ext_cfg_avail()
PCI/ACPI: Add _PRT interrupt routing info before enumerating devices
ACPI: Pass segment/bus to _PRT add/del so they don't depend on pci_bus
There are comments on why PME poll support is necessary for PCI
devices, but not for PCIe devices. That may lead to misunderstanding
that PME poll is only necessary for PCI devices. So add comments
related to PCIe PME poll to make it more clear.
The content of comments comes from the changelog of commit:
379021d5c0
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
pci_ext_cfg_avail() doesn't use the "struct pci_dev *" passed to
it, and there's no requirement that a host bridge even be represented
by a pci_dev. This drops the pci_ext_cfg_avail() parameter.
[bhelgaas: changelog]
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In https://bugzilla.kernel.org/show_bug.cgi?id=48981
Peter reported that /proc/bus/pci/??/??.? does not work for 3.6.
This is because the device configuration space registers are
not accessible if the corresponding parent bridge is suspended or
the device is put into D3cold state.
This is the same as /sys/bus/pci/devices/0000:??:??.?/config access
issue. So the function used to solve sysfs issue is used to solve
this issue.
This patch moves pci_config_pm_runtime_get()/_put() from pci/pci-sysfs.c
to pci/pci.c and makes them extern so they can be used by both the
sysfs and proc paths.
[bhelgaas: changelog, references, reporters]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=48981
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=49031
Reported-by: Forrest Loomis <cybercyst@gmail.com>
Reported-by: Peter <lekensteyn@gmail.com>
Reported-by: Micael Dias <kam1kaz3@gmail.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v3.6+
* pci/trivial:
PCI: Drop duplicate const in DECLARE_PCI_FIXUP_SECTION
PCI: Drop bogus default from ARCH_SUPPORTS_MSI
PCI: cpqphp: Remove unreachable path
PCI: Remove bus number resource debug messages
PCI/AER: Print completion message at KERN_INFO to match starting message
PCI: Fix drivers/pci/pci.c kernel-doc warnings
* commit 'v3.6-rc5': (1098 commits)
Linux 3.6-rc5
HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
Remove user-triggerable BUG from mpol_to_str
xen/pciback: Fix proper FLR steps.
uml: fix compile error in deliver_alarm()
dj: memory scribble in logi_dj
Fix order of arguments to compat_put_time[spec|val]
xen: Use correct masking in xen_swiotlb_alloc_coherent.
xen: fix logical error in tlb flushing
xen/p2m: Fix one-off error in checking the P2M tree directory.
powerpc: Don't use __put_user() in patch_instruction
powerpc: Make sure IPI handlers see data written by IPI senders
powerpc: Restore correct DSCR in context switch
powerpc: Fix DSCR inheritance in copy_thread()
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
powerpc/powernv: Always go into nap mode when CPU is offline
powerpc: Give hypervisor decrementer interrupts their own handler
powerpc/vphn: Fix arch_update_cpu_topology() return value
ARM: gemini: fix the gemini build
...
Conflicts:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
drivers/rapidio/devices/tsi721.c
Introduce an inline function pci_pcie_type(dev) to extract PCIe
device type from pci_dev->pcie_flags_reg field, and prepare for
removing pci_dev->pcie_type.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some extended capabilities, e.g., the vendor-specific capability, can
occur several times. The existing pci_find_ext_capability() only finds
the first occurrence. This adds pci_find_next_ext_capability(), which
can iterate through all of them.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix kernel-doc warnings in drivers/pci/pci.c:
Warning(drivers/pci/pci.c:1550): No description found for parameter 'pci_dev'
Warning(drivers/pci/pci.c:1550): Excess function parameter 'dev' description in 'pci_wakeup'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch fixes the following bug:
http://marc.info/?l=linux-usb&m=134318961120825&w=2
Originally, device lower power states include D1, D2, D3. After that,
D3 is further divided into D3hot and D3cold. To support both scenario
safely, original D3 is mapped to D3cold.
When adding D3cold support, because worry about some device may have
broken D3cold support, D3cold is disabled by default. This disable D3
on original platform too. But some original platform may only have
working D3, but no working D1, D2. The root cause of the above bug is
it too.
To deal with this, this patch enables D3/D3cold by default for most
devices. This restores the original behavior. For some devices that
suspected to have broken D3cold support, such as PCIe port, D3cold is
disabled by default.
Reported-by: Bjorn Mork <bjorn@mork.no>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
* pci/myron-pcibios_setup:
xtensa/PCI: factor out pcibios_setup()
x86/PCI: adjust section annotations for pcibios_setup()
unicore32/PCI: adjust section annotations for pcibios_setup()
tile/PCI: factor out pcibios_setup()
sparc/PCI: factor out pcibios_setup()
sh/PCI: adjust section annotations for pcibios_setup()
sh/PCI: factor out pcibios_setup()
powerpc/PCI: factor out pcibios_setup()
parisc/PCI: factor out pcibios_setup()
MIPS/PCI: adjust section annotations for pcibios_setup()
MIPS/PCI: factor out pcibios_setup()
microblaze/PCI: factor out pcibios_setup()
ia64/PCI: factor out pcibios_setup()
cris/PCI: factor out pcibios_setup()
alpha/PCI: factor out pcibios_setup()
PCI: pull pcibios_setup() up into core
Commit cc2893b6 (PCI: Ensure we re-enable devices on resume)
addressed the problem with USB not being powered after resume on
recent Lenovo machines, but it did that in a suboptimal way.
Namely, it should have changed the relevant code paths only,
which are pci_pm_resume_noirq() and pci_pm_restore_noirq() supposed
to restore the device's power and standard configuration registers
after system resume from suspend or hibernation. Instead, however,
it modified pci_set_power_state() which is executed in several
other situations too. That resulted in some undesirable effects,
like attempting to change a device's power state in the same way
multiple times in a row (up to as many as 4 times in a row in the
snd_hda_intel driver).
Fix the bug addressed by commit cc2893b6 in an alternative way,
by forcibly powering up all devices in pci_pm_default_resume_early(),
which is called by pci_pm_resume_noirq() and pci_pm_restore_noirq()
to restore the device's power and standard configuration registers,
and modifying pci_pm_runtime_resume() to avoid the forcible power-up
if not necessary. Then, revert the changes made by commit cc2893b6
to make the confusion introduced by it go away.
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently, all of the architectures implement their own pcibios_setup()
routine. Most of the implementations do nothing so this patch introduces
a generic (__weak) routine in the core that can be used by all
architectures as a default. If necessary, it can be overridden by
architecture-specific code.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* topic/huang-d3cold-v7:
PCI/PM: add PCIe runtime D3cold support
PCI: do not call pci_set_power_state with PCI_D3cold
PCI/PM: add runtime PM support to PCIe port
ACPI/PM: specify lowest allowed state for device sleep state
This patch adds runtime D3cold support and corresponding ACPI platform
support. This patch only enables runtime D3cold support; it does not
enable D3cold support during system suspend/hibernate.
D3cold is the deepest power saving state for a PCIe device, where its main
power is removed. While it is in D3cold, you can't access the device at
all, not even its configuration space (which is still accessible in D3hot).
Therefore the PCI PM registers can not be used to transition into/out of
the D3cold state; that must be done by platform logic such as ACPI _PR3.
To support wakeup from D3cold, a system may provide auxiliary power, which
allows a device to request wakeup using a Beacon or the sideband WAKE#
signal. WAKE# is usually connected to platform logic such as ACPI GPE.
This is quite different from other power saving states, where devices
request wakeup via a PME message on the PCIe link.
Some devices, such as those in plug-in slots, have no direct platform
logic. For example, there is usually no ACPI _PR3 for them. D3cold
support for these devices can be done via the PCIe Downstream Port leading
to the device. When the PCIe port is powered on/off, the device is powered
on/off too. Wakeup events from the device will be notified to the
corresponding PCIe port.
For more information about PCIe D3cold and corresponding ACPI support,
please refer to:
- PCI Express Base Specification Revision 2.0
- Advanced Configuration and Power Interface Specification Revision 5.0
[bhelgaas: changelog]
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Originally-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch adds runtime PM support to PCIe port. This is needed by
PCIe D3cold support, where PCIe device without ACPI node may be
powered on/off by PCIe port.
Because runtime suspend is broken for some chipsets, a black list is
used to disable runtime PM support for these chipsets.
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
For a valid pci_dev, dev->bus != NULL always, so remove this
unnecessary test.
Found by Coverity (CID 101680).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_enable_obff() and pci_enable_ltr() incorrectly check "dev->bus" instead
of "dev->bus->self" to determine whether the upstream device is a P2P
bridge or a host bridge. For devices on the root bus, the upstream device
is a host bridge, "dev->bus != NULL" and "dev->bus->self == NULL", and we
panic with a null pointer dereference.
These functions should previously have panicked when called on devices
supporting OBFF or LTR, so they should be regarded as untested.
Found by Coverity (CID 143038 and CID 143039).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_intx_mask_supported() assumes INTx masking is supported if the
PCI_COMMAND_INTX_DISABLE bit is writable. But when that bit is set,
some devices don't actually mask INTx or update PCI_STATUS_INTERRUPT
as we expect.
This patch adds a way for quirks to identify these broken devices.
[bhelgaas: split out from Chelsio quirk addition]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Replace the struct pci_bus secondary/subordinate members with the
struct resource busn_res. Later we'll build a resource tree of these
bus numbers.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In a PCI environment, transactions aren't always required to reach
the root bus before being re-routed. Intermediate switches between
an endpoint and the root bus can redirect DMA back downstream before
things like IOMMUs have a chance to intervene. Legacy PCI is always
susceptible to this as it operates on a shared bus. PCIe added a
new capability to describe and control this behavior, Access Control
Services, or ACS.
The utility function pci_acs_enabled() allows us to test the ACS
capabilities of an individual devices against a set of flags while
pci_acs_path_enabled() tests a complete path from a given downstream
device up to the specified upstream device. We also include the
ability to add device specific tests as it's likely we'll see
devices that do not implement ACS, but want to indicate support
for various capabilities in this space.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Unlike PCI Express v1's Capabilities Structure, v2's requires the entire
structure to be implemented. In v2 structures, register fields that
are not implemented are present but hardwired to 0x0. These may
include: Link Capabilities, Status, and Control; Slot Capabilities,
Status, and Control; Root Capabilities, Status, and Control; and all of
the '2' (Device, Link, and Slot) Capabilities, Status, and Control
registers.
This patch removes the redundant capability checks corresponding to the
Link 2's and Slot 2's, Capabilities, Status, and Control registers as they
will be present if Device Capabilities 2's registers are (which explains
why the macros for each of the three are identical).
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch resolves potential issues when accessing PCI Express
Capability structures. The makeup of the capability varies
substantially between v1 and v2:
Version 1 of the PCI Express Capability (defined by PCI Express
1.0 and 1.1 base) neither requires the endpoint to implement the
entire PCIe capability structure nor specifies default values of
registers that are not implemented by the device.
Version 2 of the PCI Express Capability (defined by PCIe 1.1
Capability Structure Expansion ECN, PCIe 2.0, 2.1, and 3.0) added
additional registers to the structure and requires all registers
to be either implemented or hardwired to 0.
Due to the differences in the capability structures, code dealing with
capability features must be careful not to access the additional
registers introduced with v2 unless the device is specifically known to
be a v2 capable device. Otherwise, attempts to access non-existant
registers will occur. This is a subtle issue that is hard to track down
when it occurs (and it has - see commit 864d296cf9).
To try and help mitigate such occurrences, this patch introduces
pci_pcie_cap2() which is similar to pci_pcie_cap() but also checks
that the PCIe capability version is >= 2. pci_pcie_cap2() should be
used for qualifying PCIe capability features introduced after v1.
Suggested by Don Dutile.
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There are a number of redundant pci_is_pcie() checks in various PCI
Express capabilities related routines like the following:
if (!pci_is_pcie(dev))
return false;
pos = pci_pcie_cap(dev);
if (!pos)
return false;
The current pci_is_pcie() implementation is merely:
static inline bool pci_is_pcie(struct pci_dev *dev)
{
return !!pci_pcie_cap(dev);
}
so we can just drop the pci_is_pcie() test in such cases.
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI Express Latency Tolerance Reporting (LTR) feature's
pci_ltr_supported() routine is currently only used within
drivers/pci/pci.c so make it static.
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_max_busnr() has been commented out for years (since 54c762fe62), and
this patch removes it completely.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull MIPS updates from Ralf Baechle:
"The whole series has been sitting in -next for quite a while with no
complaints. The last change to the series was before the weekend the
removal of an SPI patch which Grant - even though previously acked by
himself - appeared to raise objections. So I removed it until the
situation is clarified. Other than that all the patches have the acks
from their respective maintainers, all MIPS and x86 defconfigs are
building fine and I'm not aware of any problems introduced by this
series.
Among the key features for this patch series is a sizable patchset for
Lantiq which among other things introduces support for Lantiq's
flagship product, the FALCON SOC. It also means that the opensource
developers behind this patchset have overtaken Lantiq's competing
inhouse development team that was working behind closed doors.
Less noteworthy the ath79 patchset which adds support for a few more
chip variants, cleanups and fixes. Finally the usual dose of tweaking
of generic code."
Fix up trivial conflicts in arch/mips/lantiq/xway/gpio_{ebu,stp}.c where
printk spelling fixes clashed with file move and eventual removal of the
printk.
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (81 commits)
MIPS: lantiq: remove orphaned code
MIPS: Remove all -Wall and almost all -Werror usage from arch/mips.
MIPS: lantiq: implement support for FALCON soc
MTD: MIPS: lantiq: verify that the NOR interface is available on falcon soc
MTD: MIPS: lantiq: implement OF support
watchdog: MIPS: lantiq: implement OF support and minor fixes
SERIAL: MIPS: lantiq: implement OF support
GPIO: MIPS: lantiq: convert gpio-stp-xway to OF
GPIO: MIPS: lantiq: convert gpio-mm-lantiq to OF and of_mm_gpio
GPIO: MIPS: lantiq: move gpio-stp and gpio-ebu to the subsystem folder
MIPS: pci: convert lantiq driver to OF
MIPS: lantiq: convert dma to platform driver
MIPS: lantiq: implement support for clkdev api
MIPS: lantiq: drop ltq_gpio_request() and gpio_to_irq()
OF: MIPS: lantiq: implement irq_domain support
OF: MIPS: lantiq: implement OF support
MIPS: lantiq: drop mips_machine support
OF: PCI: const usage needed by MIPS
MIPS: Cavium: Remove smp_reserve_lock.
MIPS: Move cache setup to setup_arch().
...
On MIPS we want to call of_irq_map_pci from inside
arch/mips/include/asm/pci.h:extern int pcibios_map_irq(
const struct pci_dev *dev, u8 slot, u8 pin);
For this to work we need to change several functions to const usage.
Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: linux-pci@vger.kernel.org
Cc: devicetree-discuss@lists.ozlabs.org
Cc: linux-mips@linux-mips.org
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Patchwork: https://patchwork.linux-mips.org/patch/3710/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The intent of git commit 6fbf9e7a90
"PCI: Introduce __pci_reset_function_locked to be used when holding
device_lock." was to have a non-locking function that would call
pci_dev_reset function.
But it fell short of that by just probing and not actually reseting
the device. To make that work we need a way to move the lock
around device_lock to not be in pci_dev_reset (as the caller of
__pci_reset_function_locked already holds said lock). We do this by
renaming pci_dev_reset to __pci_dev_reset and bubbling said mutex out
of __pci_dev_reset to pci_dev_reset (a wrapper around __pci_dev_reset).
The __pci_reset_function_locked can now call __pci_dev_reset without
having to worry about the dead-lock.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A PCIe downstream port is a P2P bridge. Its secondary interface is
a link that should lead only to device 0 (unless ARI is enabled)[1], so
we don't probe for non-zero device numbers.
Some Stratus ftServer systems have a PCIe downstream port (02:00.0) that
leads to both an upstream port (03:00.0) and a downstream port (03:01.0),
and 03:01.0 has important devices below it:
[0000:02]-+-00.0-[03-3c]--+-00.0-[04-09]--...
\-01.0-[0a-0d]--+-[USB]
+-[NIC]
+-...
Previously, we didn't enumerate device 03:01.0, so USB and the network
didn't work. This patch adds a DMI quirk to scan all device numbers,
not just 0, below a downstream port.
Based on a patch by Prarit Bhargava.
[1] PCIe spec r3.0, sec 7.3.1
CC: Myron Stowe <mstowe@redhat.com>
CC: Don Dutile <ddutile@redhat.com>
CC: James Paradis <james.paradis@stratus.com>
CC: Matthew Wilcox <matthew.r.wilcox@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some shortcomings introduced into pci_restore_state() by commit
26f41062f2 ("PCI: check for pci bar restore completion and retry")
have been fixed by recent commit ebfc5b802f ("PCI: Fix regression in
pci_restore_state(), v3"), but that commit treats all PCI devices as
those with Type 0 configuration headers.
That is not entirely correct, because Type 1 and Type 2 headers have
different layouts. In particular, the area occupied by BARs in Type 0
config headers contains the secondary status register in Type 1 ones and
it doesn't make sense to retry the restoration of that register even if
the value read back from it after a write is not the same as the written
one (it very well may be different).
For this reason, make pci_restore_state() only retry the restoration
of BARs for Type 0 config headers. This effectively makes it behave
as before commit 26f41062f2 for all header types except for Type 0.
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 26f41062f2 ("PCI: check for pci bar restore completion and
retry") attempted to address problems with PCI BAR restoration on
systems where FLR had not been completed before pci_restore_state() was
called, but it did that in an utterly wrong way.
First off, instead of retrying the writes for the BAR registers only, it
did that for all of the PCI config space of the device, including the
status register (whose value after the write quite obviously need not be
the same as the written one). Second, it added arbitrary delay to
pci_restore_state() even for systems where the PCI config space
restoration was successful at first attempt. Finally, the mdelay(10) it
added to every iteration of the writing loop was way too much of a delay
for any reasonable device.
All of this actually caused resume failures for some devices on Mikko's
system.
To fix the regression, make pci_restore_state() only retry the writes
for BAR registers and only wait if the first read from the register
doesn't return the written value. Additionaly, make it wait for 1 ms,
instead of 10 ms, after every failing attempt to write into config
space.
Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are PCIe devices on the market that report ARI support but
then fail to initialize correctly when ARI is actually used. This
leads to situations in which kernels 2.6.34 and newer fail to handle
systems where the previous kernels worked without any apparent
problems. Unfortunately, it is currently unknown how many such
devices are there.
For this reason, introduce a new kernel command line option,
pci=noari, allowing users to disable PCIe ARI altogether if they
see problems with PCIe device initialization.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This isn't really a quirk; calling it directly from pci_add_device makes
more sense.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Let the user could enable and disable with pci=realloc=on or pci=realloc=off
Also
1. move variable and functions near the place they are used.
2. change macro to function
3. change related functions and variable to static and _init
4. update parameter description accordingly.
This will let us add a config option to control default behavior, and
still allow the user to turn off automatic reallocation if it fails on
their platform until a permanent solution is found.
-v2: still honor pci=realloc, and treat it as pci=realloc=on
also use enum instead of ...
-v3: update kernel-paramenters.txt according to Jesse.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Only one user in driver/pci/pci.c, so we don't need to put it in global
pci.h
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
On some OEM systems, pci_restore_state() is called while FLR has not yet
completed. As a result, PCI BAR register restore is not successful. This fix
reads back the restored value and compares it with saved value and re-tries 10
times before giving up.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com>
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.
The call chain when a user does "bind" looks as so:
echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind
and ends up calling:
driver_bind:
device_lock(dev); <=== TAKES LOCK
XXXX_probe:
.. pci_enable_device()
...__pci_reset_function(), which calls
pci_dev_reset(dev, 0):
if (!0) {
device_lock(dev) <==== DEADLOCK
The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix new kernel-doc warnings:
Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly.
This patch enables ATS at the device state restore if PCI device has ATS
capability.
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When the runtime PM is activated on PCI, if a device switches state
frequently (e.g. an EHCI controller with autosuspending USB devices
connected) the PCI configuration traces might be very verbose in the
kernel log. Let's guard those traces with DEBUG condition.
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The latency timer is read-only and hardwired to zero for all PCIe
devices, both Type 0 and Type 1, so don't bother trying to update it
and cluttering the dmesg log with meaningless "setting latency timer
to 64" messages.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The 'latency timer' of PCI devices, both Type 0 and Type 1,
is setup in architecture-specific code [see: 'pcibios_set_master()'].
There are two approaches being taken by all the architectures - check
if the 'latency timer' is currently set between 16 and 255 and if not
bring it within bounds, or, do nothing (and then there is the
gratuitously different PA-RISC implementation).
There is nothing architecture-specific about PCI's 'latency timer' so
this patch pulls its setup functionality up into the PCI core by
creating a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.
No functional change.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.
This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.
This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.
Adaptions of the ipr driver were originally written by Brian King.
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I noticed that hotplug of one setup does not work with recent change in
pci tree.
After checking the bridge conf setup, I noticed that the bridges get
assigned but do not get enabled.
The reason is the following commit, while simply ignores bridge
resources when enabling a pci device:
| commit bbef98ab0f
| Author: Ram Pai <linuxram@us.ibm.com>
| Date: Sun Nov 6 10:33:10 2011 +0800
|
| PCI: defer enablement of SRIOV BARS
|...
| NOTE: Note, there is subtle change in the pci_enable_device() API. Any
| driver that depends on SRIOV BARS to be enabled in pci_enable_device()
| can fail.
Put back bridge resource and ROM resource checking to fix the problem.
That should fix regression like BIOS does not assign correct resource to
bridge.
Discussion can be found at:
http://www.spinics.net/lists/linux-pci/msg12874.html
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During test of one IB card with guest VM, found that, msi is not
initialized properly.
It turns out __write_msi_msg will do nothing if device current_state is
not PCI_D0. And, that pci device does not have pm_cap in guest VM.
There is an error in setting of power state to PCI_D0 in
pci_enable_device(), but error is not returned for this. Following is
code flow:
pci_enable_device() --> __pci_enable_device_flags() -->
do_pci_enable_device() --> pci_set_power_state() -->
__pci_start_power_transition()
We have following condition inside __pci_start_power_transition():
if (platform_pci_power_manageable(dev)) {
error = platform_pci_set_power_state(dev, state);
if (!error)
pci_update_current_state(dev, state);
} else {
error = -ENODEV;
/* Fall back to PCI_D0 if native PM is not supported */
if (!dev->pm_cap)
dev->current_state = PCI_D0;
}
Here, from platform_pci_set_power_state(), acpi_pci_set_power_state() is
getting called and that is failing with ENODEV because of following
condition:
if (!handle || ACPI_SUCCESS(acpi_get_handle(handle, "_EJ0",&tmp)))
return -ENODEV;
Because of that, pci_update_current_state() is not getting called.
With this patch, if device power state can not be set via
platform_pci_set_power_state and that device does not have native pm
support, then PCI device power state will be set to PCI_D0.
-v2: This also reverts 47e9037ac1, as it's
not needed after this change.
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ajaykumar Hotchandani<ajaykumar.hotchandani@oracle.com>
Signed-off-by: Yinghai Lu<yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
All the PCI BARs of a device are enabled when the device is enabled
using pci_enable_device(). This unnecessarily enables SRIOV BARs of the
device.
On some platforms, which do not support SRIOV as yet, the
pci_enable_device() fails to enable the device if its SRIOV BARs are not
allocated resources correctly.
The following patch fixes the above problem. The SRIOV BARs are now
enabled when IOV capability of the device is enabled in sriov_enable().
NOTE: Note, there is subtle change in the pci_enable_device() API. Any
driver that depends on SRIOV BARS to be enabled in pci_enable_device()
can fail.
The patch has been touch tested on power and x86 platform.
Tested-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When configuring the PCIe settings for "performance", we allow parents
to have a larger Max Payload Size than children and rely on children
Max Read Request Size to not be larger than their own MPS to avoid
having the host bridge generate responses they can't cope with.
However, various drivers in Linux call pci_set_readrq() with arbitrary
values, assuming this to be a simple performance tweak. This breaks
under our "performance" configuration.
Fix that by making sure the value programmed by pcie_set_readrq() is
never larger than the configured MPS for that device.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The land of PCI power management is a land of sorrow and ugliness,
especially in the area of signaling events by devices. There are
devices that set their PME Status bits, but don't really bother
to send a PME message or assert PME#. There are hardware vendors
who don't connect PME# lines to the system core logic (they know
who they are). There are PCI Express Root Ports that don't bother
to trigger interrupts when they receive PME messages from the devices
below. There are ACPI BIOSes that forget to provide _PRW methods for
devices capable of signaling wakeup. Finally, there are BIOSes that
do provide _PRW methods for such devices, but then don't bother to
call Notify() for those devices from the corresponding _Lxx/_Exx
GPE-handling methods. In all of these cases the kernel doesn't have
a chance to receive a proper notification that it should wake up a
device, so devices stay in low-power states forever. Worse yet, in
some cases they continuously send PME Messages that are silently
ignored, because the kernel simply doesn't know that it should clear
the device's PME Status bit.
This problem was first observed for "parallel" (non-Express) PCI
devices on add-on cards and Matthew Garrett addressed it by adding
code that polls PME Status bits of such devices, if they are enabled
to signal PME, to the kernel. Recently, however, it has turned out
that PCI Express devices are also affected by this issue and that it
is not limited to add-on devices, so it seems necessary to extend
the PME polling to all PCI devices, including PCI Express and planar
ones. Still, it would be wasteful to poll the PME Status bits of
devices that are known to receive proper PME notifications, so make
the kernel (1) poll the PME Status bits of all PCI and PCIe devices
enabled to signal PME and (2) disable the PME Status polling for
devices for which correct PME notifications are received.
Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add the ability to disable PCI-E MPS turning and using the BIOS
configured MPS defaults. Due to the number of issues recently
discovered on some x86 chipsets, make this the default behavior.
Also, add the option for peer to peer DMA MPS configuration. Peer to
peer DMA is outside the scope of this patch, but MPS configuration could
prevent it from working by having the MPS on one root port different
than the MPS on another. To work around this, simply make the system
wide MPS the smallest possible value (128B).
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
massive negative ramifications on some devices. Without knowing which
devices have this issue, do not modify from the default value when
walking the PCI-E bus in pcie_bus_safe mode. Also, make pcie_bus_safe
the default procedure.
Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Simon Kirby <sim@hostway.ca>
Tested-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-and-tested-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix new kernel-doc warning in pci.c:
Warning(drivers/pci/pci.c:3259): No description found for parameter 'mps'
Warning(drivers/pci/pci.c:3259): Excess function parameter 'rq' description in 'pcie_set_mps'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On a given PCI-E fabric, each device, bridge, and root port can have a
different PCI-E maximum payload size. There is a sizable performance
boost for having the largest possible maximum payload size on each PCI-E
device. However, if improperly configured, fatal bus errors can occur.
Thus, it is important to ensure that PCI-E payloads sends by a device
are never larger than the MPS setting of all devices on the way to the
destination.
This can be achieved two ways:
- A conservative approach is to use the smallest common denominator of
the entire tree below a root complex for every device on that fabric.
This means for example that having a 128 bytes MPS USB controller on one
leg of a switch will dramatically reduce performances of a video card or
10GE adapter on another leg of that same switch.
It also means that any hierarchy supporting hotplug slots (including
expresscard or thunderbolt I suppose, dbl check that) will have to be
entirely clamped to 128 bytes since we cannot predict what will be
plugged into those slots, and we cannot change the MPS on a "live"
system.
- A more optimal way is possible, if it falls within a couple of
constraints:
* The top-level host bridge will never generate packets larger than the
smallest TLP (or if it can be controlled independently from its MPS at
least)
* The device will never generate packets larger than MPS (which can be
configured via MRRS)
* No support of direct PCI-E <-> PCI-E transfers between devices without
some additional code to specifically deal with that case
Then we can use an approach that basically ignores downstream requests
and focuses exclusively on upstream requests. In that case, all we need
to care about is that a device MPS is no larger than its parent MPS,
which allows us to keep all switches/bridges to the max MPS supported by
their parent and eventually the PHB.
In this case, your USB controller would no longer "starve" your 10GE
Ethernet and your hotplug slots won't affect your global MPS.
Additionally, the hotplugged devices themselves can be configured to a
larger MPS up to the value configured in the hotplug bridge.
To choose between the two available options, two PCI kernel boot args
have been added to the PCI calls. "pcie_bus_safe" will provide the
former behavior, while "pcie_bus_perf" will perform the latter behavior.
By default, the latter behavior is used.
NOTE: due to the location of the enablement, each arch will need to add
calls to this function. This patch only enables x86.
This patch includes a number of changes recommended by Benjamin
Herrenschmidt.
Tested-by: Jordan_Hargrave@dell.com
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When setting the PCI-E MRRS, pcie_set_readrq queries the current
settings via a pci_read_config_word call but writes the modified result
via a pci_write_config_dword. This results in writing 16 more bits than
were queried.
Also, the function description comment is slightly incorrect.
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The function pci_enable_ari() may mistakenly set the downstream port
of a v1 PCIe switch in ARI Forwarding mode. This is a PCIe v2 feature,
and with an SR-IOV device on that switch port believing the switch above
is ARI capable it may attempt to use functions 8-255, translating into
invalid (non-zero) device numbers for that bus. This has been seen
to cause Completion Timeouts and general misbehaviour including hangs
and panics.
Cc: stable@kernel.org
Acked-by: Don Dutile <ddutile@redhat.com>
Tested-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Multiple attempts to dynamically reallocate pci resources have
unfortunately lead to regressions. Though we continue to fix the
regressions and fine tune the dynamic-reallocation behavior, we have not
reached a acceptable state yet.
This patch provides a interim solution. It disables dynamic reallocation
by default, but adds the ability to enable it through pci=realloc kernel
command line parameter.
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI/ACPI: fix type mismatch
PCI: fix new kernel-doc warning
PCI: Fix warning in drivers/pci/probe.c on sparc64
When I added 3448a19da4
I forgot about the special uv handling code for this, so this
patch fixes it up.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ingo Molnar
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fix pci.c kernel-doc warnings:
Warning(drivers/pci/pci.c:3292): No description found for parameter 'flags'
Warning(drivers/pci/pci.c:3292): Excess function parameter 'change_bridge_flags' description in 'pci_set_vga_state'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (169 commits)
drivers/gpu/drm/radeon/atom.c: fix warning
drm/radeon/kms: bump kms version number
drm/radeon/kms: properly set num banks for fusion asics
drm/radeon/kms/atom: move dig phy init out of modesetting
drm/radeon/kms/cayman: fix typo in register mask
drm/radeon/kms: fix typo in spread spectrum code
drm/radeon/kms: fix tile_config value reported to userspace on cayman.
drm/radeon/kms: fix incorrect comparison in cayman setup code.
drm/radeon/kms: add wait idle ioctl for eg->cayman
drm/radeon/cayman: setup hdp to invalidate and flush when asked
drm/radeon/evergreen/btc/fusion: setup hdp to invalidate and flush when asked
agp/uninorth: Fix lockups with radeon KMS and >1x.
drm/radeon/kms: the SS_Id field in the LCD table if for LVDS only
drm/radeon/kms: properly set the CLK_REF bit for DCE3 devices
drm/radeon/kms: fixup eDP connector handling
drm/radeon/kms: bail early for eDP in hotplug callback
drm/radeon/kms: simplify hotplug handler logic
drm/radeon/kms: rewrite DP handling
drm/radeon/kms/atom: add support for setting DP panel mode
drm/radeon/kms: atombios.h updates for DP panel mode
...
For KVM device assignment, we'd like to save off the state of a device
prior to passing it to the guest and restore it later. We also want
to allow pci_reset_funciton() to be called while the device is owned
by the guest. This however overwrites and invalidates the struct pci_dev
buffers, so we can't just manually call save and restore. Add generic
interfaces for the saved state to be stored and reloaded back into
struct pci_dev at a later time.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This will allow us to store and load it later.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Latency tolerance reporting allows devices to send messages to the root
complex indicating their latency tolerance for snooped & unsnooped
memory transactions. Add support for enabling & disabling this
feature, along with a routine to set the max latencies a device should
send upstream.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
OBFF (optimized buffer flush/fill), where supported, can help improve
energy efficiency by giving devices information about when interrupts
and other activity will have a reduced power impact. It requires
support from both the device and system (i.e. not only does the device
need to respond to OBFF messages, but the platform must be capable of
generating and routing them to the end point).
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add support to allow drivers to enable/disable ID-based ordering. Where
supported, ID-based ordering can significantly improve the latency of
individual requests by preventing them from queueing up behind unrelated
traffic.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The pci_pm_reset() function is not a very nice interface due to its
limitations and conditional behavior (e.g. it doesn't affect devices
in low-power states), but it cannot be simply dropped, because
existing device drivers may depend on it. However, its behavior and
limitations should be well documented, so add an appropriate
kerneldoc comment to it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
So in a lot of modern systems, a GPU will always be below a parent bridge that won't share with any other GPUs. This means VGA arbitration on those GPUs can be controlled by using the bridge routing instead of io/mem decodes.
The problem is locating which GPUs share which upstream bridges. This patch attempts to identify all the GPUs which can be controlled via bridges, and ones that can't. This patch endeavours to work out the bridge sharing semantics.
When disabling GPUs via a bridge, it doesn't do irq callbacks or touch the io/mem decodes for the gpu.
Signed-off-by: Dave Airlie <airlied@redhat.com>
v3 -> v2: Moved ASPM enabling logic to pci_set_power_state()
v2 -> v1: Preserved the logic in pci_raw_set_power_state()
: Added ASPM enabling logic after scanning Root Bridge
: http://marc.info/?l=linux-pci&m=130046996216391&w=2
v1 : http://marc.info/?l=linux-pci&m=130013164703283&w=2
The assumption made in commit 41cd766b06
(PCI: Don't enable aspm before drivers have had a chance to veto it) that
pci_enable_device() will result in re-configuring ASPM when aspm_policy is
POWERSAVE is no longer valid. This is due to commit
97c145f7c8 (PCI: read current power state
at enable time) which resets dev->current_state to D0. Due to this the
call to pcie_aspm_pm_state_change() is never made. Note the equality check
(below) that returns early:
./drivers/pci/pci.c: pci_raw_set_pci_power_state()
546 /* Check if we're already there */
547 if (dev->current_state == state)
548 return 0;
Therefore OSPM never configures the PCIe links for ASPM to turn them "on".
Fix it by configuring ASPM from the pci_enable_device() code path. This
also allows a driver such as the e1000e networking driver a chance to
disable ASPM (L0s, L1), if need be, prior to enabling the device. A
driver may perform this action if the device is known to mis-behave
wrt ASPM.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Make wakeup events be reported by the PCI subsystem before attempting to
resume devices or queuing up runtime resume requests for them, because
wakeup events should be reported as soon as they have been detected.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>