linux-yocto/Documentation/accel/qaic/qaic.rst
Linus Torvalds de848da12f drm next for 6.12-rc1
string:
 - add mem_is_zero()
 
 core:
 - support more device numbers
 - use XArray for minor ids
 - add backlight constants
 - Split dma fence array creation into alloc and arm
 
 fbdev:
 - remove usage of old fbdev hooks
 
 kms:
 - Add might_fault() to drm_modeset_lock priming
 - Add dynamic per-crtc vblank configuration support
 
 dma-buf:
 - docs cleanup
 
 buddy:
 - Add start address support for trim function
 
 printk:
 - pass description to kmsg_dump
 
 scheduler;
 - Remove full_recover from drm_sched_start
 
 ttm:
 - Make LRU walk restartable after dropping locks
 - Allow direct reclaim to allocate local memory
 
 panic:
 - add display QR code (in rust)
 
 displayport:
 - mst: GUID improvements
 
 bridge:
 - Silence error message on -EPROBE_DEFER
 - analogix: Clean aup
 - bridge-connector: Fix double free
 - lt6505: Disable interrupt when powered off
 - tc358767: Make default DP port preemphasis configurable
 - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR
 - anx7625: simplify OF array handling
 - dw-hdmi: simplify clock handling
 - lontium-lt8912b: fix mode validation
 - nwl-dsi: fix mode vsync/hsync polarity
 
 xe:
 - Enable LunarLake and Battlemage support
 - Introducing Xe2 ccs modifiers for integrated and discrete graphics
 - rename xe perf to xe observation
 - use wb caching on DGFX for system memory
 - add fence timeouts
 - Lunar Lake graphics/media/display workarounds
 - Battlemage workarounds
 - Battlemage GSC support
 - GSC and HuC fw updates for LL/BM
 - use dma_fence_chain_free
 - refactor hw engine lookup and mmio access
 - enable priority mem read for Xe2
 - Add first GuC BMG fw
 - fix dma-resv lock
 - Fix DGFX display suspend/resume
 - Use xe_managed for kernel BOs
 - Use reserved copy engine for user binds on faulting devices
 - Allow mixing dma-fence jobs and long-running faulting jobs
 - fix media TLB invalidation
 - fix rpm in TTM swapout path
 - track resources and VF state by PF
 
 i915:
 - Type-C programming fix for MTL+
 - FBC cleanup
 - Calc vblank delay more accurately
 - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates
 - Fix DP LTTPR detection
 - limit relocations to INT_MAX
 - fix long hangs in buddy allocator on DG2/A380
 
 amdgpu:
 - Per-queue reset support
 - SDMA devcoredump support
 - DCN 4.0.1 updates
 - GFX12/VCN4/JPEG4 updates
 - Convert vbios embedded EDID to drm_edid
 - GFX9.3/9.4 devcoredump support
 - process isolation framework for GFX 9.4.3/4
 - take IOMMU mappings into account for P2P DMA
 
 amdkfd:
 - CRIU fixes
 - HMM fix
 - Enable process isolation support for GFX 9.4.3/4
 - Allow users to target recommended SDMA engines
 - KFD support for targetting queues on recommended SDMA engines
 
 radeon:
 - remove .load and drm_dev_alloc
 - Fix vbios embedded EDID size handling
 - Convert vbios embedded EDID to drm_edid
 - Use GEM references instead of TTM
 - r100 cp init cleanup
 - Fix potential overflows in evergreen CS offset tracking
 
 msm:
 - DPU:
 - implement DP/PHY mapping on SC8180X
 - Enable writeback on SM8150, SC8180X, SM6125, SM6350
 - DP:
 - Enable widebus on all relevant chipsets
 - MSM8998 HDMI support
 - GPU:
 - A642L speedbin support
 - A615/A306/A621 support
 - A7xx devcoredump support
 
 ast:
 - astdp: Support AST2600 with VGA
 - Clean up HPD
 - Fix timeout loop for DP link training
 - reorganize output code by type (VGA, DP, etc)
 - convert to struct drm_edid
 - fix BMC handling for all outputs
 
 exynos:
 - drop stale MAINTAINERS pattern
 - constify struct
 
 loongson:
 - use GEM refcount over TTM
 
 mgag200:
 - Improve BMC handling
 - Support VBLANK intterupts
 - transparently support BMC outputs
 
 nouveau:
 - Refactor and clean up internals
 - Use GEM refcount over TTM's
 
 gm12u320:
 - convert to struct drm_edid
 
 gma500:
 - update i2c terms
 
 lcdif:
 - pixel clock fix
 
 host1x:
 - fix syncpoint IRQ during resume
 - use iommu_paging_domain_alloc()
 
 imx:
 - ipuv3: convert to struct drm_edid
 
 omapdrm:
 - improve error handling
 - use common helper for_each_endpoint_of_node()
 
 panel:
 - add support for BOE TV101WUM-LL2 plus DT bindings
 - novatek-nt35950: improve error handling
 - nv3051d: improve error handling
 - panel-edp: add support for BOE NE140WUM-N6G; revert support for
   SDC ATNA45AF01
 - visionox-vtdr6130: improve error handling; use
   devm_regulator_bulk_get_const()
 - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
   DT; Fix porch parameter
 - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
   BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
   CMN N116BCP-EA2, CSW MNB601LS1-4
 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
 - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
 - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
   for code sharing
 - panel-edp: fix name for HKC MB116AN01
 - jd9365da: fix "exit sleep" commands
 - jdi-fhd-r63452: simplify error handling with DSI multi-style
   helpers
 - mantix-mlaf057we51: simplify error handling with DSI multi-style
   helpers
 - simple:
   support Innolux G070ACE-LH3 plus DT bindings
   support On Tat Industrial Company KD50G21-40NT-A1 plus DT bindings
 - st7701:
   decouple DSI and DRM code
   add SPI support
   support Anbernic RG28XX plus DT bindings
 
 mediatek:
 - support alpha blending
 - remove cl in struct cmdq_pkt
 - ovl adaptor fix
 - add power domain binding for mediatek DPI controller
 
 renesas:
 - rz-du: add support for RZ/G2UL plus DT bindings
 
 rockchip:
 - Improve DP sink-capability reporting
 - dw_hdmi: Support 4k@60Hz
 - vop: Support RGB display on Rockchip RK3066; Support 4096px width
 
 sti:
 - convert to struct drm_edid
 
 stm:
 - Avoid UAF wih managed plane and CRTC helpers
 - Fix module owner
 - Fix error handling in probe
 - Depend on COMMON_CLK
 - ltdc: Fix transparency after disabling plane; Remove unused interrupt
 
 tegra:
 - gr3d: improve PM domain handling
 - convert to struct drm_edid
 - Call drm_atomic_helper_shutdown()
 
 vc4:
 - fix PM during detect
 - replace DRM_ERROR() with drm_error()
 - v3d: simplify clock retrieval
 
 v3d:
 - Clean up perfmon
 
 virtio:
 - add DRM capset
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmbq43gACgkQDHTzWXnE
 hr4+lg/+O/r41E7ioitcM0DWeWem0dTlvQr41pJ8jujHvw+bXNdg0BMGWtsTyTLA
 eOft2AwofsFjg+O7l8IFXOT37mQLdIdfjb3+w5brI198InL3OWC3QV8ZSwY9VGET
 n8crO9jFoxNmHZnFniBZbtI6egTyl6H+2ey3E0MTnKiPUKZQvsK/4+x532yVLPob
 UUOze5wcjyGZc7LJEIZPohPVneCb9ki7sabDQqh4cxIQ0Eg+nqPpWjYM4XVd+lTS
 8QmssbR49LrJ7z9m90qVE+8TjYUCn+ChDPMs61KZAAnc8k++nK41btjGZ23mDKPb
 YEguahCYthWJ4U8K18iXBPnLPxZv5+harQ8OIWAUYqdIOWSXHozvuJ2Z84eHV13a
 9mQ5vIymXang8G1nEXwX/vml9uhVhBCeWu3qfdse2jfaTWYUb1YzhqUoFvqI0R0K
 8wT03MyNdx965CSqAhpH5Jd559ueZmpd+jsHOfhAS+1gxfD6NgoPXv7lpnMUmGWX
 SnaeC9RLD4cgy7j2Swo7TEqQHrvK5XhZSwX94kU6RPmFE5RRKqWgFVQmwuikDMId
 UpNqDnPT5NL2UX4TNG4V4coyTXvKgVcSB9TA7j8NSLfwdGHhiz73pkYosaZXKyxe
 u6qKMwMONfZiT20nhD7RhH0AFnnKosAcO14dhn0TKFZPY6Ce9O8=
 =7jR+
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "This adds a couple of patches outside the drm core, all should be
  acked appropriately, the string and pstore ones are the main ones that
  come to mind.

  Otherwise it's the usual drivers, xe is getting enabled by default on
  some new hardware, we've changed the device number handling to allow
  more devices, and we added some optional rust code to create QR codes
  in the panic handler, an idea first suggested I think 10 years ago :-)

  string:
   - add mem_is_zero()

  core:
   - support more device numbers
   - use XArray for minor ids
   - add backlight constants
   - Split dma fence array creation into alloc and arm

  fbdev:
   - remove usage of old fbdev hooks

  kms:
   - Add might_fault() to drm_modeset_lock priming
   - Add dynamic per-crtc vblank configuration support

  dma-buf:
   - docs cleanup

  buddy:
   - Add start address support for trim function

  printk:
   - pass description to kmsg_dump

  scheduler:
   - Remove full_recover from drm_sched_start

  ttm:
   - Make LRU walk restartable after dropping locks
   - Allow direct reclaim to allocate local memory

  panic:
   - add display QR code (in rust)

  displayport:
   - mst: GUID improvements

  bridge:
   - Silence error message on -EPROBE_DEFER
   - analogix: Clean aup
   - bridge-connector: Fix double free
   - lt6505: Disable interrupt when powered off
   - tc358767: Make default DP port preemphasis configurable
   - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR
   - anx7625: simplify OF array handling
   - dw-hdmi: simplify clock handling
   - lontium-lt8912b: fix mode validation
   - nwl-dsi: fix mode vsync/hsync polarity

  xe:
   - Enable LunarLake and Battlemage support
   - Introducing Xe2 ccs modifiers for integrated and discrete graphics
   - rename xe perf to xe observation
   - use wb caching on DGFX for system memory
   - add fence timeouts
   - Lunar Lake graphics/media/display workarounds
   - Battlemage workarounds
   - Battlemage GSC support
   - GSC and HuC fw updates for LL/BM
   - use dma_fence_chain_free
   - refactor hw engine lookup and mmio access
   - enable priority mem read for Xe2
   - Add first GuC BMG fw
   - fix dma-resv lock
   - Fix DGFX display suspend/resume
   - Use xe_managed for kernel BOs
   - Use reserved copy engine for user binds on faulting devices
   - Allow mixing dma-fence jobs and long-running faulting jobs
   - fix media TLB invalidation
   - fix rpm in TTM swapout path
   - track resources and VF state by PF

  i915:
   - Type-C programming fix for MTL+
   - FBC cleanup
   - Calc vblank delay more accurately
   - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates
   - Fix DP LTTPR detection
   - limit relocations to INT_MAX
   - fix long hangs in buddy allocator on DG2/A380

  amdgpu:
   - Per-queue reset support
   - SDMA devcoredump support
   - DCN 4.0.1 updates
   - GFX12/VCN4/JPEG4 updates
   - Convert vbios embedded EDID to drm_edid
   - GFX9.3/9.4 devcoredump support
   - process isolation framework for GFX 9.4.3/4
   - take IOMMU mappings into account for P2P DMA

  amdkfd:
   - CRIU fixes
   - HMM fix
   - Enable process isolation support for GFX 9.4.3/4
   - Allow users to target recommended SDMA engines
   - KFD support for targetting queues on recommended SDMA engines

  radeon:
   - remove .load and drm_dev_alloc
   - Fix vbios embedded EDID size handling
   - Convert vbios embedded EDID to drm_edid
   - Use GEM references instead of TTM
   - r100 cp init cleanup
   - Fix potential overflows in evergreen CS offset tracking

  msm:
   - DPU:
      - implement DP/PHY mapping on SC8180X
      - Enable writeback on SM8150, SC8180X, SM6125, SM6350
   - DP:
      - Enable widebus on all relevant chipsets
      - MSM8998 HDMI support
   - GPU:
      - A642L speedbin support
      - A615/A306/A621 support
      - A7xx devcoredump support

  ast:
   - astdp: Support AST2600 with VGA
   - Clean up HPD
   - Fix timeout loop for DP link training
   - reorganize output code by type (VGA, DP, etc)
   - convert to struct drm_edid
   - fix BMC handling for all outputs

  exynos:
   - drop stale MAINTAINERS pattern
   - constify struct

  loongson:
   - use GEM refcount over TTM

  mgag200:
   - Improve BMC handling
   - Support VBLANK intterupts
   - transparently support BMC outputs

  nouveau:
   - Refactor and clean up internals
   - Use GEM refcount over TTM's

  gm12u320:
   - convert to struct drm_edid

  gma500:
   - update i2c terms

  lcdif:
   - pixel clock fix

  host1x:
   - fix syncpoint IRQ during resume
   - use iommu_paging_domain_alloc()

  imx:
   - ipuv3: convert to struct drm_edid

  omapdrm:
   - improve error handling
   - use common helper for_each_endpoint_of_node()

  panel:
   - add support for BOE TV101WUM-LL2 plus DT bindings
   - novatek-nt35950: improve error handling
   - nv3051d: improve error handling
   - panel-edp:
      - add support for BOE NE140WUM-N6G
      - revert support for SDC ATNA45AF01
   - visionox-vtdr6130:
      - improve error handling
      - use devm_regulator_bulk_get_const()
   - boe-th101mb31ig002:
      - Support for starry-er88577 MIPI-DSI panel plus DT
      - Fix porch parameter
   - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE
     NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
     CMN N116BCP-EA2, CSW MNB601LS1-4
   - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
   - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
   - jd9365da:
      - Support Melfas lmfbx101117480 MIPI-DSI panel plus DT
      - Refactor for code sharing
   - panel-edp: fix name for HKC MB116AN01
   - jd9365da: fix "exit sleep" commands
   - jdi-fhd-r63452: simplify error handling with DSI multi-style
     helpers
   - mantix-mlaf057we51: simplify error handling with DSI multi-style
     helpers
   - simple:
      - support Innolux G070ACE-LH3 plus DT bindings
      - support On Tat Industrial Company KD50G21-40NT-A1 plus DT
        bindings
   - st7701:
      - decouple DSI and DRM code
      - add SPI support
      - support Anbernic RG28XX plus DT bindings

  mediatek:
   - support alpha blending
   - remove cl in struct cmdq_pkt
   - ovl adaptor fix
   - add power domain binding for mediatek DPI controller

  renesas:
   - rz-du: add support for RZ/G2UL plus DT bindings

  rockchip:
   - Improve DP sink-capability reporting
   - dw_hdmi: Support 4k@60Hz
   - vop:
      - Support RGB display on Rockchip RK3066
      - Support 4096px width

  sti:
   - convert to struct drm_edid

  stm:
   - Avoid UAF wih managed plane and CRTC helpers
   - Fix module owner
   - Fix error handling in probe
   - Depend on COMMON_CLK
   - ltdc:
      - Fix transparency after disabling plane
      - Remove unused interrupt

  tegra:
   - gr3d: improve PM domain handling
   - convert to struct drm_edid
   - Call drm_atomic_helper_shutdown()

  vc4:
   - fix PM during detect
   - replace DRM_ERROR() with drm_error()
   - v3d: simplify clock retrieval

  v3d:
   - Clean up perfmon

  virtio:
   - add DRM capset"

* tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel: (1326 commits)
  drm/xe: Fix missing conversion to xe_display_pm_runtime_resume
  drm/xe/xe2hpg: Add Wa_15016589081
  drm/xe: Don't keep stale pointer to bo->ggtt_node
  drm/xe: fix missing 'xe_vm_put'
  drm/xe: fix build warning with CONFIG_PM=n
  drm/xe: Suppress missing outer rpm protection warning
  drm/xe: prevent potential UAF in pf_provision_vf_ggtt()
  drm/amd/display: Add all planes on CRTC to state for overlay cursor
  drm/i915/bios: fix printk format width
  drm/i915/display: Fix BMG CCS modifiers
  drm/amdgpu: get rid of bogus includes of fdtable.h
  drm/amdkfd: CRIU fixes
  drm/amdgpu: fix a race in kfd_mem_export_dmabuf()
  drm: new helper: drm_gem_prime_handle_to_dmabuf()
  drm/amdgpu/atomfirmware: Silence UBSAN warning
  drm/amdgpu: Fix kdoc entry in 'amdgpu_vm_cpu_prepare'
  drm/amd/amdgpu: apply command submission parser for JPEG v1
  drm/amd/amdgpu: apply command submission parser for JPEG v2+
  drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3
  drm/amd/pm: update the features set on smu v14.0.2/3
  ...
2024-09-19 10:18:15 +02:00

210 lines
9.5 KiB
ReStructuredText

.. SPDX-License-Identifier: GPL-2.0-only
=============
QAIC driver
=============
The QAIC driver is the Kernel Mode Driver (KMD) for the AIC100 family of AI
accelerator products.
Interrupts
==========
IRQ Storm Mitigation
--------------------
While the AIC100 DMA Bridge hardware implements an IRQ storm mitigation
mechanism, it is still possible for an IRQ storm to occur. A storm can happen
if the workload is particularly quick, and the host is responsive. If the host
can drain the response FIFO as quickly as the device can insert elements into
it, then the device will frequently transition the response FIFO from empty to
non-empty and generate MSIs at a rate equivalent to the speed of the
workload's ability to process inputs. The lprnet (license plate reader network)
workload is known to trigger this condition, and can generate in excess of 100k
MSIs per second. It has been observed that most systems cannot tolerate this
for long, and will crash due to some form of watchdog due to the overhead of
the interrupt controller interrupting the host CPU.
To mitigate this issue, the QAIC driver implements specific IRQ handling. When
QAIC receives an IRQ, it disables that line. This prevents the interrupt
controller from interrupting the CPU. Then AIC drains the FIFO. Once the FIFO
is drained, QAIC implements a "last chance" polling algorithm where QAIC will
sleep for a time to see if the workload will generate more activity. The IRQ
line remains disabled during this time. If no activity is detected, QAIC exits
polling mode and reenables the IRQ line.
This mitigation in QAIC is very effective. The same lprnet usecase that
generates 100k IRQs per second (per /proc/interrupts) is reduced to roughly 64
IRQs over 5 minutes while keeping the host system stable, and having the same
workload throughput performance (within run to run noise variation).
Single MSI Mode
---------------
MultiMSI is not well supported on all systems; virtualized ones even less so
(circa 2023). Between hypervisors masking the PCIe MSI capability structure to
large memory requirements for vIOMMUs (required for supporting MultiMSI), it is
useful to be able to fall back to a single MSI when needed.
To support this fallback, we allow the case where only one MSI is able to be
allocated, and share that one MSI between MHI and the DBCs. The device detects
when only one MSI has been configured and directs the interrupts for the DBCs
to the interrupt normally used for MHI. Unfortunately this means that the
interrupt handlers for every DBC and MHI wake up for every interrupt that
arrives; however, the DBC threaded irq handlers only are started when work to be
done is detected (MHI will always start its threaded handler).
If the DBC is configured to force MSI interrupts, this can circumvent the
software IRQ storm mitigation mentioned above. Since the MSI is shared it is
never disabled, allowing each new entry to the FIFO to trigger a new interrupt.
Neural Network Control (NNC) Protocol
=====================================
The implementation of NNC is split between the KMD (QAIC) and UMD. In general
QAIC understands how to encode/decode NNC wire protocol, and elements of the
protocol which require kernel space knowledge to process (for example, mapping
host memory to device IOVAs). QAIC understands the structure of a message, and
all of the transactions. QAIC does not understand commands (the payload of a
passthrough transaction).
QAIC handles and enforces the required little endianness and 64-bit alignment,
to the degree that it can. Since QAIC does not know the contents of a
passthrough transaction, it relies on the UMD to satisfy the requirements.
The terminate transaction is of particular use to QAIC. QAIC is not aware of
the resources that are loaded onto a device since the majority of that activity
occurs within NNC commands. As a result, QAIC does not have the means to
roll back userspace activity. To ensure that a userspace client's resources
are fully released in the case of a process crash, or a bug, QAIC uses the
terminate command to let QSM know when a user has gone away, and the resources
can be released.
QSM can report a version number of the NNC protocol it supports. This is in the
form of a Major number and a Minor number.
Major number updates indicate changes to the NNC protocol which impact the
message format, or transactions (impacts QAIC).
Minor number updates indicate changes to the NNC protocol which impact the
commands (does not impact QAIC).
uAPI
====
QAIC creates an accel device per physical PCIe device. This accel device exists
for as long as the PCIe device is known to Linux.
The PCIe device may not be in the state to accept requests from userspace at
all times. QAIC will trigger KOBJ_ONLINE/OFFLINE uevents to advertise when the
device can accept requests (ONLINE) and when the device is no longer accepting
requests (OFFLINE) because of a reset or other state transition.
QAIC defines a number of driver specific IOCTLs as part of the userspace API.
DRM_IOCTL_QAIC_MANAGE
This IOCTL allows userspace to send a NNC request to the QSM. The call will
block until a response is received, or the request has timed out.
DRM_IOCTL_QAIC_CREATE_BO
This IOCTL allows userspace to allocate a buffer object (BO) which can send
or receive data from a workload. The call will return a GEM handle that
represents the allocated buffer. The BO is not usable until it has been
sliced (see DRM_IOCTL_QAIC_ATTACH_SLICE_BO).
DRM_IOCTL_QAIC_MMAP_BO
This IOCTL allows userspace to prepare an allocated BO to be mmap'd into the
userspace process.
DRM_IOCTL_QAIC_ATTACH_SLICE_BO
This IOCTL allows userspace to slice a BO in preparation for sending the BO
to the device. Slicing is the operation of describing what portions of a BO
get sent where to a workload. This requires a set of DMA transfers for the
DMA Bridge, and as such, locks the BO to a specific DBC.
DRM_IOCTL_QAIC_EXECUTE_BO
This IOCTL allows userspace to submit a set of sliced BOs to the device. The
call is non-blocking. Success only indicates that the BOs have been queued
to the device, but does not guarantee they have been executed.
DRM_IOCTL_QAIC_PARTIAL_EXECUTE_BO
This IOCTL operates like DRM_IOCTL_QAIC_EXECUTE_BO, but it allows userspace
to shrink the BOs sent to the device for this specific call. If a BO
typically has N inputs, but only a subset of those is available, this IOCTL
allows userspace to indicate that only the first M bytes of the BO should be
sent to the device to minimize data transfer overhead. This IOCTL dynamically
recomputes the slicing, and therefore has some processing overhead before the
BOs can be queued to the device.
DRM_IOCTL_QAIC_WAIT_BO
This IOCTL allows userspace to determine when a particular BO has been
processed by the device. The call will block until either the BO has been
processed and can be re-queued to the device, or a timeout occurs.
DRM_IOCTL_QAIC_PERF_STATS_BO
This IOCTL allows userspace to collect performance statistics on the most
recent execution of a BO. This allows userspace to construct an end to end
timeline of the BO processing for a performance analysis.
DRM_IOCTL_QAIC_DETACH_SLICE_BO
This IOCTL allows userspace to remove the slicing information from a BO that
was originally provided by a call to DRM_IOCTL_QAIC_ATTACH_SLICE_BO. This
is the inverse of DRM_IOCTL_QAIC_ATTACH_SLICE_BO. The BO must be idle for
DRM_IOCTL_QAIC_DETACH_SLICE_BO to be called. After a successful detach slice
operation the BO may have new slicing information attached with a new call
to DRM_IOCTL_QAIC_ATTACH_SLICE_BO. After detach slice, the BO cannot be
executed until after a new attach slice operation. Combining attach slice
and detach slice calls allows userspace to use a BO with multiple workloads.
Userspace Client Isolation
==========================
AIC100 supports multiple clients. Multiple DBCs can be consumed by a single
client, and multiple clients can each consume one or more DBCs. Workloads
may contain sensitive information therefore only the client that owns the
workload should be allowed to interface with the DBC.
Clients are identified by the instance associated with their open(). A client
may only use memory they allocate, and DBCs that are assigned to their
workloads. Attempts to access resources assigned to other clients will be
rejected.
Module parameters
=================
QAIC supports the following module parameters:
**datapath_polling (bool)**
Configures QAIC to use a polling thread for datapath events instead of relying
on the device interrupts. Useful for platforms with broken multiMSI. Must be
set at QAIC driver initialization. Default is 0 (off).
**mhi_timeout_ms (unsigned int)**
Sets the timeout value for MHI operations in milliseconds (ms). Must be set
at the time the driver detects a device. Default is 2000 (2 seconds).
**control_resp_timeout_s (unsigned int)**
Sets the timeout value for QSM responses to NNC messages in seconds (s). Must
be set at the time the driver is sending a request to QSM. Default is 60 (one
minute).
**wait_exec_default_timeout_ms (unsigned int)**
Sets the default timeout for the wait_exec ioctl in milliseconds (ms). Must be
set prior to the waic_exec ioctl call. A value specified in the ioctl call
overrides this for that call. Default is 5000 (5 seconds).
**datapath_poll_interval_us (unsigned int)**
Sets the polling interval in microseconds (us) when datapath polling is active.
Takes effect at the next polling interval. Default is 100 (100 us).
**timesync_delay_ms (unsigned int)**
Sets the time interval in milliseconds (ms) between two consecutive timesync
operations. Default is 1000 (1000 ms).