Support huge (PMD-size and PUD-size) page-table entries by providing a
huge_fault() callback.
We still support private mappings and write-notify by splitting the huge
page-table entries on write-access.
Note that for huge page-faults to occur, either the kernel needs to be
compiled with trans-huge-pages always enabled, or the kernel needs to be
compiled with trans-huge-pages enabled using madvise, and the user-space
app needs to call madvise() to enable trans-huge pages on a per-mapping
basis.
Furthermore huge page-faults will not succeed unless buffer objects and
user-space addresses are aligned on huge page size boundaries.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
TTM graphics buffer objects may, transparently to user-space, move
between IO and system memory. When that happens, all PTEs pointing to the
old location are zapped before the move and then faulted in again if
needed. When that happens, the page protection caching mode- and
encryption bits may change and be different from those of
struct vm_area_struct::vm_page_prot.
We were using an ugly hack to set the page protection correctly.
Fix that and instead export and use vmf_insert_mixed_prot() or use
vmf_insert_pfn_prot().
Also get the default page protection from
struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
This way we catch modifications done by the vm system for drivers that
want write-notification.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
UAPI Changes:
- Add support for DMA-BUF HEAPS.
Cross-subsystem Changes:
- mipi dsi definition updates, pulled into drm-intel as well.
- Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
- Remove support for dma-buf kmap/kunmap.
- Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.
Core Changes:
- Small cleanups to ttm.
- Fix SCDC definition.
- Assorted cleanups to core.
- Add todo to remove load/unload hooks, and use generic fbdev emulation.
- Assorted documentation updates.
- Use blocking ww lock in ttm fault handler.
- Remove drm_fb_helper_fbdev_setup/teardown.
- Warning fixes with W=1 for atomic.
- Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
- Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
- Various kconfig indentation fixes in core and drivers.
- Fix freeing transactions in dp-mst correctly.
- Sean Paul is steping down as core maintainer. :-(
- Add lockdep annotations for atomic locks vs dma-resv.
- Prevent use-after-free for a bad job in drm_scheduler.
- Fill out all block sizes in the P01x and P210 definitions.
- Avoid division by zero in drm/rect, and fix bounds.
- Add drm/rect selftests.
- Add aspect ratio and alternate clocks for HDMI 4k modes.
- Add todo for drm_framebuffer_funcs and fb_create cleanup.
- Drop DRM_AUTH for prime import/export ioctls.
- Clear DP-MST payload id tables downstream when initializating.
- Fix for DSC throughput definition.
- Add extra FEC definitions.
- Fix fake offset in drm_gem_object_funs.mmap.
- Stop using encoder->bridge in core directly
- Handle bridge chaining slightly better.
- Add backlight support to drm/panel, and use it in many panel drivers.
- Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.
Driver Changes:
- Small fixes all over.
- Fix documentation in vkms.
- Fix mmap_sem vs dma_resv in nouveau.
- Small cleanup in komeda.
- Add page flip support in gma500 for psb/cdv.
- Add ddc symlink in the connector sysfs directory for many drivers.
- Add support for analogic an6345, and fix small bugs in it.
- Add atomic modesetting support to ast.
- Fix radeon fault handler VMA race.
- Switch udl to use generic shmem helpers.
- Unconditional vblank handling for mcde.
- Miscellaneous fixes to mcde.
- Tweak debug output from komeda using debugfs.
- Add gamma and color transform support to komeda for DOU-IPS.
- Add support for sony acx424AKP panel.
- Various small cleanups to gma500.
- Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
- Add support for Logic PD Type 28 panel.
- Use drm_panel_* wrapper functions in exynos/tegra/msm.
- Add devicetree bindings for generic DSI panels.
- Don't include drm_pci.h directly in many drivers.
- Add support for begin/end_cpu_access in udmabuf.
- Stop using drm_get_pci_dev in gma500 and mga200.
- Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
- Add devfreq thermal support to panfrost.
- Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
- meson: Add support for OSD1 plane AFBC commit.
- Stop displaying garbage when toggling ast primary plane on/off.
- More cleanups and fixes to UDL.
- Add D32 suport to komeda.
- Remove globle copy of drm_dev in gma500.
- Add support for Boe Himax8279d MIPI-DSI LCD panel.
- Add support for ingenic JZ4770 panel.
- Small null pointer deference fix in ingenic.
- Remove support for the special tfp420 driver, as there is a generic way to do it.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl34lkkACgkQ/lWMcqZw
E8M76g//WRYl9fWnV063s44FBVJYjGxaus0vQJSGidaPCIE6Ep6TNjXp8DVzV82M
HR79P9glL02DC9B8pflioNNXdIRGSVk/FJcKVB2seFAqEFCAknvWDM/X/y+mOUpp
fUeFl+Znlwx3YlM8f4Qujdbm+CbTewfbya4VAWeWd8XG2V8jfq5cmODPPlUMNenZ
J6Ja+W3ph741uSIfAKaP69LVJgOcuUjXINE4SWhRk/i5QF3GIRej/A7ZjWGLQ/t2
2zUUF7EiCzhPomM40H3ddKtXb4ZjNJuc5pOD4GpxR8ciNbe2gUOHEZ5aenwYBdsU
5MwbxNKyMbKXATtn3yv3fSc4jH3DtmEKpmovONeO8ZDBrQBnxeYa3tQvfkNghA2f
acoZMzYUImV+ft6DMIgpXppASvo7mQYDAbLPOGEJ9E44AL4UP00jesEjnK5FOHSR
3BEzGUnK/6QL5zFNPni8YZQ8dan4jDIno1mqIV+cQ4WCGlaKckzIWO6243Bf13b/
kROSJpgWkiK6Ngq0ofhD0MHyT/m1QnqUzWRKTJhRtPflSWRBsDZqWCQ5Vx1QlNIE
/HfTNbTpXWwa+5wXbbB8TkDw5t9cQGnR+QcrEd9HgoIec7B5Re8rx9i0TJAT4N05
03RCQCecSfD8gwKd2wgaFIpFGRl9lTdLYSpffSmyL2X5a20lZhM=
=b15X
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2019-12-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.6:
UAPI Changes:
- Add support for DMA-BUF HEAPS.
Cross-subsystem Changes:
- mipi dsi definition updates, pulled into drm-intel as well.
- Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
- Remove support for dma-buf kmap/kunmap.
- Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.
Core Changes:
- Small cleanups to ttm.
- Fix SCDC definition.
- Assorted cleanups to core.
- Add todo to remove load/unload hooks, and use generic fbdev emulation.
- Assorted documentation updates.
- Use blocking ww lock in ttm fault handler.
- Remove drm_fb_helper_fbdev_setup/teardown.
- Warning fixes with W=1 for atomic.
- Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
- Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
- Various kconfig indentation fixes in core and drivers.
- Fix freeing transactions in dp-mst correctly.
- Sean Paul is steping down as core maintainer. :-(
- Add lockdep annotations for atomic locks vs dma-resv.
- Prevent use-after-free for a bad job in drm_scheduler.
- Fill out all block sizes in the P01x and P210 definitions.
- Avoid division by zero in drm/rect, and fix bounds.
- Add drm/rect selftests.
- Add aspect ratio and alternate clocks for HDMI 4k modes.
- Add todo for drm_framebuffer_funcs and fb_create cleanup.
- Drop DRM_AUTH for prime import/export ioctls.
- Clear DP-MST payload id tables downstream when initializating.
- Fix for DSC throughput definition.
- Add extra FEC definitions.
- Fix fake offset in drm_gem_object_funs.mmap.
- Stop using encoder->bridge in core directly
- Handle bridge chaining slightly better.
- Add backlight support to drm/panel, and use it in many panel drivers.
- Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.
Driver Changes:
- Small fixes all over.
- Fix documentation in vkms.
- Fix mmap_sem vs dma_resv in nouveau.
- Small cleanup in komeda.
- Add page flip support in gma500 for psb/cdv.
- Add ddc symlink in the connector sysfs directory for many drivers.
- Add support for analogic an6345, and fix small bugs in it.
- Add atomic modesetting support to ast.
- Fix radeon fault handler VMA race.
- Switch udl to use generic shmem helpers.
- Unconditional vblank handling for mcde.
- Miscellaneous fixes to mcde.
- Tweak debug output from komeda using debugfs.
- Add gamma and color transform support to komeda for DOU-IPS.
- Add support for sony acx424AKP panel.
- Various small cleanups to gma500.
- Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
- Add support for Logic PD Type 28 panel.
- Use drm_panel_* wrapper functions in exynos/tegra/msm.
- Add devicetree bindings for generic DSI panels.
- Don't include drm_pci.h directly in many drivers.
- Add support for begin/end_cpu_access in udmabuf.
- Stop using drm_get_pci_dev in gma500 and mga200.
- Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
- Add devfreq thermal support to panfrost.
- Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
- meson: Add support for OSD1 plane AFBC commit.
- Stop displaying garbage when toggling ast primary plane on/off.
- More cleanups and fixes to UDL.
- Add D32 suport to komeda.
- Remove globle copy of drm_dev in gma500.
- Add support for Boe Himax8279d MIPI-DSI LCD panel.
- Add support for ingenic JZ4770 panel.
- Small null pointer deference fix in ingenic.
- Remove support for the special tfp420 driver, as there is a generic way to do it.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
The fake offset is going to stay, so change the calling convention for
drm_gem_object_funcs.mmap to include the fake offset. Update all users
accordingly.
Note that this reverts 83b8a6f242 ("drm/gem: Fix mmap fake offset
handling for drm_gem_object_funcs.mmap") and on top then adds the fake
offset to drm_gem_prime_mmap to make sure all paths leading to
obj->funcs->mmap are consistent.
v3: move fake-offset tweak in drm_gem_prime_mmap() so we have this code
only once in the function (Rob Herring).
Fixes: 83b8a6f242 ("drm/gem: Fix mmap fake offset handling for drm_gem_object_funcs.mmap")
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20191127092523.5620-2-kraxel@redhat.com
That is needed by at least a cleanup in radeon.
v2: also export ttm_bo_vm_access
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/339353/
The default TTM fault handler may not be completely sufficient
(vmwgfx needs to do some bookkeeping, control the write protectionand also
needs to restrict the number of prefaults).
Also make it possible replicate ttm_bo_vm_reserve() functionality for,
for example, mkwrite handlers.
So turn the TTM vm code into helpers: ttm_bo_vm_fault_reserved(),
ttm_bo_vm_open(), ttm_bo_vm_close() and ttm_bo_vm_reserve(). Also provide
a default TTM fault handler for other drivers to use.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
The explicit typcasts are meaningless, so remove them.
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
With nouveau fixed all ttm-using drives have the correct nesting of
mmap_sem vs dma_resv, and we can just lock the buffer.
Assuming I didn't screw up anything with my audit of course.
v2:
- Dont forget wu_mutex (Christian König)
- Keep the mmap_sem-less wait optimization (Thomas)
- Use _lock_interruptible to be good citizens (Thomas)
v3: Rebase over fault handler helperification.
Reviewed-by: Christian König <christian.koenig@amd.com> (v2)
Reviewed-by: Thomas Hellström <thellstrom@vmware.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: "VMware Graphics" <linux-graphics-maintainer@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104173801.2972-3-daniel.vetter@ffwll.ch
The default TTM fault handler may not be completely sufficient
(vmwgfx needs to do some bookkeeping, control the write protectionand also
needs to restrict the number of prefaults).
Also make it possible replicate ttm_bo_vm_reserve() functionality for,
for example, mkwrite handlers.
So turn the TTM vm code into helpers: ttm_bo_vm_fault_reserved(),
ttm_bo_vm_open(), ttm_bo_vm_close() and ttm_bo_vm_reserve(). Also provide
a default TTM fault handler for other drivers to use.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/332900/?series=67217&rev=1
Signed-off-by: Christian König <christian.koenig@amd.com>
The explicit typcasts are meaningless, so remove them.
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/332899/
Signed-off-by: Christian König <christian.koenig@amd.com>
Commit c40069cb7b ("drm: add mmap() to drm_gem_object_funcs")
introduced a GEM object mmap() hook which is expected to subtract the
fake offset from vm_pgoff. However, for mmap() on dmabufs, there is not
a fake offset.
To fix this, let's always call mmap() object callback with an offset of 0,
and leave it up to drm_gem_mmap_obj() to remove the fake offset.
TTM still needs the fake offset, so we have to add it back until that's
fixed.
Fixes: c40069cb7b ("drm: add mmap() to drm_gem_object_funcs")
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024191859.31700-1-robh@kernel.org
As the name says global memory and bo accounting is global. So it doesn't
make to much sense having pointers to global structures all around the code.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/332879/
Parroting Daniel's backmerge justification from
2e79e22e092acd55da0b2db066e4826d7d152c41:
Thierry needs fd70c7755b ("drm/bridge: tc358767: fix max_tu_symbol
value") to be able to merge his dp_link patch series.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl2su/AeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvm4H/1jkheCrvB/GJS69
wd18vizAg+eFmNCzxlGVhpQTKGymNRy+g6clnoli3cNJ3pSVKcYgVyB3oXaONIhp
g/ANudnBjTdjqYgJzfLij5AGecrGwDpF3YL0kuKrCB63s2I/HwQGYy/aPrYY8emy
gAYdaf1DGRu5/DIIB6soTo/TnpKoAyTE+XY5MaPSug++t/Flov19tlU40IZxXW94
bjTXbm0yklrsIx+LL5mYYGGnygSTCF66JjFg1qhDCBQaS2MZ21h1ZgaOtGZTwZcc
WgEiqLC5S1Iyj96zir1t78RcVQ4RzgvDbhUOgIqUFsYAO2wOicvxyFE3Hj8rPOKd
uGgVPRM=
=xgZa
-----END PGP SIGNATURE-----
Merge v5.4-rc4 into drm-next
Thierry needs fd70c7755b ("drm/bridge: tc358767: fix max_tu_symbol
value") to be able to merge his dp_link patch series.
Some adjacent changes conflicts, plus some clashes in i915 due to
cherry-picking and git trying to be helpful and leaving both versions
in.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rename ttm_fbdev_mmap to ttm_bo_mmap_obj. Move the vm_pgoff sanity
check to amdgpu_bo_fbdev_mmap (only ttm_fbdev_mmap user in tree).
The ttm_bo_mmap_obj function can now be used to map any buffer object.
This allows to implement &drm_gem_object_funcs.mmap in gem ttm helpers.
v3: patch added to series
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20191016115203.20095-8-kraxel@redhat.com
Factor out ttm vma setup to a new function.
Reduces code duplication a bit.
v2: don't change vm_flags (moved to separate patch).
v4: make ttm_bo_mmap_vma_setup static.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20191016115203.20095-7-kraxel@redhat.com
Commit 4daa4fba3a ("gpu: drm: ttm: Adding new return type vm_fault_t")
broke TTM prefaulting. Since vmf_insert_mixed() typically always returns
VM_FAULT_NOPAGE, prefaulting stops after the second PTE.
Restore (almost) the original behaviour. Unfortunately we can no longer
with the new vm_fault_t return type determine whether a prefaulting
PTE insertion hit an already populated PTE, and terminate the insertion
loop. Instead we continue with the pre-determined number of prefaults.
Fixes: 4daa4fba3a ("gpu: drm: ttm: Adding new return type vm_fault_t")
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/330387/
Rename the embedded struct vma_offset_manager, new name is _vma_manager.
ttm_bo_device.vma_manager changed to a pointer.
The ttm_bo_device_init() function gets an additional vma_manager
argument which allows to initialize ttm with a different vma manager.
When passing NULL the embedded _vma_manager is used.
All callers are updated to pass NULL, so the behavior doesn't change.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190905070509.22407-2-kraxel@redhat.com
Be more consistent with the naming of the other DMA-buf objects.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/
GEM defines DRM_FILE_PAGE_OFFSET_{START,SIZE} constants for the
mmap-able range of addresses. TTM can use them as well.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A BO's address has to be at least the minimum offset. Sharing this
test in ttm_bo_mmap() removes code from drivers. A full buffer-address
validation is still done within drm_vma_offset_lockup_locked().
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the BO on the LRU only when it is actually moved by a DMA
operation.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-And-Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Export ttm_bo_get_unless_zero() to be used when looking up buffer
objects that are removed from the lookup structure in the destructor.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
A call to ttm_bo_unref() clears the supplied pointer to NULL, while
ttm_bo_put() does not. None of the converted call sites requires the
pointer to become NULL, so the respective assign operations has been
left out from the patch.
Signed-off-by: Thomas Zimmermann <contact@tzimmermann.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Thomas Zimmermann <contact@tzimmermann.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use new return type vm_fault_t for fault handler. For
now, this is just documenting that the function returns
a VM_FAULT value rather than an errno. Once all instances
are converted, vm_fault_t will become a distinct type.
Ref-> commit 1c8f422059 ("mm: change return type to vm_fault_t")
Previously vm_insert_{mixed,pfn} returns err which driver
mapped into VM_FAULT_* type. The new function
vmf_insert_{mixed,pfn} will replace this inefficiency by
returning VM_FAULT_* type.
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is dual licensed under GPL-2.0 or MIT.
Signed-off-by: Dirk Hohndel (VMware) <dirk@hohndel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
set TTM_OPT_FLAG_FORCE_ALLOC when we are servicing for page
fault routine.
for ttm_mem_global_reserve if in page fault routine, allow the gtt
pages reservation always. because page fault routing already grabbed
system memory and the allowance of this exception is harmless.
Otherwise, it will trigger OOM killer.
will be used later.
v2: set the FORCE_ALLOC always
v3: minor refine
Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To aid debugging set the page mapping during allocation instead of
during VM faults.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stop calling the driver callback directly.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The dual ret/retval was more complex than need be. Now
we drop the retval variable and assign the appropriate VM
codes to ret instead.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The buf pointer was not being incremented inside the loop
meaning the same block of data would be read or written
repeatedly.
(v2) Change 'buf' pointer to uint8_t* type
Cc: stable@vger.kernel.org
Fixes: 09ac4fcb3f ("drm/ttm: Implement vm_operations_struct.access v2")
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No one will use this function except ttm_bo_io_mem_pfn() now, so move
the calculation of ttm_bo_default_io_mem_pfn() into ttm_bo_io_mem_pfn()
and do some cleanup.
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
forward the operation context to ttm_tt_populate as well,
and the ultimate goal is swapout enablement for reserved BOs.
v2: squash in fix for vboxvideo
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The io_mem_pfn field was added in commit ea642c3216 ("drm/ttm: add
io_mem_pfn callback") and is called unconditionally. However, not all
drivers were updated to set it.
Use the ttm_bo_default_io_mem_pfn function if a driver did not set its
own. And add new function ttm_bo_io_mem_pfn() as wrapper.
Signed-off-by: Michal Srb <msrb@suse.com>
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull x86 mm changes from Ingo Molnar:
"PCID support, 5-level paging support, Secure Memory Encryption support
The main changes in this cycle are support for three new, complex
hardware features of x86 CPUs:
- Add 5-level paging support, which is a new hardware feature on
upcoming Intel CPUs allowing up to 128 PB of virtual address space
and 4 PB of physical RAM space - a 512-fold increase over the old
limits. (Supercomputers of the future forecasting hurricanes on an
ever warming planet can certainly make good use of more RAM.)
Many of the necessary changes went upstream in previous cycles,
v4.14 is the first kernel that can enable 5-level paging.
This feature is activated via CONFIG_X86_5LEVEL=y - disabled by
default.
(By Kirill A. Shutemov)
- Add 'encrypted memory' support, which is a new hardware feature on
upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system
RAM to be encrypted and decrypted (mostly) transparently by the
CPU, with a little help from the kernel to transition to/from
encrypted RAM. Such RAM should be more secure against various
attacks like RAM access via the memory bus and should make the
radio signature of memory bus traffic harder to intercept (and
decrypt) as well.
This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled
by default.
(By Tom Lendacky)
- Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a
hardware feature that attaches an address space tag to TLB entries
and thus allows to skip TLB flushing in many cases, even if we
switch mm's.
(By Andy Lutomirski)
All three of these features were in the works for a long time, and
it's coincidence of the three independent development paths that they
are all enabled in v4.14 at once"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits)
x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)
x86/mm: Use pr_cont() in dump_pagetable()
x86/mm: Fix SME encryption stack ptr handling
kvm/x86: Avoid clearing the C-bit in rsvd_bits()
x86/CPU: Align CR3 defines
x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages
acpi, x86/mm: Remove encryption mask from ACPI page protection type
x86/mm, kexec: Fix memory corruption with SME on successive kexecs
x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt
x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y
x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID
x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y
x86/mm: Allow userspace have mappings above 47-bit
x86/mm: Prepare to expose larger address space to userspace
x86/mpx: Do not allow MPX if we have mappings above 47-bit
x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit()
x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD
x86/mm/dump_pagetables: Fix printout of p4d level
x86/mm/dump_pagetables: Generalize address normalization
x86/boot: Fix memremap() related build failure
...
Allows gdb to access contents of user mode mapped BOs. System memory
is handled by TTM using kmap. Other memory pools require a new driver
callback in ttm_bo_driver.
v2:
* kmap only one page at a time
* swap in BO if needed
* make driver callback more generic to handle private memory pools
* document callback return value
* WARN_ON -> WARN_ON_ONCE
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since video memory needs to be accessed decrypted, be sure that the
memory encryption mask is not set for the video ranges.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Toshimitsu Kani <toshi.kani@hpe.com>
Cc: kasan-dev@googlegroups.com
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/a19436f30424402e01f63a09b32ab103272acced.1500319216.git.thomas.lendacky@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For the C file, include <drm/*.h> instead of relative path from
include/drm.
For headers in include/drm/ttm, simplify the <tty/*.h> with "*.h".
This allows us to remove the -Iinclude/drm compiler flag from
drivers/gpu/drm/ttm/Makefile (and from other drivers' Makefiles).
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1493009447-31524-3-git-send-email-yamada.masahiro@socionext.com
This allows the driver to handle io_mem mappings on their own.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
take a vma and vmf parameter when the vma already resides in vmf.
Remove the vma parameter to simplify things.
[arnd@arndb.de: fix ARM build]
Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The vm fault handler relies on the fact that the VMA owns a reference
to the BO. However, once mmap_sem is released, other tasks are free to
destroy the VMA, which can lead to the BO being freed. Fix two code
paths where that can happen, both related to vm fault retries.
Found via a lock debugging warning which flagged &bo->wu_mutex as
locked while being destroyed.
Fixes: cbe12e74ee ("drm/ttm: Allow vm fault retries")
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Every single user of vmf->virtual_address typed that entry to unsigned
long before doing anything with it so the type of virtual_address does
not really provide us any additional safety. Just use masked
vmf->address which already has the appropriate type.
Link: http://lkml.kernel.org/r/1479460644-25076-3-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of using the flag just remember the fence of the last move operation.
This avoids waiting for command submissions pipelined after the move, but
before accessing the BO with the CPU again.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Not used any more.
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Not used any more.
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Convert the raw unsigned long 'pfn' argument to pfn_t for the purpose of
evaluating the PFN_MAP and PFN_DEV flags. When both are set it triggers
_PAGE_DEVMAP to be set in the resulting pte.
There are no functional changes to the gpu drivers as a result of this
conversion.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Today, most callers of ttm_io_prot() check TTM_PL_FLAG_CACHED before
calling it since on some archs it will unconditionally create non-cached
mappings.
But not all callers do which is incorrect as far as I can tell.
Instead, move that check inside ttm_io_port() itself for all archs
and make powerpc use the same implementation as ia64 and arm
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
No users are left, kill it off! :D
Conversion to the reservation api is next on the list, after
that the functionality can be restored with rcu.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
A performance regression was introduced in TTM in linux 3.13 when we started using
VM_PFNMAP for shared mappings. In theory this should've been faster due to
less page book-keeping but it appears like VM_PFNMAP + x86 PAT + write-combine
is a particularly cpu-hungry combination, as seen by largely increased
cpu-usage on r200 GL video playback.
Until we've sorted out why, revert to always use VM_MIXEDMAP.
Reference: freedesktop.org bugzilla bug #75719
Reported-and-tested-by: <smoki00790@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: stable@vger.kernel.org
drm-intel-next-2014-01-10:
- final bits for runtime D3 on Haswell from Paul (now enabled fully)
- parse the backlight modulation freq information in the VBT from Jani
(but not yet used)
- more watermark improvements from Ville for ilk-ivb and bdw
- bugfixes for fastboot from Jesse
- watermark fix for i830M (but not yet everything)
- vlv vga hotplug w/a (Imre)
- piles of other small improvements, cleanups and fixes all over
Note that the pull request includes a backmerge of the last drm-fixes
pulled into Linus' tree - things where getting a bit too messy. So the
shortlog also contains a bunch of patches from Linus tree. Please yell if
you want me to frob it for you a bit.
* 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (609 commits)
drm/i915/bdw: make sure south port interrupts are enabled properly v2
drm/i915: Include more information in disabled hotplug interrupt warning
drm/i915: Only complain about a rogue hotplug IRQ after disabling
drm/i915: Only WARN about a stuck hotplug irq ONCE
drm/i915: s/hotplugt_status_gen4/hotplug_status_g4x/
Needed for some vm operations; most notably unmap_mapping_range() with
even_cows = 0.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This is illegal for at least two reasons:
1) While it may work on some platforms / iommus, obtaining page pointers from
mapped sg-lists is illegal, since the DMA API allows page pointer information
to be destroyed in the sg mapping process.
2) TTM has no way of determining the linear kernel map caching state of the
underlying pages. PTEs with conflicting caching state pointing to the same
pfn is not allowed.
TTM operations touching pages of imported sg-tables should be redirected through
the proper dma-buf operations.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
VM_PFNMAP is faster than VM_MIXEDMAP due to reduced page administration so
use it for shared maps where we don't have any Copy-On-Write pages. For
private maps, we continue to use VM_MIXEDMAP.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
VMAs covering a bo but that didn't start at the same address space offset as
the bo they were mapping were incorrectly generating SEGFAULT errors in
the fault handler.
Reported-by: Joseph Dolinak <kanilo2@yahoo.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: stable@vger.kernel.org
Addresses
"[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE".
In the first occurence it was used to try to be nice while releasing the
mmap_sem and retrying the fault to work around a locking inversion.
The second occurence was never used.
There has been some discussion whether we should change the locking order to
mmap_sem -> bo_reserve. This patch doesn't address that issue, and leaves
that locking order undefined. The solution that we release the mmap_sem if
tryreserve fails and wait for the buffer to become unreserved is something
we want in any case, and follows how the core vm system waits for pages
to be come unlocked while releasing the mmap_sem.
The code also outlines what needs to be changed if we want to establish the
locking order as mmap_sem -> bo::reserve.
One slight issue that remains with this code is that the fault handler might
be prone to starvation if another thread countinously reserves the buffer.
IMO that usage pattern is highly unlikely.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Fix a long-standing TTM issue where we manipulated the vma page_prot
bits while mmap_sem was taken in read mode only. We now make a local
copy of the vma structure which we pass when we set the ptes.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Make use of the FAULT_FLAG_ALLOW_RETRY flag to allow dropping the
mmap_sem while waiting for bo idle.
FAULT_FLAG_ALLOW_RETRY appears to be primarily designed for disk waits
but should work just as fine for GPU waits..
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Use the new vma-manager infrastructure. This doesn't change any
implementation details as the vma-offset-manager is nearly copied 1-to-1
from TTM.
The vm_lock is moved into the offset manager so we can drop it from TTM.
During lookup, we use the vma locking helpers to take a reference to the
found object.
In all other scenarios, locking stays the same as before. We always
guarantee that drm_vma_offset_remove() is called only during destruction.
Hence, helpers like drm_vma_node_offset_addr() are always safe as long as
the node has a valid offset.
This also drops the addr_space_offset member as it is a copy of vm_start
in vma_node objects. Use the accessor functions instead.
v4:
- remove vm_lock
- use drm_vma_offset_lock_lookup() to protect lookup (instead of vm_lock)
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Martin Peres <martin.peres@labri.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
(*->vm_end - *->vm_start) >> PAGE_SHIFT operation is implemented
as a inline funcion vma_pages() in linux/mm.h, so using it.
Signed-off-by: Libin <huawei.libin@huawei.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Removes the need for a write lock each time we call ttm_bo_unref().
v2: Remove an unused variable.
v3: Really remove the unused variable.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
currently it lost original meaning but still has some effects:
| effect | alternative flags
-+------------------------+---------------------------------------------
1| account as reserved_vm | VM_IO
2| skip in core dump | VM_IO, VM_DONTDUMP
3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
4| do not mlock | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
This patch removes reserved_vm counter from mm_struct. Seems like nobody
cares about it, it does not exported into userspace directly, it only
reduces total_vm showed in proc.
Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
[akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Venkatesh Pallipadi <venki@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the more current logging style.
Add pr_fmt and remove the TTM_PFX uses.
Coalesce formats and align arguments.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Move the page allocation and freeing to driver callback and
provide ttm code helper function for those.
Most intrusive change, is the fact that we now only fully
populate an object this simplify some of code designed around
the page fault design.
V2 Rebase on top of memory accounting overhaul
V3 New rebase on top of more memory accouting changes
V4 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
This patch attempts to fix up shortcomings with the current calling
sequences.
1) There's a fastpath where no locking occurs and only io_mem_reserved is
called to obtain needed info for mapping. The fastpath is set per
memory type manager.
2) If the fastpath is disabled, io_mem_reserve and io_mem_free will be exactly
balanced and not called recursively for the same struct ttm_mem_reg.
3) Optionally the driver can choose to enable a per memory type manager LRU
eviction mechanism that, when io_mem_reserve returns -EAGAIN will attempt
to kill user-space mappings of memory in that manager to free up needed
resources
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The bo lock used only to protect the bo sync object members, and since it
is a per bo lock, fencing a buffer list will see a lot of locks and unlocks.
Replace it with a per-device lock that protects the sync object members on
*all* bos. Reading and setting these members will always be very quick, so
the risc of heavy lock contention is microscopic. Note that waiting for
sync objects will always take place outside of this lock.
The bo device fence lock will eventually be replaced with a seqlock /
rcu mechanism so we can determine that a bo is idle under a
rcu / read seqlock.
However this change will allow us to batch fencing and unreserving of
buffers with a minimal amount of locking.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
On fault the driver is given the opportunity to perform any operation
it sees fit in order to place the buffer into a CPU visible area of
memory. This patch doesn't break TTM users, nouveau, vmwgfx and radeon
should keep working properly. Future patch will take advantage of this
infrastructure and remove the old path from TTM once driver are
converted.
V2 return VM_FAULT_NOPAGE if callback return -EBUSY or -ERESTARTSYS
V3 balance io_mem_reserve and io_mem_free call, fault_reserve_notify
is responsible to perform any necessary task for mapping to succeed
V4 minor cleanup, atomic_t -> bool as member is protected by reserve
mecanism from concurent access
V5 the callback is now responsible for iomapping the bo and providing
a virtual address this simplify TTM and will allow to get rid of
TTM_MEMTYPE_FLAG_NEEDS_IOREMAP
V6 use the bus addr data to decide to ioremap or this isn't needed
but we don't necesarily need to ioremap in the callback but still
allow driver to use static mapping
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This path isn't used by radeon yet, but future drivers will want it,
so fix it right.
Reported-by: Luca Barbieri <luca@luca-barbieri.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Return -ERESTARTSYS instead of -ERESTART when interrupted by a signal.
The -ERESTARTSYS is converted to an -EINTR by the kernel signal layer
before returned to user-space.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP code
But leave TTM code alone, something is fishy there with global vm_ops
being used.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'drm-radeon-kms' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (35 commits)
drm/radeon: set fb aperture sizes for framebuffer handoff.
drm/ttm: fix highuser vs dma32 confusion.
drm/radeon: Fix size used for benchmarking BO copies.
drm/radeon: Add radeon.test parameter for running BO GPU copy tests.
drm/radeon/kms: allow interruptible waits for objects.
drm/ttm: powerpc: Fix Highmem cache flushing.
x86: Export kmap_atomic_prot() needed for TTM.
drm/ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes.
drm/ttm: Fix an oops and sync object leak.
drm/radeon/kms: vram sizing on certain r100 chips needs workaround.
drm/radeon: Pay more attention to object placement requested by userspace.
drm/radeon: Fall back to evicting BOs with memcpy if necessary.
drm/radeon: Don't unreserve twice on failure to validate.
drm/radeon/kms: fix bandwidth computation on avivo hardware
drm/radeon/kms: add initial colortiling support.
drm/radeon/kms: fix hotspot handling on pre-avivo chips
drm/radeon/kms: enable frac fb divs on rs600/rs690/rs740
drm/radeon/kms: add PLL flag to prefer frequencies <= the target freq
drm/radeon/kms: block RN50 from using 3D engine.
drm/radeon/kms: fix VRAM sizing like DDX does it.
...
This adds new set/get tiling interfaces where the pitch
and macro/micro tiling enables can be set. Along with
a flag to decide if this object should have a surface when mapped.
The only thing we need to allocate with a mapped surface should be
the frontbuffer. Note rotate scanout shouldn't require one, and
back/depth shouldn't either, though mesa needs some fixes.
It fixes the TTM interfaces along Thomas's suggestions, and I've tested
the surface stealing code with two X servers and not seen any lockdep issues.
I've stopped tiling the fbcon frontbuffer, as I don't see there being
any advantage other than testing, I've left the testing commands in there,
just flip the fb_tiled to true in radeon_fb.c
Open: Can we integrate endian swapping in with this?
Future features:
texture tiling - need to relocate texture registers TXOFFSET* with tiling info.
This also merges Michel's cleanup surfaces regs at init time patch
even though it makes sense on its own, this patch really relies on it.
Some PowerMac firmwares set up a tiling surface at the beginning of VRAM
which messes us up otherwise.
that patch is:
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
TTM is a GPU memory manager subsystem designed for use with GPU
devices with various memory types (On-card VRAM, AGP,
PCI apertures etc.). It's essentially a helper library that assists
the DRM driver in creating and managing persistent buffer objects.
TTM manages placement of data and CPU map setup and teardown on
data movement. It can also optionally manage synchronization of
data on a per-buffer-object level.
TTM takes care to provide an always valid virtual user-space address
to a buffer object which makes user-space sub-allocation of
big buffer objects feasible.
TTM uses a fine-grained per buffer-object locking scheme, taking
care to release all relevant locks when waiting for the GPU.
Although this implies some locking overhead, it's probably a big
win for devices with multiple command submission mechanisms, since
the lock contention will be minimal.
TTM can be used with whatever user-space interface the driver
chooses, including GEM. It's used by the upcoming Radeon KMS DRM driver
and is also the GPU memory management core of various new experimental
DRM drivers.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>