[PATCH kernel RFC 0/3] powerpc/pseries/iommu: GPU coherent memory pass through

Alexey Kardashevskiy aik at ozlabs.ru
Mon Sep 17 17:05:07 AEST 2018


Ping?

The problem is still there...


On 24/08/2018 13:04, Alexey Kardashevskiy wrote:
> 
> 
> On 09/08/2018 14:41, Alexey Kardashevskiy wrote:
>>
>>
>> On 25/07/2018 19:50, Alexey Kardashevskiy wrote:
>>> I am trying to pass through a 3D controller:
>>> [0302]: NVIDIA Corporation GV100GL [Tesla V100 SXM2] [10de:1db1] (rev a1)
>>>
>>> which has a quite unique feature as coherent memory directly accessible
>>> from a POWER9 CPU via an NVLink2 transport.
>>>
>>> So in addition to passing a PCI device + accompanying NPU devices,
>>> we will also be passing the host physical address range as it is done
>>> on the bare metal system.
>>>
>>> The memory on the host is presented as:
>>>
>>> ===
>>> [aik at yc02goos ~]$ lsprop /proc/device-tree/memory at 42000000000
>>> ibm,chip-id      000000fe (254)
>>> device_type      "memory"
>>> compatible       "ibm,coherent-device-memory"
>>> reg              00000420 00000000 00000020 00000000
>>> linux,usable-memory
>>>                  00000420 00000000 00000000 00000000
>>> phandle          00000726 (1830)
>>> name             "memory"
>>> ibm,associativity
>>>                  00000004 000000fe 000000fe 000000fe 000000fe
>>> ===
>>>
>>> and the host does not touch it as the second 64bit value of
>>> "linux,usable-memory" - the size - is null. Later on the NVIDIA driver
>>> trains the NVLink2 and probes this memory and this is how it becomes
>>> onlined.
>>>
>>> In the virtual environment I am planning on doing the same thing,
>>> however there is a difference in 64bit DMA handling. The powernv
>>> platform uses a PHB3 bypass mode and that just works but
>>> the pseries platform uses DDW RTAS API to achieve the same
>>> result and the problem with this is that we need a huge DMA
>>> window to start from zero (because this GPU supports less than
>>> 50bits for DMA address space) and cover not just present memory
>>> but also this new coherent memory.
>>>
>>>
>>> This is based on sha1
>>> d72e90f3 Linus Torvalds "Linux 4.18-rc6".
>>>
>>> Please comment. Thanks.
>>
>>
>> Ping?
> 
> 
> Ping?
> 
>>
>>
>>>
>>>
>>>
>>> Alexey Kardashevskiy (3):
>>>   powerpc/pseries/iommu: Allow dynamic window to start from zero
>>>   powerpc/pseries/iommu: Force default DMA window removal
>>>   powerpc/pseries/iommu: Use memory@ nodes in max RAM address
>>>     calculation
>>>
>>>  arch/powerpc/platforms/pseries/iommu.c | 77 ++++++++++++++++++++++++++++++----
>>>  1 file changed, 70 insertions(+), 7 deletions(-)
>>>
>>
> 

-- 
Alexey


More information about the Linuxppc-dev mailing list