Memory allocation for DMA transfer

The DMA framework is not responsible for managing the memory buffers used for the DMA transfer. This is left to the users of the framework.

You can't safely perform DMA transfer directly to or from memory that has been allocated to a user process in the normal way-that is, to user chunks. There are a number of reasons for this:

1. While a DMA transfer is in progress to the user chunk, the owning user process might free the memory-or the kernel might do so, if the process dies. This is a problem because the freed memory could be reused for other purposes. Unaware of the reallocation, the DMA controller would continue with the transfer, using the physical addresses supplied, and trash the contents of the newly allocated memory. You can overcome this problem by ensuring that the driver opens the chunk representing the user memory for the duration of the transfer - but this can be inefficient

2. A process context switch may change the location of the memory. To be suitable for DMA, the memory needs to be available to the kernel at a fixed location

3. The peripheral may mandate DMA to a specific physical memory region and the allocation of user-mode memory doesn't allow this attribute to be specified

4. Since the DMA controller interfaces directly with the physical address space, it bypasses the MMU, cache and write buffer. Hence, it is important to ensure that DMA memory buffer and cache are coherent. One way to achieve this is to disable caching in the buffers used for DMA transfer. Again, the allocation of user-mode memory doesn't allow these caching and buffering options to be specified.

You can avoid all these problems by allocating the DMA buffers kernelside, and so it is usual for device drivers that support DMA to do the allocation of memory that is to be used for DMA transfers themselves.

The example code that follows shows how a driver would use a hardware chunk to allocate a buffer that is non-cacheable and non-bufferable to avoid cache incoherency issues. This creates a global memory buffer - accessible kernel-side only. By using RAM pages that are physically contiguous, this also avoids the issue of memory fragmentation.

TUint32 physAddr=0;

TUint32 size=Kern::RoundToPageSize(aBuffersize);

// Get contiguous pages of RAM from the system's free pool if (Epoc::AllocPhysicalRam(size,physAddr) != KErrNone) return(NULL);

// EMapAttrSupRw - supervisor read/write, user no access // EMapAttrFullyBlocking - uncached, unbuffered

DPlatChunkHw* chunk;

if(DPlatChunkHw::New(chunk,physAddr, size,EMapAttrSupRw|EMapAttrFullyBlocking) != KErrNone) {

Epoc::FreePhysicalRam(physAddr,size);

TUint8* buf;

buf=reinterpret_cast<TUint8*>(chunk->LinearAddress());

On the other hand, you may have reasons that make it preferable to allocate DMA buffers that are cached - for instance if you want to perform significant data processing on data in the buffer. You can do this using the same example code-but with the cache attribute EMapAt-trFullyBlocking replaced with EMapAttrCachedMax. However, to maintain cache coherency, you must then flush the cache for each DMA transfer.

For a DMA transfer from cacheable memory to peripheral (that is, a DMA write), the memory cache must to be flushed before transfer. The kernel provides the following method for this:

void Cache::SyncMemoryBeforeDmaWrite(TLinAddr aBase, TUint aSize,

TUint3 2 aMapAttr);

For DMA transfer from peripheral to cacheable memory (that is, a DMA read), the cache may have to be flushed both before and after transfer. Again, methods are provided for this:

void Cache::SyncMemoryBeforeDmaRead(TLinAddr aBase, TUint aSize,

TUint32 aMapAttr);

void Cache::SyncMemoryAfterDmaRead(TLinAddr aBase, TUint aSize);

It's worth pointing out that only kernel-side code can access the types of hardware chunk I've described so far. So, if the ultimate source or destination of a data transfer request is in normal user memory, you must perform a two-stage transfer between peripheral and user-side client:

1. DMA transfer between peripheral and device driver DMA buffer

2. Memory copy between driver DMA buffer and user memory.

Obviously a two-stage transfer process wastes time. How can it be avoided? In the previous example buffer-allocation code, if you set the access permission attribute to EMapAttrUserRw rather than EMapAt-trSupRw, the driver creates a user-accessible global memory buffer. The driver must then provide a function to report the address of this buffer as part of its user-side API. Note that you can't make these chunks accessible to just a limited set of user processes and so they are not suitable for use when the chunk's contents must remain private or secure.

A much more robust scheme for avoiding the second transfer stage is for client and driver to use a shared chunk as the source or destination of data transfer requests between peripheral and user accessible memory. I will discuss this in the next section.

0 0

Post a comment

  • Receive news updates via email from this site