Device +Driver Writing and Migration Technology Tutorial

Driver frameworks

There +are three main categories of device driver:

Boot-Loaded Non-Channel +Drivers
Boot loaded drivers are built as kernel extensions. They +are typically simple device drivers such as keyboard drivers with limited +or no client side interface and are not much impacted by data paging. It is +generally safe for them to pass data structures using the HAL in the context +of a kernel thread and for them to execute in the context of a kernel thread: +however, this assumption must always be verified for individual cases.
Media Drivers
Media +drivers are both channel based drivers and kernel extensions. When written +according to the recommended model they either execute wholly in the context +of their clients or use a unique DFC queue and associated kernel thread. If +these recommendations are followed, no additional measures to mitigate the +impact of data paging are required.
Dynamically loaded +channel based IO device drivers
Channel based IO device drivers +are based on various models: all are dynamically loaded. They are derived +either from DLogicalChannelBase or DLogicalChannel. +
Channel based drivers derived from DLogicalChannelBase usually +execute in the context of their client, mitigating the impact of data paging. +Where they are multi-threaded, they typically create separate and unique kernel +threads and do not use shared DFC queues, mitigating the impact of data paging: +if they use a shared DFC queue and associated kernel thread, they are impacted +by data paging and must be written so as to mitigate the effects. Channel +based drivers derived from DLogicalChannel may communicate +with the hardware directly (LDD to hardware) or indirectly (LDD to PDD to +hardware). If a PDD is involved, mitigation of data paging should take place +at that level and not the LDD. Channel based drivers may have single or +multiple clients, channels and hardware. It is these drivers which require +most work to mitigate the impact of data paging.

Mitigation +techniques

The impact of data paging on device drivers is mitigated +by the use of various different techniques which are the subject of the rest +of this document.

Passing +data by value

Clients should pass data by value not as pointers. +Return values of calls should be return codes not data.

Using dedicated DFC queues

All drivers which use DFCs should +use a dedicated DFC queue to service them. You should not use the kernel queues Kern::DfcQue0 and Kern::DfcQue1 for this purpose. How you create a dedicated DFC queue depends on the nature +of the driver.

To service boot loaded drivers and media drivers, you +create a DFC queue by calling Kern::DfcQueueCreate().

To +service dynamically loaded drivers derived from DLogicalChannelBase you call Kern::DynamicDfcQCreate() with +a TDynamicDfcQue as argument:

TInt Kern::DynamicDfcQCreate(TDynamicDfcQue*& aDfcQ, TInt aPriority, const TDesC& aBaseName);

To service a dynamically loaded driver derived from DLogicalChannel, +you use the DFC queue supplied with it (the member iDfc, +accessed by pointer). To use the queue you call the SetDfcQ() function +during the second phase construction of the LDD.

You destroy queues +by calling their function Destroy() which also terminates +the associated thread.

Setting +realtime state

The realtime state of a thread determines whether +it is enabled to access paged memory. If a thread is realtime (its realtime +state is on) it is guaranteed not to access paged memory, so avoiding unpredictable +delays. The realtime state of a thread may be set to ERealtimeStateOn, ERealtimeStateOff and ERealtimeStateWarn as defined in the enumeration TThreadRealtimeState and +set by the kernel function SetRealtimeState.

If +a driver uses DFC threads and is subject to performance guarantees, their +realtime state should be set to on (this is the default when data paging is +enabled). Otherwise the state should be set to off: the warning state is used +for debugging.

Validating +arguments in client context

It is often necessary to validate +the arguments of a request function. This should be done in the context of +the client thread as far as possible.

When a driver derived from the +class DLogicalChannelBase makes a request this happens +automatically as a call to the Request() function takes +place in the client thread. When the driver is derived from the class DLogicalChannel the +request involves a call to the SendMsg() function inherited +from the base class and it is necessary to override the base implementation +to force evaluation of the arguments within the client thread.

Accessing user memory from client context

The DFC should access +user memory as little as possible. Whenever there is a need to access user +memory and it can be accessed in the context of the client thread, it should +be.

When the driver is derived from the class DLogicalChannelBase, +read and write operations on user memory can be performed with calls to the Request() function +and these take place in the context of the client thread.

When the +driver is derived from the class DLogicalChannel it is +possible to read from and write to user memory by overriding the SendMsg() function +before passing the message on to be processed by the DFC thread if necessary. +If the message is passed on, data must be stored kernel side either on the +client thread's kernel stack or in the channel object.

Message data +can only be stored on the client thread's kernel stack if the message is synchronous +and the size of the data is less than 4Kb. Since the stack is local to the +client it can be used by more than one thread. One way of doing this is to +implement SendMsg() with a call to SendControl() which +is itself implemented to perform the copying in the client thread context +and independently call the SendMsg() function of the parent +class.

Where the message is asynchronous you can use a similar strategy +for overriding the SendMsg() function but this time perform +the copying to a buffer owned by the channel independently of a call to the SendMsg() function +of the parent class. In this case the size of the data must be small (in the +region of 4Kb), there must be only one client using the buffer, and data cannot +be written back to the client.

Using +TClientDataRequest

An asynchronous request often needs to copy +a structure of fixed size to its client to complete a request. The TClientDataRequest object +exists for this purpose: it writes a fixed size structure to user memory and +completes the request in the following steps.

The driver creates a TClientDataRequest object +for each asynchronous client which may be outstanding concurrently: either +one per client or one per request as appropriate.
When the client makes +a request the TClientDataRequest object is set to contain +the address of the client's buffer or descriptor and the address of the client's TRequestStatus. +This takes place in the client context.
The data to be written +is copied into the buffer of the TClientDataRequest object.
A call to Kern::QueueRequestComplete() passes +the address of the TClientDataRequest object.
The client is signalled +immediately.
When the client thread +next runs, the buffer contents and completion value are written to the client.

Using +TClientBufferRequest

When it is necessary to access user memory +from a DFC thread context, that memory must be pinned for the duration of +the request and unpinned when the request is completed. The pinning must be +performed in the context of the client thread. The TClientBufferRequest object +exists for this purpose.It is used in the following way.

The driver creates a TClientBufferRequest object +for each client request which may be outstanding concurrently: either one +per client or one per request as appropriate.
Whe a client makes a +request, the TClientBufferRequest object is set to contain +the address of any buffers used and the address of the client's TRequestStatus. +Doing so pins the contents of the buffers: they can be specified as descriptors +or by start address and length. This takes place in the client context.
The driver calls Kern::ThreadBufRead() and Kern::ThreadBufWrite() to access the client's buffers. This takes place in the context of the DFC.
When the request is +complete, the driver calls Kern::QueueBufferRequestComplete() passing +the TClientBufferRequest object. This signals the client +immediately and unpins the buffers.
When the client thread +next runs, the completion value is written back to the client along with the +updated length of any descriptors.

Using +Kern::RequestComplete()

The function Kern::RequestComplete() exists +in two versions:

static void Kern::RequestComplete(DThread* aThread, TRequestStatus*& aStatus, TInt aReason);

which is now deprecated, and its overloaded replacement

static void Kern::RequestComplete(TRequestStatus*& aStatus, TInt aReason);

The +overloaded version should always be used, as it does not take a thread pointer +argument.

Using +shared chunks

Shared chunks are a mechanism by which kernel side +code shares buffers with user side code. As an alternative to pinning memory +they have the following advantages:

Shared chunks cannot +be paged and therefore paging faults never arise.
Shared chunks transfer +data with a minimum number of copying operations and are useful where high +speeds and large volumes are required.

Shared chunks present disadvantages when a driver is being migrated +rather than written from scratch, as the client API must be rewritten as well +as the driver code.