Performance considerations for Whisker

Top  Previous  Next

The problem


Because Whisker runs in a multitasking environment, it is unavoidable that other applications that consume processor time can slow down Whisker.

Nevertheless, we find the ability to run other applications incredibly useful, and on the hardware we have used, it does not pose a performance problem. It is up to you whether you run other applications at the same time, and Whisker does its best to provide good performance and to help you make this decision by providing performance information.


The way Whisker works


Essentially, the server asks Windows to send it a message every millisecond, or as close to it as Windows can manage. When this message arrives, Whisker does two things. Firstly, it checks the state of the digital input lines, and if any lines have changed state since the last time Whisker looked, the server flags the change; if any clients are using those lines and have asked to be notified of a change in their state, it sends the appropriate message to the clients. Secondly, Whisker runs through all the timers being used by the clients; if any have timed out, it sends a message to the client.


If the computer is slow…


From Whisker's point of view, the computer can be slow either because it is a genuinely slow computer, or because it's doing lots of other processor-intensive things at the same time. Either way, there are two potential consequences.


1. The system's 'reaction times' may deteriorate.


For example, suppose a client is running that switches on a light whenever a rat presses a lever. The client will have requested to be informed whenever that lever is depressed. You could think of the system's reaction time as the time it takes for Whisker to send that message to the client, for the client to receive it, process it, and send a command back to the server to switch on the light, and for the server to receive that command and act on it. If the computer is slowed, this sequence might take longer. (If you're living dangerously and the client and server are on two different computers, then you're relying on the performance of both computers, plus the network that links them!)


Monitoring round-trip performance


Frankly, I don't expect this to be a problem. In an attempt to assess it, I added the TestNetLatency command (see the Programmer's Guide). If you run WhiskerTestClient, connect to the server, and type in TestNetLatency, the following sequence occurs:


client sends TestNetLatency to server

server receives TestNetLatency, notes the time (Time 1) and responds with Ping

client receives the Ping and responds with PingAcknowledged

server receives PingAcknowledged, notes the time (Time 2)

server sends the difference between Time 1 and Time 2 in an Info message


The upshot is that you have timed a complete round trip, including processing times, and you will receive a message like this:


Info: Network latency is 0 ms


On fast computers (e.g. Pentium-III/750, Windows 2000), the network latency is 0–2 ms; on one slower computer I tried (Pentium-II/233, Windows 2000) it reached 3–7 ms. I've tried to load our machines quite heavily, and have not managed to get these latencies much higher; basically, TCP/IP communication within a single computer is fast.


A delay of 7 ms might be unacceptable in an EEG experiment, but in simple behavioural control I don't think it's a problem. For a start, it is significantly shorter than the time it takes for a filament bulb lamp to reach a reasonable proportion of its final brightness, and certainly an incredibly brief time compared to the times involved in activating lever retracting devices or infusion pumps. It is also extremely brief compared to reaction times in typical tasks. Individual neuronal action potentials last 1 ms; large myelinated neurons conduct at 70–90 ms–1; a typical chemical synapse takes 1 ms to conduct; peak muscle contraction takes 15 ms to develop in the fastest muscles in the mammalian body (the extraocular muscles) and 30–100 ms for most skeletal muscles; similarly, the somatosensory foot–to–cortex nerve conduction time in a human is of the order of 25 ms, and conscious awareness of stimulus perception requires of the order of 100 ms.


However, repeated measurement of the same task allows much finer resolution to be reached; thus, reaction time effects of 10–15 ms may be reliably measured. For such experiments, I would suggest that you use a fast computer and follow the tips given below for optimal performance. The most accurate technique for measuring reaction times is to use the time-stamping feature of the server, removing inaccuracy due to network latencies; see TimeStamps in the Programmer's Guide for a worked example.


2. The system may miss events.


In the situations discussed above, the system might respond to events slowly, but it would still notice every single one. The potential to miss events completely is a much more important problem. It might arise like this:


detector device is initially OFF

poll #1: Whisker scans the hardware, finds the device off

device goes ON

device goes OFF

poll #2: Whisker scans the hardware, finds the device off


In this situation, the server would never notice the change. Is this a real problem? Well, if the time between successive polls is 1 ms, probably not. There are no actions a rat can make that are completely over in less than 1 ms! If you have a detector device that genuinely does generate pulses in the sub-millisecond range, there is a real problem. We don't.


However, if the inter-poll time is considerably prolonged, a problem might arise. Therefore, Whisker provides facilities to check that the inter-poll time is acceptable.


Monitoring inter-poll times


Part of the server status view is dedicated to performance monitoring. Here's a snapshot of this view:




Every second, this display is updated (together with several other clock-related server displays). This display shows that since the display was last updated, there have been 1100 polls, so the average inter-poll time was 1 ms. All well and good. The worst single inter-poll time in the last second was 2 ms; some indication of the distribution of inter-poll times is also given on the last line of the display.


In this case, the longest inter-poll time since the server started running was also 2 ms. If this number were higher, the long inter-poll time might have occurred when the server first started (this sometimes occurs and is nothing to worry about, because no clients are connected then). When the first client connects, and the server is not in real-time mode, there is sometimes a brief (e.g. 30 ms) pause. But it's just possible (or if you're not running the server in real-time mode, fairly likely) that the long inter-poll time might have occurred because you just ran some very processor-intensive program.


By using the Server Reset timing statistics command, you can assess the impact of running a given program. When you have no clients running (or none that matter!), reset the statistics, then run your program and see how bad the inter-poll time gets. If it exceeds your personal threshold for worrying, don't run that program when you have an experiment going.


Bear in mind that you only have a problem if you run a processor-intensive program and it hogs the processor at the same instant that your subject makes a response. The server display gives you the worst-case scenario.


3. Timer accuracy


Timer accuracy is not affected by long inter-poll times occurring while the timer is running (though potentially affected by a long inter-poll time occurring just at the moment when the timer expires). All Whisker does on every poll is to calculate the absolute time that has elapsed since the timer was created, and if this time equals or exceeds the programmed timer duration, it sends the timer message.


Technical notes: Repetitive timers


Suppose a timer is set up to fire repetitively, once every 100 ms. What could cause this to be inaccurate?


Transmission delay. If a system delay slows message transmission down by 10 ms, one message will arrive late, and the subsequent message will 'catch up' and be on the originally scheduled time.


A late clock tick. If the server isn't polled by Windows when the timer elapses, the message will be sent late. In this situation, the server could do two things. (1) If it schedules the next occurrence 100 ms later, the client will receive one late message, and the subsequent messages will be phase-shifted and all of them will be late from the point of view of the original command, so the whole sequence will finish late. (2) Alternatively, the server could schedule the next occurrence so it is 'on time', in which case the client will see one late message and one that 'catches up', just as with a transmission delay. The overall sequence will be of the desired length.


Whisker adopts strategy (2).


These scenarios are illustrated below:




Ways that Whisker tries to improve performance


Unless you're interested in the technical aspects of Whisker, skip this section.


High process priority


Under Windows NT, it is possible to set the priority of a running program. WhiskerServer, by default, runs as a real-time (very high priority) process.


If Whisker consumed significant CPU time, running it as a high-priority process would detract from the CPU time available to Whisker's clients. However, Whisker does not use significant processor time for its critical functions — it makes frequent, brief checks of the hardware.




WhiskerServer is effectively composed of several mini-programs or threads running simultaneously. A consequence of this is that delays in communication with one client (or in updating the graphical console displays) do not affect any other clients, or the regular polling of hardware devices. Multithreading is discussed in more detail below.


But remember, it's not just about the server…


Although the server's speed is much more critical than the clients', as clients do not have to poll the hardware, client speed will be affected by other programs. Programs that consume significant CPU resources (e.g. Microsoft Access 97; Adobe Illustrator 8.0) may slow your clients down if run at critical moments.


Suggestions for optimal performance


All common sense, really; I suggest you view the ability to use the testing computer for other tasks as a luxury and judge for yourself whether it would cause problems for your task.


If you have really time-critical experiments running, don't use the computer for anything else at the same time.
Know what programs slow the system down and avoid them when you're testing. You may find that no program ever slows Whisker enough, because Whisker works hard to receive frequent attention from Windows.


Don't push it. If you buy a slow computer, disable Whisker's high-performance features, set the computer up as a web server, dedicate 25% of its CPU time to scanning for extraterrestrial intelligence, and defragment your disk in the background while checking for viruses, running a complicated screensaver and redrawing your plans for a nuclear power station, your system may find itself a little overloaded.


Implement some general tips for high-performance computing.


Technical notes


Process and thread priorities


Windows allows processes (effectively, programs) to run in four different priority classes: idle, normal, high, and real-time, in ascending order. Within each process, Windows NT allows you to specify the thread priority. The process priority is designed to reflect the importance of the application in the system, and the thread priority to reflect the importance of different threads within that application. Windows NT itself sometimes alters the thread priority slightly – for example, to give a boost to threads that are receiving user input.


Whisker sets the process priority class, but does not alter thread priorities within the class (leaving them at the default, 'normal' priority). These settings can be configured on a per-user basis. They are stored in the registry within \\HKEY_CURRENT_USER\Software\WhiskerControl\WhiskerServer.


Hardware interrupts


Whisker uses polling to respond to two classes of event: changes in the state of the digital input lines, and timers. If the digital I/O cards were capable of generating interrupts, then a different approach could be taken with the input lines: instead of checking them regularly, Whisker could sit back and wait to be informed via a high-priority interrupt signal when one changes. Of course, timers would still have to be serviced by regular polling, and I don't know whether a whole flood of hardware interrupts would have adverse performance consequences, but it would be an additional mechanism of gaining performance.


Extremely technical notes


Threads used by WhiskerServer


Whisker is multithreaded. Multithreading is a complex topic. These notes are here as a reminder to me and as something that may interest you – nothing more.

       WhiskerServer is written in C++, using Microsoft Function Classes (MFC) and the MCL library (Cohen A & Woodring M, 1998, Win32 Multithreaded Programming, O'Reilly), with permission. It uses the following threads:

A user-interface (UI) thread, which (1) is the primary owner of all MFC objects, including the main server, client, and socket classes; (2) deals with all user input, including mouse/keyboard input; (3) owns and draws all visible windows.
Client communication threads (one thread per client), handling the TCP sockets used to communicate with the client. All communication from the client arrives via this thread. (Under Windows, TCP sockets are attached to an invisible window, which receive WM_SOCKET_NOTIFY messages that are passed through to the MFC socket classes.) If this thread needs something drawn to a window (e.g. if the client requested that a bitmap be displayed), it does not do it directly, but sends a message to the UI thread requesting that the window be redrawn at the next available opportunity.
A high-performance hardware polling thread. A multimedia timer ('MM timer') is set up to fire every 1 ms, using its own thread. When it does so, the digital I/O boards and timers are polled through this thread, and if communication with the client is necessary, this thread will send messages to the client's primary socket.


Insuring against starvation


A design such as this is vulnerable to 'starvation locking' if the threads are of high enough priority. For example, if the polling thread took 1ms or more to execute, it would 'starve' all other threads of CPU time (the OS may allow other high-prioirty threads within Whisker to pre-empt, but essentially Whisker would consume all of the CPU resources). This could be monitored by communicating with a low-priority thread, and checking that it was being serviced occasionally. However, under Win32 all threads within a process (application) must share their priority group, so even if the lowest priority thread in Whisker was being serviced, other application's threads may not be. The solution we have come up with is for the polling thread to pause for 1ms as soon as it is called. This means that the polling thread on a normally running machine will still execute every ms; however a machine which was slowed for some reason outside of Whisker the polling will occur 'as quickly as possible without starving other threads'.


The threads access the same data, and as two threads must not access the same data simultaneously, 'thread safety' is ensured by locking all data against such access. MFC classes do not expect to be addressed by multiple threads, requiring particular precautions.


The server's process priority setting applies, as the name suggests, to the process – all the threads are in the same process, and all receive the priority boost from this setting. The default (real-time) setting is recommended.


Limits on performance – 1 – multimedia display update processing is shared between clients. Threads execute simultaneously. Real-time priority threads are rarely slowed down by other programs running under Windows (it's possible, but it's very unlikely). Therefore, it should be apparent from this list that nothing will interfere with client–server communication or hardware polling. However, one task executed by the UI thread will pause execution of another task also executed by the UI thread. This primarily affects multimedia displays, as they are drawn by the UI thread.


For example, display-document access won't affect other clients, but display creation may temporarily pause another client that's also trying to create a display / destroy a display / show a document on a display (and vice versa for any of these functions) – because both tasks use the UI thread. Display redrawing may slow down redrawing of other displays (though it won't affect client communication speed at all).


I estimate these delays as a few milliseconds or less (most such delays will be sub-millisecond). They will also occur rarely. I doubt that they will cause problems.


The time taken to load bitmaps etc. from disk does not contribute to these delays, as that is not part of the redrawing process (the object is loaded from disk by the client thread and only drawn by the UI thread). If disk access delays slow displays noticeably, simply create a document in advance and load the bitmaps into it, then switch that document into the display window (which is a fast operation).


In principle, slow activities on the WhiskerServer console can affect and be affected by the speed of displays. Such delays are also incredibly small. WhiskerServer is even invulnerable to a test that pauses execution of many Windows programs – a situation generated by (1) disabling the Windows option to 'show window contents while dragging' (under Windows 2000, this is in Control Panel Display Effects); (2) dragging the WhiskerServer main window; (3) holding the mouse still with the button still down. Even this combination of events, which prevents Windows messages getting through to many windows, does not affect WhiskerServer.


If this limit were significant, it would be possible to create further threads (one client, one display thread). However, due to MFC restrictions, this would create the requirement to make temporary duplicates of the display data (specifically, because view windows on an MFC document must all be in the same thread, and the display documents used by Whisker may be viewed from the console – the UI thread – as well as in the window intended to be seen by the test subject) and this would incur a performance hit and use memory. As it's not at all clear that there's a problem with display speed at the moment, I haven't gone down this road. WhiskerServer is fast.


Limits on performance – 2 – network subsystem. If you run your tasks across a network, you are subject to the network's performance. This is not advised – the network facilities of Whisker are designed for monitoring task status (it doesn't matter if the monitor program run slowly), not for running behavioural tasks themselves. Keep the task and the server on the same computer; this uses the internal TCP stack and is fast.


Limits on performance – 3 – the impact of other programs on Whisker and its clients. As discussed above, the high priority with which WhiskerServer runs means that it is unlikely to be significantly affected by anything else happening on your computer (this includes disk accesses, intense computation, etc.). It is very rare for Windows programs to use the 'real-time' priority, because very few Windows programs need real-time performance. Consequently, few programs (including most internal Windows functions) use a priority high enough to interfere with WhiskerServer.


However, this may not be true of behavioural clients. Their performance is usually less critical, because they don't have to poll the hardware – they just wait for the server to inform them of a hardware event. If task performance is critical, you must consider whether you want to run other programs (like your database program) that may slow down your task.


As part of this topic, consider implementing my general performance suggestions.


Limits on performance – 4 – CPU speed and memory. It goes without saying that faster computers are more capable of real-time performance. More memory improves performance by reducing the chance that any program gets swapped out to virtual memory (i.e. hard disk).