home
CPU OVERHEAD II,
Via Cycle Timing
(a.k.a. fixed time)
Last Modified:  22 May 2001

Notes after initial post:  a) CPU overhead timings have also been accompished via a  fixed work method .
                                         b) If data set 1 in "raw data" is thrown out, the cost to the user per interrupt in nanoseconds is as follows:
                                                        average:           11,354
                                                        sample STD:         499
                                              The above statistics are in close correlation with data from the fixed work method.

The purpose and use of this test is quite similar to the initial CPU Overhead  method so some details are not restated here.  The main difference is that the timing code has been completely rewritten so that it no longer relies on the utilities getrusage and gettimeofday.  On aztec, these utilities are limited to a resolution of ten microseconds.  Instead, the new timing code counts cycles of nested for-loops and uses the alarm utility and a signal handler to stop the for-loop execution.

NEW TIMING CODE
The timing code for this test involves a cycleTimer class(cycleTimer.hcycleTimer.C ) that basically runs two nested for-loops for a user specified period of time.  The class provides accessor functions for the loop counters.  The test files (test.htest.C ) run the cycle timer a specified number of runs, each run being a specified number of seconds; all runs last the same amount of time.  The test files calculate average loop iterations in terms of the outer loop and can calculate standard deviation.   Values of loop iterations are formatted  in the test files as X.YYYY where X is the number of outer loops and YYYY is the fraction of inner loops with respect to the outer loop.  In other words, the iterations of the inner loop were divided by the maximum inner loop iterations (i.e., 2^32) before adding the value to the outer loop value.  This could be changed to just count actual iterations of a single for-loop if the timing runs are kept short.  Note:  the standard deviation output is only meaningful if the "multiplier" is used to get the average "value" significantly above one.  Otherwise, standard deviation is not meaningful.  During dry runs, the "muliplier" was used in this manner in order to tune test parameters (i.e., number of runs, length of a run, and number of throw away runs) so that standard deviation was minimized.

The  Makefile  provides the usual target files including the executable "CycleTimer."

USE
As before, the tester must ensure that the system is not disturbed (e.g., no mouse movement) during the tests.  The  Myrinet Control Program and GM driver  are set up as before.  Note that the MCP has been slightly modified to avoid spiking to 100% user time.  See the original CPU Overhead web page for details on the spiking problem.  The driver and MCP are installed as before and the MCP begins sending the specified stream of interrupts.  Once the MCP begins sending the interrupts, CycleTimer is executed.  Also, at the beginning and end of the test, the procinfo utility is used to determine the actual number of interrupts that were handled by the host; the intent of this check was simply to ensure that the percent user spikes did not recurr.  Note that future repeated use of this test in whole might justify automating the stream initiation such that it is called for by CycleTimer rather than initiated by installing the MCP/GM driver.

SAMPLE DATA
Raw data  is available.  And a graph of percent user time versus nanoseconds between interrupts is provided below.  This curve should be and is very similar to the  CPU Overhead curve.  The nanoseconds between interrupts is based on the number of RTC units between interrupts as specified in the GM driver; to get to nanoseconds, nanoseconds per RTC unit was determined in another test .

NOTES
As with the original CPU Overhead test, this data was used to estimate the time per interrupt needed by the kernel to handle this very basic interrupt.  All of the collected datapoints in this test were used to calculate the kernel time.  The results:  11,542 average nanoseconds per interrupt with a standard deviation of 987 nanoseconds.   This is approximately 2 microseconds greater than the cost calculated from the original CPU Overhead test; see  user interrupt cost .  Since this CPU Overhead II uses more precise methods, this 11,542 average cost in nanoseconds may be a more precise measure as well of user cost per interrupt.

The below graph uses this average to plot the expected percent user time.  The expected percent user time correlates well with the actual although this merely indicates that the shape and orientation of the curve are as expected and that the data indicates a constant cost per interrupt over a varying range of interrupt frequencies.
 
 

CARD AND HOST INFORMATION:   The tests were run on an FIC VA-503+ motherboard, 500 MHz AMD-K6(tm)-2 processor, and Red Hat 6.2 with Linux Kernel 2.2.14-5.0.  The tests
used a single Myrinet card having the below information.

LANai 7.2 9947
PCIDMA 1.0 9947
Myrinet M2L-PCI64/2-3.0
1999 Myricom Inc.
357957
M2L-PCI64A-2
40681
A-3935  3952
00:60:dd:7f:cd:21


Bill Lawry