BW_MEM(8) | LMBENCH | BW_MEM(8) |
bw_mem - time memory bandwidth
bw_mem_cp [ -P <parallelism> ] [ -W <warmups> ] [ -N <repetitions> ] size rd|wr|rdwr|cp|fwr|frd|bzero|bcopy [align]
bw_mem allocates twice the specified amount of memory, zeros it, and then times the copying of the first half to the second half. Results are reported in megabytes moved per second.
The size specification may end with ``k'' or ``m'' to mean kilobytes (* 1024) or megabytes (* 1024 * 1024).
Output format is "%0.2f %.2f\n", megabytes, megabytes_per_second, i.e.,
8.00 25.33
There are nine different memory benchmarks in bw_mem. They each measure slightly different methods for reading, writing or copying data.
This benchmark can move up to three times the requested memory. Bcopy will use 2-3 times as much memory bandwidth: there is one read from the source and a write to the destionation. The write usually results in a cache line read and then a write back of the cache line at some later point. Memory utilization might be reduced by 1/3 if the processor architecture implemented ``load cache line'' and ``store cache line'' instructions (as well as ``getcachelinesize'').
lmbench(8).
Carl Staelin and Larry McVoy
Comments, suggestions, and bug reports are always welcome.
$Date$ | (c)1994-2000 Larry McVoy and Carl Staelin |