When developing Parabix applications, it is often useful
to compare the performance of different algorithm choices.
A useful and flexible script for this can be built using the
QA/perf_stat_runner.py
tool.
The idea of this tool is to run a particular program with its
inputs together with combinations of different algorithmic choices specified
on the command line. The linux perf
program is used to execute
the program and collect performance measures including instruction counts, cycle counts and branch misses. Each combination of performance parameter is run once in order to populate the object cache (and so eliminate the JIT compile time from further runs) as well as to check that each variation produces the same result. Then the program is run several times to obtain averaged measurements of the counters.
Here is a sample script using perf_stat_runner
.
# NFD_perf.py
from perf_stat_runner import *
if __name__ == '__main__':
tester = PerformanceTester("../build16/bin/nfd")
tester.addPositionalParameter("input", ["/home/cameron/Wikibooks/wiki-books-all.xml"])
tester.addPerformanceKey("--ByteMerging", ["0", "1"])
tester.addPerformanceKey("--ByteReplace", ["0", "1"])
tester.addPerformanceKey("--LateU21", ["0", "1"])
tester.addPerformanceKey("--thread-num", ["1"])
tester.run_tests("nfd-stats.csv")
It produces performance results in the nfd-stats.csv
file, with
one row for each of difference performance keys and columns for instructions, cycle counts and branch data.
Here is another for testing UTF compiler options with the ucount
program.
# UTF_perf.py
from perf_stat_runner import *
if __name__ == '__main__':
tester = PerformanceTester("../build16/bin/ucount", ["-c"])
tester.addPositionalParameter("RE",
[
"\\p{Greek}",
"[\\u{1234}]",
"\\p{Han}",
"\\p{Old_Uyghur}",
"\\p{letter}",
"\\p{lu}",
"\\p{unassigned}",
"\\p{Arabic}",
"[\\u{12}-\\u{10FF85}]",
"\\p{digit}",
"虫"
])
tester.addPositionalParameter("input", ["/home/cameron/Wikibooks/wiki-books-all.xml"])
tester.addPerformanceKey("--lookahead", ["0", "1"])
tester.addPerformanceKey("--InitialTest", ["PrefixCC", "RangeCC", "NonASCII"])
tester.addPerformanceKey("--PartitioningFactor", ["2", "3", "4", "5"])
tester.addPerformanceKey("--CCmode", ["BixNumCCs", "SyntheticBasis", "TruncatedBasis"])
tester.addPerformanceKey("--u21", ["1"])
tester.addPerformanceKey("--thread-num", ["1"])
tester.run_tests("ucount-stats.csv")