Since I can only work from the command line on the server I needed to come up with a command to replace all Makefiles' -O2 with -O3. The command I found the easiest was the following:
find -name Makefile | xargs sed -i 's/-O2/-O3/'The following benchmarks are done on ARCH64, server Betty.
With -O2 flag
10.5 mb file | 105 mb file | 1.5 gb file |
---|---|---|
real: 0m0.037s | real: 0m0.345s | real: 0m4.792s |
user: 0m0.028s | user: 0m0.323s | user: 0m4.551s |
sys: 0m0.001s | sys: 0m0.027s | sys: 0m0.400s |
With -O3 flag
10.5 mb file | 105 mb file | 1.5 gb file |
---|---|---|
real: 0m0.036s | real: 0m0.343s | real: 0m4.768s |
user: 0m0.028s | user: 0m0.323s | user: 0m4.499s |
sys: 0m0.001s | sys: 0m0.027s | sys: 0m0.426s |
As you can see the -O3 did not do anything on ARCH64. I thought this was very strange and checked the executable file to see if it changed at all. The file did change, it got larger as expected. Yet there is no gain in speed. Comparing the real time of the 1.5gb file again, it wasn't even 1% faster. So for ARCH64 I recommend using -O2 because it doesn't change much in run time and your file is smaller.
For Betty I'll have to find another optimization possibility, although I wouldn't know what the next possibility would be. Probably make something specifically for ARCH64, but this would cost way more time.
Another way to go is add the compiler flag -fopt-info-missed to find missed optimizations and see if I can do something about that. Source: https://gcc.gnu.org/onlinedocs/gccint/Dump-examples.html
For Betty I'll have to find another optimization possibility, although I wouldn't know what the next possibility would be. Probably make something specifically for ARCH64, but this would cost way more time.
Another way to go is add the compiler flag -fopt-info-missed to find missed optimizations and see if I can do something about that. Source: https://gcc.gnu.org/onlinedocs/gccint/Dump-examples.html
No comments:
Post a Comment