Title: Comparing Intel C and Microsoft Visual C Compilers
1Comparing Intel C and Microsoft Visual C
Compilers
- Michael Baum
- David Boyett
- Holly Garrison
2Agenda
- Problem Statement
- System Environment
- Programs Used for Comparison
- Matrix Processing Programs Results and Analysis
- SPEC Benchmark Results and Analysis
- Conclusion
3Problem Statement
- The general purpose of our project is to verify
Intels claim that their compiler is 10 better
then the Microsoft Visual compiler. - Data will be gathered using Intel VTune tool from
both SPEC CPU 2000 benchmarks and from simple
matrix processing programs.
4System Environment
- Programs were run on a single processor system
with Intel P4 2.4GHz processor and 512 MB RAM. - Windows 2000 operating system
- Microsoft Visual .NET compiler
- Intel C Compiler 7.1 for Windows
- Intel VTune Performance Analyzer 7.0
5Programs Used for Comparison
- SPEC CPU 2000 Benchmark
- 164.gzip
- 300.twolf
- Simple Matrix Processing Programs
- Array Summation of 10000 elements
- Matrix Multiplication of 250x250 matrices
6VTune Setup
- Using Intels VTune application the following
events were measured - Instruction Count
- Clockticks and Clockticks per Instruction
- Loads Stores
- Level 1 cache misses
- Mispredicted Calls and Branches
7Matrix Processing Programs Results
Executable (.exe) Mispredicted Calls Mispredicted Branches 1st Level Cache Misses Loads Stores Clockticks Instruction Count Clockticks per Instruction
Array Sum 10000 (Intel) 1,518 22,285 49,890 1,268,145 844,962 18,995,295 981,030 19.36
Array Sum 10000 (VC) 4,536 39,123 186,760 863,772 1,162,239 13,069,242 1,462,053 8.94
Matrix Mult 250 (Intel) 220 5,132 0 0 657,324 9,502,532 1,979,090 4.80
Matrix Mult 250 (VC) 289 68,354 18,640,249 31,728,270 657,328 88,513,594 54,242,733 1.63
8Matrix Processing Programs Results (cont.)
9Matrix Processing Programs Results (cont.)
10Matrix Processing Analysis
- For Simple Matrix and Array Processing the Intel
compiler verified its claim of a 10 better
compiler - With the exception of the number of Stores
executed, the Intel compiler showed approximately
a 50 savings in the measured operations. - The Matrix Multiplication program showed one
noteworthy result the Intel compiler had zero
events for both 1st Level Cache Misses and for
Loads. - Verified by multiple builds and runs
11SPEC Benchmark Results
Executable (.exe) Mispredicted Calls Mispredicted Branches 1st Level Cache Misses Loads Stores Clockticks Instruction Count Clockticks per Instruction
164.gzip (Intel) 11,725 871,754,172 2,267,577,936 22,054,374,342 11,101,416,840 106,412,563,515 76,670,596,520 1.39
164.gzip (VC) 7,695 869,317,015 2,273,066,852 22,074,844,248 11,108,909,049 107,286,054,470 76,671,138,915 1.40
300.twolf (Intel) 346 4,874,982 7,639,211 77,060,025 32,577,657 484,933,215 210,922,988 2.30
300.twolf (VC) 537 4,797,552 7,526,588 76,831,638 33,214,416 473,946,742 211,425,444 2.24
12SPEC Benchmark Results
13SPEC Benchmark Results
14SPEC CPU 2000 Analysis
- SPEC CPU 2000 Benchmarks did not show any
significant difference between the two compilers. - SPEC Benchmarks were re-compiled and data sets
were collected multiple times to verify the
validity of the original data.
15Conclusions
- Even though our group saw significant
improvements in performance for our small test
programs, these same gains could not be
duplicated for the Benchmark applications. - These variations might be the result of
differences in program complexity.
16Conclusions (cont.)
- The Intel C Compiler showed results that were
equal to or in some cases better than those of
Microsoft Visual C. - While Intels claim of 10 better results may not
be true in all cases it is still a superior
compiler.