CS552 Course Wiki: Spring 2017 | Main ยป
Homework 5 |
Tasks |
Homework 5 Due 04/13 Problem 1 must be done with your project partner. Submitted:
Not Submitted:
Overview
Download the supplied tar file to fill in your answers. Problem 1a) Write an assembly program to demonstrate forwarding in a pipelined processor implementation. Write your code in b) Also write an explanation of your program including where and why forwarding takes place. Write your answer in Problem 2a) Write an assembly program to demonstrate why branch prediction is necessary and useful. Write your code in b) Write an explanation of your program and how branch prediction helps in c) Will branch prediction always take only 1 cycle? Write your answer in The remaining problems will not be graded but are recommended for better understanding of the course material. Problem 3Given a 2K Bytes 2 way set associative cache with 16 byte lines and the following code: for (int i =0; i < 1000; i++) { A[i] = 40 * B[i]; } a) Compute the overall miss rate (assuming array entries require one word, and each word is 4 byte, and that the base address of each array is aligned with cache line boundary). b) What kind of cache locality is being exploited? Problem 4Consider a direct-mapped cache with 32-byte blocks and a total capacity of 512 bytes in a system with a 32-bit address space. Assume this is a byte addressable cache.
0x0000a796 0x000092e8 0x000092f4 0x00004182 0x0000780a 0x0000a690 0x0000408e 0x0000a798 0x00007800 0x000092fc 0x00027c02 0x0000408a 0x00004198 0x00006710 0x0000670c 0x00027c04 0x0000a790 Problem 5Re-do problem 3, but using a two-way set-associative cache. When replacing a block, the least-recently-used block is chosen to be replaced. Everything else (block size and total capacity) remains the same. Determine the speedup over the direct-mapped cache in problem 3. Assume both caches can be accessed in 1 cycle, that the CPI without misses is 1.0, and that the miss penalty is 25 cycles. Problem 6Consider a cache with the following characteristics:
Problem 7How many storage bits are required to implement a 256KB cache, with 16B blocks, that is a 4 way set-associative, uses write-back policy, LRU replacement and assuming a 2^36 byte addressable address space ? Bits are required for : 1. The Data 2. The Tags 3. The Valid bits 4. The dirty bits 5. The LRU bits Problem 8Do problems 5.4.1 to 5.4.3 in page 551 of textbook. Problem 9Do problems 5.7.1 to 5.7.3 in page 554 of textbook. Problem 10Given processor running at 2GHz with a base CPI of 1.0 (CPI without considering memory access delay, stalls, etc). About 30% of the instructions in a program involve data memory access. The access delay of instruction memory is ignored. The data memory access time is 100 ns including miss handling. Its primary (L1) cache has a hit rate of 99% and no access penalty if it is a hit. Now, it is considered to add a L2 cache between the L1 cache and the main memory. Suppose the L2 cache has a miss ratio of 20% and access delay of 5 ns. How much performance improvement with the L2 cache than without it? |
Page last modified on January 24, 2017, visited times |