- Format: ePub
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
Hier können Sie sich einloggen
Bitte loggen Sie sich zunächst in Ihr Kundenkonto ein oder registrieren Sie sich bei bücher.de, um das eBook-Abo tolino select nutzen zu können.
The next generation of computer system designers will be less concerned about details of processors and memories, and more concerned about the elements of a system tailored to particular applications. These designers will have a fundamental knowledge of processors and other elements in the system, but the success of their design will depend on the skills in making system-level tradeoffs that optimize the cost, performance and other attributes to meet application requirements. This book provides a new treatment of computer system design, particularly for System-on-Chip (SOC), which addresses…mehr
- Geräte: eReader
- mit Kopierschutz
- eBook Hilfe
- Größe: 7.41MB
Dieser Download kann aus rechtlichen Gründen nur mit Rechnungsadresse in A, B, BG, CY, CZ, D, DK, EW, E, FIN, F, GR, HR, H, IRL, I, LT, L, LR, M, NL, PL, P, R, S, SLO, SK ausgeliefert werden.
- Produktdetails
- Verlag: John Wiley & Sons
- Seitenzahl: 360
- Erscheinungstermin: 8. August 2011
- Englisch
- ISBN-13: 9781118009918
- Artikelnr.: 37345210
- Verlag: John Wiley & Sons
- Seitenzahl: 360
- Erscheinungstermin: 8. August 2011
- Englisch
- ISBN-13: 9781118009918
- Artikelnr.: 37345210
Systems Approach 1 1.1 System Architecture: An Overview 1 1.2 Components of
the System: Processors, Memories, and Interconnects 2 1.3 Hardware and
Software: Programmability Versus Performance 5 1.4 Processor Architectures
7 1.4.1 Processor: A Functional View 8 1.4.2 Processor: An Architectural
View 9 1.5 Memory and Addressing 19 1.5.1 SOC Memory Examples 20 1.5.2
Addressing: The Architecture of Memory 21 1.5.3 Memory for SOC Operating
System 22 1.6 System-Level Interconnection 24 1.6.1 Bus-Based Approach 24
1.6.2 Network-on-Chip Approach 25 1.7 An Approach for SOC Design 26 1.7.1
Requirements and Specifi cations 26 1.7.2 Design Iteration 27 1.8 System
Architecture and Complexity 29 1.9 Product Economics and Implications for
SOC 31 1.9.1 Factors Affecting Product Costs 31 1.9.2 Modeling Product
Economics and Technology Complexity: The Lesson for SOC 33 1.10 Dealing
with Design Complexity 34 1.10.1 Buying IP 34 1.10.2 Reconfi guration 35
1.11 Conclusions 37 1.12 Problem Set 38 2 Chip Basics: Time, Area, Power,
Reliability, and Confi gurability 39 2.1 Introduction 39 2.1.1 Design
Trade-Offs 39 2.1.2 Requirements and Specifi cations 42 2.2 Cycle Time 43
2.2.1 Defi ning a Cycle 43 2.2.2 Optimum Pipeline 44 2.2.3 Performance 46
2.3 Die Area and Cost 47 2.3.1 Processor Area 47 2.3.2 Processor Subunits
50 2.4 Ideal and Practical Scaling 53 2.5 Power 57 2.6 Area-Time-Power
Trade-Offs in Processor Design 60 2.6.1 Workstation Processor 60 2.6.2
Embedded Processor 61 2.7 Reliability 62 2.7.1 Dealing with Physical Faults
62 2.7.2 Error Detection and Correction 65 2.7.3 Dealing with Manufacturing
Faults 68 2.7.4 Memory and Function Scrubbing 69 2.8 Confi gurability 69
2.8.1 Why Reconfi gurable Design? 69 2.8.2 Area Estimate of Reconfi gurable
Devices 70 2.9 Conclusion 71 2.10 Problem Set 71 3 Processors 74 3.1
Introduction 74 3.2 Processor Selection for SOC 76 3.2.1 Overview 76 3.2.2
Example: Soft Processors 76 3.2.3 Examples: Processor Core Selection 79 3.3
Basic Concepts in Processor Architecture 81 3.3.1 Instruction Set 81 3.3.2
Some Instruction Set Conventions 82 3.3.3 Branches 82 3.3.4 Interrupts and
Exceptions 84 3.4 Basic Concepts in Processor Microarchitecture 86 3.5
Basic Elements in Instruction Handling 88 3.5.1 The Instruction Decoder and
Interlocks 88 3.5.2 Bypassing 90 3.5.3 Execution Unit 90 3.6 Buffers:
Minimizing Pipeline Delays 91 3.6.1 Mean Request Rate Buffers 91 3.6.2
Buffers Designed for a Fixed or Maximum Request Rate 92 3.7 Branches:
Reducing the Cost of Branches 93 3.7.1 Branch Target Capture: Branch Target
Buffers (BTBs) 94 3.7.2 Branch Prediction 97 3.8 More Robust Processors:
Vector, Very Long Instruction Word (VLIW), and Superscalar 101 3.9 Vector
Processors and Vector Instruction Extensions 101 3.9.1 Vector Functional
Units 103 3.10 VLIW Processors 107 3.11 Superscalar Processors 108 3.11.1
Data Dependencies 109 3.11.2 Detecting Instruction Concurrency 110 3.11.3 A
Simple Implementation 112 3.11.4 Preserving State with Out-of-Order
Execution 116 3.12 Processor Evolution and Two Examples 118 3.12.1 Soft and
Firm Processor Designs: The Processor as IP 118 3.12.2 High-Performance,
Custom-Designed Processors 118 3.13 Conclusions 119 3.14 Problem Set 120 4
Memory Design: System-on-Chip and Board-Based Systems 123 4.1 Introduction
123 4.2 Overview 125 4.2.1 SOC External Memory: Flash 125 4.2.2 SOC
Internal Memory: Placement 126 4.2.3 The Size of Memory 127 4.3 Scratchpads
and Cache Memory 128 4.4 Basic Notions 129 4.5 Cache Organization 130 4.6
Cache Data 133 4.7 Write Policies 134 4.8 Strategies for Line Replacement
at Miss Time 135 4.8.1 Fetching a Line 136 4.8.2 Line Replacement 136 4.8.3
Cache Environment: Effects of System, Transactions, and Multiprogramming
137 4.9 Other Types of Cache 138 4.10 Split I- and D-Caches and the Effect
of Code Density 138 4.11 Multilevel Caches 139 4.11.1 Limits on Cache Array
Size 139 4.11.2 Evaluating Multilevel Caches 140 4.11.3 Logical Inclusion
143 4.12 Virtual-to-Real Translation 143 4.13 SOC (On-Die) Memory Systems
145 4.14 Board-based (Off-Die) Memory Systems 147 4.15 Simple DRAM and the
Memory Array 149 4.15.1 SDRAM and DDR SDRAM 152 4.15.2 Memory Buffers 156
4.16 Models of Simple Processor-Memory Interaction 156 4.16.1 Models of
Multiple Simple Processors and Memory 157 4.16.2 The Strecker-Ravi Model
158 4.16.3 Interleaved Caches 160 4.17 Conclusions 161 4.18 Problem Set 161
5 Interconnect 165 5.1 Introduction 165 5.2 Overview: Interconnect
Architectures 166 5.3 Bus: Basic Architecture 168 5.3.1 Arbitration and
Protocols 170 5.3.2 Bus Bridge 171 5.3.3 Physical Bus Structure 171 5.3.4
Bus Varieties 172 5.4 SOC Standard Buses 173 5.4.1 AMBA 174 5.4.2
CoreConnect 177 5.4.3 Bus Interface Units: Bus Sockets and Bus Wrappers 179
5.5 Analytic Bus Models 183 5.5.1 Contention and Shared Bus 183 5.5.2
Simple Bus Model: Without Resubmission 184 5.5.3 Bus Model with Request
Resubmission 185 5.5.4 Using the Bus Model: Computing the Offered Occupancy
185 5.5.5 Effect of Bus Transactions and Contention Time 186 5.6 Beyond the
Bus: NOC with Switch Interconnects 187 5.6.1 Static Networks 190 5.6.2
Dynamic Networks 192 5.7 Some NOC Switch Examples 194 5.7.1 A 2-D Grid
Example of Direct Networks 194 5.7.2 Asynchronous Crossbar Interconnect for
Synchronous SOC (Dynamic Network) 196 5.7.3 Blocking versus Nonblocking 197
5.8 Layered Architecture and Network Interface Unit 197 5.8.1 NOC Layered
Architecture 198 5.8.2 NOC and NIU Example 200 5.8.3 Bus versus NOC 201 5.9
Evaluating Interconnect Networks 201 5.9.1 Static versus Dynamic Networks
202 5.9.2 Comparing Networks: Example 204 5.10 Conclusions 205 5.11 Problem
Set 206 6 Customization and Confi gurability 208 6.1 Introduction 208 6.2
Estimating Effectiveness of Customization 209 6.3 SOC Customization: An
Overview 210 6.4 Customizing Instruction Processors 212 6.4.1 Processor
Customization Approaches 214 6.4.2 Architecture Description 215 6.4.3
Identifying Custom Instructions Automatically 217 6.5 Reconfi gurable
Technologies 218 6.5.1 Reconfi gurable Functional Units (FUs) 218 6.5.2
Reconfi gurable Interconnects 222 6.5.3 Software Confi gurable Processors
224 6.6 Mapping Designs Onto Reconfi gurable Devices 226 6.7
Instance-Specifi c Design 228 6.8 Customizable Soft Processor: An Example
231 6.9 Reconfi guration 235 6.9.1 Reconfi guration Overhead Analysis 235
6.9.2 Trade-Off Analysis: Reconfi gurable Parallelism 237 6.10 Conclusions
242 6.11 Problem Set 243 7 Application Studies 246 7.1 Introduction 246 7.2
SOC Design Approach 246 7.3 Application Study: AES 251 7.3.1 AES: Algorithm
and Requirements 251 7.3.2 AES: Design and Evaluation 253 7.4 Application
Study: 3-D Graphics Processors 254 7.4.1 Analysis: Processing 255 7.4.2
Analysis: Interconnection 259 7.4.3 Prototyping 260 7.5 Application Study:
Image Compression 262 7.5.1 JPEG Compression 262 7.5.2 Example JPEG System
for Digital Still Camera 264 7.6 Application Study: Video Compression 266
7.6.1 MPEG and H.26X Video Compression: Requirements 268 7.6.2 H.264
Acceleration: Designs 271 7.7 Further Application Studies 276 7.7.1 MP3
Audio Decoding 276 7.7.2 Software-Defi ned Radio with 802.16 279 7.8
Conclusions 281 7.9 Problem Set 282 8 What's Next: Challenges Ahead 285 8.1
Introduction 285 8.2 Overview 286 8.3 Technology 288 8.4 Powering the ASOC
289 8.5 The Shape of the ASOC 292 8.6 Computer Module and Memory 293 8.7 RF
or Light Communications 293 8.7.1 Lasers 294 8.7.2 RF 295 8.7.3 Potential
for Laser/RF Communications 295 8.7.4 Networked ASOC 296 8.8 Sensing 296
8.8.1 Visual 296 8.8.2 Audio 297 8.9 Motion, Flight, and the Fruit Fly 298
8.10 Motivation 299 8.11 Overview 300 8.12 Pre-Deployment 302 8.13
Post-Deployment 307 8.13.1 Situation-Specifi c Optimization 308 8.13.2
Autonomous Optimization Control 309 8.14 Roadmap and Challenges 310 8.15
Summary 312 Appendix: Tools for Processor Evaluation 313 References 316
Index 329
Systems Approach 1 1.1 System Architecture: An Overview 1 1.2 Components of
the System: Processors, Memories, and Interconnects 2 1.3 Hardware and
Software: Programmability Versus Performance 5 1.4 Processor Architectures
7 1.4.1 Processor: A Functional View 8 1.4.2 Processor: An Architectural
View 9 1.5 Memory and Addressing 19 1.5.1 SOC Memory Examples 20 1.5.2
Addressing: The Architecture of Memory 21 1.5.3 Memory for SOC Operating
System 22 1.6 System-Level Interconnection 24 1.6.1 Bus-Based Approach 24
1.6.2 Network-on-Chip Approach 25 1.7 An Approach for SOC Design 26 1.7.1
Requirements and Specifi cations 26 1.7.2 Design Iteration 27 1.8 System
Architecture and Complexity 29 1.9 Product Economics and Implications for
SOC 31 1.9.1 Factors Affecting Product Costs 31 1.9.2 Modeling Product
Economics and Technology Complexity: The Lesson for SOC 33 1.10 Dealing
with Design Complexity 34 1.10.1 Buying IP 34 1.10.2 Reconfi guration 35
1.11 Conclusions 37 1.12 Problem Set 38 2 Chip Basics: Time, Area, Power,
Reliability, and Confi gurability 39 2.1 Introduction 39 2.1.1 Design
Trade-Offs 39 2.1.2 Requirements and Specifi cations 42 2.2 Cycle Time 43
2.2.1 Defi ning a Cycle 43 2.2.2 Optimum Pipeline 44 2.2.3 Performance 46
2.3 Die Area and Cost 47 2.3.1 Processor Area 47 2.3.2 Processor Subunits
50 2.4 Ideal and Practical Scaling 53 2.5 Power 57 2.6 Area-Time-Power
Trade-Offs in Processor Design 60 2.6.1 Workstation Processor 60 2.6.2
Embedded Processor 61 2.7 Reliability 62 2.7.1 Dealing with Physical Faults
62 2.7.2 Error Detection and Correction 65 2.7.3 Dealing with Manufacturing
Faults 68 2.7.4 Memory and Function Scrubbing 69 2.8 Confi gurability 69
2.8.1 Why Reconfi gurable Design? 69 2.8.2 Area Estimate of Reconfi gurable
Devices 70 2.9 Conclusion 71 2.10 Problem Set 71 3 Processors 74 3.1
Introduction 74 3.2 Processor Selection for SOC 76 3.2.1 Overview 76 3.2.2
Example: Soft Processors 76 3.2.3 Examples: Processor Core Selection 79 3.3
Basic Concepts in Processor Architecture 81 3.3.1 Instruction Set 81 3.3.2
Some Instruction Set Conventions 82 3.3.3 Branches 82 3.3.4 Interrupts and
Exceptions 84 3.4 Basic Concepts in Processor Microarchitecture 86 3.5
Basic Elements in Instruction Handling 88 3.5.1 The Instruction Decoder and
Interlocks 88 3.5.2 Bypassing 90 3.5.3 Execution Unit 90 3.6 Buffers:
Minimizing Pipeline Delays 91 3.6.1 Mean Request Rate Buffers 91 3.6.2
Buffers Designed for a Fixed or Maximum Request Rate 92 3.7 Branches:
Reducing the Cost of Branches 93 3.7.1 Branch Target Capture: Branch Target
Buffers (BTBs) 94 3.7.2 Branch Prediction 97 3.8 More Robust Processors:
Vector, Very Long Instruction Word (VLIW), and Superscalar 101 3.9 Vector
Processors and Vector Instruction Extensions 101 3.9.1 Vector Functional
Units 103 3.10 VLIW Processors 107 3.11 Superscalar Processors 108 3.11.1
Data Dependencies 109 3.11.2 Detecting Instruction Concurrency 110 3.11.3 A
Simple Implementation 112 3.11.4 Preserving State with Out-of-Order
Execution 116 3.12 Processor Evolution and Two Examples 118 3.12.1 Soft and
Firm Processor Designs: The Processor as IP 118 3.12.2 High-Performance,
Custom-Designed Processors 118 3.13 Conclusions 119 3.14 Problem Set 120 4
Memory Design: System-on-Chip and Board-Based Systems 123 4.1 Introduction
123 4.2 Overview 125 4.2.1 SOC External Memory: Flash 125 4.2.2 SOC
Internal Memory: Placement 126 4.2.3 The Size of Memory 127 4.3 Scratchpads
and Cache Memory 128 4.4 Basic Notions 129 4.5 Cache Organization 130 4.6
Cache Data 133 4.7 Write Policies 134 4.8 Strategies for Line Replacement
at Miss Time 135 4.8.1 Fetching a Line 136 4.8.2 Line Replacement 136 4.8.3
Cache Environment: Effects of System, Transactions, and Multiprogramming
137 4.9 Other Types of Cache 138 4.10 Split I- and D-Caches and the Effect
of Code Density 138 4.11 Multilevel Caches 139 4.11.1 Limits on Cache Array
Size 139 4.11.2 Evaluating Multilevel Caches 140 4.11.3 Logical Inclusion
143 4.12 Virtual-to-Real Translation 143 4.13 SOC (On-Die) Memory Systems
145 4.14 Board-based (Off-Die) Memory Systems 147 4.15 Simple DRAM and the
Memory Array 149 4.15.1 SDRAM and DDR SDRAM 152 4.15.2 Memory Buffers 156
4.16 Models of Simple Processor-Memory Interaction 156 4.16.1 Models of
Multiple Simple Processors and Memory 157 4.16.2 The Strecker-Ravi Model
158 4.16.3 Interleaved Caches 160 4.17 Conclusions 161 4.18 Problem Set 161
5 Interconnect 165 5.1 Introduction 165 5.2 Overview: Interconnect
Architectures 166 5.3 Bus: Basic Architecture 168 5.3.1 Arbitration and
Protocols 170 5.3.2 Bus Bridge 171 5.3.3 Physical Bus Structure 171 5.3.4
Bus Varieties 172 5.4 SOC Standard Buses 173 5.4.1 AMBA 174 5.4.2
CoreConnect 177 5.4.3 Bus Interface Units: Bus Sockets and Bus Wrappers 179
5.5 Analytic Bus Models 183 5.5.1 Contention and Shared Bus 183 5.5.2
Simple Bus Model: Without Resubmission 184 5.5.3 Bus Model with Request
Resubmission 185 5.5.4 Using the Bus Model: Computing the Offered Occupancy
185 5.5.5 Effect of Bus Transactions and Contention Time 186 5.6 Beyond the
Bus: NOC with Switch Interconnects 187 5.6.1 Static Networks 190 5.6.2
Dynamic Networks 192 5.7 Some NOC Switch Examples 194 5.7.1 A 2-D Grid
Example of Direct Networks 194 5.7.2 Asynchronous Crossbar Interconnect for
Synchronous SOC (Dynamic Network) 196 5.7.3 Blocking versus Nonblocking 197
5.8 Layered Architecture and Network Interface Unit 197 5.8.1 NOC Layered
Architecture 198 5.8.2 NOC and NIU Example 200 5.8.3 Bus versus NOC 201 5.9
Evaluating Interconnect Networks 201 5.9.1 Static versus Dynamic Networks
202 5.9.2 Comparing Networks: Example 204 5.10 Conclusions 205 5.11 Problem
Set 206 6 Customization and Confi gurability 208 6.1 Introduction 208 6.2
Estimating Effectiveness of Customization 209 6.3 SOC Customization: An
Overview 210 6.4 Customizing Instruction Processors 212 6.4.1 Processor
Customization Approaches 214 6.4.2 Architecture Description 215 6.4.3
Identifying Custom Instructions Automatically 217 6.5 Reconfi gurable
Technologies 218 6.5.1 Reconfi gurable Functional Units (FUs) 218 6.5.2
Reconfi gurable Interconnects 222 6.5.3 Software Confi gurable Processors
224 6.6 Mapping Designs Onto Reconfi gurable Devices 226 6.7
Instance-Specifi c Design 228 6.8 Customizable Soft Processor: An Example
231 6.9 Reconfi guration 235 6.9.1 Reconfi guration Overhead Analysis 235
6.9.2 Trade-Off Analysis: Reconfi gurable Parallelism 237 6.10 Conclusions
242 6.11 Problem Set 243 7 Application Studies 246 7.1 Introduction 246 7.2
SOC Design Approach 246 7.3 Application Study: AES 251 7.3.1 AES: Algorithm
and Requirements 251 7.3.2 AES: Design and Evaluation 253 7.4 Application
Study: 3-D Graphics Processors 254 7.4.1 Analysis: Processing 255 7.4.2
Analysis: Interconnection 259 7.4.3 Prototyping 260 7.5 Application Study:
Image Compression 262 7.5.1 JPEG Compression 262 7.5.2 Example JPEG System
for Digital Still Camera 264 7.6 Application Study: Video Compression 266
7.6.1 MPEG and H.26X Video Compression: Requirements 268 7.6.2 H.264
Acceleration: Designs 271 7.7 Further Application Studies 276 7.7.1 MP3
Audio Decoding 276 7.7.2 Software-Defi ned Radio with 802.16 279 7.8
Conclusions 281 7.9 Problem Set 282 8 What's Next: Challenges Ahead 285 8.1
Introduction 285 8.2 Overview 286 8.3 Technology 288 8.4 Powering the ASOC
289 8.5 The Shape of the ASOC 292 8.6 Computer Module and Memory 293 8.7 RF
or Light Communications 293 8.7.1 Lasers 294 8.7.2 RF 295 8.7.3 Potential
for Laser/RF Communications 295 8.7.4 Networked ASOC 296 8.8 Sensing 296
8.8.1 Visual 296 8.8.2 Audio 297 8.9 Motion, Flight, and the Fruit Fly 298
8.10 Motivation 299 8.11 Overview 300 8.12 Pre-Deployment 302 8.13
Post-Deployment 307 8.13.1 Situation-Specifi c Optimization 308 8.13.2
Autonomous Optimization Control 309 8.14 Roadmap and Challenges 310 8.15
Summary 312 Appendix: Tools for Processor Evaluation 313 References 316
Index 329