• Nie Znaleziono Wyników

ADSP-21161 SHARC DSP Hardware Reference

N/A
N/A
Protected

Academic year: 2021

Share "ADSP-21161 SHARC DSP Hardware Reference"

Copied!
1117
0
0

Pełen tekst

(1)

SHARC DSP Hardware Reference

Third Edition, May 2002

Part Number 82-001944-01

Revision 3.0

Analog Devices, Inc.

Digital Signal Processor Division One Technology Way

Norwood, Mass. 02062-9106

(2)

Copyright Information

©1996–2002 Analog Devices, Inc., ALL RIGHTS RESERVED. This document may not be reproduced in any form without prior, express writ- ten consent from Analog Devices, Inc.

Printed in the USA.

Disclaimer

Analog Devices, Inc. reserves the right to change this product without prior notice. Information furnished by Analog Devices is believed to be accurate and reliable. However, no responsibility is assumed by Analog Devices for its use; nor for any infringement of patents or other rights of third parties which may result from its use. No license is granted by impli- cation or otherwise under the patent rights of Analog Devices, Inc.

Trademark and Service Mark Notice

The Analog Devices logo, SHARC, and the SHARC logoare registered trademarks; and VisualDSP++ and EZ-KIT Lite are trademarks of Analog Devices, Inc.

All other brand and product names are trademarks or service marks of

their respective owners.

(3)

CONTENTS

INTRODUCTION

Purpose ... 1-1

Audience ... 1-1

Overview—Why Floating-Point DSP? ... 1-2

ADSP-21161 Design Advantages ... 1-2

ADSP-21161 Architecture Overview ... 1-7

Processor Core ... 1-8

Processing Elements ... 1-8

Program Sequence Control ... 1-9

Processor Internal Buses ... 1-12

Processor Peripherals ... 1-13

Dual-Ported Internal Memory (SRAM) ... 1-13

External Port ... 1-14

I/O Processor ... 1-16

JTAG Port ... 1-17

Development Tools ... 1-18

Differences From Previous SHARC DSPs ... 1-20

Processor Core Enhancements ... 1-21

(4)

Processor Internal Bus Enhancements ... 1-21 Memory Organization Enhancements ... 1-22 External Port Enhancements ... 1-22 Host Interface Enhancements ... 1-22 Multiprocessor Interface Enhancements ... 1-23 IO Architecture Enhancements ... 1-23 DMA Controller Enhancements ... 1-23 Link Port Enhancements ... 1-23 Instruction Set Enhancements ... 1-23 For More Information About Analog Products ... 1-24 For Technical or Customer Support ... 1-25 What’s New in This Manual ... 1-26 Related Documents ... 1-26 Conventions ... 1-27

PROCESSING ELEMENTS

Overview ... 2-1

Setting Computational Modes ... 2-3

32-bit (Normal Word) Floating-Point Format ... 2-3

40-bit Floating-Point Format ... 2-4

16-bit (Short Word) Floating-Point Format ... 2-5

32-Bit Fixed-Point Format ... 2-5

Rounding Mode ... 2-6

(5)

Using Computational Status ... 2-7

Arithmetic Logic Unit (ALU) ... 2-7

ALU Operation ... 2-8

ALU Saturation ... 2-9

ALU Status Flags ... 2-9

ALU Instruction Summary ... 2-10

Multiply—Accumulator (Multiplier) ... 2-13

Multiplier Operation ... 2-14

Multiplier (Fixed-Point) Result Register ... 2-15

Multiplier Status Flags ... 2-18

Multiplier Instruction Summary ... 2-19

Barrel-Shifter (Shifter) ... 2-22

Shifter Operation ... 2-22

Shifter Status Flags ... 2-24

Shifter Instruction Summary ... 2-27

Data Register File ... 2-29

Alternate (Secondary) Data Registers ... 2-31

Multifunction Computations ... 2-32

Secondary Processing Element (PEy) ... 2-36

Dual Compute Units Sets ... 2-38

Dual Register Files ... 2-39

Dual Alternate Registers ... 2-40

SIMD (Computational) Operations ... 2-40

SIMD And Status Flags ... 2-43

(6)

PROGRAM SEQUENCER

Overview ... 3-1

Instruction Pipeline ... 3-8

Instruction Cache ... 3-9

Using the Cache ... 3-12

Optimizing Cache Usage ... 3-12

Branches and Sequencing ... 3-14

Conditional Branches ... 3-16

Delayed Branches ... 3-16

Loops and Sequencing ... 3-20

Restrictions On Ending Loops ... 3-23

Restrictions On Short Loops ... 3-24

Loop Address Stack ... 3-29

Loop Counter Stack ... 3-30

Interrupts and Sequencing ... 3-34

Sensing Interrupts ... 3-40

Masking Interrupts ... 3-41

Latching Interrupts ... 3-42

Stacking Status During Interrupts ... 3-44

Nesting Interrupts ... 3-45

Re-using Interrupts ... 3-47

Interrupting IDLE ... 3-49

Multiprocessing Interrupts ... 3-49

(7)

Timer and Sequencing ... 3-50 Stacks and Sequencing ... 3-52 Conditional Sequencing ... 3-54 SIMD Mode and Sequencing ... 3-58 Conditional Compute Operations ... 3-58 Conditional Branches and Loops ... 3-59 Conditional Data Moves ... 3-60 Conditional DAG Operations ... 3-67

DATA ADDRESS GENERATORS

Overview ... 4-1

Setting DAG Modes ... 4-2

Circular Buffering Mode ... 4-4

Broadcast Loading Mode ... 4-5

Alternate (Secondary) DAG Registers ... 4-6

Bit-reverse Addressing Mode ... 4-8

Using DAG Status ... 4-9

DAG Operations ... 4-10

Addressing With DAGs ... 4-10

Addressing Circular Buffers ... 4-12

Modifying DAG Registers ... 4-17

Addressing in SISD & SIMD Modes ... 4-18

DAGs, Registers, & Memory ... 4-18

DAG Register-to-bus Alignment ... 4-19

DAG Register Transfer Restrictions ... 4-21

(8)

DAG Instruction Summary ... 4-23

MEMORY

Overview ... 5-1

Internal Address and Data Buses ... 5-7

Internal Data Bus Exchange ... 5-10

ADSP-21161 Memory Map ... 5-16

Internal Memory ... 5-16

Multiprocessor Memory ... 5-19

External Memory ... 5-23

Shadow Write FIFO ... 5-23

Memory Organization & Word Size ... 5-25

Placing 32-Bit Words & 48-Bit Words ... 5-25

Mixing 32-Bit & 48-Bit Words ... 5-26

Restrictions on Mixing 32-Bit & 48-Bit Words ... 5-28

48-bit Word Allocation ... 5-31

Setting Data Access Modes ... 5-32

SYSCON Register Control Bits ... 5-32

Mode 1 Register Control Bits ... 5-34

Mode 2 Register Control Bits ... 5-34

Wait Register Control Bits ... 5-34

Using Boot Memory ... 5-35

Reading from Boot Memory ... 5-35

Writing to Boot Memory ... 5-36

Internal Interrupt Vector Table ... 5-37

(9)

Internal Memory Data Width ... 5-37 Memory Bank Size ... 5-38 External Bus Priority ... 5-39 Secondary Processor Element (PEy) ... 5-39 Broadcast Register Loads ... 5-40 Illegal I/O Processor Register Access ... 5-41 Unaligned 64-bit Memory Access ... 5-41 External Bank X Access Mode ... 5-42 External Bank X Waitstates ... 5-44 Using Memory Access Status ... 5-45 Accessing Memory ... 5-46 Access Word Size ... 5-47 Long Word (64-Bit) Accesses ... 5-47 Instruction Word (48-Bit) and

Extended Precision Normal Word (40-Bit) Accesses ... 5-49

Normal Word (32-Bit) Accesses ... 5-50

Short Word (16-Bit) Accesses ... 5-50

SISD, SIMD, and Broadcast Load Modes ... 5-51

Single- and Dual-Data Accesses ... 5-51

Data Access Options ... 5-52

Short Word Addressing of Single Data in SISD Mode ... 5-54

Short Word Addressing of Single Data in SIMD Mode ... 5-56

Short Word Addressing of Dual-Data in SISD Mode ... 5-58

Short Word Addressing of Dual-Data in SIMD Mode ... 5-60

(10)

32-Bit Normal Word Addressing of Single Data

in SISD Mode ... 5-62 32-Bit Normal Word Addressing of Single Data

in SIMD Mode ... 5-62 32-Bit Normal Word Addressing of Dual Data

in SISD Mode ... 5-66 32-Bit Normal Word Addressing of Dual Data

in SIMD Mode ... 5-68 Extended Precision Normal Word Addressing of Single Data 5-70 Extended Precision Normal Word Addressing of Dual Data

in SISD Mode ... 5-72 Extended Precision Normal Word Addressing of Dual Data

in SIMD Mode ... 5-74 Long Word Addressing of Single Data ... 5-76 Long Word Addressing of Dual Data in SISD Mode ... 5-78 Long Word Addressing of Dual Data in SIMD Mode ... 5-80 Mixed Word Width Addressing of Dual Data

in SISD Mode ... 5-82 Mixed Word Width Addressing of Dual Data

in SIMD Mode ... 5-84

Broadcast Load Access ... 5-86

Shadow Write FIFO Considerations In SIMD Mode ... 5-95

Arranging Data in Memory ... 5-101

Executing Instructions from External Memory ... 5-102

32- to 48-Bit Packing Address Generation Scheme ... 5-111

Total Program Size (32- to 48-Bit Packing) ... 5-112

16-to 48-Bit Packing Address Generation Scheme ... 5-112

(11)

Total Program Size (16- to 48-Bit Packing) ... 5-113 8- to 48-Bit Packing Address Generation Scheme ... 5-113 Total Program Size (8- to 48-Bit Packing) ... 5-114 No Packing (48- to 48-bit) Address Generation Scheme ... 5-115

I/O PROCESSOR

Overview ... 6-1

DMA Channel Allocation and Priorities ... 6-16

DMA Interrupt Vector Locations ... 6-20

Booting Modes ... 6-21

DMA Controller Operation ... 6-22

Managing DMA Channel Priority ... 6-23

Chaining DMA Processes ... 6-26

Transfer Control Block (TCB) Chain Loading ... 6-28

Setting Up and Starting the Chain ... 6-29

Inserting a TCB in an Active Chain ... 6-30

External Port DMA ... 6-32

External Port Registers ... 6-33

External Port FIFO Buffers ... 6-35

External Port DMA Data Packing ... 6-36

Boot Memory DMA Mode ... 6-45

External Port Buffer Modes ... 6-45

External Port Channel Priority Modes ... 6-46

External Port Channel Transfer Modes ... 6-48

External Port Channel Handshake Modes ... 6-50

(12)

Master Mode ... 6-54

Paced Master Mode ... 6-57

Slave Mode ... 6-58

Handshake Mode ... 6-61

DMA Handshake Idle Cycle ... 6-67

External-Handshake Mode ... 6-69

Setting up External Port DMA ... 6-72

Bootloading Through The External Port ... 6-74

Host Processor Booting ... 6-76

PROM Booting ... 6-78

External Port DMA Programming Examples ... 6-81

Link Port DMA ... 6-84

Link Port Registers ... 6-84

Link Port Buffer Modes ... 6-86

Link Port Channel Priority Modes ... 6-86

Link Port Channel Transfer Modes ... 6-88

Setting up Link Port DMA ... 6-89

Bootloading Through The Link Port ... 6-91

Link Port DMA Programming Examples ... 6-93

Serial Port DMA ... 6-97

Serial Port Registers ... 6-97

Serial Port Buffer Modes ... 6-100

Serial Port Channel Priority Modes ... 6-101

Serial Port Channel Transfer Modes ... 6-101

(13)

Setting up Serial Port DMA ... 6-102 SPORT DMA Programming Examples ... 6-104 SPI Port DMA ... 6-109 SPI Port Registers ... 6-109 SPI Port Buffer ... 6-111 SPI DMA Channel Priority ... 6-113 Setting up SPl Port DMA ... 6-113 Bootloading Through the SPI Port ... 6-115 SPI Port DMA Programming Examples ... 6-117 Using I/O Processor Status ... 6-122 External Port Status ... 6-128 Link Port Status ... 6-133 Serial Port Status ... 6-136 SPI Port Status ... 6-139 Optimizing DMA Throughput ... 6-141 Internal Memory DMA ... 6-142 External Memory DMA ... 6-142 System-Level Considerations ... 6-147

EXTERNAL PORT

Overview ... 7-1

Setting External Port Modes ... 7-3

External Memory Interface ... 7-3

Banked External Memory ... 7-10

Boot Memory ... 7-10

(14)

Idle Cycle ... 7-11

Data Hold Cycle ... 7-13

Multiprocessor Memory Space Waitstates and Acknowledge 7-14

Timing External Memory Accesses ... 7-15

Asynchronous Mode Interface Timing ... 7-15

Synchronous Mode Interface Timing ... 7-20

Synchronous Burst Mode Interface Timing ... 7-29

Using External SBSRAM ... 7-40

SBSRAM Restrictions ... 7-46

Host Processor Interface ... 7-47

Acquiring the Bus ... 7-50

Asynchronous Transfers ... 7-54

Host Transfer Timing ... 7-56

Host Interface Deadlock Resolution With SBTS ... 7-59

Slave Reads and Writes ... 7-60

IOP Shadow Registers ... 7-60

Instruction Transfers ... 7-61

Slave Write Latency ... 7-61

Slave Reads ... 7-62

Broadcast Writes ... 7-62

Data Transfers Through the EPBx Buffers ... 7-63

DMA Transfers ... 7-64

Host Data Packing ... 7-64

Packing Mode Variations For Host Accesses ... 7-66

(15)

IOP Register Host Accesses ... 7-67

LINK Port Buffer Access ... 7-68

EPBx Buffer Accesses ... 7-69

8- to 32-bit Data Packing ... 7-71

16- to 32-bit Packing ... 7-74

48-Bit Instruction Packing ... 7-80

Host Interface Status ... 7-82

Interprocessor Messages and Vector Interrupts ... 7-82

Message Passing (MSGRx) ... 7-83

Host Vector Interrupts (VIRPT) ... 7-84

System Bus Interfacing ... 7-84

Access to the DSP Bus—Slave DSP ... 7-85

Access to the System Bus—Master DSP ... 7-85

Processor Core Access To System Bus ... 7-88

Deadlock Resolution ... 7-88

DSP DMA Access To System Bus ... 7-90

Multiprocessing with Local Memory ... 7-90

DSP To Microprocessor Interface ... 7-92

Multiprocessor (MP) Interface ... 7-93

Multiprocessing System Architectures ... 7-96

Data Flow Multiprocessing ... 7-96

Cluster Multiprocessing ... 7-97

Multiprocessor Bus Arbitration ... 7-99

Bus Arbitration Protocol ... 7-102

(16)

Bus Arbitration Priority (RPBA) ... 7-105 Bus Mastership Timeout ... 7-108 Priority Access ... 7-109 Bus Synchronization After Reset ... 7-112 Booting Another DSP ... 7-115 Multiprocessor Writes and Reads ... 7-115 Instruction Transfers ... 7-117 Bus Lock and Semaphores ... 7-117 Multiprocessor Interface Status ... 7-119

SDRAM INTERFACE

Overview ... 8-1

SDRAM Pin Connections ... 8-7

SDRAM Timing Specifications ... 8-8

SDRAM Control Register (SDCTL) ... 8-9

SDRAM Configuration for Runtime ... 8-10

Setting the Refresh Counter Value (SDRDIV) ... 8-12

Setting the SDRAM Clock Enables ... 8-14

Setting the Number of SDRAM Banks (SDBN) ... 8-15

Setting the External Memory Bank (SDEMx) ... 8-15

Setting the SDRAM Buffering Option (SDBUF) ... 8-16

Selecting the CAS Latency Value (SDCL) ... 8-18

Selecting the SDRAM's Page Size (SDPGS) ... 8-19

Setting the SDRAM Power-Up Mode (SDPM) ... 8-19

Starting the SDRAM Power-Up Sequence (SDPSS) ... 8-20

(17)

Starting Self-Refresh mode (SDSRF) ... 8-20

Selecting the Active Command Delay (SDTRAS) ... 8-21

Selecting the Precharge Delay (SDTRP) ... 8-21

Selecting the RAS-to-CAS Delay (SDTRCD) ... 8-22

SDRAM Controller Standard Operation ... 8-23

Understanding DAG and DMA Operation ... 8-25

Multiprocessing Operation ... 8-26

Accessing SDRAM ... 8-27

Tables: ADSP-21161 Address Mapping for SDRAM ... 8-28

Understanding DQM Operation ... 8-30

Executing a Parallel Refresh Command During Host Control . 8-31

Powering Up After Reset ... 8-32

Entering and Exiting Self-Refresh Mode ... 8-33

SDRAM Controller Commands ... 8-33

Bank Activate (ACT) Command ... 8-34

Mode Register Set (MRS) ... 8-34

Precharge Command (PRE) ... 8-35

Read / Write Command ... 8-36

Read Commands ... 8-37

Write Commands ... 8-39

DMA Transfers ... 8-40

Refresh (REF) Command ... 8-40

Setting the Delay Between Refresh Commands ... 8-41

Understanding Multiprocessing Operation ... 8-41

(18)

Self Refresh Command (SREF) ... 8-42 Programming Example ... 8-43

LINK PORTS

Overview ... 9-1

Link Port To Link Buffer Assignment ... 9-3

Link Port DMA Channels ... 9-4

Link Port Booting ... 9-5

Setting Link Port Modes ... 9-5

Link Port Control Register (LCTL) Bit Descriptions ... 9-7

Link Data Path and Compatibility Modes ... 9-9

Using Link Port Handshake Signals ... 9-10

Using Link Buffers ... 9-12

Core Processor Access To Link Buffers ... 9-13

Host Processor Access To Link Buffers ... 9-14

Using Link Port DMA ... 9-17

Using Link Port Interrupts ... 9-17

Link Port Interrupts With DMA Enabled ... 9-19

Link Port Interrupts With DMA Disabled ... 9-19

Link Port Service Request Interrupts (LSRQ) ... 9-20

Detecting Errors On Link Transmissions ... 9-22

Link Port Programming Examples ... 9-23

Using Token Passing With Link Ports ... 9-26

Designing Link Port Systems ... 9-29

Terminations For Link Transmission Lines ... 9-29

(19)

Peripheral I/O Using Link Ports ... 9-30 Data Flow Multiprocessing With Link Ports ... 9-31

SERIAL PORTS

Overview ... 10-1

Serial Port Pins ... 10-5

SPORT Interrupts ... 10-8

SPORT Reset ... 10-9

SPORT Control Registers and Data Buffers ... 10-9

Serial Port Control Registers (SPCTLx) ... 10-15

Register Writes and Effect Latency ... 10-33

Transmit and Receive Data Buffers (TXxA/B, RXxA/B) ... 10-34

Clock and Frame Sync Frequencies (DIV) ... 10-36

Data Word Formats ... 10-39

Word Length ... 10-39

Endian Format ... 10-39

Data Packing and Unpacking ... 10-40

Data Type ... 10-41

Companding ... 10-42

Clock Signal Options ... 10-43

Frame Sync Options ... 10-44

Framed Versus Unframed ... 10-44

Internal vs. External Frame Syncs ... 10-45

Active Low Versus Active High Frame Syncs ... 10-46

Sampling Edge for Data and Frame Syncs ... 10-46

(20)

Early Versus Late Frame Syncs ... 10-47 Data-Independent Transmit Frame Sync ... 10-48 SPORT Loopback ... 10-49 SPORT Operation Modes ... 10-50 I2S Mode ... 10-51 Setting the Internal Serial Clock and Frame Sync Rates ... 10-52 I2S Control Bits ... 10-52 Setting Word Length (SLEN) ... 10-52 Selecting Transmit and Receive Channel Order (L_FIRST) 10-52 Selecting the Frame Sync Options (FS_BOTH) ... 10-53 Enabling SPORT Master Mode (MSTR) ... 10-54 Enabling SPORT DMA (SDEN) ... 10-54 Multichannel Operation ... 10-55 Moving Data Between SPORTs and Memory ... 10-61 DMA Block Transfers ... 10-62 Setting Up DMA on SPORT Channels ... 10-63 SPORT DMA Parameter Registers ... 10-65 SPORT DMA Chaining ... 10-69 Single-Word Transfers ... 10-69 SPORT Pin/Line Terminations ... 10-70 SPORT Programming Examples ... 10-71

SERIAL PERIPHERAL INTERFACE (SPI)

Overview ... 11-1

Functional Description ... 11-2

(21)

SPI Interface Signals ... 11-3

SPICLK ... 11-4

SPIDS ... 11-5

FLAG ... 11-5

MOSI ... 11-6

MISO ... 11-6

SPI Interrupts ... 11-8

SPI IOP Registers ... 11-9

SPI Control Register (SPICTL) ... 11-10

SPI Status Register (SPISTAT) ... 11-17

SPI Transmit Data Buffer (SPITX) ... 11-22

SPI Receive Data Buffer (SPIRX) ... 11-23

SPI Shift Registers ... 11-23

SPI Data Word Formats ... 11-24

... 11-26

SPI Word Packing ... 11-27

SPI Operation Modes ... 11-27

Master Mode Operation ... 11-27

Interrupt and DMA Driven Transfers ... 11-28

Core Driven Transfers ... 11-29

Automatic Slave Selection ... 11-29

User Controlled Slave Selection ... 11-30

Slave Mode Operation ... 11-31

(22)

Error Signals and Flags ... 11-32 Multi-Master Error (MME) ... 11-32 Transmission Error (TXE) ... 11-33 Reception Error (RBSY) ... 11-33 SPI/Link Port DMA ... 11-34 DMA Operation in SPI Master Mode ... 11-35 DMA Operation in Slave Mode ... 11-35 SPI Booting ... 11-36 32-bit SPI Host Boot ... 11-41 16-bit SPI Host Boot ... 11-42 8-bit SPI Host Boot ... 11-43 Multiprocessor SPI Port Booting ... 11-44 SPI Programming Example ... 11-47

JTAG TEST-EMULATION PORT

Overview ... 12-1

JTAG Test Access Port ... 12-3

Instruction Register ... 12-4

EMUPMD Shift Register ... 12-7

EMUPX Shift Register ... 12-7

EMU64PX Shift Register ... 12-7

EMUPC Shift Register ... 12-8

EMUCTL Shift Register ... 12-8

EMUSTAT Shift Register ... 12-12

BRKSTAT Shift Register ... 12-13

(23)

MEMTST Shift Register ... 12-14 PSx, DMx, IOx, and EPx (Breakpoint) Registers ... 12-14 EMUN Register ... 12-17 EMUCLK and EMUCLK2 Registers ... 12-18 EMUIDLE Instruction ... 12-18 In Circuit Signal Analyzer (ICSA) Function ... 12-18 Boundary Register ... 12-19 Device Identification Register ... 12-30 Built-in Self-test Operation (BIST) ... 12-30 Private Instructions ... 12-30 References ... 12-30

SYSTEM DESIGN

Overview ... 13-1

DSP Pin Descriptions ... 13-2

Input Synchronization Delay ... 13-19

Pin States At Reset ... 13-20

Pull-up and Pull-down Resistors ... 13-24

Clock Derivation ... 13-27

Timing Specifications ... 13-28

RESET and CLKIN ... 13-32

Reset Generators ... 13-35

Interrupt and Timer Pins ... 13-37

Core-Based Flag Pins ... 13-38

Flag Inputs ... 13-38

(24)

Flag Outputs ... 13-39

Programmable I/O Flags ... 13-40

System Design Considerations for Flags ... 13-43

JTAG Interface Pins ... 13-45

Dual-Voltage Powerup Sequencing ... 13-47

PLL Start-up (Revisions 1.0/1.1) ... 13-50

POR Circuit ... 13-50

PLL CLKIN Enable Circuit ... 13-52

PLL Start-up (Revision 1.2) ... 13-53

Designing For JTAG Emulation ... 13-55

Target Board Connector ... 13-56

Layout Requirements ... 13-61

Power Sequence for Emulation ... 13-61

Additional JTAG Emulator References ... 13-62

Pod Specifications ... 13-62

DSP JTAG Pod Connector ... 13-62

DSP 3.3V Pod Logic ... 13-63

DSP 2.5V Pod Logic ... 13-64

Conditioning Input Signals ... 13-66

Link Port Input Filter Circuits ... 13-66

RESET Input Hysteresis ... 13-67

Designing For High Frequency Operation ... 13-67

Clock Specifications and Jitter ... 13-68

Clock Distribution ... 13-69

(25)

Point-To-Point Connections ... 13-72 Signal Integrity ... 13-73 Other Recommendations and Suggestions ... 13-74 Decoupling Capacitors and Ground Planes ... 13-75 Oscilloscope Probes ... 13-77 Recommended Reading ... 13-78 Booting Single and Multiple Processors ... 13-79 Multiprocessor Host Booting ... 13-80 Multiprocessor EPROM Booting ... 13-80 Booting From a Single EPROM ... 13-80 Sequential Booting ... 13-81 Multiprocessor Link Port Booting ... 13-83 Multiprocessor Booting From External Memory ... 13-83 Data Delays, Latencies, and Throughput ... 13-83 Execution Stalls ... 13-84 DAG Stalls ... 13-84 Memory Stalls ... 13-84 IOP Register Stalls ... 13-85 DMA Stalls ... 13-85 Link Port and Serial Port Stalls ... 13-85

REGISTERS

Overview ... A-1

Control and Status System Registers ... A-2

Mode Control 1 Register (MODE1) ... A-3

(26)

Mode Mask Register (MMASK) ... A-9

Mode Control 2 Register (MODE2) ... A-11

Arithmetic Status Registers (ASTATx and ASTATy) ... A-14

Sticky Status Registers (STKYx and STKYy) ... A-21

User-Defined Status Registers (USTATx) ... A-27

Processing Element Registers ... A-28

Data File Data Registers (Rx, Fx, Sx) ... A-28

Multiplier Results Registers (MRFx, MRBx) ... A-29

Program Memory Bus Exchange Register (PX) ... A-30

Program Sequencer Registers ... A-31

Interrupt Latch Register (IRPTL) ... A-33

Interrupt Mask Register (IMASK) ... A-39

Interrupt Mask Pointer Register (IMASKP) ... A-39

Link Port Interrupt Register (LIRPTL) ... A-41

Flag Value Register (FLAGS) ... A-44

IOFLAG Value Register ... A-46

Program Counter Register (PC) ... A-50

Program Counter Stack Register (PCSTK) ... A-52

Program Counter Stack Pointer Register (PCSTKP) ... A-52

Fetch Address Register (FADDR) ... A-52

Decode Address Register (DADDR) ... A-53

Loop Address Stack Register (LADDR) ... A-53

Current Loop Counter Register (CURLCNTR) ... A-54

Loop Counter Register (LCNTR) ... A-54

(27)

Timer Period Register (TPERIOD) ... A-54

Timer Count Register (TCOUNT) ... A-54

Data Address Generator Registers ... A-55

Index Registers (Ix) ... A-55

Modify Registers (Mx) ... A-55

Length and Base Registers (Lx,Bx) ... A-56

I/O Processor Registers ... A-57

System Configuration Register (SYSCON) ... A-71

Vector Interrupt Address Register (VIRPT) ... A-75

External Memory Waitstate and Access Mode Register (WAIT) A-76

System Status Register (SYSTAT) ... A-79

SDRDIV Register (SDRDIV) ... A-83

SDRAM Control Register (SDCTL) ... A-84

External Port DMA Buffer Registers (EPBx) ... A-88

Message Registers (MSGRx) ... A-88

PC Shadow Register (PC_SHDW) ... A-89

MODE2 Shadow Register (MODE2_SHDW) ... A-90

Bus Time-Out Maximum Register (BMAX) ... A-91

Bus (Time-Out) Counter Register (BCNT) ... A-92

External Port DMA Control Registers (DMACx) ... A-92

Internal Memory DMA Index Registers (IIx) ... A-98

Internal Memory DMA Modifier Registers (IMx) ... A-99

Internal Memory DMA Count Registers (Cx) ... A-99

Chain Pointer For Next DMA TCB Registers (CPx) ... A-101

(28)

General Purpose DMA Registers (GPx) ... A-101 External Memory DMA Index Registers (EIEPx) ... A-101 External Memory DMA Modifier Registers (EMEPx) ... A-102 External Memory DMA Count Registers (ECEPx) ... A-102 DMA Channel Status Register (DMASTAT) ... A-103 Link Port Buffer Registers (LBUFx) ... A-105 Link Port Buffer Control Register (LCTL) ... A-106 Link Port Service Request & Mask Register (LSRQ) ... A-112 SPORT Serial Control Registers (SPCTLx) ... A-115 SPORT Multichannel Control Registers (SPxyMCTL) ... A-125 SPORT Transmit Buffer Registers (TXx) ... A-128 SPORT Receive Buffer Registers (RXx) ... A-128 SPORT Divisor Registers (DIVx) ... A-128 SPORT Count Registers (CNTx) ... A-129 SPORT Transmit Select Registers (MT2CSx and MT3CSx) . A-129 SPORT Transmit Compand Registers

(MT2CCSx and MT3CCSx) ... A-130

SPORT Receive Select Registers ... A-131

SPORT Receive Compand Registers ... A-131

SPI Port Status Register ... A-132

SPI Control Register (SPICTL) ... A-134

SPI Receive Buffer Register (SPIRX) ... A-139

SPI Transmit Buffer Register (SPITX) ... A-140

(29)

Register and Bit #Defines File (def21161.h) ... A-141

INTERRUPT VECTOR ADDRESSES

Interrupt Vector Table ... B-1

NUMERIC FORMATS

Overview ... C-1 IEEE Single-Precision Floating-point Data Format ... C-1 Extended Precision Floating-Point Format ... C-3 Short Word Floating-Point Format ... C-3 Packing for Floating-Point Data ... C-4 Fixed-point Formats ... C-6

GLOSSARY

Terms ... G-1

INDEX

(30)
(31)

Table 1-0.

Listing 1-0.

Purpose

The ADSP-21161 SHARC DSP Hardware Reference provides architectural information on the ADSP-21161 Super Harvard Architecture (SHARC) Digital Signal Processor (DSP). The architectural descriptions cover func- tional blocks, buses, and ports, including all features and processes they support. For programming information, see the ADSP-21160 SHARC DSP Instruction Set Reference.

Audience

DSP system designers and programmers who are familiar with signal pro- cessing concepts are the primary audience for this manual. This manual assumes that the audience has a working knowledge of microcomputer technology and DSP-related mathematics.

DSP system designers and programmers who are unfamiliar with signal processing can use this manual, but should supplement this manual with other texts, describing DSP techniques.

All readers, particularly system designers, should refer to the DSP’s data

sheet for timing, electrical, and package specifications. For additional sug-

gested reading, see “For More Information About Analog Products” on

page 1-24.

(32)

Overview—Why Floating-Point DSP?

Overview—Why Floating-Point DSP?

A digital signal processor’s data format determines its ability to handle sig- nals of differing precision, dynamic range, and signal-to-noise ratios.

Because floating-point DSP math reduces the need for scaling and proba- bility of overflow, using a floating-point DSP can ease algorithm and software development. The extent to which this is true depends on the floating-point processor’s architecture. Consistency with IEEE worksta- tion simulations and the elimination of scaling are two clear ease-of-use advantages. High-level language programmability, large address spaces, and wide dynamic range allow system development time to be spent on algorithms and signal processing concerns, rather than assembly language coding, code paging, and error handling. The ADSP-21161 is a

highly-integrated, lower cost 32-bit floating-point DSP that provides many of these design advantages.

ADSP-21161 Design Advantages

The ADSP-21161 is a high-performance 32-bit DSP used for medical imaging, communications, military, audio, test equipment, 3D graphics, speech recognition, motor control, imaging, and other applications. This DSP builds on the ADSP-21000 Family DSP core to form a complete sys- tem-on-a-chip, adding a dual-ported on-chip SRAM, integrated I/O peripherals, and an additional processing element for Single-Instruc- tion-Multiple-Data (SIMD) support.

The SHARC architecture balances a high performance processor core with

high performance buses (PM, DM, IO). In the core, every instruction can

execute in a single cycle. The buses and instruction cache provide rapid,

unimpeded data flow to the core to maintain the execution rate.

(33)

Figure 1-1 shows a detailed block diagram of the processor, illustrating the following architectural features:

• Two processing elements (PEx and PEy), each containing 32-Bit IEEE floating-point computation unit—multiplier, ALU, Shifter, and data register file

• Program sequencer with related instruction cache, interval timer, and Data Address Generators (DAG1 and DAG2)

• Dual-ported SRAM

• External port for interfacing to off-chip memory such as SDRAM, peripherals, hosts, and multiprocessor systems

• Input/Output (IO) processor with integrated DMA controller, SPI-compatible port, serial ports, and link ports for point-to-point multiprocessor communications

• JTAG Test Access Port for emulation

(34)

ADSP-21161 Design Advantages

Figure 1-1. ADSP-21161 SHARC Block Diagram

Figure 1-1 also shows the three on-chip buses of the ADSP-21161: the Program Memory (PM) bus, Data Memory (DM) bus, and Input/Output (IO) bus. The PM bus provides access to either instructions or data. Dur- ing a single cycle, these buses let the processor access two data operands from memory, access an instruction (from the cache), and perform a DMA transfer.

The buses connect to the ADSP-21161 external port, which provides the processor interface to external memory, memory-mapped I/O, a host pro- cessor, and additional multiprocessing ADSP-21161s. The external port performs bus arbitration and supplies control signals to shared, global memory and I/O devices.

S P I PO R T S ( 1 ) S E R I A L P O R T S

(4 ) L I NK P O R T S

( 2) D M A C ON T R O L L E R M U L T

A L U B A R R E L SH I F T E R

D A T A R E G I S T E R

F I L E ( PE y) 16 x 40 - B I T M U L T

A L U B A R R E L SH I F T E R D A T A

R E G I S T E R F I L E ( PE x) 16 x 40 - B I T

5 1 6 2 0 4 I O P

R EG I S T E R S (MEMORY MAPPED)

C ON T R OL , S T A T U S, &

D A T A B U F F E R S

I/ O PR O C ES S OR T I M E R I N ST RU C T I O N

C A C H E 3 2 x 48 - B I T

A D D R D A T A D A T A

D A T A

A D D R

A D D R D A T A A D D R

T W O I N D EP EN D E N T D UA L - P OR T E D B L O C K S P R OC ES SO R PO R T I / O PO R T BL

OCK 0 BLOCK 1

D U A L - PO R T ED S R A M

H OS T P O R T A D D R B U S

M U X I O A

1 8 I O D 6 4

MULTIPROCESSOR INTERFACE EX T ER N A L

P O R T

D A T A B U S M U X

3 2 2 4 PM A D D R ES S B U S 3 2

D M A D D R E SS B U S

P M D A T A B U S D M D A T A B U S BU S

C O N N E C T (P X ) D A G 1 8 x 4 x 3 2

3 2

6 4 6 4 CO R E P R OC ES SO R

P R O G R A M SE Q U E N C E R D A G 2

8 x 4 x 3 2

J TA G T E ST & EM U L A T I O N

6

G P IO F L A G S SD R A M C O N T R O L L ER

12

8

(35)

Figure 1-2 illustrates a typical single-processor system. The ADSP-21161 includes extensive support for multiprocessor systems as well. For more information, see “Multiprocessor (MP) Interface” on page 7-93.

Further, the ADSP-21161 addresses the five central requirements for DSPs:

• Fast, flexible arithmetic computation units

• Unconstrained data flow to and from the computation units

• Extended precision and dynamic range in the computation units

• Dual address generators with circular buffering support

• Efficient program sequencing

(36)

ADSP-21161 Design Advantages

Figure 1-2. ADSP-21161 Typical Single Processor System

Fast, Flexible Arithmetic. The ADSP-21000 Family processors execute all instructions in a single cycle. They provide fast cycle times and a complete set of arithmetic operations. The DSP is IEEE floating-point compatible

DMA DEVICE (OPTIONAL) DATA

CLKOUT DMAR1-2 DMAG1-2

ADDR DATA

HOST PROCESSOR

INTERFACE (OPTIONAL) 3

12

CLOCK CLKIN

XTAL

IRQ2-0 2 CLK_CFG1-0

EBOOT LBOOT

FL AG11-0 TI MEXP CLKDBL

RESET JTAG

7 SBTS

ADSP-21161

BMS

LINK DEVICES

(2 MAX) (OPTIONAL)

LxCLK LxACK LxDAT7-0

SCL K0

D0B D0A FS0 SERIAL

DEVICE (OPTIONAL )

CS BOOT

EPROM (OPTI ONAL)

ADDR

MEMORY AND PERIPHERALS

(OPTIONAL) OE

DATA

CS RD

RAS ACK

BR1-6 RPBA

ID2-0

PA HBG HBR SDWE MS3-0 WR DAT A47-16

DATA ADDR

CS ACK WE A DDR23-0

DATA

CONTROL ADDRESS

BRST

SDRAM (OPTIONAL) SCLK1

D1B D1A FS1 SERIAL

DEVICE (OPTIONAL)

SCLK2 D2B D2A SERIAL FS2

DEVICE (OPTIONAL)

SCLK3 D3B D3A SERIAL FS3

DEVICE (OPTIONAL)

SPICLK MISO MOSI SPIDS SPI-

COMPATIBLE DEVICE (HOST OR

SLAVE) (OPTIONAL)

DA TA CAS RAS

DQM WE

ADDR CS A10 CKE CLK DQM

CAS

REDY SDCKE

SDA10 SDCLK1-0

RSTOUT

(37)

and allows either interrupt on arithmetic exception or latched status exception handling.

Unconstrained Data Flow. The ADSP-21161 has a Super Harvard Archi- tecture combined with a 10-port data register file. In every cycle, the DSP can write or read two operands to or from the register file, supply two operands to the ALU, supply two operands to the multiplier, and receive three results from the ALU and multiplier. The processor’s 48-bit orthog- onal instruction word supports parallel data transfers and arithmetic operations in the same instruction.

40-Bit Extended Precision. The DSP handles 32-bit IEEE floating-point format, 32-bit integer and fractional formats (twos-complement and unsigned), and extended-precision 40-bit floating-point format. The pro- cessors carry extended precision throughout their computation units, limiting intermediate data truncation errors.

Dual Address Generators. The DSP has two Data Address Generators (DAGs) that provide immediate or indirect (pre- and post-modify) addressing. Modulus, bit-reverse, and broadcast operations are supported with no constraints on data buffer placement.

Efficient Program Sequencing. In addition to zero-overhead loops, the DSP supports single-cycle setup and exit for loops. Loops are both

nestable (six levels in hardware) and interruptable. The processors support both delayed and non-delayed branches.

ADSP-21161 Architecture Overview

The ADSP-21161 forms a complete system-on-a-chip, integrating a large, high-speed SRAM and I/O peripherals supported by a dedicated I/O bus.

The following sections summarize the features of each functional block in the ADSP-21161 SHARC architecture, which appears in Figure 1-1.

With each summary, a cross reference points to the sections where the fea-

tures are described in greater detail.

(38)

ADSP-21161 Architecture Overview

Processor Core

The processor core of the ADSP-21161 consists of two processing ele- ments (each with three computation units and data register file), a program sequencer, two data address generators, a timer, and an instruc- tion cache. All digital signal processing occurs in the processor core.

Processing Elements

The processor core contains two processing elements (PEx and PEy). Each element contains a data register file and three independent computation units: an ALU, a multiplier with a fixed-point accumulator, and a shifter.

For meeting a wide variety of processing needs, the computation units process data in three formats: 32-bit fixed-point, 32-bit floating-point and 40-bit floating-point. The floating-point operations are single-precision IEEE-compatible. The 32-bit floating-point format is the standard IEEE format, whereas the 40-bit extended-precision format has eight additional Least Significant Bits (LSBs) of mantissa for greater accuracy.

The ALU performs a set of arithmetic and logic operations on both fixed-point and floating-point formats. The multiplier performs float- ing-point or fixed-point multiplication and fixed-point multiply/add or multiply/subtract operations. The shifter performs logical and arithmetic shifts, bit manipulation, field deposit and extraction, and exponent deriva- tion operations on 32-bit operands. These computation units perform single-cycle operations; there is no computation pipeline. All units are connected in parallel, rather than serially. The output of any unit may serve as the input of any unit on the next cycle. In a multifunction compu- tation, the ALU and multiplier perform independent, simultaneous operations.

Each processing element has a general-purpose data register file that trans-

fers data between the computation units and the data buses and stores

intermediate results. A register file has two sets (primary and secondary) of

sixteen registers each, for fast context switching. All of the registers are 40

(39)

bits wide. The register file, combined with the core processor’s Super Har- vard architecture, allows unconstrained data flow between computation units and internal memory.

Primary Processing Element (PEx). PEx processes all computational instructions whether the DSP is in Single-Instruction, Single-Data (SISD) or Single-Instruction, Multiple-Data (SIMD) mode. This element corre- sponds to the computational units and register file in previous

ADSP-21000 family DSPs.

Secondary Processing Element (PEy). PEy processes each computational instruction in lock-step with PEx, but only processes these instructions when the DSP is in SIMD mode. Because many operations are influenced by this mode, more information on SIMD is available in multiple

locations:

• For information on PEy operations, see “Processing Elements” on page 2-1

• For information on data addressing in SIMD mode, see “Addressing in SISD & SIMD Modes” on page 4-18

• For information on data accesses in SIMD mode, see “SISD, SIMD, and Broadcast Load Modes” on page 5-51

• For information on multiprocessing in SIMD mode, see “Multipro- cessor (MP) Interface” on page 7-93

• For information on SIMD programming, see the ADSP-21160 SHARC DSP Instruction Set Reference

Program Sequence Control

Internal controls for ADSP-21161 program execution come from four

functional blocks: program sequencer, data address generators, timer, and

instruction cache. Two dedicated address generators and a program

sequencer supply addresses for memory accesses. Together the sequencer

(40)

ADSP-21161 Architecture Overview

and data address generators allow computational operations to execute with maximum efficiency since the computation units can be devoted exclusively to processing data. With its instruction cache, the

ADSP-21161 can simultaneously fetch an instruction from the cache and access two data operands from memory. The data address generators implement circular data buffers in hardware.

Program Sequencer. The program sequencer supplies instruction addresses to program memory. It controls loop iterations and evaluates conditional instructions. With an internal loop counter and loop stack, the ADSP-21161 executes looped code with zero overhead. No explicit jump instructions are required to loop or to decrement and test the counter.

The ADSP-21161 achieves its fast execution rate by means of pipelined fetch, decode, and execute cycles. If external memories are used, they are allowed more time to complete an access than if there were no decode cycle.

Data Address Generators. The Data Address Generators (DAGs) provide memory addresses when data is transferred between memory and registers.

Dual data address generators enable the processor to output simultaneous addresses for two operand reads or writes. DAG1 supplies 32-bit addresses to data memory. DAG2 supplies 32-bit addresses to program memory for program memory data accesses.

Each DAG keeps track of up to eight address pointers, eight modifiers and eight length values. A pointer used for indirect addressing can be modified by a value in a specified register, either before (pre-modify) or after (post-modify) the access. A length value may be associated with each pointer to perform automatic modulo addressing for circular data buffers;

the circular buffers can be located at arbitrary boundaries in memory.

Each DAG register has a secondary register that can be activated for fast

context switching.

(41)

Circular buffers allow efficient implementation of delay lines and other data structures required in digital signal processing, and are commonly used in digital filters and Fourier transforms. The DAGs automatically handle address pointer wraparound, reducing overhead, increasing perfor- mance, and simplifying implementation.

Interrupts. The ADSP-21161 has four external hardware interrupts: three general-purpose interrupts, IRQ2-0 , and a special interrupt for reset. The processor also has internally generated interrupts for the timer, DMA con- troller operations, circular buffer overflow, stack overflows, arithmetic exceptions, multiprocessor vector interrupts, and user-defined software interrupts.

For the general-purpose external interrupts and the internal timer inter- rupt, the ADSP-21161 automatically stacks the arithmetic status and mode ( MODE1 ) registers in parallel with the interrupt servicing, allowing fif- teen nesting levels of very fast service for these interrupts.

Context Switch. Many of the processor’s registers have secondary registers that can be activated during interrupt servicing for a fast context switch.

The data registers in the register file, the DAG registers, and the multiplier result register all have secondary registers. The primary registers are active at reset, while the secondary registers are activated by control bits in a mode control register.

Timer. The programmable interval timer provides periodic interrupt gen- eration. When enabled, the timer decrements a 32-bit count register every cycle. When this count register reaches zero, the ADSP-21161 generates an interrupt and asserts its timer expired output. The count register is automatically reloaded from a 32-bit period register and the count resumes immediately.

Instruction Cache. The program sequencer includes a 32-word instruc-

tion cache that enables three-bus operation for fetching an instruction and

two data values. The cache is selective; only instructions whose fetches

conflict with program memory data accesses are cached. This caching

(42)

ADSP-21161 Architecture Overview

allows full-speed execution of core, looped operations such as digital filter multiply-accumulates and FFT butterfly processing.

Processor Internal Buses

The processor core has six buses: PM address, PM data, DM address, DM data, IO address, and IO data. Due to processor’s Super Harvard Archi- tecture, data memory stores data operands, while program memory can store both instructions and data. This architecture allows dual data fetches, when the instruction is supplied by the cache.

Bus Capacities. The PM address bus and DM address bus transfer the addresses for instructions and data. The PM data bus and DM data bus transfer the data or instructions from each type of memory. The PM address bus is 32 bits wide, allowing access of up to 62.68 Mwords for non -SRAM and 254.68 Mwords for SDRAM banks of mixed instructions and data. The PM data bus is 64 bits wide from (8-, 16-, and 32-bits) to accommodate the 48-bit instructions and 32-bit data.

The DM address bus is 32 bits wide allowing direct access of up to 4G words of data. The DM data bus is 64 bits wide. The DM data bus pro- vides a path for the contents of any register in the processor to be

transferred to any other register or to any data memory location in a single cycle. The data memory address comes from one of two sources: an abso- lute value specified in the instruction code (direct addressing) or the output of a data address generator (indirect addressing).

The IO address and IO data buses let the IO processor access internal memory for DMA without delaying the processor core. The IO address bus is 18 bits wide, and the IO data bus is 64 bits wide.

Data Transfers. Nearly every register in the processor core is classified as a Universal Register (UREG). Instructions allow transferring data between any two universal registers or between a universal register and memory.

This support includes transfers between control registers, status registers,

and data registers in the register file. The PM bus connect ( PX ) registers

(43)

permit data to be passed between the 64-bit PM data bus and the 64-bit DM data bus, or between the 40-bit register file and the PM data bus.

These registers contain hardware to handle the data width difference. For more information, see “Processing Element Registers” on page A-28.

Processor Peripherals

The term processor peripherals refers to everything outside the processor core. The ADSP-21161 peripherals include internal memory, external port, I/O processor, JTAG port, and any external devices that connect to the DSP.

Dual-Ported Internal Memory (SRAM)

The ADSP-21161 contains 1 megabit of on-chip SRAM, organized as two blocks of 0.5 Mbits. Each block can be configured for different combina- tions of code and data storage. Each memory block is dual-ported for single-cycle, independent accesses by the core processor and I/O processor or DMA controller. The dual-ported memory and separate on-chip buses allow two data transfers from the core and one from I/O, all in a single cycle.

All of the memory can be accessed as 16-, 32-, 48-, or 64-bit words. On the ADSP-21161, the memory can be configured as a maximum of 32K words of 32-bit data, 64K words of 16-bit data, 21.25K words of 48-bit instructions (and 40-bit data), or combinations of different word sizes up to 1.0 Mbit.

The DSP supports a 16-bit floating-point storage format, which effec- tively doubles the amount of data that may be stored on chip. Conversion between the 32-bit floating-point and 16-bit floating-point formats com- pletes in a single instruction.

While each memory block can store combinations of code and data,

accesses are most efficient when one block stores data, using the DM bus

for transfers, and the other block stores instructions and data, using the

(44)

ADSP-21161 Architecture Overview

PM bus for transfers. Using the DM bus and PM bus in this way, with one dedicated to each memory block, assures single-cycle execution with two data transfers. In this case, the instruction must be available in the cache.

The DSP also maintains single-cycle execution when one of the data oper- ands is transferred to or from off-chip, using the DSP external port.

External Port

The ADSP-21161 external port provides the processor interface to off-chip memory and peripherals. The 254.68 Mword off-chip address space is included in the ADSP-21161’s unified address space. The separate on-chip buses—for PM address, PM data, DM address, DM data, IO address, and IO data—multiplex at the external port to create an external system bus with a single 24-bit address bus and a single 32-bit data bus.

The ADSP-21161 on-chip DMA controller automatically packs external data into the appropriate word width during transfers.

The ADSP-21161 supports instruction packing modes to execute from 48-, 32-, 16-, and 8-bit wide memories. With the link ports disabled, the additional link port pins can be used to execute 48-bit wide instructions.

The ADSP-21161 also includes 32- to 48-bit, 16- to 48-bit, 8- to 48-bit execution packing for executing instruction directly from 32-bit, 16-bit, or 8-bit wide external memories. External SDRAM, SRAM, or SBSRAM can be 8-, 16-, or 32-bits wide for DMA transfers to or from external memory.

On-chip decoding of high-order address lines generates memory bank select signals for addressing external memory devices. The ADSP-21161 provides programmable memory waitstates and external memory acknowl- edge controls for interfacing to peripherals with variable access, hold, and disable time requirements.

SDRAM Interface. The ADSP-21161 integrated on-chip SDRAM con-

troller transfers data to and from synchronous DRAM (SDRAM) at the

core clock frequency or one-half the core clock frequency. The synchro-

nous approach, coupled with the core clock frequency, supports data

(45)

transfer at a high throughput—up to 400 Mbytes/second for 32-bit trans- fers and 600 Mbytes/second for 48-bit transfers.

The SDRAM interface provides a glueless interface with standard SDRAMs—16 Mbits, 64 Mbits, 128 Mbits, and 256 Mbits—and includes options to support additional buffers between the ADSP-21161 and SDRAM. The SDRAM interface is extremely flexible and provides capability for connecting SDRAMs to any one of the ADSP-21161 four external memory banks, with up to all four banks mapped to SDRAM.

Systems with several SDRAM devices connected in parallel may require buffering to meet overall system timing requirements. The ADSP-21161 supports pipelining of the address and control signals to enable such buff- ering between itself and multiple SDRAM devices.

Host Processor Interface. The ADSP-21161 host interface allows easy connection to standard microprocessor buses, 8-bit, 16-bit and 32-bit, with little additional hardware required. The interface supports asynchro- nous and synchronous transfers at speeds up to the half the internal core clock rate of the ADSP-21161. The host interface operates through the ADSP-21161 external port and maps into the unified address space. Four channels of DMA are available for the host interface; code and data trans- fers occur with low software overhead. The host can directly read and write the IOP register space of the ADSP-21161 and can access the DMA channel setup and mailbox registers. The host can also perform DMA transfers to and from the internal memory of the DSP. Vector interrupt support provides for efficient execution of host commands.

Multiprocessor System Interface. The ADSP-21161 offers powerful fea- tures tailored to multiprocessing DSP systems. The unified address space allows direct interprocessor accesses of each ADSP-21161 internal IOP registers. Distributed bus arbitration logic on the DSP allows simple, glue- less connection of systems containing up to six ADSP-21161 and a host processor. Master processor changeover incurs only one cycle of overhead.

Bus arbitration handles either fixed or rotating priority. Processor bus lock

(46)

ADSP-21161 Architecture Overview

allows indivisible read-modify-write sequences for semaphores. A vector interrupt capability is provided for interprocessor commands.

I/O Processor

The ADSP-21161 Input/Output Processor (IOP) includes four serial ports, two link ports, a SPI-compatible port, and a DMA controller. One of the processes that the IO processor automates is booting. The DSP can boot from the external port (with data from an 8-bit EPROM or a host processor) or a link port. Alternatively, a no-boot mode lets the DSP start by executing instructions from external memory without booting.

Serial Ports. The ADSP-21161 features four synchronous serial ports that provide an inexpensive interface to a wide variety of digital and mixed-sig- nal peripheral devices. The serial ports can operate at up to half the processor core clock rate. Programmable data direction provides greater flexibility for serial communications. Serial port data can automatically transfer to and from on-chip memory using DMA. Each of the serial ports offers a TDM multichannel mode (up to 128 channels) and supports µ-law or A-law companding. I 2 S support is also provided with the ADSP-21161.

The serial ports can operate with little-endian or big-endian transmission formats, with word lengths from 3 to 32 bits. The serial ports offer select- able synchronization and transmit modes. Serial port clocks and frame syncs can be internally or externally generated.

Link Ports. The ADSP-21161 features two 8-bit link ports that provide additional I/O capabilities. Link port I/O is especially useful for

point-to-point interprocessor communication in multiprocessing systems.

The link ports can operate independently and simultaneously. The data

packs into 32-bit or 48-bit words, which the processor core can directly

read or the IO processor can DMA-transfer to on-chip memory. Clock

and acknowledge handshaking signals control link port transfers. Trans-

fers are programmable as either transmit or receive.

(47)

Serial Peripheral (Compatible) Interface. The ADSP-21161 Serial Peripheral Interface (SPI) is an industry standard synchronous serial link that enables the ADSP-21161 SPI-compatible port to communicate with other SPI-compatible devices. SPI is a 4-wire interface consisting of two data pins, one device select pin, and one clock pin. It is a full-duplex syn- chronous serial interface, supporting both master and slave modes. It can operate in a multi-master environment by interfacing with up to four other SPI-compatible devices, either acting as a master or slave device. The ADSP-21161 SPI-compatible peripheral implementation also supports programmable baud rate and clock phase/polarities, and the use of open drain drivers to support the multi-master scenario to avoid data

contention.

DMA Controller. The ADSP-21161 on-chip DMA controller allows zero-overhead data transfers without processor intervention. The DMA controller operates independently and invisibly to the processor core, allowing DMA operations to occur while the core is simultaneously exe- cuting its program. Both code and data can be downloaded to the ADSP-21161 using DMA transfers.

DMA transfers can occur between the ADSP-21161 internal memory and external memory, external peripherals, or a host processor. DMA transfers between external memory and external peripheral devices are another option. External bus packing to 8-, 16-, 32-, 48-, or 64-bit words is auto- matically performed during DMA transfers.

Fourteen channels of DMA are available on the ADSP-21161—two over the link ports (shared with SPI), eight over the serial ports, and four over the processor’s external port. The external port DMA channels serve for host processor, other ADSP-21161 DSPs, memory, or I/O transfers.

JTAG Port

The JTAG port on the ADSP-21161 supports the IEEE standard 1149.1

Joint Test Action Group (JTAG) standard for system test. This standard

(48)

Development Tools

defines a method for serially scanning the I/O status of each component in a system. Emulators use the JTAG port to monitor and control the DSP during emulation. Emulators using this port provide full-speed emulation with access to inspect and modify memory, registers, and processor stacks.

JTAG-based emulation is non-intrusive and does not effect target system loading or timing.

Development Tools

The ADSP-21161 is supported by VisualDSP++®, an easy-to-use project management environment consisting of an Integrated Development Envi- ronment (IDE) and Debugger. VisualDSP++ lets you manage projects from start to finish from within a single, integrated interface. Because the project development and debug environments are integrated, you can move easily between editing, building, and debugging activities.

Integrated Development Environment. The IDE provides flexible project management for the development of DSP applications. The IDE includes access to all the activities necessary to create and debug DSP projects. You can create or modify source files or view listing or map files with the IDE Editor. This powerful Editor is part of the IDE and includes multiple lan- guage syntax highlighting, OLE drag and drop, bookmarks, and standard editing operations such as undo/redo, find/replace, copy/paste/cut, and go to.

Also, the IDE includes access to the SHARC® DSP C/C++ Compiler,

C/C++ Runtime Library, Assembler, Linker, Loader, Simulator, and Split-

ter. Options for these SHARC tools can be specified through Property

Page dialogs. Property Page dialogs are easy to use, and make configuring,

changing, and managing projects simple. These options control how the

tools process inputs and generate outputs, and have a one-to-one corre-

spondence to the tools’ command line switches. You can define these

options once, or modify them to meet changing development needs. You

(49)

can also access the SHARC Tools from the operating system command line if you choose.

Debugger. The Debugger has an easy-to-use, common interface for all processor simulators and emulators available through Analog Devices and third parties or custom developments. The Debugger has many features that greatly reduce debugging time. You can view C/C++ source inter- spersed with the resulting Assembly code. You can profile execution of a range of instructions in a program; set simulated watch points on hard- ware and software registers, program and data memory; and trace

instruction execution and memory accesses. These features enable you to correct coding errors, identify bottlenecks, and examine DSP perfor- mance. You can use the custom register option to select any combination of registers to view in a single window. The Debugger can also generate inputs, outputs, and interrupts so you can simulate real world application conditions.

SHARC Software Development Tools. SHARC Software Development Tools, which support the SHARC Family, allow you to develop applica- tions that take full advantage of the SHARC architecture, including multiprocessing, shared memory, and memory overlays. SHARC Software Development Tools include C Compiler, C Runtime Library, DSP and Math Libraries, Assembler, Linker, Loader, Simulator, and Splitter.

C/C++ Compiler and Assembler. The C/C++ Compiler generates effi- cient code that is optimized for both code density and execution time. The Compiler allows you to include Assembly language statements inline.

Because of this, you can program in C and still use Assembly for time-crit- ical loops. You can also use pretested Math, DSP, and C Runtime Library routines to help shorten the time to market. The SHARC Family Assem- bly language is based on an algebraic syntax that is easy to learn, program, and debug. The add instruction, for example, is written in the same man- ner as the actual equation.

Linker and Loader. The Linker provides flexible system definition

through Linker Description Files ( .LDF ). In a single LDF, you can define

(50)

Differences From Previous SHARC DSPs

different types of executables for a single or multiprocessor system. The Linker resolves symbols over multiple executables, maximizes memory use, and easily shares common code among multiple processors. The Loader supports creation of host, link port, and PROM boot images. Along with the Linker, the Loader allows multiprocessor system configuration with smaller code and faster boot time. The Simulator is a cycle-accurate, instruction-level simulator — allowing you to simulate your application in real time.

Third-Party Products. The VisualDSP++ environment enables third-party companies to add value using Analog Devices' published set of Applica- tion Programming Interfaces (API). Third party products—runtime operating systems, emulators, high-level language compilers, multiproces- sor hardware —can interface seamlessly with VisualDSP++ thereby simplifying the tools integration task. VisualDSP++ follows the COM API format. Two API tools, Target Wizard and API Tester, are also available for use with the API set. These tools help speed the time-to-market for vendor products. Target Wizard builds the programming shell based on API features the vendor requires. The API tester exercises the individual features independently of VisualDSP++. Third parties can use a subset of these APIs that meets their application needs. The interfaces are fully sup- ported and backward compatible.

Further details and ordering information are available in the VisualDSP++

Development Tools Data Sheet. This data sheet can be requested from any Analog Devices sales office or distributor.

Differences From Previous SHARC DSPs

This section identifies differences between the ADSP-21161 DSP and pre- vious SHARC DSPs: ADSP-21160, ADSP-21060, ADSP-21061,

ADSP-21062, and ADSP-21065. The ADSP-21161 preserves much of the

ADSP-2106x architecture and is compatible to the ADSP-21160, while

extending performance and functionality. For background information on

Cytaty

Powiązane dokumenty

Implement an unbounded queue (using non-sequential memory storage – the linked list) as a template structure (aimed at storing values of any type), and a set of methods operating

Write a program which implements an unbounded sorted list (using the single or double- linked list) as a template structure (aimed at storing values of any type and ordering them

chanizmu przywiązania i stanowiąca podstawową potrzebę każdego czło- wieka 20. Dla prawidłowego rozwoju dziecko potrzebuje związku emocjonalnego przynajmniej z jedną

This accelerator consists on a data and control memory interface (to communicate directly with the main memory), a master and slave interface to the bus and the necessary interfaces

Because our national trait was and is the love of freedom, like the state of human dig- nity, respected by our own state and its legal order” 255 wrote Jan Żaryn, showing freedom

AV Set if the post-rounded result overflows (unbiased exponent > +127), other- wise cleared.

- Może to zająć długi okres czasu jeśli inicjalizowana jest macierz zer - Interfejs hosta powinien zapewnić HBR pomiędzy przesyłaniem słów, aby pozwolić DSP

But perhaps the most important result of rejecting the concept of religion by Smith is a new methodology of religious studies and, par- ticularly, the elaboration of