Silicon Graphics Origin200/2000

Technical Reference Manual

Last Revised: December 30, 1996
by Curt McDowell, csm@sgi.com


Table of Contents


Document Overview

About This Document

This manual contains information needed to use the IP27prom to boot or debug an Origin2000 system. Coverage includes the following, but does not extend into the IO6prom or IRIX Kernel:

A lot of basic information about the Origin2000 is included to minimize the number of other documents required to use the IP27prom. Information is presented in a reference format.

Intended Audience

This document contains no proprietary information and is suitable for use by SGI Power Customers.

Related Documents (SGI Access Only)


Origin2000/200 System Overview

Slot Numbering

Picture Courtesy of Steve Whitney

The two Router card slots in the front right of the machine are numbered R1 and R2, going left to right. Router cable ports are numbered 1, 2, and 3 from top to bottom. Router ports 4, 5, and 6 are internal to the system.

The four Node card slots in the back of the machine are numbered N1 through N4, going right to left. Each Node card contains two CPUs called A and B. Therefore, each module may contain up to 8 CPUs, which are called 1A, 1B, 2A, 2B, 3A, 3B, 4A, and 4B (or sometimes simply CPU 0 through CPU 7).

R1 is connected directly to N1 and N2, while R2 is connected directly to N3 and N4 (see Vector Addressing for a more thorough treatment of Router port connectivity).

IO1 through IO6 are connected directly to the Crossbows on N1 and N3, while IO7 through IO12 are connected directly to the Crossbows on N2 and N4.

Part Numbers

Common Part Numbers
Number Part
013-1547 Origin2000 Midplane (8 CPU + 12 I/O)
013-1025 Origin200 Motherboard
030-0841 Origin2000 Router Card
030-0733 Origin2000 IP27 Node Card
030-1124 Server BaseIO Card
030-0872 MSCSI Card
030-0873 MENET Card
030-0880 MIO Card
030-0927 FibreChannel Card
030-0968 HIPPI_S Card
030-0956 HPCX Card

Node IDs (NASIDs)

What is a Node ID?

During the boot process, the PROM assigns a Node ID to each Hub in the system. Node IDs are small integer values ranging from 0 to 255. Since there is one Hub on each node card, each Node card receives its own Node ID and the ID is shared between the CPUs on the same node card.

Node IDs are more correctly referred to as NASIDs (NUMA Address Space Identifiers). Hardware folks use the term Node ID, while software folks use the term NASID to avoid confusion with several other types of node identifiers used internal to the IRIX kernel.

What are Node IDs used for?

Node IDs allow any node to access the full address spaces of any other node. Addressing a remote node's space is done simply by placing the remote node's ID in bits 32 through 40 of the desired physical address. This is true whether talking to the node's memory, I/O devices, Hub chip, PROM, etc. For example, one could talk to various parts of remote node 7's address space using addresses as follows:

How do node IDs get assigned?

After a system reset, there is a period of time when no Node IDs are assigned. In this situation, all Node IDs are zero and it is not possible to access the address spaces of remote nodes.

A special Hub feature called Vector Routing provides a way to access a small subset of Hub registers on a remote node. It is through Vector Routing that the CrayLink Interconnect topology is discovered, a global master is arbitrated, node IDs are assigned and distributed to all nodes in the system, and the full address space access becomes defined.

After the Node IDs are assigned, routing tables in each Hub and Router in the system are programmed with information that allows nodes to find one another, so that a Node IDs is effectively used as a network address. The Node ID is used as an index into the routing tables at each network hop in order to determine to which router output port a request or response should be forwarded.

NICs (Number In a Can)

A tiny chip from Dallas Semiconductor called a NIC is used extensively throughout the Origin2000 system. There is a NIC on each type of board in the system, including the Node card, Router card, BaseIO card (with two NICs), module midplane, etc.

The NIC contains a 48-bit number that is permanently laser-burned in at the time the NIC is manufactured and is guaranteed to be unique from all other NICs. The Node card NICs help identify Nodes during the boot process when the system is probing the network. The NIC on the midplane is used for software licensing.

In addition to the 48-bit number, NICs also contain several pages of non-volatile memory that can be read or written by the system. This memory is used to store additional information, primarily for purposes of inventory tracking, such as:

In general, Origin2000 software never writes to the NIC. Rather, it is programmed once by an external NIC programming device in the manufacturing plant, and subsequently reprogrammed by Manufacturing should a board be upgraded to a new revision level.

Memory Configurations

The Origin family memory consists of two parts, the system memory and the directory memory. Directory memory is used for storing cache coherence protocol information and is much smaller than system memory.

The Origin2000 supports from 64 MB to 4 GB of RAM per node card. Two types of directory memory are available for the Origin2000, referred to as standard and premium. Premium directory memory is required for systems that are not a subset of a standard 32-processor hypercube. They can also be used in smaller systems to provide greater process migration accuracy.

System memory must be populated one bank at a time. Each bank consists of two memory DIMM slots and one directory DIMM slot (be careful, because the three slots are separated from one another on the node card). The two memory DIMMs making up one bank must contain the same size DIMMs. Each bank may contain a different amount of memory. There should be memory in bank zero.

Standard directory memory resides in the same DIMMs in which the system memory resides, and is automatically populated when the system memory is populated. The directory DIMM slots are left empty.

Premium directory memory is populated by inserting additional smaller DIMMs into the directory DIMM slots, adding more precision to the standard directory memory.

The Origin200 supports from 32 MB to 4 GB of RAM per node card and does not support premium directory memory. Banks are populated from two outermost DIMM slots (making up bank 0) to the two innermost DIMM slots (making up bank 3).

Module Numbers

In an Origin2000 system, each Module (4 node cards, 8 processors) is assigned a unique number from 1 to 255, called the module number. The module number is the mechanism by which Irix identifies particular modules. The module number is stored in MSC NVRAM. A backup copy of the module number is stored in each node card PROM Log, so that in the event that the MSC NVRAM becomes inaccessible, the module number is voted from the last value stored in the PROM Logs. A module number of 0 indicates that a module number has not yet been assigned.

The most important use of module numbers is device naming in the Irix Hardware Graph. Disks and other devices are referred to by the number of the module containing them, as well as the slot within that module. For example, the root disk device might be called

/hw/module/1/slot/io1/baseio/pci/0/scsi_ctlr/0/target/1/lun/0/disk/partition/0/block

indicating that the disk is in module 1, slot io1, BaseIO card, PCI slot 0, etc.

If a module number is changed, all affected devices in the hardware graph will change names. This would most likely require the system administrator to reconfigure filesystem mounts, among other things.

Maintainence of module numbers may be performed using the IP27prom module command, or using the MSC mod command. When module numbers are changed in this manner, the change will not take place until the next time the system is rebooted.

The IO6prom verifies the consistency of module numbers. To be consistent, every module must have a number assigned to it, and there must not be any duplicate numbers.

Modules found to be without numbers will have a number automatically assigned to them, and then the system will reboot. Therefore, if a new module is added to an existing system, a module number will be automatically assigned and any attached devices will show up in the hardware graph.

Duplicate module numbers will prevent the system from booting until the problem is rectified. A message indicating the problem will display on the system console.

Module System Controller

Note : The MSC was at one time called the Entry Level System Controller, or ELSC. They are one and the same.

The MSC front panel is shown in the figure and comprises the following elements:


Alternate (Diagnostic) Console Port

The Origin2000 Module System Controller (MSC) front panel has a DIN-type RS-232 serial port, labelled the Diagnostic Port, and internally referred to as the Alternate Console Port (ACP). A second connector for the same serial port is provided in the back of the module for when it is to be connected to a Multi-Module System Controller (MMSC). The ACP is always available and is primarily used for debugging during manufacturing bring-up of a system, as well as system debugging when for some reason the regular serial port console is not available.

An RS-232 dumb terminal connected to the ACP can talk to the MSC, and through the MSC to the individual CPUs. All of the CPUs in a single module must share this console. For this reason, output from multiple CPUs appears on the console simultaneously, interleaving on a line by line basis, with CPU identification at the start of each line in the form of a slot (1 to 4) and slice (A or B). The MSC can be directed to send ACP keyboard input to a particular CPU. It can also be directed to show only the output from a particular CPU.

The MSC itself has a command set. Commands are sent to the MSC by typing the escape character ^T (Control-T), followed by the text of the command and ENTER. Ordinary ACP output is discarded between the time that the ^T and the ENTER are received. If echoing is enabled (default), the prompt MSC> will be displayed upon the reception of the ^T and characters typed will be visible. If echoing is not enabled (see the ech command), nothing will be visible as the command is typed.

Security

MSC commands which could potentially be destructive to system operation may only be executed when the MSC is in supervisor mode. If the MSC is not in supervisor mode, these commands result in the following error:

    err perm

The MSC is automatically in supervisor mode when the front panel keyswitch is in the Diag position. It may also be placed in supervisor mode by issuing the MSC command pas none to enter the four-character MSC password, where none is the default MSC password. The command pas s abcd would change the MSC password to abcd.

Commands

Commands are typed to the MSC ACP by prefixing them with a ^T (Control-T) character. Commands are visible as they are typed only if echoing is turned on (which is true after power-on). Each command is three letters in length. Some of them take parameters. Numeric parameters are always in hexadecimal. All commands return responses consisting of ok and possibly some hexadecimal values, or one of the three error responses:

MSC Responses
Response Reason
err perm Permission denied. Keyswitch must be in diagnostic position, or password entered via the pas command.
err cmd Unrecognized command mnemonic.
err arg Invalid command argument(s).

The supported commands are:

MSC Commands
Command Function
aut 1 Turns on automatic power-on mode so the MSC automatically issues a pwr u when the MSC is powered on (Origin200 only).
aut 0 Turns off automatic power-on mode (Origin200 only).
clr Resets all MSC options to their power-up defaults. This includes echo mode, no heartbeats, etc.
dbg Displays the virtual and physical Debug Switch bytes.
dbg V P Sets the virtual Debug Switch byte to V, and the physical Debug Switch byte to P.
dsp M Displays message M on the 8-character alphanumeric display. M may contain up to 8 ASCII characters other than NUL. For example, dsp testing would overwrite the first seven characters with "testing," while leaving the eighth character alone.
dsc C Modifies only the Nth character on the alphanumeric display. N is a digit from 0 to 7, and C is any ASCII character other than NUL.
ech 0 Turns off echoing as MSC commands are entered.
ech 1 Turns on echoing.
ech Toggles the echoing. Echoing is on by default after reset.
fan If no fan has failed, returns n or h, according to whether the fans are currently operating at Normal or High speed. If a fan has failed on an Origin200, returns f x, where x is a bit map of the failed fans (fan 1 = 1, fan 2 = 2, fan 3= 4). If a fan has failed on an Origin2000, returns f xyz, where xyz are bit maps of the failed fans by row (fan 1 = 1, fan 2 = 2, fan 3= 4). When the MSC detects that a fan has failed, it speeds up the remaining fans to maintain cooling levels. The failed fan should be replaced because the increased speed reduces the life of the remaining fans. If more than one fan fails, the system is powered down.
fan n Sets the fans to normal speed.
fan h Sets the fans to high speed.
keyReturns the status of the key-switch as off, on, or diag.
modDisplays the number of the module containing the MSC.
mod xxSets the number of the module containing the MSC to xx, where xx is a hexadecimal module number from 01 to ff (a module number of 00 indicates no module number is yet assigned).
nmi Sends a hardware NMI to all node cards in the MSC's module.
pas xxx When xxxx is replaced with the correct four-character password, places the MSC into supervisor mode where various restricted commands may be used. If the MSC is not in supervisor mode, restricted commands will result in a permission error (err perm). The MSC is automatically in supervisor mode if the key-switch is in the diag position.
pas s xxxx Set the password to xxxx. This is a restricted command. The password is stored in NVRAM and defaults to none.
pwr Returns u or d, according to whether the system is powered up or down.
pwr u Powers the system up.
pwr d Powers the system down.
pwr d N Waits N seconds and then powers the system down. N is a hex value from 5 to 258 (5 to 600 decimal).
pwr c N Powers the system down, waits N seconds, then powers the system back up. N is a hex value from 5 to 258 (5 to 600 decimal).
rst Sends a hardware reset to all node cards in the MSC's module.
rsw Returns the current Debug Switch settings as an inverted hexadecimal byte. See section on Debug Switch Use for bit correspondence.
sel cpu Selects which CPU is to receive input from the ACP. Anything typed on the ACP which is not an escaped command is sent through to the selected CPU. CPUs are named by slot and slice as described in Slot and CPU Numbering. For example, sel 2a would select CPU A in slot N2 for input.
sel Displays which CPU is currently selected to receive ACP input, or none if no CPU is selected.
sel auto Causes the last CPU to have output anything to be automatically selected to receive ACP input (this is the power-up default).
sel none Causes no CPU to be selected to receive ACP input.
see cpu Causes output from all CPUs other than a specific CPU to be discarded (ordinarily, the output from all CPUs is displayed intermixed by line). CPUs are named by slot and slice as described in Slot and CPU Numbering. For example, see 2a would cause only CPU 2A's output to be shown.
see Displays which CPU is currently being shown, or all if all CPUs are being shown.
see all Causes output from all CPUs to be displayed (this is the power-up default).
tmp Returns o, h, or n, indicating whether the system is over-temperature, dangerously high-temperature, or normal temperature, respectively.
ver Reports the MSC firmware revision number.

The following MSC commands are not documented here: get, hbt, rcf, scf, tas, and vlm.


Multi-Module System Controller

Note : The MMSC was at one time called the Full-Featured System Controller, or FFSC. They are one and the same.

The Multi-Module System Controller provides a way to manage Origin2000 systems consisting of more than one module, and also provides an enhanced graphical display of system activity. An MMSC is specified in the standard configuration for Origin2000 systems consisting of more than one module.

Multi-module systems can be operated without an MMSC. However, certain aspects of system management are less convenient. Moreover, if no MMSC is provided, the individual modules in the system must be manually powered on one at a time, with each switch being powered on less than 15 seconds after the previous one.

For detailed information about MMSC operations and command set, please refer to the Multi-Module System Controller document.

Functions of the MMSC

The Multi-Module System Controller serves the following purposes:

MMSC Connectivity

A typical multi-module installation consists of:

Origin2000 Hierarchical System Control

IP27prom

There is one IP27prom per node card, connected directly to the Hub chip. The device is physically an AMDF080 containing 1,048,576 bytes. It is possible to erase individual sectors consisting of 65,536 bytes, a process which sets all of the bits in the sector to 1. It is then possible to program any individual 1 bit into a 0 bit, but not vice versa.

The first 14 sectors of the PROM are used for the IP27prom firmware. The last 2 sectors are used for the PROM Log which stores environment variables and log messages.

How the IP27prom Fits Into the Origin2000 System

Picture Courtesy of Steve Whitney

PROM Compatibility

It is illegal to run different versions of the IP27prom on different nodes because the details of the boot sequence vary from release to release. Each node relies on the implicit actions of other nodes in order to maintain proper synchronization. The IP27prom cross-checks version numbers with other IP27proms and displays a promrev mismatch warning if there is an incompatibility.

There are also restrictions on which versions of the IP27prom may run with which versions of the IO6prom, although they do not necessarily need to be synchronized on every release. Refer to the individual PROM release nodes for compabitility information.

System Boot Process

When the system is powered on or reset, the following processes take place.

  1. Initialize CPU
  2. Test CPU caches
  3. Set up Dex mode
  4. Disable CPU A and/or CPU B
  5. Arbitrate local master (one per node card)
  6. Read Debug Switch settings from the MSC
  7. Determine the initial console device, and initialize it
  8. Record error information that may indicate the cause of a prior crash
  9. Initialize I/O
  10. Check if there is a BaseIO card with a console port
  11. Display the PROM boot banner on the console
  12. Display which Debug Switch settings are set to other than the default
  13. Determine node's serial number and advertise it to other nodes
  14. Run Hub Chip Self-Test if Heavy or Manufacturing diagnostics are selected
  15. Configure local memory
  16. Perform basic memory tests
  17. Download PROM to memory
  18. Transfer program counter to uncached RAM
  19. Switch crucial structures from the cache into uncached memory
  20. Test and invalidate the secondary cache
  21. Transfer the stack to uncached RAM
  22. Transfer the program counter and stack to cached RAM
  23. Initialize the first 32 MB of bank 0 memory for use by the PROM
  24. Initialize permanent low-memory system data structures
  25. Run diagnostics on the local CrayLink port
  26. Discover the CrayLink Interconnect Topology
  27. Verify all PROMs are running the same firmware version
  28. Arbitrate global master
  29. Global Master configures CrayLink Interconnect
  30. All nodes switch back to uncached memory
  31. Change Node ID
  32. Switch back to cached memory in new Node ID space
  33. Test and initialize all of the rest of local memory (above 32 MB)
  34. Initialize any headless nodes (nodes without functional CPUs)
  35. Display error state information (if not cold power-on)
  36. Transfer control to IO6prom

As it boots, the IO6prom displays several messages about booting and probing devices, then proceeds to the IO6prom menu:

Numbering nodes...
2 node(s) found.
Clocks synchronized.
Modules numbered.
IO6 PROM Monitor SGI Version 2.1 Rev A IP27,   Sep 17, 1996 (BE64)
Sizing caches...
Sizing caches...
Initializing exception vectors.
Initializing environment
Initing environment
Initializing software and devices.
Reiniting caches..
Initing saio...
Installing Devices...

Walking SCSI Adapter 0
1- 2- 3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 0 device(s)

Walking SCSI Adapter 1
1- 2+ 3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 1 device(s)
Initializing devices...



System Maintenance Menu

1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor

Option?

Using the System Controller Debug Switches

There are two sets of Debug Switches maintained in NVRAM by the MSC:

These switches are set by the MSC dbg xx yy command, where xx and yy are hexadecimal bytes. The Virtual Debug Switches are set to xx and the Physical Debug Switches are set to yy. The most significant bit of xx corresponds to Debug Switch 16, while the least significant bit of yy corresponds to Debug Switch 1. Using the dbg command without arguments displays what the current settings are. Ordinarily, both sets of Debug Switches should be set to zero. An equivalent dbg command is also available in IP27prom POD mode (but be careful because it takes decimal by default).

There are also eight Hardware Debug Switches that directly correspond to the Physical Debug Switches. The Hardware Debug Switches are exclusive-ORed with the Physical Debug Switches so debug functions can be controlled via both MSC serial port commands and MSC Hardware Debug Switches. The exclusive-OR allows Hardware Debug Switches that are on to be turned off remotely, and switches that are off to be turned on. The Origin2000 Hardware Debug Switches are mounted on a blue block below the MSC keyswitch.

The Origin200 also has Hardware Debug Switches which accessible by removing a small EMI panel cover on the top of the chassis, and are slightly different than the Origin2000 switches in appearance (however, they are still OFF when switched away from the labelled numbers).

Examples:

All switches should be OFF for normal system operation. Changing Hardware Debug Switch settings requires using a sharp stylus to press the switch in on the top (switch ON) or bottom (switch OFF). The Debug Switch block shown here has switches 1, 2, and 6 set to ON and all others OFF.

When reading the raw binary value of the Hardware Debug Switches using the rsw command, a hexadecimal byte value is returned. The most significant bit is switch 8, and a bit is 1 when its switch is OFF. The bit values are reversed (1 when off, 0 when on). Reading the Hardware Debug Switch block shown here would return 0xdc.

System Controller Debug Switch Assignments

PROM Images

Of interest to developers only

The IP27prom build directory and the Irix flash commands deal with PROM images in the promgen format. The file extension for such images is .img. Under Irix, they reside in the directory /usr/cpu/firmware.

To verify an image and view the version number of an image, use the command promgen -h file.img. The promgen utility resides in stand/arcs/tools/promgen.


Power-On Diagnostics

Boot Status LEDs

During system boot, the node board LEDs are constantly updated with values indicating the boot progress, so that if the system were to crash during any phase, the LEDs would indicate what it was doing at the time. Also, diagnostics values are displayed on the Origin2000/200 Module System Controller display during boot. Further into the boot process, a console becomes available to report more detailed information on failures.

If a single processor fails very early during boot, before a console is available, the PROM will present a non-flashing FLED (failure LED) value and completely disable that processor by setting its PI_CPU_ENABLE bit to zero. The system will continue to boot without that processor.

If a single process fails after a console is available, the PROM will flash a FLED value, and wait for ^C to be entered on the console, whereupon it will enter Dex POD mode. The system will continue to boot without that processor.

Reading the LEDs


Each node card has two side-by-side columns of 8 discreet LEDs. The left column presents a status value from CPU (slice) A, and the right column presents a status value from CPU B.

The columns have the most significant bit on top. The sense of the LED bits is reversed such that a lit LED indicates a zero bit and an extinguished LED indicates a one bit. For example, in the figure at right, CPU A is showing 0xf8, while CPU B is showing 0x03.

During the boot process, the CPU changes the LEDs before each phase of initialization. If a CPU were to hang during any phase, the residual LED value would help to indicate which phase hung and perhaps pinpoint the failing component (for example, the R10000 data cache). LED values from 0x00 to 0x7f, as shown in Table 1, are used for this purpose.

Progress LED Values
LED Name Phase
0x00RESET-
0x01INITCPUInitializing R10000 GPRS, FPRS, and COP0
0x02TESTCP1Testing R10000 COP1 registers
0x03RUNTLBSwitch to mapped mode
0x04TESTICACHETest R10000 primary instruction cache
0x05TESTDCACHETest R10000 primary data cache
0x06TESTSCACHETest secondary cache
0x07FLUSHCACHESFlush all caches
0x08CKHUBLOCAL-
0x09CKHUBCONFIG-
0x0aINVICACHEInvalidate R10000 primary instruction cache
0x0bINVDCACHEInvalidate R10000 primary data cache
0x0cINVSCACHEInvalidate secondary cache
0x0dINMAINSucceeded in jumping to main()
0x0eSPEEDUPAbout to increase PROM access speed
0x0fSPEEDUPOKIncreased PROM access speed
0x10INITDCACHE-
0x11INITICACHE-
0x12INITCOP0-
0x13FLUSHTLB-
0x14CLEARTAGS-
0x15CCLFAILED_INITUART-
0x16HUBINIT-
0x17HUBCFAILED_INITUART-
0x18NOCLOCK_INITUART-
0x19HUBINITDONE-
0x1aMSCPROBEAbout to probe for presence of MSC
0x1bJUNKPROBEAbout to probe for presence of Junk UART
0x1cDONEPROBEDone probing for presence of MSC
0x1dUARTINITAbout to initialize selected UART
0x1eUARTINITDONEDone initializing selected UART
0x1fCKHUBCHIP-
0x20PODMAIN-
0x21PODLOOPAbout to enter POD mode, C portion
0x22PODPROMPTJust about to enter POD prompt loop
0x23PODMODEAbout to enter POD mode, assembler portion
0x24LOCALARBPerforming local arbitration (CPU A/B)
0x25SCINIT-
0x26BMARB-
0x27BMASTER-
0x28BARRIERAbout to perform first local barrier
0x29CKPDCACHE1-
0x2aMAKESTACKAbout to configure Dex mode stack and data
0x2bMAINReached main()
0x2cLOADPROM-
0x2dCKSCACHE1-
0x2eCKBT-
0x2fINSLAVE-
0x30PROMJUMP-
0x31NMI-
0x32INV_IDCACHES-
0x33INV_SCACHE-
0x34WRCONFIG-
0x35RTCINITAbout to initialize Hub Real Time Counter
0x36RTCINITDONEDone initializing Hub Real Time Counter
0x37LOCK-
0x38BARRIEROKFirst local barrier succeeded
0x39LOCKOK-
0x3aFPROMINIT-
0x3bFPROMINITDONE-
0x3cJUMPRAMUAbout to jump to UALIAS space
0x3dJUMPRAMUOKJumped to UALIAS space
0x3eJUMPRAMCAbout to jump to cached space
0x3fJUMPRAMCOKJumped to cached space
0x40STACKRAMAbout to test stack area of memory
0x41STACKRAMOKDone testing stack area of memory
0x42SLAVEINTSlave saw command request interrupt
0x43SLAVECALLSlave about to call requested command
0x44SLAVERENDSlave command completed
0x45LAUNCHLOOPAbout to enter slave launch loop
0x46LAUNCHINTRReceived launch interrupt
0x47LAUNCHCALLCalling launched function
0x48LAUNCHDONELaunched function returned
0x49UARTBASE-
0x4aMDIRINITAbout to initialize Hub MD and SIMM controls
0x4bMDIRCONFIGAbout to probe and configure memory size
0x4cI2CINITAbout to initialize PCF8584 I2C chip
0x4dI2CDONEDone initializing PCF8584 I2C chip
0x4eCONFIG_INIT-
0x4fIODISCOVERAbout to discover Hub I/O
0x50HUB_CONFIG-
0x51ROUTER_CONFIGAbout to write Router cfg info into KLCONFIG
0x52INITIIAbout to initialize I/O section of Hub
0x53CONSOLE_GETAbout to probe I/O section for console
0x54CONSOLE_GET_OKConsole probing completed
0x55NOT_USED_55-
0x56INITIODONEDone initializing I/O section of Hub
0x57STASH2Reset error state saved
0x58STASH3Hub error registers cleared
0x59STASH4Hub error checking enabled
0x5aIODISCOVER_DONEDone discovering Hub I/O
0x5bNMI_INITAbout to initialize NMI handler area
0x5cTEST_INTSAbout to test Hub interrupts
0x5dIORESETAbout to perform early reset of Hub I/O section

In addition to indicating boot progress, the LEDs are used to indicate fatal hardware problems found during diagnostics the PROM performs in each boot phase. If a fatal problem is found, the CPU sets the LEDs to a failure value between 0x80 and 0xff, as shown in Table 2, and automatically disables itself.

Failure LED Values
LED Name Reason
0x81CP1R10000 COP1 register test failed
0x82RESTARTRestart Master unable to load io6prom
0x83ICACHER10000 primary instruction cache test failed
0x84DCACHER10000 primary data cache test failed
0x85SCACHESecondary cache test failed
0x86KILLEDCPU disabled by another node
0x87RTCReal-time counter not counting
0x88ECC-
0x89XTLBMISS-
0x8aUTLBMISS-
0x8bKTLBMISS-
0x8cGENERAL-
0x8dNOTIMPL-
0x8eCACHE-
0x8fOS-
0x90HUBINTS-
0x91HUBLOCAL-
0x92HUBCONFIG-
0x93PREM_DIR_REQAll nodes must have premium DIMMs for this configuration
0x94UNUSED1-
0x95HUBUART-
0x96HUBCCS-
0x97MAINRETmain() returned
0x98NOMEMNode card has no local memory
0x99I2CFATAL-
0x9aDISABLEDCPU is disabled by environment variable
0x9bDOWNLOADError downloading IO6prom into RAM
0x9cCOREDEBUGCan't set CORE debug registers
0x9dIODISCOVERHub I/O discovery failed
0x9eHUB_CONFIGFailed writing Hub info into KLCONFIG
0x9fROUTER_CONFIGFailed writing Router info into KLCONFIG
0xa0HUBIO_INITHub I/O initialization failed
0xa1CONFIG_INITFailed initializing KLCONFIG
0xa2RTRCHIPRouter chip failed diags
0xa3LINKDEADLLP link failed diags
0xa4HUBBISTHub chip failed built-in self test (BIST)
0xa5RTRBISTRouter chip failed built-in self test (BIST)
0xa6RESETWAITWaiting for reset to go
0xa7LLP_FAILLLP failed after reset
0xa8LLP_NORESETLLP never came up after reset
0xa9BADMEMNo good local memory
0xaaNOT_USED_AA-
0xabNET_DISCOVERHub Network discovery failed
0xacNASID_CALCNASID calculation failed
0xadROUTE_CALCRoute calculation failed
0xaeROUTE_DISTRoute distribution failed
0xafNASID_DISTNASID distribution failed
0xb0NO_NASIDMaster did not assign a NASID
0xb1NO_MODULEIDModule ID arbitration failed
0xb2MIXED_SN00Origin200 mixed with Origin2000
0xb3ERRPARTError partition config
0xb3MODEBITError copying modebits

The following codes blink and indicate that an exception occurred so early in the PROM that no TTY device was available print information about the exception:

Early Exception LED Values
LED Name Exception Type
0xf2EXC_GENERAL(Blinking) General exception
0xf3EXC_ECC(Blinking) ECC exception
0xf4EXC_TLB(Blinking) TLB exception
0xf5EXC_XTLB(Blinking) XTLB exception
0xf6EXC_UNIMPL(Blinking) Unimplemented exception
0xf7EXC_CACHE(Blinking) Cache Error exception

POD Mode (PROM Command Interpreter)

POD mode is a command interpreter present in the PROM which is most often used for debugging of a crashed system. It can be used to examine the contents of CPU registers, support chip registers, and memory. It can also enable or disable certain node card features, such as CPUs and memory banks.

POD can be run in three modes: Dex, Cac, and Unc.

When running in Dex mode, POD requires very few system resources to run. It does not require memory -- instead, it accesses its program text directly from the PROM and uses the R10000 microprocessor's data cache as memory for its stack. The secondary cache is not used. NMI and uncaught exceptions typically result in a Dex mode POD prompt.

Dex mode is generally very slow because PROM instruction fetches are very slow. You would want to avoid performing long memory tests or flashing remote PROMs from Dex mode. Certain commands may not be executed in Dex mode, especially ones that attempt to program the PROM. See the go cac command for getting out of Dex mode.

Also, when loading or storing data or running memory tests, be sure not to access cached memory addresses while running in Dex mode. See the go dex command.

When running in Cac mode, POD places its program text, data, and stack in cached memory. This may only occur after memory has been probed and configured. POD runs very quickly out of cached memory. When the PROM takes an exception or NMI and enters POD mode, it goes into Dex mode. In the right circumstances it is possible to get back to Cac mode using the go cac command.

When running in Unc mode, POD places its program text, data, and stack in uncached memory. This is similar to Cac mode, except the cache is not used and it runs slower.

The PROM can be forced to enter POD mode at different stages of the boot process by setting Debug Switches.

The POD mode prompt is in the format:


         POD Dex MSC 001   3A>
         ___ ___ ___ ___ _ __
          1   2   3   4  5  6

where the fields are:

  1. Always POD, indicating POD mode.
  2. Dex, indicating POD mode is running memoryless out of the cache only, or
    Cac, indicating POD mode is running out of cached memory, or
    Unc, indicating POD mode is running out of uncached memory.
  3. MSC, indicating the POD prompt is being displayed on the MSC ACP port, or
    Junk, indicating the POD prompt is being displayed on the Junk UART, or
    IOC3, indicating the POD prompt is being displayed on an IOC3 UART, or
    Talk, indicating the POD prompt is being displayed on the Net UART.
  4. The NASID (node ID) of the current node.
  5. Occasionally, status fields may be inserted here indicating errors or alternate operating modes. MDerr indicates MD errors are pending (use the error command to view them and clear to clear them). AltRegs indicates the kernel register set is in effect rather than the PROM register set.
  6. The slot (1 to 4) and slice (A or B) of the CPU displaying the prompt.
  7. Many POD mode commands, including memory tests, flashing PROMs, and loop or repeat statements can be aborted by typing ^C.

The following editing characters are provided at the POD prompt:

        ^C                              Abort
        ^H, Backspace, or DEL           Backspace
        ^R                              Redisplay line
        ^U                              Erase line

Three handy command line editing features are borrowed from the C shell:

  1. Wherever the symbol !! appears in a command line, it is substituted with the contents of the previous command line.
  2. Wherever the symbol !$ appears in a command line, it is substituted with the last word from the previous command line.
  3. The command line ^xxx^yyy repeats the previous command line, except substitutes the first occurrence of the string xxx with the string yyy.

In addition, environment variables (see setenv command) are substituted wherever the name of an environment variable appears in backquotes (``). This is useful as a form of aliases. For example,

        POD Dex Hub 000 1A> setenv range
        Set "range" to? n:1 u: 0x3ff0000 2m
        POD Dex Hub 000 1A> dirtest `range`
        POD Dex Hub 000 1A> dirinit `range`
        POD Dex Hub 000 1A> memtest `range`

POD Consoles

POD mode will present a prompt on several possible I/O devices. It probes for devices in the order listed below and uses the first one found. When the CPU is at the POD prompt, it will blink an LED pattern at 0.5 Hz indicating on which UART input is expected.

    junk        If a Junk UART is plugged in, it is used preferentially.
                Junk UARTs are available only in manufacturing and bring-up.
                LED pattern: 0x80/0xb3

    ioc3        If a BaseIO (IO6) is found, and one of the ports on it is
                marked as the system console, and it can be initialized and
                accessed without causing an exception, the ioc3 uart is used.
                LED pattern: 0x80/0x8c

    msc         If the Module System Controller is responsive, the MSC
                is used for I/O.  (The MSC was at one time called the
                Entry Level System Controller, or ELSC).
                LED pattern: 0x80/0xbc

    netuart     If no UART is available, the netuart is used (see below).
                This is not a real UART, but an emulation that allows one
                Hub to talk to another.
                LED pattern: 0x80/0xbf

Netuart

When a CPU is using the netuart, it monitors some shared memory locations and interrupt lines used for communication. From any node that has a UART and is at the POD prompt, you can use the talk command to communicate with CPUs that are using the netuart. Use of the netuart is not recommended.

Constants

PROM constants are similar to C constants. Hexadecimal numbers are prefixed with 0x; octal numbers are prefixed with 0; binary numbers are prefixed with 0b; all other numbers are in decimal.

Three special suffixes are provided to facilitate entering large constants:

        g       Gigabytes               e.g.: 1g == (1 << 30)
        m       Megabytes               e.g.: 32m == (32 << 20)
        k       Kilobytes               e.g.: 100k == (100 << 10)

Expressions

Arithmetic expressions are permitted and are similar to C expressions in operators and order of evaluation. Many commands require expressions to be parenthesized to avoid ambiguity. For example:

        px (0xdeadbeef >> 6) & 0xff
        sd md_memory_config (ld md_memory_config | 077777777)

The supported operators are:

        Unary: () ~ ! - + LD LW LH LB, and * (same as LD)
        Binary: + - * / % | & ^ << >> == != < > <= >= && ||

Commands that require strings may sometimes require quotes on the strings. Within quotes, standard C escape sequences (\b, \t, \n, \r, \\, and \ddd) are permitted. There is a 31-character limit on string length.

Statements

More than one command may be performed on a command line by using a command list in which commands are separated by semicolons; e.g.:

        sd u:0 1; ld u:0

Commands that take other commands as aguments, such as loop, may take a single command or a compound command consisting of a command list in curly braces; e.g.:

        loop ld pi_rt_count
        loop { ld pi_rt_count; sd pi_rt_count 0 }

A pound sign (#) begins a comment that lasts to the end of the input line. This may be useful when cutting and pasting into a terminal window from command files.

Address Modifiers

For convenience in entering constants which represent R10000 and Origin2000 addresses, a number of modifiers are available. The modifiers should precede the constants. Multiple modifiers may be used.

Address Modifiers
ModifierActionExampleEquivalent To
h:Set top byte to 0x90 (HSPEC).h:256m0x9000000010000000 (LBOOT)
i:Set top byte to 0x92 (IO).i:(16m+32)0x9200000001000020
m:Set top byte to 0x94 (MSPEC).m:00x9400000000000000
u:Set top byte to 0x96 (UNCAC).u:00x9600000000000000
c:Set top byte to 0xa8 (CAC).c:00xa800000000000000
p:Set top byte to 0x00 (PHYS).p:c:0x1000x100
s:Sign extend word.s:0xbfc000000xffffffffbfc00000
n:#Add NASID, where # is NASID.
May only be used if CrayLink routing is configured.
n:2 c:10xa800000200000001
w:#Set top byte to 0x92. Sets bit 24. Adds Widget, where # is widget. If remote, adds bit 23.w:1 0x200
n:1 w:1 0x200
0x9200000001000200
0x9200000101800200
b:#Adds offset for memory bank (# times 512 MB).n:3 b:5 u:2k0x96000003a0000800
L:Converts address to corresponding directory entry LO in HSPEC.L:0x21005600x90000000c08402a0
H:Converts address to corresponding directory entry HI in HSPEC.H:u:0x21005600x90000000c08402a8
P:#Converts address to corresponding protection entry, where # is the region.P:5 0x21005600x90000000c0840028
E:Converts address to corresponding ECC word address.E:0x21005600x9000000080840158

Hardware Registers

The PROM knows the names and bit field assignments for approximately 500 hardware registers in the CPU, Hub, Router, and Crossbow, so you do not need to manually look up and type their addresses (or values in the case of CPU registers). Names of registers may appear in arithmetic expressions and are replaced with the registers' addresses.

The names of R10000 Cop0 registers are recognized, and include Cause, Status, etc. The names of the R10000 general purpose registers are recognized (v0, a1, ta0, gp, r5, etc). General purpose registers may also be referred to as $0, $1, $2, etc. Floating point registers may be referenced as $f0, $f1, $f2, etc.

The names of Hub registers are prefixed according to the five sections of the Hub with PI_, MD_, II_, NI_, and CORE_.

The names of Hub NI and Router registers can be used in place of the vector address in some vector commands. Router registers are prefixed with RR_. The names of the Crossbow (XBOW) registers are prefixed with XB_.

Register names are case-insensitive and may be truncated to the extent that they are unambiguous. For example:

        POD Dex IOC3 000 4A> px (md_refresh_control + 8)
        0x9200000001200028
        POD Dex IOC3 000 4A> pr n:1 md_ref
        Register: MD_REFRESH_CONTROL (0x9200000101a00020)
        Value   : 0x0000000000000000 (loaded from remote register)
           <63> RW  ENABLE                    0x1
        <23:12> RW  COUNTER                   0x2e0
        <11:00> RW  CNT_THRESH                0x504

Hub PI (Processor Interface) Registers

PI_BIST_COUNT_TARGET PI_BIST_ENTER_RUN PI_BIST_RDY PI_BIST_READ_DATA PI_BIST_SHIFT_LOAD PI_BIST_SHIFT_UNLOAD PI_BIST_WRITE_DATA PI_CALIAS_SIZE PI_CC_MASK PI_CC_PEND_CLR_A PI_CC_PEND_CLR_B PI_CC_PEND_SET_A PI_CC_PEND_SET_B PI_CPU_ENABLE_A PI_CPU_ENABLE_B PI_CPU_NUM PI_CPU_PRESENT_A PI_CPU_PRESENT_B PI_CPU_PROTECT PI_CRB_SFACTOR PI_CRB_TIMEOUT_A PI_CRB_TIMEOUT_B PI_ERR_INT_MASK_A PI_ERR_INT_MASK_B PI_ERR_INT_PEND PI_ERR_STACK_ADDR_A PI_ERR_STACK_ADDR_B PI_ERR_STACK_FORMAT PI_ERR_STACK_SIZE PI_ERR_STATUS0_A PI_ERR_STATUS0_A_CLEAR PI_ERR_STATUS0_B PI_ERR_STATUS0_B_CLEAR PI_ERR_STATUS1_A PI_ERR_STATUS1_A_CLEAR PI_ERR_STATUS1_B PI_ERR_STATUS1_B_CLEAR PI_FORCE_BAD_CHECK_BIT_A PI_FORCE_BAD_CHECK_BIT_B PI_GFX_BIAS_A PI_GFX_BIAS_B PI_GFX_CREDIT_CNTR_A PI_GFX_CREDIT_CNTR_B PI_GFX_INT_CMP_A PI_GFX_INT_CMP_B PI_GFX_INT_CNTR_A PI_GFX_INT_CNTR_B PI_GFX_PAGE_A PI_GFX_PAGE_B PI_INT_MASK0_A PI_INT_MASK0_B PI_INT_MASK1_A PI_INT_MASK1_B PI_INT_PEND0 PI_INT_PEND1 PI_INT_PEND_MOD PI_IO_PROTECT PI_MAX_CRB_TIMEOUT PI_NACK_CMP PI_NACK_CNT_A PI_NACK_CNT_B PI_NMI_A PI_NMI_B PI_PROFILE_COMPARE PI_PROF_INT_EN_A PI_PROF_INT_EN_B PI_PROF_INT_PEND_A PI_PROF_INT_PEND_B PI_PROT_OVRRD PI_REGION_PRESENT PI_REPLY_LEVEL PI_RTC_INT_EN_A PI_RTC_INT_EN_B PI_RT_COMPARE_A PI_RT_COMPARE_B PI_RT_COUNTER PI_RT_FILTER_CTRL PI_RT_INT_PEND_A PI_RT_INT_PEND_B PI_RT_LOCAL_CTRL PI_SOFTRESET PI_SPOOL_CMP_A PI_SPOOL_CMP_B PI_SYSAD_ERRCHK_EN PI_UNUSED

Hub MD (Memory/Directory) Registers

MD_DIR_DIMM_INIT MD_DIR_ERROR MD_DIR_ERROR_CLR MD_FANDOP_CAC_STAT MD_HSPEC_PROTECT MD_IO_PROTECT MD_IO_PROT_OVRRD MD_LED0 MD_LED1 MD_MEMORY_CONFIG MD_MEM_DIMM_INIT MD_MEM_ERROR MD_MEM_ERROR_CLR MD_MIG_CANDIDATE MD_MIG_CANDIDATE_CLR MD_MIG_DIFF_THRESH MD_MIG_VALUE_THRESH MD_MISC_ERROR MD_MISC_ERROR_CLR MD_MLAN_CTL MD_MOQ_SIZE MD_PERF_CNT0 MD_PERF_CNT1 MD_PERF_CNT2 MD_PERF_CNT3 MD_PERF_CNT4 MD_PERF_CNT5 MD_PERF_SEL MD_PROTOCOL_ERROR MD_PROTOCOL_ERROR_CLR MD_REFRESH_CONTROL MD_SLOT_ID_USTATUS MD_UREG0_0 MD_UREG0_1 MD_UREG0_2 MD_UREG0_3 MD_UREG0_4 MD_UREG0_5 MD_UREG0_6 MD_UREG0_7 MD_UREG1_0 MD_UREG1_1 MD_UREG1_10 MD_UREG1_11 MD_UREG1_12 MD_UREG1_13 MD_UREG1_14 MD_UREG1_15 MD_UREG1_2 MD_UREG1_3 MD_UREG1_4 MD_UREG1_5 MD_UREG1_6 MD_UREG1_7 MD_UREG1_8 MD_UREG1_9

Hub II (I/O) Registers

II_IBCT0 II_IBCT1 II_IBDA0 II_IBDA1 II_IBIA0 II_IBIA1 II_IBLS0 II_IBLS1 II_IBNA0 II_IBNA1 II_IBSA0 II_IBSA1 II_ICCR II_ICDR II_ICMR II_ICRB0_A II_ICRB0_B II_ICRB0_C II_ICRB0_D II_ICRB1_A II_ICRB1_B II_ICRB1_C II_ICRB1_D II_ICRB2_A II_ICRB2_B II_ICRB2_C II_ICRB2_D II_ICRB3_A II_ICRB3_B II_ICRB3_C II_ICRB3_D II_ICRB4_A II_ICRB4_B II_ICRB4_C II_ICRB4_D II_ICRB5_A II_ICRB5_B II_ICRB5_C II_ICRB5_D II_ICRB6_A II_ICRB6_B II_ICRB6_C II_ICRB6_D II_ICRB7_A II_ICRB7_B II_ICRB7_C II_ICRB7_D II_ICRB8_A II_ICRB8_B II_ICRB8_C II_ICRB8_D II_ICRB9_A II_ICRB9_B II_ICRB9_C II_ICRB9_D II_ICRBA_A II_ICRBA_B II_ICRBA_C II_ICRBA_D II_ICRBB_A II_ICRBB_B II_ICRBB_C II_ICRBB_D II_ICRBC_A II_ICRBC_B II_ICRBC_C II_ICRBC_D II_ICRBD_A II_ICRBD_B II_ICRBD_C II_ICRBD_D II_ICRBE_A II_ICRBE_B II_ICRBE_C II_ICRBE_D II_ICTO II_ICTP II_IECLR II_IFDR II_IGFX0 II_IGFX1 II_IIAP II_IIDEM II_IIDSR II_IIWA II_ILAPO II_ILAPR II_ILCSR II_ILLR II_IMEM II_IOWA II_IPCA II_IPCR II_IPDR II_IPPR II_IPRB0 II_IPRB8 II_IPRB9 II_IPRBA II_IPRBB II_IPRBC II_IPRBD II_IPRBE II_IPRBF II_IPRTE II_IPRTE0 II_IPRTE1 II_IPRTE2 II_IPRTE3 II_IPRTE4 II_IPRTE5 II_IPRTE6 II_IPRTE7 II_IPTP0 II_IPTP1 II_ITTE1 II_ITTE2 II_ITTE3 II_ITTE4 II_ITTE5 II_ITTE6 II_ITTE7 II_IXCC II_IXTT II_NOT_DONE II_WCR II_WID II_WSTAT

Hub NI (Network Interface) Registers

NI_AGE_CPU0_MEMORY NI_AGE_CPU0_PIO NI_AGE_CPU1_MEMORY NI_AGE_CPU1_PIO NI_AGE_GBR_MEMORY NI_AGE_GBR_PIO NI_AGE_IO_MEMORY NI_AGE_IO_PIO NI_DIAG_PARMS NI_ERROR_CLEAR NI_GLOBAL_PARMS NI_IO_PROTECT NI_LOCAL_TABLE<00> NI_META_TABLE<00> NI_PORT_ERROR NI_PORT_PARMS NI_PORT_RESET NI_PROTECTION_CONFIG NI_PROT_OVRRD NI_RETURN_VECTOR NI_SCRATCH_REG0 NI_SCRATCH_REG1 NI_STATUS_REV_ID NI_VECTOR NI_VECTOR_CLEAR NI_VECTOR_DATA NI_VECTOR_PARMS NI_VECTOR_READ_DATA NI_VECTOR_STATUS

Hub CORE (Internal crossbar control) Registers

CORE_CACR CORE_CDBGSEL CORE_CFDR CORE_CMDAB CORE_CNDAB CORE_CRPQD CORE_CRQQD

Crossbow Registers (midplane)

XB_ARB_RELOAD XB_CTRL XB_ERR_CMDWORD XB_ERR_LOWER XB_ERR_UPPER XB_ID XB_INT_LOWER XB_INT_UPPER XB_LINK_ARB_LOWER_8 XB_LINK_ARB_LOWER_9 XB_LINK_ARB_LOWER_A XB_LINK_ARB_LOWER_B XB_LINK_ARB_LOWER_C XB_LINK_ARB_LOWER_D XB_LINK_ARB_LOWER_E XB_LINK_ARB_LOWER_F XB_LINK_ARB_UPPER_8 XB_LINK_ARB_UPPER_9 XB_LINK_ARB_UPPER_A XB_LINK_ARB_UPPER_B XB_LINK_ARB_UPPER_C XB_LINK_ARB_UPPER_D XB_LINK_ARB_UPPER_E XB_LINK_ARB_UPPER_F XB_LINK_AUX_STAT_8 XB_LINK_AUX_STAT_9 XB_LINK_AUX_STAT_A XB_LINK_AUX_STAT_B XB_LINK_AUX_STAT_C XB_LINK_AUX_STAT_D XB_LINK_AUX_STAT_E XB_LINK_AUX_STAT_F XB_LINK_CTRL_8 XB_LINK_CTRL_9 XB_LINK_CTRL_A XB_LINK_CTRL_B XB_LINK_CTRL_C XB_LINK_CTRL_D XB_LINK_CTRL_E XB_LINK_CTRL_F XB_LINK_IBUF_FLUSH_8 XB_LINK_IBUF_FLUSH_9 XB_LINK_IBUF_FLUSH_A XB_LINK_IBUF_FLUSH_B XB_LINK_IBUF_FLUSH_C XB_LINK_IBUF_FLUSH_D XB_LINK_IBUF_FLUSH_E XB_LINK_IBUF_FLUSH_F XB_LINK_RESET_8 XB_LINK_RESET_9 XB_LINK_RESET_A XB_LINK_RESET_B XB_LINK_RESET_C XB_LINK_RESET_D XB_LINK_RESET_E XB_LINK_RESET_F XB_LINK_STAT_8 XB_LINK_STAT_9 XB_LINK_STAT_A XB_LINK_STAT_B XB_LINK_STAT_C XB_LINK_STAT_CLR_8 XB_LINK_STAT_CLR_9 XB_LINK_STAT_CLR_A XB_LINK_STAT_CLR_B XB_LINK_STAT_CLR_C XB_LINK_STAT_CLR_D XB_LINK_STAT_CLR_E XB_LINK_STAT_CLR_F XB_LINK_STAT_D XB_LINK_STAT_E XB_LINK_STAT_F XB_LLP_CTRL XB_NIC XB_PERF_CTR_A XB_PERF_CTR_B XB_PKT_TO XB_STAT XB_STAT_CLR

Router Registers

RR_BIST_COUNT_TARGET RR_BIST_DATA RR_BIST_ENTER_RUN RR_BIST_READY RR_BIST_SHIFT_LOAD RR_BIST_SHIFT_UNLOAD RR_DIAG_PARMS RR_ERROR_CLEAR1 RR_ERROR_CLEAR2 RR_ERROR_CLEAR3 RR_ERROR_CLEAR4 RR_ERROR_CLEAR5 RR_ERROR_CLEAR6 RR_GLOBAL_PARMS RR_HISTOGRAM1 RR_HISTOGRAM2 RR_HISTOGRAM3 RR_HISTOGRAM4 RR_HISTOGRAM5 RR_HISTOGRAM6 RR_LOCAL_TABLE<00> RR_META_TABLE<00> RR_NIC_ULAN RR_PORT_PARMS1 RR_PORT_PARMS2 RR_PORT_PARMS3 RR_PORT_PARMS4 RR_PORT_PARMS5 RR_PORT_PARMS6 RR_PORT_RESET RR_PROTECTION_CONFIG RR_RESET_MASK1 RR_RESET_MASK2 RR_RESET_MASK3 RR_RESET_MASK4 RR_RESET_MASK5 RR_RESET_MASK6 RR_SCRATCH_REG0 RR_SCRATCH_REG1 RR_STATUS_ERROR1 RR_STATUS_ERROR2 RR_STATUS_ERROR3 RR_STATUS_ERROR4 RR_STATUS_ERROR5 RR_STATUS_ERROR6 RR_STATUS_REV_ID

R10000 Coprocessor 0 Registers

C0_23 C0_24 C0_7 C0_BADVADDR C0_BRDIAG C0_CACHEERR C0_CAUSE C0_COMPARE C0_CONFIG C0_CONTEXT C0_COUNT C0_ECC C0_ENTRYHI C0_ENTRYLO0 C0_ENTRYLO1 C0_EPC C0_ERROREPC C0_FRAMEMASK C0_INDEX C0_LLADDR C0_PAGEMASK C0_PERFCOUNT C0_PRID C0_RANDOM C0_STATUS C0_TAGHI C0_TAGLO C0_WATCHHI C0_WATCHLO C0_WIRED C0_XCONTEXT

Symbols

The PROM knows the names of symbols within the PROM (and within the Kernel if one is loaded; see the kern_sym command). Symbols may appear in arithmetic expressions and are replaced with their addresses. For example,

        POD Dex IOC3 000 4A> px main
        0xc00000001fc07784
        POD Dex IOC3 000 4A> dis (main+0x1c) 4
        [main+0x001c, 0x1fc077a0]:      0001083c        dsll32  at,at,#0
        [main+0x0020, 0x1fc077a4]:      64420000        daddiu  v0,v0,0x0
        [main+0x0024, 0x1fc077a8]:      0039082c        dadd    at,at,t9
        [main+0x0028, 0x1fc077ac]:      0002103c        dsll32  v0,v0,#0
        POD Dex IOC3 000 4A> px "memcpy"
        0xc00000001fc1db14

Note that certain symbols, such as memcpy, must be quoted because they are reserved words in the PROM (POD command).


POD Mode Commands

Conventions

Data types shown in all caps are not to be input literally, but indicate the type of input expected. Items shown in square brackets ([]) are optional. Some of the types are:

Data Types
TypeRepresents
EXPRArithmetic expression
GPRNAMEGPR register name (symbolic or $num)
REGNORegister number, from 0 to 31
BITNOBit number, from 0 to 63
ADDRAddress, or arithmetic expression enclosed in parenthesis
STARTStart address
LENLength in bytes
VALNumber, or arithmetic expression enclosed in parenthesis
COUNTCount or length
CMDCommand
NODENode ID (also called NASID: Numa Address Space Identifier)
SLICECPU, either A or B.
VECCrayLink Vector path (e.g. 0x15)
VADDRCrayLink Vector address (offset or NI or RR register name)

Parts of a command enclosed in square brackets ([]) are optional.

Multiple possibilities are separated by vertical bars (|).

Command Summary

The help or ? command, used by itself, prints a command summary similar to that shown in the table below. The help or ? command may also take one command name argument to print the summary for a specific command.

Command Summary
CommandPurposeSyntax
pxPrint hexpx EXPR
pdPrint decimalpd EXPR
poPrint octalpo EXPR
pbPrint binarypb EXPR
prPrint registerpr [GPRNAME [VAL]]
pfPrint fpregpf [REGNO]
paPrint addresspa ADDR [BITNO]
lbLoad bytelb ADDR [COUNT]
lhLoad half-wordlh ADDR [COUNT]
lwLoad wordlw ADDR [COUNT]
ldLoad double-wordld ADDR [COUNT]
sbStore bytesb ADDR [VAL [COUNT]]
shStore half-wordsh ADDR [VAL [COUNT]]
swStore wordsw ADDR [VAL [COUNT]]
sdStore double-wordsd ADDR [VAL [COUNT]]
sdvsd with verifysdv ADDR COUNT
srStore registersr REG VAL
sfStore fpregsf REGNO VAL
vrVector readvr VEC VADDR
vwVector writevw VEC VADDR VAL
vxVector exchangevx VEC VADDR VAL
repeatRepeat countrepeat COUNT CMD
loopRepeat foreverloop CMD
whileWhile loopwhile (EXPR) CMD
forFor loopfor (CMD;EXPR;CMD) CMD
ifIf statementif (EXPR) CMD
delayDelaydelay MICROSEC
sleepSleepsleep SEC
timeBenchmark timingtime CMD
echoEcho a stringecho "STRING"
jumpInv. cache & jumpjump ADDR [A0 [A1]]
callCall subroutinecall ADDR [A0 [A1]]
resetReset the systemreset
softresetSoftreset a nodesoftreset n:NODE
nmiSend NMI to nodenmi n:NODE [a|b]
helpDisplay helphelp [CMDNAME]
invalInv. cache(s)inval [i][d][s]
flushFlush+inv cachesflush
tlbcClear TLBtlbc [INDEX]
tlbrRead TLBtlbr [INDEX]
dtagDump dcache tagdtag line
itagDump icache tagdtag line
stagDump scache tagstag line
dlineDump dcache linedline line
ilineDump icache lineiline line
slineDump scache linesline line
adtagDump dcache tagadtag line
aitagDump icache tagaitag line
astagDump scache tagastag line
adlineDump dcache lineadline line
ailineDump icache lineailine line
aslineDump scache lineasline line
goSet memory modego dex|unc|cac
bistSelf-test Hubbist le|ae|lr|ar [n:NODE]
rbistSelf-test Routerrbist le|ae|lr|ar VEC
disableDisable unitdisable n:NODE [SLICE]
enableEnable unitenable n:NODE [SLICE]
tdisableTemp. disabletdisable n:NODE [SLICE]
dirinitDir/prot initdirinit START LEN
meminitMemory clearmeminit START LEN
dirtestDir. test/initdirtest START LEN
memtestMemory test/initmemtest START LEN
santestMem. sanity testsantest ADDR
slaveGoto slave modeslave
segsList segmentssegs [FLAG]
execLoad/exec segmentexec [SEGNAME [FLAG]]
whyWhy are we here?why
fruRun FRU Analyzerfru [1 | 2]
clearClear errorsclear
errorDisplay errorserror
ioc3Use IOC3 UARTioc3
junkUse JunkBus UARTjunk
elscUse MSC (SysCtlr UART)elsc
talkUse Net UARTtalk [n:NODE a|b]
nmLook up PROM addrnm ADDR
discDiscover CrayLink Interconnectdisc
pcfgDump pcfg structpcfg [n:NODE] [v]
nodeGet/set node IDnode [[VEC] ID]
crbDump II CRBscrb [n:NODE]
crbx137-col wide crbcrbx [n:NODE]
routeSet up routeroute [VEC NODE]
flashPrgm remote PROMflash NODE [...]
nicRead Hub NICnic [n:NODE]
rnicRead Router NICrnic [VEC]
scSystem ctlr cmdsc ["STRING"]
scwWr sysctlr nvramscw ADDR [VAL [COUNT]]
scrRd sysctlr nvramscr ADDR [COUNT]
dipsRd hardware debug switchesdips
dbgSet/Rd debug switchesdbg [VIRT_VAL PHYS_VAL]
pasSet/Rd MSC passwordpas ["PASW"]
moduleSet/Rd module numbermodule [VAL]
modnicGet module NICmodnic
qualQuality modequal [1|0]
eccECC modeecc [1|0]
cpuSwitch to cpucpu [[n:NODE] a|b]
disDisassembledis ADDR [COUNT]
imSet R10k int maskim [BYTE]
dumpspoolDump PI err spooldumpspool [n:NODE SLICE]
error_dumpDump Hub error infoerror_dump
edump_briDump Bridge error infoedump_bri [n:NODE]
maxerrTest error limitmaxerr COUNT
rtabDump route tablertab [VEC]
rstatDmp/clr rtr statrstat VEC
versionShow PROM versionversion
memsetFill mem w/ bytememset DST BYTE LEN
memcpyCopy memory bytesmemcpy DST SRC LEN
memcmpCmp memory bytesmemcmp DST SRC LEN
memsumAdd memory bytesmemsum SRC LEN
scandirScan dir statesscandir ADDR [LEN]
dirstateDirectory statedirstate [STATE [BASE [LEN]]]
altregsUse alt. regsaltregs NUM
traplogTrap logtraplog ADDR
kern_symUse kernel symtabkern_sym
kdebugEnable kernel debug functionskdebug [STACKADDR]
initlogInit. PROM loginitlog [n:NODE]
initalllogsInit. all PROM logsinitalllogs
setenvSet variablesetenv [n:NODE] VAR ["STRING"]
unsetenvRemove variableunsetenv [n:NODE] VAR
printenvPrint variablesprintenv [n:NODE] [VAR]
logPrint PROM log entrieslog [n:NODE] [SKIP [COUNT]]
rlogPrint PROM log entries in reverserlog [n:NODE] [SKIP [COUNT]]
setpartSet up partitionsetpart RTR_LIST
hubsdeSend Hub data errorhubsde
rtrsdeSend Router data errorrtrsde
chklinkCheck local linkchklink
dgxbowRun Crossbow diagnosticdgxbow [m<n|h|m> [n<NODE>]
dgbrdgRun Bridge diagnosticdgbrdg [m<n|h|m>] [n<NODE>] [s<SLOT>]
dgconfRun BaseIO configuration space diagnosticdgconf [m<n|h|m>] [n<NODE>] [s<SLOT>]
dgpciRun PCI bus diagnosticdgpci [m<n|h|m>] [n<NODE>] [s<SLOT>] [p<PCI#>]
dgspioRun serial PIO diagnosticdgspio [m<n|h|m|x>] [n<NODE>] [s<SLOT>] [p<PCI#]
dgsdmaRun serial DMA diagnosticdgsdma [m<n|h|m|x>] [n<NODE>] [s<SLOT>] [p<PCI#]

Command Descriptions

Evaluate Expressions

Display Registers

Load From and Store To Memory

Set Registers

Vector Operation

Control

Non-Maskable Interrupts

Help

Cache

TLB

Operating Mode

Built-in Self Test (BIST)

Enable and Disable Hardware

Memory Test

Memory Region

Directory Region

Slave (Launch) Loop

IO6prom

Console Selection

CrayLink Interconnect

I/O Debug

PROM Flash

NIC (Number In a Can)

System Controller

Disassembler

Error Dump

PROM Log and Environment Variables

Partitioning

I/O Diagnostics

Debugging with the IP27prom

Reset

The hardware reset signal is generated by pressing the left button on the front panel of the Origin2000 Module System Controller, or by using the MSC rst command. All CPUs and other hardware in the module are reset. Reset may also be sent by means of software, such as when the reset command is used in POD mode.

The reset signal is propagated by hardware over the CrayLinks into other modules in the same partition. On system power-up, there are no partitions and the reset propagates throughout the entire CrayLink network. Subsequently, reset barriers are put in place to separate the system into separately resettable partitions.

Non-Maskable Interrupt (NMI)

NMIs are used for debugging a system that is hung, usually because of a software (PROM or Kernel) bug. An NMI to a CPU causes the CPU to stop what it's doing and jump to a fixed address in the PROM. The code at that address, called the NMI handler, attempts to bring the system to a PROM or Kernel monitor where the state of the system can be examined to determine the cause of the crash. The NMI handler is written carefully to avoid making assumptions about the state of the system and to avoid using system resources unnecessarily.

NMIs can be sent by means of software or by hardware. Software can send an NMI over the CrayLink Interconnect to any individual CPU. Pressing the right button on the front panel of the Origin2000 Module System Controller, or using the MSC nmi command, sends a hardware NMI to all of the CPUs plugged into the same module -- these CPUs will attempt to propagate the NMI to other modules in the same partition by means of a software NMI.

The NMI handler determines if the system was running in the Kernel or in the PROM at the time of the NMI by examining the NMI program counter. If the system was running in the PROM, the CPU immediately enters POD mode. Otherwise, the handler checks if a Kernel debugging facility (such as symmon) is available, and if so, transfers control to it. Sometimes the Kernel debugging facility may subsequently hang due to data corruption. For this reason, if another NMI is received (called a second NMI), the PROM will ignore the Kernel debugging facility and enter POD mode.

Occasionally, a hung Node may not respond to an NMI if the Hub or R10000 is completely hung. When a Node does respond to an NMI, a four-stage pattern flashes along the Node LEDs.

Kernel Debugging

Courtesy of Uday Hegde

There is support in the IP27prom to dump kernel data structures info via idbg. This is especially useful if symmon does not respond to the first NMI and you use the second NMI to drop into the IP27prom.

On NMI, all the kernel registers are saved and you enter prom Dex mode. To switch to kernel debug mode, you just type kdebug. This will enable kernel symbols in the PROM, switch the stack to memory so that we can do cached references, and switch your register set to the one that was saved on NMI. The stack is automatically placed somewhere near 64 MB on each node, but this can be overridden with an optional address to kdebug.

If any exceptions are taken while printing/inspecting kernel data structures, the PROM will re-enter Dex mode. However, the kernel register state is still intact so just typing kdebug will get you back in business.

You can use call idbg_func args to dump kernel data structures. Some of these may take TLB exceptions, since TLB misses will not be processed normally. Any idbg function that does not use qprintf (and uses printf or other printing routines) to dump the information will fail.

Here are some useful commands:

Here are some examples:

  1. kdebug (or kdebug 0xa800000001000000 to force sp to an address)
    1A 000: POD MSC Dex>kdebug
    1A 000: 
    1A 000: *** Turning on kernel symbol table lookup
    1A 000: Using Alternate register set at 0x9600000000011400
    1A 000: 
    1A 000: *** Entering POD mode with new stack pointer on node 0
    1A 000: POD MSC Dex AltRegs 0x9600000000011400> why
    1A 000: EPC : 0xc000000000534378 (wait_for_interrupt+0x28)
    1A 000: ERREPC : 0xc0000000004b44f4 (local_idle+0x8)
    1A 000: CACERR : 0x00000000dfffffff
    1A 000: Status : 0x0000000024400080
    1A 000: BadVA : 0x0000000010008848 (0x10008848)
    1A 000: RA : 0xc000000000534378 (wait_for_interrupt+0x28)
    1A 000: SP : 0xa800000000deff88
    1A 000: A0 : 0x0000000000000000
    1A 000: Cause : 0x0000000000009000 (INT:8--5----)
    1A 000: Reason : 253 (POD mode switch set or POD key pressed.)
    1A 000: POD mode was called from: 0x0000000000000000 (0x0)
    
  2. altregs (this example switches the register set to nasid 1, cpu 1's register state at NMI)
    1A 000: POD MSC Dex AltRegs 0x9600000000011400> altregs 3
    1A 000: Using Alternate register set at 0x9600000100011800
    1A 000: POD MSC Dex AltRegs 0x9600000100011800> why
    1A 000: EPC : 0xc0000000004b44fc (local_idle+0x10)
    1A 000: ERREPC : 0xc000000000534378 (wait_for_interrupt+0x28)
    1A 000: CACERR : 0x00000000dfffffff
    1A 000: Status : 0x0000000024400080
    1A 000: BadVA : 0x000000001013fff0 (0x1013fff0)
    1A 000: RA : 0xc000000000534378 (wait_for_interrupt+0x28)
    1A 000: SP : 0xa800000100e03f88
    1A 000: A0 : 0x0000000000000000
    1A 000: Cause : 0x0000000000009000 (INT:8--5----)
    
  3. Something that symmon cannot do
    1A 000: POD MSC Dex AltRegs 0x9600000000011400> a0=0x400; while(a0<0x800) {call idbg_pfn a0; a0=a0+1}
    1A 000: pfdat: addr 0xa800000001ff0000 [1024 0x400] flags 0x2800 <dump cstale > 
    1A 000: pageno 0x0 tag 0x0 use 1 rawcnt 1069
    1A 000: next 0x0 [-1] prev 0x0 [-1] hchain 0x0 [-1]
    1A 000: Bitmap base (pure) a800000001ffff00 (tainted) a800000001fffe00
    1A 000: pdep1: 0x0 pdep2: 0x0 
    1A 000: PFMS: 0x0
    1A 000: 
    1A 000: pfdat: addr 0xa800000001ff0040 [1025 0x401] flags 0x2800 <dump cstale > 
    1A 000: pageno 0x0 tag 0x0 use 1 rawcnt 1031
    1A 000: next 0x0 [-1] prev 0x0 [-1] hchain 0x0 [-1]
    1A 000: Bitmap base (pure) a800000001ffff00 (tainted) a800000001fffe00
    1A 000: pdep1: 0x0 pdep2: 0x0 
    1A 000: PFMS: 0x0
    1A 000: 
    1A 000: pfdat: addr 0xa800000001ff0080 [1026 0x402] flags 0x2800 <dump cstale > 
    1A 000: pageno 0x0 tag 0x0 use 1 rawcnt 1031
    1A 000: next 0x0 [-1] prev 0x0 [-1] hchain 0x0 [-1]
    1A 000: Bitmap base (pure) a800000001ffff00 (tainted) a800000001fffe00
    1A 000: pdep1: 0x0 pdep2: 0x0 
    1A 000: PFMS: 0x0
    1A 000: 
    1A 000: pfdat: addr 0xa800000001ff00c0 [1027 0x403] flags 0x2800 <dump cstale > 
    1A 000: pageno 0x0 tag 0x0 use 1 rawcnt 1031
    1A 000: next 0x0 [-1] prev 0x0 [-1] hchain 0x0 [-1]
    1A 000: Bitmap base (pure) a800000001ffff00 (tainted) a800000001fffe00
    1A 000: pdep1: 0x0 pdep2: 0x0 
    1A 000: PFMS: 0x0
    1A 000: 
    1A 000: pfdat: addr 0xa800000001ff0100 [1028 0x404] flags 0x2800 <dump cstale > 
    1A 000: pageno 0x0 tag 0x0 use 1 rawcnt 1031
    1A 000: next 0x0 [-1] prev 0x0 [-1] hchain 0x0 [-1]
    1A 000: Bitmap base (pure) a800000001ffff00 (tainted) a800000001fffe00
    1A 000: pdep1: 0x0 pdep2: 0x0 
    
  4. idbg_runq
    1A 000: POD MSC Dex AltRegs 0x9600000000011400> call idbg_runq 1
    1A 000: Runq(0xc000000001cfb7f8): active(0xf)
    1A 000: 
    1A 000: CPU 0: 
    1A 000: CPU 1: 
    1A 000: CPU 2: 
    1A 000: CPU 3: 
    1A 000: 
    1A 000: vmp 0 at 0xa800000000f34000
    1A 000: vmp 1 at 0xa800000000f34080
    1A 000: vmp 2 at 0xa800000000f34100
    1A 000: vmp 3 at 0xa800000000f34180
    1A 000: 
    
  5. pb
    1A 000: POD MSC Dex AltRegs 0x9600000000011400> call idbg_print putbuf 0xfffffff
    fffffffff
    1A 000: Memfit Initialized...
    1A 000: <5>NOTICE: [memsched_init]: There are 2 <= 2^1 memory nodes.
    1A 000: b.0.210
    1A 000: <4>WARNING: Nasid 0 interrupt bit 31 set in INT_PEND1
    1A 000: <4>WARNING: Nasid 0 interrupt bit 63 set in INT_PEND1
    1A 000: <4>WARNING: Nasid 1 interrupt bit 31 set in INT_PEND1
    1A 000: <4>WARNING: Nasid 1 interrupt bit 63 set in INT_PEND1
    1A 000: <4>WARNING: CPU 0 (/hw/module/1/slot/n1/node/cpu/0) is downrev (925)
    1A 000: <4>WARNING: CPU 1 (/hw/module/1/slot/n1/node/cpu/1) is downrev (925)
    1A 000: <4>WARNING: CPU 2 (/hw/module/1/slot/n2/node/cpu/0) is downrev (925)
    1A 000: <4>WARNING: CPU 3 (/hw/module/1/slot/n2/node/cpu/1) is downrev 000: Realtime overflow q:
    
  6. nodenuma (and it ultimately fails since it uses printf)
    1A 000: POD MSC Dex AltRegs 0x9600000000011400> call idbg_nodepda_numa 1
    1A 000: Dumping node numa parameters at 0xa800000100c0c000
    1A 000: memoryd semaphore @ 0xa800000100c0c008 ((noname)) count -1 flags:
    1A 000: wq: 0xa800000001aa0000; no meter info
    1A 000: Migration Control Parameters, (mcd)@: 0xa800000100c108b0
    1A 000: Migration Per Address Space Default Kernel Parameters:
    1A 000: migration_base_enabled: 0x1
    1A 000: migr_base_threshold: 0x3c
    .......
    1A 000: migr_unpegger_npages: 27
    1A 000: migr_unpegger_lastnpages: 0
    1A 000: 
    1A 000: *** General Exception on node 0
    1A 000: *** EPC: 0xc000000000532664 (0xc000000000532664)
    1A 000: *** Press ENTER to continue.
    1A 000: POD MSC Dex> kdebug
    1A 000: 
    1A 000: POD MSC Dex AltRegs 0x9600000000011400
    

Use of the MSC

The MSC alternate console port is shared between CPUs. Upon taking an NMI or exception, each CPU prints a brief three-line message on the MSC indicating what happened, and waits for the user to press ENTER.

The user can then use the MSC sel command to select a particular CPU, press ENTER to get a POD prompt on that CPU, and use the why and pr commands to obtain detailed information on where the system was executing, why it stopped, and what state the CPU registers and memory were in.

Fast loops

Since PROM command lines are precompiled, then executed by a fast stack-based engine, a loop of register stores or loads (such as for testing I/O devices) can be executed at relatively high speed. For example:

        loop { sd ADDR1 value1; sd ADDR2 value2 }
        loop v0 = ld ADDR

Memory Tests

Bank Organization

Memory is organized in up to 8 banks of up to 512 MB each. The base address of each bank is separated by 512 MB regardless of the the actual size of the bank. Thus, there is a hole in the memory address space wherever one bank follows another that's less than 512 MB. Care should be taken to avoid testing a region of memory that crosses a hole.

For example, if there are four banks of 64 MB in the system, there would be memory in the following address ranges (assuming cached space):

    c:0     through c:(64m-1)     or    c:b:0 0 through c:b:0 (64m-1)
    c:512m  through c:(576m-1)    or    c:b:1 0 through c:b:1 (64m-1)
    c:1024m through c:(1088m-1)   or    c:b:2 0 through c:b:2 (64m-1)
    c:1536m through c:(1600m-1)   or    c:b:3 0 through c:b:3 (64m-1)

Running the Tests

Memory tests clear the error registers before they start. During the test, if a data miscompare occurs, the contents of the error registers are decoded and displayed along with the test address and data.

If the qual on command is used, the data quality checking features of the R10000 will be enabled (PI_SYSAD_ERRCHK_EN). If data quality mode is on, then events such as ECC errors and SysAD parity errors will cause the R10000 to take exceptions during a memory test. On any exception during a memory test, the contents of the error registers are decoded and displayed along with the test address and data.

The ecc off command must be used prior to testing directory memory, or when testing regular memory but the directory memory is uninitialized, to prevent false ECC errors from occurring during the test.

Note that even when ECC checking is turned off, ECC correcting remains enabled. It is not possible to turn off error correcting in the Hub. For this reason, memory tests must be performed in the following order: test directory memory; initialize directories; test regular memory.

The standard memory test performs multiple phases:

  1. Store/load

    For each double-word address in the memory range, do a store of all 5's, and immediately load it back and compare, then do a store of all A's, and immediately load it back and compare. This test may find problems with RAM cells that take too long to change state.

  2. A5 fill/verify

    The memory range is filled with alternating double-word patterns of all 5's and all A's, then the range is read back and verified.

  3. 5A fill/verify

    The memory range is filled with alternating double-word patterns of all A's and all 5's, then the range is read back and verified.

  4. C3 fill/verify

    The memory range is filled with alternating double-word patterns of 0xc3c3c3c3c3c3c3c3 and 0x3c3c3c3c3c3c3c3c, then the range is read back and verified.

  5. 3C fill/verify

    The memory range is filled with alternating double-word patterns of 0x3c3c3c3c3c3c3c3c and 0xc3c3c3c3c3c3c3c3, then the range is read back and verified.

  6. Random fill/verify

    The memory range is filled with pseudo-random double-words (same pattern every time), then read back and verified. This is an address test that verifies that all memory addresses are unique.

  7. Address fill/verify

    The memory range is filled with double-words corresponding to the address being written, then read back and verified. This is an address test that verifies that all memory addresses are unique. Since this is the last test, memory is left filled with incrementing double-words.

The commands for performing memory initialization and tests are:

        dirinit
        dirtest
        meminit
        memtest

It's not recommended to run memory tests in Dex mode since they so slow; they take very long and don't stress the memory. Also, tests cannot be run to cached memory while in Dex mode since the cache is already being used for POD memory.

Examples

  1. memtest u:32m 1m
    memtest 0x9600000002000000 0x100000

    These are equivalent and test one megabyte of uncached memory space starting at an offset of 32 megabytes. The first argument is the start address, uncached 32 megabytes, and the second is the length.

  2. memtest c:32m 32m

    Tests the second 32 megabytes of cached memory. Tests on cached memory are very fast. However, it doesn't make sense to test a very small region of memory since you would only be testing out of the cache.

  3. memtest u:b:3 0 64m

    Tests 64 megabytes of uncached memory in bank 3. The b: is a shorthand way to add multiples of the bank size, 512 megabytes, to the address.

  4. memtest n:7 b:3 c: 1m 63m
    memtest (n:7 b:3 c: 1m) 63m
    memtest 0xa800000760100000 0x3f00000

    These are equivalent and test 63 megabytes of cached memory on node (NASID) 7, beginning at the the one megabyte point of bank 3.

When testing directory/protection memory, as opposed to regular memory, the dirtest command is used. It doesn't matter whether one uses a physical, cached, or uncached address. ECC checking should be turned off when testing directory memory.

The maxerr command controls the number of errors the memory test prints before it gives up and goes on to the next phase of the memory test. The default is to print 32 errors.

A final memory test available is the sanity test (santest) which tests a single address according to the following scheme:

  1. Clear the error registers and verify that they clear
  2. Test that no bits in the corresponding directory memory entry LO/HI are stuck at 1 or 0, with error register checking.
  3. Initialize the corresponding directory memory entry LO/HI to the Unowned state.
  4. Test that no bits in the memory location are stuck at 1 or 0, with error register checking.

Problem Reporting

There are four general types of problems that may occur in the PROM:

  1. The PROM takes an unexpected exception and enters POD mode.
  2. The PROM hangs, but can be made to enter POD mode by sending an NMI.
  3. The PROM hangs, but does not respond to NMI.
  4. The PROM waits for something to happen that never occurs.

If either of the first two cases occur, useful debugging information can be dumped that may aid a PROM developer in quickly pin-pointing the problem. At a minimum, the output of the following POD mode commands should be captured:

    why          <-- Prints the exception type and status registers
    pr           <-- Prints all R10000 general purpose registers
    error_dump   <-- Displays detailed Hub error register dump
    crbx         <-- If it appears to be I/O-related
    edump_bri    <-- If it appears to be Bridge-related
    fru 1        <-- Runs the FRU Analyzer on the error dump data

The third case usually indicates the Hub chip has been hung, typically because of I/O problems, and is very difficult to debug. The only available information is the last thing the PROM printed (or the last progress LED value).

The fourth case usually indicates a CrayLink communications problem. One node may be waiting on another, but never hear from it due to link errors. Multiple nodes may believe they are the global master. It is very useful to check for extinguished Router card LEDs to see if any of the CrayLink connections have gone down, effectively isolating a CPU while the system was running (see Reading the Router LEDs).

Flashing PROMs

The flashing method described here should not be necessary except in case of blank systems or special emergencies. Ordinarily, the IO6prom and Irix flashcpu, flashio, and flash commands are all that is needed for flashing, but if a system can't come up to the IO6prom or Irix due to a PROM problem, this may come in handy.

It is possible to use one node to flash another node's PROM using the IP27prom flash command. The contents of the PROM of the node doing the flash is copied into the remote node.

Warning: The remote node must perfectly match the node doing the flashing. Never copy a PROM to another node if the other node has a different secondary cache size or clock speed. The IP27prom flash command does not automatically reconfigure PROMs as the IO6prom and Irix flash commands do.

  1. Get to a POD prompt on the node from which the flash will be done. Try setting Hardware Debug Switches 4 and 5 to enter memoryless POD mode. Alternately, use ^Tnmi from the MSC alternate console port part way through system initialization. Either way will result in Dex POD mode.
  2. Enter cached mode using the go cac command. Flashing should be done from cached mode if possible because it's deathly slow in Dex mode, plus the route command sometimes has trouble changing remote nodes' NASIDs when in Dex mode.
  3. Establish NASIDs and routes for the node(s) to be flashed. If the system was previously booted past global arbitration, NASIDs may already be assigned. Check the NASIDs that appear in the POD prompts coming out on the MSC ports to see if they may have already been assigned; if they are non-zero, route command(s) may be unnecessary and it may become self-evident which nodes have which NASIDs assigned.

    To establish routes manually, use the route command, which takes a CrayLink vector and a NASID to assign. Any NASID less than 32 will do. Multiple nodes may be routed at the same time for convenience. For example, if you're talking to node N1 and wish to flash N2, N3, and N4 within the same module, the following commands may suffice:

    The route command may sometimes be used without arguments to automatically determine all nodes present in the system, and assign routes and NASIDs automatically in one step.

    The route command will most likely cause the remote nodes to crash since their NASIDs are changed out from under them, but this is okay.

  4. Execute the flash command, specifying the NASID of the node to be flashed. More than one NASID may be flashed in parallel to save time; for example,

    After flashing, the system may be rebooted, or additional route and flash commands may be executed.


PROM Log Overview

The IP27 Flash PROM is an AMD29F080 containing 1 MB of data, organized as 16 pages of 64 KB. The first 14 pages are used for storing the IP27prom code, and the last 2 pages are used for storing the PROM log. Sectors can be flashed (erased) independently, so when new code is flashed into the IP27prom, the last two sectors are left intact, and when the PROM log is initialized, only the last two sectors are affected.

The PROM Log is used to store three types of data:

  1. PROM environment variables, which consist of a key/value pair of strings.
  2. PROM environment lists, which consist of multiple value strings stored under the same key string.
  3. Error log messages, which can be dumped and viewed from the PROM and get copied into /var/adm/SYSLOG each time Irix boots.

Variables can be set and cleared, and log entries can be added to the first (primary) sector. Space is used up each time anything is done to the PROM Log (since data can't be erased except by flashing). When the primary sector becomes full, the second (alternate) sector is flashed clean and all persistent data (variables and lists) is compacted and copied into it. The new sector then becomes the primary sector, and the old sector becomes the alternate sector. Swapping back and forth in this manner minimizes the chance that all data will be lost in case of a problem.

The setenv, unsetenv, and printenv POD commands are used to view and modify environment variables stored in the PROM Log. Variable keys may be up to 15 ASCII characters in length, and variable values may be up to 127 ASCII characters.

Reserved PROM Log Variables

Where the value field is numeric, it is generally stored as a decimal number, or a hex number if prefixed by 0x. The word # appears where the data is numeric.

Reserved PROM Log Variables
KeyValuePurpose
DisableAreasonCPU A disabled if variable defined. The value is generally a description of the reason the CPU is disabled.
DisableBreasonCPU B disabled if variable defined. The value is generally a description of the reason the CPU is disabled.
DisableMem#...List of memory banks (in the range 0 to 7) to disable.
DisableMemSz########Positional list of the sizes of all disabled banks, generated automatically when memory is disabled by POD command.
DisableMemMask#...List of disabled banks, generated automatically when memory is disabled by POD command.
MemDisReasonreasonString indicating why memory bank(s) were disabled.
DisableIOreasonI/O probing disabled if present. Hub I/O section will not be initialized.
SwapBank#Alternate bank swapped with bank 0 if bank 0 is bad. Automatically incremented if bank 0 is found to be bad.
OverrideNIC#Override Hub NIC, useful if NIC broken (this NIC is not the one used for software licensing).
GlobalMasterreasonIf so, node wants to be global master (if multiple have this, lowest NIC).
LastModule#Number of the module this node card was last plugged into, updated automatically (Origin2000 only).
Origin200Module#Module number for master/slave configuration (Origin200 only).
ForceConsole-Temporary variable automatically set if no console is found. After automatic reboot, a default console will be selected.
DisableExpressreasonIf this variable is set, express links are disabled on subsequent reboots. This is mostly useful for benchmarking the effects of express links.
NASIDOffset#Adds a value to the calculated NASID. May prevent successful booting -- for developer use only.
NASID#Variable automatically assigned during partitioning. Used to reassign NASID if one partition is rebooted.
FencePorts#...List of router ports to fence off. Automatically assigned and used during partitioning.

By convention, in Log entries the first part of the value should consist of the date in local time, formatted in ASCII as follows: YY/MM/DD hh:mm:ss. This applies only if the date is available when the entry is made. The PROM may not have a real-time clock, but the kernel may.

Error Log Keys

Log entries are to be used very sparingly to avoid filling up the log. Any given message should generally not be logged more than once per reboot. Additional message types may be added later.

When the Irix kernel boots, it dumps the contents of the error log since the last reboot into the /var/adm/SYSLOG file, and sets a marker entry in the log for the next reboot.

Error Log Keys
KeyValueMeaning
FatalstringFatal error
ErrorstringNon-fatal error
InfostringImportant info message

List Keys

List Keys
KeyValueMeaning
--There are currently no list keys in use.


Origin2000 Routers

Reading the Router LEDs

Each router card has two side-by-side columns of 6 discreet LEDs.

The left side (green) LEDs indicate the connection status of router ports 1 through 6. They light up if a port has successfully negotiated and maintained a connection with another device, and turn off if the port is disconnected (physically or due to severe connection errors). These lights are very useful for pinpointing a cable or router Field Replaceable Unit (FRU) and should be one of the first things inspected if link(s) appear to be down. Note that although ports 4 through 6 connections are on the Origin2000 module midplane, the connection lights are still useful because an improperly seated card may not connect to the router properly.

The right side (yellow) LEDs are controlled by system software and typically light up when there is traffic across the corresponding link.

Vector Addressing

Early in the system boot, before the PROM has configured the CrayLink Interconnect, Node IDs are not yet assigned (they're all zero) and Router tables are not yet programmed. In this state, it is not yet valid to access the memory and I/O spaces of remote nodes using regular remote memory accesses (where the Node ID has been placed into physical address bits 40:32, such as by using the n: modifier).

The Origin2000 Hub and Routers have a feature called vector addressing which allows access to the any remote Hub's Network Interface (NI_) register subset, and to all Router (RR_) registers. To read or write a remote Hub or Router register requires specifying a path called a vector. A vector contains a string of output port numbers from 1 to 6 (corresponding directly to the hardware labeling A through F), one per nybble, appearing in reverse order. For example, the vector 0x512 indicates the request should leave the current Hub into a Router, leave port 2 of that Router into another Router, leave port 1 of that Router into another Router, and finally leave port 5 of that Router into either a Hub or Router, depending on which appears on that last port. The response automatically follows the reverse path and arrives back at the local Hub. The vector 0x0 accesses the device immediately connected to the local Hub, which is usually a Router but may be a Hub if a Null Router is being used.

POD contains several commands for performing vector operations, among them vector read (vr) and vector write (vw). Here are several examples:

  1. vr 0 0

    Reads address 0 from the device immediately connected to the local Hub. Address 0 is the NI_STATUS_REV_ID on a Hub, and RR_STATUS_REV_ID on a Router. The status register was deliberately placed at address 0 on both Hubs and Routers, and contains a bit field that can be used to tell which type is connected.

  2. vr 1 rr_scratch_reg0

    Goes out the Hub into the connected Router, out port 1 of that Router to another Router, and reads one of that Router's scratch registers.

  3. vw 0x512 ni_port_reset 0x80

    Goes through a string of 3 Routers to reach a Hub and writes 0x80 (warm reset code) into that hub's NI_PORT_RESET register.

  4. loop vr 0 0

    Loops repeatedly reading the status register of the immediately connected device. May be useful for getting a rough idea of the link quality since the number of polling iterations for each read is printed (if this number is not constant or is very high, the link may be performing excessive retries).

System Configurations

Full Router Configuration

Fully-Populated Origin2000 Module Connectivity

Consider the example CrayLink Interconnect configuration shown in the figure above, which is a typical fully-populated single-module system having two Routers (slots R1-R2) and four node cards (slots N1-N4). The five links shown are permanent and internal to the system mid-plane, and nothing is connected to either Router's external cable ports (1, 2, and 3). Any source node can talk to any destination node or router using vectors shown in the following table.

Vector Routes
Destination
Source N1 N2 N3 N4 R1 R2
N1 5 4 0x56 0x46 0 6
N2 5 4 0x56 0x46 0 6
N3 0x56 0x46 5 4 6 0
N4 0x56 0x46 5 4 6 0

Star Router Configuration (First Possibility)

Fully-Populated Star Router Connectivity (First Possibility)

A Star Router configurations is another way of populating an Origin2000 module. The figure above shows the connectivity of a fully populated Star Router configuration with the Router in slot 1 and the Star Router in slot 2. It is also possible to place the Star Router in slot 1 and the Router in slot 2 (see next configuration).

The midplane connection from R1 port 6 is internal to the system and connects directly to N3. The connection from R1 port 3 is provided by an external jumper (or cable) connecting to SR2, and is converted from differential to single-ended by the XC chip in the Star Router. The remaining ports on R1 (1 and 2) may be connected externally.

This configuration offers two advantages:

The disadvantage is limited scalability.

Any source node can talk to any destination node (or R1) using vectors shown in the table below.

Vector Routes
Destination
Source N1 N2 N3 N4 R1
N1 5 4 6 3 0
N2 5 4 6 3 0
N3 5 4 6 3 0
N4 5 4 6 3 0

Star Router Configuration (Second Possibility)

Fully-Populated Star Router Connectivity (Second Possibility)

The second Star Router configuration has the Router in slot 2 and the Star Router in slot 1. See the previous configuration for more details on the Star Router configurations.

Any source node can talk to any destination node (or R2) using vectors shown in the table below.

Vector Routes
Destination
Source N1 N2 N3 N4 R2
N1 6 3 5 4 0
N2 6 3 5 4 0
N3 6 3 5 4 0
N4 6 3 5 4 0

Null Router Configuration

Null Router Connectivity (Two Possibilities)

The Null Router configuration is the simplest way to create a 4-CPU system. The figure above shows the two possible connectivities of a module using the Null Router configuration.

In this configuration, only one inexpensive Null Router card is needed. Null Routers contain very little componentry and simply connect two Hub slots together. The system is not at all scalable, and may be only half-populated (populating both halves as two separate systems is not supported).

Either node may talk to the other using the vector 0. No other vector routes are possible.


****

"I give this document 5 cheeses, my highest rating."-Cheeser, The Dairy News.