Syscon Error Codes

From PS3 Developer wiki
Jump to navigation Jump to search

Description

Syscon memory contains a table of size 0x100 bytes intended to store error codes, every error code is composed by 4 bytes + another 4 bytes for its timestamp, in total the table can store 32 errors. When the table is full of errors and a new error needs to be stored syscon deletes the oldest
The timestamps are in UTC format (number of elapsed seconds since 2000). If the battery/cell was empty or removed when the error was triggered the timestamp is recorded as FFFFFFFF


Errorlog syscon EEPROM dump from CECH-20xx, DYN-001, SW2-301
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000900  02 10 80 A0 67 52 8E 0B 02 10 80 A0 A7 52 8E 0B  ..€ gRŽ...€ §RŽ.
00000910  02 10 80 A0 C8 52 8E 0B 02 10 80 A0 32 35 8F 0B  ..€ ÈRŽ...€ 25..
00000920  02 10 80 A0 47 35 8F 0B 02 10 80 A0 51 35 8F 0B  ..€ G5....€ Q5..
00000930  FF FF FF FF FF FF FF FF FF 14 80 A0 81 63 86 0B  ÿÿÿÿÿÿÿÿÿ.€ .c†.
00000940  02 10 80 A0 82 63 86 0B 02 10 80 A0 91 64 86 0B  ..€ ‚c†...€ ‘d†.
00000950  02 10 80 A0 37 9C 87 0B 02 10 80 A0 46 9C 87 0B  ..€ 7œ‡...€ Fœ‡.
00000960  02 10 80 A0 53 9C 87 0B 02 10 80 A0 C1 AC 87 0B  ..€ Sœ‡...€ Á¬‡.
00000970  02 10 80 A0 CF AC 87 0B 02 10 80 A0 DC AC 87 0B  ..€ Ϭ‡...€ ܬ‡.
00000980  02 10 80 A0 EA AC 87 0B 02 10 80 A0 F4 AC 87 0B  ..€ ꬇...€ ô¬‡.
00000990  02 10 80 A0 FF AC 87 0B 02 10 80 A0 0C AD 87 0B  ..€ ÿ¬‡...€ .­‡.
000009A0  02 10 80 A0 18 AD 87 0B 01 13 80 A0 19 AD 87 0B  ..€ .­‡...€ .­‡.
000009B0  02 10 80 A0 24 AD 87 0B 02 10 80 A0 2F AD 87 0B  ..€ $­‡...€ /­‡.
000009C0  02 10 80 A0 3F AD 87 0B 02 10 80 A0 46 AD 87 0B  ..€ ?­‡...€ F­‡.
000009D0  02 10 80 A0 5C AD 87 0B 02 10 80 A0 71 AD 87 0B  ..€ \­‡...€ q­‡.
000009E0  02 10 80 A0 9F AD 87 0B 02 10 80 A0 B5 AD 87 0B  ..€ Ÿ­‡...€ µ­‡.
000009F0  02 10 80 A0 3A B7 87 0B 02 10 80 A0 F6 51 8E 0B  ..€ :·‡...€ öQŽ.
  • In the errorlog sample above:
    • Contains errors: A080 1  002 , A080 1  301 , A080 1  4FF 
    • Timestamps are valid
Errorlog syscon EEPROM dump from CECH-42xx, PQX-001, SW3-304
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000900  02 18 61 A0 FF FF FF FF 02 18 61 A0 FF FF FF FF  ..a ÿÿÿÿ..a ÿÿÿÿ
00000910  02 18 61 A0 FF FF FF FF 02 18 61 A0 FF FF FF FF  ..a ÿÿÿÿ..a ÿÿÿÿ
00000920  02 18 61 A0 FF FF FF FF 02 18 61 A0 FF FF FF FF  ..a ÿÿÿÿ..a ÿÿÿÿ
00000930  02 18 61 A0 FF FF FF FF 02 18 61 A0 FF FF FF FF  ..a ÿÿÿÿ..a ÿÿÿÿ
00000940  02 18 61 A0 FF FF FF FF 02 18 61 A0 FF FF FF FF  ..a ÿÿÿÿ..a ÿÿÿÿ
00000950  02 18 61 A0 FF FF FF FF 02 18 61 A0 FF FF FF FF  ..a ÿÿÿÿ..a ÿÿÿÿ
00000960  02 18 61 A0 FF FF FF FF 02 40 40 A0 FF FF FF FF  ..a ÿÿÿÿ.@@ ÿÿÿÿ
00000970  34 30 40 A0 FF FF FF FF 02 40 40 A0 FF FF FF FF  40@ ÿÿÿÿ.@@ ÿÿÿÿ
00000980  34 30 40 A0 FF FF FF FF 02 40 40 A0 FF FF FF FF  40@ ÿÿÿÿ.@@ ÿÿÿÿ
00000990  34 30 40 A0 FF FF FF FF 02 40 40 A0 FF FF FF FF  40@ ÿÿÿÿ.@@ ÿÿÿÿ
000009A0  34 30 40 A0 FF FF FF FF 02 40 40 A0 FF FF FF FF  40@ ÿÿÿÿ.@@ ÿÿÿÿ
000009B0  34 30 40 A0 FF FF FF FF 02 40 40 A0 FF FF FF FF  40@ ÿÿÿÿ.@@ ÿÿÿÿ
000009C0  34 30 40 A0 FF FF FF FF FF FF FF FF FF FF FF FF  40@ ÿÿÿÿÿÿÿÿÿÿÿÿ
000009D0  FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
000009E0  FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
000009F0  FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
  • In the errorlog sample above:
    • Contains errors: A061 1  802 , A040 4  002 , A040 3  034 
    • Timestamps are invalid (caused by a missing battery while doing repair jobs)

How to get the syscon error log

If the PS3 still boots up to the XMB and is able to install and run apps you can use programs like the ones mentioned at top of Platform ID page
If the PS3 doesnt boots is still posible to retrieve the syscon error log by connecting a PC to syscon UART port using a "USB to TTL UART adapter" and running the command errlog. There is also the command clearerrlog to empty the error table (handy to prevent confusions with old error codes that could be cummulated along the months/years and not related with the actual problem)

Error code format

The error codes follows the format:  A  R  ST  C  ERR , where:

  •  A  (Fixed)
    • A = This is always "A"
  •  R  (Reserved)
    • 0-E = Unknown
    • F = Frequent error (For example, Motherboard Damage/Breakdown, etc.)
  •  ST  (Step Number)
    • 00-7F = Step Number of the Power On Sequence (POS). This is the Power On Self Test (POST) process. If successful, the BOOT process begins, which loads the OS.
    • 80 = Static State (Power ON). The console completed the POST and was in a static state. The error happened when the PS3 was powered on. You can get an error with Step No. 80 if your error occurs in game. For example, 80 1002 errors can happen if your NEC/TOKINs are going bad.
    • 90 = Static State (Power OFF). The error happened when the PS3 was powering off. For example, if a problem causes the system to hang while shutting down the console will beep before powering off. An error with step no. 90 will be recorded in the errorlog.
    • A0 = Immediately after SYSCON reset. A reset pulse is sent to the console's main chipset to coordinate and synchronize them. If an error occurs immediately after SYSCON reset, it means it occurred before anything else can happen. For example, if the CPU is completely dead it will not respond to the reset pulse and an error will be generated immediately after reset.
  •  C  (Category)
    • 1 = System Error
    • 2 = Fatal Error
    • 3 = Boot Error
    • 4 = Data Error
  •  ERR  (Error)
    • Any number in hex

Examples:
A0801002

  • System Error 002 (RSX VRAM Power Fail) which occurred while the System was successfully powered On.
  • 1002 errors are known to be caused by bad NEC/TOKINs, but may not be the only cause. See Error Code section below for more details.

A0403034

  • Fatal Booting Error 034 (RSX/CELL Communication Error) which occurred at step no. 40, before the Power On Sequence completed.
  • 3034 errors are known to be caused by BGA Defects (among other issues). See Error Code section below for more details.

While the Reserved Area and Step Number can be useful to figure out when the error occurred and how frequent it is, the last four numbers are the most important for figuring out what the error means. So the following Error Code section will only list the last 4 numbers (category + error).

Error codes

System Errors


1001 (Power CELL)

  • Components Involved:
    • CELL (IC1001 on COK-001)
    • NEC/TOKIN Proadlizers (C6140/C6141/C6142/C6143 on COK-001)
    • Other nearby components of the power block

Speculation:
1001 errors happen when the system encounters an unexpected shutdown. They often occur in testing, when the console is turned on/off a lot, instead of graceful shutdown. They have been associated with other errors, but there doesn't appear to be any single cause.

The hypothesis that this error is associated with insufficient Filtering on CPU's core voltage (VDDC) has not been confirmed. There is a range of voltage ripple/noise that "should" cause errors before it gets so bad it causes a CELL VDDC Power Failure (3003). There are numerous SMD components involved in filtering, but the main concern are the NEC/TOKIN Proadlizers (capacitors). 1002 errors are the fingerprint of bad tokins on the GPU, but 1001 has not been shown to have the same association with the CPU's filter. However, a connection is strongly suspected.

1002 (Power RSX)

  • Components Involved:
    • RSX (IC2001 on COK-001)
    • NEC/TOKIN Proadlizers (C6229/C6230/C6231/C6232 on COK-001)
    • Other nearby components of the power block

This error has been associated with insufficient Filtering on RSX_VDDC power line. There is a range of voltage ripple/noise that will cause this error before it gets so bad it causes an RSX_VDDC Power Failure (3004). YLOD's causing 1002's range in duration from 2 seconds to only occurring during intense games.

There are numerous SMD components involved in filtering, but the main concern are the NEC/TOKIN Proadlizers (capacitors). 1002 errors are the fingerprint of bad tokins.

1004 (Power AC/DC)

1103 (Thermal)

1200 (Thermal CELL)

  • Components Involved:

CPU Overheat. This is a common error. The usual culprit is failed Thermal Interface Material (TIM). As the material ages it "dries" allowing air inside. Air is a heat insulator, reducing the TIM's ability to transfer enough heat away from the processor. The system fan will steadily get louder over time until it cannot keep up. Once the processor approaches it's Thermal Shutdown Temperature a Yellow LED begins flashing on the console (Early Phat Models). Once it reaches the Thermal Shutdown Temperature the console will beep three times and hard shutdown, flashing red until the console is unplugged and the error state reset. Error 1200 is generated in the SYSCON errorlog.

First be sure the system fan is working. If so, apply new TIM Between the Internal Heat Spreader (IHS) and Heatsink (HS). If that does not resolve the problem, carefully remove the IHS (Delid) and replace the TIM between the IHS and processor DIE.

If that still doesn't work, it could be an issue with the temperature monitor chip (IC1101). Beyond that, some users have noted that dead CPU's can throw error 1200. However, that's the limit of our current understanding. It could be dead, or have another unexplained issue, but usually reflowing or reballing is the last ditch effort to revive such a console.

1201 (Thermal RSX)

  • Components Involved:

GPU Overheat. This is the same as error 1200 above, except it's for the GPU. The same repair steps apply, except it's Temperature Monitor Chip is IC2101.

1203 (Thermal CELL VR)

Some PS3 motherboards (TMU-520, COK-001, COK-002), have a temperature monitor located somewhere in the CELL power block. The other retail PS3 motherboard models doesnt meassures the temperature of the CELL VR

All the PS3 temperature monitor chips have a internal thermal sensor integrated + 2 pins for an optional external sensor. The temperature monitors for CELL and RSX are configured to use the external sensor, but this one for CELL VR probably uses the internal

1204 (Thermal South Bridge)

1205 (Thermal EE/GS)

This error is specific for COK-001/CXD2953AGB (with full PS2 hardware compatibility, EE+GS) or COK-002/CXD2972GB (with partial PS2 hardware compatibility, GS only)

1301

CELL PLL

14FF

Check stop

1601

BE Livelock Detection

Speculation: If a YLOD turns into a GLOD after reball/reflow then 1601 (with or without 1701) could mean the RSX RAM was damaged. This is a loose association based on a few user reports.

1701

CELL attention

1802

RSX init

1900 (RTC Voltage)

RTC voltage

1901 (RTC Oscilator)

RTC oscilator

1902 (RTC Access)

RTC access


Fatal Errors


  • This error codes seems to be repeated up to 3 times for 3 special cases, as example, errors 2003, 2103, 2203 are related to southbridge, the only thing that changes in the error code is the second digit (located inmediatly after the category). If at some point we find what means that digit we can join the wiki page sections together (with titles: "2001 & 2101", "2002 & 2102", "2003 & 2103", etc...)

2001 (CELL)

CELL (IC1001)

2002 (RSX)

RSX (IC2001)

2003 (South Bridge)

South Bridge Error (IC3001)

2010 (Clock Subsystems)

Clock Generator Error (IC5001)

2011 (Clock CELL)

Clock Generator Error (IC5003)

2012 (Clock CELL)

Clock Generator Error (IC5002)

2013 (Clock CELL, RSX, South Bridge)

Clock Generator Error (IC5004)

2020 (HDMI)

HDMI Error (IC2502)

2022 (DVE)

DVE Error (IC2406, CXM4024R MultiAV controller for analog out)

2024 (AV)

This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD).

2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either.

2030 (Thermal Sensor, CELL)

Speculation: 2030-33 errors reported in case of dodgy PWR/EJT daughter board.

2031 (Thermal Sensor, RSX)

2033 (Thermal Sensor, South Bridge)

2101 (CELL)

CELL (IC1001)

2102 (RSX)

RSX (IC2001)

2103 (South Bridge)

Southbridge Error (IC3001)

2110 (Clock Subsystems)

Clock Generator Error (IC5001)

This error has been resolved by a number of users who had a short on F6001. It is important to note that something usually causes that fuse to blow, like a short. So it's important to troubleshoot the board to find and repair the shorting component before replacing the fuse. Otherwise the new one will blow too.

One user, who resolved this error on his C model PS3, noted "very short YLOD. Error code shows 2110[...]Some earlier code shows 1001 and 1002." The 1001 & 1002 errors he noted in the log before the 2110 appeared may have been a clue that C6019 was deteriorating. Further investigation is needed to confirm this hypothesis, however. In his case, C6019 was shorting and caused F6001 to blow. This short overloaded F6001 and cut power to many Subsystems, such as the HDD, USB ports, South bridge, CPU, GPU, etc.

One particularly noteworthy component is IC6020, which supplys +3.3v_MK_Vdd to the clock generator (IC5001). When F6001 blows, a 02 2110 is generated. A step number of 02 is very early in the power on sequence (POS), which explains why 2110 is triggered instead of another error code. Since the clock generator is critical for timing, it is one of the first things the SYSCON checks during the POS.

2111 (Clock CELL)

Clock Generator Error (IC5003)

2112 (Clock CELL)

Clock Generator Error (IC5002)

2113 (Clock CELL, RSX, South Bridge)

Clock Generator Error (IC5004)

2120 (HDMI)

HDMI Error (IC2502)

2122 (DVE)

DVE Error (IC2406, CXM4024R MultiAV controller for analog out)

2124 (AV)

This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD).

2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either.

2130 (Thermal Sensor, CELL)

2131 (Thermal Sensor, RSX)

2133 (Thermal Sensor, South Bridge)

2203 (South Bridge)

South Bridge Error (IC3001)


Fatal Boot Errors


3000

Power Failure

3001

12v Power Failure

Usually this caused by a bad Power Supply Unit (PSU).

Alternatively, a failure on the 12v_main line can cause it. Check fuses, capacitors, resistors, and IC's on the 12v line. Measure resistance of the large 2 prong 12v connector on the motherboard. It should read in the Kilo ohms range if there is sufficient separation. Otherwise you may have a short somewhere on the line.

3002

Power Failure

3003

VDDC CELL Power Failure

This error will occur in the case of a PWR failure on the main core voltage of the CPU. For example, if the filtering capacitors (NEC/TOKIN's) are severely damaged. There are other SMD's in that filter, so it could be related to them as well.

3004

VDDC RSX Power Failure

This error will occur in the case of a PWR failure on the main core voltage of the GPU. For example, if the filtering capacitors (NEC/TOKIN's) are severely damaged. There are other SMD's in that filter, so it could be related to them as well.

3010

CELL Error

Observation: A user triggered this error by injecting 3.3V into PWRGD (power good) of IC6103 (NCP5318 CPU Buck Controller). It generated error 20 1001 and 20 3010.

3011

CELL

3012

CELL

3013

BE_SPI DI/DO ERROR

CELL not communicating to syscon via SPI (1.2V MC2_VDDIO and 1.2V BE_VCS no output) = Possible shorts on the line, check C4001 and trailing caps. Possible dead CPU?

Another user had one on a CPU he damaged while deliding.

3020

CELL

3030

CELL

3031

CELL

3032

CELL Error

+1.2v_YC_RC_VDDIO PWR Fail?

3033

CELL

3034

CELL / RSX Communication Error

This is the most common error seen in early Phat model PS3's with the hottest 90nm RSX and CELL processors. It is the hallmark of a BGA defect (such as a cracked solder ball). It is by no means limited to the early models, however. These arrors have been seen in every model of PS3 with varying frequency. The most reliable consoles appear to be those with a CPU/GPU of smaller manufacturing process, such as the Super Slim (SS) models (42xx and later) which have a 45nm CELL and 28nm RSX. The least reliable are the PS2 Backwards Compatable A-E Models, which have 90nm RSX/CELL.

The root cause is mechanical fatigue due to thermal cycling. The materials used to contruct the motherboard and processors have different properties. For example, the cooefficient of thermal expansion for FR4 Fiberglass used in the Motherboard and Processor Substrate is different than that of the copper BGA pads, which is different than that of the Lead-Free solder used to join them. This means they will expand and contract at different rates as the chip heats up and cools down, which applies shearing force to the BGA. Over many thermal cycle this deforms the solder balls and cause a defect (Such as a solder crack, torn trace, or the ball may pull away from the pad).

3034 is triggered when the voltage or data lines connecting the CPU/GPU are broken. There is often a data error (4XXX) that also appears, but not always. The most common cause is a BGA defect on the RSX, which usually requires a reball/reflow to repair. Something about the RSX construction or workload causes it to fail more frequently, but the CPU can fail too. However, it's not always a BGA defect. The bumps on either chip can fail, Flex IO traces (the data lines that connect the CPU/GPU) can be broken/scratched, or accumulated damage from wear and tear (electromigration) can also cause this error. The true percentage of consoles with BGA defects that can be fixed with a reball/reflow is unknown. However, there is evidence to suggest that the underfill used to reinforce the CPU/GPU die and RSX Ram bumps was not as effective when the PS3 was manufactured. This could explain many of the consoles who's reball fails prematurely afterwards.

If a reflow/reball of both the CPU/GPU fails, then the chip is beyond repair and needs replaced. The RSX can be replaced with the same model without modification. It can be replaced with a different model using a modchip that injects the correct RSX ID during boot. This has been nicknamed a "Frankenstein Mod." Since they are married to each other, the CPU can only be replaced if also replacing the chipset (NAND/NOR and SYSCON Chips). Since the CPU can't as easily be replaced, a dead CPU is usually considered unrepairable.

3035

CELL and RSX

3036

CELL and RSX

3037

CELL and RSX

3038

CELL and RSX

3039

CELL and RSX

3040

Flash


Data Errors


  • This error codes seems to be repeated up to 5 times for 5 special cases, as example, errors 4001, 4101, 4201, 4301, 4401 are related to CELL, the only thing that changes in the error code is the second digit (located inmediatly after the category). If at some point we find what means that digit we can join the wiki page sections together (with titles: "4001, 4101, 4201, 4301, 4401", etc...)

4001

CELL

4002

RSX

4003

Southbridge

4011

CELL

4101

CELL

4102

RSX

4103

Southbridge

4111

CELL

4201

CELL

4202

RSX

4203

Southbridge

4211

CELL

4212

RSX

4221

CELL

4222

RSX

4231

CELL

4261

CELL

4301

CELL

4302

RSX

4303

Southbridge

4311

CELL

4312

RSX

4321

CELL

4322

RSX

4332

RSX

4341

CELL

4401

CELL or RSX

4402

CELL or RSX

4403

CELL or RSX

4411

CELL or RSX

4412

CELL or RSX

4421

CELL or RSX

4422

CELL or RSX

4432

CELL or RSX

4441

CELL or RSX