-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECC error in PNOR flash in section offset 0x00091000 #5823
Comments
There are very few people around the last couple of weeks of the year, and unfortunately nobody who's familiar with Power 8. I can give you a couple of things to try though:
|
Thanks for the replay. The suggestions you gave should technically resolve the issue. Any idea how i would re-flash the PNOR image? Replacement is not an option. Since this part is not readily available and the few replacement options i got are costing more than the server itself. I have downloaded the latest firmware package that contains the PNOR & BMC firmware. But unfortunately i can not access the machine through IPMI Tool to flash firmware since i don't remember the IP address of the machine. Any idea how i can find the IP address so that i can connect through IPMI? I tried wireshark to sniff the IP but was not successful. Thanks for your time, |
You could I found this in our P8 documentation to flash the new images:
0,1 are BMC images, 2 is the PNOR |
Thanks.
I have the firmware update instructions and the latest firmware.
But I don't remember the IP address or the host name. So I can't ping or
access the machine through IPMI.
Any idea how I can access in this situation.
Best regards,
Adeel Akram
…On Wed, Dec 27, 2023, 7:31 PM Ilya Smirnov ***@***.***> wrote:
You could ping the machine if you remember the alias - that will give you
it's IP.
I found this in our P8 documentation to flash the new images:
ipmitool -H <IP> -z 20000 -I lanplus -U <user> -P <password> hpm upgrade
<image> component <0|1|2>
0,1 are BMC images, 2 is the PNOR
—
Reply to this email directly, view it on GitHub
<#5823 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BEXLYCILAYTJSVLXC2O6UWDYLQ5LPAVCNFSM6AAAAABAYJNNWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZQGQYDSNJXGI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Without the BMC's IP address your options are pretty limited. The entire service model is based around the BMC. Note that the BMC should have a completely separate ethernet connection compared to the "system" itself. The PDF at https://public.dhe.ibm.com/systems/power/docs/hw/p8/p8eik_install_8335.pdf has a good diagram in Figure 17. Use the left Ethernet port for the BMC/IPMI interface (as eth0). Use the right Ethernet port for any direct OS usage (as eth1). Once you get BMC access again there are a few things you can try. Do you see multiple failed boot attempts on each power on? There are multiple sides to the PNOR and a golden side fallback that is supposed to kick in to recover from failures like this. |
Thanks for your time. I am aware of the separate BMC Port and that is what I am connected to. I know since this Port gives the display output on serial connection with the machine. The only issue is that I am unable to establish an IPMI connection since I don't remember the IP address or hostname of the machine. I tried sniffing the network connection with Wireshark but wast successful in detecting any IP address. I only see the same boot failure message I attached the screenshot in my first message. Is there a way to manually switch to the golden side of the PNOR image on this machine? |
The BMC is where all of the control is, there are no other external interfaces. If you can't get into the BMC somehow there isn't much you can do. Have you gone through all of the service documents at the page I posted? There might be some other way of getting into the BMC. I'm pretty sure there is a raw serial port somewhere that you can use for BMC (vs Host) access. |
Hi Gurus,
I am trying to bring an S822LC (8335-GTB) back to life and use for AI workloads.
The System has 2 x Power8 10 Core Processors, 512GB RAM & 4 x Nvidia P100 GPUs.
After a Power Failure. The machine gets stuck at boot with the below error message:
ECC error in PNOR flash in section offset 0x00091000
System shutting down with error status 0x60F
System shutting down with error status 0x90000A79
Can anyone suggest how to recover from this.
I am willing to compensate anyone who can put in the efforts to help me resolve this for his time.
The text was updated successfully, but these errors were encountered: