Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ia32-generic-qemu: intermittent system reboots during fork() and exit #1012

Open
mateusz-bloch opened this issue Feb 13, 2024 · 11 comments · Fixed by phoenix-rtos/phoenix-rtos-kernel#532
Labels

Comments

@mateusz-bloch
Copy link
Member

mateusz-bloch commented Feb 13, 2024

Update

The issue has been reopened as it may also be related to #885 and problems with the psh runfile test on ia32-generic-qemu. Currently, I haven't observed it occurring directly in exit tests.


Problem occures with merge of 28ab383e627fe1d26df5737b12a938fe5ec473a3 in phoenix-rtos-kernel

Encountering intermittent system reboots on the ia32-generic-qemu. Specifically, the issue occurs approximately 5 out of 100 times when executing a test that involves fork() followed by test_common.test_exitPtr(EXIT_SUCCESS);. The expected behavior is for a SIGCHLD signal to be sent after the child process exits, but instead, the system reboots unexpectedly.

TEST(unistd_exit, SIGCHLD_sent)
{
	/* Test that SIGCHILD signal is sent after child exits */
	pid_t pid;
	struct sigaction sa;

	sa.sa_handler = test_sigchldHandler;
	TEST_ASSERT_EQUAL_INT(0, sigemptyset(&sa.sa_mask));
	sa.sa_flags = 0;
	TEST_ASSERT_EQUAL_INT(0, sigaction(SIGCHLD, &sa, NULL));

	/* Check handlerFlag has initial value */
	TEST_ASSERT_EQUAL_INT(0, test_common.test_handlerFlag);

	pid = fork();
	TEST_ASSERT_GREATER_OR_EQUAL(0, pid);
	/* child */
	if (pid == 0) {
		/* Exit right away */
		test_common.test_exitPtr(EXIT_SUCCESS);
	}
	/* parent */
	else {
		time_t start, curr;
		double timeout = 3.0;

		start = curr = time(NULL);
		while (difftime(curr, start) <= timeout) {
			curr = time(NULL);
			if (test_common.test_handlerFlag != 0) {
				break;
			}
		}

		TEST_ASSERT_EQUAL_INT(TEST_EXIT_DUMMY_VAL, test_common.test_handlerFlag);

		int status;
		/* Reap a child */
		TEST_ASSERT_EQUAL_INT(pid, wait(&status));
	}
}

Output from CI:
image
image

Example workflow from github:
https://github.com/phoenix-rtos/phoenix-rtos-ports/actions/runs/7846751690/job/21414331565

Project version: 0f35de2

@damianloew
Copy link
Contributor

It may be related with:
#885

@mateusz-bloch mateusz-bloch reopened this Apr 12, 2024
@agkaminski
Copy link
Member

Why reopen? It wasn't related, this issue was caused by kstack overflow on ia32

@damianloew
Copy link
Contributor

  • To investigate this issue further we will consider adding -d cpu_reset to qemu arguments
  • phrtos-project rev, where the issue in runfile test happened:
    9c05930

@damianloew
Copy link
Contributor

damianloew commented May 14, 2024

Issue caught another time in exit test, here is the output with -d cpu_reset enabled:

https://github.com/phoenix-rtos/phoenix-rtos-project/actions/runs/9072269435/job/24927640850

(psh)% /bin/test-libc-exit
  Unity test run 1 of 1
  Triple fault
  CPU Reset (CPU 0)
  EAX=00000013 EBX=000000c8 ECX=000000ba EDX=c011340e
  ESI=00000000 EDI=00000000 EBP=000000c7 ESP=c02d0000
  EIP=c0113409 EFL=00003046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
  ES =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
  CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
  SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  DS =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
  FS =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
  GS =0033 00001004 bfffffff 00cbf300 DPL=3 DS   [-WA]
  LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
  TR =0028 c0147800 00000068 00008900 DPL=0 TSS32-avl
  GDT=     c0001000 000007ff
  IDT=     c0001800 000007ff
  CR0=80000033 CR2=c02cfffc CR3=002e7000 CR4=00000010
  DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
  DR6=ffff0ff0 DR7=00000400
  CCS=0000000c CCD=00000000 CCO=LOGICL
  EFER=0000000000000000
  FCW=ff90 FSW=8184 [ST=0] FTW=ff MXCSR=00001f80
  FPR0=00000004c031d354 4f84 FPR1=4f8800000004c02e c02e
  FPR2=c01cb65000000004 5000 FPR3=0a0000000234c02e c031
  FPR4=00000001c031d300 d0e0 FPR5=0000c01223f9c02d 0000
  FPR6=0000329200000000 2390 FPR7=000100000000c012 0000
  XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
  XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
  XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
  XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
  Triple fault
  CPU Reset (CPU 0)
  EAX=000f6106 EBX=000f3e0a ECX=00000000 EDX=00000cf9
  ESI=00000000 EDI=00100000 EBP=00000000 ESP=00000fc8
  EIP=000efb0a EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
  ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  CS =0008 00000000 ffffffff 00cf9b00 DPL=0 CS32 [-RA]
  SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
  TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
  GDT=     000f6180 00000037
  IDT=     000f61be 00000000
  CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
  DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
  DR6=ffff0ff0 DR7=00000400
  CCS=000f61c8 CCD=00009e34 CCO=SUBL
  EFER=0000000000000000
  FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
  FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
  FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
  FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
  FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
  XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
  XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
  XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
  XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
  7lSeaBIOS (version 1.15.0-1)
  
  
  iPXE (https://ipxe.org/) 00:03.0 CA00 PCI2.10 PnP PMM+07F8B4A0+07ECB4A0 CA00
  Press Ctrl-B to configure iPXE (PCI 00:03.0)...
                                                                                 
  
  
  Booting from Hard Disk...
  Phoenix-RTOS loader v. 1.21 rev: 2e2476325l
  hal: IA-32 Generic
  cmd: Executing pre-init script
  console: Setting console to 0.0
  
                                   
  Waiting for input,   900 [ms]
                                   
  Waiting for input,   800 [ms]
                                   
  Waiting for input,   700 [ms]
                                   
  Waiting for input,   600 [ms]
                                   
  Waiting for input,   500 [ms]
                                   
  Waiting for input,   400 [ms]
                                   
  Waiting for input,   300 [ms]
                                   
  Waiting for input,   200 [ms]
                                   
  Waiting for input,   100 [ms]
                                   
  Waiting for input,     0 [ms]
                                   
  Waiting for input,     0 [ms]
  25hPhoenix-RTOS microkernel v. 3.2 rev: 5570581
  hal: GenuineIntel Family 6 Model 7 Stepping 3 (3/), cores=1

@astalke
Copy link
Contributor

astalke commented May 21, 2024

I've encounted this issue in an automatic test.

phoenix-rtos-tests/libc/exit: FAIL
EXPECTED:
	0: ASSERTION (?P<path>[\\S]+):(?P<line>\\d+):(?P<status>FAIL|INFO|IGNORE): (?P<msg>.*?)\\r
	1: TEST\\((?P<group>\\w+), (?P<name>\\w+)\\) (?P<status>PASS|IGNORE)
	2: TEST\\((?P<group>\\w+), (?P<name>\\w+)\\) (?P<status>FAIL) at (?P<path>.*?):(?P<line>\\d+)\\r
	3: (?P<total>\\d+) Tests (?P<fail>\\d+) Failures (?P<ignore>\\d+) Ignored \\r+\\n(?P<result>OK|FAIL)
GOT:
Unity test run 1 of 1
Triple fault
CPU Reset (CPU 0)
EAX=00000008 EBX=bffffe90 ECX=000000f7 EDX=c011340e
ESI=000000d5 EDI=00000000 EBP=000000f7 ESP=c02d0000
EIP=c0113409 EFL=00003046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
FS =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
GS =0033 00001004 bfffffff 00cbf300 DPL=3 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0028 c0147800 00000068 00008900 DPL=0 TSS32-avl
GDT=     c0001000 000007ff
IDT=     c0001800 000007ff
CR0=80000033 CR2=c02cfffc CR3=07fad000 CR4=00000010
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
DR6=ffff0ff0 DR7=00000400
CCS=0000000c CCD=00000000 CCO=LOGICL
EFER=0000000000000000
FCW=ff90 FSW=8184 [ST=0] FTW=ff MXCSR=00001f80
FPR0=00000004c02e1f54 4f84 FPR1=4f8800000004c02e c02e
FPR2=c01cb65000000004 5000 FPR3=fb0000000234c02e c030
FPR4=00000001c02e1f00 d0e0 FPR5=0000c0122429c02d 0000
FPR6=0000329200000000 23c0 FPR7=000100000000c012 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
Triple fault
CPU Reset (CPU 0)
EAX=000f6106 EBX=000f3e0a ECX=00000000 EDX=00000cf9
ESI=00000000 EDI=00100000 EBP=00000000 ESP=00000fc8
EIP=000efb0a EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9b00 DPL=0 CS32 [-RA]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     000f6180 00000037
IDT=     000f61be 00000000
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
DR6=ffff0ff0 DR7=00000400
CCS=000f61c8 CCD=00009e34 CCO=SUBL
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
7lSeaBIOS (version 1.15.0-1)

@mateusz-bloch
Copy link
Member Author

mateusz-bloch commented Jun 12, 2024

Issue encounted in psh-history test: https://github.com/phoenix-rtos/phoenix-rtos-project/actions/runs/9477002770/job/26110890308

  (psh)% g,zm_Aux4kpbeNvy
  psh: g,zm_Aux4kpbeNvy not found
  Triple fault
  CPU Reset (CPU 0)
  EAX=00000008 EBX=000000f5 ECX=0807b491 EDX=c011340e
  ESI=bffffd5c EDI=00000003 EBP=bfffff18 ESP=c02d0000
  EIP=c0113409 EFL=00003046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
  ES =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
  CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
  SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  DS =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
  FS =0023 00000000 c0000fff 00ccf300 DPL=3 DS   [-WA]
  GS =0033 00001004 bfffffff 00cbf300 DPL=3 DS   [-WA]
  LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
  TR =0028 c0147800 00000068 00008900 DPL=0 TSS32-avl
  GDT=     c0001000 000007ff
  IDT=     c0001800 000007ff
  CR0=80000033 CR2=c02cfffc CR3=07fad000 CR4=00000010
  DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
  DR6=ffff0ff0 DR7=00000400
  CCS=0000000c CCD=00000000 CCO=LOGICL
  EFER=0000000000000000
  FCW=ffa0 FSW=88c5 [ST=1] FTW=ff MXCSR=00001f80
  FPR0=000100000000c012 0000 FPR1=00000004c03266d4 4f84
  FPR2=4f8800000004c02e c02e FPR3=c01cb65000000004 7000
  FPR4=3a0000000234c02e c031 FPR5=00000001c0326680 1040
  FPR6=0000c0122429c02d 0000 FPR7=0000329200000000 23c0
  XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
  XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
  XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
  XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
  Triple fault
  CPU Reset (CPU 0)
  EAX=000f6106 EBX=000f3e0a ECX=00000000 EDX=00000cf9
  ESI=00000000 EDI=00100000 EBP=00000000 ESP=00000fc8
  EIP=000efb0a EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
  ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  CS =0008 00000000 ffffffff 00cf9b00 DPL=0 CS32 [-RA]
  SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
  LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
  TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
  GDT=     000f6180 00000037
  IDT=     000f61be 00000000
  CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
  DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
  DR6=ffff0ff0 DR7=00000400
  CCS=000f61c8 CCD=00009e34 CCO=SUBL
  EFER=0000000000000000
  FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
  FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
  FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
  FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
  FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
  XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
  XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
  XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
  XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000

@astalke
Copy link
Contributor

astalke commented Jun 13, 2024

I've noticed that in every crash report in this thread, there is some garbage data in FPU registers that looks like data from stack. AFAIK it is not supposed to happen, but in theory, it shouldn't damage the stack.

@astalke
Copy link
Contributor

astalke commented Jun 13, 2024

I've noticed that in every crash report in this thread, there is some garbage data in FPU registers that looks like data from stack. AFAIK it is not supposed to happen, but in theory, it shouldn't damage the stack.

I've decided to look into it. Since there is an issue with vfork() (#1077), I've decided to check if it works correctly with FPU (I've checked only fork()). It doesn't and somehow it caused a pagefault at exit. I'll fix that and maybe this issue will be fixed?

@astalke
Copy link
Contributor

astalke commented Jun 25, 2024

I've managed to reproduce the issue locally on QEMU 6.2.0 (4096M of RAM allocated for the machine). It is not very efficient, additional RAM is required to avoid crashes caused by zombie processes. In my case it crashed in the second execution of this program:

#include <stdio.h>
#include <stdlib.h>

static void func(size_t id) {
	if (fork() == 0) {
		for (size_t i = 0; i < 10000000; ++i) {
			__asm__ volatile ("fwait");
			__asm__ volatile ("fldz");
			__asm__ volatile ("nop");
		}
	}
	else {
		int xxx;
		__asm__ volatile ("fwait");
		__asm__ volatile ("fldz");
		__asm__ volatile ("nop");
		wait(&xxx);
		printf("%u\n", id);
	}
	exit(0);
}

int main(void)
{
	for (size_t i = 0; i < 12800; ++i) {
		if (fork() == 0) {
			func(i);
		}
	}
	for (size_t i = 0; i < 12800; ++i) {
		int id;
		int ret = wait(&id);
	}
	puts("");

	return 0;
}

Crash register dump:

Triple fault
CPU Reset (CPU 0)
EAX=00000011 EBX=00000025 ECX=bfffff70 EDX=c011340e
ESI=00000000 EDI=c19cf000 EBP=bfffff58 ESP=c19ba000
EIP=c0113409 EFL=00003046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0023 00000000 c0000fff 00ccf300 DPL=3 DS [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
DS =0023 00000000 c0000fff 00ccf300 DPL=3 DS [-WA]
FS =0023 00000000 c0000fff 00ccf300 DPL=3 DS [-WA]
GS =0033 00001004 bfffffff 00cbf300 DPL=3 DS [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0028 c0149800 00000068 00008900 DPL=0 TSS32-avl
GDT= c0001000 000007ff
IDT= c0001800 000007ff
CR0=80000033 CR2=c19b9ffc CR3=bf2c8000 CR4=00000010
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
CCS=0000000c CCD=00000000 CCO=LOGICL
EFER=0000000000000000
FCW=ff90 FSW=8187 [ST=0] FTW=ff MXCSR=00001f80
FPR0=00000004c1a0dfd8 ef84 FPR1=ef8800000004c19c c19c
FPR2=c01cd65000000004 1000 FPR3=f90000000234c19d c19f
FPR4=00000001c1a0df80 b040 FPR5=0000c01224c9c19b 0000
FPR6=0000329200000000 2460 FPR7=000100000000c012 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000

As you can see, once again there is an issue with garbage data in the FPU. I'll try to reproduce this error again, and then check if my patch works.

EDIT: Another Triple fault.

EAX=00000011 EBX=00000025 ECX=bfffff70 EDX=c011340e
ESI=00000000 EDI=c19cf000 EBP=bfffff58 ESP=c19ba000
EIP=c0113409 EFL=00003046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0023 00000000 c0000fff 00ccf300 DPL=3 DS [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA]
DS =0023 00000000 c0000fff 00ccf300 DPL=3 DS [-WA]
FS =0023 00000000 c0000fff 00ccf300 DPL=3 DS [-WA]
GS =0033 00001004 bfffffff 00cbf300 DPL=3 DS [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0028 c0149800 00000068 00008900 DPL=0 TSS32-avl
GDT= c0001000 000007ff
IDT= c0001800 000007ff
CR0=80000033 CR2=c19b9ffc CR3=bf31b000 CR4=00000010
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
CCS=0000000c CCD=00000000 CCO=LOGICL
EFER=0000000000000000
FCW=ff90 FSW=8187 [ST=0] FTW=ff MXCSR=00001f80
FPR0=00000004c19d20d8 ef84 FPR1=ef8800000004c19c c19c
FPR2=c01cd65000000004 1000 FPR3=da0000000234c19d c19f
FPR4=00000001c19d2080 b040 FPR5=0000c01224c9c19b 0000
FPR6=0000329200000000 2460 FPR7=000100000000c012 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000

@astalke
Copy link
Contributor

astalke commented Jun 26, 2024

I've found the reason for the triple fault. After we execute fsave in the exception handler, the system reports an exception 16, but since there is fsavein the exception handler, we are stuck in an infinite loop, until we triple fault, because we ran out of stack space.

@astalke
Copy link
Contributor

astalke commented Jun 27, 2024

I've submitted changes that decrease likelihood of a crash in this branch: https://github.com/phoenix-rtos/phoenix-rtos-kernel/tree/astalke/RTOS-858 (at least in my test code, that I've included in one of comments above this one)

Unfortunately these changes don't fix the issue and I think the last commit may cause errors in FPU calculations. Unfortunately I don't have enough time to make a proper fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants