syscall: syscall/sysretq subsystem for ring 3 → ring 0 transitions#31
Open
emilf wants to merge 1 commit into
Open
syscall: syscall/sysretq subsystem for ring 3 → ring 0 transitions#31emilf wants to merge 1 commit into
emilf wants to merge 1 commit into
Conversation
Implements the x86-64 syscall path for TheseusOS: GDT: - Add ring 3 user segments (user_data at 0x18, user_code at 0x20) - Move TSS descriptor from 0x18/0x20 to 0x28/0x30 - Expose KERNEL_CS, KERNEL_SS, USER_DS, USER_CS constants Syscall MSRs (STAR/LSTAR/SFMASK): - STAR: kernel CS=0x08, sysretq computes user CS=0x20, SS=0x18 - LSTAR: syscall_entry trampoline address - SFMASK: clears IF/TF/AC on entry - Verified at runtime against programmed values Assembly trampoline (kernel/src/syscall/entry.rs): - swapgs → save user RSP → kernel stack switch → push frame - SysV (user) → Microsoft x64 (kernel) ABI bridge - call syscall_dispatch → restore → swapgs → sysretq Dispatch table (kernel/src/syscall/dispatch.rs): - Static 16-entry table, SYS_NULL(0), SYS_WRITE_SERIAL(1), SYS_GET_TICKS(2) - STAC/CLAC for SMAP-safe user buffer access - SyscallFrame exposes all 6 SysV argument registers Per-CPU data (kernel/src/syscall/percpu.rs): - BSP static PerCpuData struct for GS-relative access - 32 KiB dedicated syscall kernel stack (.bss.stack) Ring 3 test (kernel/src/syscall/usermode.rs): - Embedded flat binary (nasm) at 4 GiB user address - Mapped code + stack pages, switched to user-accessible after copy - iretq transition to ring 3 - All three syscalls verified in QEMU output Boot integration: - Syscall stack mapped alongside IST stacks in bring-up - syscall_init() called after setup_msrs() in continue_after_stack_switch() - RUN_SYSCALL_TEST config flag toggles ring 3 self-test Design doc at docs/plans/syscall-system.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the full x86-64
syscall/sysretqpath for ring 3 → ring 0 transitions, with an embedded ring 3 self-test that prints to the debug serial port via syscall.Architecture
The central design challenge is the ABI mismatch: UEFI PE target compiles
extern "C"as Microsoft x64 ABI (RCX/RDX/R8/R9), whilesyscalluses the SysV convention (RAX=number, args RDI/RSI/RDX/R10/R8/R9). The solution is an assembly trampoline that saves all user registers into aSyscallFrame, then calls the Rust dispatcher with just 2 arguments (fitting cleanly in the Microsoft ABI).Files Changed
kernel/src/gdt.rskernel/src/lib.rspub mod syscallkernel/src/config.rsRUN_SYSCALL_TESTflagkernel/src/environment.rssyscall_init()call, test hookkernel/src/syscall/mod.rskernel/src/syscall/entry.rskernel/src/syscall/dispatch.rsSyscallFrame, static dispatch table, handlerskernel/src/syscall/percpu.rsPerCpuDatafor GS-relative access, syscall stackkernel/src/syscall/usermode.rsiretq), embedded test binarykernel/src/syscall/user_test.asmdocs/plans/syscall-system.mdTest Results
With
RUN_SYSCALL_TEST=true, the kernel boots, enters ring 3, and produces:Three syscalls verified end-to-end:
MSR verification at boot (downgraded to
log_debugin production):Known Limitations (addressed in design doc)
SYS_WRITE_SERIAL— trusts embedded test binary for nowRUN_SYSCALL_TESTdefaults tofalse; kernel boots cleanly without itReviewers
Please pay attention to:
sys_write_serial(STAC/CLAC)