Skip to content

syscall: syscall/sysretq subsystem for ring 3 → ring 0 transitions#31

Open
emilf wants to merge 1 commit into
mainfrom
syscall-system
Open

syscall: syscall/sysretq subsystem for ring 3 → ring 0 transitions#31
emilf wants to merge 1 commit into
mainfrom
syscall-system

Conversation

@emilf
Copy link
Copy Markdown
Owner

@emilf emilf commented May 26, 2026

Summary

Implements the full x86-64 syscall/sysretq path for ring 3 → ring 0 transitions, with an embedded ring 3 self-test that prints to the debug serial port via syscall.

Architecture

The central design challenge is the ABI mismatch: UEFI PE target compiles extern "C" as Microsoft x64 ABI (RCX/RDX/R8/R9), while syscall uses the SysV convention (RAX=number, args RDI/RSI/RDX/R10/R8/R9). The solution is an assembly trampoline that saves all user registers into a SyscallFrame, then calls the Rust dispatcher with just 2 arguments (fitting cleanly in the Microsoft ABI).

Files Changed

File Change
kernel/src/gdt.rs Added user_data (0x18) and user_code (0x20) segments; TSS moved to 0x28/0x30
kernel/src/lib.rs Added pub mod syscall
kernel/src/config.rs Added RUN_SYSCALL_TEST flag
kernel/src/environment.rs Mapped syscall stack, added syscall_init() call, test hook
kernel/src/syscall/mod.rs Subsystem init, MSR programming (STAR/LSTAR/SFMASK)
kernel/src/syscall/entry.rs Assembly trampoline with ABI bridge
kernel/src/syscall/dispatch.rs SyscallFrame, static dispatch table, handlers
kernel/src/syscall/percpu.rs PerCpuData for GS-relative access, syscall stack
kernel/src/syscall/usermode.rs Ring 3 launcher (iretq), embedded test binary
kernel/src/syscall/user_test.asm NASM flat binary for ring 3 self-test
docs/plans/syscall-system.md Design document

Test Results

With RUN_SYSCALL_TEST=true, the kernel boots, enters ring 3, and produces:

Three syscalls verified end-to-end:

  • SYS_NULL (0): no-op, returns 0
  • SYS_GET_TICKS (2): returns LAPIC timer tick count
  • SYS_WRITE_SERIAL (1): writes to QEMU debug port 0xE9 (with SMAP-safe STAC/CLAC)

MSR verification at boot (downgraded to log_debug in production):

Known Limitations (addressed in design doc)

  • User stack is minimal (single page at 0xFFFFFFF0); timer interrupts to ring 3 will PF (no user interrupt handler yet)
  • No pointer validation in SYS_WRITE_SERIAL — trusts embedded test binary for now
  • RUN_SYSCALL_TEST defaults to false; kernel boots cleanly without it

Reviewers

Please pay attention to:

  1. The STAR MSR calculation — GDT layout comments verify it produces correct selectors
  2. The assembly entry/restore path — verify no register leaks between ring 3 and ring 0
  3. SMAP handling in sys_write_serial (STAC/CLAC)

Implements the x86-64 syscall path for TheseusOS:

GDT:
  - Add ring 3 user segments (user_data at 0x18, user_code at 0x20)
  - Move TSS descriptor from 0x18/0x20 to 0x28/0x30
  - Expose KERNEL_CS, KERNEL_SS, USER_DS, USER_CS constants

Syscall MSRs (STAR/LSTAR/SFMASK):
  - STAR: kernel CS=0x08, sysretq computes user CS=0x20, SS=0x18
  - LSTAR: syscall_entry trampoline address
  - SFMASK: clears IF/TF/AC on entry
  - Verified at runtime against programmed values

Assembly trampoline (kernel/src/syscall/entry.rs):
  - swapgs → save user RSP → kernel stack switch → push frame
  - SysV (user) → Microsoft x64 (kernel) ABI bridge
  - call syscall_dispatch → restore → swapgs → sysretq

Dispatch table (kernel/src/syscall/dispatch.rs):
  - Static 16-entry table, SYS_NULL(0), SYS_WRITE_SERIAL(1), SYS_GET_TICKS(2)
  - STAC/CLAC for SMAP-safe user buffer access
  - SyscallFrame exposes all 6 SysV argument registers

Per-CPU data (kernel/src/syscall/percpu.rs):
  - BSP static PerCpuData struct for GS-relative access
  - 32 KiB dedicated syscall kernel stack (.bss.stack)

Ring 3 test (kernel/src/syscall/usermode.rs):
  - Embedded flat binary (nasm) at 4 GiB user address
  - Mapped code + stack pages, switched to user-accessible after copy
  - iretq transition to ring 3
  - All three syscalls verified in QEMU output

Boot integration:
  - Syscall stack mapped alongside IST stacks in bring-up
  - syscall_init() called after setup_msrs() in continue_after_stack_switch()
  - RUN_SYSCALL_TEST config flag toggles ring 3 self-test

Design doc at docs/plans/syscall-system.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant