Skip to content

HLSL backend (codegen + D3D11 renderer)#2897

Open
soufianekhiat wants to merge 17 commits into
AcademySoftwareFoundation:mainfrom
soufianekhiat:main
Open

HLSL backend (codegen + D3D11 renderer)#2897
soufianekhiat wants to merge 17 commits into
AcademySoftwareFoundation:mainfrom
soufianekhiat:main

Conversation

@soufianekhiat

@soufianekhiat soufianekhiat commented May 7, 2026

Copy link
Copy Markdown
HLSL GLSL Diff (x8)
hlsl glsl diff

Adds MaterialXGenHlsl and MaterialXRenderHlsl, + Python and JS bindings, plus the matching libraries under libraries/{stdlib,pbrlib,nprlib,lights}/genhlsl/. Existing GLSL / MSL / Slang code is untouched.

The codegen reuses the GLSL .glsl impl files via file="../genglsl/..." and runs a small post-emit pass for the GLSL→HLSL deltas (mix→lerp, dFdx→ddx, vector-splat C-cast, texture()→mx_texture_sample, etc.). Two HLSL-native helpers (mx_math.hlsl, mx_texture.hlsl) fill in the rest.

The renderer is D3D11. FXC for SM 5.x, DXC loaded dynamically for SM 6+. HlslMaterial owns per-stage cbuffers with a CPU mirror so partial uniform writes don't clobber neighbours — D3D11 cbuffers are stateful, no glProgramUniform analogue. HlslTextureHandler is an ImageHandler subclass. HlslRenderer auto-binds camera, lights, env, file textures and multi-mesh geometry from reflection.

Tests: 7 cases / 3369 assertions in [genhlsl], 19 / 213 in [renderhlsl]. Headless CI safe - every D3D-touching case skips
cleanly when tryCreateContext() returns null. The full FXC compile sweep (~22 min) is tagged [!slow] so quick CI runs can skip it.

Validation: 31 materials rendered side-by-side against GLSL on the shaderball (StandardSurface, OpenPBR, glTF PBR, Disney, SimpleHair), plus the 15-material chess_set scene. All visually identical; 30/31 single-material RMSE under 2 on the 0-255 scale. See Compare.

Build: MaterialXGenHlsl cross-platform, MaterialXRenderHlsl gated on WIN32. Stages dxcompiler.dll + dxil.dll from the Windows SDK.

Why not "just emit HLSL with Slang"?

Slang would give us HLSL source, not a D3D11 renderer - we'd still need most of MaterialXRenderHlsl. And FXC/DXC reflection round-trips MaterialX uniform names directly with native emit; through Slang they get mangled, breaking the per-uniform setVariable API that GLSL/MSL already provide.

Disclosure: This PR was created assicted with Claude Opus 4.7.

Cross-platform HLSL codegen as a sibling of GenGlsl/GenMsl/GenSlang.
Reuses GLSL node implementations via a small post-emit pass: three
regex rewrites (single-arg vector splat, texture() to mx_texture_sample,
ClosureData return constructor) plus a token table (mix->lerp,
dFdx/dFdy->ddx/ddy, mod->fmod, fract->frac, inversesqrt->rsqrt).

Adds:
  - source/MaterialXGenHlsl/{HlslShaderGenerator,HlslSyntax}.{h,cpp}
  - libraries/{stdlib,pbrlib,nprlib,lights}/genhlsl/*.mtlx
  - libraries/stdlib/genhlsl/lib/{mx_math,mx_texture}.hlsl
  - libraries/targets/genhlsl.mtlx
  - source/MaterialXTest/MaterialXGenHlsl/* (codegen tests)
  - CHANGELOG.md entry
Mirrors MslResourceBindingContext. When attached to GenContext, splits
uniforms into Private/Public/LightData cbuffers with explicit register
annotations and suppresses inline LightData struct emission (the
binding context emits it once at the cbuffer level).
…ction

Renderer-free compile + reflect entry point. FXC via d3dcompiler_47
(default, SM 5.x); DXC via dxcompiler.dll loaded dynamically (SM 6.x).
Reflection unifies on HlslResourceBinding for both DXBC and DXIL.
Suitable for headless CI to validate that generated HLSL compiles
without requiring a D3D11 device.
HlslContext owns ID3D11Device + DeviceContext with hardware-or-WARP
fallback and tryCreateContext() for headless test environments.
HlslFramebuffer wraps an RTV + DSV + readback path so tests can verify
real pixels. End-to-end draw test compiles a trivial VS+PS, draws a
fullscreen triangle, and reads back the centre pixel.
Bridges D3D11's stateful cbuffer model to MaterialX's per-uniform
update pattern. One ID3D11Buffer per stage per cbuffer with a CPU
mirror so partial writes do not clobber unrelated members. Reflection-
driven member offset lookup (lookupVariableOffset) so callers can name
'u_worldMatrix' and get back its byte offset inside vertexCB.

Includes a generated-shader test that compiles standard_surface_carpaint,
allocates the reflection-driven cbuffer pool, and draws into the
framebuffer to verify the pixel shader actually executed.
SRV + sampler cache keyed by Image::getResourceId. Subclassing
ImageHandler gives Python clients the full bindImage / unbindImage /
releaseRenderResources interface. getBoundSrv / getBoundSampler expose
the COM pointers so the renderer can bind t# / s# slots without
re-walking the cache.
Per-frame validateRender walks the program's reflected bindings and
auto-binds: camera/world matrices into vertexCB; lighting scalars
into pixelCB; environment radiance/irradiance via LightHandler;
file textures from PUBLIC_UNIFORMS via ImageHandler; per-light
parameters via reflection's indexed-name lookup. patchVariable lets
the renderer be cbuffer-agnostic so it works with and without an
attached HlslResourceBindingContext.
One-line CRTP-style specialisation of TextureBaker<HlslRenderer,
HlslShaderGenerator>. All baking machinery comes from the templated
base class; this subclass only wires the backend types in.
PyMaterialXGenHlsl: HlslShaderGenerator, HlslResourceBindingContext,
HlslSyntax. PyMaterialXRenderHlsl: HlslContext, HlslFramebuffer,
HlslProgram, HlslMaterial, HlslTextureHandler (as ImageHandler
subclass), HlslRenderer. D3D COM pointers are never crossed into
Python.
Embind binding for HlslShaderGenerator, gated on EMSCRIPTEN.
Generator only - no renderer in WASM since D3D11 is not portable.
@linux-foundation-easycla

linux-foundation-easycla Bot commented May 7, 2026

Copy link
Copy Markdown

CLA Signed

The committers listed above are authorized under a signed CLA.

Five tests were asserting raw pixel byte values that no longer hold
since the framebuffer defaults to sRGB encoding and the texture handler
builds a mip chain by default:

- ClearAndReadback, Draw Triangle, Material BindCbufferAndDraw,
  DrawsMeshWhenAvailable: opt the framebuffer into linear pass-through
  via setEncodeSrgb(false) before bind so the linear RTV is used.
- Texture SampleAndDraw: a 2x2 source under trilinear filtering blends
  mip 0 with the gray average mip 1, so corner reads don't return the
  original texels. Set ImageSamplingProperties::filterType = CLOSEST
  so the test sees raw texel values.
@soufianekhiat soufianekhiat marked this pull request as ready for review May 8, 2026 10:31
@jstone-lucasfilm

Copy link
Copy Markdown
Member

This proposal looks very promising, thanks @soufianekhiat, and I'd be interested in thoughts from @ashwinbhat, who has given a good deal of thought to strategies for HLSL integration in MaterialX.

Back in 2022, Ashwin presented an overview of different HLSL approaches to the MaterialX TSC, and I'll share the Google Doc from that presentation:

https://docs.google.com/document/d/1UamANfIivGFcWAB3URlJ6CRf_KEsuAGLSRbXkfzcsxU/edit?tab=t.0#heading=h.lyr31crlybwd

@ZapAndersson

Copy link
Copy Markdown
Contributor

Heh, I was looking at having a Robot Friend do the same thing, but you beat me to it!

For my use, I have a very specific requirement, though, because I need to glue my HLSL into some special constraint.

I need the the per-node genhlsl/*.hlsl library files are guaranteed to be valid standalone HLSL functions (no entry-point glue, callable in isolation). If yes, that's worth documenting because it opens the door to host integrations that consume MaterialX nodes as fragments rather than full shaders.

Will your implementation be able to do that?

@ZapAndersson

Copy link
Copy Markdown
Contributor

Ok I downloaded built and tested this, and it seems to do everything I need already - awesome!

Comment on lines +27 to +30
/// The GLSL we are piggybacking on has all its matrices transposed compared to HLSL and the MaterialX spec.
/// (The matrices are defined like mat3(1, 2, 3, 4, 5, 6, 7, 8, 9) where the spec says it should be row-major order, but GLSL creates it as col-major)
/// So when GLSL code says "mul(M, v)" it means "v * transpose(M)", and since in HLSL the matrices are stored
/// in row-major order (when declared without a layout qualifier), we need to reverse the order of multiplication to get the same result.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OkayOkay, super-nitpicky detail here; (and I hope I'm not misunderstanding the comment here) we need to very clearly state and document that your code is expecting the host to upload matrices in the "GLSL order" because that is what MaterialX currently does. However, in my use case I may want to put this into a completely different, HLSL native host, which will then upload matrices in the original row-major order.

So my suggestion is do one of the following:

a) Just very clearly upfront state with blinking arrows and warnings signs this matrix upload contract
or
b) Add a GenOption where I can set what format my particular host code happens to upload matrices in.

Either works for me, it just stood out as a minor detail.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe I am confusing myself? :)

@ashwinbhat

Copy link
Copy Markdown
Contributor

This proposal looks very promising, thanks @soufianekhiat, and I'd be interested in thoughts from @ashwinbhat, who has given a good deal of thought to strategies for HLSL integration in MaterialX.

Back in 2022, Ashwin presented an overview of different HLSL approaches to the MaterialX TSC, and I'll share the Google Doc from that presentation:

https://docs.google.com/document/d/1UamANfIivGFcWAB3URlJ6CRf_KEsuAGLSRbXkfzcsxU/edit?tab=t.0#heading=h.lyr31crlybwd

+1 on the proposal. Great work and I'm glad that this proposal includes MaterialXRenderHlsl for testing purposes to ensure future maintainability.

The pipeline I had proposed in 2022 focused on ease of maintainability and leveraged existing transpiling tools so that we would have single shadergen (GLSL), but generate additional targets for Metal, Vulkan and DX.

Some downsides of the use of transpiling tools are

  • pipeline can be cumbersome to setup for realtime workflows
  • generated shaders are hard to read and debug especially the code comments are lost due to SPIRV representations.

Since we have direct implementation for other HW target, I would support a HLSL backend that can be refined to better use new HLSL semantics and feature levels.

@ashwinbhat

ashwinbhat commented May 27, 2026

Copy link
Copy Markdown
Contributor

@soufianekhiat Would it be a lot of effort to support DX12? Do you think we might need some utilities to support extraction of shader metadata for root signature generation?

@soufianekhiat

Copy link
Copy Markdown
Author

@soufianekhiat Would it be a lot of effort to support DX12? Do you think we might need some utilities to support extraction of shader metadata for root signature generation?

It depends if it's for production readyness I would add a D3D12MemoryAllocator as a dependency. Otherwise I can build a simpler version only for the viewer.

@ashwinbhat

Copy link
Copy Markdown
Contributor

@soufianekhiat Would it be a lot of effort to support DX12? Do you think we might need some utilities to support extraction of shader metadata for root signature generation?

It depends if it's for production readyness I would add a D3D12MemoryAllocator as a dependency. Otherwise I can build a simpler version only for the viewer.

For production, I think the hlsl shader should suffice. I was suggesting DX12 for the testrenderer and viewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants