Add inline asm support for amdgpu#149793
Conversation
This comment has been minimized.
This comment has been minimized.
|
Some changes occurred in compiler/rustc_codegen_gcc |
|
This seems okay to me, but I'd rather someone more familiar with this part of the compiler give the final signoff. @bors r? |
|
Error: Parsing assign command in comment failed: ...'' | error: specify user to assign to at >| ''... Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #triagebot on Zulip. |
|
@bors r? compiler |
|
@rustbot reroll |
bdb726b to
9db5dca
Compare
|
Removed return type from tests to fix conflict with #149991, which starts disallowing returns in |
|
The change seems Ok, i'd like people with more background to take a look. |
|
That's not me (sorry it took me a while because of holidays). But iirc that could be amanieu? r? @Amanieu |
|
☔ The latest upstream changes (presumably #150866) made this pull request unmergeable. Please resolve the merge conflicts. |
|
@bors r+ |
Add inline asm support for amdgpu Add support for inline assembly for the amdgpu backend (the amdgcn-amd-amdhsa target). Add register classes for `vgpr` (vector general purpose register) and `sgpr` (scalar general purpose register). The LLVM backend supports two more classes, `reg`, which is either VGPR or SGPR, up to the compiler to decide. As instructions often rely on a register being either a VGPR or SGPR for the assembly to be valid, reg doesn’t seem that useful (I struggled to write correct tests for it), so I didn’t end up adding it. The fourth register class is AGPRs, which only exist on some hardware versions (not the consumer ones) and they have restricted ways to write and read from them, which makes it hard to write a Rust variable into them. They could be used inside assembly blocks, but I didn’t add them as Rust register class. There is one change affecting general inline assembly code, that is `InlineAsmReg::name()` now returns a `Cow` instead of a `&'static str`. Because amdgpu has many registers, 256 VGPRs plus combinations of 2 or 4 VGPRs, and I didn’t want to list hundreds of static strings, the amdgpu reg stores the register number(s) and a non-static String is generated at runtime for the register name. Tracking issue: rust-lang#135024
Add inline asm support for amdgpu Add support for inline assembly for the amdgpu backend (the amdgcn-amd-amdhsa target). Add register classes for `vgpr` (vector general purpose register) and `sgpr` (scalar general purpose register). The LLVM backend supports two more classes, `reg`, which is either VGPR or SGPR, up to the compiler to decide. As instructions often rely on a register being either a VGPR or SGPR for the assembly to be valid, reg doesn’t seem that useful (I struggled to write correct tests for it), so I didn’t end up adding it. The fourth register class is AGPRs, which only exist on some hardware versions (not the consumer ones) and they have restricted ways to write and read from them, which makes it hard to write a Rust variable into them. They could be used inside assembly blocks, but I didn’t add them as Rust register class. There is one change affecting general inline assembly code, that is `InlineAsmReg::name()` now returns a `Cow` instead of a `&'static str`. Because amdgpu has many registers, 256 VGPRs plus combinations of 2 or 4 VGPRs, and I didn’t want to list hundreds of static strings, the amdgpu reg stores the register number(s) and a non-static String is generated at runtime for the register name. Tracking issue: rust-lang#135024
|
This pull request was unapproved. |
This comment has been minimized.
This comment has been minimized.
|
Building with debug assertions cought an underflowing subtraction. Diff: diff --git a/compiler/rustc_target/src/asm/amdgpu.rs b/compiler/rustc_target/src/asm/amdgpu.rs
index 191cf6baa26..cc6f612cdb8 100644
--- a/compiler/rustc_target/src/asm/amdgpu.rs
+++ b/compiler/rustc_target/src/asm/amdgpu.rs
@@ -478,7 +478,8 @@ pub fn overlapping_regs(self, mut cb: impl FnMut(AmdgpuInlineAsmReg)) {
| AmdgpuRegStart::Full(start)) = self.range;
let size_range = size - 1;
- for overlap_start in (start - size_range)..=(start + self.class.bytes().div_ceil(4) - 1)
+ for overlap_start in
+ start.saturating_sub(size_range)..=(start + self.class.bytes().div_ceil(4) - 1)
{
let class = match self.class {
AmdgpuInlineAsmRegClass::Sgpr(_) => AmdgpuInlineAsmRegClass::Sgpr(size * 32),
diff --git a/tests/assembly-llvm/asm/amdgpu-types.rs b/tests/assembly-llvm/asm/amdgpu-types.rs
index fe8ae88ee83..c68cbe4fd96 100644
--- a/tests/assembly-llvm/asm/amdgpu-types.rs
+++ b/tests/assembly-llvm/asm/amdgpu-types.rs
@@ -223,7 +223,7 @@ macro_rules! check_reg {
// CHECK: #ASMSTART
// CHECK: s_load_b128 s{{\[[0-9]+:[0-9]+\]}}, s{{\[[0-9]+:[0-9]+\]}}, s{{[0-9]+}}
// CHECK: #ASMEND
-check_reg!(s0_i128 i128 "s[0:3]" x: ptr "s[0:1]", y: i32 "s0", "s_load_b128");
+check_reg!(s0_i128 i128 "s[0:3]" x: ptr "s[0:1]", y: i32 "s2", "s_load_b128");
// CHECK-LABEL: v0_i128:
// CHECK: #ASMSTART |
|
(forgot to mark as ready) |
|
@bors r+ |
This comment has been minimized.
This comment has been minimized.
|
@bors delegate+ r=me once conflicts are resolved |
Add support for inline assembly for the amdgpu backend (the amdgcn-amd-amdhsa target). Add register classes for `vgpr` (vector general purpose register) and `sgpr` (scalar general purpose register). The LLVM backend supports two more classes, `reg`, which is either VGPR or SGPR, up to the compiler to decide. As instructions often rely on a register being either a VGPR or SGPR for the assembly to be valid, reg doesn’t seem that useful (I struggled to write correct tests for it), so I didn’t end up adding it. The fourth register class is AGPRs, which only exist on some hardware versions (not the consumer ones) and they have restricted ways to write and read from them, which makes it hard to write a Rust variable into them. They could be used inside assembly blocks, but I didn’t add them as Rust register class. There are a few change affecting general inline assembly code, that is `InlineAsmReg::name()` now returns a `Cow` instead of a `&'static str`. Because amdgpu has many registers, 256 VGPRs plus combinations of 2 or 4 VGPRs, and I didn’t want to list hundreds of static strings, the amdgpu reg stores the register number(s) and a non-static String is generated at runtime for the register name. Similar for register classes and supported_types. Vectors of 64-bit types are supported by the LLVM backend, but omitted here to make the code simpler. There is currently no systematic support in LLVM of which vectors of 64-bit types are supported. Also, they are likely seldomly unused, vectors of 16- and 32-bit types are important.
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
View all comments
Add support for inline assembly for the amdgpu backend (the
amdgcn-amd-amdhsa target).
Add register classes for
vgpr(vector general purpose register) andsgpr(scalar general purpose register).The LLVM backend supports two more classes,
reg, which is either VGPRor SGPR, up to the compiler to decide. As instructions often rely on a
register being either a VGPR or SGPR for the assembly to be valid, reg
doesn’t seem that useful (I struggled to write correct tests for it), so
I didn’t end up adding it.
The fourth register class is AGPRs, which only exist on some hardware
versions (not the consumer ones) and they have restricted ways to write
and read from them, which makes it hard to write a Rust variable into
them. They could be used inside assembly blocks, but I didn’t add them
as Rust register class.
There are a few change affecting general inline assembly code, that is
InlineAsmReg::name()now returns aCowinstead of a&'static str.Because amdgpu has many registers, 256 VGPRs plus combinations of 2 or 4
VGPRs, and I didn’t want to list hundreds of static strings, the amdgpu
reg stores the register number(s) and a non-static String is generated
at runtime for the register name.
Similar for register classes and supported_types.
Vectors of 64-bit types are supported by the LLVM backend, but omitted
here to make the code simpler. There is currently no systematic support
in LLVM of which vectors of 64-bit types are supported. Also, they are
likely seldomly unused, vectors of 16- and 32-bit types are important.
Tracking issue: #135024