I'm a big fan of aarch64's csel family of instructions. A single instruction can evaluate rd = cond ? rs1 : f(rs2) , where cond is any condition code and f is any of f 0 (x) = x or f 1 (x) = x+1 or f 2 (x) = ~x or or f 3 (x) = -x . Want to convert a condition to a boolean? Use f 1 with rs1 == rs2 == x0 . Want to convert a condition to a mask? Use f 2 with rs1 == rs2 == x0 . Want to compute an absolute value? Use f 3 with rs1 == rs2 . It is pleasing that the composition of f 1 and f 2 is f 3 . I could continue espousing, but hopefully you get the idea.
RISC-V is the hot new thing, but it lacks a direct equivalent to csel . Some cases of converting conditions to booleans are possible with the slt family of instructions in the base instruction set. Beyond that, a few special cases are implemented by instruction set extensions: Zbb adds min and max instructions which are a particular pattern of compare and select, and Zicond adds czero.eqz and czero.nez which again are particular patterns of compare and select. But the general case? Considered and rejected, as per this direct quote from The RISC-V Instruction Set Manual Volume I Version 20250508:
We considered but did not include conditional moves or predicated instructions, which can effectively replace unpredictable short forward branches. Conditional moves are the simpler of the two, but are difficult to use ...
That quote hints at short forward branches being the recommended alternative. It doesn't quite go as far as to say that out-of-order cores are encouraged to perform macro fusion in the frontend to convert short forward branches back into conditional moves (when possible), but it is commonly taken to mean this, and some SiFive cores implement exactly this fusion.
Continuing to quote from The RISC-V Instruction Set Manual Volume I Version 20250508, the introductory text motivating Zicond also mentions fusion:
Using these [Zicond] instructions, branchless sequences can be implemented (typically in two-instruction sequences) without the need for instruction fusion, special provisions during the decoding of architectural instructions, or other microarchitectural provisions. One of the shortcomings of RISC-V, compared to competing instruction set architectures, is the absence of conditional operations to support branchless code-generation: this includes conditional arithmetic, conditional select and conditional move operations. The design principles of RISC-V (e.g. the absence of an instruction-format that supports 3 source registers and an output register) make it unlikely that direct equivalents of the competing instructions will be introduced.
The design principles mentioned in passing mean that czero.eqz has slightly odd semantics. Assuming rd ≠ rs2 , the intent is that these two instruction sequences compute the same thing:
Base instruction set With Zicond mv rd, x0 beq rs2, x0, skip_next mv rd, rs1 skip_next: czero.eqz rd, rs1, rs2
The whole premise of fusion is predicated on the idea that it is valid for a core to transform code similar to the branchy code on the left into code similar to the branch-free code on the right. I wish to cast doubt on this validity: it is true that the two instruction sequences compute the same thing, but details of the RISC-V memory consistency model mean that the two sequences are very much not equivalent, and therefore a core cannot blindly turn one into the other.
To see why, consider this example, again from The RISC-V Instruction Set Manual Volume I Version 20250508:
... continue reading