Move the code to generate the definition of an overflow-safe
divide/modulo SPIR-V function into its own Rust function, to reduce
indentation and clarify influences. This commit isn't intended to
cause any change in behavior.
Explain we need the wrapper functions not just to avoid undefined
behaviour, but also to ensure we adhere to the WGSL spec. Additionally
link to issue #7109 in cases where our workaround needs follow-up work
for non-32-bit integer types.
Explain we need the wrapper functions not just to avoid undefined
behaviour (or unspecified in the case of division), but also to ensure
we adhere to the WGSL spec.
Replace the link to the resolved WGSL spec issue about floating-point
division by zero (gpuweb/gpuweb#2798) with links to the Direct3D 11
functional specification (which Direct3D 12 inherits) and the DXIL
specification, explaining that HLSL does what WGSL wants here.
Integer division or modulo is undefined behaviour in SPIR-V when the
divisor is zero, or when the dividend is the most negative number
representable by the result type and the divisor is negative one.
This patch makes us avoid this undefined behaviour and instead ensures
we adhere to the WGSL spec in these cases: for divisions the
expression evaluates to the value of the dividend, and for modulos the
expression evaluates to zero.
Similarily to how we handle these cases for the MSL and HLSL backends,
prior to emitting each function we emit code for any helper functions
required by that function's expressions. In this case that is helper
functions for integer division and modulo. Then, when emitting the
actual function's body, if we encounter an expression which needs
wrapped we instead emit a function call to the helper.
Emit helper functions for MathFunction::Abs and UnaryOperator::Negate
with a signed integer scalar or vector operand. And for
BinaryOperator::Divide and BinaryOperator::Modulo with signed or
unsigned integer scalar or vector operands.
Abs and Negate are written to avoid signed integer overflow when the
operand equals INT_MIN. This is achieved by bitcasting the value to
unsigned, using the negation operator, then bitcasting the result back
to signed. As HLSL's bitcast functions asint() and asuint() only work
for 32-bit types, we only use this workaround in such cases.
Division and Modulo avoid undefined behaviour for INT_MIN / -1 and
divide-by-zero by using 1 for the divisor instead. Additionally we
avoid undefined behaviour when using the modulo operator on operands
of mixed signedness by using the equation from the WGSL spec, using
division, subtraction and multiplication, rather than HLSL's modulus
operator.
Though not explicitly specified one way or the other, we have been
informed by DirectX engineers that signed integer overflow may be
undefined behaviour in some cases. To avoid this, we therefore bitcast
signed operands to unsigned prior to performing addition, subtraction,
or multiplication, then bitcast the result back to signed. As signed
types are represented as two's complement, this gives the correct
result whilst avoid any potential undefined behaviour.
Unfortunately HLSL's bitcast functions asint() and asuint() only work
for the 32-bit int and uint types. We therefore only apply this
workaround for 32-bit signed arithmetic. Support for other bit widths
could be added in the future, but extra care must be taken when
converting from unsigned to signed to avoid undefined or
implemented-defined behaviour.
There is no literal suffix in HLSL for the integer type. We can,
however, wrap integer literals in a constructor style cast. This
avoids ambiguity when passing literals to overloaded functions, which
we'll make use of in the subsequent patch in this PR.
Adds infrastructure to the MSL backend for emitting helper functions,
based upon the existing HLSL backend equivalent. Emit helper functions
for MathFunction::Abs and UnaryOperator::Negate with a signed integer
scalar or vector operand. And for BinaryOperator::Divide and
BinaryOperator::Modulo with signed or unsigned integer scalar or
vector operands.
Abs and Negate are written to avoid signed integer overflow when the
operand equals INT_MIN. This is achieved by bitcasting the value to
unsigned and using the negation operator, then bitcasting the result
back to signed.
Division and Modulo avoid undefined bevaviour for INT_MIN / -1 and
divide-by-zero by using 1 for the divisor instead. Additionally we
avoid undefined behaviour when using the modulo operator when either
or both operands are negative by using the equation from the WGSL
spec, using division, subtraction and multiplication, rather than
MSL's modulus operator.
Lastly, as the usage of the wrapper function for unary integer
negation makes the negation_avoids_prefix_decrement() testcase less
interesting, we extend it to additionally test negating floats.
This allows using `StagingBelt` for copying to textures or any other
operation where copying to an existing buffer is not the desired outcome.
One unfortunate feature of the API is that `allocate()` returns the entire
buffer and an offset, so applications could accidentally touch parts of
the belt buffer outside the intended allocation. It might make more sense
to return `wgpu::BufferSlice`, but that struct cannot be used in
operations like `copy_buffer_to_texture` and does not have getters, so it
is not currently suitable for that purpose.
I also moved `Exclusive` into a module so that its unsafe-to-access field
is properly private, and made various improvements to the `StagingBelt`
documentation, such as acknowledging that `write_buffer_with()` exists.
This allows users to skip creation of `BufferSlice` if they have no use
for it, and brings `wgpu` closer to the WebGPU API without removing any
Rust convenience. New functions:
* `BufferSlice::slice()`
* `Buffer::map_async()`
* `Buffer::get_mapped_range()`
* `Buffer::get_mapped_range_mut()`
* `Buffer::get_mapped_range()`
This will allow using it in tests of wgpu resource management code
that does not actually require a backend.
* `enumerate_adapters()` returns an adapter instead of failing.
* `open()` returns a device instead of failing.
* `create_buffer()` allocates actual memory.
* `map_buffer()` actually provides access to that memory.
* `clear_buffer()` actually clears the buffer.
* `copy_buffer_to_buffer()` actually copies data.
* Fences actually work (trivially, because all operations are
synchronous).
Future work could include implementing texture copies,
timestamp queries, and the clearing part of render passes.
Ensure that unnamed constants' types are retained, even when the
constants are used only by global expressions. Add a test case.
The old code marked all named constants as used; then processed
function bodies; and then marked the types of all used constants as
used. This ensured that, as long as a constant was used by a function,
its type would be retained, but didn't address the case where a
constant is used only by a global expression.
Attempting to convert a non-abstract type will always fail, which can
result in unrelated errors being misreported as type conversion
errors. For example, in #7035 we made it so that abstract types can be
used as return values. This changed the final testcase in the
invalid_functions() test to fail due to failing to convert a u32 to an
atomic<u32>, rather than with a NonConstructibleReturnType error.
This patch makes it so that we do not attempt to convert non-abstract
types in the first place. This avoids the AutoConversion error for
that testcase, meaning the NonConstructibleReturnType is once again
reported.
Various callsites of try_automatic_conversions() have been updated to
ensure they still return an InitializationTypeMismatch error if the
types are mismatched, even if try_automatic_conversions() succeeds (eg
when conversion was not attempted due to the type being concrete). To
reduce code duplication these callsites were all adapted to use the
existing type_and_init() helper function.
Lastly, a couple of tests that expected to result in a AutoConversion
error have been adjusted to ensure this still occurs, by ensuring the
type used is abstract.
When converting the underlying scalar type of an array during const
evaluation, we currently use the resulting base type's size as the array
stride for the resulting type. For certain types, this may not match the
required alignment and will therefore result in a validation error.
For example, `array<vec3<f32>, N>` should have a stride of 16. But if
declared with an abstract initializer, eg `array(vec3(0.0))` we will
incorrectly determine the stride to be 12.
To solve this, we use the proc::Layouter struct to determine the
required array stride during const evaluation. To avoid repeating
layouting work, we reuse the Lowerer's layouter, passing it through the
various *Contexts through to the ConstantEvaluator.