VPP  0.7
A high-level modern C++ API for Vulkan
How to create reusable GPU functions

How to create reusable GPU functions

We have already shown in How to write shader and computation code how to create functions in GPU code. But those were only local to the shader. What to do if you want a reusable function across multiple shaders, or even a function library?

There are two approaches to this task. The first one is to create a macro function, while the second one results in a fully reusable GPU-level function.

Macro functions

As we alreaty know, using regular C++ constructs inside GPU code works like they were macro constructs. The code is unfolded and inlined inside generated SPIR-V. Therefore we can define a function like this:

vpp::Int Gcd ( const vpp::Int& _n, const vpp::Int& _k )
{
VInt x = _n;
VInt y = _k;
VInt s = 0;
VInt t = 0;
If ( x < y );
t = x; x = y; y = t;
Fi();
Do(); While ( ( ( x & 1 ) | ( y & 1 ) ) == 0 );
x >>= 1u; y >>= 1u; ++s;
Od();
Do(); While ( x != 0 );
{
Do(); While ( ( x & 1 ) == 0 );
x >>= 1u;
Od();
Do(); While ( ( y & 1 ) == 0 );
y >>= 1u;
Od();
If ( x >= y );
x = ( x - y ) >> 1u;
Else();
y = ( y - x ) >> 1u;
Fi();
}
Od();
// In macros, use plain C++ 'return'.
return y << s;
}

What will happen if we call such Gcd function from GPU code, like this?

Int p = Gcd ( a, b );
// ...

The entire Gcd code will be inlined into the caller's code and the program will behave as expected - i.e. as if the Gcd function were called.

This way is very useful for small functions. The Gcd function is not a very good candidate, though. It defines several mutable variables and if being called multiple times, these variables will be declared each time. This produces excessive number of mutable variables and can hurt performance (immutable variables do not have this problem). C++ optimizer will not be able to help here. Code optimizers in drivers theoretically could, but we should not rely on this.

For such complex functions with local mutable variables, a full-fledged GPU function is better.

Real GPU functions

So we are falling back to vpp::Function() syntax. This time however, we will wrap it into a C++ class:

class GFunGCD : public vpp::Function< vpp::Int, vpp::Int, vpp::Int >
{
public:
GFunGCD();
private:
};
GFunGCD :: GFunGCD() : vpp::Function< vpp::Int, vpp::Int, vpp::Int >( "gcd" )
{
using namespace vpp;
Begin();
VInt x = _n;
VInt y = _k;
VInt s = 0;
VInt t = 0;
If ( x < y );
t = x; x = y; y = t;
Fi();
Do(); While ( ( ( x & 1 ) | ( y & 1 ) ) == 0 );
x >>= 1u; y >>= 1u; ++s;
Od();
Do(); While ( x != 0 );
{
Do(); While ( ( x & 1 ) == 0 );
x >>= 1u;
Od();
Do(); While ( ( y & 1 ) == 0 );
y >>= 1u;
Od();
If ( x >= y );
x = ( x - y ) >> 1u;
Else();
y = ( y - x ) >> 1u;
Fi();
}
Od();
// Note that here is 'Return' here now, not 'return'!
Return ( y << s );
End();
}

Then, in the client code we use it like this:

// Declare it once at the beginning.
GFunGCD Gcd;
// ...
// Use it multiple times later ...
Int g1 = Gcd ( a1, b1 );
Int g2 = Gcd ( a2, b2 );
// ...

Variables will not be declared multiple times, because they are hidden inside SPIR-V level function definition. The definition itself is generated when GFunGCD object initializes. The calls which occur below, just generate a SPIR-V call instruction.

As you can see, creating reusable GPU funtions is simple - just make C++ functors of them, using the syntax presented above.

In case of small functions that do not create mutable variables, consider using macro functions as they have a chance to be faster.