Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiplying two matrices - not working in 3D range #173

Open
ZahariStoyanov opened this issue Dec 3, 2023 · 1 comment
Open

Multiplying two matrices - not working in 3D range #173

ZahariStoyanov opened this issue Dec 3, 2023 · 1 comment

Comments

@ZahariStoyanov
Copy link

ZahariStoyanov commented Dec 3, 2023

Currently working on a Java library for matrix operations and ML. I use Aparapi to utilize the GPU.
I've written this code to multiply two matrices:

   public static NDmatrix matMul(float[][] a, float[][] b) {
        int[] aDim = new int[]{a.length, a[0].length};
        int[] bDim = new int[]{b.length, b[0].length};
        if(aDim[1] == bDim[0]){
            int[] Dim = new int[]{aDim[0], bDim[1]};
            int aVSize = aDim[0] * aDim[1];
            float[] aVector = new float[aVSize];
            for(int i = 0; i < aDim[0]; i++)
                System.arraycopy(a[i], 0, aVector, i * aDim[1], aDim[1]);
            int bVSize = bDim[0] * bDim[1];
            float[] bVector = new float[bVSize];
            for(int i = 0; i < bDim[0]; i++)
                System.arraycopy(b[i], 0, bVector, i * bDim[1], bDim[1]);
            int resVSize = Dim[0] * Dim[1];
            float[] resVector = new float[resVSize];
            int d[] = new int[]{aDim[1]};
            Kernel mKernel = new Kernel() {
                final int ht = Dim[0];
                final int wt = Dim[1];
                final int dpt = d[0];
                public void run() {
                    int c = getGlobalId(0);
                    int r = getGlobalId(1);
                    int l = getGlobalId(2); // used with 3D range
                    localBarrier(); // used with 3D range
                    //for(int l = 0; l < dpt; l++) // used with 2D range
                    resVector[r * wt + c] += aVector[r * dpt + l] * bVector[l * wt + c]; // in 3D range, this only works for l = 0
                }
            };
            mKernel.setExplicit(true);
            mKernel.put(aVector);
            mKernel.put(bVector);
            mKernel.put(resVector);
            mKernel.put(Dim);
            mKernel.put(d);
            mKernel.execute(Range.create3D(Dim[1], Dim[0], d[0]));
            //mKernel.execute(Range.create2D(Dim[1], Dim[0]));
            mKernel.get(resVector);
            mKernel.dispose();
            return new NDmatrix(Dim, resVector, null);
        }
        System.out.println("The number of columns in left matrix and number of rows in right matrix do not match.");
        System.out.println();
        return null;
    }

However, it looks like resVector[some_index] only gets updated once(r + aVector[0 * dpt + 0] * bVector[0 * wt + c]). If, instead, I use a 2D range and a loop(the commented bits in the code), it works correctly. What could be the reason for such behavior and how can I force it to work fully parallel?

Interestingly, I tried one thing that "worked" - after updating resVector[some_index], I called this.put(resVector). However, it then couldn't compile in OpenCL and ended up using Java's multi-threading instead, eventually resulting in a correct result.

@ZahariStoyanov
Copy link
Author

ZahariStoyanov commented Dec 3, 2023

So... I added this loop in conjunction with the 3D range and it works:

[...]
                    for(int i = 0; i < dpt; i++)
                        if(i == l) resVector[r * wt + c] += aVector[r * dpt + l] * bVector[l * wt + c];
[...]

Why would it need such a bizarre force? What am I missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant