Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dense-linear-algebra: Getting stream fusion to work across Matrix's #60

Open
Magalame opened this issue May 26, 2019 · 5 comments
Open

Comments

@Magalame
Copy link
Contributor

Magalame commented May 26, 2019

The current problem is as follows:

(U.sum . flip M.column 0) a does not fuse. It seems to boil down to:

testRewrite1 :: Matrix -> Double  --fuses
testRewrite1 (Matrix r c v) = U.sum . flip (\u j -> U.generate r (\i -> u `U.unsafeIndex` (j + i * c))) 0 $ v

testRewrite2 :: Matrix -> Double -- does NOT fuse
testRewrite2 m = U.sum . flip (\(Matrix r c v) j -> U.generate r (\i -> v `U.unsafeIndex` (j + i * c))) 0 $ m

note: the flip isn't important, it's just by convenience, since this is from https://github.com/Magalame/fastest-matrices

So the thing that seems to happen is that stream fusion cannot "go through" Matrix's, I'm not sure exactly why

@Magalame
Copy link
Contributor Author

Magalame commented May 27, 2019

Interestingly, this fuses:

testRewrite3 :: Matrix -> Double 
testRewrite3 (Matrix r c v) = U.sum . flip (\(r,c,v) j -> U.generate r (\i -> v `U.unsafeIndex` (j + i * c))) 0 $ (r,c,v)

@Magalame
Copy link
Contributor Author

Magalame commented May 27, 2019

It seems that pattern matching is the cause?
This also fuses:

testRewrite4 :: Matrix -> Double 
testRewrite4 m = U.sum . flip (\m1 j -> U.generate (rows m1) (\i -> (_vector m1) `U.unsafeIndex` (j + i * (cols m1)))) 0 $ m

@Magalame
Copy link
Contributor Author

Magalame commented May 27, 2019

Bottom line:

column :: Matrix -> Int -> Vector Double
column (Matrix r c v) j= U.generate r (\i -> v `U.unsafeIndex` (j + i * c))
{-# INLINE column #-}

column2 :: Matrix -> Int -> Vector Double
column2 m j = U.generate (rows m) (\i -> (_vector m) `U.unsafeIndex` (j + i * (cols m)))
{-# INLINE column2 #-}

testRewrite0 :: Matrix -> Double
testRewrite0 a = (U.sum . flip column2 0) a

testRewrite1 :: Matrix -> Double
testRewrite1 a = (U.sum . flip column 0) a

testRewrite0 fuses, testRewrite1 doesn't.

However, the fusion is depending on the INLINE and I can't get M.column2 to get inlined across modules (moving the definition of column2 from the main file to Matrix.hs)

@Magalame
Copy link
Contributor Author

The results are actually a bit disappointing, I'll keep digging

@Magalame
Copy link
Contributor Author

Magalame commented Mar 9, 2020

Back to it! Turns out GHC makes wonders with rewrite rules/inlining. So throwing in for example:

{-# INLINE [1] row #-}
{-# RULES
      "row/fuse"    forall c v r i.  row (M.Matrix r c v) i = U.slice (c*i) c v
  #-}

works amazing.

It gives us some nice stuff:

Say I have matrices A,B, and C = A*B.
Then now I can take the sum of a row of C, without allocating the whole matrix, and a ~10x speedup on my machine.

I would expect to hit a point of diminishing return though, so that's to be explored

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant