Haskell memory usage and IO

Question

I had just written a piece of Haskell code where in order to debug my code I put in a bunch of print statements in my code (so, my most important function returned IO t, when it just needed to return t) and I saw that this function, on a successful run, would take up a lot of memory (roughly 1.2GB). Once I saw that the program was working fine, I removed all the print statements from the function and ran it, only to realize that it was giving me this error:

Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

Even though it was the same exact piece of code, for some reason the print statements made it ignore stack space overflow. Can anyone enlighten me as to why this happens?

I know I haven't provided my code which might make it harder to answer this question, but I've hacked a bunch of things together and it doesn't look very pretty so I doubt it would be useful and I am fairly certain that the only difference is the print statements.

EDIT:

Since people really wanted to see the code here is the relevant part:

linkCallers :: ([Int], Int, Int, I.IntDisjointSet, IntMap Int) -> ([Int], Int, Int, I.IntDisjointSet, IntMap Int)
linkCallers ([], x, y, us, im) = ([], x, y, us, im) 
linkCallers ((a:b:r), x, y, us, im) = if a == b
    then (r, x, y+1, us, im) 
    else if sameRep
        then (r, x+1, y+1, us, im) 
        else (r, x+1, y+1, us', im')
        where
            ar = fst $ I.lookup a us
            br = fst $ I.lookup b us  
            sameRep = case ar of
                Nothing -> False
                _ -> ar == br
            as' = ar >>= flip lookup im
            bs' = br >>= flip lookup im
            totalSize = do
                asize <- as' 
                bsize <- bs' 
                return $ asize + bsize
            maxSize = (convertMaybe as') + (convertMaybe bs')
            us' = I.union a b $ I.insert a $ I.insert b $ us
            newRep = fromJust $ fst $ I.lookup a us' 
            newRep' = fromJust $ fst $ I.lookup b us' 
            im'' = case ar of
                Nothing -> case br of
                    Nothing -> im
                    Just bk -> delete bk im
                Just ak -> delete ak $ case br of
                    Nothing -> im
                    Just bk -> delete bk im
            im' = case totalSize of  
                Nothing -> insert newRep maxSize im''
                Just t -> insert newRep t im''

startLinkingAux' (c,x,y,us,im) = let t@(_,x',_,us',im') = linkCallers (c,x,y,us,im) in
    case (fst $ I.lookup primeMinister us') >>= flip lookup im' >>= return . (>=990000) of
        Just True -> x'
        _ -> startLinkingAux' t

startLinkingAux' used to look something like this:

startLinkingAux' (c,x,y,us,im) = do
    print (c,x,y,us,im)
    let t@(_,x',_,us',im') = linkCallers (c,x,y,us,im) in
    case (fst $ I.lookup primeMinister us') >>= flip lookup im' >>= return . (>=990000) of
        Just True -> return x'
        _ -> startLinkingAux' t

You should show at least one function where you removed the print. — Zeta, Commented Jan 20, 2014 at 23:35
Just a hunch: try forcing (evaluating) the values where you used to print them and see if it makes a difference. — fjh, Commented Jan 20, 2014 at 23:46
Just a guess - the print statement forced evaluation of a thunk (lazy computation), and without it the thunks accumulated until they overflowed the stack. See this article for an explanation of how thunks can build up: haskell.org/haskellwiki/Foldr_Foldl_Foldl' — ErikR, Commented Jan 20, 2014 at 23:47
Since the program consumed a lot of (heap) memory with printing, my guess would be that there is a memory leak (in both IO and pure version), but in the pure version compiler optimizes it to use stack instead of heap, hence the overflow exception. — Petr, Commented Jan 21, 2014 at 7:17

Petr · Accepted Answer · 2014-01-21 20:33:13Z

There could be a memory leak in one of the arguments. Probably the first thing I'd try would be to ask the author of disjoint-set to add a RFData instance for IntDisjointSet (or do it yourself, looking at the source code, it'd fairly easy). Then try calling force on all values returned by linkCallers to see if it helps.

Second, you're not using disjoint-set right. The main idea of the algorithm is that lookups compress paths in the set. This is what gives it it's great performance! So every time you make a lookup, you should replace your old set with a new one. But this makes using a disjoint set quite clumsy in a functional language. It'd suggest to use the State monad for this and use it internally in linkCallers, as one big do block instead of where, just passing the starting set and extracting the final one. And define functions like

insertS :: (MonadState IntDisjointSet m) => Int -> m ()
insertS x = modify (insert x)

lookupS :: (MonadState IntDisjointSet m) => Int -> m (Maybe Int)
lookupS x = state (lookup x)

-- etc

to use inside State. (Perhaps they'd be a good contribution to the library as well as this will be probably a common problem.)

Finally, there are lot of small improvements that can make the code more readable:

Many times you're applying a single function to two values. I'd suggest to define something like

onPair :: (a -> b) -> (a, a) -> (b, b)
onPair f (x, y) = (f x, f y)
-- and use it like:
(ar, br) = onPair (fst . flip I.lookup us) (a, b)

Also using Applicative functions can make things shorter:

sameRep = fromMaybe False $ (==) <$> ar <*> br
totalSize = (+) <$> as' <*> bs'

then also

im'' = maybe id delete ar . maybe id delete br $ im
im' = insert newRep (fromJust maxSize totalSize) im''

Hope it helps.

Collectives™ on Stack Overflow

Haskell memory usage and IO

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
haskell
io
out-of-memory
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged haskellioout-of-memory or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
haskell
io
out-of-memory
or ask your own question.