I have a function ascArr :: String -> BigData
for parsing some big strict data from a string and another one, altitude :: BigData -> Pt -> Maybe Double
, for getting something useful from the parsed data. I want to parse the big data once and then use the altitude
function with the first argument fixed and the second one varying. Here's the code (TupleSections are enabled):
exampleParseAsc :: IO ()
exampleParseAsc = do
asc <- readFile "foo.asc"
let arr = ascArr asc
print $ map (altitude arr . (, 45)) [15, 15.01 .. 16]
This is all ok. Then I want to connect the two functions together and to use partial application for caching the big data. I use three versions of the same function:
parseAsc3 :: String -> Pt -> Maybe Double
parseAsc3 str = altitude d
where d = ascArr str
parseAsc4 :: String -> Pt -> Maybe Double
parseAsc4 str pt = altitude d pt
where d = ascArr str
parseAsc5 :: String -> Pt -> Maybe Double
parseAsc5 = curry (uncurry altitude . first ascArr)
And I call them like this:
exampleParseAsc2 :: IO ()
exampleParseAsc2 = do
asc <- readFile "foo.asc"
let alt = parseAsc5 asc
print $ map (alt . (, 45)) [15, 15.01 .. 16]
Only the parseAsc3 works like in the exampleParseAsc
: Memory usage rises at the beginning (when allocating memory for the UArray in the BigData), then it is constant while parsing, then altitude
quickly evaluates the result and then everything is done and the memory is freed. The other two versions are different: The memory usage rises multiple times until all the memory is consumed, I think that the parsed big data is not cached inside the alt
closure. Could someone explain the behaviour? Why are the versions 3 and 4 not equivalent? In fact I started with something like parseAsc2
function and just after hours of trial I found out the parseAsc3 solution. And I am not satisfied without knowing the reason...
Here you can see all my effort (only the parseAsc3
does not consume whole the memory; parseAsc
is a bit different from the others - it uses parsec and it was really greedy for memory, I'd be glad if some one explained me why, but I think that the reason is different than the main point of this question, you may just skip it):
type Pt = (Double, Double)
type BigData = (UArray (Int, Int) Double, Double, Double, Double)
parseAsc :: String -> Pt -> Maybe Double
parseAsc str (x, y) =
case parse ascParse "" str of
Left err -> error "no parse"
Right (x1, y1, coef, m) ->
let bnds = bounds m
i = (round $ (x - x1) / coef, round $ (y - y1) / coef)
in if inRange bnds i then Just $ m ! i else Nothing
where
ascParse :: Parsec String () (Double, Double, Double, UArray (Int, Int) Double)
ascParse = do
[w, h] <- mapM ((read <$>) . keyValParse digit) ["ncols", "nrows"]
[x1, y1, coef] <- mapM ((read <$>) . keyValParse (digit <|> char '.'))
["xllcorner", "yllcorner", "cellsize"]
keyValParse anyChar "NODATA_value"
replicateM 6 $ manyTill anyChar newline
rows <- replicateM h . replicateM w
$ read <$> (spaces *> many1 digit)
return (x1, y1, coef, listArray ((0, 0), (w - 1, h - 1)) (concat rows))
keyValParse :: Parsec String () Char -> String -> Parsec String () String
keyValParse format key = string key *> spaces *> manyTill format newline
parseAsc2 :: String -> Pt -> Maybe Double
parseAsc2 str (x, y) = if all (inRange bnds) (is :: [(Int, Int)])
then Just $ (ff * (1 - px) + cf * px) * (1 - py)
+ (fc * (1 - px) + cc * px) * py
else Nothing
where (header, elevs) = splitAt 6 $ lines str
header' = map ((!! 1) . words) header
[w, h] = map read $ take 2 header'
[x1, y1, coef, _] = map read $ drop 2 header'
bnds = ((0, 0), (w - 1, h - 1))
arr :: UArray (Int, Int) Double
arr = listArray bnds (concatMap (map read . words) elevs)
i = [(x - x1) / coef, (y - y1) / coef]
[ixf, iyf, ixc, iyc] = [floor, ceiling] >>= (<$> i)
is = [(ix, iy) | ix <- [ixf, ixc], iy <- [iyf, iyc]]
[px, py] = map (snd . properFraction) i
[ff, cf, fc, cc] = map (arr !) is
ascArr :: String -> BigData
ascArr str = (listArray bnds (concatMap (map read . words) elevs), x1, y1, coef)
where (header, elevs) = splitAt 6 $ lines str
header' = map ((!! 1) . words) header
[w, h] = map read $ take 2 header'
[x1, y1, coef, _] = map read $ drop 2 header'
bnds = ((0, 0), (w - 1, h - 1))
altitude :: BigData -> Pt -> Maybe Double
altitude d (x, y) = if all (inRange bnds) (is :: [(Int, Int)])
then Just $ (ff * (1 - px) + cf * px) * (1 - py)
+ (fc * (1 - px) + cc * px) * py
else Nothing
where (arr, x1, y1, coef) = d
bnds = bounds arr
i = [(x - x1) / coef, (y - y1) / coef]
[ixf, iyf, ixc, iyc] = [floor, ceiling] >>= (<$> i)
is = [(ix, iy) | ix <- [ixf, ixc], iy <- [iyf, iyc]]
[px, py] = map (snd . properFraction) i
[ff, cf, fc, cc] = map (arr !) is
parseAsc3 :: String -> Pt -> Maybe Double
parseAsc3 str = altitude d
where d = ascArr str
parseAsc4 :: String -> Pt -> Maybe Double
parseAsc4 str pt = altitude d pt
where d = ascArr str
parseAsc5 :: String -> Pt -> Maybe Double
parseAsc5 = curry (uncurry altitude . first ascArr)
Compiled with GHC 7.10.3, with -O optimization.
Thank you.