7
\$\begingroup\$

As a first step in learning Haskell I am solving this problem, which involves finding the Rith-ranked numbers, given some input list and many Ri. In an imperative language I would make a zeroed array c of length 200, increment c[h] for each height h, compute cumulative sums of c and then binary search c to determine the height corresponding to each given index. Because max_height is fixed, this has runtime linear in the size of input and bounded memory(*) excluding the input.

Here's my Haskell code:

max_height = 200
count_eq e = foldl (\c f -> if e == f then c + 1 else c) 0
counts heights = map (flip count_eq heights) [0..max_height]
first_gt e l = f l 0 where f (x:xs) i = if x > e then i else f xs (i+1)
solve heights indices = let accum = scanl1 (+) (counts heights) in
  map (flip first_gt accum) (map (subtract 1) indices)

It is correct but slow. I would like to know how to (A) reason about and (B) improve the performance. Also (C) can I achieve the same asymptotic performance as the imperative code?

(*) assuming each c[i] fits in a machine int. I believe the runtime statement holds regardless.

\$\endgroup\$
1
  • \$\begingroup\$ Welcome to Code Review! Good job on your first question. \$\endgroup\$
    – SirPython
    Commented Apr 7, 2016 at 23:52

2 Answers 2

1
\$\begingroup\$

Maps as arrays

In counts heights you are taversing heights 200 times. It is possible to emulate array with Data.IntMap and do this in one pass:

import Data.List (foldl')
import qualified Data.IntMap.Strict as Map

count = foldl' (\m h -> Map.insertWith (+) h 1 m)
  (Map.fromList [(i,0) | i <- [1..200]])

Note that I'm using strict version of left fold (see here about foldl vs foldl') and strict Map. This means that map of height counts constructed in single pass without thunks.

solve heights = map (\ix -> 1 + length (takeWhile (<ix) counts))
  where
    counts = scanl1 (+) $ Map.elems $ count heights

I'm using 1 + length (takeWhile (<ix) counts) instead of bare recursion in first_gt.

A bit more code is required for binary search:

solve heights = map (maybe 0 snd . (`Map.lookupLT` countsMap))
  where
    counts = scanl1 (+) $ Map.elems $ count heights
    countsMap = Map.fromList $ zip counts [2..]

Vectors as arrays

If log(n) overhead of trees is too much for you, it is possible to use Data.Vector which is for real arrays in Haskell.

import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Unboxed.Mutable as M
import qualified Data.Vector.Algorithms.Search as S

max_height = 200

solve :: [Int] -> [Int] -> IO [Int]
solve heights indices = do
  v <- M.replicate (max_height+1) 0
  mapM_ (M.modify v (+1)) heights
  counts <- V.unsafeFreeze v >>= V.unsafeThaw . V.scanl1' (+)
  mapM (S.binarySearchL counts) indices

binarySearchL is from vector-algorithms package.


Some utility code to parse input:

main = getContents >>= mapM_ print . map (uncurry solve) . parse

parse :: String -> [([Int], [Int])]
parse
  = pairs
  . map
    ( takeWhile (/= 0) -- drop trailing zeroes
    . map read . words -- convert string to ints
    )
  . tail . lines       -- skip first line


pairs :: [a] -> [(a, a)]
pairs (x:y:xs) = (x, y) : pairs xs
pairs _        = []
\$\endgroup\$
1
\$\begingroup\$

Lists are not for random access. As far as I know, Vectors are the modern way to have fixed-length random-access listlikes. They will allow you to do the traversing updates you want in one pass.

import qualified Data.Vector as V
counts = V.toList . V.accum (+) (V.replicate 200 0) . map (,1)

first_gt doesn't need a manual implementation:

import Safe
first_gt e = findIndexJust (> e)

And just for fun, this should only traverse the counts result once. (Requires that indices is sorted, though.)

import Control.Monad.Trans.State
solve heights = scanl1 (+) . map length . splitsOnFoo . map (subtract 1) where
  splitsOnFoo = evalState $ traverse (state . span . (<)) $ scanl1 (+) $ counts heights
\$\endgroup\$

Not the answer you're looking for? Browse other questions tagged or ask your own question.