Discussion:
[Haskell-beginners] Longest common prefix of a list of lists
philipp siegmantel
2011-04-29 14:07:00 UTC
Permalink
Hi,

For one of my toy project I had to find the longest common prefix of a list
of strings.
I figured, that this function should only work on list-structure, meaning it
should have type
longestCommonPrefix :: [[a]] -> [a]
Sadly the best I could come up with was using explicit recursion. It works
as far as I tested it, but I have the feeling that there's a better way to
do it.
longestCommonPrefix :: Eq a => [[a]] -> [a]
longestCommonPrefix [] = []
longestCommonPrefix xss | any null xss = []
| otherwise = loop xss []
where loop ::Eq b => [[b]] -> [b] -> [b]
loop xss acc =
let xs = concatMap (take 1) xss
in if any (\x -> x /= head xs) (tail xs)
then reverse acc
else loop (map tail xss) (head xs : acc)
Daniel Fischer
2011-04-29 14:32:39 UTC
Permalink
Post by philipp siegmantel
Hi,
For one of my toy project I had to find the longest common prefix of a
list of strings.
I figured, that this function should only work on list-structure,
meaning it should have type
longestCommonPrefix :: [[a]] -> [a]
Sadly the best I could come up with was using explicit recursion. It
works as far as I tested it, but I have the feeling that there's a
better way to do it.
longestCommonPrefix :: Eq a => [[a]] -> [a]
longestCommonPrefix [] = []
longestCommonPrefix xss | any null xss = []
| otherwise = loop xss []
where loop ::Eq b => [[b]] -> [b] -> [b]
loop xss acc =
let xs = concatMap (take 1) xss
in if any (\x -> x /= head xs) (tail xs)
then reverse acc
else loop (map tail xss) (head xs : acc)
A different point of view:

given at least two lists, lists@(list1:more@(list2:_)), the longest common
prefix of lists is the common prefix of list1 and the longest common prefix
of more, so

longestCommonPrefix :: Eq a => [[a]] -> [a]
longestCommonPrefix [] = [] -- what else?
longestCommonPrefix lists = foldr1 commonPrefix lists

-- to get the common prefix of two lists, we use explicit recursion, I
don't see an elegant way to avoid that.

commonPrefix :: Eq a => [a] -> [a] -> [a]
commonPrefix (x:xs) (y:ys)
| x == y = x : commonPrefix xs ys
commonPrefix _ _ = []

produces the result incrementally.
Aai
2011-04-29 18:17:27 UTC
Permalink
I don't know for sure if the following is elegant or not, but:

a = "aaabbbbbbcccccc"
b = "aaabbbccccccd"
c = "aaabbbbbbbbbccccffffdd"

takeCommonPrefix xs = flip take xs. length . filter (`isPrefixOf`xs). drop 1. inits

*Main> foldl1 takeCommomPrefix [a,b,c]
"aaabbb"

*Main> foldl1 takeCommonPrefix [a,b,c,""]
""

May be the use of 'length' is a bit ugly. :-)
Post by Daniel Fischer
-- to get the common prefix of two lists, we use explicit recursion, I
don't see an elegant way to avoid that.
--
Met vriendelijke groet,
=@@i
Daniel Fischer
2011-04-29 19:15:13 UTC
Permalink
In a way, yes. It's a composition of library functions, so you delegate the
explicit recursion to those. There are a few points where you can increase
elegance and performance here, see inline comments.

But it's very inefficient and the suggestions don't change that.
Post by Aai
a = "aaabbbbbbcccccc"
b = "aaabbbccccccd"
c = "aaabbbbbbbbbccccffffdd"
takeCommonPrefix xs =
flip take xs. length . filter (`isPrefixOf`xs) . drop 1. inits
Don't use filter here, as soon as you hit an initial segment which isn't a
prefix of xs, no later one will be, so it's a job for takeWhile.

Also, you already have the common prefixes, so why count and take from xs?
Just take the longest:

takeCommonPrefix xs = last . takeWhile (`isPrefixOf` xs) . inits
Post by Aai
*Main> foldl1 takeCommomPrefix [a,b,c]
"aaabbb"
*Main> foldl1 takeCommonPrefix [a,b,c,""]
""
May be the use of 'length' is a bit ugly. :-)
Yes, and unneeded. But perforance-wise, that is not what hurts.

The trouble is, if xs and ys have a longest common prefix of length p, you
check whether (take k ys) is a prefix of xs for k <- [0 .. p+1], so that is
an O(p^2) algorithm. Also, you can't start delivering until the longest
common prefix has been determined, which may cause high memory requirements
and bad performance.

Consider

a = replicate 1000000 'a'
b = replicate 1000001 'a'
c = "abacab"

foldl1 takeCommonPrefix [a,b,c] =
takeCommonPrefix ab c
where
ab = takeCommonPrefix a b

If the common prefix is delivered incrementally, all we need is the first
two characters to determine the result. That's fast and needs very little
memory. If the common prefix has to be entirely determined before it can be
delivered, you need to have two one-million characters long Strings in
memory at the same time, which needs a couple dozen MB, and to determine
the longest common prefix of a and b by the above method, you need about
5*10^11 Char comparisons, so it'll need a bit of time too.
Post by Aai
Post by Daniel Fischer
-- to get the common prefix of two lists, we use explicit recursion, I
don't see an elegant way to avoid that.
James Cook
2011-04-29 19:42:17 UTC
Permalink
Post by Daniel Fischer
-- to get the common prefix of two lists, we use explicit recursion, I
don't see an elegant way to avoid that.
commonPrefix :: Eq a => [a] -> [a] -> [a]
commonPrefix (x:xs) (y:ys)
| x == y = x : commonPrefix xs ys
commonPrefix _ _ = []
produces the result incrementally.
Personally, that's the definition I'd prefer. It's simple, clear, and
fast. If I were forced to hide the recursion, I would probably write
Post by Daniel Fischer
commonPrefix a b = map fst (takeWhile (uncurry (==)) (zip a b))
modulo usage of whatever combination of (.) and/or ($) is in style
today.

-- James
aditya siram
2011-04-29 23:10:33 UTC
Permalink
-- Below is a solution that doesn't use length.
-- Documentation for each function and trial runs
-- are shown in a comment block at the end.

import Data.List.Ordered(isSortedBy)
import Safe(headMay)
import Maybe(fromJust)

a = "ababaaaa"
b = "ababcccc"
c = "ab"
d = ""

safeTail :: [a] -> [a]
safeTail [] = []
safeTail xs = tail xs

maybeTranspose :: [[a]] -> [[Maybe a]]
maybeTranspose [] = []
maybeTranspose xs | all null xs = []
| otherwise = map headMay xs : maybeTranspose (map
safeTail xs)

commonPrefix :: Eq a => [[a]] -> [a]
commonPrefix = map fromJust .
fmap (head . snd) .
takeWhile fst .
fmap (\a -> ((isSortedBy (==) a), a)) .
maybeTranspose

{-
commonPrefix [a,b] => "abab"
commonPrefix [a,b,c] => "ab"
commonPrefix [a,b,c,d] => ""
Functions:
'safeTail' only returns the tail of the list of a non-empty list
or an empty list

'maybeTranspose' is just like List.transpose except it
returns the biggest matrix, not the smallest
eg. transpose [[1,2],[],[4,5]] =>
[[1,4],[2,5]]
maybeTranspose [[1,2][][4,5]] =>
[[Just 1,Nothing,Just 4],
[Just 2,Nothing,Just 5],
[Nothing,Nothing,Nothing]]

'commonPrefix' transposes the input lists , extracts
the first n lists in the transposition where all the elements
are equal, extracts the element from each and returns a concatenation
of them.
eg. Given input ["ab", "abab", "aba"]
1. Transpose
[[Just 'a',Just 'a',Just 'a'],
[Just 'b',Just 'b',Just 'b'],
[Nothing,Just 'a',Just 'a'],
[Nothing,Just 'b',Nothing]]
2. Tag lists with equal elements
[(True,[Just 'a',Just 'a',Just 'a']),
(True,[Just 'b',Just 'b',Just 'b']),
(False,[Nothing,Just 'a',Just 'a']),
(False,[Nothing,Just 'b',Nothing])]
3. Take the first n lists with equal elements
[(True,[Just 'a',Just 'a',Just 'a']),
(True,[Just 'b',Just 'b',Just 'b'])]
4. Grab the equal element from each
[Just 'a',Just 'b']
5. Concatenate
"ab"

I didn't like using 'fromJust' but I don't see how it
can throw an exception in this case.

Working with the (Bool, [Maybe a]) tuples directly could
probably have been avoided by using Arrows, but it wasn't
obvious to me.
-}
Christopher Done
2011-04-30 02:54:59 UTC
Permalink
Post by aditya siram
I didn't like using 'fromJust' but I don't see how it
can throw an exception in this case.
λ> catMaybes [Just 1,Nothing,Just 2]
[1,2]

Loading...