how would one merge two strings that are ordered alphabetically in lisp using recursion

Tag: recursion , lisp , tailrecursion-modulo-cons Author: gssh409 Date: 2012-04-08

I am learning Lisp. I have implemented a common lisp function that merge two strings that are ordered alphabetically using recursion. Here is my code, but there is something wrong and I didn't figure it out.

(defun merge (F L)
    (if (null F)
        (if (null L)
            F         ; return f
            ( L ))    ; else return L
        ;else if
        (if (null L)
            F)        ; return F
    ;else if
    (if (string< (substring F 0 1) (substring L 0 1)
        (concat 'string (substring F 0 1) (merge (substring F 1 (length F)) L)))
        (concat 'string (substring L 0 1) (merge F (substring L 1 (length L)) ))

Edit : I simply want to merge two strings such as; inputs are string a = adf and string b = beg and the result or output should be; abdefg

Thanks in advance.

or is there a better way to do?
You should describe what errors you're getting and exactly what's not working. Providing a small test case with what you want and what it's actually doing is also good.
could you please format the code correctly?
(L) will call L with 0 arguments. Remove all your extraneous parentheses as a first step. Also, there's no need to have a space after opening parenthesis. It makes your code very hard to read. If you want to return L, call (return L), not just L, if not in terminal position.
next, mind the types of your values. null is meaningful to call on lists, to check whether they are empty or not, not on strings. (null "") returns NIL. (length "") returns 0. The full name of the function you're using is concatenate, not concat. But before all, please format your code!

Best Answer

Judging by your comments, it looks like you're trying to use if with a series of conditions (like a series of else ifs in some other languages). For that, you probably want cond.

I replaced that if with cond and cleaned up some other errors, and it worked.

(defun empty (s) (= (length s) 0))

(defun my-merge (F L)
   ((empty F)
    (if (empty L)
   ((empty L)
    (if (string< (subseq F 0 1) (subseq L 0 1))
      (concatenate 'string (subseq F 0 1) (my-merge (subseq F 1 (length F)) L)) 
      (concatenate 'string (subseq L 0 1) (my-merge F (subseq L 1 (length L))))))))

Your test case came out as you wanted it to:

* (my-merge "adf" "beg")



thank you very much.
the first clause in the cond is unnecessarily complex. L can just be returned in any case.
@Will: That's true. I thought it would be more helpful just to get the original code to work first, but I can add that to my answer.

Other Answer1

Using string< is an overkill, char< should be used instead, as shown by Kaz. Recalculating length at each step would make this algorithm quadratic, so should be avoided. Using sort to "fake it" makes it O(n log n) instead of O(n). Using concatenate 'string all the time probably incurs extra costs of unneeded traversals too.

Here's a natural recursive solution:

(defun str-merge (F L)
  (labels ((g (a b)
               ((null a) b)
               ((null b) a)
               ((char< (car b) (car a))
                  (cons (car b) (g a (cdr b))))
               (t (cons (car a) (g (cdr a) b))))))
    (coerce (g (coerce F 'list) (coerce L 'list)) 'string)))

But, Common Lisp does not have a tail call optimization guarantee, let alone tail recursion modulo cons optimization guarantee (even if the latter was described as early as 1974, using "Lisp 1.6's rplaca and rplacd field assignment operators"). So we must hand-code this as a loop:

(defun str-merge (F L &aux (s (list nil)) )
  (do ( (p s (cdr p))
        (a (coerce F 'list) (if q a (cdr a)))
        (b (coerce L 'list) (if q (cdr b) b))
        (q nil))
      ( (or (null a) (null b))
          (if a (rplacd p a) (rplacd p b))
          (coerce (cdr s) 'string))
    (setq q (char< (car b) (car a)))
    (if q
      (rplacd p (list (car b)))
      (rplacd p (list (car a))))))


You can use macros (like tailprog) to rewrite your code into guaranteed tail recursion. Here: There is no example usage of tailprog, but it basically looks like labels. Some examples occur in archived Usenet discussion.
PS. since go is an operator in Common Lisp, you should not use it as the name of a function. Your code can break macros which use go in their macro expansion. Speaking of which, the tailprog macro will actually expand your tail recursive calls to (go ...).
@Kaz thanks, forgot about that. :) On second thought, it's my code, and I don't use macros which use go. I don't have tagbody (or what's the name) here either (explicit, at least...). So can I just leave it as it is?
@Kaz oh I get it, you meant in general. About tailprog, does it handle tail-rec-mod-cons? Just TCO isn't enough here I think.
Tail calls under tailprog do not return under any circumstances, so code which expects to be able to cons something onto the return value will not work.

Other Answer2

There were quite a few good answers, so why would I add one more? Well, the below is probably more efficient then the other answers here.

(defun merge-strings (a b)
  (let* ((lena (length a))
         (lenb (length b))
         (len (+ lena lenb))
         (s (make-string len)))
        ((safe-char< (x y)
           (if (and x y) (char< x y)
               (not (null x))))
         (choose-next (x y)
           (let ((ax (when (< x lena) (aref a x)))
                 (by (when (< y lenb) (aref b y)))
                 (xy (+ x y)))
               ((= xy len) s)
               ((safe-char< ax by)
                (setf (aref s xy) ax)
                (choose-next (1+ x) y))
                (setf (aref s xy) by)
                (choose-next x (1+ y)))))))
      (choose-next 0 0))))

(merge-strings "adf" "beg")

It is more efficient specifically in the sense of memory allocations - it only allocates enough memory to write the result string, never coerces anything (from list to string or from array to string etc.) It may not look very pretty, but this is because it is trying to do every calculation only once.

This is, of course, not the most efficient way to write this function, but programming absolutely w/o efficiency in mind is not going to get you far.


there's some distance between programming "absolutely w/o efficiency" and "as efficient as possible". I specifically tried to provide a linear solution, building the result in top-down manner. You've came one step further and worked on the results stream itself. Very nice. (why "not pretty" - is too, pretty). If only you were using i,j instead of x,y for indices, I'd be more happy. :) One thing though: for mergesort to be stable, merge should be left-preferential. I.e. when the two chars are equal, take the one from the left.

Other Answer3

A recursive way to do it (fixed according to comment- other solutions can get an IF form as well).

(defun merge-strings (a b)
  (concatenate 'string
               (merge-strings-under a b)))

(defun merge-strings-under (a b)
  (when (and
       (= (length a)
          (length b))
       (> (length a) 0))
    (append (if (string< (aref a 0) (aref b 0))
                (list (aref a 0) (aref b 0))
                (list (aref b 0) (aref a 0)))
            (merge-strings-under (subseq a 1)
                           (subseq b 1)))))

Here's a iterative way to do it.

(concatenate 'string 
    (loop for i across "adf" for j across "beg" nconc (list i j)))

Note that these rely on building the string into a list of characters, then vectorizing it ( a string is a vector of characters).

You can also write a more C-esque approach...

(defun merge-strings-vector (a b)
  (let ((retstr (make-array (list (+
                                   (length a)
                                   (length b)))
                            :element-type 'character)))
    (labels ((merge-str (a b i)
               (when (and
                    (= (length a)
                       (length b))
                    (/= i (length a)))
                 (setf (aref retstr (* 2 i)) (aref a i))
                 (setf (aref retstr (1+ (* 2 i))) (aref b i))
                 (merge-str a b (1+ i)))))

      (merge-str a b 0)

Note that this one - unlike the other 2 - has side effects within the function. It also, imo, is more difficult to understand.

All 3 take varying numbers of cycles to execute on SBCL 56; each seems to take between 6K and 11K on most of my trials. I'm not sure why.


could you please explain your recursive solution?
I think your recursive code doesn't work correctly. When I run this code with the arguments "abc" and "def", the output is (#\a #\d #\b #\e #\c #\f).
you missed the point of this. The two inputs are ordered but the output must be ordered too.
@WillNess: it's a quick fix.
Your solutions implement a pairwise interleave, not a merge of sorted sequences. Interleaving just happens to work for the ill-chosen "adf"/"beg" test case that student presented.