finding the character frequency in common lisp

Tag: lisp , common-lisp Author: wangluochuanbo Date: 2011-10-19

For example if i enter sequence of characters

"Hello world" H = 1 e = 1 l = 3 o - 2 r = 1 w = 1 d = 1

can some one help me

I found this code online but i dont understand it i want a simpler one

(defun letter-freq (file)
 (with-open-file (stream file)
  (let ((str (make-string (file-length stream)))
        (arr (make-array 256 :element-type 'integer :initial-element 0)))
    (read-sequence str stream)
    (loop for c across str do (incf (aref arr (char-code c))))
    (loop for c from 32 to 126 for i from 1 do
      (format t "~c: ~d~a"
        (code-char c) (aref arr c)
        (if (zerop (rem i 8)) #\newline #\tab))))))

(letter-freq "test.lisp")

Other Answer1

The above code is quite specific to ASCII characters. If you want to do the same for any possible character, you can use a hash-table.

(defun letter-freq (file)
  (with-open-file (stream file)
    (let ((str (make-string (file-length stream)))
          (ht (make-hash-table)))
      (read-sequence str stream)
      (loop :for char :across str :do
        (incf (gethash char ht 0)))
      (maphash (lambda (k v)
                 (format t "[email protected]: ~D~%" k v))

[email protected] format directive prints the chara?ter as if by prin1.


Your method of reading the file is not good, especially if your file can have non-ascii characters. file-length returns the length in bytes, but characters can be one or more bytes. [edited out example code that isn't formatted correctly] Better just loop around read-line.
I agree, but I've just used the method from the original code not to change it too much. The better one can be found here: click me
@angus: At least the string will never be too short and I believe there's an upper bound on how wrong it will be. Combine it with a fill-=pointer (I think those are automatically set by READ-SEQUENCE) and "all" you're doing is possibly wasting RAM.
@Vatine: no, fill pointers aren't handled automatically by read-sequence. You would have garbage characters at the end of the string, whose frequency you would then incorrectly calculate.
@angus READ-SEQUENCE returns the last position updated, so it would be possible to set a fill-pointer (or limit a loop).

Other Answer2

This code isn't that hard to understand. It opens the file reads it into a string. Meanwhile, it also makes an array to hold the results (size 256 because theoretically you could have non-printing chars above 128, I guess). Then it loops over the array and increments the corresponding element in the array. For instance, 'a' is 32, so when it finds an 'a' it increments array element 32.

At the end it loops over just the printable character results and prints them out.

Other Answer3

I would tend to agree with drysdam. I haven't touched any Common Lisp code in a while and was able to read this example with general understanding, as he has described it.

I don't know what kind of Lisp environment you are using but even within bare CL REPL (read eval print loop) you can ask system to (describe 'some-unknown-symbol). And if you happen to be "forced" to use Emacs it has SLIMEy gobful of features.

I see this is your second lisp related question today. Perhaps it would be better to hit some books.


Yes, I highly recommend that book too. In fact, I've hardly done any real CL programming and I could still understand that quoted code because I read that book.

Other Answer4

I would separate the task into 2 smaller tasks:

1 Read from a file and return a string 2 Count the letter frequency of the characters in the string

For 1, you can use the file-string function (

For 2, you could use a 'bag' data structure, and the fset package library (

(defun letter-freq (str)
  (let ((bg (fset:convert 'fset:bag str)))
    (fset:do-bag-pairs (value mult bg)
                       (format t "~a: ~a~%" value mult))))