Having a hard time understanding some of Paul Graham's example code in his book “On Lisp”

Question

https://sep.yimg.com/ty/cdn/paulgraham/onlisp.pdf On page 36 Paul introduces the following function:

(defun flatten (x)
  (labels ((rec (x acc)
                (cond ((null x) acc)
                      ((atom x) (cons x acc))
                      (t (rec (car x) (rec (cdr x) acc))))))
    (rec x nil)))

I'm trying to gain an intuition as to how he came to this implementation. I know that recursion is a big theme in Emacs Lisp (and Lisp in general), but trying to work through some of this code gives me a headache. Is there a strategy for studying code like this? Or does there need to be a paradigm shift in the way I'm reasoning about code such as this? Or an external resource that could give me a better intuition? Or, over time, will these idioms become more apparent to me (if this is in fact idiomatic Lisp)?

For example, given this input to function flatten: (flatten '(1 (2 3 (4 5))))
This is the trace of the given output:

{ flatten args: ((1 (2 3 (4 5))))
:{ rec@cl-flet@381 args: ((1 (2 3 (4 5))) nil)
::{ rec@cl-flet@381 args: (((2 3 (4 5))) nil)
:::{ rec@cl-flet@381 args: (nil nil)
:::} rec@cl-flet@381 result: nil
:::{ rec@cl-flet@381 args: ((2 3 (4 5)) nil)
::::{ rec@cl-flet@381 args: ((3 (4 5)) nil)
:::::{ rec@cl-flet@381 args: (((4 5)) nil)
::::::{ rec@cl-flet@381 args: (nil nil)
::::::} rec@cl-flet@381 result: nil
::::::{ rec@cl-flet@381 args: ((4 5) nil)
:::::::{ rec@cl-flet@381 args: ((5) nil)
::::::::{ rec@cl-flet@381 args: (nil nil)
::::::::} rec@cl-flet@381 result: nil
::::::::{ rec@cl-flet@381 args: (5 nil)
::::::::} rec@cl-flet@381 result: (5)
:::::::} rec@cl-flet@381 result: (5)
:::::::{ rec@cl-flet@381 args: (4 (5))
:::::::} rec@cl-flet@381 result: (4 5)
::::::} rec@cl-flet@381 result: (4 5)
:::::} rec@cl-flet@381 result: (4 5)
:::::{ rec@cl-flet@381 args: (3 (4 5))
:::::} rec@cl-flet@381 result: (3 4 5)
::::} rec@cl-flet@381 result: (3 4 5)
::::{ rec@cl-flet@381 args: (2 (3 4 5))
::::} rec@cl-flet@381 result: (2 3 4 5)
:::} rec@cl-flet@381 result: (2 3 4 5)
::} rec@cl-flet@381 result: (2 3 4 5)
::{ rec@cl-flet@381 args: (1 (2 3 4 5))
::} rec@cl-flet@381 result: (1 2 3 4 5)
:} rec@cl-flet@381 result: (1 2 3 4 5)
} flatten result: (1 2 3 4 5)

What seems to be trivial input ends up becoming a web of recursive calls. So when you think about functions such as these, do you, in your head, reason from certain base-cases and proceed? Like, for example, would you consider the (nil) case first, then the (1) case, then (1 2), then (1 (2 3)) and so on? Or is that not necessary?

To rephrase part of db48x's excellent answer, recursive logic is all about *simplifying* things using the exact functionality that you are implementing. The merge sort algorithm is a classic example: Given two pre-sorted lists, you can merge them into a single sorted list in linear time, always simply comparing the front item from each list. So given a single *unsorted* list, if you were to split it arbitrarily into two sub-lists, *sort each sub-list*, and then merge them, you end up with a sorted version of the original list. How do you sort each sub-list? You've already done it! — phils, Feb 26 '21 at 06:12
Note that recursion hasn't actually (traditionally) been a big theme in Emacs Lisp, because Emacs Lisp has limits on how many nested function calls can be made, and so it is not safe to use for unknown/arbitrary inputs. Some other lisps, such as Scheme, implement so-called Tail-Call Optimisation when the recursive call is the very last thing to happen in the function, effectively converting the recursion into a loop, avoiding issues with function call depth, and being more efficient in general. — phils, Feb 26 '21 at 06:30
@phils oh, I thought Emacs Lisp implemented Tail-Call Optimization — John DeBord, Feb 26 '21 at 06:55
Some support for it is in the works via https://akrl.sdf.org/gccemacs.html but I think it's going to be a good many years before it would become safe to presume that most Emacs users were running elisp that way. — phils, Feb 26 '21 at 07:38
These are interesting questions, but they don't really have anything to do with Emacs. General programming questions are more appropriate at stackoverflow.com — Tyler, Feb 26 '21 at 14:15

score 2 · Accepted Answer · answered Feb 26 '21 at 03:23

I recommend watching some lectures from MIT's 6.001 course using The Structure and Interpretation of Computer Programs. It's the best software engineering textbook ever written, and the lectures are well done. (On the other hand the recordings are positively ancient, and display a lot of obvious defects from the equipment of the day. The material is timeless though, so don't let those defects distract you.

https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-001-structure-and-interpretation-of-computer-programs-spring-2005/video-lectures/

The reason I recommend these lectures is that they go over how you write software like this several times, in several different guises. You do indeed want to consider the base cases very carefully. In a case like this you would want to have a case for every kind of thing you could come across in a list that someone handed you. It could be nothing, it could be a number, a string, a symbol, or it could be a list. Luckily, in this case you'll end up doing the same thing with a number as you will do with a string, so we can group most of the cases into a single case that looks for atomic objects (things that aren't lists).

And then it's just a matter of working out what to do with the complicated case. In this case you want to flatten the cdr of the list. That seems like a lot of work until you remember that you could just call a function that flattens any list, if you have one available. Guess what? We do have a function called rec which will flatten lists, so we just call that. Never mind that we haven't actually finished writing it yet; once we do finish writing it it will flatten lists for us.

Looking at the trace is also a very good idea. You'll get something very similar to the trace if you apply the substitution model that is presented in lecture 1B. In lecture 9A they describe a register machine that can execute Scheme (aka Lisp) code, and describe how that chain of function calls that builds up in the substitution model turns into a stack stored in memory.

Having a hard time understanding some of Paul Graham's example code in his book “On Lisp”

1 Answers1