To answer the original question, I feel a necessity to ask a number of related questions and then attempt to give some answers to those. I will then get to the essence of the OP question, which is very interesting, at the end.
Question--Let's start with: What is AI Research?
AI research might well have started with Alan Turing, even before the first computer had ever been built. Turing asked questions of what tasks a machine could be made to do that were considered to be the province of human thought. Early AI research included aspects of search, for example solving complex puzzles, and solving mathematical problems in an automated manner. (Later playing checkers and chess would be examples from this line of thinking.) Turing, unbeknownst to most of us, was working on complex search algorithms during World War II before the first computer had ever been built, in order to improve decoding of the German "Enigma" encryption. Other aspects included parsing and of course processing of human written language (in computer coded form).[1][Other ref needed?]
Turing speculated not only on algorithmic means to solve problems that humans could solve, but also speculated on connections of nodes similar to what we call "neural networks" today.
We note that depending on who one asks, the meaning of "AI Research"
has evolved considerably over time. I will speak about some of the problems with the original methods in a bit, because the assumptions that were somewhat incorrect in older research were important.
Question: What does early AI research have to do with modern computing?
LISP was apparently the second oldest computer language which has any continuing usage today. (FORTRAN being the oldest.) It was developed to aid in what was considered AI research at the time.[2]
"Lisp became the 'pathfinder' for many ideas which found application in the modern programming languages: tree-like structures, dynamic typing, higher-order functions and many others."[2]
The importance will become apparent below, I note that the Python language[3]:
"...uses dynamic typing and a combination of reference counting and a cycle-detecting garbage collector for memory management. .....
"Its design offers some support for functional programming in the Lisp tradition. It has filter, map and reduce functions; list comprehensions, dictionaries, sets, and generator expressions."
One of the most important early developments in LISP was interactive debugging--when a failure occurred you could interactively investigate the state of the computation, and this was primarily because LISP was largely interpreted language and you can invoke LISP functions to examine the data right at the point of stopping the execution. I also note that LISP was an example of interpreted language that later became compiled and can combine aspects of both--important for later...
Many modern elements of computer languages were developed in the AI programming communities. As the nature of AI computing changed and features became available in other languages, the usefulness of LISP itself as compared to other languages decreased.
Question: What were early assumptions that later AI research improved on?
One of the most important assumption was that "crisp logic" based search could solve some of the most important problems. I will note that in the early days there were essentially two entirely separate camps of research in AI: Symbolic and programing approaches, and "Neural Networks" approaches. These two communities pretty much did not talk to each other. [Ref needed]
The main issue is that completely exacting crisp logic and search is so "brittle" that finding answers in large bodies of stored information would most often fail because very small differences could not be resolved because they did not exactly match the same form or representation. More "fuzzy" representations and matching processes work better. Later research, both "AI" and "Machine Learning" often combine or utilize concepts that hark from the "neural network" approach that allows for matching that is less exact to the pattern which may be of interest, and furthermore can form its own matching links based on data fed to the system rather than being preprogrammed or based on human coded data structures. In essence the "Neural Network" aspects and the funcional/procedural aspects of AI research are no longer separate communities. [Ref needed]
Question: What computer language is the most commonly sited example of AI today, ChatGPT, written in?
One of the major languages ChatGPT is written in is "Python" (and several derivatives and supporting or related packages attached to the language).[4]
I bring this up to show that the history of LISP feeds into the modern tools needed for today's AI implementations. The LISP language itself no longer has the relevance it once had, in modern approaches, because the features that were most helpful are implemented in other languages and tools now, and LISP has a readability problem with all the nested parentheses. Even features of early tools like "EMACS" editor, which helped format the language to be readable, have been implemented in modern development environments for a plethora of languages.
Now to the OP question: What language design features made Lisp apt for the task of AI research?
First it was easy to work with parts of a program because of the interpreted and immediate online nature of the language. (By "online" I mean that even when run on a mainframe of the day, the user would have time sharing "terminal" or other means to work almost as though one had a personal computer as we do today.) Other useful programming aspects in LISP were already explored in the above questions.
The language naturally parsed words into lists ("LISt Processing.") Interpreting of the list of words immediately connected to the dictionary of those words so the "properties" lists could be searched and connected immediately to the stored data associated with the word. Recursion was easily supported, which matched well with search operations that involved recursive aspects during pattern matching operations of one sort or another -- noting these were almost always "procedural" sequences even if implemented in "functional" form. Since the data structures could be stored in the same form that the language naturally utilized for coding, the data/algorithm combination was easily supported.
Other aspects will be found in the rest of this discussion of the other questions I ask and partly answer.
Please note my references were the first hits on each search, I'm sure someone can do more extensive research on references...
[1] https://news.harvard.edu/gazette/story/2012/09/alan-turing-at-100/
[2] https://typeable.io/blog/2021-10-04-lisp-usage.html
[3] https://en.wikipedia.org/wiki/Python_(programming_language)
[4] https://botpress.com/blog/list-of-languages-supported-by-chatgpt