Why $d^*(q, \epsilon)$ has definition when $d(q, \varepsilon)$ does not in DFA?

Question

I'm reading an online book about DFA and NFA but it confuses me.

Given a DFA say $D=(Q,\Sigma, q_0, \delta, F_Q)$, its transition function is a total function defined on every symbol from a given alphabet (set). That is:

$$\delta: Q\times\Sigma\to Q$$

So it shows me that $\delta$ has no definition on an empty symbol $\varepsilon$. This is reasonable since $\varepsilon\not\in\Sigma$. But later on, the author simply mentions a notation $\delta^*$:

Following the usual practice of using $∗$ to designate “$0$ or more”, we define $\delta^*(q,w)$ as a convenient shorthand for “the state that the automaton will be in if it starts in state $q$ and consumes the input string $w$”. For any string, it is easy to see, based on $\delta$, what steps the machine will make as those symbols are consumed, and what $\delta^*(q,w)$ will be for any $q$ and $w$. Note that if no input is consumed, a DFA makes no move, and so $\delta^*(q, \varepsilon) = q$ for any state $q$.

The bolded part by me is what confuses me, my argument:

If $\delta$ itself is not defined for the idea of "no input is consumed", then why the author defines "no input is consumed" for the generalized idea $\delta$, i.e. the $\delta^*$ instead?

Footnote: It would be great if you could provide some formal references. On the other hand, could anyone help me confirm the quality of the link I provided?

Well, I don't see what more there is to say. I am in state $x$ and I want to know where I'll be in the DFA after I follow the path identified by the empty string. Where do I arrive? In $x$, obviously. — Sassatelli Giulio, Commented Jun 4 at 9:25
@SassatelliGiulio But if it's just $\delta$, that's not allowed. — linear_combinatori_probabi, Commented Jun 4 at 9:26
@linear_combination_probabi For that matter, if it were $\delta$, then only single characters would be allowed. That's why they define a new function $\delta^*$ on pairs of states and strings. — Sassatelli Giulio, Commented Jun 4 at 9:27
“ If $\delta$ itself is not defined for the idea of "no input is consumed", then why the author defines "no input is consumed" for the generalized idea $\delta$, i.e. the $\delta^*$ instead?” - because it is a natural and useful way of extending the transition function from the alphabet $\Sigma$ to $\Sigma^*$. Also, $\delta$ itself is not defined on length $\geq 2$ strings either so you should have a gripe with that as well — user1001001, Commented Jun 4 at 9:28
@linear_combinatori_probabi "$\delta^*$ is just a shorthand (...) to apply $\delta$ many times"... and when you apply $\delta$ zero times you stay where you are. — Sassatelli Giulio, Commented Jun 4 at 10:17

Andreas Blass · Accepted Answer · 2024-06-04 23:04:14Z

Let me start with a basic fact about how $\delta^*$ behaves on nonempty strings (so I'm not worrying yet about the $\epsilon$ that is the topic of your question), namely $$ \delta^*(q,wa)=\delta(\delta^*(q,w),a), $$ where $w$ is a nonempty string, $a$ is a single letter from your alphabet $\Sigma$, and $wa$ is the string consisting of $w$ followed by the single letter $a$. I hope this equation is in the book you're reading, but if it isn't, you should convince yourself that it's correct. It just says that where you end up starting from $q$ and following $wa$ is the same as if you started from $q$, followed only $w$, and then took one more step by reading $a$.

This equation (or something very similar) is used a lot, in proofs by induction on the length of strings, since if $w$ has length $l$ then $wa$ has length $l+1$. So it would be convenient if we could define $\delta^*$, even when the second argument is $\epsilon$ rather than a nonempty string $w$, and still have that equation hold.

So let's look at the equation above and try to make it work when we have $\epsilon$ in place of $w$. Since $\epsilon a$ is just $a$ (a string cnosisting of the single letter $a$), I'm hoping to get $$ \delta^*(q,a)=\delta(\delta^*(q,\epsilon),a). $$ On the left side of this equation, we have simply $\delta(q,a)$, since $\delta$ tells us what to do when we read a single letter. So what I want is $$ \delta(q,a)=\delta(\delta^*(q,\epsilon),a), $$ and there's an obvious way to define $\delta^*(q,\epsilon)$ to make this true, namely $\delta^*(q,\epsilon)=q$.

So, by making this definition, we get the original equation to work for all strings $w$, whether empty or not. And that's the technical motivation for $\delta^*(q,\epsilon)=q$.

The intuitive motivation, on the other hand, is what was essentially contained in the passage you quoted from your book. $\delta^*(q,w)$ is supposed to be where you end up if you start at $q$ and proceed to make all the transitions caused by the letters in $w$. ("Caused" means according to the instructions $\delta$.) So if $w$ has length $l$, you'll proceed $l$ steps from $q$. Now literally the same explanation applies even if $w$ is empty: $\delta^*(q,\epsilon)$ is supposed to be where you end up if you start at $q$ and proceed to make all the transitions caused by the letters in $\epsilon$. Since $\epsilon$ has length $0$, you'll proceed $0$ steps from $q$. So you'll end up still at $q$.

(I've written this in terms of you reading strings; your book talks rather about the automaton consuming strings. So to match things up, pretend that you're the automaton and that you consume what you read.)

"So it would be convenient if [...]" (in your first part) - Can I say that it's necessary as the recursive equation has to have an initial condition to stop? — linear_combinatori_probabi, Commented Jun 5 at 21:17
On the other hand, I also came up with a conclusion similar to your second part: I realized that when I thought about the transition function $\delta$, I subconsciously assumed that the transition has to happen once. Or, put in another way, the transition function itself cannot express "no consumption". With this idea, we have to define it, i.e. the no/empty consumption, through the extension $\delta^*$. — linear_combinatori_probabi, Commented Jun 5 at 21:26

D.W. · Accepted Answer · 2024-06-04 22:12:48Z

Why do we define $\delta^*$ this way? Because it is useful. Keep reading in your textbook, you'll probably see places where $\delta^*$ is used, and this definition often makes the presentation cleaner.

Of course, if you think there is some other definition that would make more sense, you can try imagining writing your own textbook using your definition, and see what happens to the exposition in the remaining chapters. Often you'll discover there is a good reason the author made that choice (some discussion gets messier). Sometimes it's arbitrary and doesn't matter. In any case, doing that exercise is a lot of work. Since you are still learning, I recommend you trust that the author has thought about these definitions and chosen some that are sensible.

In this case, I invite you to trust me that the math is cleaner if you define $\delta^*$ in this way. It's messier if you try to have a special exclusion for the case of length-0 strings.

Stack Exchange Network

Why $d^*(q, \epsilon)$ has definition when $d(q, \varepsilon)$ does not in DFA?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
discrete-mathematics
computer-science
automata
.

Hot Network Questions

Why $d^*(q, \epsilon)$ has definition when $d(q, \varepsilon)$ does not in DFA?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged discrete-mathematicscomputer-scienceautomata.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
discrete-mathematics
computer-science
automata
.