The Recursion Theorem in Computability Theory
Topics in computability theory covering Turing machines, undecidable languages, Turing-recognizable languages, and the Recursion Theorem. Exploring the boundaries of computation using theoretical models.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
ECE461: Theoretical CS The Recursion Theorem
Brief Recap In our recent topics, we have been discussing Turing machines (TMs), providing us with the most powerful known formalism for modeling computation If we accept the Church-Turing thesis, all algorithms can be implemented by Turing machines However, we have seen that Turing machines are not without limits In particular, some languages are undecidable, and some languages are not Turing-recognizable After proving that ATMis undecidable using an indirect proof and diagonalization, we proved that other languages are undecidable using reductions Also, since ATMis Turing-recognizable but not decidable, ATMmust not be Turing-recognizable We were then able to prove that other languages are not Turing-recognizable using mapping reducibility (a well-defined, particular type of reduction) All the topics mentioned in this recap are related to computability theory, the subject of the second of three parts of our textbook Chapter 6, the final chapter of this part, is titled "Advanced Topics in Computability Theory"; each section is (more-or-less) independent of the others, and the rest of the book will not depend on these topics Our current topic is related to Section 6.1, titled "The Recursion Theorem" We will cover material related to Sections 6.2 and 6.3 in our next topic; we will not cover Section 6.4
Pretending As explained in earlier topics, we sometimes pretend (for lack of a better word) that Turing machines can do certain things that they cannot actually do One example is when we pretend that a TM can use another TM as a subroutine Another example is when we pretend a TM can mark symbols on its tape Although Turing machines cannot truly do these things, they can do other things that are equivalent These hypothetical abilities would not increase the power of Turing machines Therefore, allowing casual descriptions of TMs to specify such actions seems reasonable In this topic, we will add to the list of something that we will pretend TMs can do; namely, they will be allowed to examine their own encodings! Although most TMs cannot do this, for any TM, T, we can imagine another TM, which does have access to its own encoding, that otherwise behaves the same way as T This is related to a theorem that our textbook calls the recursion theorem
A Paradox (not really) The book introduces this topic by discussing what it calls "a paradox that arises in the study of life" They summarize the paradox as follows (I am keeping the book's wording exactly for the three points below): 1. Living things are machines. 2. Living things can self-reproduce. 3. Machines cannot self-reproduce. Still quoting the book, they claim: "Statement 1 is a tenet of modern biology. We believe that organisms operate in a mechanistic way. Statement 2 is obvious." For statement 3, the intuition behind it is that a machine that produces something (e.g., an automated factory that produces cars) would be expected to be more complex than the thing that it produces Later, they ask a question and then immediately answer it, as follows: "How can we resolve this paradox? The answer is simple: Statement 3 is incorrect." They then explain that some machines can self-reproduce, and we will soon look at a TM that can produce its own encoding Personally, I think statement 2 above is false, statement 3 above is clearly false, and statement 1 above depends on our definition of "machine" plus other philosophical viewpoints that we might hold
Q Regardless of what we think about the book's non-paradox, constructing a TM that can produce its own encoding is not trivial We will examine a self-reproducing TM before we discuss the recursion theorem directly Before we do that, we need to understand an important lemma related to computable functions (which were introduced during our previous topic, when we discussed mapping reducibility) Lemma 6.1 of the textbook states: "There is a computable function q: , where if w is any string, q(w) is the description of a Turing machine Pwthat prints out w and then halts." The proof of the lemma is straight-forward; a TM, Q, can create an encoding, <Pw>, for a TM, Pw, that performs the following three steps: 1. Erase its input (ignoring the input altogether) 2. Write w on its tape 3. Halt After Q computes <Pw>, Q can erase w and output <Pw> on its tape The TM, Q, is implementing the computable function q(w); note that Q has access to w
Sequential Turing Machines The book calls its self-reproducing Turing machine SELF, with the encoding <SELF> They write, "The Turing machine SELF is in two parts: A and B." They express, "<SELF> = <AB>" They explain, "Part A runs first and upon completion passes control to B" I want to point out a few things here: We have never explained how to encode a Turing machine, and in fact, the method of encoding shouldn't really matter, as long as it is reasonable There is no reason to think that the concatenation of two encodings would be a valid encoding, let alone one that would let the first part run and then pass control to the second Therefore, we cannot conclude that <AB> is the concatenation of <A> and <B> However, there must be some way to combine A and B to create a TM that executes first A, then B Of course, if we have algorithms to perform the tasks of A and B, then there exists an algorithm that can perform the task of A followed by the task of B Therefore, if we accept the Church-Turing thesis, there is a TM that can do it
SELF (part 1) From the last slide, we will set <SELF> = <AB>, and now we need to determine what A and B should do A by itself is P<B>, the TM described by q(<B>); that is, A ignores and erases its input, generate the encoding of B, and write <B> on its output tape It may be tempting, then, to have B do something similar, related to q(<A>) However, this would be circular (B would be defined in terms of A which is defined in terms of B); we need to do something else Book: "B computes A from the output that A produces" That is, B accepts input <M>, "where M is a portion of a TM", and behaves as follows: B computes q(<M>); that is, it encodes a TM, T, that produces <M>; call this <T> B combines its output with <M>; that is, it creates the encoding <TM> B erases what's on the tape so far and outputs <TM> Note: We followed the book's lead, describing A before B; but I think it is helpful to now look back at the whole process, thinking about B first, since the description of B does not depend on A Once we have constructed B, which does not depend on A, constructing A to produce B should be relatively straightforward (it's just an application of the TM Q)
SELF (part 2) Recall that SELF runs A first, then B; so, SELF proceeds as follows: First, A runs, erasing the original tape input and printing <B> on the tape Then B runs, processing its input, which is <B> B computes q(<B>) = <A> and combines the result with <B>, producing <AB> = <SELF> B erases the tape and outputs <SELF> The figure below helps to explain how SELF works:
Quines The Turing machine SELF is related to a concept in programming called a quine (not mentioned in the book, although it is related to Exercise 6.1) From Wikipedia: "A quine is a computer program which takes no input and produces a copy of its own source code as its only output." It may be a fun exercise to try to create a quine in the programming language of your choice (but I decided not to make this a homework question)! As far as I know, quines are possible in all modern programming languages, but it is easier to create one in some languages than others Note that the program is not allowed to examine its source code file (that would generally be considered cheating, since that would be considered input) Complications of producing quines include dealing with newline characters and quotes appropriately When I tried (not recently), I found it was much simpler in standard C than in proper C++ or Python (which is not to say that it was easy in standard C it wasn't) I only successfully did it in standard C, but you can search for solutions in various languages In C, you can use a "printf" statement with "%c" format specifiers, and fill in integer ASCII values to deal with special characters (this is not a full solution, but perhaps a helpful hint)
Brief History of the Recursion Theorem Before stating the recursion theorem, there are a few points I want to make I believe that the textbook's proof of the recursion something is proving something more general than what the theorem states (but, both things are true, and the proved result implies what the theorem states) Later in the section, they present a "fixed-point version of the recursion theorem", and they prove it using what they previously proved for the first version I believe that the fixed-point formulation of the recursion theorem states something that is conceptually similar to what the first version states (we'll discuss this later) The Arora and Barak book does not mention the recursion theorem; the Papadimitriou book mentions it in a note at the end of a chapter, matching our textbook's fixed-point version Based on other sources, the recursion theorem was first proven by Kleene in 1938 (just a couple of years after Turing invented the concept of a Turing machine) Some sources refer to the recursion theorem as Kleene's recursion theorem Wikipedia refers to the theorem as Kleene's second recursion theorem The Wikipedia article also explains what they call Kleene's first recursion theorem, but that is less relevant to us The article also discusses Rogers's fixed-point theorem, which they describe as a simpler version of Kleene's second recursion theorem; this matches what our textbook calls the fixed-point version of the recursion theorem The article cites Jones and briefly relates Kleene's second recursion theorem to reflexive, or reflective, programming, which refers to the ability of a program to "examine, introspect, and modify its own structure and behavior"
What the Recursion Theorem States The recursion theorem, Theorem 6.3 in the textbook, states (I am keeping the textbook's wording exactly as is, but modifying the indentation): "Let T be a Turing machine that computes a function t: * * *. There is a Turing machine R that computes a function r: * *, where for every w, r(w) = t(<R>, w)." Let's discuss the meaning of this theorem, as stated: We start with a TM, T, that accepts two parameters as input The first parameter does not necessarily have to be the encoding of another TM, but it can be, and we will generally think of it this way So, the first parameter of T represents a TM, M, encoded as <M>, and the second input is w Based on the input (<M>, w), T computes some output, say o Note that o does not have to be related to a simulation of <M> on w; T can potentially perform any algorithm that operates on (<M>, w) to compute o The recursion theorem tells us that for any such T, there is some other TM, R, such that R applied input w computes the same thing as T applied to (<R>, w) Although R does not start with access to its own encoding (the initial input on its tape is just w), it computes the same thing as T when T's original input includes <R> as well as w Note that the recursion theorem as stated does not tell us that for every R, there is some equivalent TM; rather, it tells us that for every such T that takes two inputs, there exists some R for which this is true
What the Textbook Proves I believe that the proof, which is short, proves the following (the wording here is my own): Consider any Turing machine T, encoded as <T>, that accepts input w We can construct some other Turing machine, R, that has access to its own encoding, <R>, as well as w, and otherwise behaves the same as T I have tried looking this up in other sources, and here is what I find Some of my favorite sources don't mention the recursion theorem at all (e.g., Arora and Barak), and some others mention it very briefly (e.g., Papadimitriou mentions it in a note at the end of a chapter) Various online sources vary in how they talk about the recursion theorem Some online sources talk about it as Kleene's recursion theorem (or one of Kleene's recursion theorems); such sources (e.g., Wikipedia) generally do not tend to say anything about a TM having access to its own description Other online sources talk about the recursion theorem in a similar manner to our textbook, but most of those sources are clearly based on our textbook These sources are split in how they word their English descriptions of the implications of the recursion theorem; the wordings below are my own, but I'm interpreting what online sources I've seen are saying Some sources say that given any TM, T, that accepts input w, we can construct another TM, R, that has access to <R> as well as w, and otherwise behaves like T (this is equivalent to what I stated above) Other sources say that given any TM, T, that accepts input w, we can construct another TM, R, that has access to <T> as well as w, and otherwise behaves like T I think that both of these things are true, but I believe that the way I explained it at the top of the slide matches our textbook's proof and matches the way that the recursion theorem is used in the rest of the section
The Proof of the Recursion Theorem (part 1) We will discuss the textbook's proof of the recursion theorem, which again, I believe is proving something more general than what the recursion theorem states, and also implies that the theorem is true That is, we will prove the following: Consider any TM, T, encoded as <T>, that accepts input w We can construct a TM, R, that has access to its own encoding, <R>, as well as w, and otherwise behaves the same as T The figure below helps to explain this In the figure, parts A and B of R are taken from SELF, described earlier, except that we redesign function q so that the output of P<BT>is concatenated with the input w (explained in more detail on the next slide) In practice, when we apply the recursion theorem, T will typically process input of the form <M, w>, and <M> will be filled in with <R>; but the proof is general, and works for any T
The Proof of the Recursion Theorem (part 2) We are explaining a proof by construction We are constructing a Turing machine R, where <R> = <ABT> As previously explained, <ABT> does not represent a concatenation of <A>, <B>, and <T> Rather, it represents an encoding of a combination of A, B, and T such that the three parts run sequentially, in that order A = q(<BT>) = P<BT>runs first and writes <BT> on its tape This is a modified version of q, so that the output is appended to the original input on the tape; so, the tape contains w<BT> after A runs (that's what the book says it's unclear to me if w here should be before or after <BT>) Then, control is passed to B, which has been previously been explained as part of the creation of SELF B examines its tape (containing w<BT>) and applies q to the <BT> part, therefore producing q(<BT>) = A A then gets combined with the rest, producing the final string <R, w>, which gets written on the tape It seems to me that the description of the proof is inconsistent as to whether w is placed before or after the produced TM encoding, but either order should be possible, and I suppose it's fine for the various stages to handle this in different ways I believe the final output of B places <R> to the left of w, with the read-write head over the first symbol of <R> Then, control is passed to T, a TM that behaves a certain way This construction describes R, a TM that produces (and then has access to) its own encoding, <R>, which is combined with its original input, and then passes control to T This is therefore a proof by construction that given any TM, T, we can construct another TM, R, that has access to its own encoding, as well as w, and otherwise behaves the same as T
The Recursion Theorem In Practice As mentioned a couple of slides ago, in practice, when we apply the recursion theorem (as used in our textbook), we typically have in mind some T meant to process input of the form <M, w> However, we instead apply R to some input, w', analogous to only the w part of the intended input for T As explained in the proof by construction, R creates its own encoding, <R>, and combines it with w' This happens in phases A and B of R (as indicated in the figure below, repeated from a previous slide) So, by the time that control is passed to T, the tape of the Turing machine R contains <R, w'>, with the read-write head over the first symbol of <R> (at the left end of the tape) In a sense, we are filling in the first intended parameter of T with R's encoding When the book relies on the recursion theorem, it typically casually states that a TM, M, can "obtain own description <M>", sometimes adding "via the recursion theorem", as a step in its algorithm
Recursion Theorem Example 1 The textbook describes how the recursion theorem can be used to create a TM, SELF, that produces its own encoding (I am keeping the book's exact wording to describe SELF): SELF = "On any input: 1. Obtain, via the recursion theorem, own description <SELF>. 2. Print <SELF>" To create this version of SELF, we first produce a TM, T, that acts as follows (I am keeping the book's exact wording to describe T): T = "On input <M, w>: 1. Print <M> and halt" Given this T, we can create R, as explained in our proof by construction of the recursion theorem For any given input, w', originally on R's tape, the first two phases of R (A and B in the figure from the previous slides) will produce <R, w'> on the tape Then, control will be passed to T, which processes <R, w'> and outputs <R> on its tape Thus, R is equivalent to <SELF> The book points out that the ability of an actual program to reproduce its own implementation is used by computer viruses
Recursion Theorem Example 2 The textbook uses the recursion theorem to give what they call "a new and simpler proof" that ATMis undecidable They use an indirect proof, and start by assuming the existence of a TM, H, that decides ATM Given H, we can construct another TM, B, that processes input w, and behaves as follows: First, "obtain, via the recursion theorem, own description <B>" (I kept that wording exact from the textbook) Use H to determine if B accepts w (i.e., "run H on input <B, w>") If B accepts w (according to H), reject; if B does not accept w, accept What happens when we run B on w? If B accepts w, B rejects w, and if B does not accept w, B accepts w! This is a contradiction; thus, the assumption that there is an H that decides ATMmust be false Note that I don't personally consider this simpler than the original proof The original proof needed the background of diagonalization (although even that is debatable, we could have expressed it as an indirect proof without explaining the concept of diagonalization) This proof relies on the recursion theorem (and in particular, the textbook's interpretation of it) As to the question of which proof is easier to understand, that's subjective!
MINTM(part 1) The textbook defines the length of the description, <M>, of a Turing machine, M, to be the number of symbols in <M> Book: "Say that M is minimal if there is no Turing machine equivalent to M that has a shorter description." I add: This definition seems to depend on the format of an encoding and the alphabet of symbols it uses; for the sake of the theorem and proof we are about to cover, the details don't matter We can define a language containing all minimal TM descriptions: MINTM= {<M> | M is a minimal TM} Theorem 6.7 states: "MINTMis not Turing-recognizable" (this also implies MINTMis not decidable) Our proof will rely on the recursion theorem It also relies on a fact that we proved during a previous topic: Recall that an enumerator is a hypothetical machine (the book describes it as a variant of a Turing machine) that eventually displays all strings of a language (and no other strings) Strings from the language may be listed in any order, possibly with repetition, possibly running forever Theorem 3.21, which we proved during our topic about Turing machines, states: "A language is Turing- recognizable if and only if some enumerator enumerates it"
MINTM(part 2) We will prove that MINTMis not Turing-recognizable with an indirect proof Assume that MINTMis Turing-recognizable; then there must be an enumerator, E, that enumerates it Then, we would be able to construct a TM, C, that processes its input w and behaves as follows: "Obtain, via the recursion theorem, own description <C>" (exact quote from book) Run the enumerator, E, until a TM, D, appears with a longer description than C (this will eventually happen, because MINTMis infinite) Simulate D on w C's behavior is equivalent to D's behavior, and C has a shorter description than D This contradicts the fact that D is a member of MINTM The assumption that led to this contradiction must be false; hence, MINTMis not Turing-recognizable
Fixed-Point Version of the Recursion Theorem Book: "A fixed-point of a function is a value that isn't changed by the application of the function" We will consider functions that operate on encodings TMs to produce other encodings of TMs The fixed-point version of the recursion theorem states that for any such function, there is some TM whose behavior is unchanged by the function Theorem 6.8 in the textbook states (I am keeping the book's exact wording): Let t: * * be a computable function. Then there is a Turing machine F for which t(<F>) describes a Turing machine equivalent to F. Here we assume that if a string isn't a proper Turing machine encoding, it describes a Turing machine that always rejects immediately. In this theorem, t is a function that transforms a TM, and F is its fixed point As mentioned earlier, Wikipedia refers to this version of the recursion theorem as Roger's fixed-point theorem in its article on Kleene's recursion theorem
Proof of the Fixed-Point Version We will use the result from our previous version of the recursion theorem to prove the fixed-point version of the recursion theorem, using a proof by construction Consider the Turing machine, F, that processes input w and behaves as follows: First, "obtain, via the recursion theorem, own description <F>" Compute t(<F>); let's call the result G Simulate G on w Clearly, what F does is identical to what t(<F>) does; therefore, <F> is a fixed-point of function t I want to point out some thoughts about the fixed-point version of the recursion theorem: I find this version of the recursion theorem extremely unintuitive Intuitively, I feel like it should be possible to produce a function which processes an encoding of a TM and modifies it in such a way that it is bound to do something different than the original TM However, this theorem, which we have now proven, shows that this is not possible (I'll discuss this more in class) I mentioned earlier that I think the fixed-point version of the recursion theorem is stating something conceptually similar to the first version This is because we can think of the function t from the first version, applied to input (<R>, w), as a function that potentially modifies <R> and applies the modified TM to w (although t is allowed to be a more general function) Thinking of it this way, the original version of the recursion theorem states that for some <R>, the modified version of <R> applied to w produces the same result as <R> applied to w for all inputs Such an <R> seems closely related to the notion of a fixed-point from the fixed-point version of the theorem