| |||||||||||||||||||||||||||||||
HOMETheory of X-Machines
ReferenceMOTIVE |
X-ComputabilityHistorically the question what does computability mean in terms of X-machines has foundered on the observation that, given any set-theoretic relation, f, we can construct a datatype X and a simple one-arrow X-machine that seems, intuitively at least, to 'compute' that relation.
Given this situation, it appears that every relation is computable by X-machine, so that our question is vacuous. This intuition is wrong in an important way, but understanding the error requires a more careful discussion of what we mean by the term computability itself. We are accustomed in theoretical computer science to such statements as the square-root function, sqrt, is computable, but we can end up using terms so freely that we forget their exact meaning. We contend that, even in standard computational theory, the phrase "sqrt is computable" has no independent meaning. Rather, it is a convenient abbreviation for the much more cumbersome phrase "given the rules of Turing-machine computation, and given certain basic interpretations of bit patterns expressed by marking various squares on a tape, there is a Turing-machine program which can be thought of as a representation of sqrt". In other words, while we appear to be talking about a single, well-defined relation sqrt, we are actually discussing a set of relations, of which sqrt is just one constituent part. The question, then, is not "when is a relation computable by X-machine?," but rather, "given a set S of relations, what does it mean to say, in the context of X-machines, that S is a mutually computable set of relations?" To use a biological metaphor, it is helpful to think of X-machines as describing species rather than individual animals. Species are generally defined by the ability of animals to interbreed, but this definition is only ever relevant to collections of animals rather than individuals, because it is entirely possible for two individuals of the same species to be incapable of successful breeding. In the same way, "being computable by X-machine" is something that has little meaning for individuals, but can nonetheless identify collections of relations rather accurately. Now that it has been properly phrased, our question does have an answer, and one that is both meaningful and useful. Recall (see Physical Description) that an X-machine is essentially a finite state machine (FSM) whose labels are interpreted not as abstract symbols in an alphabet, but as relations on some set X. We can think of X as an auxiliary memory, the control system of a second machine, or whatever else seems appropriate for the system being modelled. Given that X-machines and FSMs differ only in their interpretation of labels, our investigation of X-machines, and especially our understanding of the ways in which X-machines go beyond FSMs, must begin with a correct understanding of labels. The standard notation we have developed for this context is as follows.
Accordingly, we think of an X-machine, M, as an FSM equipped with a labelling function, viz.
so that traversing an arc labelled a in the FSM corresponds to the X-machine behaviour "now perform the action a^{L}." Similarly, traversing a path a = a_{0}...a_{n} in the FSM is equivalent to applying the action
Regular Languages vs. Rel_{X}Suppose, then, that we've got some alphabet A from which FSM labels will be chosen. In line with Eilenberg's original formulation, certain of the underlying FSM's states are marked out as valid initial states and others as terminal states, and the computation paths of the machine are those that lead from an initial state, through consecutives arrows of the FSM, to end at a terminal state. In traversing a computation path, the arc labels "spell out" a word one character at a time, and we say that the machine recognises the word. The set of all recognised strings is the language recognised by the machine, and we refer to the machine as a finite state recogniser for that language. Languages that can be recognised (or if you prefer, generated) in this way by finite state machines are called regular languages and play an important part in theoretical computer science.Now just as Rel_{X} forms a semigroup under relational composition, so A*, the set of all finite strings written using characters from A (including e, the empty string) forms a semigroup under string concatenation, because combining any two finite strings over A forms a third. What's more, because
for every pair of strings s, t, it's clear that these two semigroups have closely related structures. The labelling function L mediates this relationship by linking the two semigroups together while preserving their essential family resemblance. We call structure-preserving functions between semigroups (in the sense of Equ. 6) homomorphisms. Formally, then, we can regard L as the homomorphism that maps each recognised string a = a_{0}...a_{n} to the action a^{L} of (Equ. 5). (Equ. 6) tells us that recognised strings in the world of finite state machines are closely related to path computations in X-machines, and in consequence that X-machine computations are closely related to regular languages. We've seen already that each string in a regular language corresponds via L to a path computation in an X-machine. Now the relation computed by an X-machine is defined to be simply the union of all its path computations. So, if we obtain the X-machine M by attaching the labelling L to the finite state machine, F - written
- then the relation |M| computed by M is related to the language |F| recognised by F by the simple formula
A language over A just means "a subset of A*." Taking our cue from (Equ. 8), let's consider the function
which maps each language L, via the labelling L, to a well-defined action L^{L} in Rel_{X}. Combining (Equ. 7, Equ. 8, Equ. 9) tells us that, whenever F is an FSM, X is a datatype and L is a labelling into Rel_{X}, the operations |·| and ·^{L} commute, viz.
Labellings of NAlthough we've been discussing a particular alphabet A, it should be obvious that the actual identity of A is irrelevant. For suppose B is another set of the same cardinaility as A, and that subst: B ----> A is a 1-1 correspondence. Given an X-machine M = F^{L}, where A = Labels(F), we can construct an equivalent representation of M using the label-set B instead of A as follows. Keeping the same underlying structure of arcs, we construct a new recogniser F~ by replacing every label a of F with the label b = subst^{-1}(a). Similarly, we replace L with L~, where for each b in B,
and now it is a straightforward matter to verify that
Consequently, it doesn't much matter what we call the characters in our alphabet, just as long as we have enough of them to label each machine appropriately. To simplify matters, then, we shall choose a fixed alphabet, and use it for every FSM we need to construct. Although any given recogniser can employ at most finitely many labels, there is no upper bound to the number of labels that might be needed, so we'll take as our alphabet the set N of natural numbers. Each labelling L is accordingly assumed to be a labelling of N, L: N ----> Rel_{X}.
X-computable subsets of Rel_{X}We have seen that each labelling L induces a mapping ·^{L} of regular languages to relations on X. If f = L^{L} for some regular language we shall say that f is L-computable, and for any given L, the set
of all L-computable relations will be called the L-computable subset of Rel_{X}. A subset of Rel_{X} is X-computable if it is L-computable for some L. The extent to which L-computable subsets of Rel_{X} are "computable" relative to standard conceptions of the term will be discussed in our next entry, but it should already be clear that, general though this definition is, it can happen that X-computability is no longer vacuous. If X is infinite, not every subset of Rel_{X} is X-computable. This is because there are at most |Rel_{X}^{N}| labellings available, but 2^|Rel_{X}| subsets of Rel_{X}. Now, since Rel_{X} = 2^{X×X}, we have |Rel_{X}| = 2^{|X|} whenever X is infinite. Consequently, for infinite X, we have
whence 2^|Rel_{X}| (the number of subsets of Rel_{X}) is exponentially larger than |Rel_{X}^{N}| (the number of distinct labellings available). |