Full Schemas and Categoricity

FullSchemasandCategoricityHandout

Start Presentation

Thanks

Thanks for inviting me.

Full Schemas and Categoricity

What I intend to do today is introduce the notion of a schema from scratch, complete with a formalized logic, and prove that a suitable version of arithmetic is in a suitable sense categorical. Today will be the first time that I have presented such a result by itself, instead of along with allied results about other logics. I have tried to restrict myself to the case at hand, but if some unnecessary generality seems to have crept in, it probably has: I've failed to expunge it in adapting earlier more general accounts.

Full Schemas

J. H. Harris—and, following him, McGee—says that the rules of logic are open ended—they apply not just in the present language but in any extension of it, whether that extension has been envisioned or not.

Full Schemas

We do in fact take our logical rules to be open ended: Consider, for example, the rule $ \phi \land \psi \vdash \phi $ or, for that matter, its informal counterpart in natural language. We do not stop to reëvaluate the rule when moving to an expanded vocabulary or even when introducing modalities—all the new formulas are automatically to be allowed into the rule. As McGee puts it for the case of reductio ad absurdum:

We do not accept reductio ad absurdum because we have surveyed the forms of expression found in English and found that its expressive power is circumscribed in such a way as to validate the rule.

Full Schemas

For example, the rule $ \phi\land\psi\vdash\phi $ of a formalized logic in a language ${\mathcal L} $ is not, if the rule is to reflect our ordinary usage,
if $ \phi\land\psi$ is a formula of ${\mathcal L}$, then from $\phi\land\psi$ infer $\phi $,
but rather,
if $ \phi$ and $ \psi$ are any formulas whatsoever, then from $ \phi\land\psi$ infer $\phi $.

Full Schemas

The notion of 'any formula whatsoever' is, of course, hopelessly vague. I see no reason to provide a precise account here, and indeed some reason not to: since our formalism is going proxy for natural language, we should surely avoid unnecessary commitment to a particular semantics.

Full Schemas

That is all very well, but how can we live with rules as imprecise as the one just suggested? The rule expresses a very precise idea: It is used to express a commitment that however we expand or modify our language in the future, we shall always continue to uphold certain principles, in this case, the conjunction elimination principle. Even imprecise rules have clear cases of application, and we shall only need the rules of logic and mathematics in clear cases in this work. Given that we shall only be interested in clear cases, we have no present reason to worry about the ambiguity of the formulation.

Formalizing rules of logic

The "open ended" rules of logic are clearly in a certain sense very general, but what is the nature of that generality? We surely do not, if at all possible, wish to understand
\[\phi\land\psi\vdash\phi\]
(or its natural language counterpart) along the lines of
\[(\forall \phi )(\forall \psi )(\phi\land\psi\vdash\phi ).\]

Formalizing rules of logic

\[(\forall \phi )(\forall \psi )(\phi\land\psi\vdash\phi )\]

It is not even clear that the universally quantified "formula" is well formed: the turnstyle '$\vdash$' is not a part of the language, but a symbol that is used to specify a rule—from the antecedent, infer the consequent.

Formalizing rules of logic

If the formula is not well formed, it is surely not an appropriate way in which to understand the relevant generality. But even if we brush such niceties aside, expressing the generality of the open-ended rules of logic using quantification will still be deeply problematic.

Formalizing rules of logic

To use such a formalism would be to employ the formalism of logic in introducing that very formalism—an undesirable circularity that can be avoided. One might even consider it to be worse than a circularity, since it requires the comparatively sophisticated notion of quantification even to introduce sentential logic.

Formalizing rules of logic

To make the circularity more stark, consider how we would apply the rule universal universal specification: $^{\text{footnote}}$ $^{\text{footnote}}$
\[(\forall \phi )(\forall x)(\forall \tau )((\forall x)\phi \vdash \phi (\tau ))\qquad{\text{UUS}}\]
to show that, say,
\[(\forall u)P(u)\vdash P(c)\]
is a valid inference.

Formalizing rules of logic

We would first apply UUS to UUS (circularity!) with $\phi = (\forall x)(\forall \tau )((\forall x)\phi \vdash \phi (\tau )),$ $x=\phi $, and $\tau =P(x)$ to obtain
\[(\forall \phi )(\forall x)(\forall \tau )((\forall x)\phi\]
\[\qquad \vdash \phi (\tau ))\vdash(\forall x)(\forall \tau )((\forall x)P(x) \vdash P (\tau )).\]

Formalizing rules of logic

We would then apply MP (Actually, meta--modus ponens, since '$\vdash$' is not in the language. But we have already agreed to put such considerations aside.) to UUS and the formula immediately above to yield
\[(\forall x)(\forall \tau )\,((\forall x)P \vdash P (\tau )),\]
knocking off the first quantifier ($(\forall \phi )$), and then apply UUS twice more. We would have to apply UUS to use the rule UUS.

Formalizing rules of logic

The point is entirely parallel to one made by Quine in "Truth by Convention." There is nothing inconsistent about describing what we mean by universal specification in this circular way, but it eliminates the possibility of telling any straightforward story about how quantification could be introduced or learned, and it ignores the existence of a pre-quantificational mechanism that has generally been thought to exist and to play a genuine role in our reasoning: the schema.

Formalizing rules of logic schematically

The schema is another, more primitive, form of generality that will do the job of introducing logical rules without any attendant circularity: we can take $\phi\land\psi\vdash\phi $ and the other logical rules to be schemas used to declare that any substitution instance is valid. In that case, $\phi $ and $\psi $ are schematic variables, and to say that the schema is open ended is to declare that the variables are what I have elsewhere called full schematic variables: 'full' is added to indicate that what counts as an acceptable substitution instance is open ended and automatically expands as the language in use expands.

Full schemas are not implicitly universally quantified

We may use the following example to see how a schema differs in conception from a universally quantified formula, even an implicitly universally quantified formula. The sentence $(\forall x)x=x$ is a truth bearer, and it is indeed true. In contrast, the schema $x=x$, with $x$ a schematic variable is not a truth bearer at all, though it does deserve some sort of honorific, since it can be used to declare that all of its suitable instances are true. The free schematic variable in a schematic formula, like an ordinary free variable, prevents the formula from being a truth bearer.

Schematic generality is not referential generality

The schematic variable cannot be regarded as a free referential variable: it does not take on values and has no domain. To assert that a formula involving a schematic variable is an axiom schema does not commit us to any true axioms involving a schematic—or any other kind of—free variable: Rather, it commits us to closed instances of the schema as axioms.

Full schemas

The idea that schemas are involved in the basis of logic has been endorsed by Quine and schemas are arguably employed by Russell—his typical ambiguity—and Hilbert.

Full schemas

Full schemas are used by Feferman (who introduced them as early as 1977, in discussing Gödel-type independence phenomena, by Tyler Burge in his theory of truth, by McGee in his theory of "How we learn mathematical language," and by Field, Parsons, and I—we have all taken mathematical induction to be a full schematic principle. In each case, full schemas are used to codify the idea that certain principles are open ended—that we are committed to upholding them even through additions to and modifications of our language. The same applies to the axiom of replacement of set theory, as I shall argue below.

Full schematic generality is not referential or substitutional generality

The case against assimilating schematic generality to that of referential or substitutional quantification is strongest in the case of a full schema: Full schemas, which are the ones of chief interest to us here, are explicitly required to allow new instances as our language expands, and so the very idea of a full schema is incompatible with the notion of a fixed domain or substitution class.

Full schematic generality is not referential or substitutional generality

Quantificational variables are only auxiliaries for quantification: formulas with free variables are not truth bearers. One might therefore wish to avoid the notion that anything is a consequence of a formula with a free variable. One can, for example, infer $0=0$ from $(\forall x)x=x$ as well as from $x=x$, and so perhaps one might wish to argue that the schematic version of $x=x$ is really just an alternative notation for the more familiar universally quantified sentence, perhaps construing the hypothesized implicit quantifier as referential, perhaps as substitutional. That cannot be made to work for several reasons.

Full schematic generality is not referential or substitutional generality

A schematic variable cannot be viewed as a free variable, whether substitutional or referential, one that could be bound by a quantifier, since schematic and quantifiable variables have different inferential roles: If $x$ is a schematic variable, one can infer $0=0$ from $x=x$, but that is not so if $x$ is a quantifiable variable—in that case in some proof systems the formula cannot appear in a proof at all, while in those that do allow it, the inference is valid only if $x$ does not occur free in any of the premises of the argument. No such proviso is required in the case of a schematic variable.

Full schematic generality is not referential or substitutional generality

Schematic variables are more closely allied to substitutional than to referential quantification: When we give a schema that is not full, we typically restrict the schematic variable to all terms or formulas in a language, not to a range of all objects or properties in a domain. Intuitively, however, no schema has a domain of application like the substitution class of a substitutional quantifier—a schema isn't used to make a general assertion about a class of possible instances.

Full schematic generality is not referential or substitutional generality

A schema provides a general method of obtaining particular assertions about suitable instances. Thus, for example, one can infer $(\forall n)\phi (n)$ from $\phi (0)$ and $(\forall n)(\phi (n)\rightarrow \phi (Sn))$, but not, if $n$ is a schematic letter, from $\phi (0)$ and $\phi (n)\rightarrow \phi (Sn)$, since the schema, rather than making an assertion about all numbers, which is what is required to reach the conclusion, provides a mechanism for making assertions about particular numbers.

Full schemas

To summarize, schematic variables are neither referential nor substitutional variables. Schematic generality has important preformal differences from the generality of substitutional universal quantification. Finally no universally quantified sentences can have the same consequences as does a suitable corresponding full schema.

Full schematic categorical axiomatization of arithmetic

Full schemas codify the open-ended principles that will make it possible to provide categorical axiomatizations of the natural numbers. We therefore discuss the formalism of axiom schemas in sufficient detail to provide the background for a categoricity proof. We shall only need to employ schemas in one or two first-order schematic variables and two or three second-order schematic variables, and we reserve the letters $s$, $t$, $P$, $Q$ and $R$ for the purpose. We allow them to have subscripts so that in principle we have infinitely many schematic variables of each order.

Full schematic logic

In our applications the schematic variables will have various numbers of substitutable places. We leave it to the reader to determine the number of substitutable places of a schematic variable by counting the number of terms enclosed in parentheses immediately following it. In this work we shall always allow extra places for parameters. For a fully general treatment, we would have to specify infinitely many schematic variables of each order and number of places.

Substituting for schematic variables

For any term or formula $ \phi $, variables $x_{1},\dots ,x_{n}$, and terms $t_{1},\dots ,t_{n}$, let $\phi \left[\!\!\left[\!{{x_{1},\dots ,x_{n}}\atop{t_{1},\dots ,t_{n}}}\!\right]\!\!\right]$ denote the result of simultaneously replacing every free occurrence of every $x_i$, $i=1,\dots ,n$, in $\phi$ by $t_i$ and renaming bound variables as necessary to prevent collisions with the (free) variables of the terms $t_{1},\dots ,t_{n}$.

Substituting for schematic variables, cont.

Given a schematic formula $\Phi ({\bf P})$, $n$ variables $x_{1},\dots ,x_{n}$, and a formula $\phi$, where $\bf P$ is a first- or second-order schematic variable that has $n$ substitutable places, we define $\Phi\left[\!\!\left[\!{{\bf P}\atop{\hat{x}_{1}\cdots\hat{x}_{n}}\phi \!\right]\!\!\right]}$ or, usually, but less precisely, $\Phi(\hat{x}_{1}\cdots\hat{x}_{n}\phi )$,

Substituting for schematic variables, cont.

We define $\Phi\left[\!\!\left[\!{{\bf P}\atop{\hat{x}_{1}\cdots\hat{x}_{n}}\phi \!\right]\!\!\right]}$ to be the formula that results from $\Phi ({\bf P})$ when, for every list of $n$ terms $t_{1},\dots ,t_{n}$, every occurrence of ${\bf P}(t_{1},\dots ,t_{n})$ in $\Phi ({\bf P})$ is replaced by $\phi \left[\!\!\left[\!{{x_{1},\dots ,x_{n}}\atop{t_{1},\dots ,t_{n}}}\!\right]\!\!\right]$,

Substituting for schematic variables, cont.

We must change bound variable in $ \Phi $ as necessary to prevent collisions with the free variables of $\phi\left[\!\!\left[\!{{x_{1},\dots ,x_{n}}\atop{t_{1},\dots ,t_{n}}}\!\right]\!\!\right]$ that are not among the (free) variables of the terms $t_{1},\dots ,t_{n}$.

Substituting for schematic variables

For example, $ (\forall x) (\exists y)P(x)\left[\!\!\left[\!{{P}\atop{\hat{z}(\exists x) R(x,z,y)}}\!\right]\!\!\right] = (\forall x) (\exists v)(\exists u) R(u,x,y)$, where $u$ replaced $x$ to avoid a collision in $ (\exists x) R(x,z,y)\left[\!\!\left[\!{{z}\atop{x}}\!\right]\!\!\right]$ and $v$ replaced $y$ to avoid a collision in $ (\forall x) (\exists y)P(x)\left[\!\!\left[\!{{P}\atop{ (\exists u) R(u,x,y)}}\!\right]\!\!\right]$.

Substituting for schematic variables

The resulting formula may have free variables. We take them to be implicitly universally quantified.

Rule of inference for schematic variables

For any language $\mathcal L$, we introduce the $\mathcal L$ substitution rule: from any formula $\Phi ({\bf P})$ in which $\bf P$ has $n$ substitutable places, infer $\Phi (\hat{x}_{1}\cdots\hat{x}_{n}\phi )$, where ${x_{1},\dots ,x_{n}}$ are any $n$ variables and $\phi$ is any term of $\mathcal{L} $ if $\bf P$ is a first-order schematic variable and any formula of $\mathcal L$ if $\bf P$ is a second-order schematic variable.

Full schematic logic

We now associate a full schematic theory with any base theory $T_{0}$ in a language $\mathcal L$. The base theory $T_{0}$ may include some formulas with schematic variables in them. We do not regard a theory as a set of sentences in a language $\mathcal L$ as is usual, but instead as a function from languages ${\mathcal L}^{+}\supset{\mathcal L}$ to sets of axioms and rules.

Full schematic logic

If the schematic variable $\bf P$, whether of first or second order, has $n$ substitutable places, let $\overline{x}$ stand for some fixed $n$-tuple of distinct variables. The full schematic theory associated with $T_0$ over $\mathcal L$, $T^{(+)}$, associates with each ${\mathcal L}^{+}\supset{\mathcal L}$ the theory with axioms $T_{0}$ and with the ${\mathcal L}^{+}$-substitution rule. When $T_{0}$ is empty, we refer to the corresponding theory as full schematic logic.

Full schematic logic

Let us define what it is for a structure to be a model of a full schematic theory in line with the redefinition of the notion of a theory.

Full schematic logic

An assignment $s$ satisfies the formula $ \phi $ with respect to full schematic logic in a structure $ {\mathfrak A} $ if $ {\mathfrak A} $ is a structure for a language including ${\mathcal L}$, $s$ is an assignment for $ {\mathfrak A} $, and $s$ satisfies in the familiar sense every schematic-variable-free deductive consequence with respect to full schematic logic of $\phi $ in any language ${\mathcal L}^{+}\supset{\mathcal L}$ in any expansion of $ {\mathfrak A}$ to a structure for a language including ${\mathcal L}^{+}$.

Full schematic logic

Let $T $ be a theory in a language ${\mathcal L}$. A structure ${\mathfrak A} $ is a model of the full schematic theory $T$ if every assignment for ${\mathfrak A}$ satisfies every formula in $T$ with respect to full schematic logic.

Full schematic logic

To say that any expansion has a certain property is not the universally quantified statement that every expansion has the property—that would require a notion of the class of all possible expansions, a notion to which I am not entitled. It is rather a full schematic metalinguistic statement that will let us infer, whenever we are given an expansion, that it has the property. That is a perfectly clear and formally precise account of a constraint on how we may expand our language. It can be used to represent our metaphysical commitments concerning models of a theory, that is for describing what properties we are committed to any model maintaining even as we revise our language.

Aside on vagueness

The models of vagueness introduced in the vagueness seminar were for a fixed language, but it would be routine to modify them to allow different languages at different leaves. Each full scheme would have each linguistic object of suitable kind (that is, of the right order and number and obeying whatever restrictions are imposed by the schema) of the total language as a potential substituend at each node. Of course, that would include different items at different nodes. I do not pursue the details further because typical cases of adding new items to the language that are of interest for our purposes are not cases in which it is vague whether a given item is part of the language, but cases of genuine change—additions to the language that are new and represent an expansion of our linguistic resources.

Full schematic logic

The present definition is the correct one in the present setting. When a theory is defined, not for a single language, but for a single language and all of its expansions, it only seems natural to require not only that a model of the theory in the original language satisfy the theory, but also that expansions of the model continue to satisfy the theory in the corresponding expanded languages.

Peano Arithmetic with Primitive Recursion

Though primitive recursion is not a part of the axiomatic basis of Peano arithmetic, it is central to another standard arithmetic theory, namely, primitive recursive arithmetic. The first theory I wish to consider is a kind of amalgam of Peano arithmetic and primitive recursive arithmetic. The fact that primitive recursive arithmetic is a standard codification of what many have considered to be the most fundamental and basic properties of the natural numbers serves nicely to make the point that taking primitive recursion to be a fundamental part of the basis of arithmetic is not an ad hoc trick in defense of categoricity, but simply a recognition of the importance in that context of a long and deeply held view.

Peano Arithmetic with Primitive Recursion

The language of our schematic theory PAPR$^{(+)}$ (full schematic Peano arithmetic with primitive recursion), which will be a first-order theory with two schemas, will be $\{0,S\}$ plus a functor $\mathbb P$ that takes two formulas of our language plus four variables to a relation symbol of our language—we give a more complete specification below. Since we take the functor itself to be part of the language, relation symbols formed using it will be relation symbols of the language and, if we expand the language with new symbols then new relation symbols formed using them in conjunction with the functor will also become relation symbols of the augmented language.

Peano Arithmetic with Primitive Recursion

The axioms will be the successor axiom of Peano arithmetic, the induction schema, and an additional schema:

The primitive recursion schema

\[({\mathbb P}(P,Q)(0,w)\leftrightarrow P(w))\land\]
\[({\mathbb P}(P,Q)(Sx,z)\leftrightarrow\]
\[\qquad (\exists y) ({\mathbb P}(P,Q)(x,y)\land Q(x,y,z))).\]

Substitution into the primitive recursion schema

I must explain how to substitute $\hat{w}\phi$ for $P$ and $\hat{x}\hat{y}\hat{z}\psi $ for $Q$ in ${\mathbb P}(P,Q)(t_{1},t_{2})$, where $w$, $x$, $y$, and $z$ are any variables, $\phi$ and $\psi$ are any formulas, and $t_1$ and $t_2$ are any terms. For any formula $\theta$, let $\Fv (\theta)$ be the set of free first-order variables in it.

Substitution into the primitive recursion schema, cont.

The relation symbol ${\mathbb P}(\hat{w}\phi ,\hat{x}\hat{y}\hat{z}\psi)$ has $n+2$ places, where $n$ is the number of variables in
\[p=(\Fv (\phi )-\{ w\})\cup (\Fv (\psi )-\{ x,y,z\} ).\]
The atomic formula that results from the indicated substitution into ${\mathbb P}(P,Q)(t_{1},t_{2})$ is ${\mathbb P}(\hat{w}\phi ,\hat{x}\hat{y}\hat{z}\psi )(t_{1},t_{2},\overline{x})$, where $\overline{x}$ is the variables (parameters) in $p$ listed in alphabetical order.

Substitution into the primitive recursion schema

We interpret the $\mathcal L$-substitution rule to apply to substitutions in the newly extended sense and extend it to apply to both $P$ and $Q$, and we similarly extend the definitions that employ the rule.

Full schematic primitive recursion

Since we are taking the functor $\mathbb P$ to be in the base language, we shall be able to prove theorems about relation symbols constructed from $\mathbb P$ by induction. In particular, we shall be able to prove that if there is a unique $w$ such that $\phi (w)$ and for every $x$ and $y$ there is a unique $z$ such that $\psi (x,y,z)$, then ${\mathbb P}(\hat{w}\phi ,\hat{x}\hat{y}\hat{z}\psi )(x,y,z)$ is a well-defined function—I omit showing the parameters here and frequently below.

Addition

We can define addition by
\[x+y=z&\leftrightarrow{\mathbb P}(\hat{w}w=u,\hat{x}\hat{y}\hat{z}Sy=z)(y,z,x)\]
.

Multiplication

We can define multiplication by
\[x\cdot y=z&\leftrightarrow\]
\[{\mathbb P}(\hat{w}w=0,\hat{x}\hat{y}\hat{z}{\mathbb P}(\hat{w}w=u,\hat{x}\hat{y}\hat{z}Sy=z)(u,z,y))(y,z,x).\]

The theory

PAPR$^{(+)}$ has the conceptual advantage of employing a uniform primitive recursion schema instead of treating each primitive recursive definition separately. That, for example, makes it possible to prove in schematic form that if there is a unique $w$ such that $\phi (w)$ and for every $x$ there is a unique $y$ such that $\psi (x,y)$, then ${\mathbb P}(\hat{w}\phi ,\hat{x}\hat{y}\psi )(x,y)$ is a well-defined function, instead of proving that separately for each primitive recursive definition.

The theory

The fragment of PAPR$^{(+)}$ in the fixed language
\[\{0,S,{\mathbb P}(\hat{w}w=u,\hat{x}\hat{y}\hat{z}Sy=z),\]
\[\quad{\mathbb P}(\hat{w}w=0,\hat{x}\hat{y}\hat{z}{\mathbb P}(\hat{w}w=u,\hat{x}\hat{y}\hat{z}Sy=z)(u,z,y))\}\]
(where the last two relation symbols are the primitive recursive definitions of addition and multiplication given above is definitionally equivalent to PA, that is, that fragment and PA share a common extension by definitions. I shall therefore say, speaking somewhat loosely, that PA$^{(+)}$ is included in PAPR$^{(+)}$.

The theory

The theory PAPR$^{(+)}$ does not include primitive recursive arithmetic for the simple reason that primitive recursive arithmetic is formulated in terms of function symbols instead of relation symbols. It is, however, a straightforward exercise to show that primitive recursive arithmetic is included in an extension by definitions of PAPR$^{(+)}$, which I take to mean that PAPR$^{(+)}$ incorporates primitive recursive arithmetic.

The theory

It is a deeper and more surprising fact, due to Gödel, that all of PAPR$^{(+)}$ restricted to the base language is included in an extension by definitions of PA, and hence that PAPR$^{(+)}$ restricted to that language is in effect merely a somewhat redundant formulation of PA. It is presumably for that reason that axiomatic theories like PAPR are not familiar. Nonetheless, PAPR is an axiomatization that employs only core ideas about the natural numbers.

The theory

When we consider adding a new unary relation $R$ to the base language of PAPR$^{(+)}$ things become more interesting. (We consider adding a unary relation for definiteness—the same kinds of considerations would apply for any relation or function.) The theory PAPR$^{(+)}$ automatically adds the requisite new instances of both the primitive recursion and induction schemas when the language is expanded.

The theory

Since we are adding the new symbols into the induction schema, we must establish that the addition is acceptable. I will offer two arguments for that. The first argument is direct: The use of primitive recursion in concert with new relations and even across domains without any presumption of a background theory is such a well-established part of traditional mathematical practice that no extra-mathematical objection to it could be tenable. It just is a part of standard mathematical practice that we do take arbitrary primitive recursive definitions to be unexceptionable, and we allow them into the induction schema freely. Moreover, that mathematical practice long antedates the idea of a background set theory, which provides some additional support for the common view that the practice does not require the presupposition of a background set theory.

The theory

The second argument for acceptability is that primitive recursive definitions of relations are acceptable because they are canonically equivalent to positive inductive definitions, which I shall argue below are acceptable on independent grounds.

The theory

The theory PAPR$^{(+)}$ in the language formed by adding the new predicate $R$ to the base language has an extension by definitions that has as parts both PA$^{(+)}$ and the theory of functions primitive recursive in $R$. The proofs are straightforward extensions of the proofs of the comparable results without the new predicate $R$.

The theory

Those facts provide an additional argument that the full theory PAPR$^{(+)}$ is the best codification of our preformal intuitions concerning the natural numbers. It is the theory that extends in the way that reflects our unreflective expectations:

The theory

Anyone who understands primitive recursion automatically understands primitive recursion in additional relations. The idea is not so much a new one as an automatic extension of the old one. It is the theories PAPR$^{(+)}$ that reflect that reality.

The theory

Perhaps surprisingly, PAPR$^{(+)}$ is not an essentially new theory: it can be obtained as an extension by definitions of PA$^{(+)}$ by straightforward application of Gödel's method.

The relativized theory

It is only when we consider the relativized theory PAPR$^{(+)U}$ that the reason for considering PAPR$^{(+)}$ and not just PA$^{(+)}$ becomes apparent. The theory PAPR$^{(+)U}$ is PAPR$^{(+)}$ with the quantifiers in the axioms and schemas relativized to a unary predicate $U$. Unrelativized formulas are allowed as substituends for the schematic letters in the substitution rule. In addition, we take PAPR$^{(+)U}$ to have the formulas $U(0)$ and $(\forall x)(U(x)\rightarrow U(Sx))$ as axioms.

The relativized theory

The need for the additional axioms could be avoided by formulating the theory without constant or function symbols in the first place. Comparable axioms for addition and multiplication are not needed: they are theorems proved by induction.

The relativized theory

The two arguments mentioned earlier for believing the primitive recursion schema to be an acceptable addition to PAPR$^{(+)}$ are equally arguments for believing the primitive recursion schema to be an acceptable addition to PAPR$^{(+)U}$, where, of course, the schema is now taken to be relativized to $U$.

The relativized theory

The raison d'être of PAPR$^{(+)U}$ is that it makes it possible to consider the natural numbers in conjunction with other systems. For example, consider the natural numbers along with a group. The axioms may be taken to be PAPR$^{(+)U}\cup{\text{G}}^{U'}$, where ${\text{G}}$ is the axioms for the group. Natural number exponentiation is
\[{\mathbb P}(\hat{w}w=e,\hat{x}\hat{y}\hat{z}y\times u=z)(x,z,y),\]

The relativized theory

Existence and uniqueness on the appropriate domains is readily proved by induction. With PAPR$^{(+)U}$ it is not necessary to add anything new to get exponentiation.

The relativized theory

Gödel's method does not apply in the present setting, since it does not yield codes for sequences of members of a group, and so PAPR$^{(+)U}$ is truly stronger than PA$^{(+)U}$ in the sense that for an arbitrary theory $T$ the theory PAPR$^{(+)U}\cup T^{U'}$ need not be an extension by definitions of PA$^{(+)U}\cup T^{U'}$. $^{\text{footnote}}$

The relativized theory

The theory PAPR$^{(+)U}$ is my candidate for a codification of our basic preformal intuitions concerning the natural numbers. I shall not attempt to give here all my reasons for taking such primitive recursion to be central, resting content with noting that many such functions have historically been employed without the special notice that would have been required had they been essentially new (for example, natural number exponentiation of fractions, polynomials, and complex numbers and iteration of functions) and that those same functions are introduced today in secondary schools without any idea that a new method or procedure is being invoked: it is taken for granted, I believe quite rightly, that such procedures are an integral part of what has already been taught about the natural numbers.

Categoricity

Now that I have shown that PAPR$^{(+)U}$ is a natural, independently motivated theory of the natural numbers not introduced in an ad hoc way to solve the Skolem problem, we can see how it helps with that problem:

Categoricity

Theorem. Let $T$ be the theory in the language ${\mathcal A}^{2}\cup\{{\mathbb P},{\mathbb P}'\}$ that is the union of the theory PAPR$^{(+)U}$ and the primed version PAPR$^{(+)U}{}'$ of the very same theory. Then there is a binary relation symbol $I$ in the language such that the following formulas are theorems of $T$:

Categoricity

1. $(\forall x)(U(x)\rightarrow (\exists y)(U'(y)\land (\forall z)(I(x,z)\leftrightarrow z=y)))$
$\quad$ ($I$ is a function from $U$ to $U'$),
2. $(\forall x)(U'(x)\rightarrow (\exists y)(U(y)\land (\forall z)(I(z,x)\leftrightarrow z=y)))$
$\quad$ ($I$ is one-to-one and onto),

Categoricity

3. $I(0,0')$,
4. $(\forall x)(\forall x')(\forall y)(\forall y')(I(x,x')\land I(y,y')$
\[\quad\rightarrow Sx=y\leftrightarrow S'x'=y')\]
,

Categoricity

5. for any variables $w$, $x$, $y$, and $z$ and formulas $\phi$ and $\psi$ that are obtained from formulas of the base language of PAPR by bounding all quantifiers by $U$,

Categoricity

\[(\forall x_{1})(\forall x'_{1})\dotsm (\forall x_{n})(\forall x'_{n})\]
\[(I(x_{1},x'_{1})\land\dots\land I(x_{n},x'_{n})\]
\[ \rightarrow{\mathbb P}(\hat{w}\phi ,\hat{x}\hat{y}\hat{z}\psi)(x_{1}\dots x_{n})\leftrightarrow\]
\[\qquad{\mathbb P}'(\hat{w}\phi ',\hat{x}\hat{y}\hat{z}\psi ')(x'_{1}\dots x'_{n})),\]

Categoricity

where $n$ is the number of places of ${\mathbb P}(\hat{w}\phi ,\hat{x}\hat{y}\hat{z}\psi )$ and $\phi '$ and $\psi '$ are the primed versions of $\phi$ and $\psi$.

Categoricity

In other words, in any model of $T$, $I$ is an isomorphism between the unprimed and primed models of PAPR.

Categoricity

The theorem should be compared with work of Parsons, who noted that one can prove two models of arithmetic isomorphic via a primitive recursively defined isomorphism. The $I$ of the theorem is
\[{\mathbb P}(\hat{w}w=0',\hat{x}\hat{y}\hat{z}z=S'y).\]

Categoricity

The theorem does not require adding any symbols or definitions to the combination of the theory with the primed version. The theory PAPR$^{(+)U}$ provides a class of functions, the primitive recursive functions, that can be defined between the natural numbers and other structures without a background notion of an extension. Thus, one can hold that we can successfully characterize the natural numbers with a fully formal theory, PAPR$^{(+)U}$, without any need for the ill-defined notion of acceptability of expansions of a language.

Why primitive recursion?

The very notion of a full schematic theory involves systematically adding new axioms to an antecedently given theory. There is, of course, no formal obstacle to adding any axioms whatsoever to any theory whatsoever. The main question is when the resulting theory will be satisfiable—when it will have a model, and in the case of full schematic theories, when it will have a model in our new specially defined sense.

Consistency is not enough

That question has a well-known answer in the case of ordinary first-order theories: by the completeness theorem, a necessary and sufficient condition for a theory to be satisfiable is that it be consistent. That condition is not sufficient for full schematic theories: for example, the theory ${\text{PA}}^{(+)}+\lnot{\text{Con(PA)}}$, where Con(PA) is the sentence asserting the consistency of PA used in Gödel's second incompleteness theorem, is (if PA is consistent) consistent but it has no models. Proof: A model in the ordinary first-order sense of this theory must be nonstandard, since PA is consistent, but PA$^{(+)}$ has no nonstandard models by the categoricity theorem.

Acceptability

I do not know of a nontrivial necessary and sufficient condition for a full schematic theory to be satisfiable, certainly not one as straightforward as simple consistency, but there is a useful general sufficient condition. There is also an additional sufficient condition useful for some special cases, including PA.

Acceptability

My main interest in full schematic theories has been their use in characterizing mathematical domains. The full schematic principles of interest concern characteristics of the domain that survive the arbitrary addition of new relations. It is therefore natural to ask when the addition of a theory $A$ to a full schematic theory $T^{(+)}$ adds new relations without changing the domain. I shall call such an addition an acceptable addition. For example, adding $\lnot{\text{Con(PA)}}$ to a theory of arithmetic is not acceptable, because satisfying it may require adding new elements (nonstandard numbers) to the domain.

Acceptability

If an addition to a full schematic theory is an acceptable addition, then any intended model of the full schematic theory can be expanded to a model of the addition. Thus, if a full schematic theory is satisfiable, so is the theory plus any acceptable addition.

Acceptability

The condition is sufficient, but it is surely not necessary. For example, large cardinal axioms are not acceptable additions to set theory—the whole point is that they add things to the domain. Nonetheless, there may well be good reason to allow adding on large cardinal axioms.

Acceptability

We can give the notion of an acceptable addition a mathematically precise explication as follows: an addition $A$ to a theory $T$ is acceptable with respect to a class of models $\Gamma$ if every model of $T$ in $\Gamma$ has an expansion to a model of $A$.

Acceptability

Actually, we shall need a slightly more general definition to allow additions with free variables for parameters: an addition $A$ to a theory $T$ is acceptable with respect to a class of ordered pairs of the form $\langle {\mathfrak A},s\rangle$, where ${\mathfrak A}$ is a model and $s$ is an assignment for ${\mathfrak A}$, if every pair $\langle {\mathfrak A},s\rangle$ in $\Gamma$ is such that if ${\mathfrak A}$ is a model of $T$ then ${\mathfrak A}$ has an expansion to a model in which $s$ satisfies $A$.

Acceptability

In the case of PA$^{(+)}$, the intended model is a smallest model. It is isomorphic, in a suitable sense, to a substructure of every structure that is a model of PA$^{(+)}$ with respect to a fixed language. That follows from the following theorem.

Acceptability

Theorem. Let $T$ be a theory that is complete for atomic sentences (that is, if $\sigma$ is an atomic sentence then either $\sigma$ or $\lnot\sigma$ is a consequence of $T$). Then, with respect to a fixed language, any model of $T$ in which every member of the domain is denoted by a closed term is a smallest model of $T$.

Acceptability

The theory PA, and hence PA$^{(+)}$, satisfies the hypothesis: The intended model is one in which every member of the domain is denoted by a closed term. The atomic sentences are all variable-free equations, and PA suffices to prove all the true ones and to disprove all the false ones.

Acceptability

Proof: Let $T$ be a theory and $\mathfrak A$ be a model as in the hypothesis of the theorem. Let $\Delta$ be the set of all atomic and negation atomic sentences that are consequences of $T$. Then $\mathfrak A$ is a model of $\Delta$, and $\Delta$ is a maximal consistent set of atomic sentences. Thus, $\Delta$ is the atomic diagram of $\mathfrak A$. Since every model of $T$ is a model of $\Delta$, it follows that $\mathfrak A$ is isomorphic to a substructure of every model of $T$. $\square$

Acceptability

Since PA$^{(+)}$ has a smallest model, it will be useful to show that any universal theory (that is, any set of formulas of the form $(\forall \overline{x})\phi$, where $\phi$ is free of quantifiers) is an acceptable addition to any theory $T^{(+)}$ with respect to the class of smallest models of $T^{(+)}$ in a fixed language if $T^{(+)}+A$ is consistent:

Acceptability

Theorem. Suppose that some theory $T$ has a smallest model $\mathfrak A$. Then ${\mathfrak A}$ can be expanded to a model of any universal theory $A$ that is consistent with $T$.

Acceptability

Proof: Since $T+A$ is consistent, it has a model $\mathfrak B$. Since $\mathfrak A$ is a smallest model of $T$, we may as well assume that $\mathfrak A$ is a substructure of the reduct of $\mathfrak B$ to the language of $\mathfrak A$. The substructure of $\mathfrak B$ with universe the universe of $\mathfrak A$, call it ${\mathfrak A}^+$, is thus a model of $T$. Since ${\mathfrak A}^+$ is a submodel of $\mathfrak B$ and $\mathfrak B$ is a model of $A$, it follows that ${\mathfrak A}^+$ is a model of $A$—the theory $A$, being universal, has the submodel property. $\square$

Acceptability

By the Matijasevi