The empty set Æ = {} is a regular language.
The singleton set consisting of the empty string {e} is a regular language.
Recall that Æ and {e} are distinct sets, hence distinct languages.
The singleton set {w} such that w Î S is a regular language.
If L and M are regular languages, then so is L ^ M = {st: s is in L and t is in M}
The concatenation of two regular languages is a regular language.
If L and M are regular languages, then so is L È M = {s: s is in L or s is in M}
The union (disjunction) of two regular languages is a regular language.
If L is a regular language, then so is L+ {sin: n > 0 and si is in L}.
The unbounded repetition of the members of a regular language is a regular language. Note that a different choice of si can be made on each repetition. If n is large enough, then the same si will recur.
Nothing else is a regular language. The class of regular languages over S is closed under concatenation, union and unbounded repetition.
If L and M are regular languages, then so is L Ç M = {s: s is in L and s is in M}
The intersection (conjunction) of two regular languages is a regular language.
If L is a regular language, then so is Ø L = {s : s is in S* and s is not in L}.
The complement with respect to S* (negation) of a regular language is a regular language.
A regular expression is a compact notation for representing a particular regular language, using concatenation, union and unbounded repetition.
The notations that we use for regular expressions include the following repetition operators.
The Kleene star operator is different from the "wildcard star", as in the DOS command delete *.*. Wildcard star means 'any string', including e, and so is equivalent to S*. Also the optional operator is different from the wildcard question mark, as in delete ??.???. Wildcard question mark means 'any single character', and so is equivalent to S.
Here is the notation for union.
We can take the Kleene plus and pipe operators as the basic notations, and define the Kleene star and optionality operators in terms of them as follows. However, we allow the use of all four operators in homework answers.
The scope of any string operator can be indicated explicitly by parentheses. Without parentheses, the scope of n, +, * and ? is the string to their immediate left, and the scope of | is the strings to its immediate left and right. Here are some examples.
Let R the language consisting of the set of rational numbers in binary notation.
Write a regular expression for R. (Hint: First formulate a set of rules for the "use" of the members of S. Then "translate" these into regular-expression notation.)
The minus sign:
The division sign:
Zero:
One:
Given that that the pipe as a whole is parenthesized, it is not necessary to parenthesize its right-hand side, but no harm is done if you do. That is, the above expression is equivalent to: (0 | (-? 1 (0 | 1)*))
Question: what does the unparenthesized pipe expression 0 | -? 1 (0 | 1)* represent?
A regular grammar is a 4-tuple {N, S, S, P}, where:
A derivation of the string s in a regular grammar RG is a sequence of lines starting with A0 and ending with s, where s Î V*. Each line in the derivation except the first is formed from the previous one by replacing a nonterminal symbol A by the right hand side of a rule whose left hand side is A. If s Î S*, the derivation is said to be terminated. The language generated by RG is the set L consisting of strings that have terminated derivations.
Here is a regular grammar G for the regular language consisting of the rational numbers in binary notation. (The productions are numbered for ease of reference.)
G = {N, S, S, P}, where:
1 |
A ® 0 |
9 |
D ® 1 D |
2 |
A ® 1 |
10 |
D ® / E |
3 |
A ® 0 B |
11 |
E ® 1 F |
4 |
A ® 1 D |
12 |
F ® 0 |
5 |
A ® - C |
13 |
F ® 0 F |
6 |
B ® / E |
14 |
F ® 1 |
7 |
C ® 1 D |
15 |
F ® 1 F |
8 |
D ® 0 D |
|
|
start |
A |
5 |
- C |
7 |
- 1 D |
8 |
- 1 0 D |
10 |
- 1 0 / E |
11 |
- 1 0 / 1 F |
14 |
- 1 0 / 1 1 |