#|------------------------------------------------------------------------------ | | NAME | | LR(1)AndLALR(1)ParserGenerator.lsp | | | DESCRIPTION | | LR parser generator which produces goto graphs and action and goto tables | for both LR(1) and LALR(1) grammars. | | It gives the same parsing tables (and conflicts) as UNIX's yacc | compiler-compiler, except that some states may be numbered in a different | order. | | | CALLING SEQUENCE | | Once you are in a Common Lisp interpreter, load this file using | your path: | | (load "LR(1)AndLALR(1)ParserGenerator.lsp") | | For an LR(1) grammar, type | | (parser-generator "grammar.dat" "parser.dat" :parser-type 'LR1) | | For an LALR(1) grammar, type | | (parser-generator "grammar.dat" "parser.dat") | or | (parser-generator "grammar.dat" "parser.dat" :parser-type 'LALR1) | | Parser-generator prints the warning message "Conflicts were detected" to | the console if any shift-reduce or reduce-reduce conflicts occur. | | For testing, you can also call | | (test-parser-generator) | | but you need to modify this function to your taste by setting the | file paths. | | Online documentation when you're in the lisp interpreter is given by the | standard documentation functions; for example, | | (apropos 'getHead) | => GETHEAD | GETHEADOFLISTUPTO (fbound) | (describe 'getHeadOfListUpTo) | COMMON-LISP-USER::GETHEADOFLISTUPTO | [symbol] | | GETHEADOFLISTUPTO names a compiled function: | Lambda-list: (ITEM LIST) | Derived type: (FUNCTION (T T) (VALUES LIST &OPTIONAL)) | Documentation: | ------------------------------------------------------------------------------- | | | | DESCRIPTION | | | | Return the list from the beginning up to but not including a given item, | | or the whole list if the item wasn't found. | ... | |------------------------------------------------------------------------------- | Source file: /Users/seanoconnor/ParserGeneratorAndParser/SourceCode/ParserGenerator/LR(1)AndLALR(1)ParserGenerator.lsp | | (describe '*productions*) | COMMON-LISP-USER::*PRODUCTIONS* | [symbol] | | *PRODUCTIONS* names a special variable: | Value: (((1) (S -> POLY MOD)) ((2) (MOD -> COMMA INTEGER)) | ((3) (MOD -> EPSILON)) ((4) (POLY -> POLY + TERM)) | ((5) (POLY -> TERM)) ((6) (TERM -> MULTIPLIER POWER)) | ((7) (MULTIPLIER -> INTEGER)) ((8) (MULTIPLIER -> EPSILON)) | ((9) (POWER -> X)) ((10) (POWER -> X ^ INTEGER)) | ((11) (POWER -> EPSILON))) | Documentation: | List of productions of the unaugmented grammar. | | | INPUT FILES: | | grammar.dat A list of the productions of the grammar followed by | a list of terminal symbols. The file grammar.dat | shows an example. Epsilon productions are allowed. | | We assume the start symbol is the one which begins the first production | listed in grammar.dat. | | Don't include $ (the right endmarker) in the list of terminals. It is | added automatically by the program. | | | | OUTPUT FILES: | | parser.dat A numbered list of productions, followed by the LR(1) | or LALR(1) goto graph (i.e. set of items) of the | grammar and the action and goto tables. See the files | parser.dat and lalrparser.dat for examples. | | The LALR(1) tables are the same as the ones in the y.output file | generated by UNIX's yacc compiler-compiler running with the -v | option. The only difference is that some states may be numbered in | a different order. | | Shift-reduce or reduce-reduce conflicts are inserted into the action | and goto tables at the end of the line for the state in which they | occur. | | You can feed the action and goto tables to my Common Lisp LR parser | program "parser.lisp". The goto graph indicates the state of the | parse, just as in yacc's output, and can help to define the parsing | error messages. | | | AUTHOR | | Sean E. O'Connor 01 Jun 1989 Version 1.0 | 11 Mar 2008 Version 5.6 released. | | LEGAL | | LR(1)AndLALR(1)ParserGenerator Version 5.6 | An LR(1) and LALR(1) Parser Generator written in Common Lisp. | | Copyright (C) 1989-2024 by Sean Erik O'Connor. All Rights Reserved. | | This program is free software: you can redistribute it and/or modify | it under the terms of the GNU General Public License as published by | the Free Software Foundation, either version 3 of the License, or | (at your option) any later version. | | This program is distributed in the hope that it will be useful, | but WITHOUT ANY WARRANTY; without even the implied warranty of | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | GNU General Public License for more details. | | You should have received a copy of the GNU General Public License | along with this program. If not, see . | | The author's address is seanerikoconnor!AT!gmail!DOT!com | with the !DOT! replaced by . and the !AT! replaced by @ | | | METHOD | | This is a Common Lisp implementation and will run under CLISP. | The software design is layered, the simpler list manipulation | utilities coming first, building up gradually to the specialized | and higher level parser functions. I've put lots of examples to | ease the pain. | | To construct the LR(1) goto graph (i.e. set of items) we use Algorithm | 4.9 of [Aho 86, pg. 231-232]. To create the cannonical LR(1) parsing | action and goto tables, algorithm 4.10 [Aho 86, pg. 234] is used. | | To construct the LALR(1) parsing tables, we use the much simpler | algorithm of [Aho 74, pg. 115] instead of algorithm 4.11 in [Aho 86, | pgs. 238-239]. | | For computing FIRST (first derived terminals) we use algorithm 5.5 of | [Aho 72, pgs. 357-359]. | | The function EFF (epsilon-free first derived terminals) is described | in [Aho 72, pg. 381]. We base the algorithm used in the function | first-terminals-of-symbol on exercise 5.2.19 [Aho 72, pg. 398]. The | modifications to algorithm 5.5 to make it compute EFF are my own and | are described in my notes. | | In the first version of this program, we used the algorithm for FIRST | of [Aho 86, pgs. 188-189]. But this algorithm does not always | terminate! In particular, it fails for the grammar, | | S -> A S | b | A -> S A | a | | of example 4.33 [Aho 86, pg. 272] by getting into the following | infinite loop: FIRST( S ) = FIRST( A ) = FIRST( S ) ... The algorithm | we use always terminates. | | | REFERENCES | | See http://www.seanerikoconnor.freeservers.com for a review of the | parsing theory behind this program. | | | [Aho 86] COMPILERS: PRINCIPLES, TECHNIQUES, AND TOOLS, | Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman, | Addison-Wesley, 1986. | | [Aho 74] "LR Parsing", Alfred V. Aho and Stephen C. Johnson, | Computing Surveys, Vol. 6, No. 2, June 1974, pg. 99-124. | | [Aho 72] THE THEORY OF PARSING, TRANSLATION AND COMPILING, VOLUME 1: | PARSING, Alfred V. Aho and Jeffrey D. Ullman, Prentice-Hall, | 1972. | | BUGS | | Have the output look like the file y.output generated by yacc -v, or | eyacc -v. | | NOTES | | In Common Lisp, functions and variables don't share the same namespace. | So you need to tell LISP that the variable func is actually a function by using funcall. | | (defun apply-func (func arg1 arg2) (funcall func arg1 arg2)) | | So you can't do the simpler expression (func arg1 arg2). | | It goes the other way too. You can't say (apply-func + 1 2) because + denotes a variable. | You have to tell LISP it's a function, | | (apply-func #'+ 1 2) | | which is a shorthand for | | (apply-func (function +) 1 2) | | first and car are synonyms as are rest and cdr | (null '()) => T | (null nil) => T | (null 'a) => NIL | | Optional arguments using keywords example: | | (defun doggie-looks( dog &key (nose-color 'red) (hair-color 'white)) | (list 'my dog 'has 'a nose-color 'nose 'and hair-color 'hair)) | | * (doggie-looks 'husky) => (MY HUSKY HAS A RED NOSE AND WHITE HAIR) | * (doggie-looks 'husky :nose-color 'black) => (MY HUSKY HAS A BLACK NOSE AND WHITE HAIR) | +-------------------------------------------------------------------------------|# ; ============================================================================== ; | Constants | ; ============================================================================== ; Use the naming convention +variable-name+ to denote constants. (defconstant +initial-hash-table-size+ 100 "------------------------------------------------------------------------------- | Initial hash table length. Don't worry, lisp hash tables are extensible | at run time. --------------------------------------------------------------------------------" ) (defconstant +hash-value-upper-limit+ 65536 "------------------------------------------------------------------------------- | Upper limit on hash value on core of items. |-------------------------------------------------------------------------------" ) ; ============================================================================== ; | Dynamically Bound (i.e. Global) Variables | ; ============================================================================== ; Use the naming convention *variable-name* to denote them as global. (defvar *productions* nil "------------------------------------------------------------------------------- | List of productions of the unaugmented grammar (without S' -> S). | e.g. ( (S -> S |a| S |b| / EPSILON) ) | which represents S -> S a S b, S -> EPSILON |-------------------------------------------------------------------------------" ) (defvar *has-epsilon-productions* nil "------------------------------------------------------------------------------- | T if we have any epsilon productions of the form A -> EPSILON, but NIL otherwise. --------------------------------------------------------------------------------" ) (defvar *terminals* nil "------------------------------------------------------------------------------- | List of terminal symbols for the grammar. e.g. ( |a| |b| ) |-------------------------------------------------------------------------------" ) (defvar *first-derived-terminals* nil "------------------------------------------------------------------------------- | Hash table containing the first derived terminals for each grammar symbol. |-------------------------------------------------------------------------------" ) (defvar *epsilon-free-first-derived-terminals* nil "------------------------------------------------------------------------------- | Hash table containing the epsilon-free first derived terminals for each grammar symbol. |-------------------------------------------------------------------------------" ) (defvar *goto-graph* nil "--------------------------------------------------------------------------------------------------------------- | Goto graph of the LR(1) or LALR(1) grammar of the form | | ( | ( ------+ | (6 |a| 4) <-- Transition in Goto graph from | | state 6 to state 4 on symbol a. +------ List of graph edges and transitions. | (1 |a| 2) <-- Transition from state 1 to state 2 | | on a. | | ) ------+ | | ) --------+ | ( 0 <-- State number 0. | | 3668 <-- Hash value of core of items. | | ( | | (SP -> DOT S |,| $) ----+ | | ( S -> DOT S |a| S |b| |,| $) | | | ( S -> DOT EPSILON |,| $) +---- Set of items for state 0. | | ( S -> DOT S |a| S |b| |,| |a|) | | | ( S -> DOT EPSILON |,| |a|) | | | ) ----+ | | ) +-- List of sets of items. | | | ( 2 <-- State number 2. | | 5168 <-- Hash values of core of items. | | ( | | (S -> S |a| DOT S |b| |,| $) ----+ | | (S -> S |a| DOT S |b| |,| |a|) | | | (S -> DOT S |a| S |b| |,| |b|) | | | (S -> DOT EPSILON |,| |b|) +-- Set of items for state 2. | | (S -> DOT S |a| S |b| |,| |a|) | | | (S -> DOT EPSILON |,| |a|) ----+ | | ) | | ) | | ) --------+ | ) |--------------------------------------------------------------------------------------------------------------" ) (defvar *action-table* nil "------------------------------------------------------------------------------- | Action table of the form, | | ( | ( (0) <-- state number | ( | ($ (R 2)) <-- reduce action on end of input $ | (|a| (R 2)) <-- reduce action on symbol a. | (DEFAULT (ERROR)) <-- otherwise must be error | ) | ) | | ( (1) <-- next line of action table. | ( | ($ (ACC NIL)) <-- accept action on end of input $ | (|a| (S 2)) <-- shift action on symbol a. | (DEFAULT (ERROR)) | ) | ) | ) |-------------------------------------------------------------------------------" ) (defvar *goto-table* nil "------------------------------------------------------------------------------- | Goto table of the form, | | ( | ( (0) <-- state number | ( | (S 1) <-- transition to state 1 on symbol S | (DEFAULT (ERROR)) <-- otherwise error | ) | ) | | ( (2) | ( | (S 3) | (DEFAULT (ERROR)) | ) | ) | ) |-------------------------------------------------------------------------------" ) (defvar *conflicts* nil "------------------------------------------------------------------------------- | Set to true if we have any shift-reduce or reduce-reduce conflicts. |-------------------------------------------------------------------------------" ) ; ============================================================================== ; | General Purpose List Processing Primitives | ; ============================================================================== (defun getHeadOfListUpTo( item list ) "------------------------------------------------------------------------------- | | DESCRIPTION | | Return the list from the beginning up to but not including a given item, | or the whole list if the item wasn't found. | | CALLING SEQUENCE | | (getHeadOfListUpTo item list) | => New list of all symbols before the item. | | EXAMPLE | | (getHeadOfListUpTo 'rat '(you are a rat fink)) => (YOU ARE A) | (getHeadOfListUpTo 'cat '(you are a rat fink)) => (YOU ARE A RAT FINK) | (getHeadOfListUpTo 'rat '(rat) ) => nil | (getHeadOfListUpTo 'rat nil ) => nil | |-------------------------------------------------------------------------------" (cond ( (null list) nil) ; Empty list. ( (equal (first list) item) nil) ; List = (item). Return (). ; Recurse. ( (cons (first list) (getHeadOfListUpTo item (rest list))))) ) (defun removeItemFromList( item list &key (equalityTest #'equal) ) "------------------------------------------------------------------------------- | | DESCRIPTION | | Remove all occurences of a given item from a list. Test item equality | with a function. | | CALLING SEQUENCE | | (removeItemFromList item list :equalityTest testFunction) | => New list with all occurrences of symbol taken out. | | testFunction The name of the function which tests if two symbols are | equal. It should be a function of two arguments which | returns T if the symbols are equal and NIL otherwise. | It defaults to #'equal. | | EXAMPLE | | (removeItemFromList '(rat bad) '( (cat good) (rat good))) | => ( (CAT GOOD) (RAT GOOD) ) | | (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2))) | (funcall #'sameAnimal '(rat good) '(rat bad)) => T | | (removeItemFromList '(rat bad) '( (cat good) (rat good)) | :equalityTest #'sameAnimal) | => ( (CAT GOOD) ) | +-------------------------------------------------------------------------------" (cond ( (null list) nil) ; Nothing in the list. ( (funcall equalityTest ; First item matches. item (first list)) ; according to equality ; test. (removeItemFromList item (rest list) ; Discard it and remove ; all other :equalityTest equalityTest)) ; items too. ( t (cons (first list) ; First item does not match. (removeItemFromList item ; Add it back and remove the (rest list) ; remaining items. :equalityTest equalityTest)))) ) (defun itemInList( element list &key (test #'equal) ) "------------------------------------------------------------------------------- | | DESCRIPTION | | Find out if an atom or a list is a member of a given list. Test for | equality with a function. | | CALLING SEQUENCE | | (itemInList item list :equalityTest testFunc) | => T if item is in list; NIL if not. | | testFunc The name of the function which tests if two symbols are | equal. It should be a function of two arguments which | returns T if the symbols are equal and NIL otherwise. | test defaults #'equal. | | EXAMPLE | | (itemInList '(hot dog) '((cool cat) (cool dog)) ) => NIL | | (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2))) | | (itemInList '(hot dog) '((cool cat) (cool dog)) | :equalityTest #'sameAnimal) => T | +---------------------------------------------------------------------------------" (cond ( (null list) nil) ; Not in the list. ( (funcall test element (first list)) ; First item matches. t) ( t (itemInList element (rest list) ; Try again on rest of list. :test test))) ) (defun positionInList( item list ) "------------------------------------------------------------------------------- | | DESCRIPTION | | Find the position of an item in a list. | | CALLING SEQUENCE | | (positionInList item list) | | item Atom or list to be found. | | list Any list. | | Returns: The position of item in the list or NIL if it is not there. | The first position is zero. | | EXAMPLE | | (positionInList '(winter mute) '(I am (winter mute))) => 2 | (positionInList 'ratfinn '(Who you ? ratfink ?)) => NIL | +---------------------------------------------------------------------------------" (cond ( (null list) nil ) ; Nothing in the list. ( (equal item (first list)) 0) ; list = (item ...), return ; position = 0. ; If the item is in the rest of the list, find its position in the ; rest of the list, then add 1 to fix up the count. ( (itemInList item (rest list)) (1+ (positionInList item (rest list)))) ( t nil )) ; Item was not found --- ; return NIL. ) (defun insertItemIntoList( item L &key (test #'equal) (precedence nil) ) "------------------------------------------------------------------------------ | | DESCRIPTION | | If an object isn't already in the list, add it to the end. If it is, | overwrite it (see below). | | CALLING SEQUENCE | | (insertItemIntoList item L :test test :precedence precedence) | | item An atom or list. | | test The test to perform to see if an item is is in the list. | It is the name of a function with two arguments which should | return T if its arguments are equal and NIL if they aren't. | The test function defaults to #'equal if omitted. | | precedence The test function to perform to say which object has the | higher precedence when both are equal. The one of higher | precedence is kept. An item of higher precedence overwrites | its lower precedence brother in the list. The function should | be of the form (precedence x y), returning the object of | higher precedence. Defaults to NIL (Don't care). | | L List of non-duplicated elements (according to equality test | specified above). | | Returns: Unchanged list if item is already in it. Otherwise, returns | the list L with the item in the last position. | | EXAMPLE | | (insertItemIntoList '(rat good) '( (rat bad) (bat good) ) ) | => ((RAT BAD) (BAT GOOD) (RAT GOOD)) | We compared for exact equality, so the new item gets inserted. | | | (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2))) | | (insertItemIntoList '(rat good) | '((rat bad) (bat good)) | :test #'sameAnimal) | => ((RAT BAD) (BAT GOOD)) | Rats are already in the list, so don't add the item. | | (defun good-always-wins( x y ) (cond ((equal (second x) 'good) x) (t y))) | | (insertItemIntoList '(rat good) '( (rat bad) (bat good) ) | :test #'sameAnimal | :precedence 'good-always-wins) => | => ((RAT GOOD) (BAT GOOD)) | Rats are already in the list, but we now compare equal items further | to see which have higher precedence. | |------------------------------------------------------------------------------" (cond ( (null L) (list item)) ; Nothing there. Add the item. ( (funcall test item (first L)) ; Item is already in the list. ; Of the two equal objects --- item and the first element in the list --- ; keep the one of higher precedence. (if (not (null precedence)) (cons (funcall precedence item (first L)) (rest L)) L)) ; Don't care about precedence, so ; keep the original list. ( t (cons (first L) (insertItemIntoList item (rest L) :test test :precedence precedence)))) ) (defun combine( list1 list2 &key (test #'equal) (precedence nil) ) "------------------------------------------------------------------------------ | | DESCRIPTION | | Take the union of two lists. We can do a generalized test for | equality of elements. Also, if two elements are equal, we can | keep the one of higher precedence. | | CALLING SEQUENCE | | (combine list1 list2 :test test :precedence precedence) | | list1 Arbitrary lists. | | list2 | | item An atom or list. | | test The test to perform to see if an item is is in the list. | It is the name of a function with two arguments which should | return T if its arguments are equal and NIL if they aren't. | The test function defaults to #'equal if omitted. | | precedence The test function to perform to say which object has the | higher precedence when both are equal. The one of higher | precedence is kept. An item of higher precedence overwrites | its lower precedence brother in the list. The function should | be of the form (precedence x y), returning the object of higher | precedence. precedence defaults to NIL (Don't care). | | Returns: The set theoretic union of the two lists, except that we | always keep the element of highest precedence when two | elements are the same. | | EXAMPLE | | (combine '((rat good) (rat awful)) '((rat bad) (bat good))) | => ((RAT AWFUL) (RAT GOOD) (RAT BAD) (BAT GOOD)) | | (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2))) | | (combine '((rat good) (rat awful)) '((rat bad) (bat good)) | :test #'sameAnimal) | => ((RAT AWFUL) (BAT GOOD)) | | (defun good-always-wins( x y ) (cond ((equal (second x) 'good) x) (t y))) | | (combine '((rat good) (rat awful)) '((rat bad) (bat good)) | :test #'sameAnimal | :precedence #'good-always-wins) => ((RAT GOOD) (BAT GOOD)) | |------------------------------------------------------------------------------" ; Successively add elements from both lists to nil, eliminating ; duplicated or low precedence items. If both lists are nil, dolist ; does not loop, and we return nil. (let ( (new-list nil) ) (dolist (item (union list1 list2 :test #'equal)) (setq new-list (insertItemIntoList item new-list :test test :precedence precedence))) new-list) ) ; ------------------------------------------------------------------------------ ; | core-of-item! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Get the core of an item: the item but without the lookahead. ; ; CALLING SEQUENCE ; ; (core-of-item! item) ; ; item [A -> alpha . beta , gamma ] ; ; Returns: [A -> alpha . beta] ; ; EXAMPLE ; ; (core-of-item! '(sandwich -> bread meat DOT bread |,| knife)) ; => (SANDWICH -> BREAD MEAT DOT BREAD) ; ; ------------------------------------------------------------------------------ (defun core-of-item!( item ) (getHeadOfListUpTo '|,| item) ) ; ------------------------------------------------------------------------------ ; | element-of-item? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find out if an item or its core is in a set of items. ; ; CALLING SEQUENCE ; ; (element-of-item item set-of-items compare-type) ; ; item [A -> alpha . beta , gamma] ; ; set-of-items ... [A' -> alpha' . beta' , gamma'] ... ; ; compare-type Whether to compare the whole item or only its core. ; ; Returns: if compare-type = 'core then T if A = A', alpha = ; alpha', beta = beta'. gamma need not equal gamma'. ; if compare-type = item then T if in addition, ; gamma = gamma', and NIL otherwise. ; ; EXAMPLE ; ; (element-of-item? '(eat -> living death |,| scum) ; '( (eat -> hot fudge |,| scum) ; (eat -> living death |,| wimp)) ; 'core) => T ; ; But (element-of-item? . . . 'item) => NIL ; ; ------------------------------------------------------------------------------ (defun element-of-item?( item set-of-items compare-type ) (cond ((null set-of-items) nil) ; No items, no match. ( (if (equal compare-type 'core) (equal (core-of-item! item) ; Core of first item (core-of-item! (first set-of-items))) ; was found. (equal item (first set-of-items))) ; First items match. T) ; First item is in set ( t (element-of-item? item ; Continue to search. (rest set-of-items) compare-type))) ) ; ------------------------------------------------------------------------------ ; | contained-in-item? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find if the first set of items is contained in the second ; set of items. Alternatively, find out if the cores of the ; first set of items are contained in the cores of the second set. ; ; CALLING SEQUENCE ; ; (contained-in-item? set-of-items1 set-of-items2 compare-type) ; ; set-of-items1 First and ... ; ; set-of-items2 ... second sets of of items. ; ; compare-type Type of comparison: 'item or 'core. ; ; Returns: T if compare-type = 'item and the first set of items ; is contained in the second set of items. ; T if compare-type = 'core and the cores of the first ; set of items are contained in the cores of the second ; set of items. ; EXAMPLE ; ; (contained-in-item? '( (a -> b DOT c |,| x) ; (e -> f DOT |,| g)) ; ; '( (a -> b DOT c |,| h) ; (e -> f DOT |,| i) ; (f -> g DOT h i |,| j) ) ; 'core ) => T ; ; However, (contained-in-item? . . . 'item) => NIL ; ; ------------------------------------------------------------------------------ (defun contained-in-item?( set-of-items1 set-of-items2 compare-type ) (cond ((null set-of-items1) T) ; Null set is contained in ; every set. ((element-of-item? (first set-of-items1) ; First item (or its core) set-of-items2 ; is in the second set. compare-type) (if (null (rest set-of-items1)) ; No other elements in first set T (contained-in-item? (rest set-of-items1) ; Are the remaining set-of-items2 ; items of the first compare-type))) ; set in the second? ( t nil )) ; First element of first set isn't in ; the second set: first set can't be ; contained in second set. ) ; ------------------------------------------------------------------------------ ; | equal-sets-of-items? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Check if two sets of items are the same (or have the same core). ; You can also use it to check if two arbitrary lists contain the ; same elements. ; ; CALLING SEQUENCE ; ; (equal-sets-of-items? set-of-items1 set-of-items2 :compare-type type) ; ; set-of-items1 First and ... ; ; set-of-items2 ... second sets of of items. ; ; type Optional argument defaulting to 'item. ; ; Returns: T for type = 'item if both sets of items are identical. ; T for type = 'core if both sets of items have the same ; cores. ; METHOD ; ; Two sets of items are the same if each one is contained within the ; other. They have the same core if the core of one is contained in ; the core of the other and vice-versa. We can't just test for equality ; because the order of the items could be different. ; ; EXAMPLE ; ; (equal-sets-of-items? '( (a -> b DOT c |,| d) ) ; '( (a -> b DOT c |,| d) )) => T ; ; (equal-sets-of-items? '( (a -> b DOT c |,| d) ) ; '( (a -> b DOT c |,| e) ) ; :compare-type 'core ) => T ; ; (equal-sets-of-items? '(a (b c) d) '(d a (b c))) => T ; ; ------------------------------------------------------------------------------ (defun equal-sets-of-items?( set-of-items1 set-of-items2 &key (compare-type 'item)) (and (contained-in-item? set-of-items1 set-of-items2 compare-type) (contained-in-item? set-of-items2 set-of-items1 compare-type)) ) ; ============================================================================== ; | Helper Functions on Symbols and Productions | ; ============================================================================== ; ------------------------------------------------------------------------------ ; | terminal? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find out if a symbol of the grammar is a terminal. ; ; CALLING SEQUENCE ; ; (terminal? symbol) ; ; *terminals* Global list of terminal symbols for the grammar. ; ; symbol: A grammar symbol ; ; Returns: T if the symbol is a terminal symbol, NIL otherwise. ; EPSILON is not a terminal. ; ; EXAMPLE ; ; Let *terminals* = (c C), ; ; (terminal? '|c| ) => T ; (terminal? 'C ) => NIL ; (terminal? 'EPSILON) => NIL ; ; ------------------------------------------------------------------------------ (defun terminal?( symbol ) (itemInList symbol *terminals*) ) ; ------------------------------------------------------------------------------ ; | nonterminal? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find out if a symbol of the grammar is a nonterminal. ; ; CALLING SEQUENCE ; ; (nonterminal? symbol) ; ; symbol: A grammar symbol ; ; Returns: T if the symbol is a nonterminal symbol, NIL otherwise. ; EPSILON is not a non-terminal. ; ; EXAMPLE ; ; (nonterminal? 'C ) => T ; (nonterminal? '|c| ) => NIL ; (nonterminal? 'EPSILON ) => NIL ; ; ------------------------------------------------------------------------------ (defun nonterminal?( symbol ) (and (not (equal symbol 'EPSILON)) (not (terminal? symbol))) ) ; ------------------------------------------------------------------------------ ; | derives-leading-terminal? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Check if a production derives a leading terminal. ; ; CALLING SEQUENCE ; ; (derives-leading-terminal? production) ; ; production: A production of the form X -> a Y ; ; Returns: T if a is a terminal, NIL otherwise. ; ; EXAMPLE ; ; (derives-leading-terminal? '(C -> |c| C)) => T ; (derives-leading-terminal? '(C -> C C)) => NIL ; ; ------------------------------------------------------------------------------ (defun derives-leading-terminal?( production ) (terminal? (third production)) ) ; ------------------------------------------------------------------------------ ; | derives-leading-nonterminal? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Check if a production derives a leading nonterminal. ; ; CALLING SEQUENCE ; ; (derives-leading-nonterminal? production) ; ; production: A production of the form X -> a Y ; ; Returns: T if a is a terminal, NIL otherwise. ; ; EXAMPLE ; ; (derives-leading-nonterminal? '(C -> C C)) => T ; (derives-leading-nonterminal? '(C -> |c| C)) => NIL ; ; ------------------------------------------------------------------------------ (defun derives-leading-nonterminal?( production ) (nonterminal? (third production)) ) ; ------------------------------------------------------------------------------ ; | valid-production? | ; ------------------------------------------------------------------------------ ; ; DESCRPTION ; ; Find out if a production starts with the given symbol. ; ; CALLING SEQUENCE ; ; (valid-production? symbol production) ; ; symbol Grammar symbol ; ; production A production of the form X -> alpha ; ; Returns: T if X = symbol, NIL otherwise. ; ; EXAMPLE ; ; (valid-production? 'C '(C -> |c| C)) => T ; (valid-production? 'S '(C -> |c| C)) => NIL ; ; ------------------------------------------------------------------------------ (defun valid-production?( symbol production ) (equal symbol (first production)) ) ; ------------------------------------------------------------------------------ ; | reduction? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find out if an item calls for a reduction. ; ; CALLING SEQUENCE ; ; (reduction? item) ; ; item Any item [A -> alpha . beta , gamma]. ; ; Returns: T if the item is of the form [A -> alpha . , gamma ] ; and NIL otherwise. ; Note [A -> alpha . EPSILON , gamma] = ; [A -> alpha . , gamma], to this is a reduction too. ; ; EXAMPLE ; ; (reduction? '(C -> |d| DOT |,| |c|)) => T ; (reduction? '(C -> |d| DOT |e| |,| |c|)) => NIL ; (reduction? '(C -> |d| DOT EPSILON |,| |c|)) => T ; ; ------------------------------------------------------------------------------ (defun reduction?( item ) ; Get everything between the dot and comma. It will be empty for a reduction. (let ((between-dot-and-comma (getHeadOfListUpTo 'DOT (reverse (getHeadOfListUpTo '|,| item ))))) (or (null between-dot-and-comma) (equal (first between-dot-and-comma) 'epsilon))) ) ; ------------------------------------------------------------------------------ ; | is-accept? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find out if an item calls for an accept. ; ; CALLING SEQUENCE ; ; (is-accept? item) ; ; item Arbitrary item [A -> alpha . beta , gamma]. ; ; *productions* Global list of productions for our grammar. ; ; Returns: T if the item is of the form [S' -> S . , $] ; S is the start symbol --- the left hand side symbol ; of the first production. S' (represented as SP) is ; the extra start symbol of the augmented grammar. ; EXAMPLE ; ; *productions* => ( (S -> C C) (C -> |c| C) (C -> |d|) ) ; (is-accept? '(SP -> S DOT |,| $)) => T ; ; ------------------------------------------------------------------------------ (defun is-accept?( item ) (equal item `(SP -> ,(first (first *productions*)) DOT |,| $)) ) ; ------------------------------------------------------------------------------ ; | symbol-after-dot! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find the symbol following the dot in an item. ; ; ; CALLING SEQUENCE ; ; (symbol-after-dot! item) ; ; item Item of the form [A -> alpha . B beta , gamma ] ; ; Returns: The symbol B, or NIL if there is none. ; ; EXAMPLE ; ; (symbol-after-dot! '(frogs -> are DOT keen |,| you bet)) => KEEN ; (symbol-after-dot! '(toads -> are not)) => NIL ; ; ------------------------------------------------------------------------------ (defun symbol-after-dot!( item ) (cond ( (null item) nil ) ; No dot was ever found. ; Return the symbol after the dot or nil if there is none. ( (equal (first item) 'DOT) (second item) ) ( T (symbol-after-dot! (rest item)))) ; Keep ; looking. ) ; ------------------------------------------------------------------------------ ; | terminal-after-dot? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Check if an item has a terminal symbol after the dot. ; ; CALLING SEQUENCE ; ; (terminal-after-dot? item) ; ; item Any item [A -> alpha . beta delta , gamma] ; ; Returns: T if beta is a terminal symbol and NIL otherwise. ; ; EXAMPLE ; ; (terminal-after-dot? '(C -> C DOT |c| |,| $ )) => T ; (terminal-after-dot? '(C -> C DOT C |,| $ )) => NIL ; ; ------------------------------------------------------------------------------ (defun terminal-after-dot?( item ) (if (null (symbol-after-dot! item)) ; No symbol after the dot. nil (terminal? (symbol-after-dot! item))) ; Check if the symbol after the dot ; is a terminal. ) ; ------------------------------------------------------------------------------ ; | epsilon-production? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find out if a production derives only epsilon. ; ; CALLING SEQUENCE ; ; (epsilon-production? production) ; ; production [A -> alpha]. ; ; Returns: T if alpha = EPSILON. ; ; EXAMPLE ; ; (epsilon-production? '(A -> EPSILON)) => T ; ; ------------------------------------------------------------------------------ (defun epsilon-production?( production ) (and (equal (length production) 3) (equal (third production) 'EPSILON)) ) ; ------------------------------------------------------------------------------ ; | same-symbol? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Test if two tagged grammar symbols are equal. ; ; CALLING SEQUENCE ; ; (same-symbol? s1 s2) ; ; s1, s2 Tagged grammar symbols of the form (symbol tag), with ; tag = 'EPSILON-FREE or NIL. ; ; Returns T if the symbol parts are equal. ; EXAMPLE ; ; (same-symbol? '(a NIL) '(a EPSILON-FREE)) => T ; (same-symbol? '(a EPSILON-FREE) '(b EPSILON-FREE)) => NIL ; ; ------------------------------------------------------------------------------ (defun same-symbol?( s1 s2 ) (equal (first s1) (first s2)) ) ; ------------------------------------------------------------------------------ ; | first-alternate! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the first alternate of a production. ; ; CALLING SEQUENCE ; ; (first-alternate! rhs) ; ; rhs The right hand side of a production containing alternates: ; alpha / beta / ... ; ; Returns: alpha ; ; EXAMPLE ; ; (first-alternate! '(A B / C D)) => (A B) ; ; ------------------------------------------------------------------------------ (defun first-alternate!( rhs ) (getHeadOfListUpTo '/ rhs) ) ; ------------------------------------------------------------------------------ ; | all-but-first-alternate! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return all but the first alternates of of a production. ; ; CALLING SEQUENCE ; ; (all-but-first-alternate! rhs) ; ; rhs The right hand side of a production, with alternates ; (alpha / beta / ...) ; ; Returns: (beta / ...) or NIL if rhs has only one alternate: ; (alpha) ; ; EXAMPLE ; ; (all-but-first-alternate! '(B C / D E / F)) => (D E / F) ; (all-but-first-alternate! '(B C)) => NIL ; (all-but-first-alternate! '(B C / D E / F)) => (D E / F) ; ; ------------------------------------------------------------------------------ (defun all-but-first-alternate!( rhs ) ; Get the first alternate, then strip it off. (nthcdr (1+ (length (getHeadOfListUpTo '/ rhs))) rhs) ) ; ------------------------------------------------------------------------------ ; | production-rhs! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the right hand side of a production. ; ; CALLING SEQUENCE ; ; (production-rhs! production) ; ; production Production of the form [A -> alpha]. ; ; Returns: alpha ; ; EXAMPLE ; ; (production-rhs! '(sandwich -> bread meat bread)) => (BREAD MEAT BREAD) ; ; ------------------------------------------------------------------------------ (defun production-rhs!( production ) (nthcdr 2 production) ) ; ------------------------------------------------------------------------------ ; | string-before-comma! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the symbols between the first symbol after the dot and the comma ; in an item. ; ; CALLING SEQUENCE ; ; (string-before-comma! item) ; ; item An item of the form [A -> alpha . B beta , gamma] ; ; Returns: beta or NIL if beta is empty. ; ; EXAMPLE ; ; (string-before-comma! '(A -> + DOT B * + + |,| a)) => (* + +) ; (string-before-comma! '(A -> + DOT B |,| a)) => NIL ; (string-before-comma! '(A -> + DOT |,| a)) => NIL ; ; ------------------------------------------------------------------------------ (defun string-before-comma!( item ) (let ((temp (reverse (getHeadOfListUpTo 'DOT (reverse item))))) ; Get everything past the ; dot. (if (equal (first temp) '|,|) ; No symbol after the dot. nil (getHeadOfListUpTo '|,| (rest temp)))) ; Get everything past the ; symbol after the dot ; (which could be nil). ) ; ------------------------------------------------------------------------------ ; | lookahead-of! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the lookahead symbol of an item. ; ; CALLING SEQUENCE ; ; (lookahead-of! item) ; ; item [A -> alpha . beta , gamma] ; ; Returns: gamma ; ; EXAMPLE ; ; (lookahead-of! '(SP -> DOT |d| |,| |c|)) => |c| ; ; ------------------------------------------------------------------------------ (defun lookahead-of!( item ) (cond ( (null item) nil ) ; Nothing at all. ( (equal (first item) '|,|) (second item)) ; Lookahead (or nothing) ; follows the comma. ( T (lookahead-of! (rest item)))) ; Search ) ; ------------------------------------------------------------------------------ ; | split-up-production | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Break a single production with alternates into separate productions. ; ; CALLING SEQUENCE ; ; (split-up-production production) ; ; production [A -> alpha / beta / ... ] ; ; Returns: The list [A -> alpha], [A -> beta], ... ; ; EXAMPLE ; ; (split-up-production '(A -> B C / D E / F)) ; => ( (A -> B C) (A -> D E) (A -> F) ) ; ; ------------------------------------------------------------------------------ (defun split-up-production( production ) ; production = A -> B | C | ... (let ( (head `(,(first production) ,(second production))) ; Get the left hand side: A -> (tail (nthcdr 2 production)) ; Get the right hand side: B | C | ... (new-productions nil) (new-production nil) ) (loop (if (null tail) (return)) (setq new-production ; A -> B, A -> C, etc. (append head (first-alternate! tail))) ; Strip off next list up to bar. (setq tail (all-but-first-alternate! tail)) (setq new-productions (append new-productions (list new-production)))) new-productions) ) ; ------------------------------------------------------------------------------ ; | split-up-productions | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Break a list of productions with alternates into a list of separate ; productions. ; ; CALLING SEQUENCE ; ; (split-up-productions production-list) ; ; production-list [A -> alpha / beta / ... ] [B -> gamma / delta ...] ... ; ; Returns: The list [A -> alpha], [A -> beta], ... [B -> gamma] ; [B -> delta], ... ; EXAMPLE ; ; (split-up-productions '((S -> C C) (C -> |c| C / |d|))) ; => ((S -> C C) (C -> |c| C) (C -> |d|)) ; ; ------------------------------------------------------------------------------ (defun split-up-productions( production-list ) (let ((new-production-list nil)) (dolist (production production-list) ; Split up each production. (setq new-production-list ; Add it to the growing list. (append new-production-list (split-up-production production)))) new-production-list) ) ; ------------------------------------------------------------------------------ ; | make-item | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Create an item with a leading dot from a production and a ; lookahead symbol. ; ; CALLING SEQUENCE ; ; (make-item production lookahead) ; ; production Any production [A -> alpha] ; ; lookahead Any lookahead symbol b. ; ; Returns: The item [A -> . alpha , b] ; ; EXAMPLE ; ; (make-item '(A -> B C) '|d|) => (A -> DOT B C |,| |d|) ; ; ------------------------------------------------------------------------------ (defun make-item( production lookahead ) `( ,(first production) ; Get the first symbol A. ,(second production) ; Get the arrow -> DOT ; Add the leading dot. ,@(nthcdr 2 production) ; Add the right hand side of the production. |,| ; Comma. ,lookahead ) ; Add the lookahead symbol last. ) ; ------------------------------------------------------------------------------ ; | move-dot-right | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Move the dot in an item to the right if possible. ; ; CALLING SEQUENCE ; ; (move-dot-right item) ; ; item [A -> alpha . X beta, b] ; ; Returns: [A -> alpha X . beta, b] ; If the item is of the form [A -> alpha . , b], we return it ; unchanged. ; ; EXAMPLE ; ; (move-dot-right '( A -> B DOT C |,| D)) => ( A -> B C DOT |,| D) ; (move-dot-right '( A -> B C DOT |,| D)) => ( A -> B C |,| DOT D) ; ; ------------------------------------------------------------------------------ (defun move-dot-right( item ) (cond ( (null item) nil ) ( (equal (first item) 'DOT) ; The item begins with a dot. (if (null (second item)) ; item = DOT item ; Leave it alone. ; Move the dot right over the next symbol. We change [ . b c d ] to [b . c d]. `( ,(second item) ; b DOT ; Add a dot. ,@(nthcdr 2 item)))) ; The remainder, (c d). ( t (cons (first item) ; Item doesn't begin with a dot. (move-dot-right (rest item))))) ) ; ------------------------------------------------------------------------------ ; | create-augmenting-item | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Create the accept item of the augmented grammar. ; ; CALLING SEQUENCE ; ; (create-augmenting-item) ; ; *productions* List of productions for this grammar. ; ; Returns: The item (SP -> DOT S |,| $) where S is the start ; symbol: the left hand side nonterminal of the first ; production S -> alpha. ; EXAMPLE ; ; *productions* => ((E -> E T) (E -> id) (T -> id)) ; ; (create-augmenting-item) => (SP -> DOT E |,| $) ; ; ------------------------------------------------------------------------------ (defun create-augmenting-item() ; Assume the first symbol of the left hand side of the first production is ; the start symbol. `(SP -> DOT ,(first (first *productions*)) |,| $) ) ; ------------------------------------------------------------------------------ ; | find-grammar-symbols | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Create all the grammar symbols (terminal and nonterminal) by looking ; at the list of productions. ; ; CALLING SEQUENCE ; ; (find-grammar-symbols) ; ; Returns: List of grammar symbols. Note we don't include the endmarker, ; $ or the null string, EPSILON. ; ; EXAMPLE ; ; For productions S -> C C, S -> |c| C | d, ; ; (find-grammar-symbols) => (S C |c| |d|) ; ; ------------------------------------------------------------------------------ (defun find-grammar-symbols() (let ((symbols nil)) ; Scan through all productions, collecting all terminals and nonterminals. (dolist (production *productions*) (setq symbols (append symbols (removeItemFromList '-> production)))) ; Remove duplicated elements which occur later in the sequence. Remove any ; EPSILON's introduced by epsilon productions, A -> EPSILON. (removeItemFromList 'EPSILON (remove-duplicates symbols :from-end T))) ) ; ------------------------------------------------------------------------------ ; | item-to-production | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Change an item to a production by removing the dot and lookahead part. ; ; CALLING SEQUENCE ; ; (item-to-production item) ; ; item [A -> alpha . beta , gamma] ; ; Returns: [A -> alpha beta] ; ; EXAMPLE ; ; (item-to-production '(rat -> on DOT rye |,| tail)) => (RAT -> ON RYE) ; ; ------------------------------------------------------------------------------ (defun item-to-production( item ) (removeItemFromList 'DOT (getHeadOfListUpTo '|,| item ))) ; ------------------------------------------------------------------------------ ; | production-number | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the number of a production. ; ; CALLING SEQUENCE ; ; (production-number production) ; ; production Any production. ; *production* List of productions for the grammar. ; ; Returns: The number of the production in the list *productions*. ; The first production is numbered 1. Recall alternates ; of productions were split off into separate productions ; in the function load-input-and-initialize. ; EXAMPLE ; ; (production-number '(S -> C C)) => 1 ; ; ------------------------------------------------------------------------------ (defun production-number( production ) (1+ (positionInList production *productions*)) ) ; ============================================================================== ; | First Derived Symbol Utilities | ; ============================================================================== ; | | ; | NOTE: In this section we will be using the sample grammar, | ; | | ; | S -> A B | ; | A -> C a | EPSILON | ; | B -> b | ; | C -> c | EPSILON | ; | | ; | with terminal symbols a b c | ; | | ; ============================================================================== ; ------------------------------------------------------------------------------ ; | tag-symbol | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Convert a grammar symbol A to tagged form (A NIL). ; ; CALLING SEQUENCE ; ; (tag-symbol symbol) ; ; symbol A grammar symbol. ; ; Returns: The list, (symbol NIL). ; ; EXAMPLE ; ; (tag-symbol 'a) => (A NIL) ; ; ------------------------------------------------------------------------------ (defun tag-symbol( s ) `(,s NIL) ) ; ------------------------------------------------------------------------------ ; | flag-epsilon-free! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Flag a tagged grammar symbol as coming from an epsilon-free derivation. ; ; CALLING SEQUENCE ; ; (flag-epsilon-free! tagged-symbol) ; ; tagged-symbol Tagged grammar symbol of the form (symbol tag). ; ; Returns: (symbol NIL) ; ; EXAMPLE ; ; (flag-epsilon-free! '(a nil)) => (A EPSILON-FREE) ; ; ------------------------------------------------------------------------------ (defun flag-epsilon-free!( s ) `(,(first s) epsilon-free) ) ; ------------------------------------------------------------------------------ ; | epsilon-free-only | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return only tagged grammar symbols with epsilon-free derivations. ; ; CALLING SEQUENCE ; ; (epsilon-free-only list) ; ; list List of tagged symbols, ( (s1 tag1) ... ) ; ; Returns: Only those symbols in the list for which ; tag = 'epsilon-free ; EXAMPLE ; ; (epsilon-free-only '((a NIL) (b EPSILON-FREE) (c NIL))) ; => ((B EPSILON-FREE)) ; ; ------------------------------------------------------------------------------ (defun epsilon-free-only( l ) (cond ( (null l) nil ) ( (equal (second (first l)) ; Keep this symbol: It is epsilon-free. 'epsilon-free) (cons (first l) (epsilon-free-only (rest l)))) ( t (epsilon-free-only (rest l)))) ; Discard this symbol. ) ; ------------------------------------------------------------------------------ ; | untag-list | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Remove the tags from a list of tagged grammar symbols. ; ; CALLING SEQUENCE ; ; (untag-list list) ; ; list List of tagged grammar symbols, ; ((a NIL) (b EPSILON-FREE) ... ). ; ; Returns: Untagged symbols, (a b ...). ; ; EXAMPLE ; ; (untag-list '( (a NIL) (b EPSILON-FREE) (c nil))) => (A B C) ; ; ------------------------------------------------------------------------------ (defun untag-list( list ) (mapcar #'car list) ) ; ------------------------------------------------------------------------------ ; | flag-non-epsilon-free | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Flag a list of tagged symbols as being all not epsilon-free derived. ; ; CALLING SEQUENCE ; ; (flag-epsilon-free tagged-list) ; ; tagged-list Tagged list of grammar symbols, ( (s1 tag1) ... ) ; ; Returns: List with all tags set to NIL, ( (s1 NIL) ... ) ; ; EXAMPLE ; ; (flag-non-epsilon-free '((a epsilon-free) (b nil) (c epsilon-free))) => ; ((A NIL) (B NIL) (C NIL)) ; ; ------------------------------------------------------------------------------ (defun flag-non-epsilon-free( s ) ; Apply an anonymous function to all elements of the list. (mapcar #'(lambda (x) (cons (first x) '(NIL))) s) ) ; ------------------------------------------------------------------------------ ; | precedence | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Compare two tagged symbols and return the one of higher precedence. ; ; CALLING SEQUENCE ; ; (precedence s1 s2) ; ; s1, s2 Tagged grammar symbols of the form (s1 tag1), (s2 tag2) with ; tag1, tag2 = 'EPSILON-FREE or NIL. ; ; Returns (s1 tag1) if tag1 = EPSILON-FREE, (s2 tag2) otherwise. ; ; EXAMPLE ; ; (precedence '(a NIL) '(b EPSILON-FREE)) => (B EPSILON-FREE) ; (precedence '(a EPSILON-FREE) '(b EPSILON-FREE)) => (A EPSILON-FREE) ; ; ------------------------------------------------------------------------------ (defun precedence( s1 s2 ) (cond ((equal (second s1) 'epsilon-free) s1) ; Return the epsilon-free one. ( t s2)) ) ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the derived leading terminal of a production. ; e.g. the derived leading terminal a of the production X -> a Y ; ; CALLING SEQUENCE ; ; (derived-leading-terminal production) ; ; production [A -> a beta] ; ; Returns: a ; ; EXAMPLE ; ; (derived-leading-terminal '(A -> + B A)) => ; ; ------------------------------------------------------------------------------ (defun derived-leading-terminal( production ) (third production) ) ; ------------------------------------------------------------------------------ ; | first-terminals-of-rhs | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find the ith approximation of the first derived terminals of the right ; hand side of a production. ; ; CALLING SEQUENCE ; ; (first-terminals-of-rhs rhs hash-table) ; ; rhs The right hand side Y1 ... Yn of the production. ; ; hash-table The current approximation to the FIRST() function for ; the non-terminals. FIRST of the terminals is already exact. ; ; Returns: The first derived terminals of the string Y1 ... Yn. We ; tag the epsilon-free first derived symbols. ; ; METHOD ; ; F ( X ) = F( Y1 ) + ... + F( Yn ) ; i 1 1 ; ; EXAMPLE ; ; Assume we have just called initial-first-derived-terminals, so that ; hash-table contains the zeroth approximation to FIRST. ; ; (first-terminals-of-rhs '(A B) hash-table) => ( (b NIL) ) ; because S => A B => EPSILON b => b, which is not in EFF(). ; ; ------------------------------------------------------------------------------ (defun first-terminals-of-rhs( rhs hash-table ) ; Compute FIRST( Y1 ) and FIRST( Y1 ) - EPSILON. (let* ((first-terms (gethash (first rhs) hash-table)) (first-terms-minus-epsilon (removeItemFromList '(EPSILON NIL) first-terms :equalityTest 'same-symbol?))) (cond ( (null rhs) NIL) ; If we have the case A -> alpha beta with FIRST( alpha ) = {}, we want to ; return {}. ( (null first-terms) nil ) ; If epsilon is in FIRST( Y1 ) , add all non-epsilon symbols in FIRST( Y1 ) ; to the first derived terminals in the rest of the list. Flag all these new ; symbols as epsilon-derived. If there are duplicated symbols, keep only ; the epsilon-free ones. ( (itemInList '(EPSILON NIL) first-terms :test 'same-symbol?) (combine first-terms-minus-epsilon (flag-non-epsilon-free (first-terminals-of-rhs (rest rhs) hash-table)) :test 'same-symbol? :precedence 'precedence)) ; Otherwise, Y1 has only non-epsilon terminals. Return FIRST( Y1 ). Whether ; these symbols are epsilon-free depends on their previous flags. ( t first-terms ))) ) ; ------------------------------------------------------------------------------ ; | update-first-derived-function | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Update the first derived terminals function to create a new version. ; ; CALLING SEQUENCE ; ; (update-first-derived-function hash-table update-hash-table) ; ; hash-table Old hash table. ; ; update-hash-table Changes to the old table. If an entry is NIL, ; it indicates no change is to be made to hash-table. ; ; Returns: Updated hash-table of the first derived terminals. ; ; EXAMPLE ; ; (update-first-derived-function) => Updated hash table. ; ; ------------------------------------------------------------------------------ (defun update-first-derived-function( hash-table update-hash-table ) ; Update only changes to nonterminals because the FIRST of a terminal symbol ; does not change. (dolist (symbol (find-grammar-symbols)) (if (nonterminal? symbol) (if (not (null (gethash symbol update-hash-table))) (setf (gethash symbol hash-table) (gethash symbol update-hash-table))))) ) ; ------------------------------------------------------------------------------ ; | initial-first-derived-terminals | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; The zeroth approximation to the first derived terminals function. ; It is exact for all terminals and epsilon. ; ; CALLING SEQUENCE ; ; (initial-first-derived-terminals hash-table :type type) ; ; *terminals* List of all the terminal symbols including the endmarker $ ; ; hash-table Empty, but allocated hash table. ; ; Returns: Hash table of the zeroth approximation to the first ; derived terminals function for every grammar symbol. ; Epsilon-free first derived symbols are tagged with the ; flag 'epsilon-free. ; ; EXAMPLE ; ; (setq F0 (make-hash-table :size 100)) ; (setq F0 (initial-first-derived-terminals F0)) ; (gethash 'A F0) => ( (EPSILON NIL) ) ; ; Grammar FIRST Grammar FIRST ; Symbol Symbol ; ------------------- ------------------- ; S NIL |a| ((|a| EPSILON-FREE)) ; A ((EPSILON NIL)) |b| ((|b| EPSILON-FREE)) ; B ((|b| EPSILON-FREE)) |c| ((|c| EPSILON-FREE)) ; C ((|c| EPSILON-FREE) EPSILON ((EPSILON NIL)) ; (EPSILON NIL)) $ (($ EPSILON-FREE)) ; ; |c| is flagged as being in EFF( C ). EPSILON is in FIRST( C ) but ; not in EFF( C ). ; ; ------------------------------------------------------------------------------ (defun initial-first-derived-terminals( hash-table ) (let ( (first-symbols nil) (nonterm nil) (new-symbol nil) ) ; FIRST( X ) = { (X 'epsilon-free) } if X is a terminal. (dolist (terminal *terminals*) (setf (gethash terminal hash-table) (list (flag-epsilon-free! (tag-symbol terminal))))) ; FIRST( EPSILON ) = { (EPSILON NIL) }. (setf (gethash 'EPSILON hash-table) (list (tag-symbol 'EPSILON))) ; Every nonterminal appears as the left hand side of some production. ; Thus we can scan through the productions to define FIRST( A ) for every ; nonterminal A. ; Compute the zeroth approximation to FIRST(). Look for a production of ; the form A -> a alpha, where a is a nonterminal. Find the entry for ; A in the table, and add a to it. ; We tag productions of the form A -> EPSILON as not being epsilon-free ; derivations. (dolist (production *productions*) (cond ( (or (derives-leading-terminal? production) (epsilon-production? production)) (setq nonterm (first production)) (setq first-symbols (gethash nonterm hash-table)) ; Get a or EPSILON. (setq new-symbol (tag-symbol (derived-leading-terminal production))) ; Flag a as an epsilon-free derivation, but EPSILON as not. (if (not (epsilon-production? production)) (setq new-symbol (flag-epsilon-free! new-symbol))) ; Add a to FIRST( A ). (setq first-symbols (insertItemIntoList new-symbol first-symbols)) (setf (gethash nonterm hash-table) first-symbols)))) hash-table) ) ; ------------------------------------------------------------------------------ ; | create-all-first-derived-terminals | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Create a hash table of all first derived terminals for every grammar ; symbol. ; ; CALLING SEQUENCE ; ; (create-all-first-derived-terminals) ; ; Returns: A hash table of the first derived terminals for every grammar ; symbol, including EPSILON and $. Flag the epsilon-free ; derived terminals. ; METHOD ; ; Successive approximation by transitive closure. ; ; EXAMPLE ; ; (setq h (create-all-first-derived-terminals)) => # ; ; (gethash 'S h) => ((|a| NIL) (|c| EPSILON-FREE) (|b| NIL)) ; i.e. FIRST( S ) = { |a| |b| |c| }, but EFF( S ) = { |c| } ; ; ; Grammar FIRST Grammar FIRST ; Symbol Symbol ; -------------------------------- -------------------------------- ; S ((|a| NIL) |a| ((|a| EPSILON-FREE)) ; (|b| NIL) |b| ((|a| EPSILON-FREE)) ; (|c| EPSILON-FREE)) |c| ((|c| EPSILON-FREE)) ; A ((EPSILON NIL) EPSILON ((EPSILON NIL)) ; (|a| NIL) $ (($ EPSILON-FREE)) ; (|c| EPSILON-FREE)) ; B ((|b| EPSILON-FREE)) ; C ((|c| EPSILON-FREE) ; (EPSILON NIL)) ; ; ------------------------------------------------------------------------------ (defun create-all-first-derived-terminals() ; Initialize the hash table. The size is extensible at run time. (let ( (hash-table (make-hash-table :size +initial-hash-table-size+)) (update-hash-table (make-hash-table :size +initial-hash-table-size+)) (nonterm nil) (new-first-symbols nil) (old-first-symbols nil) (change-flag T) ) ; Create the zeroth approximation to FIRST(), accurate for all terminals. (initial-first-derived-terminals hash-table) ; Loop until no more changes occur in the approximation to FIRST. (loop (setq change-flag nil) ; Compute FIRST[i+1](A) for all the nonterminals A. (dolist (production *productions*) ; Scan all productions A -> alpha (setq nonterm (first production)) ; A ; FIRST[i+1]( A ) = ; first terminal of( FIRST[i]( Y1 ) ... FIRST[Yn]) U FIRST[i]( A ). (setq old-first-symbols (gethash nonterm hash-table)) (setq new-first-symbols (combine old-first-symbols ; FIRST[i](A) (first-terminals-of-rhs (production-rhs! production) hash-table) :test 'same-symbol? :precedence 'precedence)) ; Record if any changes occurred, and save FIRST[i+1]( A ) in a separate ; update hash table. (cond ((not (equal-sets-of-items? new-first-symbols old-first-symbols)) (setq change-flag T) (setf (gethash nonterm update-hash-table) new-first-symbols)))) ; Add updates to the old hash table for FIRST[i]() to create FIRST[i+1](), ; then clear out the update hash table. (update-first-derived-function hash-table update-hash-table) (clrhash update-hash-table) (if (null change-flag) (return))) ; No more changes --- exit. ; Return the hash table of first derived terminals for every grammar symbol. hash-table) ) ; ------------------------------------------------------------------------------ ; | first-terminals-of-symbol | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return a list of the first derived terminals of a grammar symbol. ; ; CALLING SEQUENCE ; ; (first-terminals-of-symbol s) ; ; s Any grammar symbol (or EPSILON or $). ; ; *first-derived-terminals* Hash table of the first derived terminals ; for every grammar symbol. ; ; *epsilon-free-first-derived-terminals* ; ; Hash table of the epsilon-free first ; derived terminals for every grammar symbol. ; ; type Defaults to NIL for computing FIRST() and ; equals 'epsilon-free for computing EFF(). ; ; Returns: List of first derived terminals of s. ; If type = 'epsilon-free, return the ; epsilon-free first derived terminals ; instead. ; ; Creates *first-derived-terminals* and ; *epsilon-free-first-derived-terminals* ; if they do not already exist. ; EXAMPLE ; ; (first-terminals-of-symbol 'S) ; => (|a| |c| |b|) ; (first-terminals-of-symbol 'S :type 'epsilon-free) ; => (|c|) ; ; ------------------------------------------------------------------------------ (defun first-terminals-of-symbol( symbol &key (type NIL) ) ; Create the hash tables for FIRST() and EFF() if they do not exist. (cond ( (or (null *first-derived-terminals*) (null *epsilon-free-first-derived-terminals*)) ; Sort out first derived terminals from epsilon-free first derived terminals. (setq *first-derived-terminals* (make-hash-table :size +initial-hash-table-size+)) (setq *epsilon-free-first-derived-terminals* (make-hash-table :size +initial-hash-table-size+)) (let ( (old-hash-table (create-all-first-derived-terminals)) ) (dolist (symbol (cons '$ (find-grammar-symbols))) (setf (gethash symbol *first-derived-terminals*) (untag-list (gethash symbol old-hash-table))) (setf (gethash symbol *epsilon-free-first-derived-terminals*) (untag-list (epsilon-free-only (gethash symbol old-hash-table)))))))) ; Return FIRST() or EFF() depending on the customer's request. (if (equal type 'epsilon-free) (gethash symbol *epsilon-free-first-derived-terminals*) (gethash symbol *first-derived-terminals*)) ) ; ------------------------------------------------------------------------------ ; | first-derived-terminals | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return a list of the first derived terminals of a grammar string. ; ; CALLING SEQUENCE ; ; (first-derived-terminals string :type type) ; ; string A list of grammar symbols, X1 ... Xn ; ; type NIL by default. ; ; Returns: First derived terminals of the list, FIRST( X1 ... Xn ) ; if type = NIL, but EFF( X1 ... Xn ) if type = 'epsilon-free. ; ; METHOD ; ; FIRST( X1 ... Xn ) = FIRST( X1 ) + ... + FIRST( Xn ) ; 1 1 ; ; EFF( X1 ... Xn ) = EFF( X1 ) + FIRST( X2 ... Xn ) ; 1 ; ; EXAMPLE ; ; (first-derived-terminals '(S A)) => (|a| |c| |b|) ; (first-derived-terminals '(S A) :type 'epsilon-free) => (|c|) ; ; because FIRST( S ) = (|a| |b| |c|) and FIRST( A ) = (|a| |c| EPSILON) ; and |c| is the only terminal with a non-epsilon derivation, ; S => A B => C a b => c a b. |b|, for example has only the derivation ; S => A B => A b => EPSILON b = b, in which we must replace a leading ; non-terminal A with EPSILON. ; ; (first-derived-terminals '(A B)) => (|c| |a| |b|) ; because FIRST( B ) = { |b| } ; ; (first-derived-terminals '(A B)) => (|c|) ; ; ------------------------------------------------------------------------------ (defun first-derived-terminals( string &key (type NIL) ) ; We want FIRST( EPSILON ) = (EPSILON) and EFF( EPSILON ) = NIL. (cond ( (null string) (if (equal type 'epsilon-free) NIL (list 'EPSILON))) ; If EPSILON is in FIRST( Y1 ), add all non-epsilon terminals of FIRST( Y1 ). ; If we are computing EFF(), we do EFF( Y1 ) instead. ( (itemInList 'EPSILON (first-terminals-of-symbol (first string) :type type)) (union (removeItemFromList 'EPSILON (first-terminals-of-symbol (first string))) (first-derived-terminals (rest string)) :test #'equal ) ) ; Otherwise, return the non-epsilon symbols of FIRST( Y1 ) or of EFF( Y1 ). (t (first-terminals-of-symbol (first string) :type type))) ) ; ============================================================================== ; | Item functions: closure, goto, cores, etc. | ; ============================================================================== ; ------------------------------------------------------------------------------ ; | closure | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the closure of a list of items. ; ; CALLING SEQUENCE ; ; (closure set-of-items) ; ; *productions* Global list of productions. ; set-of-items ; ; Returns: For each item [A -> alpha . B beta , a] in the set, add ; [B -> . gamma , b] where (B -> gamma) is a production ; and b is the first derived symbol of the string "beta a". ; METHOD ; ; Intuitively, we saw alpha already and expect to see B next. ; That is, we expect to see any string of terminals gamma derived from B. ; The next symbol we expect is the lookahead, which is the first ; derived terminal symbol of the string beta a. ; ; EXAMPLE ; ; (closure '( (SP -> DOT S |,| $) ) ) => ; ; ( (SP -> DOT S |,| $ ) ; (S -> DOT C C |,| $ ) ; (C -> DOT |c| C |,| |c|) ; (C -> DOT |c| C |,| |d|) ; (C -> DOT |d| |,| |c|) ; (C -> DOT |d| |,| |d|) ) ; ; ; ------------------------------------------------------------------------------ (defun closure( item-list ) (let ((closed-item-list item-list) ; Closure of item-list. (item-num -1) ; nth item in item-list. (nonterm nil) ; B (first-syms nil) ; FIRST[ beta a ] (item nil) ; Current item in item-list. (new-item nil)) ; [B -> . gamma, b] (loop ; Loop over each item. (setq item-num (1+ item-num)) ; Advance to next item. (setq item (nth item-num closed-item-list)) ; Get current item, ; [A -> alpha . B beta , a] (if (null item) (return)) ; End of the list. (setq nonterm (symbol-after-dot! item)) ; Get B. (if (nonterminal? nonterm) ; B is nonterminal. (dolist (production *productions*) (cond ((valid-production? nonterm ; production = [B -> gamma] production) (setq first-syms ; FIRST[ beta a ] (first-derived-terminals `(,@(string-before-comma! item) ; Get beta. ,(lookahead-of! item)))) ; Get a. (dolist (lookahead first-syms) ; for each b in ; FIRST[ beta a ] (setq new-item ; [ B -> . gamma , b ] (make-item production lookahead)) (setq closed-item-list ; Add to end of list. (insertItemIntoList new-item closed-item-list)))))))) closed-item-list) ) ; ------------------------------------------------------------------------------ ; | compute-goto | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Compute the goto on a set-of-items and a grammar symbol. ; ; CALLING SEQUENCE ; ; (compute-goto set-of-items grammar-symbol) ; ; set-of-items Set of items I. ; ; grammar-symbol Grammar symbol X. ; ; Returns: Goto function GOTO( I, X ) defined as follows: ; ; For all items [A -> alpha . X beta , a] in I, ; add together the closures of [A -> alpha X . beta, a]. ; EXAMPLE ; ; (compute-goto '( (SP -> DOT s |,| $) ; ( S -> DOT c c |,| $) ; ( c -> |c| c |,| c) ; ( c -> |c| c |,| d) ; ( c -> d |,| c) ; ( c -> d |,| d)) ; ; 'S) => ( (SP -> S DOT |,| $)) ; ; ------------------------------------------------------------------------------ (defun compute-goto( set-of-items grammar-symbol ) (let ( (new-set nil) ) (dolist (item set-of-items) ; Examine each of the form [A -> alpha . X beta, a] (if (equal (symbol-after-dot! item) grammar-symbol) ; Add [A -> alpha X . beta , a ] if not already there. (setq new-set (insertItemIntoList (move-dot-right item) new-set)) ) ) ; Closure of the new list. (closure new-set) ) ) ; ============================================================================== ; | Goto Graph Functions | ; ============================================================================== ; ------------------------------------------------------------------------------ ; | create-new-node | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Create a new node in the goto graph given the data. ; ; CALLING SEQUENCE ; ; (create-new-node state hash-value set-of-items) ; ; hash-value Hash value of core of items. ; ; Returns: ; ; EXAMPLE ; ; (create-new-node 1 3668 '((SP -> S DOT |,| $)) 3648) ; => (1 3668 ((SP -> S DOT |,| $))) ; ; ------------------------------------------------------------------------------ (defun create-new-node( current-state hash-value-of-core-of-items list-of-items ) `(,current-state ,hash-value-of-core-of-items ,list-of-items) ) (defun create-new-link( current-state grammar-symbol next-state) `(,current-state ,grammar-symbol ,next-state) ) ; ------------------------------------------------------------------------------ ; | select-items! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Select out the set of items in a node of the goto graph. ; ; CALLING SEQUENCE ; ; (select-items! node) ; ; node A node in the goto graph, (i2 i1 X SET-OF-ITEMS) ; ; Returns: SET-OF-ITEMS ; ; EXAMPLE ; ; (select-items! ; '( 0 ; 3668 ; ( ; (SP -> DOT S |,| $) ; ( S -> DOT S |a| S |b| |,| $) ; ) ; ) ; ) ; ; => ; ; ------------------------------------------------------------------------------ (defun select-items!( node ) (third node) ) (defun hash-value!( node ) (second node) ) (defun links!( goto-graph ) (first goto-graph) ) (defun nodes!( goto-graph ) (second goto-graph) ) (defun nth-node!( node-num goto-graph ) (nth node-num (nodes! goto-graph)) ) (defun first-node( goto-graph ) (first (nodes! goto-graph)) ) (defun rest-node( goto-graph ) (rest (nodes! goto-graph)) ) (defun insert-node( node goto-graph ) `( ,(links! goto-graph) ,(insertItemIntoList node (nodes! goto-graph)) ) ) (defun insert-link( link goto-graph ) `( ,(insertItemIntoList link (links! goto-graph)) ,(nodes! goto-graph) ) ) ; ------------------------------------------------------------------------------ ; | current-state! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Get the number of a node in the goto graph. ; ; CALLING SEQUENCE ; ; (current-state! node) ; ; node A node in the goto graph, (i1 i2 X SET-OF-ITEMS)) ; ; Returns: i1 ; ; EXAMPLE ; ; (current-state! '(1 0 S ((SP -> S DOT |,| $)))) => 1 ; ; ----------------------------------------------------------------------------- (defun current-state!( node ) (first node) ) ; ------------------------------------------------------------------------------ ; | transition-symbol! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the symbol upon which an action is performed. ; ; CALLING SEQUENCE ; ; (transition-symbol! node) ; ; node A node in the goto graph, (i1 i2 X SET-OF-ITEMS)) ; ; Returns: X ; ; EXAMPLE ; ; (transition-symbol! '(1 0 S ((SP -> S DOT |,| $)))) => S ; ; ------------------------------------------------------------------------------ (defun transition-symbol!( node ) (third node) ) ; ------------------------------------------------------------------------------ ; | set-of-items-in-graph? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find out if a goto graph contains a given a set of items (or ; their cores). ; ; CALLING SEQUENCE ; ; (set-of-items-in-graph? set-of-items goto-graph :compare-type type) ; ; set-of-items Any set of items. ; ; goto-graph The goto graph of the grammar. ; ; type Optional keyword. If omitted, it defaults to 'item. ; ; Returns T if any node in goto-graph has the same set of items ; (for type = 'item) or the same core (for type = 'core) ; as set-of-items. ; ; EXAMPLE ; ; (set-of-items-in-graph? ; '( (a -> b DOT c |,| d) (e -> f DOT g |,| h) ) ; '( ( (-1 nil 0) (0 a 1) ) ; ( (0 12 ( (a -> b DOT c |,| ddd) (e -> f DOT g |,| h) ) ) ; (1 23 ( (i -> j DOT |,| k) ) ) )) ; ) => nil ; ; but ; ; (set-of-items-in-graph? ; '( (a -> b DOT c |,| d) (e -> f DOT g |,| h) ) ; '( ( (-1 nil 0) (0 a 1) ) ; ( (0 12 ( (a -> b DOT c |,| d) (e -> f DOT g |,| h) ) ) ; (1 23 ( (i -> j DOT |,| k) ) ) )) ; :compare-type 'core) => T ; ; ; ------------------------------------------------------------------------------ (defun set-of-items-in-graph?( set-of-items goto-graph &key (compare-type 'item) ) (cond ( (null goto-graph) nil) ; goto graph = () ( (null (first-node goto-graph)) nil) ; goto graph = ( (...) () ) ( t ; Scan all nodes in the goto-graph, looking for one which has ; a matching item. (dolist (node (nodes! goto-graph)) (if (equal-sets-of-items? set-of-items (select-items! node) :compare-type compare-type) (return t) ) ) ; return nil by default ) ) ) ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Find out the current node (state) number of a node in the goto ; graph which contains the given set of items or their cores. ; ; CALLING SEQUENCE ; ; (node-number set-of-items goto-graph :compare-type compare-type) ; ; set-of-items Set of items to search for. ; ; goto-graph Goto graph of the grammar. ; ; compare-type If 'item, find identical set of items, but if 'core, ; find identical cores. Defaults to 'item. ; ; Returns: Number of the node in the goto graph containing the given ; set of items or core. ; ; EXAMPLE ; ; (node-number '(( S -> C C DOT |,| $)) *goto-graph*) => 5 ; (node-number '(( S -> C C DOT |,| |e|)) *goto-graph*) => NIL ; but ; (node-number '(( S -> C C DOT |,| |e|)) *goto-graph* ; :compare-type 'core) => 5 ; ; ------------------------------------------------------------------------------ (defun node-number( set-of-items goto-graph &key (compare-type 'item) ) (cond ( (null goto-graph) -1) ; goto graph = () ( (null (first-node goto-graph)) -1) ; goto graph = ( (...) () ) ( t ; Scan all nodes in the goto-graph, looking for one which has ; a matching item. (dolist (node (nodes! goto-graph)) (if (equal-sets-of-items? set-of-items (select-items! node) :compare-type compare-type) (return (current-state! node)) ) ) ; return nil by default ) ) ) ; Hash value of the core of an item. ; (core-hash-value-of-item '(S -> S DOT |a| S |b| |,| $)) => 1790 ; ; Sum up the integer value of all characters in each of the symbols. ; Multiply by the position of DOT in the item to distinguish items ; with the same symbols. ; (defun core-hash-value-of-item( item ) (let ( (hash-value 0) (symbol-position -1) (string-of-symbol "") (length-of-string 0) ) ; Hash the core of the item only. (dolist (s (core-of-item! item)) ; Symbol index in the item, starting with 0. (setq symbol-position (+ 1 symbol-position)) ; Convert symbol to string and get its length. (setq string-of-symbol (symbol-name s)) (setq length-of-string (length string-of-symbol)) ; Sum the integer values of each character in the symbol. (dotimes (i length-of-string) (setq hash-value (+ hash-value (char-int (char string-of-symbol i))) ) ) ; Multiply by the index position of the DOT symbol ; to distinguish between same items with dots in different ; locations such as ; [S -> a . b , c] and [S -> a b . , c] (if (equal s 'DOT) (setq hash-value (* hash-value symbol-position)) ) ) hash-value) ) ; Only the core matters: ; (core-hash-value-of-set-of-items ; '( (SP -> S DOT |,| $) ; ( S -> S DOT |a| S |b| |,| $) ; ( S -> S DOT |a| S |b| |,| |a|))) => 3542 ; ; (core-hash-value-of-set-of-items ; '( (SP -> S DOT |,| $) ; ( S -> S DOT |a| S |b| |,| $))) => 3542 ; ; (core-hash-value-of-set-of-items ; '( (SP -> S DOT |,| $))) => 1752 ; (defun core-hash-value-of-set-of-items( set-of-items ) (let ( (hashes-of-items (mapcar #'core-hash-value-of-item set-of-items)) (sum 0) ) ; Hash value on the entire set of items. ; Don't count duplicate items. (dolist (i (remove-duplicates hashes-of-items)) (setq sum (+ sum i)) ) ; Modulo to keep within size of an integer. (mod sum +hash-value-upper-limit+) ) ) ; ============================================================================== ; | LALR(1) Core Merging Functions ; ============================================================================== ; ; Partition ; ; merge-equivalence-classes( '(2 4) '() ) ; => ( (2 4) ) ; ; merge-equivalence-classes( '(2 4) '( (4 5) (6 7) ) ) ; => ( (2 4 5) (6 7) ) ; ; merge-equivalence-classes( '(2 4) '( (4 5) (2 7) (3 6) ) ) ; => ( (3 6) (2 4 5 7) ) ; (defun merge-equivalence-classes( equivalence partition ) (cond ; Dispose of trivial inputs. ( (null equivalence) partition) ( (= (length equivalence) 1) partition) ; Partition is empty. ( (null partition) (list equivalence) ) ; First set in the partition has common elements ; with the equivalence. ( (intersection equivalence (first partition) ) ; Sort the elements in the equivalence classes. (mapcar #'(lambda (x) (sort x #'(lambda (x y) (< x y)))) ; Remerge the other sets in the partition. (merge-equivalence-classes ; Merge the equivalence into the first set in the ; partition. (union equivalence (first partition) :test #'equal) (rest partition) ) ) ) (t ; Sort the elements in the equivalence classes. (mapcar #'(lambda (x) (sort x #'(lambda (x y) (< x y)))) ; Merge the other sets in the partition. (cons (first partition) (merge-equivalence-classes equivalence (rest partition)) ) ) ) ) ) ; If member of equiv. class in the partition, return the smallest ; equivalent element. (defun remap-equivalent( num partition ) (cond ( (null partition) num) ( (member num (first partition)) (caar partition) ) (t (remap-equivalent num (rest partition)) ) ) ) ; ------------------------------------------------------------------------------ ; | merge-lookaheads | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Collapse together two sets of items, eliminating duplicated items. ; ; CALLING SEQUENCE ; ; (merge-lookaheads set-of-items1 set-of-items2) ; ; set-of-items ; ; node A node in the goto graph with the same cores as in ; set-of-items. ; ; Returns: Updated node with the same core as before, but ; added lookaheads. ; ; METHOD ; ; Remove duplicates and use a set union to merge the lookaheads. ; ; EXAMPLE ; ; (merge-lookaheads '( (D -> E DOT F |,| |b|) ; (A -> B DOT C |,| |a|) ) ; ; '( (A -> B DOT C |,| |a|) ; (A -> B DOT C |,| |a|) ; (D -> E DOT F |,| |c|) )) => ; ; ( (D -> E DOT F |,| |b|) ; (A -> B DOT C |,| |a|) ; (D -> E DOT F |,| |c|) ) ; ; ------------------------------------------------------------------------------ (defun merge-lookaheads( set-of-items1 set-of-items2 ) ; Use the equal function to test for duplicates, since we are handling ; elements which are lists, not atoms. (union (remove-duplicates set-of-items1) (remove-duplicates set-of-items2) :test #'equal) ) ; ------------------------------------------------------------------------------ ; | merge-cores | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; If the given set of items has the same core as a node in the goto graph, ; merge it into the node. ; ; CALLING SEQUENCE ; ; merge-cores( goto-graph ) ; ; goto-graph Goto graph of the grammar. ; ; Returns: All sets of items with the same cores are merged, ; and states and links are renumbered. ; ; EXAMPLE ; ; ------------------------------------------------------------------------------ (defun merge-cores( goto-graph ) (let ( (nodes (nodes! goto-graph)) ; Get the nodes out (links (links! goto-graph)) ; and the links. (previous-node nil) (previous-state -1) (previous-hash-value -1) (equiv-class nil) (merged-goto-graph '(() ()) ) ; New merged goto graph. ) ; Sort the nodes on hash value to keep sets of items with the same ; cores adjacent. (setq nodes (sort nodes #'(lambda (x y) (< (hash-value! x) (hash-value! y))))) ; Scan through the nodes, looking for sets of items with the ; same cores. (dolist (node nodes) ; Current node and previous node have same cores. (cond ( (= (hash-value! node) previous-hash-value) (setq equiv-class (merge-equivalence-classes (list (current-state! node) previous-state) equiv-class ) ) ; Create a new merged node. (setq node (create-new-node ; Use the lowest numbered state ; for the new node number. (if (< (current-state! node) (current-state! previous-node)) (current-state! node) (current-state! previous-node) ) ; Hash value. (hash-value! node) ; Merge the cores in the items. (merge-lookaheads (select-items! node) (select-items! previous-node)) ) ) ) ; Current node differs, send off previous node. (t (if (not (null previous-node)) (setq merged-goto-graph (insert-node previous-node merged-goto-graph)) ) ) ) (setq previous-node node) (setq previous-hash-value (hash-value! node)) (setq previous-state (current-state! node)) ) ; Send off last node, merged or otherwise, in any case. (setq merged-goto-graph (insert-node previous-node merged-goto-graph)) ; Renumber states in the links. (dolist (link links) (setq link `( ,(remap-equivalent (first link) equiv-class) ,(second link) ,(remap-equivalent (third link) equiv-class) ) ) (setq merged-goto-graph (insert-link link merged-goto-graph)) ) merged-goto-graph ) ) ; ============================================================================== ; | LR(1) Action and Goto table utilities | ; ============================================================================== ; ------------------------------------------------------------------------------ ; | create-goto-graph | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Create a goto graph containing the sets of items for the grammar. ; ; CALLING SEQUENCE ; ; (create-goto-graph parser-type) ; ; parser-type The type of grammar: 'LR1 or 'LALR1 ; ; Returns: Goto graph of the grammar. ; ; METHOD ; ; We create a DFA which recognizes the viable prefixes of the grammar. ; ; The DFA is called the goto graph. Each node in the graph is of the ; form (i2 i1 X SET-OF-ITEMS). ; ; i1 = V( gamma ) = (set of all items valid for viable prefix gamma). ; i2 = V( gamma X ) ; X = a grammar symbol (but not EPSILON). ; ; An item [A -> alpha . beta] is valid for viable prefix gamma alpha if ; gamma A is also a viable prefix. ; ; The prefix gamma is viable if there is a rightmost derivation ; S =>* gamma w. ; ; The first state is I0 = V( EPSILON ) = { [S -> . S , $] }. ; ; We process nodes as follows: ; ; node-num ---> 0 ; ... ; node-num ---> 3 ----+----+ ; | a | ; 4 <---+ | b ; | ; 5 <---+----+ ; ... ; 3 ----+----+ ; | a | ; 4 <---+ | b ; | ; node-num ---> 5 <---+----+ ; ; ----------------------------------------------------------------------------- (defun create-goto-graph( parser-type ) (let* ( (goto-of-item nil) ; Set of items, GOTO( I, X ) (node nil) ; Node in graph. (node-num 0) ; Next node in goto graph to process. (new-node nil) ; New node in Goto graph. (new-link nil) (new-state-num 1) ; State number of the next node. (goto-graph '( () () ) ); Initial goto-graph. ) ; Our very first set of items I0 is the closure of [S' -> .S, $] (setq goto-of-item (closure (list (create-augmenting-item)))) ; The initial node has state 0, items as above, and hash value. (setq node (create-new-node 0 (core-hash-value-of-set-of-items goto-of-item) goto-of-item)) (setq new-link (create-new-link -1 nil 0)) ; Insert nodes and links into the goto graph. (setq goto-graph (insert-node node goto-graph)) (setq goto-graph (insert-link new-link goto-graph)) (loop ; Latest unprocessed node in the goto graph. Starting with I0. (setq node (nth-node! node-num goto-graph)) (if (null node) (return)) ; No more sets of items. ; For each grammar symbol X ... (dolist (grammar-symbol (find-grammar-symbols)) ; ...compute GOTO( I, X ), the new set of items. (setq goto-of-item (compute-goto (select-items! node) grammar-symbol)) ; Create a new node with set of items GOTO( I, X ), (setq new-node (create-new-node new-state-num (core-hash-value-of-set-of-items goto-of-item) goto-of-item ) ) ; GOTO( I, X ) is empty. (if (not (null goto-of-item)) (cond ( ; Our GOTO( I, X ) has computed the same sets of ; items. (set-of-items-in-graph? goto-of-item goto-graph) ; Insert a new link ; X ; I ---> (setq new-link (create-new-link node-num grammar-symbol (node-number goto-of-item goto-graph)) ) (setq goto-graph (insert-link new-link goto-graph)) ) ; Add a new node with a new set of items and ; a new link. ; Increment the current state number. (t (setq goto-graph (insert-node new-node goto-graph)) (setq new-link (create-new-link (current-state! node) grammar-symbol new-state-num)) (setq goto-graph (insert-link new-link goto-graph)) (setq new-state-num (1+ new-state-num)) ) ) ; end cond ) ; end if empty GOTO( I, X ) ) ; end dolist ; Bump up the node number. (setq node-num (1+ node-num )) ) ; end loop ; For LALR(1) languages, sort the goto graph on core hash value ; then merge states with the same cores. (if (equal parser-type 'LALR1) (setq goto-graph (merge-cores goto-graph)) ) goto-graph ) ; end let ) ; ------------------------------------------------------------------------------ ; | goto | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; The LR GOTO function derived from the goto graph. ; ; CALLING SEQUENCE ; ; (goto i A goto-graph) ; ; goto-graph The goto graph with entries of the form ; (i2 i1 X ) ; ; state The initial state i1. ; ; symbol The transition symbol X. ; ; Returns: The next state i2, or NIL if GOTO is undefined. ; ; EXAMPLE ; ; Suppose *goto-graph* = ( ( (6 |a| 4) ) (nodes) ) ; ; (goto 6 '|a| goto-graph) => 4 ; ; ------------------------------------------------------------------------------ (defun goto( state symbol goto-graph) (dolist (link (links! goto-graph)) (if (and (= state (first link)) (equal symbol (second link)) ) (return (third link)) ) ) ; Return nil by default. ) ; ------------------------------------------------------------------------------ ; | action-list! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the list of actions in a line of the action table. ; ; CALLING SEQUENCE ; ; (action-list action-table-line) ; ; action-table-line One line of action table of the form ; ( (stateNumber) (listOfActions) ) ; ; Returns: (listOfActions) ; ; EXAMPLE ; ; (action-list! '( (0) ((|c| (S 3)) (|d| (S 4))))) ; => ((|c| (S 3)) (|d| (S 4))) ; ; ------------------------------------------------------------------------------ (defun action-list!( line-of-table ) (second line-of-table) ) ; ------------------------------------------------------------------------------ ; | action-line-state! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the state of a line of the action table. ; ; CALLING SEQUENCE ; ; (action-line-state! action-table) ; ; action-table Table of the form ( (stateNum) (listOfActions) ) ; ; Returns: stateNum ; ; EXAMPLE ; ; (action-line-state! '( (0) ((|c| (S 3)) (|d| (S 4)) (DEFAULT (ERROR)))) ) ; => 0 ; ; ------------------------------------------------------------------------------ (defun action-line-state!( action-table-line ) (first (first action-table-line)) ) ; ------------------------------------------------------------------------------ ; | action-trigger-symbol! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Return the transition symbol in an action pair. ; ; CALLING SEQUENCE ; ; (action-trigger-symbol! action-pair) ; ; action-pair An action/new-state pair of a line in the action table ; of the form (X (action i)). ; ; Returns: X ; ; EXAMPLE ; ; (action-trigger-symbol! '(|c| (S 3))) => |c| ; ; ------------------------------------------------------------------------------ (defun action-trigger-symbol!( action-pair ) (first action-pair) ) ; ------------------------------------------------------------------------------ ; | insert-action-or-goto-into-list | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Insert an action into a line of the action table. Check for conflicts. ; ; CALLING SEQUENCE ; ; (insert-action-or-goto-into-list symbol new-state list-of-actions action) ; ; symbol The transition symbol X. ; ; new-state The new state i. ; ; list-of-actions The action list part of one line of the action or ; goto table. ; ; action If 'NONE, update a goto list, else add this action ; to an action list. ; ; Returns: Augmented action list containing a new action pair ; (X (action i)) or a conflict pair ; (CONFLICT (X (action i)) (X (old-action j))) ; Similarly for a goto list. ; EXAMPLE ; ; (insert-action-or-goto-into-list 'a 666 '((b (s 3)) (c (r 2))) :action 's) ; =>((B (S 3)) (C (R 2)) (A (S 666))) ; ; (insert-action-or-goto-into-list 'b 666 '((b (s 3)) (c (r 2))) :action 's) ; => ((B (S 3)) (C (R 2)) (CONFLICT ((B (S 666)) (B (S 3))))) ; ; (insert-action-or-goto-into-list 'a 666 '((b 5) (c 6))) ; => ((B 5) (C 6) (A 666)) ; ; ------------------------------------------------------------------------------ (defun insert-action-or-goto-into-list( symbol new-state list-of-actions &key (action 'NONE) ) ; Nothing or only a default in the list. Insert a new action. (cond ( (or (null list-of-actions) (equal (first list-of-actions) '(default (error)))) (cons (if (equal action 'NONE) `(,symbol ,new-state) ; Insert a goto. `(,symbol (,action ,new-state))) ; Insert an action. list-of-actions)) ; Ignore duplicate actions. ((equal (first list-of-actions) (if (equal action 'NONE) `(,symbol ,new-state) ; Compare a goto. `(,symbol (,action ,new-state)))) ; Compare an action. list-of-actions) ; Return list unchanged. ; We have a conflict on the first action. Insert a conflict report at the ; end of the row, unless it is there already. ((equal symbol (action-trigger-symbol! (first list-of-actions))) (setq *conflicts* T) (insertItemIntoList (if (equal action 'NONE) `(conflict ((,symbol ,new-state) (,@(first list-of-actions)))) `(conflict ((,symbol (,action ,new-state)) (,@(first list-of-actions))))) list-of-actions)) ; No conflict yet --- try insertion in the rest of the list. ( t (cons (first list-of-actions) (insert-action-or-goto-into-list symbol new-state (rest list-of-actions) :action action)))) ) ; ------------------------------------------------------------------------------ ; | add-action-or-goto | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Add an action to the action table or a goto to the goto table. ; ; ; CALLING SEQUENCE ; ; (add-action-or-goto( state symbol new-state table action) ; ; state The current state i1. ; ; new-state The new state i2 ; ; table The action or goto table. ; ; action Defaults to 'NONE for the goto table, otherwise, the ; action to take (e.g. S, R, ACC, ERROR) ; ; Returns: The updated action or goto table. ; ; EXAMPLE ; ; Let table = ( ( (2) ( (a (s 5)) ; (b (r 2)) ; (default (error)))) ; ( (4) ( ($ (acc nil)) ; (default (error))))) ; ; To insert ACTION[ 2, b ] = (shift 6) into the table, call ; ; (add-action-or-goto 2 'b 6 table :action 's) => ; ; ( ( (2) ( (A (S 5)) ; (B (R 2)) ; (DEFAULT (ERROR)) ; (CONFLICT ((B (S 6)) (B (R 2)))))) ; ( (4) ( ($ (ACC NIL)) ; (DEFAULT (ERROR))))) ; ; We detect a shift/reduce conflict on symbol b and report it. ; ; On the other hand, ; ; (add-action-or-goto 2 'c 6 table :action 's) => ; ; ( ( (2) ( (A (S 5)) ; (B (R 2)) ; (C (S 6)) ; (DEFAULT (ERROR)))) ; ( (4) ( ($ (ACC NIL)) ; (DEFAULT (ERROR))))) ; ; Suppose we have a goto table, ; ; table = ( ( (0) ( (a 10) ; (b 20) ; (default (error)))) ; ( (4) ( (a 11) ; (default (error))))) ; ; To insert GOTO[ 0, c ] = 6 call ; ; (add-action-or-goto 0 'c 6 table) => ; ; ( ( (0) ( (A 10) ; (B 20) ; (C 6) ; (DEFAULT (ERROR)))) ; ( (4) ( (A 11) ; (DEFAULT (ERROR))))) ; ; ----------------------------------------------------------------------------- (defun add-action-or-goto( state symbol new-state table &key (action 'NONE)) ; The table has no entries. Create a new action table of the form ; ( (State) ( (TransitionSymbol (Action NewState)) (default (error)))) ; or Goto table of the form, ; ( (State) ( (TransitionSymbol (NewState)) (default (error)))). ; ; NOTE: ; We assume the Goto graph starts with state 0. ; Since we insert new states into the action table in order, ; the order will be maintained as we scan through the Goto graph. (cond ( (null table) `( ( (,state) ( ,(if (equal action 'NONE) `(,symbol ,new-state) ; goto table `(,symbol (,action ,new-state)) ) (default (error)) ) ) ) ) ; Found state in first line of table. Add the new action to ; this line. ( (= (action-line-state! (first table)) state) (cons (list (first (first table)) ; Get state of first line. (insert-action-or-goto-into-list symbol new-state (action-list! (first table)) :action action)) (rest table))) ; State is smaller than first line's state. Create a new line ; containing a new state, action and (default (error)) and add it ; before the first line. ( (< state (action-line-state! (first table))) (cons `( (,state) ( ,(if (equal action 'NONE) `(,symbol ,new-state) ; goto table `(,symbol (,action ,new-state)) ) (default (error)) ) ) table) ) ; State is bigger than the first line's state. Decide later. ( t (cons (first table) (add-action-or-goto state symbol new-state (rest table) :action action)))) ) ; ------------------------------------------------------------------------------ ; | build-action-table | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Build the ACTION table of a cannonical LR(1) parser. ; ; ; CALLING SEQUENCE ; ; (build-action-table goto-graph) ; ; goto-graph The goto graph generated by make-items. ; ; Returns: Action table of the grammar. ; ; METHOD ; ; We initially add the action (default (error)) to each line of the table. ; If we generate any shift-reduce or reduce-reduce conflicts, ; we record them in the action table and check them later. ; ; EXAMPLE ; ; ------------------------------------------------------------------------------ (defun build-action-table( goto-graph ) (let ( (action-table nil) (first-symbols-after-dot nil) ) (dolist (node (nodes! goto-graph)) ; Scan through every node in ; in the goto graph. (dolist (item (select-items! node)) (cond ( ; For the item [S' -> S. , $], ; ACTION[ i, $ ] = (accept). (is-accept? item) (setq action-table (add-action-or-goto (current-state! node) ; Current state i '$ ; Transition. 'nil ; No state. action-table :action 'acc) ; No state. ) ) ; For the item [A -> alpha . , b] ; ACTION[ i, b ] = (reduce k) ; where k is the number of the production A -> alpha ( (reduction? item) (setq action-table (add-action-or-goto (current-state! node) ; Current state i (lookahead-of! item) ; b. (production-number ; Production num. (item-to-production item)) action-table :action 'r) ) ) ; Prepare to add a possible shift. ( t (if *has-epsilon-productions* ; When the grammar has epsilon-productions, for the item ; [A -> alpha . beta , b] ; where beta is not equal to the null-string EPSILON, ; for all a in EFF( beta b ), we add ; ACTION[ i, a ] = (shift j) where j = GOTO( i , a ). (setq first-symbols-after-dot (if (reduction? item) ; beta = epsilon nil (first-derived-terminals `(,(symbol-after-dot! item) ,@(string-before-comma! item) ,(lookahead-of! item)) :type 'epsilon-free))) ; For a grammar with no epsilon productions, for the item ; [A -> alpha . a beta , b] ; where a is a terminal, we add ; ACTION[ i, a ] = (shift j) where j = GOTO( i , a ). (setq first-symbols-after-dot (if (terminal-after-dot? item) (list (symbol-after-dot! item)) nil)) ) ; Add a shift, if any. (dolist (term first-symbols-after-dot) (setq action-table (add-action-or-goto (current-state! node) ; Current state i term ; Terminal a. (goto ; into state j. (current-state! node) term goto-graph) action-table :action 's))))))) ; Do a shift. action-table) ) ; ------------------------------------------------------------------------------ ; | build-goto-table | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Build the GOTO table for an LR(1) parser. ; ; ; CALLING SEQUENCE ; ; (build-goto-table) ; ; goto-graph The goto graph. ; ; Returns: The goto table. ; ; ; METHOD ; ; Whenever we have a link i which has a transition ; on a nonterminal A to the link j, we fill the table with ; GOTO( i, A ) = j. ; ; EXAMPLE ; ; ------------------------------------------------------------------------------ (defun build-goto-table( goto-graph ) (let ((goto-table nil)) (dolist (link (links! goto-graph)) (if (and (> (first link) -1) (nonterminal? (second link))) (setq goto-table (add-action-or-goto (first link) (second link) (third link) goto-table) ) ) ) goto-table) ) ; ============================================================================== ; | Input and Output Functions | ; ============================================================================== ; ------------------------------------------------------------------------------ ; | write-header | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Write a header for the parse tables file. ; ; CALLING SEQUENCE ; ; (write-header fp parser-type) ; ; fp Pointer to the currently open file. ; ; parser-type 'LR1 or 'LALR1. The title will be adjusted ; automatically based on the parser type. ; ; Returns: Header text written to file. ; ; EXAMPLE ; ; ------------------------------------------------------------------------------ (defun write-header( fp parser-type ) (format fp "~A~%~A~%~A~%~A~%~A~%~A~%~A~%~A~%~%" ";---------------------" (if (equal parser-type 'LR1) "; LR(1) parse tables" "; LALR(1) parse tables") ";---------------------" ";" "; Suitable for input to the Common Lisp program " ";" "; LR(1)AndLALR(1)Parser.lsp" ";" ) ) ; ------------------------------------------------------------------------------ ; | write-terminals | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Write a header for the parse tables file. ; ; CALLING SEQUENCE ; ; (write-terminals fp) ; ; fp Pointer to the currently open file. ; ; Returns: Terminal symbols written to file. ; ; EXAMPLE ; ; ------------------------------------------------------------------------------ (defun write-terminals( fp terminals ) (format fp "~A~%~A~%~%" "; TERMINALS" ";" ) (format fp "~S~%~%" terminals) (fresh-line fp) (fresh-line fp) ) ; ------------------------------------------------------------------------------ ; | write-productions | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Write the split up productions with their numbers. ; ; CALLING SEQUENCE ; ; (write-productions fp productions) ; ; fp Pointer to the currently open file. ; ; productions List of productions to write. ; ; Returns: Neat list of numbered productions. (These are ; the ones expanded from the alternates.) ; ; EXAMPLE ; ; See the files lalrparser.dat and parser.dat for examples. ; ; ------------------------------------------------------------------------------ (defun write-productions( fp productions ) (format fp "~A~%~A~%~A~%~A~%~%~A~%" "; PRODUCTIONS" ";" "; Productions are numbered starting with 1." "; All alternates were expanded into separate productions." "(" ) (dolist (production productions) ; Print each production. (format fp "~A~D~A~S~A~%" " ( " `(,(production-number production)) " " production " )" ) ) (format fp "~A~%~%" ")" ) ) (defun construct-error-messages( action-table ) (let ( (error-messages nil) (transition-symbols nil) ) ; Scan through each line of the action table. (dolist (action-line action-table) (setq transition-symbols nil) ; Scan through the actions in each line. (dolist (action (action-list! action-line)) ; Found an error state; add message to the list. (if (equal (second action) '(ERROR)) (push `( (,(action-line-state! action-line)) (,(concatenate 'string "error - expecting one of the symbols " (string-trim "(" (string-trim ")" (write-to-string transition-symbols)))))) error-messages) ; else keep collecting transition symbols. (setq transition-symbols (cons (first action) transition-symbols)) ) ) ) (reverse error-messages)) ) ; ------------------------------------------------------------------------------ ; | write-error-message-table | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Write the error message table with templates for the user to fill in. ; ; CALLING SEQUENCE ; ; (write-error-message-table fp action-table) ; ; fp Pointer to the currently open file. ; ; action-table ; ; Returns: Error message table. ; ; EXAMPLE ; ; ; ------------------------------------------------------------------------------ (defun write-error-message-table( fp action-table ) (format fp "~A~%~%~%" " ; ERROR MESSAGE TABLE ; ; If the action table has an error state, the other non-error ; actions show which symbol was failed to appear next on the input. ; ; The user can modify these minimal error messages. " ) ; Opening parenthesis. (format fp "(~%~%") ; Iterate over error states. (dolist (error-message (construct-error-messages action-table)) (format fp " ~S ~%" error-message) ) ; Closing parenthesis. (format fp ")~%~%") ) ; ------------------------------------------------------------------------------ ; | write-goto-graph | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Write the formatted goto graph to a file. ; ; CALLING SEQUENCE ; ; (write-goto-graph fp goto-graph) ; ; fp Pointer to (open) file which is to contain the graph. ; ; goto-graph Goto graph itself, which will be pretty-printed. ; ; ; EXAMPLE ; ; (write-goto-graph fp " *goto-graph*) ; => ... see the sample output files parser.dat and lalrparser.dat. ; ; ------------------------------------------------------------------------------ (defun write-goto-graph( fp goto-graph ) ; Write the title first and the opening parenthesis. (format fp "~A~%~%" "; GOTO GRAPH ; ; Not needed for the parser, but here for reference and debugging. ; ********** ; Goto graph of the LR(1) or LALR(1) grammar of the form ; ; ( ; ( <-- List of links. ; (6 |a| 4) <-- Transition in Goto graph from state 6 to ; state 4 on symbol a. ; (1 |a| 2) <-- Transition from state 1 to state 2 on a. ; ) ; ; ( <-- List of sets of items. ; ( 0 <-- State number 0. ; 3668 <-- Hash value of core. ; ( ; (SP -> DOT S |,| $) ----+ ; ( S -> DOT S |a| S |b| |,| $) | ; ( S -> DOT EPSILON |,| $) +---- Set of items for state 0 ; ( S -> DOT S |a| S |b| |,| |a|) | ; ( S -> DOT EPSILON |,| |a|) | ; ) ----+ ; ) " ) ; Opening parenthesis of graph. (format fp "(~%") ; Opening parenthesis of links. (format fp "~3,4@T(~%") ; Print each link. (dolist (link (links! goto-graph)) (format fp "~3,8@T(~D ~S ~D)~%" (first link) (second link) (third link)) ) ; Closing parenthesis of links. (format fp "~3,4@T)~%") ; Opening parenthesis of nodes. (format fp "~3,4@T(~%") ; Print each node in the graph. (dolist (node (nodes! goto-graph)) ; Print open paren of node, state and hash value. (format fp "~3,8@T(~D~%~3,8@T~D~%" (current-state! node) (hash-value! node)) ; Print out each item. (dolist (item (select-items! node)) (format fp "~3,12@T~S~%" item)) ; Closing paren of node. (format fp "~3,8@T)~%") ) ; Closing parenthesis of nodes. (format fp "~3,4@T)~%") ; Closing parenthesis of graph. (format fp ")~%~%") ) ; ------------------------------------------------------------------------------ ; | write-action-or-goto-table | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Write the formatted action or goto table to a file. ; ; CALLING SEQUENCE ; ; (write-action-or-goto-table fp table) ; ; fp Pointer to (open) file which is to contain the ; action-table. ; ; table Action or goto table itself, which will be ; pretty-printed. ; ; ; EXAMPLE ; ; (write-action-or-goto-table fp " *action-table) ; => ... see the sample output files parser.dat and lalrparser.dat. ; ; ------------------------------------------------------------------------------ (defun write-action-or-goto-table( fp table &key (table-type 'ACTION)) ; Write the title first and the opening parenthesis. (format fp "~A~%~A~%~A~%~A~%~A~%~%~A~%" (cond ( (equal table-type 'ACTION) "; ACTION TABLE") ( (equal table-type 'GOTO) "; GOTO TABLE" )) ";" "; (state" "; (item)" "; ..." "(" ) ; Print actions for each state. (dolist (state table) ; Print the opening paren of the table and the state ; number in parentheses. (format fp "~3,4@T( (~D) ~%" (action-line-state! state) ) ; Print the word NIL explicitly if the list of items is empty. (if (null (action-list! state)) (format fp "~3,8@TNIL~%") ; Print out the list of actions. (progn ; Print first paren of action list. (format fp "~3,8@T(~%") ; Print actions. (dolist (item (action-list! state)) (format fp "~3,12@T~S~%" item)) ; Print first paren of action list. (format fp "~3,8@T)~%") ) ) (format fp "~3,4@T)~%") ) ; Closing parenthesis. (format fp ")~%~%") ) ; ------------------------------------------------------------------------------ ; | print-legal-notice | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Write legal notice when the program starts up. ; ; CALLING SEQUENCE ; ; Returns: Legal notice to standard output. ; ; EXAMPLE ; ; ------------------------------------------------------------------------------ (defun print-legal-notice() ; Print a few newlines, the notice and a few more newlines. (format t "~%~%~A~%~%" " LR(1)AndLALR(1)ParserGenerator Version 5.6 An LR(1) and LALR(1) Parser Generator written in Common Lisp. Copyright (C) 1989-2024 by Sean Erik O'Connor. All Rights Reserved. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . The author's address is seanerikoconnor!AT!gmail!DOT!com with the !DOT! replaced by . and the !AT! replaced by @" ) ) ; ------------------------------------------------------------------------------ ; ; NAME ; ; load-input-and-initialize ; ; DESCRIPTION ; ; Load the grammar from file. Initialize global variables. ; ; CALLING SEQUENCE ; ; (load-input-and-initialize filename) ; ; filename Name of the file containing the productions and terminals for ; the grammar. ; ; Returns: *terminals* After this is read from file, we add the extra ; terminal $ (the language's right endmarker). ; ; *productions* After reading the list of productions, ; ; [A -> alpha | beta ...] ; ; from file, we split up the alternates to ; generate the set of productions ; ; [A -> alpha], [A -> beta], ... ; ; *first-derived-terminals* ; *epsilon-free-first-derived-terminals* ; ; Set to NIL. ; ; *has-epsilon-productions* ; ; Set to NIL unless we have epsilon ; productions of the ; form A -> EPSILON. ; ; *conflicts* Set to NIL. ; ; *goto-graph*, ; *action-table* ; *goto-table* Set to NIL just for the hell of it. ; ; EXAMPLE ; ; (load-input-and-initialize "grammar.dat") ; ; *productions* => ((S -> S |a| S |b|) (S -> EPSILON) ; *terminals* => (|a| |b| $) ; *first-derived-terminals* => NIL ; *epsilon-free-first-derived-terminals* => NIL ; *conflicts* => NIL ; *has-epsilon-productions* => NIL ; ; ------------------------------------------------------------------------------ (defun load-input-and-initialize( grammar-file ) ; Better safe than sorry. (setq *goto-graph* nil) (setq *action-table* nil) (setq *goto-table* nil) (setq *conflicts* nil) (setq *first-derived-terminals* nil) (setq *epsilon-free-first-derived-terminals* nil) ; Split up productions and add the endmarker to the list of terminals. (let ( (fp (open grammar-file :direction :input)) ) (setq *productions* (read fp)) (setq *terminals* (read fp)) ; Add the endmarker to the list of terminals. (setq *terminals* (append *terminals* '($))) ; Split up productions (so we don't handle alternates directly) (setq *productions* (split-up-productions *productions*)) ; Detect epsilon productions. (setq *has-epsilon-productions* nil) (dolist (production *productions*) (if (epsilon-production? production) (setq *has-epsilon-productions* T))) (close fp)) ) ; ------------------------------------------------------------------------------ ; | compile-all | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Compile all the functions in this program, except compile-all itself. ; ; CALLING SEQUENCE ; ; (compile-all) ; ; EXAMPLE ; ; (compile-all) => ; ; ;;; Compiling function LOAD-INPUT-AND-INITIALIZE...tail-merging... ; assembling...emitting...done. ; ; --- and so on, with pauses for garbage collection --- ; ; ;;; Compiling function PARSER-GENERATOR...assembling...emitting...done ; NIL ; ; ------------------------------------------------------------------------------ (defun compile-all() ; Tell the compiler the following variables are global (have dynamic binding). (proclaim '(special *productions*)) (proclaim '(special *has-epsilon-productions*)) (proclaim '(special *terminals*)) (proclaim '(special *first-derived-terminals*)) (proclaim '(special *epsilon-free-first-derived-terminals*)) (proclaim '(special *goto-graph*)) (proclaim '(special *action-table*)) (proclaim '(special *goto-table*)) (proclaim '(special *conflicts*)) (let ( (functions-to-compile '( print-legal-notice load-input-and-initialize getHeadOfListUpTo removeItemFromList positionInList insertItemIntoList combine itemInList terminal? nonterminal? derives-leading-terminal? derives-leading-nonterminal? valid-production? set-of-items-in-graph? reduction? is-accept? terminal-after-dot? equal-sets-of-items? contained-in-item? element-of-item? epsilon-production? same-symbol? first-alternate! all-but-first-alternate! production-rhs! symbol-after-dot! string-before-comma! lookahead-of! select-items! hash-value! current-state! action-list! transition-symbol! action-line-state! action-trigger-symbol! core-of-item! core-hash-value-of-item core-hash-value-of-set-of-items merge-lookaheads merge-cores split-up-production split-up-productions make-item move-dot-right create-augmenting-item find-grammar-symbols create-new-node create-new-link node-number item-to-production production-number tag-symbol flag-epsilon-free! epsilon-free-only untag-list flag-non-epsilon-free precedence derived-leading-terminal initial-first-derived-terminals first-terminals-of-rhs update-first-derived-function create-all-first-derived-terminals first-terminals-of-symbol first-derived-terminals add-action-or-goto insert-action-or-goto-into-list goto closure compute-goto create-goto-graph build-action-table build-goto-table write-header write-terminals write-productions write-goto-graph write-action-or-goto-table write-error-message-table construct-error-messages parser-generator file-exists? base-path! test-parser-generator))) ; Compile all the functions, except compile-all itself. (dolist (function-to-compile functions-to-compile) (compile function-to-compile))) ) ; ============================================================================== ; | Main Program | ; ============================================================================== ; ------------------------------------------------------------------------------ ; | parser-generator | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Main program which produces the LR(1) and LALR(1) parsing tables. ; ; CALLING SEQUENCE ; ; (parser-generator in-file out-file :parser-type parser-type) ; ; in-file Productions and terminals for the grammar. See ; the file grammar.dat for an example. ; ; out-file The numbered productions, goto graph, action and ; parsing tables for the grammar. See the files ; lalrparser.dat and parser.dat for examples. ; ; parser-type 'LR1 or 'LALR1 parsing. The default is 'LALR1. ; ; Returns: A string indicating if any conflicts have occurred. ; ; EXAMPLE ; ; (parser-generator "grammar.dat" "parser.dat" :parser-type 'lr1) ; => NIL and the file parser.dat ; ; (parser-generator "grammar.dat" "lalrparser.dat" :parser-type 'lalr1) ; => NIL and the file lalrparser.dat ; ; (parser-generator "grammar.dat" "lalrparser.dat") ; => same as above ; ; (parser-generator "grammar4.dat" "junk" :parser-type 'lalr1) ; => "Conflicts were detected" and the file junk. ; ; ------------------------------------------------------------------------------ (defun parser-generator( in-file out-file &key (parser-type 'LALR1) ) ; Keep my lawyer happy. (print-legal-notice) ; Read in the grammar file productions and terminals. (load-input-and-initialize in-file) (let ( (fp (open out-file :direction :output :if-exists :supersede)) ) ; Compute the goto graph for the grammar. (setq *goto-graph* (create-goto-graph parser-type)) ; Construct the action and goto parsing tables. (setq *action-table* (build-action-table *goto-graph*)) (setq *goto-table* (build-goto-table *goto-graph*)) ; Write out the terminals and productions for reference. (write-header fp parser-type) (write-terminals fp *terminals*) (write-productions fp *productions*) ; Write out the goto graph. (write-goto-graph fp *goto-graph*) ; Write out the action and goto parse tables. (write-action-or-goto-table fp *action-table* :table-type 'ACTION) (write-action-or-goto-table fp *goto-table* :table-type 'GOTO) ; Write out the error message template. (write-error-message-table fp *action-table*) (close fp) (if *conflicts* "Conflicts were detected") ) ) ; ------------------------------------------------------------------------------ ; | print-file-to-console | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; List the lines of a file to the console. ; ; CALLING SEQUENCE ; ; (print-file-to-console filename) ; ; filename Name of the file. ; ; Returns: ; ; EXAMPLE ; ; (print-file-to-console "grammar.dat") ; => ; GrammarE=E+T_T.dat ; --------------------------------------------------------------------------- ; ; A grammar of arithmetic expressions, ; ; E -> E + T | T ; ... ; ; ------------------------------------------------------------------------------ (defun print-file-to-console( file-name ) (format t "~%~%=========================== ~A =============================~%~%~%" file-name) (with-open-file (stream file-name) (do ( (line (read-line stream nil) ; nil inhibits throw at eof (read-line stream nil) ) ; and read-line returns nil at eof ) ( (null line) ) ; Terminate at eof (format t "~A~%" line) ) ) ) (defun component-present-p (value) (and value (not (eql value :unspecific)))) (defun directory-pathname-p (p) (and (not (component-present-p (pathname-name p))) (not (component-present-p (pathname-type p))) p)) (defun pathname-as-directory (name) (let ((pathname (pathname name))) (when (wild-pathname-p pathname) (error "Can't reliably convert wild pathnames.")) (if (not (directory-pathname-p name)) (make-pathname :directory (append (or (pathname-directory pathname) (list :relative)) (list (file-namestring pathname))) :name nil :type nil :defaults pathname) pathname))) ; ------------------------------------------------------------------------------ ; | file-exists? | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Portable way to check if a file or directory exists. ; ; CALLING SEQUENCE ; ; (file-exists? directory-or-file) ; ; directory-or-file Pathname for directory or file ; Returns: t if it is there, nil if not. ; ; ; EXAMPLES ; ; (file-exists? "/NotThere") => nil ; (file-exists? "/Volumes/seanoconnor") => t ; ; ------------------------------------------------------------------------------ (defun file-exists? (pathname) "Check if the file exists" #+(or sbcl lispworks openmcl) (probe-file pathname) #+(or allegro cmu) (or (probe-file (pathname-as-directory pathname)) (probe-file pathname)) #+clisp (or (ignore-errors (probe-file (pathname-as-file pathname))) (ignore-errors (let ((directory-form (pathname-as-directory pathname))) (when (ext:probe-directory directory-form) directory-form)))) #-(or sbcl cmu lispworks openmcl allegro clisp) (error "file-exists-p not implemented") ) ; ------------------------------------------------------------------------------ ; | base-path! | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Try to find out where the base directory for the web page is located. ; ; CALLING SEQUENCE ; ; (base-path!) ; ; Returns: String of base path or nil if it can't find it. ; ; ; EXAMPLES ; ; (base-path!) => "C:/Sean/WebSite" ; Got it. ; (base-path!) => nil ; Could't find it. ; ; ------------------------------------------------------------------------------ (defun base-path!() (let ( (possible-directories-list '( "/cygdrive/c/Sean/WebSite" ; Windows / Cygwin "/Users/seanoconnor/Desktop/Sean/WebSite" ; Mac OS "/home/seanoconnor/Desktop/Sean/WebSite" ; Ubuntu Linux ))) (dolist (base-path possible-directories-list) ; (format t "base path = ~S exists = ~S~%" base-path (file-exists? base-path) ) (if (file-exists? base-path) (return (concatenate 'string base-path "/")))) ) ) ; ------------------------------------------------------------------------------ ; | test-parser-generator | ; ------------------------------------------------------------------------------ ; ; DESCRIPTION ; ; Run the parser generator on a test grammar and produce test parsing ; tables. ; ; CALLING SEQUENCE ; ; (test-parser-generator) ; ; Set the files and paths to your requirements. I'm assuming you've ; installed cygwin if you're on a Windows machine. ; ; ------------------------------------------------------------------------------ (defun test-parser-generator() ; Compile all the functions for speed. (compile-all) ; Garbage collect. (gc) ; Generate a set of parse tables from a test grammar, both LR(1) and ; LALR(1). (let* ( ; Set up the base directory paths. (base-path (base-path!)) (sub-path "ComputerScience/Compiler/ParserGeneratorAndParser/") (grammar-path "Grammars/" ) (parse-table-path "ParseTables/") ; List the grammar files (input) and parse table files (output). (grammar-file '( "GrammarS=SaSbEPSILON.dat" "GrammarE=E+T_T.dat" "GrammarPoly.dat" "GrammarLR(1)NotLALR(1).dat" "GrammarNotLR(1)NotLALR(1).dat") ) (parse-file-LR1 '( "ParseTablesLR(1)_S=SaSbEPSILON.dat" "ParseTablesLR(1)_E=E+T_T.dat" "ParseTablesLR(1)_Poly.dat" "ParseTablesLR(1)_NotLALR(1).dat" "ParseTablesLR(1)_NotLR(1)NotLALR(1).dat") ) (parse-file-LALR1 '( "ParseTablesLALR(1)_S=SaSbEPSILON.dat" "ParseTablesLALR(1)_E=E+T_T.dat" "ParseTablesLALR(1)_Poly.dat" "ParseTablesLALR(1)_NotLALR(1).dat" "ParseTablesLALR(1)_NotLR(1)NotLALR(1).dat") ) ) (dotimes (i (length grammar-file)) (let* ( ; Create the full file path. (full-grammar-file (concatenate 'string base-path sub-path grammar-path (nth i grammar-file)) ) (full-parse-file-LR1 (concatenate 'string base-path sub-path parse-table-path (nth i parse-file-LR1)) ) (full-parse-file-LALR1 (concatenate 'string base-path sub-path parse-table-path (nth i parse-file-LALR1)) ) ) ; Call the parser generator to generate parse tables for ; both LR(1) and LALR(1). (parser-generator full-grammar-file full-parse-file-LR1 :parser-type 'LR1) (parser-generator full-grammar-file full-parse-file-LALR1) ; Display the results to the console. (print-file-to-console full-grammar-file) (print-file-to-console full-parse-file-LR1) (print-file-to-console full-parse-file-LALR1) ) ) ) )