1 #|------------------------------------------------------------------------------ 2 | 3 | NAME 4 | 5 | LR(1)AndLALR(1)ParserGenerator.lsp 6 | 7 | 8 | DESCRIPTION 9 | 10 | LR parser generator which produces goto graphs and action and goto tables 11 | for both LR(1) and LALR(1) grammars. 12 | 13 | It gives the same parsing tables (and conflicts) as UNIX's yacc 14 | compiler-compiler, except that some states may be numbered in a different 15 | order. 16 | 17 | 18 | CALLING SEQUENCE 19 | 20 | Once you are in a Common Lisp interpreter, load this file using 21 | your path: 22 | 23 | (load "LR(1)AndLALR(1)ParserGenerator.lsp") 24 | 25 | For an LR(1) grammar, type 26 | 27 | (parser-generator "grammar.dat" "parser.dat" :parser-type 'LR1) 28 | 29 | For an LALR(1) grammar, type 30 | 31 | (parser-generator "grammar.dat" "parser.dat") 32 | or 33 | (parser-generator "grammar.dat" "parser.dat" :parser-type 'LALR1) 34 | 35 | Parser-generator prints the warning message "Conflicts were detected" to 36 | the console if any shift-reduce or reduce-reduce conflicts occur. 37 | 38 | For testing, you can also call 39 | 40 | (test-parser-generator) 41 | 42 | but you need to modify this function to your taste by setting the 43 | file paths. 44 | 45 | Online documentation when you're in the lisp interpreter is given by the 46 | standard documentation functions; for example, 47 | 48 | (apropos 'getHead) 49 | => GETHEAD 50 | GETHEADOFLISTUPTO (fbound) 51 | (describe 'getHeadOfListUpTo) 52 | COMMON-LISP-USER::GETHEADOFLISTUPTO 53 | [symbol] 54 | 55 | GETHEADOFLISTUPTO names a compiled function: 56 | Lambda-list: (ITEM LIST) 57 | Derived type: (FUNCTION (T T) (VALUES LIST &OPTIONAL)) 58 | Documentation: 59 | ------------------------------------------------------------------------------- 60 | | 61 | | DESCRIPTION 62 | | 63 | | Return the list from the beginning up to but not including a given item, 64 | | or the whole list if the item wasn't found. 65 | ... 66 | |------------------------------------------------------------------------------- 67 | Source file: /Users/seanoconnor/ParserGeneratorAndParser/SourceCode/ParserGenerator/LR(1)AndLALR(1)ParserGenerator.lsp 68 | 69 | (describe '*productions*) 70 | COMMON-LISP-USER::*PRODUCTIONS* 71 | [symbol] 72 | 73 | *PRODUCTIONS* names a special variable: 74 | Value: (((1) (S -> POLY MOD)) ((2) (MOD -> COMMA INTEGER)) 75 | ((3) (MOD -> EPSILON)) ((4) (POLY -> POLY + TERM)) 76 | ((5) (POLY -> TERM)) ((6) (TERM -> MULTIPLIER POWER)) 77 | ((7) (MULTIPLIER -> INTEGER)) ((8) (MULTIPLIER -> EPSILON)) 78 | ((9) (POWER -> X)) ((10) (POWER -> X ^ INTEGER)) 79 | ((11) (POWER -> EPSILON))) 80 | Documentation: 81 | List of productions of the unaugmented grammar. 82 | 83 | 84 | INPUT FILES: 85 | 86 | grammar.dat A list of the productions of the grammar followed by 87 | a list of terminal symbols. The file grammar.dat 88 | shows an example. Epsilon productions are allowed. 89 | 90 | We assume the start symbol is the one which begins the first production 91 | listed in grammar.dat. 92 | 93 | Don't include $ (the right endmarker) in the list of terminals. It is 94 | added automatically by the program. 95 | 96 | 97 | 98 | OUTPUT FILES: 99 | 100 | parser.dat A numbered list of productions, followed by the LR(1) 101 | or LALR(1) goto graph (i.e. set of items) of the 102 | grammar and the action and goto tables. See the files 103 | parser.dat and lalrparser.dat for examples. 104 | 105 | The LALR(1) tables are the same as the ones in the y.output file 106 | generated by UNIX's yacc compiler-compiler running with the -v 107 | option. The only difference is that some states may be numbered in 108 | a different order. 109 | 110 | Shift-reduce or reduce-reduce conflicts are inserted into the action 111 | and goto tables at the end of the line for the state in which they 112 | occur. 113 | 114 | You can feed the action and goto tables to my Common Lisp LR parser 115 | program "parser.lisp". The goto graph indicates the state of the 116 | parse, just as in yacc's output, and can help to define the parsing 117 | error messages. 118 | 119 | 120 | AUTHOR 121 | 122 | Sean E. O'Connor 01 Jun 1989 Version 1.0 123 | 11 Mar 2008 Version 5.6 released. 124 | 125 | LEGAL 126 | 127 | LR(1)AndLALR(1)ParserGenerator Version 5.6 128 | An LR(1) and LALR(1) Parser Generator written in Common Lisp. 129 | 130 | Copyright (C) 1989-2024 by Sean Erik O'Connor. All Rights Reserved. 131 | 132 | This program is free software: you can redistribute it and/or modify 133 | it under the terms of the GNU General Public License as published by 134 | the Free Software Foundation, either version 3 of the License, or 135 | (at your option) any later version. 136 | 137 | This program is distributed in the hope that it will be useful, 138 | but WITHOUT ANY WARRANTY; without even the implied warranty of 139 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 140 | GNU General Public License for more details. 141 | 142 | You should have received a copy of the GNU General Public License 143 | along with this program. If not, see <http://www.gnu.org/licenses/>. 144 | 145 | The author's address is seanerikoconnor!AT!gmail!DOT!com 146 | with the !DOT! replaced by . and the !AT! replaced by @ 147 | 148 | 149 | METHOD 150 | 151 | This is a Common Lisp implementation and will run under CLISP. 152 | The software design is layered, the simpler list manipulation 153 | utilities coming first, building up gradually to the specialized 154 | and higher level parser functions. I've put lots of examples to 155 | ease the pain. 156 | 157 | To construct the LR(1) goto graph (i.e. set of items) we use Algorithm 158 | 4.9 of [Aho 86, pg. 231-232]. To create the cannonical LR(1) parsing 159 | action and goto tables, algorithm 4.10 [Aho 86, pg. 234] is used. 160 | 161 | To construct the LALR(1) parsing tables, we use the much simpler 162 | algorithm of [Aho 74, pg. 115] instead of algorithm 4.11 in [Aho 86, 163 | pgs. 238-239]. 164 | 165 | For computing FIRST (first derived terminals) we use algorithm 5.5 of 166 | [Aho 72, pgs. 357-359]. 167 | 168 | The function EFF (epsilon-free first derived terminals) is described 169 | in [Aho 72, pg. 381]. We base the algorithm used in the function 170 | first-terminals-of-symbol on exercise 5.2.19 [Aho 72, pg. 398]. The 171 | modifications to algorithm 5.5 to make it compute EFF are my own and 172 | are described in my notes. 173 | 174 | In the first version of this program, we used the algorithm for FIRST 175 | of [Aho 86, pgs. 188-189]. But this algorithm does not always 176 | terminate! In particular, it fails for the grammar, 177 | 178 | S -> A S | b 179 | A -> S A | a 180 | 181 | of example 4.33 [Aho 86, pg. 272] by getting into the following 182 | infinite loop: FIRST( S ) = FIRST( A ) = FIRST( S ) ... The algorithm 183 | we use always terminates. 184 | 185 | 186 | REFERENCES 187 | 188 | See http://www.seanerikoconnor.freeservers.com for a review of the 189 | parsing theory behind this program. 190 | 191 | 192 | [Aho 86] COMPILERS: PRINCIPLES, TECHNIQUES, AND TOOLS, 193 | Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman, 194 | Addison-Wesley, 1986. 195 | 196 | [Aho 74] "LR Parsing", Alfred V. Aho and Stephen C. Johnson, 197 | Computing Surveys, Vol. 6, No. 2, June 1974, pg. 99-124. 198 | 199 | [Aho 72] THE THEORY OF PARSING, TRANSLATION AND COMPILING, VOLUME 1: 200 | PARSING, Alfred V. Aho and Jeffrey D. Ullman, Prentice-Hall, 201 | 1972. 202 | 203 | BUGS 204 | 205 | Have the output look like the file y.output generated by yacc -v, or 206 | eyacc -v. 207 | 208 | NOTES 209 | 210 | In Common Lisp, functions and variables don't share the same namespace. 211 | So you need to tell LISP that the variable func is actually a function by using funcall. 212 | 213 | (defun apply-func (func arg1 arg2) (funcall func arg1 arg2)) 214 | 215 | So you can't do the simpler expression (func arg1 arg2). 216 | 217 | It goes the other way too. You can't say (apply-func + 1 2) because + denotes a variable. 218 | You have to tell LISP it's a function, 219 | 220 | (apply-func #'+ 1 2) 221 | 222 | which is a shorthand for 223 | 224 | (apply-func (function +) 1 2) 225 | 226 | first and car are synonyms as are rest and cdr 227 | (null '()) => T 228 | (null nil) => T 229 | (null 'a) => NIL 230 | 231 | Optional arguments using keywords example: 232 | 233 | (defun doggie-looks( dog &key (nose-color 'red) (hair-color 'white)) 234 | (list 'my dog 'has 'a nose-color 'nose 'and hair-color 'hair)) 235 | 236 | * (doggie-looks 'husky) => (MY HUSKY HAS A RED NOSE AND WHITE HAIR) 237 | * (doggie-looks 'husky :nose-color 'black) => (MY HUSKY HAS A BLACK NOSE AND WHITE HAIR) 238 | 239 +-------------------------------------------------------------------------------|# 240 241 242 ; ============================================================================== 243 ; | Constants | 244 ; ============================================================================== 245 246 ; Use the naming convention +variable-name+ to denote constants. 247 248 (defconstant +initial-hash-table-size+ 100 249 "------------------------------------------------------------------------------- 250 | Initial hash table length. Don't worry, lisp hash tables are extensible 251 | at run time. 252 --------------------------------------------------------------------------------" 253 ) 254 255 256 257 (defconstant +hash-value-upper-limit+ 65536 258 "------------------------------------------------------------------------------- 259 | Upper limit on hash value on core of items. 260 |-------------------------------------------------------------------------------" 261 ) 262 263 264 265 ; ============================================================================== 266 ; | Dynamically Bound (i.e. Global) Variables | 267 ; ============================================================================== 268 269 ; Use the naming convention *variable-name* to denote them as global. 270 271 (defvar *productions* nil 272 "------------------------------------------------------------------------------- 273 | List of productions of the unaugmented grammar (without S' -> S). 274 | e.g. ( (S -> S |a| S |b| / EPSILON) ) 275 | which represents S -> S a S b, S -> EPSILON 276 |-------------------------------------------------------------------------------" 277 ) 278 279 280 281 (defvar *has-epsilon-productions* nil 282 "------------------------------------------------------------------------------- 283 | T if we have any epsilon productions of the form A -> EPSILON, but NIL otherwise. 284 --------------------------------------------------------------------------------" 285 ) 286 287 288 289 (defvar *terminals* nil 290 "------------------------------------------------------------------------------- 291 | List of terminal symbols for the grammar. e.g. ( |a| |b| ) 292 |-------------------------------------------------------------------------------" 293 ) 294 295 296 297 (defvar *first-derived-terminals* nil 298 "------------------------------------------------------------------------------- 299 | Hash table containing the first derived terminals for each grammar symbol. 300 |-------------------------------------------------------------------------------" 301 ) 302 303 304 305 (defvar *epsilon-free-first-derived-terminals* nil 306 "------------------------------------------------------------------------------- 307 | Hash table containing the epsilon-free first derived terminals for each grammar symbol. 308 |-------------------------------------------------------------------------------" 309 ) 310 311 312 313 (defvar *goto-graph* nil 314 "--------------------------------------------------------------------------------------------------------------- 315 | Goto graph of the LR(1) or LALR(1) grammar of the form 316 | 317 | ( 318 | ( ------+ 319 | (6 |a| 4) <-- Transition in Goto graph from | 320 | state 6 to state 4 on symbol a. +------ List of graph edges and transitions. 321 | (1 |a| 2) <-- Transition from state 1 to state 2 | 322 | on a. | 323 | ) ------+ 324 | 325 | ) --------+ 326 | ( 0 <-- State number 0. | 327 | 3668 <-- Hash value of core of items. | 328 | ( | 329 | (SP -> DOT S |,| $) ----+ | 330 | ( S -> DOT S |a| S |b| |,| $) | | 331 | ( S -> DOT EPSILON |,| $) +---- Set of items for state 0. | 332 | ( S -> DOT S |a| S |b| |,| |a|) | | 333 | ( S -> DOT EPSILON |,| |a|) | | 334 | ) ----+ | 335 | ) +-- List of sets of items. 336 | | 337 | ( 2 <-- State number 2. | 338 | 5168 <-- Hash values of core of items. | 339 | ( | 340 | (S -> S |a| DOT S |b| |,| $) ----+ | 341 | (S -> S |a| DOT S |b| |,| |a|) | | 342 | (S -> DOT S |a| S |b| |,| |b|) | | 343 | (S -> DOT EPSILON |,| |b|) +-- Set of items for state 2. | 344 | (S -> DOT S |a| S |b| |,| |a|) | | 345 | (S -> DOT EPSILON |,| |a|) ----+ | 346 | ) | 347 | ) | 348 | ) --------+ 349 | ) 350 |--------------------------------------------------------------------------------------------------------------" 351 ) 352 353 354 355 (defvar *action-table* nil 356 "------------------------------------------------------------------------------- 357 | Action table of the form, 358 | 359 | ( 360 | ( (0) <-- state number 361 | ( 362 | ($ (R 2)) <-- reduce action on end of input $ 363 | (|a| (R 2)) <-- reduce action on symbol a. 364 | (DEFAULT (ERROR)) <-- otherwise must be error 365 | ) 366 | ) 367 | 368 | ( (1) <-- next line of action table. 369 | ( 370 | ($ (ACC NIL)) <-- accept action on end of input $ 371 | (|a| (S 2)) <-- shift action on symbol a. 372 | (DEFAULT (ERROR)) 373 | ) 374 | ) 375 | ) 376 |-------------------------------------------------------------------------------" 377 ) 378 379 380 381 (defvar *goto-table* nil 382 "------------------------------------------------------------------------------- 383 | Goto table of the form, 384 | 385 | ( 386 | ( (0) <-- state number 387 | ( 388 | (S 1) <-- transition to state 1 on symbol S 389 | (DEFAULT (ERROR)) <-- otherwise error 390 | ) 391 | ) 392 | 393 | ( (2) 394 | ( 395 | (S 3) 396 | (DEFAULT (ERROR)) 397 | ) 398 | ) 399 | ) 400 |-------------------------------------------------------------------------------" 401 ) 402 403 404 405 (defvar *conflicts* nil 406 "------------------------------------------------------------------------------- 407 | Set to true if we have any shift-reduce or reduce-reduce conflicts. 408 |-------------------------------------------------------------------------------" 409 ) 410 411 412 413 ; ============================================================================== 414 ; | General Purpose List Processing Primitives | 415 ; ============================================================================== 416 417 (defun getHeadOfListUpTo( item list ) 418 419 "------------------------------------------------------------------------------- 420 | 421 | DESCRIPTION 422 | 423 | Return the list from the beginning up to but not including a given item, 424 | or the whole list if the item wasn't found. 425 | 426 | CALLING SEQUENCE 427 | 428 | (getHeadOfListUpTo item list) 429 | => New list of all symbols before the item. 430 | 431 | EXAMPLE 432 | 433 | (getHeadOfListUpTo 'rat '(you are a rat fink)) => (YOU ARE A) 434 | (getHeadOfListUpTo 'cat '(you are a rat fink)) => (YOU ARE A RAT FINK) 435 | (getHeadOfListUpTo 'rat '(rat) ) => nil 436 | (getHeadOfListUpTo 'rat nil ) => nil 437 | 438 |-------------------------------------------------------------------------------" 439 440 (cond ( (null list) nil) ; Empty list. 441 ( (equal (first list) item) nil) ; List = (item). Return (). 442 443 ; Recurse. 444 ( (cons (first list) 445 (getHeadOfListUpTo item (rest list))))) 446 ) 447 448 449 450 451 (defun removeItemFromList( item list &key (equalityTest #'equal) ) 452 453 "------------------------------------------------------------------------------- 454 | 455 | DESCRIPTION 456 | 457 | Remove all occurences of a given item from a list. Test item equality 458 | with a function. 459 | 460 | CALLING SEQUENCE 461 | 462 | (removeItemFromList item list :equalityTest testFunction) 463 | => New list with all occurrences of symbol taken out. 464 | 465 | testFunction The name of the function which tests if two symbols are 466 | equal. It should be a function of two arguments which 467 | returns T if the symbols are equal and NIL otherwise. 468 | It defaults to #'equal. 469 | 470 | EXAMPLE 471 | 472 | (removeItemFromList '(rat bad) '( (cat good) (rat good))) 473 | => ( (CAT GOOD) (RAT GOOD) ) 474 | 475 | (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2))) 476 | (funcall #'sameAnimal '(rat good) '(rat bad)) => T 477 | 478 | (removeItemFromList '(rat bad) '( (cat good) (rat good)) 479 | :equalityTest #'sameAnimal) 480 | => ( (CAT GOOD) ) 481 | 482 +-------------------------------------------------------------------------------" 483 484 (cond ( (null list) nil) ; Nothing in the list. 485 486 ( (funcall equalityTest ; First item matches. 487 item (first list)) ; according to equality 488 ; test. 489 490 (removeItemFromList item (rest list) ; Discard it and remove 491 ; all other 492 :equalityTest equalityTest)) ; items too. 493 494 ( t (cons (first list) ; First item does not match. 495 496 (removeItemFromList item ; Add it back and remove the 497 (rest list) ; remaining items. 498 :equalityTest equalityTest)))) 499 ) 500 501 502 503 (defun itemInList( element list &key (test #'equal) ) 504 505 "------------------------------------------------------------------------------- 506 | 507 | DESCRIPTION 508 | 509 | Find out if an atom or a list is a member of a given list. Test for 510 | equality with a function. 511 | 512 | CALLING SEQUENCE 513 | 514 | (itemInList item list :equalityTest testFunc) 515 | => T if item is in list; NIL if not. 516 | 517 | testFunc The name of the function which tests if two symbols are 518 | equal. It should be a function of two arguments which 519 | returns T if the symbols are equal and NIL otherwise. 520 | test defaults #'equal. 521 | 522 | EXAMPLE 523 | 524 | (itemInList '(hot dog) '((cool cat) (cool dog)) ) => NIL 525 | 526 | (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2))) 527 | 528 | (itemInList '(hot dog) '((cool cat) (cool dog)) 529 | :equalityTest #'sameAnimal) => T 530 | 531 +---------------------------------------------------------------------------------" 532 533 (cond ( (null list) nil) ; Not in the list. 534 535 ( (funcall test element (first list)) ; First item matches. 536 537 t) 538 539 ( t (itemInList element (rest list) ; Try again on rest of list. 540 :test test))) 541 ) 542 543 544 545 (defun positionInList( item list ) 546 547 "------------------------------------------------------------------------------- 548 | 549 | DESCRIPTION 550 | 551 | Find the position of an item in a list. 552 | 553 | CALLING SEQUENCE 554 | 555 | (positionInList item list) 556 | 557 | item Atom or list to be found. 558 | 559 | list Any list. 560 | 561 | Returns: The position of item in the list or NIL if it is not there. 562 | The first position is zero. 563 | 564 | EXAMPLE 565 | 566 | (positionInList '(winter mute) '(I am (winter mute))) => 2 567 | (positionInList 'ratfinn '(Who you ? ratfink ?)) => NIL 568 | 569 +---------------------------------------------------------------------------------" 570 571 (cond ( (null list) nil ) ; Nothing in the list. 572 573 ( (equal item (first list)) 0) ; list = (item ...), return 574 ; position = 0. 575 576 ; If the item is in the rest of the list, find its position in the 577 ; rest of the list, then add 1 to fix up the count. 578 579 ( (itemInList item (rest list)) 580 (1+ (positionInList item (rest list)))) 581 582 ( t nil )) ; Item was not found --- 583 ; return NIL. 584 ) 585 586 587 588 (defun insertItemIntoList( item L &key (test #'equal) (precedence nil) ) 589 590 "------------------------------------------------------------------------------ 591 | 592 | DESCRIPTION 593 | 594 | If an object isn't already in the list, add it to the end. If it is, 595 | overwrite it (see below). 596 | 597 | CALLING SEQUENCE 598 | 599 | (insertItemIntoList item L :test test :precedence precedence) 600 | 601 | item An atom or list. 602 | 603 | test The test to perform to see if an item is is in the list. 604 | It is the name of a function with two arguments which should 605 | return T if its arguments are equal and NIL if they aren't. 606 | The test function defaults to #'equal if omitted. 607 | 608 | precedence The test function to perform to say which object has the 609 | higher precedence when both are equal. The one of higher 610 | precedence is kept. An item of higher precedence overwrites 611 | its lower precedence brother in the list. The function should 612 | be of the form (precedence x y), returning the object of 613 | higher precedence. Defaults to NIL (Don't care). 614 | 615 | L List of non-duplicated elements (according to equality test 616 | specified above). 617 | 618 | Returns: Unchanged list if item is already in it. Otherwise, returns 619 | the list L with the item in the last position. 620 | 621 | EXAMPLE 622 | 623 | (insertItemIntoList '(rat good) '( (rat bad) (bat good) ) ) 624 | => ((RAT BAD) (BAT GOOD) (RAT GOOD)) 625 | We compared for exact equality, so the new item gets inserted. 626 | 627 | 628 | (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2))) 629 | 630 | (insertItemIntoList '(rat good) 631 | '((rat bad) (bat good)) 632 | :test #'sameAnimal) 633 | => ((RAT BAD) (BAT GOOD)) 634 | Rats are already in the list, so don't add the item. 635 | 636 | (defun good-always-wins( x y ) (cond ((equal (second x) 'good) x) (t y))) 637 | 638 | (insertItemIntoList '(rat good) '( (rat bad) (bat good) ) 639 | :test #'sameAnimal 640 | :precedence 'good-always-wins) => 641 | => ((RAT GOOD) (BAT GOOD)) 642 | Rats are already in the list, but we now compare equal items further 643 | to see which have higher precedence. 644 | 645 |------------------------------------------------------------------------------" 646 647 (cond ( (null L) (list item)) ; Nothing there. Add the item. 648 649 ( (funcall test item (first L)) ; Item is already in the list. 650 651 ; Of the two equal objects --- item and the first element in the list --- 652 ; keep the one of higher precedence. 653 654 (if (not (null precedence)) 655 656 (cons (funcall precedence item (first L)) (rest L)) 657 658 L)) ; Don't care about precedence, so 659 ; keep the original list. 660 661 ( t (cons (first L) 662 (insertItemIntoList item 663 (rest L) 664 :test test 665 :precedence precedence)))) 666 ) 667 668 669 670 (defun combine( list1 list2 &key (test #'equal) (precedence nil) ) 671 672 "------------------------------------------------------------------------------ 673 | 674 | DESCRIPTION 675 | 676 | Take the union of two lists. We can do a generalized test for 677 | equality of elements. Also, if two elements are equal, we can 678 | keep the one of higher precedence. 679 | 680 | CALLING SEQUENCE 681 | 682 | (combine list1 list2 :test test :precedence precedence) 683 | 684 | list1 Arbitrary lists. 685 | 686 | list2 687 | 688 | item An atom or list. 689 | 690 | test The test to perform to see if an item is is in the list. 691 | It is the name of a function with two arguments which should 692 | return T if its arguments are equal and NIL if they aren't. 693 | The test function defaults to #'equal if omitted. 694 | 695 | precedence The test function to perform to say which object has the 696 | higher precedence when both are equal. The one of higher 697 | precedence is kept. An item of higher precedence overwrites 698 | its lower precedence brother in the list. The function should 699 | be of the form (precedence x y), returning the object of higher 700 | precedence. precedence defaults to NIL (Don't care). 701 | 702 | Returns: The set theoretic union of the two lists, except that we 703 | always keep the element of highest precedence when two 704 | elements are the same. 705 | 706 | EXAMPLE 707 | 708 | (combine '((rat good) (rat awful)) '((rat bad) (bat good))) 709 | => ((RAT AWFUL) (RAT GOOD) (RAT BAD) (BAT GOOD)) 710 | 711 | (defun sameAnimal( s1 s2 ) (equal (first s1) (first s2))) 712 | 713 | (combine '((rat good) (rat awful)) '((rat bad) (bat good)) 714 | :test #'sameAnimal) 715 | => ((RAT AWFUL) (BAT GOOD)) 716 | 717 | (defun good-always-wins( x y ) (cond ((equal (second x) 'good) x) (t y))) 718 | 719 | (combine '((rat good) (rat awful)) '((rat bad) (bat good)) 720 | :test #'sameAnimal 721 | :precedence #'good-always-wins) => ((RAT GOOD) (BAT GOOD)) 722 | 723 |------------------------------------------------------------------------------" 724 725 ; Successively add elements from both lists to nil, eliminating 726 ; duplicated or low precedence items. If both lists are nil, dolist 727 ; does not loop, and we return nil. 728 729 (let ( (new-list nil) ) 730 731 (dolist (item (union list1 list2 :test #'equal)) 732 733 (setq new-list (insertItemIntoList item new-list 734 :test test :precedence precedence))) 735 new-list) 736 ) 737 738 739 740 741 ; ------------------------------------------------------------------------------ 742 ; | core-of-item! | 743 ; ------------------------------------------------------------------------------ 744 ; 745 ; DESCRIPTION 746 ; 747 ; Get the core of an item: the item but without the lookahead. 748 ; 749 ; CALLING SEQUENCE 750 ; 751 ; (core-of-item! item) 752 ; 753 ; item [A -> alpha . beta , gamma ] 754 ; 755 ; Returns: [A -> alpha . beta] 756 ; 757 ; EXAMPLE 758 ; 759 ; (core-of-item! '(sandwich -> bread meat DOT bread |,| knife)) 760 ; => (SANDWICH -> BREAD MEAT DOT BREAD) 761 ; 762 ; ------------------------------------------------------------------------------ 763 764 (defun core-of-item!( item ) 765 766 (getHeadOfListUpTo '|,| item) 767 768 ) 769 770 771 772 773 ; ------------------------------------------------------------------------------ 774 ; | element-of-item? | 775 ; ------------------------------------------------------------------------------ 776 ; 777 ; DESCRIPTION 778 ; 779 ; Find out if an item or its core is in a set of items. 780 ; 781 ; CALLING SEQUENCE 782 ; 783 ; (element-of-item item set-of-items compare-type) 784 ; 785 ; item [A -> alpha . beta , gamma] 786 ; 787 ; set-of-items ... [A' -> alpha' . beta' , gamma'] ... 788 ; 789 ; compare-type Whether to compare the whole item or only its core. 790 ; 791 ; Returns: if compare-type = 'core then T if A = A', alpha = 792 ; alpha', beta = beta'. gamma need not equal gamma'. 793 ; if compare-type = item then T if in addition, 794 ; gamma = gamma', and NIL otherwise. 795 ; 796 ; EXAMPLE 797 ; 798 ; (element-of-item? '(eat -> living death |,| scum) 799 ; '( (eat -> hot fudge |,| scum) 800 ; (eat -> living death |,| wimp)) 801 ; 'core) => T 802 ; 803 ; But (element-of-item? . . . 'item) => NIL 804 ; 805 ; ------------------------------------------------------------------------------ 806 807 (defun element-of-item?( item set-of-items compare-type ) 808 809 (cond ((null set-of-items) nil) ; No items, no match. 810 811 ( (if (equal compare-type 'core) 812 813 (equal (core-of-item! item) ; Core of first item 814 (core-of-item! (first set-of-items))) ; was found. 815 816 (equal item (first set-of-items))) ; First items match. 817 818 T) ; First item is in set 819 820 ( t (element-of-item? item ; Continue to search. 821 (rest set-of-items) 822 compare-type))) 823 ) 824 825 826 827 828 ; ------------------------------------------------------------------------------ 829 ; | contained-in-item? | 830 ; ------------------------------------------------------------------------------ 831 ; 832 ; DESCRIPTION 833 ; 834 ; Find if the first set of items is contained in the second 835 ; set of items. Alternatively, find out if the cores of the 836 ; first set of items are contained in the cores of the second set. 837 ; 838 ; CALLING SEQUENCE 839 ; 840 ; (contained-in-item? set-of-items1 set-of-items2 compare-type) 841 ; 842 ; set-of-items1 First and ... 843 ; 844 ; set-of-items2 ... second sets of of items. 845 ; 846 ; compare-type Type of comparison: 'item or 'core. 847 ; 848 ; Returns: T if compare-type = 'item and the first set of items 849 ; is contained in the second set of items. 850 ; T if compare-type = 'core and the cores of the first 851 ; set of items are contained in the cores of the second 852 ; set of items. 853 ; EXAMPLE 854 ; 855 ; (contained-in-item? '( (a -> b DOT c |,| x) 856 ; (e -> f DOT |,| g)) 857 ; 858 ; '( (a -> b DOT c |,| h) 859 ; (e -> f DOT |,| i) 860 ; (f -> g DOT h i |,| j) ) 861 ; 'core ) => T 862 ; 863 ; However, (contained-in-item? . . . 'item) => NIL 864 ; 865 ; ------------------------------------------------------------------------------ 866 867 (defun contained-in-item?( set-of-items1 set-of-items2 compare-type ) 868 869 (cond ((null set-of-items1) T) ; Null set is contained in 870 ; every set. 871 872 ((element-of-item? (first set-of-items1) ; First item (or its core) 873 set-of-items2 ; is in the second set. 874 compare-type) 875 876 (if (null (rest set-of-items1)) ; No other elements in first set 877 878 T 879 880 (contained-in-item? (rest set-of-items1) ; Are the remaining 881 set-of-items2 ; items of the first 882 compare-type))) ; set in the second? 883 884 ( t nil )) ; First element of first set isn't in 885 ; the second set: first set can't be 886 ; contained in second set. 887 ) 888 889 890 891 892 ; ------------------------------------------------------------------------------ 893 ; | equal-sets-of-items? | 894 ; ------------------------------------------------------------------------------ 895 ; 896 ; DESCRIPTION 897 ; 898 ; Check if two sets of items are the same (or have the same core). 899 ; You can also use it to check if two arbitrary lists contain the 900 ; same elements. 901 ; 902 ; CALLING SEQUENCE 903 ; 904 ; (equal-sets-of-items? set-of-items1 set-of-items2 :compare-type type) 905 ; 906 ; set-of-items1 First and ... 907 ; 908 ; set-of-items2 ... second sets of of items. 909 ; 910 ; type Optional argument defaulting to 'item. 911 ; 912 ; Returns: T for type = 'item if both sets of items are identical. 913 ; T for type = 'core if both sets of items have the same 914 ; cores. 915 ; METHOD 916 ; 917 ; Two sets of items are the same if each one is contained within the 918 ; other. They have the same core if the core of one is contained in 919 ; the core of the other and vice-versa. We can't just test for equality 920 ; because the order of the items could be different. 921 ; 922 ; EXAMPLE 923 ; 924 ; (equal-sets-of-items? '( (a -> b DOT c |,| d) ) 925 ; '( (a -> b DOT c |,| d) )) => T 926 ; 927 ; (equal-sets-of-items? '( (a -> b DOT c |,| d) ) 928 ; '( (a -> b DOT c |,| e) ) 929 ; :compare-type 'core ) => T 930 ; 931 ; (equal-sets-of-items? '(a (b c) d) '(d a (b c))) => T 932 ; 933 ; ------------------------------------------------------------------------------ 934 935 (defun equal-sets-of-items?( set-of-items1 set-of-items2 936 &key (compare-type 'item)) 937 938 (and (contained-in-item? set-of-items1 set-of-items2 compare-type) 939 (contained-in-item? set-of-items2 set-of-items1 compare-type)) 940 ) 941 942 943 944 945 ; ============================================================================== 946 ; | Helper Functions on Symbols and Productions | 947 ; ============================================================================== 948 949 ; ------------------------------------------------------------------------------ 950 ; | terminal? | 951 ; ------------------------------------------------------------------------------ 952 ; 953 ; DESCRIPTION 954 ; 955 ; Find out if a symbol of the grammar is a terminal. 956 ; 957 ; CALLING SEQUENCE 958 ; 959 ; (terminal? symbol) 960 ; 961 ; *terminals* Global list of terminal symbols for the grammar. 962 ; 963 ; symbol: A grammar symbol 964 ; 965 ; Returns: T if the symbol is a terminal symbol, NIL otherwise. 966 ; EPSILON is not a terminal. 967 ; 968 ; EXAMPLE 969 ; 970 ; Let *terminals* = (c C), 971 ; 972 ; (terminal? '|c| ) => T 973 ; (terminal? 'C ) => NIL 974 ; (terminal? 'EPSILON) => NIL 975 ; 976 ; ------------------------------------------------------------------------------ 977 978 (defun terminal?( symbol ) 979 980 (itemInList symbol *terminals*) 981 ) 982 983 984 985 986 ; ------------------------------------------------------------------------------ 987 ; | nonterminal? | 988 ; ------------------------------------------------------------------------------ 989 ; 990 ; DESCRIPTION 991 ; 992 ; Find out if a symbol of the grammar is a nonterminal. 993 ; 994 ; CALLING SEQUENCE 995 ; 996 ; (nonterminal? symbol) 997 ; 998 ; symbol: A grammar symbol 999 ; 1000 ; Returns: T if the symbol is a nonterminal symbol, NIL otherwise. 1001 ; EPSILON is not a non-terminal. 1002 ; 1003 ; EXAMPLE 1004 ; 1005 ; (nonterminal? 'C ) => T 1006 ; (nonterminal? '|c| ) => NIL 1007 ; (nonterminal? 'EPSILON ) => NIL 1008 ; 1009 ; ------------------------------------------------------------------------------ 1010 1011 (defun nonterminal?( symbol ) 1012 1013 (and (not (equal symbol 'EPSILON)) 1014 (not (terminal? symbol))) 1015 1016 ) 1017 1018 1019 1020 1021 ; ------------------------------------------------------------------------------ 1022 ; | derives-leading-terminal? | 1023 ; ------------------------------------------------------------------------------ 1024 ; 1025 ; DESCRIPTION 1026 ; 1027 ; Check if a production derives a leading terminal. 1028 ; 1029 ; CALLING SEQUENCE 1030 ; 1031 ; (derives-leading-terminal? production) 1032 ; 1033 ; production: A production of the form X -> a Y 1034 ; 1035 ; Returns: T if a is a terminal, NIL otherwise. 1036 ; 1037 ; EXAMPLE 1038 ; 1039 ; (derives-leading-terminal? '(C -> |c| C)) => T 1040 ; (derives-leading-terminal? '(C -> C C)) => NIL 1041 ; 1042 ; ------------------------------------------------------------------------------ 1043 1044 (defun derives-leading-terminal?( production ) 1045 1046 (terminal? (third production)) 1047 1048 ) 1049 1050 1051 1052 1053 ; ------------------------------------------------------------------------------ 1054 ; | derives-leading-nonterminal? | 1055 ; ------------------------------------------------------------------------------ 1056 ; 1057 ; DESCRIPTION 1058 ; 1059 ; Check if a production derives a leading nonterminal. 1060 ; 1061 ; CALLING SEQUENCE 1062 ; 1063 ; (derives-leading-nonterminal? production) 1064 ; 1065 ; production: A production of the form X -> a Y 1066 ; 1067 ; Returns: T if a is a terminal, NIL otherwise. 1068 ; 1069 ; EXAMPLE 1070 ; 1071 ; (derives-leading-nonterminal? '(C -> C C)) => T 1072 ; (derives-leading-nonterminal? '(C -> |c| C)) => NIL 1073 ; 1074 ; ------------------------------------------------------------------------------ 1075 1076 (defun derives-leading-nonterminal?( production ) 1077 1078 (nonterminal? (third production)) 1079 1080 ) 1081 1082 1083 1084 1085 ; ------------------------------------------------------------------------------ 1086 ; | valid-production? | 1087 ; ------------------------------------------------------------------------------ 1088 ; 1089 ; DESCRPTION 1090 ; 1091 ; Find out if a production starts with the given symbol. 1092 ; 1093 ; CALLING SEQUENCE 1094 ; 1095 ; (valid-production? symbol production) 1096 ; 1097 ; symbol Grammar symbol 1098 ; 1099 ; production A production of the form X -> alpha 1100 ; 1101 ; Returns: T if X = symbol, NIL otherwise. 1102 ; 1103 ; EXAMPLE 1104 ; 1105 ; (valid-production? 'C '(C -> |c| C)) => T 1106 ; (valid-production? 'S '(C -> |c| C)) => NIL 1107 ; 1108 ; ------------------------------------------------------------------------------ 1109 1110 (defun valid-production?( symbol production ) 1111 1112 (equal symbol (first production)) 1113 1114 ) 1115 1116 1117 1118 ; ------------------------------------------------------------------------------ 1119 ; | reduction? | 1120 ; ------------------------------------------------------------------------------ 1121 ; 1122 ; DESCRIPTION 1123 ; 1124 ; Find out if an item calls for a reduction. 1125 ; 1126 ; CALLING SEQUENCE 1127 ; 1128 ; (reduction? item) 1129 ; 1130 ; item Any item [A -> alpha . beta , gamma]. 1131 ; 1132 ; Returns: T if the item is of the form [A -> alpha . , gamma ] 1133 ; and NIL otherwise. 1134 ; Note [A -> alpha . EPSILON , gamma] = 1135 ; [A -> alpha . , gamma], to this is a reduction too. 1136 ; 1137 ; EXAMPLE 1138 ; 1139 ; (reduction? '(C -> |d| DOT |,| |c|)) => T 1140 ; (reduction? '(C -> |d| DOT |e| |,| |c|)) => NIL 1141 ; (reduction? '(C -> |d| DOT EPSILON |,| |c|)) => T 1142 ; 1143 ; ------------------------------------------------------------------------------ 1144 1145 (defun reduction?( item ) 1146 1147 ; Get everything between the dot and comma. It will be empty for a reduction. 1148 1149 (let ((between-dot-and-comma (getHeadOfListUpTo 'DOT (reverse (getHeadOfListUpTo '|,| item ))))) 1150 1151 (or (null between-dot-and-comma) 1152 (equal (first between-dot-and-comma) 'epsilon))) 1153 ) 1154 1155 1156 1157 1158 ; ------------------------------------------------------------------------------ 1159 ; | is-accept? | 1160 ; ------------------------------------------------------------------------------ 1161 ; 1162 ; DESCRIPTION 1163 ; 1164 ; Find out if an item calls for an accept. 1165 ; 1166 ; CALLING SEQUENCE 1167 ; 1168 ; (is-accept? item) 1169 ; 1170 ; item Arbitrary item [A -> alpha . beta , gamma]. 1171 ; 1172 ; *productions* Global list of productions for our grammar. 1173 ; 1174 ; Returns: T if the item is of the form [S' -> S . , $] 1175 ; S is the start symbol --- the left hand side symbol 1176 ; of the first production. S' (represented as SP) is 1177 ; the extra start symbol of the augmented grammar. 1178 ; EXAMPLE 1179 ; 1180 ; *productions* => ( (S -> C C) (C -> |c| C) (C -> |d|) ) 1181 ; (is-accept? '(SP -> S DOT |,| $)) => T 1182 ; 1183 ; ------------------------------------------------------------------------------ 1184 1185 (defun is-accept?( item ) 1186 1187 (equal item `(SP -> ,(first (first *productions*)) DOT |,| $)) 1188 1189 ) 1190 1191 1192 1193 1194 ; ------------------------------------------------------------------------------ 1195 ; | symbol-after-dot! | 1196 ; ------------------------------------------------------------------------------ 1197 ; 1198 ; DESCRIPTION 1199 ; 1200 ; Find the symbol following the dot in an item. 1201 ; 1202 ; 1203 ; CALLING SEQUENCE 1204 ; 1205 ; (symbol-after-dot! item) 1206 ; 1207 ; item Item of the form [A -> alpha . B beta , gamma ] 1208 ; 1209 ; Returns: The symbol B, or NIL if there is none. 1210 ; 1211 ; EXAMPLE 1212 ; 1213 ; (symbol-after-dot! '(frogs -> are DOT keen |,| you bet)) => KEEN 1214 ; (symbol-after-dot! '(toads -> are not)) => NIL 1215 ; 1216 ; ------------------------------------------------------------------------------ 1217 1218 (defun symbol-after-dot!( item ) 1219 1220 (cond ( (null item) nil ) ; No dot was ever found. 1221 1222 ; Return the symbol after the dot or nil if there is none. 1223 1224 ( (equal (first item) 'DOT) (second item) ) 1225 1226 ( T (symbol-after-dot! (rest item)))) ; Keep 1227 ; looking. 1228 ) 1229 1230 1231 1232 1233 ; ------------------------------------------------------------------------------ 1234 ; | terminal-after-dot? | 1235 ; ------------------------------------------------------------------------------ 1236 ; 1237 ; DESCRIPTION 1238 ; 1239 ; Check if an item has a terminal symbol after the dot. 1240 ; 1241 ; CALLING SEQUENCE 1242 ; 1243 ; (terminal-after-dot? item) 1244 ; 1245 ; item Any item [A -> alpha . beta delta , gamma] 1246 ; 1247 ; Returns: T if beta is a terminal symbol and NIL otherwise. 1248 ; 1249 ; EXAMPLE 1250 ; 1251 ; (terminal-after-dot? '(C -> C DOT |c| |,| $ )) => T 1252 ; (terminal-after-dot? '(C -> C DOT C |,| $ )) => NIL 1253 ; 1254 ; ------------------------------------------------------------------------------ 1255 1256 (defun terminal-after-dot?( item ) 1257 1258 (if (null (symbol-after-dot! item)) ; No symbol after the dot. 1259 1260 nil 1261 1262 (terminal? (symbol-after-dot! item))) ; Check if the symbol after the dot 1263 ; is a terminal. 1264 ) 1265 1266 1267 1268 1269 1270 ; ------------------------------------------------------------------------------ 1271 ; | epsilon-production? | 1272 ; ------------------------------------------------------------------------------ 1273 ; 1274 ; DESCRIPTION 1275 ; 1276 ; Find out if a production derives only epsilon. 1277 ; 1278 ; CALLING SEQUENCE 1279 ; 1280 ; (epsilon-production? production) 1281 ; 1282 ; production [A -> alpha]. 1283 ; 1284 ; Returns: T if alpha = EPSILON. 1285 ; 1286 ; EXAMPLE 1287 ; 1288 ; (epsilon-production? '(A -> EPSILON)) => T 1289 ; 1290 ; ------------------------------------------------------------------------------ 1291 1292 (defun epsilon-production?( production ) 1293 1294 (and (equal (length production) 3) 1295 (equal (third production) 'EPSILON)) 1296 ) 1297 1298 1299 1300 1301 ; ------------------------------------------------------------------------------ 1302 ; | same-symbol? | 1303 ; ------------------------------------------------------------------------------ 1304 ; 1305 ; DESCRIPTION 1306 ; 1307 ; Test if two tagged grammar symbols are equal. 1308 ; 1309 ; CALLING SEQUENCE 1310 ; 1311 ; (same-symbol? s1 s2) 1312 ; 1313 ; s1, s2 Tagged grammar symbols of the form (symbol tag), with 1314 ; tag = 'EPSILON-FREE or NIL. 1315 ; 1316 ; Returns T if the symbol parts are equal. 1317 1318 ; EXAMPLE 1319 ; 1320 ; (same-symbol? '(a NIL) '(a EPSILON-FREE)) => T 1321 ; (same-symbol? '(a EPSILON-FREE) '(b EPSILON-FREE)) => NIL 1322 ; 1323 ; ------------------------------------------------------------------------------ 1324 1325 (defun same-symbol?( s1 s2 ) 1326 1327 (equal (first s1) (first s2)) 1328 1329 ) 1330 1331 1332 1333 ; ------------------------------------------------------------------------------ 1334 ; | first-alternate! | 1335 ; ------------------------------------------------------------------------------ 1336 ; 1337 ; DESCRIPTION 1338 ; 1339 ; Return the first alternate of a production. 1340 ; 1341 ; CALLING SEQUENCE 1342 ; 1343 ; (first-alternate! rhs) 1344 ; 1345 ; rhs The right hand side of a production containing alternates: 1346 ; alpha / beta / ... 1347 ; 1348 ; Returns: alpha 1349 ; 1350 ; EXAMPLE 1351 ; 1352 ; (first-alternate! '(A B / C D)) => (A B) 1353 ; 1354 ; ------------------------------------------------------------------------------ 1355 1356 (defun first-alternate!( rhs ) 1357 1358 (getHeadOfListUpTo '/ rhs) 1359 ) 1360 1361 1362 1363 1364 ; ------------------------------------------------------------------------------ 1365 ; | all-but-first-alternate! | 1366 ; ------------------------------------------------------------------------------ 1367 ; 1368 ; DESCRIPTION 1369 ; 1370 ; Return all but the first alternates of of a production. 1371 ; 1372 ; CALLING SEQUENCE 1373 ; 1374 ; (all-but-first-alternate! rhs) 1375 ; 1376 ; rhs The right hand side of a production, with alternates 1377 ; (alpha / beta / ...) 1378 ; 1379 ; Returns: (beta / ...) or NIL if rhs has only one alternate: 1380 ; (alpha) 1381 ; 1382 ; EXAMPLE 1383 ; 1384 ; (all-but-first-alternate! '(B C / D E / F)) => (D E / F) 1385 ; (all-but-first-alternate! '(B C)) => NIL 1386 ; (all-but-first-alternate! '(B C / D E / F)) => (D E / F) 1387 ; 1388 ; ------------------------------------------------------------------------------ 1389 1390 (defun all-but-first-alternate!( rhs ) 1391 1392 ; Get the first alternate, then strip it off. 1393 1394 (nthcdr (1+ (length (getHeadOfListUpTo '/ rhs))) rhs) 1395 ) 1396 1397 1398 1399 1400 ; ------------------------------------------------------------------------------ 1401 ; | production-rhs! | 1402 ; ------------------------------------------------------------------------------ 1403 ; 1404 ; DESCRIPTION 1405 ; 1406 ; Return the right hand side of a production. 1407 ; 1408 ; CALLING SEQUENCE 1409 ; 1410 ; (production-rhs! production) 1411 ; 1412 ; production Production of the form [A -> alpha]. 1413 ; 1414 ; Returns: alpha 1415 ; 1416 ; EXAMPLE 1417 ; 1418 ; (production-rhs! '(sandwich -> bread meat bread)) => (BREAD MEAT BREAD) 1419 ; 1420 ; ------------------------------------------------------------------------------ 1421 1422 (defun production-rhs!( production ) 1423 1424 (nthcdr 2 production) 1425 ) 1426 1427 1428 1429 1430 ; ------------------------------------------------------------------------------ 1431 ; | string-before-comma! | 1432 ; ------------------------------------------------------------------------------ 1433 ; 1434 ; DESCRIPTION 1435 ; 1436 ; Return the symbols between the first symbol after the dot and the comma 1437 ; in an item. 1438 ; 1439 ; CALLING SEQUENCE 1440 ; 1441 ; (string-before-comma! item) 1442 ; 1443 ; item An item of the form [A -> alpha . B beta , gamma] 1444 ; 1445 ; Returns: beta or NIL if beta is empty. 1446 ; 1447 ; EXAMPLE 1448 ; 1449 ; (string-before-comma! '(A -> + DOT B * + + |,| a)) => (* + +) 1450 ; (string-before-comma! '(A -> + DOT B |,| a)) => NIL 1451 ; (string-before-comma! '(A -> + DOT |,| a)) => NIL 1452 ; 1453 ; ------------------------------------------------------------------------------ 1454 1455 (defun string-before-comma!( item ) 1456 1457 (let ((temp (reverse (getHeadOfListUpTo 'DOT (reverse item))))) ; Get everything past the 1458 ; dot. 1459 1460 (if (equal (first temp) '|,|) ; No symbol after the dot. 1461 1462 nil 1463 1464 (getHeadOfListUpTo '|,| (rest temp)))) ; Get everything past the 1465 ; symbol after the dot 1466 ; (which could be nil). 1467 ) 1468 1469 1470 1471 1472 ; ------------------------------------------------------------------------------ 1473 ; | lookahead-of! | 1474 ; ------------------------------------------------------------------------------ 1475 ; 1476 ; DESCRIPTION 1477 ; 1478 ; Return the lookahead symbol of an item. 1479 ; 1480 ; CALLING SEQUENCE 1481 ; 1482 ; (lookahead-of! item) 1483 ; 1484 ; item [A -> alpha . beta , gamma] 1485 ; 1486 ; Returns: gamma 1487 ; 1488 ; EXAMPLE 1489 ; 1490 ; (lookahead-of! '(SP -> DOT |d| |,| |c|)) => |c| 1491 ; 1492 ; ------------------------------------------------------------------------------ 1493 1494 (defun lookahead-of!( item ) 1495 1496 (cond ( (null item) nil ) ; Nothing at all. 1497 1498 ( (equal (first item) '|,|) (second item)) ; Lookahead (or nothing) 1499 ; follows the comma. 1500 1501 ( T (lookahead-of! (rest item)))) ; Search 1502 ) 1503 1504 1505 1506 1507 ; ------------------------------------------------------------------------------ 1508 ; | split-up-production | 1509 ; ------------------------------------------------------------------------------ 1510 ; 1511 ; DESCRIPTION 1512 ; 1513 ; Break a single production with alternates into separate productions. 1514 ; 1515 ; CALLING SEQUENCE 1516 ; 1517 ; (split-up-production production) 1518 ; 1519 ; production [A -> alpha / beta / ... ] 1520 ; 1521 ; Returns: The list [A -> alpha], [A -> beta], ... 1522 ; 1523 ; EXAMPLE 1524 ; 1525 ; (split-up-production '(A -> B C / D E / F)) 1526 ; => ( (A -> B C) (A -> D E) (A -> F) ) 1527 ; 1528 ; ------------------------------------------------------------------------------ 1529 1530 (defun split-up-production( production ) 1531 1532 ; production = A -> B | C | ... 1533 1534 (let ( (head `(,(first production) 1535 ,(second production))) ; Get the left hand side: A -> 1536 1537 (tail (nthcdr 2 production)) ; Get the right hand side: B | C | ... 1538 1539 (new-productions nil) 1540 (new-production nil) ) 1541 1542 (loop (if (null tail) (return)) 1543 1544 (setq new-production ; A -> B, A -> C, etc. 1545 (append head 1546 (first-alternate! tail))) 1547 1548 ; Strip off next list up to bar. 1549 1550 (setq tail (all-but-first-alternate! tail)) 1551 1552 (setq new-productions (append new-productions (list new-production)))) 1553 1554 new-productions) 1555 ) 1556 1557 1558 1559 1560 ; ------------------------------------------------------------------------------ 1561 ; | split-up-productions | 1562 ; ------------------------------------------------------------------------------ 1563 ; 1564 ; DESCRIPTION 1565 ; 1566 ; Break a list of productions with alternates into a list of separate 1567 ; productions. 1568 ; 1569 ; CALLING SEQUENCE 1570 ; 1571 ; (split-up-productions production-list) 1572 ; 1573 ; production-list [A -> alpha / beta / ... ] [B -> gamma / delta ...] ... 1574 ; 1575 ; Returns: The list [A -> alpha], [A -> beta], ... [B -> gamma] 1576 ; [B -> delta], ... 1577 ; EXAMPLE 1578 ; 1579 ; (split-up-productions '((S -> C C) (C -> |c| C / |d|))) 1580 ; => ((S -> C C) (C -> |c| C) (C -> |d|)) 1581 ; 1582 ; ------------------------------------------------------------------------------ 1583 1584 (defun split-up-productions( production-list ) 1585 1586 (let ((new-production-list nil)) 1587 1588 (dolist (production production-list) ; Split up each production. 1589 1590 (setq new-production-list ; Add it to the growing list. 1591 (append new-production-list 1592 (split-up-production production)))) 1593 1594 new-production-list) 1595 ) 1596 1597 1598 1599 1600 ; ------------------------------------------------------------------------------ 1601 ; | make-item | 1602 ; ------------------------------------------------------------------------------ 1603 ; 1604 ; DESCRIPTION 1605 ; 1606 ; Create an item with a leading dot from a production and a 1607 ; lookahead symbol. 1608 ; 1609 ; CALLING SEQUENCE 1610 ; 1611 ; (make-item production lookahead) 1612 ; 1613 ; production Any production [A -> alpha] 1614 ; 1615 ; lookahead Any lookahead symbol b. 1616 ; 1617 ; Returns: The item [A -> . alpha , b] 1618 ; 1619 ; EXAMPLE 1620 ; 1621 ; (make-item '(A -> B C) '|d|) => (A -> DOT B C |,| |d|) 1622 ; 1623 ; ------------------------------------------------------------------------------ 1624 1625 (defun make-item( production lookahead ) 1626 1627 `( ,(first production) ; Get the first symbol A. 1628 ,(second production) ; Get the arrow -> 1629 DOT ; Add the leading dot. 1630 ,@(nthcdr 2 production) ; Add the right hand side of the production. 1631 |,| ; Comma. 1632 ,lookahead ) ; Add the lookahead symbol last. 1633 ) 1634 1635 1636 1637 1638 ; ------------------------------------------------------------------------------ 1639 ; | move-dot-right | 1640 ; ------------------------------------------------------------------------------ 1641 ; 1642 ; DESCRIPTION 1643 ; 1644 ; Move the dot in an item to the right if possible. 1645 ; 1646 ; CALLING SEQUENCE 1647 ; 1648 ; (move-dot-right item) 1649 ; 1650 ; item [A -> alpha . X beta, b] 1651 ; 1652 ; Returns: [A -> alpha X . beta, b] 1653 ; If the item is of the form [A -> alpha . , b], we return it 1654 ; unchanged. 1655 ; 1656 ; EXAMPLE 1657 ; 1658 ; (move-dot-right '( A -> B DOT C |,| D)) => ( A -> B C DOT |,| D) 1659 ; (move-dot-right '( A -> B C DOT |,| D)) => ( A -> B C |,| DOT D) 1660 ; 1661 ; ------------------------------------------------------------------------------ 1662 1663 (defun move-dot-right( item ) 1664 1665 (cond ( (null item) nil ) 1666 1667 ( (equal (first item) 'DOT) ; The item begins with a dot. 1668 1669 (if (null (second item)) ; item = DOT 1670 1671 item ; Leave it alone. 1672 1673 ; Move the dot right over the next symbol. We change [ . b c d ] to [b . c d]. 1674 1675 `( ,(second item) ; b 1676 DOT ; Add a dot. 1677 ,@(nthcdr 2 item)))) ; The remainder, (c d). 1678 1679 ( t (cons (first item) ; Item doesn't begin with a dot. 1680 1681 (move-dot-right (rest item))))) 1682 ) 1683 1684 1685 1686 1687 ; ------------------------------------------------------------------------------ 1688 ; | create-augmenting-item | 1689 ; ------------------------------------------------------------------------------ 1690 ; 1691 ; DESCRIPTION 1692 ; 1693 ; Create the accept item of the augmented grammar. 1694 ; 1695 ; CALLING SEQUENCE 1696 ; 1697 ; (create-augmenting-item) 1698 ; 1699 ; *productions* List of productions for this grammar. 1700 ; 1701 ; Returns: The item (SP -> DOT S |,| $) where S is the start 1702 ; symbol: the left hand side nonterminal of the first 1703 ; production S -> alpha. 1704 ; EXAMPLE 1705 ; 1706 ; *productions* => ((E -> E T) (E -> id) (T -> id)) 1707 ; 1708 ; (create-augmenting-item) => (SP -> DOT E |,| $) 1709 ; 1710 ; ------------------------------------------------------------------------------ 1711 1712 (defun create-augmenting-item() 1713 1714 ; Assume the first symbol of the left hand side of the first production is 1715 ; the start symbol. 1716 1717 `(SP -> DOT ,(first (first *productions*)) |,| $) 1718 1719 ) 1720 1721 1722 1723 1724 ; ------------------------------------------------------------------------------ 1725 ; | find-grammar-symbols | 1726 ; ------------------------------------------------------------------------------ 1727 ; 1728 ; DESCRIPTION 1729 ; 1730 ; Create all the grammar symbols (terminal and nonterminal) by looking 1731 ; at the list of productions. 1732 ; 1733 ; CALLING SEQUENCE 1734 ; 1735 ; (find-grammar-symbols) 1736 ; 1737 ; Returns: List of grammar symbols. Note we don't include the endmarker, 1738 ; $ or the null string, EPSILON. 1739 ; 1740 ; EXAMPLE 1741 ; 1742 ; For productions S -> C C, S -> |c| C | d, 1743 ; 1744 ; (find-grammar-symbols) => (S C |c| |d|) 1745 ; 1746 ; ------------------------------------------------------------------------------ 1747 1748 (defun find-grammar-symbols() 1749 1750 (let ((symbols nil)) 1751 1752 ; Scan through all productions, collecting all terminals and nonterminals. 1753 1754 (dolist (production *productions*) 1755 1756 (setq symbols (append symbols (removeItemFromList '-> production)))) 1757 1758 1759 ; Remove duplicated elements which occur later in the sequence. Remove any 1760 ; EPSILON's introduced by epsilon productions, A -> EPSILON. 1761 1762 (removeItemFromList 'EPSILON (remove-duplicates symbols :from-end T))) 1763 1764 ) 1765 1766 1767 ; ------------------------------------------------------------------------------ 1768 ; | item-to-production | 1769 ; ------------------------------------------------------------------------------ 1770 ; 1771 ; DESCRIPTION 1772 ; 1773 ; Change an item to a production by removing the dot and lookahead part. 1774 ; 1775 ; CALLING SEQUENCE 1776 ; 1777 ; (item-to-production item) 1778 ; 1779 ; item [A -> alpha . beta , gamma] 1780 ; 1781 ; Returns: [A -> alpha beta] 1782 ; 1783 ; EXAMPLE 1784 ; 1785 ; (item-to-production '(rat -> on DOT rye |,| tail)) => (RAT -> ON RYE) 1786 ; 1787 ; ------------------------------------------------------------------------------ 1788 1789 (defun item-to-production( item ) 1790 1791 (removeItemFromList 'DOT (getHeadOfListUpTo '|,| item ))) 1792 1793 1794 1795 1796 ; ------------------------------------------------------------------------------ 1797 ; | production-number | 1798 ; ------------------------------------------------------------------------------ 1799 ; 1800 ; DESCRIPTION 1801 ; 1802 ; Return the number of a production. 1803 ; 1804 ; CALLING SEQUENCE 1805 ; 1806 ; (production-number production) 1807 ; 1808 ; production Any production. 1809 ; *production* List of productions for the grammar. 1810 ; 1811 ; Returns: The number of the production in the list *productions*. 1812 ; The first production is numbered 1. Recall alternates 1813 ; of productions were split off into separate productions 1814 ; in the function load-input-and-initialize. 1815 ; EXAMPLE 1816 ; 1817 ; (production-number '(S -> C C)) => 1 1818 ; 1819 ; ------------------------------------------------------------------------------ 1820 1821 (defun production-number( production ) 1822 1823 (1+ (positionInList production *productions*)) 1824 1825 ) 1826 1827 1828 1829 1830 ; ============================================================================== 1831 ; | First Derived Symbol Utilities | 1832 ; ============================================================================== 1833 ; | | 1834 ; | NOTE: In this section we will be using the sample grammar, | 1835 ; | | 1836 ; | S -> A B | 1837 ; | A -> C a | EPSILON | 1838 ; | B -> b | 1839 ; | C -> c | EPSILON | 1840 ; | | 1841 ; | with terminal symbols a b c | 1842 ; | | 1843 ; ============================================================================== 1844 1845 1846 ; ------------------------------------------------------------------------------ 1847 ; | tag-symbol | 1848 ; ------------------------------------------------------------------------------ 1849 ; 1850 ; DESCRIPTION 1851 ; 1852 ; Convert a grammar symbol A to tagged form (A NIL). 1853 ; 1854 ; CALLING SEQUENCE 1855 ; 1856 ; (tag-symbol symbol) 1857 ; 1858 ; symbol A grammar symbol. 1859 ; 1860 ; Returns: The list, (symbol NIL). 1861 ; 1862 ; EXAMPLE 1863 ; 1864 ; (tag-symbol 'a) => (A NIL) 1865 ; 1866 ; ------------------------------------------------------------------------------ 1867 1868 (defun tag-symbol( s ) 1869 1870 `(,s NIL) 1871 1872 ) 1873 1874 1875 1876 ; ------------------------------------------------------------------------------ 1877 ; | flag-epsilon-free! | 1878 ; ------------------------------------------------------------------------------ 1879 ; 1880 ; DESCRIPTION 1881 ; 1882 ; Flag a tagged grammar symbol as coming from an epsilon-free derivation. 1883 ; 1884 ; CALLING SEQUENCE 1885 ; 1886 ; (flag-epsilon-free! tagged-symbol) 1887 ; 1888 ; tagged-symbol Tagged grammar symbol of the form (symbol tag). 1889 ; 1890 ; Returns: (symbol NIL) 1891 ; 1892 ; EXAMPLE 1893 ; 1894 ; (flag-epsilon-free! '(a nil)) => (A EPSILON-FREE) 1895 ; 1896 ; ------------------------------------------------------------------------------ 1897 1898 (defun flag-epsilon-free!( s ) 1899 1900 `(,(first s) epsilon-free) 1901 1902 ) 1903 1904 1905 1906 ; ------------------------------------------------------------------------------ 1907 ; | epsilon-free-only | 1908 ; ------------------------------------------------------------------------------ 1909 ; 1910 ; DESCRIPTION 1911 ; 1912 ; Return only tagged grammar symbols with epsilon-free derivations. 1913 ; 1914 ; CALLING SEQUENCE 1915 ; 1916 ; (epsilon-free-only list) 1917 ; 1918 ; list List of tagged symbols, ( (s1 tag1) ... ) 1919 ; 1920 ; Returns: Only those symbols in the list for which 1921 ; tag = 'epsilon-free 1922 ; EXAMPLE 1923 ; 1924 ; (epsilon-free-only '((a NIL) (b EPSILON-FREE) (c NIL))) 1925 ; => ((B EPSILON-FREE)) 1926 ; 1927 ; ------------------------------------------------------------------------------ 1928 1929 (defun epsilon-free-only( l ) 1930 1931 (cond ( (null l) nil ) 1932 1933 ( (equal (second (first l)) ; Keep this symbol: It is epsilon-free. 1934 'epsilon-free) 1935 (cons (first l) (epsilon-free-only (rest l)))) 1936 1937 ( t (epsilon-free-only (rest l)))) ; Discard this symbol. 1938 ) 1939 1940 1941 1942 ; ------------------------------------------------------------------------------ 1943 ; | untag-list | 1944 ; ------------------------------------------------------------------------------ 1945 ; 1946 ; DESCRIPTION 1947 ; 1948 ; Remove the tags from a list of tagged grammar symbols. 1949 ; 1950 ; CALLING SEQUENCE 1951 ; 1952 ; (untag-list list) 1953 ; 1954 ; list List of tagged grammar symbols, 1955 ; ((a NIL) (b EPSILON-FREE) ... ). 1956 ; 1957 ; Returns: Untagged symbols, (a b ...). 1958 ; 1959 ; EXAMPLE 1960 ; 1961 ; (untag-list '( (a NIL) (b EPSILON-FREE) (c nil))) => (A B C) 1962 ; 1963 ; ------------------------------------------------------------------------------ 1964 1965 (defun untag-list( list ) 1966 1967 (mapcar #'car list) 1968 1969 ) 1970 1971 1972 1973 ; ------------------------------------------------------------------------------ 1974 ; | flag-non-epsilon-free | 1975 ; ------------------------------------------------------------------------------ 1976 ; 1977 ; DESCRIPTION 1978 ; 1979 ; Flag a list of tagged symbols as being all not epsilon-free derived. 1980 ; 1981 ; CALLING SEQUENCE 1982 ; 1983 ; (flag-epsilon-free tagged-list) 1984 ; 1985 ; tagged-list Tagged list of grammar symbols, ( (s1 tag1) ... ) 1986 ; 1987 ; Returns: List with all tags set to NIL, ( (s1 NIL) ... ) 1988 ; 1989 ; EXAMPLE 1990 ; 1991 ; (flag-non-epsilon-free '((a epsilon-free) (b nil) (c epsilon-free))) => 1992 ; ((A NIL) (B NIL) (C NIL)) 1993 ; 1994 ; ------------------------------------------------------------------------------ 1995 1996 (defun flag-non-epsilon-free( s ) 1997 1998 ; Apply an anonymous function to all elements of the list. 1999 (mapcar #'(lambda (x) (cons (first x) '(NIL))) s) 2000 ) 2001 2002 2003 2004 ; ------------------------------------------------------------------------------ 2005 ; | precedence | 2006 ; ------------------------------------------------------------------------------ 2007 ; 2008 ; DESCRIPTION 2009 ; 2010 ; Compare two tagged symbols and return the one of higher precedence. 2011 ; 2012 ; CALLING SEQUENCE 2013 ; 2014 ; (precedence s1 s2) 2015 ; 2016 ; s1, s2 Tagged grammar symbols of the form (s1 tag1), (s2 tag2) with 2017 ; tag1, tag2 = 'EPSILON-FREE or NIL. 2018 ; 2019 ; Returns (s1 tag1) if tag1 = EPSILON-FREE, (s2 tag2) otherwise. 2020 ; 2021 ; EXAMPLE 2022 ; 2023 ; (precedence '(a NIL) '(b EPSILON-FREE)) => (B EPSILON-FREE) 2024 ; (precedence '(a EPSILON-FREE) '(b EPSILON-FREE)) => (A EPSILON-FREE) 2025 ; 2026 ; ------------------------------------------------------------------------------ 2027 2028 (defun precedence( s1 s2 ) 2029 2030 (cond ((equal (second s1) 'epsilon-free) s1) ; Return the epsilon-free one. 2031 ( t s2)) 2032 ) 2033 2034 2035 2036 ; ------------------------------------------------------------------------------ 2037 ; 2038 ; DESCRIPTION 2039 ; 2040 ; Return the derived leading terminal of a production. 2041 ; e.g. the derived leading terminal a of the production X -> a Y 2042 ; 2043 ; CALLING SEQUENCE 2044 ; 2045 ; (derived-leading-terminal production) 2046 ; 2047 ; production [A -> a beta] 2048 ; 2049 ; Returns: a 2050 ; 2051 ; EXAMPLE 2052 ; 2053 ; (derived-leading-terminal '(A -> + B A)) => 2054 ; 2055 ; ------------------------------------------------------------------------------ 2056 2057 (defun derived-leading-terminal( production ) 2058 2059 (third production) 2060 2061 ) 2062 2063 2064 2065 ; ------------------------------------------------------------------------------ 2066 ; | first-terminals-of-rhs | 2067 ; ------------------------------------------------------------------------------ 2068 ; 2069 ; DESCRIPTION 2070 ; 2071 ; Find the ith approximation of the first derived terminals of the right 2072 ; hand side of a production. 2073 ; 2074 ; CALLING SEQUENCE 2075 ; 2076 ; (first-terminals-of-rhs rhs hash-table) 2077 ; 2078 ; rhs The right hand side Y1 ... Yn of the production. 2079 ; 2080 ; hash-table The current approximation to the FIRST() function for 2081 ; the non-terminals. FIRST of the terminals is already exact. 2082 ; 2083 ; Returns: The first derived terminals of the string Y1 ... Yn. We 2084 ; tag the epsilon-free first derived symbols. 2085 ; 2086 ; METHOD 2087 ; 2088 ; F ( X ) = F( Y1 ) + ... + F( Yn ) 2089 ; i 1 1 2090 ; 2091 ; EXAMPLE 2092 ; 2093 ; Assume we have just called initial-first-derived-terminals, so that 2094 ; hash-table contains the zeroth approximation to FIRST. 2095 ; 2096 ; (first-terminals-of-rhs '(A B) hash-table) => ( (b NIL) ) 2097 ; because S => A B => EPSILON b => b, which is not in EFF(). 2098 ; 2099 ; ------------------------------------------------------------------------------ 2100 2101 (defun first-terminals-of-rhs( rhs hash-table ) 2102 2103 ; Compute FIRST( Y1 ) and FIRST( Y1 ) - EPSILON. 2104 2105 (let* ((first-terms (gethash (first rhs) hash-table)) 2106 (first-terms-minus-epsilon (removeItemFromList '(EPSILON NIL) first-terms 2107 :equalityTest 'same-symbol?))) 2108 (cond ( (null rhs) NIL) 2109 2110 ; If we have the case A -> alpha beta with FIRST( alpha ) = {}, we want to 2111 ; return {}. 2112 2113 ( (null first-terms) nil ) 2114 2115 2116 ; If epsilon is in FIRST( Y1 ) , add all non-epsilon symbols in FIRST( Y1 ) 2117 ; to the first derived terminals in the rest of the list. Flag all these new 2118 ; symbols as epsilon-derived. If there are duplicated symbols, keep only 2119 ; the epsilon-free ones. 2120 2121 ( (itemInList '(EPSILON NIL) first-terms :test 'same-symbol?) 2122 2123 (combine first-terms-minus-epsilon 2124 (flag-non-epsilon-free 2125 (first-terminals-of-rhs (rest rhs) hash-table)) 2126 :test 'same-symbol? :precedence 'precedence)) 2127 2128 2129 ; Otherwise, Y1 has only non-epsilon terminals. Return FIRST( Y1 ). Whether 2130 ; these symbols are epsilon-free depends on their previous flags. 2131 2132 ( t first-terms ))) 2133 ) 2134 2135 2136 2137 ; ------------------------------------------------------------------------------ 2138 ; | update-first-derived-function | 2139 ; ------------------------------------------------------------------------------ 2140 ; 2141 ; DESCRIPTION 2142 ; 2143 ; Update the first derived terminals function to create a new version. 2144 ; 2145 ; CALLING SEQUENCE 2146 ; 2147 ; (update-first-derived-function hash-table update-hash-table) 2148 ; 2149 ; hash-table Old hash table. 2150 ; 2151 ; update-hash-table Changes to the old table. If an entry is NIL, 2152 ; it indicates no change is to be made to hash-table. 2153 ; 2154 ; Returns: Updated hash-table of the first derived terminals. 2155 ; 2156 ; EXAMPLE 2157 ; 2158 ; (update-first-derived-function) => Updated hash table. 2159 ; 2160 ; ------------------------------------------------------------------------------ 2161 2162 (defun update-first-derived-function( hash-table update-hash-table ) 2163 2164 ; Update only changes to nonterminals because the FIRST of a terminal symbol 2165 ; does not change. 2166 2167 (dolist (symbol (find-grammar-symbols)) 2168 2169 (if (nonterminal? symbol) 2170 2171 (if (not (null (gethash symbol update-hash-table))) 2172 2173 (setf (gethash symbol hash-table) 2174 (gethash symbol update-hash-table))))) 2175 ) 2176 2177 2178 2179 ; ------------------------------------------------------------------------------ 2180 ; | initial-first-derived-terminals | 2181 ; ------------------------------------------------------------------------------ 2182 ; 2183 ; DESCRIPTION 2184 ; 2185 ; The zeroth approximation to the first derived terminals function. 2186 ; It is exact for all terminals and epsilon. 2187 ; 2188 ; CALLING SEQUENCE 2189 ; 2190 ; (initial-first-derived-terminals hash-table :type type) 2191 ; 2192 ; *terminals* List of all the terminal symbols including the endmarker $ 2193 ; 2194 ; hash-table Empty, but allocated hash table. 2195 ; 2196 ; Returns: Hash table of the zeroth approximation to the first 2197 ; derived terminals function for every grammar symbol. 2198 ; Epsilon-free first derived symbols are tagged with the 2199 ; flag 'epsilon-free. 2200 ; 2201 ; EXAMPLE 2202 ; 2203 ; (setq F0 (make-hash-table :size 100)) 2204 ; (setq F0 (initial-first-derived-terminals F0)) 2205 ; (gethash 'A F0) => ( (EPSILON NIL) ) 2206 ; 2207 ; Grammar FIRST Grammar FIRST 2208 ; Symbol Symbol 2209 ; ------------------- ------------------- 2210 ; S NIL |a| ((|a| EPSILON-FREE)) 2211 ; A ((EPSILON NIL)) |b| ((|b| EPSILON-FREE)) 2212 ; B ((|b| EPSILON-FREE)) |c| ((|c| EPSILON-FREE)) 2213 ; C ((|c| EPSILON-FREE) EPSILON ((EPSILON NIL)) 2214 ; (EPSILON NIL)) $ (($ EPSILON-FREE)) 2215 ; 2216 ; |c| is flagged as being in EFF( C ). EPSILON is in FIRST( C ) but 2217 ; not in EFF( C ). 2218 ; 2219 ; ------------------------------------------------------------------------------ 2220 2221 (defun initial-first-derived-terminals( hash-table ) 2222 2223 (let ( (first-symbols nil) 2224 (nonterm nil) 2225 (new-symbol nil) ) 2226 2227 ; FIRST( X ) = { (X 'epsilon-free) } if X is a terminal. 2228 2229 (dolist (terminal *terminals*) 2230 2231 (setf (gethash terminal hash-table) 2232 (list (flag-epsilon-free! (tag-symbol terminal))))) 2233 2234 2235 ; FIRST( EPSILON ) = { (EPSILON NIL) }. 2236 2237 (setf (gethash 'EPSILON hash-table) (list (tag-symbol 'EPSILON))) 2238 2239 2240 ; Every nonterminal appears as the left hand side of some production. 2241 ; Thus we can scan through the productions to define FIRST( A ) for every 2242 ; nonterminal A. 2243 ; Compute the zeroth approximation to FIRST(). Look for a production of 2244 ; the form A -> a alpha, where a is a nonterminal. Find the entry for 2245 ; A in the table, and add a to it. 2246 ; We tag productions of the form A -> EPSILON as not being epsilon-free 2247 ; derivations. 2248 2249 (dolist (production *productions*) 2250 2251 (cond ( (or (derives-leading-terminal? production) 2252 (epsilon-production? production)) 2253 2254 (setq nonterm (first production)) 2255 2256 (setq first-symbols (gethash nonterm hash-table)) 2257 2258 ; Get a or EPSILON. 2259 (setq new-symbol 2260 (tag-symbol (derived-leading-terminal production))) 2261 2262 ; Flag a as an epsilon-free derivation, but EPSILON as not. 2263 2264 (if (not (epsilon-production? production)) 2265 (setq new-symbol (flag-epsilon-free! new-symbol))) 2266 2267 ; Add a to FIRST( A ). 2268 2269 (setq first-symbols 2270 (insertItemIntoList new-symbol first-symbols)) 2271 2272 (setf (gethash nonterm hash-table) first-symbols)))) 2273 hash-table) 2274 ) 2275 2276 2277 2278 ; ------------------------------------------------------------------------------ 2279 ; | create-all-first-derived-terminals | 2280 ; ------------------------------------------------------------------------------ 2281 ; 2282 ; DESCRIPTION 2283 ; 2284 ; Create a hash table of all first derived terminals for every grammar 2285 ; symbol. 2286 ; 2287 ; CALLING SEQUENCE 2288 ; 2289 ; (create-all-first-derived-terminals) 2290 ; 2291 ; Returns: A hash table of the first derived terminals for every grammar 2292 ; symbol, including EPSILON and $. Flag the epsilon-free 2293 ; derived terminals. 2294 ; METHOD 2295 ; 2296 ; Successive approximation by transitive closure. 2297 ; 2298 ; EXAMPLE 2299 ; 2300 ; (setq h (create-all-first-derived-terminals)) => #<Hash-Table 8EDA53> 2301 ; 2302 ; (gethash 'S h) => ((|a| NIL) (|c| EPSILON-FREE) (|b| NIL)) 2303 ; i.e. FIRST( S ) = { |a| |b| |c| }, but EFF( S ) = { |c| } 2304 ; 2305 ; 2306 ; Grammar FIRST Grammar FIRST 2307 ; Symbol Symbol 2308 ; -------------------------------- -------------------------------- 2309 ; S ((|a| NIL) |a| ((|a| EPSILON-FREE)) 2310 ; (|b| NIL) |b| ((|a| EPSILON-FREE)) 2311 ; (|c| EPSILON-FREE)) |c| ((|c| EPSILON-FREE)) 2312 ; A ((EPSILON NIL) EPSILON ((EPSILON NIL)) 2313 ; (|a| NIL) $ (($ EPSILON-FREE)) 2314 ; (|c| EPSILON-FREE)) 2315 ; B ((|b| EPSILON-FREE)) 2316 ; C ((|c| EPSILON-FREE) 2317 ; (EPSILON NIL)) 2318 ; 2319 ; ------------------------------------------------------------------------------ 2320 2321 (defun create-all-first-derived-terminals() 2322 2323 ; Initialize the hash table. The size is extensible at run time. 2324 2325 (let ( (hash-table (make-hash-table :size +initial-hash-table-size+)) 2326 (update-hash-table (make-hash-table :size +initial-hash-table-size+)) 2327 (nonterm nil) 2328 (new-first-symbols nil) 2329 (old-first-symbols nil) 2330 (change-flag T) ) 2331 2332 ; Create the zeroth approximation to FIRST(), accurate for all terminals. 2333 2334 (initial-first-derived-terminals hash-table) 2335 2336 2337 ; Loop until no more changes occur in the approximation to FIRST. 2338 2339 (loop 2340 2341 (setq change-flag nil) 2342 2343 2344 ; Compute FIRST[i+1](A) for all the nonterminals A. 2345 2346 (dolist (production *productions*) ; Scan all productions A -> alpha 2347 2348 (setq nonterm (first production)) ; A 2349 2350 2351 ; FIRST[i+1]( A ) = 2352 ; first terminal of( FIRST[i]( Y1 ) ... FIRST[Yn]) U FIRST[i]( A ). 2353 2354 2355 (setq old-first-symbols (gethash nonterm hash-table)) 2356 2357 (setq new-first-symbols 2358 2359 (combine old-first-symbols ; FIRST[i](A) 2360 2361 (first-terminals-of-rhs (production-rhs! production) 2362 hash-table) 2363 :test 'same-symbol? :precedence 'precedence)) 2364 2365 ; Record if any changes occurred, and save FIRST[i+1]( A ) in a separate 2366 ; update hash table. 2367 2368 (cond ((not (equal-sets-of-items? new-first-symbols 2369 old-first-symbols)) 2370 2371 (setq change-flag T) 2372 2373 (setf (gethash nonterm update-hash-table) 2374 new-first-symbols)))) 2375 2376 ; Add updates to the old hash table for FIRST[i]() to create FIRST[i+1](), 2377 ; then clear out the update hash table. 2378 2379 (update-first-derived-function hash-table update-hash-table) 2380 2381 (clrhash update-hash-table) 2382 2383 (if (null change-flag) (return))) ; No more changes --- exit. 2384 2385 ; Return the hash table of first derived terminals for every grammar symbol. 2386 2387 hash-table) 2388 2389 ) 2390 2391 2392 2393 ; ------------------------------------------------------------------------------ 2394 ; | first-terminals-of-symbol | 2395 ; ------------------------------------------------------------------------------ 2396 ; 2397 ; DESCRIPTION 2398 ; 2399 ; Return a list of the first derived terminals of a grammar symbol. 2400 ; 2401 ; CALLING SEQUENCE 2402 ; 2403 ; (first-terminals-of-symbol s) 2404 ; 2405 ; s Any grammar symbol (or EPSILON or $). 2406 ; 2407 ; *first-derived-terminals* Hash table of the first derived terminals 2408 ; for every grammar symbol. 2409 ; 2410 ; *epsilon-free-first-derived-terminals* 2411 ; 2412 ; Hash table of the epsilon-free first 2413 ; derived terminals for every grammar symbol. 2414 ; 2415 ; type Defaults to NIL for computing FIRST() and 2416 ; equals 'epsilon-free for computing EFF(). 2417 ; 2418 ; Returns: List of first derived terminals of s. 2419 ; If type = 'epsilon-free, return the 2420 ; epsilon-free first derived terminals 2421 ; instead. 2422 ; 2423 ; Creates *first-derived-terminals* and 2424 ; *epsilon-free-first-derived-terminals* 2425 ; if they do not already exist. 2426 ; EXAMPLE 2427 ; 2428 ; (first-terminals-of-symbol 'S) 2429 ; => (|a| |c| |b|) 2430 ; (first-terminals-of-symbol 'S :type 'epsilon-free) 2431 ; => (|c|) 2432 ; 2433 ; ------------------------------------------------------------------------------ 2434 2435 (defun first-terminals-of-symbol( symbol &key (type NIL) ) 2436 2437 ; Create the hash tables for FIRST() and EFF() if they do not exist. 2438 2439 (cond ( (or (null *first-derived-terminals*) 2440 (null *epsilon-free-first-derived-terminals*)) 2441 2442 ; Sort out first derived terminals from epsilon-free first derived terminals. 2443 2444 (setq *first-derived-terminals* 2445 (make-hash-table :size +initial-hash-table-size+)) 2446 (setq *epsilon-free-first-derived-terminals* 2447 (make-hash-table :size +initial-hash-table-size+)) 2448 2449 (let ( (old-hash-table (create-all-first-derived-terminals)) ) 2450 2451 (dolist (symbol (cons '$ (find-grammar-symbols))) 2452 2453 (setf (gethash symbol *first-derived-terminals*) 2454 (untag-list (gethash symbol old-hash-table))) 2455 2456 (setf (gethash symbol *epsilon-free-first-derived-terminals*) 2457 (untag-list 2458 (epsilon-free-only 2459 (gethash symbol old-hash-table)))))))) 2460 2461 2462 2463 ; Return FIRST() or EFF() depending on the customer's request. 2464 2465 (if (equal type 'epsilon-free) 2466 2467 (gethash symbol *epsilon-free-first-derived-terminals*) 2468 2469 (gethash symbol *first-derived-terminals*)) 2470 ) 2471 2472 2473 2474 ; ------------------------------------------------------------------------------ 2475 ; | first-derived-terminals | 2476 ; ------------------------------------------------------------------------------ 2477 ; 2478 ; DESCRIPTION 2479 ; 2480 ; Return a list of the first derived terminals of a grammar string. 2481 ; 2482 ; CALLING SEQUENCE 2483 ; 2484 ; (first-derived-terminals string :type type) 2485 ; 2486 ; string A list of grammar symbols, X1 ... Xn 2487 ; 2488 ; type NIL by default. 2489 ; 2490 ; Returns: First derived terminals of the list, FIRST( X1 ... Xn ) 2491 ; if type = NIL, but EFF( X1 ... Xn ) if type = 'epsilon-free. 2492 ; 2493 ; METHOD 2494 ; 2495 ; FIRST( X1 ... Xn ) = FIRST( X1 ) + ... + FIRST( Xn ) 2496 ; 1 1 2497 ; 2498 ; EFF( X1 ... Xn ) = EFF( X1 ) + FIRST( X2 ... Xn ) 2499 ; 1 2500 ; 2501 ; EXAMPLE 2502 ; 2503 ; (first-derived-terminals '(S A)) => (|a| |c| |b|) 2504 ; (first-derived-terminals '(S A) :type 'epsilon-free) => (|c|) 2505 ; 2506 ; because FIRST( S ) = (|a| |b| |c|) and FIRST( A ) = (|a| |c| EPSILON) 2507 ; and |c| is the only terminal with a non-epsilon derivation, 2508 ; S => A B => C a b => c a b. |b|, for example has only the derivation 2509 ; S => A B => A b => EPSILON b = b, in which we must replace a leading 2510 ; non-terminal A with EPSILON. 2511 ; 2512 ; (first-derived-terminals '(A B)) => (|c| |a| |b|) 2513 ; because FIRST( B ) = { |b| } 2514 ; 2515 ; (first-derived-terminals '(A B)) => (|c|) 2516 ; 2517 ; ------------------------------------------------------------------------------ 2518 2519 (defun first-derived-terminals( string &key (type NIL) ) 2520 2521 ; We want FIRST( EPSILON ) = (EPSILON) and EFF( EPSILON ) = NIL. 2522 2523 (cond ( (null string) (if (equal type 'epsilon-free) 2524 NIL 2525 (list 'EPSILON))) 2526 2527 ; If EPSILON is in FIRST( Y1 ), add all non-epsilon terminals of FIRST( Y1 ). 2528 ; If we are computing EFF(), we do EFF( Y1 ) instead. 2529 2530 ( (itemInList 'EPSILON 2531 (first-terminals-of-symbol (first string) :type type)) 2532 2533 (union (removeItemFromList 'EPSILON 2534 (first-terminals-of-symbol (first string))) 2535 (first-derived-terminals (rest string)) 2536 :test #'equal 2537 ) 2538 ) 2539 2540 ; Otherwise, return the non-epsilon symbols of FIRST( Y1 ) or of EFF( Y1 ). 2541 2542 (t (first-terminals-of-symbol (first string) :type type))) 2543 ) 2544 2545 2546 2547 2548 ; ============================================================================== 2549 ; | Item functions: closure, goto, cores, etc. | 2550 ; ============================================================================== 2551 2552 2553 ; ------------------------------------------------------------------------------ 2554 ; | closure | 2555 ; ------------------------------------------------------------------------------ 2556 ; 2557 ; DESCRIPTION 2558 ; 2559 ; Return the closure of a list of items. 2560 ; 2561 ; CALLING SEQUENCE 2562 ; 2563 ; (closure set-of-items) 2564 ; 2565 ; *productions* Global list of productions. 2566 ; set-of-items 2567 ; 2568 ; Returns: For each item [A -> alpha . B beta , a] in the set, add 2569 ; [B -> . gamma , b] where (B -> gamma) is a production 2570 ; and b is the first derived symbol of the string "beta a". 2571 ; METHOD 2572 ; 2573 ; Intuitively, we saw alpha already and expect to see B next. 2574 ; That is, we expect to see any string of terminals gamma derived from B. 2575 ; The next symbol we expect is the lookahead, which is the first 2576 ; derived terminal symbol of the string beta a. 2577 ; 2578 ; EXAMPLE 2579 ; 2580 ; (closure '( (SP -> DOT S |,| $) ) ) => 2581 ; 2582 ; ( (SP -> DOT S |,| $ ) 2583 ; (S -> DOT C C |,| $ ) 2584 ; (C -> DOT |c| C |,| |c|) 2585 ; (C -> DOT |c| C |,| |d|) 2586 ; (C -> DOT |d| |,| |c|) 2587 ; (C -> DOT |d| |,| |d|) ) 2588 ; 2589 ; 2590 ; ------------------------------------------------------------------------------ 2591 2592 (defun closure( item-list ) 2593 2594 (let ((closed-item-list item-list) ; Closure of item-list. 2595 (item-num -1) ; nth item in item-list. 2596 (nonterm nil) ; B 2597 (first-syms nil) ; FIRST[ beta a ] 2598 (item nil) ; Current item in item-list. 2599 (new-item nil)) ; [B -> . gamma, b] 2600 2601 (loop ; Loop over each item. 2602 2603 (setq item-num (1+ item-num)) ; Advance to next item. 2604 2605 (setq item (nth item-num closed-item-list)) ; Get current item, 2606 ; [A -> alpha . B beta , a] 2607 2608 (if (null item) (return)) ; End of the list. 2609 2610 (setq nonterm (symbol-after-dot! item)) ; Get B. 2611 2612 2613 (if (nonterminal? nonterm) ; B is nonterminal. 2614 2615 (dolist (production *productions*) 2616 2617 (cond ((valid-production? nonterm ; production = [B -> gamma] 2618 production) 2619 2620 (setq first-syms ; FIRST[ beta a ] 2621 (first-derived-terminals 2622 `(,@(string-before-comma! item) ; Get beta. 2623 ,(lookahead-of! item)))) ; Get a. 2624 2625 (dolist (lookahead first-syms) ; for each b in 2626 ; FIRST[ beta a ] 2627 2628 (setq new-item ; [ B -> . gamma , b ] 2629 (make-item production 2630 lookahead)) 2631 2632 (setq closed-item-list ; Add to end of list. 2633 (insertItemIntoList new-item 2634 closed-item-list)))))))) 2635 closed-item-list) 2636 ) 2637 2638 2639 2640 2641 ; ------------------------------------------------------------------------------ 2642 ; | compute-goto | 2643 ; ------------------------------------------------------------------------------ 2644 ; 2645 ; DESCRIPTION 2646 ; 2647 ; Compute the goto on a set-of-items and a grammar symbol. 2648 ; 2649 ; CALLING SEQUENCE 2650 ; 2651 ; (compute-goto set-of-items grammar-symbol) 2652 ; 2653 ; set-of-items Set of items I. 2654 ; 2655 ; grammar-symbol Grammar symbol X. 2656 ; 2657 ; Returns: Goto function GOTO( I, X ) defined as follows: 2658 ; 2659 ; For all items [A -> alpha . X beta , a] in I, 2660 ; add together the closures of [A -> alpha X . beta, a]. 2661 ; EXAMPLE 2662 ; 2663 ; (compute-goto '( (SP -> DOT s |,| $) 2664 ; ( S -> DOT c c |,| $) 2665 ; ( c -> |c| c |,| c) 2666 ; ( c -> |c| c |,| d) 2667 ; ( c -> d |,| c) 2668 ; ( c -> d |,| d)) 2669 ; 2670 ; 'S) => ( (SP -> S DOT |,| $)) 2671 ; 2672 ; ------------------------------------------------------------------------------ 2673 2674 (defun compute-goto( set-of-items grammar-symbol ) 2675 2676 (let ( (new-set nil) ) 2677 2678 (dolist (item set-of-items) 2679 2680 ; Examine each of the form [A -> alpha . X beta, a] 2681 (if (equal (symbol-after-dot! item) 2682 grammar-symbol) 2683 2684 ; Add [A -> alpha X . beta , a ] if not already there. 2685 (setq new-set 2686 (insertItemIntoList (move-dot-right item) 2687 new-set)) 2688 ) 2689 ) 2690 2691 ; Closure of the new list. 2692 (closure new-set) 2693 ) 2694 ) 2695 2696 2697 2698 2699 ; ============================================================================== 2700 ; | Goto Graph Functions | 2701 ; ============================================================================== 2702 2703 2704 ; ------------------------------------------------------------------------------ 2705 ; | create-new-node | 2706 ; ------------------------------------------------------------------------------ 2707 ; 2708 ; DESCRIPTION 2709 ; 2710 ; Create a new node in the goto graph given the data. 2711 ; 2712 ; CALLING SEQUENCE 2713 ; 2714 ; (create-new-node state hash-value set-of-items) 2715 ; 2716 ; hash-value Hash value of core of items. 2717 ; 2718 ; Returns: 2719 ; 2720 ; EXAMPLE 2721 ; 2722 ; (create-new-node 1 3668 '((SP -> S DOT |,| $)) 3648) 2723 ; => (1 3668 ((SP -> S DOT |,| $))) 2724 ; 2725 ; ------------------------------------------------------------------------------ 2726 2727 (defun create-new-node( current-state 2728 hash-value-of-core-of-items 2729 list-of-items ) 2730 2731 `(,current-state ,hash-value-of-core-of-items ,list-of-items) 2732 ) 2733 2734 (defun create-new-link( current-state 2735 grammar-symbol 2736 next-state) 2737 2738 `(,current-state ,grammar-symbol ,next-state) 2739 ) 2740 2741 2742 ; ------------------------------------------------------------------------------ 2743 ; | select-items! | 2744 ; ------------------------------------------------------------------------------ 2745 ; 2746 ; DESCRIPTION 2747 ; 2748 ; Select out the set of items in a node of the goto graph. 2749 ; 2750 ; CALLING SEQUENCE 2751 ; 2752 ; (select-items! node) 2753 ; 2754 ; node A node in the goto graph, (i2 i1 X SET-OF-ITEMS) 2755 ; 2756 ; Returns: SET-OF-ITEMS 2757 ; 2758 ; EXAMPLE 2759 ; 2760 ; (select-items! 2761 ; '( 0 2762 ; 3668 2763 ; ( 2764 ; (SP -> DOT S |,| $) 2765 ; ( S -> DOT S |a| S |b| |,| $) 2766 ; ) 2767 ; ) 2768 ; ) 2769 ; 2770 ; => 2771 ; 2772 ; ------------------------------------------------------------------------------ 2773 2774 (defun select-items!( node ) 2775 2776 (third node) 2777 2778 ) 2779 2780 2781 (defun hash-value!( node ) 2782 (second node) 2783 ) 2784 2785 (defun links!( goto-graph ) 2786 (first goto-graph) 2787 ) 2788 2789 (defun nodes!( goto-graph ) 2790 (second goto-graph) 2791 ) 2792 2793 (defun nth-node!( node-num goto-graph ) 2794 2795 (nth node-num (nodes! goto-graph)) 2796 2797 ) 2798 2799 (defun first-node( goto-graph ) 2800 (first (nodes! goto-graph)) 2801 ) 2802 2803 (defun rest-node( goto-graph ) 2804 (rest (nodes! goto-graph)) 2805 ) 2806 2807 (defun insert-node( node goto-graph ) 2808 2809 `( ,(links! goto-graph) 2810 ,(insertItemIntoList node 2811 (nodes! goto-graph)) 2812 ) 2813 ) 2814 2815 (defun insert-link( link goto-graph ) 2816 2817 `( ,(insertItemIntoList link (links! goto-graph)) 2818 ,(nodes! goto-graph) 2819 ) 2820 ) 2821 2822 2823 2824 2825 ; ------------------------------------------------------------------------------ 2826 ; | current-state! | 2827 ; ------------------------------------------------------------------------------ 2828 ; 2829 ; DESCRIPTION 2830 ; 2831 ; Get the number of a node in the goto graph. 2832 ; 2833 ; CALLING SEQUENCE 2834 ; 2835 ; (current-state! node) 2836 ; 2837 ; node A node in the goto graph, (i1 i2 X SET-OF-ITEMS)) 2838 ; 2839 ; Returns: i1 2840 ; 2841 ; EXAMPLE 2842 ; 2843 ; (current-state! '(1 0 S ((SP -> S DOT |,| $)))) => 1 2844 ; 2845 ; ----------------------------------------------------------------------------- 2846 2847 (defun current-state!( node ) 2848 2849 (first node) 2850 2851 ) 2852 2853 2854 2855 2856 ; ------------------------------------------------------------------------------ 2857 ; | transition-symbol! | 2858 ; ------------------------------------------------------------------------------ 2859 ; 2860 ; DESCRIPTION 2861 ; 2862 ; Return the symbol upon which an action is performed. 2863 ; 2864 ; CALLING SEQUENCE 2865 ; 2866 ; (transition-symbol! node) 2867 ; 2868 ; node A node in the goto graph, (i1 i2 X SET-OF-ITEMS)) 2869 ; 2870 ; Returns: X 2871 ; 2872 ; EXAMPLE 2873 ; 2874 ; (transition-symbol! '(1 0 S ((SP -> S DOT |,| $)))) => S 2875 ; 2876 ; ------------------------------------------------------------------------------ 2877 2878 (defun transition-symbol!( node ) 2879 2880 (third node) 2881 2882 ) 2883 2884 2885 ; ------------------------------------------------------------------------------ 2886 ; | set-of-items-in-graph? | 2887 ; ------------------------------------------------------------------------------ 2888 ; 2889 ; DESCRIPTION 2890 ; 2891 ; Find out if a goto graph contains a given a set of items (or 2892 ; their cores). 2893 ; 2894 ; CALLING SEQUENCE 2895 ; 2896 ; (set-of-items-in-graph? set-of-items goto-graph :compare-type type) 2897 ; 2898 ; set-of-items Any set of items. 2899 ; 2900 ; goto-graph The goto graph of the grammar. 2901 ; 2902 ; type Optional keyword. If omitted, it defaults to 'item. 2903 ; 2904 ; Returns T if any node in goto-graph has the same set of items 2905 ; (for type = 'item) or the same core (for type = 'core) 2906 ; as set-of-items. 2907 ; 2908 ; EXAMPLE 2909 ; 2910 ; (set-of-items-in-graph? 2911 ; '( (a -> b DOT c |,| d) (e -> f DOT g |,| h) ) 2912 ; '( ( (-1 nil 0) (0 a 1) ) 2913 ; ( (0 12 ( (a -> b DOT c |,| ddd) (e -> f DOT g |,| h) ) ) 2914 ; (1 23 ( (i -> j DOT |,| k) ) ) )) 2915 ; ) => nil 2916 ; 2917 ; but 2918 ; 2919 ; (set-of-items-in-graph? 2920 ; '( (a -> b DOT c |,| d) (e -> f DOT g |,| h) ) 2921 ; '( ( (-1 nil 0) (0 a 1) ) 2922 ; ( (0 12 ( (a -> b DOT c |,| d) (e -> f DOT g |,| h) ) ) 2923 ; (1 23 ( (i -> j DOT |,| k) ) ) )) 2924 ; :compare-type 'core) => T 2925 ; 2926 ; 2927 ; ------------------------------------------------------------------------------ 2928 2929 (defun set-of-items-in-graph?( set-of-items goto-graph 2930 &key (compare-type 'item) ) 2931 2932 (cond ( (null goto-graph) nil) ; goto graph = () 2933 ( (null (first-node goto-graph)) nil) ; goto graph = ( (...) () ) 2934 2935 ( t 2936 2937 ; Scan all nodes in the goto-graph, looking for one which has 2938 ; a matching item. 2939 (dolist (node (nodes! goto-graph)) 2940 2941 (if (equal-sets-of-items? set-of-items (select-items! node) 2942 :compare-type compare-type) 2943 (return t) 2944 ) 2945 ) 2946 ; return nil by default 2947 ) 2948 ) 2949 2950 ) 2951 2952 2953 2954 ; ------------------------------------------------------------------------------ 2955 ; 2956 ; DESCRIPTION 2957 ; 2958 ; Find out the current node (state) number of a node in the goto 2959 ; graph which contains the given set of items or their cores. 2960 ; 2961 ; CALLING SEQUENCE 2962 ; 2963 ; (node-number set-of-items goto-graph :compare-type compare-type) 2964 ; 2965 ; set-of-items Set of items to search for. 2966 ; 2967 ; goto-graph Goto graph of the grammar. 2968 ; 2969 ; compare-type If 'item, find identical set of items, but if 'core, 2970 ; find identical cores. Defaults to 'item. 2971 ; 2972 ; Returns: Number of the node in the goto graph containing the given 2973 ; set of items or core. 2974 ; 2975 ; EXAMPLE 2976 ; 2977 ; (node-number '(( S -> C C DOT |,| $)) *goto-graph*) => 5 2978 ; (node-number '(( S -> C C DOT |,| |e|)) *goto-graph*) => NIL 2979 ; but 2980 ; (node-number '(( S -> C C DOT |,| |e|)) *goto-graph* 2981 ; :compare-type 'core) => 5 2982 ; 2983 ; ------------------------------------------------------------------------------ 2984 2985 (defun node-number( set-of-items goto-graph &key (compare-type 'item) ) 2986 2987 (cond ( (null goto-graph) -1) ; goto graph = () 2988 ( (null (first-node goto-graph)) -1) ; goto graph = ( (...) () ) 2989 2990 ( t 2991 2992 ; Scan all nodes in the goto-graph, looking for one which has 2993 ; a matching item. 2994 (dolist (node (nodes! goto-graph)) 2995 2996 (if (equal-sets-of-items? set-of-items (select-items! node) 2997 :compare-type compare-type) 2998 (return (current-state! node)) 2999 ) 3000 ) 3001 ; return nil by default 3002 ) 3003 ) 3004 3005 ) 3006 3007 3008 3009 ; Hash value of the core of an item. 3010 ; (core-hash-value-of-item '(S -> S DOT |a| S |b| |,| $)) => 1790 3011 ; 3012 ; Sum up the integer value of all characters in each of the symbols. 3013 ; Multiply by the position of DOT in the item to distinguish items 3014 ; with the same symbols. 3015 ; 3016 (defun core-hash-value-of-item( item ) 3017 3018 (let ( (hash-value 0) 3019 (symbol-position -1) 3020 (string-of-symbol "") 3021 (length-of-string 0) ) 3022 3023 ; Hash the core of the item only. 3024 (dolist (s (core-of-item! item)) 3025 3026 ; Symbol index in the item, starting with 0. 3027 (setq symbol-position (+ 1 symbol-position)) 3028 3029 ; Convert symbol to string and get its length. 3030 (setq string-of-symbol (symbol-name s)) 3031 (setq length-of-string (length string-of-symbol)) 3032 3033 ; Sum the integer values of each character in the symbol. 3034 (dotimes (i length-of-string) 3035 (setq hash-value 3036 (+ hash-value 3037 (char-int (char string-of-symbol i))) 3038 ) 3039 ) 3040 3041 ; Multiply by the index position of the DOT symbol 3042 ; to distinguish between same items with dots in different 3043 ; locations such as 3044 ; [S -> a . b , c] and [S -> a b . , c] 3045 (if (equal s 'DOT) 3046 (setq hash-value (* hash-value symbol-position)) 3047 ) 3048 ) 3049 hash-value) 3050 ) 3051 3052 ; Only the core matters: 3053 ; (core-hash-value-of-set-of-items 3054 ; '( (SP -> S DOT |,| $) 3055 ; ( S -> S DOT |a| S |b| |,| $) 3056 ; ( S -> S DOT |a| S |b| |,| |a|))) => 3542 3057 ; 3058 ; (core-hash-value-of-set-of-items 3059 ; '( (SP -> S DOT |,| $) 3060 ; ( S -> S DOT |a| S |b| |,| $))) => 3542 3061 ; 3062 ; (core-hash-value-of-set-of-items 3063 ; '( (SP -> S DOT |,| $))) => 1752 3064 ; 3065 (defun core-hash-value-of-set-of-items( set-of-items ) 3066 3067 (let ( (hashes-of-items (mapcar #'core-hash-value-of-item set-of-items)) 3068 (sum 0) 3069 ) 3070 3071 ; Hash value on the entire set of items. 3072 ; Don't count duplicate items. 3073 (dolist (i (remove-duplicates hashes-of-items)) 3074 (setq sum (+ sum i)) 3075 ) 3076 3077 ; Modulo to keep within size of an integer. 3078 (mod sum +hash-value-upper-limit+) 3079 ) 3080 ) 3081 3082 3083 3084 3085 ; ============================================================================== 3086 ; | LALR(1) Core Merging Functions 3087 ; ============================================================================== 3088 3089 ; 3090 ; Partition 3091 ; 3092 ; merge-equivalence-classes( '(2 4) '() ) 3093 ; => ( (2 4) ) 3094 ; 3095 ; merge-equivalence-classes( '(2 4) '( (4 5) (6 7) ) ) 3096 ; => ( (2 4 5) (6 7) ) 3097 ; 3098 ; merge-equivalence-classes( '(2 4) '( (4 5) (2 7) (3 6) ) ) 3099 ; => ( (3 6) (2 4 5 7) ) 3100 ; 3101 (defun merge-equivalence-classes( equivalence partition ) 3102 3103 (cond 3104 ; Dispose of trivial inputs. 3105 ( (null equivalence) partition) 3106 ( (= (length equivalence) 1) partition) 3107 3108 ; Partition is empty. 3109 ( (null partition) (list equivalence) ) 3110 3111 ; First set in the partition has common elements 3112 ; with the equivalence. 3113 ( (intersection equivalence (first partition) ) 3114 3115 ; Sort the elements in the equivalence classes. 3116 (mapcar #'(lambda (x) (sort x #'(lambda (x y) (< x y)))) 3117 3118 ; Remerge the other sets in the partition. 3119 (merge-equivalence-classes 3120 3121 ; Merge the equivalence into the first set in the 3122 ; partition. 3123 (union equivalence (first partition) :test #'equal) 3124 (rest partition) 3125 ) 3126 ) 3127 ) 3128 3129 (t 3130 3131 ; Sort the elements in the equivalence classes. 3132 (mapcar #'(lambda (x) (sort x #'(lambda (x y) (< x y)))) 3133 3134 ; Merge the other sets in the partition. 3135 (cons (first partition) 3136 (merge-equivalence-classes equivalence (rest partition)) 3137 ) 3138 ) 3139 ) 3140 ) 3141 ) 3142 3143 3144 ; If member of equiv. class in the partition, return the smallest 3145 ; equivalent element. 3146 (defun remap-equivalent( num partition ) 3147 3148 (cond ( (null partition) num) 3149 3150 ( (member num (first partition)) 3151 (caar partition) 3152 ) 3153 3154 (t 3155 (remap-equivalent num (rest partition)) 3156 ) 3157 ) 3158 ) 3159 3160 3161 3162 ; ------------------------------------------------------------------------------ 3163 ; | merge-lookaheads | 3164 ; ------------------------------------------------------------------------------ 3165 ; 3166 ; DESCRIPTION 3167 ; 3168 ; Collapse together two sets of items, eliminating duplicated items. 3169 ; 3170 ; CALLING SEQUENCE 3171 ; 3172 ; (merge-lookaheads set-of-items1 set-of-items2) 3173 ; 3174 ; set-of-items 3175 ; 3176 ; node A node in the goto graph with the same cores as in 3177 ; set-of-items. 3178 ; 3179 ; Returns: Updated node with the same core as before, but 3180 ; added lookaheads. 3181 ; 3182 ; METHOD 3183 ; 3184 ; Remove duplicates and use a set union to merge the lookaheads. 3185 ; 3186 ; EXAMPLE 3187 ; 3188 ; (merge-lookaheads '( (D -> E DOT F |,| |b|) 3189 ; (A -> B DOT C |,| |a|) ) 3190 ; 3191 ; '( (A -> B DOT C |,| |a|) 3192 ; (A -> B DOT C |,| |a|) 3193 ; (D -> E DOT F |,| |c|) )) => 3194 ; 3195 ; ( (D -> E DOT F |,| |b|) 3196 ; (A -> B DOT C |,| |a|) 3197 ; (D -> E DOT F |,| |c|) ) 3198 ; 3199 ; ------------------------------------------------------------------------------ 3200 3201 (defun merge-lookaheads( set-of-items1 set-of-items2 ) 3202 3203 ; Use the equal function to test for duplicates, since we are handling 3204 ; elements which are lists, not atoms. 3205 (union (remove-duplicates set-of-items1) 3206 (remove-duplicates set-of-items2) 3207 :test #'equal) 3208 ) 3209 3210 3211 3212 3213 ; ------------------------------------------------------------------------------ 3214 ; | merge-cores | 3215 ; ------------------------------------------------------------------------------ 3216 ; 3217 ; DESCRIPTION 3218 ; 3219 ; If the given set of items has the same core as a node in the goto graph, 3220 ; merge it into the node. 3221 ; 3222 ; CALLING SEQUENCE 3223 ; 3224 ; merge-cores( goto-graph ) 3225 ; 3226 ; goto-graph Goto graph of the grammar. 3227 ; 3228 ; Returns: All sets of items with the same cores are merged, 3229 ; and states and links are renumbered. 3230 ; 3231 ; EXAMPLE 3232 ; 3233 ; ------------------------------------------------------------------------------ 3234 3235 (defun merge-cores( goto-graph ) 3236 3237 (let ( (nodes (nodes! goto-graph)) ; Get the nodes out 3238 (links (links! goto-graph)) ; and the links. 3239 (previous-node nil) 3240 (previous-state -1) 3241 (previous-hash-value -1) 3242 (equiv-class nil) 3243 (merged-goto-graph '(() ()) ) ; New merged goto graph. 3244 ) 3245 3246 ; Sort the nodes on hash value to keep sets of items with the same 3247 ; cores adjacent. 3248 (setq nodes 3249 (sort nodes #'(lambda (x y) (< (hash-value! x) (hash-value! y))))) 3250 3251 ; Scan through the nodes, looking for sets of items with the 3252 ; same cores. 3253 (dolist (node nodes) 3254 3255 ; Current node and previous node have same cores. 3256 (cond ( (= (hash-value! node) previous-hash-value) 3257 3258 (setq equiv-class 3259 (merge-equivalence-classes 3260 (list (current-state! node) previous-state) 3261 equiv-class 3262 ) 3263 ) 3264 3265 ; Create a new merged node. 3266 (setq node 3267 (create-new-node 3268 3269 ; Use the lowest numbered state 3270 ; for the new node number. 3271 (if (< (current-state! node) 3272 (current-state! previous-node)) 3273 (current-state! node) 3274 (current-state! previous-node) 3275 ) 3276 3277 ; Hash value. 3278 (hash-value! node) 3279 3280 ; Merge the cores in the items. 3281 (merge-lookaheads (select-items! node) 3282 (select-items! previous-node)) 3283 ) 3284 ) 3285 ) 3286 3287 ; Current node differs, send off previous node. 3288 (t 3289 (if (not (null previous-node)) 3290 (setq merged-goto-graph 3291 (insert-node previous-node merged-goto-graph)) 3292 ) 3293 ) 3294 ) 3295 3296 (setq previous-node node) 3297 (setq previous-hash-value (hash-value! node)) 3298 (setq previous-state (current-state! node)) 3299 ) 3300 3301 ; Send off last node, merged or otherwise, in any case. 3302 (setq merged-goto-graph 3303 (insert-node previous-node merged-goto-graph)) 3304 3305 ; Renumber states in the links. 3306 (dolist (link links) 3307 3308 (setq link 3309 `( ,(remap-equivalent (first link) equiv-class) 3310 ,(second link) 3311 ,(remap-equivalent (third link) equiv-class) 3312 ) 3313 ) 3314 (setq merged-goto-graph (insert-link link merged-goto-graph)) 3315 ) 3316 merged-goto-graph 3317 ) 3318 3319 ) 3320 3321 3322 3323 3324 3325 ; ============================================================================== 3326 ; | LR(1) Action and Goto table utilities | 3327 ; ============================================================================== 3328 3329 3330 3331 ; ------------------------------------------------------------------------------ 3332 ; | create-goto-graph | 3333 ; ------------------------------------------------------------------------------ 3334 ; 3335 ; DESCRIPTION 3336 ; 3337 ; Create a goto graph containing the sets of items for the grammar. 3338 ; 3339 ; CALLING SEQUENCE 3340 ; 3341 ; (create-goto-graph parser-type) 3342 ; 3343 ; parser-type The type of grammar: 'LR1 or 'LALR1 3344 ; 3345 ; Returns: Goto graph of the grammar. 3346 ; 3347 ; METHOD 3348 ; 3349 ; We create a DFA which recognizes the viable prefixes of the grammar. 3350 ; 3351 ; The DFA is called the goto graph. Each node in the graph is of the 3352 ; form (i2 i1 X SET-OF-ITEMS). 3353 ; 3354 ; i1 = V( gamma ) = (set of all items valid for viable prefix gamma). 3355 ; i2 = V( gamma X ) 3356 ; X = a grammar symbol (but not EPSILON). 3357 ; 3358 ; An item [A -> alpha . beta] is valid for viable prefix gamma alpha if 3359 ; gamma A is also a viable prefix. 3360 ; 3361 ; The prefix gamma is viable if there is a rightmost derivation 3362 ; S =>* gamma w. 3363 ; 3364 ; The first state is I0 = V( EPSILON ) = { [S -> . S , $] }. 3365 ; 3366 ; We process nodes as follows: 3367 ; 3368 ; node-num ---> 0 3369 ; ... 3370 ; node-num ---> 3 ----+----+ 3371 ; | a | 3372 ; 4 <---+ | b 3373 ; | 3374 ; 5 <---+----+ 3375 ; ... 3376 ; 3 ----+----+ 3377 ; | a | 3378 ; 4 <---+ | b 3379 ; | 3380 ; node-num ---> 5 <---+----+ 3381 ; 3382 ; ----------------------------------------------------------------------------- 3383 3384 (defun create-goto-graph( parser-type ) 3385 3386 (let* ( (goto-of-item nil) ; Set of items, GOTO( I, X ) 3387 (node nil) ; Node in graph. 3388 (node-num 0) ; Next node in goto graph to process. 3389 (new-node nil) ; New node in Goto graph. 3390 (new-link nil) 3391 (new-state-num 1) ; State number of the next node. 3392 (goto-graph '( () () ) ); Initial goto-graph. 3393 ) 3394 3395 3396 ; Our very first set of items I0 is the closure of [S' -> .S, $] 3397 (setq goto-of-item (closure (list (create-augmenting-item)))) 3398 3399 ; The initial node has state 0, items as above, and hash value. 3400 (setq node (create-new-node 0 3401 (core-hash-value-of-set-of-items 3402 goto-of-item) 3403 goto-of-item)) 3404 3405 (setq new-link (create-new-link -1 nil 0)) 3406 3407 ; Insert nodes and links into the goto graph. 3408 (setq goto-graph (insert-node node goto-graph)) 3409 (setq goto-graph (insert-link new-link goto-graph)) 3410 3411 (loop 3412 ; Latest unprocessed node in the goto graph. Starting with I0. 3413 (setq node (nth-node! node-num goto-graph)) 3414 3415 (if (null node) (return)) ; No more sets of items. 3416 3417 ; For each grammar symbol X ... 3418 (dolist (grammar-symbol (find-grammar-symbols)) 3419 3420 ; ...compute GOTO( I, X ), the new set of items. 3421 (setq goto-of-item 3422 (compute-goto (select-items! node) 3423 grammar-symbol)) 3424 3425 ; Create a new node with set of items GOTO( I, X ), 3426 (setq new-node 3427 (create-new-node new-state-num 3428 (core-hash-value-of-set-of-items 3429 goto-of-item) 3430 goto-of-item ) 3431 ) 3432 3433 ; GOTO( I, X ) is empty. 3434 (if (not (null goto-of-item)) 3435 3436 (cond ( 3437 ; Our GOTO( I, X ) has computed the same sets of 3438 ; items. 3439 (set-of-items-in-graph? goto-of-item 3440 goto-graph) 3441 3442 ; Insert a new link 3443 ; X 3444 ; I ---> <existing node in graph> 3445 (setq new-link (create-new-link 3446 node-num 3447 grammar-symbol 3448 (node-number goto-of-item 3449 goto-graph)) 3450 ) 3451 3452 (setq goto-graph (insert-link new-link goto-graph)) 3453 ) 3454 3455 3456 3457 ; Add a new node with a new set of items and 3458 ; a new link. 3459 ; Increment the current state number. 3460 (t 3461 (setq goto-graph (insert-node new-node goto-graph)) 3462 3463 (setq new-link (create-new-link (current-state! node) 3464 grammar-symbol 3465 new-state-num)) 3466 (setq goto-graph (insert-link new-link goto-graph)) 3467 3468 (setq new-state-num (1+ new-state-num)) 3469 ) 3470 3471 ) ; end cond 3472 3473 ) ; end if empty GOTO( I, X ) 3474 ) ; end dolist 3475 3476 ; Bump up the node number. 3477 (setq node-num (1+ node-num )) 3478 3479 ) ; end loop 3480 3481 3482 ; For LALR(1) languages, sort the goto graph on core hash value 3483 ; then merge states with the same cores. 3484 (if (equal parser-type 'LALR1) 3485 (setq goto-graph (merge-cores goto-graph)) 3486 ) 3487 goto-graph 3488 3489 ) ; end let 3490 3491 ) 3492 3493 ; ------------------------------------------------------------------------------ 3494 ; | goto | 3495 ; ------------------------------------------------------------------------------ 3496 ; 3497 ; DESCRIPTION 3498 ; 3499 ; The LR GOTO function derived from the goto graph. 3500 ; 3501 ; CALLING SEQUENCE 3502 ; 3503 ; (goto i A goto-graph) 3504 ; 3505 ; goto-graph The goto graph with entries of the form 3506 ; (i2 i1 X <list of items>) 3507 ; 3508 ; state The initial state i1. 3509 ; 3510 ; symbol The transition symbol X. 3511 ; 3512 ; Returns: The next state i2, or NIL if GOTO is undefined. 3513 ; 3514 ; EXAMPLE 3515 ; 3516 ; Suppose *goto-graph* = ( ( (6 |a| 4) ) (nodes) ) 3517 ; 3518 ; (goto 6 '|a| goto-graph) => 4 3519 ; 3520 ; ------------------------------------------------------------------------------ 3521 3522 (defun goto( state symbol goto-graph) 3523 3524 (dolist (link (links! goto-graph)) 3525 3526 (if (and (= state (first link)) 3527 (equal symbol (second link)) 3528 ) 3529 3530 (return (third link)) 3531 ) 3532 ) 3533 ; Return nil by default. 3534 ) 3535 3536 3537 ; ------------------------------------------------------------------------------ 3538 ; | action-list! | 3539 ; ------------------------------------------------------------------------------ 3540 ; 3541 ; DESCRIPTION 3542 ; 3543 ; Return the list of actions in a line of the action table. 3544 ; 3545 ; CALLING SEQUENCE 3546 ; 3547 ; (action-list action-table-line) 3548 ; 3549 ; action-table-line One line of action table of the form 3550 ; ( (stateNumber) (listOfActions) ) 3551 ; 3552 ; Returns: (listOfActions) 3553 ; 3554 ; EXAMPLE 3555 ; 3556 ; (action-list! '( (0) ((|c| (S 3)) (|d| (S 4))))) 3557 ; => ((|c| (S 3)) (|d| (S 4))) 3558 ; 3559 ; ------------------------------------------------------------------------------ 3560 3561 (defun action-list!( line-of-table ) 3562 3563 (second line-of-table) 3564 ) 3565 3566 3567 3568 ; ------------------------------------------------------------------------------ 3569 ; | action-line-state! | 3570 ; ------------------------------------------------------------------------------ 3571 ; 3572 ; DESCRIPTION 3573 ; 3574 ; Return the state of a line of the action table. 3575 ; 3576 ; CALLING SEQUENCE 3577 ; 3578 ; (action-line-state! action-table) 3579 ; 3580 ; action-table Table of the form ( (stateNum) (listOfActions) ) 3581 ; 3582 ; Returns: stateNum 3583 ; 3584 ; EXAMPLE 3585 ; 3586 ; (action-line-state! '( (0) ((|c| (S 3)) (|d| (S 4)) (DEFAULT (ERROR)))) ) 3587 ; => 0 3588 ; 3589 ; ------------------------------------------------------------------------------ 3590 3591 (defun action-line-state!( action-table-line ) 3592 3593 (first (first action-table-line)) 3594 3595 ) 3596 3597 3598 3599 3600 ; ------------------------------------------------------------------------------ 3601 ; | action-trigger-symbol! | 3602 ; ------------------------------------------------------------------------------ 3603 ; 3604 ; DESCRIPTION 3605 ; 3606 ; Return the transition symbol in an action pair. 3607 ; 3608 ; CALLING SEQUENCE 3609 ; 3610 ; (action-trigger-symbol! action-pair) 3611 ; 3612 ; action-pair An action/new-state pair of a line in the action table 3613 ; of the form (X (action i)). 3614 ; 3615 ; Returns: X 3616 ; 3617 ; EXAMPLE 3618 ; 3619 ; (action-trigger-symbol! '(|c| (S 3))) => |c| 3620 ; 3621 ; ------------------------------------------------------------------------------ 3622 3623 (defun action-trigger-symbol!( action-pair ) 3624 3625 (first action-pair) 3626 3627 ) 3628 3629 3630 3631 ; ------------------------------------------------------------------------------ 3632 ; | insert-action-or-goto-into-list | 3633 ; ------------------------------------------------------------------------------ 3634 ; 3635 ; DESCRIPTION 3636 ; 3637 ; Insert an action into a line of the action table. Check for conflicts. 3638 ; 3639 ; CALLING SEQUENCE 3640 ; 3641 ; (insert-action-or-goto-into-list symbol new-state list-of-actions action) 3642 ; 3643 ; symbol The transition symbol X. 3644 ; 3645 ; new-state The new state i. 3646 ; 3647 ; list-of-actions The action list part of one line of the action or 3648 ; goto table. 3649 ; 3650 ; action If 'NONE, update a goto list, else add this action 3651 ; to an action list. 3652 ; 3653 ; Returns: Augmented action list containing a new action pair 3654 ; (X (action i)) or a conflict pair 3655 ; (CONFLICT (X (action i)) (X (old-action j))) 3656 ; Similarly for a goto list. 3657 ; EXAMPLE 3658 ; 3659 ; (insert-action-or-goto-into-list 'a 666 '((b (s 3)) (c (r 2))) :action 's) 3660 ; =>((B (S 3)) (C (R 2)) (A (S 666))) 3661 ; 3662 ; (insert-action-or-goto-into-list 'b 666 '((b (s 3)) (c (r 2))) :action 's) 3663 ; => ((B (S 3)) (C (R 2)) (CONFLICT ((B (S 666)) (B (S 3))))) 3664 ; 3665 ; (insert-action-or-goto-into-list 'a 666 '((b 5) (c 6))) 3666 ; => ((B 5) (C 6) (A 666)) 3667 ; 3668 ; ------------------------------------------------------------------------------ 3669 3670 (defun insert-action-or-goto-into-list( symbol new-state list-of-actions 3671 &key (action 'NONE) ) 3672 3673 3674 ; Nothing or only a default in the list. Insert a new action. 3675 3676 (cond ( (or (null list-of-actions) 3677 (equal (first list-of-actions) '(default (error)))) 3678 3679 (cons (if (equal action 'NONE) 3680 `(,symbol ,new-state) ; Insert a goto. 3681 `(,symbol (,action ,new-state))) ; Insert an action. 3682 list-of-actions)) 3683 3684 ; Ignore duplicate actions. 3685 3686 ((equal (first list-of-actions) 3687 (if (equal action 'NONE) 3688 `(,symbol ,new-state) ; Compare a goto. 3689 `(,symbol (,action ,new-state)))) ; Compare an action. 3690 3691 list-of-actions) ; Return list unchanged. 3692 3693 3694 ; We have a conflict on the first action. Insert a conflict report at the 3695 ; end of the row, unless it is there already. 3696 3697 ((equal symbol (action-trigger-symbol! (first list-of-actions))) 3698 3699 (setq *conflicts* T) 3700 3701 (insertItemIntoList (if (equal action 'NONE) 3702 `(conflict ((,symbol ,new-state) 3703 (,@(first list-of-actions)))) 3704 `(conflict ((,symbol (,action ,new-state)) 3705 (,@(first list-of-actions))))) 3706 3707 list-of-actions)) 3708 3709 3710 ; No conflict yet --- try insertion in the rest of the list. 3711 3712 ( t (cons (first list-of-actions) 3713 (insert-action-or-goto-into-list symbol 3714 new-state 3715 (rest list-of-actions) 3716 :action action)))) 3717 ) 3718 3719 3720 3721 ; ------------------------------------------------------------------------------ 3722 ; | add-action-or-goto | 3723 ; ------------------------------------------------------------------------------ 3724 ; 3725 ; DESCRIPTION 3726 ; 3727 ; Add an action to the action table or a goto to the goto table. 3728 ; 3729 ; 3730 ; CALLING SEQUENCE 3731 ; 3732 ; (add-action-or-goto( state symbol new-state table action) 3733 ; 3734 ; state The current state i1. 3735 ; 3736 ; new-state The new state i2 3737 ; 3738 ; table The action or goto table. 3739 ; 3740 ; action Defaults to 'NONE for the goto table, otherwise, the 3741 ; action to take (e.g. S, R, ACC, ERROR) 3742 ; 3743 ; Returns: The updated action or goto table. 3744 ; 3745 ; EXAMPLE 3746 ; 3747 ; Let table = ( ( (2) ( (a (s 5)) 3748 ; (b (r 2)) 3749 ; (default (error)))) 3750 ; ( (4) ( ($ (acc nil)) 3751 ; (default (error))))) 3752 ; 3753 ; To insert ACTION[ 2, b ] = (shift 6) into the table, call 3754 ; 3755 ; (add-action-or-goto 2 'b 6 table :action 's) => 3756 ; 3757 ; ( ( (2) ( (A (S 5)) 3758 ; (B (R 2)) 3759 ; (DEFAULT (ERROR)) 3760 ; (CONFLICT ((B (S 6)) (B (R 2)))))) 3761 ; ( (4) ( ($ (ACC NIL)) 3762 ; (DEFAULT (ERROR))))) 3763 ; 3764 ; We detect a shift/reduce conflict on symbol b and report it. 3765 ; 3766 ; On the other hand, 3767 ; 3768 ; (add-action-or-goto 2 'c 6 table :action 's) => 3769 ; 3770 ; ( ( (2) ( (A (S 5)) 3771 ; (B (R 2)) 3772 ; (C (S 6)) 3773 ; (DEFAULT (ERROR)))) 3774 ; ( (4) ( ($ (ACC NIL)) 3775 ; (DEFAULT (ERROR))))) 3776 ; 3777 ; Suppose we have a goto table, 3778 ; 3779 ; table = ( ( (0) ( (a 10) 3780 ; (b 20) 3781 ; (default (error)))) 3782 ; ( (4) ( (a 11) 3783 ; (default (error))))) 3784 ; 3785 ; To insert GOTO[ 0, c ] = 6 call 3786 ; 3787 ; (add-action-or-goto 0 'c 6 table) => 3788 ; 3789 ; ( ( (0) ( (A 10) 3790 ; (B 20) 3791 ; (C 6) 3792 ; (DEFAULT (ERROR)))) 3793 ; ( (4) ( (A 11) 3794 ; (DEFAULT (ERROR))))) 3795 ; 3796 ; ----------------------------------------------------------------------------- 3797 3798 (defun add-action-or-goto( state symbol new-state table 3799 &key (action 'NONE)) 3800 3801 ; The table has no entries. Create a new action table of the form 3802 ; ( (State) ( (TransitionSymbol (Action NewState)) (default (error)))) 3803 ; or Goto table of the form, 3804 ; ( (State) ( (TransitionSymbol (NewState)) (default (error)))). 3805 ; 3806 ; NOTE: 3807 ; We assume the Goto graph starts with state 0. 3808 ; Since we insert new states into the action table in order, 3809 ; the order will be maintained as we scan through the Goto graph. 3810 3811 (cond ( (null table) `( 3812 ( (,state) 3813 3814 ( 3815 ,(if (equal action 'NONE) 3816 `(,symbol ,new-state) ; goto table 3817 `(,symbol (,action ,new-state)) 3818 ) 3819 3820 (default (error)) 3821 ) 3822 ) 3823 ) 3824 ) 3825 3826 3827 ; Found state in first line of table. Add the new action to 3828 ; this line. 3829 ( (= (action-line-state! (first table)) 3830 state) 3831 3832 (cons (list (first (first table)) ; Get state of first line. 3833 3834 (insert-action-or-goto-into-list symbol 3835 new-state 3836 (action-list! (first table)) 3837 :action action)) 3838 (rest table))) 3839 3840 ; State is smaller than first line's state. Create a new line 3841 ; containing a new state, action and (default (error)) and add it 3842 ; before the first line. 3843 ( (< state (action-line-state! (first table))) 3844 3845 (cons `( (,state) 3846 ( ,(if (equal action 'NONE) 3847 `(,symbol ,new-state) ; goto table 3848 `(,symbol (,action ,new-state)) 3849 ) 3850 (default (error)) 3851 ) 3852 ) 3853 table) 3854 ) 3855 3856 ; State is bigger than the first line's state. Decide later. 3857 ( t (cons (first table) 3858 (add-action-or-goto state symbol new-state (rest table) 3859 :action action)))) 3860 ) 3861 3862 3863 ; ------------------------------------------------------------------------------ 3864 ; | build-action-table | 3865 ; ------------------------------------------------------------------------------ 3866 ; 3867 ; DESCRIPTION 3868 ; 3869 ; Build the ACTION table of a cannonical LR(1) parser. 3870 ; 3871 ; 3872 ; CALLING SEQUENCE 3873 ; 3874 ; (build-action-table goto-graph) 3875 ; 3876 ; goto-graph The goto graph generated by make-items. 3877 ; 3878 ; Returns: Action table of the grammar. 3879 ; 3880 ; METHOD 3881 ; 3882 ; We initially add the action (default (error)) to each line of the table. 3883 ; If we generate any shift-reduce or reduce-reduce conflicts, 3884 ; we record them in the action table and check them later. 3885 ; 3886 ; EXAMPLE 3887 ; 3888 ; ------------------------------------------------------------------------------ 3889 3890 (defun build-action-table( goto-graph ) 3891 3892 (let ( (action-table nil) 3893 (first-symbols-after-dot nil) ) 3894 3895 (dolist (node (nodes! goto-graph)) ; Scan through every node in 3896 ; in the goto graph. 3897 (dolist (item (select-items! node)) 3898 3899 (cond ( 3900 ; For the item [S' -> S. , $], 3901 ; ACTION[ i, $ ] = (accept). 3902 (is-accept? item) 3903 3904 (setq action-table 3905 (add-action-or-goto 3906 (current-state! node) ; Current state i 3907 '$ ; Transition. 3908 'nil ; No state. 3909 action-table 3910 :action 'acc) ; No state. 3911 ) 3912 ) 3913 3914 ; For the item [A -> alpha . , b] 3915 ; ACTION[ i, b ] = (reduce k) 3916 ; where k is the number of the production A -> alpha 3917 ( (reduction? item) 3918 3919 (setq action-table 3920 (add-action-or-goto 3921 (current-state! node) ; Current state i 3922 (lookahead-of! item) ; b. 3923 (production-number ; Production num. 3924 (item-to-production item)) 3925 action-table 3926 :action 'r) 3927 ) 3928 ) 3929 3930 ; Prepare to add a possible shift. 3931 3932 ( t 3933 3934 (if *has-epsilon-productions* 3935 3936 ; When the grammar has epsilon-productions, for the item 3937 ; [A -> alpha . beta , b] 3938 ; where beta is not equal to the null-string EPSILON, 3939 ; for all a in EFF( beta b ), we add 3940 ; ACTION[ i, a ] = (shift j) where j = GOTO( i , a ). 3941 (setq first-symbols-after-dot 3942 3943 (if (reduction? item) ; beta = epsilon 3944 3945 nil 3946 3947 (first-derived-terminals `(,(symbol-after-dot! item) 3948 ,@(string-before-comma! item) 3949 ,(lookahead-of! item)) 3950 :type 'epsilon-free))) 3951 3952 ; For a grammar with no epsilon productions, for the item 3953 ; [A -> alpha . a beta , b] 3954 ; where a is a terminal, we add 3955 ; ACTION[ i, a ] = (shift j) where j = GOTO( i , a ). 3956 (setq first-symbols-after-dot 3957 3958 (if (terminal-after-dot? item) 3959 3960 (list (symbol-after-dot! item)) 3961 3962 nil)) 3963 ) 3964 3965 3966 ; Add a shift, if any. 3967 (dolist (term first-symbols-after-dot) 3968 3969 (setq action-table 3970 (add-action-or-goto (current-state! node) ; Current state i 3971 term ; Terminal a. 3972 (goto ; into state j. 3973 (current-state! node) 3974 term 3975 goto-graph) 3976 action-table 3977 :action 's))))))) ; Do a shift. 3978 3979 action-table) 3980 ) 3981 3982 3983 3984 ; ------------------------------------------------------------------------------ 3985 ; | build-goto-table | 3986 ; ------------------------------------------------------------------------------ 3987 ; 3988 ; DESCRIPTION 3989 ; 3990 ; Build the GOTO table for an LR(1) parser. 3991 ; 3992 ; 3993 ; CALLING SEQUENCE 3994 ; 3995 ; (build-goto-table) 3996 ; 3997 ; goto-graph The goto graph. 3998 ; 3999 ; Returns: The goto table. 4000 ; 4001 ; 4002 ; METHOD 4003 ; 4004 ; Whenever we have a link i which has a transition 4005 ; on a nonterminal A to the link j, we fill the table with 4006 ; GOTO( i, A ) = j. 4007 ; 4008 ; EXAMPLE 4009 ; 4010 ; ------------------------------------------------------------------------------ 4011 4012 (defun build-goto-table( goto-graph ) 4013 4014 (let ((goto-table nil)) 4015 4016 (dolist (link (links! goto-graph)) 4017 4018 (if (and (> (first link) -1) 4019 (nonterminal? (second link))) 4020 4021 (setq goto-table 4022 (add-action-or-goto (first link) 4023 (second link) 4024 (third link) 4025 goto-table) 4026 ) 4027 ) 4028 ) 4029 goto-table) 4030 ) 4031 4032 4033 4034 4035 4036 ; ============================================================================== 4037 ; | Input and Output Functions | 4038 ; ============================================================================== 4039 4040 ; ------------------------------------------------------------------------------ 4041 ; | write-header | 4042 ; ------------------------------------------------------------------------------ 4043 ; 4044 ; DESCRIPTION 4045 ; 4046 ; Write a header for the parse tables file. 4047 ; 4048 ; CALLING SEQUENCE 4049 ; 4050 ; (write-header fp parser-type) 4051 ; 4052 ; fp Pointer to the currently open file. 4053 ; 4054 ; parser-type 'LR1 or 'LALR1. The title will be adjusted 4055 ; automatically based on the parser type. 4056 ; 4057 ; Returns: Header text written to file. 4058 ; 4059 ; EXAMPLE 4060 ; 4061 ; ------------------------------------------------------------------------------ 4062 4063 (defun write-header( fp parser-type ) 4064 4065 (format fp "~A~%~A~%~A~%~A~%~A~%~A~%~A~%~A~%~%" 4066 ";---------------------" 4067 4068 (if (equal parser-type 'LR1) 4069 "; LR(1) parse tables" 4070 "; LALR(1) parse tables") 4071 4072 ";---------------------" 4073 ";" 4074 "; Suitable for input to the Common Lisp program " 4075 ";" 4076 "; LR(1)AndLALR(1)Parser.lsp" 4077 ";" 4078 ) 4079 4080 ) 4081 4082 4083 ; ------------------------------------------------------------------------------ 4084 ; | write-terminals | 4085 ; ------------------------------------------------------------------------------ 4086 ; 4087 ; DESCRIPTION 4088 ; 4089 ; Write a header for the parse tables file. 4090 ; 4091 ; CALLING SEQUENCE 4092 ; 4093 ; (write-terminals fp) 4094 ; 4095 ; fp Pointer to the currently open file. 4096 ; 4097 ; Returns: Terminal symbols written to file. 4098 ; 4099 ; EXAMPLE 4100 ; 4101 ; ------------------------------------------------------------------------------ 4102 4103 (defun write-terminals( fp terminals ) 4104 4105 (format fp "~A~%~A~%~%" 4106 "; TERMINALS" 4107 ";" 4108 ) 4109 4110 (format fp "~S~%~%" terminals) 4111 4112 (fresh-line fp) 4113 (fresh-line fp) 4114 ) 4115 4116 4117 ; ------------------------------------------------------------------------------ 4118 ; | write-productions | 4119 ; ------------------------------------------------------------------------------ 4120 ; 4121 ; DESCRIPTION 4122 ; 4123 ; Write the split up productions with their numbers. 4124 ; 4125 ; CALLING SEQUENCE 4126 ; 4127 ; (write-productions fp productions) 4128 ; 4129 ; fp Pointer to the currently open file. 4130 ; 4131 ; productions List of productions to write. 4132 ; 4133 ; Returns: Neat list of numbered productions. (These are 4134 ; the ones expanded from the alternates.) 4135 ; 4136 ; EXAMPLE 4137 ; 4138 ; See the files lalrparser.dat and parser.dat for examples. 4139 ; 4140 ; ------------------------------------------------------------------------------ 4141 4142 (defun write-productions( fp productions ) 4143 4144 (format fp "~A~%~A~%~A~%~A~%~%~A~%" 4145 "; PRODUCTIONS" 4146 ";" 4147 "; Productions are numbered starting with 1." 4148 "; All alternates were expanded into separate productions." 4149 "(" ) 4150 4151 (dolist (production productions) 4152 4153 ; Print each production. 4154 (format fp "~A~D~A~S~A~%" 4155 " ( " 4156 `(,(production-number production)) 4157 " " 4158 production 4159 " )" ) 4160 ) 4161 4162 (format fp "~A~%~%" 4163 ")" ) 4164 ) 4165 4166 4167 4168 (defun construct-error-messages( action-table ) 4169 4170 (let ( (error-messages nil) 4171 (transition-symbols nil) ) 4172 4173 ; Scan through each line of the action table. 4174 (dolist (action-line action-table) 4175 4176 (setq transition-symbols nil) 4177 4178 ; Scan through the actions in each line. 4179 (dolist (action (action-list! action-line)) 4180 4181 ; Found an error state; add message to the list. 4182 (if (equal (second action) '(ERROR)) 4183 (push `( (,(action-line-state! action-line)) 4184 (,(concatenate 'string 4185 "error - expecting one of the symbols " 4186 (string-trim "(" 4187 (string-trim ")" 4188 (write-to-string transition-symbols)))))) 4189 error-messages) 4190 4191 ; else keep collecting transition symbols. 4192 (setq transition-symbols 4193 (cons (first action) 4194 transition-symbols)) 4195 4196 ) 4197 ) 4198 ) 4199 4200 (reverse error-messages)) 4201 ) 4202 4203 4204 ; ------------------------------------------------------------------------------ 4205 ; | write-error-message-table | 4206 ; ------------------------------------------------------------------------------ 4207 ; 4208 ; DESCRIPTION 4209 ; 4210 ; Write the error message table with templates for the user to fill in. 4211 ; 4212 ; CALLING SEQUENCE 4213 ; 4214 ; (write-error-message-table fp action-table) 4215 ; 4216 ; fp Pointer to the currently open file. 4217 ; 4218 ; action-table 4219 ; 4220 ; Returns: Error message table. 4221 ; 4222 ; EXAMPLE 4223 ; 4224 ; 4225 ; ------------------------------------------------------------------------------ 4226 4227 (defun write-error-message-table( fp action-table ) 4228 4229 (format fp "~A~%~%~%" 4230 " 4231 ; ERROR MESSAGE TABLE 4232 ; 4233 ; If the action table has an error state, the other non-error 4234 ; actions show which symbol was failed to appear next on the input. 4235 ; 4236 ; The user can modify these minimal error messages. 4237 " ) 4238 4239 ; Opening parenthesis. 4240 (format fp "(~%~%") 4241 4242 ; Iterate over error states. 4243 (dolist (error-message (construct-error-messages action-table)) 4244 4245 (format fp " ~S ~%" error-message) 4246 ) 4247 4248 ; Closing parenthesis. 4249 (format fp ")~%~%") 4250 ) 4251 4252 4253 4254 ; ------------------------------------------------------------------------------ 4255 ; | write-goto-graph | 4256 ; ------------------------------------------------------------------------------ 4257 ; 4258 ; DESCRIPTION 4259 ; 4260 ; Write the formatted goto graph to a file. 4261 ; 4262 ; CALLING SEQUENCE 4263 ; 4264 ; (write-goto-graph fp goto-graph) 4265 ; 4266 ; fp Pointer to (open) file which is to contain the graph. 4267 ; 4268 ; goto-graph Goto graph itself, which will be pretty-printed. 4269 ; 4270 ; 4271 ; EXAMPLE 4272 ; 4273 ; (write-goto-graph fp " *goto-graph*) 4274 ; => ... see the sample output files parser.dat and lalrparser.dat. 4275 ; 4276 ; ------------------------------------------------------------------------------ 4277 4278 (defun write-goto-graph( fp goto-graph ) 4279 4280 ; Write the title first and the opening parenthesis. 4281 (format fp "~A~%~%" 4282 "; GOTO GRAPH 4283 ; 4284 ; Not needed for the parser, but here for reference and debugging. 4285 ; ********** 4286 ; Goto graph of the LR(1) or LALR(1) grammar of the form 4287 ; 4288 ; ( 4289 ; ( <-- List of links. 4290 ; (6 |a| 4) <-- Transition in Goto graph from state 6 to 4291 ; state 4 on symbol a. 4292 ; (1 |a| 2) <-- Transition from state 1 to state 2 on a. 4293 ; ) 4294 ; 4295 ; ( <-- List of sets of items. 4296 ; ( 0 <-- State number 0. 4297 ; 3668 <-- Hash value of core. 4298 ; ( 4299 ; (SP -> DOT S |,| $) ----+ 4300 ; ( S -> DOT S |a| S |b| |,| $) | 4301 ; ( S -> DOT EPSILON |,| $) +---- Set of items for state 0 4302 ; ( S -> DOT S |a| S |b| |,| |a|) | 4303 ; ( S -> DOT EPSILON |,| |a|) | 4304 ; ) ----+ 4305 ; ) " 4306 ) 4307 4308 ; Opening parenthesis of graph. 4309 (format fp "(~%") 4310 4311 ; Opening parenthesis of links. 4312 (format fp "~3,4@T(~%") 4313 4314 ; Print each link. 4315 (dolist (link (links! goto-graph)) 4316 4317 (format fp "~3,8@T(~D ~S ~D)~%" 4318 (first link) 4319 (second link) 4320 (third link)) 4321 ) 4322 4323 ; Closing parenthesis of links. 4324 (format fp "~3,4@T)~%") 4325 4326 ; Opening parenthesis of nodes. 4327 (format fp "~3,4@T(~%") 4328 4329 ; Print each node in the graph. 4330 (dolist (node (nodes! goto-graph)) 4331 4332 ; Print open paren of node, state and hash value. 4333 (format fp "~3,8@T(~D~%~3,8@T~D~%" 4334 (current-state! node) 4335 (hash-value! node)) 4336 4337 ; Print out each item. 4338 (dolist (item (select-items! node)) 4339 (format fp "~3,12@T~S~%" item)) 4340 4341 ; Closing paren of node. 4342 (format fp "~3,8@T)~%") 4343 ) 4344 4345 ; Closing parenthesis of nodes. 4346 (format fp "~3,4@T)~%") 4347 4348 ; Closing parenthesis of graph. 4349 (format fp ")~%~%") 4350 ) 4351 4352 4353 ; ------------------------------------------------------------------------------ 4354 ; | write-action-or-goto-table | 4355 ; ------------------------------------------------------------------------------ 4356 ; 4357 ; DESCRIPTION 4358 ; 4359 ; Write the formatted action or goto table to a file. 4360 ; 4361 ; CALLING SEQUENCE 4362 ; 4363 ; (write-action-or-goto-table fp table) 4364 ; 4365 ; fp Pointer to (open) file which is to contain the 4366 ; action-table. 4367 ; 4368 ; table Action or goto table itself, which will be 4369 ; pretty-printed. 4370 ; 4371 ; 4372 ; EXAMPLE 4373 ; 4374 ; (write-action-or-goto-table fp " *action-table) 4375 ; => ... see the sample output files parser.dat and lalrparser.dat. 4376 ; 4377 ; ------------------------------------------------------------------------------ 4378 4379 (defun write-action-or-goto-table( fp table &key (table-type 'ACTION)) 4380 4381 ; Write the title first and the opening parenthesis. 4382 (format fp "~A~%~A~%~A~%~A~%~A~%~%~A~%" 4383 (cond ( (equal table-type 'ACTION) "; ACTION TABLE") 4384 ( (equal table-type 'GOTO) "; GOTO TABLE" )) 4385 ";" 4386 "; (state" 4387 "; (item)" 4388 "; ..." 4389 "(" ) 4390 4391 ; Print actions for each state. 4392 (dolist (state table) 4393 4394 ; Print the opening paren of the table and the state 4395 ; number in parentheses. 4396 (format fp "~3,4@T( (~D) ~%" 4397 (action-line-state! state) 4398 ) 4399 4400 ; Print the word NIL explicitly if the list of items is empty. 4401 (if (null (action-list! state)) 4402 (format fp "~3,8@TNIL~%") 4403 4404 ; Print out the list of actions. 4405 (progn 4406 ; Print first paren of action list. 4407 (format fp "~3,8@T(~%") 4408 4409 ; Print actions. 4410 (dolist (item (action-list! state)) 4411 (format fp "~3,12@T~S~%" item)) 4412 4413 ; Print first paren of action list. 4414 (format fp "~3,8@T)~%") 4415 ) 4416 ) 4417 4418 (format fp "~3,4@T)~%") 4419 ) 4420 4421 ; Closing parenthesis. 4422 (format fp ")~%~%") 4423 ) 4424 4425 4426 ; ------------------------------------------------------------------------------ 4427 ; | print-legal-notice | 4428 ; ------------------------------------------------------------------------------ 4429 ; 4430 ; DESCRIPTION 4431 ; 4432 ; Write legal notice when the program starts up. 4433 ; 4434 ; CALLING SEQUENCE 4435 ; 4436 ; Returns: Legal notice to standard output. 4437 ; 4438 ; EXAMPLE 4439 ; 4440 ; ------------------------------------------------------------------------------ 4441 4442 (defun print-legal-notice() 4443 4444 ; Print a few newlines, the notice and a few more newlines. 4445 (format t "~%~%~A~%~%" 4446 " 4447 LR(1)AndLALR(1)ParserGenerator Version 5.6 4448 4449 An LR(1) and LALR(1) Parser Generator written in Common Lisp. 4450 4451 Copyright (C) 1989-2024 by Sean Erik O'Connor. All Rights Reserved. 4452 4453 This program is free software: you can redistribute it and/or modify 4454 it under the terms of the GNU General Public License as published by 4455 the Free Software Foundation, either version 3 of the License, or 4456 (at your option) any later version. 4457 4458 This program is distributed in the hope that it will be useful, 4459 but WITHOUT ANY WARRANTY; without even the implied warranty of 4460 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 4461 GNU General Public License for more details. 4462 4463 You should have received a copy of the GNU General Public License 4464 along with this program. If not, see <http://www.gnu.org/licenses/>. 4465 4466 The author's address is seanerikoconnor!AT!gmail!DOT!com 4467 with the !DOT! replaced by . and the !AT! replaced by @" 4468 ) 4469 ) 4470 4471 4472 4473 ; ------------------------------------------------------------------------------ 4474 ; 4475 ; NAME 4476 ; 4477 ; load-input-and-initialize 4478 ; 4479 ; DESCRIPTION 4480 ; 4481 ; Load the grammar from file. Initialize global variables. 4482 ; 4483 ; CALLING SEQUENCE 4484 ; 4485 ; (load-input-and-initialize filename) 4486 ; 4487 ; filename Name of the file containing the productions and terminals for 4488 ; the grammar. 4489 ; 4490 ; Returns: *terminals* After this is read from file, we add the extra 4491 ; terminal $ (the language's right endmarker). 4492 ; 4493 ; *productions* After reading the list of productions, 4494 ; 4495 ; [A -> alpha | beta ...] 4496 ; 4497 ; from file, we split up the alternates to 4498 ; generate the set of productions 4499 ; 4500 ; [A -> alpha], [A -> beta], ... 4501 ; 4502 ; *first-derived-terminals* 4503 ; *epsilon-free-first-derived-terminals* 4504 ; 4505 ; Set to NIL. 4506 ; 4507 ; *has-epsilon-productions* 4508 ; 4509 ; Set to NIL unless we have epsilon 4510 ; productions of the 4511 ; form A -> EPSILON. 4512 ; 4513 ; *conflicts* Set to NIL. 4514 ; 4515 ; *goto-graph*, 4516 ; *action-table* 4517 ; *goto-table* Set to NIL just for the hell of it. 4518 ; 4519 ; EXAMPLE 4520 ; 4521 ; (load-input-and-initialize "grammar.dat") 4522 ; 4523 ; *productions* => ((S -> S |a| S |b|) (S -> EPSILON) 4524 ; *terminals* => (|a| |b| $) 4525 ; *first-derived-terminals* => NIL 4526 ; *epsilon-free-first-derived-terminals* => NIL 4527 ; *conflicts* => NIL 4528 ; *has-epsilon-productions* => NIL 4529 ; 4530 ; ------------------------------------------------------------------------------ 4531 4532 (defun load-input-and-initialize( grammar-file ) 4533 4534 ; Better safe than sorry. 4535 (setq *goto-graph* nil) 4536 (setq *action-table* nil) 4537 (setq *goto-table* nil) 4538 (setq *conflicts* nil) 4539 (setq *first-derived-terminals* nil) 4540 (setq *epsilon-free-first-derived-terminals* nil) 4541 4542 4543 ; Split up productions and add the endmarker to the list of terminals. 4544 (let ( (fp (open grammar-file :direction :input)) ) 4545 4546 (setq *productions* (read fp)) 4547 (setq *terminals* (read fp)) 4548 4549 ; Add the endmarker to the list of terminals. 4550 (setq *terminals* (append *terminals* '($))) 4551 4552 ; Split up productions (so we don't handle alternates directly) 4553 (setq *productions* (split-up-productions *productions*)) 4554 4555 4556 ; Detect epsilon productions. 4557 (setq *has-epsilon-productions* nil) 4558 4559 (dolist (production *productions*) 4560 4561 (if (epsilon-production? production) 4562 4563 (setq *has-epsilon-productions* T))) 4564 4565 (close fp)) 4566 ) 4567 4568 4569 ; ------------------------------------------------------------------------------ 4570 ; | compile-all | 4571 ; ------------------------------------------------------------------------------ 4572 ; 4573 ; DESCRIPTION 4574 ; 4575 ; Compile all the functions in this program, except compile-all itself. 4576 ; 4577 ; CALLING SEQUENCE 4578 ; 4579 ; (compile-all) 4580 ; 4581 ; EXAMPLE 4582 ; 4583 ; (compile-all) => 4584 ; 4585 ; ;;; Compiling function LOAD-INPUT-AND-INITIALIZE...tail-merging... 4586 ; assembling...emitting...done. 4587 ; 4588 ; --- and so on, with pauses for garbage collection --- 4589 ; 4590 ; ;;; Compiling function PARSER-GENERATOR...assembling...emitting...done 4591 ; NIL 4592 ; 4593 ; ------------------------------------------------------------------------------ 4594 4595 (defun compile-all() 4596 4597 ; Tell the compiler the following variables are global (have dynamic binding). 4598 4599 (proclaim '(special *productions*)) 4600 (proclaim '(special *has-epsilon-productions*)) 4601 (proclaim '(special *terminals*)) 4602 (proclaim '(special *first-derived-terminals*)) 4603 (proclaim '(special *epsilon-free-first-derived-terminals*)) 4604 (proclaim '(special *goto-graph*)) 4605 (proclaim '(special *action-table*)) 4606 (proclaim '(special *goto-table*)) 4607 (proclaim '(special *conflicts*)) 4608 4609 (let ( (functions-to-compile 4610 4611 '( print-legal-notice 4612 load-input-and-initialize 4613 4614 getHeadOfListUpTo 4615 removeItemFromList 4616 positionInList 4617 insertItemIntoList 4618 combine 4619 itemInList 4620 4621 terminal? 4622 nonterminal? 4623 derives-leading-terminal? 4624 derives-leading-nonterminal? 4625 valid-production? 4626 set-of-items-in-graph? reduction? 4627 is-accept? terminal-after-dot? 4628 equal-sets-of-items? 4629 contained-in-item? 4630 element-of-item? 4631 epsilon-production? 4632 same-symbol? 4633 4634 first-alternate! 4635 all-but-first-alternate! 4636 production-rhs! 4637 symbol-after-dot! 4638 string-before-comma! 4639 lookahead-of! 4640 select-items! 4641 hash-value! 4642 current-state! 4643 action-list! 4644 transition-symbol! 4645 action-line-state! 4646 action-trigger-symbol! 4647 4648 core-of-item! 4649 core-hash-value-of-item 4650 core-hash-value-of-set-of-items 4651 merge-lookaheads 4652 merge-cores 4653 4654 split-up-production 4655 split-up-productions 4656 make-item 4657 move-dot-right 4658 create-augmenting-item 4659 find-grammar-symbols 4660 create-new-node 4661 create-new-link 4662 4663 node-number 4664 item-to-production 4665 production-number 4666 tag-symbol 4667 flag-epsilon-free! 4668 epsilon-free-only 4669 untag-list 4670 flag-non-epsilon-free precedence 4671 4672 derived-leading-terminal 4673 initial-first-derived-terminals 4674 first-terminals-of-rhs 4675 update-first-derived-function 4676 create-all-first-derived-terminals 4677 first-terminals-of-symbol 4678 first-derived-terminals 4679 4680 add-action-or-goto 4681 insert-action-or-goto-into-list 4682 goto closure compute-goto 4683 4684 create-goto-graph 4685 build-action-table 4686 build-goto-table 4687 4688 write-header 4689 write-terminals 4690 write-productions 4691 write-goto-graph 4692 write-action-or-goto-table 4693 write-error-message-table 4694 construct-error-messages 4695 4696 parser-generator 4697 file-exists? 4698 base-path! 4699 test-parser-generator))) 4700 4701 ; Compile all the functions, except compile-all itself. 4702 4703 (dolist (function-to-compile functions-to-compile) 4704 4705 (compile function-to-compile))) 4706 ) 4707 4708 4709 4710 ; ============================================================================== 4711 ; | Main Program | 4712 ; ============================================================================== 4713 4714 ; ------------------------------------------------------------------------------ 4715 ; | parser-generator | 4716 ; ------------------------------------------------------------------------------ 4717 ; 4718 ; DESCRIPTION 4719 ; 4720 ; Main program which produces the LR(1) and LALR(1) parsing tables. 4721 ; 4722 ; CALLING SEQUENCE 4723 ; 4724 ; (parser-generator in-file out-file :parser-type parser-type) 4725 ; 4726 ; in-file Productions and terminals for the grammar. See 4727 ; the file grammar.dat for an example. 4728 ; 4729 ; out-file The numbered productions, goto graph, action and 4730 ; parsing tables for the grammar. See the files 4731 ; lalrparser.dat and parser.dat for examples. 4732 ; 4733 ; parser-type 'LR1 or 'LALR1 parsing. The default is 'LALR1. 4734 ; 4735 ; Returns: A string indicating if any conflicts have occurred. 4736 ; 4737 ; EXAMPLE 4738 ; 4739 ; (parser-generator "grammar.dat" "parser.dat" :parser-type 'lr1) 4740 ; => NIL and the file parser.dat 4741 ; 4742 ; (parser-generator "grammar.dat" "lalrparser.dat" :parser-type 'lalr1) 4743 ; => NIL and the file lalrparser.dat 4744 ; 4745 ; (parser-generator "grammar.dat" "lalrparser.dat") 4746 ; => same as above 4747 ; 4748 ; (parser-generator "grammar4.dat" "junk" :parser-type 'lalr1) 4749 ; => "Conflicts were detected" and the file junk. 4750 ; 4751 ; ------------------------------------------------------------------------------ 4752 4753 (defun parser-generator( in-file out-file &key (parser-type 'LALR1) ) 4754 4755 ; Keep my lawyer happy. 4756 (print-legal-notice) 4757 4758 ; Read in the grammar file productions and terminals. 4759 (load-input-and-initialize in-file) 4760 4761 (let ( (fp (open out-file :direction :output :if-exists :supersede)) ) 4762 4763 ; Compute the goto graph for the grammar. 4764 (setq *goto-graph* (create-goto-graph parser-type)) 4765 4766 ; Construct the action and goto parsing tables. 4767 (setq *action-table* (build-action-table *goto-graph*)) 4768 (setq *goto-table* (build-goto-table *goto-graph*)) 4769 4770 ; Write out the terminals and productions for reference. 4771 (write-header fp parser-type) 4772 (write-terminals fp *terminals*) 4773 (write-productions fp *productions*) 4774 4775 ; Write out the goto graph. 4776 (write-goto-graph fp *goto-graph*) 4777 4778 ; Write out the action and goto parse tables. 4779 (write-action-or-goto-table fp *action-table* :table-type 'ACTION) 4780 (write-action-or-goto-table fp *goto-table* :table-type 'GOTO) 4781 4782 ; Write out the error message template. 4783 (write-error-message-table fp *action-table*) 4784 4785 (close fp) 4786 4787 (if *conflicts* 4788 "Conflicts were detected") 4789 ) 4790 ) 4791 4792 4793 4794 ; ------------------------------------------------------------------------------ 4795 ; | print-file-to-console | 4796 ; ------------------------------------------------------------------------------ 4797 ; 4798 ; DESCRIPTION 4799 ; 4800 ; List the lines of a file to the console. 4801 ; 4802 ; CALLING SEQUENCE 4803 ; 4804 ; (print-file-to-console filename) 4805 ; 4806 ; filename Name of the file. 4807 ; 4808 ; Returns: 4809 ; 4810 ; EXAMPLE 4811 ; 4812 ; (print-file-to-console "grammar.dat") 4813 ; => ; GrammarE=E+T_T.dat 4814 ; --------------------------------------------------------------------------- 4815 ; 4816 ; A grammar of arithmetic expressions, 4817 ; 4818 ; E -> E + T | T 4819 ; ... 4820 ; 4821 ; ------------------------------------------------------------------------------ 4822 4823 (defun print-file-to-console( file-name ) 4824 (format t "~%~%=========================== ~A =============================~%~%~%" file-name) 4825 4826 (with-open-file (stream file-name) 4827 (do ( (line (read-line stream nil) ; nil inhibits throw at eof 4828 (read-line stream nil) ) ; and read-line returns nil at eof 4829 ) 4830 ( (null line) ) ; Terminate at eof 4831 (format t "~A~%" line) 4832 ) 4833 ) 4834 ) 4835 4836 4837 (defun component-present-p (value) 4838 (and value (not (eql value :unspecific)))) 4839 4840 (defun directory-pathname-p (p) 4841 (and 4842 (not (component-present-p (pathname-name p))) 4843 (not (component-present-p (pathname-type p))) 4844 p)) 4845 4846 (defun pathname-as-directory (name) 4847 (let ((pathname (pathname name))) 4848 (when (wild-pathname-p pathname) 4849 (error "Can't reliably convert wild pathnames.")) 4850 (if (not (directory-pathname-p name)) 4851 (make-pathname 4852 :directory (append (or (pathname-directory pathname) (list :relative)) 4853 (list (file-namestring pathname))) 4854 :name nil 4855 :type nil 4856 :defaults pathname) 4857 pathname))) 4858 4859 4860 ; ------------------------------------------------------------------------------ 4861 ; | file-exists? | 4862 ; ------------------------------------------------------------------------------ 4863 ; 4864 ; DESCRIPTION 4865 ; 4866 ; Portable way to check if a file or directory exists. 4867 ; 4868 ; CALLING SEQUENCE 4869 ; 4870 ; (file-exists? directory-or-file) 4871 ; 4872 ; directory-or-file Pathname for directory or file 4873 ; Returns: t if it is there, nil if not. 4874 ; 4875 ; 4876 ; EXAMPLES 4877 ; 4878 ; (file-exists? "/NotThere") => nil 4879 ; (file-exists? "/Volumes/seanoconnor") => t 4880 ; 4881 ; ------------------------------------------------------------------------------ 4882 4883 (defun file-exists? (pathname) 4884 "Check if the file exists" 4885 #+(or sbcl lispworks openmcl) 4886 (probe-file pathname) 4887 4888 #+(or allegro cmu) 4889 (or (probe-file (pathname-as-directory pathname)) 4890 (probe-file pathname)) 4891 4892 #+clisp 4893 (or (ignore-errors 4894 (probe-file (pathname-as-file pathname))) 4895 (ignore-errors 4896 (let ((directory-form (pathname-as-directory pathname))) 4897 (when (ext:probe-directory directory-form) 4898 directory-form)))) 4899 4900 #-(or sbcl cmu lispworks openmcl allegro clisp) 4901 (error "file-exists-p not implemented") 4902 ) 4903 4904 4905 4906 ; ------------------------------------------------------------------------------ 4907 ; | base-path! | 4908 ; ------------------------------------------------------------------------------ 4909 ; 4910 ; DESCRIPTION 4911 ; 4912 ; Try to find out where the base directory for the web page is located. 4913 ; 4914 ; CALLING SEQUENCE 4915 ; 4916 ; (base-path!) 4917 ; 4918 ; Returns: String of base path or nil if it can't find it. 4919 ; 4920 ; 4921 ; EXAMPLES 4922 ; 4923 ; (base-path!) => "C:/Sean/WebSite" ; Got it. 4924 ; (base-path!) => nil ; Could't find it. 4925 ; 4926 ; ------------------------------------------------------------------------------ 4927 4928 (defun base-path!() 4929 (let ( (possible-directories-list '( 4930 "/cygdrive/c/Sean/WebSite" ; Windows / Cygwin 4931 "/Users/seanoconnor/Desktop/Sean/WebSite" ; Mac OS 4932 "/home/seanoconnor/Desktop/Sean/WebSite" ; Ubuntu Linux 4933 ))) 4934 4935 (dolist (base-path possible-directories-list) 4936 ; (format t "base path = ~S exists = ~S~%" base-path (file-exists? base-path) ) 4937 (if (file-exists? base-path) (return (concatenate 'string base-path "/")))) 4938 ) 4939 ) 4940 4941 4942 4943 ; ------------------------------------------------------------------------------ 4944 ; | test-parser-generator | 4945 ; ------------------------------------------------------------------------------ 4946 ; 4947 ; DESCRIPTION 4948 ; 4949 ; Run the parser generator on a test grammar and produce test parsing 4950 ; tables. 4951 ; 4952 ; CALLING SEQUENCE 4953 ; 4954 ; (test-parser-generator) 4955 ; 4956 ; Set the files and paths to your requirements. I'm assuming you've 4957 ; installed cygwin if you're on a Windows machine. 4958 ; 4959 ; ------------------------------------------------------------------------------ 4960 4961 (defun test-parser-generator() 4962 4963 ; Compile all the functions for speed. 4964 (compile-all) 4965 4966 ; Garbage collect. 4967 (gc) 4968 4969 ; Generate a set of parse tables from a test grammar, both LR(1) and 4970 ; LALR(1). 4971 (let* ( 4972 ; Set up the base directory paths. 4973 (base-path (base-path!)) 4974 4975 (sub-path "ComputerScience/Compiler/ParserGeneratorAndParser/") 4976 (grammar-path "Grammars/" ) 4977 (parse-table-path "ParseTables/") 4978 4979 ; List the grammar files (input) and parse table files (output). 4980 (grammar-file '( "GrammarS=SaSbEPSILON.dat" 4981 "GrammarE=E+T_T.dat" 4982 "GrammarPoly.dat" 4983 "GrammarLR(1)NotLALR(1).dat" 4984 "GrammarNotLR(1)NotLALR(1).dat") ) 4985 (parse-file-LR1 '( "ParseTablesLR(1)_S=SaSbEPSILON.dat" 4986 "ParseTablesLR(1)_E=E+T_T.dat" 4987 "ParseTablesLR(1)_Poly.dat" 4988 "ParseTablesLR(1)_NotLALR(1).dat" 4989 "ParseTablesLR(1)_NotLR(1)NotLALR(1).dat") ) 4990 (parse-file-LALR1 '( "ParseTablesLALR(1)_S=SaSbEPSILON.dat" 4991 "ParseTablesLALR(1)_E=E+T_T.dat" 4992 "ParseTablesLALR(1)_Poly.dat" 4993 "ParseTablesLALR(1)_NotLALR(1).dat" 4994 "ParseTablesLALR(1)_NotLR(1)NotLALR(1).dat") ) 4995 ) 4996 4997 (dotimes (i (length grammar-file)) 4998 4999 (let* ( 5000 ; Create the full file path. 5001 (full-grammar-file 5002 (concatenate 'string 5003 base-path sub-path grammar-path 5004 (nth i grammar-file)) 5005 ) 5006 5007 (full-parse-file-LR1 5008 (concatenate 'string 5009 base-path sub-path parse-table-path 5010 (nth i parse-file-LR1)) 5011 ) 5012 5013 (full-parse-file-LALR1 5014 (concatenate 'string 5015 base-path sub-path parse-table-path 5016 (nth i parse-file-LALR1)) 5017 ) 5018 ) 5019 5020 ; Call the parser generator to generate parse tables for 5021 ; both LR(1) and LALR(1). 5022 (parser-generator full-grammar-file full-parse-file-LR1 5023 :parser-type 'LR1) 5024 5025 (parser-generator full-grammar-file full-parse-file-LALR1) 5026 5027 ; Display the results to the console. 5028 (print-file-to-console full-grammar-file) 5029 (print-file-to-console full-parse-file-LR1) 5030 (print-file-to-console full-parse-file-LALR1) 5031 ) 5032 ) 5033 ) 5034 )