-
Notifications
You must be signed in to change notification settings - Fork 6
Prop Compiler Compiler is a an augmented C++ to C++ translator. Adding Alegebraic Data Types, pattern matching, rewriting, parser generator and lexer generator, amongst other augmentations.
AaronNGray/prop-cc
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Prop is a pattern matching language based on C++. It implements algebraic datatypes, pattern matching and rewriting, and generates C++ code as output. ------------------------------------------------------------------------------ Release 2.4 Updated projects and solution to Visual Studio 2010. Removed "using namespace std"'s to local std::'s converted <string.h>'s to <string>'s support for .pcc and .ph in Visual Studio Release 2.3.8 Converted tabs to 2 spaces. Release 2.3.7 Port to Cygwin GCC 3.4.3, Linux GCC 4.0, MS Visual C++ Express 2005 See port.txt Release 2.3.4 (i) Fixed a few bugs in the translator that caused it not to compile on g++ 2.7.0. (ii) Added additional state compression for rewriting tree automata. We'll now use 1-byte representation for rule set and state tables whenever possible. One-byte state representation is used whenever the number of states is <= 256. One-byte rule representation is used whenever the number of rules is <= 127. Release 2.3.3 (i) Added system V gc timer code. (ii) The old view mechanism has been reissued as a new feature, with some syntactic changes. It should now work with rewriting. (iii) Added better debugging facilities for rewriting. (iv) The rewriting modifiers before:, topdown:, bottomup: and after: have been renamed into the following: before: preorder: topdown: bottomup: postorder: The meaning of preorder: is analogous to that of the old topdown: while the meaning of postorder: is analogous to the old after: The topdown mode now correctly normalizes a term. (v) Missing types in traversal list are now correctly reported. (vi) A state caching bug creeped into version 2.3.0: the index is not activated if there are no explicit rewrite statements in the TRS. This is now fixed. ------------------------------------------------------------------------------ Release 2.3.2 (i) The datatype mapping process has been completely redone. This affects pretty much everything that has to do with datatypes, including garbage collection, persistence, and pretty printing. Previous releases' generated template code had problems. Code generated by old versions of Prop will NOT work with this release. (ii) Pretty printing methods generated by Prop now uses the library class PrettyPOstream defined in <AD/pretty/postream.h>. This class is still quite rudimentary. I'll have to think about what to do with pretty printing. (iii) Added addition code for checking for correct datatype inheritance, especially inheritance of collectable classes. Currently only one collectable class may be inherited. (iv) Adding synthesized value checking in parser generation. Missing synthesized value in production will give warnings. (v) Fixed some BigInt, AVL, SplayTree bugs. (vi) Added template instantiation stuff for g++'s -fno-implicit-templates. Please see the manual for details. (vii) Improved syntax error reporting; we now prints out the set of terminals that can be part of the valid input stream. Release 2.3.1 (i) Fixed various problems with collectable datatypes. ------------------------------------------------------------------------------ Release 2.3.0 (i) rewriting Added topdown:, bottomup:, before: and after: modes in rewriting. Added the indexing mechanism for rewriting. External indices are still not very well tested, especially GCRewriteCache implemented with weak pointers. Added the cutrewrite construct for replacement. (ii) weakpointers Changed the GC classes so that weakpointer collection is automatically performed. Previously, weakpointer collection is tied with finalization, which makes it possible to get unsafe results if the user forgets to turn on finalization when using weakpointers. Weakpointers collection can be disabled with the set_weak_pointer_collection() method. In order to reclaim entries the weak pointer table, we'll try to perform a garbage collection before the weak pointer table is expanded. [This may cause excessive garbage collection if a lot of weakpointers are manipulated. To be fixed later.] (iii) Fixed quark literal naming bug. Generated quark literals are now prefixed with the mangled filename of the source program. ------------------------------------------------------------------------------ Release 2.1 This is *yet* another rewrite of Prop. It has been bootscrapped from version 2.0.6 (not released), then rewritten using the new features. Rarely used features and features that have been poorly implemented in releases 2.0.x have been removed. Among these are: (1) lexer and parser generation. (2) hash consing and reference counting. (3) let ... in { ... } (4) cost minimization in matching (5) rule class. (6) dataview Some of these will be reintroduced later when better integration with existing features is developed. CHANGES ------- Here are some important changes to the translator. [1] Mapping of polymorphic datatypes. Mapping of polymorphic datatypes have been altered. Previously we use a very elaborate smart pointer class to simulate pointer semantics, which failed to work consistently on buggy compilers. Now we use macros instead. [2] Optimizations on pattern matching. An adaptive traversal algorithm based on the work of Sekar, Ramesh, and Ramakrishnan (``Adaptive Pattern Matching'' TR SUNY at Stony Brook, year 199?) has been implemented. This optimization should further speed up pattern matching and reduce the size of the generated code. [3] Optimizations on inference. A limited form of decomposition has been implemented on guard expressions. Specifically, guards in the form of conjuncts are decomposed and hoisted to speed up inference object matching. This is similar to ``pushing down selects'' in database. [The next step would be to implement indexing on the keys and faster joins.] [4] The following changes have been made to the syntax: [a] The pattern connectives or:, and: and not: have been eliminated in favor of ||, &&, and ! respectively. [b] Guards can now be written using a few different syntaxes: For example, foo(?x,?y) where predicate(x?,y?): foo(?x,?y) if predicate(x?,y?); foo(?x,?y) | predicate(x?,y?); are all equivalent. [c] We can now say foo(?x,?y): ?x in place of foo(?x,?y): rewrite(?x); or even foo(?x,?y): { rewrite(?x); } in a rewrite class. [d] Certain parenthesis are now optional. For example, datatype Tree = empty | leaf (int) | node (Tree, int, Tree); can now be written as: datatype Tree = empty | leaf int | node Tree, int, Tree; [5] Semiunification of non-linear patterns has been (partially) implemented. Equality tests are automatically generated for repeated variables within a pattern. This feature needs a typeclass extension to the type system to be fully functional. [6] Minor variant to 'match' called 'match while' has been added: match[all] while (e1) and (e2) and ... and (en) { p1: { action1 } | p2: { action2 } | ... | ... | pn: { actionn } } repeats the matching process until none of the rules matches. [7] We can now define functions using pattern matching using the 'fun' declaration form. It is an abbreviation for 'match' and function definition. Mutually recursive functions are defined using the 'and' connective. The advantage of this form is that we can omit naming the arguments and declaring the types and long as the type inferencer can deduce the types at compile time. datatype Tree = empty | node Tree, int, Tree; fun depth empty => int: { return 0; } | depth node(a,_,b): { return max(depth(a),depth(b)) + 1; } and flip empty => Tree: { return empty; } | flip node(a,i,b): { return node(flip(b),i,flip(a)); } ; [8] Embedded tags optimizations on term representations. It is now possible to use compact representations that embeds the variant tag of a datatype into the lower bits of its pointer. For example, the datatype Tree in datatype Tree = empty | leaf int | node Tree, int, Tree ; is mapped into the following code by default: #define empty (a_Tree *)0 class a_Tree { public: enum Tag_Tree { tag_leaf = 0, tag_node = 1; }; protected: Tag_Tree _tag_; ... }; class Tree_leaf : public a_Tree { ... }; class Tree_node : public a_Tree { ... }; On the other hand, with the embedded tag anotations '!': // embedded variant tag datatype Tree ! = empty | leaf int ! // embedded argument | node Tree, int, Tree ; the class hierarchy can be collasped into just the class class a_Tree { ... }; The tag field in the base class a_Tree will also be eliminated. Instead the tag is embedded in the lower bits of the leaf and node pointers. Furthermore, the integer argument in the leaf node will actually be imbedded within the pointer for the leaf, and no heap storage is consumed(the integer will loose a few bits of precision however, typically 2). Using this optimization, the variant tag of a term can be accessed without a load from memory. Furthermore, the terms can all be 1 word smaller (due to alignment this may not be always true). Of course, bit masking must now be done to strip the tag bits before we dereference the pointers. In most machines this requires a single instruction; but since we can eliminate one load with this optimization, I think things even out after all even if it's not improved upon. These optimizations are largely transparent from the application: the correct pattern matching code will be generated. One possible exception is that an embedded argument is not mutable. As always, we generate code assuming that the C++ compiler performs agressive optimizations such as common subexpression elimination. [9] User definable lists and array-like constructors
About
Prop Compiler Compiler is a an augmented C++ to C++ translator. Adding Alegebraic Data Types, pattern matching, rewriting, parser generator and lexer generator, amongst other augmentations.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published