Skip to content

Prop Compiler Compiler is a an augmented C++ to C++ translator. Adding Alegebraic Data Types, pattern matching, rewriting, parser generator and lexer generator, amongst other augmentations.

Notifications You must be signed in to change notification settings

AaronNGray/prop-cc

Repository files navigation

Prop is a pattern matching language based on C++.
It implements algebraic datatypes, pattern matching and rewriting,
and generates C++ code as output.
------------------------------------------------------------------------------
Release 2.4
     Updated projects and solution to Visual Studio 2010.
	 Removed "using namespace std"'s to local std::'s
	 converted <string.h>'s to <string>'s
	 support for .pcc and .ph in Visual Studio

Release 2.3.8
     Converted tabs to 2 spaces.
Release 2.3.7
     Port to Cygwin GCC 3.4.3, Linux GCC 4.0, MS Visual C++ Express 2005
     See port.txt
Release 2.3.4
(i)  Fixed a few bugs in the translator that caused it not to compile
     on g++ 2.7.0.
(ii) Added additional state compression for rewriting tree automata.
     We'll now use 1-byte representation for rule set and state tables
     whenever possible.  One-byte state representation is used whenever
     the number of states is <= 256.  One-byte rule representation
     is used whenever the number of rules is <= 127.

Release 2.3.3
(i)  Added system V gc timer code.
(ii) The old view mechanism has been reissued as a new feature, with
     some syntactic changes.  It should now work with rewriting.
(iii) Added better debugging facilities for rewriting.
(iv) The rewriting modifiers before:, topdown:, bottomup: and after: have been
     renamed into the following:

	  before:
	  preorder:
	  topdown:
	  bottomup:
	  postorder:

     The meaning of preorder: is analogous to that of the old topdown:
     while the meaning of postorder: is analogous to the old after:

     The topdown mode now correctly normalizes a term.

(v)  Missing types in traversal list are now correctly reported.

(vi) A state caching bug creeped into version 2.3.0:  the index
     is not activated if there are no explicit rewrite statements
     in the TRS.  This is now fixed.

------------------------------------------------------------------------------
Release 2.3.2
(i)  The datatype mapping process has been completely redone.
     This affects pretty much everything that has to do with datatypes,
     including garbage collection, persistence, and pretty printing.

     Previous releases' generated template code had problems.

     Code generated by old versions of Prop will NOT work with this release.

(ii) Pretty printing methods generated by Prop now uses the
     library class PrettyPOstream defined in <AD/pretty/postream.h>.
     This class is still quite rudimentary.

     I'll have to think about what to do with pretty printing.

(iii) Added addition code for checking for correct datatype
      inheritance, especially inheritance of collectable classes.
      Currently only one collectable class may be inherited.

(iv)  Adding synthesized value checking in parser generation.
      Missing synthesized value in production will give warnings.

(v)   Fixed some BigInt, AVL, SplayTree bugs.

(vi)  Added template instantiation stuff for g++'s -fno-implicit-templates.
      Please see the manual for details.

(vii) Improved syntax error reporting; we now prints out the set of
      terminals that can be part of the valid input stream.

Release 2.3.1
(i)  Fixed various problems with collectable datatypes.

------------------------------------------------------------------------------
Release 2.3.0

(i) rewriting
   Added topdown:, bottomup:, before: and after: modes in rewriting.

   Added the indexing mechanism for rewriting.  External indices are
   still not very well tested, especially GCRewriteCache implemented
   with weak pointers.

   Added the cutrewrite construct for replacement.

(ii) weakpointers
   Changed the GC classes so that weakpointer collection
   is automatically performed.  Previously, weakpointer collection
   is tied with finalization, which makes it possible to get unsafe
   results if the user forgets to turn on finalization when using
   weakpointers.

   Weakpointers collection can be disabled with the
   set_weak_pointer_collection() method.
   In order to reclaim entries the weak pointer table, we'll try to
   perform a garbage collection before the weak pointer table is expanded.
   [This may cause excessive garbage collection if a lot of weakpointers
    are manipulated.  To be fixed later.]

(iii)
   Fixed quark literal naming bug.  Generated quark literals are now prefixed
   with the mangled filename of the source program.

------------------------------------------------------------------------------
Release 2.1

This is *yet* another rewrite of Prop.  It has been bootscrapped from
version 2.0.6 (not released), then rewritten using the new features.

Rarely used features and features that have been poorly implemented in
releases 2.0.x have been removed.  Among these are:

    (1) lexer and parser generation.
    (2) hash consing and reference counting.
    (3) let ... in { ... }
    (4) cost minimization in matching
    (5) rule class.
    (6) dataview

Some of these will be reintroduced later when better integration with
existing features is developed.

CHANGES
-------

   Here are some important changes to the translator.

[1]  Mapping of polymorphic datatypes.

     Mapping of polymorphic datatypes have been altered.  Previously
   we use a very elaborate smart pointer class to simulate pointer semantics,
   which failed to work consistently on buggy compilers.  Now we use macros
   instead.

[2]  Optimizations on pattern matching.

    An adaptive traversal algorithm based on the work of Sekar, Ramesh,
   and Ramakrishnan (``Adaptive Pattern Matching'' TR SUNY at Stony Brook, year
   199?) has been implemented.  This optimization should further
   speed up pattern matching and reduce the size of the generated code.

[3]  Optimizations on inference.

     A limited form of decomposition has been implemented on guard expressions.
  Specifically, guards in the form of conjuncts are decomposed and hoisted
  to speed up inference object matching.  This is similar to ``pushing down
  selects'' in database.

     [The next step would be to implement indexing on the keys
   and faster joins.]

[4]  The following changes have been made to the syntax:

   [a]  The pattern connectives or:, and: and not: have been
        eliminated in favor of ||, &&, and ! respectively.
   [b]  Guards can now be written using a few different syntaxes:
        For example,

          foo(?x,?y) where predicate(x?,y?):
          foo(?x,?y) if    predicate(x?,y?);
          foo(?x,?y) |     predicate(x?,y?);

        are all equivalent.

    [c]  We can now say

           foo(?x,?y): ?x

         in place of    foo(?x,?y): rewrite(?x);
         or even        foo(?x,?y): { rewrite(?x); }

         in a rewrite class.

    [d]  Certain parenthesis are now optional.
         For example,

             datatype Tree = empty | leaf (int) | node (Tree, int, Tree);

         can now be written as:

             datatype Tree = empty | leaf int | node Tree, int, Tree;

[5]  Semiunification of non-linear patterns has been (partially) implemented.
     Equality tests are automatically generated for repeated variables within
     a pattern.

     This feature needs a typeclass extension to the type system
     to be fully functional.

[6]  Minor variant to 'match' called 'match while' has been added:

     match[all] while (e1) and (e2) and ... and (en)
     { p1: { action1 }
     | p2: { action2 }
     | ...
     | ...
     | pn: { actionn }
     }

     repeats the matching process until none of the rules matches.

[7]  We can now define functions using pattern matching using the
     'fun' declaration form.  It is an abbreviation for 'match' and
     function definition.  Mutually recursive functions are
     defined using the 'and' connective.

     The advantage of this form is that
     we can omit naming the arguments and declaring the types
     and long as the type inferencer can deduce the types at compile time.

     datatype Tree = empty | node Tree, int, Tree;

     fun depth empty => int:  { return 0; }
       | depth node(a,_,b):   { return max(depth(a),depth(b)) + 1; }
     and flip  empty => Tree: { return empty; }
       | flip  node(a,i,b):   { return node(flip(b),i,flip(a)); }
     ;

[8]  Embedded tags optimizations on term representations.

     It is now possible to use compact representations that embeds the
     variant tag of a datatype into the lower bits of its pointer.

     For example, the datatype Tree in

       datatype Tree = empty
                     | leaf int
                     | node Tree, int, Tree
                     ;

     is mapped into the following code by default:

     #define empty (a_Tree *)0

     class a_Tree
     {
     public:
        enum Tag_Tree { tag_leaf = 0, tag_node = 1; };
     protected:
        Tag_Tree _tag_;
        ...
     };

     class Tree_leaf : public a_Tree { ... };
     class Tree_node : public a_Tree { ... };


     On the other hand, with the embedded tag anotations '!':

                   // embedded variant tag
       datatype Tree ! = empty
                       | leaf int !            // embedded argument
                       | node Tree, int, Tree
                       ;

     the class hierarchy can be collasped into just the class

     class a_Tree { ... };

     The tag field in the base class a_Tree will also be eliminated.
     Instead the tag is embedded in the lower bits of the leaf
     and node pointers.  Furthermore, the integer argument in the
     leaf node will actually be imbedded within the pointer for the leaf,
     and no heap storage is consumed(the integer will loose a few bits
     of precision however, typically 2).

     Using this optimization, the variant tag of a term can be accessed
     without a load from memory.  Furthermore, the terms can all be 1 word
     smaller (due to alignment this may not be always true).  Of course, bit
     masking must now be done to strip the tag bits before we dereference
     the pointers.  In most machines this requires a single instruction; but
     since we can eliminate one load with this optimization, I think things
     even out after all even if it's not improved upon.

     These optimizations are largely transparent from the application:
     the correct pattern matching code will be generated.  One possible
     exception is that an embedded argument is not mutable.

     As always, we generate code assuming that the C++ compiler performs
     agressive optimizations such as common subexpression elimination.

[9]  User definable lists and array-like constructors

About

Prop Compiler Compiler is a an augmented C++ to C++ translator. Adding Alegebraic Data Types, pattern matching, rewriting, parser generator and lexer generator, amongst other augmentations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published