1 files changed, 357 insertions, 0 deletions
diff --git a/fpoptimizer/fpoptimizer.txt b/fpoptimizer/fpoptimizer.txt
new file mode 100644
index 0000000..35bb70b
--- /dev/null
+++ b/fpoptimizer/fpoptimizer.txt
@@ -0,0 +1,357 @@
+FILES:
+  fpoptimizer.hh  - C++ structures
+  fpoptimizer.dat - Optimization operations
+
+PLAN:
+1.  Convert fpoptimizer.dat into code that initializes Grammar
+    - either automatically or manually
+    - avoid runtime parsing of .dat file
+    - prefer static const structs over constructor code
+    *DONE: See fpoptimizer_grammar_gen.y .
+           Produces fpoptimizer_grammar.cc .
+
+2.  Augment the CodeTree mechanism the following ways:
+    - Add mechanism that generates a hash value for the contents
+      of the tree, for easy comparison of identical subtrees
+      *DONE
+    - Any time a CodeTree changes in any way that may affect the
+      hash, update the hash value for this node and recursively
+      to all parents
+      (Note: Ensure that a CodeTree may only have one parent.)
+
+2.  Create code that matches a given CodeTree node to Grammar,
+    and finds the rules that match (only one should match at most,
+    but it can match a number of different ways, but that does not
+    matter; only utilize the first way)
+    - If a matching rule is found, replace either the matched params
+      or the entire node as instructed in the rule. Ensure that the
+      node gets its hash updated, along with its parents recursively.
+    - The matching can be optimized by grouping the rules by the Opcode
+    Note: The matching can get really complex and/or heavy in the
+    cases where the same operand (or worse, the same set of operands)
+    is expected to be found in two different subtrees. It may require
+    some O(n!) algorithms. However, those rules can be safely skipped
+    if the runtime sees that it would be too costly to check them.
+
+3. Algorithm for matching trees:
+     First,
+       match all nodes in the entire tree. (Do this only once.)
+     Any time a hash of a node is updated,
+       match that node.
+       (When the hashes are changed recursively for a node and all
+        its parents, match all those affected nodes after the hashes
+        have been regenerated.)
+     Repeat until no substitutions are done.
+
+     Overall, first apply this algorithm using the [ENTRY] rules only.
+     Then, apply the rules with [INTERMEDIATE] rules only.
+     Finally, apply the rules with [FINAL] rules only.
+
+3.5. Algorithm for matching nodes:
+       TODO
+
+4a. Algorithm for handling bytecode:
+     First, convert the bytecode into a CodeTree.
+     Then run the matching algorithm.
+     Last, convert the produced CodeTree back into bytecode.
+
+     When generating the bytecode, additional optimizations can be performed:
+     - Converting raise-to-integer-powers into sequences of cInv/cDup/cFetch/cMul
+       using the powi algorithm already existing in fpoptimizer.cc
+       (be wary of generating too long sequences, though)
+       *DONE
+     - Converting multiply-by-integers into sequences of cNeg/cDup/cFetch/cAdd
+       using the powi algorithm already existing in fpoptimizer.cc
+       (be wary of generating too long sequences, though)
+       *DONE
+     - Reordering cMul operands such that inverted operands
+       don't come first, to avoid creating cInv unless necessary.
+       When inverted operands come in non-head positions, use cDiv instead of cMul.
+       *DONE
+     - Reordering cAdd operands such that negated operands
+       don't come first, to avoid creating cNeg unless necessary.
+       When negated operands come in non-head positions, use cSub instead of cAdd.
+       *DONE
+     - When an identical subtree is synthesized into bytecode more than
+       once in a row, use cDup for the subsequential occurrences.
+       To that point, reorder any commutative operands so as to
+       increase the chances of cDup being utilized.
+       *TODO
+
+4b. Optional algorithm if an SSA evaluator is supported
+   (possibly only if a JIT backend exists):
+     1. First, convert the bytecode into a CodeTree.
+     2. Then run the matching algorithm.
+     3. Then convert the produced CodeTree into SSA format.
+       SSA is a medium-level intermediate language (chapter 4.3 in [1], [2]),
+       in which each instruction takes one of the following forms:
+           target = command source1 source2 <...>
+             - commands such as add, mul, pow, cos, max, less
+           jump <label>
+           branch <label> if source == 0
+           target = phi source1 source2
+             - phi is used after two branches merge, to select
+               either result from the branches into one variable;
+               it is not a fparser function.
+       and in which every variable is written to only once,
+       i.e. "x = x+5" never occurs, but is instead written
+         as "x2 = x1+5"
+       <label> can be a pointer to another codeblock.
+     4. Then run generic optimizations on the SSA code, such
+       as common subexpression elimination (CSE, described
+       in chapter 13.1 in [1]).
+       Try also global value numbering (GVN, see [3]).
+       Note: After CSE, it's difficult to convert
+        the code back into bytecode.
+       Note: Due to the hashing, it is possible that some
+        of the CSE can be done automatically in the SSA
+        generation phase by peeling the CodeTree depth-first,
+        storing identical trees only once.
+     5a. (Apply this option if we do not have JIT.)
+        Eliminate the "single assignment" property of SSA
+        by mapping out the lifetimes of the variables and
+        reusing the same variables where possible. This
+        decreases the memory usage of the evaluator and
+        improves the cache efficiency.
+     5b. (Apply this option if we do have JIT.)
+        Perform register allocation for the SSA code,
+        as per chapter 16 in [1]. Doing this before
+        actual JIT will make the JIT more straightforward.
+        Doing code scheduling could be useful as well,
+        though it gets somewhat complex. (Chapter 17 in [1].)
+   If there's a library we can use from step 3 forward, it'd be great.
+   Note: There are many typical optimizations that we don't need to do.
+   For instance, we don't need to do "dead code elimination", because
+   dead code is never produced in the first place.
+
+[1] = Advanced Compiler Design & Implementation by Steven S. Muchnick, ISBN 1-55860-320-4
+[2] = http://en.wikipedia.org/wiki/Static_single_assignment_form
+[3] = http://en.wikipedia.org/wiki/Global_value_numbering
+
+
+
+
+----------------------------------------
+------------------------------------------------------
+CODETREE - A REPRESENTATION OF MATHEMATICAL EXPRESSION
+------------------------------------------------------
+FPoptimizer changes the bytecode expression into tree format.
+
+An FPoptimizer expression is called a CodeTree.
+
+A CodeTree node has:
+	- 0 to N children
+	- Type (Opcode)
+	- Type-dependent other fields
+	Different type of nodes:
+		cVar:
+			Has a type-dependent field "Var", which
+			identifies the var number
+		cImmed:
+			Has a type-dependent field "Value", which
+			describes the constant value of this node
+		cFCall and cPCall:
+			Has a type-dependent field "FuncNo", which
+			describes the function number called
+		Anything else:
+			Has a number of children which describe
+			the parameters to the operation.
+			For example, if the type (Opcode) is cSin,
+			it has a single child, which is a CodeTree
+			that describes the parameter of the sin() call.
+			If the type (Opcode) is cAdd, it has an arbitrary
+			number of children, which are all added together.
+	Examples of some expressions in tree format:
+		Expression: sin(x)*cos(y)
+		Tree:	Opcode=cMul
+			Child[0]:
+				Opcode=cSin
+				Child[0]:
+		       			Opcode=cVar
+		       			Var=0
+			Child[1]:
+				Opcode=cCos
+				Child[0]:
+		       			Opcode=cVar
+		       			Var=1
+		Expression: 1.0 + 2.0 - 3.0 - (4.0 + -5.5)
+		Actual expression (converted by bytecode reader):
+		            1.0 + 2.0 + (-1 * 3.0) + (-1 * (4.0 + -5.5))
+		Tree:	Opcode=cAdd
+			Child[0]:
+				Opcode=cImmed
+				Value=1.0
+			Child[1]:
+				Opcode=cImmed
+				Value=2.0
+			Child[2]:
+				Opcode=cMul
+				Child[0]:
+					Opcode=cImmed
+					Value=-1.0
+				Child[1]:
+					Opcode=cImmed
+					Value=3.0
+			Child[3]:
+				Opcode=cMul
+				Child[0]:
+					Opcode=cImmed
+					Value=-1.0
+				Child[1]:
+					Opcode=cAdd
+					Child[0]:
+						Opcode=cImmed
+						Value=4.0
+					Child[1]:
+						Opcode=cImmed
+						Value=-5.5
+
+------------------------------------------------------
+GRAMMAR RULE - A TREE SUBSTITUTION RULE DESCRIPTION
+------------------------------------------------------
+The Grammar describes changes (optimizations) that
+can be performed to the CodeTree.
+
+The Grammar is composed of Rules, each of which describes
+a matching-tree, and a replacement.
+
+A matching-tree node (ParamSpec) has:
+	-------
+	- Opcode
+	- Method of parameter matching
+	- 0 to N params (ParamSpec)
+	Different types of methods:
+		PositionalParams:
+			Indicates that the tree given must match
+			the params given, in exactly that order;
+			there may not be other children in the tree.
+		SelectedParams:
+			Indicates that the tree given must match
+			the params given, in any order;
+			there may not be other children in the tree.
+		AnyParams:
+			Indicates that the tree given must match
+			the params given; in any order;
+			the tree may additionally have other children,
+			too. If the parameter list contains a RestHolder,
+			those other children are captured there.
+
+List of parameters (an index to MatchedParams)
+	- Single parent
+	- Type (opcode)
+	- 0 to N children
+	- Type-dependent other fields
+	Different types of nodes:
+		NumConstant:
+			Indicates that a CodeTree
+			with type=cImmed and
+			a particular Value is matched.
+			The value is found in pack.clist[node.index].
+		ImmedHolder:
+			Indicates that a CodeTree
+			with type=cImmed and
+			some Value is matched.
+			The ImmedHolder fields are enumerated
+			such that in a single CodeTree matching
+			operation, all ImmedHolders having the
+			same index value must match CodeTrees
+			evaluating to the same numeric value.
+			For example, if an ImmedHolder with index=0
+			happens to match a CodeTree with Value=1.443,
+			all other ImmedHolders with index=0 must
+			also match a CodeTree with Value=1.443.
+		NamedHolder
+			Indicates that any CodeTree is matched regardless of type.
+			The NamedHolder fields are enumerated
+			such that in a single CodeTree matching
+			operation, all NamedHolders having the
+			same index value must match CodeTrees
+			that are identical to the first one.
+			For example, if an NamedHolder with index=0
+			happens to match a CodeTree representing
+			the expression sin(x),
+			all other NamedHolders with index=0 must
+			also match a CodeTree representing
+			the expression sin(x).
+		SubFunction
+			Indicates a subtree that must be matched.
+		RestHolder
+			Within the target CodeTree, matches anything not
+			matched by the other rules in the parent ParamSpec.
+		Anything else:
+			Indicates a mathematical expression whose
+			value is treated by the NumConstant rules above.
+
+Example rules:
+	Grammar file syntax: cMin x x
+	Internal structure: Function.opcode = cMin
+	                    Function.index refers to MatchedParams where
+	                      MatchedParams.type  = AnyParams
+	                      MatchedParams.count = 2
+	                      MatchedParams.index refers to ParamSpec where
+	                        ParamSpec[0].opcode = NamedHolder
+	                        ParamSpec[0].index  = 0   -- "x" is namedholder 0
+	                        ParamSpec[1].opcode = NamedHolder
+	                        ParamSpec[1].index  = 0   -- another x
+	Explanation:
+		This rule matches a CodeTree
+	        where the opcode = cMin and that
+	        tree has two identical subtrees.
+
+	Grammar file syntax: cMul x <1> (cPow [(cMul x <2>) -1])
+	Internal structure: Function.opcode = cMul
+	                    Function.index refers to MatchedParams where
+	                      MatchedParams.type = AnyParams
+	                      MatchedParams.count = 3
+	                      MatchedParams.index refers to ParamSpec where
+	                        ParamSpec[0].opcode = NamedHolder
+	                        ParamSpec[0].index  = 0   -- "x" is namedholder 0
+	                        ParamSpec[1].opcode = RestHolder
+	                        ParamSpec[1].index  = 1   -- RestHolder #1 captures anything else within that cMul group
+	                        ParamSpec[2].opcode = SubFunction
+	                        ParamSpec[2].index refers to Function where
+	                          Function.opcode = cPow
+	                          Function.index refers to MatchedParams where
+	                            MatchedParams.type = PositionalParams
+	                            MatchedParams.count = 2
+	                            MatchedParams.index refers to ParamSpec where
+	                              ParamSpec[0].opcode = SubFunction
+	                              ParamSpec[0].index refers to Function where
+	                                Function.opcode = cMul
+	                                Function.index refers to MatchedParams where
+	                                  MatchedParams.type  = AnyParams
+	                                  MatchedParams.count = 2
+	                                  MatchedParams.index refers to ParamSpec where
+	                                    ParamSpec[0].opcode = NamedHolder
+	                                    ParamSpec[0].index  = 0 -- must match a subtree identical to that of "x" earlier
+	                                    ParamSpec[1].opcode = RestHolder
+	                                    ParamSpec[1].index  = 2 -- RestHolder #2 captures anything else within that cMul group
+	                              ParamSpec[1].opcode = NumConstant
+	                              ParamSpec[1].value  = -1
+	Explanation:
+		This rule matches a CodeTree of type cMul,
+		 where there exists at least two children,
+		 where one of them is an arbitrary expression,
+		  call that "x",
+		 and one of them is a CodeTree of type cPow,
+		  where there are exactly two children,
+		  where the first child is a CodeTree of type cMul,
+		   where there exists at least one child,
+		   where that one child is an arbitrary expression,
+		    which must match the "x" captured earlier,
+		   and anything else from that inner cMul codetree
+		   is captured into restholder <2>;
+		  and the second child of the cPow codetree
+		   is a CodeTree of type cImmed, where its Value equals -1;
+		 and anything else from that outer cMul codetree
+		 is captured into restholder <1>.
+		This rule matches expressions such as:
+			sin(z)*5*( (sin(z)*7) ^ (-1))
+			sin(z) is captured into NamedHolder x
+			5      is captured into RestHolder <1>
+			7      is captured into RestHolder <2>
+		Or
+			6*((6*w)^(-1))
+			value 6 is captured into NamedHolder x
+			nothing is captured into RestHolder <1>
+			w       is captured into RestHolder <2>.