I mentioned the following expression as an example:
In LaTeX the code for this expressions looks like this:
$E = \{\langle a, n, n' \rangle \subseteq I \times N \times N \mid Pa \text{ and } n < n' \}$I set myself the goal to allow the same expression to be generated from the following code:
E = {<a, n, n'> :subseteq I :times N :times N | Pa \and n < n'}
I mostly have something like this (but better) working!
A heuristic approach
One of the driving ideas behind Termcat's syntax is that when it recognizes an infix operation then it should know that, normally, there's a mathematical expression to the left and to the right of that operation. It should also be able to make similar inferences from prefix and suffix operators.
By way of example, the 'raw' syntax for operators is as follows:
Using this syntax the expression above can be encoded like this:
Termcat also has heuristics for parentheses, brackets, braces, and chevrons and this allows us to some of the tildes:
Intermezzo: lexical bindings
I'm currently working on adding user-defined 'bindings' or substitutions of standalone words to Termcat. The idea is that
Bindings can be used to remove the remaining tildes in the Termcat code. Consider the following declarations:
One obvious further improvement might be to treat commas (and semicolons) as special by default. This would obviate the need to surround commas by whitespace. I will look into this too.
By way of example, the 'raw' syntax for operators is as follows:
~=~ : infix operator =, automatic spacingOperators must be surrounded by whitespace on the side of the tildes.
~~|~~ : infix operator |, forces normal spacing
~~! : postfix operator !, normal spacing to the left
#~~~ : prefix operator #, wide spacing to the right
Using this syntax the expression above can be encoded like this:
E ~=~ {~ ⟨~ a ~,~ n ~,~ n' ~⟩ ~⊆~ I ~×~ N ~×~ N ~~|~~ Pa and n ~<~ n' ~}(There's magic involved in getting n' to display correctly, but let's ignore that. Also, MathML doesn't seem to define default spacing for '|' so it needs to be surrounded by double tildes.)
Termcat also has heuristics for parentheses, brackets, braces, and chevrons and this allows us to some of the tildes:
E ~=~ {<a ~,~ n ~,~ n'> ~⊆~ I ~×~ N ~×~ N ~~|~~ Pa and n ~<~ n'}The output is nearly identical to what LaTeX generates:
The Termcat code can be simplified further. First, however, I need to introduce another Termcat feature.
Intermezzo: lexical bindings
I'm currently working on adding user-defined 'bindings' or substitutions of standalone words to Termcat. The idea is that
!bind(test)(*test*)should be rewritten into
test
*test*Bindings are lexically scoped, where scope is delimited by parentheses, brackets, braces, chevrons, indentation, and bullet list syntax determine scope. Hence
(!bind(test)(*test*)is rewritten into
test)
test
*test*Towards a more natural syntax for mathematical expressions
test
Bindings can be used to remove the remaining tildes in the Termcat code. Consider the following declarations:
!bindFor now the idea is that these bindings have to be set in every Termcat document. I may add a default bindings table at a later point though. In any case, after the above bindings have been defined it should be possible to write the following code:
- =
- ~=~
!bind
- ,
- ~,~
!bind
- subseteq
- ~⊆~
!bind
- *
- ~×~
!bind
- |
- ~~|~~
!bind
- \<
- ~<~
E = {<a , n , n'> subseteq I * N * N | Pa and n < n'}That looks a lot more readable than the LaTeX code if you ask me! In fact, I think it's nicer than the syntax I originally envisioned.
One obvious further improvement might be to treat commas (and semicolons) as special by default. This would obviate the need to surround commas by whitespace. I will look into this too.