I've been working on a personal project that I'm quite fond of, and it's taken probably the most research and generally just thinking about how to approach each section of any project I've ever taken up. As such I wanted to write out my thoughts on how I structured and designed the workings of the program here.
The program itself is essentially a text editor made specifically for mathematical expressions. I've always found it frustrating that I chose one of the few subjects at
university that still require a pen and paper (that is, physics), while
everyone else gets to use laptops, so I've decided to try and solve the
problem of writing and handling mathematical expressions being much
harder on a keyboard than on paper with this program. The core idea is that from a terminal with ncurses you can use a set of keybinds to write an arbitrary expression out in free form quickly and easily, then switch to 'operation mode, much like default/insert mode in vim, to use a new set of keybinds to carry out operations that follow mathematical law on your written expression. The program automatically simplifies expression for you, so if your expression is simply 'x' and you add an x, it'll know to present the resulting expression as '2x', rather than 'x+x'. Likewise if you add a y, you'll be left with 'x+y'. It should be noted that this is not wolframalpha though - you cannot ask it to *solve* mathematical equations for you, by making certain variables the subject or finding roots etc. You still do all of the working out, as you would on paper, the goal here is that when you command an operation to be done, the equation that's returned is the same as what you would had wrote if you were writing it yourself on paper.
So lets get into the details of how this can be done. Currently only the backend is being worked on before the interface is started, so this will only deal with how that works, however I'll come back and add to this blog as more progress is made.
What we're essentially writing is a basic interpreter. We need something that can take an arbitrary string, cut it down into component parts (it's tokens), figure out how these tokens are to be parsed and then carry evaluation. The tokeniser is simple, as it just needs to spit out numbers, variables and operators as it sees them. The only real job is to collect whole numbers as one token, so that '10+5' is returned as '10', '+', '5' rather than '1', '0', '+', '5', but this too is trivial. It also needs to add in implicit operations left out by common short hand such as '10x', so that '10', '*', 'x' is returned, but this is again trivial.
Once we have our tokens then it's a matter of figuring out how to parse them. In interpreters for languages, an abstract syntax tree is used, so lets use something like that. Maths is very simple in this respect and so a parse tree is a natural choice, where each binary operator like a plus has two children which can either be numbers, variables or more operators that eventually terminate with numbers or variables. This of course changes slightly for more advanced operations, like trig or log functions, but they are only unary operators rather than binary, so they would just have one child instead of two. Likewise calculus operations can be considered unary operations. However, the core of the program is first being build with only 'basic' algebra in mind, as to simply work as a proof of concept for the approach being taken.
I've drawn out examples of what the tree should look like given a typical expression:
It's parsed simply by taking the bottom leaves, applying the operator
stored in their parent, then replacing the key in that parent node with
the result. and recursively doing so until only the root node and
therefore result of the operation remains.
However, we're getting ahead of ourselves here. Before we can parse the tree, we need to build it from a stream of tokens. This was initially quite difficult to think about as how could you cleanly take an arbitrary expression like the one seen at the top of the above picture, and construct a tree that when parsed would evaluate the expression in the correct order according to operator precedence? There seems to be no obvious way, based on an *in*fix expression at least. What we can do to make life easier is convert our given equation to *post*fix (aka reverse polish notation) and then simply build the tree recursively using a modified shunting-yard algorithm for postfix expression evaluation, the modification here of course being rather than direct evaluation - which the shunting-yard algorithm isn't equipped for when dealing with algebraic variables - simply constructing a parse tree instead. What makes postfix notation perfect is that it already contains the correct evaluation precedence within it, so brackets don't ever need to be used, meaning you can add nodes to a tree directly rather than having to on the fly deal with bracketing order. I did think about how this could be possible, but also so much more of a mess that the extra step to convert to postfix notation is certainly worth it. It's also worth noting though that what's great about how the parse tree is constructed is that even though brackets are lost in postfix conversion, they can be recovered in the resulting evaluated expression by adding them back in to the rest of what's already been evaluated whenever the next operator seen has a higher evaluation precedence than the previous one.
So, we have our parse tree constructed in a beautiful and recursive fashion thanks for postfix notation, now for the meaty part of actual evaluation, and where this program in my mind becomes unique. Every term involved in the parse tree - regular numbers, variables or a mix - can be split into three distinct parts: the variable, the coefficient and the exponent. The variable and exponent parts are the determining factors on if combination is possible, while coefficients are just regular numbers and can be operated on each other at will. As such using a 'term' class for every token seems like the best idea, that can store all this information as it gets manipulated over the evaluation and used for checking for simplification opportunities. However things get more complicated when you can't combine two terms and have to leave operators in - for instance the 'x+y' example gave earlier. This case can be tackled in a similar fashion though, it's just a collection of terms, and can be treated like it's own term in evaluation. As such container classes can be used to store different incompatible terms and which operators are connecting them, and when these container terms are evaluated they too can be added to larger container terms, where each term can be retrieved recursively. For instance, 'x*(x+1)' will iterate over the container term '(x+1)' and multiply 'x' to every individual term inside - 'x' and '1'.
I believe the core idea of this part to be solid, however currently there are still some implementation issues I'm considering. Namely, the best way to actually structure these container term classes that makes them easiest to parse. My initial idea was to hold separate variables in the class for numerator, denominator and exponent, but this causes evaluation to get messy with a lot of condition checks required at every turn, so I'm thinking there should be a better and cleaner way to approach this, which is currently where I'm at on this project. I will be excited to update this post as new advancements in the program comes along, though!