This paper shows that a sequence-to-sequence model trained on a sufficiently large dataset can solve symbolic mathematical problems with very high accuracy. Here, the problems do not involve explicit reasoning, but are simply and directly formatted as producing the final result given the problem. They look at two problems: symbolic integration and solving differential equations. Thus, given the input function, they train a model to write down its integral (without justification, or proof, just the final function); same for ODEs.
One main innovation from this paper is on how to generate a balanced dataset of symbolic expressions using a set of transformation rules. Some of these ideas, especially the procedure in Appendix C, have been useful for me once.
As for the actual results, this just shows that deep learning models are really powerful density estimators even for quite complicated distributions. However, it's not actually the case that this model replaces the Computer Algebra systems they mention. There was a later AAAI paper that showed that their models are not exactly as robust as their results might suggest. I'll comment on that later and link it from here. But this is just a more general theme of pure deep learning models: they are often non-robust in weird ways once you go out of their training distribution. You can make the training distribution really large (e.g., by generating random expressions as they do), but even then there's no guarantee that beyond that they generalize nicely. And for combinatorial domains like expression trees, there's no ``really large'' finite dataset that can really cover the infinite variation one can produce even within a quite modest depth.