All elementary functions from a single binary operator

(arxiv.org)

189 points | by pizza 4 hours ago

21 comments

  • DoctorOetker 1 hour ago
    EDIT: please change the article link to the most recent version (as of now still v2), it is currently pointing to the v1 version which misses the figures.

    I'm still reading this, but if this checks out, this is one of the most significant discoveries in years.

    Why use splines or polynomials or haphazardly chosen basis functions if you can just fit (gradient descent) your data or wave functions to the proper computational EML tree?

    Got a multidimensional and multivariate function to model (with random samples or a full map)? Just do gradient descent and convert it to approximant EML trees.

    Perform gradient descent on EML function tree "phi" so that the derivatives in the Schroedinger equation match.

    But as I said, still reading, this sounds too good to be true, but I have witnessed such things before :)

    • ikrima 1 hour ago
      From my experience of working in this problem domain for the last year, I'd say it is pretty powerful but the "too good to be true part" comes from that EML buys elegance through exponential expression blow-up. Multiplication alone requires depth-8 trees with 41+ leaves i.e. minimal operator vocabulary trades off against expression length. There's likely an information-theoretic sweet spot between these extremes.

      It's interesting to see his EML approach whereas mine was more on generating a context sensitive homoiconic grammar.

      I've had lots of success combining spectral neural nets (GNNs, FNOs, Neural Tangent Kernels) with symbolic regression and using Operad Theory and Category Theory as my guiding mathematical machinery

      • gopalv 31 minutes ago
        > Multiplication alone requires depth-8 trees with 41+ leaves i.e. minimal operator vocabulary trades off against expression length.

        That is sort of comparable to how NAND simplify scaling.

        Division is hell on gates.

        The single component was the reason scaling went like it did.

        There was only one gate structure which had to improve to make chips smaller - if a chip used 3 different kinds, then the scaling would've required more than one parallel innovation to go (sort of like how LED lighting had to wait for blue).

        If you need two or more components, then you have to keep switching tools instead of hammer, hammer, hammer.

      • canjobear 59 minutes ago
        Link to your work?
    • gilgoomesh 1 hour ago
      > Why use splines or polynomials or haphazardly chosen basis functions if you can just fit (gradient descent) your data or wave functions to the proper computational EML tree?

      Same reason all boolean logic isn't performed with combinations of NAND – it's computationally inefficient. Polynomials are (for their expressivity) very quick to compute.

      • Nevermark 1 hour ago
        They are done with transistors though. Transistors form an efficient, single element, universal digital basis.

        And are a much less arbitrary choice than NAND, vs. NOR, XOR, etc.

        Using transistors as conceptual digital logic primitives, where power dissipation isn't a thing, Pass Logic is "The Way".

    • canjobear 1 hour ago
      > Why use splines or polynomials or haphazardly chosen basis functions if you can just fit (gradient descent) your data or wave functions to the proper computational EML tree?

      Because the EML basis makes simple functions (like +) hard to express.

      Not to diminish this very cool discovery!

    • eru 1 hour ago
      > I'm still reading this, but if this checks out, this is one of the most significant discoveries in years.

      It seems like a neat parlour trick, indeed. But significant discovery?

  • entaloneralie 1 hour ago
    This is amazing! I love seeing FRACTRAN-shaped things on the homepage :) This reminds me of how 1-bit stacks are encoded in binary:

    A stack of zeros and ones can be encoded in a single number by keeping with bit-shifting and incrementing.

        Pushing a 0 onto the stack is equivalent to doubling the number.
        Pushing a 1 is equivalent to doubling and adding 1.
        Popping is equivalent to dividing by 2, where the remainder is the number.
    
    I use something not too far off for my daily a programming based on a similar idea:

    Rejoice is a concatenative programming language in which data is encoded as multisets that compose by multiplication. Think Fractran, without the rule-searching, or Forth without a stack.

    https://wiki.xxiivv.com/site/rejoice

  • eugene3306 24 minutes ago
    This makes a good benchmark LLMs:

    ``` look at this paper: https://arxiv.org/pdf/2603.21852

    now please produce 2x+y as a composition on EMLs ```

    Opus(paid) - claimed that "2" is circular. Once I told it that ChatGPT have already done this, finished successfully.

    ChatGPT(free) - did it from the first try.

    Grok - produced estimation of the depth of the formula.

    Gemini - success

    Deepseek - Assumed some pre-existing knowledge on what EML is. Unable to fetch the pdf from the link, unable to consume pdf from "Attach file"

    Kimi - produced long output, stopped and asked to upgrade

    GLM - looks ok

  • lioeters 1 hour ago
    > A calculator with just two buttons, EML and the digit 1, can compute everything a full scientific calculator does

    Reminds me of the Iota combinator, one of the smallest formal systems that can be combined to produce a universal Turing machine, meaning it can express all of computation.

  • krick 1 hour ago
    > using EML trees as trainable circuits ..., I demonstrate the feasibility of exact recovery of closed-form elementary functions from numerical data at shallow tree depths up to 4

    That's awesome. I always wondered if there is some way to do this.

  • qiller 1 hour ago
    For completeness, there is also Peirce’s arrow aka NOR operation which is functionally complete. Fun applications iirc VMProtect copy protection system has an internal VM based on NOR.

    Quick google seach brings up https://github.com/pr701/nor_vm_core, which has a basic idea

  • testaccount28 17 minutes ago
    derivation of -x seems wrong. we can look at the execution trace on a stack machine, but it's actually not hard to see. starting from the last node before the output, we see that the tree has the form

        eml(z, eml(x, 1))
          = e^z - ln(eml(x, 1))
          = e^z - ln(e^x)
          = e^z - x
    
    and the claim is that, after it's expanded, z will be such that this whole thing is equal to -x. but with some algebra, this is happening only if

        e^z = 0,
    
    and there is no complex number z that satisfies this equation. indeed if we laboriously expand the given formula for z (the left branch of the tree), we see that it goes through ln(0), and compound expressions.

    x^-1 has the same problem.

    both formulae work ...sort of... if we allow ln(0) = Infinity and some other moxie, such as x / Infinity = 0 for all finite x.

  • notorandit 27 minutes ago
    Not sure it really compares to NAND() and the likes.

    Simply because bool algebra doesn't have that many functions and all of them are very simple to implement.

    A complex bool function made out of NANDs (or the likes) is little more complex than the same made out of the other operators.

    Implementing even simple real functions out of eml() seems to me to add a lot of computational complexity even with both exp() and ln() implemented in hardware in O(1). I think about stuff sum(), div() and mod().

    Of course, I might be badly wrong as I am not a mathematician (not even by far).

    But I don't see, at the moment, the big win on this.

  • simplesighman 2 hours ago
    > For example, exp(x)=eml(x,1), ln(x)=eml(1,eml(eml(1,x),1)), and likewise for all other operations

    I read the paper. Is there a table covering all other math operations translated to eml(x,y) form?

    • sandrocksand 1 hour ago
      I think what you want is the supplementary information, part II "completeness proof sketch" on page 12. You already spotted the formulas for "exp" and real natural "L"og; then x - y = eml(L(x), exp(y)) and from there apparently it is all "standard" identities. They list the arithmetic operators then some constants, the square root, and exponentials, then the trig stuff is on the next page.

      You can find this link on the right side of the arxiv page:

      https://arxiv.org/src/2603.21852v2/anc/SupplementaryInformat...

    • vbezhenar 54 minutes ago
      Didn't read the paper, but it was easy for me to derive constants 0, 1, e and functions x + y, x - y, exp(x), ln(x), x * y, x / y. So seems to be enough for everything. Very elegant.
    • saratogacx 1 hour ago
      last page of the PDF has several tree's that represent a few common math functions.
    • jmyeet 1 hour ago
      I was curious about that too. Gemini actually gave a decent list. Trig functions come from Euler's identity:

          e^ix = cos x + i sin x
      
      which means:

          e^-ix = cos -x + i sin -x
                = cos x - i sin x
      
      so adding them together:

         e^ix + e^-ix = 2 cos x
         cos x = (e*ix - e^-ix) / 2
      
      So I guess the real part of that.

      Multiplication, division, addition and subtraction are all straightforward. So are hyperbolic trig functions. All other trig functions can be derived as per above.

  • prvc 33 minutes ago
    This is neat, but could someone explain the significance or practical (or even theoretical) utility of it?
  • nonfamous 2 hours ago
    How would an architecture with a highly-optimized hardware implementation of EML compare with a traditional math coprocessor?
    • wildzzz 2 hours ago
      Dreadfully slow for integer math but probably some similar performance to something like a CORDIC for specific operations. If you can build an FPU that does exp() and ln() really fast, it's simple binary tree traversal to find the solution.
      • AlotOfReading 1 hour ago
        You already have an FPU that approximates exp() and ln() really fast, because float<->integer conversions approximate the power 2 functions respectively. Doing it accurately runs face-first into the tablemaker's dilemma, but you could do this with just 2 conversions, 2 FMAs (for power adjustments), and a subtraction per. A lot of cases would be even faster. Whether that's worth it will be situational.
  • tripdout 2 hours ago
    Interesting, but is the required combination of EML gates less complex than using other primitives?
    • eru 1 hour ago
      In general, no.
  • jekude 2 hours ago
    What would physical EML gates be implemented in reality?

    Posts like these are the reason i check HN every day

  • peterlk 2 hours ago
    Reminds me a bit of the coolest talk I ever got to see in person: https://youtu.be/FITJMJjASUs?si=Fx4hmo77A62zHqzy

    It’s a derivation of the Y combinator from ruby lambdas

    • Analemma_ 2 hours ago
      If you've never worked through a derivation/explanation of the Y combinator, definitely find one (there are many across the internet) and work through it until the light bulb goes off. It's pretty incredible, it almost seems like "matter ex nihilo" which shouldn't work, and yet does.

      It's one of those facts that tends to blow minds when it's first encountered, I can see why one would name a company after it.

    • thaumasiotes 2 hours ago
      Have you gone through The Little Schemer?

      More on topic:

      > No comparable primitive has been known for continuous mathematics: computing elementary functions such as sin, cos, sqrt, and log has always required multiple distinct operations.

      I was taught that these were all hypergeometric functions. What distinction is being drawn here?

  • nurettin 37 minutes ago
    The problem with symbolic regression is ln(y) is undefined at 0, so you can't freely generate expressions with it. We need to guard it with something like ln(1+y*y) or ln(1+|y|) or return undefined.
  • hyperhello 2 hours ago
    > eml(x,y)=exp(x)-ln(y)

    Exp and ln, isn't the operation its own inverse depending on the parameter? What a neat find.

    • thaumasiotes 1 hour ago
      > isn't the operation its own inverse depending on the parameter?

      This is a function from ℝ² to ℝ. It can't be its own inverse; what would that mean?

      • woopsn 1 hour ago
        It's a kind of superposition representation a la Kolmogorov-Arnold, a learnable functional basis for elementary functions g(x,y)=f(x) - f^{-1}(y) in this sense with f=exp.
      • hyperhello 1 hour ago
        eml(1,eml(x,1)) = eml(eml(1,x),1) = exp(ln(x)) = ln(exp(x)) = x
  • supermdguy 2 hours ago
    Next step is to build an analog scientific calculator with only EML gates
  • selcuka 2 hours ago
    So, like brainf*ck (the esoteric programming language), but for maths?
  • noobermin 1 hour ago
    I don't mean to shit on their interesting result, but exp or ln are not really that elementary themselves... it's still an interesting result, but there's a reason that all approximations are done using series of polynomials (taylor expansion).
    • traes 54 minutes ago
      Elementary function is a technical term that this paper uses correctly, not a generic prescription of simplicity.

      See https://en.wikipedia.org/wiki/Elementary_function.

    • xpe 1 hour ago
      > but there's a reason that all approximations are done using series of polynomials (taylor expansion).

      "All" is a tall claim. Have a look at https://perso.ens-lyon.fr/jean-michel.muller/FP5.pdf for example. Jump to slide 18:

      > Forget about Taylor series

      > Taylor series are local best approximations: they cannot compete on a whole interval.

      There is no need to worry about "sh-tt-ng" on their result when there is so much to learn about other approximation techniques.

  • zephen 2 hours ago
    Judging by the title, I thought I would have a good laugh, like when the doctor discovered numerical integration and published a paper.

    But no...

    This is about continuous math, not ones and zeroes. Assuming peer review proves it out, this is outstanding.

    • paulpauper 1 hour ago
      I don't think this is ever making it past the editor of any journal, let alone peer review.

      Elementary functions such as exponentiation, logarithms and trigonometric functions are the standard vocabulary of STEM education. Each comes with its own rules and a dedicated button on a scientific calculator;

      What?

      and No comparable primitive has been known for continuous mathematics: computing elementary functions such as sin, cos, √ , and log has always required multiple distinct operations. Here we show that a single binary operator

      Yeah, this is done by using tables and series. His method does not actually facilitate the computation of these functions.

      There is no such things as "continuous mathematics". Maybe he meant to say continuous function?

      Looking at page 14, it looks like he reinvented the concept of the vector valued function or something. The whole thing is rediscovering something that already exists.

      • traes 44 minutes ago
        This preprint was written by a researcher at an accredited university with a PhD in physics. I'm sure they know what a vector valued function is.

        The point of this paper is not to revolutionize how a scientific calculator functions overnight, its to establish a single binary operation that can reproduce the rest of the typical continuous elementary operations via repeated application, analogous to how a NAND or NOR gate creates all of the discrete logic gates. Hence, "continuous mathematics" as opposed to discrete mathematics. It seems to me you're being overly negative without solid reasoning.

  • BobbyTables2 3 hours ago
    How does one actually add with this?
    • curtisf 1 hour ago
      It's basically using the "-" embedded in the definition of the eml operator.

      Table 4 shows the "size" of the operators when fully expanded to "eml" applications, which is quite large for +, -, ×, and /.

      Here's one approach which agrees with the minimum sizes they present:

              eml(x, y             ) = exp(x) − ln(y) # 1 + x + y
              eml(x, 1             ) = exp(x)         # 2 + x
              eml(1, y             ) = e - ln(y)      # 2 + y
              eml(1, exp(e - ln(y))) = ln(y)          # 6 + y; construction from eq (5)
                               ln(1) = 0              # 7
      
      After you have ln and exp, you can invert their applications in the eml function

                    eml(ln x, exp y) = x - y          # 9 + x + y
      
      Using a subtraction-of-subtraction to get addition leads to the cost of "27" in Table 4; I'm not sure what formula leads to 19 but I'm guessing it avoids the expensive construction of 0 by using something simpler that cancels:

                         x - (0 - y) = x + y          # 25 + {x} + {y}
    • bzax 2 hours ago
      Well, once you've derived unary exp and ln you can get subtraction, which then gets you unary negation and you have addition.
      • freehorse 27 minutes ago
        And then by using the fact that the exponential turns addition into multiplication, you get multiplication (and subtraction gives division).
    • nick238 2 hours ago
      Don't know adding, but multiplication has diagram on the last page of the PDF.

      xy = eml(eml(1, eml(eml(eml(eml(1, eml(eml(1, eml(1, x)), 1)), eml(1, eml(eml(1, eml(y, 1)), 1))), 1), 1)), 1)

      From Table 4, I think addition is slightly more complicated?

      • simplesighman 1 hour ago
        Thanks for posting that. You had a transcribing typo which was corrected in the ECMAScript below. Here's the calculation for 5 x 7:

            const eml = (x,y) => Math.exp(x) - Math.log(y);
            const mul = (x,y) => eml(eml(1,eml(eml(eml(1,eml(eml(1,eml(1,x)),1)),eml(1,eml(eml(1,eml(y,1)),1))),1)),1);
            console.log(mul(5,7));
        
        > 35.00000000000001

        For larger or negative inputs you get a NaN because ECMAScript has limited precision and doesn't handle imaginary numbers.

      • Charon77 1 hour ago
        x+y = ln(exp(x) * exp(y))

        exp(a) = eml(a, 1) ln(a)=eml(1,eml(eml(1,a),1))

        Plugging those in is an excercise to the reader

        • jcgrillo 1 hour ago
          might need to turn the paper sideways