Hill Space: Neural nets that do perfect arithmetic (to 10⁻¹⁶ precision)
When understood and used properly, the constraint W = tanh(Ŵ) ⊙ σ(M̂) (introduced in NALU by Trask et al. 2018 ) creates a unique parameter topology where optimal weights for discrete operations can be calculated rather than learned . During training, they're able to converge with extreme speed and reliability towards the optimal solution. Most neural networks struggle with basic arithmetic. They approximate, they fail on extrapolation, and they're inconsistent. But what if there was a way to m