Do you know *sin(x)*? Every kid in 5th grade knows what it is. It is a property of an angle. First, how do you mathematically describe an angle? Kids in school are shown two lines starting from a single point, one line is fixed and the other rotates around it. The quantity *x* of how much one line is rotated is described as angle. So the *x* goes from 0 to 360º which is the same as 0 to 2 * π. Then teacher shows kids that if *x* can be negative too, in which case rotation is in opposite direction. But it does not stop there, you can rotate multiple times! In which case *x* starts from 0 on every full turn. That is about angles, now back to *sin(x)*. If you have a triangle with two sides perpendicular to each other, then *sin(x)* where *x* is a degree of one angle is ratio of opposite side to hypotenuse. If you imagine such a triangle and stretch its sides while focusing on one angle *x*, you would see possible values for *sin(x)*, it goes from 0 when triangle collapses into horizontal line, all up until 1 when triangle collapses into vertical line, and somewhere in between inside. Then similarly how angles *x* can go bellow 0 and beyond 2 * π, value of *sin(x)* can be defined for *x* that are beyond 90º, in which case you start using Decart coordinates to describe your triangle! Now, in Decarts plane you have: *x* and *y* axis going from -∞ to +∞; they are perpendicular to each other; and both cross each other at 0. So! Now if you place one corner of triangle into *x=0,y=0* and draw a circle with radius 1, then when you pick one point on circle — which will be your 2nd corner of triangle — and draw vertical line through it and mark where that lines crosses horizontal line — which is your 3rd corner of triangle — you will get your triangle! What is cool, now values of both perpendicular sides of triangle match always some points in horizontal or vertical axes. Their values now always from -1 to +1 and your angle *x* can have any values! So! You can see that *sin(x)* can now take any value *x* and values of *sin(x)* are from *-1* to *+1*! Teacher points out that *sin(x)= 0* at specific points where *x* is 0 or -2 * π, and *sin(x)=1* at 1/2 * π as well as 3/2 * π. Teacher also shows that you can increment or decrement *x* by 2 * π and that will not change value of *sin(x)*. How nice! You still don’t really know exact values of *sin(x)* for many values of *x*. However, teacher assures you that when your really need to know value of *sin(x)* then you can use large books with look-up tables where someone else computed it for everybody else already with high precision!

Now, that is how humans learn *sin(x)*. Let’s have a look at how machine learning of 2021 approaches the same problem in their quest to build General Artificial Intelligence and to understand how brain works. Machine learning guys would make a computational model that somewhat mimics human brain. The input of model, is just a plane value of *x*. The output of a model is a number, a straight-up value of *sin(x)*. Inside the model, they would make a multi-layer artificial neural network, that is based on very simplified version of structural and functional plasticity of the brain, Hebbian learning for learning procedure to update weights and connections between neurons. They would use back-propagation and stochastic gradient descent over loss function that they defined as how much output of a model diverges from the real value of *sin(x)*. It is a supervised learning. They would use tons of points in their dataset, maybe hundreds, maybe thousands. They would train model on hundreds or thousands of epochs. When they finally see that model does not improve error, they show you their final work. What it can do? When you zoom-in, for a given *x* it will never give exact number of *sin(x)*, it is always approximation, you can’t tell to model “alrighty, model please take a couple of hours or days but give me result with precision more than 9000 digits”. When you zoom-out, you see that model only gives more-or-less correct result only within interval it saw training samples! Model is absolutely did not get the fact that value of *sin(x)* can not go beyond +1 or bellow -1. Beyond range of training model gives wildly out of touch with reality values. Next, you might ask “okay model, do you know some interesting values of *sin(x)*, how about *sin(1/2 * π)*, is it exactly 1 or something *approximate* again?”. Your model will give you some “approximation”, not exactly +1. You might say “oh how do you type 1/2 * π to your computer model, π is not a representable number!” — well, yeah, but show me where you type π in human brain, it is not represented exactly anywhere either. Moving on, you ask “let’s check if model understands periodicity”. Nope, model will give slightly different results within trained region and wildly different outside. Ah, model did not get pretty much anything about the *sin(x)*.

And it is not just the value of *sin(x)*. When humans learn about it they use geometry, analogies from physical world, algebraic properties of numbers, and mix them all together. Does machine learning model know anything about angles or triangles? More importantly, humans do not learn from any “training” samples at all. In fact, even after we learned about *sin(x)* we do not know most of their values. Like at all. Throughout our whole life. Even engineers and mathematicians! Yet, we can give you an answer with ultimate precision — in other words, with literally 0 error — for some values and with any precision you want for the rest. We will *never* fail on properties like periodicity. We will give correct answer for any value of *x* — whether it is beyond 2 ^ 32, 2 ^ 64, 2 ^ 128, likewise when it is so small that it undrerflows float64. Magnitude of *x* does not matter. After we get introduced to derivatives and integrals, we can use our knowledge of *sin(x)* and take derivatives or integrals of it. Again, with no training samples, again with no errors.

Machine learning of today is like *sin(x)*. It really needs to achieve *zero*-training data and *zero*-error. It may even resemble programming. But what is more important, in training and inference alike, it has to be more humane. ❤️