-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No issue just a finding that i had... #1
Comments
Hi thanks for the test. |
I will play around with it a little more, because i realy find it interesting and if one finds a way to have multiple layers it might be a very very very good and quick learner.... considering binary this would mean we would have to add somehow a binary clock to the input like 0,1,0,1,0,1 (for example a 128bit value something we never achive) per datapoint. I saw this effect is helping if you try to predict a decreasing sinus soidal wave that has spikes at specific points. The example you have can predict everything that is even in the prediction, but it cant predict uneven things like every 3. value as it moves across time. if one adds the binary timer it is able to predict also the uneven numbers. Again thanks for all your hard work.... |
First of all a lot of thanks for publishing your code. In your video you mention it is unknown how many layers / neurons should be sufficient. So i run some very basic tests on RLS. From my finding i estimate to get 100% accuracy you would require 1.1x the neurons of all possible outcomes.
To do the test i chose Majority bit as a test. where you have 3-12 input bits and the more bits are 1 the output is one, if more bits are 0 on the input , then the output should be 0 too. I run it for different neuron counts and different majority gates sizes all in a one-shot (so training data only seen once) like mentioned in the video and the github description.
Here is the adjusted code i used:
Here is the output with the accuracys:
Considering using it for a large language model (just for fun) which has unlimited meassureable outputs it would be impossible to use this network as the size would be in the 10^80 in size neurons. It would learn it perfectly all languages and code, but we would be unable to run it anywhere with current compute.
Am i doing something wrong, or is this the expected behaviour? Did you also run some tests on it?
Update: did run a test if epochs matter, and statistically they dont matter, 1 epoch is enough , but the neurons have to be at least 10% greater than the different possible outputs. Did run the test from 1-512 epochs...
Best regards
The text was updated successfully, but these errors were encountered: