You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.94, random_state=42)
using only 6% to learn (with the one-hot encoded data) and 94% as test... we still get 81% accuracy...
This is an absolut amazing result!
It seems Method 1 > Method 3 > Method 2
I run multiple different tests and all came up with the same result that Method 1 is the best method of those 3.
It is also (afaik) a record from all i remember on MNIST to get 81% accuracy with a single threaded cpu within 15 seconds!!! Never saw this before (in python , not c)...
Will probably also try it with CIFAR 10 and see how well it does...
Result:
----------------------------
Size of training set: 4200
Size of testing set: 65800
----------------------------
De-correlating all the xs with each other
----------------------------
Method 1. regression on ys using multiple y_classes in the form of one_hot matrix
train accuracy: 0.9147619047619048
test accuracy: 0.8130851063829787
---------------------------------
Method 2. regression on ys with simple rounding & thresholding of the predicted y classes.....
train accuracy: 0.27404761904761904
test accuracy: 0.2289209726443769
---------------------------------
Method 3. regression on ys using multiple y_classes in the form of random vectors (embeddings)
train accuracy: 0.7904761904761904
test accuracy: 0.6704103343465045
The text was updated successfully, but these errors were encountered:
Thanks for the addition
You can increase the performance further by passing the data through a non-linear layer of weights e.g Relu.
I have added the following lines to the beginning of the code which improves the performance of method 1 to 0.91 on the testing set.
The more nodes, the better performance (with 1000 nodes you can reach 95%) but the computational run becomes exponentially high due to the quadratic nature of "xs" de-correlation. Also the risk of over-fitting increases with more nodes
def activate(w, x):
linear = x = w.T @ x
# out = linear
out = np.maximum(linear, 0) #relu
return out
nodes = 500
w = np.random.randn(xs.shape[0], nodes)
xs = activate(w , xs)
X_train = activate(w , X_train.T).T
X_test = activate(w , X_test.T).T
Ps. Also you can increase the performance of Method 3 if you increase the embed_size
Very interesting behaviour...
When
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.94, random_state=42)
using only 6% to learn (with the one-hot encoded data) and 94% as test... we still get 81% accuracy...
This is an absolut amazing result!
It seems Method 1 > Method 3 > Method 2
I run multiple different tests and all came up with the same result that Method 1 is the best method of those 3.
It is also (afaik) a record from all i remember on MNIST to get 81% accuracy with a single threaded cpu within 15 seconds!!! Never saw this before (in python , not c)...
Will probably also try it with CIFAR 10 and see how well it does...
Result:
The text was updated successfully, but these errors were encountered: