-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Something wrong with data formatter? #153
Comments
I'll respond better later when I'm at a computer, but it looks like Sci kit
|
That is really strange, I installed python x,y and then installed scikit and actually did some tutorials and used the data (8x8 images for numbers) to train and it was all fine. edit: I did also run installPythonDependencies.sh |
Hi @TasmaniaKrama! Thanks for using the library, and giving feedback. The initial error you're getting (before it cascades through many other part of the project) is: "{ [Error: AttributeError: 'module' object has no attribute 'MaxAbsScaler']" MaxAbsScaler is a bit of functionality we import from scikit-learn, and for some reason it's not being found on your system. It was recently introduced in v0.17, so my current guess is that you have an older (working) version of scikit-learn installed on your machine that the script is trying to pull from. It's entirely possible you might have two versions of scikit-learn installed if they were installed in different ways, or with two different user credentials (I've seen this pop up with sudo, anaconda, and people who just have multiple accounts on the same machine). I'd recommend uninstalling scikit-learn using whatever methods you did to install it, and then reinstalling it fresh from GitHub. You can use the instructions in installPythonDependencies.sh if you'd like for installing from GitHub, but it's probably best to just do it from the command line, since the shell script I provided might be installing as a different user. Let me know how it goes! |
C:\Users\Filip>git clone https://github.com/ClimbsRocks/machineJS.git Resolving deltas: 100% (1820/1820), done. C:\Users\Filip>cd machineJS C:\Users\Filip\machineJS>npm install [email protected] node_modules\python-shell [email protected] node_modules\minimist [email protected] node_modules\data-formatter [email protected] node_modules\mkdirp [email protected] node_modules\chai [email protected] node_modules\longjohn [email protected] node_modules\csv [email protected] node_modules\rimraf [email protected] node_modules\data-for-tests [email protected] node_modules\mocha [email protected] node_modules\fast-csv [email protected] node_modules\ensembler C:\Users\Filip\machineJS>installPythonDependencies.sh C:\Users\Filip\machineJS> So this was during the installing (Iremoved everything both python x,y and machineJS) However now once I run it I still get an error but at least it is shorter and different (so it's a plus :D ) Microsoft Windows [Version 6.1.7601] C:\Users\Filip>cd machineJS C:\Users\Filip\machineJS>node machineJS.js e0.csv --predict e1.csv we heard an unexpected shutdown event that is causing everything to close Error: spawn python ENOENT Do you happen to have some example of train data and test data? Perhaps I am stupid (probably) and I am labeling it incorrectly. (the ID, Output Regression, Continuous etc.) I am doing the editing in excel and saving it as csv. Thanks for your time :) |
Hi @TasmaniaKrama, It looks like there's probably still an issue with the install somewhere. I haven't had a chance to test this on a Windows computer, so I'm glad to have a collaborator in this! It seems that it might be having trouble finding Python to launch a child process. Try some of the suggestions in this issue thread: |
Hi @TasmaniaKrama, Let me know how this is going, and if I can help troubleshoot with you in any way! |
Hey, I was away for some time but here is what happened. I decided to use VB and install linux on it however when I did setup everything there was still trouble when trying to use pip to install certain modules. |
@TasmaniaKrama yeah, i was afraid we'd run into that. i just updated the install sequence, so that it more heavily leverages python's popular package manager pip. hopefully that should make the install easier. if you're up for trying it again (on a linux box- i just found some new issues that'll prevent it from running on a windows machine still), i'd love to hear any issues you run into! if you do try it again, start with a fresh install following the directions in the README. Thanks for responding! the feedback is really helpful to make this easier for everyone (which is the whole point of this project). |
I am a total noob for linux so I have almost zero knowledge of it. Installing mint now, and as I recall I had to install the git thing and then use git clone command + npm thing had to be installed.. Anyways, I will try now step by step with your directions and if it won't work then might I ask which linux distro you are using? Ok so I had to do sudo apt-get and install git, npm and pip since the commands didn't work. It did all that and after I did the pip install -r requirements.txt and this comes out: ` pip install -r requirements.txt
Downloading/unpacking scipy (from -r requirements.txt (line 4))
Downloading/unpacking cython (from -r requirements.txt (line 5))
Downloading/unpacking xgboost (from -r requirements.txt (line 6)) ' As I recall there was a problem last time with xgboost. |
interesting. thanks for following up!
worst case scenario, we can just comment out xgboost. if everything else is running, try just going to pySetup/classifierList.js and commenting out XGBoost wherever you see it (and adjusting the commas in the surrounding lines if necessary). we'll probably have to comment it out in one or two other places as well (like pySetup/makeClassifiers.py), but they should throw some pretty obvious errors and be easy to find. |
Running the 0.4a30 xgboost yields the same error. And about the previous packages, not really sure how to answer that one. It seems like they didn't since a lot of errors were thrown. |
So I am quite a noob in this machine learning but I really tried hard and followed exactly on how to install machineJS also it took me a while to format the data but I formatted the data as it was requested in the https://github.com/ClimbsRocks/data-formatter
If needed I could upload the data I formatted but I seriously have no idea what I am doing wrong here.
C:\Users\Filip\machineJS>node machineJS.js final.csv --predict test123.csv
thanks for inviting us along on your machine learning journey!
heard an error!
{ [Error: AttributeError: 'module' object has no attribute 'MaxAbsScaler']
traceback: 'Traceback (most recent call last):\r\n File "C:\Users\Filip\ma
chineJS\node_modules\data-formatter\mainPythonProcess.py", line 27, in \r\n from helperFunctions import minMax\r\n File "C:\Users\Filip\machin
eJS\node_modules\data-formatter\helperFunctions\minMax.py", line 8, in \r\n max_abs_scaler = preprocessing.MaxAbsScaler()\r\nAttributeError: 'mo
dule' object has no attribute 'MaxAbsScaler'\r\n',
executable: 'python',
options: null,
script: 'C:\Users\Filip\machineJS\node_modules\data-formatter\mainPython
Process.py',
args: [ '{"trainingData":"final.csv","testingData":"test123.csv","trainingPret
tyName":"final","testingPrettyName":"test123","joinFileName":"","on":false,"allF
eatureCombinations":false,"keepAllFeatures":false,"outputFolder":"C:\Users\
\Filip\machineJS\pySetup\data-formatterResults","test":false,"verbose":
1,"join":false}' ],
exitCode: 1 }
Here are the fileNames from data-formatter. If you want to skip the data-formatt
er part next time you want to play with this dataset, copy and paste this object
into machineJS/pySetup/testingFileNames.js, following the instructions included
in that file.
{}
{ [Error: KeyError: 'X_train']
traceback: 'Traceback (most recent call last):\r\n File "C:\Users\Filip\ma
chineJS\pySetup\splitDatasets.py", line 17, in \r\n XFileName = fil
eNames['X_train']\r\nKeyError: 'X_train'\r\n',
executable: 'python',
options: null,
script: 'C:\Users\Filip\machineJS\pySetup\splitDatasets.py',
args:
[ 'C:\Users\Filip\machineJS\ignoreMe.csv',
'{"":["C:\Users\Filip\machineJS\machineJS.js","final.csv"],"pr
edict":"test123.csv","dev":false,"computerTotalCPUs":4,"machineJSLocation":"C:
\Users\Filip\machineJS","dataFile":"final.csv","dataFileName":"final.csv"
,"dataFilePretty":"final","binaryOutput":false,"outputFileName":"final","join":"
","on":"","allFeatureCombinations":"","keepAllFeatures":"","dfOutputFolder":"C:
\Users\Filip\machineJS\pySetup\data-formatterResults","matrixOutpu
t":"","testFileName":"test123.csv","testFilePretty":"test123","testOutputFileNam
e":"test123","searchPercent":0.3,"validationPercent":0.3,"numRounds":10,"numIter
ationsPerRound":10,"predictionsFolder":"C:\Users\Filip\machineJS\pre
dictions\test123","validationFolder":"C:\Users\Filip\machineJS\pr
edictions\test123\validation","bestClassifiersFolder":"C:\Users\Fili
p\machineJS\pySetup\bestClassifiers\final","ensemblerOutputFolder":"
C:\Users\Filip\machineJS","validationRound":false,"ensemblerArgs":{"inp
utFolder":"C:\Users\Filip\machineJS\predictions\test123","outputF
older":"C:\Users\Filip\machineJS","validationFolder":"C:\Users\Fi
lip\machineJS\predictions\test123\validation","fileNameIdentifier":"
final","validationRound":true},"numCPUs":3,"longTrainThreshold":0.97,"continueTo
TrainThreshold":0.97,"alreadyFormatted":false}',
'{}' ],
exitCode: 1 }
{ [Error: ImportError: No module named joblib]
traceback: 'Traceback (most recent call last):\r\n File "C:\Users\Filip\ma
chineJS\pySetup\training.py", line 6, in \r\n import joblib\r\nImpo
rtError: No module named joblib\r\n',
executable: 'python',
options: null,
script: 'C:\Users\Filip\machineJS\pySetup\training.py',
args:
[ 'C:\Users\Filip\machineJS\final.csv',
'{"":["C:\Users\Filip\machineJS\machineJS.js","final.csv"],"pr
edict":"test123.csv","dev":false,"computerTotalCPUs":4,"machineJSLocation":"C:
\Users\Filip\machineJS","dataFile":"final.csv","dataFileName":"final.csv"
,"dataFilePretty":"final","binaryOutput":false,"outputFileName":"final","join":"
","on":"","allFeatureCombinations":"","keepAllFeatures":"","dfOutputFolder":"C:
\Users\Filip\machineJS\pySetup\data-formatterResults","matrixOutpu
t":"","testFileName":"test123.csv","testFilePretty":"test123","testOutputFileNam
e":"test123","searchPercent":0.3,"validationPercent":0.3,"numRounds":10,"numIter
ationsPerRound":10,"predictionsFolder":"C:\Users\Filip\machineJS\pre
dictions\test123","validationFolder":"C:\Users\Filip\machineJS\pr
edictions\test123\validation","bestClassifiersFolder":"C:\Users\Fili
p\machineJS\pySetup\bestClassifiers\final","ensemblerOutputFolder":"
C:\Users\Filip\machineJS","validationRound":false,"ensemblerArgs":{"inp
utFolder":"C:\Users\Filip\machineJS\predictions\test123","outputF
older":"C:\Users\Filip\machineJS","validationFolder":"C:\Users\Fi
lip\machineJS\predictions\test123\validation","fileNameIdentifier":"
final","validationRound":true},"numCPUs":3,"longTrainThreshold":0.97,"continueTo
TrainThreshold":0.97,"alreadyFormatted":false}',
'{}',
'clRfGini',
undefined,
0 ],
exitCode: 1 }
kicking off the process of making predictions on the predicting data set for: cl
RfGini
we heard an unexpected shutdown event that is causing everything to close
C:\Users\Filip\machineJS\shutDown.js:19
throw error;
^
TypeError: Cannot read property 'longTrainScore' of undefined
at startPredictionsScript (C:\Users\Filip\machineJS\pySetup\utils.js:129:58)
tils.js:144:5)
at Object.module.exports.makePredictions (C:\Users\Filip\machineJS\pySetup\c
ontrollerPython.js:142:11)
at C:\Users\Filip\machineJS\pySetup\controllerPython.js:32:24
at emitFinishedTrainingCallback (C:\Users\Filip\machineJS\pySetup\utils.js:8
7:7)
at C:\Users\Filip\machineJS\pySetup\utilsPyShell.js:60:7
at null._endCallback (C:\Users\Filip\machineJS\node_modules\python-shell\ind
ex.js:148:25)
at ChildProcess. (C:\Users\Filip\machineJS\node_modules\python-sh
ell\index.js:99:35)
at emitTwo (events.js:87:13)
at ChildProcess.emit (events.js:172:7)
C:\Users\Filip\machineJS>
The text was updated successfully, but these errors were encountered: