-
Notifications
You must be signed in to change notification settings - Fork 3
GDELT
dkakkar edited this page Mar 23, 2020
·
23 revisions
- Copy all Gdelt files which need to be analyzed to a folder on FASRC. Put unzipped CSV in the folder. For initially testing start with 1 month of unzipped CSV and 64GB of RAM.
- Login(ssh) to the node where Omnisci is running and give the following command on command line:
module load Anaconda3/5.0.1-fasrc02
conda create -n gdelt python=3.6
source activate gdelt
pip install pymapd
pip install pandas
pip install pyarrow
-
Copy the script from '/n/holyscratch01/cga/dkakkar/scripts/gdelt.py' to your home directory
-
Edit the script to give your folder path (containing gdelt files) and your Omnisci port information in the connection string
-
We will need the backend port information here. To find the backend port number follow the steps below:
- Click on session ID of your Omnisci session
- Open output.log file
- Look for "Backend TCP" in the file and copy the port number from there
-
After activating conda environment run the script using:
python3 gdelt.py