This is intended to be a detailed local setup document with explanations for this repository.
- Cloning this template
- Setting up local environment
- Creating an Open Humans project
- Final steps of app setup
- Heroku deployment
- Adding dummy data
- Next steps
- Getting help
In your terminal, navigate to the folder in which you want to store this repo, and enter the command
$ git clone [email protected]:OpenHumans/oh-data-demo-template.git
This should create a new folder named oh-data-demo-template
which contains all the code to create the working demo.
The Heroku CLI is a local command line tool that helps us run and eventually deploy our application to Heroku. To install:
macOS:
if you have Homebrew installed (recommended):
$ brew install heroku/brew/heroku
otherwise follow these instructions
Linux:
$ wget -qO- https://cli-assets.heroku.com/install-ubuntu.sh | sh
Windows:
Click the link and choose an installer to download
Redis is an open source, in-memory data structure store, used as a database, cache and message broker used by this application for Celery and Requests Respectful.
To install Redis you have a few options (just do 1 of them):
- Follow these instructions
- (macOS only)
brew install redis
and thenredis-server /usr/local/etc/redis.conf
- Install with Docker and run container in background:
$ docker create --name redis-db -p 6379:6379 redis:alpine
$ sudo service docker start
To set it running with Docker on very popular Ubuntu and other Debian based systems, it will likely be started for you after you install the package, but can also start it manually with:
$ docker start redis-db
.
This version of the template runs on Python 3. We recommend that you use this version, but you can still use the old version which requires only Python 2.
Please note that if you are working on a Mac, it is strongly advised that you install a fresh version of Python. The version that ships with OSX is not suitable for development and use with third party packages. Instructions for setting up your Python environment properly in OSX can be found here.
pip is a package management system used to install and manage software packages written in Python. It is available here. Check your pip
version by entering $ pip --version
to make sure that it is working with Python 3; you may have to use the command pip3
.
Virtual environments are useful when developing apps with lots of dependencies since they enable us to install software locally for a specific project, without it being present globally. Using a virtual environment allows us to use specific versions of each program and/or package for this project only, without affecting the versions that are used elsewhere on your machine.
We will set up the virtual environment here, and then work from within it for the remainder of this guide.
- install the Python package pipenv, using pip:
$ pip install pipenv
or$ pip3 install pipenv
- navigate to your project folder for this template repo, and enter the command
$ pipenv --python 3.6
This command should output some information about how it's creating a virtual environment for us with some path information about it.
Whenever we use pip or python commands, this virtual environment will be used for the remainder of this tutorial.
You can work from within this environment with $ pipenv shell
If you want to run commands from outside of this shell, you can type $ pipenv run ___
, for example, $ pipenv run python
.
You can install all dependencies with:
$ pipenv install
Note: This will install the dependencies for this project from the Pipfile and Pipfile.lock. If you have issues with installing all of the requirements at once, read the error(s) as there may be some other requirement missing locally (such as Postgres). If you still have problems, raise an issue on Github.
The environment file (.env
) contains configurations for running the application. This file will be used to set up the environmental variables that Django
uses to set up everything (The deployment to heroku
does not use this file, instead you can use their website to set up these variables. See more below under heroku
). The .env
file should never be committed to git and should be kept private as it contains secrets. First copy the contents of the template environment file, env.example
, paste into a new file, and save with the filename .env
(use $ cp env.example .env
) we will go back and alter the contents after creating a project on the Open Humans site. The .env
filename should already be in your .gitignore
, but it is worth double-checking to make sure.
Head to http://openhumans.org/direct-sharing/projects/manage to create an OAuth2 project in Open Humans. If you do not yet have an Open Humans account, you will need to create one first.
- Click the button to
Create a new OAuth2 data request project
- Fill out the form for your project description. All of this information can be edited later, so don't worry if you aren't sure about it all just yet. However do make sure you fill out the following fields:
- Description of data you plan to upload to member accounts - if you leave this field blank, Open Humans will assume that your project doesn't plan to add data
- Enrollment URL - set this to
http://127.0.0.1:5000
, this should then automatically set the redirect URL tohttp://127.0.0.1:5000/complete
When you have created the project, you'll be able to click on its name in the project management page
to show its information. From here, get the activity page
, client ID
, and client secret
and set them in your .env
file. The ID and secret identify and authorize your app. They are used for user authorization and data management.
Keep your client secret private, it should not be committed to a repository. In Heroku it will be kept private as an environment variable, and locally it will be available from the .env
file which should not be committed to git.
Finally we need to initialize the database and static assets to be able to get the app running. Django will use SQLite3 by default if you do not specify a DATABASE_URL
in .env
. For more information on databases you can check out the Django docs.
In the main project directory, run the migrate
command followed by collectstatic
as follows:
$ pipenv run python manage.py migrate
$ pipenv run python manage.py collectstatic
Now we are ready to run the app locally. Enter the command $ pipenv run heroku local
, and don't worry if you see the following warning:
warnings.warn('Using settings.DEBUG leads to a memory leak, never')
If you are curious, the cause of this warning is outlined here.
Now head over to http://127.0.0.1:5000 in your browser to see your app running locally. It should look like this:
Now you have your application built and running locally, we'll head over to Heroku where the app will be deployed remotely.
If you have hit any problems so far, please do let us know in Github issues or come and chat with us over at our Slack channel.
If you don't already have a Heroku account, head to http://www.heroku.com/ to create a free account now. If you are new to app development, you may also want to go through their getting started with Heroku/Python guide before continuing with your Open Humans app.
Make sure you have installed the Heroku command line interface, then, from your terminal, you can log in and create your app with the following commands:
$ heroku login
you will be asked for your Heroku credentials
$ heroku apps:create your-app-name
If you use Heroku's free default domain, this will be set by the name you choose here, i.e. you will have
https://your-app-name.herokuapp.com
In your browser, head over to http://dashboard.heroku.com/apps
and log in to see the app you just created.
Go to the Resources
tab, and add the following Add-ons:
Heroku Redis
Heroku Postgres
Next go to the settings
tab and add the environment variables as in the .env
file.
OPENHUMANS_CLIENT_ID
OPENHUMANS_CLIENT_SECRET
OPEHUMANS_ACTIVITY_PAGE
DEBUG
= true when neededREMOTE
= true
Head back over to your terminal and run the following command to initialize and update your code remotely in Heroku:
$ git push heroku master
You can watch logs with the command $ heroku logs -t
.
If you make changes you may have to migrate again, to do this run:
$ heroku run python manage.py makemigrations
$ heroku run python manage.py migrate
$ git push heroku master
Before starting to edit the code in this demo to create your own project, it may be useful to understand what the existing code is doing.
- Upon arrival at the app, the user sees the homepage (the
index.html
file), on this page is a button which transfers the user to Open Humans for authorization - The user logs in to their Open Humans account, and clicks to authorize the app to add data, this directs them back to the
complete.html
file - The
complete
function in the fileviews.py
receives a code from Open Humans, exchanges it for a token, and uses this token to retrieve the project member ID which is stored in theOpenHumansMember
model - As this page is loaded, an automatic, asynchronous data upload to Open Humans is triggered via the method
xfer_to_open_humans
xfer_to_open_humans
takes the member ID (which was retrieved during authentication over at the Open Humans site), and runs the methodadd_data_to_open_humans
add_data_to_open_humans
runs with three steps:- creates a dummy data file, for demo purposes only (using
make_example_datafile
method) - deletes any previous files uploaded with this name (using
delete_oh_file_by_name
method) - takes the file path, metadata, and user ID, and uploads to Open Humans (using
upload_file_to_oh
method)
- creates a dummy data file, for demo purposes only (using
upload_file_to_oh
performs the following steps:- gets S3 target for storage
- performs data upload to this target
- notifies when the upload is complete
The celery.py
file sets up asynchronous tasks for the app. The function xfer_to_open_humans
in tasks.py
is called (from complete
function in views.py
) asynchronously, by the presence of .delay
in the function call:
xfer_to_open_humans.delay(oh_id=oh_member.oh_id)
Sometimes uploads can take a long time so we advise using the delay so that this does not prevent the app from processing other events. However if you wish to run your app without asynchronosity, you can simply remove the .delay
component from this function call.
External APIs will often limit how much data you are allowed to fetch per unit time. This template uses the requests-respectful
package to easily set rate limits when making requests. The requester can be set up as follows
- find out the limitations of the external API by looking at their documentation
- specify the limitations in a realm, found in
demotemplate/settings.py
- when making your request, use the function
make_request_respectful_get
indatauploader/tasks.py
Now you have worked through to create a working demo, and should understand roughly how the demo works, you are ready to customise the code to create your own Open Humans data source. Use the code you have in this repository as a template for your app.
You are likely to want to start making changes in the tasks.py
file, which is where much of the logic is stored. Instead of generating a dummy data file you will want to think about how to get your own data into the app, whether it is a previously downloaded file, which needs to be processed and/or vetted by the app, or you are working from an external API.
Good luck, and please do get in touch to ask questions, give suggestions, or join in with our community chat!