-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
189 additions
and
79 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,78 @@ | ||
# Making a release | ||
# Deploying ShiCo | ||
If you want to run your own instance of ShiCo, there are a few things you will need: | ||
|
||
- Merge changes on branch `demo` | ||
- Run `gulp build` | ||
- Make github release | ||
|
||
- A set of word2vec models which your ShiCo instance will use. | ||
- Run the python back end on your a server (you will need a server with enough memory to hold your word2vec models). | ||
- Run a web server to serve the front end to the browser. | ||
|
||
## Word2vec models | ||
|
||
You are welcome to use our [existing w2v models](https://github.com/NLeSC/ShiCo/tree/master/word2vecModels); you might need to use [git-lfs](https://git-lfs.github.com/) to download them. If you do, please contact us for more details on how the models were build and to know how to cite our work. You can also [create your own](./docs/buildingModels.md) models, based on your own corpus. | ||
|
||
## Launching the back end | ||
|
||
Once you have downloaded the code (or clone this repo), and install all Python requirements (contained in *requirements.txt*), you can launch the flask server as follows: | ||
``` | ||
$ python shico/server/app.py -f "word2vecModels/????_????.w2v" | ||
``` | ||
|
||
*Note:* loading the word2vec models takes some time and may consume a large amount of memory. | ||
|
||
You can check that the server is up and running by connecting to the server using curl (or your web browser): | ||
``` | ||
http://localhost:5000/load-settings | ||
``` | ||
|
||
Alternatively you use [Gunicorn](http://gunicorn.org/), by setting your configuration on *shico/server/config.py* and then running: | ||
|
||
``` | ||
$ gunicorn --bind 0.0.0.0:8000 --timeout 1200 shico.server.wsgi:app | ||
``` | ||
|
||
## Launching the front end | ||
|
||
The necessary files for serving the front end are located in the *webapp* folder. You will need to edit your configuration file (*webapp/srs/config.json*) to tell the front end where your back end is running. For example, if your backend is running on *localhost* port 5000 as in the example above, you would set your configuration file as follows: | ||
|
||
``` | ||
{ | ||
"baseURL": "http://localhost:5000" | ||
} | ||
``` | ||
|
||
If you are familiar with the Javascript world, you can use the *gulp* tasks provided. You can serve your front end as follows (from the *webapp* folder): | ||
``` | ||
$ gulp serve | ||
``` | ||
|
||
You can build a deployable version (minified, uglified, etc) as follows: | ||
``` | ||
$ gulp build | ||
``` | ||
This will build a deployable version on the *webapp/dist* folder. | ||
|
||
## Pre-build deployable version | ||
|
||
If you are not familiar with the Javascript world (or just don't feel like building your own deployable version), the *demo* branch of this repository contains a pre-build version of the front end. You can checkout (or download) that branch, and then you are ready to go. | ||
|
||
## Serve with your favorite web server | ||
|
||
Once you have a *webapp/dist* folder (whether downloaded or self built) you can serve the content of it using your favorite web server. For example, you could use Python SimpleHTTPServer as follows (from the *webapp/dist* folder): | ||
``` | ||
$ python -m SimpleHTTPServer | ||
``` | ||
|
||
## Cleaning functions | ||
In some cases, resulting vocabularies may contain words which we would like to filter. ShiCo offers the possibility of using a *cleaning* function, for filtering vocabularies after they have been generated. To use this option, it is necessary to indicate the name of the cleaning function when starting the ShiCo server. A sample cleaning function is provided (*shico.extras.cleanTermList*). You can use this function as follows: | ||
``` | ||
$ python shico/server/app.py -c "shico.extras.cleanTermList" | ||
``` | ||
|
||
If you are using gunicorn, in your *config.py*, you can set `cleaningFunctionStr` to the name of your cleaning function, for instance: | ||
|
||
``` | ||
cleaningFunctionStr = "shico.extras.cleanTermList" | ||
``` | ||
|
||
## Speeding up ShiCo | ||
|
||
Current implementation of ShiCo relies on gensim word2vec model `most_similar` function, which in turn requires the calculation of the dot product between two large matrices, via `numpy.dot` function. For this reason, ShiCo greatly benefits from using libraries which accelerate matrix multiplications, such as OpenBLAS. ShiCo has been tested using [Numpy with OpenBLAS](https://hunseblog.wordpress.com/2014/09/15/installing-numpy-and-openblas/), producing a significant increase in speed. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# What should you do if you want to modify ShiCo? | ||
|
||
Be brave! And get in touch if you need help. Pull requests are very welcome. | ||
|
||
## Backend | ||
|
||
Written in Python. | ||
|
||
### Unit testing | ||
If you modify ShiCo back end, make sure to write your unit tests for your code. | ||
|
||
To run Python unit tests, run: | ||
``` | ||
$ nosetests | ||
``` | ||
|
||
## Web app | ||
|
||
Written in Javascript (Angular). | ||
|
||
### Adding hooks | ||
|
||
You can add your own custom behaviour to the force directed graphs like this: | ||
``` | ||
(function() { | ||
'use strict'; | ||
angular | ||
.module('shico') | ||
.run(runBlock); | ||
function runBlock(GraphConfigService) { | ||
GraphConfigService.addForceGraphHook(function(node) { | ||
node.select('circle').attr('r', function(d) { | ||
return d.name.length; | ||
}); | ||
}); | ||
} | ||
})(); | ||
``` | ||
|
||
This snippet modifies the size of the force directed graph nodes, and makes them dependent on the length of the name in the node's data. | ||
|
||
## Making a release on GitHub | ||
- Merge changes on branch `demo` | ||
- Run `gulp build` | ||
- Make github release |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# How to use ShiCo? | ||
|
||
This guide will instruct you in the elements for using ShiCo's user interface. | ||
|
||
## User interface components | ||
|
||
When you first open ShiCo on your browser, you will see a simple search bar: | ||
|
||
![Search bar](./searchBar.png) | ||
|
||
You can enter one or multiple (comma separated) *seed terms*. These seed terms are the entry point for your concept search. Click *Submit* to begin your search. The results from your search will be displayed in the results panel below the search bar. | ||
|
||
The search bar has some additional features: | ||
- It allows you to modify the search parameters. Click the *+* button to display additional search parameters. | ||
- It allows you to save the parameters of your current search, or load the parameters of a previous search. | ||
|
||
## Search parameters | ||
|
||
The following is the list of parameters (with a link to a brief explanation) which can be used to control your concept search: | ||
|
||
- [Max Terms](/webapp/src/help/maxTerms.md) | ||
- [Max related terms](/webapp/src/help/maxRelatedTerms.md) | ||
- [Minimum concept similarity](/webapp/src/help/minSim.md) | ||
- [Word boost](/webapp/src/help/wordBoost.md) | ||
- [Boost method](/webapp/src/help/boostMethod.md) | ||
- [Algorithm](/webapp/src/help/algorithm.md) | ||
- [Track direction](/webapp/src/help/direction.md) | ||
- [Years in interval](/webapp/src/help/yearsInInterval.md) | ||
- [Words per year](/webapp/src/help/wordsPerYear.md) | ||
- [Weighing function](/webapp/src/help/weighFunc.md) | ||
- [Function shape](/webapp/src/help/wFParam.md) | ||
- [Do cleaning ?](/webapp/src/help/doCleaning.md) (only shown if your backend uses a cleaning function). | ||
- [Year period](/webapp/src/help/yearPeriod.md) | ||
|
||
## Produced graphics | ||
|
||
Once a search is complete, ShiCo displays results in the results panel. Results are displayed using various graphs: | ||
|
||
- Stream graph -- this shows each word of the resulting vocabulary as a stream over time. The stream gets wider or narrower according to the weight the word is given in the vocabulary. | ||
|
||
![Stream graph](./streamGraph.png) | ||
|
||
- Network graphs -- this shows a collection of graphs displaying the resulting vocabulary as a network graph. Words which are related to each other are connected with an arrow. The direction of the arrow indicates which word was the product of which seed word. | ||
|
||
![Network graph](./networkGraph.png) | ||
|
||
- Space embedding -- this shows an estimate of the spatial relationship between words in the final vocabulary at every time step. Please keep in mind that these spatial relations are approximate and should be considered with care. | ||
|
||
![Space embedding graph](./embeddingGraph.png) | ||
|
||
- Plain text vocabulary -- this shows a text representation of the concept search. This consists, for each time step, of the seed words used and the produced vocabulary. | ||
|
||
## Saving and loading search parameters | ||
|
||
When you click the *Save parameters* button, a text box with your search parameters will be displayed. Copy these parameters and save them somewhere. Click *Ok* to hide the text box. | ||
|
||
When you click the *Load parameters* button, another text box will be displayed. Enter previously saved search parameters in this box and click *Ok* to load the parameters. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters