Skip to content

02 Working on the data

Duncan Paterson edited this page Nov 19, 2018 · 7 revisions

Files and File-format

The authoritative data is in the comma-separated-values (.csv) spreadsheets in the /data folder.

You can edit these directly in the browser in gitHub (good for quick fixes) or in the spreadsheet application of your choice. However, to ensure that non-latin data remains readable.

!!! The csv files must: !!!

  • be stored in UTF-8 encoding,
  • and use commas , as separators (the default on win is semicolon ;)

You can find a good summary of best practices and common mistakes, like using , when inputting data, at How to Write Good CSV. Our testsuite will automatically check if our csv files are well-formed. Please make sure that any Pull-request passes all the checks (all-green).

LibreOffice csv import settings

Depending on your editor the following might look slightly different, but similar options are available on all spreadsheet editors.

A common source of errors is the auto-formatting feature, applying unwanted changes to your input. To avoid this you can either disable auto-formatting all together, or treat the whole spreadsheet (esp.: Person and Place) as text, or you can treat only the columns known to cause Problems as text (as in the screenshot below).

libreOffice import settings

Issues and Branches

Please help everyone involved to keep the work flowing smoothly. Before creating a new working branch, check the names of other open branches to see if somebody is already doing what you were planning to do. In return, adapt a sensible name for your branch so other can see what you are doing, without having to check your files. If you want to, e.g. input 50 more placenames into the place table. Checkout a branch and give it a meaningful name, such as:

places-P0150-P0199

Secondly, if you notice that something will need to get done, but you can't do it right away. Open an issue describing what needs to get done. You can assign yourself or someone to take on an issue. This makes it even more visible who is working on what. It is also a great way to assign a more senior team member to a task, that you feel ill-equipped to deal with. To show that your work is connected to a certain task simply add the following to a commit message:

see #[issue-number]

You can do the same from the comments section of a pull-request. If you have solved an issue the keyword becomes:

close #[issue-number]

This will automatically close the issue once the change is merged into the master branch.