Welcome, and thanks for your interest in contributing to the Synapse Python client!
By contributing, you are agreeing that we may redistribute your work under this license.
- Contributing
- Table of contents
Note: Please don't file an issue to ask a question. You'll get faster results by using the resources below.
We have an official forum and a detailed FAQ and where the community and maintainers can chime in with helpful advice if you have questions.
Bug reports and feature requests can be made in two ways. The first (preferred) method is by posting a question in the Synapse Help Forum. The second is by opening an issue in this repository. In either case, providing enough details for the developers to verify and troubleshoot your issue is paramount:
- Use a clear and descriptive title for the issue to identify the problem.
- Describe the exact steps which reproduce the problem in as many details as possible. If you are following examples from somewhere (e.g., the Synapse Docs site) provide a link.
- Provide specific examples to demonstrate the steps. Include copy/pasteable snippets. If you are providing snippets in the issue, use Markdown code blocks.
- Describe the behavior you observed after following the steps and point out what exactly is the problem with that behavior.
- Explain which behavior you expected to see instead and why.
After a bug report is received, expect a Sage Bionetworks staff member to contact you through the submission method you chose (Synapse Help Forumor Github issue). After ascertaining there is enough detail for the bug report or feature request, a JIRA issue will be opened. If you want to work on fixing the issue or feature yourself, follow the next sections!
For internal Sage collaborators, you will have access and permissions to create branches within the central repository for this project. As a result, instead of creating a fork of this repository you should just clone the repository as is and work off a feature branch. This is because there is additional overhead required to make sure that integration tests and SonarCloud scans run properly in your forked repo (As secrets are not copied to forks).
-
Clone the repository to your local machine so you can begin making changes.
-
On your local machine make sure you have the latest version of the
develop
branch:git checkout develop git pull origin develop
-
See the Github docs for how to make a copy (a fork) of a repository to your own Github account.
-
Clone the repository to your local machine so you can begin making changes.
-
Add this repository as an upstream remote on your local git repository so that you are able to fetch the latest commits. ("Fetch the latest commits" means you are able to update your forked repository with the latest changes from the original repository)
-
On your local machine make sure you have the latest version of the
develop
branch:git remote add upstream https://github.com/Sage-Bionetworks/synapsePythonClient.git git checkout develop git pull upstream develop
Perform the following one-time steps to set up your local environment.
-
This package uses Python, if you have not already, please install pyenv to manage your Python versions. Versions supported by this package are all versions >=3.9 and <=3.12. If you do not install
pyenv
make sure that Python andpip
are installed correctly and have been added to your PATH by runningpython3 --version
andpip3 --version
. If your installation was successful, your terminal will return the versions of Python andpip
that you installed. Note: If you havepyenv
it will install a specific version of Python for you. -
Install
pipenv
by runningpip install pipenv
.- If you already have
pipenv
installed, ensure that the version is >=2023.9.8 to avoid compatibility issues.
- If you already have
-
Install
synapseclient
locally using pipenv:- pipenv
# Verify you are at the root directory for the cloned repository (ie: `cd synapsePythonClient`) # To develop locally you want to add --dev pipenv install --dev # Set your active session to the virtual environment you created pipenv shell # Note: The 'Python Environment Manager' extension in vscode is recommended here
- pipenv
-
Once completed you are ready to start developing. Commands run through the CLI, or through an IDE like visual studio code within the virtual environment will have all required dependencies automatically installed. Try running
synapse -h
in your shell to read over the available CLI commands. Or view theUsage as a library
section in the README.md to get started using the library to write more python.
Once your virtual environment is created the pre-commit
command will be available through your terminal. Install pre-commit
into your git hooks via:
pre-commit install
When you commit your files via git it will automatically use the .pre-commit-config.yaml
to run various checks to enforce style.
If you want to manually run all pre-commit hooks use:
pre-commit run --all-files
Learn about the multiple ways one can login to Synapse here.
Developing on the Python client starts with picking a issue to work on in JIRA! The open work items (bugs and new features) are tracked in JIRA in the SYNPY Project. Issues marked as Open
are ready for your contributions! Please make sure your assign yourself the ticket and check with the maintainers if the issue you picked is suitable.
Now that you have chosen a JIRA ticket and have your own fork of this repository. It's time to start development!
Note
To ensure the most fluid development, try not to push to your develop or main branch.
-
(Assuming you have followed all 4 steps above in the "fork and clone this repository" section). Navigate to your cloned repository on your computer/server.
-
Create a feature branch which off the
develop
branch. The branch should be named the same as the JIRA issue you are working on (e.g.,SYNPY-1234-{feature-here}
). I recommend adding details of the actual feature in the branch name so that you don't need to go back and forth between JIRA and GitHub.git checkout develop git checkout -b SYNPY-1234-{feature-here}
-
At this point, you have only created the branch locally, you need to push this to your fork on GitHub.
git push --set-upstream origin SYNPY-1234-{feature-here}
-
You should now be able to see the branch on GitHub. Make commits as you deem necessary. It helps to provide useful commit messages - a commit message saying 'Update' is a lot less helpful than saying 'Remove X parameter because it was unused'. Try to avoid using
git commit -am
unless you know for a fact that those commits are grouped.Note
It is good practice to run
git diff
orgit status
to first view your changes prior to pushing!# Make sure that you have setup `pre-commit` as noted in the getting started section git commit changed_file.txt -m "Remove X parameter because it was unused" git push
-
Once you have made your additions or changes, make sure you write tests and run the test suite. More information on testing below.
-
Once you have completed all the steps above, in Github, create a pull request from the feature branch of your fork to the
develop
branch ofSage-Bionetworks/synapsePythonClient
.Note
*A code maintainer must review and accept your pull request. A code review ideally happens with both the contributor and the reviewer present, but is not strictly required for contributing. This can be performed remotely (e.g., Zoom, Hangout, or other video or phone conference).
The status of an issue can be tracked in JIRA. Once an issue has passed a code review with a Sage Bionetworks engineer, he/she will update it's status in JIRA appropriately.
All code added to the client must have tests. These might include unit tests (to test specific functionality of code that was added to support fixing the bug or feature), integration tests (to test that the feature is usable - e.g., it should have complete the expected behavior as reported in the feature request or bug report), or both.
The Python client uses pytest
to run tests. The test code is located in the test subdirectory.
Here's how to run the test suite:
Note: The entire set of tests takes approximately 20 minutes to run.
# Unit tests
pytest -vs tests/unit
# Integration tests - The integration tests should be run against the `dev` synapse server
pytest -vs tests/integration
To test a specific feature, specify the full path to the function to run:
# Test table query functionality from the command line client
pytest -vs tests/integration/synapseclient/test_command_line_client.py::test_table_query
The easiest way to specify the HTTP endpoints to use for all synapse requests is to modify the ~/.synapseConfig
file and modify a few key-value pairs such as below. Not this is also where you will specify the dev authentication:
[authentication]
username=<dev username>
authtoken=<dev authtoken>
[endpoints]
repoEndpoint=https://repo-dev.dev.sagebase.org/repo/v1
authEndpoint=https://repo-dev.dev.sagebase.org/auth/v1
fileHandleEndpoint=https://repo-dev.dev.sagebase.org/file/v1
tests/integration/conftest.py
is where we defining which trace exporter to use. Set the SYNAPSE_OTEL_INTEGRATION_TEST_EXPORTER
environment variable to otlp
or console
depending on your use case.
When integration tests are ran in the Github CI/CD pipeline it will upload the trace data automatically using OLTP.
As an external collaborator you will not have access to a development account and environment to run the integration tests against. Either request that a Sage Bionetworks staff member run your integration tests via a pull request, or, contact us via the Service Desk to requisition a development account for integration testing only.
Asyncio is the future of the Synapse Python Client. As such, the expectation is that all future methods that rely on async methods or network calls are asynchronous themselves.
When a new async method is created ask yourself if the method will be:
- Accessed by someone using the client
- An internal method only called within the client
When an async method is expected to be called by someone using the client:
- We will need to provide a non-async method for them to call.
If the async method is expected to be called by an internal method only:
- There is no need to create a non-async method.
Read up on the expected syntax for an async method here.
- Create the new method and make sure that it has an
_async
suffix.- For example
async def my_method_async(self)
- For example
- Use the
async_to_sync
decorator found in theasync_utils.py
script to automatically generate a non-async version of your code at runtime. Add the decorator to the class where the method exists. - For static type checkers to see that a non-suffixed method will be
available at runtime:
- Create or update a Protocol class that is a copy of the method definitions
without the async keyword. See
project_protocol.py
for an example. - Copy the docstring of the original method to the method defined in the protocol.
- Update the examples in the docstring to remove the await or async function calls.
- Import the protocol class you created and add it to the class constructor to inherit the protocol class.
- Create or update a Protocol class that is a copy of the method definitions
without the async keyword. See
- Write unit and integration tests for BOTH the async and non-async versions.
- Write your tests once with async in mind.
- Copy them to a non-async testing directory.
- Remove the async-related keywords and imports.
- Add the method definitions to the appropriate markdown file for generated doc pages.
- Create the new method with the async keyword. The name of the method should not be
suffixed by
_async
to prevent it from accidentally being included with any runtime generation of non-async code. - Write unit and integration tests for the async method.
- Add the method definitions to the appropriate markdown file for generated doc pages.
When you make a modification to an async method please also copy any changes to the definition of the method OR docstring into the non-async method defintion. It is expected that you manually keep them in-sync.
The Synapse Python Client uses flake8
to enforce
PEP8
style consistency and to check for possible errors.
You can verify your code matches these expectations by running the flake8 command from the project root directory:
# ensure that you have the flake8 package installed
# Note: This is not required if using the pipenv virtual environment
pip install flake8
flake8
In addition it is expected that all docstrings follow the google python style guide here. In particular this is needed as this is how our auto-doc creates our python documentation pages. We should be creating the docstrings without types.
- Any new dataclasses should document the class attributes in both a docstring and under the class attribute itself. This is in effect until pylance resolves: microsoft/pylance-release#4759
- Create 1 or more examples of all new functions you create. An example format of a function is below. Not the specific spacing and tabbing is required for it to show up as a code block.:
@dataclass
class MyDataClass:
"""
The description of this class
Attributes:
my_attribute: My attribute description.
Example: Example
Doing something cool
my_instance = MyDataClass()
Doing something else cool
my_instance = MyDataClass()
"""
my_attribute: bool = None
"""My attribute description."""
- mkdocstrings has an autorefs plugin that allows you to link to other resources like:
[synapseclient.entity.Project][]
OpenTelemetry helps support the analysis of traces and spans which can provide insights into latency, errors, and other performance metrics. During development it may prove useful to collect information about the execution of your code. The following is meant to be a starting point for additions to the current traces being collected.
To learn more about how to modify trace collection read the documentation here
Attributes that are collected within traces should not contain any sensitive data. We should follow the Common specification concepts defined by OpenTelemetry when it comes to naming attribute keys.
All synapse related attributes should live within the synapse
namespace, for example: synapse.id
, synapse.parent_id
.
- All new integration tests should create a new span for the test.
- New spans within the Synapse Python Client should only be added if it will bring value to those looking at the spans. Some initial questions to ask yourself are:
- "Will this information help someone in the future to review an error in the code?"
- "Is this an external call or an entry point into the python client?"
- "Is it useful to know how long this section of code takes to execute?"
The core of the doc pages is mkdocstrings
. It is a series to markdown pages that use a few
plugins to link to other documents (aka: autorefs
).
At the root directory of the project you'll find a mkdocs.yml
, this is where all commands
are ran from.
To start a local HTTP server to host the documents use:
mkdocs serve
On push to the master branch the github workflow build-docs.yml
is set to automatically
use the mkdocs gh-deploy
command to build and deploy changes to the live doc site.
In each of the docs folders there articles live there is a README with further information about the expected content in each folder.
Some links for further reading:
The re-use of connection pools during integration tests needs to be considered. During April 2024 it was found that during a single run of all integration tests almost 400 connections were created and subsequently closed during the test run. As such the following set of guidelines should be followed:
- All tests should use the
async
keyword. This allows any tests to share the underlying HTTPX async client for requests. - Any non
session
scoped fixtures should not execute an HTTP request. If the fixture does need to execute a request it should not be scoped tofunction
. This is because each scope level runs it's own event loop; Connection pools cannot be shared between each of the event loops.