Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REST code generation for 2024-07 prerelease #361

Merged
merged 13 commits into from
Jul 3, 2024

Conversation

jhamon
Copy link
Collaborator

@jhamon jhamon commented Jun 26, 2024

Problem

We need to generate code off of the openapi specs stored in our apis repo. We'll need to massage the output a bit in order to avoid having a lot of confusing duplication of generated exception classes and other utils.

Solution

  • I create a new codegen directory to act as the home for everything codegen related. Inside this directory there are several things:
    • Git submodule of our template files (slightly tweaked from the open source generator, but in a separate repo so they can be shared with plugins)
    • Git submodule of our apis repo
    • A build-oas.sh script which does everything to generate the code off the spec, adjust import paths, and place things into the pinecone module package structure.

The script does the following:

  • Pulls in the latest from the apis repo
  • Builds the api repo to produce versioned spec files
  • For each of data plane and control plane, generates a python package.
  • Performs surgery to extract the duplicated files across these two packages into a shared package.
  • Use sed to adjust import paths to reflect the modified code structure.

The surgery to extract shared code is more than just a cosmetic change, as without it you can end up with a lot of confusion in the areas of configuration and error handling due to there being multiple objects that despite being identical in name and functionality remain distinct from the perspective of language utils such as isinstance, .__class__ and except.

When the script is done running, the generated outputs have this structure:

pinecone/core
└── openapi
    ├── control
    │   ├── api
    │   │   ├── inference_api.py
    │   │   └── manage_indexes_api.py
    │   └── model
    │       ├── __init__.py
    │       ├── create_index_request.py
    │       ├── index_model.py
    │       └── ...etc
    ├── data
    │   ├── api
    │   │   └── data_plane_api.py
    │   └── model
    │       ├── __init__.py
    │       ├── upsert_request.py
    │       ├── vector.py
    │       └── ...etc
    └── shared
        ├── api_client.py
        ├── configuration.py
        ├── exceptions.py
        ├── model_utils.py
        └── rest.py

What actually changed in here?

It seems the main substantive change is in how the spec info is expected to be passed when creating an index. So I had to write some additional tests and make some modifications to the create_index method.

Also, inference-related stuff is currently generated. But for now, all the wrapper implementations for that functionality are only available by installing a separate plugin, pinecone-plugin-inference.

Type of Change

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Test Plan

To run the script, run make generate-oas. Before pushing, I ran tests locally with make test-unit and PINECONE_API_KEY='key' make test-integration. This helped me catch a lot of small issues related to the sed rewrite of import paths.

Want to see tests passing even though I changed quite a bit about how the generated code is structured.

@@ -1,22 +1,32 @@
import time
import warnings
import logging
from typing import Optional, Dict, Any, Union, List, cast, NamedTuple
from typing import Optional, Dict, Any, Union, List, Tuple, cast, NamedTuple
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A decent amount of change was needed in this file due to changes in generated code related to index spec.

@jhamon jhamon marked this pull request as ready for review June 27, 2024 17:56
@jhamon jhamon changed the title Code generation for 2024-07 prerelease REST code generation for 2024-07 prerelease Jun 27, 2024
Copy link
Contributor

@haruska haruska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't thoroughly review all the code changes for the generated code but overall I really like this approach to code-gen for the python client.

Copy link

@aulorbe aulorbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome!

Copy link
Contributor

@austin-denoble austin-denoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work on this Jen, the approach feels clear and easy to follow especially given everything going on under the hood. 🚢

Comment on lines +11 to +13
update_apis_repo() {
echo "Updating apis repo"
pushd codegen/apis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to have something similar for making sure the codegen/python-oas-templates submodule is also up to date? I doubt it would have as much churn as the APIs repo, but just in case.

Comment on lines +103 to +105
# Remove the docstring headers that aren't really correct in the
# context of this new shared package structure
find "$target_directory" -name "*.py" -print0 | xargs -0 -I {} sh -c 'sed -i "" "/^\"\"\"/,/^\"\"\"/d" "{}"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup! 👍

Comment on lines +109 to +110
# Adjust import paths in every file
find "${destination}" -name "*.py" | while IFS= read -r file; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also cool, very clever piece of scripting here, nice work.

@jhamon jhamon merged commit 21448c7 into prerelease Jul 3, 2024
83 checks passed
@jhamon jhamon deleted the jhamon/prerelease-codegen branch July 3, 2024 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants