Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Avro Schema input processor #1753

Merged
merged 20 commits into from
Apr 26, 2024

Conversation

akkshitgupta
Copy link
Collaborator

@akkshitgupta akkshitgupta commented Jan 23, 2024

Description

Add Avro Schema support.

Related Issue

fixes #1741

Checklist

  • The code follows the project's coding standards and is properly linted (npm run lint).
  • Tests have been added or updated to cover the changes.
  • All tests pass successfully locally.(npm run test).
  • Documentation has been updated to reflect the changes.

Copy link

netlify bot commented Jan 23, 2024

Deploy Preview for modelina canceled.

Name Link
🔨 Latest commit ef792c6
🔍 Latest deploy log https://app.netlify.com/sites/modelina/deploys/662b6363d172e50008a01f9a

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to AsyncAPI. Thanks a lot for creating your first pull request. Please check out our contributors guide useful for opening a pull request.
Keep in mind there are also other channels you can use to interact with AsyncAPI community. For more details check out this issue.

src/processors/AvroSchemaInputProcessor.ts Outdated Show resolved Hide resolved
src/processors/AvroSchemaInputProcessor.ts Outdated Show resolved Hide resolved
src/processors/AvroSchemaInputProcessor.ts Outdated Show resolved Hide resolved
Copy link
Member

@jonaslagoni jonaslagoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some more comments

@jonaslagoni
Copy link
Member

hey @akkshitgupta how is it coming along? Are you stuck on anything? ✌️

@akkshitgupta
Copy link
Collaborator Author

since it is required to obtain the meta model directly from the Avro Schema bypassing the common model. I am trying to figure out how a schema is using common model in between to obtain the meta model. I read the different meta models but stuck at the implementation of it.

Screenshot

@jonaslagoni
Copy link
Member

jonaslagoni commented Feb 7, 2024

What are you currently stuck on? Getting started with converting them or a specific example you cant convert?

@akkshitgupta
Copy link
Collaborator Author

akkshitgupta commented Feb 7, 2024

Getting started with it. I was reading the code base and connecting some of the dots for reverse engineering to find the way backward from meta model. but unfortunately could not use the meta models. perhaps failing to understand the flow for generating the meta model.

PS: I selected this issue to work on so that I can understand the working of Common Model and Meta Model in a different way.

I will much obliged to you if you can share some insights either here or on Slack(for clean history)

@jonaslagoni
Copy link
Member

@akkshitgupta looks like you got figured out some of it, let me know if you have anything blocking you 😄

@akkshitgupta
Copy link
Collaborator Author

Hey @jonaslagoni I forgot to ping you in this particular comment. can you please go through this once. I am trying to test the processor but stuck at the bindings.
image

Also wanna confirm that whether we are going with the same .json file extension or gonna adopt the .avsc file extension and add a support for that as well. :)

Copy link
Member

@jonaslagoni jonaslagoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks like you are on the right track! Just make sure that you are not reusing code that is ONLY relevant for JSON Schema inputs and focus on what is possible with Avro and do so with tests ✌️

If I where you I would take the example in the issue and add it as you star to guide towards, if the MetaModel that is being created does not match what you expect it means there is a bug somewhere ✌️ Also I would break down the example and add individual tests for each small inputs where possible to create better tests 😄

src/helpers/AvroToMetaModel.ts Show resolved Hide resolved
src/helpers/AvroToMetaModel.ts Outdated Show resolved Hide resolved
src/helpers/AvroToMetaModel.ts Outdated Show resolved Hide resolved
src/helpers/AvroToMetaModel.ts Outdated Show resolved Hide resolved
@jonaslagoni
Copy link
Member

What those bindings are I have no idea 🤨 Are you using node v18?

@jonaslagoni
Copy link
Member

Hey @akkshitgupta, have you had a chance to take a look at the review comments?

@akkshitgupta
Copy link
Collaborator Author

Apologies for the delay @jonaslagoni I am working on it. some tests broke yesterday figuring out what happened.

Copy link

sonarcloud bot commented Feb 23, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
8.4% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

@akkshitgupta
Copy link
Collaborator Author

Hey @jonaslagoni, I am stuck at the process of converting an array as I could not figure out the keyword which would represent the entire array.

image

here the default keyword is used to represent an empty array.

@jonaslagoni
Copy link
Member

jonaslagoni commented Feb 23, 2024

@akkshitgupta in AsyncAPI spec-json-schema repo we have this JSON Schema document that can validates Avro documents: https://github.com/asyncapi/spec-json-schemas/blob/master/common/avroSchema_v1.json

I suggest we focus on that, because yea, the spec is not clear on what items: the schema of the array’s items. can contain.

According to the JSON Schema: https://github.com/asyncapi/spec-json-schemas/blob/35ea79055d09bdd319ed816e4ec169a12f0bb7a5/common/avroSchema_v1.json#L225 it can contain types, which is: https://github.com/asyncapi/spec-json-schemas/blob/35ea79055d09bdd319ed816e4ec169a12f0bb7a5/common/avroSchema_v1.json#L14

here the default keyword is used to represent an empty array.

Default is just hmmm, yea I see what you mean, what is default 🤔 I actually think you can ignore it ✌️

@akkshitgupta
Copy link
Collaborator Author

I have some doubts regarding spec-json-schema

  • are we gonna implement spec-json-schema validation instead of Avro Schema defined here or we just need to implement the scenario/concept used there for avro array and use that implementation while getting the Array Meta Model
  • not understood this particular thing, what you are referring to in here

    I suggest we focus on that

Note: I have some academic exams and I might be late in replying during this coming week. Apologies for that.

@jonaslagoni
Copy link
Member

I have some doubts regarding spec-json-schema

* are we gonna implement `spec-json-schema` validation instead of [Avro Schema](https://github.com/asyncapi/modelina/blob/73f47d7f7a326cb7f40e71658574c3b9e9510f71/src/models/AvroSchema.ts) defined here or we just need to implement the scenario/concept used there for `avro array` and use that implementation while getting the Array Meta Model

* not understood this particular thing, what you are referring to in here
  > I suggest we focus on that

Note: I have some academic exams and I might be late in replying during this coming week. Apologies for that.

The only reason why I linked the spec-json-schema files, was to show the structure of how an array is defined.

But I guess we could validate the input as well 🤔 I would forget about it in this PR though.

No worries, no rush ✌️ Good luck in the exams!

@akkshitgupta
Copy link
Collaborator Author

akkshitgupta commented Mar 26, 2024

Hey @jonaslagoni, I am getting an error whenever I run the integration test using test/processors/AvroSchemaInputProcessor.spec.ts.

image

However, the unit tests are running fine and there is no typeerror. I tried to google the solution but could not find success. can you help me with this particular thing 😄

@jonaslagoni
Copy link
Member

jonaslagoni commented Mar 27, 2024

@akkshitgupta what happens when you try running includes on object type such as here: https://github.com/asyncapi/modelina/pull/1753/files#diff-92b54f6485623708e919c7944aa95ef0c21d22401eeb4d2ee3c35cdafd57eb1eR24 😄

@jonaslagoni
Copy link
Member

@akkshitgupta for some reason I cannot see them at all 🤨

Do you mind posting them again or as a comment? 🤔

@akkshitgupta

This comment was marked as duplicate.

@jonaslagoni
Copy link
Member

@akkshitgupta do you see the pending label? It means that you wrote those comments as part of a review that you have not published. You have to add them as single comments 🙂

@akkshitgupta
Copy link
Collaborator Author

@jonaslagoni extremely sorry for the confusion 🙇🏻🙇🏻

I thought these comments are the part of unresolved conversation thats why it is showing pending 😓

avroSchemaModel: AvroSchema,
name: string
): FloatModel | undefined {
if (avroSchemaModel.type !== 'float' && avroSchemaModel.type !== 'double') {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got my mistake 😅 didn't think of that

i guess need to make a check in every function for the AvroSchema object type to consider the value of type as an Avro Schema itself.

 if ((typeof avroSchemaModel.type !== 'string' && !Array.isArray(avroSchemaModel.type)) || (avroSchemaModel.type !== 'int' && avroSchemaModel.type !== 'long')) {
return undefined;
}

@jonaslagoni, Please correct me If I am heading heading wrong or if there is any alternative. 🙇🏻

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit hard, because we have two inputs that have a type keyword. JSON Schema and Avro, each with overlapping types such as string and boolean.

So the answer to the question lies within "what is the difference between Avro and JSON Schema and how can you easily detect it" 😄

I don't have the perfect answer here unfortunately, other then look at the keywords that are different between them and match with those 🙂

Copy link
Collaborator Author

@akkshitgupta akkshitgupta Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonaslagoni How about introducing a check explicitly in the shouldProcess() function using an additional schema property with a value of Avro to differentiate an Avro Schema before processing the input?

have you checked these comments: #1753 (comment) and #1753 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonaslagoni How about introducing a check explicitly in the shouldProcess() function using an additional schema property with a value of Avro to differentiate an Avro Schema before processing the input?

Can you give an example of what you mean?

Copy link
Collaborator Author

@akkshitgupta akkshitgupta Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example:

{
  "name": "Person",
  "type": "int",
  "schema": "avro"
}

@jonaslagoni here, we can add a check for the schema property against the value avro. This would be to verify that only avro schema is passed to the Avro processor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would never happen no, because schema is not part of the Avro standard and never will be probably 🙂

It's okay if the solution is not bullet proof, as long as we document the edge cases.

docs/usage.md Outdated

The Avro input processor expects the `type` property, as per [Avro Schema specs](https://avro.apache.org/docs/1.11.1/specification/#schema-declaration), in the input object in order to proceed successfully.

> Note: Currently, we support `record` datatype for generating the `Object Model`.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonaslagoni As of now, we do not support map, fixed, byte datatypes. I don't get your point about enum, it is being handled as an Enum Model separately.

src/models/AvroSchema.ts Outdated Show resolved Hide resolved
@jonaslagoni
Copy link
Member

@akkshitgupta is it something you want to include in this PR, or want to merge this one and add issues to describe what is missing?

@akkshitgupta
Copy link
Collaborator Author

hello @jonaslagoni I have not been active for the past few days for some reason and also have my end-of-semester exams in the coming week. It would be great to merge this PR with other PRs to mention related issues, if that makes sense 😅😅. However, I would make sure to complete this ASAP. Thanks a lot for your kind support in my journey 😄

@jonaslagoni
Copy link
Member

@akkshitgupta do you mind listing all the current short comings of the implementation right now?

@akkshitgupta
Copy link
Collaborator Author

Here are the pending tasks

  • implementation for map, fixed and byte data type
  • find a work around to differentiate Avro Schema from JSON Schema
  • update docs/usage.md with the edge cases and other related things.

@jonaslagoni jonaslagoni changed the base branch from master to next April 26, 2024 08:14
@jonaslagoni
Copy link
Member

Here are the pending tasks

  • implementation for map, fixed and byte data type
  • find a work around to differentiate Avro Schema from JSON Schema
  • update docs/usage.md with the edge cases and other related things.

@akkshitgupta I am gonna merge this into the next branch, do you have time to add issues for each of those tasks, or are you full with stuff to do in school? ✌️

@jonaslagoni
Copy link
Member

Btw, you are free to join as a code owner for the Avro input processor if you want to 💪

Copy link

sonarcloud bot commented Apr 26, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

Copy link
Member

@jonaslagoni jonaslagoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the perseverance @akkshitgupta, thanks for sticking with the PR and fighting through the difficulties 💪

@jonaslagoni
Copy link
Member

/rtm

@asyncapi-bot asyncapi-bot merged commit 7b26a2a into asyncapi:next Apr 26, 2024
16 checks passed
@jonaslagoni
Copy link
Member

@all-contributors please add @akkshitgupta for code, example, test, docs

Copy link
Contributor

@jonaslagoni

I've put up a pull request to add @akkshitgupta! 🎉

@akkshitgupta
Copy link
Collaborator Author

akkshitgupta commented Apr 26, 2024

Btw, you are free to join as a code owner for the Avro input processor if you want to 💪

Thanks a lot @jonaslagoni for the opportunity. I would love to become a code owner. 🙏🏻

do you have time to add issues for each of those tasks

created a new issue covering all the tasks

@jonaslagoni
Copy link
Member

Thanks a lot @jonaslagoni for the opportunity. I would love to become a code owner. 🙏🏻

Go ahead and add the Avro input processor to the codeowners file and add yourself ✌️

# Input Champions for AsyncAPI input

@asyncapi-bot
Copy link
Contributor

🎉 This PR is included in version 4.0.0-next.36 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Avro support
4 participants