Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous Import #330

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Asynchronous Import #330

wants to merge 2 commits into from

Conversation

naydav
Copy link

@naydav naydav commented Oct 23, 2019

  • Intro
  • REST API
  • Modularity (API / Extension points)

-- Intro
-- REST API
-- Modularity (API / Extension points)
"format': {
"escape": "|"
"enclosure" : "|"
"delimiter" => "|"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls. replace => with :


```
{
"UUID": "uuid_string"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"UUID" as Lowercase?

- Import Data Exchanging;
- Import configuration;
- Product import;
- Stock status import;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused with stock status. You mean Stock QTY and not only enabled/disabled?
Also how about differentiate Single Stock and MSI source/stock?

@ticaje
Copy link

ticaje commented Nov 3, 2019

After digging a little bit into this issue, the following assesment has come out:

It turns out that the class RetrieveSourceData contains the source validator actor that wraps all the possible validator to a specific source data retriever(strategy), this means that each strategy must comply with the validations defined by di.xml:

image

This forces me to assume a sort of data contract where a new strategy(json's in this case), must fit such a data contract. Not empty: sourceType, sourceDefinition, sourceDataFormat; and also sourceDefinition must come in valid base64 format.

This way, coming down to Json strategy, i understand that sourceDefinition field is a json encoded in base64, if this is ok then i assume that it is actually a csv translated into json, so the data strategy retriever takes this chunk and translate it into array iterator as if a magento standard csv was.

- Advanced pricing import;
- Documentation;

## TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets change a Priorities here

  • Design of Import configuration;
  • Design of Get Import status;
  • Design of Import processing - Sync / Async
  • Design of Restart failed operations;

*
* @api
*/
interface CsvFormatInterface extends ExtensibleDataInterface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a duplicate, you already defined this interface above?

*
* @api
*/
interface SourceValidatorInterface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How & where we will going to use this extension point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be used:
https://github.com/magento/architecture/pull/330/files#diff-610ea2fd7eb8081598349a79459e9313R40
On a step we are retrieving data to validate them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it injected through DI configuration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right


- [REST API](rest-api.md)

## Modularity (API / Extension points)
Copy link
Contributor

@tariqjawed83 tariqjawed83 Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1- It will be helpful to add a diagram to show the flow of interaction between the modules!
2- A simple class / interface UML diagram will make it really easy to grasp the concepts

*/
interface SourceDataInterface extends \IteratorAggregate
{
public const ITERATOR = 'iterator';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how we use this?

*
* @return string|null
*/
public function getUuid(): ?string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to generalize UUID? can't we just say ID. The problem with UUID is that they are big in size and doesn't index well. When your data grows in the table, the index size also grows with it, which leads to query performance hit.

https://www.callicoder.com/distributed-unique-id-sequence-number-generator/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not expect really big data flow where, even if you are doing daily imports with 100+ files daily (which is not common case), we would need to have some cleaning functionality for such table.
to use UUID was provided in Asynchronous operations and as I know idea of Arch team was to go into UUID direction and discussion is still open, or?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there some other PRs currently opened to replace DB Sequence for IDs with some other strategy, when the ID is not externally provided; where we are having some discussions!


/**
* Describes how to change data before import
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add some example use or bit more explanation of this

@@ -0,0 +1,146 @@
# Data converting before import
Copy link
Contributor

@tariqjawed83 tariqjawed83 Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also would like to recommend that please don't implement your own ETL code, there are many good PHP libraries available that can do all this for you. Please have a look into those

https://php-etl.readthedocs.io/en/v1/
https://blog.panoply.io/6-of-the-best-php-etl-tools

and If you are designing you own interface, use the standard mature libraries to manipulate the data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for info, I will check it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tariqjawed83, I expect that its will take a lot of time to confirm from Architecture team of usage some library in Magento Core?

*
* @return string
*/
public function getImportBehaviour(): string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this behavior? and why it is string... trying to understand, if there could be more better way to represent it rather than String.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behaviour here is an "add", "update", "delete", "replace" ....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So on import process we can add new objects, update existed, delete some or replace.
We will start with "add" and "delete" with MVP

@nuzil
Copy link
Contributor

nuzil commented Nov 19, 2019

Related to #330 (comment)

AsyncImportUML

Its a high level diagram, hope will help a bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants