-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add salesforce connector #57
base: main
Are you sure you want to change the base?
Conversation
if (nextJobLocator.IsEmpty) | ||
{ | ||
this.currentJob = Option<SalesForceJob>.None; | ||
this.ScheduleOnce(TimerKey, this.source.changeCaptureInterval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is not working here.
The stream keeps trying to fetch data instead of waiting 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why this issue is happening is how Akka actually works.
Every Akka stream is a set of an objects (stages) running in parallel. Each stage can push an element to the downstream stage or pull an element from the upstream stage. When a sink is scheduled for execution and has no element inside its processing que, it will pull the source to obtain a new element for processing.
If we look in the execution stack when execution hits the PullChanges
method, we can observe two kinds of stack:
To ensure this source operates correctly, we can examine the source code of the Pulse
stage. This stage's function is essentially very similar to what you need to accomplish: it attempts to retrieve data from a certain data source, and if data is available, it pushes it to the subsequent stage. However, if data is not available or if the sink pulls from your source too frequently, the appropriate response is to do nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments, we can discuss the code a bit next week.
[ExcludeFromCodeCoverage(Justification = "Trivial")] | ||
public class SalesForceJobAbortedException : SalesForceJobException | ||
{ | ||
public SalesForceJobAbortedException(string message) : base(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warnings about comment should be fixed, here and below.
/// <summary> | ||
/// Source for reading data from SalesForce BULK v2 API. | ||
/// </summary> | ||
public class SalesForceSource : GraphStage<SourceShape<List<DataCell>>>, IParquetSource, ITaggedSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this source support backfill?
{ | ||
return new SalesForceSource(jobProvider | ||
, entityName, httpClient, changeCaptureInterval); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ | |
return new SalesForceSource(jobProvider | |
, entityName, httpClient, changeCaptureInterval); | |
} | |
{ | |
return new SalesForceSource(jobProvider, entityName, httpClient, changeCaptureInterval); | |
} |
/// <summary> | ||
/// Salesforce Job provider that handles talking with the Bulk V2 API | ||
/// </summary> | ||
public record SalesForceJobProvider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I can see, in this class you are providing HttpClient as a parameter in every call. It would be better to provide HttpClient from constructor and make it a private field and implement the IDisposable
interface fot this class.
/// <returns>Authenticated message</returns> | ||
public Task<HttpRequestMessage> GetAuthenticatedMessage(HttpClient httpClient) | ||
{ | ||
if (this.validTo.GetValueOrDefault(DateTimeOffset.MaxValue) < |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it the only way to update the token? What happens if for some reason the old token is expired?
/// <param name="job"></param> | ||
public Task<Option<SalesForceJob>> GetJobStatus(HttpClient httpClient, SalesForceJob job) | ||
{ | ||
return this.GetAuthenticatedMessage(httpClient).Map(msg => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe FlatMap here?
}); | ||
} | ||
|
||
private (Type, object) ConvertToSalesforceType(string salesforceDataType, string value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be static?
if (nextJobLocator.IsEmpty) | ||
{ | ||
this.currentJob = Option<SalesForceJob>.None; | ||
this.ScheduleOnce(TimerKey, this.source.changeCaptureInterval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why this issue is happening is how Akka actually works.
Every Akka stream is a set of an objects (stages) running in parallel. Each stage can push an element to the downstream stage or pull an element from the upstream stage. When a sink is scheduled for execution and has no element inside its processing que, it will pull the source to obtain a new element for processing.
If we look in the execution stack when execution hits the PullChanges
method, we can observe two kinds of stack:
To ensure this source operates correctly, we can examine the source code of the Pulse
stage. This stage's function is essentially very similar to what you need to accomplish: it attempts to retrieve data from a certain data source, and if data is available, it pushes it to the subsequent stage. However, if data is not available or if the sink pulls from your source too frequently, the appropriate response is to do nothing.
} | ||
|
||
[Fact] | ||
public async Task TokenGeneration() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's usually better to distinguish the different steps of a test (like Arrange
, Act
and Assert
) with comments as described here: https://automationpanda.com/2020/07/07/arrange-act-assert-a-pattern-for-writing-good-tests/
You can see an example in Arcane.Operator
/// <summary> | ||
/// Represents Salesforce entity | ||
/// </summary> | ||
public class SalesForceEntity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The classes that are not intended for use by plugins should always be internal
, not public
. This is because the documentation for internal classes is not published as API docs
Implements #56
Scope
Implemented:
Rough steps done to fetch data
Additional changes:
CsvOperations
to common extensions since it's used both in CDM and SalesforceChecklist
latest
commit.