Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GOBBLIN-1942] Create MySQL util class for re-usable methods and setup MysqlDagActio… #3812

Merged
merged 7 commits into from
Nov 1, 2023

Conversation

umustafi
Copy link
Contributor

…nStore retention

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):
    Defines a new class MySQLStoreUtils used for common functionality between MySQL based implementations of stores. It includes a new method to run a SQL command in a ScheduledThreadPoolExecutor using interval T which is used for retention on the MysqlDagActionStore and MysqlMultiActiveLeaseArbiter.  

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent work on refactoring for common leverage!

I see the challenge in naming... I may have a suggestion once I find a few moments to stop and consider.

also, I believe we have other classes w/ withPreparedStatement that would likely also benefit from reuse. if you agree, I suggest dropping a TODO in them to refactor


protected final DataSource dataSource;
private final MySQLStoreUtils mySQLStoreUtils;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty unintuitive name for an instance (generally 'utils' suggests a grab-bag of static functionality). solely based on the name, I immediately wonder: what is this meant to be used for? if it's a cohesive class, it ought to name itself accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about DBStatementExecutor?

log.info("MysqlMultiActiveLeaseArbiter initialized");
}

// Initialize Constants table if needed and insert row into it if one does not exist
private void initializeConstantsTable() throws IOException {
String createConstantsStatement = String.format(CREATE_CONSTANTS_TABLE_STATEMENT, this.constantsTableName);
withPreparedStatement(createConstantsStatement, createStatement -> createStatement.executeUpdate(), true);
mySQLStoreUtils.withPreparedStatement(createConstantsStatement, createStatement -> createStatement.executeUpdate(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seeing this use--which is quite reasonable IMO--makes me wonder whether this ought to be a common base class. what are the args for vs. against that approach? are you concerned about multiple inheritance, since some mysql store classes need another different base class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I avoided this approach because almost all of our mysql store classes extend another base class which is meant to have different store implementations (although we typically use mysql).

@@ -38,14 +39,19 @@
import org.apache.gobblin.service.ServiceConfigKeys;
import org.apache.gobblin.util.ConfigUtils;
import org.apache.gobblin.util.ExponentialBackoff;
import org.apache.gobblin.util.MySQLStoreUtils;


@Slf4j
public class MysqlDagActionStore implements DagActionStore {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is withPreparedStatement useful here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated for methods within this class

* intervals.
*/
public class MySQLStoreUtils {
private final DataSource dataSource;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javadoc ought to delineate whose responsibility for resource mgmt of DataSource--this class or caller/user

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be clearer to say something like "MUST maintain ownership of the {@link DataSource} and arrange for it to be closed, but only once this instance will no longer be used"

... and with that in mind, how would the repeating SQL mesh with that? should this instance hold on to what it schedules, and provide a .close() method that would unschedule them all? if so, the advised shutdown protocol might be:

storeUtils.close();
dataSource.close();

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good pt. I added a close method to unschedule any existing executor threads and close them.

* intervals.
*/
public class MySQLStoreUtils {
private final DataSource dataSource;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be clearer to say something like "MUST maintain ownership of the {@link DataSource} and arrange for it to be closed, but only once this instance will no longer be used"

... and with that in mind, how would the repeating SQL mesh with that? should this instance hold on to what it schedules, and provide a .close() method that would unschedule them all? if so, the advised shutdown protocol might be:

storeUtils.close();
dataSource.close();

?

* functionality includes executing prepared statements on a data source object and executing SQL queries at fixed
* intervals. The instantiater of the class should provide the data source used within this utility.
*/
public class MySQLStoreUtils {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming-wise, what's mysql-specific about the impl? wouldn't it work for any DB? (is it merely for that fancy HikariDataSource logging trick) if that's all, let's not codify mysql in the name...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DBStatementExecutor has a nice ring to it

Comment on lines +202 to +203
} catch (SQLException e) {
throw new IOException(String.format("Failure get dag actions from table %s ", tableName), e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withPreparedStatement will already map SQLException to IOException... do you do this here just for a specific error message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't write the original error messages but I do believe it's to provide more context in the error message about consequence of the failed insert/update/delete etc...


protected final DataSource dataSource;
private final MySQLStoreUtils mySQLStoreUtils;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about DBStatementExecutor?

}

/** Abstracts recurring pattern around resource management and exception re-mapping. */
public <T> T withPreparedStatement(String sql, CheckedFunction<PreparedStatement, T> f, boolean shouldCommit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's your thinking on other classes with their own withPreparedStatement, such as MysqlBaseSpecStore--do they deserve a TODO comment because you recommend to migrate them to use this impl-in-common?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the TODO comment there to refactor it if working on the class in the future.

@umustafi
Copy link
Contributor Author

running checks in own fork https://github.com/umustafi/gobblin/pull/7/checks

Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent, useful abstraction!

*/
public class MySQLStoreUtils {
public class DBStatementExecutor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be nice to implement Closeable or AutoCloseable... what do you think?

Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

@umustafi
Copy link
Contributor Author

@codecov-commenter
Copy link

codecov-commenter commented Oct 31, 2023

Codecov Report

Merging #3812 (e6ec812) into master (78bdf92) will increase coverage by 1.41%.
Report is 6 commits behind head on master.
The diff coverage is 67.53%.

@@             Coverage Diff              @@
##             master    #3812      +/-   ##
============================================
+ Coverage     47.63%   49.05%   +1.41%     
+ Complexity    11050     8073    -2977     
============================================
  Files          2155     1487     -668     
  Lines         85314    59091   -26223     
  Branches       9486     6808    -2678     
============================================
- Hits          40643    28986   -11657     
+ Misses        40982    27426   -13556     
+ Partials       3689     2679    -1010     
Files Coverage Δ
...blin/runtime/api/MysqlMultiActiveLeaseArbiter.java 77.05% <100.00%> (+1.76%) ⬆️
...gobblin/runtime/spec_store/MysqlBaseSpecStore.java 87.05% <ø> (ø)
.../runtime/dag_action_store/MysqlDagActionStore.java 65.82% <62.50%> (+2.18%) ⬆️
...a/org/apache/gobblin/util/DBStatementExecutor.java 61.76% <61.76%> (ø)

... and 680 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Comment on lines 205 to 207
if (rs != null) {
rs.close();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can rs be encapsulated with try with resources if it needs to be closed?

new DagAction(flowGroup, flowName, flowExecutionId, flowActionType), tableName), e);
} finally {
if (rs != null) {
rs.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here on try-with-resources

Comment on lines 172 to 176
if (exponentialBackoff.awaitNextRetryIfAvailable()) {
return getDagActionWithRetry(flowGroup, flowName, flowExecutionId, flowActionType, exponentialBackoff);
} else {
log.warn(String.format("Can not find dag action: %s with flowGroup: %s, flowName: %s, flowExecutionId: %s",
flowActionType, flowGroup, flowName, flowExecutionId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to encapsulate an exponential retry with the db statement executor? Seems useful to reuse in other areas if needed.

Also if not, prefer to have else-if at the same level then else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is actually unused at the moment and there's a TODO comment about potential refactoring. Until we decide how to update it, I am just changing the formatting in this PR to remove one of the nested if's to else if.

@Will-Lo Will-Lo merged commit c865b6a into apache:master Nov 1, 2023
6 checks passed
Will-Lo added a commit to Will-Lo/incubator-gobblin that referenced this pull request Dec 20, 2023
* [GOBBLIN-1921] Properly handle reminder events (apache#3790)

* Add millisecond level precision to timestamp cols & proper timezone conversion

	- existing tests pass with minor modifications

* Handle reminder events properly

* Fix compilation errors & add isReminder flag

* Add unit tests

* Address review comments

* Add newline to address comment

* Include reminder/original tag in logging

* Clarify timezone issues in comment

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1924] Reminder event flag true (apache#3795)

* Set reminder event flag to true for reminders

* Update unit tests

* remove unused variable

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1923] Add retention for lease arbiter table (apache#3792)

* Add retention for lease arbiter table

* Replace blocking thread with scheduled thread pool executor

* Make Calendar instance thread-safe

* Rename variables, make values more clear

* Update timestamp related cols

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* Change debug statements to info temporarily to debug (apache#3796)

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1926] Fix Reminder Event Epsilon Comparison (apache#3797)

* Fix Reminder Event Epsilon Comparison

* Add TODO comment

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1930] Improve Multi-active related logs and metrics (apache#3800)

* Improve Multi-active related logs and metrics

* Add more metrics and logs around forwarding dag action to DagManager

* Improve logs in response to review comments

* Replace flow execution id with trigger timestamp from multi-active

* Update flow action execution id within lease arbiter

* Fix test & make Lease Statuses more lean

* Update javadoc

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* add dataset root in some common type of datasets

* [GOBBLIN-1927] Add topic validation support in KafkaSource, and add TopicNameValidator (apache#3793)

* * Add generic topic validation support
* Add the first validator TopicNameValidator into the validator chain, as a refactor of existing codes

* Refine to address comments

* Refine

---------

Co-authored-by: Tao Qin <[email protected]>

* [GOBBLIN-1931] Refactor dag action updating method & add clarifying comment (apache#3801)

* Refactor dag action updating method & add clarifying comment

* Log filtered out duplicate messages

* logs and metrics for missing messages from change monitor

* Only add gobblin.service prefix for dagActionStoreChangeMonitor

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1934] Monitor High Level Consumer queue size (apache#3805)

* Emit metrics to monitor high level consumer queue size

* Empty commit to trigger tests

* Use BlockingQueue.size() func instead of atomic integer array

* Remove unused import & add DagActionChangeMonitor prefix to metric

* Refactor to avoid repeating code

* Make protected variables private where possible

* Fix white space

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1935] Skip null dag action types unable to be processed (apache#3807)

* Skip over null dag actions from malformed messages

* Add new metric for skipped messages

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1922]Add function in Kafka Source to recompute workUnits for filtered partitions (apache#3798)

* add function in Kafka Source to recompute workUnits for filtered partitions

* address comments

* set default min container value to 1

* add condition when create empty wu

* update the condition

* Expose functions to fetch record partitionColumn value (apache#3810)

* [GOBBLIN-1938] preserve x bit in manifest file based copy (apache#3804)

* preserve x bit in manifest file based copy
* fix project structure preventing running unit tests from intellij
* fix unit test

* [GOBBLIN-1919] Simplify a few elements of MR-related job exec before reusing code in Temporal-based execution (apache#3784)

* Simplify a few elements of MR-related job exec before reusing code in Temporal-based execution

* Add JSON-ification to several foundational config-state representations, plus encapsulated convience method `JobState.getJobIdFromProps`

* Update javadoc comments

* Encapsulate check for whether a path has the extension of a multi-work-unit

* [GOBBLIN-1939] Bump AWS version to use a compatible version of Jackson with Gobblin (apache#3809)

* Bump AWS version to use a compatible version of jackson with Gobblin

* use shared aws version

* [GOBBLIN-1937] Quantify Missed Work Completed by Reminders (apache#3808)

* Quantify Missed Work Completed by Reminders
   Also fix bug to filter out heartbeat events before extracting field

* Refactor changeMonitorUtils & add delimiter to metrics prefix

* Re-order params to group similar ones

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* GOBBLIN-1933]Change the logic in completeness verifier to support multi reference tier (apache#3806)

* address comments

* use connectionmanager when httpclient is not cloesable

* [GOBBLIN-1933] Change the logic in completeness verifier to support multi reference tier

* add uite test

* fix typo

* change the javadoc

* change the javadoc

---------

Co-authored-by: Zihan Li <[email protected]>

* [GOBBLIN-1943] Use AWS version 1.12.261 to fix a security vulnerability in the previous version (apache#3813)

* [GOBBLIN-1941] Develop Temporal abstractions, including `Workload` for workflows of unbounded size through sub-workflow nesting (apache#3811)

* Define `Workload` abstraction for Temporal workflows of unbounded size through sub-workflow nesting

* Adjust Gobblin-Temporal configurability for consistency and abstraction

* Define `WorkerConfig`, to pass the `TemporalWorker`'s configuration to the workflows and activities it hosts

* Improve javadoc

* Javadoc fixup

* Minor changes

* Update per review suggestions

* Insert pause, to spread the load on the temporal server, before launch of each child workflow that may have direct leaves of its own

* Appease findbugs by having `SeqSliceBackedWorkSpan::next` throw `NoSuchElementException`

* Add comment

* [GOBBLIN-1944] Add gobblin-temporal load generator for a single subsuming super-workflow with a configurable number of activities nested beneath (apache#3815)

* Add gobblin-temporal load generator for a single subsuming super-workflow with a configurable number of activities nested beneath

* Update per findbugs advice

* Improve processing of int props

* [GOBBLIN-1945] Implement Distributed Data Movement (DDM) Gobblin-on-Temporal `WorkUnit` evaluation (apache#3816)

* Implement Distributed Data Movement (DDM) Gobblin-on-Temporal `WorkUnit` evaluation

* Adjust work unit processing tuning for start-to-close timeout and nested execution branching

* Rework `ProcessWorkUnitImpl` and fix `FileSystem` misuse; plus convenience abstractions to load `FileSystem`, `JobState`, and `StateStore<TaskState>`

* Fix `FileSystem` resource lifecycle, uniquely name each workflow, and drastically reduce worker concurrent task execution

* Heed findbugs advice

* prep before commit

* Improve processing of required props

* Update comment in response to PR feedback

* [GOBBLIN-1942] Create MySQL util class for re-usable methods and setup MysqlDagActio… (apache#3812)

* Create MySQL util class for re-usable methods and setup MysqlDagActionStore retention

* Add a java doc

* Address review comments

* Close scheduled executors on shutdown & clarify naming and comments

* Remove extra period making config key invalid

* implement Closeable

* Use try with resources

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [Hotfix][GOBBLIN-1949] add option to detect malformed orc during commit (apache#3818)

* add option to detect malformed ORC during commit phase

* better logging

* address comment

* catch more generic exception

* validate ORC file after close

* move validate in between close and commit

* syntax

* whitespace

* update log

* [GOBBLIN-1948] Use same flowExecutionId across participants (apache#3819)

* Use same flowExecutionId across participants
* Set config field as well in new FlowSpec
* Use gobblin util to create config
* Rename function and move to util
---------
Co-authored-by: Urmi Mustafi <[email protected]>

* Allow extension of functions in GobblinMCEPublisher and customization of fileList file metrics are calculated for (apache#3820)

* [GOBBLIN-1951] Emit GTE when deleting corrupted ORC files (apache#3821)

* [GOBBLIN-1951] Emit GTE when deleting corrupted ORC files

This commit adds ORC file validation during the commit phase and deletes
corrupted files. It also includes a test for ORC file validation.

* Linter fixes

* Add framework and unit tests for DagActionStoreChangeMonitor (apache#3817)

* Add framework and unit tests for DagActionStoreChangeMonitor

* Add more test cases and validation

* Add header for new file

* Move FlowSpec static function to Utils class

* Remove unused import

* Fix compile error

* Fix unit tests

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1952] Make jobname shortening in GaaS more aggressive (apache#3822)

* Make jobname shortening in GaaS more aggressive

* Change long name prefix to flowgroup

* Make KafkaTopicGroupingWorkUnitPacker pack with desired num of container (apache#3814)

* Make KafkaTopicGroupingWorkUnitPacker pack with desired num of container

* update comment

* [GOBBLIN-1953] Add an exception message to orc writer validation GTE (apache#3826)

* Fix FlowSpec Updating Function (apache#3823)

* Fix FlowSpec Updating Function
   * makes Config object with FlowSpec mutable
   * adds unit test to ensure flow compiles after updating FlowSpec
   * ensure DagManager resilient to exceptions on leadership change

* Only update Properties obj not Config to avoid GC overhead

* Address findbugs error

* Avoid updating or creating new FlowSpec objects by passing flowExecutionId directly to metadata

* Remove changes that are not needed anymore

* Add TODO to handle failed DagManager leadership change

* Overload function and add more documentation

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* Emit metric to tune LeaseArbiter Linger metric  (apache#3824)

* Monitor number of failed persisting leases to tune linger

* Increase default linger and epsilon values

* Add metric for lease persisting success

* Rename metrics

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1956]Make Kafka streaming pipeline be able to config the max poll records during runtime (apache#3827)

* address comments

* use connectionmanager when httpclient is not cloesable

* add uite test

* fix typo

* [GOBBLIN-1956] Make Kafka streaming pipeline be able to config the max poll records during runtime

* small refractor

---------

Co-authored-by: Zihan Li <[email protected]>

* Add semantics for failure on partial success (apache#3831)

* Consistly handle Rest.li /flowexecutions KILL and RESUME actions (apache#3830)

* [GOBBLIN-1957] GobblinOrcwriter improvements for large records (apache#3828)

* WIP

* Optimization to limit batchsize based on large record sizes

* Address review

* Use DB-qualified table ID as `IcebergTable` dataset descriptor (apache#3834)

* [GOBBLIN-1961] Allow `IcebergDatasetFinder` to use separate names for source vs. destination-side DB and table (apache#3835)

* Allow `IcebergDatasetFinder` to use separate names for source vs. destination-side DB and table

* Adjust Mockito.verify to pass test

* Prevent NPE in `FlowCompilationValidationHelper.validateAndHandleConcurrentExecution` (apache#3836)

* Prevent NPE in `FlowCompilationValidationHelper.validateAndHandleConcurrentExecution`

* improved `MultiHopFlowCompiler` javadoc

* Delete Launch Action Events After Processing (apache#3837)

* Delete launch action event after persisting

* Fix default value for flowExecutionId retrieval from metadata map

* Address review comments and add unit test

* Code clean up

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1960] Emit audit count after commit in IcebergMetadataWriter (apache#3833)

* Emit audit count after commit in IcebergMetadataWriter

* Unit tests by extracting to a post commit

* Emit audit count first

* find bugs complaint

* [GOBBLIN-1967] Add external data node for generic ingress/egress on GaaS (apache#3838)

* Add external data node for generic ingress/egress on GaaS

* Address reviews and cleanup

* Use URI representation for external dataset descriptor node

* Fix error message in containing check

* Address review

* [GOBBLIN-1971] Allow `IcebergCatalog` to specify the `DatasetDescriptor` name for the `IcebergTable`s it creates (apache#3842)

* Allow `IcebergCatalog` to specify the `DatasetDescriptor` name for the `IcebergTable`s it creates

* small method javadoc

* [GOBBLIN-1970] Consolidate processing dag actions to one code path (apache#3841)

* Consolidate processing dag actions to one code path

* Delete dag action in failure cases too

* Distinguish metrics for startup

* Refactor to avoid duplicated code and create static metrics proxy class

* Remove DagManager checks that don't apply on startup

* Add test to check kill/resume dag action removal after processing

* Remove unused import

* Initialize metrics proxy with Null Pattern

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1972] Fix `CopyDataPublisher` to avoid committing post-publish WUs before they've actually run (apache#3844)

* Fix `CopyDataPublisher` to avoid committing post-publish WUs before they've actually run

* fixup findbugsMain

* [GOBBLIN-1975] Keep job.name configuration immutable if specified on GaaS (apache#3847)

* Revert "[GOBBLIN-1952] Make jobname shortening in GaaS more aggressive (apache#3822)"

This reverts commit 5619a0a.

* use configuration to keep specified jobname if enabled

* Cleanup

* [GOBBLIN-1974] Ensure Adhoc Flows can be Executed in Multi-active Scheduler state (apache#3846)

* Ensure Adhoc Flows can be Executed in Multi-active Scheduler state

* Only delete spec for adhoc flows & always after orchestration

* Delete adhoc flows when dagManager is not present as well

* Fix flaky test for scheduler

* Add clarifying comment about failure recovery

* Re-ordered private method

* Move private methods again

* Enforce sequential ordering of unit tests to make more reliable

---------

Co-authored-by: Urmi Mustafi <[email protected]>

* [GOBBLIN-1973] Change Manifest distcp logic to compare permissions of source and dest files even when source is older (apache#3845)

* change should copy logic

* Add tests, address review

* Fix checkstyle

* Remove unused imports

* [GOBBLIN-1976] Allow an `IcebergCatalog` to override the `DatasetDescriptor` platform name for the `IcebergTable`s it creates (apache#3848)

* Allow an `IcebergCatalog` to override the `DatasetDescriptor` platform for the `IcebergTable`s it creates

* fixup javadoc

* Log when `PasswordManager` fails to load any master password (apache#3849)

* [GOBBLIN-1968] Temporal commit step integration (apache#3829)

Add commit step to Gobblin temporal workflow for job publish

* Add codeql analysis

* Make gradle specific

* Add codeql as part of build script

* Initialize codeql

* Use separate workflow for codeql instead with custom build function as autobuild seems to not work

* Add jdk jar for global dependencies script

---------

Co-authored-by: umustafi <[email protected]>
Co-authored-by: Urmi Mustafi <[email protected]>
Co-authored-by: Arjun <[email protected]>
Co-authored-by: Tao Qin <[email protected]>
Co-authored-by: Tao Qin <[email protected]>
Co-authored-by: Hanghang Nate Liu <[email protected]>
Co-authored-by: Andy Jiang <[email protected]>
Co-authored-by: Kip Kohn <[email protected]>
Co-authored-by: Zihan Li <[email protected]>
Co-authored-by: Zihan Li <[email protected]>
Co-authored-by: Matthew Ho <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants