-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft][Please review] Make Paginator use simpler queries when there are no sql joins used (#8278) #11595
base: 3.3.x
Are you sure you want to change the base?
[Draft][Please review] Make Paginator use simpler queries when there are no sql joins used (#8278) #11595
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -10,6 +10,7 @@ | |||||
use Doctrine\ORM\Internal\SQLResultCasing; | ||||||
use Doctrine\ORM\NoResultException; | ||||||
use Doctrine\ORM\Query; | ||||||
use Doctrine\ORM\Query\Expr\Join; | ||||||
use Doctrine\ORM\Query\Parameter; | ||||||
use Doctrine\ORM\Query\Parser; | ||||||
use Doctrine\ORM\Query\ResultSetMapping; | ||||||
|
@@ -21,6 +22,7 @@ | |||||
use function array_map; | ||||||
use function array_sum; | ||||||
use function assert; | ||||||
use function count; | ||||||
use function is_string; | ||||||
|
||||||
/** | ||||||
|
@@ -38,14 +40,43 @@ | |||||
private readonly Query $query; | ||||||
private bool|null $useOutputWalkers = null; | ||||||
private int|null $count = null; | ||||||
private bool $fetchJoinCollection; | ||||||
/** | ||||||
* @var bool The auto-detection of queries style was added a lot later to this class, and this | ||||||
* class historically was by default using the more complex queries style, which means that | ||||||
* the simple queries style is potentially very under-tested in production systems. The purpose | ||||||
* of this variable is to not introduce breaking changes until an impression is developed that | ||||||
* the simple queries style has been battle-tested enough. | ||||||
*/ | ||||||
private bool $queryStyleAutoDetectionEnabled = false; | ||||||
/** @var bool|null Null means "undetermined". */ | ||||||
private bool|null $queryHasHavingClause = null; | ||||||
|
||||||
/** @param bool $fetchJoinCollection Whether the query joins a collection (true by default). */ | ||||||
/** | ||||||
* @param bool|null $fetchJoinCollection Whether the query joins a collection (true by default). Set | ||||||
* to null to enable auto-detection of this parameter, however note that the auto-detection requires | ||||||
* a QueryBuilder to be provided to Paginator, otherwise this parameter goes back to its default value. | ||||||
* Also, for now, when the auto-detection of this parameter is enabled, then auto-detection | ||||||
* of $useOutputWalkers and of the CountWalker::HINT_DISTINCT is also enabled. | ||||||
*/ | ||||||
public function __construct( | ||||||
Query|QueryBuilder $query, | ||||||
private readonly bool $fetchJoinCollection = true, | ||||||
bool|null $fetchJoinCollection = true, | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would revert this and instead introduce a static method. public static function newWithAutoDetection(QueryBuilder $queryBuilder): self
{
// do the autodetection here
// use the unaltered constructor here
} That way:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good idea. I've introduced the static function as you advised and the constructor returned to its original form in its entirety. Thanks. |
||||||
) { | ||||||
if ($fetchJoinCollection === null) { | ||||||
$fetchJoinCollection = $this->autoDetectFetchJoinCollection($query); | ||||||
} | ||||||
|
||||||
if ($fetchJoinCollection === null) { | ||||||
$this->fetchJoinCollection = true; | ||||||
} else { | ||||||
$this->fetchJoinCollection = $fetchJoinCollection; | ||||||
$this->queryStyleAutoDetectionEnabled = true; | ||||||
} | ||||||
|
||||||
if ($query instanceof QueryBuilder) { | ||||||
$query = $query->getQuery(); | ||||||
$this->queryHasHavingClause = $query->getDQLPart('having') !== null; | ||||||
$query = $query->getQuery(); | ||||||
} | ||||||
|
||||||
$this->query = $query; | ||||||
|
@@ -154,6 +185,23 @@ | |||||
return new ArrayIterator($result); | ||||||
} | ||||||
|
||||||
/** @return bool|null Null means that auto-detection could not be carried out. */ | ||||||
private function autoDetectFetchJoinCollection(Query|QueryBuilder $query): bool|null | ||||||
{ | ||||||
// For now, only working with QueryBuilder is supported. | ||||||
if (! $query instanceof QueryBuilder) { | ||||||
return null; | ||||||
} | ||||||
|
||||||
/** @var array<string, Join[]> $joinsPerRootAlias */ | ||||||
$joinsPerRootAlias = $query->getDQLPart('join'); | ||||||
|
||||||
// For now, do not try to investigate what kind of joins are used. It is, however, doable | ||||||
// to detect a presence of only *ToOne joins via the access to joined entity classes' | ||||||
// metadata (see: QueryBuilder::getEntityManager()->getClassMetadata(className)). | ||||||
return count($joinsPerRootAlias) > 0; | ||||||
} | ||||||
|
||||||
private function cloneQuery(Query $query): Query | ||||||
{ | ||||||
$cloneQuery = clone $query; | ||||||
|
@@ -171,13 +219,30 @@ | |||||
/** | ||||||
* Determines whether to use an output walker for the query. | ||||||
*/ | ||||||
private function useOutputWalker(Query $query): bool | ||||||
private function useOutputWalker(Query $query, bool $forCountQuery = false): bool | ||||||
{ | ||||||
if ($this->useOutputWalkers === null) { | ||||||
return (bool) $query->getHint(Query::HINT_CUSTOM_OUTPUT_WALKER) === false; | ||||||
if ($this->useOutputWalkers !== null) { | ||||||
return $this->useOutputWalkers; | ||||||
} | ||||||
|
||||||
return $this->useOutputWalkers; | ||||||
// When a custom output walker already present, then do not use the Paginator's. | ||||||
if ($query->getHint(Query::HINT_CUSTOM_OUTPUT_WALKER) !== false) { | ||||||
return false; | ||||||
} | ||||||
|
||||||
// When not joining onto *ToMany relations, then do not use the more complex CountOutputWalker. | ||||||
// phpcs:ignore SlevomatCodingStandard.ControlStructures.UselessIfConditionWithReturn.UselessIfCondition | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note: I think it's better not to enforce the failing coding standard on that if-condition because otherwise it's a multiline There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's unclear to me why the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
These points are semantics + I'm the "guest" here, so I've merged the last two return statements, as per phpstan detection. |
||||||
if ( | ||||||
$forCountQuery | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if we have a separate detection for count query and result query, I would rather extract the check for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You said:
Did you mean to say "extract (...) in a separate function"? In other words, are you proposing splitting the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIRC, I meant This would look like that: // ...
if (!$this->queryStyleAutoDetectionEnabled) {
return true; // Old simple guessing
}
if ($forCountQuery) {
return /* condition for the support of CountWalker */;
}
return /* condition for the support of LimitSubqueryWalker */; |
||||||
&& $this->queryStyleAutoDetectionEnabled | ||||||
&& ! $this->fetchJoinCollection === false | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't compare boolean variable with boolean constants
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's surprising to me that I left that in the code, despite reviewing it twice myself. Thanks. |
||||||
// CountWalker doesn't support the "having" clause, while CountOutputWalker does. | ||||||
&& $this->queryHasHavingClause === false | ||||||
) { | ||||||
return false; | ||||||
} | ||||||
|
||||||
return true; | ||||||
} | ||||||
|
||||||
/** | ||||||
|
@@ -205,10 +270,20 @@ | |||||
$countQuery = $this->cloneQuery($this->query); | ||||||
|
||||||
if (! $countQuery->hasHint(CountWalker::HINT_DISTINCT)) { | ||||||
$countQuery->setHint(CountWalker::HINT_DISTINCT, true); | ||||||
$hintDistinctDefaultTrue = true; | ||||||
|
||||||
// When not joining onto *ToMany relations, then use a simpler COUNT query in the CountWalker. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see what you mean. It appears I misunderstood the purpose of that variable. I gave a thought about what to do with it, and I initially concluded that that variable is just misnamed. Consider the following:
In other words, both the following DQL queries require the extra "SELECT DISTINCT" subquery to happen in the
It appears then that the docblocks are under an impression that that variable is "true" regardless of whether a *ToMany relation is fetch-loaded or not. Do you agree that that instance variable is not named correctly (or the author incorrectly thought that not fetch-loading a *ToMany relation is immune to the "duplicated records being returned from the database" problem)? Would you agree for me to just rename that instance variable to something like: "queryHasToManyJoins"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm, maybe it is indeed badly named. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that there is a case where joined collections that are not selected will not impact the count or the iteration: when using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just so that I understand: you're trying to say that running the following DQLs without an "outputWalker" would be fine because it would produce the correct results, therefore the literal meaning of
I agree, however note that in the current code there is nothing that would ensure that DISTINCT is added/present in the In case there is a legitimate worry that there might be someone in the userspace that does the (undocumented) due diligence and adds the DISTINCT keyword to their queries when they set Other that that, I admit that there is a potential future improvement, where So, are you ok for me to rename that instance variable, or would you rather that I should not touch it, and instead introduce a new instance variable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That was a lot to take in for me. After a substantial amount of thought, I would like to propose and get your opinion on the following.
Q1. Are you personally ok with the proposition above?
Q2. Could you please confirm that you meant:
Thanks for mentioning the case of
That is good to know, however note that I do not intend to touch the region of the code that makes the decision whether the walker-based or tree-walker fetching query should be used.
Fair point, however this again is not something that I'm going to change or touch in my PR. Note that if a query uses composite primary keys AND produces duplicate rows (i.e. is not compatible with the "fast" result query), then the Paginator will not able to be used at all, and it's no different to the current situation in the Paginator.
This again is good to know, however I don't intend to auto-detect/guess in this PR which of the two walkers should be used by the result-fetching query. Could you reply to Q1 and Q2, please, and share any other thought you have about what I said above? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
your code provides guessing for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, are you referring to the following diff in the current PR? private function useOutputWalker(Query $query): bool
{
if ($this->useOutputWalkers === null) {
return (bool) $query->getHint(Query::HINT_CUSTOM_OUTPUT_WALKER) === false;
}
return $this->useOutputWalkers;
} -> private function useOutputWalker(Query $query, bool $forCountQuery = false): bool
{
if ($this->useOutputWalkers !== null) {
return $this->useOutputWalkers;
}
// When a custom output walker already present, then do not use the Paginator's.
if ($query->getHint(Query::HINT_CUSTOM_OUTPUT_WALKER) !== false) {
return false;
}
// When not joining onto *ToMany relations, then do not use the more complex CountOutputWalker.
return ! (
$forCountQuery
// (rest of the diff not relevant)
);
} Do we both agree that this code diff does not result in any change to whether the walker-based or the tree-walker result-fetching query should be used? Or do you refer to the proposition of splitting Note that I started this PR with only the intention of adding code that "if there are any dql-joins present, then use the fast version of the queries". I believe this alone would immediately address 80% of the use cases in the wild. Anything more than this, I consider a nicety. However, I do agree that going the extra mile and checking for *ToMany dql-joins should be easy enough, so as an exception from my original plan, I'm willing to attempt to make that change in the current PR. Anything other than that is a "less than 5% of use cases" matter in my mind, and although I do appreciate us discussing it, I do not intend to address it in this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree.
I agree.
Do we both agree that my code diff does not change the behaviour of
Are you trying to say that because the new If you're not referring to the consequences of the |
||||||
if ( | ||||||
$this->queryStyleAutoDetectionEnabled | ||||||
&& $this->fetchJoinCollection === false | ||||||
d-ph marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
) { | ||||||
$hintDistinctDefaultTrue = false; | ||||||
} | ||||||
|
||||||
$countQuery->setHint(CountWalker::HINT_DISTINCT, $hintDistinctDefaultTrue); | ||||||
} | ||||||
|
||||||
if ($this->useOutputWalker($countQuery)) { | ||||||
if ($this->useOutputWalker($countQuery, forCountQuery: true)) { | ||||||
$platform = $countQuery->getEntityManager()->getConnection()->getDatabasePlatform(); // law of demeter win | ||||||
|
||||||
$rsm = new ResultSetMapping(); | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a property phpdoc, the description is just the phpdoc description, not something attached to the tag (and then, we don't need
@var
at all as there is a native type).