You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this table, we've added an index to the created_at column to quickly filter by that timestamp. When we query against the created_at column to find records created in the last 24 hours, we see that MySQL is using the index as we'd expect:
EXPLAIN SELECT*FROM todos WHERE created_at > NOW() - INTERVAL 24 HOUR;
-- | id | type | possible_keys | key | key_len | ref | rows | filtered | Extra |-- |----|-------|---------------|------------|---------|-----|------|----------|-----------------------|-- | 1 | range | created_at | created_at | 4 | | 1 | 100.00 | Using index condition |
However, if we wrap this column in a function, we're obfuscating the column from MySQL, and it can no longer use the index.
EXPLAIN SELECT*FROM todos WHERE YEAR(created_at) =2023;
-- | id | type | possible_keys | key | key_len | ref | rows | filtered | Extra |-- |----|------|---------------|-----|---------|-----|-------|----------|-------------|-- | 1 | ALL | | | | | 39746 | 100.00 | Using where |
In some cases, there are ways around index obfuscation. In this example, we could use a range scan instead of the YEAR function to obtain the same result.
EXPLAIN SELECT*FROM todos WHERE created_at BETWEEN '2023-01-01 00:00:00'AND'2023-12-31 23:59:59';
-- | id | type | possible_keys | key | key_len | ref | rows | filtered | Extra |-- |----|-------|---------------|------------|---------|-----|------|----------|-----------------------|-- | 1 | range | created_at | created_at | 4 | | 1 | 100.00 | Using index condition |
Redundant conditions in MySQL
What are redundant conditions?
Let's take a look at a contrived example to illustrate the point. In this example, we're selecting the todos with an id of less than five.
SELECT*FROM
todos
WHERE
id <5and
id <10-- This does... nothing
In this case, a redundant condition might be id < 10.
Use case
We're going to expand our todos table definition a little bit to add due_date and due_time columns. (Storing date and time separately is usually not advised, but it helps us prove the point.)
Given this table, if you want to query for todos that are due in the next day, you're stuck using the ADDTIME function:
EXPLAIN SELECT*FROM
todos
WHERE
ADDTIME(due_date, due_time) BETWEEN NOW() AND NOW() + INTERVAL 1 DAY;
-- | id | type | possible_keys | key | key_len | ref | rows | filtered | Extra |-- |----|------|---------------|-----|---------|-----|-------|----------|-------------|-- | 1 | ALL | | | | | 39746 | 100.00 | Using where |
We do have an index on due_date, but the index cannot be used because we're performing an operation on it (adding the time). There is no easy way to de-obfuscate this column either since the due_time is different for every row.
To work around this, let's add a redundant condition on due_date alone. When adding the condition, we need to make sure that it's logically impossible to change the result set, which means our redundant condition should be broader than our actual condition.
EXPLAIN SELECT*FROM
todos
WHERE-- The real condition
ADDTIME(due_date, due_time) BETWEEN NOW() AND NOW() + INTERVAL 1 DAY
AND-- The redundant condition
due_date BETWEEN CURRENT_DATEANDCURRENT_DATE+ INTERVAL 1 DAY
-- | id | type | possible_keys | key | key_len | ref | rows | filtered | Extra |-- |----|-------|---------------|----------|---------|-----|------|----------|------------------------------------|-- | 1 | range | due_date | due_date | 3 | | 1 | 100.00 | Using index condition; Using where |
MySQL will first use the index to eliminate most of the table, then the slower ADDTIME will be used to eliminate the few remaining false positives. The redundant condition is doing its job perfectly!
Domain-specific redundant conditions
In the case of our todos table, let's add an updated_at column that will be populated with the timestamp of the last time the record was changed.
In this scenario, we still only have an index on created_at, but if we want to query against updated_at, we might be able to add a redundant condition based on our knowledge of the application. If, given our understanding of the application, we can be sure that created_at is always equal to or earlier than updated_at, we can use this to our advantage.
This query, which looks for records that were last modified before January 1st of 2023, will scan the entire table because there is no index on updated_at:
SELECT*FROM
todos
WHERE
updated_at <'2023-01-01 00:00:00'
This query will return the same results but uses the created_at index to eliminate records and then filters out the false positives.
SELECT*FROM
todos
WHERE
updated_at <'2023-01-01 00:00:00'AND
created_at <'2023-01-01 00:00:00'
The text was updated successfully, but these errors were encountered:
https://planetscale.com/blog/redundant-and-approximate-conditions
Let's say you have a todos table with a created_at column that records a timestamp of when the record was created.
Obfuscated indexes
In this table, we've added an index to the created_at column to quickly filter by that timestamp. When we query against the created_at column to find records created in the last 24 hours, we see that MySQL is using the index as we'd expect:
However, if we wrap this column in a function, we're obfuscating the column from MySQL, and it can no longer use the index.
In some cases, there are ways around index obfuscation. In this example, we could use a range scan instead of the YEAR function to obtain the same result.
Redundant conditions in MySQL
What are redundant conditions?
Let's take a look at a contrived example to illustrate the point. In this example, we're selecting the todos with an id of less than five.
In this case, a redundant condition might be id < 10.
Use case
We're going to expand our todos table definition a little bit to add due_date and due_time columns. (Storing date and time separately is usually not advised, but it helps us prove the point.)
Given this table, if you want to query for todos that are due in the next day, you're stuck using the ADDTIME function:
We do have an index on due_date, but the index cannot be used because we're performing an operation on it (adding the time). There is no easy way to de-obfuscate this column either since the due_time is different for every row.
To work around this, let's add a redundant condition on due_date alone. When adding the condition, we need to make sure that it's logically impossible to change the result set, which means our redundant condition should be broader than our actual condition.
MySQL will first use the index to eliminate most of the table, then the slower ADDTIME will be used to eliminate the few remaining false positives. The redundant condition is doing its job perfectly!
Domain-specific redundant conditions
In the case of our todos table, let's add an updated_at column that will be populated with the timestamp of the last time the record was changed.
In this scenario, we still only have an index on created_at, but if we want to query against updated_at, we might be able to add a redundant condition based on our knowledge of the application. If, given our understanding of the application, we can be sure that created_at is always equal to or earlier than updated_at, we can use this to our advantage.
This query, which looks for records that were last modified before January 1st of 2023, will scan the entire table because there is no index on updated_at:
This query will return the same results but uses the created_at index to eliminate records and then filters out the false positives.
The text was updated successfully, but these errors were encountered: