Today I was asked a question about database indexes, and I was completely stumped. So I searched online and compiled some information to study it. The question is as follows:
If a composite index (multi-column index) is created on columns A and B in a table, will this index be used when I query column B?
The answer becomes quite simple after organizing the key points.
What is an index?
An index is a data structure that stores the values of a specific column in a table. It's typically a B-tree, but there are also HashTable Indexes and R-trees; we'll only discuss B-trees here. Indexes are created on columns of a table, and they only contain values from a single column within that table, stored in a single data structure.An index is a data structure.
When querying a column without an index, all records need to be compared. However, with an index, the data is already sorted, allowing for a quick search to find all corresponding values. For example, when searching for "Chris" in names, without an index, the entire list of data needs to be checked to ensure that "Chris" is not present in the last column. But with an index, because the data is already sorted, all "Chris" values are grouped together, making the query highly efficient.
An index stores a pointer to a specific row in a table.
If we find a record in an index that is the value of the indexed column, how can we find other values for that record? It's quite simple – a database index also stores pointers to the corresponding rows in the table. A pointer is a memory area that records a reference to the data of the corresponding row on the hard drive. Therefore, in addition to storing the column values, an index also stores an index pointing to the row data.
How does a database know when to use an index?
SELECT * FROM Employee WHERE Employee_Name = ‘Chris’
When this SQL statement runs, the database checks if there are indexes on the columns being queried. Assuming an index has indeed been created on the Employee_Name column, the database will then check whether using this index for the query is appropriate – because in some scenarios, using an index is less efficient than a full table scan. The index will be used if it is deemed appropriate.
How to create an index
CREATE INDEX name_index ON Employee (Employee_Name)
If it is a composite index (multi-column index)
CREATE INDEX name_index ON Employee (Employee_Name, Employee_Age)
What is the cost of indexing?
So, what are the disadvantages of using database indexes? First, indexes consume space – the larger your table, the more space the index will occupy. Second, there is a performance penalty (mainly for update operations). When you add, delete, or update row data in a table, the same operations will occur in the index.
Single-column index vs. multi-column index
Suppose we want to query firstname, lastname, and age in the user table separately, how do we create indexes?
Select id FROM people Where firstname='Mike' AND lastname='Sullivan' AND age=17;
We can consider creating an index on a single column, such as the firstname, lastname, or age column. If we create an index on the firstname column, MySQL will quickly narrow the search to records where firstname = 'Mike', and then perform further searches on this "intermediate result set": it first excludes records where lastname is not equal to "Sullivan", then excludes records where age is not equal to 17. Once all search criteria are met, MySQL returns the final search results.
Because an index was created on the firstname column, MySQL's efficiency was significantly improved compared to performing a full table scan. However, the number of records that MySQL needs to scan still far exceeds the actual requirement. Although we could delete the index on the firstname column and then create an index on the lastname or age column, overall, the search efficiency would remain similar regardless of which column the index was created on.
At this point, we need to create a multi-column index for firstname, lastname, and age.
If we create single-column indexes on the firstname, lastname, and age columns, will the effect be the same as creating a multi-column index on firstname, lastname, and age?
The answer is no; they are completely different. When we execute a query, MySQL can only use one index. If you have three single-column indexes, MySQL will try to choose the most restrictive one. However, even the most restrictive single-column index is far less restrictive than a multi-column index on the firstname, lastname, and age columns.
Leftmost prefix in a multi-column index
Multi-column indexes have another advantage, which is manifested through a concept called leftmost prefixing. Continuing with the previous example, we now have a multi-column index on the firstname, lastname, and age columns, which we call fname_lname_age. MySQL will use the fname_lname_age index when the search criteria are combinations of the following columns:
- firstname, lastname, age
- firstname, lastname
- firstname
In other words, it's equivalent to us creating indexes on the column combinations (firstname, lastname, age), (firstname, lastname), and (firstname).
So you have the answer!
This siteOriginal articleAll follow "Attribution-NonCommercial-ShareAlike 4.0 License (CC BY-NC-SA 4.0)Please retain the following annotations when sharing or adapting:
Original author:Jake Tao,source:Can a multi-column index on a database be used to query a single column?
Comment list (1 item)
I have a question. Suppose a data table has both a multi-column index (fname_lname_age index) and three single-column indexes (firstname, lastname, age). When using a WHERE clause to filter queries based on firstname, which index will the database use?