In MySQL, it is commonly understood that creating an index for each column may negatively impact query performance. Therefore, we typically adhere to two straightforward guidelines when choosing non-compounding indexes:
Now, let's introduce an extra guideline. Consider an example of an Orders table that is properly indexed for its key fields. This Orders table is then joined with both a Products table and a Warehouses table using a left join. Each of these tables is appropriately indexed.
Next, we establish a connection between this sequence of tables and an additional Vendors table using another left-join. This table contains only three records, as our online store example comprises of three vendors. The Vendors table comprises of the following columns: vendor_id (integer), vendor_name (varchar 255), vendor_addr (varchar 255), and vendor_terms (text).
Our left join query looks like this:
select ..., vendor_name from orders
left join products ...
left join warehouses ...
left join vendors on orders.vendorid=vendors.vendorid
where order_status=2;
Upon initial examination, it may appear unnecessary to index the vendor_name field. Given that the Vendors table contains just 3 records, the impact would seem minimal. Furthermore, since the vendor_name field is not utilized in the Where clause and is not involved in sorting, there seems to be even less urgency.
It is noteworthy that the query is not significantly affected by the number of records in the Vendors table, but rather by the width of the table. The presence of 3 records serves as a vertical multiplier, however, even with just one record, the joined table becomes considerably wider.
Even though the vendor_name is not specified in the Where clause, the primary Orders table is designed to extract only the necessary information, such as the name of the vendor, from the Vendors table which is left joined. However, due to variable-length text fields causing misalignment in the storage of the Vendors table, the left join query has to "search" for the start of the vendor_name field in a row without the assistance of an index.
In our hypothetical example, where we are not using scientific methodology, the query took 0.7 seconds to execute without indexing the vendor_name, and only 0.001 seconds with the index. When we simplified the left join to only involve the Orders and Vendors tables, the non-indexed query time reduced to 0.5 seconds. It is important to note that the actual execution time may vary depending on factors such as the number of records, table structure, and hardware parameters. Interestingly, MySQL's "Explain" statement does not recognize the impact of this index.