As far as I know, when you use one of the equality or inequality operators, most synthesis tools will create a comparator the length of the bit vectors. Most FPGAs have carry-chains that are optimized for addition, subtraction, and comparison. These carry chains require all the bits to be physically aligned in the FPGA fabric.
For more information checkout the MachXO2 datasheet: http://www.latticesemi.com/~/media/LatticeSemi/Documents/DataSheets/MachXO23/MachXO2FamilyDataSheet.pdf
The MachXO2 logic is made up of PFUs (programmable function units):
Each of the PFUs contains 8 4-input lookup tables and 8 data flops along a high-speed carry chain divided into 4 slices. The slices look something like this according to the datasheet:
You can see the carry-chain goes right through all the slices in the PFU. Multiple PFUs can also be chained together for carry-chain functions longer than 8-bits.