Provide an ARM specific a_ctz_32 helper function for architecture versions for which it can be implemented efficiently via the "rbit" instruction (ie all Thumb-2 capable versions of ARM v6 and above).