-
Christoph Mallon authored
For architectures without a swap instruction (all except general purpose register set on amd64 and ia32) this results in shorter code. In many cases (probably except swapping two registers) it is also better this way on amd64/ia32 due to fewer uops and modern processors eliminating mov during decoding.
To find the state of this project's repository at the time of any of these versions, check out the tags.