Subsentient writes:
"I'm a C programmer and Linux enthusiast. For some time, I've had it on my agenda to build the new version of my i586/Pentium 1 compatible distro, since I have a lot of machines that aren't i686 that are still pretty useful.
Let me tell you, since I started working on this, I've been in hell these last few days! The Pentium Pro was the first chip to support CMOV (Conditional move), and although that was many years ago, lots of chips were still manufactured that didn't support this (or had it broken), including many semi-modern VIA chips, and the old AMD K6.
Just about every package that has to deal with multimedia has lots of inline assembler, and most of it contains CMOV. Most packages let you disable it, either with a switch like ./configure --disable-asm or by tricking it into thinking your chip doesn't support it, but some of them (like MPlayer, libvpx/vp9) do NOT. This means, that although my machines are otherwise full blown, good, honest x86-32 chips, I cannot use that software at all, because it always builds in bad instructions, thanks to these huge amounts of inline assembly!
Of course, then there's the fact that these packages, that could otherwise possibly build and work on all types of chips, are now limited to what's usually the ARM/PPC/x86 triumvirate (sorry, no SPARC Linux!), and the small issue that inline assembly is not actually supported by the C standard.
Is assembly worth it for the handicaps and trouble that it brings? Personally I am a language lawyer/standard Nazi, so inline ASM doesn't sit well with me for additional reasons."
(Score: 2, Informative) by dacut on Sunday March 09 2014, @02:38AM
It used to be necessary for high-speed integer code due to pointer aliasing. For an example (a tad contrived*, but it displays the issue with brevity):
You might have intended for src and dest to be completely different arrays, but the compiler doesn't know that. Instead, it reloads *src on each iteration through the loop, e.g.:
Back in ye olden days, you would time this, scratch your head, examine the assembly output, curse, and then recode this in assembly to move the reload instruction outside of the loop. (If there was a Fortran guy in the house, he would laugh at you and your pitiful C compiler to rub some salt into the wound.)
Thankfully, C99 (sadly not C++11, but available as a nonstandard extension in almost every C++ compiler) adds the restrict qualifier which fixes this issue:
The generated assembly (on gcc 4.8.1, at least) ends up moving the load outside of the loop. Huzzah! No need for assembly here!
* There are cases where this might not be so contrived; for example, if you have to code to a certain interface so this can be used as a callback function.