Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

4
  • 1
    See also stackoverflow.com/questions/13153584/…. Not quite a duplicate because there's more to say about movss's weird design and difference from movd. Commented May 23, 2016 at 2:58
  • 2
    Intel really seems stuck in the assembly age. If the actual instruction shuffles bits without assigning meaning, the C intrinsics should have float and int versions. There's no reason why two intrinsics with different signatures couldn't map to the same instruction.
    – MSalters
    Commented May 23, 2016 at 7:40
  • I know, right? Kind of makes you want to write some new intrinsic inline functions to fill in the gaps.
    – NoelC
    Commented May 23, 2016 at 15:53
  • @MSalters: Intel finally did this for AVX with __m256 and __m256i intrinsics for vinsertf128. (vinserti128 is only in AVX2). Of course, there's not much you can usefully do with __m256i with only AVX1. But that's a great idea. They should absolutely introduce integer instrinsics for shufps, since there's nothing else like it for combining data from two registers until AVX512's vpermt2d (permute 2 vectors, overwriting the Table). Commented May 23, 2016 at 21:09