25 Sep

spirv-stats update – exposing more information

In a previous post I introduced a little command line tool spirv-stats I’ve been working on. Since doing the initial version, I’ve extended the information the tool will give you based on some queries I had on the SPIR-V binaries we were using – in the GitHub pull request here.

For the most commonly used opcodes, I’ve tried to break them down to understand a little more about the shape of the opcodes.

OpLoad & OpStore

For OpLoad and OpStore – they have an optional additional parameter for a memory access. So I wondered, given the SPIR-V shaders we have as input, how many of the OpLoad’s and OpStore’s in the SPIR-V have the optional memory access literal? It turns out none of them do!

OpDecorate & OpMemberDecorate

For OpDecorate the first thing I wanted to know was how many of the decorations used had any additional literals. It turns out that 70% of OpDecorate’s have no additional literal, and the remaining 30% has one additional literal. The next query I had was what kind of decorations were mostly used in the SPIR-V shaders? It turns out the most used decoration was the RelaxedPrecision decoration with 66% of the uses of OpDecorate just for this one. The next most used was the Location decoration with 11%. I then extended these checks over to OpMemberDecorate, and it turns out that 90% of decorations on OpMemberDecorate have one literal! The reason is because a cool 84% of the decorations used on OpMemberDecorate are for encoding the Offset of struct members.

OpAccessChain

OpAccessChain can have an arbitrarily long set of IDs used to index into the pointer object. So I wondered how many of these were using a small number of indices? It turns out that 78% of the uses of OpAccessChain had only one index, 19% have two indices, and a mere 2% have three indices.

OpVariable

I wondered how many variables as used in our SPIR-V shaders had initializers (they have an initial value). None of them do! Of all 19041 uses of OpVariable not one had an initializer.

OpConstant

Of the constants used in the SPIR-V shaders, all of them use exactly one literal. This is unsurprising because int64/double types are not widely supported or used in shaders, but I wanted to be sure.

OpVectorShuffle

The first thing I wanted to know about OpVectorShuffle was how many literals were being used when shuffling the vectors – remember that the number of literals corresponds to the width of the output vector. It turns out that 31% of shuffles have two literals (a common case when extracting from a vec4 the indices into an image sample), 45% of shuffles have three literals, and 23% have four literals. The next question I had was to do with the undef literal that can be used in shuffle. 0xFFFFFFFFu (-1 in signed) can be used to signify that that element of the resulting vector is undefined. I wondered if the SPIR-V shaders we had were using this? It turns out none of them are (currently at least). The next question I had was how many shuffles were using literals lower than 4, and lower than 8. These two numbers would be if you were shuffling an individual vec4, or shuffling two vec4. 82% of the shuffles are using literals lower than 4 – so this could either be shuffling two vec2’s together, or one vec4. The next question then is how many OpShuffle’s are using the same vector ID in both vector components. This pattern is used when you actually only want to shuffle elements from the one vector. Well it turns out exactly 82% of shuffles were using both vectors the same!

OpCompositeExtract & OpCompositeConstruct

The last two opcodes that I have looked at currently were OpCompositeExtract and OpCompositeConstruct. For both I wondered how many what were the common number of literals being used? For extract, 97% were using exactly one literal. and 3% were using two literals. For construct, 17% used one literal, 41% used two literals, 41% used three literals. Also, for extract, I wondered how many of the extracts were being used to pull a single element out of a vector. So I checked how many times the literal was zero to three. Roughly 26% were accessing the zeroth, first or second, and 20% the third.

Sample Output

Below is a sample output run over the shaders that smol-v uses for testing. The changes in the latest version from the previous are highlighted in red.

Totals: 286932 hits 4985312 bytes
                        OpLoad[ 61] =  49915 hits (17.40%) 798640 bytes (16.02%)
                                    0 hits with memory access  0.00%
                       OpStore[ 62] =  29233 hits (10.19%) 350796 bytes ( 7.04%)
                                    0 hits with memory access  0.00%
                    OpDecorate[ 71] =  23770 hits ( 8.28%) 312964 bytes ( 6.28%)
                                16839 hits with no literals   70.84%
                                 6931 hits with 1 literal     29.16%
                                    0 hits with 2+ literals    0.00%
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                15742 hits of decoration  0   66.23%
                                 1036 hits of decoration  2    4.36%
                                    1 hits of decoration  3    0.00%
                                  971 hits of decoration  6    4.08%
                                  206 hits of decoration 11    0.87%
                                    6 hits of decoration 14    0.03%
                                   17 hits of decoration 15    0.07%
                                   37 hits of decoration 18    0.16%
                                 2648 hits of decoration 30   11.14%
                                 1517 hits of decoration 33    6.38%
                                 1589 hits of decoration 34    6.68%
                 OpAccessChain[ 65] =  20116 hits ( 7.01%) 421496 bytes ( 8.45%)
                                    0 hits with 0 indices      0.00%
                                15768 hits with 1 index       78.39%
                                 3904 hits with 2 indices     19.41%
                                  442 hits with 3 indices      2.20%
                                    2 hits with 4 indices      0.01%
                                    0 hits with 5+ indices     0.00%
                    OpVariable[ 59] =  19041 hits ( 6.64%) 304656 bytes ( 6.11%)
                                    0 hits with initializer    0.00%
              OpMemberDecorate[ 72] =  14332 hits ( 4.99%) 280916 bytes ( 5.63%)
                                 1431 hits with no literals    9.98%
                                12901 hits with 1 literal     90.02%
                                    0 hits with 2+ literals    0.00%
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                  850 hits of decoration  0    5.93%
                                  579 hits of decoration  4    4.04%
                                  579 hits of decoration  7    4.04%
                                  180 hits of decoration 11    1.26%
                                    2 hits of decoration 24    0.01%
                                12142 hits of decoration 35   84.72%
                    OpConstant[ 43] =  10823 hits ( 3.77%) 173168 bytes ( 3.47%)
                                10823 hits have 1 literal    100.00%
                                    0 hits have 2+ literals    0.00%
                       OpLabel[248] =   9915 hits ( 3.46%)  79320 bytes ( 1.59%)
               OpVectorShuffle[ 79] =   9732 hits ( 3.39%) 308372 bytes ( 6.19%)
                                    0 hits with 0 literals     0.00%
                                    0 hits with 1 literal      0.00%
                                 3045 hits with 2 literals    31.29%
                                 4405 hits with 3 literals    45.26%
                                 2282 hits with 4 literals    23.45%
                                    0 hits with 5+ literals    0.00%
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                    0 hits with undef literal  0.00%
                                 7980 hits with literals < 4  82.00%
                                 1752 hits with literals < 8  18.00%
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                 7980 hits with same vector   82.00%
            OpCompositeExtract[ 81] =   9595 hits ( 3.34%) 193220 bytes ( 3.88%)
                                    0 hits with 0 literals     0.00%
                                 9265 hits with 1 literal     96.56%
                                  330 hits with 2 literals     3.44%
                                    0 hits with 3+ literals   0.00%
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                 2547 hits with literal =  0  26.55%
                                 2414 hits with literal =  1  25.16%
                                 2398 hits with literal =  2  24.99%
                                 1906 hits with literal =  3  19.86%
                                    0 hits with literal =  4   0.00%
                        OpName[  5] =   9233 hits ( 3.22%) 164092 bytes ( 3.29%)
                        OpFMul[133] =   8532 hits ( 2.97%) 170640 bytes ( 3.42%)
          OpCompositeConstruct[ 80] =   6678 hits ( 2.33%) 166680 bytes ( 3.34%)
                                    0 hits with 0 literals     0.00%
                                 1161 hits with 1 literal     17.39%
                                 2754 hits with 2 literals    41.24%
                                 2763 hits with 3 literals    41.37%
                                    0 hits with 4 literals     0.00%
                                    0 hits with 5+ literals    0.00%
                        OpFAdd[129] =   5922 hits ( 2.06%) 118440 bytes ( 2.38%)
                 OpTypePointer[ 32] =   5486 hits ( 1.91%)  87776 bytes ( 1.76%)
                     OpExtInst[ 12] =   5257 hits ( 1.83%) 145980 bytes ( 2.93%)
                      OpBranch[249] =   5229 hits ( 1.82%)  41832 bytes ( 0.84%)
           OpBranchConditional[250] =   3193 hits ( 1.11%)  51088 bytes ( 1.02%)
              OpSelectionMerge[247] =   3109 hits ( 1.08%)  37308 bytes ( 0.75%)
                        OpFSub[131] =   2668 hits ( 0.93%)  53360 bytes ( 1.07%)
                  OpMemberName[  6] =   2507 hits ( 0.87%)  78044 bytes ( 1.57%)
                OpFunctionCall[ 57] =   2198 hits ( 0.77%)  58520 bytes ( 1.17%)
           OpConstantComposite[ 44] =   2155 hits ( 0.75%)  50928 bytes ( 1.02%)
           OpFunctionParameter[ 55] =   2117 hits ( 0.74%)  25404 bytes ( 0.51%)
                         OpDot[148] =   1911 hits ( 0.67%)  38220 bytes ( 0.77%)
           OpVectorTimesScalar[142] =   1488 hits ( 0.52%)  29760 bytes ( 0.60%)
                    OpFunction[ 54] =   1398 hits ( 0.49%)  27960 bytes ( 0.56%)
                 OpFunctionEnd[ 56] =   1398 hits ( 0.49%)   5592 bytes ( 0.11%)
                OpTypeFunction[ 33] =   1175 hits ( 0.41%)  21076 bytes ( 0.42%)
                  OpTypeVector[ 23] =   1110 hits ( 0.39%)  17760 bytes ( 0.36%)
                  OpTypeStruct[ 30] =   1065 hits ( 0.37%)  58496 bytes ( 1.17%)
                   OpTypeArray[ 28] =   1038 hits ( 0.36%)  16608 bytes ( 0.33%)
                     OpFNegate[127] =   1038 hits ( 0.36%)  16608 bytes ( 0.33%)
      OpImageSampleImplicitLod[ 87] =    969 hits ( 0.34%)  19660 bytes ( 0.39%)
                        OpFDiv[136] =    961 hits ( 0.33%)  19220 bytes ( 0.39%)
                 OpReturnValue[254] =    928 hits ( 0.32%)   7424 bytes ( 0.15%)
                   OpFOrdEqual[180] =    722 hits ( 0.25%)  14440 bytes ( 0.29%)
                     OpTypeInt[ 21] =    661 hits ( 0.23%)  10576 bytes ( 0.21%)
           OpFOrdLessThanEqual[188] =    595 hits ( 0.21%)  11900 bytes ( 0.24%)
                OpFOrdLessThan[184] =    588 hits ( 0.20%)  11760 bytes ( 0.24%)
                      OpIEqual[170] =    586 hits ( 0.20%)  11720 bytes ( 0.24%)
                      OpReturn[253] =    525 hits ( 0.18%)   2100 bytes ( 0.04%)
                        OpIAdd[128] =    465 hits ( 0.16%)   9300 bytes ( 0.19%)
                   OpTypeImage[ 25] =    437 hits ( 0.15%)  15732 bytes ( 0.32%)
            OpTypeSampledImage[ 27] =    437 hits ( 0.15%)   5244 bytes ( 0.11%)
        OpFOrdGreaterThanEqual[190] =    412 hits ( 0.14%)   8240 bytes ( 0.17%)
             OpFOrdGreaterThan[186] =    391 hits ( 0.14%)   7820 bytes ( 0.16%)
      OpImageSampleExplicitLod[ 88] =    376 hits ( 0.13%)  11128 bytes ( 0.22%)
                  OpCapability[ 17] =    372 hits ( 0.13%)   2976 bytes ( 0.06%)
                 OpMemoryModel[ 14] =    341 hits ( 0.12%)   4092 bytes ( 0.08%)
                  OpEntryPoint[ 15] =    341 hits ( 0.12%)  17808 bytes ( 0.36%)
                    OpTypeVoid[ 19] =    341 hits ( 0.12%)   2728 bytes ( 0.05%)
               OpExtInstImport[ 11] =    341 hits ( 0.12%)   8184 bytes ( 0.16%)
                   OpTypeFloat[ 22] =    341 hits ( 0.12%)   4092 bytes ( 0.08%)
                  OpLogicalAnd[167] =    331 hits ( 0.12%)   6620 bytes ( 0.13%)
                         OpPhi[245] =    281 hits ( 0.10%)   7868 bytes ( 0.16%)
                  OpTypeMatrix[ 24] =    255 hits ( 0.09%)   4080 bytes ( 0.08%)
               OpExecutionMode[ 16] =    235 hits ( 0.08%)   2852 bytes ( 0.06%)
  OpImageSampleDrefExplicitLod[ 90] =    226 hits ( 0.08%)   7232 bytes ( 0.15%)
                    OpTypeBool[ 20] =    212 hits ( 0.07%)   1696 bytes ( 0.03%)
           OpVectorTimesMatrix[144] =    194 hits ( 0.07%)   3880 bytes ( 0.08%)
             OpSourceExtension[  4] =    167 hits ( 0.06%)   5732 bytes ( 0.11%)
                  OpLogicalNot[168] =    160 hits ( 0.06%)   2560 bytes ( 0.05%)
                      OpSource[  3] =    141 hits ( 0.05%)   1692 bytes ( 0.03%)
                        OpIMul[132] =    135 hits ( 0.05%)   2700 bytes ( 0.05%)
                 OpConvertSToF[111] =    116 hits ( 0.04%)   1856 bytes ( 0.04%)
                        OpFMod[141] =    114 hits ( 0.04%)   2280 bytes ( 0.05%)
                   OpLogicalOr[166] =     93 hits ( 0.03%)   1860 bytes ( 0.04%)
                   OpSLessThan[177] =     92 hits ( 0.03%)   1840 bytes ( 0.04%)
                   OpLoopMerge[246] =     84 hits ( 0.03%)   1344 bytes ( 0.03%)
                      OpSelect[169] =     68 hits ( 0.02%)   1632 bytes ( 0.03%)
                 OpConvertFToS[110] =     67 hits ( 0.02%)   1072 bytes ( 0.02%)
                     OpBitcast[124] =     66 hits ( 0.02%)   1056 bytes ( 0.02%)
           OpMatrixTimesVector[145] =     48 hits ( 0.02%)    960 bytes ( 0.02%)
            OpShiftLeftLogical[196] =     46 hits ( 0.02%)    920 bytes ( 0.02%)
                OpFOrdNotEqual[182] =     43 hits ( 0.01%)    860 bytes ( 0.02%)
                        OpKill[252] =     40 hits ( 0.01%)    160 bytes ( 0.00%)
           OpSGreaterThanEqual[175] =     39 hits ( 0.01%)    780 bytes ( 0.02%)
           OpMatrixTimesScalar[143] =     32 hits ( 0.01%)    640 bytes ( 0.01%)
              OpSLessThanEqual[179] =     21 hits ( 0.01%)    420 bytes ( 0.01%)
                        OpISub[130] =     21 hits ( 0.01%)    420 bytes ( 0.01%)
           OpMatrixTimesMatrix[146] =     15 hits ( 0.01%)    300 bytes ( 0.01%)
                        OpSDiv[135] =     12 hits ( 0.00%)    240 bytes ( 0.00%)
                      OpFwidth[209] =      8 hits ( 0.00%)    128 bytes ( 0.00%)
                 OpConvertUToF[112] =      6 hits ( 0.00%)     96 bytes ( 0.00%)
                   OpTranspose[ 84] =      6 hits ( 0.00%)     96 bytes ( 0.00%)
                  OpEmitVertex[218] =      5 hits ( 0.00%)     20 bytes ( 0.00%)
                   OpINotEqual[171] =      5 hits ( 0.00%)    100 bytes ( 0.00%)
               OpConstantFalse[ 42] =      5 hits ( 0.00%)     60 bytes ( 0.00%)
                OpConstantTrue[ 41] =      5 hits ( 0.00%)     60 bytes ( 0.00%)
                         OpAny[154] =      4 hits ( 0.00%)     64 bytes ( 0.00%)
              OpControlBarrier[224] =      4 hits ( 0.00%)     64 bytes ( 0.00%)
             OpLogicalNotEqual[165] =      3 hits ( 0.00%)     60 bytes ( 0.00%)
                     OpSNegate[126] =      3 hits ( 0.00%)     48 bytes ( 0.00%)
        OpVectorExtractDynamic[ 77] =      3 hits ( 0.00%)     60 bytes ( 0.00%)
                OpEndPrimitive[219] =      2 hits ( 0.00%)      8 bytes ( 0.00%)
                        OpDPdy[208] =      2 hits ( 0.00%)     32 bytes ( 0.00%)
                        OpDPdx[207] =      2 hits ( 0.00%)     32 bytes ( 0.00%)
                   OpULessThan[176] =      2 hits ( 0.00%)     40 bytes ( 0.00%)
                        OpUMod[137] =      2 hits ( 0.00%)     40 bytes ( 0.00%)
                OpSGreaterThan[173] =      1 hits ( 0.00%)     20 bytes ( 0.00%)
             OpCompositeInsert[ 82] =      1 hits ( 0.00%)     24 bytes ( 0.00%)
            OpTypeRuntimeArray[ 29] =      1 hits ( 0.00%)     12 bytes ( 0.00%)
                       OpUndef[  1] =      1 hits ( 0.00%)     12 bytes ( 0.00%)