============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.3, pluggy-1.6.0
benchmark: 5.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/ubuntu/trnfft
configfile: pyproject.toml
plugins: anyio-4.13.0, benchmark-5.2.3
collected 70 items

benchmarks/bench_fft.py ................................................ [ 68%]
......................                                                   [100%]
Wrote benchmark data in: <_io.BufferedWriter name='/home/ubuntu/bench.json'>


=============================== warnings summary ===============================
benchmarks/bench_fft.py::TestSTFT::test_stft_torch
  /opt/aws_neuronx_venv_pytorch_2_9/lib/python3.12/site-packages/torch/functional.py:681: UserWarning: A window was not provided. A rectangular window will be applied,which is known to cause spectral leakage. Other windows such as torch.hann_window or torch.hamming_window are recommended to reduce spectral leakage.To suppress this warning and use a rectangular window, explicitly set `window=torch.ones(n_fft, device=<device>)`. (Triggered internally at /pytorch/aten/src/ATen/native/SpectralOps.cpp:836.)
    return _VF.stft(  # type: ignore[attr-defined]

../../../opt/aws_neuronx_venv_pytorch_2_9/lib/python3.12/site-packages/_pytest/cacheprovider.py:475
  /opt/aws_neuronx_venv_pytorch_2_9/lib/python3.12/site-packages/_pytest/cacheprovider.py:475: PytestCacheWarning: cache could not write path /home/ubuntu/trnfft/.pytest_cache/v/cache/nodeids: [Errno 13] Permission denied: '/home/ubuntu/trnfft/.pytest_cache/v/cache/nodeids'
    config.cache.set("cache/nodeids", sorted(self.cached_nodeids))

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

-------------------------------------------------------------------------------------------------------------------- benchmark: 70 tests --------------------------------------------------------------------------------------------------------------------
Name (time in us)                                              Min                       Max                      Mean                 StdDev                    Median                    IQR            Outliers          OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_fft_torch[256]                                         8.5440 (1.0)             25.9820 (1.0)             10.1923 (1.0)           2.3366 (1.69)            10.5230 (1.0)           1.8330 (19.71)         1;1  98,113.7191 (1.0)          55           1
test_bluestein_torch[127]                                  11.4300 (1.34)            31.9800 (1.23)            13.6905 (1.34)          2.2531 (1.63)            13.3920 (1.27)          0.0947 (1.02)         5;33  73,043.4423 (0.74)        205           1
test_mask_trnfft_pytorch[mask_shape0]                      14.5710 (1.71)            40.1500 (1.55)            15.1705 (1.49)          1.3862 (1.0)             14.9700 (1.42)          0.0940 (1.01)     427;1156  65,917.4989 (0.67)      17964           1
test_fftn_torch[fftn_shape0]                               16.3080 (1.91)            43.1980 (1.66)            17.4267 (1.71)          3.5363 (2.55)            16.6345 (1.58)          0.2520 (2.71)         5;12  57,383.0637 (0.58)         94           1
test_fft2_torch[fft2_shape0]                               17.8930 (2.09)            43.5630 (1.68)            20.8736 (2.05)          3.1103 (2.24)            20.3760 (1.94)          0.3430 (3.69)         4;16  47,907.3618 (0.49)        118           1
test_fft_torch[1024]                                       23.4950 (2.75)            41.5750 (1.60)            28.6960 (2.82)          2.1875 (1.58)            28.2980 (2.69)          0.0930 (1.0)         22;48  34,848.1058 (0.36)        504           1
test_gemm_torch_complex64[128]                             33.2690 (3.89)            50.8470 (1.96)            34.3169 (3.37)          2.9673 (2.14)            33.5750 (3.19)          0.1722 (1.85)        14;25  29,140.1917 (0.30)        243           1
test_stft_torch                                            43.1630 (5.05)            90.1510 (3.47)            48.3076 (4.74)          4.3089 (3.11)            47.0750 (4.47)          0.3600 (3.87)        26;56  20,700.6947 (0.21)        348           1
test_batched_fft_torch[batched_shape0]                     44.1780 (5.17)           108.8850 (4.19)            50.6261 (4.97)          3.1401 (2.27)            49.8830 (4.74)          0.8072 (8.68)      506;555  19,752.6589 (0.20)       7777           1
test_mask_trnfft_pytorch[mask_shape1]                      49.2250 (5.76)            81.0620 (3.12)            51.0641 (5.01)          3.2236 (2.33)            50.1785 (4.77)          0.7425 (7.98)      216;260  19,583.2190 (0.20)       3364           1
test_bluestein_torch[997]                                  50.5630 (5.92)            78.6210 (3.03)            57.5960 (5.65)          3.2179 (2.32)            56.6355 (5.38)          0.2620 (2.82)       66;138  17,362.3107 (0.18)        882           1
test_gemm_trnfft_pytorch[128]                              72.0680 (8.43)            98.7330 (3.80)            74.4106 (7.30)          5.5078 (3.97)            72.5585 (6.90)          0.5380 (5.78)          6;8  13,438.9475 (0.14)         56           1
test_batched_fft_torch[batched_shape1]                     74.3840 (8.71)           108.9910 (4.19)            82.3026 (8.08)          3.6350 (2.62)            81.1970 (7.72)          0.9885 (10.63)     701;727  12,150.2888 (0.12)       6788           1
test_fftn_torch[fftn_shape1]                               82.7840 (9.69)           168.0210 (6.47)            87.2707 (8.56)          5.2847 (3.81)            85.5540 (8.13)          0.9805 (10.54)     125;181  11,458.5977 (0.12)       1117           1
test_fft_torch[4096]                                       92.0430 (10.77)          136.4600 (5.25)            94.2820 (9.25)          4.2794 (3.09)            92.7950 (8.82)          0.3950 (4.25)        59;69  10,606.4822 (0.11)        476           1
test_fft2_torch[fft2_shape1]                              103.4180 (12.10)          216.6180 (8.34)           107.2813 (10.53)         4.9706 (3.59)           105.5680 (10.03)         1.3995 (15.05)     694;764   9,321.2899 (0.10)       5164           1
test_fft_torch[16384]                                     164.3110 (19.23)          273.9760 (10.54)          169.3131 (16.61)        10.1664 (7.33)           166.4540 (15.82)         2.4603 (26.45)        4;30   5,906.2161 (0.06)        143           1
test_linear_trnfft_pytorch[linear_shape0]                 199.2120 (23.32)          345.9510 (13.32)          206.9397 (20.30)        13.1664 (9.50)           201.4040 (19.14)        13.8635 (149.07)       24;1   4,832.3261 (0.05)        189           1
test_gemm_torch_complex64[256]                            218.8870 (25.62)          306.5640 (11.80)          223.9023 (21.97)         6.7551 (4.87)           220.1570 (20.92)        12.8080 (137.72)      685;2   4,466.2332 (0.05)       2494           1
test_bluestein_torch[4097]                                311.5230 (36.46)          422.8970 (16.28)          319.1220 (31.31)         7.4907 (5.40)           314.5250 (29.89)        12.4690 (134.08)      500;9   3,133.5975 (0.03)       2912           1
test_gemm_trnfft_pytorch[256]                             318.4780 (37.28)          450.6370 (17.34)          326.7536 (32.06)         9.0181 (6.51)           320.9570 (30.50)        14.9200 (160.43)      261;3   3,060.4100 (0.03)       1476           1
test_fft_torch[65536]                                     501.6230 (58.71)          878.1520 (33.80)          559.3983 (54.88)        19.9344 (14.38)          559.5515 (53.17)        12.1725 (130.89)      14;12   1,787.6352 (0.02)        572           1
test_mask_trnfft_pytorch[mask_shape2]                     503.8050 (58.97)          626.5530 (24.11)          517.8616 (50.81)         8.6385 (6.23)           519.7270 (49.39)        13.7320 (147.66)      331;1   1,931.0179 (0.02)       1085           1
test_fft2_torch[fft2_shape2]                            1,053.2560 (123.27)       1,228.4590 (47.28)        1,069.1803 (104.90)       14.3090 (10.32)        1,067.0580 (101.40)       15.5605 (167.32)       39;4     935.2959 (0.01)        345           1
test_mask_nki[mask_shape0]                              1,361.8580 (159.39)       3,373.1530 (129.83)       1,427.7956 (140.09)       95.2563 (68.72)        1,412.7035 (134.25)       39.4440 (424.13)      29;34     700.3804 (0.01)        562           1
test_gemm_nki[128]                                      1,434.5180 (167.90)       2,312.5310 (89.01)        1,503.4824 (147.51)       58.0876 (41.90)        1,493.6330 (141.94)       24.3870 (262.23)      31;34     665.1225 (0.01)        562           1
test_mask_nki[mask_shape1]                              1,461.4600 (171.05)       2,237.9210 (86.13)        1,538.3494 (150.93)       56.3415 (40.64)        1,527.5835 (145.17)       32.0360 (344.47)      43;38     650.0474 (0.01)        600           1
test_linear_nki[linear_shape0]                          1,568.8210 (183.62)       6,581.4220 (253.31)       1,680.7986 (164.91)      222.4211 (160.45)       1,659.4300 (157.70)       36.0725 (387.88)       8;31     594.9553 (0.01)        521           1
test_gemm_torch_complex64[512]                          1,638.2910 (191.75)       1,773.7260 (68.27)        1,695.5270 (166.35)       27.4408 (19.80)        1,709.2680 (162.43)       45.8467 (492.98)      178;0     589.7871 (0.01)        533           1
test_gemm_nki[256]                                      1,640.7350 (192.03)       3,336.1260 (128.40)       1,709.5422 (167.73)      105.4122 (76.04)        1,685.8360 (160.20)       25.9223 (278.73)      40;57     584.9519 (0.01)        491           1
test_gemm_trnfft_pytorch[512]                           1,854.6470 (217.07)       2,053.6990 (79.04)        1,881.0402 (184.56)       15.1990 (10.96)        1,877.0460 (178.38)       16.5835 (178.32)       42;6     531.6207 (0.01)        444           1
test_gemm_nki[512]                                      2,240.9980 (262.29)       4,030.5530 (155.13)       2,317.6026 (227.39)      141.2931 (101.93)       2,294.3320 (218.03)       39.7935 (427.89)      13;28     431.4804 (0.00)        381           1
test_mask_nki[mask_shape2]                              2,910.8140 (340.69)       5,021.8570 (193.28)       2,994.3804 (293.79)      192.6168 (138.95)       2,965.7580 (281.84)       28.0678 (301.80)       8;18     333.9589 (0.00)        253           1
test_linear_nki[linear_shape1]                          3,163.0620 (370.21)       6,068.4590 (233.56)       3,316.3819 (325.38)      296.8158 (214.12)       3,268.9905 (310.65)       87.1935 (937.56)       7;16     301.5334 (0.00)        288           1
test_linear_trnfft_pytorch[linear_shape1]               3,959.8900 (463.47)       4,186.3930 (161.13)       4,067.1239 (399.04)       27.2937 (19.69)        4,069.6510 (386.74)        8.1120 (87.23)       21;33     245.8740 (0.00)        220           1
test_fftn_trnfft_pytorch[fftn_shape0]                   4,073.3960 (476.76)       4,479.2570 (172.40)       4,135.2233 (405.72)       39.2922 (28.34)        4,132.5525 (392.72)       33.2870 (357.92)       24;7     241.8249 (0.00)        232           1
test_gemm_nki[1024]                                     4,750.1590 (555.96)       8,560.1750 (329.47)       4,894.8760 (480.25)      408.4735 (294.66)       4,823.6600 (458.39)       66.8330 (718.63)       3;27     204.2953 (0.00)        174           1
test_fft_nki[256]                                       9,890.2760 (>1000.0)     10,644.0830 (409.67)      10,073.8098 (988.38)      141.5910 (102.14)      10,034.7875 (953.61)      186.7020 (>1000.0)      17;3      99.2673 (0.00)         90           1
test_fft2_trnfft_pytorch[fft2_shape0]                  12,168.2420 (>1000.0)     12,375.2620 (476.30)      12,242.9179 (>1000.0)      39.5613 (28.54)       12,240.9070 (>1000.0)      50.8715 (547.01)       27;2      81.6799 (0.00)         81           1
test_gemm_torch_complex64[1024]                        12,953.7960 (>1000.0)     13,932.3320 (536.23)      12,992.5052 (>1000.0)     115.9928 (83.67)       12,973.4880 (>1000.0)      12.6950 (136.51)        1;6      76.9675 (0.00)         70           1
test_gemm_trnfft_pytorch[1024]                         13,531.6170 (>1000.0)     14,108.3800 (543.01)      13,581.1859 (>1000.0)      87.4190 (63.06)       13,560.7500 (>1000.0)      29.7567 (319.97)        3;3      73.6313 (0.00)         61           1
test_fftn_nki[fftn_shape0]                             13,790.1130 (>1000.0)     17,753.7670 (683.31)      14,176.2653 (>1000.0)     527.7913 (380.74)      14,076.2620 (>1000.0)     236.4553 (>1000.0)       2;4      70.5404 (0.00)         71           1
test_fft2_nki[fft2_shape0]                             14,838.5810 (>1000.0)     16,665.7710 (641.44)      15,142.0672 (>1000.0)     275.8136 (198.97)      15,063.0130 (>1000.0)     212.9423 (>1000.0)      10;4      66.0412 (0.00)         65           1
test_fft_nki[1024]                                     15,749.3980 (>1000.0)     16,614.6300 (639.47)      16,009.4863 (>1000.0)     171.6238 (123.81)      15,981.9725 (>1000.0)     197.0750 (>1000.0)      14;3      62.4630 (0.00)         60           1
test_fft_trnfft_pytorch[256]                           21,698.6200 (>1000.0)     21,915.0970 (843.47)      21,761.5099 (>1000.0)      41.3412 (29.82)       21,755.0700 (>1000.0)      39.5963 (425.77)        7;1      45.9527 (0.00)         31           1
test_batched_fft_nki[batched_shape0]                   25,039.0990 (>1000.0)     26,047.5840 (>1000.0)     25,419.1840 (>1000.0)     259.2189 (186.99)      25,399.1610 (>1000.0)     263.9770 (>1000.0)      10;3      39.3404 (0.00)         31           1
test_stft_nki                                          27,395.5150 (>1000.0)     37,076.6930 (>1000.0)     28,971.3087 (>1000.0)   1,687.5324 (>1000.0)     29,339.1750 (>1000.0)   1,868.8298 (>1000.0)       1;1      34.5169 (0.00)         35           1
test_bluestein_nki[127]                                31,183.5960 (>1000.0)     32,826.6120 (>1000.0)     31,522.7551 (>1000.0)     350.8722 (253.11)      31,420.6390 (>1000.0)     359.6648 (>1000.0)       3;2      31.7231 (0.00)         31           1
test_fftn_trnfft_pytorch[fftn_shape1]                  32,610.2270 (>1000.0)     32,843.8670 (>1000.0)     32,722.7830 (>1000.0)      73.9625 (53.35)       32,736.5800 (>1000.0)     135.0040 (>1000.0)      13;0      30.5597 (0.00)         30           1
test_fft_nki[4096]                                     39,036.0400 (>1000.0)     41,417.2720 (>1000.0)     39,454.6263 (>1000.0)     473.2300 (341.38)      39,297.5430 (>1000.0)     311.3633 (>1000.0)       2;2      25.3456 (0.00)         25           1
test_fft2_nki[fft2_shape1]                             44,209.6000 (>1000.0)     51,519.9730 (>1000.0)     45,574.3828 (>1000.0)   1,741.3593 (>1000.0)     44,826.7690 (>1000.0)   2,419.4222 (>1000.0)       2;1      21.9422 (0.00)         23           1
test_stft_trnfft_pytorch                               48,595.5790 (>1000.0)     49,101.2410 (>1000.0)     48,790.6357 (>1000.0)     113.2379 (81.69)       48,769.2940 (>1000.0)     128.3555 (>1000.0)       5;1      20.4957 (0.00)         21           1
test_batched_fft_nki[batched_shape1]                   52,289.1790 (>1000.0)     54,937.4200 (>1000.0)     53,071.3427 (>1000.0)     594.1389 (428.60)      52,993.4025 (>1000.0)     507.7760 (>1000.0)       2;1      18.8426 (0.00)         16           1
test_fft2_trnfft_pytorch[fft2_shape1]                  57,249.9600 (>1000.0)     62,414.3020 (>1000.0)     60,005.8704 (>1000.0)   1,880.7530 (>1000.0)     60,770.9510 (>1000.0)   3,762.8633 (>1000.0)       7;0      16.6650 (0.00)         17           1
test_bluestein_trnfft_pytorch[127]                     66,743.0400 (>1000.0)     66,994.6880 (>1000.0)     66,852.3601 (>1000.0)      65.3999 (47.18)       66,836.7610 (>1000.0)      56.4273 (606.74)        5;2      14.9583 (0.00)         15           1
test_fftn_nki[fftn_shape1]                             69,488.7010 (>1000.0)     85,739.7660 (>1000.0)     71,177.1755 (>1000.0)   4,054.0451 (>1000.0)     70,156.6930 (>1000.0)     805.8295 (>1000.0)       1;1      14.0494 (0.00)         15           1
test_bluestein_nki[997]                                82,005.9260 (>1000.0)     83,114.6530 (>1000.0)     82,469.2558 (>1000.0)     346.9963 (250.31)      82,359.1250 (>1000.0)     445.7915 (>1000.0)       4;0      12.1257 (0.00)         12           1
test_fft_trnfft_pytorch[1024]                          87,205.1480 (>1000.0)     87,834.8180 (>1000.0)     87,583.3208 (>1000.0)     203.3386 (146.68)      87,642.9680 (>1000.0)     344.1615 (>1000.0)       4;0      11.4177 (0.00)         12           1
test_batched_fft_trnfft_pytorch[batched_shape0]        93,815.2220 (>1000.0)     94,840.8020 (>1000.0)     94,341.1749 (>1000.0)     304.0312 (219.32)      94,313.6490 (>1000.0)     468.9387 (>1000.0)       4;0      10.5998 (0.00)         11           1
test_batched_fft_trnfft_pytorch[batched_shape1]       108,193.0170 (>1000.0)    108,687.1250 (>1000.0)    108,441.9676 (>1000.0)     139.1466 (100.38)     108,431.5295 (>1000.0)     198.1430 (>1000.0)       2;0       9.2215 (0.00)         10           1
test_fft_nki[16384]                                   130,062.7450 (>1000.0)    137,412.6770 (>1000.0)    132,241.6176 (>1000.0)   2,299.9911 (>1000.0)    131,765.4620 (>1000.0)   1,763.9115 (>1000.0)       1;1       7.5619 (0.00)          8           1
test_fft_trnfft_pytorch[4096]                         354,539.8780 (>1000.0)    356,272.0190 (>1000.0)    355,496.5950 (>1000.0)     748.4853 (539.94)     355,413.5150 (>1000.0)   1,309.2730 (>1000.0)       2;0       2.8130 (0.00)          5           1
test_bluestein_nki[4097]                              430,411.5280 (>1000.0)    433,757.0080 (>1000.0)    432,626.0260 (>1000.0)   1,320.3475 (952.47)     433,125.2300 (>1000.0)   1,481.7480 (>1000.0)       1;0       2.3115 (0.00)          5           1
test_fft2_trnfft_pytorch[fft2_shape2]                 473,599.8120 (>1000.0)    474,963.5380 (>1000.0)    474,266.8300 (>1000.0)     613.3327 (442.44)     474,073.1590 (>1000.0)   1,108.3098 (>1000.0)       2;0       2.1085 (0.00)          5           1
test_fft2_nki[fft2_shape2]                            536,438.6310 (>1000.0)    594,182.8580 (>1000.0)    549,747.8616 (>1000.0)  25,027.5613 (>1000.0)    537,623.8790 (>1000.0)  19,884.3763 (>1000.0)       1;1       1.8190 (0.00)          5           1
test_bluestein_trnfft_pytorch[997]                    539,343.9870 (>1000.0)    540,476.0300 (>1000.0)    540,059.6514 (>1000.0)     451.1101 (325.42)     540,116.1100 (>1000.0)     611.2752 (>1000.0)       1;0       1.8516 (0.00)          5           1
test_fft_nki[65536]                                   554,156.2550 (>1000.0)    568,437.2830 (>1000.0)    557,873.2996 (>1000.0)   5,982.0177 (>1000.0)    555,985.3940 (>1000.0)   5,018.2760 (>1000.0)       1;1       1.7925 (0.00)          5           1
test_fft_trnfft_pytorch[16384]                      1,430,993.5850 (>1000.0)  1,435,905.8820 (>1000.0)  1,434,448.0160 (>1000.0)   2,074.6668 (>1000.0)  1,435,530.3860 (>1000.0)   2,556.6970 (>1000.0)       1;0       0.6971 (0.00)          5           1
test_bluestein_trnfft_pytorch[4097]                 4,333,486.9990 (>1000.0)  4,349,312.5440 (>1000.0)  4,342,078.4318 (>1000.0)   6,366.5317 (>1000.0)  4,342,374.4700 (>1000.0)  10,261.2957 (>1000.0)       2;0       0.2303 (0.00)          5           1
test_fft_trnfft_pytorch[65536]                      5,777,302.7910 (>1000.0)  5,809,858.1770 (>1000.0)  5,794,233.5806 (>1000.0)  13,952.2199 (>1000.0)  5,794,943.2630 (>1000.0)  24,815.0605 (>1000.0)       2;0       0.1726 (0.00)          5           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
================== 70 passed, 2 warnings in 242.02s (0:04:02) ==================
