summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorLines
2025-05-09milos-copy: reduce nr_threads configuration spacemasterBirte Kristina Friesel-4/+4
2025-05-09benchmark scripts: Add second CXL card; log git revisionBirte Kristina Friesel-21/+31
2025-05-09read/write avx512: ensure alignment; do not use input array for writesBirte Kristina Friesel-9/+7
2025-05-09benchmark scripts: rename milos-roofline to milos-copy for consistencyBirte Kristina Friesel-0/+0
2025-03-24only enable AVX512 tests by default if compiled with AVX512Birte Kristina Friesel-1/+3
2025-03-24Enable AVX512 data transfer test by default (if compiled with AVX51)Birte Kristina Friesel-0/+1
2025-03-11benchmark scripts: milos: Add CXL.mem card (in legacy mode) on NUMA node 16Birte Kristina Friesel-12/+12
2025-01-27Multi-threaded AVX512 read/write: Simplify offset calculationBirte Kristina Friesel-5/+5
2025-01-27add sanity check for multi-threaded readsBirte Kristina Friesel-5/+28
2025-01-27add a simple sanity check for (single-threaded) read testsBirte Kristina Friesel-7/+34
2025-01-27Fix AVX512 read/write tests.Birte Kristina Friesel-4/+4
512 bits is 64 Bytes, not 512 Bytes...
2025-01-10Fix compilation without avx512Birte Kristina Friesel-1/+1
2025-01-10Use native=0 to compile without -march=native (needed for, e.g., valgrind)Birte Kristina Friesel-5/+16
2025-01-06benchmark-scripts: milos: include AVX512 measurementsBirte Kristina Friesel-4/+33
2025-01-03force 512-byte alignmentBirte Kristina Friesel-24/+24
Note that this may cause parts of the input / output arrays to be unprocessed. This is deliberate: we are interested in raw bandwidth and do not want to bother with edge cases.
2025-01-03only call move_pages if arr_a / arr_b are allocatedBirte Kristina Friesel-19/+24
2025-01-03Makefile: Add debug modeBirte Kristina Friesel-0/+4
2024-12-20Add AVX512 read/write testsBirte Kristina Friesel-13/+85
2024-11-29also run read/write tests when no others are requestedBirte Kristina Friesel-0/+2
2024-11-29Only allocate input/output arrays when neededBirte Kristina Friesel-5/+14
2024-11-29Print an error message when requesting AVX512 in non-AVX512 buildBirte Kristina Friesel-1/+8
2024-11-29add read-only and write-only benchmarks on Xeon Max HBMBirte Kristina Friesel-0/+28
2024-11-28remove milos-roofline eval scriptBirte Kristina Friesel-14/+0
2024-10-25add benchmark script for milos (DRAM + HBM)Birte Kristina Friesel-0/+14
2024-09-30add a plain write testBirte Kristina Friesel-5/+19
2024-09-30Update READMEBirte Kristina Friesel-0/+11
2024-09-27Rename "dumb" to "plain"Birte Kristina Friesel-18/+18
2024-09-27Fix -t4 in non-pthread modeBirte Kristina Friesel-0/+2
2024-09-23add a simple read benchmark. No AVX or anything.Birte Kristina Friesel-3/+19
2024-09-19Add AVX512 copy variant. Not particularly efficient yet, might be missing sthBirte Kristina Friesel-5/+276
2024-09-19Compile with -O3 -march=native (does not seem to have a notable effect)Birte Kristina Friesel-1/+1
2024-07-24add benchmark script for tinosBirte Kristina Friesel-0/+14
2024-07-22Makefile: do not overwrite libs / flagsBirte Kristina Friesel-3/+3
2024-07-19mbw.c: fix dfatool output formatBirte Kristina Friesel-3/+3
2024-07-16add roofline copy benchmark script on milosBirte Kristina Friesel-0/+14
2024-07-16use distinct dfatool keys for copy typesBirte Kristina Friesel-4/+11
2024-06-17pthread=1: allocate threads after initializing global variablesBirte Kristina Friesel-22/+21
2024-05-23explicitly specify NUMA regions for source and target memoriesBirte Kristina Friesel-11/+32
2024-05-23Add NUMA supportBirte Kristina Friesel-2/+66
2023-06-07run.sh: variable array sizeDaniel Friesel-9/+9
2023-06-07adjust for new dfatool output formatDaniel Friesel-3/+3
2023-05-02use monotonic clock for benchmarks. gettimeofday is a really bad ideaDaniel Friesel-10/+10
2023-05-02add n_threads to parameter outputDaniel Friesel-0/+5
2023-05-02add multi-threaded benchmarks for NUMA evaluation and the likesDaniel Friesel-44/+169
2023-04-28produce dfatool-compatible output, add run scriptDaniel Friesel-9/+21
2023-04-28simplify makefile for SMAUG benchmarksDaniel Friesel-114/+3
2023-04-17Merge pull request #17 from raas/add-license-1raas-2/+680
Add LICENSE, GPL v3.0
2023-04-17Update spec file with new license, bump to 2.0raas-2/+6
2023-04-17Add LICENSE, GPL v3.0raas-0/+674
2023-01-24Merge pull request #15 from iboB/no-mmanraas-1/+0
Removed unused header