summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorLines
7 daysSTREAM: Re-enable -fltoHEADmainBirte Kristina Friesel-1/+1
SDK 2024.1.0 shipped a compiler bug. With 2024.2.0, it is working again.
7 daysCPU-DPU microbenchmark: use default number of threads per poolBirte Kristina Friesel-6/+2
7 daysREADME: Document changes made to benchmarksBirte Kristina Friesel-0/+32
2024-10-30README: dimes24: Add PDF and DOI linksBirte Kristina Friesel-1/+2
2024-10-30README: Add citation informationBirte Kristina Friesel-1/+3
2024-10-28README: Add Git repo linksBirte Kristina Friesel-0/+4
2024-10-28COUNT: Add baselineBirte Kristina Friesel-0/+246
2024-10-24Update READMEBirte Kristina Friesel-2/+6
2024-10-24Add COUNT benchmark (based on SEL)Birte Kristina Friesel-0/+645
2024-10-11README: fix typoBirte Kristina Friesel-1/+1
2024-09-27README: -v; add referencesBirte Kristina Friesel-2/+49
2024-08-23Make STREAM Microbenchmark work with SDK 2024.1.0Birte Kristina Friesel-1/+1
For whatever reason, -flto causes a "Heap Full" error even in an otherwise empty program: * thread #1, name = 'DPUthread0', stop reason = fault 1 (Heap Full) * frame #0: 0x80000350 dpu_code`mem_alloc_nolock(size=64) at alloc.c:32:5 frame #1: 0x80000170 dpu_code`main_kernel1 [inlined] mem_alloc(size=64) at alloc.c:52:21 frame #2: 0x80000168 dpu_code`main_kernel1 at add.c:73 frame #3: 0x80000308 dpu_code`main at add.c:42:12 frame #4: 0x80000050 dpu_code`__bootstrap at crt0.c:36:5 I do not know why this is the case.
2024-08-19HST-S: Add memcpy variant and update HBM eval scriptBirte Kristina Friesel-69/+139
2024-08-15GEMV: measure each write (dpu_push_xfer call) separatelyBirte Kristina Friesel-17/+37
2024-07-28TRNS DPU: Correctly initalize task so that it's actually repeatableBirte Kristina Friesel-0/+1
2024-07-28TRNS host: set active_dpus_before at proper locationBirte Kristina Friesel-1/+1
2024-07-26TRNS: decrease number of repetitionsBirte Kristina Friesel-5/+5
2024-07-26GEMV: specify SDK versionBirte Kristina Friesel-0/+2
2024-07-26CPU-DPU: specify SDK versionBirte Kristina Friesel-1/+5
2024-07-26TS: re-add --resumeBirte Kristina Friesel-12/+11
2024-07-26TRNS host: do not needlessly re-allocate DPUsBirte Kristina Friesel-0/+1
2024-07-26VA hbm: reduce number of repetitionsBirte Kristina Friesel-1/+1
2024-07-25VA nmc: explicitly source SDKBirte Kristina Friesel-0/+2
2024-07-25TRNS: fix write1 documentationBirte Kristina Friesel-4/+4
2024-07-25TRNS: specify memcpy nodeBirte Kristina Friesel-10/+32
2024-07-25TRNS nmc: explicitly source SDKBirte Kristina Friesel-0/+2
2024-07-24CPU-DPU: correctly log numa_node_rankBirte Kristina Friesel-9/+12
2024-07-24CPU-DPU dimes24-transfer: decrease input size; do not pin memory for >20 ranksBirte Kristina Friesel-6/+8
2024-07-23GEMV: specify memcpy cpu nodeBirte Kristina Friesel-45/+87
2024-07-23VA: remove single-node with in/out on same nodeBirte Kristina Friesel-13/+3
2024-07-23BS baseline: Add memcpy variantBirte Kristina Friesel-42/+153
2024-07-22VA baseline: configurable memcpy NUMA bindingBirte Kristina Friesel-18/+50
2024-07-22VA dimes-hetsim-nmc: variable input nodeBirte Kristina Friesel-2/+2
2024-07-18GEMV baseline: check for numa_alloc errorsBirte Kristina Friesel-0/+7
2024-07-18GEMV baseline: decrease repetitionsBirte Kristina Friesel-1/+1
2024-07-18GEMV baseline: ooopsieBirte Kristina Friesel-1/+1
2024-07-18GEMV: add MEMCPY variantBirte Kristina Friesel-7/+89
2024-07-18GEMV: move ifndef T out of NUMA block; set membind only onceBirte Kristina Friesel-8/+10
2024-07-18TRNS: Update HBM scriptBirte Kristina Friesel-11/+23
2024-07-18TRNS dimes nmc: update for new baseline versionBirte Kristina Friesel-18/+28
2024-07-17TRNS baseline: Add NUMA_MEMCPY supportBirte Kristina Friesel-56/+121
2024-07-17VA: do not count move_pages towards timerBirte Kristina Friesel-1/+1
2024-07-17VA hetsim nmc: nitsBirte Kristina Friesel-5/+5
2024-07-17VA memcpy: HBM supportBirte Kristina Friesel-25/+63
2024-07-17VA dimes benchmarks: re-introduce --resume; add memcpy baselineBirte Kristina Friesel-25/+31
2024-07-17VA: Remove legacy run scriptsBirte Kristina Friesel-136/+0
2024-07-17VA: Add baseline variant with memcpy overhead (always use local input data)Birte Kristina Friesel-5/+81
2024-07-17CPU-DPU dimes24-hetsim-transfer: decrease number of repetitionsBirte Kristina Friesel-3/+3
2024-07-17TRNS DPU version: more fine-grained latency outputBirte Kristina Friesel-37/+65
2024-07-17TRNS: document and update dimes-hetsim scriptsBirte Kristina Friesel-20/+72