Weka's SPECstorage Records: How Benchmark Transparency Should Work

Weka set #1 rankings across all five SPECstorage 2020 workloads in January 2025. More importantly, the results are independently audited and publicly verifiable—the standard every vendor should meet.

Weka and HPE announced SPECstorage Solution 2020 benchmark results in January 2025, claiming the #1 ranking across all five workloads: AI_IMAGE, EDA_BLENDED, GENOMICS, SWBUILD, and VDA [1]. The results reportedly include “significantly lower latency—in some cases up to 6.5x lower than previous records” [2].

These are substantial claims. Unlike most vendor benchmark announcements, however, they can be verified. SPECstorage results are independently audited and published on spec.org with full configuration disclosure. Anyone can review the methodology, compare against competitors, and evaluate whether the claims hold up to scrutiny [3].

This is how benchmark transparency should work.

What SPECstorage Measures

SPECstorage Solution 2020 is an industry-standard benchmark maintained by the Standard Performance Evaluation Corporation (SPEC). It measures storage system performance across five workloads designed to reflect real enterprise use cases:

AI_IMAGE simulates image training pipelines for machine learning—small file random reads with high concurrency, characteristic of GPU training workflows accessing millions of small image files.

EDA_BLENDED models electronic design automation workloads—mixed small and large file operations with complex dependency patterns typical of semiconductor design flows.

GENOMICS reflects bioinformatics pipelines—large sequential reads and writes characteristic of genome assembly and analysis workflows.

SWBUILD simulates software build environments—metadata-heavy operations with many small files, typical of large codebase compilation.

VDA represents video analytics—high-bandwidth streaming reads with consistent throughput requirements.

These workloads test different storage subsystems: metadata performance, random IOPS, sequential bandwidth, and consistency under load. Achieving #1 across all five requires balanced architecture rather than optimization for a single pattern.

The Verification Standard

SPECstorage results undergo independent auditing before publication. The process requires:

Full configuration disclosure: Hardware specifications (servers, storage, network), software versions, filesystem parameters, and client configuration must be documented. The Weka/HPE result specifies HPE Alletra Storage Server 4110 hardware with Intel Xeon processors [1].

Defined test execution: Benchmark runs follow prescribed procedures with specific ramp-up, measurement, and cool-down periods. Results reflect sustained performance, not burst capability.

Auditor review: SPEC reviews submissions for compliance before publication. Non-compliant results are rejected or require correction.

Public availability: Published results appear on spec.org where anyone can access full disclosure reports [3]. Competitors, customers, and analysts can examine methodology and compare configurations.

This process produces engineering data rather than marketing claims. When Weka claims #1 across all five workloads, that claim can be verified against published audit reports.

The 6.5x Latency Claim

Weka’s announcement includes latency improvements “up to 6.5x lower than previous records” in some workloads [2]. This claim requires context.

SPECstorage reports both throughput (jobs/streams completed) and response time (latency at load). The 6.5x improvement likely reflects specific workload and load-point combinations rather than universal latency reduction. Storage latency varies with queue depth, IO size, and workload pattern—a 6.5x improvement at one operating point may coexist with smaller improvements elsewhere.

The “up to” qualifier appropriately signals this is best-case, not typical. Marketing naturally emphasizes maximum improvements. The important point is that the underlying data is publicly available—interested parties can examine latency curves across all load points rather than relying on marketing’s selected highlights.

Why This Matters

Consider the contrast with unverified vendor claims:

VAST Data claims 25% faster than Iceberg and 60x faster updates without publishing methodology. Pure Storage claims “30% better than competitors” without specifying which competitors or test conditions. These claims cannot be independently evaluated—customers must trust vendor assertions.

Weka’s SPECstorage submission provides verifiable data. The same customers can access spec.org, review the full disclosure report, and draw independent conclusions. This shifts the relationship from “trust our marketing” to “verify our engineering.”

The verification standard matters beyond individual benchmarks. Vendors willing to face independent scrutiny demonstrate confidence in their technology. Vendors publishing only internal benchmarks—however impressive the numbers—leave customers dependent on marketing claims.

What Weka Gets Right

Participation in audited benchmarks: SPECstorage submission demonstrates willingness to face independent verification. Not all vendors make this choice.

Consistent cross-workload performance: Achieving #1 across five diverse workloads indicates architectural strength rather than workload-specific optimization. Systems optimized for sequential bandwidth often struggle with metadata operations, and vice versa.

Public methodology: Full configuration disclosure enables customers to evaluate relevance to their environments. A benchmark on different hardware than your deployment has limited applicability—knowing the test configuration enables informed assessment.

Historical track record: Weka has participated in SPECstorage previously, building a benchmark history that demonstrates consistent methodology rather than cherry-picked one-time results.

The Remaining Questions

Even verified benchmarks warrant scrutiny:

Configuration relevance: Benchmark configurations often differ from typical deployments. A maximum-performance reference architecture with expensive networking may not represent cost-effective production configurations. Customers should evaluate whether the tested configuration matches their intended deployment.

Workload applicability: SPECstorage workloads approximate but don’t exactly match production applications. AI_IMAGE simulates image training but may differ from your specific framework and access patterns. Benchmarks provide directional guidance, not guarantees.

Competitive context: Being #1 today doesn’t guarantee leadership tomorrow. Storage performance evolves as competitors release new products and update software. Point-in-time benchmark leadership is valuable but not permanent.

Cost-performance: SPECstorage reports performance but not cost. A #1 result at 10x the cost of #2 has different implications than #1 at comparable pricing. Total cost of ownership analysis requires additional data beyond benchmark reports.

These considerations apply to all benchmarks, including verified ones. The advantage of audited results is that customers can investigate these questions with actual data rather than marketing assertions.

The Industry Comparison

Weka’s benchmark transparency compares favorably to industry practice:

Weka: SPECstorage submissions with full disclosure, independently audited, publicly available.

Pure Storage: STAC-M3 submissions (audited) combined with unverified competitive claims.

VAST Data: Marketing benchmarks without published methodology or independent verification.

MinIO: Open-source benchmark tools (warp) enabling independent validation, though not formal audit program participation.

NetApp: Some SPEC submissions historically, mixed with unverified marketing claims.

The industry lacks consistent standards. Weka’s SPECstorage participation represents best practice that other vendors could adopt. The benchmarking infrastructure exists—SPEC, STAC, and TPC maintain audit programs for various workloads. Vendor participation is a choice.

For Evaluators

Organizations evaluating Weka should:

Review the SPECstorage disclosure report on spec.org. Examine hardware configuration, software versions, and performance curves across load points. Determine whether the tested configuration matches your intended deployment.

Compare across vendors: SPECstorage maintains results from multiple vendors. Direct comparison is possible using standardized methodology—something impossible with vendor-specific benchmarks.

Conduct proof-of-concept testing: Even verified benchmarks don’t replace testing with your actual workloads. Weka’s SPECstorage results provide confidence for evaluation, not substitute for validation.

Evaluate cost-performance: Performance leadership has value, but TCO analysis requires pricing data that benchmarks don’t include. Request pricing for configurations similar to benchmarked systems.

The verification advantage is starting-point credibility. Weka’s SPECstorage results provide confidence that performance claims reflect actual capability rather than marketing invention. Detailed evaluation should follow that foundation.

The Standard We Should Expect

Storage vendors collectively would serve customers better by adopting verification standards:

Publish methodology: Every benchmark claim should include hardware configuration, software versions, test parameters, and raw data sufficient for reproduction.

Submit to audited programs: SPEC, STAC, and TPC maintain infrastructure for independent verification. Vendor participation demonstrates confidence and provides customer assurance.

Distinguish verified from marketing: When mixing audited results with internal benchmarks, clearly label which is which. Customers deserve to know the verification status of performance claims.

Weka’s SPECstorage participation demonstrates these practices are achievable. The question for other vendors is why they choose differently.

The Bottom Line

Weka claims #1 SPECstorage rankings across all five workloads with up to 6.5x latency improvements. These claims are verifiable—audited by SPEC and published with full methodology disclosure.

This represents how storage benchmarks should work. Independent verification, public methodology, and auditable results enable customers to evaluate claims rather than trust marketing. Weka’s willingness to face this scrutiny deserves recognition.

The technology may or may not be right for your workload—that requires detailed evaluation. But the benchmark claims meet verification standards that many competitors don’t match. In an industry prone to unverifiable marketing numbers, transparent methodology matters.

When vendors prove their claims through independent auditing, customers can make informed decisions. When vendors publish impressive numbers without verification, customers are left guessing. Weka chose verification.


References

[1] StorageNewsletter, “Weka.IO and HPE Achieve Unmatched SPECstorage Performance,” February 2025. https://www.storagenewsletter.com/2025/02/18/weka-io-and-hpe-achieve-unmatched-specstorage-performance/

[2] Weka Blog, “Making and Breaking Records: Do Benchmarks Matter?” https://www.weka.io/blog/distributed-file-systems/benchmarks-spec-2020/

[3] SPEC, “SPECstorage Solution 2020 Results.” https://www.spec.org/storage2020/results/

[4] SPEC, “HPE WEKA Physical Server Reference Result,” January 2025. https://www.spec.org/storage2020/results/res2025q1/storage2020-20250113-00107.html


StorageMath advocates for benchmark transparency across the storage industry. Weka’s SPECstorage participation demonstrates the standard vendors should meet. Independent verification enables informed decisions—marketing numbers without methodology do not.