The Benchmark Problem: When Storage Vendors Claim 'Record-Setting' Performance Without Showing the Tests

Analysis of unverifiable performance claims from Cloudian and Dell ObjectScale - from '74% improvement' to 'world's fastest' - and why benchmark numbers without methodology are marketing, not engineering.

December 12, 2025 · 11 min read

Cloudian announces “74 percent improvement in data processing performance” [1]. Dell claims ObjectScale delivers “up to 2X greater throughput per node than the closest competitor” [2]. Both vendors publish specific numbers: 52,000 images per second, 230% higher throughput, 98% reduced CPU load. These benchmarks appear in press releases, blog posts, and sales materials as quantifiable proof of technical superiority.

The numbers look impressive. The problem: neither vendor publishes reproducible test methodologies, independent verification, or sufficient detail for customers to validate the claims. When pressed for specifics, Dell cites “internal analysis of publicly available data” without identifying which data. Cloudian describes test hardware but omits workload characteristics, batch sizes, or baseline configurations.

This pattern extends across the storage industry. Vendors claim “breakthrough performance,” “record-setting throughput,” and superiority over “competing systems” while withholding the test details that would enable verification. Let’s examine specific claims, what information is missing, and why benchmark transparency matters for purchasing decisions.

Cloudian’s PyTorch Performance Claims

Cloudian’s July 2025 announcement claims their RDMA connector for PyTorch delivers “74 percent improvement in data processing performance while simultaneously reducing CPU utilization by 43 percent” [1]. The specific benchmark: 52,000 images per second versus 30,000 with a standard S3 connector, based on “TorchBench testing.”

The test configuration description includes:

Cloudian HyperStore 8.2.2 software
Six Supermicro servers
Nvidia networking platforms
All-flash media configuration
Nvidia Spectrum-X Ethernet networking
ConnectX SuperNICs with GPUDirect support

What’s missing:

Workload details: Which TorchBench test? What model architecture? What image resolution and batch size?
Baseline configuration: What “standard S3 connector” served as comparison? Same hardware? Same network topology?
System specifications: Server CPU models, memory capacity, storage configuration, network bandwidth allocation
Test duration: Single run? Average across multiple iterations? What confidence intervals?
Independent verification: Was this tested by a third party or only Cloudian engineering?

Without these details, the “74% improvement” claim can’t be reproduced. An organization evaluating Cloudian can’t determine whether their workload would achieve similar results. The number might be accurate - or it might represent best-case conditions that don’t generalize.

Chris Mellor’s Blocks and Files coverage reports the claim without questioning the methodology gap [1]. The article presents vendor-supplied numbers as fact rather than examining what information is missing for independent verification.

Dell’s “2X Greater Throughput” Assertion

Dell’s April 2025 announcement claims ObjectScale XF960 delivers “up to 2X greater throughput per node than the closest competitor” [2]. The methodology disclosed: “based on large object read throughput per node and cluster configurations configured with ObjectScale XF960 and Ethernet networking…based on Dell internal analysis of publicly available data as of Mar. 2025.”

Break this down:

“Up to 2X”: Maximum observed improvement, not typical or average
“The closest competitor”: Unnamed vendor and product
“Large object read throughput”: No object size specified, no workload pattern described
“Internal analysis of publicly available data”: What data? From where? How was it normalized?
“As of Mar. 2025”: Snapshot comparison, not longitudinal validation

Dell provides hardware specifications: PowerEdge R7725xd servers with AMD EPYC gen 5 CPUs, Nvidia BlueField-3 DPUs, and Spectrum-4 switches providing “up to 800 Gb per second” network connectivity. But the performance claim lacks corresponding detail. Which competing system was tested? What were its specifications? How were the tests conducted to ensure fair comparison?

The phrase “actual results may vary” appears as a disclaimer. This is standard legal protection, but it also acknowledges that the claimed 2X improvement may not materialize in customer deployments. Without test methodology, customers can’t determine whether their environment will match Dell’s optimized configuration.

Dell’s S3 over RDMA Numbers

The same Dell announcement claims S3 over RDMA provides “up to 230 percent higher throughput and up to 80 percent lower latency” with “98 percent reduced CPU load compared to traditional S3 data transfers” [3]. These are dramatic improvements - exactly the kind of claims that warrant detailed scrutiny.

The Blocks and Files coverage notes: “Dell has not provided specific throughput figures, which makes independent comparison difficult” [3]. The article mentions a tweet referencing 97Gbps but notes this wasn’t in the official announcement. The methodology? “Internal and preliminary Dell testing.”

Consider what “98% reduced CPU load” means operationally. If traditional S3 consumes 50% CPU during transfers, does 98% reduction mean 1% CPU? Or does it mean reducing from 10% to 0.2%? The baseline matters enormously for understanding whether this improvement is relevant to a specific workload.

The 230% throughput improvement similarly lacks context. Throughput for what object sizes? Sequential or random access? Single stream or concurrent? Network-bound or storage-bound workload? All these variables affect whether a customer’s use case would see similar gains.

Cloudian HyperStore 8 “Record-Setting Performance”

Cloudian’s November 2023 HyperStore 8 announcement uses “revolutionary,” “breakthrough,” and “record-setting” throughout [4]. The specific claim: “17.7GB/s write and 24.9GB/s read from a cluster of six power-efficient, single-processor servers.”

This provides absolute numbers rather than just percentages, which is better. But critical details remain absent:

What workload produced these numbers? Sequential writes of large objects are very different from random writes of small objects
Network configuration: What network fabric connected the six servers? What bandwidth was available?
Client configuration: How many clients? What parallelism? What I/O pattern?
Duration: Sustained throughput or peak burst?
Comparison: Is 17.7GB/s write from six servers impressive? That’s about 3GB/s per server - good but not remarkable for all-flash

The announcement also claims “74% improvement in power efficiency compared to HDD-based systems.” Compared to what HDD system? With how many drives? At what utilization level? HDDs and SSDs have fundamentally different performance and power characteristics. A vague “74% improvement” doesn’t provide actionable information.

Cloudian CEO Jon Toor is quoted: “HyperStore 8 sets a new benchmark for the industry” [4]. But benchmarks that can’t be reproduced aren’t benchmarks - they’re marketing claims with numbers attached.

The “Breakthrough” and “Revolutionary” Language

Both Cloudian and Dell announcements deploy superlatives extensively:

“Revolutionary unified file and object data management platform”
“Breakthrough performance”
“Record-setting”
“Ushers in a new era of storage solutions”
“Leap forward”
“Sets a new benchmark”
“Industry’s highest-performing object platform”
“Redefining performance and scale for the AI era”
“World’s fastest parallel file system”
“Groundbreaking”

This language pattern signals marketing content rather than technical documentation. Engineering papers describe performance characteristics with specificity: workload, configuration, measurement methodology. Marketing materials claim superiority using adjectives.

The Register’s forum discussion about durability claims applies equally to performance assertions: these numbers become “virtually meaningless” without methodology [5]. Storage professionals can’t make informed decisions based on “breakthrough” and “revolutionary.” They need reproducible tests.

What Transparency Looks Like

Contrast these vendor claims with transparent benchmark practices:

Backblaze publishes drive stats with complete data sets available for download [6]. They show failure rates across different drive models, deployment dates, and usage patterns. Anyone can verify their analysis or draw different conclusions from the raw data.

The Storage Performance Council (SPC) requires full disclosure for submitted benchmarks [7]. SPC-1 results must include hardware configuration, software versions, pricing, and availability dates. Third-party auditors verify the testing. Results are comparable across vendors because methodology is standardized.

Academic research demands reproducibility. Papers describing storage system performance include: complete system specifications, workload generators with parameters, measurement methodology with statistical analysis, and source code or data for independent verification.

Google’s transparency about their infrastructure, while limited, provides enough detail that storage engineers can reason about design decisions and performance characteristics [8]. When they claim performance improvements, the context and methodology are typically described.

What would transparent performance claims look like for Cloudian and Dell?

Specific TorchBench test with version number
Model architecture, batch size, image resolution
Complete hardware specifications for both test and baseline
Network topology and bandwidth measurements
Raw performance data with multiple runs
Workload generator configuration for reproduction
Either third-party verification or raw data release
Specific competing product tested (with permission or using public benchmarks)
Both systems’ complete specifications
Object sizes tested (e.g., 1MB, 100MB, 1GB)
Access patterns (sequential, random, mixed)
Number of clients and concurrency level
Network infrastructure details
Multiple test runs with statistical analysis

None of this requires revealing trade secrets. It requires treating performance claims as engineering assertions subject to verification rather than marketing statements optimized for sales impact.

The Industry Problem

The storage vendor benchmark problem isn’t limited to Cloudian and Dell. It’s systemic:

VAST Data claims “100 GB/s sustained performance” without publishing test methodology [9]. What workload? What configuration? Unverifiable.

Pure Storage markets performance improvements with percentages but limited reproducible detail.

NetApp publishes benchmark results but often through controlled POCs rather than independently reproducible tests.

The pattern creates a race to the most impressive number rather than the most verifiable claim. When Vendor A announces “50% better performance,” Vendor B needs to claim “2X better” to compete, regardless of whether either claim can be substantiated.

This hurts customers who need to make informed purchasing decisions. When every vendor claims superior performance without providing verification methodology, how do organizations evaluate alternatives? The RFP process devolves into comparing marketing claims rather than engineering reality.

Why Vendors Resist Transparency

Publishing detailed benchmark methodology creates accountability. If a vendor claims 2X performance improvement and publishes complete test details, competitors can reproduce the test. If the results don’t match, the claim gets challenged publicly. If the test configuration is unrealistic, industry experts point out the gaps.

Vague claims with “internal testing” provide plausible deniability. If a customer doesn’t achieve claimed performance, the vendor can point to environmental differences, configuration variations, or workload mismatches. “Actual results may vary” becomes a liability shield rather than honest acknowledgment of configuration sensitivity.

Marketing departments prefer flexibility. “Up to 230% improvement” sounds impressive in sales presentations. The “up to” qualifier legally protects against false advertising while allowing the maximum number in marketing materials. Publishing methodology would force specifying typical or average performance - often less impressive than peak numbers.

Competitive secrecy provides cover. Vendors can claim they don’t want to reveal optimization techniques to competitors. But benchmark methodology doesn’t require revealing source code. It requires describing test configuration, workload characteristics, and measurement techniques - information necessary for customer verification.

What Customers Need

Organizations evaluating storage systems need different information than marketing departments want to provide:

Typical performance, not best case: “Up to 2X” tells you the ceiling. What’s the floor? What’s typical? What’s p50, p95, p99?

Workload relevance: Does the benchmark match actual use cases? AI training workload benchmarks don’t predict video surveillance performance.

Configuration transparency: What hardware, network, and software configuration produced these numbers? Can that configuration be purchased?

Scalability characteristics: How does performance scale with capacity? With concurrent users? Under degraded conditions?

Failure mode behavior: What happens to performance when drives fail? During rebuilds? Under high CPU load?

Cost normalization: Is this 2X throughput at 3X cost? Performance per dollar matters more than peak performance.

Independent verification: Has anyone outside the vendor’s engineering team reproduced these results?

None of these questions can be answered with “74% improvement in data processing performance” claims lacking methodology. The impressive-sounding number obscures rather than illuminates operational reality.

The Path Forward

Storage vendors should adopt transparency standards:

Publish reproducible test methodology for all performance claims. Include hardware specifications, software versions, workload characteristics, and measurement techniques.

Use third-party verification for competitive comparison claims. When claiming superiority over “the closest competitor,” name them and use standardized benchmarks.

Provide typical performance, not just peak. Show performance distributions, not just best-case numbers.

Include failure mode testing. Real-world performance includes degraded operation, not just optimal conditions.

Make data available for independent analysis. Raw benchmark data enables verification and builds trust.

Stop using superlatives without substantiation. “Revolutionary” and “breakthrough” without reproducible proof are marketing, not engineering.

Some vendors already follow better practices. Others could improve with modest effort. The industry benefits when customers can make informed decisions based on verifiable engineering rather than impressive-sounding marketing claims.

The Core Issue

Cloudian and Dell both build legitimate storage technology. Cloudian’s RDMA connector for PyTorch likely does improve performance. Dell’s ObjectScale XF960 with all-flash and modern networking probably does deliver good throughput. The engineering is real.

But “74% improvement” and “2X greater throughput” without reproducible methodology aren’t useful information for purchasing decisions. They’re marketing claims optimized for press releases rather than technical assertions subject to verification.

Storage technology is complex enough without vendors publishing benchmark numbers that can’t be reproduced. When Cloudian claims “record-setting performance” without showing the test, they’re asking customers to trust rather than verify. When Dell cites “internal analysis of publicly available data” without identifying the data, they’re avoiding accountability rather than enabling informed evaluation.

Customers making million-dollar storage decisions deserve better. They need reproducible benchmarks with complete methodology, independent verification where possible, and honest acknowledgment of limitations and variability.

Because when vendors claim “revolutionary” performance improvements without showing the tests, they’re selling marketing narratives, not engineering truth.

References

[1] Blocks and Files, “Cloudian plugs PyTorch into GPUDirect to juice AI training speeds,” July 15, 2025. https://blocksandfiles.com/2025/07/15/cloudian-rdma-connector-pytorch/

[2] Dell Technologies, “Dell Technologies Unveils Infrastructure Innovations Built to Power Modern AI-Ready Data Centers,” April 8, 2025. https://www.dell.com/en-us/dt/corporate/newsroom/announcements/detailpage.press-releases~usa~2025~04~dell-technologies-unveils-infrastructure-innovations-built-to-power-modern-ai-ready-data-centers.htm

[3] Blocks and Files, “Dell updates PowerScale, ObjectScale to accelerate AI Factory rollout,” May 20, 2025. https://blocksandfiles.com/2025/05/20/dell-updates-objectscale-and-powerscale-for-ai-nvidia-style/

[4] GlobeNewswire, “Cloudian Unveils HyperStore 8: A Breakthrough Global Unified File and Object Storage Platform,” November 14, 2023. https://www.globenewswire.com/news-release/2023/11/14/2779809/0/en/Cloudian-Unveils-HyperStore-8-A-Breakthrough-Global-Unified-File-and-Object-Storage-Platform.html

[5] The Register Forums, “Mmm, yes. 11-nines data durability? Mmmm, that sounds good. Except it’s virtually meaningless,” https://forums.theregister.com/forum/all/2018/07/19/data_durability_statements/

[6] Backblaze, “Hard Drive Stats,” https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data

[7] Storage Performance Council, “SPC-1 Specification,” https://www.storageperformance.org/specs/SPC-1_v3.9.pdf

[8] Google Research, “Design Lessons and Advice from Building Large Scale Distributed Systems,” https://research.google/pubs/pub36737/

[9] StorageMath, “Understanding VAST Data’s Erasure Coding Architecture,” https://storagemath.com/posts/vast-data-erasure-coding/