Humanoid Readiness Level (HRL) Framework

The Humanoid Readiness Level (HRL) is a nine-point scale developed by HumanoidRR to give technical leaders a common vocabulary and structured methodology for assessing whether a humanoid robot platform is ready to move from evaluation into live production deployment.

It is adapted from NASA's Technology Readiness Level (TRL) framework, reoriented specifically for the operational, integration, and safety requirements of commercial humanoid deployments in industrial and logistics environments.

Unlike TRL — which measures how mature a technology is in isolation — HRL measures readiness in context: how ready is this robot, operating this task, in this facility, under these operational conditions. The same platform may be at HRL 7 in a structured logistics environment and HRL 3 in an unstructured warehouse with mixed human traffic.

"HRL is not a score you give a robot. It is a score you give a deployment scenario."

The HRL Scale

The scale runs from HRL 1 (concept only, no hardware) through HRL 9 (full autonomous production operation at scale). Levels 1–3 are pre-deployment research phases. Levels 4–6 are controlled testing and piloting. Levels 7–9 represent live operational deployment at increasing scale and autonomy.

HRL SCALE OVERVIEW

Level Definitions

Each level has a name, a definition, and a set of evidence criteria — the observable, testable conditions that must be met before a deployment can be assessed at that level. The criteria are intentionally concrete. "The robot performs well" is not evidence. "The robot completes 95% of pick tasks at the designated waypoints within 30 seconds, with no human intervention, across 500 consecutive cycles" is evidence.

Phase 1 — Research & Evaluation (HRL 1–3)

HRL 1 Concept Stage

The organisation has identified a use case for humanoid robots but has not begun hardware evaluation. Deployment is a hypothesis, not a project.

Evidence Criteria

— Use case documented with defined task scope and facility context
— Initial market survey of available platforms completed
— Internal stakeholder alignment on intent to evaluate

HRL 2 Technology Concept Defined

A specific robot platform has been selected for evaluation. The integration approach has been designed on paper — how the robot will connect to facility systems, what fleet management software will be used, and what the operator workflow will look like.

Evidence Criteria

—Platform vendor selected, hardware specification confirmed
—Fleet management software identified (e.g. Open-RMF)
—Integration architecture documented: fleet adapter, nav graph, workcell interfaces
—IT and OT infrastructure requirements identified
—Preliminary safety assessment completed

HRL 3 Component Integration Validated

All hardware and software components are integrated and functional together. The fleet adapter is communicating with the robot. The nav graph is built. The robot can receive and execute commands from the fleet management system. No production tasks have been attempted.

Evidence Criteria

—Fleet adapter live: robot state reporting to RMF at ≤2s poll rate
—Nav graph validated: all waypoints reachable, single-robot coordinate spot-check within 0.3m
—Task submission confirmed: Go-To-Place and Delivery tasks accepted and executed
—Alarm handling validated: all three alarm tiers tested and operator response documented
—E-stop tested: robot halts within 500ms of emergency stop signal

Phase 2 — Piloting (HRL 4–6)

HRL 4 Lab Validation — Full Task Cycle

The complete end-to-end task cycle — from task submission through execution, completion, and reporting — has been validated in a controlled lab environment. The lab replicates key aspects of the target deployment but is not the production facility.

Evidence Criteria

—100 consecutive task cycles completed with <5% failure rate
—Multi-robot test: 3+ robots dispatched simultaneously with no deadlocks
—Failure recovery validated: robot recovers from navigation failure and API timeout within defined miss counter thresholds
—Charging cycle validated: robot autonomously routes to charger at low battery and resumes tasks post-charge
—Adapter crash recovery tested: persistent state file correctly restores task context after restart

HRL 5 Controlled Pilot — Representative Environment

The deployment is tested in an environment that closely mimics the target production facility — same floor surface, similar obstacle profile, representative traffic conditions — but is not yet the live production line. Human workers may be present but are briefed participants, not uninformed co-workers.

Evidence Criteria

—500+ task cycles in representative environment, <5% failure rate sustained
—Human co-presence validated: robot yields correctly to humans crossing path, no unsafe proximities recorded
—Full shift duration test: 8-hour continuous operation with no unrecovered faults
—Operator runbook complete: all alarm types, response procedures, and escalation paths documented
—Safety authority sign-off obtained for human co-presence operation
—Baseline KPIs established: task throughput/shift, failure rate, fleet utilisation, avg task duration

HRL 6 Full Pilot — Production Environment

The robot is operating in the actual target facility, executing real tasks alongside uninformed co-workers. Task outcomes are tracked but not yet counted toward production targets. This is the critical gate before live production — issues found at HRL 6 are fixable. Issues found at HRL 7 are incidents.

Evidence Criteria

—2-week continuous operation in live facility, no production-counted tasks
—All uninformed co-workers briefed on robot presence and right-of-way rules
—Zero safety incidents requiring regulatory reporting
—KPIs tracking within 15% of HRL 5 baseline (no significant degradation in real environment)
—Facility layout drift check completed: SLAM map validated against current physical layout
—Maintenance schedule established and first scheduled maintenance completed

Phase 3 — Production Deployment (HRL 7–9)

HRL 7 Live Production — Initial Deployment

A single robot is operating in live production. Task completions count toward production targets. A dedicated operator is present and monitoring continuously. Incident reporting is active and all faults are logged, investigated, and resolved before the next shift.

Evidence Criteria

—Production task completion rate ≥90% over first 30 days
—Fleet utilisation 60–85% across measured shifts
—All faults classified, root-caused, and resolved within 24 hours
—KPI trending stable or improving across 4-week rolling average
—Operator competency confirmed: operator resolves Tier 1 and Tier 2 alarms without technical escalation

HRL 8 Supervised Production at Scale

Multiple robots operating in production. The operator monitors rather than accompanies. KPIs are formally tracked against targets. The deployment is generating reliable performance data that informs scaling decisions.

Evidence Criteria

—3+ robots in simultaneous production operation
—Task failure rate <5% sustained over 90-day rolling period
—Formal KPI reporting to management on weekly cadence
—ROI model updated with actual cost and output data
—Scaling readiness assessment completed: facility capacity, charger density, operator ratio confirmed for next fleet increment

HRL 9 Full Autonomous Production

The fleet operates autonomously within defined parameters. Human oversight is supervisory, not operational. The system self-manages charging, task queuing, and routine fault recovery. Operational decisions are data-driven and the deployment is considered mature.

Evidence Criteria

—Fleet operates full shifts without operator intervention on >85% of shifts
—Task failure rate <3% over 6-month rolling period
—Actual ROI within 20% of pre-deployment model
—No regulatory safety incidents in preceding 6 months
—Continuous improvement process in place: KPI trends reviewed monthly, nav graph and adapter tuned based on performance data

Applying the Framework

HRL assessment is conducted as a structured review — typically a half-day workshop with the technical lead, operations manager, and safety officer. The assessor works through the evidence criteria for the claimed level and the level above, documenting what evidence exists and what is outstanding.

A deployment is at the highest level for which all evidence criteria are met. Meeting 4 of 5 criteria at HRL 6 means the deployment is at HRL 5, not HRL 6. The criteria are binary: met or not met. Partial credit is not part of the methodology.

Common Mistake — Organisations frequently assess themselves at a higher HRL than evidence supports. The most common gap is between HRL 5 and HRL 6: a controlled pilot in a representative environment is often described as a "production pilot" because the robot is physically in the production facility for part of the test. HRL 6 requires uninformed co-workers, production-environment conditions, and 2 weeks of continuous operation — not a single demonstration day.

Typical Assessment Timelines

The following timelines are based on HumanoidRR's observations across commercial humanoid deployments in Australia and internationally. They represent typical durations for well-resourced deployments with experienced integration teams. Under-resourced or first-time deployments should add 40–60% to each phase.

TYPICAL TIMELINE — HRL 1 TO HRL 7

Why Timelines Blow Out — The most common cause of timeline overrun between HRL 3 and HRL 6 is underestimating the nav graph and facility preparation work. Building the integration is typically 60% of the schedule organisations plan for. Validating it in a real facility with real workers is the other 40% — and it almost never goes to plan on the first attempt. Plan for two full cycles of nav graph revision before reaching HRL 6.

Using HRL in Practice

The framework is designed to be used in three distinct ways depending on your role:

For CTOs evaluating a deployment: Use HRL as a gate structure. Define the evidence criteria you require at each gate before the project starts. Require vendors to demonstrate evidence, not claim a level. The most useful gate is the HRL 5→6 transition — requiring 2 weeks of uninformed co-worker operation before counting any production output forces the deployment to prove itself in realistic conditions.

For operations managers running a live fleet: Use HRL to classify incidents. A deployment that was at HRL 8 and experiences a sustained failure rate above 10% has regressed — reassess at what level it now sits and what evidence is needed to restore it. This framing makes remediation concrete rather than vague.

For boards and executives approving spend: Use HRL to anchor the conversation about timelines and scope. When a vendor says "we can be in production in six months," ask what HRL that represents. If the answer is HRL 6, the honest answer is that live production — HRL 7 — is likely 3–6 months beyond that.

This framework is published by HumanoidRR and is free to use, adapt, and share with attribution. If you are applying HRL to an active deployment and would like an independent assessment, contact our enterprise team. If you want to go deeper on the technical integration criteria behind HRL 3–6, our Deployment Management Course covers each criterion in detail.

Humanoid ReadinessLevel Framework

The HRL Scale

Level Definitions

Applying the Framework

Typical Assessment Timelines

Using HRL in Practice

Humanoid Readiness
Level Framework