Quallaa

Risk Scoring for
Public Facing AI

Eight Dimensions. Compound Scoring.
Proportional Safety.

Jeff Toffoli

Founder, Quallaa

Quallaa

The Problem

Same Model, Different Risk

🔧

Plumber's Text-Back Bot

"Sorry we missed your call! How can we help?"

50 contactsBooks appointmentsUnregulated

📺

Autonomous Super Bowl Ad

AI writes, produces, and broadcasts to 100M viewers. No human review.

100M audienceFully autonomousBrand at stake

Both powered by the same foundation model. Both get identical safety treatment.

Quallaa

The Problem

The Gap

NIST AI RMF

7 trustworthy AI characteristics, 211 actions

No scoring system. No deployment-level assessment.

EU AI Act

4 risk tiers. Same model = different risk by deployment.

Binary classification (high/not-high). No graduation.

OWASP Agentic

10 agentic risk categories. Least Agency principle.

Security checklist, not a scoring framework.

No one scores deployment risk.

The question "Is this specific deployment configured safely?" has no standardized answer.

Quallaa

The Framework

Eight Risk Dimensions

Independent axes. Each scored 1-5. Together they define a deployment's risk profile.

Autonomy

Agent drafts; human reviews and sends every response

Action Capability

Can only read context and generate text responses

Consequence Severity

Wrong information, wasted time

Reversibility

Text response

Audience Exposure

< 50 known contacts

Domain Sensitivity

Retail, food service, general services

Identity Representation

Generic AI assistant

Data Sensitivity

Works with publicly available information

Quallaa

Dimension

Autonomy

1Human-controlled

Agent drafts; human reviews and sends every response

Example

Email draft assistant with send approval

Low risk

High risk

Quallaa

Dimension

Action Capability

1Read and respond

Can only read context and generate text responses

Example

Simple Q&A chatbot with no integrations

Low risk

High risk

Quallaa

Dimension

Consequence Severity

1Inconvenience

Wrong information, wasted time. No financial or legal exposure

Example

Agent gives wrong business hours

Low risk

High risk

Quallaa

Dimension

Reversibility

1Instantly reversible

Text response. Customer can ignore it.

Example

Wrong info in a text — next message corrects it

Low risk

High risk

Quallaa

Dimension

Audience Exposure

1Known individuals

< 50 known contacts. Owner knows everyone.

Example

Plumber's agent serving existing customers

Low risk

High risk

Quallaa

Dimension

Domain Sensitivity

1Unregulated

Retail, food service, general services

Example

Plumber, restaurant, hair salon

Low risk

High risk

Quallaa

Dimension

Identity Representation

1Clearly labeled tool

Generic AI assistant. No business identity.

Example

"AI Assistant" widget on a website

Low risk

High risk

Quallaa

Dimension

Data Sensitivity

1Public info only

Works with publicly available information

Example

FAQ bot using public website content

Low risk

High risk

Quallaa

The Framework

The Full Spectrum

Autonomy

Human-controlled

Human-gated

Supervised autonomous

Broadly autonomous

Fully autonomous

Action Capability

Read and respond

Read external systems

Write to controlled systems

Write to external systems

Transact and commit

Consequence Severity

Inconvenience

Minor financial

Significant financial

Legal or regulatory

Safety or irreversible

Reversibility

Instantly reversible

Easily reversible

Reversible with effort

Difficult to reverse

Irreversible

Audience Exposure

Known individuals

Known community

Open public, local

Open public, broad

Open public, massive

Domain Sensitivity

Unregulated

Lightly regulated

Moderately regulated

Heavily regulated

Critical infrastructure

Identity Representation

Clearly labeled tool

Business rep, disclosed

Business rep, contextual

Professional authority

Institutional authority

Data Sensitivity

Public info only

Basic contact info

Business-sensitive

Personal sensitive

Protected categories

8 dimensions × 5 levels = 390,625 possible combinations. That's why you need compound scoring.

Quallaa

Compound Scoring

Why Not Simple Math?

Deployment A: One Spike

[3, 1, 5, 1, 1, 1, 1, 1]

One dimension at max, rest minimal

1.8

Deployment B: Uniform Moderate

[3, 3, 3, 3, 3, 3, 3, 3]

Every dimension at moderate

3.0

Average treats them as equal. But uniformly moderate risk across 8 dimensions is far more dangerous than one spike.

Quallaa

Hard Stops

Dangerous Combinations

Regardless of compound score, these combinations trigger mandatory requirements

Any dimension = 5

Human oversight plan + incident response

Consequence >= 4 AND Autonomy >= 4

Human-in-the-loop on all consequential actions

Data Sensitivity >= 4 AND Action >= 3

Data access audit logging + retention policy

Domain Sensitivity >= 4

Domain-specific compliance review

Identity >= 4 AND Consequence >= 3

Explicit AI disclosure at first contact

Quallaa

Risk Tiers

Five Tiers, Proportional Safety

Quallaa

Live Scoring

The Plumber

Missed-call text-back agent

Autonomy

3

Supervised autonomous

Action Capability

3

Write to controlled systems

Consequence Severity

1

Inconvenience

Reversibility

2

Easily reversible

Audience Exposure

1

Known individuals

Domain Sensitivity

1

Unregulated

Identity Representation

3

Business rep, contextual

Data Sensitivity

2

Basic contact info

Compound Risk Score

1.73

Tier 1: Low Risk

Quallaa

Live Scoring

The Restaurant

Booking agent with calendar access

Autonomy

3

Supervised autonomous

Action Capability

3

Write to controlled systems

Consequence Severity

2

Minor financial

Reversibility

2

Easily reversible

Audience Exposure

3

Open public, local

Domain Sensitivity

1

Unregulated

Identity Representation

2

Business rep, disclosed

Data Sensitivity

2

Basic contact info

Compound Risk Score

2.11

Tier 2: Moderate Risk

Quallaa

Live Scoring

Healthcare Intake

Clinic intake agent collecting symptoms

Autonomy

2

Human-gated

Action Capability

2

Read external systems

Consequence Severity

4

Legal or regulatory

Reversibility

4

Difficult to reverse

Audience Exposure

3

Open public, local

Domain Sensitivity

4

Heavily regulated

Identity Representation

4

Professional authority

Data Sensitivity

5

Protected categories

Compound Risk Score

3.26

Tier 3: Elevated Risk

Hard-Stops Triggered

Domain Sensitivity >= 4: compliance review required

Data Sensitivity = 5: human oversight plan required

Quallaa

Live Scoring

Financial Services

Transaction-capable financial agent

Autonomy

4

Broadly autonomous

Action Capability

5

Transact and commit

Consequence Severity

5

Safety or irreversible

Reversibility

4

Difficult to reverse

Audience Exposure

4

Open public, broad

Domain Sensitivity

4

Heavily regulated

Identity Representation

5

Institutional authority

Data Sensitivity

5

Protected categories

Compound Risk Score

4.50

Tier 5: Critical Risk

Hard-Stops Triggered

Consequence >= 4 AND Autonomy >= 4: human-in-the-loop required

Data Sensitivity >= 4 AND Action >= 3: audit logging required

Domain Sensitivity >= 4: compliance review required

Identity >= 4 AND Consequence >= 3: explicit AI disclosure required

Multiple dimensions = 5: human oversight plan required

Quallaa

The Product

The Trust Layer

The scoring system becomes infrastructure.

Risk Assessment

Score the deployment across 8 dimensions

Proportional Safety

Match safety measures to the actual risk tier

Citations

Every claim traced to its source document

Audit Trail

Complete record of what the agent did and why

Quallaa builds problem solving machines.

The trust layer makes previously unsolvable problems solvable.

1 / 21

Press F for fullscreen

Risk Scoring forPublic Facing AI

Same Model, Different Risk

The Gap

Eight Risk Dimensions

Autonomy

Action Capability

Consequence Severity

Reversibility

Audience Exposure

Domain Sensitivity

Identity Representation

Data Sensitivity

The Full Spectrum

Why Not Simple Math?

Dangerous Combinations

Five Tiers, Proportional Safety

The Plumber

The Restaurant

Healthcare Intake

Financial Services

The Trust Layer

Risk Scoring for
Public Facing AI