Security & Trust

Every skill is checked before it is listed.

A skill runs inside your agent, with access to the deal files you point it at. Before one is listed, the site owner reviews it against a published rubric, and that review and its evidence show on the skill’s own page. A maintainer review is not a formal or third-party audit. Below is exactly what it covers, and what it does not.

Why this matters for CRE

Skills add risk. Vet them like any vendor.

CRE skills run against confidential files: rent rolls, T-12s, IC memos, tenant PII, sponsor financials under NDA. A bad skill could leak that data, reach for an undisclosed credential, or hand you a number you shouldn't trust. So the rubric weights what a skill is allowed to do (its purpose, its instruction scope, how it installs) above whether it writes a file. Vetting a skill before you trust it with a deal is the same discipline as vetting a vendor before granting data-room access.

The rubric

Five dimensions, weighted by blast radius.

Dimension	Weight	What it asks	What a score of 5 looks like
Purpose & Capability	×3	Is what the skill is allowed to do actually limited to what it claims to do? Does it request file, shell, network, or tool access beyond producing the analysis it advertises?	A read-and-reason skill that works only from the input you give it, requests no system, shell, or network capability, and whose stated purpose matches its actual footprint, e.g. screening a deal you paste in, with no hidden side effects.
Instruction Scope	×3	Are the skill's instructions narrowly scoped to its CRE task, with explicit do-not-trigger boundaries, and free of hidden directives, prompt-injection bait, or instructions to exfiltrate or misuse the data it sees?	Tightly scoped, on-topic instructions with clear activation and exclusion rules, no concealed system-prompt overrides, and no language that would coax the agent into leaking a rent roll, T-12, or PII it was given.
Install Mechanism	×2	How does the skill arrive on your machine? Does it run install-time hooks, post-install scripts, or fetch remote code, or is it plain, inspectable text installed through a transparent, version-pinned mechanism?	Ships as human-readable Markdown/YAML through an open-source, version-pinned package with no install-time code execution of its own, so you can read exactly what you installed before you run it.
Credentials	×2	Does the skill read environment variables, request API keys or tokens, embed secrets, or otherwise touch credentials, and if so, is that handling minimal, disclosed, and necessary?	No credential surface at all: it reads no environment variables, asks for no keys or tokens, embeds no secrets, and needs none to do its job.
Persistence & Privilege	×1	Does the skill hold state between runs, write to disk outside your explicit output, escalate privileges, or run with more access than the task requires?	Stateless and least-privilege: nothing retained between runs, nothing written to disk unless you ask for it, and any bundled calculator is pure, standard-library code with no privileged access.

Weights reflect how much damage a skill can do: its purpose, instruction scope, and how it installs matter more than whether it writes a file. Scores are 1–5 per dimension; the weighted total is normalized to a 0–5 scale (weights sum to 11). Every skill on the site is reviewed against this rubric by the maintainer, which reports a concern band. The Verified/Caution/Flagged verdict tiers (≥ 4.0, 3.0–3.99, below 3.0) are a separate, stronger claim reserved for a formal third-party audit, which none has yet.

The score

How the weighted score is computed.

Each dimension is scored 1–5. The scores are weighted, summed, and divided by the sum of the weights (11) to normalize back onto a 0–5 scale:

(Purpose & Capability × 3) + (Instruction Scope × 3) + (Install Mechanism × 2) + (Credentials × 2) + (Persistence & Privilege × 1)
  ÷ 3 + 3 + 2 + 2 + 1  (= 11)  →  score on a 0–5 scale

The same thresholds carry two different vocabularies, because the strength of the evidence differs:

Owner-reviewed methodology check (every skill today): ≥ 4.0 is Low concern, 3.0–3.99 is Moderate concern, below 3.0 is Elevated concern. It can flag a concern but never certifies a skill as safe.
Formal audit (a committed, evidence-backed record — none exist yet): ≥ 4.0 is Verified, 3.0–3.99 is Caution, below 3.0 is Flagged. The verdict words and the green “Verified” styling are reserved for this state alone.

Formal-audit verdicts

Three outcomes, reserved for audit-backed records.

These verdict tiers apply only when a real, committed audit record exists. No skill is audit-backed today, so no skill carries a Verified, Caution, or Flagged verdict. The owner-reviewed methodology checks below report a concern band instead.

Verified

≥ 4.0

Tightly scoped, no credential surface, transparent install. The kind of skill you can read and run with confidence.

Caution

3.0 – 3.99

Useful but with a wider surface: broader instructions, an install hook, or some state. Read it before pointing it at sensitive data.

Flagged

< 3.0

A capability, scope, or credential concern serious enough that it would not be listed without changes. Treat with care.

By the numbers

The catalog and the methodology checks.

127

skills in catalog

124

owner-reviewed (not audited)

formally / independently audited

trust dimensions in the rubric

weighted-score denominator

How the 124 owner-reviewed skills fall across the concern bands. Most are focused, local read-and-reason tools and land in Low concern; the workspace routers that ship their own runtime subsystem are the Moderate outliers. Elevated concern is reserved for a concrete red flag — meaningful runtime complexity, credential handling, an external connector, or mutation behavior — and none of these read-and-reason skills carry one, so the band is empty. The bands are what the maintainer review found against the rubric, not a rubber stamp:

122

Low concern

Moderate concern

Elevated concern

A methodology check, end to end

How the rubric scores a real skill.

Methodology check · Low concern

Deal QuickScreen4.82 / 5 (53/55 weighted · owner reviewed)

Subject: what the skill itself is declared to read, run, or persist — not the quality of the deal verdicts it produces. Reviewed by the site owner against the plugin manifest, source, and bundled calculator; a maintainer review, not a formal audit.

Purpose & Capability ×35 / 5runtime_role=callable_tool, classification=normal, calculator_file present. Reviewed against the rubric: a focused, single-task read-and-reason skill whose declared purpose matches its footprint. The plugin declares no allowed-tools, so the host agent you run it in (not the skill) bounds what it can read, run, or reach. Scored 5.

Instruction Scope ×35 / 5pii_policy=business_contact, classification=normal. Reviewed: instructions are narrowly scoped to the CRE task with explicit do-not-trigger rules and no embedded directive to leak or misuse the data the skill is shown. Like any prompt it stays steerable by adversarial text it is asked to summarize, so treat untrusted source documents with normal care. Scored 5.

Install Mechanism ×24 / 5Installs with the cre-skills plugin, which registers SessionStart/PostToolUse/Stop hooks and a stdio MCP server, so this is not a zero-execution install (never a 5). Those hooks are transparent, version-pinned, Apache-2.0, and source-readable; telemetry and feedback are opt-in and default-off. Reviewed and scored 4. The bundled Python calculator (src/calculators/quick_screen.py) was reviewed as pure standard-library code, so it does not lower this score.

Credentials ×25 / 5Reviewed and source-verified: neither the skill nor its bundled Python calculator (src/calculators/quick_screen.py) reads environment variables or secrets, and .mcp.json declares env:{}. No credential surface — the rubric's definition of a 5.

Persistence & Privilege ×15 / 5produces_artifact_kind=calculator_result, workspace_scope=deal. Stateless by declaration: nothing retained between runs, and nothing written outside the output you ask for — a memo, model, or calculator result you request is that output, not hidden state. Plugin-level telemetry and session hooks write to ~/.cre-skills only when you opt in (default-off).

This is the same record shown on the skill’s own page, pinned to version 0.1.0 at plugin commit 761c5a5. Every dimension links to the exact manifest and source files it was scored from.

Low concern is not a pass or a guarantee. It reflects what the maintainer review found against the rubric: a focused read-and-reason screen with a reviewed, standard-library calculator and no credential or connector surface. It remains a maintainer review, not a formal or third-party audit.

Scope & limitations

What this review does and does not cover.

Each methodology check is a maintainer review: the site owner scores the skill against this rubric from its declared catalog/manifest metadata, current plugin source, and any calculator or runtime files. It reviews the skill's security surface, not the correctness of its analysis or its full runtime behavior under every input.
The plugin declares no tool/capability permissions (no allowed-tools), so the review cannot assert what a skill reads, writes, or accesses at runtime. Dimensions that depend on declared capability are scored conservatively, and effective capability is set by the agent you run the skill in.
This is a maintainer review, not a formal or independent third-party audit. Zero skills are audit-backed. A low concern band means the review found no concrete red flag against the rubric; it is not a certification of safety.
The plugin is open source and changes independently; an upstream commit can alter a skill's behavior after any check here, so a check reflects a point in time. If the upstream version drifts from what the site shows, the check auto-hides rather than mislabel a changed skill.
Skill behavior depends on the host agent, the model version, your inputs, and your local environment; identical instructions can produce different results across setups.
Everything here is provided "as is," without warranty of any kind, under the Apache License 2.0.
You are solely responsible for any sensitive, confidential, or regulated client and portfolio data you choose to put in front of any skill or agent.
Nothing on this site is investment, legal, tax, or accounting advice.

Accessibility

Color contrast.

Every text color pairing used in this site is verified against the WCAG 2.1 AA threshold at build time. The table below is generated from the same source module the test suite asserts, so it cannot drift from the actual tokens in use.

Pairing	Sample	Foreground	Background	Ratio	WCAG AA
ink on page	Aa	`#e6e1d6`	`#000000`	16.11:1	Pass
muted on page	Aa	`#aaaaaa`	`#000000`	9.04:1	Pass
muted-2 on page	Aa	`#999999`	`#000000`	7.37:1	Pass
green accent on page	Aa	`#00ff66`	`#000000`	15.50:1	Pass
verified text on page	Aa	`#4caf50`	`#000000`	7.56:1	Pass
ink on card	Aa	`#e6e1d6`	`#001100`	14.91:1	Pass
muted on card	Aa	`#aaaaaa`	`#001100`	8.37:1	Pass
muted-2 on card	Aa	`#999999`	`#001100`	6.82:1	Pass
green accent on card	Aa	`#00ff66`	`#001100`	14.35:1	Pass
verified text on card	Aa	`#4caf50`	`#001100`	6.99:1	Pass
ink on raised fill	Aa	`#e6e1d6`	`#001900`	14.14:1	Pass
muted-2 on raised fill	Aa	`#999999`	`#001900`	6.47:1	Pass
accent on accent tint	Aa	`#00ff99`	`#001a0d`	13.67:1	Pass
caution on amber tint	Aa	`#ffcc44`	`#1f1500`	11.99:1	Pass
flagged on red tint	Aa	`#ff6666`	`#1f0a0a`	6.64:1	Pass
verified on green tint	Aa	`#4caf50`	`#0a1f0e`	6.22:1	Pass
neon on chrome bar	Aa	`#00ff66`	`#001900`	13.60:1	Pass
ondark body on panel	Aa	`#e6e1d6`	`#001100`	14.91:1	Pass
ondark muted on bar	Aa	`#5cc999`	`#001900`	9.02:1	Pass
ondark muted on panel	Aa	`#5cc999`	`#001100`	9.51:1	Pass
ondark muted on install band	Aa	`#5cc999`	`#000000`	10.27:1	Pass
trust stamp on panel	Aa	`#ffcc44`	`#001100`	12.94:1	Pass
elevated concern on panel	Aa	`#ff6666`	`#001100`	6.80:1	Pass
crt link on page	Aa	`#5ab0ff`	`#000000`	9.09:1	Pass
code fg on code bg	Aa	`#e6e1d6`	`#0a0a0a`	15.18:1	Pass
code accent on page	Aa	`#5eead4`	`#000000`	14.20:1	Pass
code accent on panel	Aa	`#5eead4`	`#001100`	13.14:1	Pass
code accent on code bg	Aa	`#5eead4`	`#0a0a0a`	13.38:1	Pass
black on green button	Aa	`#000000`	`#00ff66`	15.50:1	Pass

Every text pairing above is asserted at build and in CI by a unit test (29 pairings, all meeting WCAG AA). 3 decorative (non-text) pairings are excluded from the gate.

The state model

Four states, weakest to strongest claim.

A skill’s trust signal can only ever be one of these four states. The site sits at “methodology assessed” today: every skill is checked, no skill is formally audited, and the two can never be confused because they never share a render path.

Not assessed
The baseline. A skill with no committed assessment pinned to its current version shows “Not formally audited” and no score. For example, a skill whose upstream version has drifted from what this site shows, so its methodology check no longer matches.
Today: methodology assessed
Every skill carries a maintainer review that scores it against the rubric above (its catalog entry, manifest, declared runtime behavior, and any calculator files) and reports a Low/Moderate/Elevated concern band. This is the site owner's review, not a formal or independent third-party audit: it can surface a concern but never certifies a skill as safe, and it never uses the verdict words below. Each skill still shows “Not formally audited,” because a maintainer review is not an audit.
Next: audit pending
When a skill is queued for a real, evidence-backed review, the contract supports an “audit pending” state. It shows no formal score or verdict (only that a review is in progress), so a pending review can never be mistaken for a completed one.
Audit-backed
A skill earns a formal Verified/Caution/Flagged verdict only when a committed audit record exists that pins to the exact reviewed commit and version, scores all five dimensions, and whose verdict is derived from that score. If the upstream plugin changes, the record stops matching and the skill returns to neutral. A trust signal can never outlive the code it described.

Custom skills go through the same check.

Skills built for your firm are scored against the same rubric before they ship to your team.

Explore custom skills →