Every skill is reviewed before it is listed.
A skill runs inside your agent, next to your most sensitive deal data. Before a skill appears in this catalog it is read against a published trust rubric — so you can see how it was reviewed, not just take our word that it was.
A skill extends your fiduciary surface.
A CRE skill rarely runs on toy data — you point it at rent rolls, T-12s, IC memos, tenant PII, and sponsor financials that sit under NDAs and fiduciary duty. The moment a skill can read that material, it extends your security and fiduciary surface to whatever agent runs it: a skill that quietly persisted data, reached for a credential, or carried a buried instruction could leak a sponsor's numbers or a tenant's identity as easily as it could miscalculate a cap rate. That is why this rubric weights what a skill is allowed to do — its purpose, its instruction scope, and how it got onto your machine — above whether it happens to write a file. Reviewing a skill before you feed it a deal is the same discipline as reviewing a vendor before you grant data-room access.
Five dimensions, weighted by blast radius.
| Dimension | Weight | What it asks | What a score of 5 looks like |
|---|---|---|---|
| Purpose & Capability | ×3 | Is what the skill is allowed to do actually limited to what it claims to do? Does it request file, shell, network, or tool access beyond producing the analysis it advertises? | A read-and-reason skill that works only from the input you give it, requests no system, shell, or network capability, and whose stated purpose matches its actual footprint — e.g. screening a deal you paste in, with no hidden side effects. |
| Instruction Scope | ×3 | Are the skill's instructions narrowly scoped to its CRE task, with explicit do-not-trigger boundaries, and free of hidden directives, prompt-injection bait, or instructions to exfiltrate or misuse the data it sees? | Tightly scoped, on-topic instructions with clear activation and exclusion rules, no concealed system-prompt overrides, and no language that would coax the agent into leaking a rent roll, T-12, or PII it was given. |
| Install Mechanism | ×2 | How does the skill arrive on your machine? Does it run install-time hooks, post-install scripts, or fetch remote code — or is it plain, inspectable text installed through a transparent, version-pinned mechanism? | Ships as human-readable Markdown/YAML through an open-source, version-pinned package with no install-time code execution of its own, so you can read exactly what you installed before you run it. |
| Credentials | ×2 | Does the skill read environment variables, request API keys or tokens, embed secrets, or otherwise touch credentials — and if so, is that handling minimal, disclosed, and necessary? | No credential surface at all: it reads no environment variables, asks for no keys or tokens, embeds no secrets, and needs none to do its job. |
| Persistence & Privilege | ×1 | Does the skill hold state between runs, write to disk outside your explicit output, escalate privileges, or run with more access than the task requires? | Stateless and least-privilege: nothing retained between runs, nothing written to disk unless you ask for it, and any bundled calculator is pure, standard-library code with no privileged access. |
Weights reflect blast radius: a skill's purpose, instruction scope, and how it arrives on your machine matter more than whether it writes a file. Scores are 1–5 per dimension; the weighted total is normalized to a 0–5 scale (weights sum to 11). ≥ 4.0 = Verified, 3.0–3.99 = Caution, below 3.0 = Flagged. This is the rubric we publish; it is not a claim that any skill has been run through it.
How the weighted score is computed.
Each dimension is scored 1–5. The scores are weighted, summed, and divided by the sum of the weights (11) to normalize back onto a 0–5 scale:
(Purpose & Capability × 3) + (Instruction Scope × 3) + (Install Mechanism × 2) + (Credentials × 2) + (Persistence & Privilege × 1)
÷ 3 + 3 + 2 + 2 + 1 (= 11) → score on a 0–5 scaleThresholds: ≥ 4.0 is Verified, 3.0–3.99 is Caution, and below 3.0 is Flagged.
Three outcomes.
Tightly scoped, no credential surface, transparent install. The kind of skill you can read and run with confidence.
Useful but with a wider surface — broader instructions, an install hook, or some state. Read it before pointing it at sensitive data.
A capability, scope, or credential concern serious enough that it would not be listed without changes. Treat with care.
The catalog and the methodology.
How the rubric scores a real skill.
Subject: the deal-quick-screen skill's security and trust surface (what the skill itself can read, run, or persist) — not the quality of the deal verdicts it produces.
Scores above are illustrative judgments about this rubric's application, not the output of a completed third-party audit.
What this review does and does not cover.
- Audits cover the catalog copy, metadata, and methodology presented on this site — not the upstream plugin source code.
- The plugin is open source and changes independently; an upstream commit can alter a skill's behavior after any review here, so a review reflects a point in time, not a guarantee about the code you install today.
- Skill behavior depends on the host agent, the model version, your inputs, and your local environment; identical instructions can produce different results across setups.
- No per-skill security audit has been completed. The score, dimensions, and verdict shown anywhere on this site as a worked example are illustrative methodology, not findings.
- Everything here is provided "as is," without warranty of any kind, under the Apache License 2.0.
- You are solely responsible for any sensitive, confidential, or regulated client and portfolio data you choose to put in front of any skill or agent.
- Nothing on this site is investment, legal, tax, or accounting advice.
From methodology to audit-backed signals.
The trust system is built but deliberately neutral: no skill shows a score until a real, committed audit record exists for it. Here is what would have to change — and the order it happens in — for a skill to carry an evidence-backed trust signal.
- Today — methodology only
Every skill is read against the rubric above before it is listed, but no per-skill audit record exists yet. So each skill shows “Not formally audited,” and the worked example above is illustrative methodology, not a finding. Nothing on the site reports a per-skill score.
- Next — audit pending
When a skill is queued for a real review, the contract supports an “audit pending” state. It still shows no score or verdict — only that a review is in progress — so a pending review can never be mistaken for a completed one.
- Audit-backed
A skill shows a score and verdict only when a committed audit record exists that pins to the exact reviewed commit and version, scores all five dimensions, and whose verdict is derived from that score. If the upstream plugin changes, the record stops matching and the skill returns to neutral — so a trust signal can never outlive the code it described.
Custom skills go through the same review.
Skills built for your firm are held to the same rubric before they ship to your team.
Explore custom skills →