Two Weeks Ago I Had a Working Dashboard

Two weeks ago I had a working dashboard for a small matching prototype. Four test profiles, live scraping against a few German-speaking freelance platforms, tier-ranked suggestions on one screen every morning. It worked. I was drafting a short announcement post when a client I had been talking to asked the question I had not budgeted for: is this Annex III?

Three days of reading and one afternoon with a lawyer later, the answer was yes. The dashboard came off the public internet. The announcement post came down with it. EU AI Act classification follows what a system does, not how small the deployment is, and the gap between this scores humans against projects and this is a high-risk AI system turns out to be thin.

What Annex III Actually Covers

Annex III is the EU AI Act’s catalogue of domains where an AI system is treated as high-risk by default. Eight areas: biometrics, critical infrastructure, education, employment and workforce management, access to essential services, law enforcement, migration and border control, and administration of justice. A system is high-risk if it is deployed in one of these areas and meets the functional test in Article 6 — broadly, if the AI is a safety component or a substantive input into a decision affecting people.

The area I landed in was employment and workforce management. The exact language: AI systems intended for the recruitment or selection of natural persons, or for making decisions on promotion, termination, task allocation, or monitoring and evaluation of performance. A matching engine that scores consultants against projects lives inside the task-allocation clause, even when no employment relationship exists. The Act treats the freelance market the same way it treats hiring, because the decision class is the same.

Scale does not carve you out. Article 2 has narrow exclusions — personal non-professional use, scientific research prior to deployment, open-source components that are not placed on the market — but a working prototype that a client is evaluating qualifies for none of them. Solo does not mean small-in-the-eyes-of-the-statute.

The Self-Assessment That Stopped Me

Before you call a lawyer, three questions are worth trying to answer on your own. They do not replace a legal review, but they frame the conversation and tell you whether you need one.

First: what kind of decision does your system influence? If the honest answer is none, it’s just a dashboard — you are probably outside high-risk. If the answer includes hiring, firing, work allocation, compensation, credit, housing, education, law enforcement, or healthcare triage — you are in the orbit of Annex III and need to check the specific category language.

Second: how much automation sits between the input and the decision? Fully automated pipelines that score, rank, and send outreach without human review are one thing. A dashboard that ranks suggestions where a reviewer clicks through each one is another. Article 14 requires high-risk systems to be designed so a natural person can effectively oversee them during use — not just nominally, but in a way that would let the reviewer override or cancel the output. If your reviewer cannot do that in practice, you have an oversight gap before you have a compliance gap.

Third: what personal data is in scope? Inputs, training data, outputs. For employment AI, the critical question is whether the inputs include protected characteristics (Art. 9 GDPR and, in Germany, the compensation-adjacent protections of § 26 BDSG) — and even nominally neutral data like job history can encode them.

My honest answers were all bad. Work allocation, near-full automation with a human-review button that was in practice always clicked through, and hourly rates plus full consultant profiles as input data. That set pushed the prototype squarely into Annex III scope and made the compliance path unavoidable.

The Three Changes That Had to Happen

Three things changed between the draft dashboard and the research prototype it became.

Scope was reframed. The version I had drafted read as a product pitch with a demo link and pricing copy. I pulled both. The current framing is internal research prototype, not available for external use — a load-bearing statement rather than a marketing softening. An internal research prototype operating on synthetic data sits under a narrower set of obligations than a product third parties can sign up for. The framing change was paired with taking the hosted demo offline, pausing external API plans, and rewriting every autonomy-adjacent phrase in the UI from automated matching to suggestion list.

Personal data was replaced with synthetic data. Real consultant profiles, example-match screenshots with identifiable postings, any field that could be traced back to a natural person — all replaced with synthetic test data generated from templates. The scoring pipeline is unchanged; the data that flows through it during testing is different. The trap here is that the article accompanying the prototype is itself a publication surface. A screenshot in a blog post is a data-controller moment, not just a PR asset, and the synthetic-data switch had to happen in the content too, not only in the code.

Decision authority was documented. Every score surface now carries a human-review requirement. UI language says the system proposes; a human decides. Internal documentation names who the human is for each deployment context and what their override authority looks like. The auto-tune function that recalibrates scoring weights no longer updates weights silently — it proposes new weights, and a human approves them before they take effect. This pattern covers the Article 14 human-oversight obligation in spirit and on paper, and it is the change I would make first if I were starting over.

The Paperwork Nobody Warns You About

The technical changes were the visible part. The paperwork was most of the work.

Processor agreements (AVV). Every cloud service that sees personal data needs a Data Processing Agreement under Art. 28 GDPR — in German practice called Auftragsverarbeitungsvertrag. In my stack: Supabase for the database, Vercel for hosting, any analytics provider that records sessions. The AVV names the scope of processing, the technical and organisational measures, the sub-processors, and the data-return and deletion conditions. Cloud providers ship templates. The work is reading them carefully and understanding what you are agreeing to.

Freelancer classification. Anyone working on the project who is not an employee needs a real classification. In Germany the Scheinselbstaendigkeit risk under § 611a BGB is a separate existential concern, independent of the AI Act. A solo builder with no collaborators skips the worst of this, but the moment you bring in a contractor for a few days, a written contractor agreement that would survive a Rentenversicherung audit is a floor not a ceiling.

Data inventory with per-column sensitivity flags. A spreadsheet — or a Supabase comment, depending on taste — listing every column that might contain personal data, its sensitivity level, its retention policy, its encryption status. Hourly-rate fields on a freelancer matching system are the hottest line item per § 26 BDSG, because compensation data carries extra German employment-data protection. Putting that in writing makes the risk legible in a way that the hand-wave we process consultant data does not.

Terms-of-service review per source platform. If your system scrapes, mirrors, or integrates with external platforms, each one has a ToS governing what you can do with the data you fetch. A compliance position that considers the Act but skips the ToS is half-done. My scraper set covered platforms whose ToS ranged from open for non-commercial research to no access without a written licence agreement. I dropped one source entirely after re-reading it with compliance glasses on.

Fundamental-rights impact assessment (Art. 27). Mandatory for deployers that are bodies governed by public law or private operators providing public services, and for deployers of the credit-scoring and insurance-risk systems in Annex III points 5(b) and 5(c). Optional for most employment-AI deployments, but worth doing anyway if Annex III applies — it is the closest thing the Act has to a structured self-review.

What You Can’t Insure

The penalty ceiling for high-risk non-compliance under Art. 99(4) is up to 15 million EUR or 3% of global annual turnover, whichever is higher. Enforcement at the solo-builder scale has been thin so far; most of the early action targets larger systems, and SME-proportionality is written into Art. 99(6). That said, there is no insurance product I know of that backstops regulatory fines at this scale.

Standard professional indemnity insurance in Germany carries exclusions for regulatory fines and penalties — the § 138 VVG-type clauses in local policy terms, with parallels across EU jurisdictions. Errors-and-omissions coverage protects against client-liability lawsuits, not against regulators. The practical consequence is that your risk-management strategy has to be avoidance and compliance, because there is no policy to buy.

The Decision Template

Before you build your next AI thing, four questions you can answer in an afternoon:

  1. What decision class does the system touch? Hiring, firing, task allocation, compensation, credit, healthcare, education, law enforcement, migration, justice. If any of these, read the Annex III categories directly — the language is terse and specific.
  2. What is the automation ratio? Fully automated, human-in-the-loop-with-teeth, human-in-the-loop-for-show, advisory-only. The ratio changes your Article 14 obligations and in some cases influences whether the system is classified as high-risk at all.
  3. What personal data is in scope? Inputs, training data, outputs. Special-category data under Art. 9 GDPR and protected characteristics are automatic yellow flags. Compensation data in Germany carries the extra § 26 BDSG load.
  4. Who is the deployer? You, your client, a third party. Deployer obligations under Art. 26 are stricter for public bodies and organisations providing public services. Provider obligations under Art. 16 land on whoever puts the system on the market — which might be you, even when you are still thinking of it as just a prototype.

If the honest answer is yes on an Annex III category, a meaningful automation ratio, personal data in scope, and you as the deployer — you are in scope. A short conversation with a lawyer is cheaper than a short conversation with a regulator.

What This Is Not

This is not legal advice. I am a developer who paid for a few hours of legal guidance and did not pay for the right to pass it on. Every specific system depends on details no article can carry, so for your own case a lawyer is the person to call.

This is also not a full EU AI Act primer. There is no coverage here of prohibited systems under Art. 5, general-purpose models under Chapter V, transparency obligations for providers, or conformity-assessment procedures. Those apply to different classes of AI and different actors, and each merits its own read of the statute.

Closing

The compliance work ended up in the repo alongside the code. AVV templates, the sensitivity inventory, the ToS review notes, the lawyer’s memo — all under version control in a compliance/ directory, one file per concern. That directory is now a prerequisite for any future AI system I build that touches people, not an afterthought filed later. The pattern is the same one that turns a CLAUDE.md into an agent harness: configuration that used to live in someone’s head ends up in git, where it can be reviewed, diffed, and versioned like the code it governs.