We Ran the Numbers. Then We Ran the Scoping. They Didn’t Agree.

TL;DR: The ROI model from Part 6 produced a compelling case for a custom ATS at 86,000 PLN. Then we scoped the build. The number came back at 283–320K PLN. The assessment model we’re advocating for helped us find this gap. The experiment taught us what the right question actually was. Here is what happened, what we decided,and why the refined answer is more useful than the one we started with.

What we expected to decide

Part 6 ended with a specific promise: the ROI model and the scoping result had to be in the same room before a decision was possible. We put them there.

Going into that meeting, the working assumption was straightforward.Discovery was complete. Twenty-four problems, seven clusters, 150,648PLN in estimated annual costs. The financial case had been stress-tested: on direct costs alone the project does not pay back; at50% attribution of opportunity costs, it produces a strong positive return. The build budget assumption was 86,000 PLN — an estimate based on scoping a system against our documented requirements, with AI-assisted engineering factored into the productivity assumptions.

The question on the table was: does the ROI justify 86K?

That turned out to be the wrong question.

What scoping actually found

The development scoping process starts from documented requirement sand estimates the minimum viable system that addresses them. It is not a quick exercise — a rigorous scope requires translating 24 documented problems into functional requirements, then into feature specifications,then into time and cost estimates with confidence intervals.

What came back: 283–320K PLN. Not 86K.

The first thing to understand about that number is that it is not a negotiating position. It reflects what a full ATS replacement, one that genuinely addresses the workflow we actually use requires to build and maintain at production quality.

The second thing to understand is what drove the gap. The “10% of SaaS features” benchmark — the idea that most companies use only a fraction of what their tools provide — is accurate on average. It was not accurate for Recruitee. The audit from Part 5 identified 24 problems across 7 clusters, but the same audit also mapped what Recruitee does adequately for our process.The answer was: a lot. Candidate management, pipeline visibility,communication logging, offer management, job board integrations. We were not paying for 100% and using 10%. We were paying for 100% and using 40–50%.

When you remove the features Recruitee handles well from the scope,you do not get a smaller replacement system. You still need those features, you have just moved them from “off the shelf” to “on our build list.” A full ATS replacement requires building a full ATS. AI-assisted engineering compresses certain development cycles and reduces the cost of well-understood modules. It does not convert an 11-month project into a 3-month one.

The 86K assumption was based on scoping to our critical gaps. The scoping process revealed that a system covering only those gaps was not a complete system. It was a set of features that would need to integrate with or replace a substantial working ATS. The baseline was larger than the initial estimate assumed.

Where the real gaps are

The scoping process did something more useful than produce a number:it forced precision about which problems actually required custom software, versus which ones Recruitee handled adequately, versus which ones could be addressed without building anything.

Two clusters emerged as genuinely unaddressable through Recruitee’s existing architecture.

Cluster 6: Scheduling and automation. The calendar synchronization problem is not a configuration issue. It requires real-time availability checks across multiple interviewers, automatic stage transitions triggered by calendar events, and logic that handles reschedules without breaking pipeline state. Recruitee’s scheduling is built for manual coordination. The scoping confirmed this is a structural gap — not a feature Recruitee will eventually ship, but a capability that conflicts with how Recruitee’s architecture handles state. Every workaround adds a manual step that belongs inside the process, not alongside it.

Cluster 4: Competency-driven evaluation. No ATS we evaluated handles structured competency matrices and longitudinal candidate records as native features. The problem in Cluster 4 is not that Recruitee does it badly, but that it is that the concept does not exist in Recruitee’s data model. A candidate’s competency evaluation from a previous application cannot inform the current one because Recruitee does not store candidates across applications in a queryable way. Building this requires building a separate data layer, not extending an existing one.

Both clusters rated 5/5 severity in the original audit. Both survived the scoping process as confirmed custom-build requirements. The other five clusters, including Cluster 1, which carries 99,792 PLN of the opportunity cost estimate, either had partial mitigations available within Recruitee or were addressable through process changes that did not require a new system.

That last point matters for how to read the ROI model. Cluster 1’sopportunity cost number was the dominant variable in the full-replacement case. Once the scoping process clarified that Cluster 1did not require custom software, the justification for full replacement lost its most important economic pillar.

The interface dimension

There is one finding worth naming separately because it affects any cost model that assumes AI-assisted interfaces reduce development cost.

During scoping, a text-interface approach was evaluated for parts of the scheduling and evaluation workflow. The argument: a natural language interface for interview coordination and evaluation entry would reduce UI development cost by approximately 10%, while keeping the underlying logic intact.

The People Team declined it. Not as a preference but as a hard constraint. The team responsible for using this system every day assessed a text-driven interface as a material adoption risk. Not the kind of risk that goes away with training. The kind that produces a technically sound system that gets abandoned within three months because it does not fit how the team actually works.

Adoption risk belongs in the cost model. A 10% reduction in development hours is not a 10% net savings if adoption failure probability increases in any meaningful way. We excluded thetext-interface approach and accepted the higher UI development cost.

This matters more broadly: AI-assisted development changes the coststructure of building a system. It does not change what it means tobuild a system people will actually use. Change management is not a lineitem that AI compresses. If you are modeling the economics of a custombuild, include it.

What we decided

Full ATS replacement: no.

Targeted build covering Clusters 4 and 6: yes.

The targeted build is scoped at approximately 50,000 PLN. The economics are substantially cleaner: 50K against two clusters with a combined annual cost of 13,980 PLN in direct costs and a defensibleopportunity cost argument specific to those clusters. The paybackhorizon is realistic without requiring the full opportunity costattribution to hold.

This is not a consolation prize. It is what the experiment wassupposed to produce — a specific answer, grounded in actual data, to aquestion we could not have asked precisely before we ran theprocess.

The question we started with was: should we replace Recruitee? Thatquestion was underspecified. The question the experiment gave us was:where exactly are our critical gaps, and can a targeted build addressthem at a cost that makes sense? The answer to that question is yes, fortwo specific clusters, at a defined budget, with a payback timeline wecan defend.

Recruitee continues to handle everything it was handling before. Itdoes what it was built to do. The targeted build handles the twocapabilities Recruitee was never designed to provide.

The refined hypothesis

The manifesto in Part 2 made a claim: AI-assisted engineering haschanged the build-vs-buy economics enough that a full ATS replacementcould be justified at 2× annual subscription cost. That claim needsrefinement.

AI has changed the cost threshold in both directions. For targetedbuilds — systems covering a narrow, well-defined capability gap —AI-assisted development makes viable what was previously not worthattempting. A 50K build at current quality levels would have requiredconsiderably more to deliver five years ago. That threshold shift isreal. The Paradoxof Cheaper Code argument holds.

For full replacements, the dynamics depend on usage depth. If you areusing 10% of your SaaS, a full replacement involves building areplacement for the 10% and discarding the rest. AI makes that core 10%cheaper to build, and the economics can work cleanly. If you are using40–50%, a full replacement means building a lot of functionality thatwas previously off the shelf and working. AI helps with the build cost,but the baseline is larger — and the payback math reflects that.

The scoringframework from Part 3 still holds. The four evaluation questions arethe right questions. But one variable in that framework now deservesmore weight than we initially gave it: usage depth. Not just whatpercentage of the SaaS you use — but whether the features you use arecommodity functionality that rebuilds quickly, or non-trivial workflowlogic that takes real engineering time to replicate regardless of howmuch AI you apply.

The refined hypothesis: custom software is economically viable when critical differentiating capabilities are absent from the SaaS, AND when actual usage depth leaves room for a targeted extension rather than requiring full replacement — or when usage depth is high but the SaaS’s entire feature set is commodity. AI lowers the cost thresholdsignificantly. It does not change the usage depth calculus.

Howthis lands in a market that just declared SaaS dead

The SaaSocalypse headlines are real. The wave of enterprisesreplacing or consolidating SaaS tools is measurable. Thirty-five percentof enterprises have replaced at least one SaaS tool already. The macrocase for build-vs-buy has strengthened, and the economics of AI-assisteddevelopment are real.

But “replace SaaS with custom” is not one answer. It is a decisiontree. The branch you land on depends on your specific usage pattern,where your gaps are, and whether those gaps are in commodityfunctionality or differentiating capability.

We ran the process on ourselves and got a specific answer: targetedbuild for two clusters, keep Recruitee for everything else. That answeris less marketable than “we built our own ATS and it worked.” It is moreuseful. The analysis in this series — the audit, the cost model, thescoping — exists because Appunite runs this process for clients. Thefact that it produced a “partial build” recommendation rather than a“replace everything” recommendation is not a qualified success. It iswhat a rigorous process should produce when the data supports it.

The finding that mattered was not the final number. It was thediscovery that the right question was different from the one we startedwith. That is what we would have found for a client. It is what we foundfor ourselves.

The targeted build is underway. The next posts in this series willcover what we actually built, how it performs against the estimates, andwhether the opportunity cost assumptions behind the two clusters hold in practice. Frameworks are hypotheses. Usage data is evidence. Thatpart of the story is just starting.

Sources

Part 2 — Hold My Beer (The Manifesto):https://www.appunite.com/blog/manifesto-building-our-own-ats
Part 3 — Four Questions to Score Any SaaS Tool:https://www.appunite.com/blog/four-questions-to-score-any-saas-tool
Part 4 — How to Discover What’s Actually Broken in Your SaaS Tool:https://www.appunite.com/blog/how-to-discover-whats-actually-broken-in-your-saas-tool
Part 5 — 24 Problems, 7 Clusters — What We Found Wrong with Our ATS:https://www.appunite.com/blog/what-we-found-wrong-with-our-ats
Part 6 — How to Estimate the ROI of Replacing a SaaS Tool (With RealNumbers):https://www.appunite.com/blog/how-to-estimate-the-roi-of-replacing-a-saas-tool
The Paradox of Cheaper Code — Why AI is Making Custom Software MoreValuable:https://www.appunite.com/blog/why-ai-is-making-custom-software-development-more-valuable

We Ran the Numbers. Then We Ran the Scoping. They Didn’t Agree.

What we expected to decide

What scoping actually found

Where the real gaps are

The interface dimension

What we decided

The refined hypothesis

Howthis lands in a market that just declared SaaS dead

Sources

Further reading

Buy vs build

The SaaS conundrum

AI-assisted engineering