Delivery Stack

The stack we use to transform operations.

We help teams modernize workflow products, legacy services, and internal dashboards with a simpler delivery stack that makes metrics, integrations, and tool access easier to manage.

What this page should tell you quickly

Where we work

  • AI workflow systems
  • Legacy service upgrades
  • Operator and executive dashboards

What we set up first

  • Clear telemetry and core metrics
  • Service contracts for legacy systems
  • Safe access into business tools

What clients get

  • Better executive visibility
  • Faster delivery decisions
  • A cleaner path to production

The Sherlock product set is the main product layer

These are the named Baker Street products that package the method itself: case shaping, repo intelligence, evals, rescue, and adversarial hardening. The partner tools below support delivery, but this is the core product set clients actually buy into.

Repo Intelligence Hub

221B

Open-source hub

The operating layer for fast software investigation.

221B gives Baker Street investigations a shared operating system: repo intake, evidence grading, reusable templates, and MCP access to the underlying knowledge base.

Primary outcome: Faster orientation in unfamiliar systems with clearer evidence and handoff

  • MCP-enabled knowledge base
  • Repo registry and intake workflow
  • Report and decision templates

Strategic Insight Layer

Adler

Open-source investigation toolkit

Turn an ambiguous AI opportunity into a scoped next move.

Adler packages the Baker Street investigation sprint into reusable templates, example outputs, and executive-ready decision support for teams that need clarity before build work starts.

Primary outcome: A ranked recommendation and implementation brief in days, not weeks

  • Hypothesis backlog
  • Executive readout templates
  • Worked customer-support case

Workflow Intelligence Layer

Mycroft

Open-source eval harness

Create an evidence loop before AI changes hit production.

Mycroft is the evaluation and release-readiness layer for AI workflows, combining offline and model-backed evals with richer keyword, latency, and report outputs.

Primary outcome: Objective pass/fail signals before a workflow ships or relaunches

  • Offline and OpenAI modes
  • Keyword and latency gates
  • Markdown and JSON reports

Pilot Doctor

Watson

Open-source rescue playbook

Recover the AI pilot before it turns into an account problem.

Watson is a practical rescue system for stalled pilots, combining triage, stabilization, relaunch gates, KPI recovery tracking, and production handoff assets.

Primary outcome: A credible decision to relaunch, rescope, or stop with evidence

  • Failure-mode triage
  • KPI recovery scorecard
  • Production handoff templates

Adversarial Testing Layer

Moriarty

Open-source red-team starter kit

Pressure-test the workflow before the workflow embarrasses you.

Moriarty is the red-team and failure-seeking layer in the Sherlock product set, built for prompt injection, policy evasion, tool misuse, and remediation verification work.

Primary outcome: A clearer view of exploit paths, severity, and whether fixes actually worked

  • Scenario packs
  • Rules-of-engagement checklists
  • Worked support-assistant red-team example

The infrastructure and access layers that support the products

These are the foundations we reach for early because they make the Sherlock products easier to instrument, integrate, and govern in real delivery.

The supporting tools we use to move faster

Once the core stack is clear, these tools help us tighten testing, Python setup, dashboard quality, and product pricing without adding unnecessary complexity.

Testing

Blacksmith

Speeds up CI runs so builds, tests, and container changes do not slow delivery down.

  • Faster GitHub Actions runs
  • Less time lost to queues and caches
  • Quicker feedback during active sprints

Python tooling

Astral

Improves Python setup, linting, and environment consistency across development and CI.

  • Cleaner Python environment setup
  • Faster local development loops
  • More reliable backend and eval workflows

Front-end dashboards

TanStack

Powers data-heavy front-end products such as review queues, reporting views, and operator dashboards.

  • Stronger query and caching patterns
  • Better tables and operational data views
  • A more robust base for internal tools

Pricing

Polar

Useful when a workflow becomes a product and needs pricing, subscriptions, or usage-based billing.

  • Supports packaged pricing models
  • Helps with usage-based billing setup
  • Makes product monetization less bespoke

Need to modernize an operational workflow or legacy service?

Bring the workflow, the tooling constraint, and the outcome you need to prove. We can shape the first sprint around analytics, access, and one production-grade slice of the system.