Building effective enterprise AI agents from a total cost of ownership lens

Harnessing business value from the minimized cost of setting up co-pilots with high security and compliant standards, and efficient change management and ecosystem driven synergies

Oct 09, 2023

Context

In continuation to part 1 of the three part series, this part delves into the components of the Total Cost of Ownership (TCO) for AI co-pilots, including licensing, setup, data, infrastructure, change management, performance monitoring, security, and compliance costs and examines the factors that contribute to the accrual of business value for AI co-pilots through various phases.

A re-usable tabular version of this part of the framework can be found on this Coda page.

Executive Summary

The initial setup and integration phase is crucial, considering the diverse infrastructure and security standards in enterprise environments. Key considerations include deployment architecture, activation, installation, distribution, integration with existing systems, interoperability, private cloud/on-premise/hybrid support, and user management.
Configuration and change management are ongoing processes, requiring the ability to fine-tune input and output formats, incorporate human feedback, and maintain brand consistency & tonality. Scalability is essential, with attention to increased transaction volume, intensity, types of transactions, and concurrent users. Monitoring and management tools play a vital role in overseeing AI co-pilots at enterprise scale.
Security and compliance are paramount, with measures such as securing model parameters, federated learning, input and output validation, data storage, and adherence to industry standards. Transparency and accountability are necessary to understand the decision-making processes of AI co-pilots.
The ecosystem and network effects are highlighted, emphasizing the benefits of sharing privately hosted models, federated learning, data and integrations marketplaces, and the importance of aligning customer demand with available supply.

Total Cost of Ownership (TCO) for AI Co-Pilots

Total cost of ownership is a commonly used metric for characterizing the value of adopting and using a SaaS solution over its entire lifecycle. TCO takes into account cost components beyond the subscription or licensing fees, helping organizations make informed decisions about the true financial implications of adopting a specific software service. Putting AI co-pilot in perspective, here’s how this cost can be broken down

Licensing Subscription which has typically followed a per user model, based on what we know from Microsoft co-pilot and OpenAI. This cost however may not may not linearly increase with incremental users or transactions due to volume discounts.

Initial setup / implementation costs cover the initial configuration, integration, data migration, and training required to get the co-pilot solution up and running. This may include consulting fees, customization, and the time spent by a customer’s IT team. In context of AI co-pilots, the following costs may apply, especially for scenarios where private models are developed and deployed within customer environments:

Data Acquisition, Storage and Processing costs for obtaining high-quality data for training and validation, storing, pre-processing and cleaning.
Data labelling costs as a pre-requisite for training high quality models.
Model development costs
Deployment and testing costs

Infrastructure Costs includes cost of any internal infrastructure that may be needed by the customer to maintain / manage the solution. This may include:

On premise / private cloud compute costs for hosting private model inference.
Cost of GPUs for localized training / private model deployment and management
Costs to integrate the AI-driven software into existing systems and workflows.

Change Management Costs are incurred for maintaining the solution on an ongoing basis. Putting AI driven co-pilots in context, this includes:

Costs for training employees and users on how to interact with and benefit from the AI-driven software
Additional configuration costs required to make changes to the AI co-pilot as the business process evolves and to fine tune the co-pilots.
Performance & monitoring costs for tools and services tracking the performance and accuracy of AI co-pilots

Security & Compliance Costs include any costs needed to adhere to enterprise security and compliance standards.

Security related expenses for implementing security measures to protect AI models and data.
Compliance adherence costs related to complying with data protection and privacy regulations, such as GDPR or HIPAA.

The more compliant the solution is, the less costs are incurred. The other side of these costs includes costs incurred in the event of a non-compliance or security incident, adjusted for probability of such events.

Initial Setup & Integration

While the modern SaaS stack has made the introduction of a new tool or service considerably simpler, enterprise environments still adhere to varying infrastructure and security standards. The effectiveness of a product feature is not merely a function of the value that it’s creating to the end user but also the ease with which it can be implemented / introduced. This can be driven by the following factors:

Deployment Architecture: Co-pilots can utilize publicly available APIs from OpenAI, Anthropic, Cohere or opt for an in-house model. Efficiency in building a SaaS or PaaS architecture is crucial for rapid adoption. This encompasses specifics like selecting cloud services for co-pilot interaction and optimizing traffic flow through internal firewalls and proxies. Customers usually adhere to a strict zero trust firewall policy, potentially blocking traffic to public LLM services by default, necessitating an enhanced networking design for seamless connectivity.

Activation, Installation & Distribution: For AI agents installed locally on user workstations or browsers, creating a seamless distribution and installation channel streamlines adoption. Additionally, when these agents execute tasks locally, it's essential to assess and assign the necessary privileges for specific functions. This consideration becomes more relevant when the agents overlay existing applications rather than being deeply integrated into them. In the latter case, configuring the source applications to exchange data with the agent may be required, making a user-friendly activation process is essential for boosting usage.

Integration with customer’s business systems: Integration with existing enterprise systems is often a pre-requisite to adoption of any SaaS software, not just a co-pilot. Some of the most common pre-requisites are:

Integration with existing customer databases, which may be hosted on premise or on private cloud instance.
Integrations with systems of record, reference, action and insight.
Integrations with customer’s credentials management system such as Cyberark.

In these cases, an agent's capability to quickly integrate is typically achieved by:

Using pre-built connectors based on target system APIs that reduce cost of building custom integrations
Allowing customers access to agent APIs and event data for custom connectors.
Configuring systems for data scraping when no integration is available. The capability to fine tune and configure such tools also streamlines integration.
Implementing mid tier agents for scenarios where customer systems are locked down in on premise environments

Interoperability: For enterprise co-pilots, supporting legacy browsers and operating systems, like Edge, IE, and older systems, is vital for sectors like finance and healthcare, where technical debt is common. This broad compatibility accelerates adoption, especially in cases where customers have strict security settings, such as disabled JavaScript on browsers.

Private Cloud / On-premise / Hybrid support: In traditional and highly regulated enterprises, there's often a reluctance to use public cloud-based co-pilots, particularly when dealing with sensitive data. To accommodate these concerns, customers may opt for co-pilot installations in their private cloud or choose a hybrid setup, keeping data sources internal while processing in the public cloud. Google Cloud's Vertex AI, for instance, offers support for such deployment models.

User Management: Most enterprise applications require users authentication to integrate with their identity providers (IdP) such as Okta, Azure AD, Auth0 etc. The capability of a co-pilot to quickly integrate with customer’s Single-Sign-On (SSO) systems also contributes to initial setup costs.

Licensing: In a seat / user based licensing paradigm, specific customers may allocate co-pilot licenses on a rolling basis to facilitate moving around licenses between different teams based on the timing of use. A great example of this is a business process outsourcing provider for outsourced support services that may leverage co-pilot licenses for a team in the Pacific region and would then want to re-allocate those licenses to the team in the EU region. While this floating model can play against software providers, such enterprise wise licensing agreements can be largely priced based on business cases and thus moving around seats may not affect the total contract value. Such flexibility in licensing creates an ease of adoption.

Configuration & Change Management

The co-pilot may require ongoing calibration / tuning to achieve higher levels of accuracy on the existing use cases or to accomodate a broader variety of use cases. Moreover, business processes are always bound to change, thus creating the need for users to accomodate those changes to maintain accuracy & compliance.

Tweaking Input & Output Formats: AI Co-pilots may leverage a standard set of user configured prompt templates as inputs and generate outputs in a pre-configured format which eventually feeds into downstream systems that expect them to be in a that format. Since business processes and systems are always subject to change, exposing the capability for user to quickly configure and eventually change input and output templates streamlines the time and cost for building and managing such co-pilots. Examples of input formats include the capability for users to inject only questions, or additional documents along with questions. Examples of output formats can include JSON, XML, CSV etc.

Fine Tuning via Human-in-the-loop: This applies to AI co-pilots that may use multiple methods of fine tuning an output such as supervised fine tuning or reinforcement learning. Fine-tuning in AI co-pilots, especially those employing methods like supervised fine-tuning or reinforcement learning, hinges on key factors for effectiveness:

Ad-Hoc Training Data: The ability for customers to upload tailored training sets, such as use-case-specific documents or chat transcripts, and seamlessly fine-tune out-of-the-box models to enhance accuracy for specific tasks.
Human Feedback Loop: Incorporating human feedback into ongoing outputs is crucial. For instance, a co-pilot aiding an accounts payable specialist can present uncertain invoices for human review within the source system (e.g., NetSuite). Changes made are tracked, with possible follow-up questions for context, all seamlessly integrated with existing workflows. This minimizes friction and enriches the context required for optimal performance.

Brand, Tonality & Moderation: For co-pilots generating content for widespread consumption, maintaining brand compliance is crucial. This involves adhering to specific design elements, language guidelines, and desired tonality, both for internal and customer-facing materials. Effective co-pilots offer customizable settings and utilize contextual brand data to seamlessly integrate these requirements. Additionally, they should allow easy editing and adaptation to changes in brand and tonal preferences, ultimately reducing maintenance costs.

Scalability: As co-pilot software scales across various users, use cases and diverse data sets, scalability encompasses several aspects such as:

Increased transaction volume per user, for eg. processing higher number of invoices per AP co-pilot.
Enhanced transaction intensity, for eg. as processing more pages per invoice by an AP co-pilot.
Expanding the types of transactions handled, for eg. an accounts payable co-pilot managing purchase order and invoice document types.
Growing number of concurrent users leveraging the co-pilot.

Scaling these transactions can lead to substantial compute demands unless optimized. OpenAI employs distributed computing, load balancing, auto-scaling, resource pooling, caching, prioritization, and algorithmic resource management to handle concurrent requests efficiently. Scalability profoundly affects pricing, as discussed in part 3. In PaaS scenarios requiring customer-side infrastructure, the incremental cost of customer-side compute infrastructure raises total ownership costs. The rate of cost increase with linear scalability impacts ROI, making solutions less cost-effective at scale.

Monitoring & Management: The capability to monitor the usage and activities conducted by AI co-pilots across various users, bodies of work, systems and functions, supplemented with the capability to initiate, terminate or edit the existing workflows is key to effectively managing co-pilots at enterprise scale.

Security & Compliance

Heavily regulated customers such as financial institutions have some of the highest standards for enterprise security. Given the architecture of modern day co-pilots that relies heavily on data for training and fine tuning, having stronger security controls becomes more evident. Here are some key pillars to keep in mind to minimize the cost of compliance and rather the financial & reputational risk from non-compliance

Securing Model Parameters: For co-pilot software with custom / open source model deployment, some commonly used approaches to maintain high levels of security are:

Encrypting model parameters to protect learned knowledge.
Restricting access for model parameters to authorized users/systems.
Using HSMs/key management for secure key storage.
Utilizing techniques like zero-knowledge proofs for secure computation on parameters without revealing them.

Training on Sensitive Data: Co-pilot software can leverage techniques such as federated learning, where the model is trained collaboratively without sharing sensitive data, reducing the exposure of confidential information. I’ve covered this in more detail in the ecosystem section below.

Input & Output Validation & Sanitization: To mitigate injection attacks and prevent inappropriate or harmful content in LLMs trained on a broad dataset, specific techniques include:

Thoroughly validating input data by checking data type, length, format, and range.
Developing custom filters and rules to flag unacceptable content, such as hate speech or personal attacks.
Implementing human review for sensitive or high-risk applications, leveraging human judgment that automated filters may overlook.
Creating feedback loops where user-reported issues and false positives/negatives from content moderation enhance filtering mechanisms and train the model to produce more suitable content.

Information Storage & Retention: For SaaS software, secure storage and retention of PII & PFI data, if collected, are paramount. This entails securing data both at rest and in transit using industry-standard protocols. Additionally, clearly communicating data retention policies and empowering users to manage their data and retention preferences are essential for compliance and adoption. While not a new concern, efficient implementation within the modern LLM stack significantly impacts usage and compliance costs.

Industry Standard Compliance: In continuation to the point above, adherence to standard certifications / data security frameworks such as GDPR and CCPA are key pre-requisites to enterprise adoption.

Transparency and Accountability: LLM outputs combine probabilistic elements with deterministic rules and customer system knowledge, which can yield inconsistent outcomes for the same input. To address this, most enterprises seek accountability and require an audit trail to understand the rationale behind a co-pilot's decisions based on the given data. Exposing the co-pilot's workflow and methodology, including how inputs were processed by multiple models or deterministic systems, is crucial for internal and external accountability. This can be presented as audit trails or decision flows, similar to how a service manager for example audits support ticket transcripts. Understanding how an LLM handles customer requests, classifies them, accesses knowledge bases, and provides solutions or recommendations is vital for security and performance monitoring.

Ecosystem & Network Effects

In my article on Business Process Automation and the Generative AI Turbocharge, I highlighted that co-pilots can be implemented in very similar ways across end users from different organizations which creates redundant implementation & fine tuned work, that can be streamlined to create quicker adoption and reduce cost of ownership. In context of modern day AI co-pilots, such commonalities can be attributed to fine tuning of models, data used and systems that the co-pilots integrate to create quicker adoption:

Model Ecosystem: A model ecosystem can be characterized by:

Sharing Privately Hosted Models: This involves sharing fine-tuned models for specific use cases within a customer ecosystem, reducing the need for individual fine-tuning. HuggingFace provides an example, though it's limited to models that can be imported into private cloud instances for custom co-pilot development.
Federated Learning: In this approach, an initial global model is created and shared among participating entities. Each entity then trains the model on their local data iteratively without sharing the data itself. Model updates are computed after each round of training and sent, not the data. A central server aggregates these updates to create a global model update, which is then distributed to participating entities. Federated learning has the potential of creating true network effects within co-pilot ecosystems while managing and maintaining customer compliance and securing data.

Data Marketplace: While enterprises are less likely to share proprietary datasets with each other, systems integrators and service providers, who offer initial implementation services, may have a stronger incentive to host and potentially monetize their collected datasets as part of their business strategy.

Integrations Marketplace: In addition to the initial set of integrations offered, there's value in a community-driven approach. Allowing customers and developers to create and share integrations fosters implementation synergies and encourages network effects. An excellent example is Zapier, which has successfully built its ecosystem around this concept.

In the modern AI landscape, the success of an enterprise ecosystem hinges on the speed at which models, parameters, datasets, and integrations can seamlessly integrate into existing co-pilot workflows and adapt to specific use case requirements. Aligning customer demand with the availability of these resources has consistently been the driving force behind the creation of thriving marketplaces.

Bringing it home

Beyond just the design of the co-pilot, product market fit, user experience and data types, understanding how easily the co-pilot can be introduced and implemented within complex enterprise environments, managed and maintained with the ever changing business language in a secure and compliant manner contributes significantly to the long term success of the software maker and the customer. Finally, the capability for such co-pilots to create an ecosystem that maps the demand for latest and greatest fine tuned models, data and systems integration to reduce the redundant pockets of implementation across the customer base, can turbocharge adoption and reduce the time to execution. The final part of this three part series will focus on the mechanisms of realization of the business value from a cost-benefit perspective, how such co-pilot software will be sold as work and not software.

Beyond the Prompt

Discussion about this post