
Open Architecture Was the First Principle. Open Weights Are the Second.
Open architecture in wealth management was never a marketing campaign. It was a structural response to changing advisor and client demands. Captive distribution, where product manufacturing and distribution sat under the same roof, produced three compounding challenges: misaligned incentives (real or perceived) between Advisor and client, constrained optionality for the advisor obligated to act in the client's interest, and single-vendor fragility where the entire operation's capability ceiling was set by a one provider’s strategic priorities, performance, and offering.
These were not theoretical concerns. These concerns influenced a number of industry changing events, including asset sales & business separations, new regulatory requirements, including Regulation Best Interest [1], the DOL's fiduciary rule [2], and the long migration of assets from wirehouses to independent RIAs [3]. That history matters now as wealth firms are making similar consequential decisions about AI infrastructure, and many are making this decision by default rather than by design. I am specifically referring to the decision to adopt a given frontier model into your business practices - Gemini, Co-Pilot, Claude, ChatGPT, Grok - without considering the longer term implications to the business model of being tied to a single platform and model provider.
The parallel is not metaphorical and the mapping is surprisingly precise:
>> The captive product shelf maps to single-provider API dependency.
>> The open product shelf maps to a provider-agnostic model layer.
>> The asset portability maps to a semantic data and execution layer that makes model portability possible.
Consider the three challenges of captive distribution and their current analogs.
First, incentive misalignment. Frontier model providers operate under an economic logic that rewards lock-in. The more deeply a firm integrates with a specific provider's API surface, fine-tuning ecosystem, and proprietary features (function calling formats, system prompt conventions, model-specific optimization techniques), the higher the switching cost and the more durable the revenue stream[5]. The provider is rewarded for making exit costly. Firms should be designing for the opposite.
Second, constrained optionality. Committing to a single provider's ecosystem means inheriting their capability roadmap, pricing trajectory, and strategic priorities. If a better-suited model emerges the firm cannot adopt it without unwinding months of integration work. Open-weight models are already closing this gap. The LMSYS Chatbot Arena [6], the DeepSeek-R1 technical report, [7], and Stanford's HELM [8] all show convergence on extraction, classification, and summarization – precisely the tasks that constitute the bulk of operational AI workloads in wealth management.
Third, single-vendor fragility. The Financial Stability Board flagged AI provider concentration risk for financial institutions in 2017 [9], and reinforced it in 2023 with specific guidance on technology vendor concentration [10]. When your extraction pipeline, reconciliation workflow, and compliance review all depend on the same provider's API, you have not built a technology stack. You have built a dependency graph with a single point of failure whose pricing, uptime, and capability evolution you do not control.
The strategic mistake is treating model selection as a procurement decision when it is an architectural one; because the real question is not which model is best today but whether the system is built so that that answer can change tomorrow as your business adapts?
Wealth firms do not merely suffer from fragmented data. They suffer from unencoded operating reality. Accounts, households, products, advisory programs, permissions, fee schedules, approval paths and compliance obligations exist across systems but not as a governed semantic model. Until that structure exists, AI can summarize fragments but cannot reliably reason over the firm. This is also why agent overlays disappoint in wealth management; they inherit the fragmentation they are supposed to solve. An agent that cannot traverse the Firm’s actual entity relationships, fee logic, program structures and permission boundaries is not augmenting operations. It is automating guesswork.
Most firms building AI into financial workflows still let the model interact with data too directly: generating SQL, composing API calls, or translating natural language into execution logic with very little structural mediation. That works until it fails, and when it fails, the failure is hard to reproduce, or govern.
This distinction matters for model optionality in a way that is not immediately obvious. When the model's job is to emit structured intermediate representation against a well-defined ontology, rather than raw execution logic, the coupling between the model and the data layer becomes loose by design. The ontology does not change when you swap models. The metric definitions do not change. The safety invariants - tenant isolation, query budgets, provenance tracking - do not change. What changes is the component that translates natural language into structured intent, and that turns out to be the most substitutable part of the system.
Cost matters here because the gap is small and cumulative, and at operational volume it can become the difference between an AI feature and a viable operating model. As of early 2025, inference pricing for a 70B-parameter open-weight model (Llama 3.1 70B, for instance) on managed infrastructure through providers like Fireworks AI or Together AI runs approximately $0.88 to $0.90 per million input tokens. Frontier API pricing for comparable capability ranges from $2.50 to $15.00 per million input tokens, depending on the model and provider. For a wealth management firm processing thousands of client records, fund statements, custodial reports, and compliance documents, the per-document cost differential is not modest. At volume, it can approach an order of magnitude, or more once output token pricing and batch processing patterns are included. Artificial analysis publishes performance charts that make the spread clear enough [11].
But cost, stated in isolation, still misses the deeper point. The question is not only which model is cheaper. It is how the model interacts with the firm's data, and whether the system is designed so that the model layer remains subordinate to a semantic data layer that encodes the firm's actual operating reality.
This is where multi-model routing becomes less a preference than an architectural requirement. Not every task in a financial-services workflow demands the same model. Structured extraction from a semi-standardized client agreement - classification, entity resolution, field mapping - is a case where open-weight models often perform near parity with frontier offerings, and where the cost differential is hardest to justify. Complex multi-step reasoning, novel analytical synthesis across heterogeneous data sources, and nuanced interpretation of ambiguous disclosure language remain areas where larger frontier models still have real advantages. The gap is still real on the hardest tasks. But most operational volume is the hardest work, and a well-designed system routes each request to the model whose capability profile and cost structure match the task's actual complexity. If most operational volume runs on lower cost open-weight inference and only genuinely complex work routes to frontier models, the economics of the system change materially.
None of this happens by accident. Open architecture in wealth management required actual infrastructure: ACAT transfer protocols, clearing and custody separation, standardized account structures, open product shelves. It was an engineering project, not a philosophical position, and it required intentional decisions made early.
AI model optionality requires the same kind of deliberate infrastructure: a semantic layer that governs all data access through a firm controlled ontology, a provider-agnostic abstraction that normalizes tool calling and model outputs, an execution environment where models emit validated intermediate representations rather than raw logic, and a routing layer that matches task complexity to model capability. That way, the frontier pricing is reserved for frontier-grade work. I have built these layers. The engineering cost is real but bounded. The cost of retrofitting model independence after-the-fact is: technically possible, but practically prohibitive.
The wealth management industry took roughly two decades to internalize the open architecture lesson. Some firms never did, and they now read like case studies in structural decline. The AI version of this decision is being made right now, mostly in implementation choices that executives still tend to treat as technical details. In wealth management the semantic layer is not an AI feature. It is the beginning of computable Firm infrastructure. The uncomfortable question is whether firms will recognize that before their model provider becomes the de facto control plane.

Sources
[1] U.S. Securities and Exchange Commission, "Regulation Best Interest: The Broker-Dealer Standard of Conduct," Release No. 34-86031, June 2019. https://www.sec.gov/rules/final/2019/34-86031.pdf
[2] U.S. Department of Labor, "Definition of the Term 'Fiduciary'; Conflict of Interest Rule — Retirement Investment Advice," 81 FR 20946, April 2016. https://www.federalregister.gov/documents/2016/04/08/2016-07924/definition-of-the-term-fiduciary-conflict-of-interest-rule-retirement-investment-advice
[3] Cerulli Associates, "U.S. RIA Marketplace" report series, ongoing. https://www.cerulli.com
[4] Financial Planning Association v. SEC, No. 04-1242 (D.C. Cir. 2007). https://media.cadc.uscourts.gov/opinions/docs/2007/03/04-1242a.pdf
[5] Carl Shapiro and Hal R. Varian, *Information Rules: A Strategic Guide to the Network Economy* (Harvard Business School Press, 1999), Chapters 5-6. https://www.inforules.com
[6] LMSYS Chatbot Arena Leaderboard, UC Berkeley. https://chat.lmsys.org/?leaderboard
[7] DeepSeek AI, "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning," January 2025. https://arxiv.org/abs/2501.12948
[8] Stanford Center for Research on Foundation Models, "HELM: Holistic Evaluation of Language Models." https://crfm.stanford.edu/helm/
[9] Financial Stability Board, "Artificial Intelligence and Machine Learning in Financial Services: Market Developments and Financial Stability Implications," November 2017. https://www.fsb.org/2017/11/artificial-intelligence-and-machine-learning-in-financial-services/
[10] Financial Stability Board, "Enhancing Third-Party Risk Management and Oversight: A Toolkit for Financial Authorities and Financial Institutions," December 2023. https://www.fsb.org/2023/12/enhancing-third-party-risk-management-and-oversight-a-toolkit-for-financial-authorities-and-financial-institutions/
[11] Artificial Analysis, LLM Performance and Pricing Tracker. https://artificialanalysis.ai/