The Missing Model: Why Finance Has No Good Framework for Commodity Procurement Under Geopolitical Risk
And an open-source attempt to build one.
There is an enormous gap at the intersection of portfolio theory, commodity procurement, and geopolitical risk. It is a gap that matters — for energy security, for critical minerals strategy, for anyone trying to think clearly about how nations should source the physical inputs their economies depend on. And it is a gap that, as far as I can tell, nobody has seriously tried to fill.
I have built a model and open-source application that takes a first pass at this problem using an agent based modelling approach and Q learning. It is imperfect in ways I will be upfront about and which you can find outlined here. But even in its current form, it learns things that I think are genuinely interesting — things that match the intuitions of good policymakers but that have never, to my knowledge, been derived from a formal optimization framework. The model and app are available on GitHub at commodity-portfolio-abm. The app is here.
The literature gap
Modern portfolio theory is nearly seventy years old. Markowitz gave us mean-variance optimization in 1952. Since then, the financial economics literature has developed extraordinarily sophisticated tools for portfolio construction under uncertainty: factor models, robust optimization, Black-Litterman, CVaR constraints, hierarchical risk parity. If you want to optimize a portfolio of equities, or bonds, or even financial commodity derivatives, you are spoiled for choice.
But if you are a government or a large industrial buyer trying to decide how to physically procure a commodity — how much to contract from which suppliers, how much domestic capacity to maintain, how much to store, how much to leave to the spot market — the literature offers almost nothing.
This is not because the problem is unimportant. Europe’s catastrophic dependence on Russian gas, the scramble for rare earth alternatives after Chinese export controls, the fertiliser shock following the invasion of Ukraine — these are among the most consequential economic disruptions of recent years. They are all, at their core, problems of portfolio construction in physical commodity procurement under geopolitical risk. And yet the formal tools available to the people making these decisions are startlingly thin.
The credit risk literature comes closest. CVA pricing, wrong-way risk models, Duffie-Singleton, Gordy’s asymptotic single-risk-factor framework — these give us machinery for thinking about counterparty default in financial contexts. However, physical commodity offtakes are different from financial contracts in ways that matter enormously. When your largest supplier defaults, they do not just fail to deliver — their default is the supply shock that moves the replacement price against you. The loss given default is endogenous to the default event itself in a way that creates convexity: the relationship between supplier concentration and conditional expected loss is nonlinear, which means standard credit portfolio models systematically underestimate the diversification benefit. To compare these securities if a bond defaults you lose some of your principal. If you short a stock your losses are uncapped and the downside of commodity offtake defaults are more like stock short sales - you have to buy back in at a higher price.
On the operations research side, there is good work on supply chain optimization and logistics — the kind of work done by scholars like Jun Ukita Shepard, Lincoln Pratson, and Gosens, Turnbull, and Jotzo on coal trade flows and energy logistics. But this work tends to optimize for efficiency under normal conditions rather than resilience under disruption though network flow models can be stressed for scenario analysis. It does not typically model the geopolitical risk dimension or the strategic feedback loops between procurement decisions and supplier behavior.
And the geopolitical risk literature — Caldara and Iacoviello’s GPR index, the resource curse and resource war traditions — gives us measures and narratives but not decision frameworks. Knowing that geopolitical risk is elevated does not tell you how to restructure your offtake portfolio in response.
What is missing is a framework that jointly considers: the portfolio of supplier contracts and their covariance structure under disruption; the option value of domestic production capacity (sometimes expensive but reliable); the insurance value of strategic storage; the feedback effects of procurement decisions on supplier capacity and market structure; and the spot market dynamics that emerge when multiple buyers compete for scarce supply during a crisis.
An agent-based approach
The model I have built attacks this problem from a different direction than the analytical frameworks above. Rather than trying to derive a closed-form solution — which is likely intractable given the feedback loops and non-stationarities involved — it uses reinforcement learning within an agent-based simulation to discover good procurement strategies through trial and error.
The setup is straightforward. A simulated world contains suppliers, each with a capacity, a cost structure (fixed plus variable), and disruption and recovery probabilities modelled as a two-state Markov chain. Countries have demand, domestic production capacity (more expensive but reliable), strategic storage, and access to a spot market. The spot market clears via a price mechanism: when supply is disrupted and multiple countries are competing for scarce supply, the price rises until demand meets available supply, with more price-elastic buyers reducing their purchases more.
The critical feedback loop is on the supply side. Suppliers adjust their capacity based on utilization. If a supplier’s output is being underutilized — because buyers are stockpiling, sourcing elsewhere, or leaning on domestic production — it gradually cuts capacity. This means the “safe” strategy of heavy domestic production and storage can, paradoxically, make the international market structurally tighter and crises worse when they do hit. This is important to consider today when looking at “Bidenist” policies - pulling all capacity onshore can embrittle global supply chains but similarly, Chinese overcapacity that reduces refining capacity abroad then creates problems as we see in Asian refined products today with China stopping exports and the capacity having been lost.
Each country is controlled by a Q-learning agent that observes a compressed state — storage levels, spot prices, disruption severity, and supply tightness — and chooses from a discrete set of actions covering allocation profiles (how to split purchases across contract suppliers, spot, and domestic) and storage decisions (build, hold, or draw). Over thousands of training episodes, the agent converges on a policy that minimizes average procurement cost across the full distribution of disruption scenarios.
What the model learns
Three findings emerge consistently, and I think they are worth highlighting because they match real-world policy intuitions while being derived purely from the optimization:
First, diversification is robustly valuable, and more valuable than naive models suggest. The trained agent consistently spreads procurement across multiple contracted suppliers rather than concentrating on the cheapest one, even when cost differentials are significant. This is the endogeneity-of-LGD result showing up in a different guise: when your largest supplier goes offline, the spot price spikes precisely because of that disruption, and your replacement cost is worst exactly when you need to replace the most volume. The agent discovers this convexity through experience without anyone programming it in.
Second, storage is used strategically, not passively. The agent does not simply fill storage and hold it. It learns to build reserves when spot prices are low and disruption severity is low, and to draw them down during crises or when the market is tight. Storage acts as a dynamic buffer — absorbing surplus in calm markets and releasing it during stress — rather than as a static strategic reserve. The cost of holding inventory is weighed against the tail-risk insurance it provides, and the agent finds a balance that would be very difficult to derive analytically given the path-dependence involved. If this sounds exactly like my work with Employ America and Arnab Datta and others SPR work that is because it is exactly this - reserves need to be dynamically managed.
Third, the agent pays up for domestic capacity. Even though domestic production is modelled as more expensive than international supply on a per-unit basis, the learned policy maintains and uses domestic capacity rather than letting it atrophy. The key insight is that domestic capacity’s value is not primarily in the units it produces during normal times — it is in the units it can produce during a crisis, when the alternative is buying on a spot market that has spiked to multiples of the base price. The agent learns to maintain domestic capacity as a real option against extreme spot price outcomes, even when this raises average-case costs. This has obvious resonance with debates around energy independence, onshoring, and the value of maintaining “uncompetitive” domestic industries as strategic buffers. One deep question from an extension of this model is how you manage this when private sector incentives are short term (“make money or get raided by Elliott”) and governments have electoral cycles. Japan does a good job here with JOGMEC and JERA but few others do.
What the model does not do is equally important. It does not model strategic behavior by suppliers — disruptions are purely stochastic, not adversarial. It does not model transport costs, demand growth, multiple commodities, or the production lags that characterize mining and heavy industry. Countries learn independently without game-theoretic interaction. These are all directions for extension, and I have laid them out in detail in the project README.
Why this matters for policy
The policy relevance should be clear. Governments are currently making enormous bets on supply chain restructuring — friendshoring, critical mineral stockpiles, domestic production subsidies — largely on the basis of narrative reasoning and ad hoc analysis. The analytical frameworks they can draw on are either too abstract (geopolitical risk indices), too narrow (logistics optimization without disruption risk), or borrowed from domains with fundamentally different structure (financial portfolio theory applied without adaptation to physical commodities).
What I think this model demonstrates is that even a relatively simple simulation framework can generate non-trivial insights about optimal procurement strategy. The finding that diversification benefits are larger than naive models suggest has direct implications for how we should think about concentration risk in critical mineral supply chains. The finding that storage should be managed dynamically, not passively, has implications for how strategic petroleum reserves and other stockpiles are operated. And the finding that it can be cost-effective to maintain higher-cost domestic capacity as a strategic buffer provides analytical grounding for industrial policies that are currently justified mainly on national security grounds. How one does this is via a strategic resilience reserve or a command economy. I prefer the former.
The bigger question of how one establishes bank regulation like capital charges or incentives to diversify domestic supply is a trickier one. If no individual country should hold a 20% weighting on Qatar…. then how did the LNG market get there in aggregate? Most governments have not developed the long view of thousands of lifetimes afforded by reinforcement learning, a kind of remembrance of past lives of a Tulku. If we cannot be government by enlightened beings, perhaps we should model more?
The model is open-source, comes with a React dashboard for exploring scenarios interactively, and runs via a FastAPI backend. I would welcome contributions, extensions, and — most importantly — critique. If anyone wants to publish this as a paper with me (I, a lowly non-PhD holder) let me know. The gap in the literature is real and filling it will require work from people with deeper expertise than mine in operations research, mechanism design, and the specific commodity markets where these dynamics play out.
This is a first pass. But I think it is a first pass at the right problem and well posed.
The model is available at github.com/nemoincognito/commodity-portfolio-abm.


Capital asset pricing model (CAPM) pricing and ‘efficiency’ that was mandatory orthodoxy when I was a finance MBA major has simply failed usnin the real geopolitical world. Especially, as you illustrate, in critical commodity supply chains. Eg fossil fuel energy and critical minerals. I do remember in mid 1980s a key mantra in Finance 101 was ‘diversify your equity portfolio to about 20 stocks’…in a way this is an offshoot of that. But we did mot apply diversification portfolio theory to commodities and geopolitical risk. The battle between efficiency v sufficiency and resilience is a real competition in the real disrupted and disruptible world. A narrow idea of ‘efficiency’ meaning cheapest is always best has led some countries down a dangerous path on multiple fronts. Efficiency is just not sufficient. Very good you are trying to address this via open sourced modelling. Nice work kiddo!