Scroll Top

Quants are using LLMs to create Causal Graphs to find Alpha

WHY THIS MATTERS IN BRIEF

Humans know what impacts what, but AI isn’t that great at figuring out yet and if it can then it will help the finance industry find good investments faster.

 

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trendsconnect, watch a keynote, or browse my blog.

Divorce rates in Maine and the consumption of margarine, ice cream sales, and drowning incidents – it’s easy to find examples of spurious statistical links. Of course, those handling data should know well that causation and correlation are different things. After all, nobody would expect ice cream to cause drowning. But, in the arena of investing, though, true cause and effect can be harder to establish especially when you’re trying to do it at huge scale to find better ways to make money, and now one group of quants thinks a lack of rigour in the area is a problem for the industry. And an opportunity.

 

RELATED
Disrupting banking is a team effort

 

AI Quantitative trading models should be treated as unscientific unless they’re preceded by a detailed causal analysis, they argue. Otherwise, mis-specified models are likely to find their way into production and to crowd out truer representations of how markets really work, and therefore dash many investors hopes of making returns on their investments.

In pursuit of a purer understanding of the cogs and wheels of markets, some quants have started to test different approaches. It isn’t altogether easy.  The objective here is to create a so-called DAG (Directed Acyclic Graph), essentially a data map of these causal relationships, drawn as a network of arrows pointing from one variable to another.

Documenting causality in such a way goes further than formulating broad hypotheses about a given strategy earning a positive return, the causal camp argues. The in-vogue way to create DAGs, then, is to use causal discovery algorithms. These algorithms are able to determine from raw observational data what’s driving what.

 

RELATED
FDA approves first spray on skin treatment for burns

 

Is it going to be better at this stage than the best humans? I don’t know. But is it going to be useful when applied at scale? Potentially, says Alik Sokolov managing director of machine learning at RiskLab at the University of Toronto and co-founder and CEO of Sibli, which builds AI tools for investors.

Quants have access to more data about the world. And more computer power should help, too. But the algorithms have to calculate vast combinations of variables that grow in number exponentially with the complexity of the model. In finance, with its mountains of data, the process can be time-consuming and sometimes impossible based on data alone. Another route, then, is simply to rely on human expertise. Quants following this approach have run into problems, too, though.

It can take multiple experts to draw a relatively simple causal graph. And in markets that move rapidly, such an exercise can prove obsolete before it is even complete.  So, quants have come up with a third idea – and an obvious one in today’s world: to apply Large Language Models (LLMs) to the task. A 2021 project used a large language model built by the firm Causal Link to generate causal graphs based on the opinions of experts, which it gathers from 50,000 news articles a day.

 

RELATED
Doctor Dolittle researchers use AI to decode mice chatter

 

The researchers constructed example graphs linking macro variables such as dollar strength, food prices, gold, oil demand, US inflation and so on.  The build time of causal graphs with this “Wisdom of the crowds” approach can be reduced to a “matter of seconds”, the researchers stated.

In another project, conducted late last year, quants employed GPT-4 to organise 153 factors into clusters and map out causal charts within those clusters. A small but influential cadre says the multi-trillion-dollar factor investing industry is based on flawed science The groupings generated by GPT-4 predicted monthly returns just as well as conventional correlation-based versions, the researchers found, and were less correlated and easier to interpret. A high rate – two thirds – of the relationships proposed by GPT-4 aligned with statistical causality tests.

The trick to using large language models in this way, says Sokolov who worked on the project, is to interrogate the model correctly, sometimes with chains of prompts. Sokolov reckons firms could in future set up a strategy search loop using models in this way.

 

RELATED
AI everywhere as DARPA spins up new intelligent edge computing projects

 

“Potentially you come up with 10 candidate strategies that are much more likely to be sound from first principles,” he says. “Is it going to be better at this stage than the best humans? I don’t know. But is it going to be useful when applied at scale? Potentially.”

Crowd-sourced causal graphs could help build conviction in a strategy, he reckons. “A six-month research cycle could become a three-month research cycle.” The process is not fool proof, of course. Human experts are needed to review the output. But in a world where establishing causality becomes a starting point for quant models – and some quants believe that will happen – LLMs may have a role to play.

Related Posts

Leave a comment

EXPLORE MORE!

1000's of articles about the exponential future, 1000's of pages of insights, 1000's of videos, and 100's of exponential technologies: Get The Email from 311, your no-nonsense briefing on all the biggest stories in exponential technology and science.

You have Successfully Subscribed!

Pin It on Pinterest

Share This