How AI Can Reshape the Work of Portfolio Managers
Presentation given at the Jefferies Asia Forum on 20 March, 2024
As someone who has spent decades at the forefront of the technology industry, I've seen firsthand how advancements like the PC, the Internet, mobile and cloud have transformed the ways we work and live. Now we are on the cusp of another transformative technology: deep-learning-based artificial intelligence (AI). And nowhere is the potential for AI to drive significant improvements more apparent than in the investment management process.
Photo credit: Kennevia Photography
I recently had the opportunity to speak at the Jefferies Asia Forum alongside Vikram Dewan, Chief Information Officer, and Conor O’Mara, Managing Director both from Jefferies, about how the firm is leveraging AI and natural language processing (NLP) to streamline workflows and unlock productivity for their analysts and portfolio managers. The key takeaway: AI is already delivering significant value and its impact will only grow in the years ahead.
Distilling Insights from Vast Amounts of Financial Documents
One of the most impactful applications is using AI to automatically generate summaries of the massive amounts of information investment professionals need to stay on top of, from earnings call transcripts to research reports. By distilling these documents into their key insights, AI can save analysts countless hours and ensure crucial details are not overlooked.
For example, using Anthropic’s Claude 3 Opus model, I was able to feed in both the transcript (created with Otter.ai) of a Jefferies conference session as well as the presentation deck, and the AI-generated a comprehensive summary, complete with key insights, actionable takeaways, and overall sentiment analysis - within seconds. This ability to synthesize information from multiple sources and formats is a capability that would take even the most skilled analyst substantial time and effort.
But beyond just summarization, these foundation large language models (LLMs) can be leveraged to build powerful research chatbots and knowledge bases. By ingesting and embedding a firm's research reports, meeting transcripts, and other proprietary data sources, these AI systems can instantly retrieve highly relevant information in response to an analyst's natural language queries - acting as an omniscient research assistant to augment the investment research process.
Enhancing LLMs with Contextual Knowledge for Investment Research using RAG
In my presentation at the Jefferies Asia Forum, I demonstrated a powerful system I created using an easy-to-use chatbot creation platform from GPTBots that leverages a technique called Retrieval Augmented Generation (RAG) to augment language models with highly relevant domain-specific information. The key idea behind RAG is to combine the broad general knowledge and powerful language generation capabilities of LLMs with a contextual retrieval system that can surface the most salient information from a substantial corpus of documents.
To showcase the potential of this approach for investment managers, I ingested a wide range of financial data sources into the GPTBots RAG system, including Jefferies research reports, earnings call transcripts, 10-K filings, news articles from Bloomberg and the Wall Street Journal, and even websites of private companies. By embedding this proprietary information into a vector database, the system can perform a semantic search to find the snippets of text that are most relevant to an analyst's query - going beyond simple keyword matching to understand the true intent behind the question.
The impact of augmenting the LLM with this curated domain knowledge is substantial. Rather than the model trying to generate an answer based solely on its training data - which may be outdated or biased towards more general information - it can now draw upon the expertise contained within the firm's own research to inform its outputs. This reduces the risk of hallucinations or inaccuracies and allows the model to provide highly targeted insights that are directly applicable to the companies and themes that the portfolio managers care about.
But beyond just serving as an information retrieval system, the integration of the LLM and the vector database through the RAG architecture enables powerful natural language interactions. Analysts can engage with the system much like they would converse with a highly knowledgeable colleague - asking follow-up questions, seeking clarification or additional context, and rapidly zeroing in on the key insights they need to inform their investment theses.
To run through a concrete example, let's say an analyst wanted to quickly get up to speed on the key takeaways from NVIDIA's latest GPU Technology Conference (GTC). Rather than having to manually read through multiple research notes or watch hours of presentation videos, they could simply ask the RAG system "What were the major announcements from NVIDIA's GTC conference this year?" The model would then search through its embedded knowledge base of conference transcripts, research reports, and news articles to surface the key details on NVIDIA's latest product and strategy announcements, generating a succinct summary that highlights the most important implications for investors as shown below.
During NVIDIA's GTC conference this year, several major announcements were made:
Blackwell GPU Architecture: NVIDIA unveiled the Blackwell GPU architecture, which offers significant improvements over its predecessor, particularly in terms of performance and power efficiency. The Blackwell GPUs are designed using a custom 4NP TSMC process, contain 208 billion transistors, and introduce six transformative technologies for AI training and real-time Large Language Model (LLM) inference for models scaling up to 10 trillion parameters.
Performance and Efficiency: The Blackwell architecture is said to deliver 20 PetaFLOPS of AI performance, which is 4x faster on AI Training workloads and 30x faster on AI Inferencing workloads compared to the H100 architecture. It also boasts up to 25x more power efficiency than its predecessor.
Second-Generation Transformer Engine: This technology within the Blackwell platform allows for 4-bit floating point AI calculations, doubling the compute performance and model sizes from the previous generation.
NVLink 5.0 Technology: The new NVLink technology enables up to 576 GPUs to be linked together at speeds up to 1.8 TB/sec, essential for complex AI models.
Grace Blackwell Superchip: This combines Blackwell GPUs with an Nvidia Grace CPU, offering a highly specialized system for demanding AI workloads.
Software Suite AI Enterprise 5.0 and Nvidia Inference Microservices (NIM): These ecosystem improvements simplify the deployment of AI models, making it easier for businesses to integrate AI capabilities into their operations.
Expansion of Applications and Industries: NVIDIA is working to bring critical GenAI compute solutions to various industries, including healthcare, heavy industry, automotive, robotics, manufacturing, telecommunications (6G), weather forecasting, and more.
Grace Hopper Superchip: NVIDIA showcased its Grace Hopper superchip, designed for AI applications, further cementing its commitment to the AI and GPU market.
These advancements are expected to solidify NVIDIA's position as a leader in the AI and GPU market, significantly widening its competitive moat.
Or if a portfolio manager wanted to dive deeper into a particular product line, they could ask a more targeted question like "What are the key differences between NVIDIA's H100 and A100 GPUs?" The system would again comb through its contextual knowledge to pull out the relevant performance and specification details, empowering the PM with the information they need to make faster and more informed investment decisions.
The key differences between NVIDIA's H100 and A100 GPUs are as follows:
Architecture: The H100 is based on the new NVIDIA Hopper architecture, while the A100 is based on the previous NVIDIA Ampere architecture.
Performance: The H100 delivers a significant performance leap over the A100. For instance, it provides 6x the peak compute throughput of the A100 for the most compute-hungry workloads.
Tensor Cores: The H100 features fourth-generation Tensor Cores that are up to 6x faster chip-to-chip compared to the A100's Tensor Cores. The H100 Tensor Cores also support the new FP8 data type, offering 4x the rate of A100's 16-bit floating-point options.
Memory: The H100 uses HBM3 memory, which is faster than the A100's HBM2, providing a substantial increase in memory bandwidth. The H100 offers up to 80 GB of memory size and a data rate of 2619 MHz DDR, compared to A100's 40 GB and 1215 MHz DDR.
Energy Efficiency: The H100 is designed to be up to 25x more power-efficient compared to its predecessor.
Compute Capability: The H100 supports Compute Capability 9.0, which introduces new features and improvements over the A100's Compute Capability 8.0.
Security Features: The H100 introduces new security enhancements and confidential computing features not present in the A100.
L2 Cache Size: The H100 has a larger L2 cache size of 50 MB compared to the A100's 40 MB.
Manufacturing Process: Both the H100 and A100 are manufactured using a 4N process customized for NVIDIA, but the H100 has 80 billion transistors compared to the A100's 54.2 billion, indicating a more complex and capable GPU.
TDP (Thermal Design Power): The H100 has a higher TDP of 700 Watts compared to the A100's 400 Watts, reflecting its higher performance capabilities.
These differences collectively contribute to the H100's superior performance in AI, HPC, and data analytics applications, making it a more advanced and efficient solution for complex computational tasks.
As I showed in my demo with the answers above, these are just a few examples of the power of augmenting LLMs with targeted information retrieval capabilities. By giving analysts the ability to quickly extract insights from the mountain of financial data that they are inundated with every day in natural language, RAG systems have the potential to dramatically boost the efficiency and effectiveness of the investment research process.
The Future of AI-Driven Investment Management: Opportunities, Challenges, and the Imperative to Adapt
To be clear, today's AI is not without limitations. These language models can hallucinate information, struggle with numerical analysis, and lack the domain expertise to independently drive investment decisions. But by strategically leveraging AI to augment human capabilities - automating the most laborious parts of the investment research process while still relying on the judgment of experienced PMs in the loop - the technology can be transformational in improving the productivity and quality of the investment management process.
Of course, successfully implementing AI in a regulated and hyper-competitive industry like investment management is not without challenges. But firms like Jefferies are showing that with the right approach, the benefits far outweigh the risks. As more and more investment managers embrace AI, those that fail to adapt will undoubtedly be left behind.
Looking to the future, we are just scratching the surface of AI's potential in the investment industry. As the technology behind language models continues to advance and we solve key challenges like incorporating real-time market data, the ability for AI to increase productivity to allow PMs to focus on creating alpha will grow exponentially. With the asset management industry's smartest minds increasingly being augmented with superhuman AI research assistants, in the future, technology-driven investment management will become the new normal, the same way that decades ago in the PC era, spreadsheets became an indispensable tool for portfolio managers.
Note: The article was written with the assistance of Claude 3 Opus based on an AI-generated transcript of my presentation.