Research and Development of Next-Generation Equity Investment Model and Quant Analysis System

R&D of Next-Generation Equity Investment Model and Quant Analysis System Integrated with LLM

We maintain a daily and monthly database of the Japanese stock market spanning over 45 years. We research and develop alpha analysis systems, portfolio analysis systems, and risk management systems. By designing the database from scratch using advanced programming, we have achieved high-speed daily calculations and big data processing.

In practical portfolio composition, there are various challenges. We are able to compose passive funds that can withstand scales of hundreds of billions of yen using our unique stratified sampling method. Furthermore, our alpha analysis system allows us to immediately implement and simulate alpha from cutting-edge research. By appropriately allocating the generated alpha to sectors or segments and optimizing it, we can now design active funds (Long/Short) exactly as intended.

In addition, by incorporating Large Language Models (LLMs) into the model, we are developing a system that automatically evaluates characteristics of the resulting portfolio from detailed perspectives and explains in what market phases it performs effectively.


Key Research Areas

  1. High-Precision Alpha Generation: Return prediction using multi-factor models and segment-specific approaches.
  2. Large-Scale Optimization Technology: Construction of practical-scale portfolios using optimization methods.
  3. Integration of LLM: Automatic evaluation and interpretation of portfolio characteristics by embedding Generative AI into the models.

System Configuration

In this research, we develop and operate the following integrated system groups.

System Overview Key Functions
QuantDB Financial Data Lake Management of 45 years of daily/monthly data, Delta Lake integration
FactorAnalysis Alpha Analysis IC analysis, Quintile analysis, Factor return verification
MultiFactor Model Multi-Factor Segment-specific synthetic factors, Adaptive weight adjustment
TOPIX Index Passive Fund Large-scale index construction using Stratified Sampling
HighAlpha Long/Short Active Fund Optimization of Market Neutral and 130/30 type portfolios

QuantDB: Financial Database Infrastructure

Data Structure

Category Content Data Period
Price Data Daily stock price, Adjusted close, Return 1980 – Present
Fundamental PBR, ROE, Dividend Yield, etc. 1980 – Present
Market Market Cap, Trading Volume, Floating Stock Ratio 1990 – Present
Benchmark TOPIX, Sector Indices, Segments 1990 – Present

FactorAnalysis: Alpha Analysis System

Analysis Modules

Module Analysis Content Output
IC Analysis Time-series evaluation of Information Coefficient IC Mean, ICIR, Decay rate
Quintile Analysis Return verification by factor strength Quintile return, Spread
Factor Return Cross-sectional regression t-value, Return contribution
Segment Analysis Validity verification by size/style Interaction between segments
Synthetic Factor Composite signal construction Optimal weights, Correlation matrix

HighAlpha Long/Short: Active Fund System

Investment Strategy Types

Type Long Short Benchmark
Market Neutral 100% 100% Short-term interest rate
Long-Short (130/30) 130% 30% TOPIX
Extension (120/20) 120% 20% TOPIX

Optimization Engine

Using high-speed optimization with Linear Programming (HiGHS solver), we construct portfolios that simultaneously satisfy the following constraints:

Objective Function: Maximize Alpha - Penalty Terms

Constraints:
├── Segment Constraints (Market weight ± tolerance)
├── Industry Constraints (Net / Total)
├── Factor Exposure Constraints
├── Turnover Constraints
└── Individual Stock Limits

Integration of LLM Advisor

Overview

Detailed information about the portfolio construction is embedded internally and sent to the LLM to automatically generate qualitative evaluations and recommendations.

Evaluation Items

Evaluation Category Content
Risk Analysis Concentration risk, Sector bias, Factor tilt
Market Condition Suitability Consistency with current market environment
Improvement Proposals Specific proposals for portfolio adjustment
Scenario Analysis Assumed impact during market fluctuations

Future Research Direction

  • Real-time data integration
  • Machine learning-based factor discovery
  • Integration of ESG factors
  • Multi-asset expansion
  • Sophistication of risk models (CVaR, MAD)

Conclusion

We focus on thinking quickly about what is needed for next-generation investment and analysis models, making hypotheses, and implementing them. Our goal is to freely perform research up to the point where it can be used in actual practice.