Ecosystem

A connected open-source ecosystem spanning financial language models, reinforcement learning, trading systems, and AI agents.

FinGPT

Open-source financial AI platform for language models, benchmarking, APIs, and research workflows.

FinRL

Deep reinforcement learning framework for quantitative finance and portfolio research.

FinRL-Trading

AI-native trading infrastructure for strategy development, backtesting, and live execution.

FinRobot

Agent-based framework for financial analysis, research automation, and task orchestration.

FinGPT Architecture

A modular stack connecting financial applications, domain tasks, models, data engineering, and data sources.

FinGPT Architecture

A modular stack connecting financial applications, domain tasks, models, data engineering, and data sources.

Applications

Robo-advisorSentiment AnalysisPortfolio OptimizationRisk ManagementQuantitative TradingESG ScoringFraud DetectionCredit ScoringM&A Forecasting

Tasks

SummarizationNERInformation ExtractionSentiment AnalysisData AnalysisNumerical ReasoningIntent Detection

LLMs

APIs:

ChatGPTClaudeGeminiMistral

Trainable:

Llama3ChatGLM3QwenFalconInternLM

Methods:

LoRA / QLoRARAGChain-of-ThoughtRLSP

Data Engineering

Data CleaningTokenizationVector EmbeddingFeature ExtractionData Augmentation

Data Sources

News:

FinnhubYahoo FinanceCNBC

Social:

TwitterRedditWeibo

Filings:

SECNYSENASDAQ

Datasets:

ASharestocknet-dataset

FinGPT is part of the broader AI4Finance open-source ecosystem, connecting research innovation with deployable financial AI systems.

FinGPT-Benchmark

FinGPT uses instruction tuning to adapt open-source LLMs for financial tasks — enabling cost-effective fine-tuning across sentiment analysis, entity recognition, and more with task-specific, multi-task, and zero-shot paradigms.

Tasks

Instruction Construction

Base Models

Instruction Tuning Paradigm

SASentiment Analysis

HeadlineHeadline Analysis

NERNamed Entity Recognition

RERelation Extraction

NER (CLS)NER Classification

RE (CLS)RE Classification

What is the sentiment of this news? Please choose from {negative / neutral / positive}.

Does the news headline talk about price going up? Please choose from {Yes / No}.

Find all entities in the input text. Answer with format "entity1: type1; entity2: type2".

Extract the word/phrase pair and the corresponding lexical relationship from the input text.

What is the entity type of 'Bank' in the input sentence? Options: person, location, organization.

Choose the right relationship between 'Apple Inc' and 'Steve Jobs'. Options: industry, founded by, owner of...

Llama3ChatGLM3BLOOMFalconMPTQwen⋮

Step 1Task-Specific Instruction Tuning

Each task trains its own model independently

Task 1→Base Model 1→Respond 1

Task 2→Base Model 2→Respond 2

Task 3→Base Model 3→Respond 3

Step 2Multi-Task Instruction Tuning

All tasks train a single shared model jointly

Task 1

Task 2

Task 3

↘→↗

Shared Model

↗→↘

Respond 1

Respond 2

Respond 3

Step 3Zero-Shot Instruction Tuning

Hold out one task, train on others, test zero-shot transfer

Task 1

Task 2

Task 3 (held out)

↘→⤳

Base Model

↗→↘

Respond 1

Respond 2

Respond 3 (zero-shot)

FinNLP — Data Curation

FinGPT's data pipeline covers financial news, social media, filings, and research datasets — with feature engineering, data cleaning, and unified data access across 30+ providers.

NLP Data Sources

Financial text data from news, social media, regulatory filings, and research datasets.

News

Yahoo FinanceReutersSeekingAlphaPennyStocksMarketWatchCNBCThe FlyTalkMarketsAlliance NewsGuruFocusInvestorPlaceTipRanksFinnhubAkshareEastmoneySinaTushare

Social Media

TwitterRedditWeiboXueqiuStockTwitsEastmoneyFacebook

Filings

SECJuchao

Research Datasets

AShareCHRNNFiQAStocknetTrade The EventFPB

Feature Engineering

Fundamental Features

Financial RatiosAssetsLiabilitiesSales

Market Features

OpenHighLowCloseVolume

Analytics Features

News Sentiment

Alternative Features

Social MediaESGGoogle Trends

FinGPT-RAG

A retrieval-augmented generation framework for financial sentiment analysis. Most financial news lacks adequate context — FinGPT-RAG uses instruction tuning combined with multi-source knowledge retrieval to fill context gaps and enhance information depth.

By integrating external knowledge retrieval, the LLMs respond more accurately to financial sentiment analysis tasks, achieving performance improvements of 15% to 48% in accuracy and F1 scores.

RAG Pipeline

End-to-end flow from knowledge retrieval to instruction-tuned inference.

Retrieval-Augmented Generation

1.Multi-Source Knowledge Querying

2.Similarity-based Retrieval

Prompt Construction

1.Prompt with Query

2.Retrieved Context (Full Context)

LLMs Call

1.Inference

2.Training

Instruction Tuning

1.Supervised Sentiment Analysis Dataset

2.Instruction-Following Data Construction

3.Base Model Selection (Llama2-7B, ChatGLM2-6B, etc)

Financial Knowledge Sources

Multi-source retrieval from news, research platforms, and social media for richer context.

News Sources

BloombergReutersYahoo FinanceCNBCMarketWatch

Research Platforms

Goldman Sachs MarqueeCiti VelocitySeeking Alpha

Social Media

TwitterRedditStockTwitsWeibo

RAG in Action — Sentiment Analysis Example

Without RAG

"$ENR - Energizer shakes off JPMorgan's bear call."

Instruction-tuned LLM

Neutral

With RAG

"$ENR - Energizer shakes off JPMorgan's bear call."

Multi-source retrieval

Instruction-tuned LLM

Positive

Retrieved Context

"JPMorgan hikes Energizer Holdings (NYSE:ENR) to a Neutral rating from Underweight... We came away encouraged by some of the company's initiatives and believe their focus on innovation and brand investment can lead to relative outperformance going forward... Shares of Energizer are 0.46% premarket to $50.44."

RAG Performance — Twitter Validation Dataset

Accuracy and F1 scores with and without retrieval-augmented generation.

Model	Accuracy	F1
ChatGPT 4.0 w/o RAG	0.788	0.652
ChatGPT 4.0 w/ RAG	0.813	0.708
FinGPT w/o RAG	0.863	0.811
FinGPT w/ RAG	0.881	0.842