
AI4Finance foundation has several interesting projects on application of AI in finance. One of them is FinGPT, which is introduced and available here. Part of this project is about sentiment analysis using LLM. They used Llama2 and ChatGLM as the base, and then performed instruction fine-tuning on them. The outcome was promising.
In this mini project, we attempt to replicate the same sentiment analysis with LLM but we are going to use Google’s FLAN T5 model. This model is well-known for its good performance in instruction fine tuning, and comes in different sizes. My experiments showed that a model as small as FLAN T5 base, with only 250 million parameters, is able to significantly improve its performance with PEFT (LoRa) fine tuning. And more importantly, I did all of the process using an i9 10th generation CPU!
Let’s start exploring the project. In this part, I introduce the FLAN T5 model, and analyze its performance and different ways to use it. We will see how quantized 8-bit and 4-bit models perform and compare their sizes with the original models. You can find the code on my GitHub page.
from transformers import pipeline
model_small = pipeline("text2text-generation", model = "google/flan-t5-small")
model_base = pipeline("text2text-generation", model = "google/flan-t5-base")
model_large = pipeline("text2text-generation", model = "google/flan-t5-large")
model_xl = pipeline("text2text-generation", model = "google/flan-t5-xl")
models = [["Small",model_small], ["Base", model_base], ["Large", model_large], ["X-Large", model_xl]]
for name, model in models:
model.save_pretrained(f"./saved_models/{name}")from transformers import pipeline
models = [["Small"], ["Base"], ["Large"], ["X-Large"]]
for i in range(len(models)):
models[i].append(pipeline("text2text-generation", model = f"./saved_models/{models[i][0]}"))models[['Small',
<transformers.pipelines.text2text_generation.Text2TextGenerationPipeline at 0x275147d6b10>],
['Base',
<transformers.pipelines.text2text_generation.Text2TextGenerationPipeline at 0x27514db3410>],
['Large',
<transformers.pipelines.text2text_generation.Text2TextGenerationPipeline at 0x27514d50490>],
['X-Large',
<transformers.pipelines.text2text_generation.Text2TextGenerationPipeline at 0x27514d505a0>]]models[3][1]("When a recession begins, the Federal Reserve typically lowers interest rates as part of its monetary easing strategy. Here's how and why:\
What the Fed Does:\n\
- Cuts the Federal Funds Rate: The Fed reduces the benchmark interest rate to make borrowing cheaper for businesses and consumers.\n\
- Stimulates Economic Activity: Lower rates encourage spending and investment, which can help revive economic growth.\n\
- Supports Employment: By boosting demand, the Fed aims to reduce unemployment, which often rises during recessions.\n\
- Manages Inflation: If inflation is low or falling, rate cuts are more aggressive. If inflation remains high, the Fed may be more cautious.\n\n\
Historical Patterns:\n\
- In past recessions (e.g., 2001, 2008, 2020), the Fed slashed rates significantly—sometimes to near zero.\n\
- These moves are often accompanied by other tools like quantitative easing or forward guidance to reinforce the impact.\n\n\
Strategic Considerations:\n\
- The Fed doesn’t always act immediately. It assesses inflation trends, labor market data, and financial stability before deciding.\n\
- If inflation is still elevated—as seen in recent cycles—the Fed may delay or moderate rate cuts.\n\n\
Summarize the above text")[0]["generated_text"]"The Federal Reserve typically lowers interest rates as part of its monetary easing strategy. Here's how and why."models[3][1]("I bought 2 apples and three oranges for a total of 10 dollars. Each orange is 2 dollars. How much is the price of each apple?")[{'generated_text': '3 oranges cost 3 * 2 = 6 dollars. 2 apples cost 10 - 6 = 4 dollars. Each apple is 4 / 2 = 2 dollars.'}]prompts = ["How old the planet Earth is?'",
"Write a Python code to calculate sum off all elements in a list",
"Who is the current president of the USA?",
"What is a common hedging method against currency volatility?"]
models[3][1](prompts)from threading import Thread
threads = []
results = [[None for _ in range(len(models))] for _ in range(len(prompts))]
def run_models(i, j):
output = models[j][1](prompts[i])
results[i][j] = output[0]["generated_text"]
print(f"Model {j} finished prompt {i}")
for i in range(len(prompts)):
for j in range(len(models)):
thread = Thread(target=run_models, args=(i,j))
thread.start()
threads.append(thread)Model 0 finished prompt 0
Model 0 finished prompt 3
Model 0 finished prompt 2
Model 1 finished prompt 0
Model 1 finished prompt 3
Model 1 finished prompt 2
Model 2 finished prompt 0
Model 0 finished prompt 1
Model 3 finished prompt 0
Model 3 finished prompt 3
Model 3 finished prompt 2
Model 2 finished prompt 3
Model 2 finished prompt 2
Model 1 finished prompt 1
Model 2 finished prompt 1
Model 3 finished prompt 1for thread in threads:
thread.join()
for i in range(len(prompts)):
print(f"\n\n{prompts[i]}")
for j in range(len(models)):
print(f" {models[j][0]}: {results[i][j]}")Tell me how old the planet Earth is?
Small: 10 billion years
Base: 4.5 billion years old
Large: billions of years
X-Large: 4.5 billion years
What is the Python method that returns sum off all elements in a list
Small: List of elements in a list is a list of elements in a list.
Base: sum = sum(list(map(int, input().split())))
Large: s = sum(list(map(int,input().split())))
X-Large: s = 0 for i in list(map(int, input().split())): s += i print(s)
Who is the current president of the United States?
Small: george w. bush
Base: gerald ford
Large: barack obama
X-Large: barack obama
What is a common hedging method against currency volatility?
Small: hedging
Base: hedge fund
Large: hedging with a foreign currency
X-Large: hedging with forward contractsLoading a prepared T5 quantized 8-bit model
import ctranslate2
import transformers
translator = ctranslate2.Translator("./saved_models/X-Large-8bit/")
tokenizer = transformers.AutoTokenizer.from_pretrained("google/flan-t5-xl")
prompts = ["How old the planet Earth is?'",
"Write a Python code to calculate sum off all elements in a list",
"Who is the current president of the USA?",
"What is a common hedging method against currency volatility?"]
for prompt in prompts:
input_text = prompt
input_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(input_text))
results = translator.translate_batch([input_tokens])
output_tokens = results[0].hypotheses[0]
output_text = tokenizer.decode(tokenizer.convert_tokens_to_ids(output_tokens))
print(output_text)4.5 billion
list = list(map(int, input().rstrip().replace(" "))) list.replace(" ".join(list)) list.replace(list[1]) list.replace(list[2]) list.replace(list[3]) list.replace(list[4]) list.replace(list[5]) list.replace(list[6]) list.replace(list[7]) list.replace(list[8]) list.replace(list[9]) list.replace(list[10]) list.replace(list[11]) list.replace(list[8]) list.replace(list[9]) list.replace(list[6]) list.replace(list[7]) list.replace(list[8]) list.replace(list[9]) list.replace(list[0]) list.replace(list[1]) list.replace(list[2]) list.replace(list[0]) list.replace(list[
barack obama
foreign currency forward transactionsQuantized 8-bit and 4-bit models
from transformers import T5Tokenizer, T5ForConditionalGeneration, BitsAndBytesConfig as bnb_cfg
qcfg = bnb_cfg(load_in_8bit = True)
qqcfg = bnb_cfg(load_in_4bit = True)
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-xl")
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xl", device_map="auto")
qmodel = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xl", device_map="auto", quantization_config=qcfg)
qqmodel = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xl", device_map="auto", quantization_config=qqcfg)
qmodel.save_pretrained("qXL")
qqmodel.save_pretrained("qqXL")from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-xl")
model = T5ForConditionalGeneration.from_pretrained("./saved_models/X-Large", device_map="auto")
qmodel = T5ForConditionalGeneration.from_pretrained("./saved_models/qXL", device_map="auto")
qqmodel = T5ForConditionalGeneration.from_pretrained("./saved_models/qqXL", device_map="auto")input_text = "I bought 2 apples and three oranges for a total of 10 dollars. Each orange is 2 dollars. How much is the price of each apple?"
input_ids = tokenizer(input_text, return_tensors="pt").input_idsoutputs = model.generate(input_ids)
print(f"Original XL: {tokenizer.decode(outputs[0])}")Original XL: <pad>3 oranges cost 3 * 2 = $6. 2 apples cost 10 - 6 = $4.outputs = qmodel.generate(input_ids)
print(f"8-bit XL: {tokenizer.decode(outputs[0])}")8-bit XL: <pad>3 oranges cost 3 * 2 = $6. 2 apples cost 10 - 6 = $4.outputs = qqmodel.generate(input_ids)
print(f"4-bit XL: {tokenizer.decode(outputs[0])}")4-bit XL: <pad>3 oranges cost 3 * 2 = $6. So the apples cost 10 - 6 = $4prompts = ["How old the planet Earth is?'",
"Write a Python code to calculate sum off all elements in a list",
"Who is the current president of the USA?",
"What is a common hedging method against currency volatility?"]
for prompt in prompts:
input_text = prompt
input_tokens = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_tokens)
output_text = tokenizer.decode(outputs[0])
print(output_text)<pad> 4.5 billion years</s>
<pad>s = 0 for i in range(len(list(map(int
<pad> barack obama</s>
<pad> hedging with forward contracts</s>