import braintrust
import fastcore.all as fc
from dotenv import load_dotenv
from tqdm import tqdm
from wattbot import retriever, eda, evaluate, utils, generatorSubmissions
Including neighbour chunks
%load_ext autoreload %autoreload 2
load_dotenv()True
embedding_model = 'accounts/fireworks/models/qwen3-embedding-8b'
llm_model = 'accounts/fireworks/models/kimi-k2p5'all_chunks = retriever.chunk_all(retriever.chunk_doc)With Lexical Search
ls = retriever.LexicalSearch(all_chunks, neighbour_chunks=True)
rag = generator.RAG(ls, utils.fw(), model=llm_model)experiment_metadata = {
'pdf_extraction': 'pypdf',
'chunking': 'character_level',
'chunk_size': 1500,
'chunk_step': 1400,
'retrieval': 'lexical_search'
'neighbour_chunks': True
}
evaluate.evaluate_train(rag, experiment_metadata, n_rc=10)Processing Rows: 2%|███▉ | 1/41 [00:02<01:27, 2.19s/it]Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Processing Rows: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 41/41 [01:50<00:00, 2.68s/it]
31.475
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
experiment_metadata['output_path'] = 'submission_v11.csv'
evaluate.create_submission(rag, experiment_metadata)With Semantic Search
ss = retriever.SemanticSearch(all_chunks, model=embedding_model, neighbour_chunks=True)
rag = generator.RAG(ss, utils.fw(), model=llm_model)rag.r.chunks_embeddings[-1].shape(1927, 4096)
experiment_metadata = {
'pdf_extraction': 'pypdf',
'chunking': 'character_level',
'chunk_size': 1500,
'chunk_step': 1400,
'retrieval': 'semantic_search',
'neighbour_chunks': True
}evaluate.evaluate_train(rag, experiment_metadata, n_rc=10)Processing Rows: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 41/41 [02:45<00:00, 4.02s/it]
32.375
experiment_metadata['output_path'] = 'submission_v6.csv'
evaluate.create_submission(rag, experiment_metadata, n_rc=10)Answering question: 0%| | 0/282 [00:00<?, ?it/s]/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:109: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'The total energy consumption of U.S. data centers increased by about 4% between 2010 and 2014' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'answer'] = str(answer['answer'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:110: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '4' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'answer_value'] = str(answer['answer_value'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:112: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '['wu2021b']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'ref_id'] = str(answer['ref_id'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:113: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '['https://arxiv.org/pdf/2108.06738']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'ref_url'] = str(answer['ref_url'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:114: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'The total energy consumption of the US data centers increased by about 4% from 2010-2014' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'supporting_materials'] = str(answer['supporting_materials'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:115: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'Chunk 18 from wu2021b explicitly states that U.S. data center energy consumption increased by about 4% between 2010 and 2014, providing the exact percentage increase requested.' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'explanation'] = str(answer['explanation'])
Answering question: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 282/282 [17:44<00:00, 3.77s/it]
Hybrid Search
ls = retriever.LexicalSearch(all_chunks)
ss = retriever.SemanticSearch(all_chunks, model=embedding_model)
hs = retriever.HybridSearch(ls, ss, neighbour_chunks=True)
rag = generator.RAG(hs, utils.fw(), model=llm_model)experiment_metadata = {
'pdf_extraction': 'pypdf',
'chunking': 'character_level',
'chunk_size': 1500,
'chunk_step': 1400,
'retrieval': 'hybrid_search',
'neighbour_chunks': True
}evaluate.evaluate_train(rag, experiment_metadata, n_rc=10)Processing Rows: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 41/41 [05:34<00:00, 8.17s/it]
33.25
experiment_metadata['output_path'] = 'submission_v13.csv'
evaluate.create_submission(rag, experiment_metadata, n_rc=10)Answering question: 0%| | 0/282 [00:00<?, ?it/s]/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:103: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '4%' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'answer'] = str(answer['answer'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:104: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '4' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'answer_value'] = str(answer['answer_value'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:106: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '['wu2021b']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'ref_id'] = str(answer['ref_id'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:107: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '['https://arxiv.org/pdf/2108.06738']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'ref_url'] = str(answer['ref_url'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:108: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '"The total energy consumption of the US data centers increased by about 4% from 2010-2014, compared with the estimated 24% increase from 2005-10 and nearly 90% increase from 2000-05 [Masanet et al., 2020]."' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'supporting_materials'] = str(answer['supporting_materials'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:109: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'The answer is explicitly stated in Chunk 2 of the provided context. The text directly states that U.S. data center energy consumption increased by about 4% from 2010-2014, citing Masanet et al., 2020. This is the only mention of the 2010-2014 period in the context, and it provides a clear percentage increase without requiring any calculation.' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
df.loc[i, 'explanation'] = str(answer['explanation'])
Answering question: 0%|▌ | 1/282 [00:05<26:04, 5.57s/it]Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Answering question: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 282/282 [30:46<00:00, 6.55s/it]