Submissions

Including neighbour chunks

%load_ext autoreload %autoreload 2

import braintrust
import fastcore.all as fc
from dotenv import load_dotenv
from tqdm import tqdm
from wattbot import retriever, eda, evaluate, utils, generator

load_dotenv()

True

embedding_model = 'accounts/fireworks/models/qwen3-embedding-8b'
llm_model = 'accounts/fireworks/models/kimi-k2p5'

all_chunks = retriever.chunk_all(retriever.chunk_doc)

With Lexical Search

ls = retriever.LexicalSearch(all_chunks, neighbour_chunks=True)
rag = generator.RAG(ls, utils.fw(), model=llm_model)

experiment_metadata = {
    'pdf_extraction': 'pypdf',
    'chunking': 'character_level',
    'chunk_size': 1500,
    'chunk_step': 1400,
    'retrieval': 'lexical_search'
    'neighbour_chunks': True
}

evaluate.evaluate_train(rag, experiment_metadata, n_rc=10)

Processing Rows:   2%|███▉                                                                                                                                                            | 1/41 [00:02<01:27,  2.19s/it]Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Processing Rows: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 41/41 [01:50<00:00,  2.68s/it]

31.475

Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist

experiment_metadata['output_path'] = 'submission_v11.csv'
evaluate.create_submission(rag, experiment_metadata)

With Semantic Search

ss = retriever.SemanticSearch(all_chunks, model=embedding_model, neighbour_chunks=True)
rag = generator.RAG(ss, utils.fw(), model=llm_model)

rag.r.chunks_embeddings[-1].shape

(1927, 4096)

experiment_metadata = {
    'pdf_extraction': 'pypdf',
    'chunking': 'character_level',
    'chunk_size': 1500,
    'chunk_step': 1400,
    'retrieval': 'semantic_search', 
    'neighbour_chunks': True
}

evaluate.evaluate_train(rag, experiment_metadata, n_rc=10)

Processing Rows: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 41/41 [02:45<00:00,  4.02s/it]

32.375

experiment_metadata['output_path'] = 'submission_v6.csv'
evaluate.create_submission(rag, experiment_metadata, n_rc=10)

Answering question:   0%|                                                                                                                                                                    | 0/282 [00:00<?, ?it/s]/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:109: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'The total energy consumption of U.S. data centers increased by about 4% between 2010 and 2014' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'answer'] = str(answer['answer'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:110: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '4' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'answer_value'] = str(answer['answer_value'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:112: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '['wu2021b']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'ref_id'] = str(answer['ref_id'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:113: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '['https://arxiv.org/pdf/2108.06738']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'ref_url'] = str(answer['ref_url'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:114: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'The total energy consumption of the US data centers increased by about 4% from 2010-2014' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'supporting_materials'] = str(answer['supporting_materials'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:115: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'Chunk 18 from wu2021b explicitly states that U.S. data center energy consumption increased by about 4% between 2010 and 2014, providing the exact percentage increase requested.' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'explanation'] = str(answer['explanation'])
Answering question: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 282/282 [17:44<00:00,  3.77s/it]

Hybrid Search

ls = retriever.LexicalSearch(all_chunks)
ss = retriever.SemanticSearch(all_chunks, model=embedding_model)
hs = retriever.HybridSearch(ls, ss, neighbour_chunks=True)
rag = generator.RAG(hs, utils.fw(), model=llm_model)

experiment_metadata = {
    'pdf_extraction': 'pypdf',
    'chunking': 'character_level',
    'chunk_size': 1500,
    'chunk_step': 1400,
    'retrieval': 'hybrid_search', 
    'neighbour_chunks': True
}

evaluate.evaluate_train(rag, experiment_metadata, n_rc=10)

Processing Rows: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 41/41 [05:34<00:00,  8.17s/it]

33.25

experiment_metadata['output_path'] = 'submission_v13.csv'
evaluate.create_submission(rag, experiment_metadata, n_rc=10)

Answering question:   0%|                                                                                                                                                                    | 0/282 [00:00<?, ?it/s]/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:103: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '4%' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'answer'] = str(answer['answer'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:104: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '4' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'answer_value'] = str(answer['answer_value'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:106: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '['wu2021b']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'ref_id'] = str(answer['ref_id'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:107: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '['https://arxiv.org/pdf/2108.06738']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'ref_url'] = str(answer['ref_url'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:108: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '"The total energy consumption of the US data centers increased by about 4% from 2010-2014, compared with the estimated 24% increase from 2005-10 and nearly 90% increase from 2000-05 [Masanet et al., 2020]."' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'supporting_materials'] = str(answer['supporting_materials'])
/Users/anubhavmaity/projects/wattbot/wattbot/evaluate.py:109: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value 'The answer is explicitly stated in Chunk 2 of the provided context. The text directly states that U.S. data center energy consumption increased by about 4% from 2010-2014, citing Masanet et al., 2020. This is the only mention of the 2010-2014 period in the context, and it provides a clear percentage increase without requiring any calculation.' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.loc[i, 'explanation'] = str(answer['explanation'])
Answering question:   0%|▌                                                                                                                                                           | 1/282 [00:05<26:04,  5.57s/it]Skipping git metadata. This is likely because the repository has not been published to a remote yet. Remote named 'origin' didn't exist
Answering question: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 282/282 [30:46<00:00,  6.55s/it]