A local Q&A engine using llama and FAISS

FAISS implements similarity search. Sentence Transformers encodes sentences into vectors FAISS can use. Llama can be configured to perform semantic searches in a FAISS vector store. By combining all of the above together we can build a local GenAI powered assistant capable of performing semantic queries to extract information from a local corpus of documents, let’s see how.

For this example I’m using llama-2-7b-chat.ggmlv3.q4_0.bin and this document (notes.txt) I’d like my agent to provide me answers about. Also I’m using faiss-gpu, as I have a compatible graphic card, but the cpu version works too.

# conda install faiss-gpu
# pip install llama-index langchain ctransformers sentence-transformers unstructured

from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.llms import CTransformers
from langchain_core.vectorstores import VectorStoreRetriever
from langchain import PromptTemplate
from langchain.chains import RetrievalQA

# Encode the document into FAISS
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=20)
loader = UnstructuredFileLoader('data/notes.txt')
documents = loader.load()
vector_store=FAISS.from_documents(text_chunks, embeddings)
retriever = VectorStoreRetriever(vectorstore=vector_store, search_kwargs={'k': 2})

# Load LLama 
llm = CTransformers(
    config={'max_new_tokens':500,'temperature':0.1, 'context_length': 2048})

# Define the agent behaviour
template="""Use the following pieces of information to answer the user's question.
If you dont know the answer just say you don't  know, don't try to make up an answer.
Only return the helpful answer below and nothing else
Helpful answer

# Create the agent
qa_prompt=PromptTemplate(template=template, input_variables=['context', 'question'])
chain = RetrievalQA.from_chain_type(llm=llm,
                                   chain_type_kwargs={'prompt': qa_prompt})

# Example question
question="How should windows be leaved before leaving the flat?"

# response: Windows should be cleaned prior to departure.