In the fast-paced world of business operations, having swift access to accurate and relevant data is a pivotal advantage. As such, enterprise search solutions have emerged as indispensable tools that facilitate the streamlined retrieval of data from diverse sources, whether internal databases or externally facing applications and websites. Today, the advancement of generative AI, especially large language models (LLMs) and Retrieval Augmented Generation (RAG), is revolutionizing the enterprise search paradigm. This article delves deep into the transformative roles of RAG and LLMs in enhancing enterprise search solutions, sharing first-hand experience and insights from our own building journey.
What is an enterprise search solution?
An enterprise search solution is designed to create an intuitive interface for users across roles and teams to conduct advanced, intelligent searches, thus enabling businesses to operate more efficiently and effectively. These solutions enable users to collect and update information from varied data sources, types, and formats. They also facilitate data indexing or archiving and offer intelligent search options such as autocomplete, ‘find similar’, and ‘rank by relevance’. Users can refine their searches using advanced filters for more targeted results. Crucially, these solutions also consider data governance and security, defining different user permissions for information access.
How RAG transforms the landscape of enterprise search
What is Retrieval Augmented Generation (RAG)？
Retrieval Augmented Generation or RAG, refers to a dual-component framework: an information retrieval component, and a large language model (LLM) capable of generating answers in response to user queries.
Introduced by Meta AI researchers, this framework addresses two inherent challenges associated with LLMs: the potential for inaccurate answers, and provision of outdated information. By providing LLMs with credible sources, RAG effectively curbs the hallucination issues of LLMs, enhancing accuracy and trustworthiness of the generated content. It also empowers LLMs to tap into the most recent information via information retrieval, compensating for the static nature of LLMs’ parametric knowledge.
How does Retrieval Augmented Generation work in enterprise search?
RAG takes a user inquiry and retrieves a set of relevant documents from a given source (i.e. company data stored in a vector database). The searched documents, being considered as context, are sent to the LLM together with the original query prompt. The LLM will then generate the answer to address the user inquiry, based on information extracted from the searched documents.
Enhancing enterprise search: benefits of leveraging LLMs under the RAG framework
Precision is paramount in enterprise search solutions. With RAG, we can guide LLMs to retrieve content from the database first, taking both searched information and users’ queries into consideration, and subsequently generate the answers. This effectively prevents LLMs from fabricating plausible-sounding responses and improves AI accuracy.
Another key benefit is the enhancement of semantic search. It delivers more relevant results than traditional keyword-based searches. Leveraging natural language processing and machine learning techniques, semantic search can comprehend user intent, content, and the context of the search. This makes the search results more meaningful to users.
RAG also expedites knowledge discovery. Rather than presenting searched documents directly to users, LLMs summarize and extract critical information from these documents, creating tailor-made answers to address users’ questions in a matter of seconds. This allows users to quickly access crucial knowledge and insights, enhancing productivity.
Behind the scenes: the creation of WIZ.AI’s enterprise search solution
WIZ.AI is building RAG-powered enterprise search solutions for different clients. The solution we are building is a versatile enterprise copilot, assisting both enterprises’ employees and their customers to retrieve information they need.
Our approach is twofold. Firstly, we develop a comprehensive vector database stocked with the most recent and relevant knowledge from targeted enterprises. Secondly, we fine-tune pre-trained Large Language Models (LLMs), using specific, enterprise-focused knowledge and workflows. This enhances the LLMs’ domain comprehension and cultural alignment within individual organizations.
Throughout this journey of exploration, our product and R&D experts have gained invaluable insights and lessons, enriching our first-hand experience in the field.
The art of data pre-processing
There is diverse and valuable data to be incorporated into the database for LLMs’ information retrieval. This includes public data such as a company’s FAQs on their website, as well as internal data like customer service recordings, documentation, meeting notes, emails, and more.
To enhance LLMs’ response accuracy and user experience, we preprocess available data before incorporating them into the database. One strategy that is showing remarkable effectiveness is to divide lengthy documents into multiple question-answer pairs. These Q&A pairs are not randomly generated, rather they are tailor-made for specific business scenarios. For queries that require precise responses, we label the answers and guide LLMs to adhere to standard answers without generating their own interpretations.
### Long Q&A Example: #### Question: What should I consider when choosing a savings account at a bank? #### Answer: When selecting a savings account, it's important to consider several factors to ensure it meets your financial needs. Firstly, look at the interest rate offered, as this determines how much your savings will grow over time. A higher interest rate will yield more returns on your deposits. Secondly, be aware of any fees that may be associated with the account, such as monthly maintenance fees, minimum balance fees, or transaction fees, as these can significantly reduce your savings. Thirdly, check the accessibility of your funds. Some savings accounts offer ATM cards, online banking, and mobile banking apps for convenient access, while others may have more restrictive withdrawal policies. Additionally, consider the minimum balance requirements and whether you can maintain them to avoid fees. Lastly, ensure that the bank is insured by a reputable government agency, which in the United States would be the Federal Deposit Insurance Corporation (FDIC), providing protection for your funds up to a certain limit. By carefully evaluating these aspects, you can choose a savings account that best suits your saving goals and financial situation. ### Smaller Q&A Examples: #### Q1: What is the importance of the interest rate on a savings account? #### A1: The interest rate is crucial as it dictates the growth of your savings. A higher rate compounds over time, increasing your savings more significantly than a lower rate. #### Q2: How do fees impact a savings account? #### A2: Fees can diminish your savings, especially if they're frequent or large. Monthly maintenance, transaction, and minimum balance fees are common and should be considered when choosing an account. #### Q3: What should I know about accessing my funds in a savings account? #### A3: Accessibility varies; some accounts offer ATM cards and online banking for easy access, while others may limit withdrawals. Consider how often you'll need to access your funds before choosing an account. #### Q4: Why is it important to consider minimum balance requirements for a savings account? #### A4: Minimum balance requirements are important because failing to maintain them can result in fees or lower interest rates, which could negate the benefits of saving. #### Q5: How does FDIC insurance affect my savings account? #### A5: FDIC insurance protects your savings up to a certain limit if the bank fails, ensuring that your money is safe. It's a critical factor for the security of your funds in a savings account.
Fine-tuning LLMs: tailoring large language models for bespoke services
During the fine-tuning process, we equip LLMs with enterprise-specific knowledge and workflows. An essential part of this process involves a thorough examination of our collected data to discover any underlying workflows. For instance, our data analysis revealed that human agents from a Singapore government entity typically start their responses by verifying applicants’ ages. This stems from the entity’s distinct procedures for adults (aged 16 and older) and children (under 16 years of age). Unearthing such hidden knowledge and workflows during the fine-tuning process is vital. We ensure that LLMs are trained to adopt these necessary behaviors before they start generating responses.
Capability boundaries of LLMs
Additionally, we’ve made some noteworthy discoveries regarding limitations and constraints during our development of Retrieval Augmented Generation (RAG) applications in enterprise search. One such observation relates to the limited capacity of LLMs in understanding and executing complex prompts. For example, we’ve tested the below prompt:
# Global Instructions ... # Context ## Context 1 ### Question: What are the benefits of opening a savings account with a bank? ### Answer: Opening a savings account with a bank offers several benefits including earning interest on your deposits, which helps grow your savings over time. It also provides a safe place to keep your money, as savings accounts are typically insured by a government agency like the FDIC in the United States up to a certain amount. Additionally, having a savings account can encourage financial discipline through regular deposits and can offer conveniences like online banking, mobile deposits, and easy access to funds through ATMs or debit cards. ### Instruction: Please provide a detailed comparison of the interest rates, fees, and services associated with savings accounts from three different banks. ## Context 2 ...
Here, the question, answer, and instruction related to Context 1 are all stored within the Vector Database(VDB). These elements are then retrieved and amalgamated into the prompt. It’s our expectation that the LLM applies the instructions of Context 1 in its inference process, when addressing issues related to this context.
However, disappointingly, only GPT-4 can consistently execute these instructions as anticipated. Other models, including GPT-3.5 and all variants of LLama 2, failed to do so consistently. Our analysis is that such prompts necessitate the LLM to possess advanced abstract capabilities. These capabilities should enable LLMs to translate complex prompt formats into actionable business logic and execute the derived logic effectively. Unfortunately, these abstract abilities are challenging to acquire through fine-tuning alone, especially on smaller scale models.
Nonetheless, our curiosity drives us to continuously explore new possibilities. One such avenue under preliminary consideration is formalization. This could involve storing instructions within the context using a VDB-based format to ensure these instructions are correctly applied in the prompt.