In our previous article we delved into the value of GenAI for eDiscovery, and the areas in which we’re currently seeing it show the most promise. We also touched on the fact that, as powerful as AI is, it still requires human input to deliver high quality results.
Today, we’re going to take a closer look at what that input looks like, and the techniques that can be used to influence the relevance and precision of GenAI output.
How input influences output
At the risk of oversimplifying a complex and nuanced process – GenAI is ultimately about guessing the next word in a sequence. The relevance of that synthesised output is influenced primarily by two factors, context and prompt.
Context
GenAI can only formulate a response based on the information to which it is exposed. If it only sees a small piece of the picture – kind of like looking through a very small window – its response will be similarly limited, with potentially important context not included. This is known as the “context window” and as GenAI evolves, these windows on the world become larger. But in general terms, the larger they are, the more costly they are too.
Exposing the GenAI model to as rich a dataset as possible will provide the most relevant material for it to review and analyse (greater context), resulting in a more meaningful and comprehensive synthesized response.
The challenge therefore is in making sure that the relevant content is presented to and fits within the context window, whatever size that is, through the use of intelligent keyword and concept search, along with any other filters at your disposal.
Prompts
The prompt a GenAI model receives also influences the output. In general, the more specific and bounded the request, the more relevant the response will be.
For example, a prompt like “Is the accused guilty?” is unlikely to deliver a useful response. A non-cognitive and non-sentient algorithm simply isn’t capable of this kind of open-ended deductive reasoning.
We see GenAI’s role being to distil information for human interpretation – whilst it may provide some level of interpretation to assist the reviewer of its output, it should not be the ultimate arbiter of that information itself. As such, a more responsive prompt might be, “Provide a timeline of the events, places and parties involved in the leadup to incident X”.
This so called “Prompt Engineering” is a whole discipline in itself.
Data security/leakage
Data security has become a hot topic for GenAI. Organisations are justifiably concerned about IP being fed into the GenAI engines and then leaking into the public domain.
To address this concern, many vendors are adopting closed models. GPT3.5, GPT4 and Claude are popular examples. These “prepackaged” LLMs (Large Language Models) are trained on vast quantities on content before being “sealed” for use. From there, they use this pretrained understanding of the world to synthesise a response based on the content they are exposed to (the context window) combined with the prompt they are given. No new content is added to the LLMs (barring a few exceptions that do allow further training), which means no proprietary content can be leaked.
Finetuning GenAI output
Grounding
The importance of context cannot be understated, and no less so for applications in eDiscovery. Whilst the LLM has a generic understanding based on its training, to improve the quality and relevance of the responses it synthesises, it should be ‘grounded’ by providing use-case specific information, which in itself, did not form part of its training. This information does not get trained into the LLM but complements the generic perspective the LLM would otherwise provide. This is known as grounding the LLM; if you like providing some boundaries to guide the LLM.
Retrieval Augmented Generation
The primary technique for achieving this grounding in an eDiscovery context is known as Retrieval Augmented Generation. Without trying to oversimplify it, consider this as the process of using search to identify the most relevant content from a larger corpus to pass to the LLM’s context window, along with the prompt. The search techniques used to locate this sub-set of data are themselves important, as the resultant language and context provided by these search results grounds the responses the LLM provides. Optimising the search by combining a range of keyword, conceptual and other search techniques will influence the overall output.
Optimising LLMs for precision vs recall
The relevance of GenAI output is often described as accuracy. Given that LLMs generate their results using mathematical algorithms based on probability, data scientists generally prefer to use the terms “precision” and “recall”.
It’s important to note that, whilst both of these terms are based on relevance, their definition of relevance is very different. As such, an LLM may need to be optimised for one or the other, depending on the specific use case.
Precision
Precision (also known as positive predictive value) is the fraction of relevant instances among the retrieved instances. Its formula is:
precision = relevant retrieved instances/all relevant instances
Optimising for precision is all about exposing the content most relevant to your query from the data corpus. This lends itself to supporting the assessment of large and diverse data corpuses – i.e. ECA – to ensure that you’ve comprehensively uncovered and understood the concepts involved.
Recall
Recall (also known as sensitivity) is the fraction of relevant instances that were retrieved. Its formula is:
Recall = relevant retrieved instances/all retrieved instances
Optimising for recall is all about extracting detail from a relevant content set. It lends itself to the later stages of the EDRM. For example: building the specifics of a case in preparation for court using content which has already undergone human review.
If all this technical discussion sounds daunting, fear not! Vendors generally don’t expect users to be experts in data science and as such you can expect to see a number of GenAI solutions or variants appearing, which are tuned to specific applications, rather than being ‘generalist’.
Call in the experts
GenAI is not a blunt instrument. It’s a precision tool that needs to be wielded expertly to deliver the best results. For most firms, achieving that level of AI expertise in-house is unrealistic in the short term.
That’s where a partner like Salient comes in, providing the specialist insight and skill you need to unlock the full potential of GenAI in your eDiscovery. Whether you’re looking for a full-service eDiscovery support partner or an expert extension of your own team, we’ll help you extract every ounce of time- and cost-saving potential from today’s latest and leading technology. Get in touch.
Read more of our series on practical AI for eDiscovery
Practical AI for eDiscovery: today, tomorrow and in future
We’re still a long way away from discovering where artificial intelligence will lead us. But preparing for that mysterious future shouldn’t stop us from making the most of what we have here and now.
1. Intelligent culling - a critical component of cost-effective eDiscovery
The most effective way to reduce data volumes for review and hosting is by using AI to intelligently cull irrelevant, duplicate and otherwise non-responsive data.
2. Using AI to improve inclusion
How do you find what you need when you don’t necessarily know what you’re looking for? With the help of AI and an investigative mindset, it’s possible to automatically expose leading indicators from within a much larger dataset.
3. Expectation vs reality: what Generative AI really offers eDiscovery
Is generative AI really the next frontier in eDiscovery? How much of the hype is grounded in reality? We explore practical applications for GenAI in eDiscovery.
4. Finetuning Generative AI for eDiscovery
AI may be powerful, but it still requires human input to deliver high quality results. We share our GenAI learnings around how context and prompt influence output and how GenAI output can be finetuned.