Researchers from Stanford and Cornell have presented a novel method dubbed EVAPORATE that can cut the inference cost of language models by 110x while producing results of higher quality. The method has been tested on 16 sets of documents representing a variety of formats, themes, and attribute types. It finds redundancies among several documents and takes advantage of them to increase efficiency.
A brand-new method dubbed EVAPORATE has been put out by Stanford and Cornell researchers, and it has the potential to 110x lower the inference cost of language models.
The researchers discussed how language models are expensive to use when processing huge documents, using the more than $100,000 cost of conducting inference over 55 million Wikipedia pages as an example.
EVAPORATE has been tested on 16 sets of documents covering a variety of formats, themes, and attribute types. It finds redundancies between numerous papers and makes use of them to increase efficiency.
🤖 The EVAPORATE prototype system is powered by LLMs, and two possible implementation options are identified: either prompting the LLM to directly extract values from documents or prompting the LLM to create code that does the extraction. Although code synthesis was less expensive, it was also less precise than handling each document individually with the LLM. The team suggested EVAPORATE-CODE+, an expanded code synthesis implementation, to address this.
By performing a sublinear pass through the documents with the LLM, EVAPORATE-CODE+ beats cutting-edge systems, resulting in a 110x reduction in the number of tokens the LLM must process on average across the 16 evaluation settings of 10k documents each.
📝 By offering a potential method for automating the extraction of tables from semi-structured documents utilizing LLMs, this work will advance the data management community.
Stanford Researchers Proposes A New AI Approach That Reduces Inference Cost of Language Models by 110x called Evaporate.