Stanford Researchers Proposes A New LLM AI Approach That Reduces Inference Cost

Researchers from Stanford and Cornell have presented a novel method dubbed EVAPORATE that can cut the inference cost of language models by 110x while producing results of higher quality. The method has been tested on 16 sets of documents representing a variety of formats, themes, and attribute types. It finds redundancies among several documents and takes advantage of them to increase efficiency.