Token Efficient Augmented LLM

Blog post description.

LARGE LANGUAGE MODELSAI

6/7/20232 min read

Throughout my explorations in the realm of Artificial Intelligence, I've often been drawn to how AI can augment and redefine gameplay, be it from unique dialogue to augmented reactions and automation. This fascination led me to a side project where I employed MineFlayer, a JavaScript package for creating Minecraft Bots. The Bot I had built was a simple bot was capable of interpreting unstructured commands and performing corresponding actions in the game, ranging from following me around, cutting down trees, to exploring. However, these actions were somewhat limited, and the bot had its share of shortcomings.

This rudimentary form of AI interaction in Minecraft was taken to the next level by MineCLIP. Leveraging large pre-trained video-language models as a learned reward function, MineCLIP's innovative agent learning algorithm allowed for the creation of MineAgents. These agents could solve a variety of open-ended tasks specified in free-form language, all without any manually designed dense shaping reward. Despite the groundbreaking nature of this approach, it was still confined by the necessity of model parameterization.

Building on these foundations, the AI gaming world has now welcomed Voyager, the first Large Language Model (LLM) powered embodied lifelong learning agent in Minecraft. Voyager surpasses the capabilities of previous AI bots with its ability to continuously explore the world, acquire diverse skills, and make novel discoveries, all without human intervention.

The automatic curriculum of Voyager maximizes exploration, while its ever-growing skill library stores and retrieves complex behaviors. An innovative iterative prompting mechanism incorporating environmental feedback, execution errors, and self-verification promotes program improvement. Voyager interacts with GPT-4 via blackbox queries, bypassing the need for model parameter fine-tuning. It shows exceptional proficiency in playing Minecraft, outperforming previous state-of-the-art AIs.

Voyager's advanced functionality, however, requires significant token consumption. Enter ReWOO, a game-changer with the potential to significantly improve the efficiency and effectiveness of Voyager.

ReWOO could enable Voyager's planning module to function with far less token consumption. It would segregate the reasoning process from external observations, significantly reducing the need for repeated prompts. This allows for more efficient planning, which would be the most token-intensive system in Voyager.

The Worker module of ReWOO could be interfaced with Voyager's Skill Library. This would bypass the need for Voyager to pass unnecessary tokens, ensuring a leaner, more efficient process of accessing and executing skills.

Lastly, the Solver module could synthesize the information to create actions. Each of these modules could potentially be run by different LLMs, allowing for the usage of the most optimal LLM for each module. This could even extend to fine-tuned models specifically crafted for each task, replacing the broad usage of GPT-4 with more specialized, efficient models.

In essence, ReWOO could revolutionize the way AI functions in games like Minecraft. By enhancing efficiency and optimizing task allocation, it could usher in a new era of advanced, self-sufficient AI bots capable of unparalleled performance. The future of AI-driven gameplay is exciting indeed.

Token Efficient Augmented LLM

kss239@cornell.edu