
"A current challenge with LLMs is that they have a limited input size (aka context window) and often struggle with tasks that require a long context. The key idea of RLMs is, instead of passing the prompt directly to the LLM, to give the LLM access to a programming language such as Python. The LLM then generates code to manipulate the prompt and perform preprocessing such as breaking it into chunks or searching for regular expressions."
"While RLMs show strong performance on tasks beyond the context window limitations of existing LMs at reasonable inference costs, the optimal mechanism for implementing RLMs remains under-explored...Our results across multiple settings and models demonstrated that RLMs are an effective task-agnostic paradigm for both long-context problems and general reasoning. We are excited to see future work that explicitly trains models to reason as RLMs, which could result in another axis of scale for the next generation of language model systems."
"Although frontier LLMs often have very large context windows, users have noticed that once the context gets large, the models start to show context rot. That is, they struggle to recall data from the context. This is even more visible for needle in a haystack tasks: finding random facts from a large context. MIT designed the RLM to solve these problems."
Recursive Language Models (RLMs) grant LLMs access to a programming language interface so the model generates code to manipulate and preprocess inputs. The generated code performs tasks like chunking, regex search, and can recursively invoke additional RLM calls to decompose large prompts. RLMs operate as a programmatic, task-agnostic paradigm that reduces context-window limitations and mitigates context rot on needle-in-a-haystack retrieval tasks. RLM implementations have shown strong performance at reasonable inference cost and often outperform context compaction. Optimal implementation details remain under-explored, and training models explicitly to reason as RLMs could open a new axis of scaling.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]