Position: Leverage Foundational Models for Black-Box Optimization

Motivation: Traditional BBO techniques struggle with multi-modality and task generalization

The position paper by [Son24P] advocates using LLM-based foundation models for Black Box Optimization (BBO). The goal of BBO is to optimize an objective function given only the evaluations of the function (i.e., no gradients or other second-order information about the function). A common example of a BBO task is neural network architecture search, where the objective is to maximize classification accuracy based on different architectures. Classical BBO approaches include grid search, random search, and Bayesian optimization.

Figure 1. [Son24P], Figure 1. Foundation Models can learn priors from a wide variety of sources, such as world knowledge, domain-specific documents, and actual experimental evaluations. Such models can then perform black-box optimization over various search spaces (e.g. hyperparameters, code, natural language) and feedbacks (numeric values, categorical ratings, and subjective sentiment).

More recent BBO algorithms typically try to incorporate inductive biases or priors into the search problem, e.g., domain knowledge, parameter constraints, the search history etc. One particular goal of these approaches is to perform meta-learning, i.e., to develop algorithms that can automatically provide priors for various tasks from different domains without additional task-specific training. However, constructing reliable priors that work across multiple tasks and can take in data from multiple modalities (values, text, images) is challenging.

Position: LLMs can process multi-model data and be fine-tuned to different tasks

Figure 2. [Son24P], Figure 2. Black-box optimization loop with sequential foundation models. Using metadata $m$ and history $h$, the model proposes candidates $x$ which are checked for feasibility, evaluated, and then appended to the history.

The central point made in this paper is that LLMs are a promising candidate for tackling this challenge (Figure 1). The key idea is to interpret BBO as a problem of learning a sequence: Given a search space $\mathcal{X}$ containing hyperparameter setting $x \in \mathcal{X}$ and a sequence or history $h_{1:t-1}$ of previous settings $x_{1:t-1}$ and corresponding objective function values $y_{1:t-1}$, the goal is to predict the next element in the sequence, i.e., a new hyperparameter setting $x_t$.

Transformer-based LLMs exceed at sequence learning and meet several critical requirements for foundational BBO:

Multi-modality: They can process large amounts of data from various modalities.
Pre-training: They can be pre-trained to acquire extensive world knowledge.
Fine-tuning: They can be fine-tuned with task-specific information.

The workflow for using LLM-based foundation models for BBO is visualized in Figure 2.

The authors also give an overview of common techniques for BBO, summarizing the increasing capabilities as we move from hand-crafted genetic algorithms, model-based BBO and feature-based meta-learning to sequence-based, attention-based, token-based and finally LLM-based algorithms (table 1, paper section 3.2).

Table 1. [Son24P], Table 2. Classes of methods organized by their capabilities. Note: Method names based on increasing development order - e.g. “Attention-based” can consist of techniques up to their development such as meta-learning, but not LLMs.

Finally, the authors collect a set of challenges and open questions for BBO with LLMs (paper section 4). They argue that there is a need for

better data representation and multimodality datasets for training models on multi-modal tasks
a common guideline or format for encoding BBO (meta)-data to be processed by LLMs
large open-source evaluation datasets
better generalization and customization of LLMs for different tasks
new benchmarks for metadata-rich BBO to better test the capabilities of LLMs

This paper is an interesting read and provides a comprehensive overview of the limitations of classical BBO methods and the possibilities of Large Language Models for Black Box Optimization.

Undeniably, Large Language Models (LLMs) have stirred an extraordinary wave of innovation in the machine learning research domain, resulting in substantial impact across diverse fields such as reinforcement learning, robotics, and computer vision. Their incorporation has been rapid and transformative, marking a significant paradigm shift in the field of machine learning research. However, the …

Motivation: Traditional BBO techniques struggle with multi-modality and task generalization

Position: LLMs can process multi-model data and be fine-tuned to different tasks

References

In this series →