diff --git a/paper.md b/paper.md index 30b9107..d6f0d8a 100644 --- a/paper.md +++ b/paper.md @@ -52,11 +52,11 @@ bibliography: paper.bib # Summary -We introduce with [CBXPy](https://pdips.github.io/CBXpy/) and [ConsensusBasedX.jl](https://pdips.github.io/ConsensusBasedX.jl/) Python and Julia implementations of consensus-based interacting particle systems (CBX) (generalising consensus-based optimization methods (CBO)) for global, derivative-free optimisation. The _raison d'être_ of our libraries is twofold. On the one side, to offer high-performance implementations of CBX methods that can be used outside of academic tests, while, on the other, to provide a general interface that can accommodate and be extended to other members of the CBX family, not just standard CBO. Python and Julia were selected as the leading high-level languages in terms of usage and performance, as well as their popularity among the scientific computing community. Both libraries have been developed with a common _ethos_, ensuring a similar API and core functionality, while leveraging the strengths of each language and writing idiomatic code. +We introduce [CBXPy](https://pdips.github.io/CBXpy/) and [ConsensusBasedX.jl](https://pdips.github.io/ConsensusBasedX.jl/), Python and Julia implementations of consensus-based interacting particle systems (CBX), which generalise consensus-based optimization methods (CBO) for global, derivative-free optimisation. The _raison d'être_ of our libraries is twofold: on the one hand, to offer high-performance implementations of CBX methods that can be used outside of academic tests, while, on the other, to provide a general interface that can accommodate and be extended to other members of the CBX family, not just standard CBO. Python and Julia were selected as the leading high-level languages in terms of usage and performance, as well as their popularity among the scientific computing community. Both libraries have been developed with a common _ethos_, ensuring a similar API and core functionality, while leveraging the strengths of each language and writing idiomatic code. # Mathematical background -Consensus-based optimisation (CBO) is an approach to solve for a given (continuous) _objective function_ $f:\mathbb{R}^d \rightarrow \mathbb{R}$ the _global minimisation problem_ +Consensus-based optimisation (CBO) is an approach to solve, for a given (continuous) _objective function_ $f:\mathbb{R}^d \rightarrow \mathbb{R}$, the _global minimisation problem_ $$ x^* = \operatorname*{argmin}_{x\in\mathbb{R}^d} f(x), @@ -64,7 +64,7 @@ $$ i.e., the task of finding the point, where $f$ attains its lowest value. Such problems arise in a variety of disciplines including engineering, where $x$ might represent a vector of design parameters for a structure and $f$ a function related to its cost and structural integrity, or machine learning, where $x$ could comprise the neural network parameters and $f$ the empirical risk, which measures the discrepancy of the neural network prediction with the observed data. -In some cases, so-called _gradient-based methods_ (those that involve updating a guess of $x^*$ by evaluating the gradient $\nabla f$) achieve state-of-the-art performance in the global minimisation problem. However, in scenarios where $f$ is _non-convex_ (when $f$ could have many _local minima_), where $f$ is _non-smooth_ ($\nabla f$ is not well-defined), or where the evaluation of $\nabla f$ is impractical due to cost or complexity, it is necessary to resort to _derivative-free_ methods. Numerous techniques exist for derivative-free optimisation, such as _random_ or _pattern search_ [@friedman1947planning;@rastrigin1963convergence;@hooke1961direct], _Bayesian optimisation_ [@movckus1975bayesian] or _simulated annealing_ [@henderson2003theory]. Here, we focus on _particle-based methods_, specifically, consensus-based optimisation (CBO), as proposed by @pinnau2017consensus, and the consensus-based taxonomy of related techniques, which we term _CBX_. +In some cases, so-called _gradient-based methods_ (those that involve updating a guess of $x^*$ by evaluating the gradient $\nabla f$) achieve state-of-the-art performance in the global minimisation problem. However, in scenarios where $f$ is _non-convex_ (when $f$ could have many _local minima_), where $f$ is _non-smooth_ ($\nabla f$ is not well-defined), or where the evaluation of $\nabla f$ is impractical due to cost or complexity, _derivative-free_ methods are needed. Numerous techniques exist for derivative-free optimisation, such as _random_ or _pattern search_ [@friedman1947planning;@rastrigin1963convergence;@hooke1961direct], _Bayesian optimisation_ [@movckus1975bayesian] or _simulated annealing_ [@henderson2003theory]. Here, we focus on _particle-based methods_, specifically, consensus-based optimisation (CBO), as proposed by @pinnau2017consensus, and the consensus-based taxonomy of related techniques, which we term _CBX_. CBO uses a finite number $N$ of _agents_ (particles), $x_t=(x_t^1,\dots,x_t^N)$, to explore the landscape of $f$ without evaluating any of its derivatives (as do other CBX methods). At each time $t$, the agents evaluate the objective function at their position, $f(x_t^i)$, and define a _consensus point_ $c_\alpha$. This point is an approximation of the global minimiser $x^*$, and is constructed by weighing each agent's position against the _Gibbs-like distribution_ $\exp(-\alpha f)$ [@boltzmann1868studien]. More rigorously, @@ -76,7 +76,7 @@ c_\alpha(x_t) = \omega_\alpha(\,\cdot\,) = \mathrm{exp}(-\alpha f(\,\cdot\,)) $$ -for some $\alpha>0$. The exponential weights in the definition favour those points $x_t^i$ where $f(x_t^i)$ is lowest, and comparatively ignore the rest, in particular for larger $\alpha$. If all the found values of the objective function are approximately the same, $c_\alpha(x_t)$ is roughly an arithmetic mean. Instead, if one particle is much better than the rest, $c_\alpha(x_t)$ will be very close to its position. +for some $\alpha>0$. The exponential weights in the definition favour those points $x_t^i$ where $f(x_t^i)$ is lowest, and comparatively ignore the rest, particularly for larger $\alpha$. If all the found values of the objective function are approximately the same, $c_\alpha(x_t)$ is roughly an arithmetic mean. Instead, if one particle is much better than the rest, $c_\alpha(x_t)$ will be very close to its position. Once the consensus point is computed, the particles evolve in time following the _stochastic differential equation_ (SDE) @@ -118,7 +118,7 @@ CBX methods have been successfully applied and extended to several different set In general, very few implementations of CBO already exist, and none have been designed with the generality of other CBX methods in mind. We summarise here the related software: -Regarding Python, we refer to @duan2023pypop7 and @scikitopt for a collection of various derivative-free optimisation strategies. A very recent implementation of Bayesian optimisation is described by @Kim2023. PSO and SA implementations are already available [@miranda2018pyswarms;@scikitopt;@deapJMLR2012;@pagmo2017], They are widely used by the community and provide a rich framework for the respective methods. However, adjusting these implementations to CBO is not straightforward. The first publicly available Python packages implementing CBX algorithms were given by some of the authors together with collaborators. @Igor_CBOinPython implement standard CBO [@pinnau2017consensus], and @Roith_polarcbo provides an implementation of polarised CBO [@bungert2022polarized]. [CBXPy](https://pdips.github.io/CBXpy/) is a significant extension of the latter. +Regarding Python, we refer to @duan2023pypop7 and @scikitopt for a collection of various derivative-free optimisation strategies. A very recent implementation of Bayesian optimisation is described by @Kim2023. PSO and SA implementations are already available [@miranda2018pyswarms;@scikitopt;@deapJMLR2012;@pagmo2017]. They are widely used by the community and provide a rich framework for the respective methods. However, adjusting these implementations to CBO is not straightforward. The first publicly available Python packages implementing CBX algorithms were given by some of the authors together with collaborators. @Igor_CBOinPython implement standard CBO [@pinnau2017consensus], and @Roith_polarcbo provides an implementation of polarised CBO [@bungert2022polarized]. [CBXPy](https://pdips.github.io/CBXpy/) is a significant extension of the latter. Regarding Julia, PSO and SA methods are, among others, implemented by @mogensen2018optim, @mejia2022metaheuristics, and @Bergmann2022. PSO and SA are also included in the meta-library [@DR2023], as well as Nelder-Mead, which is a direct search method. One of the authors gave the first specific Julia implementation of standard CBO [@Bailo_consensus], which has now been deprecated in favour of [ConsensusBasedX.jl](https://pdips.github.io/ConsensusBasedX.jl/), which offers additional CBX methods and a far more general interface. @@ -134,7 +134,7 @@ Ultimately, a low-level interface (including documentation and full-code example ![CBXPy logo.](CBXPy.png){ width=50% } -Most of the [CBXPy](https://pdips.github.io/CBXpy/) implementation uses basic Python functionality, and the agents are handled as an array-like structure. For certain specific features, like broadcasting-behavior, array copying, and index selection, we fall back to the `numpy` implementation [@harris2020array]. However, it should be noted that an adaption to other array or tensor libraries like PyTorch [@paszke2019pytorch] is straightforward. Compatibility with the latter, enables gradient-free deep learning directly on the GPU, as demonstrated in the documentation. +Most of the [CBXPy](https://pdips.github.io/CBXpy/) implementation uses basic Python functionality, and the agents are handled as an array-like structure. For certain specific features, like broadcasting-behavior, array copying, and index selection, we fall back to the `numpy` implementation [@harris2020array]. However, it should be noted that an adaptation to other array or tensor libraries like PyTorch [@paszke2019pytorch] is straightforward. Compatibility with the latter, enables gradient-free deep learning directly on the GPU, as demonstrated in the documentation. The library is available on [GitHub](https://github.com/pdips/CBXpy) and can be installed via `pip`. It is licensed under the MIT license. The [documentation](https://pdips.github.io/CBXpy/) is available online. @@ -151,7 +151,7 @@ The library is available on [GitHub](https://github.com/PdIPS/ConsensusBasedX.jl RB was supported by the Advanced Grant Nonlocal-CPD (Nonlocal PDEs for Complex Particle Dynamics: Phase Transitions, Patterns and Synchronisation) of the European Research Council Executive Agency (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 883363) and by the EPSRC grant EP/T022132/1 "Spectral element methods for fractional differential equations, with applications in applied analysis and medical imaging". KR acknowledges support from the German Federal Ministry of Education and Research and the Bavarian State Ministry for Science and the Arts. -TR acknowledges support from DESY (Hamburg, Germany), a member of the Helmholtz Association HGF. This research was supported in part through the Maxwell computational resources operated at Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany. +TR acknowledges support from DESY (Hamburg, Germany), a member of the Helmholtz Association HGF. This research was supported in part through the Maxwell computational resources operated at Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany. UV acknowledges support from the Agence Nationale de la Recherche under grant ANR-23-CE40-0027 (IPSO). We thank the Lorentz Center in Leiden for their kind hospitality during the workshop "Purpose-driven particle systems" in Spring 2023, where this work was initiated.