Simple statistical tests selection based parallel computating method ensures the guaranteed global extremum identification

Viacheslav Kovtun; Torki Altameem; Mohammed Al-Maitah; Wojciech Kempa

doi:10.1016/j.jksus.2024.103165

View/Download PDF

Buy Reprints

PDF

Translate this page into:

05 2024

:36;

103165

doi:

10.1016/j.jksus.2024.103165

Simple statistical tests selection based parallel computating method ensures the guaranteed global extremum identification

Viacheslav Kovtun^{^a,⁎}, Torki Altameem^{^b}, Mohammed Al-Maitah^{^b}, Wojciech Kempa^{^c}

a Department of Computer Control Systems, Faculty of Intelligent Information Technologies and Automation, Vinnytsia National Technical University, Khmelnitske Shose str., 95, Vinnytsia 21000, Ukraine

b Computer Science Department, Community College, King Saud University, 11451 Riyadh, Saudi Arabia

c Department of Mathematics Applications and Methods for Artificial Intelligence, Faculty of Applied Mathematics, Silesian University of Technology, ul. Akademicka 2A, 44-100 Gliwice, Poland

⁎Corresponding author. kovtun_v_v@vntu.edu.ua (Viacheslav Kovtun),

Received: 2023-7-18, Accepted: 2024-3-13,

Disclaimer:
This article was originally published by Elsevier and was migrated to Scientific Scholar after the change of Publisher.

Abstract

The article proposes a parallel computing oriented method for solving the global minimum finding problems, in which continuous objective functions satisfy the Hölder condition, and the control parameters domain limited by continuous functions is characterized by a positive Lebesgue measure. A typical example of such a task is the discrepancy minimizing problem between the left and right parts of some large system of equilibrium equations (this is a usual situation when describing real process using Markov chain). The method is based on simple statistical tests, thanks to which, at each iteration, growing sets of potential global minima and sets of decrements necessary for estimating the values of the Hölder constants are formed. The article theoretically substantiates and empirically proves the guaranteed convergence of the authors’ method to the real global minimum, which occurs at an exponential rate. For the continuous iterations number, analytical upper estimates of the spacing between the potential global minima and real global minima are formalized, as well as an estimate of the probability of overcoming this spacing is formalized. The decrements sequence approximation, estimation of a priori unknown Hölder constants, estimation of the average number of iterations of the method and probabilistic characteristics of the final solution are analytically justified. In addition to the theoretical proof, the adequacy of the authors’ method has been confirmed empirically. It turned out that both the quality characteristics of the initial results calculated by the authors’ method and the time to obtain them are practically independent of the size of the search area. This expected result is a significant advantage of the authors’ method over analogues.

Keywords

Parallel computating

Applied mathematics

Optimization problem

Global extremum

Statistical tests method

Equilibrium

Show Related Articles from PubMed

Nomenclature

$q$: is the dimension of the vector $z$
$N_{i}$: is the volume of the set of stochastic vectors at the i-th iteration $(N_{i} = M_{i})$
$Z_{i} = \{z^{(1)}, \dots, z^{(N_{i})}\}$: is a set of stochastic, uniformly distributed $Z_{+}^{n}$ , vectors $z$
$Y_{i} = \{y^{(1)}, \dots, y^{({\tilde{N}}_{i})}\}$: is a set of admissible stochastic vectors belonging to the admissible set $L$ (see (4)), ${\tilde{N}}_{i} ⩽ N_{i}$
${\tilde{N}}_{i}$: is the volume of the set $Y_{i}$
$F_{i}^{*} = \min_{y \in Y_{i}} F (y)$: is the minimum of the objective function $F$ on the admissible set $Y_{i}$
$F^{(i)} = \{F_{0}^{*}, F_{1}^{*}, \dots, F_{i}^{*}\}$: is the cumulative set of potentially global minima
$δ_{k} = |F_{k}^{*} - F_{k - 1}^{*}|$: is the decrement of the set $F^{(i)}$ , $k = \bar{1, i}$
$Δ^{(i)} = \{δ_{1}, \dots, δ_{i}\}$: is the cumulative set of decrements; $w_{k} = \log δ_{k}$ , $k = \bar{1, i}$
$W^{(i)} = \{w_{1}, \dots, w_{i}\}$: is the accumulated set of logarithms of decrements
${\tilde{W}}^{(i)} (ρ) = \{{\tilde{w}}_{i - ρ}, {\tilde{w}}_{i - ρ + 1}, \dots, {\tilde{w}}_{i}\}$: is the accumulated set of logarithms of decrements of smaller $\log γ$ ; $n_{k} = \log {\tilde{N}}_{k}$ , $k = \bar{0, i}$
$N^{(i)} = \{n_{0}, n_{1}, \dots, n_{i}\}$: is the accumulated set of logarithms of the volumes of sets
$γ$: is the maximum acceptable error for calculating the values of decrements
$ρ$: is the applied number of elements of the accumulated sequence ${\tilde{W}}^{(i)} (ρ)$
${\hat{C}}^{(i)}$ , ${\hat{s}}^{(i)}$: are estimates of the values of the Hölder constants (see expression (24))
$ξ_{j}$: is a situation when $w_{j} ⩽ \log γ$
$r_{ϕ}^{+}$: is the upper estimate of the spacing from the current potentially optimal minimum $F_{i}^{*}$ , obtained on the $i = ϕ$ -th iteration of the method, to the real minimum $F^{*}$ (see expression (33))
$P_{r_{ϕ}^{+}}^{*}$: is the lower probability estimate (34)

1 Introduction

Optimization is the process of finding the point of the extreme value of a certain (target) function (Stracquadanio and Pardalos, 2019; McNaughton, 2023). It is one of the biggest cornerstones of applied mathematics, physics, engineering, economics, and industry. The scope of its application is vast and includes, for example, minimization of physical quantities at micro and macro levels, maximization of profit or efficiency of logistics chains, etc. Machine learning is also focused on optimization (Huang et al., 2022; Alexiadis, 2023): various regressions and neural networks try to minimize the discrepancy between predicted and real data.

Optimization is an active and relevant sector of scientific research. This sector includes a huge number of thematic fields and their extreme variations. The number of optimization problem statements and their solution methods has been growing like mushrooms after the rain for decades. For example, there is a huge field of Mixed-Integer Programming (Alfant et al., 2023; Zhang et al., 2023), which deals with discrete scenarios. Nondeterministic optimization (Zhang et al., 2022; Yang et al., 2019), based on stochastic principles, is known. There is robust optimization (Fransen and Langelaar, 2023; Castelli et al., 2023), in which fixed parameters are taken into account. The optimization of dynamic systems evolving in time (Yang et al., 2023; Xie et al., 2022) is known. There are various meta-heuristic methods (Marulanda-Durango and Zuluaga-Ríos, 2023; Hirsching et al., 2022): simulated annealing method, genetic algorithms, and swarm evolution methods. There are well-known approaches to optimization using fuzzy logic (Pei et al., 2023).

Let's focus on the global optimization problem of continuously differentiable functions (this will allow us to narrow the area of interest at least a little). The analysis involves evaluating the applicants in a certain quality criteria metric. In this context, first of all, we will mention such criteria as:

- global convergence (global convergence in optimization refers to the ability of an optimization algorithm to guarantee convergence to the global minimum or maximum of the objective function, regardless of the initial conditions. It ensures that the algorithm approximates the optimal solution across the entire search space, not just a local minimum or maximum. This property is crucial to avoid getting stuck in suboptimal solutions, ensuring the discovery of the best solution in the general case of the optimization problem);

- speed of local convergence (the speed of local convergence in optimization refers to the rate at which an optimization algorithm approaches the local minimum or maximum of the objective function near the convergence point. This characteristic determines how quickly the algorithm converges to the optimal solution in the vicinity of the current point, assessing the rate of decrease in the function value. A high speed of local convergence implies rapid approximation to the local optimum, which can be crucial for efficiently solving optimization problems);

- the dimension of the optimization problem, the solution of which is aimed at the method;

- the need to store matrices in the computer memory during the solution process (yes or no);

- use of the Hesse’s matrises in the solution process (yes or no);

- the need for scaling (yes or no) (scaling in optimization refers to the process of transforming variables or parameters in an optimization problem in such a way that they have a similar scale. This is important because different variables may have different orders of magnitude, which can hinder the convergence of optimization algorithms. Scaling helps balance the contribution of each variable, thereby facilitating the optimization process and enhancing the efficiency of converging to the optimal solution).

The first two criteria are analytical, the others characterize the aspects of the directly applied implementation of a particular method.

Let us pay closer attention to the first two of the mentioned criteria. An optimization method is said to have the property of global convergence if its iterations converge to a local minimizer regardless of the initial position. The speed of local convergence shows how quickly the method will find the minimizer after getting into the vicinity of the latter. Note that these two criteria are potentially competing. The so-called hybrid methods (Xu et al., 2023; Wu et al., 2023) are an actual answer to the conflict that manifests itself in an attempt to achieve simultaneous compliance with both the first and second criteria.

Now, having mentioned the main optimization method's qualitative criteria, we can go directly to the methods as such (at least, to the most used of them). Let’s recall the most used methods of smooth unconditional optimization (Ali et al., 2021; Smith and Nair, 2005): the lines method, the fastest descent method, the Newton method, quasi-Newton methods (incl, DFP, SR1, BFGS (Broyden, Fletcher, Goldfarb, Shanno) and its modifications: muffled, with limited memory, etc.), the nonlinear conjugate gradients method, the truncated Newton method. Note that from this list, only the fastest descent method and the truncated Newton method can claim (with limitations) to pass the first criterion (global convergence).

Of course, there are methods for which compliance with the first criterion was the dominant requirement at the creation time. We will distinguish three classes of such methods. The methods we refer to in the first class are focused on the configuration of the objective function and the admissible solutions domain. The most striking representative of this class is the DC functions minimizing method (Fang et al., 2012; Montano et al., 2022). In the method, both the objective function and the constraint functions for the admissible solutions domain are represented as the differences between two convex functions. A complete view of the evolution of this class of global optimization methods can be made by reading the works (Fang et al., 2012; Montano et al., 2022; Shilaja et al., 2022 ). Methods of the second class are focused on solving global optimization problems with simple configurations of admissible solutions regions (which have, for example, the shape of a parallelepiped) and objective functions characterized a priori by known Hölder constants (Chen and Zheng, 2021; Sovrasov, 2016). The same class can also include methods focused on reducing a multidimensional optimization problem to a one-dimensional one using Peano curves (Sovrasov, 2016). However, if the Hölder constants are unknown, the application of second-class methods is accompanied by considerable uncertainty. Finally, methods belonging to the third class focus on random search and the formalization of intellectualized heuristics to interpret its results (Gorawski et al., 2021).

We would like to separately mention Sub-gradient methods for non-smooth optimization (Tymchenko et al., 2019; Zaiats et al., 2019). Sub-gradient methods for non-smooth optimization are optimization techniques specifically designed to address problems where the objective function is not everywhere differentiable. Unlike traditional gradient methods applicable to smooth functions, sub-gradient methods use sub-gradients, a generalization of gradients, to navigate through optimization landscapes containing non-differentiable components. These methods are particularly useful in convex optimization scenarios involving functions with nonsmooth elements, such as those encountered in support vector machines and lasso regression. Sub-gradient methods iteratively adjust the solution in the direction of the sub-gradient, aiming to converge to a point where the sub-gradient approaches zero, indicating a potential solution to the non-smooth optimization problem. While they may exhibit slower convergence rates compared to methods for smooth optimization, sub-gradient methods are essential for effectively tackling problems with inherent nonsmoothness.

One of the essential problems accompanying the use of stochastic tests in the context of solving optimization problems is the generation of uniformly distributed random vectors in a certain region of the search space. Scientists are looking for ways to effectively apply this problem (Bisikalo and Kharchenko, 2023; Izonin et al., 2022). In this context, we note Hit-and-Run-like methods, Markov chains using methods, and methods that take into account relative entropy. However, a universal method has not yet been found.

Taking into account the strengths and weaknesses of the mentioned analogue methods, we will formulate the necessary attributes of scientific research.

Research object is a process of solving the global minimum finding problems, which continuous objective functions satisfy the Hölder condition, and the control parameters domain limited by continuous functions is characterized by a positive Lebesgue measure and is limited by multidimensional parallelepipeds.

Research subject includes functional analysis theory, probability theory and mathematical statistics, experiment planning theory, and computational methods.

Research aim is to formalize the process of guaranteed solutions to the optimization problems outlined by the research object in the form of a parallel computing oriented method.

Research objectives are:

- formalize the parallel computing oriented method of the guaranteed finding of the global solution of the optimization problem with the objective function and restrictions imposed by algorithmically calculated continuous functions that satisfy the Hölder condition (with a priori unknown values of the corresponding constants);

- formalize the process of generating sets of potentially global extrema as the results of simple statistical tests (as a basic element of each iteration of the method);

- formalize the probabilistic characteristics of the approximate solution, which is the result of the implementation of a finite number of iterations of the authors’ method;

- justify the adequacy of the proposed mathematical apparatus and demonstrate its functionality with an example.

Main contribution. The article proposes a parallel computing oriented method for solving the global minimum finding problems, in which continuous objective functions satisfy the Hölder condition, and the control parameters domain limited by continuous functions is characterized by a positive Lebesgue measure and is limited by multidimensional parallelepipeds. A typical example of such a task is the discrepancy minimizing problem between the left and right parts of some system of equilibrium equations. The method is based on simple statistical tests, thanks to which, at each iteration, growing sets of potential global minima and sets of decrements necessary for estimating the values of the Hölder constants are formed. The article theoretically substantiates and empirically proves the guaranteed convergence of the authors’ method to the real global minimum, which occurs at an exponential rate. For the continuous iterations number, analytical upper estimates of the spacing between the potential global minima and real global minima are formalized, as well as an estimate of the probability of overcoming this spacing is formalized. The decrements sequence approximation, estimation of a priori unknown Hölder constants, estimation of the average number of iterations of the method and probabilistic characteristics of the final solution are analytically justified.

- Statement of the research, which introduces general definitions and clarifies the purpose and objective of the research;

- Formalization of the parallel computing oriented method of the guaranteed finding of global extremum with simple statistical tests selection;

- Specific aspects of the applied application of the presented method;

- Sections devoted to demonstration and analysis of the results of the intended use of the proposed method.

2 Materials and methods

2.1

2.1 Statement of the research

The problem of finding the extremum of a function is to determine its largest (maximum) or smallest (minimum) value in a certain range of values of its arguments. The task of the boundaries of this area (as well as the rest of the additional conditions) is implemented in the form of a system of equations and (or) inequalities. In this case, we talk about the conditional problem of finding the extremum of a function or the problem of finding a local extremum. If the range of admissible values of the arguments is not limited, then we are dealing with the problem of finding the global extremum of the function or the optimization problem. Next, we will investigate the problem of finding the global extremum of the function $f (x)$ :

(1)

g l o b \min f (x)

for

x \in H_{x}

H_{x} = \{x : h (x) ⩽ 0, x \in R^{q}\}

h \in R^{r}

r < q

, functions

h (x)

are continuous. Function (1) satisfies the Hölder condition with constant

G > 0

v > 0

, i.e.

υ (t) = \max_{\begin{matrix} (\tilde{w}, \tilde{y}) \in R^{q} : \\ ∥\tilde{w} - \tilde{y}∥ ⩽ t \end{matrix}} |f (\tilde{w}) - f (\tilde{y})| ⩽ G t^{v}

We formalize the canonical form of the optimization problem (1):

(2)

g l o b \min F (z)

for

z \in L_{z}

L_{z} = H_{z} \cap Z_{+}^{q}

Z_{+}^{q} = \{z : 0 ⩽ z ⩽ 1\} \subset R^{q}

H_{z} = \{z : v (z) ⩽ 0, z \in Z_{+}^{q}\}

v \in R^{r}

r < q

The functional relationship between the optimization problems (1) and (2) can be expressed through the continuously differentiable and mutually unique transformation

(3)

z = T (x) x = 1 / T (z)

A functional instance of (3) is, for example, the transformation $z = 1 / (1 + \vec{\exp} (- c \otimes x))$ , $x = (- 1 / c) \otimes \vec{\ln} ((1 - z) / z)$ , where $\vec{\exp} (- ∙ \otimes ∙)$ and $\vec{\ln} (∙)$ - $n$ -dimensional vectors are formed by the components $\exp (°_{i} \otimes °_{i})$ and $\ln (- °_{i})$ , respectively.

In the problems of applied design, the area of determination of controlled parameters x is not the entire space $R^{q}$ , but its area is limited by the $q$ -dimensional parallelepiped $P_{x}$ . Let's redefine the optimization problem (1) taking into account this restriction:

(4)

\min_{x} f (x)

for

x \in P_{x} \cap H_{x}

P_{x} = \{x : c ⩽ x ⩽ d\}

c = \{c_{i}\}

d = \{d_{i}\}

i = \bar{1, n}

x \in R^{q}

;

H_{x} = \{x : h (x) = \{h_{j} (x), j = \bar{1, r}\} ⩽ 0, x \in R^{q}\}

h \in R^{r}

r < q

, where the generalized functions

h (x)

are continuous. In the context of the optimization problem (4), the transformation (3) is significantly simplified:

z = (x - c) \otimes (1 / (d - c))

Let the objective function $F (z)$ of the optimization problem (4) acquire a global minimum at the values of the argument $z^{*} \in Z^{*}$ : $F (z^{*}) = F^{*}$ . Taking into account the continuity of the transformation (3), we can state that the function $F (z)$ is bounded from below by $L_{z}$ , that is, there is a constant a for which the inequality $F^{*} ⩾ a = const$ holds.

In the general case, a is unknown, but $a \to 0$ in the context of specific optimization problems (for example, where the function $F (z)$ characterizes the discrepancy between the left and right parts of some system of equilibrium equations). At the same time, the function $F (z)$ also satisfies the Hölder condition, but with constants $K$ and $s$ , that is:

(5)

υ (g) = \max_{\begin{matrix} (w, y) \in L_{x} : \\ ∥w - y∥ ⩽ g \end{matrix}} |F (w) - F (y)| ⩽ K g^{s}

Therefore, the aim of our investigation is to create a parallel computing oriented method that will allow us to geting the global extreme $z^{*} \in Z^{*}$ of the objective function $F (z)$ on the set $L_{z}$ . The subject of our research will be the simple statistical tests method.

2.2

2.2 Formalization of the parallel computing oriented method of the guaranteed finding of global extremum with simple statistical tests selection

The basic element of the authors’ method is the elementary positive cube $Z_{+}^{n}$ mentioned in the formulation of the optimization problem (4). Each i-th iteration $(i = 0, 1, 2, \dots)$ of the method begins with the generation of a set $Z_{i}$ , which contains $N_{i}$ uniformly distributed and independent $Z_{+}^{n}$ stochastic vectors $\{z^{(1)}, \dots, z^{(N_{i})}\}$ . Next, among the calculated for the i-th set of the function $F (z^{(j)})$ values, $j = \bar{1, N_{i}}$ , the minimum value $F_{i}^{*}$ and the corresponding value of the argument $z_{i}^{*}$ are determined and fixed. Considering that the generation of the set $Z_{i}$ is carried out by the statistical tests method, the determined optimal value $F_{i}^{*}$ is stochastic. The transition to the (i+1)-th iteration is accompanied by a regulated increase in the number of stochastic vectors in the set $Z_{i + 1}$ : $N_{i + 1} > N_{i}$ . The result of execution of the (i+1)-th iteration is the set $\{F_{i + 1}^{*}, z_{i + 1}^{*}\}$ . The course of the iterative process is accompanied by the growth of accumulated sets $F^{*} = \{\bar{F_{i}^{*}}\}$ and $Δ = \{\bar{δ_{i + 1}}, δ_{i + 1} = F_{i}^{*} - F_{i + 1}^{*}\}$ , $i = 0, 1, 2, \dots$ . The iterative process is completed if during $ρ$ iterations the dynamics of changes in the values in the set $Δ$ does not exceed the certain threshold $γ$ :

(6)

ϕ = \min \{i : \max_{i - ρ ⩽ s ⩽ i} δ_{s} ⩽ γ\}

The output result of the method described above is the set $S = \{F_{ϕ}^{*}, z_{ϕ}^{*}, r_{ϕ}^{+}, P (ξ_{r_{ϕ}^{+}})\}$ , where $F_{ϕ}^{*}$ is the found approximate value of the global extremum of the objective function (4), $z_{ϕ}^{*}$ is the found approximate value of the argument of the objective function (4) for $F_{ϕ}^{*}$ , $r_{ϕ}^{+}$ is an estimate of the spacing between the approximate and actual values of the global extremum of the objective function (4), $P (ξ_{r_{ϕ}^{+}})$ is an estimate of the event $ξ_{r_{ϕ}^{+}} = \{F_{ϕ}^{*} - F^{*} ⩽ r_{ϕ}^{*}\}$ probability.

The set $Z_{i}$ is formed from a matrix of independent stochastic values of dimension $(q \times M_{i})$ , the elements of which are uniformly distributed over the segment $[0, 1]$ . The source of this matrix for the i-th iteration is a random number generator. This matrix is the basis for the formation of $N_{i}$ independent and uniformly distributed positive elementary parallelepiped stochastic vectors $Z_{i} = \{z^{(j)}\}$ , $j = \bar{1, N_{i}}$ . Let's examine the probabilistic characteristics of the set $Z_{i}$ .

Let's form the desired set $Z_{i}$ from the existing stochastic matrix of dimension $(q \times M_{i})$ in the following way: $N_{i} = M_{i}^{q}$ or $N_{i} = M_{i}$ , where the parameter $N_{i}$ characterizes the number of stochastic vectors in the set $Z_{i}$ , and the parameter $M_{i}$ characterizes the coordinates of $z_{j}^{(i)}$ the vector $z^{(i)}$ . Taking this $N_{i} = M_{i}$ , we formulate the rule for the formation of the set $Z_{i + 1}$ (taking into account the declared iterative increase in the volume of the latter: $N_{i + 1} > N_{i}$ ):

(7)

N_{i + 1} = α N_{i} M_{i + 1} = α M_{i} α > 1 i = 0, 1, 2, \dots

Let us divide the $q$ -dimensional parallelepiped $Z_{+}^{q}$ mentioned in (4) into unit cubes in each of the dimensions using a lattice with a uniform step

(8)

μ_{i} = M_{i}^{- q} 0 < φ ⩽ o ⩽ β < 1

where the parameter

o

characterizes the volume of the unit cube and the values of the boundary parameters

φ

β

should be chosen taking into account the results of the analysis of probabilistic characteristics of the set of global minima

F^{*}

and the set of their corresponding arguments

Z^{*}

We characterize as $P (N_{i}, q, o)$ the probability of such an situation that there is an cube that does not contain any of the $N_{i}$ vectors generated on the i-th iteration:

(9)

P (N_{i}, q, o) = {((1 + N_{i}^{o}) {(1 - N_{i}^{- o})}^{N_{i}})}^{q} = Q (i, q, o)

For $i \to \infty$ , expression (9) will take the form

(10)

Q (i, q, o) \sim Q (i, q, o) = N_{i}^{qo} \exp (- q N_{i}^{1 - o})

Let’s justify analytically the correctness of expressions (9), (10) in the context of the optimization problem (4). Let us characterize by the probability $P (A_{k})$ the situation $A_{k}$ when none of the $M_{i}$ generated independent and uniformly distributed $[0, 1]$ stochastic values falls into the interval $μ_{i}$ for the k-th dimension: $P (A_{k}) ⩽ R (i, q, o) = (1 + M_{i}^{- o}) {(1 - M_{i}^{- o})}^{M_{i}}$ where $(1 + M_{i}^{- o})$ is the upper limit of the number of segments formed as a result of dividing the interval $[0, 1]$ by a grid with a step $M_{k}^{- q}$ .

Considering that $\lim_{x \to \infty} \frac{{(1 - x^{- o})}^{x}}{\exp (- x^{1 - o})} = 1$ , the expression (10) can be transformed into the form $P (A_{k}) ⩽ R (i, q, o) \underset{i \to \infty}{\sim} M_{i}^{o} \exp (- M_{i}^{1 - o})$ .

Let us characterize B the situation when an $o$ -dimensional elementary parallelepiped is found that does not contain any of the $N_{i}$ generated vectors, i.e.: $B = ⋂_{k = 1}^{o} A_{k}$ . Since an independent set of stochastic values $M_{i}$ is generated for each i-th dimension, then for $i \to \infty$ we get

(11)

P (B) = P (N_{i}, q, o) = {(P (A_{i}))}^{o} ⩽ {((1 + N_{i}^{o}) (1 - N_{i}^{- o}))}^{o}

Expression (11) is identical to the left side of expression (9). Therefore, with a sufficiently large number of iterations i, at least one of the stochastic vectors $N_{i}$ from the set $Z_{i}$ generated on the i-th iteration of the method falls into each unit cube with a side $M_{i}^{- o}$ with a probability not less than $(1 - Q (i, q, o))$ .

We established that for optimization problem (4), both the set of global minima v and the set of $Z^{*} = \{z_{0}^{*}, \dots, z_{i}^{*}, \dots\}$ (their corresponding arguments) are not empty. Let us focus attention on the set of global minima $F^{*}$ and investigate its probabilistic characteristics.

Suppose, $Z^{*}$ be the set of arguments of the global estrema points of the function $F$ on the unit cube. Let’s define the spacing $l$ from a sufficiently compact set $Z^{*}$ to an arbitrary point $z$ as

(12)

l (z, Z^{*}) = \min_{y \in Z^{*}} ∥z - y∥

Suppose that the point $\hat{z} \in Z_{i}$ is the closest to the $Z^{*}$ in the context of the spacing (12). As we proved earlier, with probability $(1 - Q (i, q, o))$ at least one of the points of the set $Z_{i}$ falls into the unit cube with side $μ_{i}$ . Taking into account this fact, we redefine the expression (12) for the point $\hat{z}$ : $∥z^{*} - \hat{z}∥ ⩽ μ_{i} \sqrt{q} / 2 = g_{i}$ . This expression characterizes the situation when the point $z^{*}$ is located in the center of the unit cube, and the point $\hat{z}$ is located at one of its vertices. Accordingly: $|F^{*} - F (\hat{z})| ⩽ υ (g_{i})$ , where $υ (g_{i})$ is the modulus of continuity of the function $F (z)$ characterized by expression (5). On the other hand: $F^{*} = F (z^{*}) = \min_{z \in Z_{+}^{q}} F (z) ⩽ \min_{1 ⩽ s ⩽ i} F_{s}^{*} < F (\hat{z})$ . Let's generalize the last two expressions and get $- υ (g_{i}) ⩽ F^{*} - F (\hat{z}) ⩽ \min_{z \in Z_{+}^{q}} F (z) - \min_{1 ⩽ s ⩽ i} F (z_{s}) ⩽ 0$ or

(13)

|\min_{z \in Z_{+}^{q}} F (z) - \min_{1 ⩽ s ⩽ i} F (z_{s})| = |F^{*} - F_{i}^{*}| ⩽ υ (g_{i})

Inequality (13) is fulfilled with a probability not less than $(1 - Q (i, q, o))$ . Relying on inequality (13), it can be asserted that for the function of the form (5) there are such positive constants $K$ , $s$ that the estimate

(14)

r_{i} (o) = K {(\sqrt{q} / 2)}^{p} N_{i}^{- o p} ⩾ |F^{*} - F_{i}^{*}|

is reliable with a probability not less than

(1 - Q (i, q, o))

. The estimate (14) depends on the values of the parameters

o

and

φ

β

entered in (8).

Operating with probabilistic characteristics (10), (14) it is possible to substantiate the convergence of the parallel computing oriented method proposed at the beginning of Section 2.2. Let's take a sufficiently small positive number $ξ$ and find the value of the parameter $L (ξ) = \min \{i : r_{i} ⩽ ξ\}$ (the value of the estimate $r_{i}$ is determined according to (14)). For $i < L (ξ)$ , the situation $\{|F^{*} - F_{i}^{*}| > ξ\} \subset \{|F^{*} - F_{i}^{*}| > r_{i}\}$ is relevant. The probability of realization of this situation in the context of expressions (9), (14) is characterized by the expression

(15)

P \{|F^{*} - F_{i}^{*}| > ξ\} ⩽ Q (i, q, o)

Based on expression (7), $α > 1$ , $i = 1, 2, \dots$ we write

(16)

M_{i} = α^{i} M_{0}

We present the estimate of the probability characteristic (9) as

(17)

Q (i, n, o) = (α^{qoi} M_{0}^{qo}) \exp (- (q / 2) α^{(1 - o) i} M_{0}^{1 - o}) \exp (- (i (1 - o) / 2) M_{0}^{1 - o})

Considering (16), we note that in expression (17) the third multiplier is greater than the second in terms of value. Taking this into account, as well as $x = α^{i} M_{0}$ , we present the expression (17) in a compact form: $Q (i, q, o) ⩽ \tilde{Q} (- (i (1 - o) / 2) M_{0}^{1 - o})$ , where $\tilde{Q} = \max_{x ⩾ 0} x^{qo} \exp (- q x^{1 - o} / 2)$ . The result allows for $i > L (ξ)$ to estimate the probability (15) as

(18)

P \{|F^{*} - F_{i}^{*}| > ξ\} ⩽ \tilde{Q} \exp (- A i)

where

A = (1 - o) M^{1 - o} / 2

. Analyzing expression (18)

i > L (ξ / 2)

, we write

(19)

P \{\sup_{s ⩾ i} |F_{i}^{*} - F_{s + i}^{*}| > ξ\} ⩽ 2 P \{\sup_{s ⩾ i} |F^{*} - F_{s}^{*}| > \frac{ξ}{2}\} ⩽ 2 Q \sum_{s = i}^{\infty} \exp (- A s)

Examining the limiting values of (19) for sufficiently small positives $ξ$ , we obtain

(20)

\lim_{i \to \infty} P \{\sup_{s ⩾ 0} |F_{i}^{*} - F_{i + s}^{*}| ⩾ ξ\} = 0

Expression (20) can be interpreted as a necessary and sufficient condition for the convergence of the set $F^{*} = \{F_{i}^{*}\}$ , $i = 0, 1, 2, \dots$ . Expression (19) allows us to state that the speed of this convergence is exponential.

Based on the proven conclusion regarding the convergence of the $F^{*}$ to the global extrema of the optimization problem (4) value, it can be predicted that the set of arguments $z_{i}^{*}$ is guaranteed to converge to the set $Z^{*}$ in the spacing metric (12). Suppose that there is a subset $\{z_{k}^{*}\}$ that converges to the $\overset{⌣}{z} \notin Z^{*}$ point with non-zero probability. This (considering the continuity of the $F$ function) means that the $\lim_{i \to \infty} F (z_{i}^{*}) = F (\overset{⌣}{z}) > F^{*}$ inequality holds with nonzero probability. But, relying on (20), we can write: $\lim_{i \to \infty} F (z_{i}^{*}) = \lim_{i \to \infty} F_{i}^{*} = F^{*} = F (\overset{⌣}{z})$ . In the resulting contradiction, it is the second construction that is analytically justified (see the sequence of obtaining expression (20)). Therefore, the hypothesis that the set of arguments $z_{i}^{*}$ is guaranteed to coincide with the set $Z^{*}$ in the spacing metric (12) can be considered proven. Analyzing expression (20), we can conclude that the set of decrements $Δ$ introduced at the beginning of Section 2.2 converges to zero.

2.3

2.3 Specific points of applied use of the authors’ method

The iterative process of solving the optimization problem (4) by the authors’ method is accompanied by both the calculation of the elements of the set of decrements $Δ$ and the determination of the constants $K$ and $s$ (to comply with the Hölder condition (5)). Let us approximate the sequence of decrements $Δ$ and formalize the process of estimating the values of the constants $K$ , $s$ .

We adapt condition (14) for functions that satisfy the Hölder condition. For the i-th iteration, we get

(21)

|F_{i}^{*} - F^{*}| \approx K {(\sqrt{q} / 2 N_{i}^{o})}^{s} = r_{i} (o)

and for the i+1-th iteration, we get

(22)

|F_{i + 1}^{*} - F^{*}| \approx K {(\sqrt{q} / 2 N_{i + 1}^{o})}^{s}

Subtract expression (22) from expression (21) taking into account rule (7) and that $α^{- o s} < < 1$ : $δ_{i} \approx K (\sqrt{q} / 2 N_{i}^{o}) (1 - α^{- o p}) \sim K (\sqrt{q} / 2 N_{i}^{o}) = {\tilde{δ}}_{i} \partial$ .

After differentiating the obtained expression, we write

(23)

w_{i} = \log {\tilde{δ}}_{i} = \log K - s (o \log N_{i} - \log (\sqrt{q} / 2))

Denoting $C^{(i)} = \log K$ , $n_{i} = \log N_{i}$ , ${\tilde{n}}_{i} (o) = o n_{i} - \log (\sqrt{q} / 2)$ , we present the expression (23) as

(24)

w_{i} = C^{(i)} - s^{(i)} {\tilde{n}}_{i} (o) + ψ_{i}

where

ψ_{i}

is a set of stochastic values with zero mathematical expectation and variance

V

, and

n_{i}

is considered sufficiently large.

The constants $K$ , $s$ are related to the parameters $C^{(i)}$ , $s^{(i)}$ using expression (24). Estimates ${\hat{C}}^{(i)}$ ${\hat{s}}^{(i)}$ can be determined by the “Least Squares” method based on the current values of the elements of sets $\{w_{i}\}$ , $\{n_{i}\}$ . At the same time, we take into account ${\hat{K}}^{(i)} (o) = c^{C^{(i)}}$ , where $c$ is the base of the logarithm. Considering the determined estimates of the constants $K$ , $s$ , we present expression (21) as

(25)

r_{i} (o) = {\hat{K}}^{(i)} (o) {(\sqrt{q} / 2 N_{i}^{o})}^{{\hat{s}}^{(i)} (o)}

Expression (25) allows us to estimate the spacing $r_{i}$ to the exact global extrema value, taking into account the unit cube volume o, characteristic of the i-th iteration of the parallel computing oriented method.

Now let's analytically estimate the average number of iterations until the realization of the situation $ξ$ when condition (6) is fulfilled. Suppose that this situation is realized on the ϕ-th iteration. Let's introduce a positive integer variable $T$ for which $ξ = \{ϕ > T\}$ . For the situation $ξ$ , we can write $ξ \subset ⋃_{s = T - ρ}^{T} \{w_{s} > γ\}$ , based on which we will get

(26)

P (ξ) ⩽ \sum_{s = T - ρ}^{T} P \{w_{s} > \log γ\} ⩽ (ρ + 1) \max_{T - ρ ⩽ s ⩽ T} P \{w_{s} > \log γ\}

Let us present expression (24) in the context of the Bienaime inequality:

(27)

P \{w_{s} > \log γ\} = P \{ψ_{s} > \log γ - {\hat{C}}^{(T)} + {\hat{s}}^{(T)} s\} ⩽ {\hat{V}}^{(T)} / {(\log γ - {\hat{C}}^{(T)} + {\hat{s}}^{(T)} s)}^{2}

Substitute expression (27) into expression (26) and get

(28)

\begin{matrix} P (ξ) = (ρ + 1) \max_{T - ρ ⩽ s ⩽ T} {\hat{V}}^{(T)} / {(\log γ - {\hat{C}}^{(T)} + {\hat{s}}^{(T)} s)}^{2} ⩽ \\ ⩽ (ρ + 1) {\hat{V}}^{(T)} / {(\log γ - {\hat{C}}^{(T)} + {\hat{s}}^{(T)} (T - ρ))}^{2} = I (1 / T^{2}), \end{matrix}

where

{\hat{V}}^{(T)}

is an estimate of the variance

V

of the stochastic set

ψ_{i}

Based on the terminology used, we express the average number of iterations of the parallel computing oriented method before the event $ξ$ occurs as

(29)

E \{ϕ\} = \sum_{m = 1}^{\infty} m P \{ϕ = m\}

Let's open the expression (29) for $T > 2$ :

(30)

\begin{matrix} \sum_{m = 1}^{T} m P \{ϕ = m\} = \sum_{m = 2}^{T} m (P \{ϕ > m - 1\} - P \{ϕ > m\}) = \\ = \sum_{m = 2}^{T} (m - 1) P \{ϕ > m - 1\} - \sum_{m = 2}^{T} m P \{ϕ > m\} + \\ + \sum_{m = 2}^{T} P \{ϕ > m - 1\} = \sum_{m = 1}^{T - 1} P \{ϕ > m\} - T P \{ϕ > T\} . \end{matrix}

Taking expression (28) into account, we write:

(31)

\lim_{T \to \infty} T P \{ϕ > T\} ⩽ \lim_{T \to \infty} T I (1 / T^{2}) = 0

Consider inequality (31) when passing to the limiting form of expression (30) for $T \to \infty$ :

(32)

E \{ϕ\} = \sum_{T = 1}^{\infty} P \{ϕ > T\} ⩽ I (\sum_{T = 1}^{\infty} 1 / T^{2})

Expression (32) is an estimate of the average number of iterations of the method before it stops at the ϕ-th iteration. Assume that condition (32) is fulfilled. Based on expressions (14), (18), we can conclude that with a sufficiently large $i$ , probability that at least one of the values of $F_{ϕ - ρ}^{*}$ , $F_{ϕ - ρ + 1}^{*}$ , …, $F_{ϕ}^{*}$ is not in the vicinity of the exact global minimum of the optimization problem (4) tends to zero when the situation $ξ$ arises.

When the situation $ξ$ occurs on the ϕ-th iteration, it is possible to calculate the estimates ${\hat{K}}^{(ϕ)} (o)$ , $s^{(ϕ)} (o)$ using expression (25). In turn, these estimates can be used to estimate the spacing of the found global minimum $F_{ϕ}^{*}$ from the exact value $F^{*}$ . Taking into account expressions (14), (25), we write:

(33)

|F_{ϕ}^{*} - F^{*}| ⩽ r_{ϕ}^{+}

r_{ϕ}^{+} = \max_{0 < φ ⩽ o ⩽ β < 1} r_{ϕ} (o)

Taking this into account, we characterize the lower limit of the probability of occurrence of the situation $ξ_{r_{ϕ}^{+}}$ from expression (6) as

(34)

P_{r_{ϕ}^{+}}^{*} = \min_{0 < φ ⩽ o ⩽ β < 1} (1 - N_{ϕ}^{qo} \exp (- q N_{ϕ}^{1 - o}))

3 Results

The authors’ method is focused on finding the global minimum of the optimization problem (4), which is an adaptation of the canonical form of the optimization problem (2). We algorithmically implement the method as a composition of two functions. The function $MinGen ()$ implements the generation of a set of local minima $Z_{i}^{*}$ and a set of decrements $δ_{i}$ , as well as checking the fulfilment of the stop condition (6). The function $MinEst ()$ implements the quality check of the found extrema using the probabilistic characteristic (34), which allows for determining the global minimum. The search for the extremum is implemented iteratively, and at each i-th iteration, $s$ computational operations are performed. For the convenience of readers, all variables used below are presented in the Nomenclature at the end of the article.

At each i-th iteration of the application of the function $MinGen ()$ , the following actions are performed:

1. Using the stochastic tests method a pool of stochastic vectors $Z_{i} = \{z^{(1)}, \dots, z^{N_{i}}\}$ , $N_{i} = α N_{i - 1}$ is formed;

2. From the pool $Z_{i}$ , stochastic vectors belonging to the admissible set $L$ are selected. As a result, a set $Y_{i}$ with capacity ${\tilde{N}}_{i}$ is formed;

3. The values of the objective function (4) on the set of admissible points $y^{(i)} \in Y$ are calculated: $F (y^{(k)})$ , $k = \bar{1, {\tilde{N}}_{i}}$ ;

4. From the values $F (y^{(k)})$ calculated at stage 3, the smallest one is chosen: $\min_{1 ⩽ k ⩽ {\tilde{N}}_{i}} F (y^{(k)})$ ;

5. The decrement $δ_{i} = |F_{i}^{*} - F_{i - 1}^{*}|$ and derived parameters $w_{i} = \log δ_{i}$ , $n_{i} = \log N_{i}$ are calculated.

Stages 1–5 are repeated for $ρ$ iterations (as long as the condition $w_{i} < γ$ is fulfilled). After performing $ρ$ iterations, the function $MinGen ()$ completes its work. The output parameter of the function is a tuple of storage sets $〈F, Δ, W, N〉$ of capacity $ρ$ each. Next, the function $MinEst ()$ comes into action, which for the input tuple $〈F, Δ, W, N〉$ implements the following structured calculation procedure:

1. Estimates ${\hat{C}}^{(i)} π$ , ${\hat{s}}^{(i)}$ are calculated for sets $N$ , $W$ by the least squares method (see expression (24));

2. Such quality indicators of the found solution as the spacing $r_{ϕ}^{+}$ (see expression (33)) and the probability characteristic $P_{r_{ϕ}^{+}}^{*}$ (see expression (34)) are calculated.

The algorithmic constructions described above were implemented in the Python programming environment with a focus on supporting GPU Programming with Python and CUDA parallel computing technologies, the description of which is freely available at the link https://github.com/PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA. The created software was based on a software platform consisting of Windows, Python 2.7, Anaconda 5, CUDA 10 (incl. cuBLAS, cuSolver), PyCUDA, Scikit-CUDA. The hardware configuration of the test bench included a 64-bit Intel CPU, 32 Gigabytes of RAM, NVIDIA GPU RTX 3070.

The plan of experiments is designed to empirically confirm the adequacy of the presented method, as well as to obtain data for evaluating its effectiveness.

Empirical confirmation of the adequacy of the authors’ method was carried out based on the content of the Optimization Test Problems section of the specialized resource Virtual Library of Simulation Experiments: Test Functions and Datasets, freely available at the link https://www.sfu.ca/∼ssurjano/optimization.html.

Among the instances available in the Optimization Test Problems section, we selected ten optimization problems, the of admissible solutions domains of which are limited by $P_{x}$ -shaped parallelograms. Here is a list of selected problems with the values of the corresponding controlled parameters and the output results $x^{*}$ , $f^{*}$ in the context of the terminology used in the article:

T1: $f (x) = \sum_{i = 1}^{k} 1 / (a_{i} + \sum_{j = 1}^{q} {(x_{j} - c_{ji})}^{2})$ . For $k = 5$ , $q = 4$ , $A = (0, 1; 0, 2; 0, 2; 0, 4; 0, 4)$ , $C = (\begin{matrix} 4 & 1 & 6 & 8 & 3 \\ 4 & 1 & 6 & 8 & 7 \\ 4 & 1 & 6 & 8 & 3 \\ 4 & 1 & 6 & 8 & 7 \end{matrix})$ we have: $x^{*} = 4, 000$ , $f^{*} = - 10, 153$ ;

T2: $f (x) = (\sum_{i = 1}^{q} x_{i}^{4} - 16 x_{i}^{2} + 5 x_{i}) / 2$ . For $q = 4$ we have: $x^{*} = 2, 904$ , $f^{*} = - 156, 664$ ;

T3: $f (x) = \sum_{i = 1}^{q} ({(x_{i} - 1)}^{2} + 100 {(x_{i + 1} - x_{i})}^{2})$ . For $q = 2$ we have: $x^{*} = 1, 000$ , $f^{*} = 0, 000$ ;

T4: $f (x) = {(2, 625 - x_{1} + x_{1} x_{2}^{3})}^{2} + {(2, 250 - x_{1} + x_{1} x_{2}^{2})}^{2} + {(1, 500 - x_{1} + x_{1} x_{2})}^{2} .$ We have: $x^{*} = \{3, 000; 0, 500\}$ , $f^{*} = 0, 000$ ;

T5: $f (x) = (1 + {(x_{1} + x_{2} + 1)}^{2} (19 - 14 x_{1} - 14 x_{2} + 6 x_{1} x_{2} + 3 x_{1}^{2} + 3 x_{2}^{2})) (30 + {(2 x_{1} + 3 x_{2})}^{2} (18 - 32 x_{1} + 48 x_{2} - 36 x_{1} x_{2} + 12 x_{1}^{2} + 27 x_{2}^{2})) .$

We have: $x^{*} = \{0, 000; - 1, 000\}$ , $f^{*} = 3, 000$ ;

T6: $f (x) = - \exp (0, 5 (\cos 2 π x_{1} + \cos 2 π x_{2})) - 20 \exp (- 0, 2 \sqrt{0, 5 (x_{1}^{2} + x_{2}^{2})}) + 20 + e .$ We have: $x^{*} = 0, 000$ , $f^{*} = 0, 000$ ;

T7: $f (x) = - x_{1} \sin \sqrt{|x_{1} - x_{2} - 47|} - (x_{2} + 47) \sin \sqrt{|x_{1} / 2 + x_{2} + 47|}$ . We have: $x^{*} = \{512, 404; 2319, 000\}$ , $f^{*} = - 959, 641$ ;

T8: $f (x) = \sum_{i = 1}^{k} 1 / (a_{i} + \sum_{j = 1}^{q} {(x_{j} - c_{ji})}^{2})$ . For $A = (0, 1; 0, 2; 0, 2; 0, 4; 0, 4; 0, 6; 0, 3; 0, 7; 0, 5; 0, 5)$ , $C = (\begin{matrix} 4 & 1 & 8 & 6 & 3 & 2 & 5 & 8 & 6 & 7 \\ 4 & 1 & 8 & 6 & 7 & 9 & 5 & 1 & 2 & 3, 6 \end{matrix})$ , $k = 10$ , $q = 2$ we have: $x^{*} = 4, 000$ , $f^{*} = - 11, 030$ ;

T9: $f (x) = 1 / (0, 002 + \sum_{j = 1}^{k} \sqrt{j + \sum_{i = 1}^{q} {(x_{i} - c_{ji})}^{6}})$ . For $k = 25$ , $q = 2$ , $c_{j 1} = (- 32; - 16; 0; 16; 32; - 32; - 16; 0; 16; 32; {- 32; - 16; 0; 16; 32; - 32; - 16; 0; 16; 32; - 32; - 16; 0; 16; 32)}^{T},$ $c_{j 2} = (- 32; - 32; - 32; - 32; - 32; - 16; - 16; - 16; - 16; - 16; 0; 0; 0; 0; 0; 16; 16; 16; 16; 16; 32; 36; 32; 36; 32),$ we have: $x^{*} = - 32, 000$ , $f^{*} = 0, 998$ ;

T10: $f (x) = \sum_{i = 1}^{q} (5 x_{i} - 16 x_{i}^{2} + x_{i}^{4}) / 2$ . For $q = 2$ we have: $x^{*} = 2, 904$ , $f^{*} = - 39, 166$ .

Test problems T1-T10 were solved using the authors’ method with the following initial data: $N_{0} = 10$ , $α = 10$ , $γ = 10^{- 3}$ , $ρ = 3$ , $ϕ = 9$ , $δ_{ϕ} = |f^{*} - f_{ϕ}^{*}|$ . The obtained empirical solutions are presented below in graphical and tabular form in Figs. 1-4 . The results of the calculations are presented in the metric $\{P_{x}, f_{ϕ}^{*}, x_{ϕ}^{*}, δ_{ϕ}, r_{ϕ}, T\}$ , where $f_{ϕ}^{*}$ is the optimal value of the corresponding optimization problem found on the ϕ-th iteration; $x_{ϕ}^{*}$ - the value of the argument, which corresponds to the value of $f_{ϕ}^{*}$ ; $r_{ϕ}$ - quality assessment (21); $T$ is the time spent searching for the value of $f_{ϕ}^{*}$ .

Results of solving test problems T1, T2 by the authors’ method.

Results of solving test problems T3, T4, T5 by the authors’ method.

Results of solving test problems T6, T7 by the authors’ method.

Results of solving test problems T8, T9, T10 by the authors’ method.

Let's finish the experimental part by checking the adequacy of the model embodied in the expression (24). Fig. 5 presents the results of a computational experiment to identify the functional dependence of $\log δ_{i} = \log (F_{i}^{*} - F^{*}) = f (\log (i))$ at $F^{*} = 0$ .

Real and simulated functional dependence of log δ i = f log i at F * = 0 .

According to the results presented in Fig. 5, the parameters and the root mean square deviation of the approximation (21) were determined by the least squares method. Note that the calculated root mean square deviation turned out to be smaller $12 %$ , which allows us to consider model (24) as adequate.

4 Discussion

Let's start the discussion by stating the empirically proven fact of the adequacy of the proposed method for solving the global optimization problems, in which continuous objective functions satisfy the Hölder condition, and the control parameters domain limited by continuous functions is characterized by a positive Lebesgue measure and is limited by multidimensional parallelepipeds. This fact is confirmed by the results of comparing the solutions obtained using the authors’ method with etalon solutions of optimization problems selected from the Optimization Test Problems section of the specialized resource Virtual Library of Simulation Experiments: Test Functions and Datasets. Figs. 1-4 shows that for all optimization test problems T1-T10, approximate values of the global minimum with an error in the range of $[10^{- 9}, 10^{- 2}]$ were found. Moreover, for test problems, T1-T7, adequate values of the upper estimates of the spacing to the exact global minimum (metric $r_{ϕ}$ ) with a probability close to one were obtained. Separately, we note that both the quality characteristics of the initial results calculated according to the authors’ method and the time to obtain them turned out to be practically independent of the size of the search area. This expected result (provided by a selection of simple statistical tests performed on iterations of the method) is a significant advantage of the authors’ method over analogues.

We will also note several theoretically grounded remarks concerning the authors’ method. In particular, the values of estimate (14) depend on the value of the established parameter $o$ , which is associated with the selected value of the unit cube (see (8)). The fact that the set $F_{i}^{*}$ is guaranteed to converge with an exponential rate is theoretically confirmed by the guaranteed convergence of the strictly monotonic set $F^{*} = \{F_{0}^{*} > F_{1}^{*} > \dots\}$ to the value of the exact global extrema with an exponential rate. Recall that the cumulative set $F^{*}$ grows with each transition from the i-th to the i+1-th iteration of the authors’ method (see the beginning of Section 2.2). Finally, if the admissible set is part of a unit cube (4), then when evaluating the probabilistic characteristics mentioned in Section 2.3, only the number of unit cubes in the partition of the admissible set will change. Accordingly, the estimates of the probabilistic characteristics from Section 2.3 (as well as the statements regarding the convergence of the authors’ method) are valid under the condition of compactness (4), which is characterized by the positivity of the Lebesgue measure. With this in mind, when implementing the authors’ method, at each iteration, you should check whether each stochastic point belongs to an admissible set.

5 Conclusion and future directions

The article proposes a parallel computing oriented method for solving the global minimum finding problems, in which continuous objective functions satisfy the Hölder condition, and the control parameters domain limited by continuous functions is characterized by a positive Lebesgue measure and is limited by multidimensional parallelepipeds. A typical example of such a task is the discrepancy minimizing problem between the left and right parts of some system of equilibrium equations. The method is based on simple statistical tests, thanks to which, at each iteration, growing sets of potential global minima and sets of decrements necessary for estimating the values of the Hölder constants are formed. The article theoretically substantiates and empirically proves the guaranteed convergence of the authors’ method to the real global minimum, which occurs at an exponential rate. For the continuous iterations number, analytical upper estimates of the spacing between the potential global minima and real global minima are formalized, as well as an estimate of the probability of overcoming this spacing is formalized. The decrements sequence approximation, estimation of a priori unknown Hölder constants, estimation of the average number of iterations of the method and probabilistic characteristics of the final solution are analytically justified. In addition to the theoretical proof, the adequacy of the authors’ method has been confirmed empirically. It turned out that both the quality characteristics of the initial results calculated by the authors’ method and the time to obtain them are practically independent of the size of the search area. This expected result is a significant advantage of the authors’ method over analogues.

Further research is planned to be devoted to the applied use of the proposed method, in particular, in the field of analysis of small stochastic data. The theoretical basis for these studies are articles such as (Kovtun et al., 2024; Kovtun et al., 2023).

Funding

This work was funded by Researchers Supporting Project number (RSP2024R503), King Saud University, Saudi Arabia.

Acknowledgments

The authors are grateful to all colleagues and institutions that contributed to the research and made it possible to publish its results.

References

Alexiadis A., . A minimalistic approach to physics-informed machine learning using neighbour lists as physics-optimized convolutions for inverse problems involving particle systems. J. Comput. Phys.. 2023;473:111750
[CrossRef] [Google Scholar]
Alfant R.M., Ajayi T., Schaefer A.J., . Evaluating mixed-integer programming models over multiple right-hand sides. Oper. Res. Lett. 2023
[CrossRef] [Google Scholar]
Ali, H., Tariq, U. U., Hardy, J., 2021. Bensaali, F.; Amira, A.; Fatema, K.; Antonopoulos, N. A Survey on System Level Energy Optimisation for MPSoCs in IoT and Consumer Electronics. Comput. Sci. Rev., 41, 100416. 10.1016/j.cosrev.2021.100416.
Bisikalo O., Kharchenko V., . Parameterization of the stochastic model for evaluating variable small data in the Shannon entropy basis. Entropy. 2023;25:184.
[CrossRef] [Google Scholar]
Castelli A.F., Moretti L., Manzolini G., . Robust optimization of seasonal, day-ahead and real time operation of aggregated energy systems. Int. J. Electr. Power Energy Syst.. 2023;152:109190
[CrossRef] [Google Scholar]
Chen W., Zheng M., . Multi-objective optimization for pavement maintenance and rehabilitation decision-making: a critical review and future directions. Autom. Constr.. 2021;130:103840
[CrossRef] [Google Scholar]
Fang D.H., Li C., Yang X.Q., . Asymptotic closure condition and Fenchel duality for DC optimization problems in locally convex spaces. Nonlinear Anal. Theory Methods Appl.. 2012;75:3672-3681.
[CrossRef] [Google Scholar]
Fransen, M. P., Langelaar, M., 2023. Schott, D. L. Deterministic vs. Robust Design Optimization Using DEM-Based Metamodels. Powder Technology, 425, 118526. 10.1016/j.powtec.2023.118526.
Gorawski M., Grochla K., Marjasz R., . Energy minimization algorithm for estimation of clock skew and reception window selection in wireless networks. Sensors. 2021;2021(21):1768.
[CrossRef] [Google Scholar]
Hirsching C., de Jongh S., Eser D., . Meta-heuristic optimization of control structure and design for MMC-HVdc applications. Electr. Pow. Syst. Res.. 2022;213:108371
[CrossRef] [Google Scholar]
Huang M., Du Z., Liu C., . Problem-independent machine learning (PIML)-based topology optimization—A universal approach. Extreme Mech. Lett.. 2022;56:101887
[CrossRef] [Google Scholar]
Izonin I., Tkachenko R., Shakhovska N., . Two-step data normalization approach for improving classification accuracy in the medical diagnosis domain. Mathematics. 2022;10:1942.
[CrossRef] [Google Scholar]
Kovtun V., Grochla K., Kharchenko V., . Stochastic forecasting of variable small data as a basis for analyzing an early stage of a cyber epidemic. Sci. Rep.. 2023;13
[CrossRef] [Google Scholar]
Kovtun V., Altameem T., Al-Maitah M., Kempa W., . Entropy-metric estimation of the small data models with stochastic parameters. Heliyon. 2024;10:e24708.
[Google Scholar]
Marulanda-Durango J.J., Zuluaga-Ríos C.D., . A meta-heuristic optimization-based method for parameter estimation of an electric arc furnace model. Results Eng.. 2023;17:100850
[CrossRef] [Google Scholar]
McNaughton, S., 2023, Optimizing Learning: An Overview. International Encyclopedia of Education(Fourth Edition), 560–567. 10.1016/b978-0-12-818630-5.14065-5.
Montano J., Garzón O.D., Rosales Muñoz A.A., . Application of the arithmetic optimization algorithm to solve the optimal power flow problem in direct current networks. Results Eng.. 2022;16:100654
[CrossRef] [Google Scholar]
Pei P., Quek S.T., Peng Y., . Enriched global-local multi-objective optimization scheme for fuzzy logic controller-driven magnetorheological damper-based structural system. Mech. Syst. Sig. Process.. 2023;193:110267
[CrossRef] [Google Scholar]
Shilaja C., Kiran S.R., Murali M., . Design and analysis of global optimization methods for proton exchange membrane fuel cell powered electric vehicle system with single switch DC-DC converter. Mater. Today:. Proc.. 2022;52:2057-2064.
[CrossRef] [Google Scholar]
Smith J.E., Nair R., . Dynamic binary optimization. Virtual Mach.. 2005;147–219
[CrossRef] [Google Scholar]
Sovrasov V.V., . Local tuning in peano curves-based global optimization scheme. Procedia Comput. Sci.. 2016;101:27-34.
[CrossRef] [Google Scholar]
Stracquadanio G., Pardalos P.M., . Stochastic methods for global optimization and problem solving. Encyclopedia Bioinformat. Computat. Biol.. 2019;321–327
[CrossRef] [Google Scholar]
Tymchenko O., Havrysh B., Khamula O., . Methods of converting weight sequences in digital subtraction filtration. In: 2019 IEEE 14th International Conference on Computer Sciences and Information Technologies (CSIT). 2019.
[CrossRef] [Google Scholar]
Wu H., Niu W., Zhang Y., . A hybrid polynomial-based optimization method for underwater gliders with parameter uncertainty. Appl. Ocean Res.. 2023;133:103486
[CrossRef] [Google Scholar]
Xie W., Tang W., Kuang Y., . A new hybrid optimizer for stochastic optimization acceleration of deep neural networks: dynamical system perspective. Neurocomputing. 2022;514:341-350.
[CrossRef] [Google Scholar]
Xu A., Li S., Fu J., . A hybrid method for optimization of frame structures with good constructability. Eng. Struct.. 2023;276:115338
[CrossRef] [Google Scholar]
Yang Z., Kang R., Luo X., . Rigorous modelling and deterministic multi-objective optimization of a super-critical CO2 power system based on equation of state and non-linear programming. Energ. Conver. Manage.. 2019;198:111798
[CrossRef] [Google Scholar]
Yang C., Zhu Y., Zhou J., . Dynamic flexibility optimization of integrated energy system based on two-timescale model predictive control. Energy. 2023;276:127501
[CrossRef] [Google Scholar]
Zaiats V.M., Rybytska O.M., Zaiats M.M., . An approach to assessment of the value and quantity of information in queueing systems based on pattern recognition and fuzzy sets theories. Cybern. Syst. Anal.. 2019;638–648
[CrossRef] [Google Scholar]
Zhang X., Huang G., Xie Y., . A coupled non-deterministic optimization and mixed-level factorial analysis model for power generation expansion planning – a case study of Jing-Jin-Ji metropolitan region, China. Appl. Energy. 2022;311:118621
[CrossRef] [Google Scholar]
Zhang J., Liu C., Li X., . A survey for solving mixed integer programming via machine learning. Neurocomputing. 2023;519:205-217.
[CrossRef] [Google Scholar]

Introduction

Materials and methods

Statement of the research
Formalization of the parallel computing oriented method of the guaranteed finding of global extremum with simple statistical tests selection
Specific points of applied use of the authors’ method

Results

Discussion

Conclusion and future directions

Fulltext Views
47

PDF downloads
41

View/Download PDF
Download Citations

BibTeX
RIS

Show Sections

[1] Alexiadis A., . A minimalistic approach to physics-informed machine learning using neighbour lists as physics-optimized convolutions for inverse problems involving particle systems. J. Comput. Phys.. 2023;473:111750
[CrossRef] [Google Scholar]

[2] Alfant R.M., Ajayi T., Schaefer A.J., . Evaluating mixed-integer programming models over multiple right-hand sides. Oper. Res. Lett. 2023
[CrossRef] [Google Scholar]

[3] Ali, H., Tariq, U. U., Hardy, J., 2021. Bensaali, F.; Amira, A.; Fatema, K.; Antonopoulos, N. A Survey on System Level Energy Optimisation for MPSoCs in IoT and Consumer Electronics. Comput. Sci. Rev., 41, 100416. 10.1016/j.cosrev.2021.100416.

[4] Bisikalo O., Kharchenko V., . Parameterization of the stochastic model for evaluating variable small data in the Shannon entropy basis. Entropy. 2023;25:184.
[CrossRef] [Google Scholar]

[5] Castelli A.F., Moretti L., Manzolini G., . Robust optimization of seasonal, day-ahead and real time operation of aggregated energy systems. Int. J. Electr. Power Energy Syst.. 2023;152:109190
[CrossRef] [Google Scholar]

[6] Chen W., Zheng M., . Multi-objective optimization for pavement maintenance and rehabilitation decision-making: a critical review and future directions. Autom. Constr.. 2021;130:103840
[CrossRef] [Google Scholar]

[7] Fang D.H., Li C., Yang X.Q., . Asymptotic closure condition and Fenchel duality for DC optimization problems in locally convex spaces. Nonlinear Anal. Theory Methods Appl.. 2012;75:3672-3681.
[CrossRef] [Google Scholar]

[8] Fransen, M. P., Langelaar, M., 2023. Schott, D. L. Deterministic vs. Robust Design Optimization Using DEM-Based Metamodels. Powder Technology, 425, 118526. 10.1016/j.powtec.2023.118526.

[9] Gorawski M., Grochla K., Marjasz R., . Energy minimization algorithm for estimation of clock skew and reception window selection in wireless networks. Sensors. 2021;2021(21):1768.
[CrossRef] [Google Scholar]

[10] Hirsching C., de Jongh S., Eser D., . Meta-heuristic optimization of control structure and design for MMC-HVdc applications. Electr. Pow. Syst. Res.. 2022;213:108371
[CrossRef] [Google Scholar]

[11] Huang M., Du Z., Liu C., . Problem-independent machine learning (PIML)-based topology optimization—A universal approach. Extreme Mech. Lett.. 2022;56:101887
[CrossRef] [Google Scholar]

[12] Izonin I., Tkachenko R., Shakhovska N., . Two-step data normalization approach for improving classification accuracy in the medical diagnosis domain. Mathematics. 2022;10:1942.
[CrossRef] [Google Scholar]

[13] Kovtun V., Grochla K., Kharchenko V., . Stochastic forecasting of variable small data as a basis for analyzing an early stage of a cyber epidemic. Sci. Rep.. 2023;13
[CrossRef] [Google Scholar]

[14] Kovtun V., Altameem T., Al-Maitah M., Kempa W., . Entropy-metric estimation of the small data models with stochastic parameters. Heliyon. 2024;10:e24708.
[Google Scholar]

[15] Marulanda-Durango J.J., Zuluaga-Ríos C.D., . A meta-heuristic optimization-based method for parameter estimation of an electric arc furnace model. Results Eng.. 2023;17:100850
[CrossRef] [Google Scholar]

[16] McNaughton, S., 2023, Optimizing Learning: An Overview. International Encyclopedia of Education(Fourth Edition), 560–567. 10.1016/b978-0-12-818630-5.14065-5.

[17] Montano J., Garzón O.D., Rosales Muñoz A.A., . Application of the arithmetic optimization algorithm to solve the optimal power flow problem in direct current networks. Results Eng.. 2022;16:100654
[CrossRef] [Google Scholar]

[18] Pei P., Quek S.T., Peng Y., . Enriched global-local multi-objective optimization scheme for fuzzy logic controller-driven magnetorheological damper-based structural system. Mech. Syst. Sig. Process.. 2023;193:110267
[CrossRef] [Google Scholar]

[19] Shilaja C., Kiran S.R., Murali M., . Design and analysis of global optimization methods for proton exchange membrane fuel cell powered electric vehicle system with single switch DC-DC converter. Mater. Today:. Proc.. 2022;52:2057-2064.
[CrossRef] [Google Scholar]

[20] Smith J.E., Nair R., . Dynamic binary optimization. Virtual Mach.. 2005;147–219
[CrossRef] [Google Scholar]

[21] Sovrasov V.V., . Local tuning in peano curves-based global optimization scheme. Procedia Comput. Sci.. 2016;101:27-34.
[CrossRef] [Google Scholar]

[22] Stracquadanio G., Pardalos P.M., . Stochastic methods for global optimization and problem solving. Encyclopedia Bioinformat. Computat. Biol.. 2019;321–327
[CrossRef] [Google Scholar]

[23] Tymchenko O., Havrysh B., Khamula O., . Methods of converting weight sequences in digital subtraction filtration. In: 2019 IEEE 14th International Conference on Computer Sciences and Information Technologies (CSIT). 2019.
[CrossRef] [Google Scholar]

[24] Wu H., Niu W., Zhang Y., . A hybrid polynomial-based optimization method for underwater gliders with parameter uncertainty. Appl. Ocean Res.. 2023;133:103486
[CrossRef] [Google Scholar]

[25] Xie W., Tang W., Kuang Y., . A new hybrid optimizer for stochastic optimization acceleration of deep neural networks: dynamical system perspective. Neurocomputing. 2022;514:341-350.
[CrossRef] [Google Scholar]

[26] Xu A., Li S., Fu J., . A hybrid method for optimization of frame structures with good constructability. Eng. Struct.. 2023;276:115338
[CrossRef] [Google Scholar]

[27] Yang Z., Kang R., Luo X., . Rigorous modelling and deterministic multi-objective optimization of a super-critical CO2 power system based on equation of state and non-linear programming. Energ. Conver. Manage.. 2019;198:111798
[CrossRef] [Google Scholar]

[28] Yang C., Zhu Y., Zhou J., . Dynamic flexibility optimization of integrated energy system based on two-timescale model predictive control. Energy. 2023;276:127501
[CrossRef] [Google Scholar]

[29] Zaiats V.M., Rybytska O.M., Zaiats M.M., . An approach to assessment of the value and quantity of information in queueing systems based on pattern recognition and fuzzy sets theories. Cybern. Syst. Anal.. 2019;638–648
[CrossRef] [Google Scholar]

[30] Zhang X., Huang G., Xie Y., . A coupled non-deterministic optimization and mixed-level factorial analysis model for power generation expansion planning – a case study of Jing-Jin-Ji metropolitan region, China. Appl. Energy. 2022;311:118621
[CrossRef] [Google Scholar]

[31] Zhang J., Liu C., Li X., . A survey for solving mixed integer programming via machine learning. Neurocomputing. 2023;519:205-217.
[CrossRef] [Google Scholar]

Simple statistical tests selection based parallel computating method ensures the guaranteed global extremum identification

Abstract

Keywords

Parallel computating

Applied mathematics

Optimization problem

Global extremum

Statistical tests method

Equilibrium

Nomenclature

1 Introduction

2 Materials and methods

2.1 Statement of the research

2.2 Formalization of the parallel computing oriented method of the guaranteed finding of global extremum with simple statistical tests selection

2.3 Specific points of applied use of the authors’ method

3 Results

4 Discussion

5 Conclusion and future directions

Acknowledgments

References

Suggested read for related articles: