Unary quality metrics#

Inverted Generational Distance (IGD and IGD+) and Averaged Hausdorff Distance#

igd(data, /, ref, *[, maximise])

Inverted Generational Distance (IGD).

igd_plus(data, /, ref, *[, maximise])

Modified IGD (IGD+).

avg_hausdorff_dist(data, /, ref, *[, ...])

Average Hausdorff distance.

Functions to compute the inverted generational distance (IGD and IGD+) and the averaged Hausdorff distance between nondominated sets of points.

The generational distance (GD) of a set \(A\) is defined as the distance between each point \(a \in A\) and the closest point \(r\) in a reference set \(R\), averaged over the size of \(A\). Formally,

\[GD_p(A,R) = \left(\frac{1}{|A|}\sum_{a\in A}\min_{r\in R} d(a,r)^p\right)^{\frac{1}{p}}\]

where the distance in our implementation is the Euclidean distance:

\[d(a,r) = \sqrt{\sum_{k=1}^M (a_k - r_k)^2}\]

The inverted generational distance (IGD) is calculated as \(IGD_p(A,R) = GD_p(R,A)\).

The modified inverted generational distanced (IGD+) was proposed by Ishibuchi et al.1 to ensure that IGD+ is weakly Pareto compliant, similarly to epsilon_additive() or epsilon_mult(). It modifies the distance measure as:

\[d^+(r,a) = \sqrt{\sum_{k=1}^M (\max\{r_k - a_k, 0\})^2}\]

The average Hausdorff distance (\(\Delta_p\)) was proposed by Schütze et al.2 and it is calculated as:

\[\Delta_p(A,R) = \max\{ IGD_p(A,R), IGD_p(R,A) \}\]

IGDX 3 is the application of IGD to decision vectors instead of objective vectors to measure closeness and diversity in decision space. One can use the functions igd() or igd_plus() (recommended) directly, just passing the decision vectors as data.

There are different formulations of the GD and IGD metrics in the literature that differ on the value of \(p\), on the distance metric used and on whether the term \(|A|^{-1}\) is inside (as above) or outside the exponent \(1/p\). GD was first proposed by Van Veldhuizen and Lamont4 with \(p=2\) and the term \(|A|^{-1}\) outside the exponent. IGD seems to have been mentioned first by Coello Coello and Reyes-Sierra5, however, some people also used the name D-metric for the same concept with \(p=1\) and later papers have often used IGD/GD with \(p=1\). Schütze et al.2 proposed to place the term \(|A|^{-1}\) inside the exponent, as in the formulation shown above. This has a significant effect for GD and less so for IGD given a constant reference set. IGD+ also follows this formulation. We refer to Ishibuchi et al.1 and Bezerra et al.6 for a more detailed historical perspective and a comparison of the various variants.

Following Ishibuchi et al.1, we always use \(p=1\) in our implementation of IGD and IGD+ because (1) it is the setting most used in recent works; (2) it makes irrelevant whether the term \(|A|^{-1}\) is inside or outside the exponent \(1/p\); and (3) the meaning of IGD becomes the average Euclidean distance from each reference point to its nearest objective vector. It is also slightly faster to compute.

GD should never be used directly to compare the quality of approximations to a Pareto front, as it often contradicts Pareto optimality (it is not weakly Pareto-compliant). We recommend IGD+ instead of IGD, since the latter contradicts Pareto optimality in some cases (see examples in igd_plus()) whereas IGD+ is weakly Pareto-compliant, but we implement IGD here because it is still popular due to historical reasons.

The average Hausdorff distance (\(\Delta_p(A,R)\)) is also not weakly Pareto-compliant, as shown in the examples in igd_plus().

Epsilon metric#

epsilon_additive(data, /, ref, *[, maximise])

Additive epsilon metric.

epsilon_mult(data, /, ref, *[, maximise])

Multiplicative epsilon metric.

The epsilon metric of a set \(A\) with respect to a reference set \(R\) is defined as 7

\[epsilon(A,R) = \max_{r \in R} \min_{a \in A} \max_{1 \leq i \leq n} epsilon(a_i, r_i)\]

where \(a\) and \(b\) are objective vectors and, in the case of minimization of objective \(i\), \(epsilon(a_i,b_i)\) is computed as \(a_i/b_i\) for the multiplicative variant (respectively, \(a_i = b_i\) for the additive variant), whereas in the case of maximization of objective \(i\), \(epsilon(a_i,b_i) = b_i/a_i\) for the multiplicative variant (respectively, \(b_i = a_i\) for the additive variant). This allows computing a single value for problems where some objectives are to be maximized while others are to be minimized. Moreover, a lower value corresponds to a better approximation set, independently of the type of problem (minimization, maximization or mixed). However, the meaning of the value is different for each objective type. For example, imagine that objective 1 is to be minimized and objective 2 is to be maximized, and the multiplicative epsilon computed here for \(epsilon(A,R) = 3\). This means that \(A\) needs to be multiplied by 1/3 for all \(a_1\) values and by 3 for all \(a_2\) values in order to weakly dominate \(R\). The computation of the multiplicative version for negative values doesn’t make sense.

Computation of the epsilon indicator requires \(O(n \cdot |A| \cdot |R|)\), where \(n\) is the number of objectives (dimension of vectors).

Hypervolume metric#

hypervolume(data, /, ref[, maximise])

Hypervolume indicator.

Hypervolume(ref[, maximise])

Object-oriented interface for the hypervolume indicator.

total_whv_rect(x, /, rectangles, *, ref[, ...])

Compute total weighted hypervolume given a set of rectangles.

whv_rect(x, /, rectangles, *, ref[, maximise])

Compute weighted hypervolume given a set of rectangles.

The hypervolume of a set of multidimensional points \(A\) with respect to a reference point \(\vec{r}\) is the volume of the region dominated by the set and bounded by the reference point 8.

Approximating the hypervolume metric#

hv_approx(data, /, ref[, maximise, ...])

Approximate the hypervolume indicator.

whv_hype(data, /, *, ref, ideal[, maximise, ...])

Approximation of the (weighted) hypervolume by Monte-Carlo sampling (2D only).

Computing the hypervolume can be time consuming, thus several approaches have been proposed in the literature to approximate its value via Monte-Carlo sampling. These methods are implemented in whv_hype() and hv_approx().

Scalarized Hypervolume (DZ2019)#

Deng and Zhang9 proposed to approximate the hypervolume:

\[\widehat{HV}_r(A) = \frac{\pi^\frac{m}{2}}{2^m \Gamma(\frac{m}{2} + 1)}\frac{1}{n}\sum_{i=1}^n \max_{y \in A} s(w^{(i)}, y)^m\]

where \(m\) is the number of objectives, \(n\) is the number of weights \(w^{(i)}\) sampled, \(\Gamma\) is the gamma function math.gamma(), i.e., the analytical continuation of the factorial function, and \(s(w, y) = \min_{k=1}^m (r_k - y_k)/w_k\). The weights \(w^{(i)}, i=1\ldots n\) are sampled from the unit normal vector such that each weight \(w = \frac{|x|}{\|x\|_2}\) where each component of \(x\) is independently sampled from the standard normal distribution.

Bibliography#

[1] (1,2,3)

Hisao Ishibuchi, Hiroyuki Masuda, Yuki Tanigaki, and Yusuke Nojima. Modified distance calculation in generational distance and inverted generational distance. In António Gaspar-Cunha, Carlos Henggeler Antunes, and Carlos A. Coello Coello, editors, Evolutionary Multi-criterion Optimization, EMO 2015 Part I, volume 9018 of Lecture Notes in Computer Science, pages 110–125. Springer, Heidelberg, Germany, 2015.

[2] (1,2)

Oliver Schütze, X. Esquivel, A. Lara, and Carlos A. Coello Coello. Using the averaged Hausdorff distance as a performance measure in evolutionary multiobjective optimization. IEEE Transactions on Evolutionary Computation, 16(4):504–522, 2012.

[3]

A. Zhou, Qingfu Zhang, and Yaochu Jin. Approximating the set of Pareto-optimal solutions in both the decision and objective spaces by an estimation of distribution algorithm. IEEE Transactions on Evolutionary Computation, 13(5):1167–1189, 2009. doi:10.1109/TEVC.2009.2021467.

[4]

David A. Van Veldhuizen and Gary B. Lamont. Evolutionary computation and convergence to a Pareto front. In John R. Koza, editor, Late Breaking Papers at the Genetic Programming 1998 Conference, 221–228. Stanford University, California, July 1998. Stanford University Bookstore.

[5]

Carlos A. Coello Coello and Margarita Reyes-Sierra. A study of the parallelization of a coevolutionary multi-objective evolutionary algorithm. In Raúl Monroy, Gustavo Arroyo-Figueroa, Luis Enrique Sucar, and Humberto Sossa, editors, Proceedings of MICAI, volume 2972 of Lecture Notes in Artificial Intelligence, pages 688–697. Springer, Heidelberg, Germany, 2004.

[6]

Leonardo C. T. Bezerra, Manuel López-Ibáñez, and Thomas Stützle. An empirical assessment of the properties of inverted generational distance indicators on multi- and many-objective optimization. In Heike Trautmann, Günter Rudolph, Kathrin Klamroth, Oliver Schütze, Margaret M. Wiecek, Yaochu Jin, and Christian Grimme, editors, Evolutionary Multi-criterion Optimization, EMO 2017, volume 10173 of Lecture Notes in Computer Science, pages 31–45. Springer International Publishing, Cham, Switzerland, 2017. doi:10.1007/978-3-319-54157-0_3.

[7]

Eckart Zitzler, Lothar Thiele, Marco Laumanns, Carlos M. Fonseca, and Viviane Grunert da Fonseca. Performance assessment of multiobjective optimizers: an analysis and review. IEEE Transactions on Evolutionary Computation, 7(2):117–132, 2003. doi:10.1109/TEVC.2003.810758.

[8]

Eckart Zitzler and Lothar Thiele. Multiobjective optimization using evolutionary algorithms - A comparative case study. In Agoston E. Eiben, Thomas Bäck, Marc Schoenauer, and Hans-Paul Schwefel, editors, Parallel Problem Solving from Nature – PPSN V, volume 1498 of Lecture Notes in Computer Science, pages 292–301. Springer, Heidelberg, Germany, 1998. doi:10.1007/BFb0056872.

[9]

Jingda Deng and Qingfu Zhang. Approximating hypervolume and hypervolume contributions using polar coordinate. IEEE Transactions on Evolutionary Computation, 23(5):913–918, October 2019. doi:10.1109/tevc.2019.2895108.