BALSAM: Beyond WorstCase
Analysis in Approximation Algorithms and Mechanism Design
Funded by the
Hellenic Foundation for Research
and Innovation (H.F.R.I.) under the "1st Call for H.F.R.I.
Research Projects to Support Faculty Members and Researchers and Procure
HighValue Research Equipment", project number 1424.
Synopsis
We aim at circumventing strong lower bounds and impossibility results, imposed by the traditional approach of worstcase analysis,
and at a deeper understanding of the algorithmic properties of fundamental problems in the areas of Approximation Algorithms,
Algorithmic Mechanism Design and Computational Learning Theory for “typical” instances that usually appear and are important for practical applications.
Adopting the principles of beyond worstcase analysis in algorithms, our research effort focused on the following directions:
 Approximation and online algorithms for optimization problems with timeevolving costs.
 Approximation algorithms for scheduling malleable tasks on parallel machines
 Truthful approximation mechanisms for perturbationresilient and general instances
 Statistically and computationally efficient learning algorithms from samples that may have coarse labels, be truncated or have incomplete information
 Analysis of the distortion of preference aggregation rules based on ordinal information
 Learning and taking advantage of the implicit structure and properties of the instances in order to solve computationally difficult problems more efficiently.
 Practically efficient algorithms for timeseries clustering.
The project deliverables are research papers presented in highly competitive (and selective) conferences and/or published in peerreviewed journals
in the areas of Theoretical Computer Science (ICALP, IPCO, WINE, SAGT and Algorithmica, Mathematical Programming, Operations Research, Theoretical Computer Science,
Theory of Computing Systems), Artificial Intelligence (AAAI, AISTATS and Journal of Artificial Intelligence Research) and Computational Learning Theory (COLT, NeurIPS, ICML).
Some of the algorithms proposed and analyzed have been implemented and experimentally evaluated.
Originality and Objectives
The research agenda of socalled “beyond worstcase analysis”, which aims at a deeper understanding of the algorithmic
properties of fundamental computational problems for practically relevant instances, has been significantly developed in the last few
years and keeps developing rapidly.
The project aims to make a substantial research contribution to the research agenda of
beyond worstcase analysis. Its originality concerns the following, among others:
 This is the first time that the methods and the techniques of beyondworstcase analysis have been applied to the design of efficient truthful
mechanisms and of efficient approximation algorithms for optimization problems with timeevolving costs.
 This is the first time that “typical” practical instances of optimization problems with timeevolving costs and combinatorial auctions have been
investigated, and the notion of perturbation stability has been used as a means of characterizing such instances.
 The project develops novel algorithmic techniques for optimization problems with timeevolving costs and aims at a deeper and unified understanding
of their algorithmic properties, which is missing from current literature.
 The project studies, for the first time, how the design of efficient truthful mechanisms can be facilitated, if we focus on practically relevant
instances, and proposes new forms of truthful mechanisms that best exploit this direction.
Results
 Approximation and online algorithms for timeevolving optimization problems: Our research focused on facility location in timeevolving metrics and
the problem of minsum set cover. We presented a polynomialtime algorithm for
timeevolving facility location
on the real line and computationally efficient algorithms with asymptotically optimal competitive ratio for the
online version of timeevolving facility location. We introduced and were the first
to study the online and multistage versions of minsum set cover, which comprise interesting generalizations of the classical list update problem. For
online minsum set cover, we presented deterministic and
randomized algorithms with optimal competitive ratio, some of which are based on
online learning and dimension reduction
(i.e., we consider doubly stochastic matrices instead of rankings). For the
multistage minsum set cover,
we presented the first polynomialtime approximation algorithm with squarelogarithmic approximation ratio and prove that a logarithmic approximation
ratio cannot be achieved in polynomial time (unless P = NP). Our algorithms are mostly inspired by and can be regarded as interesting generalizations of the
classical MoveToFront algorithm for list update.
 Malleable task scheduling: We introduced and were the first to study malleable task scheduling on parallel machines to minimize the makespan
in the general and particularly interesting case where each task’s processing time is given by (taskdependent set) function of the set of machines where
the task is assigned. We showed that for subadditive processing power functions, there is a simple nearoptimal schedule, where most tasks are assigned to
a single machine. Hence, by losing a small constant factor in the approximation ratio, we can ignore the scheduling aspect of the problem and focus on
approximating a nearoptimal assignment of the tasks to sets of machines. We present the first polynomialtime algorithms with a logarithmic approximation ratio for
submodular processing power functions and with a
constant approximation ratio
for the case where machines behave as gross substitutes and the
processing power functions are M#concave. We further proved that the algorithm with logarithmic approximation
ratio for submodular functions can be extended to subadditive processing power functions, assuming access to a demand oracle, with a squarelogarithmic
approximation ratio. Extensive experimental evaluation demonstrates that in practically interesting instances with submodular processing power functions,
the makespan achieved by our algorithms deviates from the optimal makespan by a small constant factor.
 Truthful approximation mechanisms: We characterized the best possible approximation ratio achievable
(i) by truthful mechanisms without monetary payments for
kfacility location on the real line for perturbationstable instances; and
(ii) by polynomialtime truthful mechanisms for
combinatorial auctions with perturbationstable valuations. An interesting contribution of our work is the formal definition of the class of perturbationstable valuations, which arises as a natural generalization of the notion of endowed valuations, which have been used in previous work dealing with the existence and efficient computation of Walrasian equilibrium.
Moreover, we proposed a class of truthful mechanisms, known as
equalcost mechanisms, which are based on moderate payments and guarantee identical utility (i.e., valuation minus payment) to all participants, and an implementation framework that guarantees
best possible communication complexity for optimal truthful mechanisms
in the settings of multiitem and multiunit auctions.
 Efficient learning from imperfect samples: Our research effort focused on formulating minimal sets of sufficient and necessary assumptions under which statistically and computationally efficient algorithms for learning from imperfect samples can be formulated. Adopting this general approach, we studied the problems of
parameter estimation of truncated Boolean product distributions and
learning from coarse labels, for each of which we formulated a set of sufficient and necessary assumptions that allow for efficient learning algorithms with polynomial statistical and computational complexity.
We further proved that certain fundamental parameters (e.g., number of cycles or small cliques) of an unknown network can be
efficiently estimated from a noisy sample of the network
and a small number of adaptive neighborhood queries. Finally, our research effort resulted in statistically and computationally efficient
regression algorithms with differential privacy guarantees, when the input data follow a
ddimensional Gaussian distribution and may have unbounded covariates.
 Distortion of preference aggregation rules based on ordinal preferences:
We analyzed the best possible distortion of voting and preference aggregation rules, which are based on ordinal preferences, against an optimal utilitarian solution, which has access to and takes into account the voters’ cardinal preferences.
We established almost tight upper and lower bounds on the distortion ratio for 1facility location (i.e., singlewinner selection) in general metric spaces, when the voting rule has access only to
a small part of the voters’ ordinal preferences.
Next, we introduced and were the first to study the distortion of
kfacility location (kcommittee election) on the real line, under the assumption that the voting rule may use a small number of distances between selected pairs of candidates. We proved strong (and asymptotically tight) upper and lower bounds on the number of distances that the voting rule needs to know in order to achieve first bounded and then constant distortion with respect to the social cost and the maximum cost of the voters. We should underline that the number of distances that the voting rule needs to know for constant distortion does not depend on the number of voters or the number of candidates, it only depends on the number k of candidates selected by the voting rule.
 Algorithmic exploitation of instance structure: Our research effort focused on formulating algorithmic approaches that can efficiently learn and exploit the implicit structure and properties of the instances in order to solve computationally difficult problems more efficiently.
A key contribution in this research direction is
a novel theoretical framework aiming to investigate the possibility of efficient parameterization of computationally difficult problems, such as MAXCUT and TSP. Under this framework, we ask whether there exist generative models that (i) are expressive enough to generate approximately optimal solutions; (ii) have a tractable, i.e, of polynomial size, parameterization; (iii) their optimization landscape is benign in the sense that it does not contain suboptimal stationary points. Using such a parameterization, we show how to approximate potentially intractable problems with computational complexity that depends on the complexity of sampling from the above class of distributions.
Moreover, we present the first known statistically and computationally efficient algorithms for the (extensively studied in practice) label ranking problem, when rankings of the labels are obtained by linear transformations of input features. We also present and analyze theoretically an algorithmic framework that allows for
efficient sampling from a discrete distribution from noisy pairwise comparisons between the elements in its support.
 Practically efficient algorithms for timeseries clustering: We proposed, implemented and evaluated experimentally a new
algorithmic framework for timeseries clustering that first applies sparse Guassian
process regression in order to simplify the input timeseries, and then the wellknown kmeans clustering algorithm to the simplified
timeseries. For the widely used Dynamic Time Warping (DTW) distance, the new algorithmic framework achieves comparable clustering quality
and asymptotic running time about two orders of magnitude faster than directly applying kmeans to the input timeseries.
Impact
 A deeper understanding of the algorithmic properties of optimization
problems with timeevolving costs.
 The extension of the notion of the stability of the optimal solution to small perturbations of the input and its exploitation for the characterization of
“typical” practical instances in the contexts of combinatorial auctions
and facility location games.
 The development of a framework related to the design of efficient truthful mechanisms for perturbation stable instances.
 A deeper understanding of efficient learning from imperfect samples.
 The formation of a research group, consisting of young researchers, with topnotch expertise in the important and timely research directions
studied within the project.
 Support of two completed PhD theses and two PhD theses which will be
completed soon.
 Accumulation of stateoftheart algorithmic knowledge, with significant practical applications and potential of economic and social growth though its application and exploitation.
 Further development of research culture related to high quality, high risk and high gain algorithmic research.
Research Team
Workshops  Dissemination

The workshop on New Trends and Beyond Worstcase Analysis on Mechanism Design and Approximation Algorithms, February 2425, 2022 (online event) was coorganized by BALSAM.
Distinguished invited speakers
presented state of the art results on the project's topic and Thanasis Lianeas, Stratis Skoulakis, Alkis Kalavasis and Panagiotis Pasilinakos presented some of the project's research results.
 Thursday, February 24:
Morning talks
of Aris FilosRatsikas and Carmine Ventre (also
Webex link, password: BaLSaMCoMRaDE2022)
 Thursday, February 24:
Afternoon talks
of Artem Tsikiridis, Georgios Papasotiropoulos, Michail Fasoulakis, Ioannis Caragiannis and Guido Schaefer (also
Webex link, password: BaLSaMCoMRaDE2022)
 Friday, February 25:
Morning talks
of Martin Hoefer, Georgios Amanatidis, Stratis Skoulakis, Thanasis Lianeas, Panagiotis Patsilinakos and Alkis Kalavasis (also
Webex link, password: BaLSaMCoMRaDE2022)
 Friday, February 25:
Afternoon talks
of Vasilis Gkatzelis and Christos Tzamos (also
Webex link, password: BaLSaMCoMRaDE2022)
 The 16th Athens Colloquium on Algorithms and Complexity (ACAC 2021), August 2627, 2021, National Technical University of Athens (online event) was sponsored by
BALSAM. Stratis Skoulakis, Alkis Kalavasis and Panagiotis Pasilinakos presented some of the project's research results.
Publications
 Dimitris Fotakis, Vardis Kandiros, Thanasis Lianeas, Nikos Mouzakis, Panagiotis Patsilinakos, Stratis Skoulakis:
NodeMaxCut and the Complexity of Equilibrium in Linear Weighted Congestion Games. ICALP 2020: 50:150:19.
 Dimitris Fotakis, Loukas Kavouras, Grigorios Koumoutsos, Stratis Skoulakis, Manolis Vardas:
The Online MinSum Set Cover Problem.
ICALP 2020: 51:151:16.
 Dimitris Fotakis, Alkis Kalavasis, Christos Tzamos:
Efficient Parameter Estimation of Truncated Boolean Product Distributions. COLT 2020: 15861600.
Full version in Algorithmica 84(8): 21862221, Springer, 2022.
 Giannis Fikioris, Dimitris Fotakis:
Mechanism Design for Perturbation Stable Combinatorial Auctions.
SAGT 2020: 4763.
Full version in Theory of Computing Systems 66(4): 778801, Springer, 2022.
 Ioannis Anagnostides, Dimitris Fotakis, Panagiotis Patsilinakos:
Asymptotically Optimal Communication in Simple Mechanisms. SAGT 2020:
1731. Revised and extended version to appear, under the title “Sampling and Optimal Preference Elicitation in Simple Mechanisms“, in
Theory of Computing Systems, Springer, 2024.
 Dimitris Fotakis, Thanasis Lianeas, Georgios Piliouras, Stratis Skoulakis:
Efficient Online Learning of Optimal Rankings: Dimensionality Reduction via Gradient Descent.
NeurIPS 2020.
 Dimitris Fotakis, Alkis Kalavasis, Konstantinos Stavropoulos:
Aggregating Incomplete and Noisy Rankings.
AISTATS 2021: 22782286.
 Dimitris Fotakis, Thanasis Pittas, Stratis Skoulakis:
Estimating the Number of Induced Subgraphs from Incomplete Data and Neighborhood Queries.
AAAI 2021: 40454053.
 Dimitris Fotakis, Piotr Krysta, Carmine Ventre:
Efficient Truthful Scheduling and Resource Allocation through Monitoring.
AAAI 2021: 54235431.
 Dimitris Fotakis, Loukas Kavouras, Lydia Zakynthinou:
Online Facility Location in Evolving Metrics.
Algorithms 14(3): 73, 2021.
 Dimitris Fotakis, Loukas Kavouras, Panagiotis Kostopanagiotis, Philip Lazos, Stratis Skoulakis, Nikos Zarifis:
Reallocating multiple facilities on the line.
Theoretical Computer Science 858: 1334, 2021.
 Dimitris Fotakis, Panagiotis Kostopanagiotis, Vasileios Nakos, Georgios Piliouras, Stratis Skoulakis:
On the Approximability of Multistage MinSum Set Cover.
ICALP 2021: 65:165:19.
 Dimitris Fotakis, Georgios Piliouras, Stratis Skoulakis:
Efficient Online Learning for Dynamic kClustering.
ICML 2021: 33963406.
 Dimitris Fotakis, Alkis Kalavasis, Vasilis Kontonis, Christos Tzamos:
Efficient Algorithms for Learning from Coarse Labels.
COLT 2021: 20602079.
 Ioannis Anagnostides, Dimitris Fotakis, Panagiotis Patsilinakos:
MetricDistortion Bounds Under Limited Information.
SAGT 2021: 299313.
Full version in
Journal of Artificial Intelligence Research (JAIR) 74: 14491483, 2022.
 Dimitris Fotakis, Panagiotis Patsilinakos:
Strategyproof Facility Location in Perturbation Stable Instances.WINE
2021: 95112 (full version).
 Robert Istvan BusaFekete, Dimitris Fotakis, Balazs Szorenyi, Manolis Zampetakis:
Identity testing for Mallows model.
NeurIPS 2021.

Robert Istvan BusaFekete, Dimitris Fotakis, Manolis Zampetakis:
Private and Nonprivate Uniformity Testing for Ranking Data.
NeurIPS 2021.
 Jason Milionis, Alkis Kalavasis, Dimitris Fotakis, Stratis Ioannidis:
Differentially Private Regression with Unbounded Covariates.
AISTATS 2022: 32423273.
 Dimitris Fotakis, Alkis Kalavasis, Vasilis Kontonis, Christos Tzamos.
Linear Label Ranking with Bounded Noise.
NeurIPS 2022.
 Dimitris Fotakis, Alkis Kalavasis, Christos Tzamos.
Sampling from Pairwise Comparisons. NeurIPS 2022.
 Dimitris Fotakis, Laurent Gourves, Panagiotis Patsilinakos.
On the Distortion of Committee Election with Linear Preferences and Few Distance Queries.
Technical Report, 2022.
 Dimitris Fotakis, Jannik Matuschke, Orestis Papadigenopoulos:
A ConstantFactor Approximation for Generalized Malleable Scheduling Under M#Concave Processing Speeds.
IPCO 2022: 237250. Full
version to appear in Mathematical Programming, 2024.
 Dimitris Fotakis, Jannik Matuschke, Orestis Papadigenopoulos:
Assigning and Scheduling Generalized Malleable Jobs under Subadditive or Submodular Processing Speeds.
Full version to appear in Operations Research, 2024.
 Constantine Caramanis, Dimitris Fotakis, Alkis Kalavasis, Vasilis Kontonis, Christos Tzamos.
Optimizing SolutionSamplers for Combinatorial Problems:
The Landscape of PolicyGradient Method. NeurIPS 2023.
 Dimitris Fotakis, Panagiotis Patsilinakos, Eleni Psaroudaki and Michalis Xefteris.
Efficient TimeSeries Clustering through Sparse Gaussian Modeling.
Algorithms 17(2): 61, 2024.
Implementation and experimental evaluation platform.