Xu and Mannor provide a theoretical paper on robustness and generalization where their notion of robustness is based on the idea that the difference in loss should be small for samples that are close. This implies that, e.g., for a test sample close to a training sample, the loss on both samples should be similar. The authors formalize this notion as follows:
Definition: Let $A$ be a learning algorithm and $S \subset Z$ be a training set such that $A(S)$ denotes the model learned on $S$ by $A$; the algorithm $A$ is $(K, \epsilon(S))$-robust if $Z$ can be partitioned into $K$ disjoint sets, denoted $C_i$ such that $\forall s \in S$ it holds:
$s,z \in C_i \rightarrow |l(A(S), s) – l(A(S), z)| \leq \epsilon(S)$.
In words, this means that we can partition the space $Z$ (which is $X \times Y$ for a supervised problem) into a finite set of subsets and whenever a sample falls into the same partition as a training sample, the learned model should have nearly the same loss on both samples. Note that this notion does not entirely match the notion of adversarial robustness as commonly referred to nowadays. The main difference is that the partition can be chosen, while for adversarial robustness, the “partition” (usually in form of epsilon-balls around training and testing samples) is fixed.
Based on the above notion of robustness, the authors provide PAC bounds for robust algorithms, i.e. generalization performance of $A$ is linked to its generalization. Furthermore, in several examples, common machine learning techniques such as SVMs and neural networks are shown to be robust under specific conditions. For neural networks, for example, an upper bound on the $L_1$ norm of weights and the requirement of Lipschitz continuity is enough. This actually related to work on adversarial robustness, where Lipschitz continuity and weight regularization is also studied.