Sandy H. Huang, Nicolas Papernot, Ian J. Goodfellow, Yan Duan, Pieter Abbeel. Adversarial Attacks on Neural Network Policies. CoRR abs/1702.02284, 2017.

Huang et al. study adversarial attacks on reinforcement learning policies. One of the main problems, in contrast to supervised learning, is that there might not be a reward in any time step, meaning there is no clear objective to use. However, this is essential when crafting adversarial examples as they are mostly based on maximizing the training loss. To avoid this problem, Huang et al. assume a well-trained policy; the policy is expected to output a distribution over actions. Then, adversarial examples can be computed by maximizing the cross-entropy loss using the most-likely action as ground truth.

Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.