We seek to develop a deep learning-based system which is interpretable to a suﬃcient degree in the medical domain. Due to the complex nature of evaluating interpretability, we provide a precise description of how we assess the methods we introduce. The ﬁrst goal is to create a system which produces counterfactual examples in the input space, ultimately answering the question: “In this situation, why did you produce this prediction, and for which examples would the prediction have changed?”. For example, considering an input space of images, the system will generate ‘nearby’ images for which the neural network would have predicted a diﬀerent class. Extensions of this immediate project are described below, and all fall under the topic of improving interpretability in deep learning, driven by a particular purpose or domain.