State-of-the-art methods for diminished reality propagate pixel information from a keyframe to subsequent frames forreal-time inpainting. However, these approaches produce artifacts, if the scene geometry is not sufficiently planar. In this paper, wepresent InpaintFusion, a new real-time method that extends inpainting to non-planar scenes by considering both color and depthinformation in the inpainting process. We use an RGB-D sensor for simultaneous localization and mapping, in order to both track thecamera and obtain a surfel map in addition to RGB images. We use the RGB-D information in a cost function for both the color and thegeometric appearance to derive a global optimization for simultaneous inpainting of color and depth. The inpainted depth is merged ina global map by depth fusion. For the final rendering, we project the map model into image space, where we can use it for effects suchas relighting and stereo rendering of otherwise hidden structures. We demonstrate the capabilities of our method by comparing it toinpainting results with methods using planar geometric proxies.