Most of the current statistical machine translation systems learns models from parallel data. For small amount of language pairs these resources can be found in the form of government text. However, for the vast majority of language pairs, there is almost nothing. Fortunately, there are other types of data that can be exploited: monolingual and comparable data. On the other side, a substantial amount of work has been done to apply representation learning for various NLP tasks. My PhD project will explore how we can use non-parallel data and representation learning approaches to help low-resource machine translation.