We introduce Joint Probability Trees (JPT), a novel approach that makes learning of and reasoning about joint probability distributions tractable for practical applications. JPTs support both symbolic and subsymbolic variables in a single hybrid model, and they do not rely on prior knowledge about variable dependencies or families of distributions. JPT representations build on tree structures that partition the problem space into relevant subregions that are elicited from the training data instead of postulating a rigid dependency model prior to learning. Learning and reasoning scale linearly in JPTs, and the tree structure allows white-box reasoning about any posterior probability P(Q|E), such that interpretable explanations can be provided for any inference result. Our experiments showcase the practical applicability of JPTs in high-dimensional heterogeneous probability spaces with millions of training samples, making it a promising alternative to classic probabilistic graphical models.