Mathematical fundation for probability and statistics

I’ve been reading the new book from Chistopher Bishop, Pattern Recognition and Machine Learning, and once again, I realize that I really lack a strong mathematical fundation for statistics. Not that the book from Bishop requires it, it does not go really deep into the mathematical side, and anyone with an undergraduate level in calculus should be able to follow it quite easily. But I would really like to see proofs for some results (convergence of EM, why and in which conditions Variational Bayes approaches gives a good approximation, etc…). One of the point I keep being confused by is everything related to conditional expectation. I have a hard time to really ‘get it’ and use it comfortably.

What I am looking for is some books which are:

  1. at a graduate level
  2. mathematically sound (with proof of convergence, for example)
  3. Ideally, can be used a self-study (exercices + solutions).

Some of the references which look worth being looked at:

  1. A course in probability theory from Kai Lai Chung -> I just bought it. Seems concise
  2. The Elements of Statistical Learning by T. Hastie, R. Tibshirani and J. H. Friedman.
  3. Probability: A Graduate Course by Allan Gut. This one look good, with exercices (no solution, though). May seem like a detail, but the typography is really good: this is basic latex style, but that’s what I prefer by far for math-heavy text books.
  4. Fundation of probability by O. Kallenberg. I once borrowed it: it look like all the points I keep being lost at are treated, but the level is quite above mine for the moment.
  5. Mathematical Statistics by Jun Shao. This once has a companion book with solutions.

Will see how it goes with the first one, which I’ve just bought, and if it is enough to follow a Bayesian Choice, which I am still waiting for, and was recommended by A. Doucet as a good introduction to rigourous Bayesian statistics.