A cold dark matter halo big enough to host the Milky Way contains hundreds of subhalos massive enough to host dwarf galaxies. The difference to the much smaller observed number of satellite galaxies seems to be a problem for CDM. The galaxy number density profile and disk like configuration are also different form the total subhalo populations in CDM simulations. A number of different models of dwarf galaxy formation which are able to reproduce the right number of luminous subhalos have been proposed. Some of them also give the right radial distributions and make disk like configurations more probable. Additional information about the typical formation times and sites of dwarf galaxies can be found in the stellar halo of the Milky Way, i.e. from the stellar debris of tidally disrupted dwarfs. A stellar halo with a realistic concentration is obtained when most dwarfs form early (before redshift 10) in small halos (virial mass above $10^8$ solar masses). This mass scale found from simulations of dark matter structure formation coincides with the virial temperature of $10^4 K$ which is needed for efficient atomic cooling.