The shape gradient is a local sensitivity function defined on the surface of an object which provides the change in a characteristic quantity, or figure of merit, associated with a perturbation to the shape of the object. The shape gradient can be used for gradient-based optimization, sensitivity analysis and tolerance calculations. However, it is generally expensive to compute from finite-difference derivatives for shapes that are described by many parameters, as is the case for typical stellarator geometry. In an accompanying work (Antonsen, Paul & Landreman J. Plasma Phys., vol. 85 (2), 2019), generalized self-adjointness relations are obtained for magnetohydrodynamic (MHD) equilibria. These describe the relation between perturbed equilibria due to changes in the rotational transform or toroidal current profiles, displacements of the plasma boundary, modifications of currents in the vacuum region or the addition of bulk forces. These are applied to efficiently compute the shape gradient of functions of MHD equilibria with an adjoint approach. In this way, the shape derivative with respect to any perturbation applied to the plasma boundary or coil shapes can be computed with only one additional MHD equilibrium solution. We demonstrate that this approach is applicable for several figures of merit of interest for stellarator configuration optimization: the magnetic well, the magnetic ripple on axis, the departure from quasisymmetry, the effective ripple in the low-collisionality $1/\unicode[STIX]{x1D708}$ regime $(\unicode[STIX]{x1D716}_{\text{eff}}^{3/2})$ (Nemov et al. Phys. Plasmas, vol. 6 (12), 1999, pp. 4622–4632) and several finite-collisionality neoclassical quantities. Numerical verification of this method is demonstrated for the magnetic well figure of merit with the VMEC code (Hirshman & Whitson Phys. Fluids, vol. 26 (12), 1983, p. 3553) and for the magnetic ripple with modification of the ANIMEC code (Cooper et al. Comput. Phys. Commun., vol. 72 (1), 1992, pp. 1–13). Comparisons with the direct approach demonstrate that, in order to obtain agreement within several per cent, the adjoint approach provides a factor of $O(10^{3})$ in computational savings.