Once a genetic region involved in a complex disease has been localized through linkage or
association studies, we need methods to help us identify the actual disease predisposing genetic
variant(s) in the region. A large number of single nucleotide polymorphic (SNP) sites may exist in
this region. It is important to identify genetic variants directly involved in disease from those in
linkage disequilibrium, and thus associated with, the disease predisposing variant(s). A question of
great interest is to test whether a SNP, or a combination of SNPs, that influence the trait under
investigation have been identified. For many complex HLA-associated diseases, patterns of amino
acid site variability raise the possibility that HLA-variation association with a disease may not be
due to a given allele but rather one or more variable amino acid sites occurring on several alleles.
Here the question is whether an amino acid variant or a combination of amino acid variants involved
in disease are identified. To address this question, this paper proposes a permutation procedure for
the haplotype method, to test whether all the sites involved in the disease have been identified using
the haplotypic data of patients and controls. The method is based on the theoretical result of Valdes
and Thomson, that, for each haplotype combination containing all the amino acid sites involved in
the disease process, the relative frequencies of amino acid variants at sites not involved in disease,
but in linkage disequilibrium with the disease-predisposing sites, are expected to be the same in
patients and controls. This procedure takes into account the non-independence of the sites sampled
and is robust to mode of inheritance and penetrance of the disease, and can definitely specify when
all the disease predisposing sites have not been identified. Application to both simulated data and
real data sets on type 1 diabetes and alcoholism indicates that the proposed procedure works well
in testing the important null hypothesis of whether all the predisposing sites are identified.