We present the first (polynomial-time) algorithm for reducing
a given deterministic finite state automaton (DFA) into
a hyper-minimized DFA, which may have fewer states than
the classically minimized DFA. The price we pay is that the
language recognized by the new machine can differ from the
original on a finite number of inputs. These hyper-minimized
automata are optimal, in the sense that every DFA with fewer
states must disagree on infinitely many inputs. With small
modifications, the construction works also for finite state
transducers producing outputs. Within a class of finitely differing languages, the
hyper-minimized automaton is not necessarily unique. There may
exist several non-isomorphic machines using the minimum number of
states, each accepting a separate language finitely-different
from the original one. We will show that there are large
structural similarities among all these smallest automata.