Published online by Cambridge University Press: 15 March 2011
An ever present, common sense idea in language modelling research is that, for aword to be a valid phrase, it should comply with multiple constraints atonce. A new language definition model is studied, based on agreement or consensusbetween similar strings. Considering a regular set of strings over a bipartitealphabet made by pairs of unmarked/marked symbols, a match relation isintroduced, in order to specify when such strings agree. Then a regular setover the bipartite alphabet can be interpreted as specifying another languageover the unmarked alphabet, called the consensual language. A word is in theconsensual language if a set of corresponding matching strings is in theoriginal language. The family thus defined includes the regular languages andalso interesting non-semilinear ones. The word problem can be solved inNLOGSPACE, hence in P time. The emptiness problem is undecidable. Closure properties areproved for intersection with regular sets and inverse alphabetical homomorphism.Several conditions for a consensual definition to yield a regular language arepresented, and it is shown that the size of a consensual specification ofregular languages can be in a logarithmic ratio with respect to a DFA. Thefamily is incomparable with context-free and tree-adjoining grammar families.