No CrossRef data available.
Published online by Cambridge University Press: 04 June 2014
We have developed a heuristic method for unsupervised parsing of unrestricted text. Our method relies on detecting certain patterns of part-of-speech tag sequences of words in sentences. This detection is based on statistical data obtained from the corpus and allows us to classify part-of-speech tags into classes that play specific roles in the parse trees. These classes are then used to construct the parse tree of new sentences via a set of deterministic rules. Aiming to asses the viability of the method on different languages, we have tested it on English, Spanish, Italian, Hebrew, German, and Chinese. We have obtained a significant improvement over other unsupervised approaches for some languages, including English, and provided, as far as we know, the first results of this kind for others.