Tools & Resources
To make sure we maintain the quality and coverage of Chambers dictionaries, we are constantly developing in-house tools and resources to support the dictionary writing process. Two of these are described here.
Chambers International Corpus
A corpus is a very large collection of texts stored electronically and annotated in such a way as to allow lexicographers to collect evidence and answer questions about the way we use language. In recent years, we have worked very hard to build and maintain a large, state-of-the-art corpus resource using a combination of sophisticated web-spidering and electronic data processing techniques. This is the Chambers International Corpus (CHIC) – almost a billion words of modern, international English. You can read more about CHIC and how it is used by our lexicographers here.
Wordtrack
It is crucial that our dictionaries reflect current language use as accurately as possible. To this end we have a developed Wordtrack, a comprehensive approach to new words tracking and analysis which combines corpus-based computational techniques with the reliability of a well-established reading programme. You can read more about Wordtrack here, including how to license our new words resources for your own use.
For a more detailed description of our tools and resources, you can also read our recently published paper:
Ruth O'Donovan and Mary O'Neill (2008). A Systematic Approach to the Selection of Neologisms for Inclusion in a Large Monolingual Dictionary. In Proceedings of Euralex 2008, Barcelona, Spain. [pdf]