A collaborative project between CFL and Lexegesys Ltd., this project investigated the potential for attributing authorship in the context of micro-blogging. The team developed techniques from forensic linguistics that had been successfully applied to the analysis of SMS text messages in criminal cases, extending their application to tweets. The team developed a proof of concept system to identify and extract a range of discriminatory features, and an efficient approach to performing analysis on large numbers of short texts. Nicci’s role on this project required her to compile lexicons of discriminatory features, and develop a hierarchical taxonomy for their identification.


