FORENSIC LINGUISTIC TOOLS
All of our research projects include two questions:
- Can the accuracy of a statistical model reach a high enough level with the limited data quantity typically available in a forensic setting to be useful in investigation and adjudication?
- Can the linguistic analysis be fully automated or semi-automated?
These questions are not included in the research project descriptions, but they are guiding principles of all ILE research. Each of the following projects is fully automated with current research focusing on developing the appropriate databases for continued validation testing and reaching the highest possible accuracy.
GIGISM Gender Guessing: with what degree of accuracy is it possible to guess the gender of a document's author?
AGNESSM Age Estimation: with what degree of accuracy is it possible to estinmate the age of a document's author? Can the accuracy reach a high enough level with the limited data quantity typically available in a forensic setting to be useful in investigation and adjudication?
WISERSM Witness Statement Collusion: how accurately can we determine if two witnesses have actually experienced the same event, or have been coached by one who did witness the event?
PRETextSM Predator Text Assessment: how accurately can we predict when an overtly predatory sexual text will occur in a chat?
FIREPANTSSM Veracity Assessment: how accurately can we distinguish between a truthful witness statement and a less than truthful statement? how accurately can we identify less than forthcoming statements in depositions and other dialogic communications
OnGOING VALIDATION TESTING
SynAIDSM: syntax-based author identification
UniAIDESM: grapheme-based author identification
ThreatAssessSM: determines statistically is a letter is a real threat or a control document
SNARESM Suicide Note Assessment: how accurately can we determine if a text is or is not a real suicide note?
FUNDAMENTAL TEXT ANALYSIS TOOLS
In developing forensic linguistic methods, we begin by building fundamental text analysis tools. These tools are then tweaked and developed into components of ALIAS: Automated Linguistic Identification and Assessment SystemSM, which underlie the services offered through our sister organization, ALIAS Technology LLC. Some of our fundamental Text Analysis Tools are:
InterTexterSM: derives n-gram analysis of each document in a set and determines the overlap of n-grams from 2 to 8 words in length
LexiLapSM: quickly derives several measures of vocabulary overlap between two documents, or one document and a set of documents.
Stripper: divides a text into content words and function words with option to strip words to base form.
Tagger: uses a ~30,000 word lexicon, morphological rules, and syntactic rules to tag words for part of speech.
Scraper: enables the user to scrape text from the web directly into the ALIAS Research Document LIbrary database.
E-TaggerSM: Tagger with the addition of other elements for highly ambiguous tagging.