An investigation of Twitter messages reveals new insights and tools for studying how people use stretched words, such as "duuuuude," "heyyyyy," or "noooooooo." Tyler Gray and colleagues at the University of Vermont in Burlington present these findings in the open-access journal PLOS ONE on May 27, 2020.
In spoken and written language, stretched words can modify the meaning of a word. For instance, "suuuuure" can imply sarcasm, while "yeeessss" may indicate excitement. Stretched words are rare in formal writing, but the rise of social media has opened up new opportunities to study them.
Gray and colleagues have now completed the most comprehensive study to date of "stretchable" words in social media. They developed a new, more thorough strategy for identifying stretched words in tweets and used it to analyze a randomly selected dataset of about 10 percent of all tweets generated between September 2008 and December 2016--totaling about 100 billion tweets.
The researchers identified thousands of "stretchable" words in the tweets, including "ha" (e.g., "hahaha" or "haaahaha"), "awesome" (e.g., "awesssssommmmmeeeeee") and "goal) (e.g., ggggoooooaaaaallllll).
They also identified two key ways of measuring the characteristics of stretchable words: balance and stretch. Balance refers to the degree to which different letters tend to be repeated. For instance, "ha" has a high degree of balance because when it is stretched, the "h" and the "a" tend to be repeated just about equally. "Goal" is less balanced, with "o" repeated more than any other letter in the word.
Stretch refers to how long a word tends to be stretched. For instance, short words or sounds like "ha" have a high degree of stretch because people often repeat them many times (e.g., "hahahahahahahaha"). Meanwhile, regular words like "infinity" have lower stretch, often with just one letter repeated: "infinityyyy."
For this analysis, the researchers developed various tools and methods that could be used in future research of stretchable words, such as investigations of mis-typings and misspellings. The tools could also be applied to improve natural language processing, search engines, and spam filters
The authors add: "We were able to comprehensively collect and count stretched words like 'gooooooaaaalll' and 'hahahaha', and map them across the two dimensions of overall stretchiness and balance of stretch, while developing new tools that will also aid in their continued linguistic study, and in other areas, such as language processing, augmenting dictionaries, improving search engines, analyzing the construction of sequences, and more."
Citation: Gray TJ, Danforth CM, Dodds PS (2020) Hahahahaha, Duuuuude, Yeeessss!: A two-parameter characterization of stretchable words and the dynamics of mistypings and misspellings. PLoS ONE 15(5): e0232938. https://doi.org/10.1371/journal.pone.0232938
Funding: CMD and PSD were supported by National Science Foundation Grant Number IIS-1447634, and TJG, CMD, and PSD were supported by a gift from MassMutual. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: We have the following interests: TJG, CMD, and PSD were supported by a gift from MassMutual. There are no patents, products in development, or marketed products to declare. This does not alter our adherence to all of the PLOS ONE policies on sharing data and materials.
In your coverage please use this URL to provide access to the freely available article in PLOS ONE: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0232938