Linguistic Patterns in AI
We are now accepting leadership applications for the 2025-26 school year!
Linguistic Patterns in AI
(Image Credit: GeeksforGeeks)
(Image Credit: BotPenguin)
June 3, 2025
Sophia Lin
9th Grade
Brooklyn Technical High School
AI: Computer systems that can perform tasks traditionally thought to require human intelligence, such as learning, problem-solving, and decision-making.
In our current society, humans are seen as highly intelligent compared to other, less intelligent animals. As AI advances, its intelligence is thought to be “on par” with human intelligence. Just as humans and other less intelligent animals each have their own languages, AI, potentially even surpassing human intelligence, has its own language as well: embeddings.
The language of embeddings is used by AI to process and represent information, but it is far too complex for humans to fully comprehend. As a result, the amount of knowledge we have on this language is very limited. Despite using a language that humans may not be able to understand, AI usually communicates with us in human languages. For instance, ChatGPT responds to questions in English, Mandarin, Spanish, and other languages, as do other AI programs. Why is that?
This is where linguistic patterns come into play. Linguistic patterns are the regularities and structures found in language data, such as grammar rules, sentence structures, and word relations. Similarly, in AI, linguistic patterns refer to the study of regularities in language data that enable a foundational understanding of language, structure, meaning, and use. AI systems learn these patterns by training on large datasets of text and speech, allowing them to understand, process, and generate human language effectively.
In order for AI to learn human language, it uses NLPs, or Natural Language Processing, models, to help it identify patterns in human language. The development of natural language processing began in the 1950s, and though there have been repeated attempts and failures, the first successful statistical machine translation systems emerged in the late 1980s. Before these systems, most language processing models relied on complex sets of hand-written rules, which, as you can imagine, were incredibly tedious given the millions of language rules that exist worldwide. However, in the late 1980s, a revolution in natural language processing began with the introduction of machine learning algorithms. This marked the birth of NLP models used in AI today.
NLPs go through a series of processes to help generate human language effectively. First, it goes through data preprocessing, which involves breaking down the sentence input and simplifying it to make it easier for AI to understand. Next, the simplified input is processed through a series of machine learning models to analyze its meaning. This leads to the final step: generating a response that is based on or related to the topic provided by the user.
There are many NLP models used by AI today. However, one of the most commonly used models is GPT-3. As you can imagine, GPT-3 powers ChatGPT. Although ChatGPT is now powered by a more fine-tuned version of GPT-3 known as GPT-3.5, its functions remain really similar. GPT-3 is a natural processing model developed by OpenAI that uses machine learning and statistics to predict the next word in a sentence based on the previous words. In addition, GPT-3 can identify patterns of language, which helps it accurately predict and generate coherent text.
Over the past decade, researchers from MIT, Cornell University, and McGill University have taken steps to analyze the speech and word structures humans use, which may be teachable to AI. They have also developed a machine learning model capable of understanding the rules and patterns of human language by creating its own rules to explain why certain grammatical structures change. Essentially, it can learn a language on its own. Even more impressively, this model can also learn higher-level language patterns that apply to many languages. In one experiment, researchers tested the model using 58 different languages from linguistic textbooks. Each problem had a set of words with corresponding word-form changes, and the AI model was able to create its own rules and describe them about 60 percent of the time.
All in all, as AI continues to grow in use around the world, components like NLPs are also being fine-tuned and actively studied by many. Perhaps one day, NLPs will become so advanced that they will be able to introduce humans to their own language, just as they were introduced to ours: embeddings.
Reference Sources
Guosheng Zhang, AI Linguistics, Natural Language Processing Journal, Volume 10, 2025, 100137, ISSN 2949-7191,
https://doi.org/10.1016/j.nlp.2025.100137, https://www.sciencedirect.com/science/article/pii/S2949719125000135.
Murati, E. (2022). Language & Coding Creativity. Daedalus, 151(2), 156–167.
https://www.jstor.org/stable/48662033.
Wikipedia. “Natural Language Processing.” Wikipedia, Wikimedia Foundation, 28 June 2019
https://en.wikipedia.org/wiki/Natural_language_processing.
Zewe, Adam. “AI That Can Learn the Patterns of Human Language.” MIT News, Massachusetts Institute of Technology, 30 Aug. 2022,