Terms to know when talking about language

Allison Parrish

These definitions assume a familiarity with English, and extensively refer to American English (specifically, my own accent of American English) in their examples.


The way a language is written is called its “orthography.” For example, when we talk about the orthography of English, we’re talking about:

  • The writing system we use (i.e., the 26 letters in the Roman characters)
  • How individual words are composed from these letters (i.e., “spelling”)
  • Rules for determining how a word is spelled, based on how it sounds

Other languages have orthographies that differ in various ways. For example, Spanish also uses the Roman alphabet, but has a number of different variations on the characters themselves (accents, etc.) and stricter rules about how the sound of a word corresponds to its spelling. (In Spanish, as with many languages other than English written in alphabetic systems, it’s much easier to figure out how a word is spelled based on how it sounds.)

Many languages use writing systems where the individual unit of orthography is not a letter that represents a single sound, but a symbol that represents some larger unit, such as a syllable, a morpheme (see below), or a word. Omniglot is an excellent web site for learning about how the writing systems of the world work.

For the purposes of this class, a word’s “spelling” and its “orthography” can be considered the same thing.


A language’s “lexicon” is the list of all valid words in the language. These are the words that you would find in a dictionary of the language. Of course, the idea that there can be such a thing as a “complete” list of words in any language is highly contested: new words are coined all the time, while other words fall out of favor. Words that originate in marginalized groups (e.g., African-American Vernacular English, or words that are borrowed from other languages, are often denied “official” status (by whatever organization deems themselves to have the necessary authority to say which words are “official” and which are not).

An individual’s “lexicon” is the list of words that that person in particular knows—colloquially known as that person’s “vocabulary.” The idea that the size of someone’s vocabulary or lexicon is associated with intelligence is common, but contested (see here and here).


A language’s “phonology” is the set of sounds used in that language, and the rules for how those sounds interact. Different languages have very different phonologies. For example, the initial sound in the word “the” (in which you put your tongue between your teeth and vibrate your vocal chords; the linguistics term for this is a “voiced interdental fricative”) is found in English but in very few other languages.

A common sound found in other languages but not in English (except in loanwords or personal names) is the voiceless velar fricative: position your tongue to make a “k” sound, but let the air flow through a small opening, instead of closing the opening entirely. (Think of the final sound in the name “Bach”).

A language’s phonology is often divided into two types of sounds: vowels and consonants. Vowels (think “ay”, “ee”, “eye”, “oh!”, “oooh!”) are made by shaping the vocal tract in various ways (with the lips and tongue) while vibrating the vocal chords to produce sound. Consonants (e.g., the “p” and “t” in “pet” involve making some kind of full or near occlusion in the vocal tract, using the tongue, the lips, the velum, the uvula, etc.

A language’s phonology and its orthography aren’t necessarily related in a logical or consistent way. Chinese characters, for example, make little reference to how the words they represent sound when spoken. Linguists use a special alphabet called the International Phonetic Alphabet to write the sounds of a language, regardless of how those sounds are written in the language’s most common writing system.

English orthography, as everyone knows, has very little to do with English phonology, which is why words like “trough” and “doff” rhyme, even though they’re written completely differently. There’s no hard-and-fast way to automatically determine how a word is pronounced based on how it’s spelled; for most English words, you just have to memorize how they’re pronounced. Most online dictionaries have pronunciation guides that will tell you how a word is pronounced. The CMU Pronouncing Dictionary is a computer-readable dictionary with pronunciations for many thousands of English words.

In this class, we’ll refer to the “five vowels” of English, based on the five letters commonly used to write vowel sounds (a, e, i, o, u). However, it’s important to keep in mind that speakers of most varieties of English distinguish between over a dozen different distinct vowel sounds. You can hear many of these in the following words: beet, bit, bait, bet, bat, bought, boat, butt, boot.


A language’s “morphology” is the rules for determining how words are put together. In English, most morphology comes in the form of affixes: the word “unfastening,” for example, is made up of the root word “fasten”, with two affixes: the prefix “un-“ and the suffix “-ing.” “Un-“ and “-ing” can’t stand alone as separate words; they’re always used to build up meaning on another word.

Another common affix in English is “-s,” which is used both to indicate that a noun is plural (“cheese” vs “cheeses”) and to indicate subject agreement on a verb in the present tense (“I read” vs. “she reads”). Morphology in English can be irregular: for example, the plural of “child” is not “childs” but “children;” the past tense of “go” is “went,” not “goed.”

A word stripped of all its morphology is called a “lemma”; the process of removing the morphology from a word (or an entire text) is called “lemmatization.” For example, the lemma of “running” is “run”; the lemma of “children” is “child.”

English morphology allows for the construction of very sophisticated words that have a meaning composed from their units; “antidisestablishmentarianism” is an often cited extreme example of this. Other languages have even more productive morphology. In Turkish, for example, the single verb “Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesineyken” has the meaning “As though you are from those whom we may not be able to easily make into a maker of unsuccessful ones.”


The “syntax” of a language is all of the rules that determine how words fit together in sequence. For example, the sequence of words “The dog cuddles with the cat” is syntactically “valid” in English, but the sequence of words “Cuddles the dog the cat with” is not. (But see below on the word “valid.”)

When learning a language, it’s not enough to simply learn the words of the language—you need to learn the rules for how to form clauses and sentences from those words as well. Some languages, like English, have very strict rules about the order of words: for example, the subject of a verb comes before the verb, and the object of the verb comes after it. (This order is called “SVO”—subject, verb, object. Some languages instead use SOV order, and still others VSO.) In other languages, the rules for order aren’t as strict, or are used for different purposes (such as to show emphasis).

Some linguists believe that all languages have underlying, absolute rules about what is “correct” and “incorrect,” and that it’s possible to apply a procedure (perhaps a very complicated procedure) to determine whether some stretch of language conforms to those rules. As a poet and artist, it seems to me that language can have interesting, meaningful effects even when its words are unconventionally ordered, and that using the term “incorrect” or “invalid” for such language is needlessly stigmatizing (and delegitimizes both creative uses of language, and language as used by marginalized groups). So instead of “incorrect” or “invalid,” I’ll usually use the terms “unconventional” or “atypical.” (Not a perfect strategy by any means, but better than the status quo, I think.)


When we use the word “semantics,” we’re talking about what a stretch of language means. We can talk about a word’s phonology (e.g., the word “run” has the sounds “r”, “u”, “n”) and its morphology (the past tense of “run” is “ran”) and its syntax (“run” is a verb which can be either transitive or intransitive). The semantics of the word is what it actually means—i.e., “To move forward quickly upon two feet by alternately making a short jump off either foot” (among other meanings).

All of the elements of language are related to semantics, of course, because one of the primary things people want to do with language is convey meaning. However, it’s possible to talk about language and talk from a strictly structural standpoint without referencing its meaning, just as it’s possible to talk about what someone “meant” by some stretch of language, without explicitly talking about its structure.


Finally, “discourse” or “discourse analysis” is the study of how language functions and how language is structured in units and situations beyond the sentence. For example, how do people conduct conversations? It seems like a trivial question (we all have conversations every day!) but the rules and behavior observed in conversations are actually quite sophisticated.

How about other kinds of linguistic interaction? How is holding a conversation with a friend different from ordering food at a restaurant? When we tell stories, what are the rules for determining how the story is structured? What’s the difference between a non-fiction book and a fiction book? These are all questions for discourse analysis.