0
EN
1
المرجع الالكتروني للمعلوماتية

Grammar

Tenses

Present

Present Simple

Present Continuous

Present Perfect

Present Perfect Continuous

Past

Past Simple

Past Continuous

Past Perfect

Past Perfect Continuous

Future

Future Simple

Future Continuous

Future Perfect

Future Perfect Continuous

Parts Of Speech

Nouns

Countable and uncountable nouns

Verbal nouns

Singular and Plural nouns

Proper nouns

Nouns gender

Nouns definition

Concrete nouns

Abstract nouns

Common nouns

Collective nouns

Definition Of Nouns

Animate and Inanimate nouns

Nouns

Verbs

Stative and dynamic verbs

Finite and nonfinite verbs

To be verbs

Transitive and intransitive verbs

Auxiliary verbs

Modal verbs

Regular and irregular verbs

Action verbs

Verbs

Adverbs

Relative adverbs

Interrogative adverbs

Adverbs of time

Adverbs of place

Adverbs of reason

Adverbs of quantity

Adverbs of manner

Adverbs of frequency

Adverbs of affirmation

Adverbs

Adjectives

Quantitative adjective

Proper adjective

Possessive adjective

Numeral adjective

Interrogative adjective

Distributive adjective

Descriptive adjective

Demonstrative adjective

Pronouns

Subject pronoun

Relative pronoun

Reflexive pronoun

Reciprocal pronoun

Possessive pronoun

Personal pronoun

Interrogative pronoun

Indefinite pronoun

Emphatic pronoun

Distributive pronoun

Demonstrative pronoun

Pronouns

Pre Position

Preposition by function

Time preposition

Reason preposition

Possession preposition

Place preposition

Phrases preposition

Origin preposition

Measure preposition

Direction preposition

Contrast preposition

Agent preposition

Preposition by construction

Simple preposition

Phrase preposition

Double preposition

Compound preposition

prepositions

Conjunctions

Subordinating conjunction

Correlative conjunction

Coordinating conjunction

Conjunctive adverbs

conjunctions

Interjections

Express calling interjection

Phrases

Sentences

Clauses

Part of Speech

Grammar Rules

Passive and Active

Preference

Requests and offers

wishes

Be used to

Some and any

Could have done

Describing people

Giving advices

Possession

Comparative and superlative

Giving Reason

Making Suggestions

Apologizing

Forming questions

Since and for

Directions

Obligation

Adverbials

invitation

Articles

Imaginary condition

Zero conditional

First conditional

Second conditional

Third conditional

Reported speech

Demonstratives

Determiners

Direct and Indirect speech

Linguistics

Phonetics

Phonology

Linguistics fields

Syntax

Morphology

Semantics

pragmatics

History

Writing

Grammar

Phonetics and Phonology

Semiotics

Reading Comprehension

Elementary

Intermediate

Advanced

Teaching Methods

Teaching Strategies

Assessment

قم بتسجيل الدخول اولاً لكي يتسنى لك الاعجاب والتعليق.

The importance of hapax legomena

المؤلف:  Mark Aronoff and Kirsten Fudeman

المصدر:  What is Morphology

الجزء والصفحة:  P242-C8

2026-04-24

487

+

-

20

The importance of hapax legomena

According to Baayen (1992), if you want to study morphological productivity, it is important to study hapax legomena, words that appear only once in a given corpus, preferably a large one. Why? If you adhere to the theory discussed immediately above, then a productive rule is like a machine that spins out words, throws them into the air, and doesn’t bother to keep track of them. Words that appear only once in a large corpus are more likely than words that are used repeatedly to have been formed by a productive rule.

 

If this seems counterintuitive to you, then think of it in terms of concrete examples. If you look in the dictionary, you probably won’t find giggle-gaggle. But it does not sound odd, because semi-reduplicatives like this are common in English: chitchat, jingle-jangle, flip-flop, zigzag. If giggle-gaggle fell out as a hapax legomenon in a large corpus, it would be precisely because it follows a productive pattern, and speakers who use it can create it on the fly. Memorized words, ones that are not created on the fly but are stored in the lexicon, are more likely to recur in a large corpus. So in a large corpus, we would expect to find multiple examples of words like monitor, third, or get. We are not claiming that words that follow a productive pattern have to be hapax legomena – we would also expect to find multiple examples of inflected forms of common words, like argues or arguing. We are saying only that if a word is a hapax legomenon, it is more likely to have been formed by a productive rule.

 

If you take a huge corpus – say, 30, 50, or 100 million words – and look for words that occur only once, this will be a very good indicator of productivity. The formula that Baayen proposes is quite simple: pro ductivity  is equal to the number of words occurring only once in a corpus divided by the total number of tokens of words of the same morphological type:

 

For example, if we are considering the type X-ness (e.g., redness), then we look for words that occur only once in our corpus (perhaps decidedness), and we divide the total number of such once-only words by the total number of occurrences of the type X-ness in our corpus. This will be our measure of the productivity of the type X-ness in our corpus. The larger and more representative of the language the corpus is, the closer this  number comes to the actual productivity of the pattern in the language.

 

Baayen’s formula does not take into consideration how many different types of words there are, only the ratio of hapax legomena to actual words. If you find a high ratio of words that occur only once in a given pattern to the total number of words in the pattern, you demonstrate productivity. This is a formula with reasonable predictability and a technique for indirectly gaining access to what kind of linguistic knowledge speakers possess.

 

There are some caveats to Baayen’s formula, as pointed out by Bauer (2001), who applied the formula to the Wellington Corpus of Written New Zealand English. In that corpus, the suffix -iana occurs only once, in the word Victoriana. If we apply the formula, the number of hapax legomena is one and the total number of tokens in the corpus is also one, so -iana appears to be totally productive – an apparently absurd result. This doesn’t reflect a problem with Baayen’s formula, as Bauer notes. Instead, the problem lies with the relatively small sample size.

 

The Wellington Corpus of Written New Zealand English contains not much more than a million words and only one example of the suffix -iana. This is not enough for our purposes. (Baayen’s original corpus was about 18 times larger.) It’s also important to keep in mind that the numbers we get by applying Baayen’s formula cannot be compared across corpora of different sizes. The same affix might garner different  results depending on the corpora used. This doesn’t invalidate the formula. It comes about because the  value produced is relative to the size of the corpus.

اخر الاخبار

اشترك بقناتنا على التلجرام ليصلك كل ما هو جديد