A Proposal To Reform the Vietnamese Writing System into Polysyllabicity
By dchph
A note on this English version
Abstract
INTRODUCTIONTHE PRESENT STATE OF THE VIETNAMESE WRITING SYSTEM
WHY THE CURRENT WRITING SYSTEM REQUIRES REFORM
- Vietnamese and its Sinitic foundations
- Vietnamese and Chinese commonalities
- On the evident polysyllabism of Vietnamese
- The politics of polysyllabics
HOW TO REFORM THE CURRENT VIETNAMESE WRITING SYSTEM
- The weakest links
- The other pictures: Lessons from our neighbors
- Polysyllabic writing fosters abstract and collective thought
- Accuracy facilitates data processing
x X x
A Note on This English Version
This English version of Canhtân Cáchviết TiếngViệt presents the subject matter from a perspective tailored to Vietnamese readers. At the same time, it is intended for English speakers who may be interested in Vietnamese linguistic issues but lack familiarity with the language. Many may not realize that the Vietnamese words cited in this work follow a proposed compound formation, an essential concept at the heart of the discussion.
Compared to the Vietnamese proposal, this version is more descriptive. Certain ideas may be self-evident to native speakers but require elaboration for non-native readers. Additionally, some viewpoints are deliberately omitted from the Vietnamese text. While these perspectives may serve as valid supporting arguments for outsiders, they risk being perceived as emotionally sensitive or controversial by Vietnamese audiences. These discussions touch on themes of national pride and cultural heritage, which may provoke unfavorable reactions and reduce receptiveness to arguments concerning the genetic composition of the Vietnamese people and their language. As such, they could detract from otherwise well-supported claims.
The author respectfully asks for understanding that the core argument of this proposal remains vital: the current Vietnamese writing system requires urgent reform toward a polysyllabic structure. Although this work does not constitute a formal academic thesis or scientific study, since some hypotheses warrant further investigation, it is an original composition: a carefully developed analysis advocating for Vietnamese orthographic reform and a serious proposal for improvement.
ABSTRACT
Why Vietnamese2020?
Vietnamese2020 introduces a reimagined Vietnamese writing system designed for future adoption, aiming to redefine how Vietnamese is written. This proposal advocates for reform of the current orthography, transitioning toward a polysyllabic structure with a slightly modified visual form that better reflects the linguistic nature of contemporary Vietnamese.
The proposed reform exposes monolingual native learners to symbolic textual patterns, fostering abstract and collective cognition through polysyllabic writing, where all syllables of a word are grouped in unified formations. This approach mirrors natural cognitive processes, allowing pre-defined text strings to recur in distinct shapes that resemble conceptual units rather than fragmented syllabic spellings, as seen in Vietnamese’s current monosyllabic system.
In a polysyllabic script, word meanings remain tightly bound to their visual configurations, functioning symbolically, akin to ideographs. Languages such as English and German exhibit this symbolic quality through their polysyllabic word structures, which are often perceived abstractly via the shape and rhythm of text strings.
By contrast, Vietnamese’s monosyllabic orthography compels readers to mentally disassemble words, first identifying individual syllables, then assigning meaning, and finally reconstructing the whole. Polysyllabic writing allows the brain to absorb continuous sequences as unified visual forms, echoing the ideogram-like recognition found in logographic systems. Learners proficient in foreign languages, particularly German, may already be familiar with these highly visual effects.
The Limits of Monosyllabic Writing , A monosyllabic writing system inherently restricts expressive capacity, each syllable conveys only a fragment of a concept. If global databases had been structured like a monosyllabic Vietnamese dictionary, the world would have witnessed far less sophisticated computational systems than those we rely on today.
Although Vietnamese is no longer a monosyllabic language, its orthography remains fragmented, reminiscent of how Vietnamese once transcribed block-written Chinese characters prior to the late 19th century. Words such as họcbổng (scholarship), matuý (narcotic), bângkhuâng (melancholy), and bângquơ (indiscriminate) are inherently polysyllabic, yet continue to be written as disjointed syllables: học bổng, ma tuý, bâng khuâng, bâng quơ. This is typographically and cognitively regressive, akin to rendering English words as “scho lar ship”, “nar co tic”, “me lan cho ly”, or “in dis cri mi na te”.
Each concept-word is polysyllabic by nature. Take bângquơ, for example: it is a derivative contraction of the Chinese trisyllabic word 不明确 (bù míng què), where 不明 (bù míng) contracts into the syllable bâng, and 确 (què) evolves into quơ. Only the combined formation bângquơ conveys the full meaning: “vague”, “indiscriminate”, “indefinite”, “unclear”. Neither bâng nor quơ alone functions as a meaningful morpheme in this context.
Across all languages, monosyllabic writing is illogical and unscientific. The Vietnamese examples above should be written in their combined formations, accurately reflecting their dissyllabic nature. Had English adopted a Vietnamese-style monosyllabic model, it would not have achieved its global dominance in computing, technical communication, or linguistic abstraction, so to speak.
Language evolution drives societal progress. The stagnation of Vietnamese monosyllabic orthography has impeded technological development and data-processing efficiency, with broader implications for Vietnam’s advancement. Reform may be difficult, but it is imperative.
This proposed system establishes foundational principles for standardized polysyllabic writing, ultimately leading to a unified Vietnamese orthography. In the long term, polysyllabic Vietnamese will foster abstract reasoning in children, enhance computational literacy, and stimulate economic and technological growth.
Call to Action , The modernization of Vietnamese orthography begins now, through emails, online posts, and everyday efforts to adopt polysyllabic formations. While awaiting formal guidance from linguistic authorities, speakers can reference equivalent polysyllabic structures in foreign languages to ensure semantic and structural accuracy. Examples include:
-
although → mặcdù
-
blackboard → bảngđen
-
faraway → xaxôi
The German writing system, renowned for its disciplined polysyllabic constructions, offers a compelling reference model for Vietnamese reform.
Vietnamese language modernization depends on early adopters who establish and normalize polysyllabic standards. This initiative is not utopian, it is a pragmatic and necessary step toward a resilient, future-proof linguistic system. With sufficient collective support, this transformation will not only be possible, it will be inevitable.
The time to act is now.
INTRODUCTION
Languages are among the most enduring artifacts of human civilization. They evolve gradually, rarely succumbing to abrupt change. Yet over time, every language undergoes transformation, especially in its writing system. Across history, writing reform has often marked a pivotal phase in societal development. Today, in the era of global digital communication, it is imperative to recognize that the Vietnamese writing system must be reformed, not only to enhance communicative precision, but also to accommodate the structural logic required for modern data processing.
The current Vietnamese orthography fails to reflect the dissyllabic nature of its spoken language. This disconnect has become a critical obstacle to linguistic representation and technological advancement. One of the central motivations behind this proposal is to address that gap. A reformed polysyllabic writing system will not only improve semantic clarity but also foster abstract and collective thinking in children, an essential foundation for cognitive development.
From a computational standpoint, the proposed system offers a structural framework for reform. It enables more accurate data modeling, improves electronic representation, and simplifies algorithmic translation. A Vietnamese translation engine, capable of rendering English webpages for monolingual users, will become feasible, as polysyllabic formations allow for more logical indexing and semantic coding. This reform will streamline spelling, sorting, tagging, and categorization across computing environments.
These tasks have long been hindered by the limitations of the current monosyllabic orthography. In reality, dissyllabic words constitute the majority of Vietnamese vocabulary. The proposed system is built on a dissyllabic principle: all two-syllable words should be written in combined formation to reflect their spoken unity. This approach will reduce semantic fragmentation and preserve conceptual integrity within word boundaries.
Moreover, there is an even more urgent rationale for reform: to strengthen the cognitive development of monolingual Vietnamese speakers, beginning with young native learners. Under the proposed polysyllabic script, children encounter language as holistic, symbol-like units, multi-syllable words fused into single visual forms. Imagine scanning a block of text at a glance and instantly apprehending its concepts from those unified shapes, rather than laboriously decoding each isolated syllable. This shift in reading strategy will not only speed up literacy but also deepen abstract and collective thinking.
Given these factors, writing reform is not merely desirable, it is necessary. It promises long-term scientific and economic benefits for Vietnam. Historically, reform efforts have been sidelined, often due to the perceived difficulty of changing national writing habits. But if we collectively recognize the shortcomings of the current system, we can build momentum for a popular movement.
Only through shared effort can we raise awareness and advocate for reform. This includes urging the government to place writing reform on the national agenda, beginning with the establishment of a language academy tasked with developing a master plan. Only then will the goal of Vietnamese orthographic reform move from aspiration to reality.
With these principles in mind, the following sections will examine the current state of Vietnamese writing, the rationale for reform, and the pathway forward.
THE PRESENT STATE OF THE VIETNAMESE WRITING SYSTEM
This section examines key characteristics of the Vietnamese language and the historical evolution of its scripts, with the aim of clarifying the structural shortcomings of the current writing system.
1) Vietnamese and Its Sinitic foundations
In my recent study, What Makes Chinese So Vietnamese, I demonstrate that over 90 percent of Vietnamese vocabulary derives from Chinese. This insight springs from a novel dissyllabic analysis that uncovers Sino-Vietnamese etymologies even in core lexicons. The following section will trace the historical and linguistic forces behind this pervasive Chinese influence. Crucially, if world-renowned linguistic authorities recognize Chinese as a polysyllabic language, then Vietnamese, with its fundamentally dissyllabic structure, should be classified likewise. Establishing this structural parallelism lays the groundwork for the orthographic reform proposed here.
Vietnam’s millennium-long period under Chinese rule (111 BC–936 AD) played a decisive role in embedding Chinese vocabulary into Vietnamese. Over centuries, Vietnamese absorbed thousands of Chinese words, both ancient and modern, through processes of borrowing and localization. These lexical integrations occurred across multiple dialectal layers and historical stages.
Beyond imperial influence, the sustained migration of Chinese populations into Vietnamese territories over the past two millennia further deepened this linguistic convergence. These migrants, often war refugees, impoverished peasants, soldiers, and political exiles, settled permanently, intermarried with local communities, and became assimilated into the dominant ethnic group known as Kinh. Their dialectal features, carried into Vietnamese society, gradually merged with the native linguistic environment.
Over successive generations, the descendants of Chinese immigrants became fully absorbed into Vietnamese society, identifying as part of the Kinh majority alongside Vietnam’s other ethnic groups. Their original dialectal features quietly merged into everyday speech as families blended into the local cultural milieu.
This gradual assimilation finds a modern parallel in the more than fifty thousand Amerasians born to Vietnamese women during the U.S. military presence between 1963 and 1973. A similar pattern emerges in the mestizo populations of Latin America, where centuries of contact produced new biological and linguistic blends.
The deep penetration of Chinese vocabulary into Vietnamese also reflects the linguistic policies enforced during a millennium of Chinese rule. Conquerors mandated Chinese for administration and scholarship, embedding Sinolectal terms from the elite literary register down into the most basic layers of daily speech.
Even after securing independence in the tenth century, Vietnam retained the Chinese writing system as its official script before gradually developing Nôm, a set of vernacular characters derived and adapted from Chinese. By the late nineteenth century, two parallel word stocks had taken shape: Hán‐Việt (Sino-Vietnamese) for formal and scholarly vocabulary, and Hán-Nôm (Sinitic-Vietnamese) encompassing all Vietnamese words of Chinese origin, including ancient loanforms.
For further detail and contextual analysis, see What Makes Chinese So Vietnamese.
2) Vietnamese and Chinese commonalities
Vietnamese and Chinese share a wide array of linguistic features, including core vocabulary, morphemic compounding, dialectal and colloquial expressions, grammatical particles, classifiers, and functional words. These attributes are highly specific to languages within the same historical family, once broadly classified under the Sino-Tibetan umbrella. Notably, many foundational Vietnamese lexemes appear to stem from the same etymological roots as their Chinese counterparts.
The influence of Chinese on Vietnamese dates back at least to the Qin and Han dynasties (beginning in 221 BCE), and possibly earlier. Numerous culturally embedded terms of ancient Chinese origin, such as đũa 箸 (chopsticks), bếp 庖 (kitchen), canh 羹 (broth), bàn 案 (table), ghế 椅 (chair), tủ 匵 (cabinet), cũi 櫃 (cupboard), vuquy 于歸 (bridal send-off ceremony), and thángchạp 臘月 (twelfth lunar month), remain actively used in Vietnamese, even as many of these terms have faded from modern Chinese usage.
This enduring presence affirms the depth and permanence of Vietnamese lexical adoption from Chinese. These words are not merely borrowed, they are culturally embedded, structurally integrated, and semantically preserved across centuries of linguistic evolution.
The shared lexicon expands further when considering archaic terms still used in both languages today. Examples include thánggiêng 正月 (January), Tết 春節 (Spring Festival), TếtÐoanngọ 端午節 (Late Spring Festival), and numerous basic words with likely common roots:
- cha 爹 (father)
- mẹ 母 (mother)
- anh 兄 (older brother)
- chị 姐 (older sister)
- thịt 腊 (meat)
- ăn 吃 (eat)
- uống 飲 (drink)
- lúa 來 (rice grain)
- voi 為 (elephant)
- trâu 牛 (water buffalo)
- cọp 虎 (tiger)
- lửa 火 (fire)
- lá 葉 (leaf)
- đất 土 (soil) (See Appendix B for extended listings.)
This process of linguistic absorption continued long after Vietnam gained independence from China. Archaeological findings from the late 1970s, including inscribed tablets, reveal Sinitic-Vietnamese vocabulary dating to the Ming dynasty (16th century). The influence persists into the modern era, with colloquial expressions such as:
- khôngdámđâu 不敢當 (“It’s not so”)
- basạo 瞎掰 (“all mouth”)
- tầmbậy 三八 (“nonsense”)
- bạtmạng 拼命 (“reckless action”)
- phaocâu 屁股 (“chicken’s butt,” a delicacy)
- dêxồm 婬蟲 (“lecherous”)
These examples underscore a long-standing linguistic convergence that predates even the Han dynasty’s initial incursions into ancient Vietnam. (See Appendices for further documentation.)
Vietnamese has also adopted Chinese methods of vocabulary formation, especially in the creation of dissyllabic compounds, where each syllable carries semantic weight. Like Chinese characters, Vietnamese syllables often function independently as morphemes. However, many dissyllabic words in Vietnamese have evolved into indivisible units, where one or both syllables lack standalone meaning and must be interpreted as a whole.
Examples of such composite formations include:
- càgiựt (ill-behaved)
- càlăm (stammer)
- cùlần (unworldly)
- càmràm (whining)
- lãngnhách (nonsense)
- xíxọn (talkative)
- dưahấu (watermelon)
- basạo (all mouth)
These compounds, numbering in the thousands, have become permanently dissyllabic and morphemic in nature. (See Appendix B.)
3) On the Evident Polysyllabism of Vietnamese
To demonstrate the dissyllabic nature of Vietnamese, one need only sample entries from any modern Vietnamese dictionary. Across multiple pages, dissyllabic words consistently account for well over two-thirds of the contemporary lexicon. Vietnamese is no longer a monosyllabic language in practice; it has evolved into a predominantly dissyllabic, and increasingly polysyllabic, system. This shift is one of the defining characteristics of the sound of present-day Vietnamese.
Yet despite this vocal transformation, the current Romanized script fails to reflect dissyllabism accurately. Most two-syllable words are still written as separate syllables with intervening white space, obscuring their semantic unity. This typographic fragmentation mirrors the legacy of Chinese character-based writing, where each syllable was historically rendered as a discrete logograph.
In fact, during the first several decades of Quốcngữ’s official adoption, dissyllabic words were often hyphenated to signal their compound nature. This early convention acknowledged the structural evolution of Vietnamese vocabulary from monosyllabic to dissyllabic forms. Over time, however, the hyphenation practice was abandoned, and the visual integrity of dissyllabic words was lost.
Some linguists have argued that Vietnamese may have originally been polysyllabic, later compressed into monosyllabic forms under Chinese influence, and now re-emerging as dissyllabic. This trajectory, from polysyllabism to monosyllabism to dissyllabism, reflects a layered linguistic history shaped by over a thousand years of Chinese domination.
Evidence for early polysyllabism can be found in ancient Nôm script and early Romanized dictionaries, which record complex consonantal clusters such as bl- in words like blời (for trời, the sun) and blăng (trăng, the moon). These forms may have evolved into mặttrời and mặttrăng, possibly through phonological shifts such as b > m, and the vocalization of mặt as a semantic prefix. Such transformations parallel cases like khlong evolving into khủnglong (恐龍 kǒnglóng, dinosaur) in Chinese.
Further support for dissyllabism lies in native Vietnamese lexicons where syllables form inseparable pairs. Examples include:
- màngtang (temple)
- mỏác (crown of the head)
- đầugối (knee)
- khuỷtay (elbow)
- bảvai (shoulder)
- cùichỏ (elbow)
- mồhôi (sweat)
- cùlét (tickle)
And in polysyllabic compounds:
- xấcbấcxangbang (in tatters)
- bảlápbảxàm (talking nonsense)
- gióheomay (breeze)
- ngủlibì (sleep soundly)
- bayphấtphới (flying flag)
- mưalấtphất (drizzle)
- ngóchămbẳm, nhìnchằmchặp (gaze steadily)
- lộnxàngầu, lộntùngphèo (in chaos)
- mêtítthòlò (totally attracted to)
- thởhồnghộc (breathe heavily)
- bađồngbảyđổi (temperamental)
- tuyệtcúmèo (fabulous)
- bachớpbanháng (absent-minded)
- bãithama (graveyard)
These examples, among thousands, cannot be meaningfully separated into individual syllables. They function as unified semantic units, confirming the structural necessity of polysyllabic representation.
As such, some scholars have proposed a more complex historical trajectory: that Vietnamese may have evolved from polysyllabism to monosyllabism, and then re-emerged as dissyllabic. This hypothesis reflects the uncertainty surrounding early Nôm transcriptions, where it remains unclear whether certain characters represented polysyllabic words or monosyllabic forms with complex consonantal initials.
Nonetheless, the structural patterns found in these cited words consistently point toward a developmental trend of dissyllabism. Phonetically, Vietnamese appears to have transitioned from simplicity to sophistication, from monosyllabic to dissyllabic expression.
This dissyllabic tendency is further evidenced by the presence of synonymous compounds, two-syllable words formed from elements with overlapping meanings. Unlike monosyllabic vocabulary, which often consists of stand-alone units, dissyllabic words in Vietnamese tend to be semantically interdependent and more precise. Their emergence reflects a linguistic strategy to avoid homonymic ambiguity and to encode more specialized meanings.
This phenomenon parallels modern Chinese, where dissyllabic compounds with synonymous syllables are common. Vietnamese examples include:
- tức|giận (angrily/mad)
- trước|tiên (initially/first)
- cũ|kỹ (ancient/old)
- kề|cận (closely/near)
- gấp|rút (urgently/quick)
Recognizing Vietnamese as polysyllabic is not merely a theoretical exercise, it is the foundation for a practical reform. Only by aligning the writing system with the true nature of the language can Vietnamese fully realize its communicative, cognitive, and computational potential.
Why do these linguistic observations matter for the proposed Vietnamese writing reform? They serve to reinforce a central claim: Vietnamese is fundamentally a polysyllabic language. It shares core structural and lexical attributes with Chinese, a language widely recognized by leading linguistic institutions as polysyllabic vocally in nature. So the sounds of either tongue should be transcribed in polysyllabic formation.
While this conclusion may appear straightforward, it is not universally acknowledged. For some, the dissyllabic character of Vietnamese remains obscured by its fragmented orthography. Yet the deep structural parallels between Vietnamese and Chinese are undeniable. The two languages are so intricately intertwined that any serious study of one is incomplete without reference to the other.
Composite Syntax and Derivational Structures in Vietnamese , Some linguists, misled by the surface features of dissyllabic synonymity, have mistakenly classified Vietnamese as an “isolated language”, a term implying that both word and sentence structures consist merely of discrete syllables treated as standalone words. What they may have intended to suggest is that Vietnamese remains in an early developmental stage, not yet having evolved into a morphologically mature system in which word forms reflect tense, case, or syntactic relation through inflection.
This view stands in contrast to the concept of a composite language, a term newly introduced in this proposal. A composite language parallels the notion of inflectional languages, such as English, where word and sentence structures are built from derivational forms. In Vietnamese, composite words are formed from syllables that function as integral components, akin to English radicals and affixes. For example, vănsĩ (“writer”), nghệsĩ (“artist”), quốcgia (“nation”), quốctế (“international”) all demonstrate polysyllabic integrity.
Many Vietnamese composite elements, whether affixes, radicals, roots, or suffixes, can be treated analogously to their English counterparts. These elements serve as semantic building blocks, forming complete word-concepts. Beyond this, Vietnamese also employs particles that construct verbal and adverbial expressions: maulên (“be quick”), bànvề (“talk about”), ănđi (“go ahead and eat”), nhấtlà (“especially”), chonên (“therefore”). Unique classifier-compounds such as bầutrời (“sky”), quảđất (“globe”), khuônmặt (“face”), bàntay (“hand”) further illustrate the language’s structural richness. Besides, reduplicatives like bànghoàng (“stunned”), bồihồi (“sorrowful”), bẻnlẻn (“timid”), bộpchộp (“hasty”) reinforce Vietnamese’s deep affinity with Chinese, far more than with any languages in the region, including the Mon-Khmer languages, which lacks such connotative formations.
Vietnamese, as a composite language, possesses a distinct grammatical architecture. Structured sentences are formed through the use of grammatical particles and markers such as rồi (“already”), sẽ (“will”), đã (“have”), bị (passive voice), vìvậy (“therefore”), chodù (“though”), along with action particles like lên, đi, and thôimà. These elements function not as inflectional affixes, but as syntactic operators that shape meaning and temporal reference.
To those who have mistakenly claimed that Vietnamese lacks “grammar” simply because it does not encode tense or case through morphological inflection, a misconception that has fueled the “isolated language” label, it must be clarified that grammar is defined by a system of internal rules, not by the presence of inflection alone. Vietnamese grammar operates through composite structuring, semantic pairing, and particle-based syntax.
In fact, the syntactic organization of modern Vietnamese has been significantly influenced by French grammatical conventions, particularly in the construction of complete sentences. This historical layering further reinforces the composite nature of Vietnamese, both in its spoken cadence and written form.
Early Vietnamese texts clearly demonstrate how sentences were constructed, often without explicit subjects or objects, yet still grammatically complete. Remarkably, this syntactic feature persists today. Vietnamese sentence structure relies heavily on tonal and contextual cues, allowing composite constructions to convey precise meaning without overt grammatical markers. Consider the following examples:
-
Đã biết vậyrồi saocòn mắcphải?
(Literally, 'Been known so how come got it?' to mean “If you already knew that, why did you still fall for it?”) -
Chodù thếnào đi chăng nữa, cònnuớccòntát.
(Literally: 'Though how go more, still water still spare.' to mean “No matter what, give it your best shot.”) -
Thậtlà ngu thấyrõ, cơhội đếntay chẳnghiểusao lạiđể vuộtmất?
(Literally: 'Really dumb seen clearly, opportunity reach hand not understand why cause slipped way?' to mean “That was truly foolish, how could he let the opportunity slip away?”) -
Ănno rồi chỉbiết ngủ thôi. Chả làmnên tíchsự gì!
Literally: 'Eat full already only know sleep solely. Not have done thins good!' to mean “He just eats and sleeps, completely useless!”
These examples illustrate how Vietnamese relies on connotative composite structures, where particles and word order shape the tone and meaning. The absence of explicit grammatical subjects or tense markers is compensated by semantic precision and syntactic fluidity.
Vietnamese also exhibits composite derivational behavior that parallels inflectional languages. One way to observe this is through the structural formation of compound words such as:
- nghệsĩ (artist)
- casĩ (singer)
- vănsĩ (writer)
- quốcgia (nation)
- quốctế (international)
Now, imagine a hypothetical system in which Vietnamese suffixes function analogously to English derivational endings like -ist, -er, or -or. If -sĩ were treated as a productive suffix equivalent to -s, we might derive:
-
nghệs, văns, hoạs, nhạcs
Similarly, if -gia were rendered as -z, we could imagine:
-
tácz (writer), luậtz (lawyer), sángchếz (inventor)
Prefixing sự- as s- yields:
-
stình (circumstance), scố (incident), sviệc (matter), sthể (situation)
Treating -thuật as -th yields:
-
kỹth (technology), nghệth (arts), math (magic), mỹth (aesthetics)
And phi- as f- yields:
-
flý (illogical), fquânsự (demilitarized), fnhân (inhuman), fliênkết (non-aligned), fchínhphủ (non-governmental)
These analogical constructions demonstrate that Vietnamese composite vocabulary shares derivational logic with inflectional languages. The implications are clear: Vietnamese is not structurally isolated, but symbolically composite. Its vocabulary system reflects a layered morphology that, while not inflectional in the traditional sense, operates through systematic semantic pairing and syntactic cohesion.
Vietnamese as a Naturally Dissyllabic Language , Modern Vietnamese is inherently dissyllabic in its spoken rhythm. Even in sentences where individual words appear unrelated, they are vocally paired into two-syllable units that convey complete semantic notions. These units often co-occur with adverbial particles, forming composite expressions that are syntactically whole and connotatively rich. For example:
-
Ăn lẹ | cho xong | rồi đi!
“Eat quickly, finish it, then go!” -
Chờ mãi | không thấy | nó tới | tụi nầy | mới đi!
“We waited and waited, he never showed up, so we left!”
This cadence is not incidental, it is foundational to Vietnamese conversational structure. In folkloric verse, the rhythm becomes even more pronounced:
-
Yêu nhau | cởi áo | cho nhau, Về nhà | dối mẹ | qua cầu | gió bay!
“To love is to give, even if I must lie to my mother that the wind blew off my clothes over the bridge.”
Here, pronouns and tenses are entirely implied within the dissyllabic framework. These are not strings of isolated syllables; they are connotative composites, word-concepts that synthetically blend meaning, tone, and grammatical function, often without explicit markers.
This challenges the outdated classification of Vietnamese as an “isolated language.” Unlike inflectional languages such as Russian, where grammatical cases allow flexible word order, Vietnamese achieves clarity through composite structuring. Though word order is more fixed, the semantic load is distributed across polysyllabic units, allowing speakers to convey nuanced meaning without overt subjects, objects, or tense indicators.
Such constructions are native and intuitive, not artificial. They are spoken fluently by Vietnamese speakers across all registers. In contrast, truly “isolated” utterances, composed of disconnected monosyllables, are characteristic of early language acquisition, such as in young children forming rudimentary phrases without regard for grammar or connotation.
If non-native speakers struggle to grasp these composite dynamics, it is understandable. Mastery requires not just vocabulary, but a native-level fluency capable of perceiving and producing connotatively structured sentences. How many foreign-born specialists in Vietnamese have truly reached this level, beyond the superficial parsing of syllables, to speak naturally as Vietnamese do? Few, if any. Yet many continue to perpetuate the erroneous classification of Vietnamese as isolated and monosyllabic. This is a methodological failure: garbage in, garbage out.
It is time to discard these reductive labels. Vietnamese is not isolated. It is composite, dissyllabic, or exactly polysyllabic, in both structure and spirit. The orthography must evolve to reflect this reality, rather than continue to misrepresent the language through outdated symbolic conventions.
4) The politics of polysyllabics
Paradoxically, while the dissyllabic nature of Vietnamese is immediately evident to most non-native learners, who instinctively perceive its rhythmic pairing, many so-called “specialists” in Vietnamese consistently misclassify it as a monosyllabic language. This error is not incidental; it is systemic, and it persists across generations of linguistic pedagogy.
In studying Vietnamese, foreign learners must acquire not only monosyllabic lexical items but also dissyllabic composites. Mere familiarity with individual syllables may enable basic recognition and pronunciation, but it does little to foster true fluency. To master Vietnamese, one must learn dissyllabic words in their full, connotative form. Simply stringing syllables together does not yield intelligible or idiomatic speech.
This is no different from the study of Chinese: a non-native speaker may memorize two thousand individual characters, yet still fail to comprehend the thousands of dissyllabic compounds that derive from them. Recognition of radicals is not mastery. Likewise, in English, a learner may identify Latin roots, perhaps acquired through French or another Romance language, but this etymological awareness does not confer command over the full semantic and syntactic range of English polysyllables.
Strictly speaking, linguistic proficiency demands the acquisition of words in their polysyllabic entirety, not fragments, not radicals, not syllables in isolation. Vietnamese is no exception. Its lexical architecture is composite, its rhythm dissyllabic, and its semantic load distributed across paired units that function as grammatical and conceptual wholes.
To ignore this is to misrepresent the language. And to persist in labeling Vietnamese as “isolated” or “monosyllabic” is not merely outdated, it is methodologically bankrupt.
The dissyllabic nature of Vietnamese is not subtle, it is acoustically evident even to non-native listeners with only rudimentary linguistic awareness. When exposed to fluent Vietnamese speech, whether in casual conversation or broadcast media, they can intuitively detect word boundaries. This is because Vietnamese words are uttered in rhythmic pairs, forming a continuous chain of sound. If we let X represent a syllable, the auditory pattern typically unfolds as: XX XX X XX XX X XX… , a cadence of unbroken, paired syllables that signals semantic units with remarkable clarity.
To native speakers, this rhythm is not merely structural, it is musical. It echoes through folk songs and vernacular poetry, where dissyllabic word-concepts are most naturally expressed. Yet in writing, this pairing is obscured. Vietnamese orthography continues to render words as isolated syllables: X X X X X X X… , a typographic fragmentation that misrepresents the spoken language and obstructs its organic evolution. More critically, it imposes cognitive strain on young native readers, who must mentally reconstruct paired rhythms from a visually atomized script.
Before the 20th century, Vietnamese writing was based entirely on Chinese script. Chinese vocabulary served as a referential framework, supplying raw materials for the creation of Vietnamese dissyllabic compounds. From the 10th century onward, the Vietnamese people sought to express their own colloquial voice, distinct in sound and idiom. This led to the invention of Chữ Nôm, a block-script system that adapted Chinese characters to represent native Vietnamese expressions.
By the 16th century, Western missionaries arrived in Vietnam with the aim of translating religious texts. Faced with the dual challenge of Chinese and Nôm scripts, they devised an early form of Quốcngữ, a Romanized orthography tailored to their evangelical mission. Crucially, in transcribing Vietnamese speech, they recognized its dissyllabic structure. Their solution: insert hyphens between syllables to preserve word-concepts. Thus emerged forms like: gia-đình, đồng-bào, ăn-năn, each a typographic reflection of the spoken composite.
As Quốcngữ gained traction in the early 20th century, hyphenation became the norm for dissyllabic words and remained in active use through the 1970s, at least until the war between North and South Vietnam ended in 1975. Today, however, hyphenation survives only in academic contexts such as classic literature. Most native speakers now write dissyllabic words with a space between syllables, visually fragmenting what is audibly whole.
The result is an orthography that appears illogical, unscientific, and increasingly disconnected from the true nature of the spoken language. Vietnamese is not monosyllabic. It is dissyllabic, composite, and polysyllabic, in rhythm, in structure, and in spirit. And the writing system must evolve and mature to reflect this reality.
Dissyllabic density and the myth of monosyllabism , The sheer volume of dissyllabic compounds in Vietnamese is sufficient to classify the language as structurally dissyllabic. Consider just a few more examples from the Sino-Vietnamese stratum:
- tổquốc (fatherland)
- phụnữ (woman)
- giađình (family)
- cộngđồng (community)
Add to these the Sinitic-Vietnamese composites:
- sinhđẻ (give birth)
- dạydỗ (educate)
- lạnhlẽo (cold)
- nhờvã (depend on)
And further still, the so-called “pure” Vietnamese dissyllabic lexicons:
- mặccả (bargain)
- bângkhuâng (melancholy)
- ngọtngào (gently sweet)
- mồcôi (orphaned)
-
hiuquạnh (desolate
and tranquil)
(See Appendix B for extended listings.)
This lexical landscape leaves no doubt: Vietnamese is a dissyllabic language, structurally, rhythmically, and semantically.
The Impossibility of True Monosyllabism , In practical terms, no living language today is truly monosyllabic. The reason is mathematical as much as linguistic. A monosyllabic system offers a severely limited vocabulary. In Vietnamese, even with tone distinctions, the total number of usable one-syllable combinations is estimated at around 12,000. Many theoretical combinations, like tưp, nhửng, cunh, lẻp, phèp, tac, are either unused or phonotactically implausible.
If tones are excluded, as in many Mon-Khmer languages, an imagined monosyllabic language might be left with only 6,000 usable words. By contrast, English contains over 500,000 lexical entries, with thousands of new terms coined in the computing field alone over the past three decades.
In short, any language that remains truly monosyllabic today is either extinct or on the brink of extinction. Vietnamese is neither. It is vibrant, expansive, and polysyllabic. This statement should decisively retire the outdated notion of Vietnamese as a monosyllabic language.
To further illustrate Vietnamese dissyllabicity, we may compare it with English morphology. Both languages exhibit functional radicals, syllabic units that serve as morphological building blocks. English is unequivocally polysyllabic, yet if we filter out Latinate and Hellenic loanwords, we find a core of Anglo-Saxon monosyllables:
- go, keep, run, walk, eat, sleep
- morning (< morn), evening (< eve)
- before (be + fore), forward (fore + ward)
These basic units parallel Vietnamese monosyllables, some of which may have Sinitic origins:
- ăn (唵 eat)
- uống (飲 drink)
- đái (尿 urinate)
- ỉa (屙 defecate)
- đi (去 go)
- đứng (站 stand) (See Appendix B for comparative modeling.)
Such parallels reinforce the point: monosyllables exist, but they do not define the language. Vietnamese, like English, is built on polysyllabic and dissyllabic foundations. Its orthography must evolve to reflect that reality.
Composite Morphology and the Case for Dissyllabicity , Some may argue that comparing Vietnamese and English is like comparing apples and oranges, after all, English is an inflectional language, forming words through radicals and affixes (eater, keeper, walker, sleeper), while Vietnamese is often mislabeled as “isolated.” But this is precisely the misconception that must be corrected. Vietnamese is not isolated, it is a composite language, and its word formation reflects that.
As discussed earlier, Vietnamese equivalents to those English compounds include:
- nghệsĩ (artist)
- casĩ (singer)
- vănsĩ (writer)
In these examples, components like sĩ, giả, and gia function analogously to English suffixes such as -er, -ist, or -or. Crucially, these Vietnamese morphemes cannot stand alone, just as -er or -ist cannot function independently in English.
English has long absorbed foreign elements and developed compound formations such as:
-
therefore, anybody, however, nevertheless, blackboard, gunship, eyebrow
Vietnamese mirrors this structure with equivalents like:
-
vìvậy, bấtcứai, tuynhiên, nhưngmà, bảngđen, tàuchiến, chânmày
Yet while English preserves these compounds as unified words, current Vietnamese orthography continues to split them into isolated syllables, even when the individual components no longer carry independent meaning.
Consider the following dissyllabic composites:
- bângkhuâng (melancholy)
- hồihộp (breathless anticipation)
- mồhôi (sweat)
- taitiếng (infamy)
- mặccả (bargain)
- cùlét (tickle)
What does bâng mean in isolation? Or khuâng? Or mồ? Or hôi? These syllables, severed from their pairings, lose semantic coherence. Yet in writing, they are routinely broken apart, an orthographic practice that undermines the integrity of Vietnamese word-concepts.
This alone is sufficient to classify Vietnamese as a dissyllabic language though. If polysyllabicity is defined by the prevalence and frequency of usage of multi-syllable words in a language’s vocabulary stock, then Vietnamese, by virtue of its vast inventory of Sino-Vietnamese and Sinitic-Vietnamese compounds, is indisputably dissyllabic.
The continued use of monosyllabic spacing in Vietnamese writing is not only illogical and unscientific, it actively impairs the language’s capacity to function as a tool for abstract reasoning, cognitive development, and data structuring. Vietnamese deserves an orthography that reflects its true linguistic nature: composite, dissyllabic, and polysyllabic.
WHY THE CURRENT WRITING SYSTEM REQUIRES REFORM
In truth, the idea of reforming Vietnamese orthography is not new. Several distinguished scholars, Lãng-Nhân Phùng Tất-Ðắc (UK), Trịnh Nhật (Australia), Dương Ðức-Nhự, Ðào Trọng-Ðủ, and Phạm Hoàng-Hộ (the latter two having published works in dissyllabic format), alongside advocates such as Hồ Hữu-Tường, Nguyễn-Ðình Hoà, and Bùi Ðức-Tịnh, have long recognized the polysyllabic nature of Vietnamese and criticized the limitations of its current writing system. Yet their insights were largely eclipsed during the upheavals of 20th-century wartime Vietnam.
Today, however, technological progress and the rise of the global internet offer a renewed opportunity. Through digital platforms, websites, email, online publishing, we can reintroduce and actively experiment with a more accurate and efficient way of writing Vietnamese. The reform proposed here is not a radical departure, but a long-overdue correction.
The rationale for reform has already been touched upon throughout this paper. But let us now focus more precisely on the central claim: that replacing the current syllable-by-syllable system with one that reflects polysyllabic principles, writing multi-syllable words in unified, composite formations, will dramatically improve both mental processing and electronic data handling.
This is not merely a typographic adjustment. It is a structural realignment, one that restores Vietnamese to its rightful place among the world’s polysyllabic languages and equips it to function more effectively in modern cognitive, educational, and computational contexts.
Vietnamese, as it stands today, is the product of centuries of linguistic evolution, an amalgam of historical shifts, cultural overlays, and pragmatic adaptations. For hundreds of years prior to the 20th century, Chinese script served as the medium for official records, historical chronicles, and literary expression. Although the Nôm script was devised to transcribe vernacular Vietnamese, its usage remained largely confined to elite literary circles.
This historical trajectory was underpinned by a long-held belief, perhaps once plausible, that Vietnamese and Chinese shared genetic roots within the Sino-Tibetan family. Only in the mid-20th century did André Haudricourt’s groundbreaking work begin to reposition Vietnamese within the Mon-Khmer branch of the Austroasiatic family, challenging entrenched assumptions.
So why revisit this past? Some argue that Vietnamese now possesses its own superior romanized script and no longer needs to concern itself with archaic affiliations. Others claim that spoken language may evolve, but orthography should remain fixed, citing English as a case in point, where spelling has endured despite phonological drift. Predictably, such voices dismiss the need for reform as neither urgent nor necessary.
Yet a closer look at these objections reveals their fragility. As seen in debates like those in Bình luận về “Sửa đổi Cách viết Tiếng Việt” (Vietnamese Forum), resistance often stems from conservative quarters rather than from those who grasp the abstract and collective imperatives of reform. This is not surprising. Every major reform encounters opposition, just as the early Nôm innovators were ridiculed by traditionalists for daring to record Vietnamese sounds in non-Chinese forms.
Today’s anti-reform voices echo those same sentiments. They cling to a fragmented orthography that future generations may well regard as anachronistic. Their fears, that reform will sow confusion or chaos, are shortsighted, obscuring the long-term cognitive and technological benefits of a polysyllabic writing system.
Ask any opponent and you’ll hear a litany of objections, most of them sentimental or superficial. Some say the new script “looks strange”; others fear misunderstanding. But such resistance is precisely the weakest link in our linguistic evolution. It perpetuates backwardness in scientific thought and impedes the development of abstract and collective reasoning, especially among children and monolingual adults.
To be clear, this proposal does not advocate a radical overhaul. While a complete revamp, eliminating diacritics and reconfiguring derivatives, might appeal to second-language learners, our focus is more measured: to present Vietnamese words in their full conceptual unity through polysyllabic formations.
Consider classifiers: “con đường” (road), “bầu trời” (sky), “quả đất” (globe). Each classifier, “con”, “bầu”, “quả”, is semantically bound to its noun. Yet the current orthography severs these units, obscuring their relationship. Non-native learners often ask why we use “con”, “sự”, “bầu”, or “quả” inconsistently. If written as unified words, “conđường”, “bầutrời”, “quảđất”, the logic becomes self-evident.
While Vietnamese does not possess an overwhelming number of classifiers, the confusion surrounding their usage stems not from their quantity but from the way they are visually severed from the nouns they modify. The current orthography, writing classifiers and their associated words as separate syllables, fails to indicate which classifier belongs with which noun. This typographic fragmentation obscures semantic relationships and impedes comprehension.
This is not a minor flaw. Classifiers are among the defining features that distinguish Vietnamese from other languages in the Mon-Khmer branch of the Austroasiatic family. They are also one of the many structural affinities Vietnamese shares with Chinese, a neighboring language widely recognized as polysyllabic. Given these parallels, Vietnamese should likewise be classified as a polysyllabic language.
Importantly, the goal of this reform is not to simplify Vietnamese for foreign learners, nor to radically transform the language by converting classifiers into suffixes (-s, -z, f-, con-, sự-, etc.), nor to eliminate diacritics. The issue at hand is more fundamental: the current transcription of dissyllabic words is inaccurate. It does not reflect how Vietnamese is actually spoken.
In natural speech, dissyllabic words are delivered as unified sound chains, each pair of syllables forming a complete conceptual unit. So why are they broken apart in writing? Some argue it’s habit. Others cite tradition: the current system is widely understood, used from North to South, printed in books, etched on street signs. Change, they say, would be disruptive, unaesthetic, and impractical.
But this defense of the status quo is deeply flawed. Writing Vietnamese as if it were monosyllabic and isolated is unscientific, illogical, and retrograde. It reflects a mindset that resists progress, even when the evidence for reform is overwhelming.
We must confront the limitations of the current system honestly. Only by acknowledging its weaknesses can we begin to devise meaningful solutions. If left unaddressed, the system will continue to evolve in ways that avoid reform altogether, doing more harm than good.
The most insidious harm is cognitive. A writing system that presents language as a string of disconnected syllables trains the brain to think in concrete, fragmented terms. Over time, this shapes a generational mindset incapable of abstract and collective reasoning. Studies have shown that high-performing individuals often begin life with early exposure to polysyllabic languages, languages that foster conceptual thinking and cognitive flexibility. (See Ngôn ngữ và Trí tuệ by Nguyễn Cường.)
So we must ask ourselves: why continue writing our language in a way that diminishes its expressive power, when we have every capacity to do better? If we weigh the benefits against the drawbacks, the case for reform becomes clear. Transitioning from a monosyllabic orthography to a polysyllabic one is not just a linguistic adjustment, it is a cultural imperative.
The current system fails to reflect the true nature of Vietnamese dissyllabic words, which are spoken in paired sounds to convey complete and unique concepts. Once we accept that the writing system is inadequate for modern needs, we must approach reform with clarity, courage, and an open mind.
2) The other pictures: Lessons from our neighbors
Let us glance over our shoulders to observe how our culturally proximate neighbors have approached the question of script reform.
China, at various points in its modern history, earnestly considered abandoning its logographic script in favor of a Latin-based system. Yet despite the ambition, the plan was never realized. One of the principal obstacles was the overwhelming number of homophones in Mandarin, words that sound identical but differ in meaning and character. When early romanization attempts transcribed these homophones as isolated syllables, the resulting ambiguity was even greater than that found in the original block script.
Ironically, Chinese and Vietnamese share deep phonological affinities. Vietnamese has been successfully Romanized; so too could Chinese, had its reformers recognized the polysyllabic nature of their own language. But they did not. Instead, they clung to the notion of monosyllabism, a view reinforced by centuries of character-based writing and compounded by widespread illiteracy. Only with the full adoption of the Pinyin system in the late 1970s, driven by the demands of computerization, did China begin to standardize a Latinized transcription. Even then, the legacy of block script coding locked them into a hybrid system.
Historically, Western missionaries who ventured into China at the same time as those who came to Vietnam failed to introduce a Romanized script. Why? Again, they misunderstood the structural nature of Chinese. The concept of polysyllabism was alien even to Chinese linguists of the time, who were steeped in the tradition of character isolation. When missionaries attempted Latin transcription, they rendered each syllable separately, generating a flood of homonyms and confusion among native learners. Had they adopted a combining formation, or even hyphenation, as in Vietnamese, they might have succeeded.
Another factor lies deeper: the Chinese writing system is not merely functional, it is civilizational. With over 5,000 years of continuous use, it has become the symbolic soul of the nation. Even Mao Zedong, who once contemplated full romanization, ultimately abandoned the idea, reportedly out of reverence for Tang poetry. He alone had the authority to enact such a reform, but chose not to.
That moment has passed. China has since institutionalized Pinyin for formal transcription of Putonghua, as seen in global usage of “Beijing” and “Guangdong” rather than “Peking” or “Canton.” In doing so, they tacitly acknowledged the polysyllabic structure of their language, writing compound words in unified Latin formations.
Japan faced similar challenges. Romanization of Japanese would have unleashed an even greater flood of homonyms. Consider the syllable do, which corresponds to over 100 different Chinese characters in Japanese usage, all pronounced nearly identically. Vietnamese equivalents span a wide range: đông, đôn, độc, độn, đồn, đốc, đống, động, đồng, and even não, náo, thuỷ, bách, câu, điện, viễn, thời, nỗ, among others. To manage this complexity, Japan introduced two national phonetic scripts, Katakana and Hiragana, to complement the long-standing use of Kanji. These scripts serve to transcribe foreign words and native polysyllabic expressions, respectively.
This is not to say reform has been absent. Both China and Japan have implemented partial modernization: simplification of traditional characters, horizontal left-to-right writing, and standardized formatting. Though full romanization was never achieved, meaningful steps were taken.
Vietnam, by contrast, stands at a unique crossroads. Having already adopted a Romanized script, we possess the structural foundation to advance further, toward a polysyllabic orthography that reflects the true nature of our language. The lessons from our neighbors are clear: reform is possible, but only when the linguistic structure is correctly understood.
What Our Neighbors Reveal , A provocative question deserves attention: had China and Japan succeeded in fully Romanizing their writing systems, would their scientific, technological, and economic development have accelerated beyond what we see today? The answer is almost certainly yes.
Had China adopted a Latin-based script earlier, mass literacy across its billion-plus population would likely have advanced more rapidly, and the digitization of language, essential for informatics, would have scaled faster and deeper. The economic ripple effects would have been profound. Instead, the complexity of Chinese characters posed significant obstacles to industrial modernization throughout the 1980s. Today, the script is deeply embedded in digital infrastructure, making any future reform a century-scale endeavor.
Some point to Taiwan as a counterexample: it has retained traditional Chinese characters since 1949 and still achieved notable success in computing long before China had achieved since the beginning of the 21st century. True, but Taiwan’s progress in that field has been driven largely by English-language tools, e.g., computer languages have been programmed in the English language, though, not by the Chinese script itself.
Others cite Korea for extreme cases. North Korea abandoned Chinese characters entirely, yet remains technologically stagnant, good only in producing divisions of hackers and nuclear ballistic missiles, so to speak. Meanwhile, South Korea, by contrast, is a global leader in innovation, despite retaining Chinese characters in its writing system until the late 20th century. But here’s the nuance: South Korea recognizes the integral role of Chinese-derived vocabulary in its linguistic structure, just as Vietnamese does with Sino-Vietnamese and Sinitic-Vietnamese compounds. North Korea’s rejection of Chinese script may have inadvertently severed a vital link to shared technological and, consequently, economic development.
It’s tempting to argue that English alone is sufficient for technological advancement. After all, it is the global language of computing, and countries like China, Japan, South Korea, and Taiwan all rely on English for technical domains. So why should Vietnam bother reforming its own writing system?
Because English alone is not a panacea. Countries like India, the Philippines, Suriname, and Jamaica use English officially, yet lag behind in scientific output. The key difference? Language reform. Singapore, South Korea, Malaysia, and Thailand, all of which have undergone linguistic modernization, stand out as regional success stories. Each has embraced polysyllabic structuring, facilitating smoother integration with digital systems.
What about Vietnam? Some may point to minor reforms, standardizing scientific terms like ốcxíthoá, cạcbônnát, canxum, nitrơát, or replacing y with i, etc. But these superficial changes have often done more harm than good, introducing confusion and burdening learners with parallel lexicons.
Today, Vietnam increasingly retains original foreign spellings for proper nouns, a pragmatic shift from the earlier mandate to transcribe names phonetically, e.g., Xan Phơ-ran-xít-cô (San Francisco), Oátsingtơn (Washington), Ốxtơrália (Australia), Nícơxơn (Nixon). This retention of original Latin placenames allows even monolingual Vietnamese readers to approximate foreign pronunciations and engage more efficiently with global content, i.e., San Francisco, Washington, etc.
Commonly recognized words derived from Chinese forms such as Mỹ (America), Anh (England), Bỉ (Brussels), Đức ('Deuthsche'), Úc (Australia), etc. and localized borrowings like xàphòng (soap, from savon), càphê (coffee, from café), kem, càrem (ice cream, from crème), xinê (cinema, from cinéma), ápphê (affair, from affaire), and sale, free, internet, web..., these are now fully naturalized and should remain untouched.
In sum, the lesson from our neighbors is clear: linguistic reform, especially one that embraces polysyllabic structure, is not merely aesthetic or academic. It is a strategic imperative for modernization, cognitive development, and global integration. Vietnam has already laid the groundwork with its Romanized script. Now is the time to complete the journey.
3) Polysyllabic writing fosters abstract and collective thought
It is no coincidence that those among us who have acquired a second or third language, especially polysyllabic ones like English or French, tend to excel in academic and scientific domains. These fields demand abstract reasoning and collective cognition, capacities that are not innate but cultivated through sustained linguistic and intellectual training. The process of mastering a polysyllabic language rewires the brain to perceive, process, and synthesize complex ideas.
This cognitive advantage extends far beyond academia. It shapes how we reason, collaborate, and innovate. We can safely assert that acquiring a polysyllabic second language is one of the most powerful disciplines for developing abstract and collective thinking. Those left outside this intellectual circle, often the economically disadvantaged, are typically monolingual Vietnamese speakers, conditioned by a monosyllabic orthography. Tragically, they represent the majority.
Can a nation thrive when most of its citizens are neurologically trained to think in fragments? This is not a rhetorical question, it is a national imperative.
Consider German. Its nouns are famously long:
- Auf Wiedersehen (See you again)
- Willcommenskultur (Welcoming Culture)
- Informationssystemverarbeitung (information system processing)
- Recherchemöglichkeiten (research possibilities)
- Betriebswirtschaft (business administration)
These are not awkward strings, they are unified word-concepts. German speakers accept them as cognitively whole. The capitalization of nouns further reinforces their symbolic integrity, signaling the beginning of a conceptual unit. This typographic convention fortifies abstract thinking. The Germans do not read each word by its syllables, but by symbolistic shape as a holistic whole, so to speak. They never write street signs or slogans in ALL UPPERCASE STRINGS like the Vietnamese do! They have risen up strongly again in less than 20 years after the complete destruction in World War II!
As a matter of fact, Vietnamese speakers are trained to focus on minute details, a mindset that tends to associate abstract concepts with concrete objects, individually and sentimentally. For example, we often hear among ourselves boasting that how beautiful our language is, each syllable represents and triggers an object visually and depicts a picturesque perception of a word (actually a syllable for this matter) in our mind, or how orderly our language shows with regard to social hierarchy, etc., when we should call a person by name, by title, by seniority, or by rank, etc., (in this case consider India's social classes which are still in existence!), while in many other languages, including Chinese (that used to be the same as ours for this matter), all first and second person's hierarchical address forms are abstracted to "I, you" in English, "wo, ni" in Mandarin, or "je, tu" ("moi", "toi" and "vous") in French. It is so not because in other cultures people do not know how to respect others to address them accordingly. This abstraction is not a sign of cultural indifference or disrespect. It is a linguistic transcendence, a cognitive elevation from concrete social markers to generalized human reference, to higher abstract degree for this matter.
This is not a mark of cultural sophistication, it is a symptom of linguistic descent, not transcendence.
From early childhood, Vietnamese learners are taught to spell syllables individually, not to perceive words as conceptual wholes. This pedagogical model, unchanged for generations, has conditioned the brain to process language in fragments. Meanwhile, spelling curricula in American schools evolve annually, adapting to cognitive research and pedagogical innovation.
Figuratively speaking, we teach our children to identify trees, but not to see forests. Americans, French, Chinese, they teach forests first.
This failure to utilize our writing system as a tool for abstract cognition has left a legacy in limbo, passed from one generation to the next. We continue to implant this fragmented mindset in our children and celebrate it as tradition. We have been intellectually impoverished for millennia, trapped in a stagnant pool of syllabic thinking.
Language is the scaffolding of thought. If the only tool our children have is a monosyllabic script, they will grow up thinking one syllable at a time.
Abstract and collective thinking is essential, for mathematics, science, economics, and beyond. It is not a gift; it is a skill, shaped by language. A poorly designed linguistic tool will yield poor cognitive outcomes. A polysyllabic writing system, by contrast, will stimulate the brain to think differently, and better.
Reading and writing Vietnamese in polysyllabic formations will help children perceive concepts as unified wholes. They will learn to associate meaning with structure, not with dismembered syllables. This is not just a reform, it is a cognitive revolution.
The Korean, Chinese, and Thai models: Writing as cognitive architecture , The Koreans have long understood the cognitive power of polysyllabic structuring. Their national writing system, Hangul, groups syllabic blocks into discrete concept-words, whether derived from native Korean or adapted Chinese vocabulary. Consider:
- Hyundai = hiệnđại (modern)
- Dongnama = ÐôngnamÁ (Southeast Asia)
- Fanghuo = phònghoả (fire prevention) or phónghoả (arson)
- Kori = Caoly (Korea)
- Kamsamida = cảmtạ (thank you)
For example,
"그는 여러 代의 임금을 내리 섬긴 歷事로 널리 尊敬받았다."
Geuneun yeoreo dae'ui imgeumeul naeri seomgin yeoksaro neolli jon'gyeongbadatda.
(He is widely respected for his service under successive monarchs.)
And after script reform: 그는 여러 대의 임금을 내리 섬긴 역사로 널리 존경받았다.
Commentary: Their reformed writing system surpasses Chinese characters, Chữ Nôm, and even Quốc Ngữ. They have advanced from ideographic thinking to iconic representation leaping toward the “smartphone” AI era!
If X represents a Korean syllabic block, the visual structure of these words appears as: XX XXX XX XX , four concept-words, not nine isolated syllables. This typographic clarity mirrors the spoken rhythm and reflects a collective cognitive orientation. Korean writing is processed faster, mentally and digitally, because it aligns with the brain’s natural tendency to group meaning-bearing units.
By contrast, Chinese script places symbolic characters sequentially, either vertically or horizontally. While the characters themselves are rich in meaning, their linear arrangement lacks the grouping logic of Korean Hangul. The result is a slower, less efficient processing model, though still more effective than Vietnamese monosyllabic spacing. Is that the reason why Korea had been always one step ahead China?
Across the Mekong River to the left of Laos, Thai script offers another instructive model. Its writing flows like a train of uninterrupted syllables, no spacing, no fragmentation. The visual rhythm reinforces semantic continuity. The principle is clear: “see one, catch all.” This is the essence of polysyllabic writing. Laotian scrip is somewhat similar linguistically and orthographically, anf 'Gotcha', they also possessed an aircraft like their neighbor Thailand then.
Reform as a cognitive accelerator , Writing reform alone, of course, cannot guarantee technological progress, though. But it lays the foundation. A polysyllabic orthography enhances data processing, machine translation, and cognitive efficiency. It is a prerequisite for modernization, not a panacea, but a catalyst.
Consider again the German example: 'Informationssystemverarbeitung' , a single word, instantly grasped. No German speaker mentally spells out its syllables. The concept is perceived holistically.
Now compare with Vietnamese: 'xử lý thông tin' , four separate syllables, 4 visual shapes. A Vietnamese reader must first decode each syllable, then group them into 2 concept-words (xửlý, thôngtin), and finally synthesize the phrase. The cognitive load is heavier, the processing slower.
If we wrote it as xửlýthôngtin, the brain would process it in one unified step, just as in German. Even xửlý thôngtin would be an improvement. Similarly, 'chủnghĩaxãhội', 'chủnghĩaquốctế', or even adoted acronyms like 'AI', 'US', 'ASEAN' or 'VNCH'. The key is polysyllabic grouping.
Imagine applying this principle across hundreds of Vietnamese terms. The result: fewer visual units, faster comprehension, more efficient data handling. If the new polysyllabic writing system were already in place, our eyes, scanning a line of text, might recognize fewer distinct word-shapes. But paradoxically, our brains would process more meaning, and at greater speed.
If this explanation still feels unclear, your mind may still be parsing language one syllable at a time. What it needs is recalibration, polysyllabic training. And that begins with reform.
We have reached a clear conclusion: writing words as they are spoken, what we call the “natural way”, enables faster recognition and processing of concept-word-phrases than the fragmented, syllable-by-syllable method still in use. Readers should not be forced to decode each syllable, then mentally reassemble them into words, just to grasp the meaning of a phrase. In this respect, polysyllabic writing, when rendered in Latin script, can achieve symbolistic effects comparable to those of ideographic systems. It fosters abstract and collective thought.
Of course, not all Latin-script users think alike. But we, who still cling to monosyllabic orthography, have failed to fully harness the power of our writing system. Who else shares this predicament? The Hmong do, though polysyllabic compounds appear sporadically in their writing. So do several indigenous groups in Vietnam’s Central Highlands, whose orthographies were modeled on our own. They write as we do. And so, we have found companions who think as we think.
Let’s return to the bamboo analogy. We excel at distinguishing one bamboo stalk from another, whether in our front yard, along a village path, or deep in a forest. So why the confusion? Why do we still struggle with basic computing tasks? Our current system cannot support consistent font schemes, accurate spell-checking, or even proper alphabetical sorting. And forget about reliable translation of English websites.
The writing system we use today is a relatively recent invention, still full of flaws. We must not treat it as sacred simply because it was handed down to us. It is a tool, a symbolic medium for communication. And tools can be improved. If a better system, grounded in polysyllabic principles, can be created and adopted, that is the one we should value. Not the imperfect one we now endure.
To be clear, we do not advocate radical reform, such as replacing “sĩ” with -s or -ist, “gia” with -z or -er, or “sự” with s- for abstract nouns. Instead, we propose a modest shift: let go of old habits and begin writing Vietnamese the polysyllabic way. Simply combine the syllables of each word, usually two, to form a complete unit that conveys the full concept.
4) Accuracy facilitates data processing
One need not be a database architect to grasp how poorly Vietnamese linguistic logic currently serves digital infrastructure. The inefficiencies, redundant field attributes, convoluted algorithms, and excessive parsing layers, are evident in even the most basic online dictionaries or translation engines. The urgency for reform is clear: Vietnamese must adopt a polysyllabic writing system to meet the demands of modern data processing, especially in the AI era.
When scanning large volumes of information, it is far more efficient to recognize concept-words as unified visual symbols than to mentally reconstruct meaning from fragmented syllables. Take the English word international. A reader does not need to spell it out, in-ter-na-tion-al, to understand it. The shape alone conveys the concept instantly, much like a Chinese ideograph or a pictogram.
This symbolic efficiency extends to derivatives:
- internationalization
- internationalism
- international imperialism
- internationale
Each is processed at nearly the same speed as the root word international, because the visual structure remains anchored to a recognizable radical.
Now compare this to Vietnamese equivalents:
- quốc tế
- quốc tế hoá
- chủ nghĩa quốc tế
- chủ nghĩa đế quốc quốc tế
- thế giới đại đồng
In their current orthographic form, these phrases require multiple cognitive steps: decoding syllables, grouping them into words, and finally synthesizing the concept. But if written as:
- quốctế
- quốctếhoá
- chủnghĩaquốctế
- chủnghĩađếquốcquốctế
- thếgiớiđạiđồng
, the brain would process them more rapidly, recognizing fewer shapes while absorbing more meaning.
This efficiency translates directly to computing. A microprocessor can handle polysyllabic strings with greater speed and accuracy. For example, chủnghĩaquốctế saves three bytes of memory compared to its syllable-separated counterpart. It also eliminates ambiguity: no more confusion between chủ nghĩa and chu nghĩa, or chú nghĩa, all of which are legitimate syllables but semantically unrelated.
In database architecture, this matters. Translating chủnghĩaquốctế becomes as straightforward as translating internationalism. The system no longer needs to scan through dozens of unrelated entries, chủ nhà, chủ tiệm, chủ chứa, chủ trương, chủ ý, chủ trì, chủ quan, before locating chủ nghĩa, and then repeating the process for quốc tế. The polysyllabic form collapses this complexity into a single, searchable unit.
Even in print, the benefits are tangible. Eliminating unnecessary white spaces between syllables could reduce paper usage by 5–10%, lowering production costs and environmental impact.
In short, polysyllabic reform is not just a linguistic refinement, it is a technological imperative. It enhances accuracy, accelerates processing, and supports scalable infrastructure. As the Vietnamese saying goes, có thực mới vực được đạo, “without sustenance, there is no principle.” Reform begins with the tools we use to think.
HOW TO REFORM THE CURRENT VIETNAMESE WRITING SYSTEM
Before any reform of the Vietnamese writing system can be responsibly implemented, we must first acknowledge and address several cultural and linguistic realities. Controversial as they may be, these facts form the necessary foundation for meaningful change.
First, like most languages, Vietnamese has absorbed a vast number of loanwords from more dominant linguistic spheres, chiefly Chinese. This is not merely historical; it is structural. Just as many of us carry genetic traces of Vietnamese-Chinese ancestry, our language carries the imprint of centuries of Sinitic influence. The analogy holds: linguistic hybridity is not a flaw, it is a fact.
Second, Vietnamese shares numerous typological features with Chinese. This should not surprise us, though younger generations, exposed to Western cultural paradigms, sometimes imagine Vietnamese as a hybrid of Chinese and French. In truth, French and Mon-Khmer contributions to Vietnamese vocabulary are minimal by comparison (see Appendix A). The overwhelming presence of Sino-Vietnamese and Sinitic-Vietnamese compounds affirms Vietnamese as a fundamentally dissyllabic language.
Globally, Chinese is now widely recognized by linguists as a polysyllabic, or more precisely, dissyllabic, language. Given the sheer volume of Chinese-derived vocabulary in Vietnamese, this alone justifies classifying Vietnamese as dissyllabic. It is the linguistic engine behind this proposed reform.
Some have suggested purging Chinese elements from Vietnamese to “purify” the language. But what would remain? A gutted lexicon and a cultural void. Campaigns like giữgìn sựtrongsáng của tiếngViệt (preserving the purity of Vietnamese) have attempted to replace Sino-Vietnamese terms with so-called native alternatives, e.g., máybay for phicơ, tênlửa for hoảtiển, sânbay for phitrường. Ironically, these “pure” words also trace back to Chinese roots.
Even technical or anatomical terms, bộphận sinhdục, âmhộ, dươngvật, giaocấu, are deeply embedded in Sino-Vietnamese morphology. To eliminate them would be to amputate the language’s expressive range. Just as Latin and Greek roots enrich English, Sino-Vietnamese compounds have deepened Vietnamese across every register. Functional particles like và, dù, sỡdĩ, nếu, nhưng, all of Chinese origin, are indispensable. One cannot construct a Vietnamese sentence without invoking Chinese etymology. Reform must not become erasure.
Since its inception, Quốcngữ has undergone numerous orthographic adjustments. But since the mid-20th century, Vietnamese spelling has remained relatively stable. This stability allows us to observe phonological shifts over time. For instance:
- thu is pronounced /t'ou/, not /t'u/
- không as /k'owngm/, not /k'ong/
- hộc as /howkm/, not /hok/
- ti as /tei/, not /ti/
- tin remains /tin/, not /tein/
Regional accents, Northern, Central, Southern, further complicate orthographic fidelity. It is likely that the original Quốcngữ creators transcribed sounds as they were spoken in specific locales at specific times. Language evolves; orthography must adapt.
Unlike English, whose spelling often diverges dramatically from pronunciation, Vietnamese has maintained relative phonological consistency. Therefore, in this first stage of reform, we do not propose a full phonetic overhaul. Instead, we focus on correcting the way polysyllabic and dissyllabic words are written, grouping them as they are spoken.
This reform promises tangible benefits: cognitive efficiency, technological compatibility, and linguistic clarity. And we need not wait. Further research may refine our understanding of dissyllabicity, but common sense already confirms what is visible to the eye and audible to the ear: most Vietnamese words consist of two syllables.
If doubt remains, let us begin with what is indisputable: the overwhelming presence of dissyllabic Sino- and Sinitic-Vietnamese compounds, alongside a modest set of French and English loanwords (see Appendix A). That alone is more than enough to designate Vietnamese as a polysyllabic, indeed, dissyllabic, language.
As established throughout this proposal, the most accurate and logical characterization of Vietnamese is that it is, undeniably, a polysyllabic language.
Some traditionalists, especially poets, have voiced concern that reforming Vietnamese by writing dissyllabic words in combining formation would disrupt the structural integrity of poetic forms like lụcbát (six-eight syllable couplets), songthấtlụcbát (seven-seven-six-eight), or thấtngônbátcú (seven-syllable regulated verse). They fear the melodic rhythm would be lost, much like Mao Zedong’s reluctance to Romanize Chinese, out of reverence for Tang poetry.
But this concern is easily resolved. Poets are free to choose their medium. In poetry, it is the spoken rhythm, not the visual spacing, that matters. Artistic expression is not bound by orthographic reform.
The writing reform proposed here is not about poetry, it is about clarity, logic, and scientific precision in communication. Polysyllabic writing enhances semantic transparency. Consider:
- coi cọp (watching tigers) ≠ coicọp (sneaking into a show without paying)
- hoa hồng (red-colored flowers) ≠ hoahồng (roses or commission)
- đánh rớt (to drop) ≠ đánhrớt (to fail a student)
- phá thành (to assault a citadel) ≠ pháthành (to distribute)
These distinctions are not trivial, they are essential for accurate data processing, machine translation, and lexicographic clarity. Scientific fields increasingly rely on precise terminology, and Vietnamese has already begun to coin new terms using polysyllabic principles:
- dữliệu (database)
- dữkiện (data)
- thôngtin (information)
- trangnhà (homepage)
- bệnhthan (anthrax)
- vimô (micro)
- vĩmô (macro)
New compound terms like điệnthoạithôngminh (smartphone), thôngminhnhântạo (AI), lênmạng (online), cổngnối (gateway), nốimạng (connected), trangnhà (homepage) reflect polysyllabic logic, even if still written in outdated monosyllabic forms currently all over places.
This principle allows for flexible word formation, akin to how English uses radicals and affixes. Vietnam, though still developing scientifically, has enriched its technical vocabulary by adapting Sino-Vietnamese roots, often via Chinese characters based on Japanese coinages, for example, chínhtrị (politics), cộnghoà (republic), dânchủ (democracy), tíchcực (positive), tiêucực (negative), they were all coined by Japanese lexicographers and re-imported into Chinese and Vietnamese.
Here are examples of computing terms formed polysyllabically:
- máyvitính (microcomputer)
- tinhọc (informatics)
- liênmạng (internet)
- nângcấp (upgrade)
Meanwhile, English terms like chip, bit, byte (bai), mega (mê), board (bo), font (phông), email, website, unicode, internet are used directly or slightly adapted.
This lexicographic flexibility confirms Vietnamese’s dissyllabic nature. Just as English builds words from roots and affixes, Vietnamese can do the same, if the writing system allows it. Consensus rules in this field, that is, you may want to call a computer a "máyvitính" or "máyđiệntoán" more than "máyđiệnnão", but if everybody calls it a "máyvitính", that becomes the standard.
Accepting the current writing system simply because it is widely used is not a legitimate defense. Its fragmented structure encodes cognitive limitations in young minds. It is a retrograde instrument, a linguistic carcinogen, that stunts abstract and collective reasoning.
This is not conjecture. Ask how many Vietnamese have truly excelled without mastering a foreign language. Perhaps only a few cadres have risen through political channels, but their children, educated abroad, often emerge with entirely different cognitive profiles. Conquan thì lại được làmquan. But should intellectual privilege remain hereditary?
Do we want only a narrow elite to benefit from the cognitive power of abstract polysyllabic languages like English? Reforming Vietnamese into a polysyllabic writing system democratizes that advantage. It levels the field.
Writing Vietnamese polysyllabically, like English or German, will elevate the nation intellectually and digitally. The orthography is still young. It deserves refinement. Let us not settle for less. Let us act.
3) Abolish old-fashioned hyphenation, decisively and permanently
The orthographic inertia we tolerate today reflects a broader national stagnation. As long as we do nothing, the old way of writing remains entrenched. This mirrors how Vietnam has approached its own language: as if it were “isolated,” a term once used by Western linguists to imply primitiveness. Though such voices have faded, their texts linger, and their influence persists in Vietnamese scholarship.
Ironically, Vietnamese writing was more structurally accurate until the late 1970s. Dissyallabic words were routinely written with hyphens: quốc-gia (nation), bâng-khuâng (melancholy), lạnh-lẽo (coldly). The hyphen signaled polysyllabic unity. Its disappearance was driven not by linguistic insight, but by convenience, scribes saved time by skipping strokes. Yet hyphenation remains a valid and formal convention in academic writing.
Some speculate that Chinese script influenced this fragmentation, since each character is a self-contained word. But that’s unlikely. Historically, high illiteracy rates and the complexity of Hán and Nôm characters meant few Vietnamese internalized Chinese orthographic logic.
Today, Vietnamese is written with syllables spaced apart, as if each were a standalone word, visually mimicking Chinese, but semantically incoherent. This creates false boundaries between syllables and words, eroding polysyllabic cohesion.
So why did it vanish? Habit. Laziness. The convenience of dropping hyphens became orthographic default. But with our polysyllabic reform, we go further: eliminate both hyphens and white spaces within dissyllabic words. This restores cognitive clarity and typographic efficiency.
Let us not merely revive hyphenation. Let us transcend it.
We’ve now laid out the rationale for reforming Vietnamese orthography. The question is no longer why, but how. Are we ready to contribute our part to this linguistic transformation? The answer need not be daunting, this is a reform rooted in simplicity and common sense.
✍️ Principles for Writing the Polysyllabic Way
-
Recognize natural pairings. Many syllables consistently co-occur in fixed expressions. Write them in combining formation, as continuous sound strings, just as we speak them. Examples: mặcdù (although), vớinhau (together), nhiềuhơn (more than), đẹpnhất (most beautiful), dođó (therefore), chotớinay (until now), xãhộichủnghĩa (socialism), phầnmềm (software), kểkhôngxiết (uncountable).
-
Use foreign languages as scaffolds. When in doubt, consult English or other polysyllabic languages. Their combining formations offer a reliable guide. Examples: although = mặcdù, scholarship = họcbổng, dictionary = từđiển, individualism = chủnghĩacánhân.
-
Follow those who know. If you’re unsure how to group syllables, imitate those who’ve adopted the reform. Let usage guide refinement.
-
Spread the reform. Practice it yourself. Use every available medium, email, websites, signage, publications. Even without diacritics, polysyllabic combining formation improves recognition and clarity.
-
Leverage visibility. Store signs, online posts, and advertising written in polysyllabic formation attract attention. The novelty becomes a tool for advocacy.
📣 Reform Through Usage
The more we write in the new polysyllabic way, the stronger our collective voice in shaping reform. We become pioneers of a smarter, clearer Vietnamese.
Yes, early adopters may write the same phrase differently. But over time, usage will stabilize. A future Academy of the Vietnamese Language will codify the most common forms for official adoption.
And what of old books and archives? Once readers embrace ChữViệt2020, economic incentives will drive publishers to reprint in the new format, if books are still printed at all. Government mandates will follow public demand. It has happened before. It will happen again.
Let us not wait for permission. Let us lead.
From Vision to Action
We’ve explored the rationale for reforming Vietnamese orthography. While the case may not yet be exhaustive, if the vision resonates with you, if you feel the stirrings of reform, then hesitate no longer. Pick up your pen, open your keyboard, and begin writing in the new combining formation today.
This reform is not burdensome. In fact, it thrives in our digital age, where experimentation is free and visibility is instant. The Vietnamese version of this very post has already demonstrated how effortless the transition can be.
Vision without action is only a dream,
Action without vision only passes time,
Vision with action can change the world.
Joel Arthur Barker
Let us not merely dream. Let us act, with clarity, with purpose, and with unity.
All comments and reflections are welcome and will be shared on our forum for further discussion. Be among the first to pioneer this movement. Together, we won’t just reform orthography, we’ll make history.
Without your voice, your writing, your contribution, this vision remains a ripple in a teacup. With you, it becomes a wave.
Last updated 8/8/2025
x X x
CRITICS ON THE VIETNAMESE2020 ORTHOGRAPHY
Bìnhluận từ Ðộcgiả
Về Bài"Sửađổi Cáchviết TiếngViệt"
by Vũ Vương Thao (David)
I am writing to give my opinions with respect to the proposal " Sửađổi Cáchviết ChữViệt." I am hoping that you can help forward it to the author and share it with everyone involved.
My initial reaction to this proposal is with disbelief and very strong disagreement. The author known as "dchph" seems to me that he/she does not, I presume, want to give his/her real name or even a contact email address on that page. One would assume that if he/she were to solicit any feedback, as he or she stated on the last paragraph, there would be an email address, a forum, a post-a-comment, a discussion board or some form of an information exchange medium, rather than a one-way broadcast such as it is presented on that web page.
I am a network engineer living away from Vietnam for the past 20 years and I still speak, read and understand Vietnamese fluently. I do not claim to be a scholar on languages, but I find the way Mr/Ms "dchph" (from this point on will be referred to as "dchph" or the author) is illogical, unreasonable, with no basis or real practical benefit. I refer to the following points dchph made:
1) "It is not scientific; words should be written in syllabic combination as some samples cited above. This will scientifically represent the true characteristics of today's Vietnamese"
To me language is not a science, it is an art. Science is something like representing Vietnamese in UNICODE. Words *may be* written in multi-syllabic in some languages, not should. Who dictates this law anyway?
2) "What are the benefits? Examine German, English, Chinese or Korean and think about it! How do you feel if you have to read and write 'scholarship' as 'scho lar ship' or even 'scholar ship'? But 'hoc bong' is written as such. A society progresses if its language progresses. "The only benefit in joining words together, mathematically, is to save space, reduce the redundancy and increase the compressibility of the language.
3) A society (such as American/Western) progresses because of the freedom and the ease to communicate (example: telephone network, fax, Internet). Joining words together makes it more complicated, restricted, rigid. The monosyllabic nature of present day Vietnamese respresents simplicity and flexibility, powers that enable the writer to create new words, new vocabulary and still be artistic about it. This is why the Vietnamese language is so beautiful.
4) "This new proposed writing system will speed up the process of obsorbing information and facilitate the advancement of science faster"Again, this is an amusing unfounded statement. If the author refers to joining words together so that an email message is overall shorter to transport, a book thinner, so that it can be read more quickly, then that does not mean an advancement of any kind.
5) "Join us in this effort NOW by start writing Vietnamese in combined formation of syllables of a word for each concept. In practice, when you are in doubt, think of an equivalent word in English or in another common foreign language. For example, for 'although' we have 'macdu', for 'blackboard' > 'bangden', 'faraway' > 'xaxoi', and so on."
This is a very blatant example of copying the English language for the sake of copy with no real benefit.
6) "Rõràng là lốiviết nầy phảnánh tính thiếukhoahọc và khôngtiếnbộ của ngườiViệtnam! "Obviously, the progress or dis-progress of the Vietnamese people can not be blamed on the language. I believe there are many other apparent factors that the author simply sidesteps and sweeps under the carpet such as the history of thousands of years of war for example.
"-- họ cóthể hàmý tiếngViệt chúngta còn thôsơ, chưapháttriển, lạchậu, và nghèonàn. "
I believe this is the author's strong and belittled perception and feeling, not necessarily the foreigners' or anyone with rational thinking. I for once feel proud to be Vietnamese, but this is supposed to be a scientific discussion.
7) "conđường=road, bầutrời=the sky, quảđất=the globe... "con" đichung với "đường", "bầu" đichung với "trời", và "quả" đichung với "đất"; nhờđó họ không còn phải thắcmắc về cách chúngta nói khi thì "con", khi thì "bầu", khi thì "quả"... tạisao không dùng hết "con" hay "cái" cho nó tiện!"
I believe the author here fails to see that these combinations of the Vietnamese language are what make the language beautiful, flexible and gives the writer a large scope of creativity. For example "bầu trời" is a lot more expressive and descriptive than just the bland "the sky" - "Bầu" gives the reader the magnitude, the texture and the nature of the sky.
To more correctly compare: "bầu trời == the circular wrap-around-you sky". This demonstrates that Vietnamese is a simple yet extremely powerful language, IMO. By joining words for convenience's sake, you destroy the perpetual existence of individual words.
8) A more extreme comparison: I am sure others would have pointed out that "I" and "you" used in English to address two parties regardless of genders, relations, age (the revered and elders) is extremely impersonal and callous, and is debatably why community and family relationship links are not as strongly bonded as in the Vietnamese sense. How would you like your son to call you "mày" and call himself "tao".
9) "TiếngAnh là vạnnăng! TiếngAnh là ngônngữ kỹthuật! TiếngAnh là tiếngnói của thếgiới! Cứ xửdụng tiếngAnh làm côngcụ ngônngữ kỹthuật là đủ, cảitổ tiếngViệt chi cho phiềntoái! Ðólà nhờ tiếngAnh mởrộng cánhcửa thunhận tấtcả mọi yếutố -- nhờđó nó pháttriển mạnhmẽ chăng?"
I agree with this viewpoint. English is such global language countries like Singapore has to adopt it as a compulsory language. However, let's remind each other that the progress of Western world is not due to the language, it is due to the freedom and ease and low cost of communication. The more information (read new ideas/technical papers) is published, more will be read and discussed which fuels more publications and the cycle is repeated perpetually in a self-accelerated fashion. To improve the progress of a nation such as Vietnam, communication infrastrure is very much vital in fueling its growth. Five to ten years ago, the progress/advancement of a country could be loosely measured by how many telephone per head of capita, now it is measured by Internet and broadband infrastructure spending.
10) "Ðầuóc conngười đã xửlý nhanh thì máyvitính xửlý càngnhanh và chínhxáchơn. Thídụ "chủnghiãquốctế" sẽ tiếtkiệm cho bộnhớ của máyvitính 3 bytes cho ba khoảngcáchtrắng (spaces), khi kiểmlỗi chínhtả "speller" sẽ làmviệc nhanhhơn và khôngcòn gặp trườnghợp "chủ nghĩa" nếu được viếtthành "chu nghiã", "chủ nghĩa", chú nghĩa" đềuđược máyvitính dễdàng chothôngqua! Nóivề tiếtkiệm giấyin thì chúngta còn tiếtkiệm tiềnbạc nhiềuhơn là tiếtkiệm khoảngtrống trong bộnhớ của máyvitính, và sáchvở inra bớt tốn giấy thì dĩnhiên giáthành trởnên rẻhơn!"
Saving space and cost does not make a language *scientifically better* as the author is always trying to bring across. With the computing power of the CPU doubling in every 12 months, the saving of a byte here and there is irrelevant. All human languages are always very redundant. I fail to see in 1000 years, that human, for the sake of saving space and time, convert their worded language to binary numbers, as computers use, just to save in spacial redundancy or make spell checking quicker.
"Cáilợi đã được phântích, tuy chưa được sâusắc, đầyđủ và thuyếtphục lắm, nhưng nếu các bạn nhậnthấy điềuđó đúng và có nhiệttình, bắttay vàolàm ngaybâygiờ, aiai cũng làm thì còn lo gì không thựchiện nổi cuộccảitổ nhỏbé nầy, nhấtlà bước thửnghiệm trên liênmạng chẳng tốnkém gì cả. Bàiviết này là một thídụ điểnhình vậy!"
The author fails to explain or convince me of the benefit in converting. What he/she proposes creates confusion, destroys and undermines creativity, makes the language rigid, inflexible, impersonal, indifferent, and most importantly, mechanical. Joining words to save spaces and saves paper, yes, but the world is going paperless, or e-paper very soon. I fail to see how a more compressed language is distinctly scientifically advanced.
A Westerner (and a Vietnam-born engineer like me) would be very amused if the Vietnamese language is changed to save a few hundred bytes, while massive advances in computing power and technology have been made in the past 20 years and in many more years to come. I agree that words "imported" from a foreign language now and in the future can and should be joined or written in its native form, but other Vietnamese mono-syllable words should be left as is.
11) I would like to encourage the author to do some more reading, from the other viewpoint, on the Study of the Vietnamese Language by foreigners. He/she will then have a much more balanced viewpoint.
My suggestions for the movement of Vietnamese2020 are (no order of importance)
1- standardise on a Vietnamese computerised character set. Get rid of VISCII, VPS, VNI, .VN, etc and create one standard common accessible set for all Vietnamese web sites to use, eg. UNICODE.
2- make English a second compulsory language, it can not be stressed enough that most up-to-date world information are in English. To use an old saying "Biết mình biết ta trăm trận trăm thắng". To be competitive in the global village, every Vietnamese national should be able to follow and keep up to date with the latest information.
3- improve communication and networking infrastructure NOW! Something is easier to say than do, but every thing starts with a thought.
Vũ Vương Thao (David)
30/8/2001
------------------------
Rebuttal
On David Vu's critics of "Sửađổi Cáchviết TiếngViệt"
by dchph
Thank you, David Vu, for writing me regarding this matter.
I would like to take this opportunity to answer some of David Vu's critics and I am happy to have the communication channel open so that everyone interested in this subject will share their opinions.
Even though virtually he does not agree with most of what I write in the proposal, but I enjoy reading very much his meticulous writing because it shows that he does care about this matter. Thanks again, David Vu.
dchph is only my penname, just like any others such as TTKh, Trần Thị Ngh., etc. I picked this name partially because I think that the work of advocating the cause for changing Quốcngữ to the new Vietnamese2020 writing system is that of anybody who shares the same view, not of any particular individual's work. Anyone who wishes to write me can always do so by emailing to dchph (a) yahoo.com. And I hope that in the near future, there will be a discussion board in this website for interested parties to post their different viewpoints.
David Vu is the first one among a few who wrote me shedding doubts about the validity of arguments in the proposal "Sửađổi Cáchviết ChữViệt."
He may be not alone in the opposing camp who may have the same difficulty accepting the fact that Vietnamese is a polysyllabic or, accurately, a dissyllabic language. One must see that a great number of Vietnamese words is composed of two syllables so that one can easily accept the new way to write Vietnamese, that is, dissyllabic words should be written in combined formation for words made of two syllables (or more) that clearly belong to a well defined concept.
Here are some food for thoughts: bảvai, cùlét, cùichỏ, màngtang, bạttai, bỡngỡ, hốthoảng, quốcgia, sơnhà, gióheomay, ngậpngừng, mừngrỡ, loạnxàngầu, bảláp, dưahấu, dưaleo, bônghồngtrtắng, bánhdày, bánhđa... You name it!
In addition to the foregoing argument, which is seemingly enough as background on why we should reform the current writing system, here are my specific replies to David Vu's critics:
1) Linguistics is a science and its historical linguistics is a branch of philology that linguists use as tools to study the nature, characteristics, and history of a language: what it is all about, and how it has come about at its present state. "Scientific" is denoted in this meaning.
"It is not scientific" because the current writing system goes against the true nature of the Vietnamese language as not a monosyllabic language.
In fact, its morphemics testifies that words made of syllables and if
these words are comprised of two or more syllables but they consist of only
one morpheme, the smallest unit that has meanings used to form words, they
should not be broken down into unmeaning units as they are written now,
e.g., "mồcôi" > "mồ côi", "cùichỏ" > "cùi chỏ", "bângkhuâng" > "bâng khuâng", etc...
Also, its semantics shows that "chodù", "sỡdĩ", "tuynhiên"... are grammatical entities (hưtự), made of syllables to signify one-concept words, that always come in morpheme-paired combination as elements to help form sentences. However, the syllabic and morphemic borders have been slurred by the fact that each syllable may carry its own independent meaning, so they have been treated as separate "words." This might have been the result of the old presentation of each syllable originally written by a squared-symbol block writing Chinese or Nôm character. However, one cannot deny the complete wholeness of each one-concept polysyllabic word, composed of more than one syllable, as signified by its meaning. So it is unscientific to break one-concept words and write them in spaced separate syllables, which may look good in composing poetic verses, but those disyllabic words still sound in pair just as in speech.
We can continue to go on with analysis of its lexicography, syntax, phonology... and we will be able to draw the same conclusion.
- Vietnamese UNICODE, in the meanwhile, is an application tool belongs to a different field of science, i.e., computer science. However, this is not contradict to David Vu's notion that language is a form of art. In fact, I totally agree with him that language in a sense is an art, but that belongs to different disciplines of human activities when we utilize the language to write poetic verses, lyrics, or create literary work.
- As I see it, all known languages, not some, existing on earth now are considered as polysyllabic ones in nature and transcribed as such even in block-writing characters such as Korean, Japanese, and even Chinese when written in Latin pinyin. I wish somebody can show me what language on earth is a purely monosyllabic one so we all can make a fair comparison.
- "This law (Divid Vu's words)" is made into laws in some countries, such as South Korea, China, and Thailand. And those among us, the ones who know teach those who don' know, together we form a consensus to pressure those who have the authority to "make laws."
2) and 10) The benefits are obvious, not only "to save space, reduce the redundancy and increase the compressibility of the language" as pointed out by David Vu, but this is only a minor one. Benefits in other areas are much greater because information will certainly be processed much faster in people's brain on just catching the mere shape of long word strings. Their brain won't have to work hard by going bit by bit of separate syllables. Just like English our brain doesn't "spell" each long polysyllabic words in reading, but instantly recognize and understand their meanings by catching just the sight of a whole word. Consequently, the development of a child's brain will tend to gear toward symbolistic and abstract concepts.
In data processing, we will be able to record frequency of whole word usage, not just syllables, that will certainly benefit Vietnamese lexicography. Vietnamese spelling checkers will also work with much accuracy. Machine translators will be much more efficient because they will translate texts basing on concepts, not just words. We are still having problems in these areas because computer scientists, who are not linguists, like any average person, still treat Vietnamese words as monosyllabic ones.
3) and 10) English is one of the richest language in the world in terms of its vocabulary stock and its flexibility to absorb new words. This language enjoys its superior position in the communication world today due to its capability to evolve from monosyllabic to polysyllabic form -- not the other way around. And, of course, it still retains the beauty of artistic articulation in creative work. As a result, joining separate syllables to form words will make things simpler, as in computing science, not more complicated, for instance, software as opposed to software, harddisk to hard disk, website to web site, etc.
Moreover, due to the fact that a total of syllables available in the Vietnamese language are confined to a rigid amount, so "it takes at least two to tango." We can not coin or create a new word just by using only one syllable except for those newly imported English words such as phông, chip, bit, byte, meg... In order to create new words we need to use two or more available syllables in the Vietnamese language to coin new words such as "lênmạng", "liênmạng", "trangnhà", "vitính", "nốikết", "sựcố", "dữliệu", "diãcứng", "virút", "phầnmền", "lậptrình", etc. This method of making new words work almost the same as those in the English language, e.g., website, online, email, harddisk, hardware, software, database.... In other words, we cannot create something out of nothing, i.e., randomly picking up an unused one or two syllables to coin new Vietnamese words. Let's imagine a couple of them now: chụt(chụt) (tiếng hôn, kissing sound), bấc ("bug"), ruốt (for virus), cờ lick (click)... it is really a pain in the neck to pretend creating new words using David Vu's principle that "other Vietnamese mono-syllable words should be left as is" and avoiding (3) "joining words together [because that] makes it more complicated, restricted, rigid. The monosyllabic nature of present day Vietnamese respresents simplicity and flexibility, powers that enable the writer to create new words, new vocabulary" because "what [I] proposes creates confusion, destroys and undermines creativity, makes the language rigid, inflexible, impersonal, indifferent, and most importantly, mechanical."
Among those new terms having appeared in Vietnamese in the last decade as cited above, which ones are "mono-syllable words"? As I understand the way David Vu has seen it, words like "dĩacứng", "phầnmền"... are made of "dĩa" and "cứng", so let's write "dĩa cứng", just like "hard" and "disk", and the Englisk "harddisk" or "software" has nothing to do with scientific advancement, as he writes, do not copy them. So, as for him, similarly, all other words should be written separately. He sees the trees but not the forest as a whole.
That is to emphasize, again and again, Vietnamese is not a monosyllabic language, but more acurately, it is a dissyllabic one in nature and in most part as far as its vocabulary stock of large numbers of dissyllabic words is concerned. I have written a pretty detail in my proposal about this matter, but it fails to convey the point due to my inability to communicate effectively, that's always my weakness -- I must admit.
4) and 10) The main purpose of writing words in combined formation is not for compacting email massages or saving spaces in bits and bytes and paper, but to write the right way, logically, a dissyllabic language, like Vietnamese, deserves. Again, children's brain will undoubtedly benefit by processing information faster in symbolistic and abstract terms, not in concrete detail.
5) Writing dissyllabic words in their true form of combination of separate syllables, we are not copying English or any other languages. We just use one of them as a guideline in materializing and forming a new way of writing. In the old days we used to write them by placing a hyphen [-] between syllables in all dissyllabic words using HánViệt words as examples. Today as there are more and more people knowing English, it seems a natural thing to do is to pick up the notion that each word stands for a concept.
As seen in the cited English examples such as "although" or "faraway", semantically they are considered as one-concept words giving rise to the combined formation of polysyllabic words; therefore, it is illogical to break a word, or a concept, into smaller units, as seen in those Vietnamese equivalents: "xaxôi," "mặcdù..." and interestingly enough in these particular examples, these words will mean "halves" of something else if the morphemes of the two words are broken into smaller unmeaning parts. One may ask what is "xôi" and what is "mặc"?
6) I totally agree with David Vu that " the progress or dis-progress of the Vietnamese people can not be blamed on the language." When I write "Rõràng là lốiviết nầy phảnánh tính thiếukhoahọc và khôngtiếnbộ của ngườiViệtnam! " I did not really mean it, but it is a figurative exaggeration in writing as a means to emphasize an idea, or one may say that it is a rhetoric to incite indignation in many of us, patriotic but dormant or complacent Vietnamese. I hope you understand my "dụngtâm lươngkhổ" (letting others misundertand one's real intention or even exert hatred onto oneself in order to achieve a common benificial goal or consensus.)
7) Bầu, con, cái, quả... in these examples are unique classifiers in Vietnamese as in Chinese (the Chinese don't combine them in pinyin writing.) You have given an excellent point in arguing for not to combine. To put thing in perspective to be clear on this point, David Vu might have misunderstood me for advocating dropping all other classifiers and use only "cái" and "con." I do not advocate this idea, completely not at all. If that is the case the excerption of this passage as quoted by David Vu here is commented out of context. This (7) is only a figurative way of emphasizing the nature of dissyllabics in Vietnamese, i.e., a specific classifier can only go with certain words.
The point I am trying to emphasize in my proposal is that whether we shall or shall not combine a classifier and its associate word together, and I do not advocate getting rid of them altogether. In fact, in my writing, I am just putting forward suggestions to further joining classifiers with associated words, even that seems like an overkill. In practice, which way is better is the way that is mostly used, that is, to combine or not to combine, will pave the new way of writing.
8) For David Vu's raising the issue of "I" and "you", "tao" and "mày", I personnally think this a higher form addresses which embraces all aspects of delicate addresses in the Vietnamese language even though this matter is not advocated in my propsal because it belongs to cultural aspect of the language.
One can cite Chinese examples for argument because in the Chinese language all forms of addresses still exist as they used to be from ancient days until present day, however the Chinese equivalents of "I" (wo) and "you" (ni) exist in parallel of other form of addresses similar to those in Vietnamese. Addressing forms in Chinese are mentioned here because all these Vietnamese addressing forms etymologically were derived from those of the Chinese language, e.g. cô, cậu, chú, bác, anh, chị, em, etc.
This development seems to me as a transcending phenomenon in language evolution, for the better, not for the worst. We give this notion a negative reception just because we used to be associating the meaning of "I" and "you" as "tao" and "mày". We feel annoyed by the contempt connotation of the use of these addressing forms. If we think of them as "tôi" (partially embracing all the denotations of English "I", French "moi", or Chinese "wo") and the "other" ("toi" in French, "you" in English, and "ni" in Chinese, but still unknown in Vietnamese for an equivalent word) to imply anh, chị, em, cô, bác... then the negative perception will not be the same. When someone addresses to you in speaking English as "you", or French as "toi", or Chinese as "ni", you do not think that they mean "mầy" as in Vietnamese for sure.
I know that many of us don't like this conception, but that may be the truth. It is something very logical here.
9) Eventually David Vu agrees with me on this matter, but my whole passage of "TiếngAnh là vạnnăng! ..." is clearly a sacarstic way to say something while I really mean something else. If one carefully reads this passage and those paragraphs before and after that in my work, one will see that I am trying to gear the readers' focus back to the urgency and the need for reforming the present way of writing Vietnamese with the new Vietnamese2020 proposal, but not really mean "TiếngAnh là vạnnăng! ..."
10) Please refer back to the earlier items.
11) Thank you, David Vu, for your suggestions -- at least we still share many things in common that worth discussing. The truth may be lying somewhere in between our disagreement or not a all.
dchph
Revised 03/03/2007
------------------------
x X x
The role of Vietnamese dissyllabism in exploring Vietnamese words of
Chinese origin
A new dissyllabic sound change approach to be explored
by dchph
Dec.8.2002 19:33 pm
Abbreviations:
SV: Sino-Vietnamese
(HánViệt)
VS: Sinitic-Vietnamese (HánNôm)
Today's
Vietnamese vocabulary stock consists of a great number of two-syllable or
dissyllabic words. This characteristic of dissyllabism -- of language with
dominant words composed of two syllables -- has become dominantly one of the
main characteristics of present-time Vietnamese, including those
two-syllable words built with two synonymous word-syllables. The same is
true in modern Chinese synonymous dissyllabic words which have been coined
the same way as model for those mirrored dissyllabic words of the same
characteristics in Vietnamese. In fact, modern Vietnamese appears to show
clearly that it is a language of dissyllabism in nature as found plentiful
in this kind of composite words, that is, many of these words are comprised
of two elements of word-syllable, which are almost synonymous with each
other, e.g., tức|giận (mad/angry), trước|tiên (firstly/initially), cũ|kỹ
(ancient/old), kề|cận (by/near)...
Why do all these matters have
to do with the Vietnamese etymology? Close examination of the previously
cited examples will reveal some sound change patterns that underline the
etymology of those Vietnamese words that apparently have been alternations
of Chinese dissyllabic equivalents. As disscussed above, the lexical and
semantic approach can apply here. However, lexically, these composite words
have different composition of which the two monosyllabic words that make up
the dissyllabic words are variations of different Chinese word-syllables,
for example,
tức|giận: (~ tứckhí) this dissyllabic word can be
further broken into "tức" and "giận", two monosyllabic synonyms in
Vietnamese, and so are in Chinese in its equivalents as qì 氣 and hèn 恨.
However, a modern dissyllabic Chinese word shēngqì 生氣 is a much more
plausible cognate to "tứcgiận", for which, interestingly enough, the
Vietnamese word order is in reverse (this penomenon, to be explained later,
is common in Vietnamese from Chinese dissyllabic words.)
trước|tiên
: is cognate of shǒuqiān 首先 (SV: đầutiên), composed of "trước" (a
Sinitic-Vietnamese sound of qian -- cf. Hainanese /tăi/) plus "tiên" (a
Sino-Vietnamese sound for "qiān"). The concept-sound of "trước" has taken
place of "đầu" in this case, that is to say, the "trước" has been associated
with "đầu" shǒu 首 to form this dissyllabic word. This is called the sandhi
process of association.
cũ|kỹ /kʊkei/: "kỹ" appears to be a
reduplicate of "cũ", also a cognate with "jìu" 舊, a closer sound to "kỹ"
than "cũ". The same composition and formation apply equally to
kề|cận
/kekʌn/: is from "kàojìn" 靠近 (~ jièjìn 接近) which is also cognate of
"gầngũi" and "gầnkề" , of which the syllabic-words of the the later two
dissyllabic words are in reverse to fit into local speech habit.
Dissyllabism
has beeen a later development in both Chinese and Vietnamese, however,
"trước", "cũ", and "gần", as opposed to the Sino-Vietnamese "tiên", "cựu",
and "cận", respectively, are old materials which point the same root for the
same formation of those dissyllabic words with the same contextual
denotation in both languages.
From there we can see why it is so
Chinese about the Vietnamese language, both so intertwined with each other
that sound change from one language to another must have occurred in the
context of the characteristics that both languages share, in this case, the
dissyllabic features of the two.
For the time being just take
some of many sound change patterns at their face values, e.g., -ang >
-at, -ong > aw, n- > d-, etc. even though sound changes do follow
linguistic rules which will be explained later on. The main principle to
bear in mind is that sound changes did occur in "phonological batches" or
cluster of sounds as whole syllabic units such as -ương > -ang, -ong >
-aw, -ang > -at, -at > an, etc., but not just phonemically n-, -at,
-u-, -n-, -ng, etc., in a much later development As Chinese has become more
and more disyllabic in nature at a later time, when its disyllabic words had
changed into Vietnamese they also changed in dissyllabic clusters of sounds,
in a whole entity of paired syllables, not singly as simple vowels into
other vowels or an initial into another initial, or not even syllable by
syllable on one-to-one correspondences.
Dissyllabic sound change
patterns are an important point in the new approach used in this research of
Vietnamese etymology of Chinese origin. The logic behind this argument is,
in terms of historical evolution and linguistic characteristics, if Chinese
has already been classified by the world's large universities' renown
linguistic circles as a polysyllabic language, then Vietnamese should be
considered as such, too. Only in this context can one be able to see how the
sound changes have taken place and why dissyllabic words should have had the
apprearance as we see them here in this paper. In other words, disyllabic
words had carried along with their disyllabic characteristics when they
transformed themselves in Vietnamese, so that is why with
"qì" 氣
we have "hơi" as in "qìchē" 氣車: "xehơi", "kiệt" and "xỉn" as in "xiăoqì"
小氣: "keokiệt" and "bủnxỉn", or "sáo" and "khứa" as in kèqì 客氣 (~kètào
客套): "kháchsáo" ~ "kháchkhứa"; however, with "shengqì" 生氣 we have
"tứcgiận" (< giận\tức phonetically -- in reverse order - "iro") and
"sheng" 生 by itself is "sống" (live).
Here are some other examples:
jiārén
家人: "ngườinhà" (in reverse order), but with rénjiā 人家, jiā becomes "ta"
as in "ngườita", "cả" as in dàjiā 大家: "tấtcả", while by itself it is
"nhà";
bāngmáng 幫忙: bênhvực, while "máng" 忙 has given rise to both
"bận" and "mắc";
bāchăng 巴掌: "bạttai" ~ "bàntay";
As we
can see, the magnitude of sound changes are multi-faceted and diverse when
dissyllabic words are treated as the whole unit whereas the same portion
that stands alone as a monosyllabic word would not affect the whole string
of sounds of dissyllabic words, i.e., the sound changes for disyllabic words
had happened without the constraints of those for monosyllabic words. If one
still considers Vietnamese is a monosyllabic language, then s/he will never
fully appreciate the underlined notion of these hyphotheses which is used
for a new dissyllabic approach of sound changes.
Once accepting
this principle, one will never wonder why -ư corresponds to -a, -iê to -a,
-au ~ -ông, -at ~ -an, -an ~ -ôt, -ai ~ ua, etc, and will not insist on -a-
must be -ươ-, -ng must be -ng, or d- must be n- and so on in one-to-one
relationship.
In fact, sound changes did happen within linguistic
contraints, such as cultural factor as in "mẹ" ~ "mợ" or local speech habit
as in "kháchkhứa". They migh also occurr following certain patterns and, of
course, within a linguistic kinship boundary, e.g., English "cut" and
Vietnamese "cắt" obviously are not cognates, but 隔 "gé" [kə2] and "cắt" is,
given the the historical context of linguistic development of Vietnamese
which has been going hand in hand with the evolution of the Chinese
language, of which the vast vocabularies have penetrated into the Vietnamese
language with various dialectal contacts at different times.
This
new approach based on dissyllabism in studying Vietnamese of Chinese origin
will be utilized in this research paper. By centering on the recognition of
dissyllabic nature of the Vietnamse language, we will no longer look at
sound change patterns as an isolate phonemic sound change event, but as a
dynamic process that the whole sound string or cluster of sounds all have
changed together independent of their monosyllabic word equivalents. This
sound change patterns have occurred just like those of Latin polysyllabic
roots that have given rise to many variations penetrating into the
vocabulary stocks in the Indo-European languages.
Conventionally,
therefore, in the aspect of romanized transcriptions, like their
counterparts in Chinese, Vietnamese dissyllabic words in this paper shall be
written in combining formation just as those of Mandarin are being
transcribed in pinyin, such as
廢話 fèihuà ‘non-sense’ Vietnamese
bahoa ~ baphải,
溫馨 wēnxīng ‘warm’ V ấmcúng,
開心 kāixīn ~ 高興
gāoxìng ’pleased’ V vuilòng .
In fact, what peculiar about words
of dissyllabic formation is that sound changes from one sound to another is
the dynamic phonological changes, having drastically veered away and been
independent of the original sounds. We will examine this phenomenon at
length to understand why sometimes they are all both phonologically and
semantically distinct from what they originated from. The result of that
will lay out foundation for the new dissyllabic approach and that will help
us identify a vast majority of Vietnamese words having a Chinese origin.
Multiple
sound changes of the same sole syllable in a dissyllabic word, however, at
first sight, may help readers see sound change patterns that appear in its
whole entirety instead of isolate syllables; however, at the same time, they
may also cause confusion to the readers which leaves them with the
impression that phonological variants given for the same Chinese
monosyllabic root are ad hoc cases.
As to the dissyllabic
characteristics of the examples cited above, while one may reconcile
phonologically the sound change 費 fèi with ba, he will wonder how they can
be connected semantically. Obviously this word has nothing to do with ba in
the senses of ‘three" or "father...’ In fact, conceptually it renders phế
'waste' and bỏ ‘abandon’ connotations in Vietnamese. Individually the
meaning of each syllable-word is not the same as that of the whole new
dissyllabic word that makes the concept of "baphải" (non-sense). At the same
time, the word ba- as well as -hoa individually does not mean anything
lexically in Vietnamese as opposed to what we know etymologically of those
two syllable-words in Chinese. Together as bound morphemes they to make what
bahoa is as a unit. In this case, one plus one makes one, but not two -- one
for one meaning. Structurally it is the same with baphải. In contrast to ba,
however, it is easier to see why "fèi" has become "bỏ- "'unwanted, deserted’
as in
bỏphế 費除 fèichú, ‘eradicate',
bỏđi 費棄 fèiqì
‘abandon’,
đồbỏ 費物 fèwù ‘the unwanted’ (in reverse order),
bỏhoang
荒費 huāngfèi ‘deserted’ (in reverse order),
Like ba,bỏ is not
necessarily always associated with 費 fèi. It is so because sound changes
from Chinese to Vietnamese are manifold, especially from those of
dissyllabic words. To gain more understanding of the idea that sound change
is independent of etymological root -- originally of one-syllabe word or one
Chinese character -- and influenced by both phonological and semantic
association and dissimilation, let’s further compare some Vietnamese words
derived from some of those Chinese dissyllables to result in Vietnamese
homophones with bỏ
bãibỏ 排除 páichú ‘abolish’,
bỏphiếu 投票
tóupiào ‘to cast a ballot’,
bỏrơi 抛棄 pàoqì ‘abandon’ (~ bỏngõ)
bỏđi
離去 líqù ‘leave’ (~ rađi),
bỏqua 放過 fàngguò ‘let go’ (~bỏlỡ),
alternation of 錯過 cuòguò, a doublet of 放過 ),
bỏmặc 不理 bùlǐ
'abandon',
bỏlỡ dịpmay 放過機會 fàngguò jihuì ‘let go an opportunity’
(~ bỏqua dịpmay),
bỏ tiền (vô túi) 放錢(進入口袋里) fàngqián (jìnrù
kǒudài lǐ) put the money (into the pocket),
bỏtiền ra mua 花錢來買
huàqián lái măi: spend the money to buy,
bỏphí 白費 báifèi: to waste
bỏphiếu
投票 tóupiào: to vote
The sound change to bỏ in the above
examples, including the innovations of other words, too, are due to
different contextual settings. They involve not only phonological and
semantic assimilation but also syntactical reshuttle through the reverse
order of word structure as exemplified in đồbỏ and bỏhoang, which was
undoubtedly a local development to fit syntactically into Vietnamese
speakers’ speech habit.
Similarly, the fact that 話 huà
phonetically evolves into hoa is acceptable, but in which way does it become
phải ? The sound change rule /hw/ > /fw/ applies here as this phenomenon
is very common in Chinese dialects such as Cantonese and Fukienese as
compared to Middle Chinese or Mandarin sounds. Moreover, in dissyllabic
formation, /fwa/ can easily evolve into /fai/ while
話 huà in its
original monosyllabic word evolved into lời ‘spoken word’ Sino-Vietnamese
thoại (cf. correspondent patterns: 火 huǒ lửa, 夥 huǒ lũ).
For
the same reasons,
快 kuài may become mau (also a loan graph for
‘happy’ Sinitic-Vietnamese vui ),
and it is not hard to understand why
點 diăn becomes -lên.
Of course, lên here has nothing to do with
‘ascend, get on’ and it is only a particle indicating a command, similar to
‘up’ in ‘hurry up’. Phonologically, it is easier to see [tjen] ~ [len].
Individually
點 [tjen] can also be tiếng ‘hour’, châm 'ignite',
chấm ‘dot’ and 'dip', tí ‘a bit’, điểm, đếm 'count', etc.,
of
which phonologically and semantically the different Vietnamese meanings
match exactly what /tjen/ means in every definition of the word 點 diăn as
defined in an ancient or modern Chinese dictionary. Let compare lên in other
context:
lênđây 上來 shànglái ‘come up here’.
In this case,
shàng corresponds to lên ‘ascend’, and -lái is a particle while -đây is
assimilated to an adverb of direction in Vietnamese of the same sound (zhèi
這 in Chinese). Lastly,
溫 wēn can be ấm, but in which way that 馨 xīn
becomes cúng? Of course, it is not the same as
cúng 供 gòng (SV
cống) ‘make offerings to spirits’,
but a result of sound change, as 馨
xīn is also pronounced xīng, Sino-Vietnamese hinh, MC xieng <*hing, of
which the velar x- becomes a labiovelar /k-/, /k'-/ as commonly occurred in
Chinese. Let’s compare 慶 磬 罄 ..., all pronounced qìng and Sino-Vietnamese
khánh, and consider its phonological variations as in
thơmlừng ~
thơmlựng 新香 xīnxiāng ‘fragrantly smell’.
The above examples
demonstrate to us multifaceted sound changes from Chinese to Vietnamese,
among which each of the above dissyllabic words is composed of bound
morphemes, either or both of which can not be separated. It is a result of
sound change of a dissyllabic word from which any syllable can give rise to
a complete new sound that can be, by all means, different from the very same
syllable if standing alone as a monosyllabic word. The new sound may or may
not mean anything if separated from the compound form depending on the
degree of its association with another word similar in sound or meaning.
Let’s examine the syllable-word mau- in mauchóng 敏捷 mǐnjié ‘quickly’,
which, in fact, a variation of 盡快 jìnkuài (> chóng + mau) and its
colloquial variation as 馬上 măshàng.
In fact, Chinese
dissyllabic words can become various sounds in Vietnamese, of which the
order could be put in reverse order to fit into the local speech habit, and
this will be discussed much more later on in different perspectives. In any
cases, homophones and homonyms are plentiful in both Vietnamese and
Chinese.
Regarding to the true nature of Vietnamese it has been
wrongly regarded as monosyllabism (tínhđơnâmtiết 單音節性), or
charateristics of a language based on its dominant one-syllable words, in
its vocabulary, that is, Vietnamese is a language that is lexically,
semantically and syntactically composed of one-syllable words. It might be
true in ancient times, but certainly it is not so in modern Vietnamese. We
can say that the misconception on these issues from the linguistic circle
has misled specialists of Vietnamese to the point that has certainly
hindered new break-through development in this field. For this reason, the
result of this research is, hopefully, to correct the misconception about
monosyllabism and to set out a new approach to explore areas of the Chinese
origin of the Vietnamese language by way of this nouveau dissyllabic
approach, departing from the old approach that is limited to only isolated
monosyllabic and merely basic words. This Sinitic-Vietnamese study is also
an attempt to establish kinship of both Chinese and Vietnamese with
linguistic proofs in all comprehensive linguistic lexical aspects.
Indeed
the two aspects of disyllabicism and Chinese origin are closely intertwined
as much as the two languages themselves are to the point that studies in
either language cannot satisfactorily be done without referring to the
other. Karlgren (1915), Haudricourt (1954), Chang (1974) and Denlinger
(1979), Pulleyblank (1984) and many others utilized Vietnamese when they
studied Ancient Chinese phonology. Specialists of Vietnamese studies such as
Haudricourt (1954), Lê (1967) and Ðào (1983) and some others also did the
same by making use of Chinese dialects to shed light on etymology of
Vietnamese words. They all see the affinity, whether genetic or not, between
Chinese and Vietnamese, but until now nobody discovered that most of
Vietnamese words are originated from Chinese since they have mostly based
their research limited on monosyllabism, which has prevented them from
seeing other variations in sound changes from the same monosyllabic
roots.
In fact, the dissyllabic approach to find Vietnamese words
of Chinese origin is based on the two new premises that, firstly, both
modern Vietnamese and Chinese are dissyllabic languages, or of dissyllabism,
that is, semantically each of the two languages as a whole is composed of a
high percentage of two-syllable words. Once Chinese and Vietnamese basic
words are found cognates, there maybe exists the kinship between the two
languages since basic words were what a language originally had had to start
with. As we will see, Vietnamese is closely affiliated with many ancient and
modern Chinese dialects, literary as well as vernacular (to be called
"Chinese" in general). This new approach has indeed enabled me to find a
remarkable large number, about 20,000, of Vietnamese words of Chinese
origin, many of which have been long regarded as Nôm words, or "pure"
Vietnamese.
Again, this new dissyllabic approach is to treat each
Chinese word, as it should be, since it is the correct way to deal with
Chinese lexicography, as composed of one or more morphemes, or syllables, as
represented by each Chinese character singly, regardless of its meanings
associated with each individual morpheme whether it is monosyllabic or
polysyllabic. In both Vietnamese and Chinese, a morpheme mostly coincides
with a syllable, which is free to go with other syllables to form other
words.
Sometimes, the syllabic combinations in Chinese may convey
completely different meanings regardless of its written characters in
Chinese and, consequently, in Vietnamese, for instance,
on the
Chinese side,
măshàng 馬上: mauchóng 'quickly'
qímă 起碼:
ítra 'at least'
piányì 便宜: bèo 'cheap'
dōngxī 東西: đồđạc
'things'
liáotiān 聊天: tròchuyện 'chat'
wúliáo 無聊: lạtlẽo (~
nhạtnhẽo) 'boring'
mòshēng 陌生: lạlùng 'strange'
huāshēng 花生:
đậuphụng 'peanut' (Hai. /wundow/)
and here on the Vietnamese side,
mặnmà
舔蜜: tiánmì (~ mật\ngọt) 'tasty'
thathiết 體貼: tǐtiè 'heartily'
cẩuthả
苟且: kǒuqiě (~ ẩutả) 'carelessly'
vấtvả 奔波: bēnbó (~ tấttả) 'hand to
mouth'
múarối 木偶戲: mùǒuxì 'pupetry'
trờinắng 太陽: tàiyáng
'sunshine'
bồihồi 徘徊: báihuái 'sadly'
chịuđựng 忍受: rěn\shòu
'endure'
bắtđền 賠償: péichăng 'ask for compensation' (~ bắtthường)
For
those words on the Chinese side any linguist of Chinese knows that better
than anybody else. In a Chinese dictionary, one can find characters or
polysyllabic words which have multiple meanings and the Chinese graphs
involved have nothing to do with the meanings they convey. In the case of
Chinese evolving into Vietnamese scenario, those Vietnamese words carrying
the same characteristics like those example as cited above are endless. It
is no surprise to see that sometimes what has changed into Vietnamese is not
exactly what it was originally in Chinese, for instance, the meaning of
起
qǐ among other things is ‘to rise’ (VS: dậy, hence
起義 qǐyì, VS:
nổidậy ‘to rise against), but
起馬 qímă means ‘at least’ (VS: ítra),
興起
xìngqǐ ‘interested’ (VS: hứngchí and mừngrỡ) and
起頭 qǐtóu ‘start’
(VS: bắtđầu).
Other examples such as
孝順 xiàoshùn
‘filial piety’ (VS: hiếuthảo),
順利 shùnlì ‘smoothly’ (VS: suôngsẻ and
trótlọt),
順風shùnfēng ‘favorable wind’ (VS:xuôigió and thuậngió),
順手
shùnshǒu ‘conveniently’ (VS: thuậntay, sẵntay and luônthể),
順便
shùnbiàn ‘conveniently’ (VS: luôntiện and sẵntiện).
The
word-morphemes 起 and 順 are in bound form and have evolved into different
sounds, meanings and words in Vietnamese. The morphemes ‘qǐ’ and ‘shùn’ are
innumerable in the Chinese language. By actively persuing this avenue in
search for words of Chinese origin, we will find that almost all the
Vietnamese words have a Chinese origin!
As we have seen through
all the illustrations in this paper, the misconception of dissyllabism of
Vietnamese and Chinese have prevented specialists in the field of Vietnamese
etymology from seeing that sound changes of individual syllables in
dissyllabic formation are independent from its original monosyllabic
equivalents. Regarding dissyllabism, in ancient times, both Vietnamese and
Chinese might have been monosyllabic. It is easier to confirm that
monosyllabic characteristics of Chinese based on literary works of more than
two thousand years ago than to do so with that of Vietnamese where its
oldest ones are only dated as far as ten centuries ago. However, basic words
that both languages seem to share in common seem to point to the direction
of monosyllabism.
In any case, in modern Vietnamese, as one can
find in any Vietnamese dictionary thousands of dissyllabic and a few
polysyllabic words even though they are written in separated syllables. In
the past, many experts of Vietnamese insist on its monosyllabic
characteristics as represented by Barker (1966, p. 10): “With the exception
of certain compounds, reduplicative patterns, and loan words, Vietnamese and
Muong are both monosyllabic languages.” If we take his saying to apply to
the English language in certain aspects, it is also a monosyllabic language!
Also, this statemenent just makes him look like that is all he knows about
the Vietnamese language. Some Vietnamese linguist might have "worshipped"
him, more or less, just simply because he was a western linguist who know
something about Vietnamese. When he said “certain compounds, reduplicated
patterns, and loan words”, anyone who is unfamiliar with the language may
feel that there are only a small number of such words exist in Vietnamese.
In reality, almost a whole vocabulary stock of Vietnamese are structured in
such a way as we can see in any Vietnamese dictionary. In other words, his
statement can be used to disqualify him as a specialist of Vietnamese.
Ironically, many Vietnamese linguists in the field tend to worship those
westerners who know something about Vietnamese to say something about it!
It
is true that many of those dissyllabic words in Vietnamese can be analyzed
into a combination of monosyllables which can be used independently and
attach to other syllables to form other counpounds. Nevertheless, a great
number of those words are composed of two or more syllables, or morphemes as
to be considered in this case, that cannot be separated into single
syllables to be used as independent words. One of the good examples is the
most basic Vietnamese words about human body parts, which must have been
originated from ancient time, such as cùichỏ ’elbow’, đầugối ‘knee’, mắccá
‘ankle’, màngtang ‘temple’, mỏác ‘fontanel’, chânmày ‘eyebrow’, etc. All of
these are dissyllabic words since syllables of each word are unbreakable
like their English counterparts. In this respect, the only difference is,
like its sister Chinese language, each morpheme in its free form as a
complete syllable can mean something else. For example, đầu also means
‘head’ and gối means ‘to lean against’. Other examples of a great number of
dissyllabic words are in different areas such as càunhàu ‘growl’, cằnnhằn
‘grumble’, ‘bângkhuâng ‘pensive’, bồihồi ‘melancholy’, bùingùi ‘sorrowful’,
mồhôi 'sweat', mồcôi; ‘orphan’, bằnglòng 'agree', taitiếng; ‘notorious’,
tạmbợ; ‘temporary’, tráchmóc ‘reproach’, or Sino-Vietnamese words hiệndiện
‘presence’, phụnữ ‘woman’, sơnhà ‘fatherland’, and polysyllabic words such
as mêtítthòlò; ‘irresistable’, húhồnhúvía ‘Oh my Lord!’, bađồngbảyđổi;
‘unpredictably’, hằnghàsasố; ‘innumerable’, lộntùngphèo; ‘upside down’,
tuyệtcúmèo ‘wonderful’. (Read more detail of this discussion in Sửađổi
Cáchviết ChữViệt) If those words are written in combining formation instead
of being singly written in separate syllables, they certainly will give
foreign learners of Vietnamese a different impression, including Barker
hemself.
For the matter of polysyllabism, in the past renown
vietnamese linguists such as Bùi Ðức Tịnh (1966, p.82) who had sided with Hồ
Hữu Tường when he criticized and defied ideas that Vietnamese is a
monosyllabic language. Both of them treated Vietnamese as a dissyllabic
language. In Vietnamese, the only fact that a high percentage of
Sino-Vietnamese words (just like words having roots from Latin and Greek in
the English language) as quoted above being used in today’s Vietnamese
sufficiently constitutes the dissyllabic nature of the Vietnamese language,
let alone other polysyllabic words of different categories. Many of those
loanwords are unbreakable. The Koreans and Japanese have long recognized
this matter and they always, scientifically, write Chinese loan words in
“group”! Unfortunately, in today’s writing system of the Vietnamese language
each of such dissyllabic words is still broken into two syllables where each
of which when standing alone may not be related to the original meanings and
may not mean anything at all!
Exactly the same thing can be said
about the dissyllabic characteristics of the Chinese language. Any Chinese
dialect nowadays is also a dissyllabic language. Regarding to this issue,
Chou (1982, p.106) quoted others in his article:
Following
Kennedy and de Francis, Eugene Chin said: ”If we admit that words, not
morphemes, are the construction material of Chinese, we cannot but admit
that Chinese is polysyllabic. If we may use the majority rule here, we will
have no trouble establishing the fact that Chinese is dissyllabic.”
From
this premise, given the fact that Vietnamese and Chinese are dissyllabic, we
can trace each dissyllabic word in both Vietnamese and Chinese and he will
find that, phonologically, a dissyllabic Chinese word can also become quite
a few different words in Vietnamese. For instance, one Chinese word 三八
sānbā (Sino-Vietnamese: tambát), meaning “nonsense”, might have already
evolved into tầmphào, tầmbậy, tầmbạ, bảláp, bảxàm, basạo, xàbát, xằngbậy...
in Vietnamese.
As to the sound change from Chinese into
Vietnamese words, those linguists, who started with the premise that Chinese
and Vietnamese are both monosyllabic languages, try to look for only one
related Vietnamese equivalent to one Chinese character, equally a
monosyllabic word, and, in most of the cases, they seem to associate only
one word of Chinese origin to the one that is in the Vietnamese language.
That is, plagued with the old approach they sought the etymology of
Vietnamese words by investigating and confining themselves to only isolated
monosyllables to find their corresponding Chinese cognates.
Once
and for all, let's face it, since both languages are dissyllabic languages
consisting mainly of two-syllable words, the linguistic rules of sound
changes from Chinese dissyllabic words into Vietnamese ones are just like
those of other polysyllabic languages. For instance, in Indo-European
languages polysyllabic words of the same root when changing into another
language at least one of the syllables may not strictly follow the same
phonological pattern in all languages, such as Latin gelatan > French
gelée or variations of the word “police”: politi, polizei, policia, polizia,
polite, polis, polisi, "phúlít" (old VS from French).
What does
this rule have to do with Vietnamese words of Chinese origin? In the Chinese
> Vietnamese scenario, though one Chinese character (coinciding with a
syllable and a word) when changing into Vietnamese, theoretically, only one
equivalent sound (word) exists, but, in reality, in many a case there are
more than one Vietnamese sound for each Chinese character, for example,
元
yuán SV nguyên, ngươn , VS (tháng)giêng,
度 dù SV độ, VS đo, đạc,
粉
fén SV phấn, VS bún, bột, phở,
拜 bài SV bái, VSvái, lạy,
etc., or
in compounds:
場 chăng SV trường, tràng, but in Vietnamese there
are several sounds:
劇場 jùchăng (SV: kịchtrường) sânkhấu 'stage',
式場
shìchăng (SV: thítrường) trườngthi 'examination site',
戰場 zhànchăng
(SV: chiếntrường) chiếntrận , hence, trậnchiến 'battle' (note: word order is
in reverse in all three cases above),
一場夢 yì chăng mèng (SV:nhất
trườngmộng) một giấc/cơn mơ/mộng 'dream',
一場病yì chăng bìng (SV:nhất
trườngbệnh) một trận/cơnbệnh 'illness',
一場戲 yì chăng xì (SV: nhất
trường hí) một xuấthát 'a show',
一場空 yì chăng kong (SV: nhất
trườngkhông) một khoảngtrống 'nothingness, nada',
在場 zàichăng (SV:
tạitrường) tạichỗ ~ tạitrận 'on spot, red-handed', etc.
The
sandhi process of association has occurred not only in syllables where
neighboring sounds with similar syllable-word and meanings can be
assimilated, which might have already taken place before they were
introduced to Vietnamese as in the above cases where zhèn 陣 (trận) or chù
黜 (xuất) had been associated with chăng å
- Ngườihiệuđính: dchph vào ngày Mar.16.2003, 19:54 pm
x X x
-----------------------------
Nghia <varianunity@xxx.xxx> (23/01/02)
1) First email:
You are CRAZY!
The language can not to be forced being in this way or that way. I hope you understand, if you are working in linguistics science, that linguistics scientists can not MAKE rules for a language.
All they could do is to SAY the way that some people speak/write is correct or incorrect.
What you are trying now is another attempt of 'language re-form' (gosh, again?), and of course will fall between the forgotten. I'm sorry to give you such a cruel prophet. In fact no Vietnamese needs your conception, except your crazy comrades, real or virtual copies of your self.
Cheers.
Nghia
2) Second email, also from the same writer, is an attachment of a widely distributed on the internet entitled, "Euro-English ", which has long been posted in VNY2K.COM website on the webpage Thugian2 . Then in his post script, the author wrote, "p.s. some how similar, isn't it? Yes, it is the same ideology, the same dream, such as yours."
3) Third email (15/02/02) , from the same author:
Hi Sir.
If I were you, i would cut off the "New Vietnamese Dictionary" article by [Tan Việt]. Remember the "Euro-English" joke I sent you before? I didn't send it just for fun.
By the way, I've read a great deal of articles in your website. Some of them are highly valuable and I enjoyed them and informed my friends. I recognize you as an expert in linguistics. But again, my opinion, don't change the rules! We will waste an infinite amount of time in this fast-stumbling electronical era only for arguing which rules.
Regards, tran
--- ATran29181@xxx.xxx wrote (09/10/2001):
> Dear DChph,
> Happen to know vny2k website and viewed your arguments with David Vu's.
I do not want take your time but remind you that written and spoken
language
should be the most cohesive semantic system and that language
is made of pattern and structure (you talking only about structure). Music
composers will be not happy to write their lyrics following your " Sua doi
cach viet tieng Viet." Surely I believe you understand Pattern and
Structure and the impossibility of a mathematical semantics of natural
language as your proposal.
Regards
> Anthony Tran, a supposed to be a French linguistic [sic] for a
computer company.
--- Bryan Tran <btran@xxx.com> wrote:
I think it will be a good
reform.
However, should it be one capital letter for
one group word such as
> Tiengviet instead of TiengViet
>
>
Tiengnom instead of TiengNom
>
> Saigon instead of
SaiGon
>
> Hanoi instead of HaNoi
>
>
Vietnam instead of VietNam
Best Regards
Quan Tran
----------------------------------------------------------
Thank you for your email. I'm glad to hear another voice from an opposite camp. It means that voices from our side have been heard. At least from that breaking ground, interests and concerns about the same matter, in this case, the present state of the Vietnamese language, we all can find truth and common ground for a widely acceptable solution. Also, the good thing is we all acknowledge that our present writing system does have a problem. I would like to take the liberty and this opportunity to post your comments in this webspace for readers' own further judgment.
I assume that you have at least read " Sửađổi Cáchviết ChữViệt " -- otherwise, my reply here regarding your comments on our campaign of Vietnamese2020 language reform is totally non-sense.
You are right by saying that linguists only state fact, not making language rules. The same applies to the role of grammarians, they just can state language facts as correct or incorrect -- but good or bad writings are only a matter of personal choice.
It is sad that your prophet for our reform is that it would be doomed -- but we believe in our cause. I am also glad to be reminded that we do have on our side supporters from intellectual circle -- linguists, professors, researchers, academicians, and the like, including many of common average people that are not in the same professional linguistic field ... And for you we all look like crazy going against the ignorant mass. But one thing is clear, just like linguists and grammarians who can tell the correct and incorrect facts in language expressions, we know the unknowns and are determined to point out the rights and wrongs to those who are not aware of certain awkful facts about the current writing system of our mother tongue.
We initiate the language reform with the hope that others will not be afraid of reforms, even in those of a language matter. Progress of different civilizations, with the latter ones often being better than the previous ones, have seemingly gone through one reform after another. Proofs are plentiful in human history. And in this case, certainly that is not comparable with the so-called "Euro-English." They are just apples and oranges in the same basket.
dchph replies:
Hi Anthony Tran:
Glad that you write.
In David Vu's argument, he agrees with me that foreign words should be kept as they are, but here he means only English. He may have forgotten that a vast large vocabulary stock of the Vietnamese language has Chinese origin, mostly dissyllabic in nature. It is because they sound so "Vietnamese". Shall we consider those Vietnamese words are of "foreign" origin just like those polysyllabic words of Latin and Greek origin in English? How does the notion of possibility of "a mathematical semantics of natural language" apply to the English language?
In poetry and musical lyrics, Vietnamese words already come in pairs. No need to change anything, do we? Somebody has opposed to this Vietnamese2020 reform idea because he very much enjoys "seeing" poems written in current way -- that is maybe the pattern you are mentioning. Is this kind of pattern "xx xx xx xx xx..." that you are talking about? I really have no idea about it and also the notion of "impossibility of a mathematical semantics of natural language", regretfully. However, linguistics is a science, so is mathematics. We cannot live with art only and I believe art can go its own way without our intervention, but science can't. A train can only progress if it runs on track and it should be on the right track.
Thank you.
Hi Quan Tran:
Thanks for writing.
Agreed
with you that Hanoi, Saigon, or Vietnam should be written the natural
way.
However, Viet in TiengViet seems to transcend the other syllable prominently.
Just like in English have you ever considered the case that some words are written as anti-American, anti-Semitism, pro-French?
Regards
dchph
-------------------------------------
Date: Sun, 2 Oct 2005 18:40:06 +0200 (CEST)
From: "Gentium User"
<waythrow-sil@xxxxxxxxxxxxxxxxxx.de>
Subject: Sinitic-Vietnamese
draft + Vietnamese 2020
Dear dchph,
I had a short glance at your draft of an introduction
to
Sinitic-Vietnamese Studies. Seems to make for an
interesting
read.
On www.vny2k.net/vny2k/SiniticVietnamese.htm, you provide a link to a popular Unicode true-type font. This font is
proprietary and I am not sure whether it is legal to download it. There is a
very beautiful font which is much better suited for Vietnamese. That font is
called GentiumAlt and everybody can download it for free from
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=Gentium_download
I suggest you set a link to that download page instead. You can
download an archive containing the fonts Gentium and GentiumAlt. When
reading Vietnamese,
the latter looks better than the former, although I
viewed your draft with Gentium and it worked fine.
In your
Vietnamese 2020 proposal, you mentioned a couple of German terms which I'd
like to comment on /correct:
"Informationssystemverarbeitung" < I've never heard of that, but there
might be people who use it. Anybody could basically coin any word just like
in English; the only difference with English is really that in German, you
should not leave spaces in a word, or at
least use hyphens.
"Aufwiedersehen" < this is wrong; should be "auf Wiedersehen" or (as an
expression) "Auf Wiedersehen!"
"Recherchemoglichkeiten" = (re)search
possibilities(plural!), NOT "possibility research"
"Datenbanken" = data banks? (Plural, NOT "data bank"; besides, aren't those things called data*bases* instead of data*banks*?)
"Betriebwirschaft" should be "Betriebswirtschaft"
Cheers
jns
PS. You write "We are the advocates
of the newVietnamese2020 language reform!" - shouldn't this be "I am the
advocate of a new Vietnamese 2020 language
reform!" instead?
--------------------
dchph replies:
Dear 'Gentium User':
Thank you for taking the time to write me
comments. I'll look into the matters soon.
Your writing makes
some good suggestions, including the cited German words. I'd like to take
the liberty to post your email on ziendan.com so that interested parties may
learn something about some German lexical formation. During 1972-75 I had
spent some time to study German in Goethe Institute, but I have never had a
chance to use it since then and now forgotten everything. Therefore, it is
no surprise that I have made a mistake as simple as as the case
"Aufwiedersehen".
Re the Arial Unicode font, as far as I know
that is the only font that contains all the world's languages and IPA
symbols. And this font is offered for download in many sites. I'm looking
for a similar font to replace that.
For "We are the advocates..."
or "I'm the advocate...", it is actually an "We..." since until now many
others have jumped in the bandwagon and expressed their support for that
writing reform idea as seen in their viewpoints posted in the ziendan.com
forum.
Regards,
dchph
--------------------------------------------------------
x X x
Change the Way Vietnamese is Written:
Don't write Vietnamese as it is a monosyllabic language!
By PHU CUONG
I've been wondering for a while the question: "Why is Vietnamese written as a monosyllabic language while it is not?" The answer I found is based on these theories:
From the very beginning of the modern human races, all languages around the world were developed from single sounds or monosyllabic words! The first human languages, originated somewhere from Africa, certainly had about few hundreds of simple monosyllabic words, e.g. eat, drink, jump, go, run, I, you, he. je, nous, vous, etc. By observing how babies learn their first language this fact can be proved . Later, thanks to the human brain evolution and due to needs for better communication, people started adding more sounds (or syllables) to the original words for clearly expressing their mind. In fact, monosyllabic languages on earth now are spoken only by a very few tribal minorities in Asia and Africa, excluding Vietnamese . Unfortunately, Vietnamese is still considered by many people as one of those monosyllabic languages which are still existing on earth (spoken at least by 75 millions or more people).
Even though Vietnamese is now grouped into the Mon-Khmer language family by majority of Vietnamese linguists, which is still questionable and debatable matter, it still can be considered as a cousin of the original Chinese based on all the similar characteristics of all its linguistic aspects except for the case of grammatical order of the adjective position (cf. thien thanh vs. troi xanh). Vietnamese is written the way as it is today mainly because each of its sound is originally based on each of the Chinese writing character. However, Chinese at the present state is certainly a polysyllabic language as reckoned by most Chinese linguists. Besides Vietnamese and Chinese, Korean and Japanese are also written in a similar way in block characters -- but they are not monosyllabic languages at all. Therefore the fact that Vietnamese is still written in separate formation cannot constitute Vietnamese as a monosyllabic language by nature. We can prove that by examining many basic Vietnamese words which are polysyllabic or disyllabic words in nature, for example, mangtang, moac, cuicho, bavai... Solely the only fact that the majority of vocabulary originated from Chinese being used in Vietnamese is enough to establish the polysyllabic nature of the Vietnamese language. For example, one can say sonha, giangsan, sonthuy, but cannot break the morpheme son or san out from its combined formation to say "toi len son".
Today's Romanized Vietnamese was invented and evolved when European missionaries came to Vietnam and learned Vietnamese mostly from peasants and ethnic minorities such as Muong people, whose Vietnamese proficiency was not too good or poorly spoken. In the process of using Latin alphabet to transcribe the spoken Vietnamese, these missionaries intentionally simplified the way words were written so that the new writing system was easier to read for the less-educated mass. Therefore the Vietnamese words were separated clearly by a space or a hyphen between syllables regardless of they were monosyllablics or polysyllabics in nature. For examples: ong_Troi, con_gai, Quoc_Tu_Giam...(instead of... ongtroi, congai, Quoctugiam), ..or tu-tuong, thanh-kien, thien-than, vat-chat, tu-do which should have been correctly transcribed as tutuong, thanhkien, thienthan, vatchat, tudo, etc.
The way of writing Vietnamese as it was then was made out of their own advantages and convenience, too. Please don't get me wrong on this, I would not deny the good deeds they contributed toward developing our modern written language. However, just imagine that if you were a math teacher and always gave your students easy problems all the time to make them happy. That was how they wanted to keep their followers happy. It has been over 300 years now and even though the writing language has been in official use for less than a century people have gotten used to that way of reading and writing. Unfortunately, this has profoundly affected the way we speak and write, and most importantly, the evolution of genetic structure of our people's brain, too.
Nowadays, when writing Vietnamese people have become lazy to draw a short bar or "hyphen -" between disyllabic or polysyllabic words such as tivi, honda, mayvitinh, vanhocnghethuat, but many do remember to use a hyphen (-) to write their full names such as Nguyen-van-Anh-Vu. Why don't they just write Nguyenvan Anhvu instead? Something must be sacred about their names, but the soul of Vietnamese is not -- that's somebody's else business. Technically, it's just a waste of paper and time. Just take into consideration the fact that one must lift the pen or hit space key on a computer keyboard hundreds of times for a few pages in writing. It is illogical that most newspapers print "Saigon, Ha Noi in Viet Nam" instead of Saigon, Hanoi in Vietnam. My rough estimate is that we will save up to 5 % of paper , and 10% of our valuable time if we omit those separation spaces.
Nevertheless, this writing is not to bring up the subject of saving time and paper. There is something more important: The way we write will affect to the way we speak and the development of our people's brain structure, especially for children who begin to learn their first words in the Vietnamese language. The brain of children who "must" learn his first language from a more sophisticated language, in this case the written language, would be naturally required to work harder, and as a result, his brain would develop a little much better than otherwise. Generation after generation, that will have an accumulated effect significantly on genetic evolution of our people's brain.
Any language on earth is evolving or changing slowly in time. Typically more than a few vocabulary words and meanings are added into American English a day! Vietnamese language should need more than that because of the increasing role played by new technology and science in daily life that are much in need for a developing country like Vietnam. That needed change would not be easily accomplished if our language still keeps its monosyllabic way of writing because the Vietnamese language as it is today is no longer a monosyllabic language. Monosyllabic languages which are still in use are those of tribal or less civilized people whose brains have not evolved or developed sophisticatedly enough to memorize or coordinate different syllables simultaneously in their native tongues. We do not have to be Vietnamese linguists to apprehend the problems with our mother tongue; common sense teaches us what is "better" or "better not". So, please don't take this lightly. Let's join in a movement to start writing Vietnamese in a more scientific way which reflects the true nature of our polysyllabic language.
Here is the first simple step we can begin with: Do not leave a "space" between the polysyllabic words. For instance: cuicho, chanmay, bavai, bangkhuang, bongo, thanhaonhansi, hanghasaso, hientuongluan, nhansinhquan, utru, tuongtuong, canhchung, xemxet, quansat, suyluan, tuduy... or any words that usually come together such as cucchangda, tuynhien, buoisang, chodu, macke... In English one may notice that the commonly paired words are written in the combined formation. For instance, the word "university" was originated by two words "universal city", or "nevertheless" was "never the less", "albeit" was "all be it", afternoon was clearly from after......noon!" etc.
Noitomlai, dayla mot loi viet dung cho tieng Viet hiendai:
Chungta cothe batdau lam mot cuoc cachmang nho cho vanhoa Vietnam, bangcach thaydoi loi viet tieng Viet ngaytubaygio. Bietdau trong tuonglai motvai thehe toi se cho ketqua totdep ma hientai khong ai cothe thaytruoc duoc!
PHU CUONG
x X x
Author: lost-dude
The purpose of the
orthography is to denote the spoken language. The Vietnamese quốc ngữ
writing system has a few little problems that we haven’t attempt to solve,
such as ă, â, gi, qu... [The Vietnamese GI, originated from the ancient
Italian GI, has the sound equivalent to the English Z. So then the word “gì”
is z` (a consonant with an accent). Or the QU produces the sound like W; so
words such as quốc should actually spelled quuốc = wuốc. Notice how
currently quần and quằng are pronounced differently for one is qu + uần
(spelled the Việt cộng style, it’s u – ớ – nờ – uân – quờ – uân – quuân –
huyền - quuần), the other qu + ằng (ă – ngờ – ăng – quờ – ăng – quăng –
huyền – quằng).] Before reforming the system basing on the current quốc ngữ
scripts, perhaps they should try to solve those things first.
Is
making the Vietnamese writing system polysyllabic necessary? It is not a
necessity for the language, per se. The QN script has been monosyllabic for
centuries, so to speak, and it hasn’t diluted the Vietnamese language, at
least to a measurable degree. So why should we change it, if it’s working
okay?
First thing to look at is the language itself. Is
Vietnamese a monosyllabic “language”? (Not to be mistaken for a monosyllabic
orthography). The answer is mostly so. A huge number of Vietnamese terms are
consisting of only one sound: ba, xanh, đi, nhớ... Singly, there are only
roughly 6,500 syllables in Vietnamese. So even if with every sound, we have
three meanings, it can sum up to only around 20,000 words. Twenty thousand
is not adequate for a language, and we know, if we opened a Vietnamese
dictionary, it has more than twenty thousand words. With this illustration
alone, we can safely assume that the Vietnamese language itself is not
monosyllabic.
Having spoken Vietnamese, we noticed that a lot of
times, a word is not a term or anything, but just a sound. Like khuâng, what
is khuâng? There’s no such thing as a khuâng. But wait a minute! What about
bâng khuâng? Well, bâng khuâng are two “words” that constitute a term, an
adjective, to be exact. So we have two “words” that have no meaning
whatsoever individually, put together and they produce something meaningful.
Khuâng isn’t the only word in Vietnamese that doesn’t have any meaning on
its own. It’s not a prefix nor suffix, really, but it’s not “word”, to the
conventional sense. What, then, do we make of it? It’s just a sound. A sound
in conjunction with another sound to create a part of speech. Like
chopsticks, which is mostly used for picking up food, singly are almost
useless. Two sticks together to make a pair, two syllables together to make
a, uhm, for the sake of discussion, let’s call it a “term”. This backs up
the previous conclusion that the Vietnamese “language” isn’t
monosyllabic.
Secondly, now that we’ve established the fact that
Vietnamese isn’t restrictly monosyllabic, is there anything wrong with using
a monosyllablic writing system to denote a polysyllabic language? Not
anything apparent suggesting that it’s “wrong” to do so. It’s been like that
for centuries, why bother to change it when it still functions? Under the
education system of the Republic of South Vietnam, missing a hyphenation was
considered a spelling mistake, I don’t think students would welcome this
change. (Back then, if you didn’t spell Việt Nam as “Việt-Nam”, points would
be deducted.) But before discussing that and similiar adversaries for
change, why the change? I personally welcome it, because I like changes. If
we change into something sucks, we would realize how good what we had was.
We can simply go back to it. But if we don’t change at all, how would we
know whether what we have is the best? Tradition is a great thing. Consider
it the past, and we all should know history. Nevertheless, tradition
shouldn’t be an obstacle for improvements. With the
if-it-ain’t-broken-don’t-fix-it mentality, the Vietnamese people had been
farming pretty much the same way for millenia. Americans just started on
farming for about three hundred years. Being an experienced practitioner the
Vietnamese is, it’s hilarious to say an American family is more productive
than a Vietnamese village, when it comes to agrarian matters.
Not
all changes are good, but not all changes should be dismissed primitively
either. So let’s take a look at the proposal on the dissyllabicization of
the Vietnamese script. The classical example is the title of Emperor Gia
Long: Khai Thiên Hoằng Đạo Lập Kỷ Thùy Thống Thần Văn Thánh Võ Tuấn Đức Long
Công Chí Nhân Đại Hiếu Cao Hoàng Đế. If you come across this title on a
textbook, how the world would you know how to read it? “Con cháu vua Thế Tổ
dùng mỹ hiệu “Khai Thiên Hoằng Đạo Lập Kỷ Thùy Thống Thần Văn Thánh [pause
to catch some breath…] Võ Tuấn Đức Long Công Chí Nhân Đại Hiếu Cao Hoàng Đế
[take a little break…] để truy tôn vị khai sáng ra cơ nghiệp nhà Nguyễn.”
Now let’s imagine doing that as a teacher. We can’t expect our listeners to
get it if we don’t get it ourselves. The hyphenation in the past would solve
that problem: Khai-Thiên Hoằng-Đạo Lập-Kỷ Thùy-Thống Thần-Văn Thánh-Võ
Tuấn-Đức Long-Công Chí-Nhân Đại-Hiếu Cao-Hoàng-Đế. Notice how you can
distribute the pause evenly so there isn’t a huge gap? KhaiThiên HoằngĐạo
LậpKỷ ThùyThống ThầnVăn ThánhVõ TuấnĐức LongCông ChíNhân ĐạiHiếu CaoHoàngĐế,
exotic looking, serves the same purpose.
At this point, people
would give the classical case of “pháthành”. “Ông tỉnhtrưởng ra lệnh
pháthành”. Is he ordering [people] to take down the wall or to public
[something]? Of course, with the context, it should be clear, but without
it, we can’t tell. Phát hành or phá thành, there would be tons of cases like
this. Hmm, confusion already, why should we accept it? This suggestion isn’t
perfect after all. But one can argue that without the context, many, many
things are confusion anyways. If we hear, “It is red”, without context, is
it “It is red” or “It is read”? Em có bận không = Are you busing? Or is it
something perverting?
So it has its pros and cons: There are
issues need to be addressed. This proposal addresses some issue but ignores
others, while creating other issues. My personal conclusion? Toss it back
into the drawing board.
We buy computers and then later on
upgrade them. After a while, we throw them away and get newer ones. Why?
Because of the availability of something better. When there’s a newer
version of a program, oftentimes we are cautious about implementing it.
Microsoft updates solve some problems and generate newer problems. One thing
though, the OS in 2003 is better than the one is 1995.
We should
be cautious of changes, but remember the fact that not all changes are bad.
Vietnamese is polysyllabic; a polysyllabic writing system isn’t a bad idea.
I support those people in their cause and encourage them to work on a better
reformation. O ther wise I think they would be ve ry un suc cess ful.
(Notice how much better it is to have a single “word” for unsuccessful? And
it's much easier to skim a polysyllabic text than its monosyllabic
counterpart.)
Source: www.vietcyber.net/forums/showthread.php?t=91775
- Ngườihiệuđính: dchph vào ngày Jun.13.2009, 19:39 pm
-------------------------------------------------------------------------------
Stunning! Vietnamese2020 has passed the test
on "Free Online Text
to Speech (TTS) Reader"
By Vuara
Apr.7.2019 01:44 am
I was just curious to see if the machine would accept Vietnamese2020 and
output the correct pronunciation. I was stunned! The result was 99%!!!
The test engine: The Online Text to Speech Engine @
https://inforobo.com/text-to-speech-online/
The text to be tested: The first 3 paragraphs of the article "Vấnnạn
Liênhoàn" @ http://vny2k.com/vny2k/VannanLienhoan.htm
The words with (only slightly) less than perfect pronunciation:
nhậnthứcrõrệt, hiệnhữu, vấnnạnliênhoàn
Error metrics provided by https://wordcounter.net/: 3 out of 309 words OR
36 out of 2,062 characters
VERY IMPRESSIVE.
x X x
Eiras
Dec.10.2010 01:36 am
Tiếng Việt, lâu nay nhiều người muốn cải cách để viết hoặc gõ cho đỡ mệt. Tôi tán thành ý kiến đó và xin hiến 3 kế đơn giản, dễ áp dụng:
* Trường hợp 1: Bớt chữ: Các từ tổ gồm hơn hai từ trở lên, nếu các
chữ cuối của từ trước giống chữ đầu tiên của từ sau, bớt đi 1, và viết liền
nhau:
VD: "Con người" sẽ viết thành "congười"; trong ngữ cảnh
"Anh co người lại" chúng ta viết tách ra - Như thế sẽ không làm mất đi sự
trong sáng của tiếng Việt.
Trong trường hợp từ "thành nhân" có
đến 2 chữ giống nhau, chúng ta viết "thànhân" là đủ; nếu muốn viết "thành ân
oán" chẳng hạn, chúng ta viết tách ra.
* Trường hợp 2: Bớt dấu:
Khó ai có thể phân biệt từ hat và hát - tôt và tốt - khac và khác khi đọc
lên cả; nói khác đi, những từ kết thúc bằng: c, ch, t, p mà đứng trước nó là
những nguyên âm, dấu sắc trong trường hợp này tôi thấy bị thừa, bỏ đi là
xong. (dĩ nhiên còn nhiều trường hợp khác đang chờ các thành viên phát
hiện)
* Trường hợp 3: Bớt cả dấu, cả chữ: Có một số trường hợp
chữ H câm như trong: vô địch, tới đích. Tôi thấy chỉ cần viết: vô địc, tới
đic là đủ
Khi ngồi gõ hàng trăm ngàn từ, bớt được vài ngàn con
chữ cũng là tiết kiệm thời gian, tiền bạc, sức khoẻ, các bạn đồng ý
không?
Biêt rằng, mỗi con chữ đã đều có một cái hồn riêng qua
nhiều trăm năm phatriển. Nhìn thiếu chữ, thiếu dấu hơi là lạ, nhưng có gì
trên đời lại không luôn phatriển? Ông cha ta đã từng bỏ chữ Hán - Nôm để
chọn quốc ngữ. Đó là cách mạng lớn! Lẽ nào ta không dám băt chươc?
Xin
được lãnh hội ý kiến của cac thành viên.
- Ngườihiệuđính: Eiras vào ngày Jun.13.2011, 21:02 pm
-----------------------------
Testing word spelling as suggested by Eiras
by Vuara
Apr.7.2019 02:39 am
The test engine: The Online Text to Speech Engine @
https://inforobo.com/text-to-speech-online/
The text to be tested: as suggested by Eiras
conngười CORRECT
congười INCORRECT
thànhnhân CORRECT
thànhân INCORRECT
hat INCORRECT
hát CORRECT
tôt INCORRECT
tốt CORRECT
khac INCORRECT
khác CORRECT
vôđịch CORRECT
vôđịc CORRECT
tớiđích CORRECT
tớiđic INCORRECT (SLIGHTLY)
pháttriển CORRECT
phatriển INCORRECT
Conclusion: vôđịc ACCEPTABLE; tớiđic TOSS-UP; congười, thànhân, hat, tôt,
khac, phatriển REJECTED
x X x
To: dchph@yahoo.com
Subject: Vietnamese2020 Suggestions
Date: Tue, 30 Nov 2004 21:42:08 +0100
Dear dchph!
Before I start: I want to apologize that I don't write on vietnamese. I grew up in germany and learnt reading and writing vietnamese by myself. Therefore it is easier for me to write to you in english.
I found your articles about the vietnamese way of writing at vny2k.net. I can't say anything about the roots of vietnamese language but I agree with you that the current vietnamese writing system is not well suited for writing and reading vietnamese. I also agree with you that polysyllabic words should be written as one word and reading in word shapes like in english or chinese is superior to reading letter by letter.
I think a modernized vietnamese script should
- allow reading by word shapes
- be easy to write on paper and on computers
- be able to represent polysyllabic words
- be compatible with the present vietnamese script
Some say that the new style you propose is harder to read and cause misreadings because of ambigous syllable boundaries. I think the problem is that standard quoc ngu already doesn't provide easy to differentiate word shapes. By combining them hard to tell polysyllabic syllables you get a word that is even more difficult to read. I would like to share my ideas about possible ways of solving this problem.
The first I suggest is reducing the number of diacritics, the second is avoiding ambigous syllable boundaries.
Let me show how I reduce the number of diacritics.
The approach bases on a set of 7 vowels. This is not scientific but it helps systemizing the language a lot.
Each base vowel has two representations:
A: a - a(
E: e - ê
I: i - iê
O: o - ô
U: u - uô
X: â - o+
Y: u+ - u+o+
Each vietnamese syllable can be traced down to exactly one of these 14 stems. The rest of the syllable can only be a consonant and/or a half vowel (w sound) before it, and a consonant or half vowel (w or y sound) after it. E.g. Yêu is based on iê, think of it as iê + w. Quê is based on ê, think of it as Gw + ê.
Quoc ngu bases on 11 vowels, that is 6 more than standard latin alphabeth has. My system bases only on 7 vowels, that is only 2 more than standard latin alphabeth has. I assigned new letter combinations to those vowel stems that use diacritics in quoc ngu. The table then looks like this
A: a - ae
E: e - ei
I: i - ie
O: o - ou
U: u - uo
X: ah - oh
Y: uh - uoh
To make use of this system only a few things have to be changed in quoc ngu. The changes are quite easy to remember, after a few minutes you are already able to use it.
a( => ae
ê => ei
ô => ou
â => ah
o+ => oh
u+ => uh
iê, yê, uô => ie, ye, uo
The h in ah/oh/uh is written after the whole vowel cluster. E.g. Mưa is written Muah (and not Muha), Mười is written Muòih (and not Muhòhi). Additionally I use z for d and sometimes dh for đ, because đ and d look too similar.
And how is this better to read?
Remember that you got used to quoc ngu because you have probably used it for years. You will need a certain time to get used to the new system so you can memorize common word shapes. The main advantage of the new system is that the use of diacritics is reduced to tone marks only. Everything that has to do with pronounciation (and I don't mean the tone) is expressed in letters only without any diacritics. The words can be recognized as a whole shape and don't need to be deciphered one letter or diacritic after the other. Compare it with english:
mooh, moe, more is better to read at a glance than mơ, mô, mo.
The same for day, dare vs. đê, đe.
the, though, they are better to read and memorize than thơ, thô, tho.
thoh, thou, tho - don't you also find this better to read than thơ, thô, tho?
Reading in word shapes instead of single letters and diacritics also helps reading polysyllabic words.
How can a polysyllabic word be read comfortably when its syllables causes problems. With a script that provides easy to differentiate word shapes this can be avoided. Even longer words can be recognized at the first glance. Another problem caused by polysyllabic words are the syllable boundaries. For avoiding misreadings caused by ambigous syllable boundaries (cốý, chủnghĩa, sátrạt) I have worked out some rules.
1. before a vowel and before an h, place an apostroph (') to indicate the syllable boundary
2. in the middle of a word ng/ngh in syllable initial position changes to w
3. in the middle of a word tr in syllable initial position changes to j
I personally don't replace tr but if you want 100% misreading-free syllable boundaries you need all three rules. Only then you have a unique representation for each possible syllable boundary. These rules can also be applied on normal quoc ngu without the vowel transcription above. Since quoc ngu has so many diacritics one could easily misread the apostroph as a sa('c tone. In that case I use a dash instead of apostroph, though this is unsatisfactory because dashes require that both sides of the dash are meaningful by themselves. That is not always the case, e.g. nút-radi-ô vs. nút-radi'ô.
I found that when using both, the vowel transcription and the rules for polysyllabic writing, then it is mostly possible to read text without or with only very few tone marks. One could add a tone mark only on the last syllable of a word and the text is still perfectly readable. Since the pronounciation lies completely in the letters the text can be read very well even without diacritics. This is similar to hearing songs where the melody rarely match the inherent tone of the words. This is not that easy quoc ngu, when omitting all diacritics in quoc ngu, not only the tone is lost but also the pronounciation is heavily altered.
Another way to improve readability is noun capitalization. Since I grew up with german language I know about the advantages of noun capitalization. For example if at the first glance you see "ca" then it could be ca or cả, but not cá or cà, because cá or cà would have a capital C. When you know the nouns then you also know that the remaining words are verbs, adjectives, particles, etc. but not things, persons, places.
Here is an example text from your website, once written with quoc ngu and a second time with the changes applied.
Mộtkhi đã nhậnthấy rằng cáchviết hiệnnay không phảnánh đúng một cách khoahọc hiệntrạng của tiếngViệt vì trên thựctế từng từ trong tiếngViệt vẫn còn bị viết rờirạc thành từng tiếng một, bạn hiểu rằng với cáchviết này sẽ làm trìtrệ ócpháttriển của trẻcon khi chúng bắtđầu học đánhvần và nhậndiện chữViệt (xem Ngônngữ và Trítuệ, của Nguyễn Cường) cũng như không tậndụng hết tiềmnăng đadạng của khoa tinhọc hiệnđại.
Mọutkhi đã nhạhntháyh ràeng Cáchviét hiẹnnay khoung phản'ánh đúng mọut Cách-khoahọc hiẹnjạng của Tiéngviẹt vì trein Thụhctéi tùhng tùh trong Tiéngviẹt vãhn còn bị viét ròihrạc thành tùhng Tiéng mọut, Bạn hiẻu ràeng vóih Cáchviét này sẽ làm trìtrẹi Óc-phátjiẻn của Trẻcon khi chúng báetđàu học đánhvàhn và nhạhnyiẹn Chũhviẹt (xem Ngounwũh và Trítuẹi, của Nguyễn Cường) cũng nhuh khoung tạhnyụng héit tièmnaeng đayạng của Khoa tinhọc hiẹnđại.
This script system originally started as a complete alternative to quoc ngu. But I have modified it to be compatible with quoc ngu so both systems can now be mixed within the same text and even within the same word. This is very important for historical and computational aspects.
I hope you can find the ideas useful as a contribution to the development of a modernized writing system.
I would be glad to hear your comments about these suggestions.
Best regards
Nguyen Quoc Trung
From: "yen41975@zzzzz.com" <yen41975@zzzzz.com>
Date: Mon, 10
Jan 2005 04:21:57 GMT
To: editor@vny2k.net
CC:
dchph@yahoo.com
Subject: Tran Y - Thắcmắc và đềnghị về bàiviết của Tâm ÐoànViệt
Thưa Ông Tân Đoàn Việt
Vì hôm nay máy truyền hình của tôi bị mất hình nên tôi vào mạng vô tình đọc baì ông viết về đổi chữ Việt thành đa âm.
Tôi là một ngươì dốt chữ Việt. Tôi dốt đến nỗi mà mẹ tôi thường nói “một chữ cắn làm đôi không biết mà cứ thích lý luận.”
Tôi có vài thắc mắc và vài đề nghị.
1. Nếu ông đổi cách viết tiếng Việt thành đa âm thì ông dùng hệ thống nào để ghép chữ? Tiếng Anh có gốc chữ Latinh, Pháp, đức v.v. Ông tiếp đầu ngữ và vĩ ngữ bằng cách nào?
2. Theo tôi biết, học sinh học Hán Văn phải học từng chữ một. Nếu tiếng Việt đa âm thì học sinh phải học từng chữ một hay sao? Ví dụ như chữ NGUQUAN ông chỉ học được có một chữ thôi. Nếu ông học chữ NGU và chữ QUAN ông có thể ghép hai chữ này với nhiều chữ khác. Ví dụ như ngu quan, ngu quyền, ngu dân, ngu dốt, ngu đần, ngu muội. THAM QUAN (quan ăn hối lộ), quan quyền, quan lại v.v.
3. Ông dùng hệ thống nào để nhấn vần, đánh dấu chữ? Bây giờ đã có rất nhiều người VN đánh dấu sai và dùng từ sai. Vi dụ gần nhất là VNY2K. 1000 năm nữa liệu người VN còn nhớ chỗ để đánh dấu chữ không hay lại như người Pháp nói tiếng Việt. Ăn cực khổ thì nói là ăn cức khô. (giọng bắc nói là ăn cứt khô)
4. Chữ nào thì ông viết chữ hoa, chữ nào thì ông viết chữ thường. Ví dụ như chữ, chữ quốc ngữ “chữQuốcngữ”. Ngay cả đến ông có khi ông viết “chữQuốcngữ” có khi ông lại viết “chưquốcngữ”
ĐỀ NGHỊ
Tôi xin đền ghị ban tổ chức VNY2K coi lại khả năng tiếng Việt và tiếng Anh của mình. Tôi đề nghị ông coi lại những chữ sau đây: TRANG CHU, TRANG NHA CÁ NHA, "NZ first Demands All Agent Orange Papers be Made Public” “Strong Earthwake off Disaster-Hit Indonesian Province” Tôi chắc rằng nếu tôi có thì giờ đọc VNY2K tôi sẽ còn rất nhiều đề nghị vơí ông.
5. Đề nghị thứ năm của tôi là xin các ông học hết những chữ chính đi đã rồi các ông mơí có khả năng thay đổi chữ Việt sang đa âm.
Tôi rất mong được ông và người kiểm bài trả lời trực tiếp.
Xin cám ơn
Tran Y
Cảmơn ông đã bỏ thờigiờ không xem truyềnhình để viếtthư bàytỏ ýkiến của mình. Tôi xin thaymặt cho anh Tâm ÐoànViệt và vny2k lầnlượt trảlời các câuhỏi của ông, theo thứtự từ dưới lên trên vì tôi cảmthấy logic hơn.
(5) Chuyệnthayđổi cáchviết chữViệt đaâmtiết mới dĩnhiên khôngchỉ có tôi
hay bấtcứ một nhómngười nào cóthể thựchiện nếu khôngcó sựtánđồng và
sựủnghộ của đạiđasố quầnchúng. Những gì tôi đềnghị và viếtra chỉlà
dựatheo cáinhìn và sựhiểubiết của mình. Sựđónggóp của cácvị thứcgiả khác
là điềucầnthiết. Nếu chúngtôi khôngcó khảnăng làmđược chuyệncảicách thì
nhiềungười gópsức cóthể thựchiệnđược côngtrình lịchsử nầy.
Bảnthân tôi cũng chínhvì nhìnthấy sựyếukém chữViệt của mình nên tôi đã rasức học từng tiếngmột, mà ông gọi là 'từ chính'. Tôi xin tạm liênkết kháiniệm nầy với 3 chữAnh 'root', 'stem', 'radical' mà tôi gọilà 'từgốc', 'từcăn', 'từnguyên'. Cũng chính trong quátrình họchỏi, tôi nhậnra sựbấthợplý khôngkhoahọc của cáchviết chữViệt hiệnđại. Cáchviết rờirạc đơnâmtiết đã táchrời từ với kháiniệm điđôi với nó và vôhìnhtrung ta đã không tậndụng được tínhcách dạngtừ ('word shape' haylà 'word formation' mà bạn Tâm ÐoànViệt đã dùng từ 'biểuý' và đã bị hiểulầm với tínhbiểuý của chữHán.), nghĩalà thoáng nhìnthấy dạngtừ là ta nhậnbiết ngay ýnghĩa của từ mà khôngphải đọc từng chữ.
Về lời đềnghị của ông bảo chúngtôi nên coilại khảnnăng tiếngViệt và tiếngAnh, chúngtôi xin ghinhận, nhưng khôngbiết ông tríchdẫn 'TRANG CHU, TRANG NHA CÁ NHA' là lấy từđâu và với ý gì. Cólẽ đó là những saisót về chínhtả haylà vì lýdo kỹthuật biểuhiện của trangmạng -- vài hạtcát trong nồicơm. Còn những đềbài như 'NZ first Demands All Agent Orange Papers be Made Public' hay “Strong Earthwake off Disaster-Hit Indonesian Province' là hai đềbài do 'newsfeeder' cungcấp và họ cũng lấy từ đềbài của những trangmạng có đăngtin ấy. Vảlại tôi khôngthấy saisót nào về mặt ngữpháp tiếngAnh cả trong haicâu ấy cả.
(4) "chưquốcngữ" rõràng là lỗi chínhtả của "chữQuốcngữ". Cólẽ ông quá cốchấp chăng? Nếu toànbộ nhữ bàiviết đăngtải trên trangnhà vny2k mà ông inra phải mất khoảng 10 ngàn tranggiấy in, làmgì mà khôngcó những lỗichínhtả trongđó? Về cáchviếthoa, đó chỉlà một đềnghị theo tínhhợplý của nó. Ông cóthể sosánh cáchviết trong tiếngAnh như 'non-English speaking', 'anti-American', 'trans-Asia',...
(3) Chúngtôi chỉ đưara những lời đềnghị sửađổi cáchviết chữViệt mới chứ chưa đisâu vào chitiết mộtcách cóhệthống là những từ đaâm, cụmtừcốđịnh, vần, dấu phải viết nhưthếnào. Ðâylà côngviệc lớn của những tiểuban ngữhọc nên được thànhlập saukhi đasố chúngta đồngý cảicách theo lốiviết mới. Tómlại, đề nghị của tôi và những ngườicổvũ chỉ đưara đềcương. Những chitiết cụthể nếuđược mong ông đónggóp cho với sựhiểubiết sâusắc của ông.
Thứnữa, sựquansát của ông cólẽ đúng, 1000 năm sau tiếngViệt sẽ không còngiống tiếngViệt ngàynay. TiếngViệt 300 năm trước đã khác nhiều với tiếngViệt hiệnđại, chứ đừngnóichi đến chữViệt 1000 nămtrước. Ông lật Ðôngphong Tạpchí, hay đọc những bàiviết của Trương Vĩnh Ký, Phạm Quỳnh... 100 năm trướcđây, hay xem Từđiển Thiền Chửu soạn chỉ cách vàichục năm mà cáchviết cáchdiễntả và cáchdùngtừ đãcó sựcáchbiệt rõrệt với chữViệt hiệnđại.
Chúngtôi cóthể đã đánhdấu sai và dùngtừ sai, đólà lỗi của ngườiviết chứ chẳng ănnhập gì đến côngcuộc cổvõ cảicách. Nóuchung, đạiđể những bàiviết bànvề việccảicách cáchviết mới và lýdo tạisao vẫn cógiátrị luậnlý của chúng.
(2) Ông nóiđúng, họcsinh Trungquốc bắtđầu học Hánvăn đều học từng chữ một trước. Ngàynay họcsinh mẫugiáo và lớpmột học 'pinyin' (phiênâm) trước, tức là chữLatinh phiênâm tiếng Quanthoại hiệnđại, thídụ, a, o, e, b, d, t,... rồi 'ba', 'ma', 'san', 'he'...sau đó mới học từkép và đaâmtiết như 'bama' (bamá), 'sanhe' (sơnhà) (Chúý cách phiênâm của ngườita: 'sanhe', chứkhôngphải 'san he')... Saukhi học hết những vần cơbản bằng chữLatinh, các em mới bắtđầu thựcsự học Hántự! Khôngcógìlạ vì tấtcả các nước trên thếgiới các em họcsinh mẫugiáo đều bắtđầu học từng tiếngmột trước. Thídụ tiếngAnh ở Mỹ, các emnhỏ học 'mom', 'dad', 'after', 'noon'... trướckhi học 'mommy', 'daddy', 'afternoon', 'father', 'mother, 'parent'....
Ông hỏi: 'Ví dụ như chữ NGUQUAN ông chỉ học được có một chữ thôi. Nếu ông học chữ NGU và chữ QUAN ông có thể ghép hai chữ này với nhiều chữ khác. Ví dụ như ngu quan, ngu quyền, ngu dân, ngu dốt, ngu đần, ngu muội. THAM QUAN (quan ăn hối lộ), quan quyền, quan lại v.v.'
Xin trảlời: Tự bảnthân trong câuhỏi của ông đã hàmchứa câu trảlời rồi! Cái mà ông gọilà 'chữ chính' chính là từ 'ngu' và 'quan'. Chúng vừa là chữđơn hay từđơnâmtiết và cũnglà 'từgốc' haylà 'từcăn' cóthể dùng để tạo từghép và từ đaâmtiết: nguquan, nguquyền, ngudân, ngudốt... hoặclà thamquan, quanquyền, quanlại... Nhìnkỹ vào cáchviết đaâmtiết của những từ nầy, ông không nhậnthấy tínhưuviệt của cáchviết nầy sao?
(1) Tươngtự, ta cóthể xem những chữ thídụ sauđây cóthế dùnglàm tiềntố hay tiếpđầungữ trong tiếngViệt:
- 'phi-' tacó: phinhân, phichínhphi, philiênkết, philý...,
- 'vô' ta có vônhân, vôlý, vôtình, vôlối....
haylà những hậutố hay tiếpvĩngữ như
- '-gia' : chínhtrịgia, phihànhgia, oangia...
- '-giả' ta có : họcgiả, tácgiả...
- '-sĩ' ta có: vănsĩ, hoạsĩ...
Tómlại, nếu cần sosánh, vaitrò của kho từvựng HánViệt cóthể xem như kho từvựng Latinh trong tiếngAnh vậy.
Còn vềmặt tínhchất đíchthực của tiếngViệt, rõràng tiếngViệt là một tiếngnói đaâmtiết. Khi ta nói 'bângkhuâng', bồihồi' 'bảvai', 'cùichỏ'... ta phải phátâm liêntục, vìvậy khi viết, saolại phải viết táchrời? Ðây là mặt khôngkhoahọc của cáchviết chữViệt hiệnđại.
Kể đến sựíchlợi về măt ghinhận, thunhập, tưduy trừutượng, v.v.. đasố những người thànhcông trênđất Việnam và ở hảingoại đềulà những người biết ínhất một ngônngữ đaâmtiết thứ hai, thídụ tiếngAnh, vì thôngqua quátrình họchỏi và thẩmnhập, khảnăng tưduy trừutượng của họ rấtlà cao. Ðáng tiếc cho những người khôngmaymắn trongnước khôngbiết thứtiếng đaâmtiết nàokháchơn ngoài tiếngViệt và cho những mầmnon tươnglai nếu khôngcó cơhội họchỏi ngoạingữ vềsau, duytrì cáchviết chữViệt rờirạc hiệntại là đã trìníu quátrình tiếnhoá đếnmức tốiđa của các mầmnon đấtnước vì khảnăng tưduy của các em sẽ bị hạnchế thôngqua lốiviếtchữ rờirạc của chúngta ngàynay!
Xin mời ông thongthả đọc các bàiviết về vấnđề nầy, ông sẽ hiểu nhiều điều chúngtôi muốn nói ởđây.
Xinchào.
dchph
October 2, 2012 @ 2:48 pm ·
Filed by Victor Mair under
Announcements, Writing systems
There is a movement called
Vietnamese2020 that aims to substantially reform the writing system by the
year 2020. The main change would be to group syllables into words. As the
advocates of this change point out, most words in Vietnamese are
disyllabic (the same is true of Mandarin). The proponents of the reform
believe that, among others, it would reap the following benefits:
1.
achieve greater compatibility with the needs of information processing
systems
2. comport better with the findings of cognitive
science
3. put the kibosh on the false notion of monosyllabism,
which they say is unnatural and does not exist in real languages
I
myself had these additional thoughts:
1. Would the adoption of
polysyllabism (i.e., linking of syllables into words) in Vietnamese
obviate the need for so many diacritics (i.e., reduce homonymy)? Without
knowing the precise details of Vietnamese romanization, the plethora of
diacritical marks has always led me to suspect that the script may be
fraught with redundancy and overspecification, especially if the basic
unit of grammar were taken to be the word rather than the syllable. The
fact that many Vietnamese in their casual writing omit the diacriticals
and are still able to make themselves understood (see below) underscores
this possibility.
2. Would the adoption of polysyllabism make
indexing, dictionary compilation, etc. easier and more user-friendly? This
has certainly been the case with Romanized Chinese and Japanese (e.g., in
dictionaries and encyclopedias arranged according to alphabetical order by
words), and I suspect that the same would be true of Korean as well.
I
ran these proposals and ideas by a number of Western specialists in
Vietnamese language and culture. Their reactions were, to put it mildly,
unenthusiastic.
Bill Hannas notes that this sort of proposal
has been around for a few decades at least, and that the following line in
the proposal does not offer much hope for adoption: "In practice, while
awaiting official orthography guidelines, hopefully, from a governmental
body such as a national language academy, …"
Eric Henry
states:
This is the first time I ever encountered this
proposal. The article doesn't make it clear whether this idea has any
government backing or not. To me the idea of pretending that Vietnamese
compound expressions are unitary words in the same sense that "asparagus"
or "daffodil" are words seems silly and artificial. The Vietnamese used to
use hyphens to accomplish the same purpose; thus fangfa 方法 ("method")
was "phương-pháp," and so on. Then people discovered that they could get
along fine without hyphens, and that the absence of hyphens gave the page
a pleasantly uncluttered look. Conjoining syllables in the manner proposed
seems to me a way of reverting to hyphens [VHM: without the hyphens]. But
then it's natural to be attached to whatever one is habituated to, and I
happen to be habituated to un-conjoined syllables.
To which I
replied, "ex cept in Eng lish".
Eric continued:
I
don't see how polysyllabism could reduce the need for diacritics.
Vietnamese people of course write to each other all the time with no
diacritics and can still figure out 98% of the text, but everyone knows
and feels that this is just a makeshift. It would perhaps be nice to
eliminate the need for the circumflex and the half moon by inventing a few
special vowel signs, but I don't see how the tone marks themselves could
be represented in spelling (cf., for comparison, luomazi [National
Romanization for Mandarin]: han, harn, haan, hann), that would just be a
nuisance, especially since Vietnamese has, not four, but six tones.
Vietnamese orthography has already (i.e., centuries ago) made a move in
the direction of new vowel symbols with the letters "ư" and "ơ."
Maybe
a Vietnamese equivalent of DeFrancis's ABC Chinese dictionary could be
created. It might be wonderfully useful for some purposes, as the ABC
dictionary is wonderfully useful for some purposes. But I haven't really
thought this through.
Another correspondent replied:
This
has nothing to do with the government. It looks to me like it's the work
of some overseas Vietnamese linguistics grad student or (former grad
student) who has now gone slightly crazy because of the "East Sea/South
China Sea/Really Far South Mongolian Sea. . ." issue.
The
author has several pages. Another one (hocthuat.org) has a long study that
argues for the linguistic connections between Vietnamese and Chinese, but
it now has the following disclaimer:
STATEMENT OF RENUNCIATION
OF THE SINITIC CAMP
Here comes a painful decision. I would like
to renounce my long standing belief in what I have elaborated in this
electronic publication about Sinitic Vietnamese. That is to say, I no
longer believe in what I used to see as vestiges of sinitic linguistic
elements in Vietnamese vocabulary stock that are postulated in my research
paper. The reason for my taking this course of action is, admittedly,
politically motivated because I do not want my work later to serve for
unforeseen evil purposes, especially in the face of Chinazi's overt
actions trying to impose its hegemonism onto today's Vietnam. My blood is
boiling with revulsion and hatred after seeing a series of unrolling
events currently taking place in the East Vietnam Sea. Civilized people
mostly see that those behaviors could only be committed by warmongers,
descendants of those same savages as vividly and accurately described in
"The Ugly Chinaman" 醜陋的中國人 by Bo Yang 柏楊. Don't take me wrong,
though both matters not related, given the fact that my blood is
genetically embedded with Chinese DNA.
For Heaven's sake,
please forgive me for all what I have been laboring on hitherto. I would
appreciate your understanding and ask that you take this unstate [sic]
moment of truthfulness as a statement of my renunciation of the sinitic
camp and I shall accept all consequences thereof. My apology to my fellow
scholars, too, and yet, if you still need to read my writings for some
reason, focus instead on the antithesis of what is discussed herein, that
is, "de-sinitize" them by taking the opposite view. You may still quote
any material in this paper but remember to annotate your citation with
this statement accordingly. You could post your comments and questions on
Ziendan TiengViet.
It so happens that another language movement
in Vietnam going on right now is called English2020; it aims to make all
school leavers proficient in English by that year.
Steve
O'Harrow comments:
There is an "English 2020″ project being
spearheaded by Professor Nguyen Ngoc Nhung on behalf of the SRVN Ministry
of Education & Training that aims to make English language instruction
available in a broad range of fields at the secondary and tertiary levels
[by 2020]. It is the only domestic national-level language-related
initiative I know of at this time in Viet Nam. One might be forgiven for
suspecting that the proposers of the Vietnamese2020 movement stole the
name "2020″ from the Ministry of Education & Training English
initiative.
The article you link here looks rather "iffy," to
say the least. In reality, it is probably a scheme put on line by some
Viet Kieu ["overseas Vietnamese"] someplace outside of the country itself.
In my opinion, after my 50 years of Vietnamese language teaching and
research in Viet Nam, Europe and America, there is a zero chance of this
spelling movement taking hold. Why? Because the current system works well.
It is known and used by nearly 90 million people.
The
Vietnamese populace is already one of the most literate in Southeast Asia
and it has been literate for a very long time. They are not likely to
change what works well.
"If it ain't broke, don't fix it." And
believe me, they won't.
What is endlessly interesting to this
observer over the years is that for a long time now, the handful of folks
who identify themselves as Vietnamese but who live overseas, are of the
impression that what they cook up in the cafés of Paris or the campuses of
the USA is going to have some magic impact on the millions and millions of
Vietnamese who are actually living their day-to-day lives in Viet Nam
itself. There are all kinds of looney ex-pats out there and each one has a
fantastic plot to do something, reform the language, overthrow the
government, invent a perpetual motion machine that serves pho on the side.
They're constantly going around appointing each other prime minister of
governments in exile or re-claiming the Nguyen Dynasty throne. Mind you,
founding a new goofy religion actually works sometimes - as long as you
are really in Viet Nam, that is.
But if you are abroad,
"fuhged-daboudit," [especially if you live in Brooklyn].
Responding
to my technical questions about the possible value of a polysllabic
approach to Vietnamese writing, Steve remarked:
Short answer:
NO Longer answer: I really do not know enough about the
technology of
information processing, etc. to be 100% sure and I do know
that many
Vietnamese disagree on which words are polysyllabic & which
are
not [Chinese loans are easier to judge, but Mon-Khmer vocabulary is
another
question and mixed lexemes are even fuzzier]. The main obstacle
to
information processing at this point in time seems to be the fact that
we
do not have decent optical character recognition programs, due to
a lack
of typographic consistency and the fact that Vietnamese
printing in the past
has been all over the map. However, none of the
"fixes" will eliminate the
need for the diacritics and there is a lot
of misunderstanding among those
folks who do not actually read/speak
Vietnamese which marks are diacritical
[only the five tone marks] and
which are integral parts of letters [hooks, bars,
and circumflexes].
A Vietnamese native speaker does not see, say, the
letters "o" and
"ô" or "e" and "ê" as being "o with / without a circumflex" or "e with /
without a circumflex" – rather s/he conceives of them simply as completely
distinct
letters, as different as we would think of "e" and "o" in
English. The folks
whom this system confuses are mainly foreigners,
so who gives a damn?
A 2nd point would be that there is a lot
of disagreement on what constitutes
a "word" in Vietnamese. Is "Không
quân" [Airforce] one or two words? I
really don't think we are going
to come to any substantial agreement in the
foreseeable future and I
really don't think it matters a whole helluva lot, at
least not to
the Vietnamese reading public
Again, the main point is that the
current Vietnamese writing system
works well for Vietnamese people in
Viet Nam itself, so any substantial
changes would likely be
counter-productive. Just remember the old US
saying: IF IT AIN'T
BROKE, DON"T FIX IT! – it is just as true in VN as
it is in the US.
Tinkers be damned.
Finally, just before I was about to make
this post, I received these brilliant remarks from a Vietnamese specialist
who wishes to remain anonymous:
If Vietnamese were written as
words, and not as syllables, there would be less need for diacritics
(tones and "special"–in the sense that they lack Western alphabet
equivalents–letters) because an equivalent amount of information (cues) is
provided by the word division.
By adding information up front
of one sort, you get by with less information of another sort. Word
division in orthography means that society and its individuals have
invested resources in an upgraded system that rewards users with greater
clarity for less effort. You put the effort in at the beginning–deciding
the rules and learning them.
We don't specify every
phonological detail in English writing because we don't need them to get
to meaning. The reader, if s/he cares about it, can supply those details
later, after accessing the word-meaning. Often an unambiguous
pronunciation is possible only after the word has been retrieved from
one's mental lexicon. It surely does not derive from the successive
letter-sounds. By the same logic, written Vietnamese words would be
overspecified if they included all the diacritics in use at present.
Because
indicating tone in computerized writing is such a bother, Vietnamese
usually just leave them out of their informal correspondence, such as
emails. The messages can still be understood, albeit with some difficulty.
Word division would restore the missing redundancy.
Information
technology, and indexing in particular, depend on having "tokenized"
units, usually at the word level. Most of the tokenizing work is done
already in languages with word division. For CJV (not K), however, a
tokenizing function is needed.
It all comes down to the same
rule: you can pay the cost once up front (create and learn rules for word
division) or in perpetual installments.
It is remarkable that,
although Chinese, Japanese, Korean, and Vietnamese have four different
writing systems, they all are vexed with the problem of whether or not to
join syllables into words. That, I believe, is the result of the latter
three still retaining vestigial traces or influences of the Chinese
characters. But even character writing could adopt word spacing if enough
of its users would agree to follow such a norm.
[A tip of the
hat to Jonathan Smith and thanks to Liam Kelley and Michele Thompson]
October
2, 2012 @ 2:48 pm · Filed by Victor Mair under Announcements, Writing
systems
Permalink
--------------------------------------------------------------------------------
45
Comments »
J.W. Brewer said,
October 2, 2012 @ 3:20 pm
For
Chinese and Japanese, you may be characterizing the issue backwards - what
is going on is not so much breaks between syllables rather than between
words, but no information-conveying breaks at all except at the end of
sentences and thus no visible distinction betweeen single-character (or
single-syllable) words and multiple-character (or multiple-syllable)
words, although in Japanese many individual kanji of course have
polysyllabic readings. That lack of information-conveying breaks was once
common practice for texts written in our alphabet, but was abandoned in
favor of inserting blank spaces at word-breaks in the latter part of the
first millenium A.D. http://en.wikipedia.org/wiki/Scriptio_continua. The Vietnamese situation may be different altogether.
Sili
said,
October 2, 2012 @ 4:21 pm
Really Far South Mongolian
Sea
This should probably not amuse me as much as it does.
I
award the the author a swimming holiday to Austria.
JS said,
October
2, 2012 @ 4:31 pm
^
Chinese writing certainly provides
"breaks between syllables" in the sense that the salient written units,
characters, map (almost) without exception to single syllables of speech;
the addition of physical "blank space" as that called upon to separate
English words would, of course, be redundant.
However, Korean
orthographical standards do call for word separation, meaning that in the
case of (standard) Korean writing, both the syllable and the word are
strongly marked in written text , though as one might expect, there are in
the case of the word many cases in which decisions regarding division are
variable and arbitrary.
Peter said,
October 2, 2012 @ 5:13
pm
^ I agree that characters neatly (and nearly without
exception) subdivide an expression into syllables. Having blank spaces
between the _words_ though–that would not be redundant. It would be kind
of helpful (but no one will ever do it).
Victor Mair said,
October
2, 2012 @ 5:28 pm
@Peter
"that would not be
redundant" , clear thinking on your part
"but no one will ever
do it" , actually, a lot of people have done it (e.g., Chow Tse-tsung and
Apollo Wu). Who knows? Someday it might just catch on. That would be a
boon for IT specialists, dictionary makers, indexers, grammarians, and
sundry others.
Peter said,
October 2, 2012 @ 5:52 pm
@Victor
That
would be convenient. Considering that most Chinese (or, I suppose,
Americans) can't tell the difference between a morpheme and a word, I'm
not holding out a great deal of hope.
Ellen K. said,
October
2, 2012 @ 5:57 pm
In English we have cases where whether
something is a word or two is somewhat arbitrary, and even cases where we
don't agree on if it's one word or two. This doesn't seem to get in the
way of our use of the written language. Curious that none of the writers,
all writing in English, mention that we have this in English. My question
for them would be, is this any different from English, other than that in
English we've had time to standardize many of the cases that can go either
way?
Victor Mair said,
October 2, 2012 @ 6:14 pm
@Peter
Most
Americans (and other speakers of English) know what a word is (i.e., know
where to put spaces between words) , in 99+% of the cases. Otherwise we
wouldn't be able to hold these conversations on Language Log. And you can
be sure that commenters would jump down the throats of us
bloggersifweforgottoputinthosespaces.
As for what a morpheme
is, that's specialized knowledge that can be left to linguists and others
who delight in the study of languages.
tram said,
October
2, 2012 @ 7:39 pm
Funny example. Is "Airforce" one or two
words?
Ruben Polo-Sherk said,
October 2, 2012 @ 7:56 pm
I
think in understanding this issue it's important to realize that, just
like with Chinese (when it is divided), the compounds aren't really being
divided into syllables–they're being divided into morphemes, and that they
simultaneously get divided into syllables is just a coincidence.
JS
said,
October 2, 2012 @ 8:47 pm
^ Hmmm… from a synchronic
point of view it might be possible to claim that in Chinese and Vietnamese
writing, compounds are being divided into syllables and that it is the
correspondence of those syllables to morphemes which is only a
coincidence… after all, in these two cases, the salient written unit's
relationship to the syllable is (all but) invariant, while its
relationship to the morpheme is much confounded by the significant and
increasing number of morphemes that are longer than one syllable.
However,
historically speaking, your view is reasonable as the preference for
disyllabic "compound" words in both languages (which seems to have
followed on processes of reduction of longer and otherwise more
phonologically complex words to CV[C]?) means that the relationship
between originally logographic Chinese characters and modern-day morphemes
is indeed in some sense original and essential…
Brad said,
October
2, 2012 @ 9:15 pm
I think one of the non-English rebuttals
should be:
So everyone needs to deal with the made up hassles of
distinguishing between compound words, hyphenated compounds, and
multi-word compounds?
It's a distinction that the writing
system makes, yet the organization system for the dictionaries resolutely
ignores it. Does the meaning of 'air' change dramatically when followed by
'man'? If it does, you put 'airman' in the dictionary whether it's
'airman', 'air-man', or 'air man'.
:-/
Every Japanese book
that I have that has spaces between the Japanese words is either a kids
book or a Japanese as a foreign language text. The kids books have spaces
between the words because uninterrupted strings of hiragana or katana can
be difficult to parse quickly.
And the native Japanese
dictionaries intendend for children that I have get along just fine using
Japanese kana ordering for the dictionary, so "alphabetizing" only
benefits the people that have memorized the arbitary order of 26 letters
instead of memorizing the arbitrary order of 52-some kana.
In
the written form used by adults, spaces would be redundant because the
information is either conveyed through other indications:
-
grammatical particles indicating the end of the word
- kanji
interrupting the hiragana streams
- punctuation
and once someone
gets into things like verb conjugation and so on, distinguishing between
the various components really becomes quite arbitrary.
All of
the electronic dictionary work that I've done has involved looking up
words using longest substring style lookup. So if X and Y are words, but
someone also decided that XY is a word, you don't have to care. So if the
electronic translation people need to build better word tables, that's not
a very compelling argument to change tradition.
In other words,
God save us from yet another spelling reform, especially if it's for
someone else's language.
Ran Ari-Gur said,
October 2, 2012
@ 9:29 pm
@Ruben Polo-Sherk: I don't know Vietnamese, so please
correct me if I'm being clueless, but , I don't think that's completely
true. For example, the Vietnamese Wikipedia gives "London" as "Luân Đôn" ,
not, I submit, because it's composed of the morphemes "Luân" and "Đôn".
(However, it also gives "Paris" as "Paris", and "Wikipedia" as
"Wikipedia"; so there's definitely a tendency to write borrowed morphemes
solid even when they're polysyllabic, but it competes with a tendency to
write spaces between syllables even within polysyllabic morphemes.)
michael
farris said,
October 3, 2012 @ 1:54 am
Some initial random
musings.
There's a fair amount of variation in how borrowed
morphemse (which have undergone Vietnamization) are written. If you take
'salad' I've seen all three:
xa lát
xa-lát
xalát
with
the first being the most common.
Words that don't undergo
Vietnamization (like Paris) remain written as one word.
Word
division seems a thornier issue in Vietnamese that any other language I've
examined. When I was actively learning Vietnamese there were times I could
understand a sentence just fine but couldn't have hoped to divide it into
words (or could think of a number of ways of doing so). Leaners of Thai
and Khmer I've talked to report very similar experiences while learners of
Mandarin mostly don't. It might be a SEAsia thing…
Yes
Vietnamese speakers can get by in some contexts without diacritics (I used
to receive emails from one which I could understand pretty well) but this
is partly due to diacritics being used most of the time - you can sort of
'see' the diacritics when they're not there. I'm also assuming there's
some deliberate vocabulary and syntactic choices being made to facilitate
understanding. But diacritic free Vietnamee (minus other massive changes)
seems like a non-starter.
IME unlike most writers of languages
with diacritics, when a diacritic appears over a lower case i in
Vietnamese speakers tend to write the dot and diacritic both (when writing
by hand, in print the diacritic replaces the dot). I'm not sure what, if
anything, this means, but it's sort of distinctive.
You really
should do a post on those Viet Kieu who want a return of Chu Nom
(character based script). They make the word division (or other
orthographic reform) plans seem completely feasible (nb I'm not talking
about scholars who are interested in Chu Nom from an academic point of
view who do very valuable work but those with half-baked plans for
compulsory education and the like)
Ruben Polo-Sherk said,
October
3, 2012 @ 3:14 am
JS, Ran Ari-Gur: Good point bringing up the
polysyllabic morphemes.
First of all, the polsyllabic morphemes
in Chinese or Vietnamese are basically anomalies in one relevant sense:
they cannot combine with other morphemes to form compounds in the way
monosyllabic morphemes generally can. There are also very few of them.
So
it is not unreasonable to ignore them when figuring out how to transcribe
Vietnamese, which has a large substructure of monosyllabic morphemes, and,
because the importance of these monosyllabic morphemes, decide to simplify
and standardize by making each syllable written separately, which is what
they did with quoc ngu. And so that is how you get Lon Don. With foreign
words, though, as michael farris said, it's not entirely standardized. The
other exception is in cases like with the current featured article on
Vietnamese wikipedia, which has "dreadnought" in it, which is clearly not
written in quoc ngu–it's written English–and so isn't subject to the
syllable-dividing rule.
With pinyin, it's essentially the same.
The substructure of Chinese consists almost entirely of monosyllabic
morphemes and so, if someone decides to write with spaces to separate
those morphemes, they may, for the sake of consistency, separate syllables
of polysyllabic morphemes as well. But the motivation cannot be to
distinguish syllables–that doesn't really make any sense, I think. If you
argue that this is done to mimick the boundaries between Chinese
characters, you get back to the point of morphemic structure, since a
major function of Chinese characters is to support this kind of structure.
It is possible, of course, to write a language like English, with no such
structure, in Chinese characters, but the system of two-character
compounds would not fit in general (and therefore there would really be no
reason to not write each character separately if you transition from that
into an alphabetic script). This is essentially an innate feature of the
language, and not the writing system.
So, to put it simply,
when disyllabic morphemes are split, this is done basically to be
consistent in a system that, in order to accomodate a substructure of
monosyllabic morphemes, has been standardized (by convention or personal
choice) to have spaces between syllables. The chief concern is the
division between morphemes.
(In case anyone doesn't understand
what I mean by "substructure of monosyllabic morphemes", I'll explain it
this way: Vietnamese and Chinese have it, and English doesn't. It's the
thing that makes the issue of word division a real pain in the ass in the
Vietnamese and Chinese, and not a problem at all in English. With
Vietnamese and Chinese, because of the importance of the organization at
the morpheme level, the concept of "word" doesn't fit well.)
(Not
really part of my argument, but maybe something to think about: We write
"New York" with a space, but, though originally it was two morphemes, it
is now really just one. So we do sort of have this in English, too.)
(Ran
Ari-Gur: I am not the all-knowing god of Vietnamese.)
richard
howland-bolton said,
October 3, 2012 @ 6:02 am
"ex cept in
Eng lish"?
"ex cept in Engl ish" surely :-)
Victor Mair
said,
October 3, 2012 @ 6:31 am
@richard howland-bolton
surely
not
Ruben Polo-Sherk said,
October 3, 2012 @ 6:44 am
Shouldn't
it be Eng glish?
Gene Buckley said,
October 3, 2012 @ 7:24
am
Linguistically, compounds like air force are single words
composed of other words: this is the beauty of hierarchical structure.
Orthographies make different choices about how to handle that layered
structure in writing. English is inconsistent, sometimes using a space,
hyphen, or no division at all, often related to how familiar or
"lexicalized" the compound is: water tower vs. waterfall.
Spelling
practice varies over time and space; hyphens used to be more common, and
still are relatively more common in British than in American orthography.
German, where these compounds have the same linguistic structure as in
English, has a more consistent orthography, regularly writing compounds as
one word (Wasserturm, Wasserfall) regardless of length; see this dramatic
Afrikaans example, since it (like Dutch) follows the same practice.
In
Chinese, and therefore in Sino-Vietnamese, the compounds mainly at issue
are closer to English per-mit, con-fer, and tele-phone. Because the
meaning of the whole is often not very predictable from the meaning of the
components, speakers shouldn't have much trouble learning to treat most
such items as single written words, although there would no doubt be a
role for (somewhat arbitrary) standardization. I think Victor's point is
that to make no reference at all to word structure (whether by using
spaces nowhere or everywhere) is to leave the reader completely on his or
her own, when an orthography could give some significant information
through the judicious use of spaces.
It's another question
whether further compounding should be written as a single word. Victor, as
I take it, is mainly talking about the equivalent of per mit, although
there will also be words like build ing that are semantically more
transparent. Today Vietnamese writes the equivalent of build ing per mit.
A writing reform that ended with building permit might be superior to
buildingpermit, since the spaces show the relative grouping of (pairs of)
morphemes where they do the most good, while still identifying the
internal constituency of larger compounds. If Vietnamese and German
represent the extremes, English orthography might for once actually be
rather sensible, if only it were more consistent.
Victor Mair
said,
October 3, 2012 @ 7:28 am
That's why it's
"English".
Matt Anderson said,
October 3, 2012 @ 7:55
am
Ruben Polo-Sherk,
Maybe I don't understand your
point exactly, but, in Mandarin, polysyllabic words can certainly combine
with other morphemes to form longer words. For example, húdié 蝴蝶
'butterfly' can combine with gǔ 骨 'bone' to form the word húdiégǔ 蝴蝶骨
'sphenoid'; xìbāo 細胞 'cell' can combine with zhì 質 'substance' to form
xìbāozhì 細胞質 'cytoplasm'; and lǚyóu 旅遊 'tourism' can combine with qū
區 'district' to form lǚyóuqū 旅遊區 'tourist area'. &, while the
individual syllables of xìbāo and lǚyóu can themselves be said to be
morphemes, húdié is itself a single morpheme.
Ruben Polo-Sherk
said,
October 3, 2012 @ 8:30 am
Certainly polysyllabic
words can combine with other morphemes. My point about the restriction on
polysyllabic *morphemes* doing so was with regard to *how* they do it. The
only way they can is basically through the same mechanism that we use to
get "tennis racket" and "toaster oven". 蝴蝶骨 is basically "butterfly
bone" in this same sense. There's an important difference between that
sort of union and the one in, for example 理解, or 看见.
M (was
L) said,
October 3, 2012 @ 9:37 am
Does it make a lot of
sense to bust a gut over foreign names and words? Every written language
is challenged by this. Every language has to deal with it, and often by
special localization rules that differ for each commonly-encountered
foreign language. Often, it's a matter of "drop back ten and punt."
It
seems to me that whatever Vietnamese decides to do with Vietnamese
vocabulary, and with loan-words that have become sufficiently adopted that
they are now de facto Vietnamese vocabulary, is one question - - - but not
a decision that ought to be driven by foreign words. Tail wagging the dog,
no?
Steve said,
October 3, 2012 @ 11:57 am
POINT
ONE: The folks who worry about joining Vietnamese syllables or not joining
Vietnamese syllables are in the same league with theologians worrying
about how many angels can dance on the head of a pin. 90 million
Vietnamese use an orthographic system that works well for them. In the
early post-WW2 period, they undertook a massive literacy campaign that
worked very well because, for a native speaker of Vietnamese, the writing
system is not nearly as difficult to learn as say, the English system is
for native speakers of English.
POINT TWO: If one makes the axiomatic
statement that "As the advocates of this change point out, most words in
Vietnamese are disyllabic (the same is true of Mandarin)," one begs the
question of what constitutes a "word." Many commentators appear to be
judging whether and utterance in Vietnamese is a word based on whether
what is expressed can be called "a (i.e., one) word" in, say, English or
French. This is, in my opinion, a highly subjective stance.
In
any event, judging the matter as a non-native speaking student of both
Vietnamese and Mandarin Chinese for the last 50+ years, it strikes me that
the rate of apparent monosyllabicity in Vietnamese is much greater than in
Mandarin Chinese – indeed, Vietnamese appears to have the highest rate of
monosyllabicity and the lowest rate of phonemic redundancy of any language
I have taken a scholarly interest in. For what it's worth…
Steve
said,
October 3, 2012 @ 12:17 pm
While this discussion is
very interesting for us [and to me especially, since this is basic to what
I have been doing every day for the past half century], it is rather
meaningless from the point of view of the users of the Vietnamese writing
system. It is very unlikely that any writing reforms will be instituted in
the foreseeable future. They would cause more chaos that benefit. For
example, if you look at Ho Chi Minh's manuscripts and other handwritten
materials, you will see that he often liked to write "z" for "d" and "r"
and "gi" – these are reflexions of the similar Northern pronunciation of
the graphs in question [odd, since he spoke with a Central accent in
day-to-day conversation]. Because of Ho's iconic status in much of Viet
Nam [but clearly not all of Viet Nam], some true-believers have pushed the
idea that the writing system should make the same substitution. However,
there are other regions in Viet Nam where there is no "z" sound whatsoever
and where "d" and "r" and "gi" do not represent the same sounds anyway.
And there is even a very small part of the country where "d" and "r" and
"gi" are pronounced as separate contrasting sounds.
What this means
is that one immediately begs political questions of national unity when
one advocates writing reform of a system that is both universally employed
[except in a few private spheres] and widely accepted from the Ca Mau
peninsula to the Chinese border.
So I come back to my sainted
mother's old Indiana wisdom: "if it ain't broke, don't fix it!"
Ran
Ari-Gur said,
October 3, 2012 @ 1:11 pm
@M (was L): I
don't think anyone is suggesting otherwise. I fear you might be refuting a
straw man . . .
michael farris said,
October 3, 2012 @
1:40 pm
Apropos of what Steve has written it's important to
note that Quoc Ngu is not a transcription of a particular dialect or
language variety (which is still arguably the case for Pinyin) but an
orthography that has slowly evolved to work for speakers of dialects with
rather different phonemic inventories.
Each distinction made in
the script reflects a difference made somewhere (except for i and y as
full syllables and for all I know somewhere does make that distinction)
but nowhere makes all the distinctions (though a few dialects might come
pretty close) and which differences are levelled varies from region to
region (or village to village).
It is not calculated to look
appealing to westerners but it does a remarkably good job of providing a
working unified orthography for the language.
M (was L)
said,
October 3, 2012 @ 2:59 pm
@Ran Ari-Gur - I was
responding to the handwringing about Lon Don. What matters is how you
write Hanoi in Vietnamese. How you write London or Paris or East Lansing
doesn't really come into it except as a footnote.
JS said,
October
3, 2012 @ 3:28 pm
Ruben Polo-Sherk:
It would indeed be
interesting if this were a principled distinction… but is noun compounding
really "importantly different" from the sort of example you mention
(li3jie3 理解, from two verbs in "parallel," or kan4jian4 看见, from two
verbs in "series")? It seems possible that, historically, there simply
haven't been enough disyllabic+monomorphemic verbs around to feed such
processes… and such as have appeared more recently do get up to a certain
amount of funny stuff, esp. of a "reduplicative" nature (e.g.,
lao1laodaodao 唠唠叨叨, shu3shuluoluo 数数落落, etc.)
Jongseong
said,
October 3, 2012 @ 4:13 pm
Korean has been written
with spaces between words since at least the 1930s; before that, spacing
depended largely on the author, and before that, spaces were not used.
Spacing
continues to vex Koreans, but this is largely due to the agglutinative
morphology. For example, suffixes are supposed to be written without
spaces and dependent nouns are supposed to be spaced, but Korean is full
of cases where the same form can behave as a suffix or a dependent noun,
as in daero 대로. As a suffix meaning "based on" or "following", you have
beop-daero 법대로 ("following the law") with no space; as a dependent noun
meaning "as", you have mal-han daero 말한 대로 ("as spoken") with space
(I'm using the hyphen to separate morphemes in the romanization). Think of
the confusion in English between "a while" and "awhile" or "maybe" and
"may be", but much more frequent in the language.
Compound
nouns are another source of ambiguity, much as in English (which has the
additional option of hyphenation to confuse matters further, "crybaby",
"cry-baby", or "cry baby"?). Korean rules allow for optional spacing in
many cases, which I guess is pragmatic.
I'm less familiar with
North Korean rules, but in general they use spaces quite a bit less than
in South Korea. Compound nouns are generally written without spaces, and I
think even dependent nouns may be written without spaces, so that the
example above would be mal-han-daero 말한대로 in North Korean spelling.
I
don't think you could come up with a spacing rule for Korean that is at
once simple and can satisfy everyone. However, for all the confusion about
correct spacing, you wouldn't find anyone arguing for going back to no
spaces between words. Korean is so much more readable with spaces. For
what it's worth, Koreans don't have the confusion between syllables and
words regarding their own language, though they have the advantage that
polysyllabic morphemes are so common in Korean.
Knowing next to
nothing about Vietnamese and based on the simple fact that it is an
isolating language with limited affixation, I would think spacing rules
for Vietnamese would be simpler than for Korean.
Ruben
Polo-Sherk said,
October 3, 2012 @ 4:39 pm
The issue is
semantic: Polysyllabic morphemes are independent in a way that the
monosyllabic morphemes, when functioning as part of a compound, are not.
They contain the entirety of the meaning. Now, even if it can be used
independently, the monosyllabic morphemes, when they are serving to
construct a compound, do not–the meaning of each is part of a large set of
fundamental "nuts and bolts" that are put together to have meaning that
can stand by itself. This fundamentalness is what I was talking about, and
there are no (or at least trivially few) disyllabic morphemes in this
group of fundamental ones.
Matt said,
October 3, 2012 @
8:09 pm
One interesting thing about spaces in Japanese kids'
books is that they don't come between words and particles. So in kana it's
"いぬが はなを" (dog-NOM flower-ACC) but in Romaji it's (usually) "inu ga
hana o". (Although the Portuguese missionaries used the same separation as
modern kana: "inuga fanauo".) One useful effect of adding spaces to
Japanese orthography would be the provision of a final, by-fiat answer to
what exactly constitutes a word in Japanese. (Tongue only partly in
cheek.)
wren ng thornton said,
October 3, 2012 @ 8:27
pm
@Ellen K:
There are certainly ambiguous cases in
English, but I think the issue is one of severity. Most of the English
examples I can think of are ones where the compositional structure has
been lost to us (e.g., "a lot", "after all") and we treat the set phrase
as a single word. (The other examples are compound nouns, but German seems
to do fine with eliminating the spaces there.) However, to pick Japanese
as an example, because of its agglutinative nature the issue of
distinguishing words is problematic even for productive structures.
For
example, Japanese uses a lot of verb compounding. This is vaguely similar
to English's system of modal verbs, except that it's extremely productive
instead of involving a closed set of forms. Depending on the verbs
involved, these compounds could be (a) entirely compositional, (b)
syntactically compositional but with non-compositional semantics, (c)
semantically non-compositional to the point of being aspectual/affective
markers, often with phonetic non-compositionality, or (d)
non-compositional to the point that they are considered to be simple
inflections rather than compounds. In the conventional romanization we
treat most of (d) as single words; treat (a), (b), and the remainder of
(d) as separate words; and waffle back and forth over (c). But because
there's a continuum here , from clearly compositional processes through to
tense/aspect/mood/polarity inflections, wherever you draw the line is
going to be problematic.
To pick another issue, in the
traditional romanization we separate off case morphemes from their noun
(etc). This is strange, but then there's a continuum between case
morphemes and postpositions, so again there's this issue of where to draw
the boundary (if indeed any boundary should be drawn). And this gets
confounded into other issues too. For true adjectives, the morpheme
converting them into adverbs is traditionally romanized as part of the
same word. Whereas for adjectival nouns, the morpheme converting them into
adverbs is traditionally written as a separate word (since it's related to
the dative). And that morpheme coincides with one for converting verbal
stems into adverbs, but for verbal stems people waffle back and forth
about whether it should be separated or not. That morpheme is also a form
of the copula, so surely you'd want to be consistent about how you treat
the copula elsewhere right? Etc. Etc.
If Vietnamese is at all
similar, it's no wonder they settled on spaces between each
morpheme/syllable. It's a bit extreme, but at least it's consistent,
eh?
Ran Ari-Gur said,
October 3, 2012 @ 11:04 pm
@M
(was L): Re: "I was responding to the handwringing about Lon Don": I don't
see how you can have been, seeing as there wasn't any . . .
JS
said,
October 4, 2012 @ 12:01 am
Ah… I am not clear on all
points, but sense in your last comment a view of Chinese and Vietnamese
word formation rather different from that which I have in mind: where I
tend to think mostly about larger words formed from smaller words proper
by a variety of processes (some of which might be properly called
"compounding" and some not), it seems you view these languages as engaging
in word-building from stores of (often bound) morphemes (the "nuts and
bolts") in a more self-conscious manner , a la "classical" compounding in
English, or novel unions of Sino-Japanese elements in Japanese?
These
two possibilities are not mutually exclusive, of course… but my tendency
to see the latter sort of "compounding" as more exceptional and less
interesting might be the reason I have been slow to appreciate your
suggestion regarding the relative productivity of monosyllabic vs.
disyllabic morphemes in compounds (a difference I suppose I might see as
merely a reflection of the sorts of words available in the language at a
given time.)
Matt said,
October 4, 2012 @ 12:31 am
Also,
part of role that Chinese characters play in Japanese orthography is
indicating word division. The basic principle is that "A change from kana
to kanji usually indicates that a new word has begun."
人類社会のすべての構成員の固有の尊厳と平等で譲ることのできない権利とを承認することは、世界における自由、正義及び平和の基礎であるので、
jinruishakainosubeteno
koseiinno koyuno songento byodode yuzurukotonodekinai kenritoo
shoninsurukotowa, sekainiokeru jiyu, seigioyobi heiwano kisodearunode…
That
simple rule above gets us about halfway to a working tokenizer , of the 12
"words" above, at least 6 or 7 are arguably "really words" if you accept
the particles-are-part-of-the-word-they-follow argument. The lexicon
needed to mop up the edge cases isn't unworkably enormous.
Of
course, this doesn't mean that kanji are necessary for Japanese writing to
make sense (as harped on endlessly in other threads), as any shift to a
kanji-free writing system would surely see the introduction of spacing as
well. But in my opinion this is part of the reason why there is such
resistance to ideas like "only write Sino-Japanese words with kanji; write
the native vocabulary (like 譲る in the example above) in kana" , the
arrangement of different types of characters conveys the same sort of
information as whitespace, albeit less efficiently and unambiguously.
Ruben
Polo-Sherk said,
October 4, 2012 @ 8:31 am
JS: I'm sorry,
but I'm not entirely sure what you're saying, so forgive me if I'm talking
about something entirely different.
Aren't these two types of
compounding entirely different phenomena? The first one isn't really
particular to Chinese, and doesn't have anything to do with the
morphological substructure, so I left it out of my original post. In fact,
my point was that these two-morpheme compounds are *different* from
compounds like "tennis racket" (if that's what you mean by "'classical'
compounding"?).
Do you mean that you see the mechanism for
establishing the meanings of two-(bound) morpheme compounds from their
constituent parts as irregular to the point that you consider these
compounds to be mostly "set" combinations, and therefore unitary?
If
so, I'll try to explain why I see it the way I described it.
From
my own experience learning them, I find that a many (a majority?) of
compounds are understandable entirely from their constituent morphemes.
More specifically, in the past, when I came across an unfamiliar compound,
but knew each morpheme, I would be able to understand what that compound
meant from my knowledge of those morphemes. In fact, there have been times
when I wanted a particular word, but hadn't learned it yet, and was able
to successfully "derive" it from morphemes that I already knew (If you
want examples of the kind of compounds I derived, some I remember now are
区分、両日、根源、外面的、変容).
JS said,
October 4, 2012
@ 9:59 am
^ Thanks for your remarks. Basically I feel that
compounding from bound morphemes in Chinese at least, while it certainly
exists, is not terrifically productive , such words (dian4shi4 电视 and
the like) smell more like our coinages from Greek/Latin roots (what I
imprecisely called "classical" compounding) or the Sino-Japanese
contribution to CJK (ke1xue2 科学). The examples you raised earlier
(li3jie3 理解, kan4jian4 看见, myriad others) are instead in origin
free-free syntactic adjacencies (the latter arguably still phrasal), and I
see no reason in principle why polysyllabic morphemes couldn't wind up
involved in such lexicalization processes. So this second is indeed the
"tennis racket" category, though much richer in practice than such a
designation might suggest.
Incidentally, in neither case would
I see the meanings of these Chinese "compounds" as generally transparent
given their individual components, though the latter sort were at some
point freely composed and thus are arguably so from time to time…
Jason
said,
October 4, 2012 @ 11:26 am
@ JS
I think
you are confusing compounds, which are similar to Germanic words in
English (e.g. airport, kitchen table), and agglutination, which accounts
for Greek and Latin words in English (e.g. deconstructionism). Mandarin,
like English, employs both; however, compounding is by far the more
productive form.
Ran Ari-Gur said,
October 4, 2012 @ 3:29
pm
@Jason: By "coinages from Greek/Latin roots" or "'classical'
compounds", I assume that JS is referring to words like "biology",
"telescope", "interject", etc., where a single word is formed by
compounding (?) two bound morphemes ("bio-" and "-ology", "tele-" and
"-scope", "inter-" and "-ject", etc.). Lexically and semantically, they're
very similar to compounds of free morphemes like "life science" and
"distance viewer", and to verb-particle idioms like "throw in".
JS
said,
October 4, 2012 @ 8:53 pm
^ So… yeah, dian4shi4 etc.
strike me as "biology"-type words, built self-consciously from the
nuts-and-bolts Ruben Polo-Sherk has referred to, while the core of the
Mandarin lexicon consists more of "life science"-type words (though of
course of very diverse phrasal origins, found across word classes, and
often with constituents no longer free.)
@Jason, not sure what
you would want to call "agglutination" in Mandarin as distinct from
"compounding"… perhaps -de suffixation to create "one who does X"
meanings, -hua suffixation to create "ish"-ish meanings, and the like? In
which case you would have processes limited in number but very productive
indeed…
Apologies if I've derailed discussion… to return to the
point, I might say I've found it interesting that those with knowledge of
Vietnamese language and writing seem to find the suggestion of word
division so asinine. The situation surely can't be so different from that
of Mandarin, where IN THEORY (this naturally being as far as the present
discussion means to extend), word division would be a workable and an at
least marginally useful orthographical device.
Ruben Polo-Sherk
said,
October 4, 2012 @ 11:16 pm
Ok, now I see what you're
saying. I think that our disagreement comes from how we are viewing the
processes involved in compounding for the "core" of the lexicon.
We
agree on the fact that polysyllabic morphemes can form part of "life
science" compounds, but I am claiming that there is an important
distinction between "life science"/"tennis racket"/蝴蝶骨 compounds and
ones like 空間/変化. In the former, both parts are stand-alone,
independent words, and you are using the life/tennis/butterfly to specify
the kind of science/racket/bone. This is not the same construction
involved in 空間 or 理解. (If anything, the former is a lot closer to the
"biology" type in construction). Whether or not polysyllabic morphemes can
involved in a particular process has, obviously, nothing to do with how
many syllables they have; it has to do with the fact that, whatver the
reason, all polysyllabic morphemes in Chinese are stand-alone, independent
words, and not building blocks*.
For the purposes of this
discussion, I'm splitting compounds into three types (some of my examples
are Chinese; others are Sino-Japanese, but the mechanism is the same):
1)
电视, 化学, etc. These are essentially the same as "classical" compounds
in English.
2) tennis racket, 蝴蝶骨, etc. This exists in lots
of languages and is unremarkable.
3) 空間, 想要, 見解, 区分,
理解, 変化. This is what I mean by the core of the lexicon.
*If
you know of any exceptions, please let me know, but I maintain that they'd
still be statistically rare enough to be irrelevant to my larger point.
JS
said,
October 5, 2012 @ 11:06 pm
^ To be "compounds" at
all, all items under your (2) as well as (3) must be lexemes in their own
right, with transparency or lack thereof merely a function of time, among
other factors, correct? Surely da4ren(2) 大人 ("descriptive" compound;
currently 'adult' and formerly 'your honor', etc.) is no different from
hu2die2gu3, with li3jie3 and others (though very often of entirely
different first syntactic structure) distinct from these only due to
gradual loss of transparency? So, my claim was only that polysyllabic
morphemes, though relatively few in number, may also engage in such
processes.
I don't think we should speak of privileged
"building blocks" in Mandarin aside from "suffixes" like -de, -jia, -men,
etc., and arguably the bound forms on occasion exploited in your (1).
Ruben
Polo-Sherk said,
October 7, 2012 @ 6:21 am
It seems that
we've been using arguments that assume one interpretation or the other on
whether these morphemes are lexemes or not, basically arguing from
inconsistent paradigms. It seems to me that you see every morpheme, with
the exception of things like -学, 电视, and -的, as always functioning as
a lexeme. In Sino-Japanese, that interpretation is absolutely
untenable–there's no question that the compounds themselves are the
lexemes, but in Chinese, it's not so clear. There's only a valid
distinction between polysyllabic morphemes and monosyllabic ones (or, more
precisely, between bound and free ones) if you *don't* see every morpheme
as a lexeme (excepting the agglutinative ones). If things like 理解 are
taken to be clearly two words instead of one, then there is, of course, no
utility to having the concept of a core process for forming the vocabulary
at all (but my earlier point would nevertheless be correct–then every
element of the lexicon, with a few very rare exceptions, is still
monosyllabic). I'm not going to try to convince you or anyone else that
things like 理解 are actually unitary in Chinese, since I don't believe
that myself: many tools for analyzing other languages (for example, the
concepts of parts of speech and word boundaries) are not suitable for
Chinese, and everything looks fuzzy when you look at it from those
perspectives. I'll only conclude with an argument for transparency and
compositionality of these compounds: suppose you know what 理論、 理解、
解説、説明、and 回答 mean; you can infer what the "meanings" (or parts of
meanings) are represented by 理 and 解. And then if you see 解答 for the
first time, you can understand it compositionally. (I'm not claiming that
*every* compound works like this–there are many, of course, that are
rather opaque–but I do think that the majority remain compositional.)
Gpa
said,
October 14, 2012 @ 4:29 pm
Vietnamese borrows mainly
from Cantonese, which is a remnant from Middle Chinese, not Mandarin,
which is a bunch of reduced sounds from Middle Chinese, so using Mandarin
seems irrelevant. And using Japanese is more irrelevant. Most of the words
in Japanese use ancient Chinese monosyllabic combinations with other
monosyllabic words to form a disyllabic or polysyllabic word. Koreans due
to their borrowing from Chinese, just like Japanese which borrows across
the many varieties of Chinese dialects, so any Chinese dialect's original
word is now not their own anymore. Basically, Japanese, Korean, and
Vietnamese use the same method to convey Chinese disyllabism: Using
approximate sounds via their devised writing systems, all via Chinese, to
form the Chinese words, which might or might not sound like the original
Chinese word anymore, due to Japanization, Koreanization and
Vietnamization of these original Chinese words. 蝴蝶: 蝴 & 蝶 both
mean "butterfly/butterflies", which are rarely separated to form other
disyllabic / polysyllabic words in Chinese.
Source: http://languagelog.ldc.upenn.edu/nll/?p=4233
-----------------------------