Thursday, April 25, 2013

Hyphenation in compounds with abbreviation remarks



So far I understood, that hyphenation should aid readability.



Examples [1, 2]:





North America-based company



A Gaussian mixture model-based approach



We propose spherical Gaussian-based approximations to calculate this analytically.




Although, this never aligned with my understanding of parsing trees, I would still like to apply this rule.




How does it extend to abbreviation remarks?




Gaussian mixture model (GMM)-based approach



Non-negative matrix factorization (NMF)-inspired method




My own understanding of how to parse the words is as follows, which does not seem to be reflected in how hyphens are used:




{
{
{
Gaussian {
mixture model
}
} (GMM)
}-based
} approach


Answer



Hyphens are used to compose constituents, either words or phrases, to make words. Consequently, to know whether a hyphen is appropriate, you have to know the categories of constituents, not just what the constituents are. Below, I've tried to amend your diagram for "Gaussian mixture model (GMM)-based approach" by adding category (parts of speech) information. NP means noun phrase, N is noun (a word), A is adjective or other noun-modifier (a word), Participle (a word).



{NP
{A
{NP
A Gaussian {N
N mixture N model
}
} (GMM)

}-Participle based
} N approach


There are two types of word compounds in the example. A compound adjective (a word) is made by combining a NP (a phrase) and a Participle (a word), and a compound N (a word) is made by combining two Ns (words). For the latter type of compound, a hyphen is often optional.



I'm not sure I see a problem with the hyphenation. I'm worried, though, about the structure of "Gaussian mixture model", which must be a phrase, not a single word, because "Gaussian" is an adjective, and noun-noun compounds can't contain adjectives. But "Gaussian mixture" should be a constituent, because of the interpretation: mixture of Gaussian distributions.


No comments:

Post a Comment