Thursday, November 23, 2017

Hyphenation in compounds with abbreviation remarks



So far I understood, that hyphenation should aid readability.



Examples [1, 2]:




North America-based company




A Gaussian mixture model-based approach



We propose spherical Gaussian-based approximations to calculate this analytically.




Although, this never aligned with my understanding of parsing trees, I would still like to apply this rule.



How does it extend to abbreviation remarks?





Gaussian mixture model (GMM)-based approach



Non-negative matrix factorization (NMF)-inspired method




My own understanding of how to parse the words is as follows, which does not seem to be reflected in how hyphens are used:



{
{
{

Gaussian {
mixture model
}
} (GMM)
}-based
} approach

Answer



Hyphens are used to compose constituents, either words or phrases, to make words. Consequently, to know whether a hyphen is appropriate, you have to know the categories of constituents, not just what the constituents are. Below, I've tried to amend your diagram for "Gaussian mixture model (GMM)-based approach" by adding category (parts of speech) information. NP means noun phrase, N is noun (a word), A is adjective or other noun-modifier (a word), Participle (a word).




{NP
{A
{NP
A Gaussian {N
N mixture N model
}
} (GMM)
}-Participle based
} N approach



There are two types of word compounds in the example. A compound adjective (a word) is made by combining a NP (a phrase) and a Participle (a word), and a compound N (a word) is made by combining two Ns (words). For the latter type of compound, a hyphen is often optional.



I'm not sure I see a problem with the hyphenation. I'm worried, though, about the structure of "Gaussian mixture model", which must be a phrase, not a single word, because "Gaussian" is an adjective, and noun-noun compounds can't contain adjectives. But "Gaussian mixture" should be a constituent, because of the interpretation: mixture of Gaussian distributions.


No comments:

Post a Comment