author: Eric Mader author: Notes: author: The property \p{Decomposition_Type=Canonical} will match all characters with a canonical author: decomposition. author: So "[[\\p{Latin}\\p{Greek}\\p{Cyrillic}] & [\\p{Decomposition_Type=Canonical}]]" author: will match all Latin, Greek and Cyrillic characters with a canonical decomposition. author: Are these three scripts enough? Do we want to collect them all at once and distribute by script, author: or process them one script at a time.