If database name contains nonalphanumeric chars use "
" to quote:
CREATE DATABASE my-db CHARACTER SET utf8 COLLATE utf8_general_ci;
When using in shell script quote the quotes with "\"
mysql -p -e "CREATE DATABASE my-db` CHARACTER SET utf8 COLLATE utf8_general_ci;"
Question:
utf8_general_ci 是什么?
官方:
For any Unicode character set, operations performed using the xxx_general_ci collation are faster than those for the xxx_unicode_cicollation. For example, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons forutf8_unicode_ci. The reason for this is that utf8_unicode_ci supports mappings such as expansions; that is, when one character compares as equal to combinations of other characters. For example, in German and some other languages “ß” is equal to “ss”. utf8_unicode_ci also supports contractions and ignorable characters. utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.
To further illustrate, the following equalities hold in both utf8_general_ci and utf8_unicode_ci (for the effect this has in comparisons or when doing searches, see Section 11.1.8.7, “Examples of the Effect of Collation”):
[crayon-573ec63b1baff143258945/]
A difference between the collations is that this is true for utf8_general_ci:
[crayon-573ec63b1bb08904897963/]
Whereas this is true for utf8_unicode_ci, which supports the German DIN-1 ordering (also known as dictionary order):
[crayon-573ec63b1bb0e190585160/]
MySQL implements language-specific collations for the utf8 character set only if the ordering with utf8_unicode_ci does not work well for a language. For example, utf8_unicode_ci works fine for German dictionary order and French, so there is no need to create special utf8collations.
utf8_general_ci also is satisfactory for both German and French, except that “ß” is equal to “s”, and not to “ss”. If this is acceptable for your application, you should use utf8_general_ci because it is faster. If this is not acceptable (for example, if you require German dictionary order), use utf8_unicode_ci because it is more accurate.
MySQL :: MySQL 5.7 Reference Manual :: 11.1.15.1 Unicode Character Sets
http://dev.mysql.com/doc/refman/5.7/en/charset-unicode-sets.html
其它一:
utf8_general_ci是一个遗留的校对规则,不支持扩展,它仅能够在字符之间进行逐个比较。 这意味着utf8_general_ci校对规则进行的比较速度很快,但是与使用utf8_unicode_ci的校对规则相比,比较正确性较差。
However:utf8_unicode_ci比较准确,utf8_general_ci速度比较快。通常情况下 utf8_general_ci的准确性就够我们用的了,在我看过很多程序源码后,发现它们大多数也用的是utf8_general_ci,所以新建数据 库时一般选用utf8_general_ci就可以了
mysql中utf8_bin、utf8_general_ci、utf8_general_cs编码区别 - huanleyan的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/chenghuan1990/article/details/10078931
其它二:
utf8_general_ci is a very simple — and on Unicode, very broken — collation, one that givesincorrect results on general Unicode text. What it does is:
- converts to Unicode normalization form D for canonical decomposition
- removes any combining characters
- converts to upper case
- The lowercase of “ẞ” is “β”, but the uppercase of “β” is “SS”.
- There are two lowercase Greek sigmas, but only one uppercase one; consider “Σίσυφος”.
- Letters like “ø” do not decompose to an “o” plus a diacritic, meaning that it won’t correctly sort.
utf8_unicode_ciuses the standard Unicode Collation Algorithm, supports so called expansions and ligatures, for example: German letter ß (U+00DF LETTER SHARP S) is sorted near "ss" Letter Œ (U+0152 LATIN CAPITAL LIGATURE OE) is sorted near "OE".
utf8_general_ci does not support expansions/ligatures, it sorts all these letters as single characters, and sometimes in a wrong order.
utf8_unicode_ciis generally more accurate for all scripts. For example, on Cyrillic block:utf8_unicode_ciis fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian. While utf8_general_ci is fine only for Russian and Bulgarian subset of Cyrillic. Extra letters used in Belarusian, Macedonian, Serbian, and Ukrainian are sorted not well.
utf8_unicode_ci is that it is a little bit slower than utf8_general_ci. But that’s the price you pay for correctness. Either you can have a fast answer that’s wrong, or a very slightly slower answer that’s right. Your choice. It is very difficult to ever justify giving wrong answers, so it’s best to assume that utf8_general_ci doesn’t exist and to always use utf8_unicode_ci. Well, unless you want wrong answers.
Source: http://forums.mysql.com/read.php?103,187048,188748#msg-188748