Characters and runes
suggest changeGo has 2 types of characters:
byte
is a 1 byte value, an alias foruint8
typerune
is a 4 byte Unicode code-point, an alias forint32
type
Zero value of a byte
and rune
is 0.
Iterate over a string using bytes
s := "str"
for i := 0; i < len(s); i++ {
c := s[i]
fmt.Printf("Byte at index %d is '%c' (0x%x)\n", i, c, c)
}
Byte at index 0 is 's' (0x73)
Byte at index 1 is 't' (0x74)
Byte at index 2 is 'r' (0x72)
Iterate over a string using runes
s := "日本語"
for i, runeChar := range s {
fmt.Printf("Rune at byte position %d is %#U\n", i, runeChar)
}
Rune at byte position 0 is U+65E5 '日'
Rune at byte position 3 is U+672C '本'
Rune at byte position 6 is U+8A9E '語'
A string is an array of bytes.
When iterating a string as runes we assume the string is Unicode string encoded as UTF-8.
UTF-8 is a variable-length encoding where a rune can be encoded as 1, 2, 3 or 4 bytes.
The returned index i
is a byte position within the string where the rune starts. It's not a rune count.
Found a mistake? Have a question or improvement idea?
Let me know.
Table Of Contents