Strings
suggest changeStrings in Go are immutable sequences of bytes.
Unlike languages like Python or Java, they are not internally represented as Unicode. Consequently, when reading strings from files or network connections, there is no conversion step from bytes to internal representation. When writing strings to files, there is no conversion to a code page.
Go strings don’t assume any particular code page. They are just bytes.
Go source code files are always UTF-8 so strings defined in source code are also valid UTF-8 strings.
Additionally, functions in the standard library that involve converting characters to upper-case or lower-case etc. assume that raw bytes represent UTF-8-encoded Unicode strings and perform transformations using Unicode rules.
Note the distinction between string literals delimited by double quotes (aka interpreted literals) as in "bar", and those delimited by backticks (aka raw literals) as in `foo`. Text between double quotes forms the value of the literal, with a backslash used to escape characters as in "\n" for newline. Text between backticks is treated as uninterpreted (implicitly UTF-8-encoded); in particular, backslashes have no special meaning and the string may contain newlines.
Basic string usage:
var s string // empty string ""
s1 := "string\nliteral\nwith\tescape characters"
s2 := `raw string literal
which doesn't recognize escape characters like \n
`
// you can add strings with +
fmt.Printf("sum of string: %s\n", s+s1+s2)
// you can compare strings with ==
if s1 == s2 {
fmt.Printf("s1 is equal to s2\n")
} else {
fmt.Printf("s1 is not equal to s2\n")
}
fmt.Printf("substring of s1: %s\n", s1[3:5])
fmt.Printf("byte (character) at position 3 in s1: %d\n", s1[3])
// C-style string formatting
s = fmt.Sprintf("%d + %f = %s", 1, float64(3), "4")
fmt.Printf("s: %s\n", s)
sum of string: string
literal
with escape charactersraw string literal
which doesn't recognize escape characters like \n
s1 is not equal to s2
substring of s1: in
byte (character) at position 3 in s1: 105
s: 1 + 3.000000 = 4
Important standard library packages for working on strings:
- strings implements string searching, splitting, case conversions
- bytes has the same functionality as
strings
package but operates on[]byte
byte slices - strconv for conversion between strings and integer and float numbers
- unicode/utf8 decodes from UTF-8-encoded strings and encodes to UTF-8-encoded strings
- regexp implements regular expressions
- text/scanner for scanning and tokenizing UTF-8-encoded text
- text/template for generating larger strings from templates
- html/template has all the functionality of
text/template
but understands the structure of HTML for generation of HTML that is safe from code injection attacks