Learn Golang Tutorials - Rune Types Explained with examples

What is Rune data typein Golang?

Rune is a data type in the Go programming language.

Rune stores a character literal which represents the int32 value and each int value is mapped to Unicode code point https://en.wikipedia.org/wiki/Code\_point.

It represents any characters of any alphabet from any language in the world.

Each Character in language has a different meaning. For the English language, the character from the alphabet has ‘B’, ‘b’, and’&’, for Chinese, it is ‘我’.

Like other programming languages, Go doesn’t have a character data type.

bytes and rune are used to represent character literals. a byte represents ASCII codes. rune represents Unicode characters encoded in UTF-8 format.

ASCII characters byte example:

It contains 128 characters which contain integer values from 0 to 127. It can be stored in 8-bit types i.e. byte

Unicode’s characters rune type:

It is a superset of the ASCII character set which contains integer numbers in the range from 0 to 2 power 21 characters. Not all integers are mapped to Unicode characters, but some are reserved.

The following are important points of Rune type

  • It is an alias for int32, which means It holds values from o to 2 power 32 -1
  • Represents Unicode code point
  • It is a character literally enclosed in a single quote
  • Group of runes represents strings, can be cast to rune array([])
  • Rune represents the character literal.
  • rune is a keyword in the Go language

How to declare rune type variable in Golang?

You can create a rune variable like any other data type.

var runeVariable rune // filled with default zero value -  U+0000

if the variable is declared without assigned any value, the compiler filled with a default zero value i.e. U+0000

var runeVariable rune = 'Z'

On a single statement, the variable is declared and assigned with a character literal value.

Please note that character literals are always enclosed in a single quote or without the rune keyword, you can declare a variable and assign a literal, a compiler will infer the type as a rune. So the default type is rune for a character.

var charVariable = 'C' // compiler infer its type as rune for character literals

Like any other variable assignment, you can use shorthand assignment operator ():= by omitting the var keyword and rune keyword

runeVariable := 'Z'

compiler infers the type from the right-hand value

How to create Rune type using Rune Literal values?

Rune literals are used to represent Unicode characters or single characters. Unicode characters always start with \\U, followed by 4 digits hexadecimal values which can be represented in a single character enclosed in double quotes or Unicode codepoints. Some examples of rune literals

var v1='B' is same as   '\u0042'
var v2 = '\u2665' same as ''
var letter = '\n' is same as '\u000A'

Unicode codepoints are mapped to characters that contain integer numbers. These numbers are called code points. Each Unicode has formats for code points. It always starts with U+ and is followed by a hexadecimal value of the code point

var charVariable = 'C' // Type inferred as `rune` (Default type for character values)
var myvariable rune = ''

Let’s see some examples using rune datatype in going

How to check variable is a rune or not?

rune is an alias for int32 for representing Unicode characters. reflect.TypeOf function checks rune variable type and returns int32.

package main

import (
    "fmt"
    "reflect"
)

func main() {
    variable: = 'A'
    fmt.Printf("%d \n", variable)
    fmt.Println(reflect.TypeOf(variable))
}

The output of the above program is

65
int32

How to Iterate over a string using rune type?

The 1 contains a slice of characters or runes. We can use the range form of for loop to iterate the rune

func main() {

    str1: = "kiran"
    for i,
    rune: = range str1 {
        fmt.Printf("%d: %c:%U\n", i, rune, rune)
    }
}

When the above code is compiled and executed, the output is as follows

0: k:U+006B
1: i:U+0069
2: r:U+0072
3: a:U+0061
4: n:U+006E

How to Check the Unicode code point and the number of a character or string?

The string is represented in the form of a group of characters i.e. array of runes. Iterate each character using a range form of for loop. For each rune value iteration, print the character, integer value, and the Unicode literal.

Following is a program to get the Unicode of a character or string

func main() {

    // input is string
    str: = "my♥"
    for _,
    r: = range str {
            fmt.Printf("%c - %d - %U\n", r, r, r) // character - integer - unicode
        }
        // input is character ie rune
    char: = ''
    fmt.Printf("%c - %d - %U\n", char, char, char)

}

The output of the above program is

m - 109 - U+006D
y - 121 - U+0079
 - 9829 - U+2665
 - 9829 - U+2665

How to Convert String to/from rune array or slices?

When the string is converted to a rune array, a new array is returned and each element contains Unicode code points.

If any character in a string contains an invalid UTF-8 value, it returns 0xFFFD for it

Here is a code program string to rune array/slice type with an example

package main

import (
    "fmt"
)

func main() {

    strVariable: = "my♥"
    runeArray: = [] rune(strVariable)
    fmt.Printf("%v\n", runeArray) // Ascii code array
    fmt.Printf("%U\n", runeArray) // unicode code point array

}

The output of the above program is

[109 121 9829]
[U+006D U+0079 U+2665]

After you convert the rune array to string, all the Unicode codepoints in the array are converted and appended to a new string encoded in UTF-8 format.

If any Unicode is out of range of valid Unicode, It returns \uFFFD, and the symbol is � returned.

Here is a code program example for converting a rune array to a string

package main

import (
    "fmt"
)
func main() {
    runeArray: = [] rune {
        '\u006D', '\u0079', '\u2665', -1
    }
    str: = string(runeArray)
    fmt.Println(str) // my♥�

}

output:

my♥�

How to get the length of a string in runes and bytes

The below program gets the length of the string in bytes and the length of the string in runes.

In order to get the length of runes, unicode/utf8 is used. RuneCountInString() function accepts string return integer value length of a string.

package main

import (
    "fmt"
    "unicode/utf8"
)
func main() {
    str: = "This is test string"
    fmt.Println(len(str)) // return the length in bytes
    fmt.Println(utf8.RuneCountInString(str)) // return the length in runes

}

Output:

19
19

Conclusion

In this tutorial, Learned the Rune type and completed the tutorial with examples in Golang.