Friday, November 30, 2018

Learn Golang Tutorials - Rune Types Explained with examples

Golang Rune type introduction

Rune is a data type in the Go programming language. Rune is character literal which represents int32 value and each int value is mapped to Unicode code point https://en.wikipedia.org/wiki/Code_point

It represents any characters of any alphabet from any language in the world.
Each Character in language has a different meaning. For the English language, the character is from alphabets has 'B','b','&', for Chinese, it is '我'

Like other programming languages, Go don't have a character data type. bytes and rune are used to represent character literals. a byte represents ASCII codes,
rune represents Unicode characters encoded in UTF-8 format.

ASCII characters
It contains 128 characters which contain integer values from 0 to 127. It can be stored in 8 bit types ie byte

Unicode characters
It is a superset of ASCII character set which contains integer numbers in the range from 0 to 2 power 21 characters. Not all integers mapped to Unicode characters, but some are reserved.
Following are important points of Rune type 
  • It is an alias for int32, means It holds values  from o to 2 power 32 -1
  • Represents Unicode code point
  • It is a character literal enclosed in single quote 
  • Group of runes represents strings, can be cast to []rune
  • Rune represents character literal. 
  • rune is a keyword in go language

Rune type variable creation

You can create a variable like any other data type
var runeVariable rune // filled with default zero value -  U+0000
if the variable is declared without assigned any value, default compiler filled with default zero value ie U+0000
var runeVariable rune = 'Z'
On a single statement, the variable is declared and assigned with character literal value. Please note that character literals always enclosed in a single quote or without rune keyword, you can declare a variable and assigns a literal, a compiler will infer type as a rune. So default type is rune for a character.
var charVariable = 'C' // compiler infer its type as rune for character literals
Like any other variable assignment, you can use shorthand assignment operator ():= by omitting var keyword and rune keyword
runeVariable := 'Z'
compiler infer the type from the right-hand value

Create Rune type using Rune Literal values 

Rune literals are used to represent Unicode characters or single characters Unicode characters always start with \U followed by 4 digit hexadecimal values which can be represented in single character enclosed in double quotes or Unicode codepoints some of the examples of rune literals
var v1='B' is same as   '\u0042'
var v2 = '\u2665' same as '♥'
var letter = '\n' is same as '\u000A'
Unicode codepoints These will be mapped to characters which contain integer numbers. These numbers are called code points. Each Unicode has formats for code points. It always starts with U+ and followed by a hexadecimal value of the code point
var charVariable = 'C' // Type inferred as `rune` (Default type for character values)
var myvariable rune = '♥'

rune usage Examples

Check rune variable type 

rune is an alias for int32 for representing Unicode character. reflect.TypeOf function cheks rune variable type and return int32


package main

import (
 "fmt"
 "reflect"
)

func main() {
 variable := 'A'
 fmt.Printf("%d \n", variable)
 fmt.Println(reflect.TypeOf(variable))
}
The output of the above program is


65
int32

Iterate over a string over runes 

The string contains a slice of characters or runes. We can use range form of for loop to iterate the rune
func main() {

str1 := "kiran"
 for i, rune := range str1 {
  fmt.Printf("%d: %c:%U\n", i, rune, rune)
 }
}
When the above code is compiled and executed, the output is as follows
0: k:U+006B
1: i:U+0069
2: r:U+0072
3: a:U+0061
4: n:U+006E

Check Unicode code point and a number of a character or string

the string is represented in form of a group of characters ie array of runes. range form of for loop iterates each rune value and print character, integer value and the Unicode literal following is a program to get Unicode of a character or string
func main() {

 // input is string
 str := "my♥"
 for _, r := range str {
  fmt.Printf("%c - %d - %U\n", r, r, r) // character - integer - unicode
 }
 // input is character ie rune
 char := '♥'
 fmt.Printf("%c - %d - %U\n", char, char, char)

}
Output of the above program is

m - 109 - U+006D
y - 121 - U+0079
♥ - 9829 - U+2665
♥ - 9829 - U+2665
Convert String to/from rune array or slices When the string is converted to rune array, a new array is returned which each element contains Unicode code points. if any character in a string contains an invalid UTF-8 value, it returns 0xFFFD for it Here is a code program string to rune array/slice type with example
package main

import (
 "fmt"
)

func main() {

  strVariable := "my♥"
 runeArray := []rune(strVariable)
 fmt.Printf("%v\n", runeArray) // Ascii code array
 fmt.Printf("%U\n", runeArray) // unicode code point array

}
The output of the above program is
[109 121 9829]
[U+006D U+0079 U+2665]
when you convert rune array to string, all the Unicode codepoints in the array are converted and appended to a new string encoded in UTF-8 format if any Unicode is out of range of valid Unicode, It returns \uFFFD and symbol is � returned here is code program example for converting rune array to string
package main

import (
 "fmt"
)
func main() {
 runeArray := []rune{'\u006D', '\u0079', '\u2665',-1}
 str := string(runeArray)
 fmt.Println(str) // my♥�

}
output is
my♥�

How to get the length of a string in runes and bytes

The below program get the length of the string in bytes and lenght o f string in runes. for getting the length of runes,unicode/utf8 is used.RuneCountInString() function accepts string return integer value length of a string
package main

import (
 "fmt"
 "unicode/utf8"
)
func main() {
 str := "This is test string"
 fmt.Println(len(str))                    // return the length in bytes
 fmt.Println(utf8.RuneCountInString(str)) // return the length in runes

}
Output is
19
19

Related article


EmoticonEmoticon