Strings
In Julia, as in other programming languages, a string is a sequence of one or more characters and can be created using quotes.
julia> str = "Hello, world."
"Hello, world."
The strings are immutable and, therefore, cannot be changed after creation. However, it is simple to create a new string from parts of existing strings. Individual characters of a string can be accessed via square brackets and indices (the same syntax as for arrays).
julia> str[1] # returns the first character
'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
The return type, in this case, is a Char
.
julia> typeof(str[1])
Char
A Char
value represents a single character. It is just a 32-bit primitive type with a special literal representation and appropriate arithmetic behaviour. Chars can be created using an apostrophe.
julia> 'x'
'x': ASCII/Unicode U+0078 (category Ll: Letter, lowercase)
It is also possible to convert characters to a numeric value representing a Unicode and vice versa.
julia> Int('x')
120
julia> Char(120)
'x': ASCII/Unicode U+0078 (category Ll: Letter, lowercase)
Substrings from the existing string can be extracted via square brackets. The indexing syntax is similar to the one for arrays.
julia> str[1:5] # returns the first five characters
"Hello"
julia> str[[1,2,5,6]]
"Heo,"
We used the range 1:5
to access the first five elements of the string (further details on ranges are given in the section on arrays). Be aware that the expressions str[k]
and str[k:k]
do not give the same results.
julia> str[1] # returns the first character as Char
'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
julia> str[1:1] # returns the first character as String
"H"
When using strings, we have to pay attention to following characters with special meaning: \
, "
and $
. In order to use them as regular characters, they need to be escaped with a backslash (\
). For example, unescaped double quote ("
) would end the string prematurely, forcing the rest being interpreted as Julia code. This is a common malicious attack vector called code injection.
julia> str1 = "This is how a string is created: \"string\"."
"This is how a string is created: \"string\"."
Similarly, the dollar sign is reserved for string interpolation (it will be explained soon). If we want to use it as a character, we have to use a backslash too.
julia> str2 = "\$\$\$ dollars everywhere \$\$\$"
"\$\$\$ dollars everywhere \$\$\$"
julia> "The $ will be fine."
ERROR: ParseError:
# Error @ none:1:7
"The $ will be fine."
# └ ── identifier or parenthesized expression expected after $ in string
[...]
No, they won't. If used incorrectly, Julia will throw an error. Printing of strings can be done by the print
function or the println
function that also add a new line at the end of the string.
julia> println(str1)
This is how a string is created: "string".
julia> println(str2)
$$$ dollars everywhere $$$
There is one exception to using quotes inside a string: quotes without backslashes can be used in multi-line strings. Multi-line strings can be created using triple quotes syntax as follows:
julia> mstr = """
This is how a string is created: "string".
"""
"This is how a string is created: \"string\".\n"
julia> print(mstr)
This is how a string is created: "string".
This syntax is usually used for docstring for functions. It will have the same form after printing it in the REPL.
julia> str = """
Hello,
world.
"""
" Hello,\n world.\n"
julia> print(str)
Hello,
world.
Create a string with the following text
Quotation is the repetition or copy of someone else's statement or thoughts.
Quotation marks are punctuation marks used in text to indicate a quotation.
Both of these words are sometimes abbreviated as "quote(s)".
and print it into the REPL. The printed string should look the same as the text above, i.e., each sentence should be on a separate line. Use an indent of length 4 for each sentence.
Solution:
There are two basic ways to get the right result. The first is to use a multi-line string and write the message in the correct form.
julia> str = """
Quotation is the repetition or copy of someone else's statement or thoughts.
Quotation marks are punctuation marks used in text to indicate a quotation.
Both of these words are sometimes abbreviated as "quote(s)".
""";
julia> println(str)
Quotation is the repetition or copy of someone else's statement or thoughts.
Quotation marks are punctuation marks used in text to indicate a quotation.
Both of these words are sometimes abbreviated as "quote(s)".
We do not have to add backslashes to escape quotation marks in the text. The second way is to use a regular string and the new line symbol \n
. In this case, it is necessary to use backslashes to escape quotation marks. Also, we have to add four spaces before each sentence to get a proper indentation.
julia> str = " Quotation is the repetition or copy of someone else's statement or thoughts.\n Quotation marks are punctuation marks used in text to indicate a quotation.\n Both of these words are sometimes abbreviated as \"quote(s)\".";
julia> println(str)
Quotation is the repetition or copy of someone else's statement or thoughts.
Quotation marks are punctuation marks used in text to indicate a quotation.
Both of these words are sometimes abbreviated as "quote(s)".
String concatenation and interpolation
One of the most common operations on strings is their concatenation. It can be done using the string
function that accepts any number of input arguments and converts them to a single string.
julia> string("Hello,", " world")
"Hello, world"
Note that it is possible to concatenate strings with numbers and other types that can be converted to strings.
julia> a = 1.123
1.123
julia> string("The variable a is of type ", typeof(a), " and its value is ", a)
"The variable a is of type Float64 and its value is 1.123"
In general, it is not possible to perform mathematical operations on strings, even if the strings look like numbers. However, there are two exceptions. The *
operator performs string concatenation.
julia> "Hello," * " world"
"Hello, world"
Unlike the string
function, which works for other types, this approach can only be applied to String
s. The second exception is the ^
operator, which performs repetition.
julia> "Hello"^3
"HelloHelloHello"
The example above is equivalent to calling the repeat
function.
julia> repeat("Hello", 3)
"HelloHelloHello"
Using the string
function to concatenate strings can be cumbersome due to long expressions. To simplify the strings' construction, Julia allows interpolation into string literals with the $
symbol.
julia> a = 1.123
1.123
julia> string("The variable a is of type ", typeof(a), " and its value is ", a)
"The variable a is of type Float64 and its value is 1.123"
julia> "The variable a is of type $(typeof(a)), and its value is $(a)"
"The variable a is of type Float64, and its value is 1.123"
We use parentheses to separate expressions that should be interpolated into a string. It is not mandatory, but it can prevent mistakes. In the example below, we can see different results with and without parentheses.
julia> "$typeof(a)"
"typeof(a)"
julia> "$(typeof(a))"
"Float64"
In the case without parentheses, only the function name is interpolated into the string. In the second case, the expression typeof(a)
is interpolated into the string literal. It is more apparent when we declare a variable myfunc
that refers to typeof
function
julia> myfunc = typeof
typeof (built-in function)
julia> "$myfunc(a)"
"typeof(a)"
julia> "$(myfunc(a))"
"Float64"
Both concatenation and string interpolation call string
to convert objects into string form. Most non-AbstractString
objects are converted to strings closely corresponding to how they are entered as literal expressions.
julia> v = [1,2,3]
3-element Vector{Int64}:
1
2
3
julia> "vector: $v"
"vector: [1, 2, 3]"
julia> t = (1,2,3)
(1, 2, 3)
julia> "tuple: $(t)"
"tuple: (1, 2, 3)"
Print the following message for a given vector
"<vec> is a vector of length <len> with elements of type <type>"
where <vec>
is the string representation of the given vector, <len>
is the actual length of the given vector, and <type>
is the type of its elements. Use the following two vectors.
a = [1,2,3]
b = [:a, :b, :c, :d]
Hint: use the length
and eltype
functions.
Solution:
We will show two ways how to solve this exercise. The first way is to use the string
function in combination with the length
function to get the length of the vector, and the eltype
function to get the type of its elements.
julia> a = [1,2,3];
julia> str = string(a, " is a vector of length ", length(a), " with elements of type ", eltype(a));
julia> println(str)
[1, 2, 3] is a vector of length 3 with elements of type Int64
The second way is to use string interpolation.
julia> b = [:a, :b, :c, :d];
julia> str = "$(b) is a vector of length $(length(b)) with elements of type $(eltype(b))";
julia> println(str)
[:a, :b, :c, :d] is a vector of length 4 with elements of type Symbol
Useful functions
A handy function is the join
function that performs string concatenation. Additionally, it supports defining a custom separator and a different separator for the last element.
julia> join(["apples", "bananas", "pineapples"], ", ", " and ")
"apples, bananas and pineapples"
In many cases, it is necessary to split a given string according to some conditions. In such cases, the split
function can be used.
julia> str = "JuliaLang is a pretty cool language!"
"JuliaLang is a pretty cool language!"
julia> split(str)
6-element Vector{SubString{String}}:
"JuliaLang"
"is"
"a"
"pretty"
"cool"
"language!"
By default, the function splits the given string based on whitespace characters. This can be changed by defining a delimiter.
julia> split(str, " a ")
2-element Vector{SubString{String}}:
"JuliaLang is"
"pretty cool language!"
Julia also provides multiple functions that can be used to find specific characters or substring in a given string. The contains
function checks if the string contains a specific substring or character. Similarly, the occursin
function determines if the specified string or character occurs in the given string. These two functions differ only in the order of arguments.
julia> contains("JuliaLang is pretty cool!", "Julia")
true
julia> occursin("Julia", "JuliaLang is pretty cool!")
true
Another useful function is endswith
, which checks if the given string ends with the given substring or character. It can be used, for example, to check that the file has a proper suffix.
julia> endswith("figure.png", "png")
true
Sometimes, it is necessary to find indices of characters in the string based on some conditions. For such cases, Julia provides several find functions.
julia> str = "JuliaLang is a pretty cool language!"
"JuliaLang is a pretty cool language!"
julia> findall(isequal('a'), str)
5-element Vector{Int64}:
5
7
14
29
33
julia> findfirst(isequal('a'), str)
5
julia> findlast(isequal('a'), str)
33
The first argument isequal('a')
creates a function that checks if its argument equals the character a
.
As we said before, strings are immutable and cannot be changed. However, we can easily create new strings. The replace
function returns a new string with a substring of characters replaced with something else:
julia> replace("Sherlock Holmes", "e" => "ee")
"Sheerlock Holmees"
It is also possible to apply a function to a specific substring using the replace
function. The following example shows how to change all e
letters in the given string to uppercase.
julia> replace("Sherlock Holmes", "e" => uppercase)
"ShErlock HolmEs"
It is even possible to replace a whole substring:
julia> replace("Sherlock Holmes", "Holmes" => "Homeless")
"Sherlock Homeless"
Use the split
function to split the following string
"Julia!"
into a vector of single-character strings.
Hint: we can say that an empty string ""
separates the characters in the string.
Solution:
To separate a string into separate single-character strings, we can use the split
function and an empty string (""
) as a delimiter.
julia> split("Julia!", "")
6-element Vector{SubString{String}}:
"J"
"u"
"l"
"i"
"a"
"!"