Analyze Text Data with String Arrays
This example shows how to store text from a file as a string array, sort the words by their
frequency, plot the result, and collect basic statistics for the words found in the file.
Import Text File to String Array
Read text from Shakespeare's Sonnets with the fileread function. fileread returns
the text as a 1-by-100266 character vector.
sonnets = fileread('sonnets.txt');
sonnets(1:35)
ans =
'THE SONNETS
by William Shakespeare'
Convert the text to a string using the string function. Then, split it on newline
characters using the splitlines function. sonnets becomes a 2625-by-1 string array,
where each string contains one line from the poems. Display the first five lines of
sonnets.
sonnets = string(sonnets);
sonnets = splitlines(sonnets);
sonnets(1:5)
ans = 5x1 string array
"THE SONNETS"
""
"by William Shakespeare"
""
""
Clean String Array
To calculate the frequency of the words in sonnets, first clean it by removing empty
strings and punctuation marks. Then reshape it into a string array that contains individual
words as elements.
Remove the strings with zero characters ("") from the string array. Compare each
element of sonnets to "", the empty string. Starting in R2017a, you can create strings,
6 Characters and Strings