How to make and visualize a word cloud in RsteemCreated with Sketch.

in visualization •  7 years ago 

There are many packages in R for text processing.

It is thus possible to analyze a text and extract the most common words and visualize this set of words as a cloud .. from this derives the term for the type of visualization itself word cloud of words...

The R code for the word cloud visualization has to import many libraries. So install the following packages

library(tm)

libray(wordcloud)

What about the text we are going to visualize? It is a wikipedia page of a country. We will find out of what country it is by the wordcloud itself. I copied that page in a .txt file. So here is the code. It is well commented in order to explain what does each line.

read the text file, line by line

page = readLines("italy.txt")

produce a corpus of the text

corpus = Corpus(VectorSource(page))

convert all of the text to lower case (standard practice for text processing)

corpus = tm_map(corpus, tolower)

remove any kind of punctuation

corpus = tm_map(corpus, removePunctuation)

remove all the numbers

corpus = tm_map(corpus, removeNumbers)

remove English stop words

corpus = tm_map(corpus, removeWords, stopwords("english"))

create a document term matrix

dtm = TermDocumentMatrix(corpus)

there will be a kind of warning but I'm not sure about this warning

//Error: inherits(doc, "TextDocument") is not TRUE

it will then reconfigure the corpus as a text document

corpus = tm_map(corpus, PlainTextDocument)
dtm = TermDocumentMatrix(corpus)

convert the document matrix to a standard matrix for use in the

m = as.matrix(dtm)

sort the data so we end up with the highest as biggest

v = sort(rowSums(m), decreasing = TRUE)

finally produce the word cloud

wordcloud(names(v), v, min.freq = 10)

Go ahead and try the example. The code will complain at some point about some small error but at the end, it will run ... so no problem.

Here is the source of this article: http://www.datatreemap.com/vis4r/wordcloud_in_R.php

For more examples of data collecting analysis and visualizing http://www.datatreemap.com

P.S. Did you find out what country is the wikipedia page about? Italy of course ;-)

Schermata 2017-09-23 alle 16.41.06.png

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

I am sorry, but is R ?

Yeap it is R programming :-)

Thanks for sharing.

You're welcome!

The @OriginalWorks bot has determined this post by @alketcecaj to be original material and upvoted it!

ezgif.com-resize.gif

To call @OriginalWorks, simply reply to any post with @originalworks or !originalworks in your message!

To nominate this post for the daily RESTEEM contest, upvote this comment! The user with the most upvotes on their @OriginalWorks comment will win!

For more information, Click Here!

Congratulations @alketcecaj! You have completed some achievement on Steemit and have been rewarded with new badge(s) :

Award for the number of posts published

Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here

If you no longer want to receive notifications, reply to this comment with the word STOP

By upvoting this notification, you can help all Steemit users. Learn how here!