1000 Python Questions
Get 1 Python question daily. Join this telegram channel https://t.me/python1000questions
Advertise with us
image word-cloud script   0   5463
Python Script 16: Generating word cloud image of a text using python


Word cloud is an image composed of words used in a particular text or subject, in which the size of each word indicates its frequency or importance.

In this python script, we will generate a word cloud image of text from a news article on CNN.


Dependencies:

- wordcloud 1.5.0
- matplotlib 3.0.3

Install the dependencies in a virtual environment and activate it. 


Image configurations:

The image we want to generate will have below configurations.

# image configurations
background_color = "#101010"
height = 720
width = 1080


I have copy-pasted the content of the news article in a text file. Read the file and store words in a list.

# Read a text file and calculate frequency of words in it
with open("/tmp/sample_text.txt", "r") as f:
words = f.read().split()


Now generate a dictionary with keys as words and values as frequency of words. We will ignore the stop words.

data = dict()

for word in words:
word = word.lower()
if word in stop_words:
continue

data[word] = data.get(word, 0) + 1


You can get the list of stopwords from nltk library or from resource available online.

import nltk
from nltk.corpus import stopwords
set(stopwords.words('english'))


Now create word cloud object and initialize with image configurations.

word_cloud = WordCloud(
background_color=background_color,
width=width,
height=height
)

word_cloud.generate_from_frequencies(data)
word_cloud.to_file('image.png')


Call the generate_from_frequencies method with data dictionary as input and then generate the image and save to file.


Code is available at Github.


Complete Script:

"""
Python script to generate word cloud image.
Author - Anurag Rana
Read more on - https://www.pythoncircle.com
"""

from wordcloud import WordCloud

# image configurations
background_color = "#101010"
height = 720
width = 1080

with open("stopwords.txt", "r") as f:
stop_words = f.read().split()

# Read a text file and calculate frequency of words in it
with open("/tmp/sample_text.txt", "r") as f:
words = f.read().split()

data = dict()

for word in words:
word = word.lower()
if word in stop_words:
continue

data[word] = data.get(word, 0) + 1

word_cloud = WordCloud(
background_color=background_color,
width=width,
height=height
)

word_cloud.generate_from_frequencies(data)
word_cloud.to_file('image.png')


For more details, visit official documentation of word cloud. 

image word-cloud script   0   5463

Related Articles:
How to compress the uploaded image before storing it in Django
Compressing an image in Django before storing it to the server, How to upload and compress and image in Django python, Reducing the size of an image in Django, Faster loading of Django template, Solving cannot write mode RGBA as JPEG error,...
Python Script 8: Validating Credit Card Number - Luhn's Algorithm
Validating credit card number using Luhn' Algorithm, Verifying Credit and Debit card using python script, Python code to validate the credit card number, Luhn' algorithm implementation in Python...
Python Script 5: How to find most popular technologies on Stackoverflow
How to find most popular technology on stackoverflow by crawling the stackoverflow site using python. Using python beautifulsoup to crawl web pages on stackoverflow. Python code to crawl stackoverflow, crawling stackoverflow for tags, python script to fetch data from stackoverflow....
Uploading a file to FTP server using Python
Uploading files to FTP server using Python, Python script to connect to ftp server, Python code to login to FTP server and upload file, How to connect to FTP server using python code, ftplib in python, Get server file listing using ftplib in python...
SUBSCRIBE
Please subscribe to get the latest articles in your mailbox.

© 2017-2020 Python Circle   Contact   Sponsor   Archive   Sitemap