For accessing the Gmail inbox and reading emails, we will be using imaplib
python module. To connect to the Gmail server, we need the below information.
- SMTP server DNS. Its value will be 'imap.gmail.com' in our case.
- SMTP server port. The value will be 993. This port is used for Internet message access protocol over TLS/SSL.
Gmail won't allow external applications to access the Inbox with the Google account password. For accessing Gmail Inbox, we need to create an application and create a passphrase for that application.
Go to your Google account -> security -> Signing in to Google -> App Password. Create a new App and copy the password.
Click on the 'Generate' button and a new password will be generated for you to use in your python script. This password will let us access the emails from Gmail.
Copy and paste this password in a separate text file. Do not use the password directly in the main Python script.
import imaplib def get_mail_client(email_address):
SMTP_SERVER = "imap.gmail.com"
SMTP_PORT = 993
password = ""
with open("password.txt", "r") as f:
password = f.read().strip()
mail = imaplib.IMAP4_SSL(SMTP_SERVER)
mail.login(email_address, password)
return mail
In the line mail = imaplib.IMAP4_SSL(SMTP_SERVER)
, if the server is omitted, the localhost is used by default. If the port is omitted, 993 is used by default.
To access a specific folder, we need to select that folder. For example to search emails in the Inbox folder, select the Inbox folder.
mail.select('INBOX')
To search the Spam folder,
mail.select('[Gmail]/Spam')
We can search for specific category (Promotional / Updates / Forums) in Inbox.
def get_top_10_emails(category):
# category can be 'Promotional, Updates or Forums # returns tuple
status, response = mail.uid('search', 'X-GM-RAW "category:' + category + '"')
# get email ids list
response = response[0].decode('utf-8').split()
response.reverse()
response = response[:min(10, len(response))]
return response
The above function will return a list of uid
of the first 10 emails in the specified category in the Inbox folder. These uids are the universal identifier for an email. Even if an email is deleted, the uid of remaining mails remains the same. This is different from email Id which we will see after it.
Now we can iterate over this list to fetch email details.
for uid in get_top_1o_emails(category):
status, data = mail.uid('fetch', uid, '(RFC822)')
msg = get_msg_object(data) print(msg)
mail.uid('fetch', uid, '(RFC822)')
function return a list of tuples. Each tuple contains the byte objects. get_msg_object
iterate over the items in the list and pick the second item from the tuple. Now using the message_from_bytes
method of the email
module, the message object is returned.
import email def get_msg_object(data):
for response_part in data:
if isinstance(response_part, tuple):
return email.message_from_bytes(response_part[1])
When msg
is printed, the output will be as below.
Delivered-To: your-email-address@gmail.com Received: by 4567:123:123:123:0:0:0:0 with SMTP id 21dfr2888589nju; Sat, 17 Oct 2020 22:04:41 -0700 (PDT) X-Received: by xxxx:xxx:xxxx:: with SMTP id q129xxxxxxxxxx.13.160xxxxxxxx04; Sat, 17 Oct 2020 22:04:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602997481; cv=none; d=google.com; s=arc-20160816; b=oTK3SGWhT5udjSpaqY1c2kEyHPp+XHIQhihpF07XSxs6ljTZl2J1pIYFVECfER2OMU AYiS/ENv1talc9GIn0hsWHaKixxxxxxxxxx+x/9E5nLt0/0qIWOKiNUOJ2X5 AlzRwxROayvStYYIZmqqWtZAprXjVSvz2ZGi1yZvF6V56JPiG6rhoUDbA+qX5g94zQ1I A+ZhPtfMoE3vJcVQ2nJgtZvK+xxxxxxxxxxxxxx+/wVkL1PCAfHB0OMnP2LqdKoXx ymE/OrtH96AvFMuVRi/hgQRiQU/9SrpmFypn1GDUSJrhYw/RgE1KoWHQ+hL/X64W6rJg 8oFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:from:subject:message-id:feedback-id:date:mime-version :dkim-signature; bh=mAnxiGxOqVCr9z/t0t7s8dHkCKtcZLFHscPaqmyVmf4=; b=sVrizYJ9oyHDhLhmvoRPk6Q18p4eYFtCX2vEWlc2p+5X3riOqotK+qY7FiZU09Afsb XR8BO7uyBm3A33k2CMMd0qfAG50uvz8j79glhmKJP+bNgMHNtLB78Izl4pHDm0VUzfUc 6APb+kSJ9iChYvIssZC7c3Yxxxxxxxxxxxxxx++RSrAC0QayceqwbIe soZ++3BklOil2iMv6fOxnbVTa7MyJJL1s8YQlUfuUMG4ZryVT/sag1aYZ2PDoTQdYcf2 QrTSAHPiYG6JZKitdPpOW2BTiZGot0QGHbj5sSHe6FN3RFiRGraKM+Ibnij6eyo7dFB7 QK0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@accounts.google.com header.s=20161025 header.b=jHcNbSwC; spf=pass (google.com: domain of 36cylxwgtd7qhi-xxxxxxxx.aiiafy.xxxxxx.wig@gaia.bounces.google.com designates 209.85.220.73 as permitted sender) smtp.mailfrom=36cyLXwgTD7Qhi-lYjfsUWWiohnm.aiiafY.WigUholUalUhUxvv23agUcf.Wig@gaia.bounces.google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=accounts.google.com Return-Path: <36cyLXwgTD7Qhi-xxxxxx.aiiafY.xxxxxxxx.Wig@gaia.bounces.google.com> Received: from mail-sor-f73.google.com (mail-sor-f73.google.com. [209.85.220.73]) by mx.google.com with SMTPS id k184sor1479766vsk.94.2020.10.17.22.04.41 for <your-email-address@gmail.com> (Google Transport Security); Sat, 17 Oct 2020 22:04:41 -0700 (PDT) Received-SPF: pass (google.com: domain of 36cylxwgtd7qhi-xxxxxx.xxxx.wiguholualuhuxvv23agucf.wig@gaia.bounces.google.com designates 209.85.220.73 as permitted sender) client-ip=209.85.220.73; Authentication-Results: mx.google.com; dkim=pass header.i=@accounts.google.com header.s=20161025 header.b=jHcNbSwC; spf=pass (google.com: domain of 36cylxwgtd7qhi-lyjfsuwwiohnm.xxxxxx.xxxxxxx.wig@gaia.bounces.google.com designates 209.85.220.73 as permitted sender) smtp.mailfrom=36cyLXwgTD7Qhi-lYjfsUWWiohnm.aiiafY.WigUholUalUhUxvv23agUcf.Wig@gaia.bounces.google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=accounts.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=accounts.google.com; s=20161025; h=mime-version:date:feedback-id:message-id:subject:from:to; bh=mAnxiGxOqVCr9z/t0t7s8dHkCKtcZLFHscPaqmyVmf4=; b=jHcNbSwC0dEII0cWC2A4TaZHF3VCGPZZcWmuXXy/L5URXMJaEhJo0sjH2i39UA/bvC GLoGmnNw/I077ILDMfVXMQHP0DGpPp5m5vU3hG0k3/XXJKntyTF601OfeyuwvgjxOOm4 xxxxxxxxxxxxxxxxxxxxxxxx+zuhkiVQsNa24RZqTdX6TOm8+kzt7M ohumaD67phkS5b1Rb086k/jGtWcwqwJj6F7T3gdgT6aEcKQcdPvZ/rR/Ph6FdYPvxhJ2 u39hnNjUd1F/Q1ODv5ESuTOYgc8BMNQrMulEjVn52UlSmURnEMfefj/lEIX8DG5AeFWr PKwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:date:feedback-id:message-id:subject :from:to; bh=mAnxiGxOqVCr9z/t0t7s8dHkCKtcZLFHscPaqmyVmf4=; b=m/a4UtLRYq+5mTjTRJsPxVk4qaKkaov43DLr0B7zQ5st/uY1Bd5frWiN1BBRmyEDIS 31yw2XUWcrd89FRsLEgAQ4E16CS1A8/117uCaU5IMDpgR+vqPZFX514r1bzql2XRPU8Z xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx gJlVL3qZjqQwUOGiYQqj0BtumMqNz6wcGZoWmFy3vsugHhttdydtiSJs3phAL5qlmrQY cIfutlen+v3KYFFC6bOtCohB3KxtKSsCWFXsVKJb8HVEpcGC2OodVyALwHqsB1URFf7a yxQg== X-Gm-Message-State: AOAM530j8mOsvWgyElljWsqH1qqyt+xxxxxx/GtDkY0nrnmvYnGVzr ZATPx3ISOMU7DxxnO7T+q39an0WsrOwn X-Google-Smtp-Source: ABdhPJyBHNYWZ9UYMgsSI/xxxxxxxxxxxx+v2m4CN2Rdm/hEq1e/nhkvW+LpFOuWg4v5pNCnA== MIME-Version: 1.0 X-Received: by 2002:a67:edcb:: with SMTP id xxxxxxxxx.11.1602997481361; Sat, 17 Oct 2020 22:04:41 -0700 (PDT) Date: Sun, 18 Oct 2020 05:04:40 GMT X-Account-Notification-Type: 20 Feedback-ID: 20:account-notifier X-Notifications: xxxxxxxxxxx Message-ID: <xxxxxxxxxx.0@notifications.google.com> Subject: App password created From: Google <no-reply@accounts.google.com> To: your-email-address@gmail.com Content-Type: multipart/alternative; boundary="00000000000073d34005b1eaef17" --00000000000073d34005b1eaef17 Content-Type: text/plain; charset="UTF-8"; format=flowed; delsp=yes Content-Transfer-Encoding: base64 IlB5dGhvblNjcmlwdCIgcGFzc3dvcmQgc3VjY2Vzc2Z1bGx5IGNyZWF0ZWQNCg0KDQoNCkhpIEFu dXJhZywNCllvdSBoYXZlIHN1Y2Nlc3NmdWxseSBjcmVhdGVkIGFuIGFwcCBwYXNzd29yZCBmb3Ig -------------- CONTENT REMOVED --------------------------------------------- IGFjY291bnQuDQoNCsKpIDIwMjAgR29vZ2xlIExMQywgMTYwMCBBbXBoaXRoZWF0cmUgUGFya3dh eSwgTW91bnRhaW4gVmlldywgQ0EgOTQwNDMsIFVTQQ0K --00000000000073d34005b1eaef17 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <!DOCTYPE html><html lang=3D"en"><head><meta name=3D"format-detection" cont= ent=3D"email=3Dno"/><meta name=3D"format-detection" content=3D"date=3Dno"/>= <style nonce=3D"xxxxxxxxxxxx">.awl a {color: #FFFFFF; text-decora= tion: none;} .abml a {color: #000000; font-family: Roboto-Medium,Helvetica,= ------------ CONTENT REMOVED ----------------------------------------------- USA</div></td></tr></table></td></tr></table></td></tr></table></td><td wid= th=3D"32" style=3D"width: 32px;"></td></tr><tr height=3D"32" style=3D"heigh= t: 32px;"><td></td></tr></table></body></html> --00000000000073d34005b1eaef17--
# search in spam
mail.select('[Gmail]/Spam')
print("Looking in Spams")
id_list = get_email_ids(mail, "[Gmail]/Spam", max_mails_to_look=50)
for email_id in id_list:
msg = get_email_msg(email_id)
print(msg)
get_email_ids
function will return a list of email ids in reverse order i.e. recent email first.
def get_email_ids(mail, label='INBOX', criteria='ALL', max_mails_to_look=10):
mail.select(label)
type, data = mail.search(None, criteria)
mail_ids = data[0]
id_list = mail_ids.split()
# revers so that latest are at front
id_list.reverse()
id_list = id_list[: min(len(id_list), max_mails_to_look)]
return id_list
Please note that id_list
is not a list of uid. These ids or message number can change if an email is deleted. Hence it is always recommended to work with uid.
get_email_msg
will return the message detail for each message number.
import email def get_email_msg(email_id):
email_id = str(int(email_id))
type, data = mail.fetch(str(email_id), '(RFC822)')
for response_part in data:
if isinstance(response_part, tuple):
return email.message_from_bytes(response_part[1])
We can fetch any header from a message using the get
method. To get the SPF, DMARC, and DKIM details from message headers, first, get the Authentication-Results
header.
auth_results = msg.get("Authentication-Results", None)
auth_results
will contain the below data.
Authentication-Results: mx.google.com; dkim=pass header.i=@accounts.google.com header.s=20161025 header.b=jHcNbSwC; spf=pass (google.com: domain of 36cylxwgtd7qhi-lyjfsuwwiohnm.aiiafy.wiguholualuhuxvv23agucf.wig@gaia.bounces.google.com designates 209.85.220.73 as permitted sender) smtp.mailfrom=36cyLXwgTD7Qhi-lYjfsUWWiohnm.aiiafY.WigUholUalUhUxvv23agUcf.Wig@gaia.bounces.google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=accounts.google.com
Now we can parse this string to get the DKIM, SPF and, DMARC results.
Similarly, we can check if the unsubscribe header is present in the message.
def is_unsubscribe_present(msg):
return "List-Unsubscribe" in msg.keys()
To search an email by subject we can iterate over the emails and then get the subject of each email. Compare the subject of each email with the expected subject.
def search_by_subject(email_ids_list, subject_substring):
for email_id in email_ids_list:
msg = get_email_msg(email_id)
if "Subject" in msg.keys():
subject = msg.get("Subject", "")
print("{}".format(subject))
if subject_substring.lower() in subject.lower():
return msg
return None
- Here is a list of some awesome python books
- Host your Django Application for free on PythonAnyWhere.
- If you want full control of your application and server, you should consider DigitalOcean.
- Create a DigitalOcean account with this link and get $100 credits.