Create your own Telegram bot with Django on Heroku – Part 4 – pull vs. push method

This entry is part 4 of 11 in the series Create your own Telegram bot with Django on Heroku

⚠️ This article is outdated and discontinued since Heroku decided to no longer offer their free tiers as this article series suggests to use in August, 2022. Please see this post for details. ⚠️

In the previous part of this series, we started to get familiar with telepot, a Python module to interact with Telegram bots and had a short look at how the Telegram bot API is providing messages as JSON structures.

Today we will talk about the Webhook-method (push) instead of the previously introduced getUpdate-method (pull).

What’s the difference in getUpdates and a webhook? ?

The Default: getUpdates (pull) ?

By default (when a new bot is registered), a bot is configured to cache messages until they are actively fetched by some request (getUpdates). This is also called “pull method”.
When a message is sent to the bot while it is configured for this mode, it is cached in some kind of buffer within the infrastructure of Telegram (until they are delivered to the bot, but not longer than 24 hours, as described in the docs).

While this mode offers an easy start because you do not need to have any infrastructure or code prepared to create the bot and start exchanging messages with it, it has some downsides:

You have to take care of having your code to ask for updates frequently.
You will often ask for updates while there is none, generating unnecessary load and traffic.
You do not have the opportunity to implement a real-time service since updates will always be delayed until the next update execution.

There might be a good reason to keep it like this nevertheless, like if you do not have a way to meet the technical requirements for the alternative method which is a webhook, like that you do not have a way to expose an interface to the public internet for example or if you want to prepare and test some code before firing up a deployed service. But most serious bots probably will change this to the alternate webhook (push) method sooner or later.

GET and POST HTTP methods

Before we look at the webhook push method, let’s first refresh some basics about how HTTP works. This will only cover some basics to better understand what is going on in the next sections. There is more to say and explain about the methods shown here and there are more methods to the protocol as well; not because of no reason there are 656 pages to O’Reilly’s book “HTTP: The Definitive Guide“. Since obviously, this is beyond the scope of this article, if you want to know more, please help yourself.

GET

When you are surfing the web with your favorite browser, GET requests are sent by that browser to the web servers providing your favorite websites, asking for a specific resource (like “GET / HTTP/1.1” is asking for the page index). To fetch the article you are currently reading, your browser most certainly sent something like this to my web server:

“GET /create-your-own-telegram-bot-with-django-on-heroku-part-3/?somethingIWantToSet=HubbaHubba HTTP/1.1”

followed by a myriad of additional requests for each asset (like CSS files, JS, images, …). The web server then does it’s magic and delivers the associated content in its answer; classically: A webpage.

The “?” in that request string separates the URL from an optional, additional section in that GET request: The query-string (?somethingIWantToSet=HubbaHubba). Even though GET is a method to request data from an HTTP resource, data can be sent to it, this way. Since this becomes quite unhandy soon and everyone can easily read and manipulate the content of the data sent to the server, this is only recommendable for really short things, like selecting a bright or dark theme for the page you are requesting like this:

http://www.my-great-site.com/index.php?theme=dark

To sum this up: GET is used to request data from a specified resource.

POST

Nowadays, most websites are not delivering just static content anymore. Instead, you can interact with them. You can, for example, send an email by filling a contact form or publish an answer to some forum’s thread or similar. To make this work, you need to send data to the remote website instead of sending a request to the remote web server to have something sent back to you. The method in the HTTP protocol enabling you to do so is called POST.

If, for example, somebody is sending me an email using the contact form on my website, I can find a request like this in the access logs of my server:

162.158.89.81 – – [18/Aug/2018:13:11:30 +0200] “POST /contact/ HTTP/1.1” 200 21185 “https://www.marc-richter.info/contact/” “Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0”

There is some detail in that, like the HTTP status (200) indicating that the request was successful, the referer of the request (https://www.marc-richter.info/contact/) telling where the user came from, some details about the user agent (Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0) telling what OS and Webbrowser the visitor was using, etc. But the important part for us is:

POST /contact/ HTTP/1.1

This looks pretty much the same as the GET request is, isn’t it?

… wait a moment: Has somebody sent you an empty mail?

No. In fact, I sent it to myself with the following content:

To: Marc Richter
Name: Marc Richter
Email: [email protected]
Subject: Testmail

Message:
Hi Marc,
great site!
BR

The reason why you do not see this in the logline before is not that I cut that part. It doesn’t show up there, because unlike with GET requests, the POST request is initiating the communication with calling for the destination path only, without any additional in-URL data visible. The content/payload of the request is not part of the initializing requests URL, but is appended to the header of the request.

This is not only more secure to exchange data with the server, but also it is more convenient to exchange bigger amounts of data this way. Imagine, that all data which is ever sent to your favorite social network can be read by any admin who can read the log files of that server … or your employer since you are routed over his proxy servers … ?

To understand webhooks and to sum this up: Telegram bots are sending the JSON data we already saw in the previous part of this series to any URL we configure to them, using POST requests. The JSON data is then part of these request’s payload and can easily be extracted by Django (or any other application).

Again: If you want to get additional details on this, look them up for yourself; to understand how webhooks are working, what I explained so far is already sufficient to follow this article.

Webhook (push) ⚓

Now that we know what POST is and how it is working, let’s talk about how webhooks work.
There is a section in the official bot guide of Telegram, which explains how Webhooks work in three different levels of details; if you want to understand more details about Webhooks, I recommend reading at least the shortest version of it; I think it really is a great start for anybody not familiar with the concept of a webhook.

In my own words: A webhook is primarily a normal HTTP(S) socket, provided by a web server – like software listening to it for HTTP requests. Some piece of software is picking up everything which is sent towards this interface and applies some pre-defined logic on the data received. What logic applies is defined by your code, picking up that data.
This happens in real-time and is aware of the delivery result: If the message could not be forwarded to the defined interface, maybe because the server is down or there are temporary connection issues, Telegram caches the messages just like it does with the getUpdate pull method and retries to deliver it again after some time. As soon as the message was handed over to your interface successfully, Telegram forgets about the message and will never re-transmit it to your bot again.

Giving webhooks a try

Enough of the boring theory: I bet you are desperate to see this whole stuff in action, don’t you? Let’s change your bot to webhook-mode and have everything sent to it forwarded to examine what we need to prepare our bot application for. After that, we will switch back to the getUpdates pull method again.

Attention: Please be aware that everything that will be sent to the bot will be forwarded to a 3rd party’s web service, I do have no control of. Please make sure you are OK with any data privacy policy of your country and that service before following the next lines, since I do not take responsibility about these things (if you revealed your online banking TAN lists because you felt like your bot could be interested in them or your partner is sending nudes unaware of this change, for example ?).

First, you need to visit https://webhook.site in your favorite browser. You will be presented a page telling you all the details to utilize a unique webhook which just got created for you automatically without any registration – pretty awesome, isn’t it?

Copy that link and click the “Open in a new tab“-link once. That will open a new tab, showing an empty page; but – wait! The welcome page of your webhook will have changed immediately, listing a new GET request to have just arrived. That was your browser opening that empty page.
It works the same with POST or any other HTTP method anybody makes towards that interface.

Let’s now configure our bot’s webhook to use that URL you just copied as a destination interface for it’s webhook. You will need two things:

WEBHOOK_URL: The URL of the webhook you just created
BOT_TOKEN: The Token of your bot, we created in the previous sections of this series and I advised you to take a note on it because we will need that later; now is “later” ?

In your browser, you now navigate to the following URL after you have crafted the full URL:

https://api.telegram.org/bot{BOT_TOKEN}/setWebhook?url={WEBHOOK_URL}

Alternatively, you can use something like curl or similar to do this. You need to replace the two “{}“-markers with your own data. You should be presented something like this:

{“ok”:true,”result”:true,”description”:”Webhook was set”}

And: You are good to go!
Make sure the page of your webhook is opened in your browser and then, send something to your bot in Telegram. Immediately (!) you should see that new message as a POST request in the “requests” bar on the left of your webhook page. You can tick the “Format JSON” checkbox at the upper right corner of that page to make the JSON data a bit easier to read.
And that’s it! This is a preview of how the messages which are sent to your bot will be forwarded to your webhook.

Go on, play around with that. Have different datatypes (like images, text, vcards) sent to your bot and check out how it will arrive at the webhook.

You can rely on this format will be the same which are about to hit your Python code, soon. Better be prepared for everything ✌

Switching back to getUpdate pull method

When you are done, better switch back to the getUpdate pull method, to not accidentally expose sensitive data to this publically available service.

This can be done in two different ways:
First and probably most official one: Use the deleteWebhook method like this:

https://api.telegram.org/bot{BOT_TOKEN}/deleteWebhook

Second one: Sending another GET request to the bot’s API like we used to enable the hook. The only difference is that we won’t provide a webhook URL this time:

https://api.telegram.org/bot{BOT_TOKEN}/setWebhook?url=

Success should be indicated by the following message:

{“ok”:true,”result”:true,”description”:”Webhook was deleted”}

Outlook for the next part of the series

So far so good! For this part of the series, that’s it again!
We just learned how webhooks are working and how to enable your bot to make use of that method, how to analyze the data received by it on any webhook and how to disable it again, re-enabling getUpdate pull method.

Next time, I will show you how to prepare Heroku to host the Django app we will create in later chapters and already upload a first demo site to it.

I hope you enjoyed this part! Please let me know of all the things you either enjoyed or did not like that much and how I can do it better in the comments.

Series Navigation << Go back to previous part of this series (Part 3)Jump to next part of this series (Part 5) >>

Marc Richter

Born in 1982, Marc Richter is an IT enthusiast since 1994. He became addicted when he first put his hands on their family’s PC and never stopped investigating and exploring new things since then.
He is married to Jennifer and a proud father of two wonderful children.
His current professional focus is DevOps and Python development.

An exhaustive bio can be found in this blog post.