What is Message Queue Telemetry Transport (MQTT)? How to secure MQTT?
Table of Content
2.2.6. Who is the Client, who is the Server?
2.4. MQTT Quality of Service (QoS)
2.5. Load Balancing of brokers
2.5.1. Extend Natively with NGINX
3.1. Most popular MQTT brokers
4.1. Security Context and Attack surface
4.2. IoT devices generally don’t encrypt communication
4.3 Authentication and AccessControl
5. How can NGINX enrich your MQTT use cases?
1. What is MQTT?
1.1. Presentation
Message Queue Telemetry Transport (MQTT) is a standard messaging protocol for the Internet of Things (IoT).
MQTT is designed with a small footprint and a lightweight publish/subscribe messaging transport, which makes it ideal for connecting IoT devices for various use cases, from connecting smart home devices to industrial robots or connected cars.
Benefits of MQTT
- Lightweight : The hardware requirements to run MQTT clients are very small so it can be used on any small microchips.
- Network Efficiency: MQTT messages are small, so it does not require high bandwidth to operate.
- Reliable: Reliability of message delivery is important for many IoT use cases. This is why MQTT has different quality of service levels.
- Support for Unreliable Networks: Many IoT devices connect over unreliable cellular networks. MQTT’s support for persistent sessions reduces the time to reconnect the client with the broker.
- Security Enabled: MQTT has support for message encryption and authentication. We will see in this document that we often make trade-offs on security to keep the lightness of devices.
MQTT Publish / Subscribe Architecture
The current implementation of MQTT is MQTTv5 which brings more scalability, improved features and better error handling than its older brother MQTT v3.
References:
1.2. Constrained devices
When you talk about IoT you directly think about your connected coffee machine, connected toothbrush or sensors hidden everywhere in your apartment. You certainly don’t want to have a computer sized add-on to your devices that consumes more power than the device itself.
That’s why there is a strong need to have constrained devices that have low power consumption requirements and have a small footprint. However, this also means limited compute performance including security processing.
There are different classed for constrained devices depending on their RAM and flash requirements which directly impact their power consumption.
Classes of Constrained Devices |
||
CLASS |
RAM Size |
Flash Size |
Class 0, C0 |
<< 10 KiB |
<< 100KB |
Class 1, C1 |
~ 10 KiB |
~ 100 KiB |
Class 2, C2 |
~50KB |
~250KB |
https://datatracker.ietf.org/doc/html/rfc7228
1.3. Use Cases
When I started looking into messaging protocols and MQTT, I only had 2 use cases in mind: smart homes and connected cars. An example of a smart home use case would be a connected temperature sensor. These tend to be more expensive than traditional thermometers, but it can provide nice graphs on my mobile device and does the conversion between °F and °C automatically. In terms of connected cars, more and more vehicle manufacturers offer remote telemetry, mobile device connectivity and data-enriched emergency notification services.
In my research, I was amazed to discover how prevalent IoT and Messaging Protocols are across various industries. Here are some examples:
- Automotive: MQTT can be used by automotive manufacturers to collect car’s sensors data to process them centrally and give more assistance to the driver and enrich the driving experience and security. It is also used to deliver software updates over-the-air or to augment navigation systems capabilities.
- Logistics: MQTT can be used for real-time tracking of assets and transportation vehicles. It is also used to have a constantly up to date inventory of stocks.
- Retail: Smart stores, inventory, customer tracking and analytics, Point-of-Sale integration.
- Manufacturing: Robots and equipment monitoring to identify failures and improve energy consumption, optimize productivity efficiency, coordination between supply chains.
- Medical & Healthcare: With the progress in medical technologies, the cost of medical consultations combined with the lack of medical care in rural areas, there is greater need than ever to provide remote patient monitoring for vital metrics such as blood pressure or blood sugar. Data can be collected by connected medical equipment at home and sent to a central location for analysis and alerting.
- Smart Home: anything that can make you change your Internet router configuration from a /24 to a /16 network like multimedia, light controls, Temperature, home security…
- Smart Cities: Monitoring any environmental factor such as air quality, weather, temperature or react rapidly to emergency situations.
- Oil & Gas: Petrochemical industries are very sensitive and require constant monitoring and control, safety monitoring and alerting and environment observability.
- …
1.4. Alternatives to MQTT
When it comes to considering a communication protocol for IoT or Machine-2-Machine data exchange, you need to consider the capabilities and constraints of the devices themselves:
- Energy efficiency so it does not drain too much power when devices are powered by USB or batteries.
- Performance, depending on the use case the device could receive multiple measurements per minute and needs to respond to consumers sub-second.
- Resource requirements – devices that are affordable and can fit in a constrained space.
- Network quality: In many cases, IoT devices need to communicate over-the-air via mobile networks where network quality is not always guaranteed.
- Message reliability: Unfortunately, packet loss can be a reality. Depending on the use case and the importance of every single message, it should have an inner mechanism to reduce packet loss and retransmit missing packets.
- Queueing: In large-scale deployments, thousands or up to millions of devices will be exchanging messages rapidly. The infrastructure should scale and have a modern message bus that clients can connect to in order to publish or subscribe to messages.
- …
Although there are multiple choices for communication protocols such as: Constrained Application Protocol (CoAP), eXtensible Messaging and Presence Protocol (XMPP), Advanced Message Queuing Protocol (AMQP) and of course Hypertext Transfer Protocol (HTTP), MQTT is, by far, the one that has the best balance between network/hardware/energy requirements vs reliability/queueing/network capabilities.
Of course, when ample physical space and power supply afford you the ability introduce edge computing capabilities, other protocols like Websockets and gRPC can also be considered. However, you may need to explicitly manage resiliency and message bus distribution capabilities when using these protocols.
References:
https://core.ac.uk/download/pdf/160743474.pdf
2. Architecture
2.1. Overall Architecture
2.2. Terminology
2.2.1. Topic
An MQTT Topic is an identifier or a filter for the message.
For example, a temperature sensor can send a message with the following topic “temperature/house/floor1/room3”.
2.2.2. Message
The message is the payload of the MQTT publication for a specific topic.
Following the same temperature example, the message can be “20 degrees Celsius”
2.2.3. Subscriber
An MQTT subscriber “subscribes” to particular topics available on a MQTT Broker and can read the related messages.
A subscriber can ask for specific topics “temperature/house/floor1/room3” or wildcard the topics to get all messages under a specific topic level. For example: I can read all MQTT messages related to temperature by subscribing to “temperature/#”
2.2.4. Publisher
An MQTT publisher sends messages with a topic to a MQTT Broker.
2.2.5. Broker
A broker is a server that receives MQTT messages as input from publishers and in turn publish them to subscribers.
2.2.6. Who is the Client, who is the Server?
A client is a device or an application, such as an IoT device or a mobile app, making an MQTT connection to the broker.
Clients and Servers can be either MQTT publishers, subscribers, or both.
2.3. Traffic overview
2.4. MQTT Quality of Service (QoS)
Unfortunately, the reality is we are delivering messages to and from constrained devices over unreliable networks. When consider a temperature sensor, it may be not so catastrophic to miss a temperature measurement within a 5-minute time slot, but for other use cases like healthcare or automotive, omissions can have tragic consequences. Therefore, we need to consider Quality of Service (QoS) levels that can be associated with MQTT implementations.
MQTT has 3 levels of QoS:
- QoS 0 – “at most once” aka fire and forget: the message is sent without confirmation or follow-up
- QoS 1 – “at least once”: the publisher keeps a copy of the published message until it receives a PUBACK confirmation reception message from the broker. If it does not receive a successful receipt message from the broker before expiration of a timeout period, it resends the message.
- QoS 2 – “exactly once”: when the publisher sends a message to the broker, it expects a PUBREC acknowledgement. On receiving the PUBREC, the sender removes the packet and sends a PUBREL to tell the broker to release the message If it does not receive any during a timeout period, it sends the PUBLISH packet with a duplicate flag (DUP)
So, how to select an appropriate QoS value? First, I would say: just put “2” and add to it “which can do more can do less” but I think again at the constrained devices. The higher the QoS level, the more bandwidth and compute it requires.
Here are few tips on how to choose the best QoS:
- QoS0 is faster than QoS1 which is faster than QoS2.
- Use QoS0 if you have a very reliable network.
- QoS1 tolerates duplicate messages.
- Use QoS0 when you don’t mind losing some message occasionally.
2.5. Load Balancing of brokers
Of course, like any networking service, MQTT Brokers should be deployed in N redundant instances to provide capacity and high availability.
As with any stateful protocol you try to load balance, you need stickiness or persistency to make sure a single MQTT publisher sticks for the same connection to the same broker.
What happens when you do not have an appropriate session persistence mechanism and you have set up QoS? You cannot rely completely on source IP address as a persistence criterion, as connected devices can roam across networks thus present themselves with a different IP address and be connected to a different broker every time.
2.5.1. Extend Natively with NGINX
NGINX can get the client_id from the MQTT packet and persist on it.
NGINX can substitute the client ID during the MQTT CONNECT message, which most of the time should be a unique identifier such as the serial number of the device combined with a different identifier (such as the client SSL certificate serial number).
stream {
mqtt on;
server {
listen 2883 ssl;
ssl_certificate /etc/nginx/certs/emqx.pem;
ssl_certificate_key /etc/nginx/certs/emqx.key;
ssl_client_certificate /etc/nginx/certs/ca.crt;
ssl_session_cache shared:SSL:10m;
ssl_verify_client on;
proxy_pass 10.0.0.113:1883;
proxy_connect_timeout 1s;
mqtt_set_connect clientid $ssl_client_serial;
}
}
2.5.2. Extend with NJS
njs is a subset of the JavaScript language that allows extending nginx functionality. The traditional use cases for njs are:
- Complex access control and security checks in njs before a request reaches an upstream server
- Manipulating response headers
- Writing flexible asynchronous content handlers and filters
You can find a great example of MQTTv5 implementation extending the capabilities with NJS on Doug_Gallarda personal GitHub repository: https://github.com/gallarda/mqtt5
3. MQTT Brokers
3.1. Most popular MQTT brokers
The purpose of this document is not to compare MQTT brokers; simply introduce you to the concept. You can find plenty of comparisons on the Internet. For example, this one https://emqx.medium.com/a-comprehensive-comparison-of-open-source-mqtt-brokers-2023-e70257cc5b75 and https://www.emqx.com/en/blog/open-mqtt-benchmarking-comparison-mqtt-brokers-in-2023 which were both written by EMQ Technologies. I am not making any bias judgement in the ranking, but rest assured you will find the content to be flush with detail.
Amongst the most popular MQTT brokers we have the following:
- Mosquitto
- HiveMQ
- EMQX
- Solace
Depending on the solution you choose, licensing may be set per broker host, per connections and/or messages per second.
Beyond sharing message processing loads evenly between brokers, an intermediate proxy can block undesirable and unauthenticated connections attempts, filter legitimate and well-structured messages so you pay only for real production traffic.
3.1. Why proxy MQTT traffic?
In 2022, the Eclipse Foundation published an IoT & Edge Developer Survey Report which highlighted some concerns related to this. Examine the following chart presented in this report:
The Eclipse foundation explains these numbers by:
- An increase in connectivity concerns underscores the lack of computational capacity for efficient built-in security.
- Security still resembles major concern despite the percentage drop (from 46% in 2021)
- A decrease in deployment-related concerns (from 31% to 20% in 2022) indicates that less solutions are moving past the PoC phase, and developers are focusing more on successful solutions rollout to assure overall better user experience.
- Concerns around integration complexity have also decreased (by 11% compared to 2021). As the number of deployments increase, developers see less complexity in the need for additional integrations with complementary technologies and systems.
4. MQTT Security
4.1. Security Context and Attack surface
In early 2024, did you hear about the 3 million toothbrushes conducting a massive DDoS attack? (https://www.securityweek.com/3-million-toothbrushes-abused-for-ddos-attacks-real-or-not/). As it turns out, this was fake news. However, this scenario is certainly within the realm of possibility.
In fact, massive DDoS attacks directed by botnets of IoTs is very common. Think about billions of devices (~25B in 2024 according to analysts) with a fair portion of them being completely vulnerable:
- Not encrypted
- Never updated
- Poor code
I am not saying it happens because of a lack of security awareness, knowledge or skills, I honestly think it is because of a trade-off engineers made between the technical overhead of security requirements and:
- The lightness of the MQTT protocol in terms of hardware requirements for clients. Due to their reduced form factor, their power requirements and price, MQTT devices are generally constrained devices as per RFC 7228 which cannot handle optimal TLS processing.
- The large number of devices makes it difficult to manage the security settings, updates and credentials management over-the-air at scale.
Of course, sometimes there is also a lack of security awareness and the appreciation of the impact of a cyber-attack.
4.2. IoT devices generally don’t encrypt communication
This is a bold claim. How do we know this? Leverage one of the first steps employed by ethical hackers: reconnaissance!
Let’s search for opened non encrypted MQT devices on Internet. How? There are various search engines that discover and index internet-connected devices. Shodan is one of such services. You can run a search on Shodan for: MQTT port: 1883 code:0 and see results like the following:
Notes:
- 1883 is the default port for unencrypted MQTT.
- Code:0 indicates unauthenticated
I was looking into whitepapers on MQTT security, and I found several publications with Shodan outputs April 27th, 2017 = 24,998 brokers with default ports successfully indexed by Shodan (https://www.researchgate.net/figure/Result-of-MQTT-broker-on-port-1883-in-Shodan_fig2_322059897) and they were just looking for MQTT brokers on port 1883 regardless of their Connection code.
Now (February 29th, 2024), and only unauthenticated brokers, I get x19 results.
Now, second step is looking for authentication, any brokers having a connection code of 0 will accept any clients without any authentication. Again, any client means both publishers AND subscribers. In this case, you can pretty much choose what you want to exploit:
- Read for confidential data (you can subscribe to all topics with “#”)
- Drain out messages from brokers (you can ACK messages, so they are deleted from brokers).
- Spam the subscribers by publishing wrong messages.
- Flood with bad messages (i.e. L7 DDoS: search for mqtt malaria on google, there are plenty of open MQTT “stress” tools L).
4.3. Authentication and access control
MQTT brokers supports username password authentication. If no credentials are provided, you will get a connection code of 5, if your credentials are wrong the connection code will be 4.
Again, if traffic is not encrypted, every part of your MQTT packet is in clear text including username and password.
Like any username/password authenticated application, brokers are subject to brute force, directory or credential stuffing attacks.
4.4. Injections
So far, we have discussed the implications of attacks on the publisher, broker and subscriber components. What are some secondary consequences?
- A subscriber will likely store the message data somewhere in a database. SQL database? J
- Unfiltered message data may be presented in a modern webpage? J
- MQTT is a TCP messaging protocol that can’t be inspected by traditional L7 security solutions? J
I did a small test with a simple paho python (https://pypi.org/project/paho-mqtt/) MQTT temperature app (reach out if you want the code).
Note:
Paho provides a client class which enables applications to connect to an MQTT broker to publish messages, and to subscribe to topics and receive published messages. It also provides some helper functions to make publishing one off messages to an MQTT server very straightforward.
I was too lazy to set up a SQL database and store data into it. So, I am just sharing the JSON received data from the python code to Flask. So no SQL Injection for today,… XSS will be enough.
And, here what I got on the web app rendering the temperature:
The XSS attack has been successfully made its way until the web application without being identified and intercepted because it have been encapsulated in a MQTT message which cannot be inspected natively by a WAF.
5. How can NGINX enrich your MQTT use cases?
5.1. What is NGINX?
NGINX is a lightweight and highly performant software-based load balancer and reverse proxy. It natively supports multiple protocols like HTTP for general-purpose web and API traffic but also more specific protocols like MQTT.
NGINX possesses many built-in features for traffic processing and handling. However, it can also be extended by the usage of its Javascript scripting engine, called NJS (https://nginx.org/en/docs/njs/). NJS can be used to take actions like steering traffic or rewriting packet contents based on multiple conditions.
There are multiple use cases where NGINX, either natively or by extending the capabilities using NJS for parsing and taking decision:
5.2.Traffic optimisation
- Reduce latency
- Uniform load balancing of brokers
- Steering messages to brokers based on topic
- Steering messages to brokers based on QoS
- Steering messages to brokers based on QoS
- Steering messages to brokers based on message content
- ...
5.3.MQTT Security
- TLS Offload
- Client Authentication offload
- Client Authorization (filter only to publish not to subscribe for example).
- Filtering broker fingerprinting attempts
- Filtering unwanted information ($SYS)
- Inspecting messages for injection attempts.
5.4. Analytics and Telemetry
NGINX provides an OpenTelemetry module to help you analyze your software performance by instrumenting, generating, collecting, and exporting telemetry data.
https://docs.nginx.com/nginx/admin-guide/dynamic-modules/opentelemetry/