Life of a device connection

This article explains what happens when a device connects to mDash.

  1. TCP connection. Device performs TCP connection with mdash.net
  2. TLS handshake. Device performs TLS handshake
  3. MQTT authentication. Device sends MQTT CONNECT command, passing device DEVICE_ID and DEVICE_TOKEN as MQTT username and MQTT password
    • Device waits for the MQTT CONNACK response
    • If the ID/Token are invalid, the MQTT CONNACK response code is non-zero (failure) and mDash closes TCP connection. A device goes to step 1
    • If the ID/Token are valid, MQTT CONNACK response code is zero (success)
    • If mDash already has alive connection from the device with the same ID/Token, that connection is considered stale and is closed
  4. mDash publishes message DEVICE_ID to the topic db/online, notifying any potential observer that a device DEVICE_ID goes online. Since the topic name starts with db/, this message gets logged to the database
  5. A device subscribes to the topic DEVICE_ID/rpc in order to receive JSON-RPC messages on that topic. The replies will go to $sys/DEVICE_ID/rpc topic
  6. In ~750 milliseconds after the successful connection, mDash publishes the {"id":1,"method":"Sys.GetInfo","params":{"utc_time":1558870586},"src":"$sys/d1"} message to the DEVICE_ID/rpc topic, thus calling the Sys.GetInfo RPC method on a device
  7. A device sets the local time to utc_time if it is not already set
  8. A device replies with the message similar to this:
     {
       "fw_version": "1.0.18-arduino-10809-pico32",
       "arch":"esp32",
       "fw_id":"20190526-102743",
       "app":"sketch_huzzah32.ino",
       "status":0,
       "uptime":6,
       "reboot_reason":"power-on"
     }
  9. mDash stores the received Sys.GetInfo reply to the state.reported.ota device shadow section, in order to provide essential information about the device even if it is offline
  10. mDash publishes Sys.GetInfo reply to the db/online/DEVICE_ID topic. The message gets saved into the database. Thus the information about all device connections, disconnections, and Sys.GetInfo results (which contains an important reboot_reason field) is stored persistently, giving the visibility on crashes, reconnections, etc.
  11. From this point on, a device performs a firmware-specific setup (e.g. it may subscribe to other topics, like device shadow delta, etc), and exchanges MQTT messages in a normal way:
  12. When a device disconnects, this usually happens in a non-clean way (e.g. device power-off). mDash sends TCP keep-alive every 20 seconds, and closes the connection after 3 failed keep-alives. When a connection is closed, mDash sends DEVICE_ID message to the db/offline topic. Disconnected device automatically goes to step 1 (reconnects)