NOTE: This document is a work in progress.
This document describes the design of the Networking stack on Tock.
The design described in this document is based off of ideas contributed by Phil Levis, Amit Levy, Paul Crews, Hubert Teo, Mateo Garcia, Daniel Giffin, and Hudson Ayers.
This document is split into several sections. These are as follows:
Principles - Describes the main principles which the design of this stack intended to meet, along with some justification of why these principles matter. Ultimately, the design should follow from these principles.
Stack Diagram - Graphically depicts the layout of the stack
Explanation of queuing - Describes where packets are queued prior to transmission.
List of Traits - Describes the traits which will exist at each layer of the stack. For traits that may seem surprisingly complex, provide examples of specific messages that require this more complex trait as opposed to the more obvious, simpler trait that might be expected.
Explanation of Queuing - Describe queueing principles for this stack
Description of rx path
Description of the userland interface to the networking stack
Implementation Details - Describes how certain implementations of these traits will work, providing some examples with pseudocode or commented explanations of functionality
Example Message Traversals - Shows how different example messages (Thread or otherwise) will traverse the stack
Keep the simple case simple
Layering is separate from encapsulation
Dataplane traits are Thread-independent
Transmission and reception APIs are decoupled
IPv6 over ethernet: Non-Thread 15.4: Thread Stack: Encapsulation Libraries +-------------------+-------------------+----------------------------+ | Application |-------------------\ ----------------------------------------+-------------+---+----------+ \ |TCP Send| UDP Send |TCP Send| UDP Send | | TCP Send | | UDP Send |--\ v +--------+----------+--------+----------+ +----------+ +----------+ \ +------------+ +------------+ | IP Send | IP Send | | IP Send | \ -----> | UDP Packet | | TCP Packet | | | | +-------------------------+ \ / +------------+ +------------+ | | | | \ / +-----------+ | | | | -+-------> | IP Packet | | | | THREAD | / +-----------+ | IP Send calls eth | IP Send calls 15.4| <--------|------> +-------------------------+ | 6lowpan libs with | 6lowpan libs with | | \ -------> | 6lowpan compress_Packet | | default values | default values | | \ +-------------------------+ | | | | \ +-------------------------+ | | + +-----------| ------> | 6lowpan fragment_Packet | | | | | 15.4 Send | +-------------------------+ |-------------------|-------------------+----------------------------+ | ethernet | IEEE 802.15.4 Link Layer | +-------------------+------------------------------------------------+
Notes on the stack:
Queuing happens at the application layer in this stack. The userland interface to the networking stack (described in greater detail in Networking_Userland.md) already handles queueing multiple packets sent from userland apps. In the kernel, any application which wishes to send multiple UDP packets must handle queueing itself, waiting for a send_done to return from the radio before calling send on the next packet in a series of packets.
This section describes a number of traits which must be implemented by any network stack. It is expected that multiple implementations of some of these traits may exist to allow for Tock to support more than just Thread networking.
Before discussing these traits - a note on buffers:
Prior implementations of the tock networking stack passed around references to 'static mut [u8] to pass packets along the stack. This is not ideal from a standpoint of wanting to be able to prevent as many errors as possible at compile time. The next iteration of code will pass ‘typed’ buffers up and down the stack. There are a number of packet library traits defined below (e.g. IPPacket, UDPPacket, etc.). Transport Layer traits will be implemented by a struct that will contain at least one field - a [u8] buffer with lifetime 'a. Lower level traits will simply contain payload fields that are Transport Level packet traits (thanks to a TransportPacket enum). This design allows for all buffers passed to be passed as type ‘UDPPacket’, ‘IPPacket’, etc. An added advantage of this design is that each buffer can easily be operated on using the library functions associated with this buffer type.
The traits below are organized by the network layer they would typically be associated with.
Thus far, the only transport layer protocol implemented in Tock is UDP.
Documentation describing the structs and traits that define the UDP layer can be found in capsules/src/net/udp/(udp.rs, udp_send.rs, udp_recv.rs)
Additionally, a driver exists that provides a userland interface via which udp packets can be sent and received. This is described in greater detail in Networking_Userland.md
RxClient, which is set as the mac layer (awake_mac, typically)AwakeMac) has a single RxClient, which is the mac_device(ieee802154::Framer::framer)MuxMac (virtual MAC device).MuxMac can have multiple “users” which are of type MacUsersixlowpan_state structsixlowpan_state has a single rx_client, which in our case is a single struct that implements the ip_receive trait.ip_receive implementing struct (IP6RecvStruct) has a single client, which is udp_recv, a UDPReceive struct.So what are the implications of all this?
Currently, any userland app could receive udp packets intended for anyone else if the app implmenets 6lowpan itself on the received raw frames.
Currently, packets are only muxed at the Mac layer.
Right now the IPReceive struct receives all IP packets sent to the MAC address of this device, and soon will drop all packets sent to non-local addresses. Right now, the device effectively only has one address anyway, as we only support 6lowpan over 15.4, and as we haven't implemented a loopback interface on the IP_send path. If, in the future, we implement IP forwarding on Tock, we will need to add an IPSend object to the IPReceiver which would then retransmit any packets received that were not destined for local addresses.
This section describes how the IP stack can be configured, including setting addresses and other parameters of the MAC layer.
Source IP address: An array of local interfaces on the device is contained in main.rs. Currently, this array contains two hardcoded addresses, and one address generated from the unique serial number on the sam4l.
Destination IP address: The destination IP address is configured by passing the address to the send_to() call when sending IPv6 packets.
src MAC address: This address is configured in main.rs. Currently, the src mac address for each device is configured by default to be a 16-bit short address representing the last 16 bits of the unique 120 bit serial number on the sam4l. However, userland apps can change the src address by calling ieee802154_set_address()
dst MAC address: This is currently a constant set in main.rs. (DST_MAC_ADDR). In the future this will change, once Tock implements IPv6 Neighbor Discovery.
src pan: This is set via a constant configured in main.rs (PAN_ID). The same constant is used for the dst pan.
dst pan: Same as src_pan. If we need to support use of the broadcast PAN as a dst_pan, this may change.
radio channel: Configured as a constant in main.rs (RADIO_CHANNEL).
This section describes the current userland interface for the networking stack on Tock. This section should serve as a description of the abstraction provided by libTock - what the exact system call interface looks like or how libTock or the kernel implements this functionality is out-of-scope of this document.
The Tock networking stack and libTock should attempt to expose a networking interface that is similar to the POSIX networking interface. The primary motivation for this design choice is that application programmers are used to the POSIX networking interface design, and significant amounts of code have already been written for POSIX-style network interfaces. By designing the libTock networking interface to be as similar to POSIX as possible, we hope to improve developer experience while enabling the easy transition of networking code to Tock.
udp.c and udp.h in libtock-c/libtock define the userland interface to the Tock networking stack. These files interact with capsules/src/net/udp/driver.rs in the main tock repository. driver.rs implements an interface for sending and receiving UDP messages. It also exposes a list of interace addresses to the application layer. The primary functionality embedded in the UDP driver is within the allow(), subscribe(), and command() calls which can be made to the driver.
Details of this driver can be found in doc/syscalls folder
udp.c and udp.h in libtock-c make it easy to interact with this driver interface. Important functions available to userland apps written in c include:
udp_socket() - sets the port on which the app will receive udp packets, and sets the src_port of outgoing packets sent via that socket. Once socket binding is implemented in the kernel, this function will handle reserving ports to listen on and send from.
udp_close() - currently just returns success, but once socket binding has been implemented in the kernel, this function will handle freeing bound ports.
udp_send_to() - Sends a udp packet to a specified addr/port pair, returns the result of the tranmission once the radio has transmitted it (or once a failure has occured).
udp_recv_from_sync() - Pass an interface to listen on and an incoming source address to listen for. Sets up a callback to wait for a received packet, and yeilds until that callback is triggered. This function never returns if a packet is not received.
udp_recv_from() - Pass an interface to listen on and an incoming source address to listen for. However, this takes in a buffer to which the received packet should be placed, and returns the callback that will be triggered when a packet is received.
udp_list_ifaces() - Populates the passed pointer of ipv6 addresses with the available ipv6 addresses of the interfaces on the device. Right now this merely returns a constant hardcoded into the UDP driver, but should change to return the source IP addresses held in the network configuration file once that is created. Returns up to len addresses.
Other design notes:
The current design of the driver has a few limitations, these include:
Currently, any app can listen on any address/port pair
The current tx implementation allows for starvation, e.g. an app with an earlier app ID can starve a later ID by sending constantly.
Below is a fairly comprehensive overview of the POSIX networking socket interface. Note that much of this functionality pertains to TCP or connection- based protocols, which we currently do not handle.
socket(domain, type, protocol) -> int fd
domain: AF_INET, AF_INET6, AF_UNIXtype: SOCK_STREAM (TCP), SOCK_DGRAM (UDP), SOCK_SEQPACKET (?), SOCK_RAWprotocol: IPPROTO_TCP, IPPROTO_SCTP, IPPROTO_UDP, IPPROTO_DCCPbind(socketfd, my_addr, addrlen) -> int success
socketfd: Socket file descriptor to bind tomy_addr: Address to bind onaddrlen: Length of addresslisten(socketfd, backlog) -> int success
socketfd: Socket file descriptorbacklog: Number of pending connections to be queuedOnly necessary for stream-oriented data modes
connect(socketfd, addr, addrlen) -> int success
socketfd: Socket file descriptor to connect withaddr: Address to connect to (server protocol address)addrlen: Length of addressWhen used with connectionless protocols, defines the remote address for sending and receiving data, allowing the use of functions such as send() and recv() and preventing the reception of datagrams from other sources.
accept(socketfd, cliaddr, addrlen) -> int success
socketfd: Socket file descriptor of the listening socket that has the connection queuedcliaddr: A pointer to an address to receive the client's address informationaddrlen: Specifies the size of the client address structuresend(socketfd, buffer, length, flags) -> int success
socketfd: Socket file descriptor to send onbuffer: Buffer to sendlength: Length of buffer to sendflags: Various flags for the transmissionNote that the send() function will only send a message when the socketfd is connected (including for connectionless sockets)
sendto(socketfd, buffer, length, flags, dst_addr, addrlen) -> int success
socketfd: Socket file descriptor to send onbuffer: Buffer to sendlength: Length of buffer to sendflags: Various flags for the transmissiondst_addr: Address to send to (ignored for connection type sockets)addrlen: Length of dst_addrNote that if the socket is a connection type, dst_addr will be ignored.
recv(socketfd, buffer, length, flags)
socketfd: Socket file descriptor to receive onbuffer: Buffer where the message will be storedlength: Length of bufferflags: Type of message receptionTypically used with connected sockets as it does not permit the application to retrieve the source address of received data.
recvfrom(socketfd, buffer, length, flags, address, addrlen)
socketfd: Socket file descriptor to receive onbuffer: Buffer to store the messagelength: Length of the bufferflags: Various flags for receptionaddress: Pointer to a structure to store the sending addressaddrlen: Length of address structureNormally used with connectionless sockets as it permits the application to retrieve the source address of received data
close(socketfd)
socketfd: Socket file descriptor to deletegethostbyname()/gethostbyaddr() Legacy interfaces for resolving host names and addresses
select(nfds, readfds, writefds, errorfds, timeout)
nfds: The range of file descriptors to be tested (0..nfds)readfds: On input, specifies file descriptors to be checked to see if they are ready to be read. On output, indicates which file descriptors are ready to be readwritefds: Same as readfds, but for writingerrorfds: Same as readfds, writefds, but for errorstimeout: A structure that indicates the max amount of time to block if no file descriptors are ready. If None, blocks indefinitelypoll(fds, nfds, timeout)
fds: Array of structures for file descriptors to be checked. The array members are structures which contain the file descriptor, and events to check for plus areas to write which events occurrednfds: Number of elements in the fds arraytimeout: If 0 return immediately, or if -1 block indefinitely. Otherwise, wait at least timeout milliseconds for an event to occurgetsockopt()/setsockopt()
Below is a list of desired functionality for the libTock userland API.
struct sock_addr_t ipv6_addr_t: IPv6 address (single or ANY) port_t: Transport level port (single or ANY)
struct sock_handle_t Opaque to the user; allocated in userland by malloc (or on the stack)
list_ifaces() -> iface[] ifaces: A list of ipv6_addr_t, name pairs corresponding to all interfaces available
udp_socket(sock_handle_t, sock_addr_t) -> int socketfd socketfd: Socket object to be initialized as a UDP socket with the given address information sock_addr_t: Contains an IPv6 address and a port
udp_close(sock_handle_t) sock_handle_t: Socket to close
send_to(sock_handle_t, buffer, length, sock_addr_t)
sock_handle_t: Socket to send usingbuffer: Buffer to sendlength: Length of buffer to sendsock_addr_t: Address struct (IPv6 address, port) to send the packet fromrecv_from(sock_handle_t, buffer, length, sock_addr_t)
sock_handle_t: Receiving socketbuffer: Buffer to receive intolength: Length of buffersock_addr_t: Struct where the kernel writes the received packet's sender informationThere are two major differences between the proposed Tock APIs and the standard POSIX APIs. First, the POSIX APIs must support connection-based protocols such as TCP, whereas the Tock API is only concerned with connectionless, datagram based protocols. Second, the POSIX interface has a concept of the sock_addr_t structure, which is used to encapsulate information such as port numbers to bind on and interface addresses. This makes bind_to_port redundant in POSIX, as we can simply set the port number in the sock_addr_t struct when binding. I think one of the major questions is whether to adopt this convention, or to use the above definitions for at least the first iteration.
ip_senseAn example use of the userland networking stack can be found in libtock-c/examples/ip_sense
This section was written when the networking stack was incomplete, and aspects may be outdated. This goes for all sections following this point in the document.
The Thread specification determines an entire control plane that spans many different layers in the OSI networking model. To adequately understand the interactions and dependencies between these layers' behaviors, it might help to trace several types of messages and see how each layer processes the different types of messages. Let's trace carefully the way OpenThread handles messages.
We begin with the most fundamental message: a data-plane message that does not interact with the Thread control plane save for passing through a Thread-defined network interface. Note that some of the procedures in the below traces will not make sense when taken independently: the responsibility-passing will only make sense when all the message types are taken as a whole. Additionally, no claim is made as to whether or not this sequence of callbacks is the optimal way to express these interactions: it is just OpenThread's way of doing it.
As you can see, the data dependencies are nowhere as clean as the OSI model dictates. The complexity mostly arises because
Note that all of the MAC layer dependencies in step 5 can be pre-decided so that the MAC layer is the only one responsible for writing the MAC header.
This gives a pretty good overview of what minimally needs to be done to even be able to send normal IPv6 datagrams, but does not cover all of Thread's complexities. Next, we look at some control-plane messages.
The only cross-layer dependency introduced by the MLE layer is the dependency between MLE-layer security and link-layer security. Whether or not the MLE layer sits atop an actual UDP socket is an implementation detail.
If Thread REED devices are to be eventually supported in Tock, then we must also consider this case. If a frame is sent to a router which is not its final destination, then the router must forward that message to the next hop.
This example shows that the IP6 transmission interface may need to handle more message types than just IP6 datagrams: there is a case where it is convenient to be able to handle a datagram that is already 6LoWPAN compressed.
From time to time, a sleepy edge device will wake up and begin polling its parent to check if any frames are available for it. This is done via a MAC command frame, which must still be sent through the transmission pipeline with link security enabled (Key ID mode 1). OpenThread does this by routing it through the IP6 transmission interface, which arguably isn't the right choice.
We could imagine giving the data poll manager direct access as a client of the MAC layer to avoid having to shuffle data through the IP6 transmission interface. This is only justified because MAC command frames are never 6LoWPAN-compressed or fragmented, nor do they depend on the IP6 interface in any way.
This type of message behaves similarly to the MAC data polls. The message is essentially and empty MAC frame, but OpenThread chooses to also route it through the IP6 transmission interface. It would be far better to allow a child supervision implementation to be a direct client of the MAC interface.
These two message types are also explicitly marked, because they require a specific Key ID Mode to be selected when producing the frame for the MAC interface.
So far, it seems like we can expect the MAC layer to have no cross-layer dependencies: it receives frames with a completely specified description of how they are to be secured and transmitted, and just does so. However, this is not entirely the case.
When the frame is being secured, the key ID mode has been set by the upper layers as described above, and this key ID mode is used to select between a few different key disciplines. For example, mode 0 is only used by Joiner entrust messages and uses the Thread KEK sequence. Mode 1 uses the MAC key sequence and Mode 2 is a constant key used only in MLE announce messages. Hence, this key ID mode selection is actually enabling an upper layer to determine the specific key being used in the link layer.
Note that we cannot just reduce this dependency by allowing the upper layer to specify the key used in MAC encryption. During frame reception, the MAC layer itself has to know which key to use in order to decrypt the frames correctly.