Comment on page
How Tashi Consensus Engine implements Meshed Networking
For two devices on the Internet to talk to each other, they need to know each other's Internet Protocol (IP) address. Any device connected to a network will know its own IP address, and so in theory it just needs to share that address with any peers it wishes to connect to.
Tashi Network Transport provides an easy way for users of a game session to learn each other's IP addresses and connect to each other through its integration with Unity Lobby. In a perfect world, this is all the information needed for multiple instances of a Tashi-enabled application to talk to each other, and then the consensus algorithm takes care of the rest.
However, in practice it is not always that simple.
Because of the limited address space of Internet Protocol version 4 (IPv4), a consumer broadband Internet connection is typically only assigned a single IP address which must be shared by all devices attached to its Local Area Network (LAN), whether it's by Wi-Fi, Ethernet, USB, etc. This concept is called Network Address Translation (https://en.wikipedia.org/wiki/Network_address_translation).
The larger address space of IPv6 theoretically allows each device to be assigned its own globally unique address, but IPv6 rollout is slow and has yet to reach many consumer broadband customers, at least in North America, and even if network hardware supports IPv6 it often still prefers IPv4 for backwards compatibility.
This presents us with two major problems for peer-to-peer connectivity:
- 1.There is a high chance that a client's self-reported IP address is not reachable on the public Internet: it may instead be the internal IP address assigned to the client by the router for their Local Area Network.
- As an upside from an architectural perspective, this allows multiple networks to use the same part of the IPv4 address space because they don't talk to each other directly. This eases the issue of address space exhaustion.
- 2.Routers that implement NAT typically do not accept unsolicited inbound connections as they usually have no way of determining which device on their network the connection is intended for.
- This is also an upside for security reasons, as potentially malicious traffic from the Internet can be firewalled off from local traffic, with only authorized traffic being allowed through.
- A workaround for advanced users is to configure port forwarding in the router, which explicitly maps one or more ports on its public interface to a single device on the Local Area Network. You may remember having to do this to play older games online, or from hosting your own game server for your friends, like Minecraft. However, this has a significant impact on the user experience and is technically a security risk.
Most online multiplayer games work around these issues by having one or more servers which are reachable at known IP addresses on the public Internet, which all players connect to. These may be dedicated servers hosted by the developer/publisher of the game, or by the playerbase if the developer gives them access to the server software, or may be shared generic infrastructure such as Unity Relay or Steamworks.
The obvious downside of this approach is cost, as cloud infrastructure is not cheap and requires personnel with the expertise to set it up and maintain it, or paying a premium for someone else to host the infrastructure for you. Player-hosted infrastructure of course requires a playerbase with similar knowledge or expertise, or a partnership with third-party providers to offer dedicated server hosting as a service to your players.
However, there are other options that Tashi opens up.
If two peers cannot connect to each other, a third peer whom they both can connect to can relay traffic between them. Tashi handles this situation automatically, though it is likely to affect consensus latencies as Tashi also tries to avoid unfairly burdening a single peer with the whole network's bandwidth at once. The best performance scenario in most cases is when all peers have direct connections to each other.
In the case that a given network configuration doesn't allow any peers to talk to each other, we also provide a hosted service: Tashi Relay. Unlike Unity Relay which is all-or-nothing, with Tashi Relay it's allowed that only peers who cannot be directly contacted use the relay, and if no peers need the relay then it's not used, reducing costs significantly.
Notice the loophole in problem #2 above: routers that implement NAT do not accept unsolicited inbound connections. What does a solicited connection look like? Well, it's any Internet traffic that originates from a device on the network. Outbound connections need a way for the reply from the remote peer to reach the client, so when a connection is opened from within the network, the router effectively adds a temporary port forwarding rule for that connection so that it knows where to route the return traffic.
Remember that Tashi uses QUIC, which is an application-layer protocol built on UDP, which has no inherent concept of connections (QUIC does have logical connections but they take place over the connectionless UDP). UDP traffic, however, is still subject to Network Address Translation. An outgoing UDP packet will cause the router to add a temporary entry to its port forwarding table so reply packets can find their way back. And since Tashi uses one UDP port for all its traffic, only two things are typically required to establish a connection with a peer behind a NAT:
- 1.The peer needs to learn and share their true public IP address (the one assigned to the router by the Internet Service Provider), and:
- 2.The peer behind the NAT needs to send out packets destined for the peer that wishes to connect to them, so that their router creates the appropriate entries in its port forwarding table.
- These outbound packets need to be carefully timed so as to coincide with the incoming connection attempt, as the lifetime of the port forwarding entry varies from manufacturer to manufacturer but is typically very short, on the order of seconds.
This is assuming that the other peer is also behind a NAT, because if their IP address is publicly reachable then the solution is trivial: the peer just needs to try connecting to them instead of the other way around. Tashi automatically tries connections in both directions periodically and assumes both peers are behind a NAT if they're not directly reachable. In reality, NAT hole punching happens for both peers simultaneously.
Because the circumstances for NAT hole punching means that both peers are behind NAT, the hole-punching protocol does require that each peer has a connection to a third peer who is publicly reachable on the Internet. This peer will report the other peers' public IP addresses to each other (which are not implicitly trusted; the TLS handshake enforces that the peer we connected to is the one we meant to connect to, using their public key) and coordinate the connection attempt between them.
In the case that no peer is able to serve as an intermediary, in comes Tashi Relay to save the day! As a hosted service, it is always reachable, and so can facilitate hole punching between other peers. Since the relay is only needed temporarily during the connection phase, we can offer this service at a significantly reduced cost compared to relaying all traffic for the duration of a session.
Not every router configuration supports NAT hole punching; it's likely that a Symmetric NAT configuration (refer to the Wikipedia link above) cannot be hole-punched by our protocol, in which case Tashi Relay can automatically take over and relay traffic to that peer. However, we expect a majority of network configurations on the public Internet to be hole-punchable, which means using the relay can be the exception rather than the rule.