P2P WebRTC file sharing app: Broker using Phoenix

WebRTC was created primarily for video and voice communication but it has the API to send raw binary data between two browsers as well. This opens up a lot of opportunities to create more peer to peer web application which are supported by modern browser. There are already a lot of interesting applications created using it such as WebTorrent, UberConference. This is just the beginning of many other P2P application that people will come up with.

I wanted to understand WebRTC and what better way to understand something than to create an application using it. I picked up creating a P2P file sharing application (many others have already done it before). This is an interesting little project in which I had to deal with both the frontend and backend.

The goal is to enable file sharing between two peers directly, without any middleman, which means it is secure and ephemeral so that once you close the web page, everything is released. The sender will add the files and share a unique URL with the receiver. When receiver visits the unique URL, user can see the files shared and download the files. Try it out (WIP). The code for this application exists here.

Ephemeral Share

In this post, I am going to briefly go over my design decision for the broker which is used for the initial handshake. I assume that you are already aware of the basics of how web socket communication works in Phoenix framework. I might discuss other aspects of this application in later posts.

Establishing connection in WebRTC:

WebRTC uses multiple different protocols to establish connection between the peers. If you want to understand more about these protocols then read this blog post.
For this post all we need to understand is that there is an initiator, which commences the handshake to establish connection, it creates an offer which contains its network information and other media related information.
The receiver on receiving this offer creates its answer and its own offer and communicates this back to the initiator which on accepting it concludes the initial handshake and both peers can communicate.

This exchange of offers require some means of transferring this information to the the other peer. This is where our broker comes into the picture.

Broker:

Let's see what are the requirement for the broker

  1. It should be able to communicate (over websocket) with other peer by its ID.
  2. It needs to maintain information that which peers are connected so if someone want to connect to a peer which does not exist, it can respond with an error.

Each peers gets assigned a unique ID when it connects with our web app. After serving the web page we communicate with the server using websocket. The peer first requests for a unique ID to be assigned. Once it receives a unique ID it registers itself using that unique ID.
Let's see how we satisfy each of the broker requirements.

1. Communicate using peer ID

As we would like to communicate with the peer using the ID, there are two ways that I tried.

The first approach was to maintain a universal mapping (dictionary) of peer ID and socket associated to it. When we need to communicate with peer, we retrieve the socket using peer ID as key and push the message to it.
Communicate using peer id approac 1
The second approach was that each peer connects to a unique topic and when we want to send a message to that peer, we just broadcast the message to that topic. As only one peer would be registered to that topic so no other peer would get the same message.
Communicate using broadcast to different topic

I decided to go with the second approach. The main drawback of the first method is that having one elixir agent for storing dictionary will become a bottleneck and would not be scalable. This is because any request to communicate with other peer would have to send a message to that agent to get the socket, this would put all the load on that agent and hence slow down the whole application.

In the second approach this is not the case. When registering with the websocket it joins a unique topic which looks like this peer:<UUID> e.g. peer:20dd48ca-fdcf-41c9-9a3b-f192f77650f9. We send the message to that topic using the function ApplicationName.Endpoint.broadcast(topic, event, payload)

2. Maintaining peer information:

For this I also tried two approaches.

The first approach was to keep the universal mapping(dictionary) of peer ID and its socket, as described previously this would have also served the purpose of getting the socket associated with that peer. In this approach we would have checked for the existence of peer ID in the dictionary and responded accordingly. But we also get the added responsibility of maintaining the list if the socket connection closes or the socket process got terminated etc.

In the second approach we leverage the global name registering capability of Elixir/Erlang. This is a way to register a global name corresponding to a PID. Whenever that process crashes or terminated the name is unregistered. This also scales to multiple nodes.
Architecture all
I use the second approach as it scales better and across nodes. So when the peer registers it starts a simple GenServer which maintains information regarding the peer and assigns it a global name like peer_state:<UUID>. This process is linked with the socket process, hence if the socket closes or crashes that process also goes down and gets unregistered. Using this we don't have to maintain the list ourself.
When we need to figure out if a peer exists we use the :global.whereis_name(Name) function to get the PID if it exists else it returns undefined which means that peer does not exist.

Communicating the WebRTC offer

Now to the easy part of communicating the offers. As we have already discussed that how we will identify if a peer exists and communicate with peer using its unique ID, lets see how we send the offer to and fro the peers.
I'll refer the sender as the one who shares a file and the receiver who intends to download the file. The share URL looks something like this http://epicshare.zohaib.me/?peer_id=20dd48ca-fdcf-41c9-9a3b-f192f77650f9. It contains the ID of the peer.
On opening this URL the receiver also gets assigned a unique ID and registers itself with the broker as described before. Now we have both the peers connected to the broker using web sockets.

The next phase of communication is to send and receive the offer. Following is the sequence diagram of how the flow works.

WebRTC offer communication sequence diagram

References:

Source Code:

Code for the peer channel

Code for the GenServer which maintains the state of the peer