Each year we host around half-a-dozen interns (stagiaires), for periods of two to six months. These young men and women are usually studying or have just finished studying for undergraduate or Master’s degrees, and are eager to gain experience in the blockchain industry. Each has a unique story to tell about their individual interest in this technology.
In this blogpost, we will ask three questions of one of our current interns: Julien Coolen (and a couple of questions of his mentors at Nomadic Labs).
Julien — so happy to have you with us! We hope you’re having a wonderful and educational period with Nomadic Labs. We’d love to hear a bit about you and your activities chez nous …
I am a first-year master’s student in cryptology (double master Mathématiques, Informatique de la cryptologie et sécurité) at the Université de Paris. I graduated in 2020 with bachelor’s degrees in Mathematics and Computer Science.
Being fond of open research, open source, and formal verification, I tried to gain experience in these areas. In 2019 I worked as a system administrator intern to configure and deploy a digital library for my mathematics faculty. Then in 2020 a friend and I attempted to program and prove the correctness of an LL(1) parser using the Coq proof assistant. We thank Yann Régis-Gianas for this learning opportunity. Also in 2020, I developed features for an OCaml bot to simplify collaborative software development, under the supervision of Théo Zimmermann at Inria.
2. Tell us more about your internship: main subject, who is your mentor, what you have learned and especially, what surprised you the most within these months?
I am developing a distributed hash table (DHT) for file sharing during a three-month internship from June to August 2021, under the supervision and mentorship of research engineers Vivien Pelletier and Julien Tesson at Nomadic Labs.
A hash table is a key-value store; you need a key to retrieve each piece of stored information. However as your store grows you become limited by computer memory. The distributed nature of DHTs addresses this issue: the data is distributed around different computers on the internet. An example of a DHT is the IPFS distributed web.
There are several ways to implement a DHT. We have chosen one that exploit the Tezos peer-to-peer (P2P) library as an off-the-shelf and well-tested component. With it, the distributed parts of the distributed hash table (called peers) can communicate.
This project aims to test the architecture of the peer-to-peer library of Tezos in these somewhat different circumstances:
- In the Tezos blockchain, connections between peers are arbitrary, whereas in a DHT connections follow specific patterns and rules.
The size of a message sent across the Tezos network is typically a few kilobytes, and only occasionally megabytes. The Tezos network protocol never requires to transfer a message more than 100 megabytes; meanwhile the serialization library (which interfaces between the Tezos network protocol and the peer-to-peer library itself) is limited to a message size of 1 gigabyte.
My DHT places no theoretical limit on message size (though there are practical ones, e.g. disk space) so we intend to benchmark and then optimize performance of the peer-to-peer network using my DHT, to see how far we can scale practical message size. We hope to attain at least gigabytes — thus ensuring that the practical capabilities of the peer-to-peer library are a thousandfold above that required by the exigencies of the Tezos network.
After a bit more than a month, I (with help from my mentors) have built a prototype of DHT and tested it using the library for unit and integration testing from Tezos. To achieve this, I had to familiarize myself with part of the Octez codebase, which is the Tezos implementation to which Nomadic Labs contributes.
I thank Vivien and Julien for explaining the innards of a full-scale industrial codebase running live and safety-critical code. This experience has also brought home to me how just one tiny mistake in a distributed system can corrupt the entire network!
3. Why did you choose to become a part of the blockchain ecosystem and Nomadic labs? What are your plans after completing this internship?
I chose Nomadic Labs because the company applies formal methods to build robust, open-source software with the OCaml programming language. In particular, their smart-contract stack-based language Michelson is the only one of its kind to be formalized!
Blockchains are in an exploratory phase, so they raise many technical challenges and create many opportunities for innovation and discovery. The end goal is to build resilient, transparent, and accessible systems.
After this internship, I hope to continue working with functional programming languages and formal methods.
Julien is independent, and in a good way. Our supervision consists mostly of discussing objectives for the week and answering his many penetrating technical questions.
This peer-to-peer layer is tailored to the OCaml Tezos node implementation (which is now called Octez). By using this layer for other purposes — sharing files, in this case — we hope to improve its maturity.
In developing a distributed hash table, Julien had to add some features to this layer that should be useful for future improvements of the Tezos node; for example, the possibility to query the related network address of a peer, given its cryptographic identity. Another feature that Julien plans to add is to let users of this layer configure the topology to their needs.