Hi, Steemit! We're Textile. Here's a deeper look at the tech behind our Threads protocol.

in introduceyourself •  6 years ago 

Written by Carson Farmer & Sander Pick

Download the app, take a picture, share!

Recently,  we’ve started writing more about the technologies underlying Textile  Photos that help keep your photos (and likes and comments, etc) safe and  secure on the decentralized web. In our previous post, we talked about the encryption process Textile Photos,  with a focus how Textile delivers end-to-end encrypted photo sharing.  Today’s post is a follow-up (though it should also be sufficiently  detailed to stand on its own), this time highlighting how Textile  coordinates private photo sharing among groups of users, a feature we  call Threads.

Why we built it
We  designed Threads to allow groups of users to share photos securely and  privately, without any centralized, authoritative database. We also made  sure it all works well offline, that its possible to recover lost data,  and that its easy to add new members.

What makes Threads different
Threads  allow private groups to post photos and interact over a decentralized  network, maintaining complete control over their own content. Textile  operates in a completely zero-knowledge framework. Private by design.

Why Threads are exciting
Because  photos are just the first step. Today, Threads allow users to share a  photo with other Thread in a secure, decentralized way. Threads can  facilitate secure sharing, coordination, and storage of many types of data over a decentralized network. Upgradable by design.

On  the surface, you can think of each Thread like a decentralized  database, shared between specific participants. We built Threads into  the fabric of Textile (see what we did there 😉) because group members  need a record of who shared what photo, and when. But, once we created  Threads, we realized just how powerful a concept this was — for those  familiar with mobile app development, think Realm or Firebase but without the centralized server.

To  really understand what Threads brings to the table, you really need to  understand Threads themselves. So let’s dig a bit deeper into how  Textile conceptualizes and implements Threads, and how that helps keep  your photos (and likes and comments, etc) safe and secure on the  decentralized web. We’ll start by highlighting the specific requirements  we had when developing Threads, and then break down each of these  requirements into the specific solutions that we came up with. Along the  way, our CTO Sander Pick will highlight how those various solutions  came about, and why we think our approach is in the best interest of our  users.

The experience

Textile Photos  allows small, decentralized, private groups to share photos, send  messages, and engage with each other. That’s the experience, so it has  to ‘just work’.

Requirements

To drive the Textile user experience, we identified five key features needed in the sharing protocols:

  1. A mechanism to share and receive state updates within a group of n users
    To  enable photo sharing (and other common interactions such as likes and  comments) among a group of friends and/or colleagues, some concept of a shared state is required.
  2. A way to ensure the shared state stays resilient to peers dropping out or latency issues
    Since  we’re operating in a mobile environment, we have to expect peers to  continually drop ‘offline’ due to coverage issues, app back-grounding,  battery optimizations, and a whole slew of other reasons for a mobile  device to be cut off from a network.
  3. A way to avoid state conflicts with other members of the group
    On  top of the requirements above, when peers do come back online, we don’t  want any state changes that were made by other members of the group  while they were disconnected from the network to conflict with their own  local changes.
  4. A mechanism to recover the full state from the network as a whole
    Another  important consideration in the mobile world is that the number of users  (out of n) that are online at any given time is generally unknown, and  quite possibly zero. To reiterate, we want a decentralized shared state,  but it has to work even when you are the only member online.  This means we have to assume the full group state may not ever be  directly accessible (i.e., downloadable) from a single group member.  This is in contrast to something like Bitcoin, where new nodes are able to download the full blockchain from any connected peer.
  5. Way to link updates via their content, rather than where they are stored
    Since we are building on top of the IPFS network, and would like to eventually support a Filecoin-based  future in which users can select from a multitude of decentralized  storage providers, Threads need to embrace content addressing, rather  than location addressing. This makes it easy to grow and change the  underlying network, without affecting data access and sharing.

With these requirements in mind, let’s break down our solutions into their individual components…

Solutions

1. Handling Updates — use a peer-to-peer network with structured updates

First things first: how do we handle state updates between a set of distributed peers? This is mostly about peer-to-peer (p2p) networking.  And when it comes to communicating between heterogeneous network  devices (computers, phones, IoT devices, etc), we actually need many different types of network protocols.  That way, no matter what type of device we are talking about — be it a  phone, desktop computer, browser, or Internet-enabled fridge — it is  able to communicate with other devices located in the same room, or on  the other side of the planet.

At Textile, we use the super amazing libp2p  library for our networking needs. Libp2p is a networking stack and  library (you might have heard it called a protocol suite) modularized  out of the IPFS project,  and bundled separately for other tools to use. Essentially, libp2p does  all the heavy network lifting so that we can focus on our core task:  exchanging updates between communicating peers.

Decentralized, peer-to-peer networks are radically different types of communication networks.

Libp2p  was a pretty natural choice for us. The stack includes all the crypto  and networking protocols we need to deliver messages to group members,  and the libp2p developer community is super responsive and excited about  the power of p2p interactions. Easy choice.

The  other really nice thing about using the libp2p library is it comes  packed with many useful cryptography tools and functions, keeping  communications secure. For instance, all p2p communications over the  Textile network use the secio stream security transport.  This way, all connections use secure sessions provided by libp2p/secio  to encrypt all traffic, whereby a TLS-like handshake is used to setup  the initial communication channel.

Like many IPFS-based projects, Textile uses Protocol Buffers  for over-the-wire communication, and advanced cryptographic algorithms  to secure those messages. Essentially, each update to the shared group  state is just an encrypted Profobuf message with two parts: a header  with author and date info, and a body with the type-specific data. These  pieces are sent in their own inner-’envelope’ which contains a link to  the encrypted message and the Thread ID. This inner-envelope is then  signed by the sender and placed into the wire ‘envelope’ along with it’s  signature. You can read more about some of the cryptographic tools  Textile uses in this previous article. You can also check out how we structure our Protobuf messages, learn a bit more about how secio works, plus check out some recent updates to message encryption while you’re at it.

2. Network Resilience — support offline messaging so peers can come and go

If you are at all familiar with libp2p, then you might be thinking “ah libp2p has a pubsub layer that would be perfect for exchanging updates to a group of connecting peers”.  And while you’d certainly be right, there are a few key limitations  that makes using pubsub for something like Textile Photos pretty  cumbersome. On top of this, while pubsub is super nice for things like  chat rooms or distributed services, it is a ‘fire-and-forget’ messaging  protocol, meaning that once a peer publishes a message, it is up to its  peers to ensure they are listening for the right message at the right  time. To circumvent this, some pubsub systems introduce message echoing,  to ensure a message stays in the system long enough to be picked up by  the peers who might need it. However, this can lead to really noisy  network traffic, and is really just a band-aid over a larger issue.

Our  initial POC involved pubsub and always-online room echoers… not  scalable or particularly decentralized. A real solution to distributing  state has to involve direct messaging with an offline mechanism.

So this starts to get at our second requirement, that the shared state stays resilient to peers dropping out.  We need to assume peers might not be around to receive important  messages in ‘real-time’, which is a common problem with p2p systems.  Right now, Textile addresses this problem by enabling what you might  call offline messaging. Since we’re already using IPFS for data storage and communication,  we wanted to take advantage of some of the core technologies driving  IPFS. In particular, we (currently) use a special fork of the Kademlia-based distributed hash table (DHT) used by IPFS, that allows us to post messages for a peer directly in the DHT. For those unfamiliar with DHTs, they are a hash table  where the data is spread across a network of nodes or peers. And these  peers are all coordinated to enable efficient access and lookup between  nodes in a decentralized way. You can read more about this kind of stuff  in our previous article about how IPFS peers find, request, and retrieve content (and each other) on the decentralized web.  So, when a peer we want to communicate with is offline, rather than  blindly sending them a message that will never be received, we post a  message to Textile’s DHT, and they can then retrieve that message the  next time they come online again. Conceptually simple, and works pretty  well in practice.

p2p  network with custom Textile DHT overlay. Peers post (key, value)  messages (value) with a key specific to their intended recipient, and  this key is broadcast and available to entire network; though only the  intended recipient is able to decrypt the actual message content. Based  on Figure 1–2 from this thesis.

There  are still some issues with our current approach, including that it is  difficult/impossible to remove messages from the DHT manually. Indeed,  it can start to get a bit messy when left-over offline messages have to  be retrieved each time a peer comes back online… imagine a peer that  goes in and out of service frequently, this could lead to a lot of  network traffic and wasted CPU cycles. So, we’ve implemented an alternative  to this DHT-based offline messaging system that does not suffer from  these limitations (and also allows us to participate in the public IPFS  network), while still remaining decentralized and scalable in the  long-term. This new approach should be released soon, after more testing  and evaluation. You can follow along with this progress as part of the move towards a Cafe-based setup (see also What’s Next).

3 & 4. Avoiding Conflicts & State Recovery —use a CRDT to keep an immutable history across peers

Ok, so our next requirement and its associated solution have received a great deal of research and development attention over the years. The question of “how to avoid state conflicts with other members of a group?”  comes up when working collaboratively on documents, updating shared  databases, etc. For the purposes of updating a shared Thread of photos,  it turns out that an operation-based CRDT  that supports append-only operations is pretty much all you need to get  going. You can think of Textile’s CRDT (which shares some ideas with ipfs-log)  setup as an immutable, append-only tree that can be used to model a  mutable, shared state between peers. Every entry in the tree is saved on  IPFS, and each points to a hash of previous entry(ies) forming a graph.  These trees can be 3-way and fast-forward merged.

Speaking  of forks and joins, for those familiar with git and other similar  system, you might be thinking this sounds a lot like a git hash tree, Merkle DAG, or even a blockchain.  And you’d be right! The concepts are very similar, and this buys us  some really nice properties for building and maintaining a shared state.  By modeling our shared Thread state in this way, we benefit from tried  and tested methods for allowing a peer to incorporate other peers’  updates into their state while maintaining history (via fast-forwards  and three-way merging for example).

At the end of the day, a Thread is just a git-like hash tree of updates with a deterministic merge policy. Simple.

So  what does this look like in practice? Currently — because things might  change as we make improvements to the underlying implementation — each  Thread in Textile Photos is essentially a chain of updates, where each  update represents some specific action or event. For instance, when you  create a new Thread, under-the-hood you are actually creating a JOIN update on a new Thread chain. Similarly, when you update the Thread via a new photo (DATA update), comment, or like (ANNOTATION update), you’re actually updating that Thread chain. After each modification, the HEAD of the Thread will point to the latest update.

Building on top of these ideas, we also have concepts such as an INVITE, which points a new peer to a given point on the Thread chain, or a MERGE, which happens when the current HEAD  is not contained in an incoming update’s parent list for some reason  (maybe the peer doesn’t know about it because they were offline). If two  peers are merging the same sub trees,  all they need to do to ensure the update resolves to the same hash is  a) include the same date b) exclude author info. To get the same date,  they both follow a rule: choose the latest of the parents for the date  (in practice they add a little bit extra on to keep it ahead of both  parents).

To give you a better idea of what exactly we’re talking about, consider the following set of operations: User A creates a new Thread, and adds a Photo. They then externally invite User B (sent via some other secure communication channel), who eventually joins the Thread. But before User B is able to join the Thread, User A adds another Photo, moving the Thread’s HEAD forward. By the time User B joins the Thread, they’d end up with a Thread sequence that looks something like this:

Thread  join example. Solid arrows point towards the ‘parent’ of a given  update, over-the-wire communications are indicated with a 📶-style  arrow, and messages that are rebroadcast (e.g., via the welcome message)  are indicated with a dashed arrow. Similarly, merges point to both  their parent updates.

Here,  we see the merge happening at the end of the sequence because the  bottom peer is joining via an external invite that is no longer HEAD , forcing them to merge the most recent DATA update with their own JOIN update. But since merge results are deterministic (given the same parents), both peers create the MERGE update locally, and do not broadcast them to avoid trading merges back and forth.

A more complete sequence is given in the following figure. Suppose User A  goes ‘offline’ (e.g., their phone goes to sleep, they shut down the  app, they lose their data connection, etc), and in the mean time, both Users A and B update the Thread, with User A adding an ANNOTATION update, and User B adding a new Photo (DATA update). Now, when User A comes back online, there is a conflict, and both Users create a MERGE update to remedy this. A MERGE update has two parents, in this case, the DATA and ANNOTATION update from the different users. As always, the HEAD continues to point to the latest update (which in the example below eventually becomes an ANNOTATION from User B). Once both peers are online again, the more straightforward update and transmit mode of operation can continue.

More  complex Thread interaction where one or more peers are temporarily  offline. Note that an external invite is the same as a normal invite,  but the invite details are encrypted with a single use key, which is  sharable with the invite update location.

The  same properties that make hash trees or blockchains useful for  developing a shared, consistent (consensus-driven) state, also makes it  possible to address our fourth requirement: the ability to recover the full state from the network as a whole.  Because each Thread update references its parent(s), given a single  point on the Thread chain, we can trace back all the way to the  beginning of the Thread. For example, at any point along the sequence in  the above figures, a peer can trace back the history of the Thread, as  indicated by the solid arrows. This works particularly nicely when a  peer JOINs a thread, even at a point prior to the current HEAD. They can simply JOIN, and any existing Thread member can send them the latest HEAD  (even via offline messages if needed). From here, they can explore the  entire history of the Thread with ease. This is all really similar to  git commit speak, in which one only needs to know about a single commit  to be able to trace back the entire history of a code project; it’s also  essentially how blockchains work.

5. Content Addressing — store everything on IPFS and get ready to scale

As  we alluded to earlier, each update to a Thread is backed by an IPFS CID  hash (i.e., they are content addressable chunks of data on IPFS). This  means where the data is stored  is no longer relevant… IPFS will find it on the network via it’s hash.  This helps us address our fifth requirement, that we have a way to link updates via their content, rather than where they are stored. We’ve covered this topic a lot in the past, but for the uninitiated, the next paragraph provides a summary of how content addressing on IPFS works (pulled from this previous article).

Rather than referencing a file or chunk of data by its location (think HTTP), we reference it via its fingerprint. In IPFS and other such systems, this means identifying content by its cryptographic hash, or even better, a self-describing content-addressed identifier (multihash). A cryptographic hash is a (relatively) short alphanumeric string that’s calculated by running your content through a cryptographic hash function (like SHA). For example, when the (unencrypted) Textile logo is added to IPFS, its multihash ends up being QmbgGgWW3vH7v9FDxVCzcouKGChqGEjtf6YLDUgSHnk5J2. This ‘hash’ is actually the CID (Content IDentifier) for that file, computed from the raw data within that PNG. It is guaranteed to be cryptographically unique to the contents of that file, and that file only. If we change that file by even one bit, the hash will become something completely different.

Now,  when we want to access a file over IPFS (like the above logo), we can  simply ask the IPFS network for the file with that exact CID, the  network will find the peers that have the data (using a DHT), retrieve  it, and verify (using the CID) that it’s the correct file. What this  means is we can technically get the file from multiple places  because as long as the file matches the hash, we know we’re getting the  right data. Which brings us to the solution to our final requirement…  use IPFS! For now, Textile is maintaining a network of large,  homogeneous, volunteer nodes (we call them Cafes)  to ‘pin’ and store content on IPFS. It is important to note here that  the other nodes doing the pinning are the same as the nodes on your  phone — Textile Nodes that offer a pinning service to other peers. Soon,  we’ll allow users to elect their own Cafe nodes, add even add  additional nodes for redundancy. All this could eventually be driven by  Filecoin for even greater scalablility and flexibility.

What’s Next?

So  there you have it. Five solutions to five requirements for seamless,  secure, decentralized photo sharing and backup. Easy 😉. And at a  conceptual level, the Textile Thread protocol is  relatively simple: blocks of operations chained together to produce a  beautiful Thread of photos. But there’s a lot of complexity going on  under-the-hood that has required a lot of experimentation, testing, and  limit pushing, especially on mobile. And our journey isn’t over yet.

The Textile team  is still hard at work iterating, updating, and improving upon what we  already have working. For example, we’ll soon to moving to a new offline  messaging system that allow us to drop the custom DHT fork, and move  back to the public IPFS network. On top of this, our move to more powerful backup  and recovery capabilities has us taking new approaches to security,  profile management, offline interactions, and much much more. On top of  these changes, the team is actively working to modularize the Threads  concept and code into its own stand-alone package, which should provide  developers with something akin to a Realm and/or Firebase layer for  decentralized mobile applications!

If you are interested in learning more about this stuff, reach out over Twitter or Slack,  or pull us aside the next time you see us at a conference or event. We’re happy to provide background, thoughts, and opinions on how we  think the future of decentralized apps will play out. In the mean time,  don’t forget to check out our GitHub repos  for code and PRs that showcase our current and old implementations. We  try to make sure all our development happens out in the open, so you can  see things as they develop. Additionally, if you haven’t already, don’t miss out on signing up for our waitlist, where you can get early access to Textile Photos, the beautiful interface to Textile’s Threads.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  


Welcome to Steem @sanderpick.

Do read A thumb rule for steemit minnows - 50:100:200:25 for starter tips.

Spend time reading Steem Blue Paper to know how Steem blockchain works and if you still have any queries ask them on our Ask me anything about Steemit post and we will try to answer that.

All the Best!!!

Hi! I am a robot. I just upvoted you! I found similar content that readers might be interested in:
https://medium.com/textileio/wip-textile-threads-whitepaper-just-kidding-6ce3a6624338

Welcome sanderpick!
eSteem is the application that improves your experience here. We have Mobile application for Android and iOS users. We also have developed Surfer Desktop application that helps you to gain new followers and stay connected with your friends, unique features - notifications, bookmarks, favorites, drafts, and more.
We reward our users with encouragement upvotes as well as monthly giveaways rewarding Spotlight top users and active Discord users.
Learn more: https://esteem.app
Join our discord: https://discord.gg/8eHupPq

Here's an intro to Textile from @andrewxhill:

We've been in private beta through the Fall and are getting very close to a public release. We wanted to invite Steem'ers to give it an early try though. Please keep in mind that it's Beta, the app is still changing a lot between each release. We also have a major migration planned in a week in order to get users a bunch of cool new features.

IMG_0844.png

Brief overview:

  • Textile runs an IPFS node directly on your mobile device
  • Photos you add to the app will be privately encrypted with a key only you have access to, and hashed for IPFS.
  • You can create Threads, which are small groups to share photos with, you all share a set of keys specific to the Thread.
  • When you post a photo to a Thread, you share the photo key to Thread members so they can fetch it and view it.

Here's another technical post on our encryption: The 5 steps to end-to-end encrypted photo storage and sharing

If you are still interested, you can join me with the referral code, 'OCCER'

Just enter it after you download the app for iOS or Android.

Then, after you have the app up and running, click the following link from your phone to join my Steem Thread to test out sharing.

https://www.textile.photos/invites/new#id=QmNxaZpX2EvsJ9GGvxzqHBdRnzF5GkGNFeX3fRX2n7N4FJ&key=1COZFJ77NuJhCeOuoIdlf8n8ajU1COZFCR0fSKN3fGcr&inviter=andrewxhill&name=Steem&referral=MSCES

Congratulations @sanderpick! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

You published your First Post
You got a First Vote
You made your First Comment
You received more than 10 upvotes. Your next target is to reach 50 upvotes.
You made your First Vote

Click here to view your Board of Honor
If you no longer want to receive notifications, reply to this comment with the word STOP

Do not miss the last post from @steemitboard:

Be ready for the next contest!
Trick or Treat - Publish your scariest halloween story and win a new badge

Support SteemitBoard's project! Vote for its witness and get one more award!

Hi, Welcome to the amazing world of Blogging and Sharing on Steem Blockchain. Make sure to post only Original content and avoid all kind of plagarism.
Keep your patience and faith to get success here. Also make sure to engage and contribite to Community development. Commenting is best part to get involve. Hope you have a happy stay.....Steem On!
Thanks
@steemflow

Posted using Partiko Android

Congratulations @sanderpick! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 1 year!

Click here to view your Board

Do not miss the last post from @steemitboard:

Carnival Challenge - Collect badge and win 5 STEEM
Vote for @Steemitboard as a witness and get one more award and increased upvotes!

Congratulations @sanderpick! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 2 years!

You can view your badges on your Steem Board and compare to others on the Steem Ranking

Do not miss the last post from @steemitboard:

Use your witness votes and get the Community Badge
Vote for @Steemitboard as a witness to get one more award and increased upvotes!