Decentralization vs. Centralization
Is it time for P2P to enter a new era of relevancy? From Napster, to Limewire, to IPFS, to Blockchain, the underdog Bahamut of modern infrastructure. Next year’s potential rise in decentralized A.I. will mark the beginning of a long-term battle between 2 ideologies that have been competing head-to-head since the late 90s and early 2000s. The floodgates of A.I. and the floodgates of crypto, have only pushed this narrative further. People tend to return to this need of peer to peer principles, a need to own their data while utilizing a network of shared users.
My current perspective of how value is defined in the tech industry is via a product that facilitates ingestion, stores it in a central place, and then sells it at a price. Whether the mediums are Large Language Models or Social Networks, where the core variable in the equation of market valuation is the data you have that others don’t. Ingested in a journey of patterns that is inherently more valuable than a competitor’s. When applying P2P principles, creative monetization methods can be devised to bring the user into pricing discussions. Involving the user, instead of turning the user into the whole product, they behave as an active stakeholder and their usage patterns are their return on investment.
For the rest of this write-up, I am going to write in a style of proposed questions, to create discussion in the reader’s mind, since no one truly knows where the puck will be in the future.
Where does Model Context Protocol sit in the P2P landscape? It’s an ideal solution for a high-level peer-to-peer relationship between models, but is there a lower-level solution available? Where LLMs don’t matter as much once they reach the threshold of hosting affordability and parameter count. This is from the perspective of start-ups or corporations that do not want to burn more than they earn. Another follow-up question would then lead to, how is quality defined? Can the data of a smaller-startup produce results or responses that can be unique from competing Goliaths that amass lakes for a fading promise of AGI?
Seems to me we are past the marketing era of LLMs with dressed up responses that turns into A.I. slop and tunes A.I. psychosis into its user's mind. Should we go back to thinking of these solutions as calculators? LLMs are machines to deliver a solution for a problem you state. Nothing more, nothing less. If it gets the answer wrong, it’s not a good model or infrastructure. The “look-up” procedure is the key, mapping questions into the proper look up tables. I wonder if we can take inspiration from LUTs in the computational photography space? And if the LLM cannot answer the question, then it should simply respond with the truth. Pushing the question into a queue to be filled with the answer over time. Building a system to create that automated learning framework will be interesting to explore. I honestly don't believe we are building alien systems, we are building ships with auto-pilot functionality and the goal is to find India, but we may end up finding America instead.
The embodiment of Seer is to challenge these questions while prioritizing privacy, security, and extreme transparency. Where the evaluation of the product does not require a pitch, it either is or is not XYZ subject matter. We want to store a user’s private information securely in a document based strategy. One form is using embeddings generated from a user’s texts or journal entries. These embeddings are stored in a LUT style format as a document on a server. In this paradigm, we treat embeddings or the vector representation of text as public keys and the original text documents as private keys. The idea is to store only the embeddings (public keys) in a database and use them to retrieve or compare with other embeddings. The original text documents are treated as private keys, kept secure client-side. Even if Seer’s servers leak information, the idea is to leak information that can’t be used against the user or the institution.
If we can discover an efficient encryption solution to mask the embeddings as well, that would be the follow-up if this client-side and server-side model turns out to be scaleable and valid. Reason being, embeddings can most likely be reverse engineered if someone has access to the embedding model. If we can solve encrypted streams of embeddings we can be confident in securing your stories. Stored on IPFS (InterPlanetary File System, where you can transparently see your thoughts in the open via pinning) or not.
Although, I wonder if the key to this solution (essentially an aggregated embedding search) sits within IPFS’s methodology of filename hashes. I wonder if the embeddings themselves, when generated from the documents of others, pattern matches in a way so that you can see their internal mind map in relation to yours? So that perspectives are not simply reshaped, but also merged. Is that the bridge? Is this... what Social Networks were meant to be?
