> The build and deploy instructions as well as some of the Aqua code are outdated and the deployed services are no longer available. An updated version will be available soon.
The [Ethereum virtual machine](https://ethereum.org/en/developers/docs/evm/) (EVM) is available across a variety of L1, L2 and sidechains with implementations ranging from straight clones to [compatible or equivalent](https://metisdao.medium.com/evm-equivalence-vs-evm-compatibility-199bd66f455d) incarnations. [Ethereum JSON RPC](https://ethereum.github.io/execution-apis/api-documentation/) is the base API to access EVM functionality and the underlying request model for the various web3 libraries such as [ethers](https://docs.ethers.io/v5/) or [ethers-rs](https://docs.rs/ethers/0.1.3/ethers/). In theory, and sometimes even practice, this allows developers to reuse their contracts and Web3 code across different chains with no changes but a few parameters.
While many of the various EVM chains provide the Web3 benefits desired, running (blockchain) clients tends to be a resource intensive adventure. Not surprisingly, a myriad of hosted node providers, such as [Infura](https://infura.io/), [Alchemy](https://www.alchemy.com/) and many more, provide relatively cheap access to a variety of chains. Alas, the benefits of convenience and low-to-no cost, the typical siren call of Web2 SaaS, comes at the price of introducing a single point of failure, trust in not finding your (personal) data on some blackhat site right beside the data from all the other "secure" SaaS vendors, and high exit barriers when adopting a hosting provider's specific API, thereby introducing a nontrivial chokepoint in your DApp.
In this tutorial, we illustrate how Fluence and Aqua can help DApp developers minimize and even eliminate centralized points of failure undermining the Web3 benefits of their DApp.
If you haven't set up your Fluence and Aqua development environment, head over to the [setup docs](https://fluence.dev/docs/build/tutorials/setting-up-your-environment); if you're not familiar with Fluence and Aqua, give the [developer docs](https://fluence.dev/) a gander.
In addition to the Fluence setup, you ought to sign up for a few hosted (EVM) node providers with JSON-RPC access as discussed below.
### Selecting Providers
A large number of EVM nodes are hosted by centralized providers such as [Alchemy](https://www.alchemy.com/), which isn't all that surprising how resource intensive it is to self-host nodes. The following hosting providers offer free JSON-RPC access but feel free to substitute as you see fit:
* [Infura](https://infura.io/)
* need to sign up for a free account and get API lkey
* for Ethereum mainnet, our uri is:
* [Alchemy](https://www.alchemy.com/)
* need to sign up for a free account and get API lkey
* for Ethereum mainnet, our uri is:
* [Linkpool](https://linkpool.io/)
* no need to sign up or need for an API key
* is a light client limiting historic queries
* for Ethereum mainnet, our uri is: https://main-light.eth.linkpool.io
If you are looking for another multi-blockchain API provider, checkout [BEWARELAABS API](https://bwarelabs.com/blockchain-api). Sticking with the Ethereum mainnet and the [`eth_blockNumber`](https://docs.bwarelabs.com/api-docs/ethereum-api/rpc/eth_blocknumber-method) method, we get the expected result:
Centralized hosted nodes introduce at best a single point of failure and at worst, a nefarious actor creating havoc with your DApp. Hence, a centralized source of truth easily negates the benefits of the decentralized backend. Without giving up all of the convenience and cost savings of hosted blochchain nodes, we can route identical requests to multiple hosted providers and determine, against some subjective metric, the acceptability of the responses. That is, we decentralize hosted provider responses. See Figure 1.
In order to interact with the hosted EVMs, we choose the lowest common but open and interoperable API denominator, i.e., Ethereum JSON-RPC, which provides us with flexibility and great reusability but also forgoes some of the conveniences available in the various web3 sdks or custom APIs.
For illustrative purposes, let's say we want to make sure the provider is returning the `latest` block, which, while simple, is a pretty good indicator of provider "liveness." The method we want to call is [`eth_blockNumber`](https://ethereum.github.io/execution-apis/api-documentation/) which formats to the following request:
```json
{
"jsonrpc": "2.0",
"method": "eth_blockNumber",
"params": [],
"id": 1
}
```
where `id` is the nonce. If we were to use `curl` on the command line:
```bash
curl --data '{"method":"eth_blockNumber","params":[],"id":1,"jsonrpc":"2.0"}' -H "Content-Type: application/json" -X POST <nodeorproviderurl>
```
we expect the following result:
```json
{
"id": 1,
"jsonrpc": "2.0",
"result": "0x4b7" // hex encoded latest block
}
```
Ok, let's create a Wasm service we can use to query multiple providers. Keep the curl command in mind as we'll need to use curl from our Wasm to make the provider call.
The `get_block_number` function implements a wrapper around the `eth_blockNumber` method and decodes the `hex` response to a json string. We could have implemented a more general function, say, `fn eth_rpc_wrapper(provider: ProviderInfo, method: String, parameters: Vec<String>) -> EVMResult` and either returned the raw json rpc result or added per-method decoding match arms, which *you are encouraged* to implement for the methods of your choosing.
Recall that the `#[marine]` macro brings the Fluence `marine-rust-sdk` into play to compile to the Wasi target and expose the appropriate interfaces. Moreover, note that we link the [`curl` module](./curl-adapter/) to enable our http calls.
Note that not all providers follow the JSON-RPC style when it comes to error handling. For example, submitting an invalid API key to Alchemy, results in the error captured in the JSON-RPC response:
Infura, on the other hand, does not follow the JSON-RPC route and instead returns a string in stdout with no other error codes or indicators provided:
```json
"invalid project id\n"
```
Infura-ating but such is life. We compensate for this idiosyncrasy with the following adjustment to an otherwise straight-forward curl response processor:
let response = String::from_utf8(response.stdout).unwrap();
let response: Result<RpcResponse,_> = serde_json::from_str(&response);
match response {
Ok(r) => r,
Err(e) => RpcResponse {
jsonrpc: "".to_owned(),
error: Some(RpcResponseError {
code: -1, // we know it's not an EVM error
message: e.to_string(),
}),
result: None,
},
}
```
Of course, other providers may provide even other response patterns and it is up to you to make the necessary adjustments. You may think the convenience of vendor lock-in doesn't look too bad right about now but trust yourself, it is a big risk and cost.
which should put `curl_adapter.wasm` and `multi_provider_query.wasm` in the `artifacts` directory. Before we deploy or service to one or more peers, let's check it out locally using the `marine` REPL:
If you go back to the source files, you'll see that all the interfaces marked up with the `#[marine]` macro are exposed and available in the REPL. Moreover, note that both the `curl_adapter` and `multi_provider_query` WASM modules are available as the eponymous namespaces with the corresponding (exposed) functions.
Without further ado, let's try to get the latest block with a couple of the provider urls:
Ok, so we called both Alchemy and Infura for the latest block height and got the same result, which is somewhat confidence inspiring. Let's check with a bad API key and keep in mind that we had to "coerce" the Infura response into the JSON-RPC format:
result: Object({"provider": String("infura"), "stderr": String("{\"code\":-1,\"message\":\"expected value at line 1 column 1\"}"), "stdout": String("")})
elapsed time: 419.089663ms
```
While we have been using hosted Ethereum mainnet endpoints, you can easily use other supported (EVM) networks such as [Polygn PoS on Alchemy](https://docs.polygon.technology/docs/develop/alchemy/) or [Polygon PoS on Infura](https://docs.infura.io/infura/networks/polygon-pos/how-to).
All looks well and we are ready to deploy!
## Multi-Provider Queries With Aqua
Now that we have our adapter and providers, let's think about how we can use a multi-provider query approach to ensure a high likelihood of truth and reliability in our query results. Keep in mind that in addition to our lack of trust in each of the providers, we may also not trust the Fluence peers hosting our services, which we'll ignore for the moment. Moreover, we need to parallelize our requests to make sure we don't inadvertently straddle different block times legitimately leading to different responses. As you may recall, Wasm modules are single threaded and concurrency of service execution is managed at the Aqua level. Parallel execution of a service may be accomplished on one or multiple nodes but either way, you need to deploy multiple service instances, e.g., for three providers, we want three service instances.
In the "trust the Fluence nodes" case, then, we deploy our adapter service to three (3) different nodes -- one for each provider. Let's get to it, deploy our services and create our Aqua script.
### Service Deployment
We deploy our service with the `aqua cli` tool to the `stage` testnet. To see the Fluence default peers for `stage`:
Also, if you are deploying your own service instances, you need one or more keypairs to deploy, authenticate and eventually delete your service. You can use `aqua-cli` to create keys and make sure you store them in a safe place:
If you deploy the services, your service ids are different, of course. Also, feel free to use different peers. Just as with your keys, make sure you keep the (peer id, service id) in a safe place for future use. Now that we have our services deployed, it's time to create our Aqua script.
The Fluence protocol uses function addressability, i.e., (peer id, service id) tuples, to resolve services. In our model, function addresses are input parameters just like the provider urls. So we can capture our function addresses with a struct:
result <-MultiProviderQuery.get_block_number(provider)
result2 <<-provider.name
-- join result[n2-1]
join result[n*n2-1]
<-result
```
The script we created is going to get us halfway to where we want to go. So let's try it from the command-line with `aqua-cli`, which is a bit unwieldy and we'll see in a bit how we can create a local client with Fluence JS. For now, quick and easy:
A quick scan suggests that all responses are equal across providers. That's a good start! However, we don't want to rely on manual inspection of our data but want to programmatically determine some sort of quorum. A possible quorum rule is the two-third rule but we need consider our response categories. For example:
* blocknumber response -- "correct"
* blocknumber response -- "incorrect"
* no response
In order to be able to apply a threshold decision like the 2/3 rule, we first need to determine the frequency distribution of our result set. `cd` into the `simple-quorum` directory to see a naive implementation:
Basically, our decentralized blockchain API service returns the block height with the most frequencies, which we can then compare to our quorum threshold. See Figure 2.
In essence, we are building on our prior work and piping the array of EVMResults in the SimpleQuorum service to arrive at the quorum. Again, with `aqua cli`:
Before we turn to the result, note that we replaced an increasingly hard to maintain inline `--data` representation with a much more manageable json file for our function parameters:
So we have three (3) providers called from three (3) service instances, leaving us to reasonably expect nine responses and not necessarily all the same. One possible way to handle the overall confidence in a result is to add a threshold value to our Aqua script to determine an *acceptable* quorum level for a point value. For example:
The updated Aqua workflow now returns the highest frequency response and a boolean comparing the relative frequency ratio to some threshold value -- 0.66 in our example below:
Of course, this may not always be the case but if it is, we probably want more information than just the summary stats. In order to do that, we can expand on or Aqua script:
v <-Utilities.kv_to_u64(res.stdout,"block-height")--<(1)thisisanewservice,seewasm-modules/utilities
if v != quorum[0].mode: -- (2)
deviations <<-res--(3)
on %init_peer_id% via u_addr.peer_id:
co ConsoleEVMResult.print(res) --< (4) placeholder for future processing of divergent responses
Math.add(n_dev, 1)
-- ConsoleEVMResults.print(deviations) -- (5)
<-quorum[0],is_quorum[0]
```
We introduce a new post-quorum section, see above, which kicks in if the "point estimate" count (mode) doesn't equal the number of (expected) responses, n. We
* introduce a [utility service](./wasm-modules/utilities/), to extract the block-height from the response string (1)
* cycle through the responses to find the deviants (2) (note: this could be more efficiently accomplished with a service especially when n gets large)
* collect the deviants (3)
* print the collection (4)
There a few things that warrant additional explanations:
```aqua
on %init_peer_id% via u_addr.peer_id:
co ConsoleEVMResult.print(res)
```
In order to use print, it is imperative to know that this method only works on the initiating peer. Since we are on a different peer, `u_addr.peer_id`, we need to affect a topological change to the initiating peer `init_peer_id` by topological moving via the current peer, `u_addr.peer_id`. However, we don't want to leave the primary peer, so we implement the print method as a `co`routine. Using the co-routine allows us to proceed without having to create a global variable `deviations`. Alternatively, we can use the method in (5) instead of the co print, which may be more useful for the actual processing of results.
In this case, three of the nine responses differed from the rest. With our new sub-routine, we print out the deviant responses. Of course, printing the deviants is only a placeholder for "real" processing, e.g., updating a reference count of responses.
### CIDs As Aqua Function Arguments
So far, we used function parameters in Aqua as you probably have done a million times. However, that's not the only way. In fact, we can use (IPFS) CIDs as function arguments. Such an approach opens up a lot of opportunities when it comes to safely preserving or sharing parameter sets, passing more complex data models with a single string, etc.
Since IPFS makes all data public and actually encrypting and eventually decrypting data seems like a lot of overhead, we may keep the providers (arg1) in our *quorum_params.json* as is but upload the remaining params to IPFS. So our IPFS candidates as:
Matching service granularity to IPFS document content may make sense if we want to minimize updates due to service changes. However, we could have as easily put all the function addresses into one IPFS file. Regardless, let push our content to IPFS.
how to get the IPFS sidecar multiaddr:
```aqua
-- aqua snippet to get multiaddr for IPFS sidecar to specified peer id
import "@fluencelabs/aqua-ipfs/ipfs.aqua"
func get_maddr(node: string) -> string:
on node:
res <-Ipfs.get_external_api_multiaddr()
<-res.multiaddr
```
We can use Aqua to upload our files or *ipfs cli*. Using the cli:
Woo hoo!! All dressed up and nowhere to go ... just yet. Let's change that. In order to be able to deal with IPFS documents, we need to be able to interact with IPFS from Aqua. We could write a [custom IPFS adapter](./wasm-module/ipfs-adapter) and [custom IPFS processing module](./wasm-module/ipfs-cli) or use Fluence's builtin [IPFS integration with Aqua](https://fluence.dev/docs/aqua-book/libraries/aqua-ipfs) library. If we use the latter, we don't have to implement and deploy our own ipfs adapter service(s) and don't have to manage the associated function addresses. However, the Fluence IPFS library is geared toward file management, which means that we need to write and maintain a file processing service, which still leaves us with service management chores. So let's write and deploy a super-light ipfs adapter based on the [ipfs cat](https://docs.ipfs.tech/reference/kubo/cli/#ipfs-cat) command.
Nothing new here, we link to the host's binary and wrap the command into an exposed (wasm) function `ipfs_request` which any other linked modules can use. In order to get the real lifting done, [we need a bit more](./wasm-modules/ipfs-cli/):
```rust
// wasm-module/ipfs-cli
use marine_rs_sdk::{marine, module_manifest, MountedBinaryResult};
`params_from_cid` takes the multiaddr and cid to execute the `ipfs cat` command using our ipfs adapter. Once we get have our IPFS document, we turn it into an array of `FuncAddr`s with some minimal error management -- feel free to improve on that. Let's test it in the Marine Repl before we deploy it:
Ok, so all works as intended and, as previosuly discussed, the function address parameters for the ipfs service stick out like a sore thumb. Of course we can still use a parameter file to keep things neat and managable, we update our (parameter) [data file](./parameters/quorum_mixed_params.json) and include the IPFS document references:
v <-Utilities.kv_to_u64(res.stdout,"block-height")
if v != quorum[0].mode:
deviations <<-res
on %init_peer_id% via u_addrs[0].peer_id:
co ConsoleEVMResult.print(res)
Math.add(n_dev, 1)
<-quorum[0],is_quorum[0]
```
Aside from updating our function signature, we add the subroutine processing the various IPFS documents into the necessary `FunctionAddress` structs and, where needed, index the resulting arrays through out the rest of the code. And now we are ready to run it:
In this section we rather explicitly illustrated how to use IPFS documents via CIDs to provide function arguments. The motivation to do so arises from a) the immutability of CIDs bringing realiable resussability and provability to paramters for a variety of use cases, including Marine service "pinning" and b) lay a foundation to pipe much more complex (linked) data models as succint, immutable function arguemnts. We'll need another tutorial for that!
We developed a model to decentralize blockchain APIs for our DApps and implemented a stylized solution with Fluence and Aqua. Specifically, we queried multiple centralized hosted EVM providers using the open Ethereum JSON-RPC API and settled on pulling the `latest` block as an indicator of reliability and "liveness" as opposed to, say, (stale) caches, unreachability or nefarious behavior.
Along our journey, we pretty much touched on every possible chokepoint and discussed what a feasible approach to a quorum might look like. However, we made a couple significant omissions:
* we trusted the Fluence nodes, which is not necessarily the right course of action. If you don't trust the Fluence nodes, you can expand on this tutorial by running each providers set across multiple nodes and then compare the results sets across nodes, for example. We are very much looking forward to your PR!
* we are satisfied with a probabilistic true-false quorum, which may be enough for a lot of use cases. However, we can expand our analysis to introduce learning over response (failure) attributes to the process to develop trust weights for each provider. That might allow us to use smaller subsets of providers for each call and run "test" requests outside of our workflow to update provider trust weights and then just pick the most trusted providers at "run time." Ditto for Fluence providers. Again, we are looking forward to your PRs.
* we only used the Ethereum mainnet options for the provider requests. Please run it with other hosted blockchain solutions, such as Polygon PoS. If things don't work out, let us know in Issues.
Thank you for making it all the way to end and we hope that this little tutorial not only helped you learn a bit more about Fluence and Aqua but also provided a reminder on just how vigilant DApp developers need to be to use the *D* in their Apps.