mirror of
https://github.com/fluencelabs/tendermint
synced 2025-04-24 14:22:16 +00:00
add design philosophy doc (#3034)
This commit is contained in:
parent
d6dd43cdaa
commit
ddbdffb4e5
158
PHILOSOPHY.md
Normal file
158
PHILOSOPHY.md
Normal file
@ -0,0 +1,158 @@
|
||||
## Design goals
|
||||
|
||||
The design goals for Tendermint (and the SDK and related libraries) are:
|
||||
|
||||
* Simplicity and Legibility
|
||||
* Parallel performance, namely ability to utilize multicore architecture
|
||||
* Ability to evolve the codebase bug-free
|
||||
* Debuggability
|
||||
* Complete correctness that considers all edge cases, esp in concurrency
|
||||
* Future-proof modular architecture, message protocol, APIs, and encapsulation
|
||||
|
||||
|
||||
### Justification
|
||||
|
||||
Legibility is key to maintaining bug-free software as it evolves toward more
|
||||
optimizations, more ease of debugging, and additional features.
|
||||
|
||||
It is too easy to introduce bugs over time by replacing lines of code with
|
||||
those that may panic, which means ideally locks are unlocked by defer
|
||||
statements.
|
||||
|
||||
For example,
|
||||
|
||||
```go
|
||||
func (obj *MyObj) something() {
|
||||
mtx.Lock()
|
||||
obj.something = other
|
||||
mtx.Unlock()
|
||||
}
|
||||
```
|
||||
|
||||
It is too easy to refactor the codebase in the future to replace `other` with
|
||||
`other.String()` for example, and this may introduce a bug that causes a
|
||||
deadlock. So as much as reasonably possible, we need to be using defer
|
||||
statements, even though it introduces additional overhead.
|
||||
|
||||
If it is necessary to optimize the unlocking of mutex locks, the solution is
|
||||
more modularity via smaller functions, so that defer'd unlocks are scoped
|
||||
within a smaller function.
|
||||
|
||||
Similarly, idiomatic for-loops should always be preferred over those that use
|
||||
custom counters, because it is too easy to evolve the body of a for-loop to
|
||||
become more complicated over time, and it becomes more and more difficult to
|
||||
assess the correctness of such a for-loop by visual inspection.
|
||||
|
||||
|
||||
### On performance
|
||||
|
||||
It doesn't matter whether there are alternative implementations that are 2x or
|
||||
3x more performant, when the software doesn't work, deadlocks, or if bugs
|
||||
cannot be debugged. By taking advantage of multicore concurrency, the
|
||||
Tendermint implementation will at least be an order of magnitude within the
|
||||
range of what is theoretically possible. The design philosophy of Tendermint,
|
||||
and the choice of Go as implementation language, is designed to make Tendermint
|
||||
implementation the standard specification for concurrent BFT software.
|
||||
|
||||
By focusing on the message protocols (e.g. ABCI, p2p messages), and
|
||||
encapsulation e.g. IAVL module, (relatively) independent reactors, we are both
|
||||
implementing a standard implementation to be used as the specification for
|
||||
future implementations in more optimizable languages like Rust, Java, and C++;
|
||||
as well as creating sufficiently performant software. Tendermint Core will
|
||||
never be as fast as future implementations of the Tendermint Spec, because Go
|
||||
isn't designed to be as fast as possible. The advantage of using Go is that we
|
||||
can develop the whole stack of modular components **faster** than in other
|
||||
languages.
|
||||
|
||||
Furthermore, the real bottleneck is in the application layer, and it isn't
|
||||
necessary to support more than a sufficiently decentralized set of validators
|
||||
(e.g. 100 ~ 300 validators is sufficient, with delegated bonded PoS).
|
||||
|
||||
Instead of optimizing Tendermint performance down to the metal, lets focus on
|
||||
optimizing on other matters, namely ability to push feature complete software
|
||||
that works well enough, can be debugged and maintained, and can serve as a spec
|
||||
for future implementations.
|
||||
|
||||
|
||||
### On encapsulation
|
||||
|
||||
In order to create maintainable, forward-optimizable software, it is critical
|
||||
to develop well-encapsulated objects that have well understood properties, and
|
||||
to re-use these easy-to-use-correctly components as building blocks for further
|
||||
encapsulated meta-objects.
|
||||
|
||||
For example, mutexes are cheap enough for Tendermint's design goals when there
|
||||
isn't goroutine contention, so it is encouraged to create concurrency safe
|
||||
structures with struct-level mutexes. If they are used in the context of
|
||||
non-concurrent logic, then the performance is good enough. If they are used in
|
||||
the context of concurrent logic, then it will still perform correctly.
|
||||
|
||||
Examples of this design principle can be seen in the types.ValidatorSet struct,
|
||||
and the cmn.Rand struct. It's one single struct declaration that can be used
|
||||
in both concurrent and non-concurrent logic, and due to its well encapsulation,
|
||||
it's easy to get the usage of the mutex right.
|
||||
|
||||
#### example: cmn.Rand:
|
||||
|
||||
`The default Source is safe for concurrent use by multiple goroutines, but
|
||||
Sources created by NewSource are not`. The reason why the default
|
||||
package-level source is safe for concurrent use is because it is protected (see
|
||||
`lockedSource` in https://golang.org/src/math/rand/rand.go).
|
||||
|
||||
But we shouldn't rely on the global source, we should be creating our own
|
||||
Rand/Source instances and using them, especially for determinism in testing.
|
||||
So it is reasonable to have cmn.Rand be protected by a mutex. Whether we want
|
||||
our own implementation of Rand is another question, but the answer there is
|
||||
also in the affirmative. Sometimes you want to know where Rand is being used
|
||||
in your code, so it becomes a simple matter of dropping in a log statement to
|
||||
inject inspectability into Rand usage. Also, it is nice to be able to extend
|
||||
the functionality of Rand with custom methods. For these reasons, and for the
|
||||
reasons which is outlined in this design philosophy document, we should
|
||||
continue to use the cmn.Rand object, with mutex protection.
|
||||
|
||||
Another key aspect of good encapsulation is the choice of exposed vs unexposed
|
||||
methods. It should be clear to the reader of the code, which methods are
|
||||
intended to be used in what context, and what safe usage is. Part of this is
|
||||
solved by hiding methods via unexported methods. Another part of this is
|
||||
naming conventions on the methods (e.g. underscores) with good documentation,
|
||||
and code organization. If there are too many exposed methods and it isn't
|
||||
clear what methods have what side effects, then there is something wrong about
|
||||
the design of abstractions that should be revisited.
|
||||
|
||||
|
||||
### On concurrency
|
||||
|
||||
In order for Tendermint to remain relevant in the years to come, it is vital
|
||||
for Tendermint to take advantage of multicore architectures. Due to the nature
|
||||
of the problem, namely consensus across a concurrent p2p gossip network, and to
|
||||
handle RPC requests for a large number of consuming subscribers, it is
|
||||
unavoidable for Tendermint development to require expertise in concurrency
|
||||
design, especially when it comes to the reactor design, and also for RPC
|
||||
request handling.
|
||||
|
||||
|
||||
## Guidelines
|
||||
|
||||
Here are some guidelines for designing for (sufficient) performance and concurrency:
|
||||
|
||||
* Mutex locks are cheap enough when there isn't contention.
|
||||
* Do not optimize code without analytical or observed proof that it is in a hot path.
|
||||
* Don't over-use channels when mutex locks w/ encapsulation are sufficient.
|
||||
* The need to drain channels are often a hint of unconsidered edge cases.
|
||||
* The creation of O(N) one-off goroutines is generally technical debt that
|
||||
needs to get addressed sooner than later. Avoid creating too many
|
||||
goroutines as a patch around incomplete concurrency design, or at least be
|
||||
aware of the debt and do not invest in the debt. On the other hand, Tendermint
|
||||
is designed to have a limited number of peers (e.g. 10 or 20), so the creation
|
||||
of O(C) goroutines per O(P) peers is still O(C\*P=constant).
|
||||
* Use defer statements to unlock as much as possible. If you want to unlock sooner,
|
||||
try to create more modular functions that do make use of defer statements.
|
||||
|
||||
## Matras
|
||||
|
||||
* Premature optimization kills
|
||||
* Readability is paramount
|
||||
* Beautiful is better than fast.
|
||||
* In the face of ambiguity, refuse the temptation to guess.
|
||||
* In the face of bugs, refuse the temptation to cover the bug.
|
||||
* There should be one-- and preferably only one --obvious way to do it.
|
Loading…
x
Reference in New Issue
Block a user