3 Commits

Author SHA1 Message Date
Anton Kaliaev
7fd51e6ade
make govet linter pass (#3292)
* make govet linter pass

Refs #3262

* close PipeReader and check for err
2019-02-11 16:31:34 +04:00
Ethan Buchman
39eba4e154
WAL: better errors and new fail point (#3246)
* privval: more info in errors

* wal: change Debug logs to Info

* wal: log and return error on corrupted wal instead of panicing

* fail: Exit right away instead of sending interupt

* consensus: FAIL before handling our own vote

allows to replicate #3089:
- run using `FAIL_TEST_INDEX=0`
- delete some bytes from the end of the WAL
- start normally

Results in logs like:

```
I[2019-02-03|18:12:58.225] Searching for height module=consensus wal=/Users/ethanbuchman/.tendermint/data/cs.wal/wal height=1 min=0 max=0
E[2019-02-03|18:12:58.225] Error on catchup replay. Proceeding to start ConsensusState anyway module=consensus err="failed to read data: EOF"
I[2019-02-03|18:12:58.225] Started node module=main nodeInfo="{ProtocolVersion:{P2P:6 Block:9 App:1} ID_:35e87e93f2e31f305b65a5517fd2102331b56002 ListenAddr:tcp://0.0.0.0:26656 Network:test-chain-J8JvJH Version:0.29.1 Channels:4020212223303800 Moniker:Ethans-MacBook-Pro.local Other:{TxIndex:on RPCAddress:tcp://0.0.0.0:26657}}"
E[2019-02-03|18:12:58.226] Couldn't connect to any seeds module=p2p
I[2019-02-03|18:12:59.229] Timed out module=consensus dur=998.568ms height=1 round=0 step=RoundStepNewHeight
I[2019-02-03|18:12:59.230] enterNewRound(1/0). Current: 1/0/RoundStepNewHeight module=consensus height=1 round=0
I[2019-02-03|18:12:59.230] enterPropose(1/0). Current: 1/0/RoundStepNewRound module=consensus height=1 round=0
I[2019-02-03|18:12:59.230] enterPropose: Our turn to propose module=consensus height=1 round=0 proposer=AD278B7767B05D7FBEB76207024C650988FA77D5 privValidator="PrivValidator{AD278B7767B05D7FBEB76207024C650988FA77D5 LH:1, LR:0, LS:2}"
E[2019-02-03|18:12:59.230] enterPropose: Error signing proposal module=consensus height=1 round=0 err="Error signing proposal: Step regression at height 1 round 0. Got 1, last step 2"
I[2019-02-03|18:13:02.233] Timed out module=consensus dur=3s height=1 round=0 step=RoundStepPropose
I[2019-02-03|18:13:02.233] enterPrevote(1/0). Current: 1/0/RoundStepPropose module=consensus
I[2019-02-03|18:13:02.233] enterPrevote: ProposalBlock is nil module=consensus height=1 round=0
E[2019-02-03|18:13:02.234] Error signing vote module=consensus height=1 round=0 vote="Vote{0:AD278B7767B0 1/00/1(Prevote) 000000000000 000000000000 @ 2019-02-04T02:13:02.233897Z}" err="Error signing vote: Conflicting data"
```

Notice the EOF, the step regression, and the conflicting data.

* wal: change errors to be DataCorruptionError

* exit on corrupt WAL

* fix log

* fix new line
2019-02-04 13:00:06 -05:00
Ismail Khoffi
a6011c007d Close and retry a RemoteSigner on err (#2923)
* Close and recreate a RemoteSigner on err

* Update changelog

* Address Anton's comments / suggestions:

 - update changelog
 - restart TCPVal
 - shut down on `ErrUnexpectedResponse`

* re-init remote signer client with fresh connection if Ping fails

- add/update TODOs in secret connection
- rename tcp.go -> tcp_client.go, same with ipc to clarify their purpose

* account for `conn returned by waitConnection can be `nil`

- also add TODO about RemoteSigner conn field

* Tests for retrying: IPC / TCP

 - shorter info log on success
 - set conn and use it in tests to close conn

* Tests for retrying: IPC / TCP

 - shorter info log on success
 - set conn and use it in tests to close conn
 - add rwmutex for conn field in IPC

* comments and doc.go

* fix ipc tests. fixes #2677

* use constants for tests

* cleanup some error statements

* fixes #2784, race in tests

* remove print statement

* minor fixes from review

* update comment on sts spec

* cosmetics

* p2p/conn: add failing tests

* p2p/conn: make SecretConnection thread safe

* changelog

* IPCVal signer refactor

- use a .reset() method
- don't use embedded RemoteSignerClient
- guard RemoteSignerClient with mutex
- drop the .conn
- expose Close() on RemoteSignerClient

* apply IPCVal refactor to TCPVal

* remove mtx from RemoteSignerClient

* consolidate IPCVal and TCPVal, fixes #3104

- done in tcp_client.go
- now called SocketVal
- takes a listener in the constructor
- make tcpListener and unixListener contain all the differences

* delete ipc files

* introduce unix and tcp dialer for RemoteSigner

* rename files

- drop tcp_ prefix
- rename priv_validator.go to file.go

* bring back listener options

* fix node

* fix priv_val_server

* fix node test

* minor cleanup and comments
2019-01-13 14:31:31 -05:00