7315 Commits

Author SHA1 Message Date
Ismail Khoffi
3c7bb6b571 Add some numbers for #2778 2019-03-27 16:50:42 +01:00
Anton Kaliaev
5fa540bdc9
mempool: add a safety check, write tests for mempoolIDs (#3487)
* mempool: add a safety check, write tests for mempoolIDs

and document 65536 limit in the mempool reactor spec

follow-up to https://github.com/tendermint/tendermint/pull/2778

* rename the test

* fixes after Ismail's review
2019-03-27 16:45:34 +01:00
Ismail Khoffi
52727863e1 add external contributors 2019-03-27 16:07:03 +01:00
Ismail Khoffi
e3f840e6a6 reset CHANGELOG_PENDING.md 2019-03-27 16:04:43 +01:00
Ismail Khoffi
ed63e1f378 Add more entries to the Changelog, fix formatting, linkify 2019-03-27 16:03:25 +01:00
Ismail Khoffi
55b7118c98 Prep changelog: copy from pending & update version 2019-03-27 15:35:32 +01:00
zjubfd
5a25b75b1d p2p: refactor GetSelectionWithBias for addressbook (#3475)
Why submit this pr:

    we have suffered from infinite loop in addrbook bug which takes us a long time to find out why process become a zombie peer. It have been fixed in #3232. But the ADDRS_LOOP is still there, risk of infinite loop is still exist.
    The algorithm that to random pick a bucket is not stable, which means the peer may unluckily always choose the wrong bucket for a long time, the time and cpu cost is meaningless.

A simple improvement:
shuffle bucketsNew and bucketsOld, and pick necessary number of address from them. A stable
algorithm.
2019-03-26 17:13:14 +01:00
Anton Kaliaev
a4d9539544
rpc/client: include NetworkClient interface into Client interface (#3473)
I think it's nice when the Client interface has all the methods. If someone does not need a particular method/set of methods, she can use individual interfaces (e.g. NetworkClient, MempoolClient) or write her own interface.

technically breaking

Fixes #3458
2019-03-26 09:44:49 +01:00
HaoyangLiu
1bb8e02a96 mempool: fix broadcastTxRoutine leak (#3478)
Refs #3306, irisnet@fdbb676

I ran an irishub validator. After the validator node ran several days, I dump the whole goroutine stack. I found that there were hundreds of broadcastTxRoutine. However, the connected peer quantity was less than 30. So I belive that there must be broadcastTxRoutine leakage issue.

According to my analysis, I think the root cause of this issue locate in below code:

		select {
		case <-next.NextWaitChan():
			// see the start of the for loop for nil check
			next = next.Next()
		case <-peer.Quit():
			return
		case <-memR.Quit():
			return
		}

As we know, if multiple paths are avaliable in the same time, then a random path will be selected. Suppose that next.NextWaitChan() and peer.Quit() are both avaliable, and next.NextWaitChan() is chosen.

                // send memTx
		msg := &TxMessage{Tx: memTx.tx}
		success := peer.Send(MempoolChannel, cdc.MustMarshalBinaryBare(msg))
		if !success {
			time.Sleep(peerCatchupSleepIntervalMS * time.Millisecond)
			continue
		}

Then next will be non-empty and the peer send operation won't be success. As a result, this go routine will be track into infinite loop and won't be released.

My proposal is to check peer.Quit() and memR.Quit() in every loop no matter whether next is nil.
2019-03-26 09:29:06 +01:00
Dev Ojha
6de7effb05 mempool no gossip back (#2778)
Closes #1798

This is done by making every mempool tx maintain a list of peers who its received the tx from. Instead of using the 20byte peer ID, it instead uses a local map from peerID to uint16 counter, so every peer adds 2 bytes. (Word aligned to probably make it 8 bytes)

This also required resetting the callback function on every CheckTx. This likely has performance ramifications for instruction caching. The actual setting operation isn't costly with the removal of defers in this PR.

* Make the mempool not gossip txs back to peers its received it from

* Fix adversarial memleak

* Don't break interface

* Update changelog

* Forgot to add a mtx

* forgot a mutex

* Update mempool/reactor.go

Co-Authored-By: ValarDragon <ValarDragon@users.noreply.github.com>

* Update mempool/mempool.go

Co-Authored-By: ValarDragon <ValarDragon@users.noreply.github.com>

* Use unknown peer ID

Co-Authored-By: ValarDragon <ValarDragon@users.noreply.github.com>

* fix compilation

* use next wait chan logic when skipping

* Minor fixes

* Add TxInfo

* Add reverse map

* Make activeID's auto-reserve 0

* 0 -> UnknownPeerID

Co-Authored-By: ValarDragon <ValarDragon@users.noreply.github.com>

* Switch to making the normal case set a callback on the reqres object

The recheck case is still done via the global callback, and stats
are also set via global callback

* fix merge conflict

* Addres comments

* Add cache tests

* add cache tests

* minor fixes

* update metrics in reqResCb and reformat code

* goimport -w mempool/reactor.go

* mempool: update memTx senders

I had to introduce txsMap for quick mempoolTx lookups.

* change senders type from []uint16 to sync.Map

Fixes DATA RACE:

```
Read at 0x00c0013fcd3a by goroutine 183:
  github.com/tendermint/tendermint/mempool.(*MempoolReactor).broadcastTxRoutine()
      /go/src/github.com/tendermint/tendermint/mempool/reactor.go:195 +0x3c7

Previous write at 0x00c0013fcd3a by D[2019-02-27|10:10:49.058] Read PacketMsg                               switch=3 peer=35bc1e3558c182927b31987eeff3feb3d58a0fc5@127.0.0.1
:46552 conn=MConn{pipe} packet="PacketMsg{30:2B06579D0A143EB78F3D3299DE8213A51D4E11FB05ACE4D6A14F T:1}"
goroutine 190:
  github.com/tendermint/tendermint/mempool.(*Mempool).CheckTxWithInfo()
      /go/src/github.com/tendermint/tendermint/mempool/mempool.go:387 +0xdc1
  github.com/tendermint/tendermint/mempool.(*MempoolReactor).Receive()
      /go/src/github.com/tendermint/tendermint/mempool/reactor.go:134 +0xb04
  github.com/tendermint/tendermint/p2p.createMConnection.func1()
      /go/src/github.com/tendermint/tendermint/p2p/peer.go:374 +0x25b
  github.com/tendermint/tendermint/p2p/conn.(*MConnection).recvRoutine()
      /go/src/github.com/tendermint/tendermint/p2p/conn/connection.go:599 +0xcce

Goroutine 183 (running) created at:
D[2019-02-27|10:10:49.058] Send                                         switch=2 peer=1efafad5443abeea4b7a8155218e4369525d987e@127.0.0.1:46193 channel=48 conn=MConn{pipe} m
sgBytes=2B06579D0A146194480ADAE00C2836ED7125FEE65C1D9DD51049
  github.com/tendermint/tendermint/mempool.(*MempoolReactor).AddPeer()
      /go/src/github.com/tendermint/tendermint/mempool/reactor.go:105 +0x1b1
  github.com/tendermint/tendermint/p2p.(*Switch).startInitPeer()
      /go/src/github.com/tendermint/tendermint/p2p/switch.go:683 +0x13b
  github.com/tendermint/tendermint/p2p.(*Switch).addPeer()
      /go/src/github.com/tendermint/tendermint/p2p/switch.go:650 +0x585
  github.com/tendermint/tendermint/p2p.(*Switch).addPeerWithConnection()
      /go/src/github.com/tendermint/tendermint/p2p/test_util.go:145 +0x939
  github.com/tendermint/tendermint/p2p.Connect2Switches.func2()
      /go/src/github.com/tendermint/tendermint/p2p/test_util.go:109 +0x50

I[2019-02-27|10:10:49.058] Added good transaction                       validator=0 tx=43B4D1F0F03460BD262835C4AA560DB860CFBBE85BD02386D83DAC38C67B3AD7 res="&{CheckTx:gas_w
anted:1 }" height=0 total=375
Goroutine 190 (running) created at:
  github.com/tendermint/tendermint/p2p/conn.(*MConnection).OnStart()
      /go/src/github.com/tendermint/tendermint/p2p/conn/connection.go:210 +0x313
  github.com/tendermint/tendermint/libs/common.(*BaseService).Start()
      /go/src/github.com/tendermint/tendermint/libs/common/service.go:139 +0x4df
  github.com/tendermint/tendermint/p2p.(*peer).OnStart()
      /go/src/github.com/tendermint/tendermint/p2p/peer.go:179 +0x56
  github.com/tendermint/tendermint/libs/common.(*BaseService).Start()
      /go/src/github.com/tendermint/tendermint/libs/common/service.go:139 +0x4df
  github.com/tendermint/tendermint/p2p.(*peer).Start()
      <autogenerated>:1 +0x43
  github.com/tendermint/tendermint/p2p.(*Switch).startInitPeer()
```

* explain the choice of a map DS for senders

* extract ids pool/mapper to a separate struct

* fix literal copies lock value from senders: sync.Map contains sync.Mutex

* use sync.Map#LoadOrStore instead of Load

* fixes after Ismail's review

* rename resCbNormal to resCbFirstTime
2019-03-26 09:27:29 +01:00
zjubfd
25a3c8b172 rpc: support tls rpc (#3469)
Refs #3419
2019-03-23 18:08:15 +01:00
Thane Thomson
85be2a554e tools/tm-signer-harness: update height and round for test harness (#3466)
In order to re-enable the test harness for the KMS (see
tendermint/kms#227), we need some marginally more realistic proposals
and votes. This is because the KMS does some additional sanity checks
now to ensure the height and round are increasing over time.
2019-03-22 14:16:38 +01:00
Anton Kaliaev
1d4afb179b
replace PB2TM.ConsensusParams with a call to params#Update (#3448)
Fixes #3444
2019-03-21 11:05:39 +01:00
tracebundy
660bd4a53e fix comment (#3454) 2019-03-20 08:30:49 -04:00
Ethan Buchman
81b9bdf400
comments on validator ordering (#3452)
* comments on validator ordering

* NextValidatorsHash
2019-03-20 08:29:40 -04:00
Anton Kaliaev
926127c774 blockchain: update the maxHeight when a peer is removed (#3350)
* blockchain: update the maxHeight when a peer is removed

Refs #2699

* add a changelog entry

* make linter pass
2019-03-19 20:59:33 -04:00
zjubfd
03085c2da2 rpc: client disable compression (#3430) 2019-03-19 20:18:18 -04:00
Anton Kaliaev
7af4b5086a Remove RepeatTimer and refactor Switch#Broadcast (#3429)
* p2p: refactor Switch#Broadcast func

- call wg.Add only once
- do not call peers.List twice!
  * bad for perfomance
  * peers list can change in between calls!

Refs #3306

* p2p: use time.Ticker instead of RepeatTimer

no need in RepeatTimer since we don't Reset them

Refs #3306

* libs/common: remove RepeatTimer (also TimerMaker and Ticker interface)

"ancient code that’s caused no end of trouble" Ethan

I believe there's much simplier way to write a ticker than can be reset
https://medium.com/@arpith/resetting-a-ticker-in-go-63858a2c17ec
2019-03-19 20:10:54 -04:00
needkane
60b2ae5f5a crypto: delete unused code (#3426) 2019-03-19 20:00:53 -04:00
Anca Zamfir
a6349f5063 Formalize proposer election algorithm properties (#3140)
* Update proposer-selection.md

* Fixed typos

* fixed typos

* Attempt to address some comments

* Update proposer-selection.md

* Update proposer-selection.md

* Update proposer-selection.md

Added the normalization step.

* Addressed review comments

* New example for normalization section

Added a new example to better show the need for normalization
Added requirement for changing validator set
Addressed review comments

* Fixed problem with R2

* fixed the math for new validator

* test

* more small updates

* Moved the centering above the round-robin election

- the centering is now done before the actual round-robin block
- updated examples
- cleanup

* change to reflect new implementation for new validator
2019-03-19 19:56:13 -04:00
Ethan Buchman
22bcfca87a
Merge pull request #3450 from tendermint/master
Merge master back to develop
2019-03-19 19:54:09 -04:00
Ethan Buchman
0d985ede28
Merge pull request #3417 from tendermint/release/v0.31.0
Release/v0.31.0
v0.0.0 v0.0.1 v0.31.0
2019-03-19 19:53:37 -04:00
Ismail Khoffi
1e3469789d Ensure WriteTimeout > TimeoutBroadcastTxCommit (#3443)
* Make sure config.TimeoutBroadcastTxCommit < rpcserver.WriteTimeout()

* remove redundant comment

* libs/rpc/http_server: move Read/WriteTimeout into Config

* increase defaults for read/write timeouts

Based on this article
https://www.digitalocean.com/community/tutorials/how-to-optimize-nginx-configuration

* WriteTimeout should be larger than TimeoutBroadcastTxCommit

* set a deadline for subscribing to txs

* extract duration into const

* add two changelog entries

* Update CHANGELOG_PENDING.md

Co-Authored-By: melekes <anton.kalyaev@gmail.com>

* Update CHANGELOG_PENDING.md

Co-Authored-By: melekes <anton.kalyaev@gmail.com>

* 12 -> 10

* changelog

* changelog
2019-03-19 19:45:51 -04:00
Ethan Buchman
5f68fbae37
Merge pull request #3449 from tendermint/ismail/merge_develop_into_release/0.31.0
Merge develop into release/0.31.0
2019-03-19 19:25:26 -04:00
Ismail Khoffi
e276f35f86 remove 3421 from changelog 2019-03-19 14:36:42 +01:00
Ismail Khoffi
8e62a3d62a Add #3421 to changelog and reorder alphabetically 2019-03-19 12:19:02 +01:00
Ismail Khoffi
48aaccab8f Merge in develop and update CHANGELOG.md 2019-03-19 12:09:26 +01:00
Anton Kaliaev
4162ebe8b5
types: refactor PB2TM.ConsensusParams to take BlockTimeIota as an arg (#3442)
See https://github.com/tendermint/tendermint/pull/3403/files#r266208947

In #3403 we unexposed BlockTimeIota from the ABCI, but it's still part
of the ConsensusParams struct, so we have to remember to add it back
after calling PB2TM.ConsensusParams. Instead, PB2TM.ConsensusParams
should take it as an argument

Fixes #3432
2019-03-19 11:38:32 +04:00
Ethan Buchman
551b6322f5
Update v0.31.0 release notes (#3434)
* changelog: fix formatting

* update release notes

* update changelog

* linkify

* update UPGRADING
2019-03-16 19:24:12 -04:00
Ismail Khoffi
52c4e15eb2 changelog: more review fixes/release/v0.31.0 (#3427)
* Update release summary

* Add pubsub config changes

* Add link to issue for pubsub changes
v0.31.0-rc0
2019-03-14 19:07:06 +04:00
Ismail Khoffi
5483ac6b0a minor changes / fixes to release 0.31.0 (#3422)
* bump ABCIVersion due to renaming BlockSizeParams -> BlockParams
(https://github.com/tendermint/tendermint/pull/3417#discussion_r264974791)

* Move changelog on consensus params entry to breaking

* Add @melekes' suggestion for breaking change in pubsub into upgrading.md

* Add changelog entry for #3351

* Add changelog entry for #3358 & #3359

* Add changelog entry for #3397

* remove changelog entry for #3397 (was already released in 0.30.2)

* move 3351 to improvements

* Update changelog comment
2019-03-14 15:17:49 +04:00
Anton Kaliaev
7457133307
grpcdb: close Iterator/ReverseIterator after use (#3424)
Fixes #3402
2019-03-14 15:00:58 +04:00
Anton Kaliaev
a59930a327
localnet: fix $LOG variable (#3423)
Fixes #3421

Before: it was creating a file named ${LOG:-tendermint.log} in .build/nodeX
After: it creates a file named tendermint.log
2019-03-13 16:09:05 +04:00
Ismail Khoffi
85c023db88 Prep release v0.31.0:
- update changelog, reset pending
 - bump versions
 - add external contributors (partly manually)
2019-03-12 20:07:26 +01:00
Anton Kaliaev
4cbd36f341
Merge pull request #3415 from tendermint/master
Merge master back to develop (do not squash)
2019-03-12 21:22:03 +04:00
Anton Kaliaev
e42f833fd4
Merge master back to develop (#3412)
* libs/db: close batch (#3397)

ClevelDB requires closing when WriteBatch is no longer needed, https://godoc.org/github.com/jmhodges/levigo#WriteBatch.Close

Fixes the memory leak in https://github.com/cosmos/cosmos-sdk/issues/3842

* update changelog and bump version to 0.30.2
2019-03-12 16:20:59 +04:00
Anton Kaliaev
ad3e990c6a
fix GO_VERSION in installation scripts (#3411)
there is no such file https://storage.googleapis.com/golang/go1.12.0.linux-amd64.tar.gz

Fixes #3405
2019-03-11 23:59:00 +04:00
srmo
676212fa8f cmd: make sure to have 'testnet' create the data directory for nonvals (#3409)
Fixes #3408
2019-03-11 23:06:03 +04:00
Anton Kaliaev
3035572034
cs: comment out log.Error to avoid TestReactorValidatorSetChanges timing out (#3401) 2019-03-11 22:52:09 +04:00
Anton Kaliaev
d741c7b478
limit number of /subscribe clients and queries per client (#3269)
* limit number of /subscribe clients and queries per client

Add the following config variables (under [rpc] section):
  * max_subscription_clients
  * max_subscriptions_per_client
  * timeout_broadcast_tx_commit

Fixes #2826

new HTTPClient interface for subscriptions

finalize HTTPClient events interface

remove EventSubscriber

fix data race

```
WARNING: DATA RACE
Read at 0x00c000a36060 by goroutine 129:
  github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1()
      /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0

Previous write at 0x00c000a36060 by goroutine 132:
  github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe()
      /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0
  github.com/tendermint/tendermint/rpc/client.WaitForOneEvent()
      /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178
  github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1()
      /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162

Goroutine 129 (running) created at:
  github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe()
      /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7
  github.com/tendermint/tendermint/rpc/client.WaitForOneEvent()
      /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178
  github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1()
      /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162

Goroutine 132 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:878 +0x659
  github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync()
      /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162
==================
```

lite client works (tested manually)

godoc comments

httpclient: do not close the out channel

use TimeoutBroadcastTxCommit

no timeout for unsubscribe

but 1s Local (5s HTTP) timeout for resubscribe

format code

change Subscribe#out cap to 1

and replace config vars with RPCConfig

TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout

rpc: Context as first parameter to all functions

reformat code

fixes after my own review

fixes after Ethan's review

add test stubs

fix config.toml

* fixes after manual testing

- rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes
Tendermint resources (pubsub)
- rpc: better error in Subscribe and BroadcastTxCommit
- HTTPClient: do not resubscribe if err = ErrAlreadySubscribed

* fixes after Ismail's review

* Update rpc/grpc/grpc_test.go

Co-Authored-By: melekes <anton.kalyaev@gmail.com>
2019-03-11 22:45:58 +04:00
Anton Kaliaev
15f621141d
remove TimeIotaMs from ABCI consensus params (#3403)
Also

- init substructures to avoid panic in pb2tm.ConsensusParams
Before: if csp.Block is nil and we later try to access/write to it,
we'll panic.
After: if csp.Block is nil and we later try to access/write to it,
there'll be no panic.
2019-03-11 22:21:17 +04:00
Anca Zamfir
dc359bd3a5 types: remove check for priority order of existing validators (#3407)
When scaling and averaging is invoked, it is possible to have validators
with close priorities ending up with same priority. With the current code,
this  makes it impossible to verify the priority orders before and after updates.

Fixes #3383
2019-03-11 18:17:25 +04:00
Ethan Buchman
976819537d
Merge pull request #3399 from tendermint/release/v0.30.2
Release/v0.30.2
v0.30.2
2019-03-11 08:17:14 -04:00
Anton Kaliaev
100ff08de9
p2p: do not panic when filter times out (#3384)
Fixes #3369
2019-03-11 15:31:53 +04:00
Anton Kaliaev
f996b10f47
update changelog and bump version to 0.30.2 2019-03-10 13:06:34 +04:00
Yumin Xia
36d7180ca2
libs/db: close batch (#3397)
ClevelDB requires closing when WriteBatch is no longer needed, https://godoc.org/github.com/jmhodges/levigo#WriteBatch.Close

Fixes the memory leak in https://github.com/cosmos/cosmos-sdk/issues/3842
2019-03-10 12:56:04 +04:00
Yumin Xia
b021f1e505 libs/db: close batch (#3397)
ClevelDB requires closing when WriteBatch is no longer needed, https://godoc.org/github.com/jmhodges/levigo#WriteBatch.Close

Fixes the memory leak in https://github.com/cosmos/cosmos-sdk/issues/3842
2019-03-10 12:46:32 +04:00
mircea-c
90794260bc circleci: removed complexity from docs deployment job (#3396) 2019-03-09 19:13:36 +04:00
Anton Kaliaev
b6a510a3e7
make ineffassign linter pass (#3386)
Refs #3262

This fixes two small bugs:

1) lite/dbprovider: return `ok` instead of true in parse* functions. It's weird that we're ignoring `ok` value before.
2) consensus/state: previously because of the shadowing we almost never output "Error with msg". Now we declare both `added` and `err` in the beginning of the function, so there's no shadowing.
2019-03-08 09:46:09 +04:00
Ismail Khoffi
e415c326f9 update golang.org/x/crypto (#3392)
Update Gopkg.lock via dep ensure --update golang.org/x/crypto

see #3391 (comment) (nothing to review here really).
2019-03-08 09:40:59 +04:00