This PR contains a few major improvements:
* Code duplication has been removed.
* Everything has been refactored so that the implementation is much easier to understand.
* `future_to_promise` is now implemented with `spawn_local` rather than the other way around (this means `spawn_local` is faster since it doesn't need to create an unneeded `Promise`).
* Both the single threaded and multi threaded executors have been rewritten from scratch:
* They only create 1-2 allocations in Rust per Task, and all of the allocations happen when the Task is created.
* The singlethreaded executor creates 1 Promise per tick, rather than 1 Promise per tick per Task.
* Both executors do *not* create `Closure`s during polling, instead all needed `Closure`s are created ahead of time.
* Both executors now have correct behavior with regard to spurious wakeups and waking up during the call to `poll`.
* Both executors cache the `Waker` so it doesn't need to be recreated all the time.
I believe both executors are now optimal in terms of both Rust and JS performance.