GN
GlobalNews.one
Startups

Show HN: Writing a C++20M:N Scheduler from Scratch (EBR, Work-Stealing)

February 17, 2026
Sponsored
Show HN: Writing a C++20M:N Scheduler from Scratch (EBR, Work-Stealing)

tiny_coro is a lightweight, high-performance M:N cooperative asynchronous runtime framework written from scratch based on C++20 Coroutines.

The core objective of this project is education and deconstruction. It strips away the complex wrappers of industrial-grade libraries (such as Seastar, Folly) and demonstrates the core components of a modern high-concurrency runtime with the most minimalist code: M:N Scheduler, Work-Stealing Algorithm, EBR Memory Reclamation, Lock-Free Queue, and the Reactor Pattern.

If you are looking to learn the underlying logic of coroutines or how to get started with C++20 coroutines, then this repository might be exactly what you need. This repository aims to provide teaching that is easy to understand yet profound, making it an excellent example for deeply understanding C++20 coroutine mechanisms and system programming.

First, please finish reading this README file. At the very bottom of this file, you will be guided on how to navigate this repository for learning.

All files in this project with .h, .cpp, .hpp, and CMakeLists.txt extensions are governed by the MIT License. All other files, including but not limited to how_to_make_your_M:N_scheduler.md, all .md files under the docs folder, project assets, diagrams, and this README, are governed by the CC BY-NC-SA 4.0 License (Commercial Use Strictly Prohibited).

Under a simple web backend simple_http_web.cpp written based on this framework, the local loopback wrk test achieved a QPS of 186,045, and passed all local tests including ASAN and TSAN.

Of course, we also implemented a concise Go code based on the same test logic -- main.go, which achieved a QPS of 193,587 under the same environment's local loopback wrk test.

This is not surprising news. Golang has been deeply optimized by Google and deeply integrated with the operating system, so this result is expected. At the same time, my test environment is my local machine. The socket errors reported by wrk should actually be a physical phenomenon caused jointly by OS bottlenecks and test tool behavior, rather than a logical bug. When the system is on the edge of "port exhaustion" or "denial of service," the behavior of the TCP stack becomes unstable. To reclaim resources, the operating system might send RST (Reset) packets directly to connections that seem "stuck" or at the "tail end," instead of a graceful FIN.

In lock-free programming (like StealQueue), when one thread is reading node A, another thread might delete A and free the memory. If the system immediately reuses this memory, the reading thread will crash or read dirty data.

tiny_coro introduces EBR:

The StealQueue in include/queue.h adopts the Chase-Lev algorithm:

include/async_mutex.h and include/channel.h utilize the feature of await_suspend returning bool:

This project depends only on the standard C++20 library and system APIs (Linux/macOS).

If you want to contact me, here is my email: lixia.chat@outlook.com

Sponsored
Marco Rodriguez

Marco Rodriguez

Startup Scout

Finding the next unicorn before it breaks. Passionate about innovation and entrepreneurship.