Improve Performance for SRS #3666
Replies: 4 comments
-
SRS4: Refine ST Iterate Coroutines PerformanceThere is an optimization in ST that could potentially improve performance by 5% to 10%. This mainly addresses the issue of iterating coroutines. Data reference: ossrs/state-threads#5 (comment) This optimization involves significant changes, so it will not be implemented in SRS3, but is expected to be in SRS4. MacPro information:
Docker information:
SRS3 for Playing BaselineSRS3, without this optimization, can serve as a performance baseline to see how much this PR has improved relative to it.
Interpretation:
SRS3 for Playing with ST RefinedSRS3, with this PR merged, optimizes the ST iteration logic.
Interpretation:
|
Beta Was this translation helpful? Give feedback.
-
SRS3: Use Compiler O2 To Improve PerformanceSRS1,2,3 have always used O0 by default, disabling compiler optimization. Data can be compared after enabling optimization. MacPro information:
Docker information:
SRS3 Play BaselineFirst, let's look at the baseline data, with an average CPU usage of 66%, 39% in user space, and 22% in system space.
SRS3 Play with Compiler O2After enabling the O2 compiler option for SRS3, performance can be improved by about 10%, with CPU usage around 52%, 26% in user space, and 17% in system space.
|
Beta Was this translation helpful? Give feedback.
-
It was found that the Docker environment may have unstable baseline issues, sometimes high and sometimes low, with significant differences, as shown in the following figure: Some optimizations have been made, some of which are expected to improve, such as enabling O2, but due to the unstable baseline, they will be put on hold for now and tested on a physical machine later. The following are the optimization branches:
|
Beta Was this translation helpful? Give feedback.
-
Regarding ST optimization, the points that can be optimized are:
For analysis on ST, refer to: https://github.com/ossrs/state-threads/tree/srs#analysis
|
Beta Was this translation helpful? Give feedback.
-
Performance optimization is an endless topic that requires continuous improvement. SRS2 has undergone a significant performance optimization, increasing from 3k to 7k. Further optimizations are needed, and the optimization process and data will be posted in this issue.
Previously, SRS2 had undergone some optimizations, as referenced below:
Play RTMP benchmark
The data for playing RTMP was benchmarked by [SB][srs-bench]:
Publish RTMP benchmark
The data for publishing RTMP was benchmarked by [SB][srs-bench]:
Play HTTP FLV benchmark
The data for playing HTTP FLV was benchmarked by [SB][srs-bench]:
Latency benchmark
The latency between encoder and player with realtime config([CN][v3_CN_LowLatency], [EN][v3_EN_LowLatency]):
|
Beta Was this translation helpful? Give feedback.
All reactions