Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Priority Connection Queue to Worker #4386

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

nibanks
Copy link
Member

@nibanks nibanks commented Jul 5, 2024

Description

This build on #4279 and now adds the prioritization logic at the worker queue level. High priority operations queued on a connection now result in a connection being high priority on its worker.

Testing

CI/CD

Documentation

N/A

@nibanks nibanks added Area: Performance Area: Core Related to the shared, core protocol logic labels Jul 5, 2024
@nibanks nibanks requested a review from a team as a code owner July 5, 2024 12:48
src/test/lib/DataTest.cpp Outdated Show resolved Hide resolved
@ami-GS ami-GS force-pushed the nibanks/priority-connection branch from d4f36b0 to c9950eb Compare July 11, 2024 00:16
@ami-GS
Copy link
Contributor

ami-GS commented Jul 11, 2024

Random stall issue

@ami-GS
Copy link
Contributor

ami-GS commented Jul 22, 2024

QuicWorkerQueueConnection has the cause of stall. but then priority mechanism randomly fail

Copy link

codecov bot commented Jul 22, 2024

Codecov Report

Attention: Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.

Project coverage is 85.00%. Comparing base (db66ab9) to head (5b79341).
Report is 16 commits behind head on main.

Files Patch % Lines
src/core/worker.c 93.75% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4386      +/-   ##
==========================================
- Coverage   85.02%   85.00%   -0.03%     
==========================================
  Files          56       56              
  Lines       15457    15485      +28     
==========================================
+ Hits        13143    13163      +20     
- Misses       2314     2322       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines +3792 to +3804
ExpectedStartOrder[0] = &Stream1;
ExpectedStartOrder[1] = &Stream4;
ExpectedStartOrder[2] = &Stream2;
ExpectedStartOrder[3] = &Stream5;
ExpectedStartOrder[4] = &Stream3;
ExpectedSendOrder[0] = &Stream1;
ExpectedSendOrder[1] = &Stream4;
ExpectedSendOrder[2] = &Stream2;
ExpectedSendOrder[3] = &Stream5;
ExpectedSendOrder[4] = &Stream3;

TEST_TRUE(memcmp(Context.StartOrder, ExpectedStartOrder, sizeof(ExpectedStartOrder)) == 0);
TEST_TRUE(memcmp(Context.SendOrder, ExpectedSendOrder, sizeof(ExpectedSendOrder)) == 0);
Copy link
Contributor

@ami-GS ami-GS Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes [3] and [4] (non prioritized operations) are flipped. timing issue

Copy link
Contributor

@ami-GS ami-GS Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how operations in FIFO queue with 1:1 producer : consumer swap.....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the cause.
There is a case when [3] is in worker queue before explicitly enqueue operation by API.
Conventional enqueue doesn't re-enqueue the connection. Prioritized enqueue does.

@ami-GS
Copy link
Contributor

ami-GS commented Jul 24, 2024

This PR itself is ready to go.

spinquic has stall issue. Investigation in progress

@ami-GS
Copy link
Contributor

ami-GS commented Jul 24, 2024

RegConfig.ExecutionProfile = QUIC_EXECUTION_PROFILE_TYPE_SCAVENGER
ExecConfig.ProcessorCount = 1
1 RunThread thread

doesn't resolve the stall.

@ami-GS
Copy link
Contributor

ami-GS commented Jul 24, 2024

@nibanks
MacOS doesn't hit the stall issue, I thought the difference in MacOS is the number of (platform_)worker thread, but any other differences around scheduling?

@nibanks
Copy link
Member Author

nibanks commented Jul 25, 2024

macOS only uses 1 CPU / worker thread, so all the scheduling/timings will be very different

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Core Related to the shared, core protocol logic Area: Performance
2 participants