Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline: Add batchId to batch metadata on resubmit #21767

Merged
merged 17 commits into from
Jul 10, 2024

Conversation

markfields
Copy link
Member

@markfields markfields commented Jul 3, 2024

Description

Batch ID is the key piece of information needed to detect/prevent duplication of changes due to a forked container.

We compute Batch ID as a tuple of original client ID and original Client Sequence number (of the batch start). We only stamp the batchId on batch metadata (of first message in the batch) on resubmit - if a batch is missing batchId we will be able to compute it from the client ID and Client Sequence Number.

The next PR will be more interesting - comparing batch IDs in PendingStateManager as incoming messages are processed. This just sets it up by adding batch ID to batch metadata on resubmit.

Breaking Changes

This PR breaks the format of the stashed pending local state. Back-compat support for reading old-format stashes is included. New format is backwards compatible (additive). This is slightly superfluous anyway since there is no production usage of the stashed schema.

Reviewer Guidance

For the end-to-end flow, see prototype PR #21266

Here are some tests I considered in my test plan but I am not planning to write at this point (i.e. scenarios without direct test coverage). Many of these will be covered in the next PR where we have end-to-end coverage of the forked container scenarios.

  • PendingStateManager
    • Load from old stashed format. batchIdContext !== undefined
    • Load twice from old stashed format. Batches are not correlated and duplication occurs
    • Flushing two batches when disconnected should yield unique batch IDs
    • Resubmitting twice preserves existing batchId
    • Resubmitting after rehydrate preserves stashed batchIdContext or batchId
  • end-to-end
    • batchId is present on sequenced messages being processed when expected
@github-actions github-actions bot added area: runtime Runtime related issues base: main PRs targeted against main branch labels Jul 3, 2024
@markfields markfields marked this pull request as ready for review July 9, 2024 06:24
@markfields markfields requested review from anthony-murphy, kian-thompson, dannimad, vladsud, a team, pragya91, agarwal-navin and jatgarg and removed request for a team July 9, 2024 17:48
// Only include Batch ID if "Offline Load" feature is enabled
// It's only needed to identify batches across container forks arising from misuse of offline load.
const includeBatchId =
this.mc.config.getBoolean("Fluid.Container.enableOfflineLoad") ?? false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting that we're reading this flag from two different places. I don't know if that's a good or bad thing. And it always makes me worry about loader-runtime back compat with older drivers and how they might behave.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two different places in this PR? I thought it was just here.

This flag is read in multiple places and is the single gate for all Offline related scenarios. Good question about loader/runtime... Yeah... I see what you mean, the other hit is in Loader layer. I'll follow up, but leaving here for now since none of this has shipped yet.

export function asBatchMetadata(metadata: unknown): IBatchMetadata | undefined {
return isBatchMetadata(metadata) ? metadata : undefined;
return metadata as IBatchMetadata | undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, I favored the older solution as it seemed a little safer, but I don't know if this is important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I had just added that type guard, but on second though didn't think the runtime checks were that interesting. The only risk is if two different parts of code used the same key batch or batchId and stored data with a different type and different semantics. That's not going to happen though... So let's not waste time with the runtime checks.

Since all IBatchMetadata props are optional, and this thing indicates it could be undefined, callers have to handle those cases.

Maybe it's better to return Partial<IBatchMetadata> | undefined to emphasize the uncertainty (basically just want callers to be able to get intellisense on valid property names + expected types)

@msfluid-bot
Copy link
Collaborator

@fluid-example/bundle-size-tests: +2.55 KB
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 456.5 KB 457.11 KB +625 Bytes
azureClient.js 554.23 KB 554.86 KB +639 Bytes
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 257.71 KB 258.3 KB +606 Bytes
fluidFramework.js 388.04 KB 388.06 KB +14 Bytes
loader.js 134.1 KB 134.11 KB +14 Bytes
map.js 42.17 KB 42.17 KB +7 Bytes
matrix.js 145.43 KB 145.44 KB +7 Bytes
odspClient.js 522.23 KB 522.86 KB +639 Bytes
odspDriver.js 97.17 KB 97.19 KB +21 Bytes
odspPrefetchSnapshot.js 42.27 KB 42.29 KB +14 Bytes
sharedString.js 162.51 KB 162.52 KB +7 Bytes
sharedTree.js 378.51 KB 378.51 KB +7 Bytes
Total Size 3.26 MB 3.26 MB +2.55 KB

Baseline commit: 0450953

Generated by 🚫 dangerJS against 13d315d

@markfields markfields merged commit adf81de into microsoft:main Jul 10, 2024
36 checks passed
@markfields markfields deleted the o3/batch-id branch July 10, 2024 23:45
markfields added a commit that referenced this pull request Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: runtime Runtime related issues base: main PRs targeted against main branch
4 participants