fix: preserve crash report ID high bits#8216
Merged
Merged
Conversation
8 tasks
📲 Install BuildsiOS
|
Contributor
Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| ed33915 | 1231.46 ms | 1259.48 ms | 28.02 ms |
| e337271 | 1234.71 ms | 1270.10 ms | 35.40 ms |
| 1936da3 | 1223.18 ms | 1253.96 ms | 30.78 ms |
| 4014441 | 1232.16 ms | 1266.41 ms | 34.25 ms |
| dff20e3 | 1225.19 ms | 1258.66 ms | 33.47 ms |
| bd2b6d3 | 1213.78 ms | 1253.83 ms | 40.06 ms |
| ce33dcf | 1236.42 ms | 1267.76 ms | 31.34 ms |
| 8bb6aeb | 1215.82 ms | 1244.84 ms | 29.02 ms |
| 27850be | 1222.65 ms | 1254.23 ms | 31.58 ms |
| 1af7edf | 1210.63 ms | 1247.91 ms | 37.29 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| ed33915 | 24.14 KiB | 1.20 MiB | 1.17 MiB |
| e337271 | 24.14 KiB | 1.16 MiB | 1.13 MiB |
| 1936da3 | 24.14 KiB | 1.17 MiB | 1.15 MiB |
| 4014441 | 24.14 KiB | 1.16 MiB | 1.14 MiB |
| dff20e3 | 24.14 KiB | 1.15 MiB | 1.13 MiB |
| bd2b6d3 | 24.14 KiB | 1.17 MiB | 1.14 MiB |
| ce33dcf | 24.14 KiB | 1.19 MiB | 1.16 MiB |
| 8bb6aeb | 24.14 KiB | 1.17 MiB | 1.15 MiB |
| 27850be | 24.14 KiB | 1.18 MiB | 1.16 MiB |
| 1af7edf | 24.14 KiB | 1.19 MiB | 1.16 MiB |
philprime
approved these changes
Jun 26, 2026
philprime
left a comment
Member
There was a problem hiding this comment.
LGTM, thank you for fixing this and the flaky tests!
NinjaLikesCheez
approved these changes
Jun 26, 2026
itaybre
pushed a commit
that referenced
this pull request
Jun 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📜 Description
This is a fun one. It essentially adds one cast to preserve the high bits in our report ID generation. This is required not only to avoid cycling IDs too frequently, but also to avoid accidentally creating 0-IDs, which we then discard. I pulled out the inner part that converts to ID from the
time_twe get fromgmtime_r()and added a regression test that shows the issue.💡 Motivation and Context
This problem appeared recently when another PR highlighted flaky tests in an unrelated part of the code. However, I also recently investigated that part of the code in the context of the
KSCrashinvestigation and thus had somewhat of a clue what was happening there.In particular, what popped into mind when looking at the logs was the fact that tests and retries all happened within the same second, which to me hinted at a connection to the report ID construction, which itself is based on a seconds counter, meaning tests running within the same second will all use the same ID base.
And it turned out I was right, but in a rather roundabout way. The three relevant parts in the code are here:
sentry-cocoa/Sources/SentryCrash/Recording/SentryCrashReportStore.c
Lines 183 to 198 in 34646fe
sentry-cocoa/Sources/SentryCrash/Recording/SentryCrashReportStore.c
Lines 153 to 158 in 34646fe
sentry-cocoa/Sources/SentryCrash/Recording/SentryCrashReportStore.c
Lines 129 to 133 in 34646fe
The first creates a report ID based on the current time. The last two are iterations from
getReportIDs()andgetReportCount(). The latter two establish the first important fact: the SDK ignores reports withID == 0. Regarding the first one, we have to look at the timestamp of the failing flaky tests: 2026-06-23 22:01:27.If we plug the timestamp into the formula, we get
which produces
4066106368or0xf25bdc00. After left-shifting by 23 bits, we get0x00792dee00000000. This means our entire lower 32 bits are 0.Now we come to the bug:
The intention here is that it creates a negative masking for the lower bits, leaving us with the upper 32 bits (in their final 64-bit position). But the problem is that the mask
0xffffffffis a 32-bit value in C, and inverting the bits yields0x00000000, which, in 64-bit, is also 0, i.e., a mask of0x0000000000000000, not the intended 64-bit mask0xffffffff00000000.Since the lower 32 bits are 0 anyway, and we have now masked away the non-zero upper 32 bits, the resulting ID equals 0. Meaning our tests checking on counts will fail to count the first report generated. But also: any report generated in production during one such unlucky second will be ignored.
How often does this happen? Since we shift the seconds by 23 bits, we only have 9 bits left in our second-counter precision. Meaning, around every 512 seconds (ignoring the formula details), we cycle the IDs. Whenever the
baseID modulo 512 == 0before shifting, the resulting ID equals zero. Of course, this also means we cycle the ID generation every 512 seconds, but I guess this is less of a problem since we generate crash reports here.The fix is to cast
0xffffffffto a 64-bit value, so that inverting the bits results in0xffffffff00000000, i.e., the expected mask. After my dive into this, I also looked upstream (in the old version,developchanged report generation considerably), and as you can expect, the bug was fixed. What is rather surprising and painful is that it was fixed back in 2020.💚 How did you test it?
Added a unit test that fails with the old implementation and correctly handles the new one.
📝 Checklist
You have to check all boxes before merging:
sendDefaultPIIis enabled.Closes #8217