Wanna Discuss Add Voice Text by arul28 · Pull Request #578 · arul28/ADE

arul28 · 2026-06-15T07:09:07Z

Summary

Describe the change.

What Changed

Key files and behaviors.

Validation

How you tested.

Risks

Anything to watch.

Open in ADE · ade/wanna-discuss-add-voice-text-46c24910 branch · PR #578

Summary by CodeRabbit

New Features
- Added on-device voice dictation to chat composers with microphone input controls
- Added voice input toggle to app settings (Desktop and iOS)
- Added live recording indicator displaying elapsed time and audio levels
- Integrated voice glossary for improved speech-to-text accuracy with contextual term recognition and automatic corrections
- Added microphone permission handling and status indicators

Greptile Summary

This PR adds on-device voice dictation to both the desktop (Electron/whisper.cpp) and iOS (iOS 26 SpeechAnalyzer) chat composers, with a shared deterministic cleanup pipeline, a Dynamic Island Live Activity, and a voice glossary for domain-term recognition.

Desktop: A module-level GlobalVoiceRecorder singleton captures 16 kHz mono PCM via Web Audio, ships it to a main-process TranscriptionService that shells out to a bundled whisper.cpp binary, then runs a deterministic cleanup pass (filler removal → corrections → capitalization → spacing) before inserting into the registered composer and copying to clipboard as a recovery net.
iOS (iOS 26+): A SpeechDictationService streams buffers into a SpeechAnalyzer/SpeechTranscriber pipeline; a DictationController singleton lifts recording above individual composers so it survives navigation, drives a Live Activity Dynamic Island, and routes the cleaned transcript to whichever composer is visible.
Shared glossary: A single voice-glossary.json (corrections + contextual terms + fillers) is bundled on both platforms; the deterministic cleanup algorithms on both sides now agree on capitalizeSentences behavior including non-letter leading characters.

Confidence Score: 5/5

Safe to merge — all previously flagged correctness issues are resolved; remaining findings are documentation and tooling quality concerns only.

Large feature addition with well-structured platform isolation; core recording lifecycles (generation guards, teardown paths, single-flight transcription queue) are correctly implemented on both platforms. The generation-based cancel guard in GlobalVoiceRecorder correctly handles all cancel-during-getUserMedia race windows. The iOS SpeechDictationService isStarting guard, defer teardown, and double-safe cancelIOS26 calls are all sound. The cleanup pipelines on both platforms now agree. The three flagged items are documentation and tooling quality concerns, not correctness bugs in the shipped feature.

apps/desktop/src/renderer/services/globalVoiceRecorder.ts (inaccurate IPC comment), apps/ios/ADE/Services/Dictation/SpeechDictationService.swift (locale fallback comment vs. implementation), apps/desktop/scripts/materialize-whisper-resources.mjs (build-step timeout)

Important Files Changed

Filename	Overview
apps/desktop/src/renderer/services/globalVoiceRecorder.ts	New module-level singleton for Web Audio PCM capture; generation-based cancel guard correctly aborts getUserMedia and prevents orphaned audio graphs; misleading "zero-copy" comment for the IPC ArrayBuffer transfer.
apps/desktop/src/main/services/transcription/transcriptionService.ts	Well-implemented whisper.cpp wrapper: serialized single-flight queue, proper temp-file cleanup, WAV header written without native deps, typed error codes, configurable process timeout with SIGKILL fallback.
apps/ios/ADE/Services/Dictation/SpeechDictationService.swift	iOS 26 SpeechAnalyzer integration with generation-based cancellation, proper defer/catch teardown, and isStarting guard; preferredLocale comment claims a device-locale fallback that isn't implemented.
apps/ios/ADE/Services/Dictation/DictationController.swift	App-level singleton wiring dictation lifecycle to Live Activity and insertion targets; isFinishing guard, Combine republishing pattern, and double-safe cancelIOS26 calls are all correct.
apps/desktop/src/main/services/transcription/dictationCleanup.ts	Deterministic transcript cleanup (trim → fillers → corrections → capitalize → spacing); longest-first sort, module-level cache by path, and test-only reset hook are all correct.
apps/ios/ADE/Services/Dictation/DictationCleanup.swift	Swift counterpart to the TS cleanup pipeline; capitalizeSentences now correctly preserves pending capitalization through non-letter leading chars (matches TS behavior).
apps/desktop/scripts/materialize-whisper-resources.mjs	Download + build script for whisper.cpp; downloadFile now has a configurable timeout, but spawnStep used for git clone and cmake has no timeout protection against hung builds.
apps/desktop/src/renderer/components/chat/AgentChatComposer.tsx	Voice dictation wired into composer via ref-based insertion target registration; stale-closure issue on draft is correctly avoided using insertDictatedTextRef; shimmer animation injects CSS once via a style tag id guard.

Sequence Diagram

sequenceDiagram
    participant UI as Composer UI
    participant GVR as GlobalVoiceRecorder
    participant Store as Root App Store
    participant IPC as Preload IPC
    participant TS as TranscriptionService (main)
    participant Whisper as whisper.cpp

    UI->>GVR: start()
    GVR->>Store: setDictationPhase("recording") [optimistic]
    GVR->>IPC: requestMicAccess()
    IPC-->>GVR: "{status: "granted"}"
    GVR->>GVR: getUserMedia() + build AudioGraph
    note over GVR: ScriptProcessorNode collects Float32 chunks

    UI->>GVR: finish()
    GVR->>Store: setDictationPhase("transcribing")
    GVR->>GVR: downsampleToInt16(merged, srcRate)
    GVR->>IPC: "transcribe(pcm.buffer, {format:"int16"})"
    IPC->>TS: ipcMain.handle(transcriptionTranscribe)
    TS->>TS: pcmToWavBuffer() → write tmp WAV
    TS->>Whisper: spawn whisper-cli -m model -f wav -oj -np
    Whisper-->>TS: WAV.json sidecar
    TS->>TS: parseWhisperJson() + cleanTranscript()
    TS-->>IPC: "{raw, cleaned}"
    IPC-->>GVR: TranscriptionResult
    GVR->>Store: activeDictationTarget.insertText(cleaned)
    GVR->>Store: resetDictationSession() → "idle"

_{Reviews (5): Last reviewed commit: "ship: iteration 4 - address dictation re..." | Re-trigger Greptile}

vercel · 2026-06-15T07:09:13Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
ade	Ignored	Preview	Jun 15, 2026 8:38am

coderabbitai · 2026-06-15T07:09:15Z

Warning

Review limit reached

@arul28, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 8 minutes and 26 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8ff0d988-c84b-4a79-993b-6d0c67b18f74

📥 Commits

Reviewing files that changed from the base of the PR and between b9006de and f996a3e.

⛔ Files ignored due to path filters (4)

apps/ios/ADE.xcodeproj/project.pbxproj is excluded by !**/*.xcodeproj/project.pbxproj
docs/features/chat/composer-and-ui.md is excluded by !docs/**
docs/features/onboarding-and-settings/README.md is excluded by !docs/**
docs/features/sync-and-multi-device/ios-companion.md is excluded by !docs/**

📒 Files selected for processing (55)

.gitignore
apps/desktop/build/entitlements.mac.inherit.plist
apps/desktop/build/entitlements.mac.plist
apps/desktop/package.json
apps/desktop/resources/voice/voice-glossary.json
apps/desktop/resources/whisper/.gitkeep
apps/desktop/resources/whisper/README.md
apps/desktop/scripts/materialize-whisper-resources.mjs
apps/desktop/scripts/validate-whisper-resources.mjs
apps/desktop/src/main/main.ts
apps/desktop/src/main/services/ipc/ipcTimeouts.test.ts
apps/desktop/src/main/services/ipc/ipcTimeouts.ts
apps/desktop/src/main/services/ipc/registerIpc.ts
apps/desktop/src/main/services/transcription/bundledResources.ts
apps/desktop/src/main/services/transcription/dictationCleanup.ts
apps/desktop/src/main/services/transcription/transcriptionService.test.ts
apps/desktop/src/main/services/transcription/transcriptionService.ts
apps/desktop/src/preload/global.d.ts
apps/desktop/src/preload/preload.ts
apps/desktop/src/renderer/components/app/App.tsx
apps/desktop/src/renderer/components/app/App.workKeepAlive.test.tsx
apps/desktop/src/renderer/components/app/TopBar.tsx
apps/desktop/src/renderer/components/chat/AgentChatComposer.tsx
apps/desktop/src/renderer/components/chat/VoiceDictationButton.tsx
apps/desktop/src/renderer/components/settings/AppearanceSection.tsx
apps/desktop/src/renderer/components/settings/DictationSection.tsx
apps/desktop/src/renderer/components/voice/GlobalVoiceCaptureIndicator.tsx
apps/desktop/src/renderer/components/voice/RecordingPill.tsx
apps/desktop/src/renderer/hooks/useVoiceModelInstalled.ts
apps/desktop/src/renderer/services/globalVoiceRecorder.ts
apps/desktop/src/renderer/state/appStore.ts
apps/desktop/src/shared/ipc.ts
apps/ios/ADE/App/ADEApp.swift
apps/ios/ADE/App/ContentView.swift
apps/ios/ADE/Info.plist
apps/ios/ADE/Resources/VoiceGlossary.json
apps/ios/ADE/Services/Dictation/DictationCleanup.swift
apps/ios/ADE/Services/Dictation/DictationController.swift
apps/ios/ADE/Services/Dictation/SpeechDictationService.swift
apps/ios/ADE/Services/Dictation/VoiceGlossary.swift
apps/ios/ADE/Shared/DictationActivityShared.swift
apps/ios/ADE/Views/Components/DictationMicButton.swift
apps/ios/ADE/Views/Components/GlobalDictationPill.swift
apps/ios/ADE/Views/Components/RecordingPill.swift
apps/ios/ADE/Views/Settings/ConnectionSettingsView.swift
apps/ios/ADE/Views/Settings/SettingsVoiceInputSection.swift
apps/ios/ADE/Views/Work/WorkChatComposerAndInputViews.swift
apps/ios/ADE/Views/Work/WorkChatSessionView.swift
apps/ios/ADE/Views/Work/WorkNewChatScreen.swift
apps/ios/ADE/Views/Work/WorkNewChatSheet.swift
apps/ios/ADE/Views/Work/WorkPreviews.swift
apps/ios/ADE/Views/Work/WorkRootScreen.swift
apps/ios/ADETests/ADETests.swift
apps/ios/ADEWidgets/ADEWidgetBundle.swift
apps/ios/ADEWidgets/DictationLiveActivity.swift

📝 Walkthrough

Walkthrough

Adds end-to-end voice dictation to both the Electron desktop app and the iOS app. On desktop, a Node.js script materializes whisper.cpp binaries and a base English model, a main-process transcription service encodes PCM to WAV and spawns whisper-cli, and three new IPC channels expose transcription, status, and mic-access to the renderer. On iOS, a new SpeechDictationService drives the iOS 26 SpeechAnalyzer/SpeechTranscriber pipeline and a DictationController routes transcripts to composer insertion targets or clipboard. Both platforms share a deterministic cleanup pipeline (filler removal, longest-first corrections, sentence capitalization) driven by a shared VoiceGlossary.json resource.

Changes

Desktop Voice Dictation

Layer / File(s)	Summary
Build infrastructure: entitlements, packaging, gitignore `.gitignore`, `apps/desktop/build/entitlements.mac*.plist`, `apps/desktop/package.json`, `apps/desktop/resources/whisper/README.md`	Adds `com.apple.security.device.audio-input` to both macOS entitlement plists, updates electron-builder `extraResources` to package `voice/` and `whisper/` directories, expands the `x64ArchFiles` glob, wires materialize/validate steps into all dist scripts, and gitignores built whisper binaries.
Whisper resource materialization and validation scripts `apps/desktop/scripts/materialize-whisper-resources.mjs`, `apps/desktop/scripts/validate-whisper-resources.mjs`, `apps/desktop/resources/voice/voice-glossary.json`	`materialize-whisper-resources.mjs` downloads `ggml-base.en.bin` and a platform-specific `whisper-cli` (or builds from source via cmake/git). `validate-whisper-resources.mjs` checks file presence, minimum size, and executable bits. Adds the voice glossary JSON with contextual terms, corrections, and fillers.
Transcription service: bundled resource resolution, cleanup, and whisper execution `apps/desktop/src/main/services/transcription/bundledResources.ts`, `apps/desktop/src/main/services/transcription/dictationCleanup.ts`, `apps/desktop/src/main/services/transcription/transcriptionService.ts`, `apps/desktop/src/main/services/transcription/transcriptionService.test.ts`	`resolveBundledResource` handles packaged vs dev path lookup. `dictationCleanup` defines `VoiceGlossary`/`PreparedGlossary` types and a `cleanTranscript` pipeline (filler removal, longest-first corrections, sentence capitalization, whitespace normalization). `transcriptionService` encodes PCM to 16kHz WAV, spawns whisper-cli with `-oj`, parses JSON sidecar output, serializes requests via a promise queue, and exposes typed `TranscriptionError` codes. Tests validate cleanup behavior and `buildWhisperArgs`.
IPC channels, main-process wiring, and preload bridge `apps/desktop/src/shared/ipc.ts`, `apps/desktop/src/main/main.ts`, `apps/desktop/src/main/services/ipc/registerIpc.ts`, `apps/desktop/src/preload/preload.ts`, `apps/desktop/src/preload/global.d.ts`	Adds three IPC channel constants. `main.ts` lazily constructs a shared `TranscriptionService` singleton and disposes it on `will-quit`. `registerIpc` adds `transcriptionTranscribe` (PCM normalization, typed error re-throw), `transcriptionStatus`, and `transcriptionRequestMicAccess` (macOS `systemPreferences` mic-access flow). The preload contextBridge and `global.d.ts` expose `window.ade.transcription`.
Renderer Zustand store: voiceInputEnabled and dictation session state `apps/desktop/src/renderer/state/appStore.ts`	Adds persisted `voiceInputEnabled` to `PersistedUserPreferences` and propagates it to project stores. Adds ephemeral dictation session state (`DictationPhase`, `dictationElapsed`, `dictationLevels`, `activeDictationTarget`) with full setter suite, `resetDictationSession`, and `register/unregisterDictationTarget`. Exports `rootAppStoreApi` and `useRootAppStore`.
GlobalVoiceRecorder singleton: Web Audio capture, waveform, and transcription `apps/desktop/src/renderer/services/globalVoiceRecorder.ts`	Module-level singleton managing mic capture via Web Audio (MediaStreamSource → Analyser + ScriptProcessorNode). `start()` optionally checks macOS mic permission, publishes waveform levels and elapsed time to the root store. `finish()` downsamples PCM to 16kHz Int16, calls `window.ade.transcription.transcribe`, inserts cleaned text into the active dictation target, and copies to clipboard as fallback.
RecordingPill, GlobalVoiceCaptureIndicator, and VoiceDictationButton `apps/desktop/src/renderer/components/voice/RecordingPill.tsx`, `apps/desktop/src/renderer/components/voice/GlobalVoiceCaptureIndicator.tsx`, `apps/desktop/src/renderer/components/chat/VoiceDictationButton.tsx`, `apps/desktop/src/renderer/components/app/TopBar.tsx`	`RecordingPill` renders idle/transcribing/recording states with a waveform bar visualization, `prefersReducedMotion` support, and cancel/done controls. `GlobalVoiceCaptureIndicator` subscribes to root store dictation state and renders the pill in `TopBar` when not idle. `VoiceDictationButton` maps typed recorder errors to user messages and renders either the pill or a mic button.
AgentChatComposer integration and DictationSection settings `apps/desktop/src/renderer/components/chat/AgentChatComposer.tsx`, `apps/desktop/src/renderer/hooks/useVoiceModelInstalled.ts`, `apps/desktop/src/renderer/components/settings/DictationSection.tsx`, `apps/desktop/src/renderer/components/settings/AppearanceSection.tsx`, `apps/desktop/src/renderer/components/app/App.tsx`, `apps/desktop/src/renderer/components/app/App.workKeepAlive.test.tsx`	`AgentChatComposer` injects shimmer keyframes, registers itself as the active dictation target via `rootAppStoreApi`, inserts dictated text at the caret/saved selection, and renders a dismissible error banner plus a `VoiceDictationButton` or disabled mic icon. `useVoiceModelInstalled` probes `window.ade.transcription.status`. `DictationSection` renders a voice-input checkbox with a "model not installed" notice. `voiceInputEnabled` is added to `rootPrefs` hydration.

iOS Voice Dictation

Layer / File(s)	Summary
iOS permissions and app-level bootstrap `apps/ios/ADE/Info.plist`, `apps/ios/ADE/App/ADEApp.swift`, `apps/ios/ADE/App/ContentView.swift`	Declares microphone and speech recognition usage strings and the `audio` background mode in `Info.plist`. `ADEApp` owns a `@StateObject DictationController` injected as an environment object. `ContentView` adds `GlobalDictationPill()` as a top safe-area inset.
VoiceGlossary model, loading, and DictationCleanup pipeline `apps/ios/ADE/Resources/VoiceGlossary.json`, `apps/ios/ADE/Services/Dictation/VoiceGlossary.swift`, `apps/ios/ADE/Services/Dictation/DictationCleanup.swift`, `apps/ios/ADETests/ADETests.swift`	Adds `VoiceGlossary.json` with contextual terms, corrections, and fillers. `VoiceGlossary.swift` defines the model with a cached `.shared` loader that sorts corrections longest-first. `DictationCleanup.clean` applies the same deterministic pipeline as desktop: filler removal, corrections, sentence capitalization, and spacing normalization.
SpeechDictationService: AVAudioEngine + iOS 26 SpeechAnalyzer `apps/ios/ADE/Services/Dictation/SpeechDictationService.swift`	`SpeechDictationService` (`@MainActor` ObservableObject) drives `AVAudioEngine` and the iOS 26 `SpeechAnalyzer`/`SpeechTranscriber` pipeline. `startIOS26` installs locale model assets, injects contextual vocabulary into `AnalysisContext`, starts the analyzer with an `AsyncStream`, and installs an audio input tap. `DictationBufferConverter` handles mic-to-analyzer format conversion. Exponential-moving-average RMS levels power the waveform.
DictationController orchestration and Live Activity shared contract `apps/ios/ADE/Services/Dictation/DictationController.swift`, `apps/ios/ADE/Shared/DictationActivityShared.swift`	`DictationActivityAttributes`/`ContentState` define the Live Activity contract (waveform levels, elapsed time, isFinishing) shared between the app and widget targets. `DictationDoneIntent`/`DictationCancelIntent` dispatch through `DictationActivityActionRegistry`. `DictationController` owns `SpeechDictationService`, maintains an insertion-target registry with visibility tracking, routes finished transcripts to the visible target or clipboard, and manages throttled ActivityKit Live Activity updates.
RecordingPill, DictationMicButton, GlobalDictationPill `apps/ios/ADE/Views/Components/RecordingPill.swift`, `apps/ios/ADE/Views/Components/DictationMicButton.swift`, `apps/ios/ADE/Views/Components/GlobalDictationPill.swift`	`RecordingPill` shows elapsed time, a `DictationWaveform` with `Canvas`/`TimelineView` easing (Reduce Motion aware), and cancel/done controls. `DictationMicButton` registers with `DictationController`, inserts cleaned transcripts with smart leading-space and a UTF-16 undo span, shows a timed `DictationRawUndoChip`. `GlobalDictationPill` shows the pill globally when the active target is not visible.
Composer surface integrations `apps/ios/ADE/Views/Work/WorkChatSessionView.swift`, `apps/ios/ADE/Views/Work/WorkChatComposerAndInputViews.swift`, `apps/ios/ADE/Views/Work/WorkNewChatScreen.swift`, `apps/ios/ADE/Views/Work/WorkNewChatSheet.swift`, `apps/ios/ADE/Views/Work/WorkRootScreen.swift`, `apps/ios/ADE/Views/Work/WorkPreviews.swift`	Threads `dictationTargetId` through `WorkChatComposerCard` → `WorkChatComposerDraftInput`. Adds `DictationMicButton` + `DictationInsertionCoordinator` to `WorkChatComposerDraftInput`, `WorkQueuedSteerRow`, `WorkNewChatComposerBar`, and `WorkNewChatSheet`. `WorkRootScreen` re-injects `dictationController` into `navigationDestination` pushed views.
Settings section and Live Activity widget `apps/ios/ADE/Views/Settings/SettingsVoiceInputSection.swift`, `apps/ios/ADE/Views/Settings/ConnectionSettingsView.swift`, `apps/ios/ADEWidgets/DictationLiveActivity.swift`, `apps/ios/ADEWidgets/ADEWidgetBundle.swift`	Adds `SettingsVoiceInputSection` (AppStorage-backed voice input toggle) into `ConnectionSettingsView`. `DictationLiveActivity` implements an iOS 17+ Live Activity with a Dynamic Island layout (mic glyph, waveform, timer, interactive `DictationDoneIntent` Done button) and a lock-screen view. Registered in `ADEWidgetBundle` behind `#available(iOS 17.0, *)`.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested labels

desktop, ios

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ade/wanna-discuss-add-voice-text-46c24910

mintlify · 2026-06-15T07:09:15Z

Preview deployment for your docs. Learn more about Mintlify Previews.

Project	Status	Preview	Updated (UTC)
ade-ac1c6011	🟢 Ready	View Preview	Jun 15, 2026, 7:09 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

arul28 · 2026-06-15T07:09:22Z

@copilot review but do not make fixes

arul28 · 2026-06-15T07:31:02Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 696f9fc0a0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T07:43:06Z

    didYouKnowEnabled: rootState.didYouKnowEnabled,
    launchPromptClipboardEnabled: rootState.launchPromptClipboardEnabled,
    launchPromptClipboardNoticeEnabled: rootState.launchPromptClipboardNoticeEnabled,
+    voiceInputEnabled: rootState.voiceInputEnabled,


Keep voice preference synced into project stores

When settings or another project surface toggles voice input while a project store already exists, this initial copy stays stale because ProjectTabHost only re-hydrates open project stores from its rootPrefs selector, and that selector does not include voiceInputEnabled. As a result, the new Settings checkbox and chat composers inside the active project can continue reading the old value until the project surface is recreated/reloaded, so disabling voice input may leave the mic visible/enabled (or enabling it may not show the mic). Add this preference to the root-pref hydration path or read it from the root store like the dictation session state.

Useful? React with 👍 / 👎.

arul28 · 2026-06-15T07:46:44Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b9006de324

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T07:55:14Z

+    if converter == nil || converter?.outputFormat != format {
+      converter = AVAudioConverter(from: inputFormat, to: format)
+      converter?.primeMethod = .none


Recreate the converter when the mic format changes

If a user records once, then changes audio route or sample rate before the next recording (for example switching to AirPods/Bluetooth), this cached converter is reused as long as the analyzer output format is unchanged. The converter was created for the old inputFormat, so subsequent buffers with a different format can fail conversion and get dropped by try?, leaving the analyzer with little or no audio. Include the input format in the cache check or reset the converter between sessions.

Useful? React with 👍 / 👎.

arul28 · 2026-06-15T08:01:46Z

@codex review

coderabbitai

Actionable comments posted: 17

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/desktop/scripts/materialize-whisper-resources.mjs`:
- Around line 48-55: The whisperBinarySpecForHost function currently selects the
whisper-cli binary based on the host architecture, which can produce an
arm64-only binary even when building universal macOS artifacts, causing exec
format errors on Intel Macs. Modify the function to detect when building for
universal macOS (check for an environment variable or build context that
indicates universal build mode) and in that case, use a universal binary target
(such as darwin-universal) instead of the host architecture-based target. This
ensures that the universal build gets a compatible binary regardless of which
architecture the build runs on.
- Around line 141-142: The console.log statements are logging full URLs that may
contain sensitive query tokens from signed URLs, risking secret exposure in CI
logs. Remove or redact the MODEL_URL parameter from the log messages to prevent
secrets from being exposed. Keep the MODEL_BASENAME and other descriptive
information visible, but sanitize the URL by either omitting the query
parameters, replacing the URL with a redacted placeholder, or extracting only
the base path without query strings. Apply this same fix to both logging
locations mentioned in the comment (the downloadFile call and the additional
location at lines 270-271).
- Around line 204-210: The code is currently falling back to cloning the default
branch when the pinned WHISPER_SRC_REF cannot be cloned, which breaks
reproducibility. Instead of the fallback behavior, remove the spawnStep call
that clones WHISPER_SRC_REPO without the pinned ref and replace it with a throw
statement or process.exit call after the warning log. This ensures the script
fails if the pinned ref cannot be cloned, maintaining reproducibility of release
artifacts.

In `@apps/desktop/scripts/validate-whisper-resources.mjs`:
- Around line 50-52: The statFile call for the voice glossary only validates
file existence but not JSON structure. After the statFile check for the glossary
file path (voice-glossary.json in voiceRoot), add additional validation to read
and parse the JSON content to ensure it is well-formed. If the JSON parsing
fails or the structure is invalid, throw an error so the build fails early
rather than allowing malformed JSON to pass through and degrade behavior at
runtime.

In `@apps/desktop/src/main/main.ts`:
- Around line 5933-5936: The sharedTranscriptionService disposal code currently
lives only in the will-quit handler, but when the app is not ready during
shutdown, finalizeAppExit calls process.exit() directly, which bypasses the
will-quit event and leaves the service undisposed. Move the
sharedTranscriptionService?.dispose() and sharedTranscriptionService = null
statements from the will-quit handler to the runImmediateProcessCleanup function
so the disposal runs before both app.exit() (in the ready path) and
process.exit() (in the non-ready path) calls, ensuring the transcription service
and its child processes are properly cleaned up regardless of app readiness
state.

In `@apps/desktop/src/main/services/ipc/registerIpc.ts`:
- Around line 6198-6222: The IPC.transcriptionTranscribe handler returns
transcription results containing raw and cleaned text that can be exposed in
main-process logs through the global IPC wrapper's result logging. You need to
add redaction for this channel to prevent sensitive dictated text from appearing
in logs. First, add result redaction for the TranscriptionResult return value
(which contains the raw and cleaned transcript text) to mask the transcribed
content. Second, explicitly redact the PCM audio argument in the handler to
prevent raw audio data from being logged during IPC tracing. This redaction
should be applied at the IPC wrapper level where results and arguments are
logged for slow or verbose calls, not just at the handler implementation itself.
- Around line 6210-6222: The code in the PCM buffer handling logic accepts
renderer-controlled input without sufficient validation before constructing
typed arrays and forwarding to the transcribe service. Add validation checks
before creating the typed arrays: validate that the format parameter is one of
the expected values (such as "float32" or "int16"), verify that the sampleRate
is a finite positive number, check that the buffer byte alignment is correct for
the specified format, and enforce reasonable limits on the total audio duration
or buffer size to prevent resource exhaustion. Perform these validations after
checking for an empty buffer but before constructing the Float32Array or
Int16Array from the buffer, throwing descriptive errors for any validation
failures.

In `@apps/desktop/src/main/services/transcription/dictationCleanup.ts`:
- Around line 184-187: In the catch block that handles read/parse failures
(lines 184-187), remove the lines that cache the fallback EMPTY_GLOSSARY and its
path (the assignments to cachedGlossary and cachedGlossaryPath). Instead, only
return EMPTY_GLOSSARY without caching it, so that transient read/parse errors
allow retry attempts on subsequent calls rather than permanently disabling
glossary cleanup until process restart.
- Line 67: The sort comparator in the dictationCleanup.ts file at the sorting of
replacement phrases currently only sorts by length, which makes the order of
equal-length phrases non-deterministic and inconsistent with iOS behavior.
Modify the sort function to add a deterministic tie-break: when two phrases have
the same length (when b.from.length - a.from.length equals zero), add a
secondary comparison to sort them alphabetically by their from string value.
This ensures consistent replacement order for overlapping entries across
platforms.

In `@apps/desktop/src/main/services/transcription/transcriptionService.ts`:
- Around line 314-320: The sampleRate variable extracted from IPC options is not
validated before being used in the pcmToWavBuffer function call. Add validation
after the sampleRate assignment (where it defaults to TARGET_SAMPLE_RATE if not
provided) to ensure the value is a finite positive number. If validation fails,
throw a typed TranscriptionError instead of allowing invalid values to propagate
to pcmToWavBuffer, which could cause unhandled exceptions during WAV header
writes and bypass proper error handling.
- Around line 251-291: Add a hard timeout mechanism to prevent the whisper child
process from hanging indefinitely. In the Promise returned by this block, set a
timeout using setTimeout after spawning the child process. If the timeout
expires before the child process closes, kill the child process using
child.kill(), remove it from activeChildren, and reject the promise with an
appropriate TranscriptionError. Make sure to clear the timeout in the
child.on("close") handler to prevent it from firing after the process has
already completed successfully.

In `@apps/desktop/src/renderer/services/globalVoiceRecorder.ts`:
- Around line 274-287: The insertText call on target is not guarded by
try-catch, so if it throws an exception, control will jump out and skip the
clipboard fallback write below, causing the transcript to be lost. Wrap the
target.insertText(cleaned) call in its own try-catch block to ensure any
exceptions it throws are caught and do not prevent the clipboard write fallback
from executing. This preserves the intended behavior where the clipboard write
acts as a recovery mechanism.

In `@apps/ios/ADE/Resources/VoiceGlossary.json`:
- Around line 4-110: The contextualTerms array in VoiceGlossary.json currently
contains 105 items, exceeding the file's documented constraint of approximately
100 phrases. Reduce the array to align with this stated limit by removing or
deprioritizing less critical terms. Focus on keeping the most essential and
frequently used contextual terms for voice recognition, particularly those
related to the core ADE functionality and widely-used development tools and
frameworks.

In `@apps/ios/ADE/Services/Dictation/DictationCleanup.swift`:
- Around line 94-96: The line constructing
Character(String(character).uppercased()) in the capitalizeNext block can trap
at runtime when uppercase mapping expands to multiple characters (for example,
"ß" becomes "SS"). Instead of appending a Character initialized with the
uppercased string, directly append the uppercased String value to the result
variable to safely handle multi-character uppercase expansions.

In `@apps/ios/ADE/Services/Dictation/DictationController.swift`:
- Around line 64-71: The DictationController is mirroring audioLevel,
elapsedTime, and isRecording from SpeechDictationService as `@Published`
properties, but isPreparing and isStarting are not mirrored, leaving the UI
unaware of startup and download phases. Add `@Published` private(set) properties
for isPreparing and isStarting in DictationController to mirror these state
values from SpeechDictationService, then update DictationMicButton to include
these mirrored properties in its UI state gating and disable logic so users
receive deterministic preparation feedback during startup.

In `@apps/ios/ADE/Services/Dictation/SpeechDictationService.swift`:
- Around line 487-493: The converter recreation condition in the convert method
only checks if the outputFormat has changed, but during audio route changes the
inputFormat can also change independently. Update the condition that checks
whether to recreate the converter to also validate that the current converter's
inputFormat matches the buffer's inputFormat, in addition to the existing
outputFormat check. This ensures the cached converter is recreated when either
the input or output format changes, preventing stale converters from causing
conversion failures and dropped transcription buffers.

In `@apps/ios/ADE/Views/Work/WorkNewChatSheet.swift`:
- Around line 22-24: The sheet actions can still be triggered while dictation is
active because the `canStartChat` condition does not account for the
`isDictating` state. Modify the `canStartChat` logic to include a check that
prevents the Start action from firing when `isDictating` is true. This will
block sheet dismissal while an active recording is in progress and prevent loss
or redirection of the transcript insertion.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0453fa03-a6b0-42e2-b1f4-e7307ccbf85a

📥 Commits

Reviewing files that changed from the base of the PR and between 1603520 and b9006de.

⛔ Files ignored due to path filters (4)

apps/ios/ADE.xcodeproj/project.pbxproj is excluded by !**/*.xcodeproj/project.pbxproj
docs/features/chat/composer-and-ui.md is excluded by !docs/**
docs/features/onboarding-and-settings/README.md is excluded by !docs/**
docs/features/sync-and-multi-device/ios-companion.md is excluded by !docs/**

📒 Files selected for processing (53)

.gitignore
apps/desktop/build/entitlements.mac.inherit.plist
apps/desktop/build/entitlements.mac.plist
apps/desktop/package.json
apps/desktop/resources/voice/voice-glossary.json
apps/desktop/resources/whisper/.gitkeep
apps/desktop/resources/whisper/README.md
apps/desktop/scripts/materialize-whisper-resources.mjs
apps/desktop/scripts/validate-whisper-resources.mjs
apps/desktop/src/main/main.ts
apps/desktop/src/main/services/ipc/registerIpc.ts
apps/desktop/src/main/services/transcription/bundledResources.ts
apps/desktop/src/main/services/transcription/dictationCleanup.ts
apps/desktop/src/main/services/transcription/transcriptionService.test.ts
apps/desktop/src/main/services/transcription/transcriptionService.ts
apps/desktop/src/preload/global.d.ts
apps/desktop/src/preload/preload.ts
apps/desktop/src/renderer/components/app/App.tsx
apps/desktop/src/renderer/components/app/App.workKeepAlive.test.tsx
apps/desktop/src/renderer/components/app/TopBar.tsx
apps/desktop/src/renderer/components/chat/AgentChatComposer.tsx
apps/desktop/src/renderer/components/chat/VoiceDictationButton.tsx
apps/desktop/src/renderer/components/settings/AppearanceSection.tsx
apps/desktop/src/renderer/components/settings/DictationSection.tsx
apps/desktop/src/renderer/components/voice/GlobalVoiceCaptureIndicator.tsx
apps/desktop/src/renderer/components/voice/RecordingPill.tsx
apps/desktop/src/renderer/hooks/useVoiceModelInstalled.ts
apps/desktop/src/renderer/services/globalVoiceRecorder.ts
apps/desktop/src/renderer/state/appStore.ts
apps/desktop/src/shared/ipc.ts
apps/ios/ADE/App/ADEApp.swift
apps/ios/ADE/App/ContentView.swift
apps/ios/ADE/Info.plist
apps/ios/ADE/Resources/VoiceGlossary.json
apps/ios/ADE/Services/Dictation/DictationCleanup.swift
apps/ios/ADE/Services/Dictation/DictationController.swift
apps/ios/ADE/Services/Dictation/SpeechDictationService.swift
apps/ios/ADE/Services/Dictation/VoiceGlossary.swift
apps/ios/ADE/Shared/DictationActivityShared.swift
apps/ios/ADE/Views/Components/DictationMicButton.swift
apps/ios/ADE/Views/Components/GlobalDictationPill.swift
apps/ios/ADE/Views/Components/RecordingPill.swift
apps/ios/ADE/Views/Settings/ConnectionSettingsView.swift
apps/ios/ADE/Views/Settings/SettingsVoiceInputSection.swift
apps/ios/ADE/Views/Work/WorkChatComposerAndInputViews.swift
apps/ios/ADE/Views/Work/WorkChatSessionView.swift
apps/ios/ADE/Views/Work/WorkNewChatScreen.swift
apps/ios/ADE/Views/Work/WorkNewChatSheet.swift
apps/ios/ADE/Views/Work/WorkPreviews.swift
apps/ios/ADE/Views/Work/WorkRootScreen.swift
apps/ios/ADETests/ADETests.swift
apps/ios/ADEWidgets/ADEWidgetBundle.swift
apps/ios/ADEWidgets/DictationLiveActivity.swift

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 016ec16313

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T08:10:16Z

+    func cancelRecording() {
+        guard service.isRecording else { return }
+        ADEHaptics.warning()


Allow canceling a pending dictation start

When first use has to request/download iOS 26 Speech assets, startRecording has already set activeTargetId, but SpeechDictationService.start() does not set isRecording until after ensureModel and audio setup complete. This guard makes cancelRecording() a no-op throughout that isStarting/isPreparing window, so a user who leaves the composer or tries to stop the pending capture can still have recording begin later against their intent. Include the starting/preparing state in the cancellation path or keep the start task cancellable.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T08:10:16Z

+                    HStack(spacing: 10) {
+                        DictationIslandTimer(state: ctx.state)
+                        DictationIslandDoneButton(isFinishing: ctx.state.isFinishing)
+                    }


Wire Cancel into the Live Activity controls

The background/Dynamic Island dictation surface renders only the Done intent even though DictationCancelIntent and cancelFromLiveActivity() are defined. When ADE is backgrounded or locked, users can finish but cannot discard a recording without reopening the app, leaving an unwanted mic session running; add a cancel button/intent to the Live Activity surfaces where cancellation is promised.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T08:10:16Z

+      const raw = error instanceof Error ? error.message : String(error);
+      this.emitError(
+        raw.startsWith("model_not_installed:")
+          ? "model_not_installed"
+          : raw.startsWith("empty_audio:")
+            ? "no_audio"
+            : "transcribe_failed",


Match typed transcription errors after IPC wrapping

When transcription.transcribe throws from ipcMain.handle, Electron rejects ipcRenderer.invoke with a message prefixed by Error invoking remote method ..., so these startsWith checks miss the typed prefixes emitted by main. In no-audio or model-missing cases the composer reports the generic transcription failure instead of the intended actionable message; unwrap/search the message or return a structured error code.

Useful? React with 👍 / 👎.

arul28 · 2026-06-15T08:38:45Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f996a3ef71

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T08:52:46Z

+
+  const onCancel = useCallback(() => globalVoiceRecorder.cancel(), []);
+  const onDone = useCallback(() => {


Surface dictation errors when the composer unmounts

When a user starts dictation in a composer, navigates away so VoiceDictationButton unmounts, and then taps Done from this global pill, globalVoiceRecorder.finish() can emit no_audio or transcribe_failed but there is no mounted onError listener because the only listener is registered by the composer button. The pill just disappears on failure, so the recording is lost without any explanation; keep a root/global error listener or surface failures from this control.

Useful? React with 👍 / 👎.

mintlify Bot deployed to staging June 15, 2026 07:09 View deployment

greptile-apps Bot reviewed Jun 15, 2026

View reviewed changes

mintlify Bot deployed to staging June 15, 2026 07:31 View deployment

greptile-apps Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread apps/ios/ADE/Services/Dictation/SpeechDictationService.swift

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

mintlify Bot deployed to staging June 15, 2026 07:47 View deployment

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

mintlify Bot deployed to staging June 15, 2026 08:02 View deployment

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

arul28 added 5 commits June 15, 2026 04:29

feat: add on-device voice dictation

06f8a4d

ship: iteration 1 — fix dictation review issues

e3d821c

ship: iteration 2 — fix dictation start and prefs

fbea138

ship: iteration 3 — fix dictation converter cache

0190464

ship: iteration 4 - address dictation review issues

f996a3e

arul28 force-pushed the ade/wanna-discuss-add-voice-text-46c24910 branch from 016ec16 to f996a3e Compare June 15, 2026 08:38

mintlify Bot deployed to staging June 15, 2026 08:38 View deployment

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

arul28 merged commit c3a663a into main Jun 15, 2026
27 checks passed

arul28 deleted the ade/wanna-discuss-add-voice-text-46c24910 branch June 15, 2026 08:54


		const onCancel = useCallback(() => globalVoiceRecorder.cancel(), []);
		const onDone = useCallback(() => {

Conversation

arul28 commented Jun 15, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Validation

Risks

Summary by CodeRabbit

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

vercel Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Suggested labels

Uh oh!

mintlify Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arul28 commented Jun 15, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arul28 commented Jun 15, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

arul28 commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

arul28 commented Jun 15, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

arul28 commented Jun 15, 2026 •

edited by greptile-apps Bot

Loading

vercel Bot commented Jun 15, 2026 •

edited

Loading

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading

mintlify Bot commented Jun 15, 2026 •

edited

Loading